Brother-in-law graduated med school in the early 90s and has been a practicing ER physician since. We discussed this recently and he related that his advisors told him not to go into radiology back in the late 80s because the assumption was that computers were going to take over the field. He's not too far away from retirement and it's only now that we're starting to see some signs of this prediction from 30+ years ago.
As others in the thread note, there are plenty of concerns around operational use of AI solutions in the medical space, but radiology has a much larger target painted on it than other practices as a fair portion of the job (but certainly not all!) can boil down to high-skill pattern recognition from visual inputs. The current list of AI-enabled devices going through FDA approval is public, more than 3/4 of the list are targeting radiology use cases: https://www.fda.gov/medical-devices/software-medical-device-...
The issue with radiologists is that on average they are able to spot ~35% of correct diagnoses, while the world's best radiologists ~45%. AI might get us to ~50% which is ~15% better than an average radiologist (who still needs to review it).
Its fine to ask for sources. It's also fine to not give sources when relaying information in freeform comments. It's not fine to ask for sources in the tone you are using though, as though you are annoyed and simply expect sources to always be included with claims. There are better ways of accomplishing your goals.
Someone drops very specific percentages about diagnostic accuracy....numbers that, if true, have serious implications for patient outcomes, and your concern is that I did not ask nicely enough for a source? I could not think of a more HN typical response...
I did not even call the claim false, even if it almost deserve it...I said, essentially ...let's see the references so we can treat this as fact rather than opinion.
What you did is write a longer and more prescriptive comment about my tone than anything anyone has written about the actual substance :-)). You tone policed a one line request for evidence while giving a complete pass to unsourced medical statistics presented as fact.
If we are ranking things that erode discourse quality, I would say you are higher on the list.
Three comments in... and you still have not said a single word about whether radiologists actually catch 35% of diagnoses. But you have found time to call me passive aggressive, entitled, lazy, and immature. For one sentence. Asking for a source...
You are now, multiple comments deep, doing the thing you accuse me of...being more invested in tone than substance.
The lack of self awareness you display is impressive. Grade A troll or bot. As someone who sometimes misses things, I find it mildly interesting when someone is so confidently not on the same page as others. Good luck.
If you give specific numbers then I expect sources. If you give out incredibly bold claims then I also expect sources.
It's one thing to talk casually, in which case I agree with you. But as soon as hard numbers are on the table, it's no longer casual, and if you do not provide sources then the assumption has to be that you pulled the numbers out of your ass and you are not to be trusted.
To get around that, just don't provide numbers and don't speak authoritatively. It's very easy, I don't know why people speak authoritatively if they know they can't back it up.
There's a middle ground here that is a grey area that you seem to be pretending is obviously navigated. You're speaking pretty authoritatively on this by the way. Do you have the moral, propositional logic, and epistemological justification for these claims?
I’m not sure I find this to be a comparable example.
If someone was making an important calculation or decision based on the circumference of the earth, then they would likely want the number cited/confirmed and not just thrown out by a random person that doesn’t pass the smell test. “Radiologists are only right 35% of the time” does not pass the smell test and a cursory search makes the case even worse.
I didn't make any claims, all of that is my opinion. There's literally no claims there. I just said that people who spew out numbers then can't provide a source aren't trustworthy - that's an opinion.
And there's obviously a difference between an established and obvious fact and a BOLD claim. This person made a BOLD claim. And provided numbers. To me, that requires a source.
Yes, there is a middle ground, but this isn't in the middle ground. I think this type of claim requires a source. A different claim, without specific percentages, would not. Or an obvious claim, like the Earth's circumference, also would not.
Maybe radiologist mean something different in my country, but here radiologist don't diagnose (i mean, except you see them for a broken bone or something), oncologist do. I did an observation internship with a radiologist when i was 20 (95% of my family are doctor/nurses/PT, i wanted to know what a degree in physics could help me do in the field, and radiologist was the only path to medecine from my initial formation where i only lost a year, and not two). You spend your time calculating doses, finding patient history, and calibrating machines, it's much more a technician role than a MD. In any case, and even if in the US radiologist diagnose cancer, that's such a small part of their job it shouldn't matter.
^ Knowing this, I would believe the best course of action for a hospital administrator would be to implement a "blind workflow" to reduce risk & lawsuits.
A radiologist should separately review a scan, an AI separately review it, and then combine the 2 results for review.
I worry that rational takes like this end up completely lost in the battle between motivated parties who yell far louder, but have minimal investment in actual outcomes for those who will be depending on these technologies. The debate over self-driving vehicles is another example.
If your original argument was “it could be useful for more difficult/niche observations” then I think most of us wouldn’t have objected.
I also really don’t understand why you still aren’t sharing any links. Is this all LLM-generated without citations or something? Where are you getting your numbers?
You’re mixing up “using” with “copying”. You are allowed to “use” all of a book or movie or code by listening to or watching or reviewing the whole thing. Copyright protects copies. The legal claim here is than training an LLM is sufficiently transformative such that it cannot be construed as a copy.
People seem to struggle with the concept of private datacenters these days. Palantir customers tend to be the sorts of orgs that are pretty paranoid about their data, and they wouldn't be handing it over to some schmucks without being confident that those concerns were addressed. Militaries and governments generally aren't fuckin around with things like intelligence data, so I think it's reasonable that Palantir is able to make a convincing case to the world's most paranoid orgs that their data isn't being sent anywhere (and it'd likely be air gapped anyway).
Just because everything you touch is in the cloud doesn't mean other orgs aren't still building their own datacenters and then buying software to run inside.
I'm not sure if you're familiar with the work from the lab of Mike Levin at Tufts but I'm betting you'll find it interesting if not. Here's a taste https://pmc.ncbi.nlm.nih.gov/articles/PMC6923654/
While I disagree with your notion that this is explicity due to gravity, the rest of your argument seems to align with some of this lab's work. Learning can be demonstrated on scales as low as a few molecules, way below what we would normally call "life".
Modern LLMs, just like everyone reading this, will instead reach for a calculator to perform such tasks. I can't do that in my head either, but a python script can so that's what any tool-using LLM will (and should) do.
Long multiplication is a trivial form of reasoning that is taught at elementary level. Furthermore, the LLM isn't doing things "in its head" - the headline feature of GPT LLMs is attention across all previous tokens, all of its "thoughts" are on paper. That was Opus with extended reasoning, it had all the opportunity to get it right, but didn't. There are people who can quickly multiply such numbers in their head (I am not one of them).
I tried this with Claude - it has to be explicitly instructed to not make an external tool call, and it can get the right answer if asked to show its work long-form.
Mathematics is not the only kind of reasoning, so your conclusion is false. The human brain also has compartments for different types of activities. Why shouldn't an AI be able to use tools to augment its intelligence?
There are many examples of current limitations, but do you see a reason to think they are fundamental limitations? (I'm not saying they aren't, I'm curious what the evidence is for that.)
It's because of how transformers work, especially the fact that the output layer is a bunch of weights which we quite literally do a weighted random choice from. My hunch is that diffusion models would have a higher chance of doing real reasoning - or something like a latent space for reasoning.
Thinking that LLMs are intelligent arises from an incomplete understanding of how they work or, alternatively, having shareholders to keep happy.
Furthermore, the LLM isn't doing things "in its head" - the headline feature of GPT LLMs is attention across all previous tokens, all of its "thoughts" are on paper
LOL, talk about special pleading. Whatever it takes to reshape the argument into one you can win, I guess...
LLMs don't reason.
Let's see you do that multiplication in your head. Then, when you fail, we'll conclude you don't reason. Sound fair?
I can do it with a scratch pad. And I can also tell you when the calculation exceeds what I can do in my head and when I need a scratch pad. I can also check a long multiplication answer in my head (casting 9s, last digit etc.) and tell if there’s a mistake.
The LLMs also have access to a scratch pad. And importantly don’t know when they need to use it (as in, they will sometimes get long multiplication right if you ask them to show their work but if you don’t ask them to they will almost certainly get it wrong).
The context is the scratch pad. LLMs have perfect recall (ignoring "lost in the middle") across the entire context, unlike humans. LLMs "think on paper."
Plenty of humans can't do arithmetic. Can they also not reason.
Reasoning isn't a binary switch. It's a multidimensional continuum. AI can clearly reason to some extent even if it also clearly doesn't reason in the same way that a human would.
> Plenty of humans can't do arithmetic. Can they also not reason.
I just pointed out that this isn't valid reasoning ... it's a fallacy of denial of the antecedent. No one is arguing that because LLMs can't do arithmetic, therefore they can't reason. After all, zamalek said that he can't quickly multiply large numbers in his head, but he isn't saying that therefore he can't reason.
> Reasoning isn't a binary switch. It's a multidimensional continuum.
Indeed, and a lot of humans are very bad at it, as is clear from the comments I'm responding to.
> AI can clearly reason to some extent
The claim was about LLMs, not AI. This is like if someone said that chihuahuas are little and someone responded by saying that dogs are tall to some extent.
LLMs do not reason ... they do syntactic pattern matching. The appearance of reasoning is because of all the reasoning by humans that is implicit in the training data.
I've had this argument too many times ... it never goes anywhere. So I won't respond again ... over and out.
Indeed, and a lot of humans are very bad at it, as is clear from the comments I'm responding to.
This is your idea of "conversing curiously" and "editing out swipes," I suppose.
I've had this argument too many times ... it never goes anywhere. So I won't respond again ... over and out.
A real reasoning entity might pause for self-examination here. Maybe run its chain of thought for a few more iterations, or spend some tokens calling research tools. Just to probe the apparent mismatch between its own priors and those of "a lot of humans," most of whom are not, in fact, morons.
you’re just abstracting it away into this new “systems” definition
when someone says LLMs today they obviously mean software that does more than just text, if you want to be extra pedantic you can even say LLMs by themselves can’t even geenrate text since they are just model files if you don’t add them to a “system” that makes use of that model files, doh
> when someone says LLMs today they obviously mean ...
LLMs, if the someone is me or others who understand why it's important to be precise. And in this context, the distinction between LLM and AI mattered--not pedantic at all.
bunnie your book "Hacking the XBox" taught me how to get started on reversing electronics, took the fear out of the process, and replaced it with fun. Thanks for the multi-decades long effort you've made to make these tools available and accessible and approachable, your contributions to the hacker community are immeasurable and I cannot say thank you enough.
If code becomes essentially free (ignoring for a moment the environmental cost or the long term cost of allowing code generation to be tollboothed by AI megacorps) the value of code must lie in its track record.
The 5-day-old code in chardet has little to no value. The battle-tested years-old code that was casually flushed away to make room for it had value.
What you describe is essentially what happened, the AI result working from specs and tests was more performant than the original. The real AI you describe just rewrote chardet without looking at the source, only better.
Is there any visibility or accountability to record exactly what it did and not look at? I doubt it. So we're left with a kind of Rorschach test: some people think LLMs follow rules like law-abiding citizens, and some people distrust commercial LLMs because they understand that commercial LLMs were never designed for visibility and accountability.
There should exist a .jsonl file somewhere with exactly that information in it - might be worth Dan preserving that, it should be in a ~/.claude/projects folder.
reply