A process is described here: https://arxiv.org/pdf/2506.22405 >A physician or AI...

apical_dendrite · 2025-07-24T03:19:56 1753327196

I believe that dataset was built off of cases that were selected for being unusual enough for physicians to submit to the New England Journal of Medicine. The real-world diagnostic accuracy of physicians in these cases was 100% - the hospital figured out a diagnosis and wrote it up. In the real world these cases are solved by a team of human doctors working together, consulting with different specialists. Comparing the model's results to the results of a single human physician - particularly when all the irrelevant details have been stripped away and you're just left with the clean case report - isn't really reflective of how medicine works in practice. They're also not the kind of situations that you as a patient are likely to experience, and your doctor probably sees them rarely if ever.

atleastoptimal · 2025-07-24T04:42:02 1753332122

Either way, the AI model performed better than the humans on average, so it would be reasonable to infer that AI would be a net positive collaborator in a team of internists.

sorcerer-mar · 2025-07-24T00:04:27 1753315467

Okay you have a point. AI probably would do really well when short case abstracts start walking into clinics.

atleastoptimal · 2025-07-24T00:07:56 1753315676

How else would a study scientifically determine the accuracy of an AI model in diagnosis? By testing it on real people before they know how good it is?

rafaelmn · 2025-07-24T00:19:59 1753316399

Why not ? Have AI do it then have human doctor do a follow-up/review ? I might not be a fan of this for urgent care but for general visits I wouldn't mind spending a bit extra time if they it was followed by an expert exam.