A study published in The New England Journal of Medicine has demonstrated that an AI diagnostic system can match or beat seasoned physicians when it comes to identifying rare diseases. Not marginally. Significantly. The research, reported by The Next Web, centers on a system built around large language models that was tested against clinical geneticists — doctors who specialize in precisely this kind of difficult diagnostic work.
The implications are hard to overstate for a medical field where patients routinely wait years for answers.
Rare diseases affect roughly 300 million people worldwide, according to the World Health Organization. Despite that staggering number, the “rare” label applies because each individual condition affects a small fraction of the population, making any single diagnosis extraordinarily difficult. Patients often endure what’s known as a “diagnostic odyssey” — bouncing between specialists for an average of five to seven years before receiving a correct diagnosis, if they ever do. Many don’t. The AI system tested in this study was designed to short-circuit that painful process by analyzing patient symptoms, genetic data, and clinical notes to generate ranked lists of possible diagnoses.
So how did it actually perform? The researchers pitted the AI against a group of clinical geneticists, presenting both with identical case information drawn from real patient records. The AI system correctly identified the diagnosis more frequently than the physicians did, and it did so faster. It also ranked the correct diagnosis higher on its differential list — meaning when it was right, it was confidently right, not just guessing in the neighborhood.
That’s a striking result. Clinical geneticists aren’t generalists fumbling through unfamiliar territory. They’re among the most specialized diagnosticians in medicine, trained specifically to recognize patterns in rare and genetic conditions. For an AI to outperform them on their home turf sends a clear signal about where diagnostic medicine is heading.
But context matters here. The system wasn’t operating in a vacuum or replacing doctors in a clinical setting. It was tested under controlled research conditions with structured patient data. Real-world medicine is messier — incomplete records, patients who struggle to articulate symptoms, comorbidities that muddy the picture. The study demonstrates capability, not deployment readiness. And the researchers themselves have been careful to frame the AI as a tool to assist clinicians, not supplant them.
Still, the performance gap was notable enough to warrant serious attention from hospital systems and health tech companies alike. The study adds to a growing body of evidence that LLM-based systems, when fine-tuned on medical data and paired with structured clinical inputs, can perform diagnostic reasoning at expert level. Google’s Med-PaLM 2 showed similar promise in earlier research, scoring at an “expert” level on medical licensing exam questions. Research published in Nature has documented how these models can synthesize vast amounts of medical literature in ways no individual physician could replicate from memory alone.
The rare disease space is particularly ripe for AI-assisted diagnosis because the core challenge is pattern recognition across an impossibly large search space. There are over 7,000 known rare diseases. No single doctor can hold all of them in working memory. An AI trained on comprehensive datasets can. That asymmetry — human cognitive limits versus machine-scale pattern matching — is exactly where these systems shine.
And there’s a health equity angle that shouldn’t be ignored. Patients in rural areas or developing countries often lack access to clinical geneticists entirely. An AI diagnostic tool, deployed through telemedicine platforms or integrated into electronic health records, could democratize access to expert-level rare disease diagnosis. The technology doesn’t need to be perfect to be transformative; it just needs to be better than the alternative, which in many regions is no specialist access at all.
The business signal is equally clear. Companies like Phenomics Health and others working at the intersection of genomics and AI stand to benefit as health systems look for ways to reduce diagnostic delays and associated costs. Rare disease patients are among the most expensive in healthcare — not because their treatments are always costly, but because the years of misdiagnosis and unnecessary testing that precede a correct diagnosis generate enormous waste.
There are legitimate concerns. Bias in training data could lead to worse outcomes for underrepresented populations. Liability questions around AI-assisted diagnosis remain unresolved in most jurisdictions. And clinician trust is a real barrier — doctors won’t adopt tools they don’t understand or can’t verify.
But the direction of travel is unmistakable. This study isn’t an outlier. It’s part of a pattern. AI diagnostic systems are getting better, the evidence base is growing, and the clinical need — especially in rare diseases — is acute. The question for health systems and regulators isn’t whether these tools will enter clinical practice. It’s how fast, and under what guardrails.
For the millions of patients still waiting for a diagnosis, faster can’t come soon enough.
An AI System Just Outperformed Experienced Doctors at Diagnosing Rare Diseases — Here’s What That Means first appeared on Web and IT News.
Anthropic just made its AI agent permanently resident on your desktop. Not as a chatbot…
Jack Clark thinks coding is the new literacy. Not in the vague, aspirational way that…
Ask a chatbot a question and you’ll get an answer. But the answer you get…
For years, cropping a photo in Google Photos has been an exercise in quiet frustration.…
OPEC’s crude oil production dropped sharply in May, and the reasons stretch far beyond the…
Google is making its biggest bet yet on the idea that artificial intelligence should be…
This website uses cookies.