Doctors 'vastly outperform' symptom checker apps

"Doctors correctly diagnose illness 'twice as often as online symptom checkers'," The Sun reports.

A US study ran a head-to-head comparison between doctors and a series of symptom checkers using what are known as clinical vignettes.

Clinical vignettes have been used for many years to help hone trainee doctors' diagnostic skills. They are essentially diagnostic puzzles based on real-life case reports designed to test training and clinical knowledge.

The researchers provided 45 clinical vignettes to more than 200 doctors. They found doctors were twice as likely to diagnose accurately first time compared with online symptom-checking applications.

But these findings are not entirely reliable – vignettes can never fully replicate the real-life diagnosis of patients. And many of the doctors involved were still in training posts.

It's often the case in the field of artificial intelligence that tasks computers find incredibly easy – like multiplying 30-digit prime numbers – humans find incredibly hard.

But the reverse is also true – tasks that are second nature to us, like understanding jokes, computers just cannot do.

It is possibly the case that diagnosis in some part relies on intuition, and not just an algorithmic approach to processing information.

That said, artificial intelligence has a great deal to offer medicine. For example, Google is working with the NHS to come up with software that can quickly and accurately scan radiotherapy images.

Applications may well become a diagnostic tool for doctors, rather than a replacement for them.

Where did the story come from?

The study was carried out by researchers from Harvard Medical School. No source of funding was reported in the paper.

It was published in the peer-reviewed JAMA Internal Medicine.

Symptom checkers are websites and apps that help patients with self-diagnosis. As these are becoming more popular, it is important that they are investigated thoroughly and the findings made public.

The media presented the facts of the study well, reporting the main findings accurately, although there was no discussion about the research's limitations.

What kind of research was this?

This comparative study aimed to assess the diagnostic accuracy of doctors and computer algorithms known as symptom checkers.

This is a useful way of drawing comparisons and highlighting areas for further research.

However, the small sample of scenarios assessed here cannot be representative of all the different combinations of signs and symptoms patients may have.

What did the research involve?

The researchers compared the diagnostic accuracy of online symptom checkers, with the diagnostic accuracy of doctors.

A total of 45 vignettes were used in the study, and included 26 common and 19 uncommon conditions.

The 234 physicians involved were hospital doctors specialising in general medicine, rather than other specialities such as surgery or paediatrics. They were asked to rank diagnoses for each case. Each vignette was solved by at least 20 physicians.

The responses were reviewed by another two doctors, who independently decided whether the diagnosis was correct or in the top three diagnoses. Discrepancies were resolved by a third member of the research team.

Each doctor's accuracy was compared with the symptom checker's accuracy for each of the vignettes.

What were the basic results?

The study found physicians listed the correct diagnosis first more often across all vignettes compared with symptom checkers (72.1% vs 34.0%). They also recognised the top three diagnoses listed (84.3% vs 51.2%) more often.

Doctors were more likely to give the correct diagnosis across all severities of presentation, as well as for common and uncommon presentations.

How did the researchers interpret the results?

The researchers concluded that: "In what we believe to be the first direct comparison of diagnostic accuracy, physicians vastly outperformed computer algorithms in diagnostic accuracy (84.3% vs 51.2% correct diagnosis in the top three listed).

"Despite physicians' superior performance, they provided the incorrect diagnosis in about 15% of cases, similar to prior estimates (10%-15%) for physician diagnostic error."

They went on to say: "While in this project we compared diagnostic performance, future work should test whether computer algorithms can augment physician diagnostic accuracy."

Conclusion

This study aimed to assess the diagnostic accuracy of online symptom checkers versus the accuracy of doctors.

The researchers found doctors were much more likely to accurately diagnose a condition than symptom checkers.

However, this research did have some limitations:

Clinical vignettes were used for diagnosis instead of real patients, and the vignettes did not include physical examination or test results.
Doctors involved in this study may not be representative of all doctors. The study only included doctors practising hospital medicine, rather than across the range of medical and surgical specialities. Many doctors were also still in training posts. Different doctors and qualification levels may differ in diagnostic accuracy.
Symptom checkers are only one form of computer diagnostic tools, and other tools may perform better.
The 45 vignettes assessed are only a small fraction of all possible sign and symptom combinations that adults or children may present with.

This being said, the use of computer programs can be useful in reducing diagnostic error – as long as the symptom checkers are accurate.

This research highlights the need for future work to improve the performance of these programmes.

It will probably be many years until an application becomes sophisticated enough to replace your GP, but these types of applications could one day be a useful tool in a doctor's (virtual) kitbag.