Chatbots Make Terrible Doctors, New Study Finds

XLE@piefed.social · edit-2 18 hours ago

Chatbots Make Terrible Doctors, New Study Finds

rumba@lemmy.zip · 4 hours ago

Chatbots make terrible everything.

But an LLM properly trained on sufficient patient data metrics and outcomes in the hands of a decent doctor can cut through bias, catch things that might fall through the cracks and pack thousands of doctors worth of updated CME into a thing that can look at a case and go, you know, you might want to check for X. The right model can be fucking clutch at pointing out nearly invisible abnormalities on an xray.

You can’t ask an LLM trained on general bullshit to help you diagnose anything. You’ll end up with 32,000 Reddit posts worth of incompetence.

Ricaz@lemmy.dbzer0.com · 30 minutes ago

Just sharing my personal experience with this:

I used Gemini multiple times and it worked great. I have some weird symptoms that I described to Gemini, and it came up with a few possibilities, most likely being “Superior Canal Dehiscence Syndrome”.

My doctor had never heard of it, and only through showing them the articles Gemini linked as sources, would my doctor even consider allowing a CT scan.

Turns out Gemini was right.

cøre@leminal.space · 31 minutes ago

They have to be for a specialized type of treatment or procedure such as looking at patient xrays or other scans. Just slopping PHI into a LLM and expecting it to diagnose random patient issues is what gives the false diagnoses.

SuspciousCarrot78@lemmy.world · edit-2 55 minutes ago

Agree.

I’m sorta kicking myself I didn’t sign up for Google’s MedPALM-2 when I had the chance. Last I checked, it passed the USMLE exam with 96% and 88% on radio interpretation / report writing.

I remember looking at the sign up and seeing it requested credit card details to verify identity (I didn’t have a google account at the time). I bounced… but gotta admit, it might have been fun to play with.

Oh well; one door closes another opens.

In any case, I believe this article confirms GIGO. The LLMs appear to have been vastly more accurate when fed correct inputs by clinicians versus what lay people fed it.