Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn’t ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

  • rumba@lemmy.zip
    link
    fedilink
    English
    arrow-up
    17
    ·
    4 hours ago

    Chatbots make terrible everything.

    But an LLM properly trained on sufficient patient data metrics and outcomes in the hands of a decent doctor can cut through bias, catch things that might fall through the cracks and pack thousands of doctors worth of updated CME into a thing that can look at a case and go, you know, you might want to check for X. The right model can be fucking clutch at pointing out nearly invisible abnormalities on an xray.

    You can’t ask an LLM trained on general bullshit to help you diagnose anything. You’ll end up with 32,000 Reddit posts worth of incompetence.

    • Ricaz@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      3
      ·
      30 minutes ago

      Just sharing my personal experience with this:

      I used Gemini multiple times and it worked great. I have some weird symptoms that I described to Gemini, and it came up with a few possibilities, most likely being “Superior Canal Dehiscence Syndrome”.

      My doctor had never heard of it, and only through showing them the articles Gemini linked as sources, would my doctor even consider allowing a CT scan.

      Turns out Gemini was right.

    • cøre@leminal.space
      link
      fedilink
      English
      arrow-up
      2
      ·
      31 minutes ago

      They have to be for a specialized type of treatment or procedure such as looking at patient xrays or other scans. Just slopping PHI into a LLM and expecting it to diagnose random patient issues is what gives the false diagnoses.

    • SuspciousCarrot78@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      55 minutes ago

      Agree.

      I’m sorta kicking myself I didn’t sign up for Google’s MedPALM-2 when I had the chance. Last I checked, it passed the USMLE exam with 96% and 88% on radio interpretation / report writing.

      I remember looking at the sign up and seeing it requested credit card details to verify identity (I didn’t have a google account at the time). I bounced… but gotta admit, it might have been fun to play with.

      Oh well; one door closes another opens.

      In any case, I believe this article confirms GIGO. The LLMs appear to have been vastly more accurate when fed correct inputs by clinicians versus what lay people fed it.