New research highlights strengths of large language models in uncovering social determinants of health while underscoring the need for human oversight and improved de-identification methods

ISB Study Highlights AI’s Potential and Pitfalls in Analyzing Health Data

Media contact:
Joe Myxter
Director of Communications, ISB
jmyxter@isbscience.org

Institute for Systems Biology (ISB) researchers have gained new insights into the strengths and limitations of using artificial intelligence (AI) to identify social determinants of health from electronic health records. Their peer-reviewed results were published on Wednesday.

The ISB team, collaborating with Providence, leveraged large language models (LLM) developed from generative pre-trained transformers (GPT). Their research was conducted completely within the secure Providence internal environment.

The study – aimed at detecting housing instability – was conducted on over 25,000 clinical notes from 795 pregnant women and evaluated two large language models (GPT-4 and GPT-3.5), a named entity recognition model, regular expressions, and human review.

This research goes beyond previous studies in two important ways. First, researchers measured how well AI can find housing challenges, distinguish between current and past housing instability, and provide direct evidence from clinical notes. Second, they measured whether AI performed differently if the notes had been de-identified.

GPT-4 was the most effective of the four technologies tested, and was better than humans at finding cases of housing instability (recall). Humans, however, were better at understanding when people did not have housing instability (precision). Humans were also better at providing correct evidence from a clinical note.

“These results show that LLMs present a scalable, cost-effective solution for an initial search for patients who may benefit from outreach,” said ISB Associate Professor Jennifer Hadlock, MD, corresponding author of the paper.

GPT-4 generally provided the same text that humans had selected to justify answers. Notably, no hallucinated comments appeared in the GPT-4 responses that were reviewed, most likely because the researchers designed the LLM instructions to request verbatim evidence from notes.

However, there were cases where the AI interpretation of note text was incorrect in ways that could be misleading. This is especially important because housing status can intersect with many other challenging or risky situations, such as domestic abuse.

“When a healthcare professional decides whether and how to reach out to offer help, they take great care to consider patient safety. Our results illustrate that it would still be essential to have a human read the actual text in the chart, not just the LLM summary,” Hadlock added.

Further, in a novel experiment, researchers showed that recall was worse when run on de-identified versions of the same clinical notes. These notes had been de-identified with an automated technique called “hide in plain sight,” which replaces potentially sensitive information (such as names, locations and dates) with realistic but fictitious alternatives. The de-identification sometimes reclassified critical information enough to skew the ability to accurately determine housing instability.

“This highlights the need to refine de-identification methods to preserve privacy without losing important details about social determinants of health,” said Alexandra Ralevski, PhD, lead author of the study.

About ISB

Institute for Systems Biology (ISB) is a collaborative and cross-disciplinary non-profit biomedical research organization based in Seattle. We focus on some of the most pressing issues in human health, including aging, brain health, cancer, chronic illness, infectious disease, and more. Our science is translational, and we champion sound scientific research that results in real-world clinical impacts. ISB is an affiliate of Providence, one of the largest not-for-profit healthcare systems in the United States. Follow us online at isbscience.org, and on YouTube, Facebook, LinkedIn, X, Bluesky and Instagram.


Read Previous

Axcelead DDP Enters into Drug Discovery

Read Next

MultiPlan Announces Commencement of Exch

Add Comment