Generative Artificial Intelligence Successfully Automates Data Extraction From Unstructured Magnetic Resonance Imaging Reports: Feasibility in Prostate Cancer Care.

Odisho AY; Liu AW; Pace WA; Carlisle MN; Krumm R; Cowan JE; Carroll PR; Cooperberg MR

doi:10.1200/CCI-24-00334

← 뒤로

Generative Artificial Intelligence Successfully Automates Data Extraction From Unstructured Magnetic Resonance Imaging Reports: Feasibility in Prostate Cancer Care.

1/5 보강

JCO clinical cancer informatics 📖 저널 OA 42.1% 2026 Vol.10() p. e2400334

Odisho AY, Liu AW, Pace WA, Carlisle MN, Krumm R, Cowan JE, Carroll PR, Cooperberg MR

PubMed ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

[PURPOSE] Radiology reports are stored as plain text in most electronic health records, rendering the data computationally inaccessible.

이 논문을 인용하기

↓ .bib ↓ .ris

APA Odisho AY, Liu AW, et al. (2026). Generative Artificial Intelligence Successfully Automates Data Extraction From Unstructured Magnetic Resonance Imaging Reports: Feasibility in Prostate Cancer Care.. JCO clinical cancer informatics, 10, e2400334. https://doi.org/10.1200/CCI-24-00334

MLA Odisho AY, et al.. "Generative Artificial Intelligence Successfully Automates Data Extraction From Unstructured Magnetic Resonance Imaging Reports: Feasibility in Prostate Cancer Care.." JCO clinical cancer informatics, vol. 10, 2026, pp. e2400334.

PMID 41499717

DOI 10.1200/CCI-24-00334

Abstract

[PURPOSE] Radiology reports are stored as plain text in most electronic health records, rendering the data computationally inaccessible. Large language models are powerful tools for analyzing unstructured text but relatively untested in urologic oncology. We aimed to develop a pipeline to extract data from plain text prostate magnetic resonance imaging (MRI) reports using GPT4.0 and compare the accuracy to manually abstracted data.

[METHODS] We developed a data pipeline using a secure, enterprise-wide deployment of OpenAI's GPT-4.0 to automatically extract data elements from prostate MRI report text when presented with prostate MRI reports. Identical prompts and reports were sent multiple times to determine response variability. We extracted 15 data elements per report and compared accuracy to a manually abstracted gold standard.

[RESULTS] Across 424 prostate MRI reports, GPT-4.0 response accuracy was consistently above 95%. Individual field accuracies were 98.3% (96.3%-99.3%) for prostate-specific antigen density, 97.4% (95.4%-98.7%) for extracapsular extension, and 98.1% (96.3%-99.2%) for TNM stage, and had a median of 98.1% (96.3%-99.2%), a mean of 97.2% (95.2%-98.3%), and a range of 99.8% (98.7%-100.0%) for number of suspicious lesions to 87.7% (84.2%-90.7%) for identification of lesion location in the base of the prostate. Response variability over five repeated runs ranged from 0.14% to 3.61%, differed based on the data element extracted ( < .001), and was inversely correlated with accuracy ( < .001). In disagreements between manual and GPT-4.0 extracted data, GPT-4.0 responses were more often deemed correct by an additional reviewer.

[CONCLUSION] GPT-4.0 had high accuracy with low variability in extracting data points from prostate cancer MRI reports with low upfront programming requirements. This represents an effective tool to expedite medical data extraction for clinical and research use cases.

이 논문을 인용하기

Abstract

🏷️ 키워드 / MeSH