본문으로 건너뛰기
← 뒤로

Artificial intelligence for immunotherapy response assessment in lung cancer using PET/CT reports.

Japanese journal of radiology 2025 Vol.43(12) p. 2042-2050

Ismayilov R, Altundag O, Gencoglu EA, Aktas A, Alparslan S, Ozcicek A, Turhanoglu D, Oguz A, Farzaliyeva A, Ramazanoglu MN, Kocak M, Akcali Z

📝 환자 설명용 한 줄

[BACKGROUND] Accurate and timely assessment of immunotherapy response is vital for optimizing lung cancer management.

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Ismayilov R, Altundag O, et al. (2025). Artificial intelligence for immunotherapy response assessment in lung cancer using PET/CT reports.. Japanese journal of radiology, 43(12), 2042-2050. https://doi.org/10.1007/s11604-025-01840-3
MLA Ismayilov R, et al.. "Artificial intelligence for immunotherapy response assessment in lung cancer using PET/CT reports.." Japanese journal of radiology, vol. 43, no. 12, 2025, pp. 2042-2050.
PMID 41091339

Abstract

[BACKGROUND] Accurate and timely assessment of immunotherapy response is vital for optimizing lung cancer management. This study evaluates the efficacy of large language models (LLMs) in automating response assessment using positron emission tomography/computed tomography (PET/CT) reports based on the European Organization for Research and Treatment of Cancer (EORTC) criteria.

[METHODS] An effective prompting strategy was developed using Google Gemini 2.5 Pro Experimental 03-25, with explicit instructions for applying EORTC criteria via few-shot prompting. This prompt was then tested with both Gemini 2.5 Pro and OpenAI ChatGPT 4o to assess cross-model performance. Pre- and post-immunotherapy PET-CT reports in text format from 36 lung cancer patients were independently classified by the LLMs and an experienced nuclear medicine specialist. Performance metrics, including precision, recall, F1-score, and support, were calculated for each response category. Inter-rater agreement was assessed using Cohen's Kappa.

[RESULTS] The nuclear medicine specialist classified 5, 21, 6, and 4 reports as complete metabolic response (CMR), progressive metabolic disease (PMD), partial metabolic response (PMR), and stable metabolic disease (SMD), respectively, while Gemini 2.5 Pro classified 4, 21, 8, and 3 of them. Gemini achieved an overall accuracy of 94% and demonstrated strong agreement with the expert (overall Cohen's Kappa: 0.907). F1-scores were 0.86 for PMR and SMD, 0.89 for CMR, and 1.00 for PMD, with per-label Kappa scores ranging from 0.824 (PMR) to 1.00 (PMD). In comparison, ChatGPT 4o achieved perfect agreement with the expert across all 36 cases (accuracy = 100%, Cohen's Kappa = 1.000).

[CONCLUSIONS] When guided by a structured and task-specific prompt, both Gemini 2.5 Pro and ChatGPT 4o demonstrated strong capability for automating accurate immunotherapy response assessment in lung cancer using PET-CT reports. These results underscore the potential of LLMs to streamline clinical workflows and improve efficiency. Validation with larger data sets is warranted to support clinical implementation.

MeSH Terms

Humans; Positron Emission Tomography Computed Tomography; Lung Neoplasms; Immunotherapy; Artificial Intelligence; Male; Female; Aged; Middle Aged; Treatment Outcome

같은 제1저자의 인용 많은 논문 (4)