본문으로 건너뛰기
← 뒤로

A vision-language model-based approach for lung cancer diagnosis using lossless 3D CT images: evaluation of GPT-4.1 and GPT-4o for patient-level malignancy assessment.

2/5 보강
Health information science and systems 2026 Vol.14(1) p. 16 OA Lung Cancer Diagnosis and Treatment
Retraction 확인
출처
PubMed DOI PMC OpenAlex 마지막 보강 2026-04-28
OpenAlex 토픽 · Lung Cancer Diagnosis and Treatment Radiomics and Machine Learning in Medical Imaging AI in cancer detection

Shi N, Liu Z, Wan Z, Yang G, Shi Y, Wang P

📝 환자 설명용 한 줄

[PURPOSE] Large vision-language models (VLMs), such as GPT-4.1 and GPT-4o, have shown strong potential in medical tasks.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Ning Shi, Zhenpeng Liu, et al. (2026). A vision-language model-based approach for lung cancer diagnosis using lossless 3D CT images: evaluation of GPT-4.1 and GPT-4o for patient-level malignancy assessment.. Health information science and systems, 14(1), 16. https://doi.org/10.1007/s13755-025-00417-8
MLA Ning Shi, et al.. "A vision-language model-based approach for lung cancer diagnosis using lossless 3D CT images: evaluation of GPT-4.1 and GPT-4o for patient-level malignancy assessment.." Health information science and systems, vol. 14, no. 1, 2026, pp. 16.
PMID 41439200 ↗

Abstract

[PURPOSE] Large vision-language models (VLMs), such as GPT-4.1 and GPT-4o, have shown strong potential in medical tasks. However, their application in lossless 3D medical image analysis is still underexplored. This study proposes a GPT-based diagnostic approach that maintains voxel-level accuracy during data ingestion, structures multi-slice visual inputs for model interpretation, and integrates consensus guidelines to align predictions with clinical standards. In doing so, the approach may provide interpretable and guideline-consistent decision support even for less experienced clinicians.

[METHODS] We designed a novel approach that directly processes 3D chest CT scans in NIfTI format, maintains full voxel fidelity during data import and analysis, and is compatible with GPT-based workflows. For each lung nodule, we guided GPT in analyzing multi-slice visual inputs, including bounding annotations, segmentation overlays, and cropped views. Guidelines (Fleischner, BTS, ACCP) were embedded to promote standardized interpretation and guide reasoning from nodule-level characteristics to patient-level assessment. Three NIfTI-based input settings were used to test the method on the LIDC-IDRI dataset: (1) nodule coordinates only; (2) coordinates with guideline-based prompting; (3) segmentation overlays with guideline-based prompting. To evaluate the performance on external datasets, we performed external validation on the Lung Nodule Database (LNDb).

[RESULTS] GPT-4.1 outperformed GPT-4o overall, especially with full-context input, while GPT-4o had higher sensitivity with limited input. With segmentation and guideline-based prompting, GPT-4.1 achieved accuracy 0.722 and AUC 0.780 on LIDC-IDRI dataset. In external validation on LNDb dataset, it reached accuracy 0.767 and AUC 0.780. GPT-4.1 maintained its competitiveness when compared to representative deep-learning baselines and radiologist readers. It also provided stronger interpretability through guideline-grounded and patient-level reasoning with explicit textual justifications.

[CONCLUSIONS] This study presents a clinically aligned and interpretable approach for GPT-based lung cancer diagnosis using lossless 3D CT images. The outcomes demonstrate the potential of combining large vision-language models with structured visual and guideline-based context in real-world diagnostic workflows.

[SUPPLEMENTARY INFORMATION] The online version contains supplementary material available at 10.1007/s13755-025-00417-8.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (4)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기