본문으로 건너뛰기
← 뒤로

Vision-language model-based semantic-guided imaging biomarker for lung nodule malignancy prediction.

1/5 보강
Journal of biomedical informatics 2025 Vol.172() p. 104947
Retraction 확인
출처

Zhuang L, Tabatabaei SMH, Salehi-Rad R, Tran LM, Aberle DR, Prosper AE

📝 환자 설명용 한 줄

[OBJECTIVE] Machine learning models have utilized semantic features, deep features, or both to assess lung nodule malignancy.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Zhuang L, Tabatabaei SMH, et al. (2025). Vision-language model-based semantic-guided imaging biomarker for lung nodule malignancy prediction.. Journal of biomedical informatics, 172, 104947. https://doi.org/10.1016/j.jbi.2025.104947
MLA Zhuang L, et al.. "Vision-language model-based semantic-guided imaging biomarker for lung nodule malignancy prediction.." Journal of biomedical informatics, vol. 172, 2025, pp. 104947.
PMID 41161557 ↗

Abstract

[OBJECTIVE] Machine learning models have utilized semantic features, deep features, or both to assess lung nodule malignancy. However, their reliance on manual annotation during inference, limited interpretability, and sensitivity to imaging variations hinder their application in real-world clinical settings. Thus, this research aims to integrate semantic features derived from radiologists' assessments of nodules, guiding the model to learn clinically relevant, robust, and explainable imaging features for predicting lung cancer.

[METHODS] We obtained 938 low-dose CT scans from the National Lung Screening Trial (NLST) with 1,261 nodules and semantic features. Additionally, the Lung Image Database Consortium dataset contains 1,018 CT scans, with 2,625 lesions annotated for nodule characteristics. Three external datasets were obtained from UCLA Health, the LUNGx Challenge, and the Duke Lung Cancer Screening. For imaging input, we obtained 2D nodule slices in nine directions from 50×50×50mm nodule crop. We converted structured semantic features into sentences using Gemini. We fine-tuned a pretrained Contrastive Language-Image Pretraining (CLIP) model with a parameter-efficient fine-tuning approach to align imaging and semantic text features and predict the one-year lung cancer diagnosis.

[RESULTS] Our model outperformed the state-of-the-art (SOTA) models in the NLST test set with an AUROC of 0.901 and AUPRC of 0.776. It also showed robust results in external datasets. Using CLIP, we also obtained predictions on semantic features through zero-shot inference, such as nodule margin (AUROC: 0.807), nodule consistency (0.812), and pleural attachment (0.840).

[CONCLUSION] By incorporating semantic features into the vision-language model, our approach surpasses the SOTA models in predicting lung cancer from CT scans collected from diverse clinical settings. It provides explainable outputs, aiding clinicians in comprehending the underlying meaning of model predictions. The code is available at https://github.com/luotingzhuang/CLIP_nodule.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (3)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반