Longitudinal multisource clinical model for early lung cancer risk stratification and screening.
[OBJECTIVES] Lung cancer is the leading cause of cancer-related mortality worldwide, with poor prognosis largely due to late-stage diagnosis.
- 연구 설계 cohort study
APA
Chien CH, Chang SC, et al. (2026). Longitudinal multisource clinical model for early lung cancer risk stratification and screening.. BMJ health & care informatics, 33(1). https://doi.org/10.1136/bmjhci-2025-101989
MLA
Chien CH, et al.. "Longitudinal multisource clinical model for early lung cancer risk stratification and screening.." BMJ health & care informatics, vol. 33, no. 1, 2026.
PMID
41734977
Abstract
[OBJECTIVES] Lung cancer is the leading cause of cancer-related mortality worldwide, with poor prognosis largely due to late-stage diagnosis. Current screening methods such as low-dose CT face accessibility and cost barriers in resource-limited settings. This study develops a lightweight multichannel convolutional neural network for lung cancer screening support through longitudinal risk stratification using routine pre-diagnostic healthcare data.
[METHODS] We conducted a retrospective cohort study using Taiwan's National Health Insurance Research Database, comprising 99 615 individuals (575 lung cancer cases; 99 040 non-cancer controls). Diagnostic codes, medication records and medical orders within a 36-month observation window were extracted. Log-likelihood ratio feature selection was implemented to reduce dimensionality, achieving 99.8% reduction in computational requirements while retaining clinical relevance. A multichannel Convolutional Neural Network (CNN) architecture was designed to process these heterogeneous data modalities simultaneously.
[RESULTS] The proposed method achieved an F₁-score of 0.5738, precision of 0.7149, Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.8316 and Area Under the Precision-Recall Curve (AUPRC) of 0.1617, outperforming baseline methods in precision and F₁-score. Ablation studies confirmed that medical orders provide primary predictive value, while medication features contribute limited discriminative signal in the pre-diagnostic phase. SHapley Additive exPlanations analysis revealed that routine healthcare utilisation patterns, rather than cancer-specific features, drive risk stratification.
[DISCUSSION] The lightweight architecture enables deployment in resource-constrained clinical environments while maintaining robust performance, offering potential as a preliminary screening tool to identify high-risk individuals for further diagnostic examination.
[CONCLUSION] Efficient deep learning models using routine clinical data can facilitate lung cancer risk stratification and screening, providing a scalable solution for clinical implementation.
[METHODS] We conducted a retrospective cohort study using Taiwan's National Health Insurance Research Database, comprising 99 615 individuals (575 lung cancer cases; 99 040 non-cancer controls). Diagnostic codes, medication records and medical orders within a 36-month observation window were extracted. Log-likelihood ratio feature selection was implemented to reduce dimensionality, achieving 99.8% reduction in computational requirements while retaining clinical relevance. A multichannel Convolutional Neural Network (CNN) architecture was designed to process these heterogeneous data modalities simultaneously.
[RESULTS] The proposed method achieved an F₁-score of 0.5738, precision of 0.7149, Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.8316 and Area Under the Precision-Recall Curve (AUPRC) of 0.1617, outperforming baseline methods in precision and F₁-score. Ablation studies confirmed that medical orders provide primary predictive value, while medication features contribute limited discriminative signal in the pre-diagnostic phase. SHapley Additive exPlanations analysis revealed that routine healthcare utilisation patterns, rather than cancer-specific features, drive risk stratification.
[DISCUSSION] The lightweight architecture enables deployment in resource-constrained clinical environments while maintaining robust performance, offering potential as a preliminary screening tool to identify high-risk individuals for further diagnostic examination.
[CONCLUSION] Efficient deep learning models using routine clinical data can facilitate lung cancer risk stratification and screening, providing a scalable solution for clinical implementation.
MeSH Terms
Humans; Lung Neoplasms; Retrospective Studies; Taiwan; Early Detection of Cancer; Risk Assessment; Male; Female; Middle Aged; Neural Networks, Computer; Aged
같은 제1저자의 인용 많은 논문 (3)
- Substantive theory of family resilience among couples dealing with prostate cancer: A grounded theory study.
- Effects of an app-assisted self-management intervention for urinary incontinence on self-efficacy and related outcomes in men with prostate cancer: A randomized controlled feasibility trial.
- Overall Survival and Complication Rates in the Treatment of Liver Carcinoma: A Comparative Study of Ultrasound, Computed Tomography, and Combined Ultrasound and Computed Tomography Guidance for Radiofrequency Ablation.