Development of machine learning-based prognostic models for small cell lung cancer with brain metastases: an analysis of SEER and Chinese populations.
[BACKGROUND] Accurate survival prediction is critical for optimizing clinical management in small cell lung cancer (SCLC) patients with initial brain metastasis (BM).
APA
Guo M, Zhu J, et al. (2025). Development of machine learning-based prognostic models for small cell lung cancer with brain metastases: an analysis of SEER and Chinese populations.. Journal of thoracic disease, 17(11), 9622-9641. https://doi.org/10.21037/jtd-2025-961
MLA
Guo M, et al.. "Development of machine learning-based prognostic models for small cell lung cancer with brain metastases: an analysis of SEER and Chinese populations.." Journal of thoracic disease, vol. 17, no. 11, 2025, pp. 9622-9641.
PMID
41376903
Abstract
[BACKGROUND] Accurate survival prediction is critical for optimizing clinical management in small cell lung cancer (SCLC) patients with initial brain metastasis (BM). This study aims to develop a machine learning (ML)-based prognostic model to improve survival prediction, thereby enhancing clinical decision-making and treatment outcomes.
[METHODS] Data on SCLC patients with initial BM were extracted from the Surveillance, Epidemiology, and End Results (SEER) database for the period from 2010 to 2020 and split into training (70%) and testing (30%) sets. We used COX proportional hazards regression model to select modeling features in the training set. Four ML methods, including least absolute shrinkage and selection operator (LASSO), random forest (RF), eXtreme Gradient Boosting (XGBoost), and gradient boosting machine (GBM), were used to establish prognostic models and validated by both the SEER database and patients from Harbin Medical University Cancer Hospital. Model performance was evaluated using area under the curve (AUC) values, accuracy and F1 scores. Furthermore, we conducted a prioritization of the importance of features by SHapley Additive exPlanations (SHAP) at various time points.
[RESULTS] A total of 4,227 patients were enrolled in the SEER database, with 2,958 cases allocated to the training set and 1,269 to the testing set. Based on the results of univariate and multivariate COX regression analyses, we identified age, sex, marital, primary tumor size, N stage, bone metastasis, liver metastasis, lung metastasis, months from diagnosis to therapy, surgery, radiotherapy, and chemotherapy as model features. The LASSO model outperformed other models, with AUCs of 0.771, 0.724, 0.753, 0.718 at 6 months, 1 year, 2 years, and 3 years in the testing set, and 0.801, 0.763, 0.838, and 0.900 in the external validation set. Additionally, feature importance analysis consistently identified liver metastasis (highest rank), surgery, radiotherapy, lung metastasis, and chemotherapy as key predictors across all time points.
[CONCLUSIONS] The LASSO model demonstrated high accuracy in predicting survival for SCLC patients with initial BM, particularly in external validation. This model may provide valuable prognostic insights for personalized treatment strategies.
[METHODS] Data on SCLC patients with initial BM were extracted from the Surveillance, Epidemiology, and End Results (SEER) database for the period from 2010 to 2020 and split into training (70%) and testing (30%) sets. We used COX proportional hazards regression model to select modeling features in the training set. Four ML methods, including least absolute shrinkage and selection operator (LASSO), random forest (RF), eXtreme Gradient Boosting (XGBoost), and gradient boosting machine (GBM), were used to establish prognostic models and validated by both the SEER database and patients from Harbin Medical University Cancer Hospital. Model performance was evaluated using area under the curve (AUC) values, accuracy and F1 scores. Furthermore, we conducted a prioritization of the importance of features by SHapley Additive exPlanations (SHAP) at various time points.
[RESULTS] A total of 4,227 patients were enrolled in the SEER database, with 2,958 cases allocated to the training set and 1,269 to the testing set. Based on the results of univariate and multivariate COX regression analyses, we identified age, sex, marital, primary tumor size, N stage, bone metastasis, liver metastasis, lung metastasis, months from diagnosis to therapy, surgery, radiotherapy, and chemotherapy as model features. The LASSO model outperformed other models, with AUCs of 0.771, 0.724, 0.753, 0.718 at 6 months, 1 year, 2 years, and 3 years in the testing set, and 0.801, 0.763, 0.838, and 0.900 in the external validation set. Additionally, feature importance analysis consistently identified liver metastasis (highest rank), surgery, radiotherapy, lung metastasis, and chemotherapy as key predictors across all time points.
[CONCLUSIONS] The LASSO model demonstrated high accuracy in predicting survival for SCLC patients with initial BM, particularly in external validation. This model may provide valuable prognostic insights for personalized treatment strategies.
같은 제1저자의 인용 많은 논문 (5)
- Polysaccharide of Danggui Buxue Decoction Attenuates Colorectal Cancer via Modulating Intestinal Microflora and Metabolites.
- Adjuvant-metal-ion-chelating PTEN mRNA with cell-membrane-coating augments the immune sensitivity for precise cancer immunotherapy.
- Comment on "Prehabilitation Interventions in Patients Undergoing Colorectal Cancer Surgery: A Systematic Review and Meta-Analysis".
- Heterogeneous inorganic nanomedicine delivery system loaded with anlotinib for enhanced treatment of non-small cell lung cancer (NSCLC).
- Suppressive role of SCN4B in the epithelial‑mesenchymal transition of lung adenocarcinoma.