본문으로 건너뛰기
← 뒤로

Development and validation of a machine learning model for predicting early death in metastatic pancreatic ductal adenocarcinoma: a study based on the SEER database.

Translational cancer research 2026 Vol.15(1) p. 53

Zhang L, He J

📝 환자 설명용 한 줄

[BACKGROUND] Metastatic pancreatic ductal adenocarcinoma (mPDAC) has a poor prognosis, with a significant number of patients experiencing early death.

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Zhang L, He J (2026). Development and validation of a machine learning model for predicting early death in metastatic pancreatic ductal adenocarcinoma: a study based on the SEER database.. Translational cancer research, 15(1), 53. https://doi.org/10.21037/tcr-2025-1276
MLA Zhang L, et al.. "Development and validation of a machine learning model for predicting early death in metastatic pancreatic ductal adenocarcinoma: a study based on the SEER database.." Translational cancer research, vol. 15, no. 1, 2026, pp. 53.
PMID 41674965

Abstract

[BACKGROUND] Metastatic pancreatic ductal adenocarcinoma (mPDAC) has a poor prognosis, with a significant number of patients experiencing early death. Identifying these high-risk patients at diagnosis is critical for personalizing treatment intensity, facilitating timely palliative care discussions, and improving clinical trial stratification. Therefore, this study aimed to develop and validate a machine learning (ML)-based algorithm to estimate the probability of early death in patients with mPDAC.

[METHODS] We recruited a total of 14,820 patients diagnosed with mPDAC from the Surveillance, Epidemiology, and End Results (SEER) databases. Key exclusion criteria were missing data on survival time or essential variables. The cohort was randomly split into a training set (70%) and an internal test set (30%). For external validation, we retrospectively enrolled patients with mPDAC from a Chinese medical center (2017-2019), representing a distinct geographic and healthcare population. The primary outcome was early death, defined as all-cause mortality within three months of diagnosis. Baseline clinical predictors included demographic, tumor, and treatment characteristics. Four ML models were constructed based on clinical and pathological features. The effectiveness of these models was assessed through various metrics such as the area under the curve (AUC), calibration plots, and decision curve analysis (DCA). The optimal model was selected based on 10-fold cross-validation and its generalizability was internally and externally validated. Additionally, Shapley values for relevant features were calculated using the SHapley Additive exPlanations (SHAP) method.

[RESULTS] The extreme gradient boosting classifier (XGBoost) model demonstrated the best performance (AUC =0.757). Crucially, it maintained strong generalizability in the independent external Chinese cohort (AUC =0.780), demonstrating robust cross-population applicability. According to the feature importance ranking plot generated, chemotherapy stood out as the most crucial feature, followed by age, and marital status.

[CONCLUSIONS] We developed and validated an interpretable ML model that accurately predicts the risk of early death in mPDAC patients. The model's robust performance across US and Chinese populations underscores its broad clinical utility. This tool can assist clinicians in identifying high-risk individuals at diagnosis, thereby informing personalized treatment strategies, prioritizing palliative care, and optimizing resource allocation in diverse healthcare settings.

같은 제1저자의 인용 많은 논문 (5)