Explainable Machine Learning Model for Predicting Postoperative Survival in Patients With Locally Advanced Gastric Cancer.
1/5 보강
[PURPOSE] This study aims to develop and validate an explainable machine learning model for predicting postoperative survival in patients with locally advanced gastric cancer (LAGC), optimizing predic
- 95% CI 0.720-0.745
APA
Gong Z, Zhou L, et al. (2025). Explainable Machine Learning Model for Predicting Postoperative Survival in Patients With Locally Advanced Gastric Cancer.. Cancer medicine, 14(22), e71408. https://doi.org/10.1002/cam4.71408
MLA
Gong Z, et al.. "Explainable Machine Learning Model for Predicting Postoperative Survival in Patients With Locally Advanced Gastric Cancer.." Cancer medicine, vol. 14, no. 22, 2025, pp. e71408.
PMID
41271251 ↗
Abstract 한글 요약
[PURPOSE] This study aims to develop and validate an explainable machine learning model for predicting postoperative survival in patients with locally advanced gastric cancer (LAGC), optimizing predictive accuracy while ensuring clinical applicability to facilitate personalized prognostication for patients.
[METHODS] The study utilized data from 8616 LAGC patients who underwent gastrectomy (2004-2015) in the Surveillance, Epidemiology, and End Results (SEER) database for model development and validation, with external validation performed using 235 postoperative LAGC cases (2016-2022) from Maoming People's Hospital (Maoming, China). Five predictive models-Cox proportional hazards model (CoxPH), random survival forest (RSF), extreme gradient boosting (XGBoost), gradient boosting machine (GBM), and DeepSurv-were developed using the training set. Model performance was evaluated using the concordance index (C-index), area under the receiver operating characteristic curve (AUROC), and Brier score. Additionally, 1-, 3-, and 5-year receiver operating characteristic curves (ROC), calibration curves, and decision curve analysis (DCA) were employed for further assessment. The optimal model was interpreted using explainability tools such as SurvSHAP and SurvLIME. Finally, an interactive prediction tool was created to provide personalized survival evaluation for LAGC patients.
[RESULTS] RSF exhibited the highest predictive performance, with a C-index of 0.732 (95% CI: 0.720-0.745) in the validation set and 0.723 (95% CI: 0.696-0.755) in the external validation set. The 1-, 3-, and 5-year AUROCs were 0.771, 0.803, and 0.809 in the validation set, and 0.802, 0.711, and 0.721 in the external validation set. Explainability analysis identified lymph node ratio (LNR), AJCC stage, and age as the most influential prognostic factors. An interactive prediction tool was developed to provide individualized prognosis visualization.
[CONCLUSION] This study developed an RSF-based model to predict postoperative survival in LAGC patients, emphasizing the prognostic significance of LNR, AJCC stage, and age. The interactive prediction tool enhances clinical utility, facilitating personalized treatment decision-making for physicians.
[METHODS] The study utilized data from 8616 LAGC patients who underwent gastrectomy (2004-2015) in the Surveillance, Epidemiology, and End Results (SEER) database for model development and validation, with external validation performed using 235 postoperative LAGC cases (2016-2022) from Maoming People's Hospital (Maoming, China). Five predictive models-Cox proportional hazards model (CoxPH), random survival forest (RSF), extreme gradient boosting (XGBoost), gradient boosting machine (GBM), and DeepSurv-were developed using the training set. Model performance was evaluated using the concordance index (C-index), area under the receiver operating characteristic curve (AUROC), and Brier score. Additionally, 1-, 3-, and 5-year receiver operating characteristic curves (ROC), calibration curves, and decision curve analysis (DCA) were employed for further assessment. The optimal model was interpreted using explainability tools such as SurvSHAP and SurvLIME. Finally, an interactive prediction tool was created to provide personalized survival evaluation for LAGC patients.
[RESULTS] RSF exhibited the highest predictive performance, with a C-index of 0.732 (95% CI: 0.720-0.745) in the validation set and 0.723 (95% CI: 0.696-0.755) in the external validation set. The 1-, 3-, and 5-year AUROCs were 0.771, 0.803, and 0.809 in the validation set, and 0.802, 0.711, and 0.721 in the external validation set. Explainability analysis identified lymph node ratio (LNR), AJCC stage, and age as the most influential prognostic factors. An interactive prediction tool was developed to provide individualized prognosis visualization.
[CONCLUSION] This study developed an RSF-based model to predict postoperative survival in LAGC patients, emphasizing the prognostic significance of LNR, AJCC stage, and age. The interactive prediction tool enhances clinical utility, facilitating personalized treatment decision-making for physicians.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- Prediction of immunotherapeutic responses by a classifier model based on inflammation-associated tumor microenvironment signatures in colorectal cancer.
- Insulin-like growth factor-binding protein 5 promotes prostate cancer metastasis and osteoblastic activity by inducing chemokines and activating NF-κB signaling.
- Habitat analysis based on magnetic resonance imaging for the prediction of prostate cancer: a dual-center study.
- Exploring the value of multiparametric quantitative magnetic resonance imaging in avoiding unnecessary biopsy in patients with PI-RADS 3-4.
- Machine learning identifies TIME subtypes linking EGFR mutations and immune states in lung adenocarcinoma.
📖 전문 본문 읽기 PMC JATS · ~98 KB · 영문
Introduction
1
Introduction
Gastric cancer (GC) ranks as the fifth most frequently diagnosed malignancy worldwide and the third leading cause of cancer‐related deaths [1]. Due to the absence of specific early symptoms, the majority of patients are diagnosed at locally advanced or advanced stages [2]. For those diagnosed with locally advanced gastric cancer (LAGC), surgery combined with adjuvant therapy remains the primary treatment approach [3, 4]. However, the overall postoperative survival rate remains unsatisfactory, with a 5‐year survival rate below 30% [5, 6, 7]. The interplay of multiple factors, including tumor staging, molecular subtypes, and individual patient heterogeneity, significantly impacts prognosis, posing a considerable challenge for accurate prognostic prediction [8]. The American Joint Committee on Cancer (AJCC) staging system is the predominant prognostic evaluation instrument for LAGC in clinical practice. While it provides guidance for treatment decisions, it does not offer sufficient prognostic information [9]. Therefore, developing an accurate predictive model for postoperative survival in LAGC patients is essential to optimize clinical decision‐making and enhance patient outcomes.
Machine learning (ML), a core branch of artificial intelligence, empowers algorithms to autonomously identify patterns within data and develop predictive models [10]. Traditional linear statistical models have been extensively used in predicting postoperative survival in LAGC patients, demonstrating a certain degree of utility [11, 12, 13]. However, in real‐world clinical practice, considering only linear relationships among clinical variables is insufficient. Compared to traditional statistical methods, ML demonstrates substantial advantages in handling high‐dimensional, nonlinear, and complex interactions in data, enabling the integration of multimodal clinical data to enhance predictive accuracy [14]. However, the “black‐box” nature of machine learning models presents difficulties in interpreting the dynamic contributions of key prognostic factors to individual survival risk, potentially undermining clinicians' trust in the predictive outcomes [15]. In recent years, various explainable artificial intelligence (XAI) algorithms have been introduced [16], striking a balance between predictive accuracy and interpretability. For patients with LAGC, explainable machine learning models for predicting postoperative survival remain scarce.
This study integrates data from the Surveillance, Epidemiology, and End Results (SEER) database and Maoming People's Hospital (Maoming, China) to develop and validate an explainable ML‐based model for predicting postoperative survival in LAGC patients. The model is designed to optimize predictive accuracy while ensuring clinical applicability, offering personalized prognostic insights for both clinicians and patients.
Introduction
Gastric cancer (GC) ranks as the fifth most frequently diagnosed malignancy worldwide and the third leading cause of cancer‐related deaths [1]. Due to the absence of specific early symptoms, the majority of patients are diagnosed at locally advanced or advanced stages [2]. For those diagnosed with locally advanced gastric cancer (LAGC), surgery combined with adjuvant therapy remains the primary treatment approach [3, 4]. However, the overall postoperative survival rate remains unsatisfactory, with a 5‐year survival rate below 30% [5, 6, 7]. The interplay of multiple factors, including tumor staging, molecular subtypes, and individual patient heterogeneity, significantly impacts prognosis, posing a considerable challenge for accurate prognostic prediction [8]. The American Joint Committee on Cancer (AJCC) staging system is the predominant prognostic evaluation instrument for LAGC in clinical practice. While it provides guidance for treatment decisions, it does not offer sufficient prognostic information [9]. Therefore, developing an accurate predictive model for postoperative survival in LAGC patients is essential to optimize clinical decision‐making and enhance patient outcomes.
Machine learning (ML), a core branch of artificial intelligence, empowers algorithms to autonomously identify patterns within data and develop predictive models [10]. Traditional linear statistical models have been extensively used in predicting postoperative survival in LAGC patients, demonstrating a certain degree of utility [11, 12, 13]. However, in real‐world clinical practice, considering only linear relationships among clinical variables is insufficient. Compared to traditional statistical methods, ML demonstrates substantial advantages in handling high‐dimensional, nonlinear, and complex interactions in data, enabling the integration of multimodal clinical data to enhance predictive accuracy [14]. However, the “black‐box” nature of machine learning models presents difficulties in interpreting the dynamic contributions of key prognostic factors to individual survival risk, potentially undermining clinicians' trust in the predictive outcomes [15]. In recent years, various explainable artificial intelligence (XAI) algorithms have been introduced [16], striking a balance between predictive accuracy and interpretability. For patients with LAGC, explainable machine learning models for predicting postoperative survival remain scarce.
This study integrates data from the Surveillance, Epidemiology, and End Results (SEER) database and Maoming People's Hospital (Maoming, China) to develop and validate an explainable ML‐based model for predicting postoperative survival in LAGC patients. The model is designed to optimize predictive accuracy while ensuring clinical applicability, offering personalized prognostic insights for both clinicians and patients.
Materials and Methods
2
Materials and Methods
2.1
Data Collection
This study was a retrospective analysis, with the overall workflow illustrated in Figure 1. Data were extracted from the SEER Research Data (17 Registries, Nov 2023 submission, 2000–2021) using SEER*Stat software (v8.4.4), selecting cases diagnosed between 2004 and 2015 according to the specified inclusion criteria: (i) age between 18 and 80 years; (ii) primary gastric malignancies (anatomical location: C16.0–C16.9; ICD‐O‐3 histological codes: 8140/3–8490/3); and (iii) patients who underwent gastrectomy (RX Summ‐Surg Prim Site codes 30–80). Exclusion criteria were: (i) distant metastatic (M1) or early‐stage gastric cancer (T1‐2N0M0); (ii) cases with missing key clinical variables or incomplete follow‐up data; and (iii) overall survival (OS) < 1 month. Additionally, an external validation set of 235 LAGC patients treated at Maoming People's Hospital (Maoming, China) between 2016 and 2022 was included, complying with the identical inclusion and exclusion criteria. The study was approved by the Ethics Committee of Maoming People's Hospital (approval number: PJ2025MI‐K038‐01). Given that the study utilized only de‐identified historical data, the ethics committee waived the requirement for informed consent.
2.2
Variables
Patient survival data (overall survival and the corresponding endpoint event status) and clinical characteristics were systematically collected, including demographic variables (sex, age, race, marital status), tumor characteristics (location, size, grade, and histological type), staging details (TNM stage standardized according to the 8th edition of the AJCC classification), lymph node metastatic burden (number of positive lymph nodes and total examined lymph nodes), and treatment modalities (extent of surgery and receipt of chemotherapy or radiotherapy). Univariate Cox regression analyses were conducted for all variables, and those with p < 0.05 were incorporated into multivariate Cox regression analysis to ascertain independent prognostic factors.
A multi‐stage feature selection strategy was employed, where clinical variables were screened using Least Absolute Shrinkage and Selection Operator (LASSO), Recursive Feature Elimination (RFE), and the Boruta algorithm. The final feature set comprised consensus variables identified through the intersection of all three methods. LASSO is an embedded feature selection method that applies L1 regularization, shrinking the coefficients of non‐informative variables to zero, effectively handling multicollinearity and reducing overfitting [17]. RFE is a feature selection technique that employs a backward elimination strategy to iteratively refine the feature subset [18]. In this study, we propose a GBM‐RFE variant, which utilizes gradient boosting machine (GBM) to determine feature importances and iteratively remove the least important features, making it well‐suited to capturing complex or nonlinear relationships. Both LASSO and RFE were implemented with five‐fold cross‐validation to enhance selection stability. The Boruta algorithm, a wrapper‐based feature selection method, introduces “shadow features” as benchmark references. A random forest model then computes Z‐scores for real and synthetic noise features, applying statistical hypothesis testing to determine feature importance [19].
2.3
Model Development
Patients from the SEER database were randomly assigned to a training set and a validation set in a 7:3 ratio. Five representative predictive models were developed using the training set: Cox proportional hazards model (CoxPH), random survival forest (RSF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), and DeepSurv.
CoxPH is a classical semi‐parametric survival analysis method that establishes a log‐linear relationship between covariates and the hazard function using partial likelihood estimation, offering an interpretable hazard ratio framework. RSF, a non‐parametric ensemble learning approach, constructs multiple survival trees and employs splitting criteria to maximize survival differences across nodes, effectively capturing nonlinear effects and interactions [20]. GBM iteratively optimizes an additive model by minimizing the negative log‐likelihood loss function, enabling adaptive learning for heterogeneous survival data [21]. XGBoost extends GBM by incorporating L1/L2 regularization, second‐order gradient approximation, and parallel computation. Through feature binning and a sparsity‐aware algorithm, it enhances computational efficiency, rendering it especially appropriate for high‐dimensional survival prediction tasks [22]. DeepSurv, a deep learning‐based survival model, utilizes neural networks to capture high‐order feature interactions through nonlinear transformations in hidden layers, enabling the modeling of complex survival risk patterns [23]. Hyperparameters for all machine learning models were optimized using grid search with five‐fold cross‐validation, evaluating performance by the concordance index (C‐index).
2.4
Model Evaluation
Multiple metrics were utilized to evaluate model performance. C‐index and integrated area under the receiver operating characteristic curve (Integrated AUROC) assessed global discriminatory ability, while 1‐year, 3‐year, and 5‐year ROC plots were plotted to evaluate time‐specific discrimination. Typically, higher C‐index and AUC values indicate superior discriminative ability. Integrated Brier Score (IBS) and calibration curves were utilized to evaluate predictive accuracy. An IBS < 0.25 was considered indicative of an acceptable overall prediction error, while calibration curves were used to visualize local calibration bias across risk strata, with closer alignment to the diagonal reflecting lower bias. Decision curve analysis (DCA) was performed to quantify the net benefit of each model at different probability thresholds, providing insights into clinical utility. The integrated AUROC was calculated as the inverse‐probability‐of‐censoring‐weighted (IPCW) mean of the time‐dependent AUROC values at 1, 3, and 5 years, and the IBS was calculated similarly.
Risk categories (high‐, medium‐, and low‐risk) were established using tertile cutoffs derived from risk scores of the optimal model. Prognostic distinctions among these groups were evaluated using Kaplan–Meier survival analysis with log‐rank tests.
2.5
Model Explainability and Application
A multi‐level explainability framework was implemented to elucidate the predictions of the optimal model. At the global level, we generated time‐dependent feature‐importance plots, partial dependence survival profiles, and SurvSHAP(t) (a Shapley Additive Explanations variant for survival) plots. These plots were used to assess the contributions of individual features to model predictions and to illustrate how these contributions evolve over survival time. At the local level, case‐specific explanations were provided through SurvSHAP(t) plots and Survival Local Interpretable Model‐agnostic Explanations (SurvLIME) plots, enabling visualization of individualized prediction mechanisms.
Finally, an interactive application was created based on the optimal model to facilitate personalized postoperative survival prediction for LAGC patients, providing a user‐friendly platform for clinicians and patients.
2.6
Statistical Analysis
All statistical analyses and data visualizations were performed using R (version 4.3.3). Categorical variables were compared using the chi‐square test, while continuous variables were analyzed using the Mann–Whitney U test. Survival prediction models were developed using the “mlr3proba” framework, with DeepSurv implemented via the reticulate interface to call the Pycox module. Model explainability was conducted using the “survex” package, and the interactive prognostic prediction platform was built using the Shiny framework. A p value < 0.05 was deemed statistically significant.
Materials and Methods
2.1
Data Collection
This study was a retrospective analysis, with the overall workflow illustrated in Figure 1. Data were extracted from the SEER Research Data (17 Registries, Nov 2023 submission, 2000–2021) using SEER*Stat software (v8.4.4), selecting cases diagnosed between 2004 and 2015 according to the specified inclusion criteria: (i) age between 18 and 80 years; (ii) primary gastric malignancies (anatomical location: C16.0–C16.9; ICD‐O‐3 histological codes: 8140/3–8490/3); and (iii) patients who underwent gastrectomy (RX Summ‐Surg Prim Site codes 30–80). Exclusion criteria were: (i) distant metastatic (M1) or early‐stage gastric cancer (T1‐2N0M0); (ii) cases with missing key clinical variables or incomplete follow‐up data; and (iii) overall survival (OS) < 1 month. Additionally, an external validation set of 235 LAGC patients treated at Maoming People's Hospital (Maoming, China) between 2016 and 2022 was included, complying with the identical inclusion and exclusion criteria. The study was approved by the Ethics Committee of Maoming People's Hospital (approval number: PJ2025MI‐K038‐01). Given that the study utilized only de‐identified historical data, the ethics committee waived the requirement for informed consent.
2.2
Variables
Patient survival data (overall survival and the corresponding endpoint event status) and clinical characteristics were systematically collected, including demographic variables (sex, age, race, marital status), tumor characteristics (location, size, grade, and histological type), staging details (TNM stage standardized according to the 8th edition of the AJCC classification), lymph node metastatic burden (number of positive lymph nodes and total examined lymph nodes), and treatment modalities (extent of surgery and receipt of chemotherapy or radiotherapy). Univariate Cox regression analyses were conducted for all variables, and those with p < 0.05 were incorporated into multivariate Cox regression analysis to ascertain independent prognostic factors.
A multi‐stage feature selection strategy was employed, where clinical variables were screened using Least Absolute Shrinkage and Selection Operator (LASSO), Recursive Feature Elimination (RFE), and the Boruta algorithm. The final feature set comprised consensus variables identified through the intersection of all three methods. LASSO is an embedded feature selection method that applies L1 regularization, shrinking the coefficients of non‐informative variables to zero, effectively handling multicollinearity and reducing overfitting [17]. RFE is a feature selection technique that employs a backward elimination strategy to iteratively refine the feature subset [18]. In this study, we propose a GBM‐RFE variant, which utilizes gradient boosting machine (GBM) to determine feature importances and iteratively remove the least important features, making it well‐suited to capturing complex or nonlinear relationships. Both LASSO and RFE were implemented with five‐fold cross‐validation to enhance selection stability. The Boruta algorithm, a wrapper‐based feature selection method, introduces “shadow features” as benchmark references. A random forest model then computes Z‐scores for real and synthetic noise features, applying statistical hypothesis testing to determine feature importance [19].
2.3
Model Development
Patients from the SEER database were randomly assigned to a training set and a validation set in a 7:3 ratio. Five representative predictive models were developed using the training set: Cox proportional hazards model (CoxPH), random survival forest (RSF), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), and DeepSurv.
CoxPH is a classical semi‐parametric survival analysis method that establishes a log‐linear relationship between covariates and the hazard function using partial likelihood estimation, offering an interpretable hazard ratio framework. RSF, a non‐parametric ensemble learning approach, constructs multiple survival trees and employs splitting criteria to maximize survival differences across nodes, effectively capturing nonlinear effects and interactions [20]. GBM iteratively optimizes an additive model by minimizing the negative log‐likelihood loss function, enabling adaptive learning for heterogeneous survival data [21]. XGBoost extends GBM by incorporating L1/L2 regularization, second‐order gradient approximation, and parallel computation. Through feature binning and a sparsity‐aware algorithm, it enhances computational efficiency, rendering it especially appropriate for high‐dimensional survival prediction tasks [22]. DeepSurv, a deep learning‐based survival model, utilizes neural networks to capture high‐order feature interactions through nonlinear transformations in hidden layers, enabling the modeling of complex survival risk patterns [23]. Hyperparameters for all machine learning models were optimized using grid search with five‐fold cross‐validation, evaluating performance by the concordance index (C‐index).
2.4
Model Evaluation
Multiple metrics were utilized to evaluate model performance. C‐index and integrated area under the receiver operating characteristic curve (Integrated AUROC) assessed global discriminatory ability, while 1‐year, 3‐year, and 5‐year ROC plots were plotted to evaluate time‐specific discrimination. Typically, higher C‐index and AUC values indicate superior discriminative ability. Integrated Brier Score (IBS) and calibration curves were utilized to evaluate predictive accuracy. An IBS < 0.25 was considered indicative of an acceptable overall prediction error, while calibration curves were used to visualize local calibration bias across risk strata, with closer alignment to the diagonal reflecting lower bias. Decision curve analysis (DCA) was performed to quantify the net benefit of each model at different probability thresholds, providing insights into clinical utility. The integrated AUROC was calculated as the inverse‐probability‐of‐censoring‐weighted (IPCW) mean of the time‐dependent AUROC values at 1, 3, and 5 years, and the IBS was calculated similarly.
Risk categories (high‐, medium‐, and low‐risk) were established using tertile cutoffs derived from risk scores of the optimal model. Prognostic distinctions among these groups were evaluated using Kaplan–Meier survival analysis with log‐rank tests.
2.5
Model Explainability and Application
A multi‐level explainability framework was implemented to elucidate the predictions of the optimal model. At the global level, we generated time‐dependent feature‐importance plots, partial dependence survival profiles, and SurvSHAP(t) (a Shapley Additive Explanations variant for survival) plots. These plots were used to assess the contributions of individual features to model predictions and to illustrate how these contributions evolve over survival time. At the local level, case‐specific explanations were provided through SurvSHAP(t) plots and Survival Local Interpretable Model‐agnostic Explanations (SurvLIME) plots, enabling visualization of individualized prediction mechanisms.
Finally, an interactive application was created based on the optimal model to facilitate personalized postoperative survival prediction for LAGC patients, providing a user‐friendly platform for clinicians and patients.
2.6
Statistical Analysis
All statistical analyses and data visualizations were performed using R (version 4.3.3). Categorical variables were compared using the chi‐square test, while continuous variables were analyzed using the Mann–Whitney U test. Survival prediction models were developed using the “mlr3proba” framework, with DeepSurv implemented via the reticulate interface to call the Pycox module. Model explainability was conducted using the “survex” package, and the interactive prognostic prediction platform was built using the Shiny framework. A p value < 0.05 was deemed statistically significant.
Results
3
Results
3.1
The Characteristics of Patients
A total of 8616 LAGC patients from the SEER database and 235 patients from Maoming People's Hospital who underwent gastrectomy were included in the study. Table 1 summarizes the baseline characteristics of patients, highlighting notable differences between the SEER and Chinese cohorts. Regarding demographic data, racial composition differed entirely between the two cohorts, and the percentage of married patients was markedly greater in the China cohort compared to the SEER dataset (95.7% vs. 64.2%, p < 0.001). Regarding tumor characteristics, the proportion of T4 stage cases was significantly higher in the China cohort than in the SEER dataset (68.5% vs. 33.7%, p < 0.001). Additionally, while 55.3% of patients in the SEER dataset were diagnosed at stage III, the proportion was even higher in the China cohort (65.5%), indicating that patients in the China cohort were diagnosed at a more advanced tumor stage. In terms of treatment modalities, the China cohort had a lower percentage of patients receiving chemotherapy or radiotherapy. Notably, only a very small proportion of patients in the China cohort received radiotherapy. Furthermore, discrepancies were noted in N staging, tumor location, histological type, surgical scope, and the quantity of lymph nodes examined between the two datasets.
3.2
Variable Selection
Table 2 presents the results of the Cox regression analysis. Multivariate Cox regression revealed that all variables, except for sex and AJCC stage, were independent prognostic factors. Subsequently, three feature selection methods—LASSO, RFE, and the Boruta algorithm—were applied to clinical variables (Figure 2). LASSO regression identified 13 variables, while Boruta and RFE selected 15 and 12 variables, respectively. The intersection of these methods yielded 12 key prognostic features: age, race, T stage, AJCC stage, tumor size, tumor location, histological grade, RNE (number of regional nodes examined), LNR (lymph node ratio; defined as positive lymph nodes/total lymph nodes examined), extent of surgery, chemotherapy, and radiotherapy. These features were incorporated into the model development process.
3.3
Model Comparisons
Five prediction models were evaluated in the validation and external validation sets. The model performance metrics are summarized in Table 3. Among these models, RSF exhibited superior performance with C‐index values of 0.732 (95% CI: 0.720–0.745) and 0.723 (95% CI: 0.696–0.755), Integrated AUROC of 0.794 and 0.757, and Integrated Brier scores of 0.171 and 0.172 in the validation and external validation cohorts, respectively, demonstrating superior discrimination and calibration.
ROC curves, calibration plots, and DCA were generated for all models (Supplementary Figure 1: training set; Figures 3 and 4: validation and external validation sets). RSF demonstrated superior discriminative performance, achieving higher AUROCs in both validation (1‐, 3‐, and 5‐year AUROCs: 0.771, 0.803, and 0.809) and external validation sets (1‐, 3‐, and 5‐year AUROCs: 0.802, 0.711, and 0.721). In the calibration plots, all models except GBM demonstrated good calibration in the validation set. Although calibration performance declined in the external validation set, RSF maintained relatively acceptable calibration. The DCA plot indicated that the RSF model provided a consistently higher net benefit than the other models across both validation cohorts over a broad range of threshold probabilities. Consequent to these findings, RSF was identified as the most suitable model for further investigation.
A risk stratification system was constructed using RSF‐derived risk scores (Figure 5), which effectively classified patients into high‐, medium‐, and low‐risk categories. Significant differences in prognosis were observed across these risk strata in all three cohorts (p < 0.001), demonstrating the model's robust prognostic stratification capability.
3.4
Global Explanation
To enhance model interpretability, we began with a global interpretation of the model. As shown in Figure 6, we assessed time‐dependent variable importance for the RSF model using two distinct permutation‐based approaches—the Brier score loss and the C/D AUC loss. Greater changes in these loss metrics after permutation indicate higher importance of the respective clinical variables over time. LNR emerged as the most influential feature, particularly during the early survival period, with its importance gradually declining over time yet remaining significant in later survival stages. AJCC stage, T stage, and tumor location maintained high importance throughout the follow‐up period. Chemotherapy and radiotherapy played a crucial role in early survival predictions but exhibited diminishing effects in later periods. In contrast to LNR and AJCC stage, whose prognostic significance declined over time, the influence of age increased and ultimately exceeded both variables in later survival intervals.
Next, partial dependence survival curves were generated (Figure 7) to illustrate how variations in individual variables influenced overall survival while holding other variables constant. These curves confirmed that LNR, age, AJCC stage, and T stage were the most influential predictors of survival, with chemotherapy, radiotherapy, and tumor location also playing notable prognostic roles.
Finally, a SurvSHAP(t) summary plot was generated using RSF model predictions for 6031 patients in the training set (Figure 8). SurvSHAP(t), an explainable artificial intelligence (XAI) methodology for survival analysis, extends the classical SHAP framework by quantifying feature contributions across the survival time dimension [24]. Figure 8A (left) presents a bar plot ranking the aggregated absolute SurvSHAP(t) values (|SurvSHAP(t)|), indicating variable importance. LNR was the most influential predictor, followed by AJCC stage, age, and radiotherapy. Tumor location, T stage, and chemotherapy also played significant roles, while tumor size, histological grade, and extent of surgery had relatively lower importance. The right panel of Figure 8A displays time‐dependent trends of variable importance, showing that LNR was crucial for short‐term survival prediction, whereas age and tumor location became more important for long‐term survival, consistent with the time‐dependent feature importance plot in Figure 6.
The beeswarm plot in Figure 8B further visualizes the distribution of SurvSHAP(t) values across 6031 patients. Variables are arranged by importance, and the horizontal axis represents SurvSHAP(t) values, with larger values indicating greater impact on survival risk. The color intensity represents variations in the variables: for continuous variables, a transition from light to dark indicates an increase in value, while for categorical variables, different colors represent different categories. This visualization further supports the conclusion that LNR, AJCC stage, and age are key prognostic predictors.
3.5
Local Explanation
SurvSHAP(t) also enables individualized explanations for specific patients. As shown in Figure 9A, we selected patient #5581, whose risk score was at the median of the training set, for evaluation. This patient was a 56‐year‐old White individual with a 4 cm tumor in the cardia, pathologically staged as pT4aN1M0 (Stage IIIA), who underwent partial gastrectomy accompanied by chemoradiotherapy. The LNR was 0.09 and the RNE was 11. The vertical axis represents SurvSHAP(t) values, reflecting the relative contribution of each variable to survival prediction. Positive values denote an elevated survival probability, while negative values suggest a reduced survival probability. For this patient, a relatively lower LNR, younger age, and receipt of chemoradiotherapy contributed to improved survival, whereas advanced T and AJCC staging, along with inadequate lymph node examination, were associated with a reduced survival probability.
SurvLIME, an adaptation of the LIME algorithm specifically for survival models, interprets individual predictions by locally perturbing input data and approximating the black‐box model behavior with a simpler linear model [25]. As shown in Figure 9B, the left panel displays variable contributions to this patient's survival prediction, with red bars denoting positive impacts and green bars denoting negative effects. Chemotherapy, age, LNR, and radiotherapy were the strongest positive predictors, while AJCC stage, T stage, and RNE were the strongest negative predictors, consistent with the SurvSHAP(t) results. The right panel of Figure 9B shows that the approximation model closely aligns with the black‐box model, enhancing the reliability of its interpretation.
3.6
The Individual Prognostic Prediction
An interactive prognostic prediction tool based on RSF was developed to provide personalized postoperative survival estimates for LAGC patients (Figure 10). The interface comprises two main sections: (1) a user input panel for entering clinical variables and (2) a results panel displaying survival probabilities and survival curves. The tool can be accessed on GitHub‐https://github.com/GZJ0526/LAGC. Download the required files from the website, launch the application, enter a patient's information, and click “Predict” to obtain the 1‐, 3‐, and 5‐year survival probabilities and the survival curve.
Results
3.1
The Characteristics of Patients
A total of 8616 LAGC patients from the SEER database and 235 patients from Maoming People's Hospital who underwent gastrectomy were included in the study. Table 1 summarizes the baseline characteristics of patients, highlighting notable differences between the SEER and Chinese cohorts. Regarding demographic data, racial composition differed entirely between the two cohorts, and the percentage of married patients was markedly greater in the China cohort compared to the SEER dataset (95.7% vs. 64.2%, p < 0.001). Regarding tumor characteristics, the proportion of T4 stage cases was significantly higher in the China cohort than in the SEER dataset (68.5% vs. 33.7%, p < 0.001). Additionally, while 55.3% of patients in the SEER dataset were diagnosed at stage III, the proportion was even higher in the China cohort (65.5%), indicating that patients in the China cohort were diagnosed at a more advanced tumor stage. In terms of treatment modalities, the China cohort had a lower percentage of patients receiving chemotherapy or radiotherapy. Notably, only a very small proportion of patients in the China cohort received radiotherapy. Furthermore, discrepancies were noted in N staging, tumor location, histological type, surgical scope, and the quantity of lymph nodes examined between the two datasets.
3.2
Variable Selection
Table 2 presents the results of the Cox regression analysis. Multivariate Cox regression revealed that all variables, except for sex and AJCC stage, were independent prognostic factors. Subsequently, three feature selection methods—LASSO, RFE, and the Boruta algorithm—were applied to clinical variables (Figure 2). LASSO regression identified 13 variables, while Boruta and RFE selected 15 and 12 variables, respectively. The intersection of these methods yielded 12 key prognostic features: age, race, T stage, AJCC stage, tumor size, tumor location, histological grade, RNE (number of regional nodes examined), LNR (lymph node ratio; defined as positive lymph nodes/total lymph nodes examined), extent of surgery, chemotherapy, and radiotherapy. These features were incorporated into the model development process.
3.3
Model Comparisons
Five prediction models were evaluated in the validation and external validation sets. The model performance metrics are summarized in Table 3. Among these models, RSF exhibited superior performance with C‐index values of 0.732 (95% CI: 0.720–0.745) and 0.723 (95% CI: 0.696–0.755), Integrated AUROC of 0.794 and 0.757, and Integrated Brier scores of 0.171 and 0.172 in the validation and external validation cohorts, respectively, demonstrating superior discrimination and calibration.
ROC curves, calibration plots, and DCA were generated for all models (Supplementary Figure 1: training set; Figures 3 and 4: validation and external validation sets). RSF demonstrated superior discriminative performance, achieving higher AUROCs in both validation (1‐, 3‐, and 5‐year AUROCs: 0.771, 0.803, and 0.809) and external validation sets (1‐, 3‐, and 5‐year AUROCs: 0.802, 0.711, and 0.721). In the calibration plots, all models except GBM demonstrated good calibration in the validation set. Although calibration performance declined in the external validation set, RSF maintained relatively acceptable calibration. The DCA plot indicated that the RSF model provided a consistently higher net benefit than the other models across both validation cohorts over a broad range of threshold probabilities. Consequent to these findings, RSF was identified as the most suitable model for further investigation.
A risk stratification system was constructed using RSF‐derived risk scores (Figure 5), which effectively classified patients into high‐, medium‐, and low‐risk categories. Significant differences in prognosis were observed across these risk strata in all three cohorts (p < 0.001), demonstrating the model's robust prognostic stratification capability.
3.4
Global Explanation
To enhance model interpretability, we began with a global interpretation of the model. As shown in Figure 6, we assessed time‐dependent variable importance for the RSF model using two distinct permutation‐based approaches—the Brier score loss and the C/D AUC loss. Greater changes in these loss metrics after permutation indicate higher importance of the respective clinical variables over time. LNR emerged as the most influential feature, particularly during the early survival period, with its importance gradually declining over time yet remaining significant in later survival stages. AJCC stage, T stage, and tumor location maintained high importance throughout the follow‐up period. Chemotherapy and radiotherapy played a crucial role in early survival predictions but exhibited diminishing effects in later periods. In contrast to LNR and AJCC stage, whose prognostic significance declined over time, the influence of age increased and ultimately exceeded both variables in later survival intervals.
Next, partial dependence survival curves were generated (Figure 7) to illustrate how variations in individual variables influenced overall survival while holding other variables constant. These curves confirmed that LNR, age, AJCC stage, and T stage were the most influential predictors of survival, with chemotherapy, radiotherapy, and tumor location also playing notable prognostic roles.
Finally, a SurvSHAP(t) summary plot was generated using RSF model predictions for 6031 patients in the training set (Figure 8). SurvSHAP(t), an explainable artificial intelligence (XAI) methodology for survival analysis, extends the classical SHAP framework by quantifying feature contributions across the survival time dimension [24]. Figure 8A (left) presents a bar plot ranking the aggregated absolute SurvSHAP(t) values (|SurvSHAP(t)|), indicating variable importance. LNR was the most influential predictor, followed by AJCC stage, age, and radiotherapy. Tumor location, T stage, and chemotherapy also played significant roles, while tumor size, histological grade, and extent of surgery had relatively lower importance. The right panel of Figure 8A displays time‐dependent trends of variable importance, showing that LNR was crucial for short‐term survival prediction, whereas age and tumor location became more important for long‐term survival, consistent with the time‐dependent feature importance plot in Figure 6.
The beeswarm plot in Figure 8B further visualizes the distribution of SurvSHAP(t) values across 6031 patients. Variables are arranged by importance, and the horizontal axis represents SurvSHAP(t) values, with larger values indicating greater impact on survival risk. The color intensity represents variations in the variables: for continuous variables, a transition from light to dark indicates an increase in value, while for categorical variables, different colors represent different categories. This visualization further supports the conclusion that LNR, AJCC stage, and age are key prognostic predictors.
3.5
Local Explanation
SurvSHAP(t) also enables individualized explanations for specific patients. As shown in Figure 9A, we selected patient #5581, whose risk score was at the median of the training set, for evaluation. This patient was a 56‐year‐old White individual with a 4 cm tumor in the cardia, pathologically staged as pT4aN1M0 (Stage IIIA), who underwent partial gastrectomy accompanied by chemoradiotherapy. The LNR was 0.09 and the RNE was 11. The vertical axis represents SurvSHAP(t) values, reflecting the relative contribution of each variable to survival prediction. Positive values denote an elevated survival probability, while negative values suggest a reduced survival probability. For this patient, a relatively lower LNR, younger age, and receipt of chemoradiotherapy contributed to improved survival, whereas advanced T and AJCC staging, along with inadequate lymph node examination, were associated with a reduced survival probability.
SurvLIME, an adaptation of the LIME algorithm specifically for survival models, interprets individual predictions by locally perturbing input data and approximating the black‐box model behavior with a simpler linear model [25]. As shown in Figure 9B, the left panel displays variable contributions to this patient's survival prediction, with red bars denoting positive impacts and green bars denoting negative effects. Chemotherapy, age, LNR, and radiotherapy were the strongest positive predictors, while AJCC stage, T stage, and RNE were the strongest negative predictors, consistent with the SurvSHAP(t) results. The right panel of Figure 9B shows that the approximation model closely aligns with the black‐box model, enhancing the reliability of its interpretation.
3.6
The Individual Prognostic Prediction
An interactive prognostic prediction tool based on RSF was developed to provide personalized postoperative survival estimates for LAGC patients (Figure 10). The interface comprises two main sections: (1) a user input panel for entering clinical variables and (2) a results panel displaying survival probabilities and survival curves. The tool can be accessed on GitHub‐https://github.com/GZJ0526/LAGC. Download the required files from the website, launch the application, enter a patient's information, and click “Predict” to obtain the 1‐, 3‐, and 5‐year survival probabilities and the survival curve.
Discussion
4
Discussion
LAGC is characterized by deep tumor infiltration, a high incidence of regional lymph node metastasis, and significant biological heterogeneity, resulting in inferior postoperative survival outcomes compared to early‐stage gastric cancer [8]. Accurate prognostic prediction in these patients is crucial for guiding clinical management. In fact, multiple prognostic models have been developed for predicting survival outcomes in patients with LAGC following surgery. For example, Sun et al. created a nomogram to predict survival in elderly patients with LAGC, achieving C‐indices of 0.687 and 0.713 in internal and external validation sets, respectively [11]. Yu et al. constructed a nomogram to predict survival in patients with locally advanced gastric signet ring cell carcinoma, reporting 1‐, 3‐, and 5‐year AUCs of 0.704, 0.759, and 0.767 in the validation set [13]. Although these nomogram‐based models incorporate multiple prognostic factors, they are constrained by the assumptions of linear models, limiting their ability to capture complex and dynamic interactions among clinical variables. Furthermore, traditional models rely on static predictions and cannot account for time‐dependent risks, reducing their utility in individualized and dynamic prognostic assessment.
With advancements in artificial intelligence, machine‐learning methods are increasingly applied to diagnostic classification and survival prognostication. Using data from 23,867 patients with colorectal cancer in the SEER program, Qiu et al. developed an XGBoost model to predict distant metastasis, achieving an AUC of 0.814 in an external validation cohort [26]. Zhao et al. leveraged the Survival Quilts framework to estimate post‐gastrectomy survival across multiple horizons (6 months to 10 years), with external‐validation C‐index ranging from 0.691 to 0.756 [27]. Beyond outcome prediction, machine learning also helps highlight key drivers of clinical outcomes. For example, in a random forest model predicting prolonged ICU length of stay after coronary artery bypass grafting, Jafarkhani et al. identified the duration of endotracheal intubation, body mass index, age, and operative time as the most influential predictors [28]. Likewise, a systematic review of prostate cancer studies highlighted age, prostate‐specific antigen (PSA), total PSA, free PSA, and PSA density as key risk indicators [29]. These studies can further help clinicians identify the clinical features most critical to disease progression, thereby guiding the optimization of treatment. However, machine learning models specifically designed for LAGC patients remain scarce. In this research, we utilized a multi‐method modeling technique that included (i) traditional statistical models (CoxPH), (ii) three major ML frameworks (RSF, XGBoost, and GBM), and (iii) deep learning architectures (DeepSurv). This comprehensive strategy systematically covers conventional statistical methodologies, state‐of‐the‐art ML techniques, and advanced neural network approaches. By comparing the performance of these models across internal and external validation datasets, RSF emerged as the optimal predictive model. RSF, an ensemble model composed of multiple survival trees, employs random feature selection and node‐splitting optimization strategies, enabling the capture of complex nonlinear effects without predefined risk function assumptions. RSF has been extensively utilized across multiple disease areas owing to its strong predictive efficacy [20, 30, 31].
Despite the strong predictive power of ML models, their “black‐box” nature presents a substantial obstacle to clinical use, as the lack of interpretability prevents clinicians from verifying the biological plausibility of model decisions. Even when models achieve excellent performance, their clinical utility remains limited if they do not provide interpretable decision rules or feature contribution rankings [32]. Efforts have been made to address this challenge through post hoc explainability frameworks such as LIME [33] and SHAP [34], which enhance interpretability while preserving model performance by utilizing local linear approximations or game‐theoretic feature attributions.
After identifying RSF as the best‐performing model, we conducted a multidimensional explainability analysis to examine the contributions and temporal dynamics of key prognostic variables. Among these variables, LNR emerged as the most important prognostic factor. LNR represents a comprehensive metric of both tumor lymphatic burden and the quality of lymphadenectomy. In our study, SHAP value distributions indicated that high LNR values were correlated with a markedly heightened risk of mortality. Prior research has likewise shown the predictive significance of LNR, showing that higher LNR values are strongly correlated with shorter survival in gastric cancer patients [35, 36, 37]. Additionally, our analysis revealed the time‐dependent nature of LNR's prognostic impact, with its influence being most pronounced in the early postoperative period but gradually diminishing over time. Mechanistically, a high LNR probably marks residual tumor burden and micrometastatic spread, which drive early relapse and cancer‐related death following gastrectomy [38, 39]. At the same time, high LNR patients who stay event free beyond the early postoperative period represent a selected subgroup with lower residual disease, in whom the later association between LNR and survival is attenuated. AJCC stage and T stage also exhibited significant prognostic value in the model, aligning with the widely accepted consensus that the AJCC staging system remains to be the widely used standard for evaluating prognosis in gastric cancer [40]. Notably, AJCC stage was significant in univariable analysis but was not retained in the multivariable Cox model. The most plausible explanation is collinearity with T and N stages, plus the limited flexibility of the standard Cox model to capture nonlinearity or interactions, leaving the AJCC stage with little incremental prognostic value. Age also emerged as a key prognostic factor, in line with prior research that has established advanced age as an independent prognostic factor in gastric cancer [12, 41]. It is noteworthy that age also exhibited a distinct time‐dependent effect. The increasing prognostic influence of age over time likely reflects growing competing risks in older patients. As cardiovascular disease, infections, accidental injuries, and other age‐related comorbidities accumulate during follow up, non‐cancer mortality contributes more to late deaths while tumor‐related hazards decline.
Regarding treatment, although the optimal timing and modality of chemoradiotherapy for gastric cancer remain controversial, its overall efficacy in improving patient survival has been well‐established by numerous studies [1, 42, 43, 44, 45]. Consistently, our model also identified chemoradiotherapy as a significant prognostic factor. Chemotherapy and radiotherapy exert their greatest influence in the early postoperative years by lowering the near‐term risk of recurrence. Patients who remain event free beyond this period form a selected subgroup with lower residual disease, so the treatment signal attenuates during later follow up. However, due to inherent limitations of the SEER database, key treatment‐related details—such as chemotherapy cycles and specific drug regimens—were unavailable, and the relatively small number of cases with complete radiotherapy data further constrained our ability to perform a more detailed treatment stratification analysis. Furthermore, categorizing untreated patients together with those whose treatment status was unknown might have introduced systematic bias into the evaluation of therapeutic efficacy, potentially resulting in either underestimation or overestimation of the actual impact of chemoradiotherapy on clinical outcomes. Additional research is required to refine the prognostic role of chemoradiotherapy.
The interactive prognostic prediction tool puts the model into practice. Using routine postoperative data, the RSF tool produces a patient‐specific survival curve and 1‐, 3‐, and 5‐year probabilities. Clinicians can use these data to plan treatment and follow‐up schedules. High‐risk patients receive closer monitoring and more frequent visits, which may improve survival.
This research possesses several limitations. First, As SEER does not include important features such as vascular invasion, perineural invasion, and molecular markers, the model lacks these predictors, which can lower discrimination, introduce miscalibration in specific subgroups, and weaken transportability across centers. Second, the external validation cohort was retrospective, single center, and relatively small, which limits the demonstration of generalizability. The model's robustness may be affected by unavoidable selection bias or residual overfitting. In the future, we plan to build a large multicenter prospective cohort with comprehensive molecular, pathologic, and treatment information to support model development and validation.
Discussion
LAGC is characterized by deep tumor infiltration, a high incidence of regional lymph node metastasis, and significant biological heterogeneity, resulting in inferior postoperative survival outcomes compared to early‐stage gastric cancer [8]. Accurate prognostic prediction in these patients is crucial for guiding clinical management. In fact, multiple prognostic models have been developed for predicting survival outcomes in patients with LAGC following surgery. For example, Sun et al. created a nomogram to predict survival in elderly patients with LAGC, achieving C‐indices of 0.687 and 0.713 in internal and external validation sets, respectively [11]. Yu et al. constructed a nomogram to predict survival in patients with locally advanced gastric signet ring cell carcinoma, reporting 1‐, 3‐, and 5‐year AUCs of 0.704, 0.759, and 0.767 in the validation set [13]. Although these nomogram‐based models incorporate multiple prognostic factors, they are constrained by the assumptions of linear models, limiting their ability to capture complex and dynamic interactions among clinical variables. Furthermore, traditional models rely on static predictions and cannot account for time‐dependent risks, reducing their utility in individualized and dynamic prognostic assessment.
With advancements in artificial intelligence, machine‐learning methods are increasingly applied to diagnostic classification and survival prognostication. Using data from 23,867 patients with colorectal cancer in the SEER program, Qiu et al. developed an XGBoost model to predict distant metastasis, achieving an AUC of 0.814 in an external validation cohort [26]. Zhao et al. leveraged the Survival Quilts framework to estimate post‐gastrectomy survival across multiple horizons (6 months to 10 years), with external‐validation C‐index ranging from 0.691 to 0.756 [27]. Beyond outcome prediction, machine learning also helps highlight key drivers of clinical outcomes. For example, in a random forest model predicting prolonged ICU length of stay after coronary artery bypass grafting, Jafarkhani et al. identified the duration of endotracheal intubation, body mass index, age, and operative time as the most influential predictors [28]. Likewise, a systematic review of prostate cancer studies highlighted age, prostate‐specific antigen (PSA), total PSA, free PSA, and PSA density as key risk indicators [29]. These studies can further help clinicians identify the clinical features most critical to disease progression, thereby guiding the optimization of treatment. However, machine learning models specifically designed for LAGC patients remain scarce. In this research, we utilized a multi‐method modeling technique that included (i) traditional statistical models (CoxPH), (ii) three major ML frameworks (RSF, XGBoost, and GBM), and (iii) deep learning architectures (DeepSurv). This comprehensive strategy systematically covers conventional statistical methodologies, state‐of‐the‐art ML techniques, and advanced neural network approaches. By comparing the performance of these models across internal and external validation datasets, RSF emerged as the optimal predictive model. RSF, an ensemble model composed of multiple survival trees, employs random feature selection and node‐splitting optimization strategies, enabling the capture of complex nonlinear effects without predefined risk function assumptions. RSF has been extensively utilized across multiple disease areas owing to its strong predictive efficacy [20, 30, 31].
Despite the strong predictive power of ML models, their “black‐box” nature presents a substantial obstacle to clinical use, as the lack of interpretability prevents clinicians from verifying the biological plausibility of model decisions. Even when models achieve excellent performance, their clinical utility remains limited if they do not provide interpretable decision rules or feature contribution rankings [32]. Efforts have been made to address this challenge through post hoc explainability frameworks such as LIME [33] and SHAP [34], which enhance interpretability while preserving model performance by utilizing local linear approximations or game‐theoretic feature attributions.
After identifying RSF as the best‐performing model, we conducted a multidimensional explainability analysis to examine the contributions and temporal dynamics of key prognostic variables. Among these variables, LNR emerged as the most important prognostic factor. LNR represents a comprehensive metric of both tumor lymphatic burden and the quality of lymphadenectomy. In our study, SHAP value distributions indicated that high LNR values were correlated with a markedly heightened risk of mortality. Prior research has likewise shown the predictive significance of LNR, showing that higher LNR values are strongly correlated with shorter survival in gastric cancer patients [35, 36, 37]. Additionally, our analysis revealed the time‐dependent nature of LNR's prognostic impact, with its influence being most pronounced in the early postoperative period but gradually diminishing over time. Mechanistically, a high LNR probably marks residual tumor burden and micrometastatic spread, which drive early relapse and cancer‐related death following gastrectomy [38, 39]. At the same time, high LNR patients who stay event free beyond the early postoperative period represent a selected subgroup with lower residual disease, in whom the later association between LNR and survival is attenuated. AJCC stage and T stage also exhibited significant prognostic value in the model, aligning with the widely accepted consensus that the AJCC staging system remains to be the widely used standard for evaluating prognosis in gastric cancer [40]. Notably, AJCC stage was significant in univariable analysis but was not retained in the multivariable Cox model. The most plausible explanation is collinearity with T and N stages, plus the limited flexibility of the standard Cox model to capture nonlinearity or interactions, leaving the AJCC stage with little incremental prognostic value. Age also emerged as a key prognostic factor, in line with prior research that has established advanced age as an independent prognostic factor in gastric cancer [12, 41]. It is noteworthy that age also exhibited a distinct time‐dependent effect. The increasing prognostic influence of age over time likely reflects growing competing risks in older patients. As cardiovascular disease, infections, accidental injuries, and other age‐related comorbidities accumulate during follow up, non‐cancer mortality contributes more to late deaths while tumor‐related hazards decline.
Regarding treatment, although the optimal timing and modality of chemoradiotherapy for gastric cancer remain controversial, its overall efficacy in improving patient survival has been well‐established by numerous studies [1, 42, 43, 44, 45]. Consistently, our model also identified chemoradiotherapy as a significant prognostic factor. Chemotherapy and radiotherapy exert their greatest influence in the early postoperative years by lowering the near‐term risk of recurrence. Patients who remain event free beyond this period form a selected subgroup with lower residual disease, so the treatment signal attenuates during later follow up. However, due to inherent limitations of the SEER database, key treatment‐related details—such as chemotherapy cycles and specific drug regimens—were unavailable, and the relatively small number of cases with complete radiotherapy data further constrained our ability to perform a more detailed treatment stratification analysis. Furthermore, categorizing untreated patients together with those whose treatment status was unknown might have introduced systematic bias into the evaluation of therapeutic efficacy, potentially resulting in either underestimation or overestimation of the actual impact of chemoradiotherapy on clinical outcomes. Additional research is required to refine the prognostic role of chemoradiotherapy.
The interactive prognostic prediction tool puts the model into practice. Using routine postoperative data, the RSF tool produces a patient‐specific survival curve and 1‐, 3‐, and 5‐year probabilities. Clinicians can use these data to plan treatment and follow‐up schedules. High‐risk patients receive closer monitoring and more frequent visits, which may improve survival.
This research possesses several limitations. First, As SEER does not include important features such as vascular invasion, perineural invasion, and molecular markers, the model lacks these predictors, which can lower discrimination, introduce miscalibration in specific subgroups, and weaken transportability across centers. Second, the external validation cohort was retrospective, single center, and relatively small, which limits the demonstration of generalizability. The model's robustness may be affected by unavoidable selection bias or residual overfitting. In the future, we plan to build a large multicenter prospective cohort with comprehensive molecular, pathologic, and treatment information to support model development and validation.
Conclusions
5
Conclusions
In conclusion, this study developed five ML‐based models using SEER and single‐center Chinese data to predict postoperative survival in LAGC patients, identifying RSF as the best‐performing model in both validation and external validation cohorts. Through global and local explainability analyses, we identified LNR, AJCC stage, and age as key prognostic drivers and elucidated their time‐dependent effects. The RSF‐based interactive prediction tool facilitates individualized prognosis visualization, offering an accurate and practical decision‐support system for dynamic clinical management, with the potential to optimize risk stratification strategies in gastric cancer.
Conclusions
In conclusion, this study developed five ML‐based models using SEER and single‐center Chinese data to predict postoperative survival in LAGC patients, identifying RSF as the best‐performing model in both validation and external validation cohorts. Through global and local explainability analyses, we identified LNR, AJCC stage, and age as key prognostic drivers and elucidated their time‐dependent effects. The RSF‐based interactive prediction tool facilitates individualized prognosis visualization, offering an accurate and practical decision‐support system for dynamic clinical management, with the potential to optimize risk stratification strategies in gastric cancer.
Author Contributions
Author Contributions
Zhijie Gong: data curation, formal analysis, project administration, writing – original draft, writing – review and editing. Liping Zhou: data curation, formal analysis, project administration, writing – review and editing, writing – original draft. Yinghao He: data curation, methodology. Yanjie Deng: data curation. Jun Zhou: data curation. Weiwei Wang: data curation. Qiangbang Yang: data curation. Jian Pan: data curation. Yingze Li: data curation. Xiaolu Yuan: project administration, writing – review and editing. Minghui Ma: project administration, writing – review and editing.
Zhijie Gong: data curation, formal analysis, project administration, writing – original draft, writing – review and editing. Liping Zhou: data curation, formal analysis, project administration, writing – review and editing, writing – original draft. Yinghao He: data curation, methodology. Yanjie Deng: data curation. Jun Zhou: data curation. Weiwei Wang: data curation. Qiangbang Yang: data curation. Jian Pan: data curation. Yingze Li: data curation. Xiaolu Yuan: project administration, writing – review and editing. Minghui Ma: project administration, writing – review and editing.
Ethics Statement
Ethics Statement
The study was approved by the Ethics Committee of Maoming People's Hospital (approval number: PJ2025MI‐K038‐01). Informed consent to participate was not applicable for this retrospective study.
The study was approved by the Ethics Committee of Maoming People's Hospital (approval number: PJ2025MI‐K038‐01). Informed consent to participate was not applicable for this retrospective study.
Conflicts of Interest
Conflicts of Interest
The authors declare no conflicts of interest.
The authors declare no conflicts of interest.
Supporting information
Supporting information
Data S1: Supporting Information.
Data S1: Supporting Information.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.