An interpretable machine learning model for predicting distant organ metastasis after radical resection of colorectal cancer.
[OBJECTIVE] Distant organ metastasis remains the primary factor affecting long-term survival following radical surgery for colorectal cancer (CRC).
- 연구 설계 cohort study
APA
Weibin L, Weixiang N, et al. (2026). An interpretable machine learning model for predicting distant organ metastasis after radical resection of colorectal cancer.. Frontiers in oncology, 16, 1764032. https://doi.org/10.3389/fonc.2026.1764032
MLA
Weibin L, et al.. "An interpretable machine learning model for predicting distant organ metastasis after radical resection of colorectal cancer.." Frontiers in oncology, vol. 16, 2026, pp. 1764032.
PMID
41815553
Abstract
[OBJECTIVE] Distant organ metastasis remains the primary factor affecting long-term survival following radical surgery for colorectal cancer (CRC). This study aimed to develop and validate an interpretable machine learning (ML) model to predict the 5-year cumulative risk of distant metastasis after radical CRC surgery.
[METHODS] A retrospective observational cohort study was conducted using clinical and follow-up data from 341 CRC patients who underwent radical surgery. The cohort was randomly divided into a training set and a validation set at a 7:3 ratio. Feature selection was performed using least absolute shrinkage and selection operator (LASSO) regression, identifying variables associated with the 5-year cumulative occurrence of metastasis. Prediction models were constructed using seven algorithms. Model performance was evaluated through multiple metrics: area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, F1 score, calibration plots, and decision curve analysis. The SHapley Additive exPlanations (SHAP) method was applied to improve model interpretability.
[RESULTS] LASSO combined with tenfold cross-validation selected 11 key features for model development. Among the models tested, the SVM model demonstrated superior performance, achieving a Brier score of 0.144 and an AUC of 0.865 in the validation set. Calibration and clinical decision curves confirmed the SVM model's strong calibration and clinical applicability. The SHAP dependence plots and force analysis provided explanations at both feature and individual patient levels for the model's 5-year risk predictions.
[CONCLUSION] This study established a high-accuracy and interpretable ML model capable of effectively predicting the 5-year cumulative risk of distant organ metastasis after radical colorectal cancer surgery, while further external validation is necessary to confirm its clinical utility.
[METHODS] A retrospective observational cohort study was conducted using clinical and follow-up data from 341 CRC patients who underwent radical surgery. The cohort was randomly divided into a training set and a validation set at a 7:3 ratio. Feature selection was performed using least absolute shrinkage and selection operator (LASSO) regression, identifying variables associated with the 5-year cumulative occurrence of metastasis. Prediction models were constructed using seven algorithms. Model performance was evaluated through multiple metrics: area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, F1 score, calibration plots, and decision curve analysis. The SHapley Additive exPlanations (SHAP) method was applied to improve model interpretability.
[RESULTS] LASSO combined with tenfold cross-validation selected 11 key features for model development. Among the models tested, the SVM model demonstrated superior performance, achieving a Brier score of 0.144 and an AUC of 0.865 in the validation set. Calibration and clinical decision curves confirmed the SVM model's strong calibration and clinical applicability. The SHAP dependence plots and force analysis provided explanations at both feature and individual patient levels for the model's 5-year risk predictions.
[CONCLUSION] This study established a high-accuracy and interpretable ML model capable of effectively predicting the 5-year cumulative risk of distant organ metastasis after radical colorectal cancer surgery, while further external validation is necessary to confirm its clinical utility.