Explainable AI for colorectal cancer mortality and risk factor prediction in Korea: A nationwide cancer cohort study.
코호트
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
069 patients with CRC were analyzed for all-cause mortality (1,878 deaths) and 8,589 patients for CRC-specific mortality (1,398 deaths).
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
[CONCLUSIONS] We developed the first interpretable machine learning model that accurately predicts CRC survival in a nationwide Korean cohort. Age-specific risk factors identified by SHAP not only support personalized care but also advance the application of precision oncology in Asian settings.
[BACKGROUND] Colorectal cancer (CRC) prognosis varies significantly, yet conventional statistical models struggle to capture the complex, non-linear interactions among clinical variables.
- 95% CI 0.80-0.85
- 연구 설계 cohort study
APA
Park SW, Yeo NY, et al. (2026). Explainable AI for colorectal cancer mortality and risk factor prediction in Korea: A nationwide cancer cohort study.. International journal of medical informatics, 205, 106125. https://doi.org/10.1016/j.ijmedinf.2025.106125
MLA
Park SW, et al.. "Explainable AI for colorectal cancer mortality and risk factor prediction in Korea: A nationwide cancer cohort study.." International journal of medical informatics, vol. 205, 2026, pp. 106125.
PMID
41066920
Abstract
[BACKGROUND] Colorectal cancer (CRC) prognosis varies significantly, yet conventional statistical models struggle to capture the complex, non-linear interactions among clinical variables. Furthermore, most predictive models are based on Western populations, limiting their applicability to Korean patients. This study aimed to develop an explainable AI (XAI) model for CRC mortality prediction using a nationwide Korean cohort to provide clinically actionable insights.
[METHODS] We conducted a retrospective cohort study using the Korean Cancer Public Library Database. A total of 9,069 patients with CRC were analyzed for all-cause mortality (1,878 deaths) and 8,589 patients for CRC-specific mortality (1,398 deaths). Four ML algorithms-support vector machine, random forest, XGBoost, and LightGBM-were constructed. We employed explainable AI techniques, including SHapley Additive exPlanations (SHAP), to quantify the contribution of each predictor and ensure model interpretability.
[RESULTS] All models showed good discrimination (AUC: 0.82-0.94). LightGBM was presented as the best-optimized model with an AUC of 0.824 [95% CI 0.80-0.85] in all-cause mortality. For CRC-specific mortality, LGB again yielded the AUC of 0.867 [95% CI 0.84-0.89]. SHAP revealed tumor stage and carcinoembryonic antigen as top mortality predictors across ages. Metabolic markers (e.g., hypertension, cholesterol) and liver enzymes were more predictive in younger patients.
[CONCLUSIONS] We developed the first interpretable machine learning model that accurately predicts CRC survival in a nationwide Korean cohort. Age-specific risk factors identified by SHAP not only support personalized care but also advance the application of precision oncology in Asian settings.
[METHODS] We conducted a retrospective cohort study using the Korean Cancer Public Library Database. A total of 9,069 patients with CRC were analyzed for all-cause mortality (1,878 deaths) and 8,589 patients for CRC-specific mortality (1,398 deaths). Four ML algorithms-support vector machine, random forest, XGBoost, and LightGBM-were constructed. We employed explainable AI techniques, including SHapley Additive exPlanations (SHAP), to quantify the contribution of each predictor and ensure model interpretability.
[RESULTS] All models showed good discrimination (AUC: 0.82-0.94). LightGBM was presented as the best-optimized model with an AUC of 0.824 [95% CI 0.80-0.85] in all-cause mortality. For CRC-specific mortality, LGB again yielded the AUC of 0.867 [95% CI 0.84-0.89]. SHAP revealed tumor stage and carcinoembryonic antigen as top mortality predictors across ages. Metabolic markers (e.g., hypertension, cholesterol) and liver enzymes were more predictive in younger patients.
[CONCLUSIONS] We developed the first interpretable machine learning model that accurately predicts CRC survival in a nationwide Korean cohort. Age-specific risk factors identified by SHAP not only support personalized care but also advance the application of precision oncology in Asian settings.
MeSH Terms
Humans; Republic of Korea; Colorectal Neoplasms; Male; Female; Risk Factors; Middle Aged; Retrospective Studies; Aged; Prognosis; Artificial Intelligence; Algorithms; Cohort Studies; Adult