Machine learning application in colon cancer treatment outcome prediction.
[UNLABELLED] Colon cancer represents a significant global health burden, accounting for a substantial portion of cancer-related morbidity and mortality worldwide.
APA
Ghasemi H, Hosseini SV, et al. (2026). Machine learning application in colon cancer treatment outcome prediction.. Scientific reports, 16(1), 6159. https://doi.org/10.1038/s41598-026-36917-0
MLA
Ghasemi H, et al.. "Machine learning application in colon cancer treatment outcome prediction.." Scientific reports, vol. 16, no. 1, 2026, pp. 6159.
PMID
41580567
Abstract
[UNLABELLED] Colon cancer represents a significant global health burden, accounting for a substantial portion of cancer-related morbidity and mortality worldwide. Many studies have been conducted to predict survival outcomes; however, most of these analyses have been performed predominantly via basic statistical methods. The aim of this study was to perform machine learning techniques to build models for survival prediction in patients with colon cancer. A retrospective review of 764 colon cancer patients treated over a 10-year period facilitated the construction of a detailed dataset containing 44 predictor variables and one dependent variable, the survival status of the patients (alive or dead). The data were randomly split into 80% training and 20% testing sets. Prognostic features from the database were used to build machine learning algorithms, including random forest, logistic regression, XGBoost, gradient boosting, categorical boosting (CatBoost), light gradient boosting machine (LightGBM), multilayer perceptron (MLP), and one-dimensional convolutional neural network (1D-CNN) to predict progressive disease outcomes. Models were validated for sensitivity, accuracy and specificity, with predictive ability assessed by receiver operating characteristic (ROC) curve and area under the curve (AUC) calculations. In terms of model accuracy and precision, almost all algorithms produced similar outcomes; however, among the evaluated models, CatBoost achieved the highest accuracy of 0.813, and the random forest model demonstrated the best precision of 0.727, whereas the logistic regression model exhibited the highest recall of 0.658 on the test set. Our results revealed that the random forest algorithm exhibited the highest AUC of 0.83, demonstrating remarkable efficacy in achieving an optimal balance between sensitivity and specificity. In summary, this research highlights the potential of machine learning models to support personalized and timely interventions for colon cancer patients, ultimately aiming to improve patient care and outcomes.
[SUPPLEMENTARY INFORMATION] The online version contains supplementary material available at 10.1038/s41598-026-36917-0.
[SUPPLEMENTARY INFORMATION] The online version contains supplementary material available at 10.1038/s41598-026-36917-0.