Machine learning approaches for predicting progression in hormone-sensitive prostate cancer patients.
[OBJECTIVE] Almost all hormone-sensitive prostate cancer (HSPC) cases eventually progress to castration-resistant prostate cancer (CRPC) following androgen deprivation therapy (ADT).
- 95% CI 0.8324-0.902
APA
Zhu B, Jiang H, et al. (2026). Machine learning approaches for predicting progression in hormone-sensitive prostate cancer patients.. Frontiers in oncology, 16, 1704671. https://doi.org/10.3389/fonc.2026.1704671
MLA
Zhu B, et al.. "Machine learning approaches for predicting progression in hormone-sensitive prostate cancer patients.." Frontiers in oncology, vol. 16, 2026, pp. 1704671.
PMID
41768252
Abstract
[OBJECTIVE] Almost all hormone-sensitive prostate cancer (HSPC) cases eventually progress to castration-resistant prostate cancer (CRPC) following androgen deprivation therapy (ADT). This study aims to develop a machine learning (ML) model to predict the progression of HSPC patients. Additionally, we conducted statistical analyses on the dataset to identify significant features and clinical markers predictive of HSPC transitioning to CRPC.
[METHODS] Data from 410 HSPC patients treated at Yunnan Cancer Hospital between 01/01/2017, and 31/05/2022, were analyzed. Predictive analyses were performed on a series of features observed during the patients' initial visits. The primary ML methods employed were decision tree (DT), random forest (RF), XGBoost, artificial neural network (ANN), and support vector machine (SVM). Feature selection was conducted using a genetic algorithm (GA). The ML models were trained with an 80% training set and validated with a 20% test set. Model performance was evaluated using the area under the ROC curve (AUC), calibration plots, and learning curves to assess fit and calibration. Evaluation metrics included accuracy (ACC), precision (PRE), specificity (SPE), sensitivity (SEN), and F1 score.
[RESULTS] Visualization of evaluation metrics was presented through confusion matrices and ROC curves. Ensemble learning methods, particularly RF and XGBoost, demonstrated the best model performance. RF achieved a score of 0.838 (95% CI:0.8324-0.902)on the training dataset and 0.817 (95% CI: 0.659 - 0.829) on the test dataset (AUC: 0.873, 95% CI:0.730-0.878). XGBoost achieved a score of 0.814 (95% CI:0.790-0.878) on the training dataset and 0.805 (95% CI:0.707-0.829) on the test dataset (AUC: 0.866, 95% CI:0.780-0.871). Calibration curves indicated good model calibration, and learning curves suggested no significant overfitting in both the training and test sets.
[CONCLUSION] Our findings demonstrate that ensemble learning methods, particularly RF, exhibit superior performance in predicting HSPC progression. This study represents a preliminary step toward a predictive tool, highlighting the potential of baseline clinical data for risk stratification. Future prospective studies with larger, multi-center cohorts are warranted to validate and refine this approach for possible clinical integration.
[METHODS] Data from 410 HSPC patients treated at Yunnan Cancer Hospital between 01/01/2017, and 31/05/2022, were analyzed. Predictive analyses were performed on a series of features observed during the patients' initial visits. The primary ML methods employed were decision tree (DT), random forest (RF), XGBoost, artificial neural network (ANN), and support vector machine (SVM). Feature selection was conducted using a genetic algorithm (GA). The ML models were trained with an 80% training set and validated with a 20% test set. Model performance was evaluated using the area under the ROC curve (AUC), calibration plots, and learning curves to assess fit and calibration. Evaluation metrics included accuracy (ACC), precision (PRE), specificity (SPE), sensitivity (SEN), and F1 score.
[RESULTS] Visualization of evaluation metrics was presented through confusion matrices and ROC curves. Ensemble learning methods, particularly RF and XGBoost, demonstrated the best model performance. RF achieved a score of 0.838 (95% CI:0.8324-0.902)on the training dataset and 0.817 (95% CI: 0.659 - 0.829) on the test dataset (AUC: 0.873, 95% CI:0.730-0.878). XGBoost achieved a score of 0.814 (95% CI:0.790-0.878) on the training dataset and 0.805 (95% CI:0.707-0.829) on the test dataset (AUC: 0.866, 95% CI:0.780-0.871). Calibration curves indicated good model calibration, and learning curves suggested no significant overfitting in both the training and test sets.
[CONCLUSION] Our findings demonstrate that ensemble learning methods, particularly RF, exhibit superior performance in predicting HSPC progression. This study represents a preliminary step toward a predictive tool, highlighting the potential of baseline clinical data for risk stratification. Future prospective studies with larger, multi-center cohorts are warranted to validate and refine this approach for possible clinical integration.
같은 제1저자의 인용 많은 논문 (5)
- Synergistic Activation of Immunogenic Cell Death and the cGAS-STING Pathway by Engineered Zinc/Manganese-Based Metal-Organic Framework Nanoplatforms for Colon Cancer Immunotherapy.
- Repression of FOSL1 augments ferroptosis to overcome oxaliplatin resistance in colorectal cancer by acting on SRSF2.
- Case report: tigecycline-induced procalcitonin elevation.
- Causal association between oxidative stress and lymphomas: A two-sample Mendelian randomization study.
- Mass spectrometry-based multi-omics analysis elucidates immune microenvironmental characteristics and the risk of distant metastasis in N1c colorectal cancer.