Interpretable machine learning model using CT body composition combined with inflammatory and nutritional indicators to predict pathological complete response after neoadjuvant therapy in breast cancer: a retrospective study.
[OBJECTIVE] Accurate prediction of pathological complete response (pCR) following neoadjuvant therapy (NAT) is critical for optimizing treatment in breast cancer.
APA
Zhong L, Zeng Q, et al. (2026). Interpretable machine learning model using CT body composition combined with inflammatory and nutritional indicators to predict pathological complete response after neoadjuvant therapy in breast cancer: a retrospective study.. PeerJ, 14, e21051. https://doi.org/10.7717/peerj.21051
MLA
Zhong L, et al.. "Interpretable machine learning model using CT body composition combined with inflammatory and nutritional indicators to predict pathological complete response after neoadjuvant therapy in breast cancer: a retrospective study.." PeerJ, vol. 14, 2026, pp. e21051.
PMID
41940388
Abstract
[OBJECTIVE] Accurate prediction of pathological complete response (pCR) following neoadjuvant therapy (NAT) is critical for optimizing treatment in breast cancer. This study develops and validates an interpretable, cost-effective machine learning (ML) model integrating computed tomography (CT)-based body composition parameters with routine inflammatory and nutritional biomarkers to predict pCR.
[METHODS] In this retrospective single-center study ( = 189; January 2019-June 2023), patients were divided into training ( = 142) and independent temporal test ( = 47) sets. CT-based body composition parameters and blood test variables were analyzed. Independent predictors were identified Least Absolute Shrinkage and Selection Operator and multivariate logistic regression. Eight ML algorithms were compared, and the optimal model was selected based on Area Under the Curve (AUC), calibration, and clinical utility. SHapley Additive exPlanations (SHAP) analysis visualized predictive contributions.
[RESULTS] Six independent predictors were identified: visceral adipose tissue density, skeletal muscle density, intramuscular adipose tissue content, albumin-to-alkaline phosphatase ratio, systemic inflammation response index, and molecular subtype. The eXtreme Gradient Boosting (XGBoost) model demonstrated superior performance, achieving an area under the curve (AUC) of 0.888 (95% CI [0.837-0.939]) in internal validation and 0.831 (95% CI [0.723-0.938]) in the independent test set. The model exhibited good calibration (Brier score = 0.180). SHAP analysis highlighted the contribution of host-related factors alongside tumor biology.
[CONCLUSIONS] This interpretable ML model effectively integrates host-related body composition and inflammatory-nutritional markers to predict pCR. By utilizing routinely available data, this approach offers a practical, accessible tool for initial risk stratification, complementing existing imaging-based strategies and supporting personalized clinical decision-making.
[METHODS] In this retrospective single-center study ( = 189; January 2019-June 2023), patients were divided into training ( = 142) and independent temporal test ( = 47) sets. CT-based body composition parameters and blood test variables were analyzed. Independent predictors were identified Least Absolute Shrinkage and Selection Operator and multivariate logistic regression. Eight ML algorithms were compared, and the optimal model was selected based on Area Under the Curve (AUC), calibration, and clinical utility. SHapley Additive exPlanations (SHAP) analysis visualized predictive contributions.
[RESULTS] Six independent predictors were identified: visceral adipose tissue density, skeletal muscle density, intramuscular adipose tissue content, albumin-to-alkaline phosphatase ratio, systemic inflammation response index, and molecular subtype. The eXtreme Gradient Boosting (XGBoost) model demonstrated superior performance, achieving an area under the curve (AUC) of 0.888 (95% CI [0.837-0.939]) in internal validation and 0.831 (95% CI [0.723-0.938]) in the independent test set. The model exhibited good calibration (Brier score = 0.180). SHAP analysis highlighted the contribution of host-related factors alongside tumor biology.
[CONCLUSIONS] This interpretable ML model effectively integrates host-related body composition and inflammatory-nutritional markers to predict pCR. By utilizing routinely available data, this approach offers a practical, accessible tool for initial risk stratification, complementing existing imaging-based strategies and supporting personalized clinical decision-making.
MeSH Terms
Humans; Female; Retrospective Studies; Neoadjuvant Therapy; Body Composition; Middle Aged; Breast Neoplasms; Machine Learning; Tomography, X-Ray Computed; Adult; Inflammation; Aged
같은 제1저자의 인용 많은 논문 (5)
- Leukocyte-Hitchhiking Nanomedicine for Sensitized Ferroptosis Therapy.
- A novel multimodal combining radiomics and tumor-stroma ratio (TSR) improves diagnosis of gastric cancer peritoneal metastasis.
- Association between periodontitis and hepatocellular carcinoma across different tumor stages.
- A multi-omics analysis integrating mendelian randomization, brain functional connectivity, and transcriptomics to explore risk-associated features in non-small cell lung cancer.
- Allogeneic Whole Eye Transplantation in Macaques Achieves 19-Day Graft Survival With Structural and Functional Viability.