Development and validation of an Interpretable Machine learning model for Discriminating between benign and malignant breast cancer.
[OBJECTIVE] Breast cancer prognosis depends on early detection.
- 표본수 (n) 745
APA
Wang Z, Liu W, et al. (2026). Development and validation of an Interpretable Machine learning model for Discriminating between benign and malignant breast cancer.. International journal of medical informatics, 210, 106300. https://doi.org/10.1016/j.ijmedinf.2026.106300
MLA
Wang Z, et al.. "Development and validation of an Interpretable Machine learning model for Discriminating between benign and malignant breast cancer.." International journal of medical informatics, vol. 210, 2026, pp. 106300.
PMID
41592419
Abstract
[OBJECTIVE] Breast cancer prognosis depends on early detection. We developed and externally validated a model using routine, readily available clinical and laboratory variables to discriminate malignant from benign breast lesions, aiming to reduce unnecessary biopsies and support early decision-making.
[METHODS] This retrospective two-center study included a development cohort 1from Jiujiang First People's Hospital (N = 745; malignant 573, benign 172) and an external cohort2 from the First Affiliated Hospital of Nanchang University (N = 221; malignant 161, benign 60).Cohort 1 was randomly split into a 70:30 training and test set. Five-fold cross-validation was used to compare multiple algorithms and lock the model and hyperparameters; the locked model was evaluated on a fixed test set and the external cohort. The primary metric was AUC, with sensitivity, specificity, F1, Brier score, calibration curve, decision curve analysis (DCA), and SHAP for explanation.
[RESULTS] Logistic regression was selected, using Age, TT, APTT, CEA, and Ca. Cross-validated AUCs were 0.910 (training) and 0.905 (internal validation). The fixed test set yielded AUC 0.865 (sensitivity 0.802; specificity 0.712; F1 0.849; Brier 0.112). External validation achieved AUC 0.861, specificity 0.883, and PPV 0.934. DCA showed net benefit over "treat-all/none" across 20 %-95 % threshold probabilities. SHAP identified Age, TT, CEA, APTT and Ca as the dominant contributors.
[CONCLUSIONS] A logistic model based on routine laboratory variables effectively distinguishes malignant from benign breast lesions, with robust external performance and clear clinical net benefit, enabling early risk stratification and fewer unnecessary biopsies.This study proposes a tool that quantifies breast tumor malignancy risk using only objective indicators, without subjective factors. Online tool: prediction-for-bc.shinyapps.io/dynnomapp/.
[METHODS] This retrospective two-center study included a development cohort 1from Jiujiang First People's Hospital (N = 745; malignant 573, benign 172) and an external cohort2 from the First Affiliated Hospital of Nanchang University (N = 221; malignant 161, benign 60).Cohort 1 was randomly split into a 70:30 training and test set. Five-fold cross-validation was used to compare multiple algorithms and lock the model and hyperparameters; the locked model was evaluated on a fixed test set and the external cohort. The primary metric was AUC, with sensitivity, specificity, F1, Brier score, calibration curve, decision curve analysis (DCA), and SHAP for explanation.
[RESULTS] Logistic regression was selected, using Age, TT, APTT, CEA, and Ca. Cross-validated AUCs were 0.910 (training) and 0.905 (internal validation). The fixed test set yielded AUC 0.865 (sensitivity 0.802; specificity 0.712; F1 0.849; Brier 0.112). External validation achieved AUC 0.861, specificity 0.883, and PPV 0.934. DCA showed net benefit over "treat-all/none" across 20 %-95 % threshold probabilities. SHAP identified Age, TT, CEA, APTT and Ca as the dominant contributors.
[CONCLUSIONS] A logistic model based on routine laboratory variables effectively distinguishes malignant from benign breast lesions, with robust external performance and clear clinical net benefit, enabling early risk stratification and fewer unnecessary biopsies.This study proposes a tool that quantifies breast tumor malignancy risk using only objective indicators, without subjective factors. Online tool: prediction-for-bc.shinyapps.io/dynnomapp/.
MeSH Terms
Humans; Breast Neoplasms; Female; Machine Learning; Retrospective Studies; Middle Aged; Adult; Aged; Diagnosis, Differential; Algorithms; Sensitivity and Specificity
같은 제1저자의 인용 많은 논문 (5)
- Flap perfusion assessment with indocyanine green angiography in deep inferior epigastric perforator flap breast reconstruction: A systematic review and meta-analysis.
- A case of pulmonary mucosa-associated lymphoid tissue lymphoma with plasmacytic differentiation and amyloid deposition: case report and literature review.
- Role of ferroptosis and autophagy in pulmonary diseases.
- NUP62 Elevates USP10 Expression and Promotes Tamoxifen Resistance of Breast Cancer by Deubiquitinating ERα.
- Multi-omics analysis identifies a glycosyltransferase-related prognostic signature linked to the immune landscape in colorectal cancer.