Development and validation of an Interpretable Machine learning model for Discriminating between benign and malignant breast cancer.

Wang Z; Liu W; Hua L; Li X; Xue G

doi:10.1016/j.ijmedinf.2026.106300

← 뒤로

Development and validation of an Interpretable Machine learning model for Discriminating between benign and malignant breast cancer.

International journal of medical informatics 2026 Vol.210() p. 106300

Wang Z, Liu W, Hua L, Li X, Xue G

원문 ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

[OBJECTIVE] Breast cancer prognosis depends on early detection.

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)

표본수 (n) 745

이 논문을 인용하기

BibTeX ↓ RIS ↓

APA Wang Z, Liu W, et al. (2026). Development and validation of an Interpretable Machine learning model for Discriminating between benign and malignant breast cancer.. International journal of medical informatics, 210, 106300. https://doi.org/10.1016/j.ijmedinf.2026.106300

MLA Wang Z, et al.. "Development and validation of an Interpretable Machine learning model for Discriminating between benign and malignant breast cancer.." International journal of medical informatics, vol. 210, 2026, pp. 106300.

PMID 41592419

DOI 10.1016/j.ijmedinf.2026.106300

Abstract

[OBJECTIVE] Breast cancer prognosis depends on early detection. We developed and externally validated a model using routine, readily available clinical and laboratory variables to discriminate malignant from benign breast lesions, aiming to reduce unnecessary biopsies and support early decision-making.

[METHODS] This retrospective two-center study included a development cohort 1from Jiujiang First People's Hospital (N = 745; malignant 573, benign 172) and an external cohort2 from the First Affiliated Hospital of Nanchang University (N = 221; malignant 161, benign 60).Cohort 1 was randomly split into a 70:30 training and test set. Five-fold cross-validation was used to compare multiple algorithms and lock the model and hyperparameters; the locked model was evaluated on a fixed test set and the external cohort. The primary metric was AUC, with sensitivity, specificity, F1, Brier score, calibration curve, decision curve analysis (DCA), and SHAP for explanation.

[RESULTS] Logistic regression was selected, using Age, TT, APTT, CEA, and Ca. Cross-validated AUCs were 0.910 (training) and 0.905 (internal validation). The fixed test set yielded AUC 0.865 (sensitivity 0.802; specificity 0.712; F1 0.849; Brier 0.112). External validation achieved AUC 0.861, specificity 0.883, and PPV 0.934. DCA showed net benefit over "treat-all/none" across 20 %-95 % threshold probabilities. SHAP identified Age, TT, CEA, APTT and Ca as the dominant contributors.

[CONCLUSIONS] A logistic model based on routine laboratory variables effectively distinguishes malignant from benign breast lesions, with robust external performance and clear clinical net benefit, enabling early risk stratification and fewer unnecessary biopsies.This study proposes a tool that quantifies breast tumor malignancy risk using only objective indicators, without subjective factors. Online tool: prediction-for-bc.shinyapps.io/dynnomapp/.

MeSH Terms

Humans; Breast Neoplasms; Female; Machine Learning; Retrospective Studies; Middle Aged; Adult; Aged; Diagnosis, Differential; Algorithms; Sensitivity and Specificity

같은 제1저자의 인용 많은 논문 (5)

Flap perfusion assessment with indocyanine green angiography in deep inferior epigastric perforator flap breast reconstruction: A systematic review and meta-analysis.
Microsurgery 2023 cited 1
A case of pulmonary mucosa-associated lymphoid tissue lymphoma with plasmacytic differentiation and amyloid deposition: case report and literature review.
Frontiers in oncology 2026
Role of ferroptosis and autophagy in pulmonary diseases.
Tissue & cell 2026
NUP62 Elevates USP10 Expression and Promotes Tamoxifen Resistance of Breast Cancer by Deubiquitinating ERα.
Annals of surgical oncology 2026
Multi-omics analysis identifies a glycosyltransferase-related prognostic signature linked to the immune landscape in colorectal cancer.
International immunopharmacology 2026