Binary and Ternary Classification Prediction for Breast Cancer and Breast Sclerosing Adenosis With Interpretable Artificial Intelligence From Clinical and Imaging Features: A Retrospective, Diagnostic Accuracy Cohort Study.
[BACKGROUND] Sclerosing adenosis (SA) and breast cancer (BC) often exhibit overlapping clinical, imaging, and pathological characteristics, making them difficult to differentiate.
APA
Qu Y, Lian J, et al. (2026). Binary and Ternary Classification Prediction for Breast Cancer and Breast Sclerosing Adenosis With Interpretable Artificial Intelligence From Clinical and Imaging Features: A Retrospective, Diagnostic Accuracy Cohort Study.. Cancer innovation, 5(1), e70049. https://doi.org/10.1002/cai2.70049
MLA
Qu Y, et al.. "Binary and Ternary Classification Prediction for Breast Cancer and Breast Sclerosing Adenosis With Interpretable Artificial Intelligence From Clinical and Imaging Features: A Retrospective, Diagnostic Accuracy Cohort Study.." Cancer innovation, vol. 5, no. 1, 2026, pp. e70049.
PMID
41694644
Abstract
[BACKGROUND] Sclerosing adenosis (SA) and breast cancer (BC) often exhibit overlapping clinical, imaging, and pathological characteristics, making them difficult to differentiate. SA may also coexist with BC (SA + BC), including ductal carcinoma in situ (SA-DCIS) and invasive breast cancer (SA-IBC), which complicates diagnosis even when core-needle biopsy (CNB) suggests SA. This study aimed to develop interpretable AI-based binary and ternary classification models that leverage clinical and imaging features to distinguish SA-only from SA + BC and to further differentiate among SA-only, SA-DCIS, and SA-IBC.
[METHODS] We retrospectively analyzed a cohort of 726 patients with SA (January 2006 to December 2021), comprising 537 SA-only and 189 SA + BC cases (90 SA-DCIS, 99 SA-IBC). Multiple machine learning algorithms-logistic regression, support vector machine, decision tree, XGBoost, and random forest-were compared using AUC, accuracy, F1-score, and C-index. Model interpretability was assessed with SHAP to elucidate feature contributions and identify key predictors. Additionally, we incorporated an independent external validation cohort consisting of 113 patients to verify the model's effectiveness.
[RESULTS] XGBoost consistently outperformed other algorithms in both tasks. Eight features emerged as most informative: age, ultrasound BI-RADS category, maximum and minimum ultrasound diameters, ultrasound margin characteristics, biopsy procedure, mammographic density, and microcalcifications. For binary classification (SA-only vs. SA + BC), XGBoost achieved an AUC of 0.925, accuracy of 0.883, and C-index of 0.844. For ternary classification (SA-only, SA-DCIS, SA-IBC), the model achieved an AUC of 0.888, accuracy of 0.811, and C-index of 0.813. Age, ultrasound BI-RADS, and minimum lesion diameter were consistently top predictors. We further proposed a three-tier interpretability framework (global, cohort-level; local, subgroup-level; and individual, case-level) to facilitate clinical translation.
[CONCLUSION] Given the substantial risk of coexisting of SA with DCIS or IBC, and the potential for CNB to underestimate disease due to limited sampling, lesions diagnosed as SA on CNB should be evaluated with additional modalities before determining the need for surgical excision. The proposed interpretable AI model enhances discrimination between SA-only and SA with concomitant breast cancer (SA + BC), thereby supporting more informed clinical decision-making in breast disease management.
[METHODS] We retrospectively analyzed a cohort of 726 patients with SA (January 2006 to December 2021), comprising 537 SA-only and 189 SA + BC cases (90 SA-DCIS, 99 SA-IBC). Multiple machine learning algorithms-logistic regression, support vector machine, decision tree, XGBoost, and random forest-were compared using AUC, accuracy, F1-score, and C-index. Model interpretability was assessed with SHAP to elucidate feature contributions and identify key predictors. Additionally, we incorporated an independent external validation cohort consisting of 113 patients to verify the model's effectiveness.
[RESULTS] XGBoost consistently outperformed other algorithms in both tasks. Eight features emerged as most informative: age, ultrasound BI-RADS category, maximum and minimum ultrasound diameters, ultrasound margin characteristics, biopsy procedure, mammographic density, and microcalcifications. For binary classification (SA-only vs. SA + BC), XGBoost achieved an AUC of 0.925, accuracy of 0.883, and C-index of 0.844. For ternary classification (SA-only, SA-DCIS, SA-IBC), the model achieved an AUC of 0.888, accuracy of 0.811, and C-index of 0.813. Age, ultrasound BI-RADS, and minimum lesion diameter were consistently top predictors. We further proposed a three-tier interpretability framework (global, cohort-level; local, subgroup-level; and individual, case-level) to facilitate clinical translation.
[CONCLUSION] Given the substantial risk of coexisting of SA with DCIS or IBC, and the potential for CNB to underestimate disease due to limited sampling, lesions diagnosed as SA on CNB should be evaluated with additional modalities before determining the need for surgical excision. The proposed interpretable AI model enhances discrimination between SA-only and SA with concomitant breast cancer (SA + BC), thereby supporting more informed clinical decision-making in breast disease management.
같은 제1저자의 인용 많은 논문 (5)
- Family Sense of Coherence, Dyadic Coping, and Quality of Life in Young and Middle-Aged Patients With Advanced Lung Cancer and Spousal Caregivers: An Actor-Partner Interdependence Mediation Model.
- Inhibiting the Nrf2/HO-1 signaling cascade weakens the pro-inflammatory response induced by gasoline engine exhaust in lung epithelial cells following air-liquid interface exposure.
- Epigenetic modulation with nanosatellite triggers tumoricidal immunity for hepatocellular carcinoma treatment.
- Photothermal treatment of prostate tumor with micellar indocyanine green and napabucasin to co-ablate cancer cells and cancer stem cells.
- Predicting case difficulty in endodontic microsurgery using machine learning algorithms.