본문으로 건너뛰기
← 뒤로

A distributed fusion framework for breast cancer recurrence prediction using MapReduce.

2/5 보강
Scientific reports 📖 저널 OA 95.5% 2026 OA AI in cancer detection
Retraction 확인
출처
PubMed DOI OpenAlex 마지막 보강 2026-04-30
OpenAlex 토픽 · AI in cancer detection Machine Learning in Healthcare Breast Cancer Treatment Studies

Shahare P, Mahalwar A, Shahade AK

📖 무료 전문 🔓 OA PDF oa
📝 환자 설명용 한 줄

Breast cancer recurrence remains a major clinical challenge, significantly influencing long-term survival and treatment planning.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Prachi Damodhar Shahare, Abha Mahalwar, Aniket K. Shahade (2026). A distributed fusion framework for breast cancer recurrence prediction using MapReduce.. Scientific reports. https://doi.org/10.1038/s41598-026-47382-0
MLA Prachi Damodhar Shahare, et al.. "A distributed fusion framework for breast cancer recurrence prediction using MapReduce.." Scientific reports, 2026.
PMID 41935148 ↗

Abstract

Breast cancer recurrence remains a major clinical challenge, significantly influencing long-term survival and treatment planning. Accurate early prediction is hindered by heterogeneous clinical factors, imbalanced datasets, and the distributed nature of medical records stored across hospitals, registries, and laboratories. To address these challenges, this study proposes a MapReduce-aligned hybrid framework that combine with distributed Spark-based Gradient Boosted Trees, denoising autoencoder (AE)-derived latent representations, calibrated XGBoost, and deep tabular framework (FT-Transformer and TabTransformer). The framework is designed to operate efficiently on heterogeneous, large-scale datasets while preserving data locality. Two benchmark datasets; the SEER breast cancer recurrence cohort and the Wisconsin Diagnostic Breast Cancer dataset were used to evaluate framework performance across clinical data. Experimental results show that the proposed calibrated XGBoost and AE-augmented fusion frameworks obtained superior discrimination, calibration with the Wisconsin dataset reaching ROC-AUC values of 0.9954 and MCC ≥ 0.981. On the SEER dataset, characterized by high heterogeneity and sparse recurrence signals, the fusion framework attained improved recall, while calibrated XGBoost offered the best overall balance between precision and stability. The findings demonstrate that combining tree-based embedded feature selection, latent AE compression, and transformer-based contextual frameworking yields consistent performance gains. Moreover, the Spark-GBT integration ensures scalability and suitability for multi-institutional environments where data centralization is restricted. The experimental results show that the proposed fusion framework provides competitive performance compared to strong baseline frameworks such as calibrated XGBoost, and improved recall and robustness for minority-class recurrence prediction. The results indicate that fusion learning improves sensitivity and framework stability, whereas calibrated XGBoost provides the strongest overall discrimination performance. The proposed framework presents a reliable, scalable, and clinically meaningful solution for individualized recurrence-risk prediction.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🔓 OA PDF 열기