A distributed fusion framework for breast cancer recurrence prediction using MapReduce.
2/5 보강
OpenAlex 토픽 ·
AI in cancer detection
Machine Learning in Healthcare
Breast Cancer Treatment Studies
Breast cancer recurrence remains a major clinical challenge, significantly influencing long-term survival and treatment planning.
APA
Prachi Damodhar Shahare, Abha Mahalwar, Aniket K. Shahade (2026). A distributed fusion framework for breast cancer recurrence prediction using MapReduce.. Scientific reports. https://doi.org/10.1038/s41598-026-47382-0
MLA
Prachi Damodhar Shahare, et al.. "A distributed fusion framework for breast cancer recurrence prediction using MapReduce.." Scientific reports, 2026.
PMID
41935148 ↗
Abstract 한글 요약
Breast cancer recurrence remains a major clinical challenge, significantly influencing long-term survival and treatment planning. Accurate early prediction is hindered by heterogeneous clinical factors, imbalanced datasets, and the distributed nature of medical records stored across hospitals, registries, and laboratories. To address these challenges, this study proposes a MapReduce-aligned hybrid framework that combine with distributed Spark-based Gradient Boosted Trees, denoising autoencoder (AE)-derived latent representations, calibrated XGBoost, and deep tabular framework (FT-Transformer and TabTransformer). The framework is designed to operate efficiently on heterogeneous, large-scale datasets while preserving data locality. Two benchmark datasets; the SEER breast cancer recurrence cohort and the Wisconsin Diagnostic Breast Cancer dataset were used to evaluate framework performance across clinical data. Experimental results show that the proposed calibrated XGBoost and AE-augmented fusion frameworks obtained superior discrimination, calibration with the Wisconsin dataset reaching ROC-AUC values of 0.9954 and MCC ≥ 0.981. On the SEER dataset, characterized by high heterogeneity and sparse recurrence signals, the fusion framework attained improved recall, while calibrated XGBoost offered the best overall balance between precision and stability. The findings demonstrate that combining tree-based embedded feature selection, latent AE compression, and transformer-based contextual frameworking yields consistent performance gains. Moreover, the Spark-GBT integration ensures scalability and suitability for multi-institutional environments where data centralization is restricted. The experimental results show that the proposed fusion framework provides competitive performance compared to strong baseline frameworks such as calibrated XGBoost, and improved recall and robustness for minority-class recurrence prediction. The results indicate that fusion learning improves sensitivity and framework stability, whereas calibrated XGBoost provides the strongest overall discrimination performance. The proposed framework presents a reliable, scalable, and clinically meaningful solution for individualized recurrence-risk prediction.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Management of pleural relapse after breast cancer resection in a middle-aged man: a case report.
- Exploring the Experiences and Perspectives of Patients With Early Breast Cancer, Caregivers, and Health Care Professionals: Italian Social Media Listening Study.
- Metaplastic carcinoma of the breast mimicking breast implant-associated squamous cell carcinoma: a challenging differential diagnosis.
- Machine learning approaches for predicting breast cancer recurrence using clinical and histopathological data.
- Imaging of the Reconstructed Breast.
- Importance of long-term monitoring of patients with breast reconstructions: a case of 10-year cancer recurrence.