본문으로 건너뛰기
← 뒤로

Interpretable machine learning for survival prediction and risk stratification in elderly patients with breast cancer after breast-conserving surgery.

1/5 보강
Gland surgery 📖 저널 OA 100% 2021: 23/23 OA 2022: 34/34 OA 2023: 50/50 OA 2024: 52/52 OA 2025: 56/56 OA 2026: 34/34 OA 2021~2026 2026 Vol.15(2) p. 38
Retraction 확인
출처

PICO 자동 추출 (휴리스틱, conf 2/4)

유사 논문
P · Population 대상 환자/모집단
872 patients were included (training set: 31,897; test set: 7,975).
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
Our system provides prognostic information that, integrated with existing radiotherapy evidence, can inform individualised treatment discussions. Prospective studies comparing radiotherapy outcomes within risk strata are needed to validate clinical utility for treatment decision-making.

Ling Q, Sun Z

📝 환자 설명용 한 줄

[BACKGROUND] The necessity of adjuvant radiotherapy following breast-conserving surgery (BCS) in elderly patients with early-stage breast cancer remains controversial.

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)
  • p-value P<0.001

이 논문을 인용하기

↓ .bib ↓ .ris
APA Ling Q, Sun Z (2026). Interpretable machine learning for survival prediction and risk stratification in elderly patients with breast cancer after breast-conserving surgery.. Gland surgery, 15(2), 38. https://doi.org/10.21037/gs-2025-aw-462
MLA Ling Q, et al.. "Interpretable machine learning for survival prediction and risk stratification in elderly patients with breast cancer after breast-conserving surgery.." Gland surgery, vol. 15, no. 2, 2026, pp. 38.
PMID 41808811 ↗

Abstract

[BACKGROUND] The necessity of adjuvant radiotherapy following breast-conserving surgery (BCS) in elderly patients with early-stage breast cancer remains controversial. Existing studies focus predominantly on population-level benefits without identifying specific prognostic subgroups with different baseline survival probabilities. We aimed to develop interpretable machine learning models to predict survival and establish precise prognostic risk stratification that could inform individualised treatment discussions.

[METHODS] Using the Surveillance, Epidemiology, and End Results database (2016-2022), we included patients aged ≥70 years with T1-2N0M0, oestrogen receptor-positive, human epidermal growth factor receptor 2 (HER2) negative breast cancer who underwent BCS. We developed six machine learning survival models incorporating age, tumour grade, T stage, progesterone receptor status, race, histology, and chemotherapy. Model performance was evaluated using time-dependent area under the curve (AUC) and concordance index. The optimal model was interpreted using SHapley Additive exPlanations (SHAP) framework. Patients were stratified into three risk groups, with survival differences assessed using Kaplan-Meier analysis.

[RESULTS] A total of 39,872 patients were included (training set: 31,897; test set: 7,975). The eXtreme Gradient Boosting (XGBoost) model demonstrated optimal performance with 1-, 3-, and 5-year AUCs of 0.714, 0.692, and 0.711, respectively. SHAP analysis identified age as the most important predictor, followed by tumour grade and T stage. Risk stratification successfully delineated three distinct prognostic groups: low-risk (37% of patients, 5-year overall survival 88-90%), intermediate-risk (33% of patients, 5-year overall survival 82-84%), and high-risk (30% of patients, 5-year overall survival 65-67%) (log-rank P<0.001). Notably, the low-risk group's survival rate was comparable to radiotherapy-treated patients in previous studies (88.6%).

[CONCLUSIONS] We successfully established a prognostic risk stratification system identifying three distinct survival groups (low-risk, intermediate-risk, and high-risk). The low-risk group's 5-year survival matched radiotherapy-treated patients in a previous study (Yang , 88.6%). Our system provides prognostic information that, integrated with existing radiotherapy evidence, can inform individualised treatment discussions. Prospective studies comparing radiotherapy outcomes within risk strata are needed to validate clinical utility for treatment decision-making.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (2)

📖 전문 본문 읽기 PMC JATS · ~59 KB · 영문

Introduction

Introduction
Breast cancer is the most common malignancy among women globally, with over 2.3 million new cases diagnosed in 2020 (1). As populations age, the incidence of breast cancer in elderly populations continues to rise. In the US, more than 40% of patients are aged 70 years or older at diagnosis, with this proportion projected to increase further (2,3). Breast cancer in elderly patients typically exhibits indolent biological behaviour, characterised by high rates of oestrogen receptor (ER) and progesterone receptor (PR) positivity and low rates of human epidermal growth factor receptor 2 (HER2) amplification, features that portend favourable prognosis (4).
For early-stage breast cancer, breast-conserving surgery (BCS) combined with adjuvant radiotherapy represents the standard of care, yielding survival outcomes equivalent to mastectomy whilst preserving breast morphology and enhancing quality of life (5,6). However, the necessity of adjuvant radiotherapy in elderly patients remains contentious. The landmark CALGB 9343 trial demonstrated that for women aged ≥70 years with T1N0M0, ER-positive disease, radiotherapy addition to BCS and endocrine therapy did not improve overall survival (OS) despite reducing local recurrence (10-year: 2% vs. 9%) (7), leading National Comprehensive Cancer Network (NCCN) guidelines to permit radiotherapy omission in this population (8). However, recent large-scale retrospective cohort studies have challenged this paradigm. Yang and colleagues, analysing 26,586 patients from the 2010–2014 Surveillance, Epidemiology, and End Results (SEER) database—a comprehensive population-based cancer registry covering approximately 48% of the US population—reported significantly superior 5-year OS in patients receiving adjuvant radiotherapy compared with those who did not [88.6% vs. 72.1%, hazard ratio (HR) 0.589, P<0.001] (9-12), prompting reconsideration of current practice. Critically, existing studies focus on population-level average benefits without identifying specific subgroups who genuinely require radiotherapy versus those who can safely forgo it. This limitation creates a clinical dilemma: guideline-sanctioned radiotherapy omission might deny treatment to high-risk patients, whilst universal radiotherapy recommendations subject low-risk patients to unnecessary treatment burden (13,14).
In this context, precision medicine and individualised treatment decision-making assume paramount importance (15). Machine learning, a powerful tool for data mining and pattern recognition, has demonstrated substantial potential in tumour prognostication and risk stratification (16,17). Compared with traditional statistical approaches, machine learning can process high-dimensional data, capture non-linear relationships, and identify complex variable interactions. Nevertheless, conventional machine learning models are frequently criticised as “black boxes”, lacking interpretability and thus limiting clinical application. Recent advances in explainable artificial intelligence (XAI), particularly the development of SHapley Additive exPlanations (SHAP), provide transparent interpretive frameworks for understanding model decisions (18).
Against this background, we aimed to develop interpretable machine learning models using the SEER database to predict survival in elderly (≥70 years) patients with early-stage breast cancer following BCS, and through precise prognostic risk stratification, provide quantitative prognostic information that could inform treatment discussions. Our specific objectives were: (I) to develop and compare multiple machine learning survival models; (II) to establish an interpretable three-tier prognostic risk stratification system using SHAP analysis; (III) to characterize survival outcomes across risk strata and contextualize findings within existing radiotherapy evidence. We emphasize that this study develops a prognostic model that identifies patients with different baseline survival risks rather than a predictive model that directly quantifies treatment-specific benefits. The prognostic information from our model, when integrated with existing evidence on radiotherapy outcomes from randomized trials (e.g., CALGB 9343) and real-world studies (e.g., Yang et al. 2022), may help inform individualised treatment discussions in clinical practice, though it cannot directly dictate treatment decisions. We present this article in accordance with the TRIPOD reporting checklist (available at https://gs.amegroups.com/article/view/10.21037/gs-2025-aw-462/rc).

Methods

Methods

Study design and participants
This was a population-based retrospective cohort study. We used data from the November 2024 submission of the SEER database of the National Cancer Institute (NCI), which encompasses 17 population-based cancer registries. We included female patients aged 70 years or older diagnosed with breast cancer between January 1, 2016, and December 31, 2022. Eligible patients met the following criteria: (I) pathologically confirmed T1–2N0M0 invasive ductal or lobular carcinoma that was ER-positive and HER2-negative; and (II) previous treatment with BCS. We excluded patients who received neoadjuvant therapy, those with a history of other malignancies, and cases with missing data on key prognostic variables such as tumour grade or receptor status. After applying all inclusion and exclusion criteria, the final cohort comprised 39,872 patients for analysis. This study used de-identified data from the publicly available SEER database. As the SEER database contains only de-identified patient information, institutional review board approval and informed consent were waived in accordance with the Common Rule (45 CFR 46.102). The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Outcome measures
The primary outcome was OS, defined as the time from diagnosis to death from any cause. Patients alive at the end of follow-up were censored at the date of last known contact.

Statistical analysis
The final cohort was randomly divided into a training set for model development and an independent test set for model validation. Predictor variables included patient age, race, tumour grade, T stage, PR status, and receipt of adjuvant chemotherapy.
We developed and compared six machine learning survival models, including Cox proportional hazards model, random survival forest (RSF), eXtreme Gradient Boosting (XGBoost) survival model, and three additional standard algorithms (naive Bayes, neural network, and stochastic gradient boosting). Model performance was primarily assessed using time-dependent area under the receiver operating characteristic curve (AUC) and overall concordance index (C-index) for 10-year OS calculated in the test set. To interpret the best-performing model, we employed the SHAP framework. We generated SHAP summary plots to visualise the direction and magnitude of each predictor’s influence on model output, and created heatmaps of patient characteristics sorted by predicted risk to illustrate global relationships between patient features, predicted risk, and actual outcomes.
Based on 10-year OS predictions from the optimal model, we stratified all patients in the test set into low-, intermediate-, and high-risk groups according to predefined percentile thresholds. The clinical validity of this risk stratification was confirmed by plotting Kaplan-Meier survival curves for each group and comparing them using the log-rank test. All statistical analyses were performed using Python (version 3.9).

Results

Results

Baseline characteristics of the study population
This study included 39,872 patients, randomly divided into a training set (n=31,897) and test set (n=7,975) at an 8:2 ratio (Table 1). Age distribution showed that 50.9% were aged 70–74 years, 30.0% were 75–79 years, and 5.3% were 85 years or older. Racial composition was 85.9% White, 6.5% Black, and 7.6% other ethnicities. Regarding tumour pathological characteristics, moderately differentiated tumours (grade II) comprised 53.1%, well-differentiated (grade I) 36.0%, and poorly differentiated (grade III) 10.9%. Infiltrating ductal carcinoma accounted for 74.1% and lobular carcinoma for 13.0%. T1 stage represented 82.3% whilst T2 stage comprised 17.7%. PR positivity was observed in 88.7% of patients. Only 4.5% of patients received adjuvant chemotherapy. The training and test sets demonstrated consistent distribution across all baseline characteristics (difference <0.2%), ensuring dataset balance.

Development and performance evaluation of predictive models
Based on clinical variables including age, race, tumour grade, T stage, PR status, and chemotherapy receipt, we constructed six machine learning survival prediction models and evaluated their performance for predicting 1-, 3-, and 5-year OS in the independent test set (Figure 1). All models demonstrated robust predictive capability. Most models achieved AUC values between 0.6 and 0.8, representing moderate to good predictive performance. This confirms that effective survival prediction tools can be constructed using only routinely available clinicopathological features. Among all tested models, XGBoost demonstrated optimal performance, achieving AUCs of 0.714, 0.692, and 0.711 for 1-, 3-, and 5-year survival prediction, respectively, displaying stable predictive capability. RSF likewise exhibited excellent performance, with corresponding AUCs of 0.711, 0.681, and 0.699. Logistic regression and stochastic gradient boosting models also achieved acceptable predictive performance (AUC range, 0.629–0.651).
Given XGBoost’s consistently optimal and stable predictive performance across all time points, we selected this model as the final survival prediction tool for subsequent feature importance analysis and risk stratification studies.

Interpretability analysis of the XGBoost model
To understand how the XGBoost model performs survival prediction, we employed the SHAP framework for interpretability analysis at both global and individual levels (Figure 2). Feature importance ranking (Figure 2A) revealed age as the most important predictor influencing outcomes, followed by tumour grade and T stage. The SHAP summary plot illustrated the directional influence of each feature value on model output: elderly patients (red dots) predominantly distributed in the positive SHAP value region, indicating that increasing age significantly elevated mortality risk; high tumour grade (grade III) was similarly associated with elevated mortality risk; T2 tumours increased risk compared with T1 stage. In contrast, race, histology, PR status, and chemotherapy demonstrated smaller effects, whilst still contributing to model predictions.
Individual-level prediction interpretation (Figure 2B) demonstrated the risk prediction process for a specific patient with baseline risk f(x)=0.016. In this case, age reduced risk (SHAP value =−0.37), whilst grade III differentiation (+0.37) and T2 stage (+0.24) substantially increased risk. PR-positive status (−0.2), chemotherapy (−0.1), and certain histological features (−0.16) moderately reduced risk, whilst race exerted negligible influence on this patient’s prediction (+0). This individualised risk decomposition clearly demonstrated how each clinical feature jointly contributes to form the patient’s OS prediction.

Association patterns between patient characteristics and risk scores
To visually demonstrate global relationships between predicted risk, patient clinical characteristics, and actual outcomes, we created feature heatmaps (Figure 3) with all patients sorted by risk score from low to high. The heatmap clearly displayed characteristic patterns of high-risk patient populations. From left to right (low to high risk), the age row exhibited obvious colour gradient changes, with the high-risk region (right side) showing more yellow-green colours, indicating that advanced age represents a key driver of increased risk, highly concordant with SHAP analysis results. Tumour grade demonstrated relatively complex distribution patterns across the entire cohort, with moderate grade (cyan) predominating, though the proportion of high-grade tumours (yellow) increased in high-risk regions.
The colour distribution of the chemotherapy row (predominantly yellow) reflected the treatment characteristics of this cohort—the vast majority of patients did not receive adjuvant chemotherapy, consistent with standard treatment practice for elderly, early-stage, hormone receptor-positive breast cancer. Race, PR status, T stage, and histology displayed relatively dispersed distributions across different risk tiers, indicating that whilst these factors contributed to prediction, they did not dominate risk stratification.
The risk score row demonstrated smooth transition from low to high (dark purple to yellow-green), confirming successful patient allocation across a continuous risk spectrum. Importantly, the vital status row showed more colour variation in the high-risk region (right side), suggesting death events concentrated more heavily amongst high-risk patients, validating excellent concordance between model-predicted risk and actual clinical outcomes.

Risk stratification and survival analysis based on the XGBoost model
Based on individualised risk scores predicted by the XGBoost model, we stratified all patients in the entire cohort (n=39,872) into three risk tiers according to predefined percentile thresholds: low-risk group (n=14,731, 37.0%), intermediate-risk group (n=13,221, 33.2%), and high-risk group (n=11,920, 29.9%). Kaplan-Meier survival curve analysis (Figure 4) clearly demonstrated significant and sustained survival differences amongst the three risk groups.
Survival rates exhibited clear gradient separation amongst groups. The low-risk group demonstrated optimal survival outcomes, with the survival curve consistently positioned at the top throughout follow-up. At approximately 90 months of follow-up, the low-risk group maintained OS of approximately 88–90%. The intermediate-risk group’s survival curve occupied the middle position, with corresponding survival of approximately 82–84%. The high-risk group exhibited the lowest survival, declining to approximately 65–67% at the end of follow-up, representing a difference of approximately 20–25 percentage points compared with the low-risk group.
Inter-group differences expanded progressively over time. During early follow-up (first 12 months), survival curves remained relatively close amongst the three groups, with survival rates exceeding 95% in all groups. As follow-up duration extended, curve separation became progressively more pronounced. By mid-follow-up (approximately 40–50 months), survival rates in the intermediate and high-risk groups began to decline notably, with increasing curve slopes. This time-dependent pattern of differences suggests that the model not only identifies short-term risk but also effectively predicts long-term prognosis.
Risk stratification demonstrates clinical utility. The log-rank test revealed highly statistically significant survival differences amongst the three groups (P<0.001). This risk stratification system, constructed from routine clinical variables, effectively identifies patient subgroups with markedly divergent prognoses, providing objective evidence for clinical decision-making. Particularly for low-risk group patients (comprising 37% of the total population), their excellent long-term survival (5-year survival rate >90%) suggests these patients might benefit from adjuvant radiotherapy omission, whilst high-risk group patients might require more aggressive therapeutic intervention or intensive surveillance.

Discussion

Discussion
This study, encompassing 39,872 elderly (≥70 years) patients with early-stage breast cancer from the SEER database, employed interpretable machine learning methods to construct survival prediction models and achieve effective risk stratification. Our principal findings include three elements. First, using only routinely available clinicopathological variables (age, tumour grade, T stage, PR status, race, histology, and chemotherapy), we constructed models with robust predictive performance; the XGBoost model achieved AUCs of 0.714, 0.692, and 0.711 for 1-, 3-, and 5-year survival prediction, respectively. Second, SHAP interpretability analysis identified age as the most important prognostic factor, followed by tumour grade and T stage, findings concordant with established clinical knowledge and thereby enhancing model credibility. Third, the three-tier risk stratification system successfully delineated patient subgroups with markedly divergent prognoses: 5-year OS rates were 88–90%, 82–84%, and 65–67% for low-risk, intermediate-risk, and high-risk groups, respectively, providing quantitative tools for individualised treatment decisions.
Before discussing our findings, we must clarify a fundamental distinction that has important implications for interpreting our results. Our study developed a prognostic model that predicts OS based on baseline patient and tumor characteristics, rather than a predictive model that directly quantifies radiotherapy benefit. While our risk stratification successfully identifies subgroups with markedly different prognoses, it does not directly measure how much each subgroup would benefit from radiotherapy compared to omitting it.
The excellent survival observed in the low-risk group (5-year OS 88–90%) is comparable to that reported for radiotherapy-treated patients in Yang et al.’s real-world study (88.6%). This observation is hypothesis-generating rather than conclusive. It suggests—but does not prove—that some patients in this subgroup might maintain favorable outcomes without radiotherapy. However, this comparable survival could result from multiple factors: (I) most low-risk patients in our cohort may have actually received radiotherapy (radiotherapy data were not available in our dataset); (II) this subgroup may have inherently favorable disease biology regardless of radiotherapy; (III) a combination of both factors.
Future prospective studies should specifically evaluate radiotherapy outcomes (with versus without) within each risk stratum identified by our model to determine which subgroups truly benefit from treatment intensification versus de-escalation. Such studies would need to directly compare survival and recurrence outcomes between radiotherapy and no radiotherapy arms within homogeneous prognostic subgroups. Our model provides prognostic context that, when combined with existing evidence from randomized trials (e.g., CALGB 9343) and real-world studies (e.g., Yang et al. 2022) on radiotherapy benefits, can inform but not dictate treatment decisions. The value of our prognostic stratification lies in identifying which patients to prioritize for future treatment comparison studies and in providing a framework for integrating prognostic information with treatment evidence during individualized clinical decision-making.
The role of adjuvant radiotherapy following BCS in elderly patients has remained controversial for two decades. The 2004 CALGB 9343 prospective randomised controlled trial demonstrated that for women aged 70 years or older with T1N0M0, ER-positive disease, BCS combined with endocrine therapy yielded comparable OS regardless of radiotherapy addition, despite radiotherapy significantly reducing local recurrence (7). Based on this evidence, NCCN guidelines permit radiotherapy omission in this population (8). However, recent large-scale retrospective cohort studies have challenged this conclusion. Yang and colleagues, analysing 26,586 patients from the 2010–2014 SEER database, reported 5-year OS of 88.6% in the radiotherapy group, significantly superior to 72.1% in the no-radiotherapy group (HR 0.589, P<0.001) (9). Studies from the National Cancer Database (10-12) and Ontario Cancer Registry (12) yielded similar conclusions, consistently demonstrating OS benefits with adjuvant radiotherapy.
These discordant findings likely stem from multiple factors. Prospective randomised trials impose rigorous inclusion criteria and mandate adequate surgical margins, whereas real-world clinical practice encompasses greater complexity and heterogeneity (13,14). Additionally, large-sample retrospective studies possess enhanced statistical power to detect modest yet genuine survival differences. Crucially, however, these studies share a common limitation: they focus on population-level average benefits without distinguishing subgroups with varying radiotherapy requirements.
Our study addresses this gap from an individualised precision medicine perspective. Rather than negating radiotherapy’s value, we sought through precise risk stratification to identify populations genuinely requiring radiotherapy versus those who can safely forgo it. We identified that the low-risk group, comprising 37% of the total population, achieved 5-year OS of 88–90%, comparable to the 88.6% reported by Yang and colleagues for patients receiving radiotherapy (9). This crucial finding suggests that carefully selected low-risk patients, identified through rigorous multifactorial assessment, might maintain excellent long-term survival without adjuvant radiotherapy. Considering radiotherapy-associated acute and chronic toxicity, healthcare costs, treatment duration, and potential impacts on quality of life and compliance in elderly patients (19,20), our risk stratification tool can help clinicians identify subgroups genuinely suitable for radiotherapy omission, thereby avoiding unnecessary overtreatment.
Current NCCN guidelines for radiotherapy omission in elderly patients with early-stage breast cancer employ relatively simple criteria: age ≥70 years, T1N0M0, and ER-positive status (8). However, these criteria exhibit obvious limitations. First, they consider only three factors, failing to incorporate other important prognostic variables such as tumour grade and PR status. Second, they employ simple binary classification (guideline-concordant or non-concordant) rather than continuous risk scoring, inadequately capturing patient heterogeneity. Third, they lack quantitative risk prediction, hindering individualised benefit-risk assessment.
Our study, by integrating seven clinicopathological variables, stratifies patients into three distinct risk tiers, providing more refined guidance for clinical decisions.

Low-risk group (37% of patients)
This subgroup, characterised predominantly by age 70–74 years, grade I–II tumours, and T1 stage, exhibits excellent long-term survival (5-year OS 88–90%). SHAP analysis demonstrates that favourable prognostic factors (younger age, low tumour grade, small tumour size) generate cumulative protective effects, resulting in substantially better baseline prognosis compared to higher-risk groups.
The excellent prognosis of this subgroup deserves careful interpretation in the context of existing evidence. Notably, their 5-year survival rate (88–90%) is comparable to that reported for radiotherapy-treated patients in Yang et al.’s 2022 real-world study (88.6%), which analyzed 26,586 elderly patients from the SEER database between 2010 and 2014. This observation suggests that low-risk patients identified by our model may represent candidates for treatment de-escalation discussions. However, we must emphasize several critical caveats:
❖ First, our prognostic model does not directly quantify how much this subgroup would benefit from radiotherapy. The comparable survival could result from most patients in this subgroup actually receiving radiotherapy (our dataset lacks radiotherapy information), inherently favorable disease biology, or both factors.

❖ Second, the observation that their survival matches radiotherapy-treated patients in previous studies is hypothesis-generating rather than conclusive evidence for radiotherapy omission.

❖ Third, our model provides prognostic stratification but cannot determine individual treatment benefit.

For patients in this subgroup, decisions regarding radiotherapy should integrate multiple sources of information: (I) our model’s prognostic assessment indicating excellent baseline survival probability; (II) evidence from the CALGB 9343 trial showing no OS benefit from radiotherapy in selected elderly patients; (III) real-world evidence from studies like Yang et al. showing potential radiotherapy benefit in broader populations; (IV) individualized assessment of comorbidities, functional status, life expectancy; (V) patient preferences regarding treatment burden versus potential benefit; (VI) access to modern radiotherapy techniques with reduced toxicity.
The appropriate clinical application of our risk stratification for this subgroup is to facilitate informed, shared decision-making rather than to provide definitive treatment recommendations. Prospective studies that specifically compare outcomes with and without radiotherapy within this low-risk subgroup, ideally using our risk model for patient selection, would provide the definitive evidence needed to guide treatment de-escalation decisions.

Intermediate-risk group (33% of patients)
This group exhibits intermediate survival (5-year OS 82–84%), representing a clinical zone of equipoise. From SHAP analysis and feature distribution patterns, this subgroup is characterized by mixed prognostic profiles: predominantly patients aged 75–79 years (accounting for approximately 35% of this group) with moderate tumor characteristics. Typical intermediate-risk profiles include: (I) younger patients (70–74 years) with grade II–III tumors or T2 stage; (II) patients aged 75–79 years with grade II tumors and T1 stage; (III) patients aged 80–84 years with predominantly favorable tumor characteristics (grade I–II, T1) where age becomes the primary risk driver.
The key distinguishing feature of this group compared to low-risk patients is the presence of at least one unfavorable prognostic factor—either more advanced age (≥75 years) or more aggressive tumor biology (higher grade or larger size)—that elevates baseline mortality risk. However, these patients lack the multiple compounding adverse factors seen in the high-risk group. The moderate baseline prognosis of these patients (approximately 4–6 percentage points lower 5-year survival than the low-risk group) suggests they may derive meaningful benefit from adjuvant radiotherapy, though our prognostic model does not quantify this benefit directly.
Treatment decisions for this subgroup should particularly emphasize shared decision-making that integrates prognostic information with comprehensive clinical assessment. The decision framework should incorporate: (I) our model’s prognostic prediction indicating intermediate baseline risk; (II) existing evidence on radiotherapy benefits from clinical trials and real-world studies; (III) thorough assessment of comorbidities and functional status; (IV) realistic estimation of life expectancy considering both cancer and non-cancer mortality risks; (V) patient values and preferences regarding treatment burden, potential side effects, and survival benefit; (VI) practical considerations including proximity to radiotherapy facilities and support systems; (VII) access to modern radiotherapy techniques that may reduce treatment burden through hypofractionation or accelerated partial breast irradiation.
For this intermediate-risk group, neither routine radiotherapy nor routine omission is appropriate—rather, truly individualized decision-making informed by prognostic context is essential. Future research should investigate whether specific subsets within this intermediate-risk category might benefit more or less from radiotherapy based on additional clinical or molecular factors.

High-risk group (30% of patients)
This group demonstrates significantly reduced survival (5-year OS 65–67%), approximately 20–25 percentage points lower than the low-risk group, indicating substantially elevated mortality risk. SHAP analysis reveals that this group is defined by the convergence of multiple unfavorable prognostic factors, creating compounding risk effects.
Clinical characterization shows this group comprises predominantly: (I) patients aged ≥85 years (representing approximately 25% of high-risk group) regardless of tumor characteristics, where advanced age becomes the dominant prognostic determinant; (II) patients aged 80–84 years with adverse tumor features (grade III, T2, or PR-negative), where age and aggressive tumor biology synergistically elevate risk; (III) patients aged 75–79 years with multiple adverse tumor characteristics (grade III combined with T2 stage, or grade III with PR-negative status); and less commonly (IV) younger patients (70–74 years) with particularly aggressive tumors (grade III, T2, PR-negative) where tumor biology overwhelms the protective effect of younger age.
The defining feature distinguishing this group is the presence of multiple concurrent risk factors rather than a single adverse characteristic. SHAP force plots demonstrate that in high-risk patients, individual negative factors (advanced age, high grade, larger tumor) amplify each other’s effects rather than simply adding linearly. For instance, grade III disease in a patient aged 85 years confers substantially higher risk than the sum of risks from each factor alone. This multiplicative risk effect explains the dramatic survival difference between high and low-risk groups and suggests these patients have fundamentally different disease biology and/or host factors that substantially increase mortality risk from both breast cancer and competing causes.
The poor baseline prognosis suggests these patients have aggressive disease biology and/or unfavorable host factors that substantially increase mortality risk. These patients may potentially derive considerable benefit from radiotherapy and other intensive interventions, though again our prognostic model does not directly measure treatment-specific benefits.
The marked difference in survival between high-risk and low-risk groups (20–25 percentage points) suggests substantial heterogeneity in disease biology and prognosis within elderly breast cancer patients. For this high-risk subgroup, several clinical implications merit consideration:
❖ First, given the elevated baseline mortality risk, these patients may represent a population where radiotherapy provides meaningful absolute survival benefit, even if the relative risk reduction is similar across risk groups.

❖ Second, the poor prognosis indicates that these patients should be considered for comprehensive treatment intensification, potentially including not only radiotherapy but also optimized systemic therapy, aggressive management of comorbidities, and closer surveillance for recurrence.

❖ Third, the substantial mortality risk in this subgroup emphasizes the importance of realistic discussion about prognosis and treatment goals, balancing potential benefits of aggressive therapy against quality of life considerations and patient preferences.

Importantly, while this subgroup’s poor prognosis suggests they might benefit most from radiotherapy, this remains a hypothesis requiring validation. Future studies should specifically evaluate whether high-risk patients identified by our model derive greater absolute benefit from radiotherapy compared to lower-risk groups. Additionally, research should investigate whether this high-risk group has specific biological features (e.g., Ki-67 index, molecular subtypes) that could further refine treatment recommendations.
Traditional machine learning models are frequently criticised as “black boxes”, limiting clinical implementation. Our study employs the SHAP framework to provide transparent model interpretation. The SHAP summary plot clearly displays predictor importance rankings: age emerges as the most important predictor, followed by tumour grade and T stage, findings highly concordant with clinical experience and thereby validating model rationality. Importantly, SHAP provides individualised risk decomposition for each patient, enabling clinicians to communicate: “Your younger age represents a favourable factor; however, your high tumour grade increases risk.” This transparent communication facilitates shared decision-making, enhances treatment compliance, and represents a crucial safeguard for safe medical AI deployment (21,22).

Limitations
Several limitations warrant acknowledgment. The most important limitation of our study is that we developed a prognostic model rather than a predictive model for radiotherapy benefit. This distinction has critical implications for interpreting our findings and their clinical application.
Our model stratifies patients by their OS risk based on baseline characteristics, but does not directly quantify how much each risk group would benefit from radiotherapy compared to omitting it. While we observed that the low-risk group’s 5-year survival rate (88–90%) is comparable to radiotherapy-treated patients in Yang et al.’s real-world study (88.6%), this observation does not prove that low-risk patients can safely omit radiotherapy. This comparable survival could result from several scenarios: (I) most low-risk patients in our cohort may have actually received radiotherapy (we lack radiotherapy data in our dataset), meaning their excellent survival might depend on radiotherapy receipt; (II) this subgroup may have inherently favorable disease biology that ensures good outcomes regardless of radiotherapy; (III) a combination of both factors may contribute.
To truly determine which patients can safely omit radiotherapy, future studies must directly compare outcomes between patients receiving and not receiving radiotherapy within each risk stratum identified by our model. Such predictive modeling would require: (I) detailed information on radiotherapy receipt; (II) outcome comparison between treatment and no-treatment groups within each risk category; (III) sufficient sample sizes in both treatment arms within each risk stratum; (IV) appropriate methods to account for selection bias (e.g., propensity score matching, instrumental variables).
Additionally, our dataset does not include information on radiotherapy receipt, local recurrence, or distant metastasis, precluding any direct analysis of radiotherapy effectiveness or pattern of failure within our cohort. The SEER database provides only OS and breast cancer-specific survival endpoints, limiting our ability to evaluate treatment impact on disease control.
This fundamental limitation constrains our ability to make treatment recommendations and necessitates that our findings be interpreted as prognostic stratification providing context for treatment discussions rather than as predictive evidence for treatment selection. Our model identifies patients at different baseline risk levels, which is clinically valuable for risk communication and contextualization, but cannot replace studies specifically designed to evaluate treatment efficacy within defined prognostic subgroups.
As a retrospective study, residual confounding from unmeasured factors cannot be entirely excluded. The SEER database lacks data on comorbidities, functional status, and endocrine therapy compliance, factors substantially influencing prognosis in elderly patients (23,24). Second, the database does not record local recurrence or distant metastasis, precluding assessment of model performance for recurrence prediction. Third, our models excluded emerging molecular markers such as the 21-gene recurrence score (Oncotype DX) and 70-gene signature (MammaPrint) (25,26), though future integration of multi-omic data might further enhance predictive precision. Finally, our models, developed using US SEER data, require external validation in diverse populations across different countries and regions.
Future research should encompass prospective clinical trials validating this risk stratification system’s clinical utility; integration of gene expression profiles, radiomics, and other multi-omic data to construct more precise predictive models (27,28); development of online risk calculators facilitating clinical application; establishment of differentiated surveillance strategies for distinct risk groups; and health economic evaluation assessing cost-effectiveness of precision-stratified individualised radiotherapy decisions (29).

Conclusions

Conclusions
Through interpretable machine learning, we successfully developed a robust prognostic risk stratification system that identifies distinct patient subgroups with markedly different survival outcomes in elderly breast cancer patients after BCS. Our three-tier classification system (low-risk: 37%, intermediate-risk: 33%, high-risk: 30%) demonstrates substantial prognostic heterogeneity, with 5-year survival rates ranging from 65–67% in high-risk patients to 88–90% in low-risk patients—a clinically meaningful difference of approximately 20–25 percentage points.
Our risk stratification system provides quantitative prognostic information that can complement existing evidence on radiotherapy benefits to inform individualized treatment discussions. Notably, the excellent prognosis observed in the low-risk subgroup (5-year OS 88–90%) is comparable to radiotherapy-treated patients in previous real-world studies (Yang et al. 2022, 88.6%), providing a foundation for hypothesis generation regarding potential treatment de-escalation in carefully selected patients. However, we emphasize that our prognostic model does not directly measure treatment-specific benefits, and prospective studies directly comparing outcomes with and without radiotherapy within each risk stratum are essential to validate the safety and efficacy of any de-escalation strategies and to determine which prognostic subgroups truly benefit most from radiotherapy.
This work represents an important step toward integrating prognostic stratification with treatment evidence to advance from population-level recommendations toward patient-level precision medicine. Our interpretable risk stratification tool can help identify which patients to prioritize for future treatment comparison studies and provide a framework for contextualized clinical decision-making. However, definitive treatment recommendations await prospective validation in studies specifically designed to evaluate treatment-specific benefits within well-defined prognostic subgroups. The clinical value of our prognostic model lies not in replacing treatment trials but in providing the prognostic framework necessary to design and interpret such trials effectively, ultimately enabling truly personalized treatment strategies that balance efficacy with quality of life considerations in this elderly, vulnerable population.

Supplementary

Supplementary
The article’s supplementary files as

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기