본문으로 건너뛰기
← 뒤로

The value of different machine learning radiomics based on DCE-MRI in predicting axillary lymph node status of breast cancer.

1/5 보강
Translational cancer research 📖 저널 OA 100% 2021: 1/1 OA 2023: 10/10 OA 2024: 23/23 OA 2025: 166/166 OA 2026: 124/124 OA 2021~2026 2025 Vol.14(12) p. 8475-8488
Retraction 확인
출처

Wang H, Gong L

📝 환자 설명용 한 줄

[BACKGROUND] The accurate preoperative assessment of axillary lymph node (ALN) status is critical for therapeutic decision-making in primary breast cancer (BC), yet current methods are either invasive

이 논문을 인용하기

↓ .bib ↓ .ris
APA Wang H, Gong L (2025). The value of different machine learning radiomics based on DCE-MRI in predicting axillary lymph node status of breast cancer.. Translational cancer research, 14(12), 8475-8488. https://doi.org/10.21037/tcr-2025-1418
MLA Wang H, et al.. "The value of different machine learning radiomics based on DCE-MRI in predicting axillary lymph node status of breast cancer.." Translational cancer research, vol. 14, no. 12, 2025, pp. 8475-8488.
PMID 41510122 ↗

Abstract

[BACKGROUND] The accurate preoperative assessment of axillary lymph node (ALN) status is critical for therapeutic decision-making in primary breast cancer (BC), yet current methods are either invasive or lack precision. The objective of this study was to investigate the performance of machine learning models based on dynamic contrast-enhanced magnetic resonance imaging (MRI), in conjunction with clinicopathologic data, in predicting different American Joint Committee on Cancer (AJCC) lymph node (N) stages in patients with BC.

[METHODS] The data of 605 BC patients were retrospectively analyzed and separated into training and test sets. Following dimensionality reduction and feature selection, a predictive model was established via machine learning techniques. Clinicopathologic features were assessed through both univariable and multivariable logistic regressions (LRs) to select variables for constructing clinical models. The optimal radiomics and clinical models were identified via receiver operating characteristic (ROC) curve analysis and integrated into a combined model. The clinical utility of this combined model was evaluated via decision curve analysis (DCA), which confirmed its superior diagnostic accuracy in detecting axillary lymph node metastasis (ALNM).

[RESULTS] The combined model yielded area under the curve (AUC) values of 0.890 and 0.854 in the training and test sets, respectively. Additionally, in differentiating the N1 group from the N2-3 group, the combined model showed strong performance, with AUC values of 0.973 and 0.835 in the training and test sets, respectively. Moreover, the model effectively classified the N0, N1, and N2-3 groups, achieving a micro-AUC of 0.861 and a macro-AUC of 0.812.

[CONCLUSIONS] The integration of radiomics features with clinicopathologic characteristics provides a robust predictive tool for ALNM, potentially offering a noninvasive and effective approach for clinical decision-making.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

📖 전문 본문 읽기 PMC JATS · ~85 KB · 영문

Introduction

Introduction
Breast cancer (BC) is the leading cause of cancer-related mortality among women globally (1). Importantly, the prognosis of the disease vary significantly across different lymph node stages (N stages), necessitating different surgical and adjuvant treatment approaches (2,3). Initially, patients with N-stage disease are evaluated through either sentinel lymph node biopsy (SLNB) or axillary lymph node dissection (ALND) on the basis of their individual circumstances (4,5). The National Comprehensive Cancer Network (NCCN) strongly recommends the application of preoperative systemic therapy in patients with N2–3 disease. Postmastectomy radiotherapy is also recommended for patients with N2–3 disease according to both the European Society for Medical Oncology and the American Society of Clinical Oncology (6). Nevertheless, both SLNB and ALND are considered invasive procedures and carry potential complications, such as numbness, seromas, lymphedema, and infections (7). Moreover, SLNB has been criticized for its high false-negative rate (8). Therefore, it is crucial to explore and develop noninvasive and accurate diagnostic methods for the preoperative assessment of the patient’s N stage. Such methods could reduce unnecessary lymphadenectomies, alleviate the psychological and physical burdens of the patients, and minimize activity limitations and the risk of surgical complications. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is widely used in clinical settings for identifying high-risk individuals, determining tumor stage, and evaluating the efficacy of neoadjuvant chemotherapy (NACT) (9). Despite its broad application (10,11), DCE-MRI generally requires manually annotation of a limited number of qualitative descriptors of the tumor, potentially imposing observer bias in the results (12,13).
Machine learning-based radiomics methods have received considerable attention, despite demonstrating various limitations in clinical trials (14,15). Radiomics involves the automated extraction of numerous quantitative image features that are often imperceptible to humans (16). High-dimensional data, including texture features, intensity, and shape, can be extracted via specialized software and analyzed with specific algorithms (17,18).
However, existing models focus primarily on the qualitative analysis of axillary lymph node metastasis (ALNM) (19), with only a few conducting quantitative assessments (20-22). Furthermore, these quantitative analyses often overlook the N stage as defined by the American Joint Committee on Cancer (AJCC)/Union for International Cancer Control (UICC) tumor-node-metastasis (TNM) staging system (8th edition) (23), a gap that is particularly critical to address for patients with supraclavicular or intramammary lymph node metastasis.
To address these issues, we developed a machine learning framework that integrates DCE-MRI radiomics of primary tumors with clinicopathologic data from The Cancer Genome Atlas-The Cancer Imaging Archive (TCGA-TCIA) repository, aiming to noninvasively discriminate between N0, N1, and N2–3 ALNM. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1418/rc).

Methods

Methods

Study population
The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The dataset used in this study was obtained from the TCIA, particularly from the Duke-Breast-Cancer-MRI section (24). The retrieved data comprise preoperative DCE-MRI scans of 922 patients with biopsy-validated BC, including details of the tumor characteristics. All MRI examinations were performed at initial diagnosis, prior to any systemic or neoadjuvant therapy. The use of this public database adhered to the citation requirements and data use policies outlined on the TCGA-TCIA website. This study was exempt from institutional review board oversight, as patient identifiers are not accessible to database users.
Among the 922 patients, 622 had unilateral BC. The axillary lymph node (ALN) status for each patient was determined through postoperative pathological evaluation and subsequently classified according to the N classification of the AJCC TNM staging system, 8th edition. Following the exclusion of 18 patients lacking lymph node pathology data, 604 patients were included in the study.

Imaging data
Preoperative DCE-MRI scan parameters were obtained from the TCIA. Both 1.5T and 3T MRI scanners were used, and the patients were scanned primarily in the prone position. The images from the following MR sequences were obtained in DICOM format: a fat-saturated gradient echo T1-weighted precontrast sequence, a fat-free saturated T1-weighted sequence, and four postcontrast T1-weighted sequences (with the use of a weight-based protocol, 0.2 mL/kg), the latter obtained after the administration of intravenous contrast material. The details of the scanner and MRI acquisition parameters have been documented in detail in previous studies (25,26).

Clinicopathologic and radiological analysis
Clinicopathologic and radiological data were acquired from the TCIA. Clinicopathologic parameters, such as age, menopausal status, tumor location, histologic type, Nottingham grade, and T stage (tumor size), were retrospectively retrieved and analyzed. In addition, we analyzed the following imaging parameters: multicentricity, lymphadenopathy, skin or nipple involvement, and chest involvement.

Radiomics feature extraction and selection
A compilation of 529 computer-extracted imaging features was procured from TCIA (27). This dataset included commonly published features from the literature alongside uniquely extracted features. The patients were divided into training and testing cohorts at a ratio of 8:2. In the data of the training cohort, dimensionality reduction and feature selection entailed the following steps. First, all feature values were normalized via z score normalization, in which features are rescaled to a mean of zero and a standard deviation of one, ensuring comparability across features with varying scales. Spearman’s correlation analysis was then used to assess correlations between features, and one of two features whose correlation coefficient exceeded 0.9 was retained to minimize multicollinearity. The minimum redundancy maximum relevance (mRMR) algorithm was employed to identify the most pertinent features for tumor classification. Finally, a least absolute shrinkage and selection operator (LASSO) regression model with 10-fold cross-validation was developed, retaining features with nonzero coefficients. All procedures were validated in the data of the test cohort.

Machine learning model construction
Radiomics models were constructed using the final set of selected features. Machine learning techniques were applied to create precise, objective, and reliable models to aid in clinical decision-making (28). Four widely used algorithms were evaluated: support vector machine (SVM), random forest (RF), logistic regression (LR), and extreme gradient boosting (XGBoost). The diagnostic models were compared via metrics such as the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

Clinical and combined model construction
Univariable LR analysis was conducted on the clinicopathologic and radiological characteristics. Multivariable LR analysis was subsequently conducted for those features that were deemed significantly different between the groups in the univariable analysis to determine the final predictor variables for model development.
The optimal radiomics model was integrated with the clinical predictors to create a combined model. The performance of the combined model was assessed via receiver operating characteristic (ROC) curves, whereas the clinical efficacy of the model in tumor classification was evaluated via decision curve analysis (DCA), which quantifies the net benefit across various threshold probabilities.

Statistical analysis
Statistical analysis was performed via SPSS Version 25.0 software, Python version 3.5.6, and R version 3.5.3. Independent t-tests were used to compare continuous variables, and the chi-square tests or Fisher’s exact tests were used to compare categorical variables. Statistical significance was determined by a two-tailed P value less than 0.05.

Results

Results

Prediction of ALN status between N0 and N+ (≥1 metastatic ALN)
A total of 605 patients were included in the study (Table 1). Univariable and multivariable LRs analyses revealed that multifocal tumor status, T stage, lymphadenopathy or suspicious nodes on MRI, and metastasis outside the lymph nodes were independent predictors of ALNM (Table 2). LASSO regression analysis was employed for dimensionality reduction (λ=0.0295).
The results of ROC curve analysis are displayed in Table 3. The XGBoost model achieved an AUC of 0.999 in the training cohort, indicating almost perfect performance. However, its AUC decreased significantly to 0.776 in the test cohort, reflecting relatively poor generalizability to unseen data. This discrepancy suggests overfitting, a condition wherein the model captures noise and specific patterns in the training data that do not generalize well to independent datasets. In contrast, the SVM model achieved a more consistent performance, with an AUC of 0.813 in the test cohort (but only an AUC of 0.869 in the training cohort), making it the optimal choice of model for this study.
By incorporating radiomics features with clinicopathologic and radiological predictors, a combined model was constructed, which demonstrated superior discriminatory capacity over the individual clinical and radiomics feature models in both the training and test cohorts, as shown in its higher AUC values (0.890 in the training cohort, 0.854 in the testing cohort) (Figure 1). Figure 2 displays the DCA plots for the three models, demonstrating that the combined model provided the greatest net benefit in classifying ALN involvement in both the training and test cohorts.

Comparison between N1 and N2–3 patients
For this analysis, N1 BC was adopted as the negative reference standard (Table 1). In the training cohort, the AUC of the radiomics model was 0.972, whereas that of the clinicopathological model was only 0.505. However, when combined with the radiomics model, the AUC of the clinical model improved to 0.973. In the test cohort, the AUC slightly decreased but remained at a reasonable value of 0.835. The details of the statistical results are summarized in Table 4. Figure 3 shows the corresponding ROC curves for the comparisons, whereas Figure 4 presents the DCA plots for the three models.

Prediction of the N stage for N0, N1 and N2–3 patients
The present study expanded upon the existing model to accommodate three distinct task groups for predicting ALN status. The clinical endpoints were divided into three groups, the N0, N1 and N2–3 groups, with 359, 182 and 64 lesions, respectively. In multiclass classification, the micro-AUC evaluates overall predictive capability by aggregating predictions across all classes into a single ROC curve, giving equal weight to each instance. The macro-AUC, on the other hand, averages the AUCs of the individual classes, treating all classes equally in an attempt to highlight balanced performance across them. In this study, the combined model achieved a micro-AUC of 0.861, indicating strong overall predictive performance, and a macro-AUC of 0.812, demonstrating some level of effectiveness in maintaining balanced prediction across the N0, N1, and N2–3 groups. These metrics collectively highlight the robustness and generalizability of the combined model. The confusion matrix is depicted in Figure 5, while the ROC curves for the combined model are illustrated in Figure 6. Notably, the combined model demonstrated favorable performance in distinguishing between the N0 and N2–3 groups but yielded less satisfactory outcomes in identifying the N1 group.

Discussion

Discussion

Clinical significance and current challenges in predicting ALNM
ALNM is a well-established prognostic indicator in BC and significantly influences both the clinical course and treatment decisions for affected patients (21,22). Several studies have employed imaging modalities such as ultrasound (US), mammography (MMG), and MRI to evaluate the utility of breast radiomics in diagnosing BC to identify prognostic factors and to predict therapeutic responses (29-32).
However, research specifically focusing on the application of breast radiomics for predicting ALNM remains limited (33). Most existing studies have focused primarily on the qualitative analysis of ALNM (29,34), with only a few addressing the quantitative assessment of this burden in BC. These studies typically aimed to distinguish between patients with low-load ALNM, defined by one to two positive nodes (N1–2), and those with heavy-load metastasis, defined by three or more positive nodes (N ≥3) (33,35), or, relatedly, to differentiate between (N1–3) and (N ≥4) groups (36). However, these quantitative analyses largely rely on the number of ALNs as the sole classification criterion, neglecting the classification based on the N stage itself in the conventional TNM classification system. This oversight is particularly important when considering metastasis to supraclavicular or internal mammary lymph nodes.
Previous studies have shown that radiomics features obtained from ALNs may aid in predicting ALNM in BC patients (37,38). However, this approach is limited by some challenges, including discrepancies between imaging and pathological findings and the limited scanning scope of standard MRI. Additionally, biases may be introduced by excluding patients with small ALNs to mitigate issues associated with manual delineation.
Notably, models built from radiomics features derived from breast tumor images captured during the peak enhancement phase of DCE-MRI outperform models relying on features from the initial imaging phase in predicting ALNM (39,40). This greater predictive power likely results from the improved visibility of tumor heterogeneity and aggressive characteristics during peak contrast enhancement (41).

Novelty and key findings of the present study
This study is the first to demonstrate the potential of integrating radiomics features extracted from DCE-MRI with clinicopathologic characteristics to predict the N stage of ALN in BC patients. A unique aspect of this study is the classification of lymph nodes according to the conventional TNM staging system rather than a reliance solely on the number of metastatic nodes. This approach specifically distinguishes among patients with stages N0, N1, and N2–3. This stratification is clinically significant, as different N stages are associated with different patient outcomes and therapeutic strategies. The results highlighted the superior performance of the combined model, with AUC values of 0.890 and 0.854 in the training and test cohorts, respectively, in differentiating between patients with no ALNM (N0) and those with at least one metastatic lymph node (N+). Furthermore, ROC curve analysis yielded AUC values of 0.973 and 0.835 in the training and test cohorts, respectively, demonstrating the model’s high accuracy in differentiating between the N1 and N2–3 stages. Importantly, the model also performs well in distinguishing among the three categories (N0, N1, and N2–3), and thus may reduce the need for invasive procedures such as SLNB and ALND.
The integrated model demonstrated increased diagnostic capability because of the synergistic interaction between radiomics and clinicopathologic features. Radiomics texture metrics, such as entropy and uniformity, reflect tumor heterogeneity (42), whereas shape parameters, such as sphericity and compactness, capture invasive growth patterns (43). Meanwhile, clinicopathologic variables, such as tumor size (T stage), multifocality, and lymphadenopathy, provide an essential clinical context.
In contrast to single-center studies, our research utilized publicly available datasets from the TCGA-TCIA repository (44). Developed by leading institutions, these resources ensure reliability through rigorous quality control and standardized protocols. The comprehensive nature of these datasets enhances their statistical power and generalizability while minimizing the time and resources required for data acquisition by researchers.

Limitations and future directions
While the findings are promising, there are notable limitations in this study. First, differences in imaging acquisition parameters may have introduced heterogeneities in the dataset. While such differences can improve generalizability, it may also affect the reproducibility of the radiomics features. Future work should prioritize standardized imaging acquisition parameters to address this issue (45). Second, the exclusion of patients with bilateral BC limits the model’s applicability to unilateral cases. The development of models that can predict ALNM in patients with bilateral BC remains crucial for future research. Future studies should also explore the integration of multimodal data, such as genomic and metabolic data, to refine the model’s predictive accuracy (46). Third, it is important to acknowledge that the pathological N staging employed as our reference standard did not distinguish between macrometastases, micrometastases, and isolated tumor cells, due to the inconsistent availability of such detailed information within the dataset. Furthermore, deep learning frameworks, such as convolutional neural networks (CNNs) (47), represent a critical frontier for exploration (48). Unlike traditional radiomics, which relies on manually curated features, deep learning can automatically extract complex patterns from imaging data, offering the potential to identify novel biomarkers for ALNM (49).

Conclusions

Conclusions
This study demonstrates that integrating DCE-MRI radiomics of the primary tumor with clinicopathologic data provides a powerful, non-invasive tool for accurately predicting AJCC/UICC TNM N stage in BC patients, offering significant potential to refine preoperative planning and reduce reliance on invasive axillary procedures.

Supplementary

Supplementary
The article’s supplementary files as

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기