Factors predicting the risk of breast cancer: construction and validation of a nomogram model.

Ren T; Huang R; Xiao W; Huang X; Yan J; Ji Q; Song Y; Guo Y; Li X; Yang M; Xu Z; Liang F

doi:10.21037/qims-24-2220

← 뒤로

Factors predicting the risk of breast cancer: construction and validation of a nomogram model.

1/5 보강

Quantitative imaging in medicine and surgery 📖 저널 OA 100% 2022~2026 2026 Vol.16(4) p. 277

Ren T, Huang R, Xiao W, Huang X, Yan J, Ji Q, Song Y, Guo Y, Li X, Yang M, Xu Z, Liang F

📖 무료 전문 🟢 PMC 전문 PMC13066840

PubMed ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

[BACKGROUND] A simple and practicable strategy for predicting the risk of breast cancer (BC) urgently needs to be established.

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)

추적기간 6 months

이 논문을 인용하기

↓ .bib ↓ .ris

APA Ren T, Huang R, et al. (2026). Factors predicting the risk of breast cancer: construction and validation of a nomogram model.. Quantitative imaging in medicine and surgery, 16(4), 277. https://doi.org/10.21037/qims-24-2220

MLA Ren T, et al.. "Factors predicting the risk of breast cancer: construction and validation of a nomogram model.." Quantitative imaging in medicine and surgery, vol. 16, no. 4, 2026, pp. 277.

PMID 41972057 ↗

DOI 10.21037/qims-24-2220

Abstract

[BACKGROUND] A simple and practicable strategy for predicting the risk of breast cancer (BC) urgently needs to be established. This study aimed to construct a nomogram model based on age and ultrasound (US) features to predict BC.

[METHODS] Consecutive adult females with a breast mass who underwent breast US followed by biopsy or surgery (Breast Imaging Reporting and Data System (BI-RADS) categories III-V) or who received follow-up US more than 6 months after the initial US (BI-RADS category II) from August 2020 to November 2023 were enrolled in this prospective multicenter study. The participants were allocated to three groups (the training set, internal validation set, and external validation set). A logistic regression analysis of the training set was performed to identify the independent variables associated with BC, based on which the breast nomogram model (B-NM) was constructed. The performance of the B-NM was evaluated using the area under the curve (AUC) of the receiver operating characteristic curve and calibration diagrams.

[RESULTS] The training set comprised 306 females (43.0±11.9 years), the internal validation set comprised 45 females (46.0±12.6 years), and the external validation set comprised 114 females (40.6±11.4 years). Age and four US variables (mass size, orientation, margin, and vascularity) were found to be independently associated with BC. The B-NM combining these variables demonstrated relatively good performance {AUC [95% confidence interval (CI)]: 0.914 (0.882-0.946) . 0.905 (0.827-0.982) . 0.846 (0.750-0.940)} in the training, internal, and external validation sets, respectively. The calibration diagram showed that the nomogram's predicted probabilities were highly consistent with the observed values.

[CONCLUSIONS] The B-NM incorporating age and US variables demonstrated favorable discrimination and calibration in predicting the risk of BC.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

Letter to the Editor: Body composition in lung cancer risk stratification-towards clinical translation.
European radiology 2026
IKBKE downregulation increases chemosensitivity through pyroptosis mediated by the caspase-3/GSDME pathway in pancreatic cancer.
Journal of experimental & clinical cancer research : CR 2026
The Senescence-SASP Landscape in Colon Adenocarcinoma: Prognostic and Therapeutic Implications.
Current issues in molecular biology 2026
An interpretable hybrid deep learning framework for gastric cancer diagnosis using histopathological imaging.
Scientific reports 2025
Prophylactic central lymph node dissection for low-risk papillary thyroid cancer-Impact on subsequent therapy.
World journal of surgery 2025

📖 전문 본문 읽기 PMC JATS · ~39 KB · 영문

Introduction

Introduction
Breast cancer (BC) is the most commonly diagnosed malignancy and represents a serious threat to female health (1). It is estimated that there will be approximately 4.3 million new BC cases worldwide in 2040, and 1.05 million BC-related deaths (2). The projected number of cases in China is expected to reach approximately 481,000, accounting for 11.2% of the global total, while the projected number of deaths in China is expected to reach almost 174,000, accounting for 16.6% of the global total (3). Due to differences in breast density, mammography is the first-line screening method for BC in western countries (4); however, ultrasound (US)-based screening is considered more suitable in China (5).
To improve the clinical management of breast masses, the American College of Radiology (ACR) developed the Breast Imaging Reporting and Data System (BI-RADS) for mammography in 1992 (6). The first BI-RADS US was published in 2003 (7), and the fifth edition of the BI-RADS in 2013 (8). The ACR BI-RADS 2013 has been widely implemented worldwide and demonstrates relatively effective risk stratification capabilities (9). However, after nearly a decade of clinical application, certain limitations have been identified. During the routine clinical application of the BI-RADS 2013, high inconsistency of interobserver variability has been observed, which can lead to inaccuracies in categorization (10-13).
Nomograms have been extensively applied as predictive tools in oncology (14-16). Able to generate the individual probability of a clinical incident by integrating diverse prognostic and determinative variables, nomograms have been widely applied across clinical disciplines [e.g., to predict treatment response to chemotherapy (17), to guide biopsy decisions (18), and to determine respiratory strategies for adults with Coronavirus Disease 2019 (19)], enabling the establishment of biologically and clinically integrated models and advancing personalized medicine. The application of nomograms has recently extended to diagnostic imaging, such as constructing models to predict BC using US images combined with radiomics features or blood sample analysis results (20,21). Radiomics and blood sample analyses demonstrate good predictive performance but are relatively complex, limiting their clinical application.
We aimed to construct a simple and practicable nomogram based on age and US features to predict the malignant risk of breast masses, which we designated the breast nomogram model (B-NM). We hypothesized that the B-NM would demonstrate robust risk prediction performance in adult female patients with breast masses. We present this article in accordance with the TRIPOD+AI reporting checklist (available at https://qims.amegroups.com/article/view/10.21037/qims-24-2220/rc).

Methods

Methods

Study design
This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The prospective investigation was approved by the Medical Ethics Committee of The Seventh Affiliated Hospital, Sun Yat-sen University (No. KY-2023-134-02). All participants signed the written informed consent form. This research was registered on https://www.chictr.org.cn/ (No. ChiCTR2400082497; name: “Constructing a Points-Based BI-RADS by Nomograms: A Prospective Multicenter Study”). The other participating hospitals were informed and agreed to the study.

Participants
From August 2020 to November 2023, consecutive patients at The Seventh Affiliated Hospital, Sun Yat-sen University, Shenzhen (training and internal validation set), Dongguan Songshan Lake Tungwah Hospital, Dongguan, China, Hengyang Central Hospital Affiliated, Hunan Normal University, Hengyang, China, and Chao’an District People’s Hospital, Chaozhou, China (external validation set) were prospectively enrolled in the study. The study inclusion criteria were as follows: (I) women aged ≥18 years; and (II) breast masses classified as ACR BI-RADS 2013 categories II–V, among which categories III–V had a pathologically confirmed diagnosis. ACR BI-RADS 2013 category II lesions included breast cysts confirmed by US. The study exclusion criteria were as follows: (I) pregnant or lactating women; (II) a history of previous breast surgery; (III) diffuse breast masses; (IV) images with large artefacts or poor resolution; and/or (V) three or more masses in the ipsilateral breast for which pathological results could not be clearly attributed to individual masses.

Test methods

US image acquisition and evaluation
For each enrolled patient, gray-scale images of the most malignant features of the breast masses and their orthogonal gray-scale views were obtained. For each enrolled patient, color Doppler static images of the breast masses showing the richest blood flow were obtained, along with perpendicular Doppler static images. The cross-section, longitudinal section, and multi-section gray-scale dynamic view of each breast mass were simultaneously acquired and stored. The US was performed with ultrasonic instruments from multiple brands (Table S1).
The US features of each breast mass and patient demographic characteristics (age, height, weight, body mass index, and mass location) were collected and recorded in a case report form (Table S2) independently by the same radiologists who performed the US. Biopsy or surgical resection of the breast masses that underwent biopsy or surgical excision were the same as the aforementioned breast mass. According to the ACR BI-RADS 2013, the US features of each breast mass were recorded, including breast tissue composition (fat, fibroglandular, or heterogeneous background echotexture); mass size; mass orientation (parallel or not parallel); mass margin (circumscribed, indistinct, angular, microlobulated, or spiculated); mass echo pattern (anechoic, hyperechoic, complex cystic and solid, hypoechoic, isoechoic, or heterogeneous); calcifications/echogenic foci (within a mass, outside a mass, or intraductal); associated features (architecture distortion, duct change, skin changes, or edema); and vascularity (absent, internal vascularity, or vessels in rim) (8).

Reference standard
The interval between the US examination and biopsy or surgical resection was limited to 7 days, and no clinical intervening events were reported. The following criteria were established as reference standards: (I) for BI-RADS categories III–V, histopathologic results from biopsy or surgical resection; and (II) for benign masses without a pathological diagnosis, BI-RADS category II on follow-up US performed more than 6 months later.
Image interpretation was prospectively performed under a masked scheme, under which the readers were blinded to the clinical and pathological information. The image review of the US images and videos was performed in consensus by four board-certified breast radiologists (F.L., T.R., Y.S., and Y.G., with 12, 5, 5, and 4 years of experience, respectively). If the reading results of the four radiologists were inconsistent, a consensus was reached through discussion. If a consensus could not be reached, a senior radiologist (Q.J., with 13 years of experience) was consulted. The B-NM classified cases with a predicted probability of BC greater than 50% as malignant masses (positive outcome), and those with a predicted probability of 50% or less as benign masses (negative outcomes). The pathologists had access to the clinical information and US diagnoses of the masses before making their assessments. A pathological report indicating a malignant mass was classified as a positive result.

Statistical analysis
Descriptive statistics were presented as frequencies and percentages for categorical variables, and as means and standard deviations for continuous variables. Differences among means were evaluated using analysis of variance, and differences among percentages were assessed using the χ2 test. Absolute dispersions were assessed using the pairwise confidence interval (CI) method in the R software package.
In the training set, univariable and multivariable logistic regression analyses were performed to identify the main factors associated with BC. Odds ratios (ORs) and 95% CIs were calculated per 1-unit increase for all continuous variables, and for the calculation of. The variables identified as significant (P<0.10) in the univariable analysis were included in multivariable logistic regression models, and backward stepwise selection was performed based on improvement in goodness of fit, as indicated by a decrease in the Akaike information criterion. Variables were excluded if the number of events was too small to compute the ORs. To improve the generalizability and simplicity of the nomogram model, this study excluded all risk factors dependent on laboratory parameters (e.g., estradiol levels).
The final nomogram model was created from the entire training set. The final multivariable model for predicting the likelihood of BC at time t was expressed using the following formula: ln[p/(1 − p)] = α + (β1X1 + β2X2 … + βiXi), where β is the regression coefficient, X is the reported value of the covariates that were found to be significantly associated in the multivariable regression, and α is the baseline constant, assessed from the dataset. The regression coefficient was applied to build the variable axes in the predicted model. The performance of the nomogram model was evaluated in terms of discrimination and calibration. Discrimination was quantified using the area under the curve (AUC) of the receiver operating characteristic curve. While calibration was assessed using calibration diagrams. To analyze the consistency between the nomogram predictions and actual outcomes in the training set, 1,000 bootstrap resamples (with replacement) were generated, and calibration curves were constructed. 95% CIs were calculated and compared with that of each independently related variable in the training set. The Akaike information criterion was calculated to measure the goodness of fit of the nomogram.
All statistical analyses were performed using RStudio (version 4.2.3) and SPSS (version 26.0.0.0), and a P value of less than 0.05 was considered statistically significant. For the AUC analysis, cases predicted by the B-NM to have a 50% risk of BC (considered equivocal) were considered negative. All cases had clear pathological diagnoses, with no equivocal results, and there were no missing data.
Using the events per variable method, the sample size was determined based on the number of events for each independent variable. In multivariable logistic regression analysis, the minimum number of participants included should be at least 10 times the number of independent variables in the multivariable regression model. In this study, it was expected that five independent variables would be included in the final model; thus, the sample size for the case group was 10×5=50 cases. Given a BC incidence rate of 17% in this study, the required sample size was calculated as 50/0.17, indicating that at least 294 cases were needed. The sample size for model evaluation was designed to be at least 30% of that used for model construction; thus, the internal and external validation sets each required a minimum of 88 cases, resulting in a total sample size of at least 470 cases. Based on a 20% non-response rate, a minimum of 564 patient samples needed to be collected. For the ease of task allocation at each center, the planned total number of participants was set at 570, a convenient multiple of 10.

Results

Results

Participants characteristics
From August 28, 2020, to November 28, 2023, a total of 465 patients (with 532 masses) were identified as eligible for inclusion in the final analysis of this study. Three hundred and six consecutive patients (353 masses, mean age ± standard deviation, 43.0±11.9 years) were included in the training set from August 28, 2020, to July 21, 2023. Forty-five consecutive patients (51 masses, mean age ± standard deviation, 46.0±12.6 years) were included in the internal validation set from July 22, 2023, to November 28, 2023. One hundred and fourteen consecutive patients (128 masses, mean age ± standard deviation, 40.6±11.4 years) were included in the external validation set from September 7, 2020, to November 14, 2023. The external validation set comprised patients from three centers (Dongguan Songshan Lake Tungwah Hospital, 92; Hengyang Central Hospital Affiliated, Hunan Normal University, 17; and Chao’an District People’s Hospital, 5) (Figure 1, Table 1). In the training set, 85 (24.1%) of 353 masses were malignant. In the internal and external validation sets, 19 (37.3%) of 51 and 30 (23.4%) of 128 masses were malignant, respectively. All the patients included in this study were part of a previously published cohort (22). The current study provides updated data on these patients, including an additional 16 months of follow-up. The elasticity parameters of breast masses were excluded from this analysis.

Factors associated with BC: univariable and multivariable analyses
The results of the univariable and multivariable logistic regression revealed that the independent risk factors associated with BC included age and US variables (mass size, orientation, margin, and vascularity). In our nomogram model, age [OR (95% CI): 1.106 (1.069–1.150)], and US variables, including mass size [OR (95% CI): 1.066 (1.029–1.107)], orientation: not parallel [OR (95% CI): 3.699 (1.825–7.829)], margin: indistinct [OR (95% CI): 9.038 (2.801–13.346)], angular [OR (95% CI): 0.010 (0.000–1.480)], microlobulated [OR (95% CI): 11.361 (2.364–15.378)], inspiculated [OR (95% CI): 15.624 (2.364–20.674)], two features [OR (95% CI): 18.786 (6.565–25.778)], three or more features [OR (95% CI): 43.866 (9.011–52.233)], vascularity: internal vascularity [OR (95% CI): 6.904 (2.649–8.124)], vessels in rim [OR (95% CI): 1.816 (0.542–5.930)], mixed vascularity [OR (95% CI): 3.875 (1.419–5.879)], were independently associated with BC (Table 2).

Nomogram development and validation
Based on the final multivariable model, a nomogram was developed by giving a weighted score to each of the elements associated with BC (Figure 2). The total score was calculated as follows: –11.138 + (age × 0.1009) + (0.0639 × size) + (1.308 × orientation not parallel) + [2.946 × margin (distinct)] + [14.5788 × margin (angular)] + [2.4302 × margin (microlobulated)] + [3.5730 × margin (spiculated)] + [3.3599 × margin (two features)] + ([3.7812 × margin (three of more features)] + [1.932 × vascularity (internal vascularity)] + [0.5967 × vascularity (vessels in rim)] + [1.3547 × vascularity (mixed vascularity)].
We assessed the ability of our final model (the B-NM) to differentiate between BC and benign breast masses using AUC values. The AUC of the B-NM in predicting BC in the training set was 0.914 (95% CI: 0.882–0.946), which was significantly higher than the AUC of each variable in the model (Figure 3A). The bootstrap-corrected calibration diagram (apparent and bias-corrected) aligned with the ideal line in the training set, demonstrating high consistency between the B-NM predictions and observed values (Figure 3B).
The B-NM demonstrated good discrimination performance in the internal validation set comprising 45 females [AUC (95% CI): 0.905 (0.827–0.982)] (Figure 3C). It also demonstrated good calibration, showing strong agreement in the calibration diagram (Figure 3D). The B-NM demonstrated good discrimination performance in the external validation set comprising 114 females [AUC (95% CI): 0.846 (0.750–0.940)] (Figure 3E). It also demonstrated good calibration, showing strong agreement in the calibration diagram (Figure 3F). No adverse events (serious or non-serious) were observed in the trial. A representative example of BC is provided in Figure 4. Figure 5 presents a case in which the B-NM predicted a high risk of BC, but the histopathological outcome was a benign mass.

Discussion

Discussion
Patients diagnosed with BC at advanced tumor stages (T stages) (T3–4) exhibit significantly higher mortality compared to those diagnosed at earlier stages. Thus, the early prediction and accurate diagnosis of BC at earlier T stages (T1–2) are essential for facilitating timely interventions by healthcare professionals and enhancing patient prognosis. In this study, we developed and validated a nomogram for the simple prediction of BC in patients. Our nomogram incorporating age and US variables (mass size, orientation, margin, and vascularity) demonstrated good discrimination ability [AUC (95% CI): 0.905 (0.827–0.982) and 0.846 (0.750–0.940)] in the internal and external validation sets, respectively, as well as good calibration ability in both cohorts.
Several studies have shown that T3–4 stage BC is associated with a poor prognosis, including a low 5-year survival rate (23). The early diagnosis of BC is particularly important for low-income patients, as access to expensive targeted therapies is often limited in this group. Previous studies have suggested that the ACR BI-RADS 2013 performs well in the classification of breast masses at intermediate or high risk (24,25); however, some scholars hold differing views (13). The discriminatory ability of the model in the training set was similar to that in the internal and external validation sets. Therefore, we identified five risk factors from the ACR BI-RADS 2013 guidelines to enhance the usability of the B-NM.
The ACR BI-RADS 2013 provides a range of malignancy probabilities for breast masses; however, the B-NM provides a quantitative malignancy probability for each individual patient, which is more in line with the trend toward individualized diagnosis and treatment. In addition, the ACR BI-RADS2013 is relatively complex, and radiologists must evaluate more than 10 indicators to stratify breast masses by malignancy. Conversely, the B-NM evaluates only five risk factors, greatly simplifying the analysis process. Notably, the performance of our nomogram surpassed that of the BI-RADS (AUC: 0.914 vs. 0.68–0.80, respectively) (24,26). Further, the B-NM is easy to apply in clinical practice. The nomogram could also be used to identify patients with BC in remote areas, where the availability of breast magnetic resonance imaging for patients is limited. Further, this nomogram, which can detect BC more efficiently and cost-effectively than the ACR BI-RADS, may be particularly useful for patients who cannot undergo breast magnetic resonance imaging for regional or financial reasons. These patients could benefit from early diagnosis and clinical intervention.
Unlike previous reports (8,27), the present study showed that microcalcification presence was less strongly associated with BC. This may be related to the inherent limitations of US, which has a very low detection rate for breast microcalcifications. The development of breast US MicroPure imaging technology has improved breast microcalcification detection (28); however, the ability of mammography to detect breast microcalcifications remains superior to that of US (29). Therefore, in the future, we intend to construct a nomogram incorporating both breast US and mammography imaging data to determine whether it can improve the performance of our current nomogram model, which is based solely on age and breast US features.
The “vessel-in-rim” phenomenon is considered a characteristic vascularity feature of fibroadenomas (30); thus, the OR of 1.816 was somewhat surprising. It may be that many fibroadenomas lack vascularity and that some BCs exhibit the “vessel-in-rim” feature. Additionally, the blood vessels located in the central region of invasive BC are often too small in diameter to be detected by US probes. As a result, only blood flow signals from relatively larger peripheral vessels are visualized on US images. This misclassification of a subtype of invasive carcinoma as displaying a peripheral vascular pattern has led to an overestimation of the proportion of invasive cancers characterized by peripheral vascularity.
This study had several limitations. First, the distribution of data was imbalanced; the number of collected cases across the three centers (Dongguan Songshan Lake Tungwah Hospital, Hengyang Central Hospital Affiliated, Hunan Normal University, and Chao’an District People’s Hospital) varied greatly, which may have introduced potential bias in the external validation dataset. The number of cases in the internal validation set (which only included 51 masses) did not reach our estimated sample size of 88. The ratio of cases between the external and internal validation sets exceeded 2:1, which may have caused model overfitting during internal validation. Second, only Han Chinese patients from four centers in central to southern China were enrolled in the study; however, regional and ethnic differences in the incidence of BC have been reported (23,31), which may limit the external applicability of the B-NM. Third, only four US features were included in the development of the nomogram model, one of which was the margin. However, the margin is easily influenced by observer subjectivity, which may have led to inter-observer inconsistency and may reduce the external consistency of the model. Finally, the ability of our nomogram model to distinguish between BC and chronic inflammatory masses of the breast is limited. This may be because the margin, which was one of the US features, was overweighted in predicting BC risk, as the margins of inflammatory masses are similar to those of invasive BC. This observation warrants further validation in future research.

Conclusions

Conclusions
Our B-NM model, which incorporates age and US variables, demonstrated favorable discrimination and calibration performances in predicting the risk of BC. The B-NM is straightforward to implement in clinical practice, providing a quantitative assessment of the malignant probability for each individual patient.

Supplementary

Supplementary
The article’s supplementary files as

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

Development and validation of a novel nomogram model for predicting postoperative survival of T4N0M0 NSCLC: a population-based survival analysis.
International journal of surgery (London, England) 2026 Lu T 외 📖 OA
Clinical Characteristics and Prognostic Prediction of Secondary Solid Malignancies in Patients With Diffuse Large B-Cell Lymphoma and Follicular Lymphoma.
Cancer medicine 2026 Zhang L 외 📖 OA
Independent Risk Factors and Nomogram-Based Prediction of Pulmonary Fungal Infection in Lung Cancer Inpatients: A Single-Center Retrospective Study.
Cancer management and research 2026 Xu Y 외 📖 OA
Nomogram Based on Tumor Burden Score and Inflammation-Nutritional Indicators to Predict the Prognosis of Hepatocellular Carcinoma Patients Undergoing TACE Combined with Targeted and Immunotherapy.
Journal of hepatocellular carcinoma 2026 Yu M 외 📖 OA
Standalone 29-MHz micro-ultrasound for classifying clinically significant prostate cancer: a systematic review and diagnostic test accuracy meta-analysis of prospective studies.
Abdominal radiology (New York) 2026 Abdel Gawad AM 외 📖 OA
Molecular Subtyping and Prognostic Prediction in Pancreatic Cancer Based on Mitophagy-Related Genes.
International journal of medical sciences 2026 Cai Y 외 📖 OA