Modeling Individual-Level Uncertainty From Missing Data in Multifactorial Breast Cancer Risk Prediction.
[PURPOSE] Multifactorial breast cancer (BC) risk prediction models use a range of predictors to estimate an individual's chance of developing BC.
APA
White BL, Ficorella L, et al. (2026). Modeling Individual-Level Uncertainty From Missing Data in Multifactorial Breast Cancer Risk Prediction.. JCO precision oncology, 10, e2500852. https://doi.org/10.1200/PO-25-00852
MLA
White BL, et al.. "Modeling Individual-Level Uncertainty From Missing Data in Multifactorial Breast Cancer Risk Prediction.." JCO precision oncology, vol. 10, 2026, pp. e2500852.
PMID
41650366
Abstract
[PURPOSE] Multifactorial breast cancer (BC) risk prediction models use a range of predictors to estimate an individual's chance of developing BC. Data on risk factors are often incomplete, and point estimates calculated when data are missing can mask considerable uncertainty. Quantifying this uncertainty is critical for effective risk communication.
[METHODS] We used Monte Carlo simulation methods to estimate the distribution of 10-year BC risk for individuals with missing data, using the BOADICEA multifactorial model as an example. Multivariate imputation by chained equations with large representative reference data sets was used to sample missing covariates. We developed a framework for estimating the uncertainty distribution, uncertainty intervals (UIs), and probability of reclassification, which can be applied to any given individual with missing risk factor data. This was applied to estimating individual-level uncertainty distributions and quantifying the probability of reclassification when groups of risk factors are measured, for a range of example women.
[RESULTS] Women with limited risk factor data had considerable uncertainty in their estimated BC risk, and 95% UIs spanned all risk categories. This was especially relevant for women classified as moderate-risk, such as those with strong family history or a moderate-risk pathogenic variant. Reclassification probability in this case was as high as 57.5%, with 95% UI of 0.9% to 9.3% for the 10-year risk from age 40 years. Risk certainty improved with additional data collection, particularly genetic information or mammographic density measurement.
[CONCLUSION] Our results demonstrate that, in some cases, there is considerable probability of reclassification after collecting missing data. Methodology presented here can identify situations where it would be most beneficial to collect additional information, to enable better informed clinical decision making.
[METHODS] We used Monte Carlo simulation methods to estimate the distribution of 10-year BC risk for individuals with missing data, using the BOADICEA multifactorial model as an example. Multivariate imputation by chained equations with large representative reference data sets was used to sample missing covariates. We developed a framework for estimating the uncertainty distribution, uncertainty intervals (UIs), and probability of reclassification, which can be applied to any given individual with missing risk factor data. This was applied to estimating individual-level uncertainty distributions and quantifying the probability of reclassification when groups of risk factors are measured, for a range of example women.
[RESULTS] Women with limited risk factor data had considerable uncertainty in their estimated BC risk, and 95% UIs spanned all risk categories. This was especially relevant for women classified as moderate-risk, such as those with strong family history or a moderate-risk pathogenic variant. Reclassification probability in this case was as high as 57.5%, with 95% UI of 0.9% to 9.3% for the 10-year risk from age 40 years. Risk certainty improved with additional data collection, particularly genetic information or mammographic density measurement.
[CONCLUSION] Our results demonstrate that, in some cases, there is considerable probability of reclassification after collecting missing data. Methodology presented here can identify situations where it would be most beneficial to collect additional information, to enable better informed clinical decision making.
MeSH Terms
Humans; Breast Neoplasms; Female; Uncertainty; Risk Assessment; Monte Carlo Method; Risk Factors; Middle Aged