Machine Learning-based Prediction of Mean Heart Dose and Deep Inspiration Breath-hold Selection in Left-sided Breast Cancer Volumetric Modulated Arc Therapy Radiotherapy Planning.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
120 patients' treatment plans on free-breathing (FB) scans from left-sided postmastectomy breast cancer patients treated with volumetric modulated arc therapy (VMAT).
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
The VMAT-5P technique showed reduced classification performance (58.3% accuracy, AUC: 0.83). [CONCLUSIONS] Machine learning software demonstrated accurate prediction of mean heart dose during pre-planning for left-sided breast cancer, enabling informed DIBH selection for cardiac sparing based on simple anatomical metrics from FB computed tomography (CT) scans.
[BACKGROUND] Deep inspiration breath-hold (DIBH) can reduce cardiac radiation exposure in left-sided breast cancer, but resource limitations necessitate appropriate patient selection.
- Sensitivity 83%
APA
Patil D, Zope MK, et al. (2026). Machine Learning-based Prediction of Mean Heart Dose and Deep Inspiration Breath-hold Selection in Left-sided Breast Cancer Volumetric Modulated Arc Therapy Radiotherapy Planning.. Journal of medical physics, 51(1), 77-88. https://doi.org/10.4103/jmp.jmp_292_25
MLA
Patil D, et al.. "Machine Learning-based Prediction of Mean Heart Dose and Deep Inspiration Breath-hold Selection in Left-sided Breast Cancer Volumetric Modulated Arc Therapy Radiotherapy Planning.." Journal of medical physics, vol. 51, no. 1, 2026, pp. 77-88.
PMID
42039731 ↗
Abstract 한글 요약
[BACKGROUND] Deep inspiration breath-hold (DIBH) can reduce cardiac radiation exposure in left-sided breast cancer, but resource limitations necessitate appropriate patient selection.
[PURPOSE] To develop and evaluate a machine learning-based tool for predicting heart mean dose and identifying patients who would benefit from DIBH using simple anatomical predictors in left-sided breast cancer.
[MATERIALS AND METHODS] A retrospective study analyzed 120 patients' treatment plans on free-breathing (FB) scans from left-sided postmastectomy breast cancer patients treated with volumetric modulated arc therapy (VMAT). All plans were generated using three techniques: VMAT 2-field plan (VMAT-2P), VMAT 4-field plan (VMAT-4P), and VMAT 5-field plan (VMAT-5P). Two anatomical predictors, maximum heart distance (MHD) and heart-to-PTV distance (HPD), were measured. Elastic Net regression was used for continuous dose prediction, whereas logistic regression was applied for binary classification of DIBH necessity, using a 5 Gy heart mean dose threshold. An independent cohort ( = 25) with paired FB-DIBH scans validated predictions.
[RESULTS] In the validation cohort ( = 25), DIBH reduced mean heart dose by 34% (1.72-1.86 Gy, < 0.001 for both techniques) and decreased high-risk patients (>5 Gy) by 69%-80%. Strong correlations were observed between FB predictions and DIBH-achieved doses for anatomical parameters and VMAT-2P ( = 0.667-0.720, < 0.001), with moderate correlation for VMAT-4P ( = 0.545, = 0.005). In the independent test cohort from the model development dataset ( = 24), Elastic Net achieved mean absolute errors of 0.81-1.02 Gy. Logistic regression demonstrated 87.5% accuracy with 83%-92% sensitivity and 83%-92% specificity for VMAT-2P and VMAT-4P (area under the curve [AUC]: 0.85-0.94). The VMAT-5P technique showed reduced classification performance (58.3% accuracy, AUC: 0.83).
[CONCLUSIONS] Machine learning software demonstrated accurate prediction of mean heart dose during pre-planning for left-sided breast cancer, enabling informed DIBH selection for cardiac sparing based on simple anatomical metrics from FB computed tomography (CT) scans.
[PURPOSE] To develop and evaluate a machine learning-based tool for predicting heart mean dose and identifying patients who would benefit from DIBH using simple anatomical predictors in left-sided breast cancer.
[MATERIALS AND METHODS] A retrospective study analyzed 120 patients' treatment plans on free-breathing (FB) scans from left-sided postmastectomy breast cancer patients treated with volumetric modulated arc therapy (VMAT). All plans were generated using three techniques: VMAT 2-field plan (VMAT-2P), VMAT 4-field plan (VMAT-4P), and VMAT 5-field plan (VMAT-5P). Two anatomical predictors, maximum heart distance (MHD) and heart-to-PTV distance (HPD), were measured. Elastic Net regression was used for continuous dose prediction, whereas logistic regression was applied for binary classification of DIBH necessity, using a 5 Gy heart mean dose threshold. An independent cohort ( = 25) with paired FB-DIBH scans validated predictions.
[RESULTS] In the validation cohort ( = 25), DIBH reduced mean heart dose by 34% (1.72-1.86 Gy, < 0.001 for both techniques) and decreased high-risk patients (>5 Gy) by 69%-80%. Strong correlations were observed between FB predictions and DIBH-achieved doses for anatomical parameters and VMAT-2P ( = 0.667-0.720, < 0.001), with moderate correlation for VMAT-4P ( = 0.545, = 0.005). In the independent test cohort from the model development dataset ( = 24), Elastic Net achieved mean absolute errors of 0.81-1.02 Gy. Logistic regression demonstrated 87.5% accuracy with 83%-92% sensitivity and 83%-92% specificity for VMAT-2P and VMAT-4P (area under the curve [AUC]: 0.85-0.94). The VMAT-5P technique showed reduced classification performance (58.3% accuracy, AUC: 0.83).
[CONCLUSIONS] Machine learning software demonstrated accurate prediction of mean heart dose during pre-planning for left-sided breast cancer, enabling informed DIBH selection for cardiac sparing based on simple anatomical metrics from FB computed tomography (CT) scans.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (1)
📖 전문 본문 읽기 PMC JATS · ~87 KB · 영문
I
INTRODUCTION
Breast cancer remains the most prevalent malignancy diagnosed in women worldwide, with 2.30 million new cases reported in 2020.[1] Five-year survival rates exceed 90% in developed countries, which has shifted the clinical focus toward minimizing the toxicity associated with treatments.[2] Volumetric modulated arc therapy (VMAT) allows for precise dose delivery while sparing normal tissues,[3] yet the issue of cardiac toxicity is still critical in left-sided breast radiotherapy despite these advances.[4] Darby et al.[5] demonstrated that the incidence of major coronary events increases linearly by 7.4% for each gray of mean heart dose, with the risk persisting for more than 20 years.
Deep inspiration breath-hold (DIBH) has been identified as an effective technique for cardiac sparing, as it increases lung volume and moves the heart inferiorly and posteriorly by approximately 2–3 cm. Clinical research demonstrates that DIBH can achieve a mean heart dose reduction of 30%–40% compared to free-breathing (FB).[67] However, DIBH requires additional equipment, specialized training, and patient cooperation. It is crucial to recognize that not all patients will experience significant reductions in cardiac dose; some may show only minimal benefits based on their anatomy,[89] which necessitates careful patient selection. Current methodologies rely on comparative planning that generates both FB and DIBH plans to evaluate eligibility. This process is time-intensive, resource-demanding, and produces inconsistent patient selection.[9] On the other hand, machine learning techniques are showing potential for improving decision-making in radiation oncology.[1011]
Previous research has demonstrated that heart dose can be predicted from anatomical features using machine learning, including maximum heart distance, heart volume within the field, total heart volume, and PTV volume.[12] Other investigators have explored anatomical predictors with varying complexity. Mendez et al.[13] developed geometric predictors reporting moderate correlations (r = 0.54–0.68) with heart dose. Cao et al.[14] achieved area under the curve (AUC) values of 0.72–0.79 for predicting DIBH benefit using heart-in-field volume analysis, though requiring sophisticated three dimensional segmentation and extensive processing. While effective, these complex volumetric methods have limited practical utility as rapid decision-support tools.
Simple predictive models that are anatomically based and use readily accessible computed tomography (CT) measurements could provide efficient heart dose estimation before starting time-consuming comparative planning. Two parameters are of clinical relevance: Maximum heart distance (MHD) and heart-to-PTV distance (HPD), which quantify the spatial separation between cardiac structures and the planning target volume. These measurements can be quickly obtained in <2–3 min without the need for complex segmentation algorithms.
The purpose of this study was to develop and validate a machine learning tool aimed at predicting the mean heart dose in left-sided post-mastectomy radiotherapy, utilizing simple MHD and HPD measurements from FB CT scans. This study makes three key contributions: (1) the simultaneous evaluation of three distinct VMAT techniques within a cohesive framework, (2) a comprehensive assessment of continuous dose prediction and binary classification for DIBH selection based on the 5 Gy threshold, and (3) validation using actual DIBH scans, with correlation analysis between FB and DIBH anatomical and dosimetric parameters.
Breast cancer remains the most prevalent malignancy diagnosed in women worldwide, with 2.30 million new cases reported in 2020.[1] Five-year survival rates exceed 90% in developed countries, which has shifted the clinical focus toward minimizing the toxicity associated with treatments.[2] Volumetric modulated arc therapy (VMAT) allows for precise dose delivery while sparing normal tissues,[3] yet the issue of cardiac toxicity is still critical in left-sided breast radiotherapy despite these advances.[4] Darby et al.[5] demonstrated that the incidence of major coronary events increases linearly by 7.4% for each gray of mean heart dose, with the risk persisting for more than 20 years.
Deep inspiration breath-hold (DIBH) has been identified as an effective technique for cardiac sparing, as it increases lung volume and moves the heart inferiorly and posteriorly by approximately 2–3 cm. Clinical research demonstrates that DIBH can achieve a mean heart dose reduction of 30%–40% compared to free-breathing (FB).[67] However, DIBH requires additional equipment, specialized training, and patient cooperation. It is crucial to recognize that not all patients will experience significant reductions in cardiac dose; some may show only minimal benefits based on their anatomy,[89] which necessitates careful patient selection. Current methodologies rely on comparative planning that generates both FB and DIBH plans to evaluate eligibility. This process is time-intensive, resource-demanding, and produces inconsistent patient selection.[9] On the other hand, machine learning techniques are showing potential for improving decision-making in radiation oncology.[1011]
Previous research has demonstrated that heart dose can be predicted from anatomical features using machine learning, including maximum heart distance, heart volume within the field, total heart volume, and PTV volume.[12] Other investigators have explored anatomical predictors with varying complexity. Mendez et al.[13] developed geometric predictors reporting moderate correlations (r = 0.54–0.68) with heart dose. Cao et al.[14] achieved area under the curve (AUC) values of 0.72–0.79 for predicting DIBH benefit using heart-in-field volume analysis, though requiring sophisticated three dimensional segmentation and extensive processing. While effective, these complex volumetric methods have limited practical utility as rapid decision-support tools.
Simple predictive models that are anatomically based and use readily accessible computed tomography (CT) measurements could provide efficient heart dose estimation before starting time-consuming comparative planning. Two parameters are of clinical relevance: Maximum heart distance (MHD) and heart-to-PTV distance (HPD), which quantify the spatial separation between cardiac structures and the planning target volume. These measurements can be quickly obtained in <2–3 min without the need for complex segmentation algorithms.
The purpose of this study was to develop and validate a machine learning tool aimed at predicting the mean heart dose in left-sided post-mastectomy radiotherapy, utilizing simple MHD and HPD measurements from FB CT scans. This study makes three key contributions: (1) the simultaneous evaluation of three distinct VMAT techniques within a cohesive framework, (2) a comprehensive assessment of continuous dose prediction and binary classification for DIBH selection based on the 5 Gy threshold, and (3) validation using actual DIBH scans, with correlation analysis between FB and DIBH anatomical and dosimetric parameters.
M M
MATERIALS AND METHODS
Patient selection and study design
After obtaining approval from the institutional review board, we conducted a retrospective analysis of 120 consecutive patients who received left-sided post-mastectomy VMAT from December 2022 to September 2024. The study design, software development, and validation workflow are presented in Figure 1.
Model development cohort (n = 120, FB scan): The inclusion criteria required that patients be at least 18 years old, diagnosed with left-sided breast cancer, and have undergone a modified radical mastectomy, along with complete documentation for postmastectomy radiotherapy planning. Exclusion criteria included prior thoracic radiotherapy, the presence of a cardiac pacemaker, and bilateral disease
DIBH validation cohort (n = 25, FB and DIBH scan): A separate validation cohort of 25 consecutive patients who underwent both FB and DIBH CT simulations as part of routine institutional practice, rather than being selected based on predicted cardiac dose. This cohort was independent of the 120-patient model development cohort and was used solely to validate the accuracy of FB-to-DIBH predictions, without any prospective selection based on cardiac dose.
Patient characteristics
The study included 145 patients total: 120 in the model development with FB scan and an independent 25-patient cohort with paired FB-DIBH scans for validation. In the 120-patient development cohort, the median age was 54 years (range: 32–76 years), and the median body mass index was 26.2 kg/m² (range: 19.1–34.8 kg/m²). A majority of patients were diagnosed with T2 disease (62.5%) and node-positive disease (61.7%). Patient demographic and clinical characteristics are provided in Table 1.
Computed tomography simulation and deep inspiration breath-hold protocol
All patients underwent FB CT (2.5-mm slices, Revolution EVO, GE Healthcare) in a supine position, using standard breast board immobilization. The images were imported into the Eclipse Treatment Planning System version 16.1 (Varian Medical Systems) for planning purposes. For the DIBH validation cohort, both FB and DIBH scans utilized respiratory gating (RGSC system, Varian) with an infrared reflective marker on the anterior abdomen. The DIBH level was set at 80% of the maximum inspiration amplitude, with reproducibility criteria of ±3 mm over 3–5 consecutive breath-holds. Both scans were acquired in the same session with identical positioning and no patient movement between scans.
Anatomical measurements
Target volumes and organs-at-risk were delineated according to the radiation therapy oncology group (RTOG) breast atlas guidelines.[15] Two anatomical predictors were systematically measured on all FB CT scans, with 20 patients independently evaluated by two physicists to assess reproducibility through the intraclass correlation coefficient (ICC) methodology [Supplementary Table 1]. Maximum heart distance (MHD) quantifies the maximum cardiac protrusion toward the treatment field and was recorded as the greatest perpendicular distance from any cardiac contour point to the tangent line connecting the medial and lateral posterior PTV edges on axial slices demonstrating maximum cardiac extent [Figure 2a]. Heart-to-PTV distance (HPD) was defined as the minimum distance between the posterior heart contour and the anterior PTV contour in the coronal plane, measured perpendicular to the chest wall [Figure 2b]. Interobserver reliability was quantified using two-way random effects ICCs.[16]
Treatment planning
Three different partial-arc VMAT plans were generated on FB scans for each patient:
VMAT-2P: Two tangential arcs (295°–300° to 160°–165°, clockwise and counter-clockwise) with collimator rotations of ±30°, representing a basic chest wall approach without jaw splitting
VMAT-4P: Four arcs including two opposing tangential arcs plus anterior (300°–41°) and posterior (165°-80°) arcs with collimator angles of ± 30°, 35°, and 345°
VMAT-5P: Five arcs (310°–41°, 81°–160°, 331°–160°, 160°–81°, and 41°–310°) with collimator angles of 17°, 343°, 80°, 357°, and 3°, incorporating jaw splitting.
For the DIBH validation cohort, VMAT-2P and VMAT-4P plans were developed based on both FB and DIBH scans; VMAT-5P was not included in the DIBH validation due to limited institutional experience with 5-arc DIBH treatments at the time of validation cohort treatment. The treatment planning utilized 6-MV photon beams (TrueBeam SVC linear accelerator) to deliver 40.05 Gy in 15 fractions (2.67 Gy per fraction) to the PTV. Inverse planning was executed using Photon Optimizer (version 16.1), and final dose calculations were carried out with the Anisotropic Analytical Algorithm (version 16.1) on a 2.5-mm grid.
The planning objectives for target coverage were defined as PTV V95% ≥95% and V107% ≤2% to limit hotspots consistent with ICRU Report 83 recommendations. The constraints for organs at risk were based on QUANTEC recommendations[17] and institutional protocols included: Mean heart dose ≤5 Gy and Heart V5Gy <20%; ipsilateral lung mean dose ≤15 Gy, Lung V5Gy ≤65%, and V20Gy ≤25%; and left anterior descending artery (LAD) mean dose ≤10 Gy. The 5 Gy mean heart dose constraint, based on QUANTEC guidance and Darby’s linear dose-response relationship,[4] served as both a planning goal and a classification threshold.
Prediction framework and dataset preparation
We constructed a dual-component prediction framework that incorporates MHD and HPD from FB scans: (1) continuous heart dose prediction via Elastic Net regression and (2) binary DIBH recommendations through logistic regression.
Dataset and preprocessing
The dataset of 120 patients was segmented into training (n = 96) and testing (n = 24) sets in an 80:20 ratio using stratified random sampling to maintain class balance. Features were standardized using z-score normalization applied to the training set and then applied to test set to prevent data leakage.
Deep inspiration breath-hold classification threshold
We established a mean heart dose threshold of 5 Gy for DIBH classification (Class 0: ≤5 Gy, no DIBH; Class 1: >5 Gy, DIBH recommended) based on: (1) QUANTEC cardiac dose constraints,[18] (2) Darby’s linear dose-response relationship, (3) the institutional median heart dose to support balanced training classes, and (4) resource optimization that prioritizes patients expected to experience reductions exceeding 1–2 Gy. This threshold is consistent with established DIBH selection criteria (3–5 Gy).[19]
Algorithm selection
We systematically evaluated six regression algorithms (Linear, Ridge, Lasso, Elastic Net, Support Vector Regression, and Random Forest) and eight classification algorithms (Decision Tree, AdaBoost, Extra Trees, Random Forest, Gradient Boosting, K-Nearest Neighbors, Support Vector Classifier, and Logistic Regression) through stratified 5-fold cross-validation on the training dataset, ensuring that class proportions were preserved across the folds. Default hyperparameters were used as preliminary testing showed minimal performance gains (<2%) from hyperparameter tuning, whereas maintaining model simplicity and generalizability.
Software implementation
The models were developed in Python 3.9, utilizing Scikit-learn version 1.0.[20] A clinical decision-support interface was created with the Streamlit framework [Figure 3], enabling real-time predictions in under 1 s. The interface is made up of two modules: (a) continuous dose prediction, which displays mean heart dose estimates for all three VMAT techniques simultaneously, and (b) binary classification that provides DIBH recommendations (required or not required) according to the 5 Gy thresholds with confidence probabilities.
Statistical analysis
The Pearson correlation coefficients were utilized to quantify the linear relationships between: (1) anatomical predictors (MHD and HPD) and the mean heart dose across the entire cohort, and (2) FB compared to DIBH anatomical and dosimetric parameters within the validation cohort. Correlation strength was assessed using standard thresholds: |r| <0.3 (weak), 0.3–0.7 (moderate), and >0.7 (strong). Statistical significance was established at P < 0.05, with a Bonferroni correction applied for the correlations in the primary validation cohort (4 comparisons, adjusted α = 0.0125). Normality was determined using the Shapiro–Wilk test across all continuous variables within the DIBH validation cohort; each parameter showed a normal distribution (P > 0.05), which supports the suitability of parametric paired t-tests. Continuous variables are presented as median (range or interquartile range [IQR]), while categorical variables are expressed as frequency (percentage). Paired t-tests were employed to compare FB and DIBH measurements, following the confirmation of normality through Shapiro–Wilk tests. All tests were conducted as two-tailed. Sample size calculations suggested that n = 25 yields >80% statistical power to identify correlations r ≥ 0.50 at α = 0.05 (two-tailed) using G*Power 3.1.[20]
Patient selection and study design
After obtaining approval from the institutional review board, we conducted a retrospective analysis of 120 consecutive patients who received left-sided post-mastectomy VMAT from December 2022 to September 2024. The study design, software development, and validation workflow are presented in Figure 1.
Model development cohort (n = 120, FB scan): The inclusion criteria required that patients be at least 18 years old, diagnosed with left-sided breast cancer, and have undergone a modified radical mastectomy, along with complete documentation for postmastectomy radiotherapy planning. Exclusion criteria included prior thoracic radiotherapy, the presence of a cardiac pacemaker, and bilateral disease
DIBH validation cohort (n = 25, FB and DIBH scan): A separate validation cohort of 25 consecutive patients who underwent both FB and DIBH CT simulations as part of routine institutional practice, rather than being selected based on predicted cardiac dose. This cohort was independent of the 120-patient model development cohort and was used solely to validate the accuracy of FB-to-DIBH predictions, without any prospective selection based on cardiac dose.
Patient characteristics
The study included 145 patients total: 120 in the model development with FB scan and an independent 25-patient cohort with paired FB-DIBH scans for validation. In the 120-patient development cohort, the median age was 54 years (range: 32–76 years), and the median body mass index was 26.2 kg/m² (range: 19.1–34.8 kg/m²). A majority of patients were diagnosed with T2 disease (62.5%) and node-positive disease (61.7%). Patient demographic and clinical characteristics are provided in Table 1.
Computed tomography simulation and deep inspiration breath-hold protocol
All patients underwent FB CT (2.5-mm slices, Revolution EVO, GE Healthcare) in a supine position, using standard breast board immobilization. The images were imported into the Eclipse Treatment Planning System version 16.1 (Varian Medical Systems) for planning purposes. For the DIBH validation cohort, both FB and DIBH scans utilized respiratory gating (RGSC system, Varian) with an infrared reflective marker on the anterior abdomen. The DIBH level was set at 80% of the maximum inspiration amplitude, with reproducibility criteria of ±3 mm over 3–5 consecutive breath-holds. Both scans were acquired in the same session with identical positioning and no patient movement between scans.
Anatomical measurements
Target volumes and organs-at-risk were delineated according to the radiation therapy oncology group (RTOG) breast atlas guidelines.[15] Two anatomical predictors were systematically measured on all FB CT scans, with 20 patients independently evaluated by two physicists to assess reproducibility through the intraclass correlation coefficient (ICC) methodology [Supplementary Table 1]. Maximum heart distance (MHD) quantifies the maximum cardiac protrusion toward the treatment field and was recorded as the greatest perpendicular distance from any cardiac contour point to the tangent line connecting the medial and lateral posterior PTV edges on axial slices demonstrating maximum cardiac extent [Figure 2a]. Heart-to-PTV distance (HPD) was defined as the minimum distance between the posterior heart contour and the anterior PTV contour in the coronal plane, measured perpendicular to the chest wall [Figure 2b]. Interobserver reliability was quantified using two-way random effects ICCs.[16]
Treatment planning
Three different partial-arc VMAT plans were generated on FB scans for each patient:
VMAT-2P: Two tangential arcs (295°–300° to 160°–165°, clockwise and counter-clockwise) with collimator rotations of ±30°, representing a basic chest wall approach without jaw splitting
VMAT-4P: Four arcs including two opposing tangential arcs plus anterior (300°–41°) and posterior (165°-80°) arcs with collimator angles of ± 30°, 35°, and 345°
VMAT-5P: Five arcs (310°–41°, 81°–160°, 331°–160°, 160°–81°, and 41°–310°) with collimator angles of 17°, 343°, 80°, 357°, and 3°, incorporating jaw splitting.
For the DIBH validation cohort, VMAT-2P and VMAT-4P plans were developed based on both FB and DIBH scans; VMAT-5P was not included in the DIBH validation due to limited institutional experience with 5-arc DIBH treatments at the time of validation cohort treatment. The treatment planning utilized 6-MV photon beams (TrueBeam SVC linear accelerator) to deliver 40.05 Gy in 15 fractions (2.67 Gy per fraction) to the PTV. Inverse planning was executed using Photon Optimizer (version 16.1), and final dose calculations were carried out with the Anisotropic Analytical Algorithm (version 16.1) on a 2.5-mm grid.
The planning objectives for target coverage were defined as PTV V95% ≥95% and V107% ≤2% to limit hotspots consistent with ICRU Report 83 recommendations. The constraints for organs at risk were based on QUANTEC recommendations[17] and institutional protocols included: Mean heart dose ≤5 Gy and Heart V5Gy <20%; ipsilateral lung mean dose ≤15 Gy, Lung V5Gy ≤65%, and V20Gy ≤25%; and left anterior descending artery (LAD) mean dose ≤10 Gy. The 5 Gy mean heart dose constraint, based on QUANTEC guidance and Darby’s linear dose-response relationship,[4] served as both a planning goal and a classification threshold.
Prediction framework and dataset preparation
We constructed a dual-component prediction framework that incorporates MHD and HPD from FB scans: (1) continuous heart dose prediction via Elastic Net regression and (2) binary DIBH recommendations through logistic regression.
Dataset and preprocessing
The dataset of 120 patients was segmented into training (n = 96) and testing (n = 24) sets in an 80:20 ratio using stratified random sampling to maintain class balance. Features were standardized using z-score normalization applied to the training set and then applied to test set to prevent data leakage.
Deep inspiration breath-hold classification threshold
We established a mean heart dose threshold of 5 Gy for DIBH classification (Class 0: ≤5 Gy, no DIBH; Class 1: >5 Gy, DIBH recommended) based on: (1) QUANTEC cardiac dose constraints,[18] (2) Darby’s linear dose-response relationship, (3) the institutional median heart dose to support balanced training classes, and (4) resource optimization that prioritizes patients expected to experience reductions exceeding 1–2 Gy. This threshold is consistent with established DIBH selection criteria (3–5 Gy).[19]
Algorithm selection
We systematically evaluated six regression algorithms (Linear, Ridge, Lasso, Elastic Net, Support Vector Regression, and Random Forest) and eight classification algorithms (Decision Tree, AdaBoost, Extra Trees, Random Forest, Gradient Boosting, K-Nearest Neighbors, Support Vector Classifier, and Logistic Regression) through stratified 5-fold cross-validation on the training dataset, ensuring that class proportions were preserved across the folds. Default hyperparameters were used as preliminary testing showed minimal performance gains (<2%) from hyperparameter tuning, whereas maintaining model simplicity and generalizability.
Software implementation
The models were developed in Python 3.9, utilizing Scikit-learn version 1.0.[20] A clinical decision-support interface was created with the Streamlit framework [Figure 3], enabling real-time predictions in under 1 s. The interface is made up of two modules: (a) continuous dose prediction, which displays mean heart dose estimates for all three VMAT techniques simultaneously, and (b) binary classification that provides DIBH recommendations (required or not required) according to the 5 Gy thresholds with confidence probabilities.
Statistical analysis
The Pearson correlation coefficients were utilized to quantify the linear relationships between: (1) anatomical predictors (MHD and HPD) and the mean heart dose across the entire cohort, and (2) FB compared to DIBH anatomical and dosimetric parameters within the validation cohort. Correlation strength was assessed using standard thresholds: |r| <0.3 (weak), 0.3–0.7 (moderate), and >0.7 (strong). Statistical significance was established at P < 0.05, with a Bonferroni correction applied for the correlations in the primary validation cohort (4 comparisons, adjusted α = 0.0125). Normality was determined using the Shapiro–Wilk test across all continuous variables within the DIBH validation cohort; each parameter showed a normal distribution (P > 0.05), which supports the suitability of parametric paired t-tests. Continuous variables are presented as median (range or interquartile range [IQR]), while categorical variables are expressed as frequency (percentage). Paired t-tests were employed to compare FB and DIBH measurements, following the confirmation of normality through Shapiro–Wilk tests. All tests were conducted as two-tailed. Sample size calculations suggested that n = 25 yields >80% statistical power to identify correlations r ≥ 0.50 at α = 0.05 (two-tailed) using G*Power 3.1.[20]
R
RESULTS
Anatomical parameters and dosimetric measurements
Median anatomical measurements showed considerable variability among patients: MHD 2.5 cm (IQR: 2.1–2.9 cm; range: 1.0–3.9 cm) and HPD 2.5 cm (IQR: 1.8–3.2 cm; range: 0.2–6.0 cm). The median heart volume was 477 cm3 (IQR: 420–530 cm3). Interobserver reliability was excellent for both distance parameters (ICC: 0.988 for MHD, 95% confidence interval [CI]: 0.970–0.995; ICC: 0.991 for HPD, 95% CI: 0.978–0.997), with mean measurement differences of 1.69% and 3.60%, respectively [Supplementary Table 1]. These ICC values exceeding 0.98 indicate excellent reliability, significantly surpassing the 0.75 threshold for clinical acceptability, confirming that the measurements are robust to inter-observer variability.
The mean heart dose on FB scans demonstrated: A median of 5.5 Gy (IQR: 4.6–6.4 Gy) for VMAT-2P, 5.1 Gy (IQR: 4.5–6.0 Gy) for VMAT-4P, and 5.1 Gy (IQR: 4.1–5.9 Gy) for VMAT-5P, as shown in Supplementary Table 1.
Correlation between anatomical predictors and cardiac dose
Within the full cohort (n = 120), both anatomical predictors revealed significant correlations with cardiac dose across all VMAT techniques [Table 2]. MHD demonstrated moderate-to-strong positive correlations (r = 0.575–0.607, all P < 0.001), whereas HPD showed moderate inverse correlations (r = −0.339 to − 0.388, all P < 0.001). These consistent findings confirm MHD and HPD as effective predictors of cardiac dose.
Five-fold cross-validation performance
Stratified 5-fold cross-validation applied to the training set (n = 96) indicated consistent performance across all folds [Supplementary Tables 2 and 3]. The Elastic Net regression analysis revealed stable performance, with mean absolute errors of 0.98 ± 0.11 Gy (VMAT-2P), 0.93 ± 0.09 Gy (VMAT-4P), and 0.82 ± 0.08 Gy (VMAT-5P). Root mean square errors were consistently below 1.3 Gy across all techniques: 1.21 ± 0.05 Gy (VMAT-2P), 1.14 ± 0.04 Gy (VMAT-4P), and 1.01 ± 0.05 Gy (VMAT-5P). The coefficients of variation were low (CV: 9.7%–11.2% for MAE; 3.5%–4.9% for RMSE), indicating minimal overfitting, with narrow fold-by-fold MAE ranges of 0.89–1.04 Gy (VMAT-2P), 0.85–0.97 Gy (VMAT-4P), and 0.74–0.88 Gy (VMAT-5P). Logistic regression achieved strong discriminative performance based on the 5 Gy mean heart dose threshold. The mean AUC values were 0.91 ± 0.05 (VMAT-2P), 0.88 ± 0.06 (VMAT-4P), and 0.85 ± 0.07 (VMAT-5P), with classification accuracies of 0.86 ± 0.04, 0.82 ± 0.04, and 0.77 ± 0.04, respectively. Sensitivity (0.83-0.88) and specificity (0.68-0.85) were balanced across folds. The low AUC coefficient of variation (CV < 9%) confirmed consistent generalization across all VMAT techniques.
Algorithm selection
Elastic Net regression was identified as the optimal choice for continuous dose prediction based on: (1) the lowest mean absolute error across the various techniques (0.77–0.98 Gy), (2) its superior interpretability through linear coefficients, and (3) its effective management of potential multicollinearity. For DIBH classification, logistic regression was selected based on: (1) the highest AUC values (0.83–0.94), (2) an optimal balance of sensitivity (83%–92%) and specificity (83%–92%), and (3) its transparent decision-making process facilitated by interpretable coefficients. Complete performance metrics for all algorithms are provided in Supplementary Table 4.
Regression model performance
Among six regression algorithms, Elastic Net demonstrated the highest predictive accuracy for estimating continuous cardiac doses while ensuring model interpretability. In the training cohort (n = 96), the mean absolute errors varied from 0.77 to 0.98 Gy across VMAT techniques, with root mean squared errors remaining below 1.2 Gy [Table 3]. In the independent test cohort (n = 24), the Elastic Net model recorded mean absolute errors of 1.02 Gy (VMAT-2P), 0.99 Gy (VMAT-4P), and 0.81 Gy (VMAT-5P), with the respective root mean squared errors being 1.27 Gy, 1.20 Gy, and 1.03 Gy. Strong correlations between the observed and predicted doses were noted across all techniques (Pearson r = 0.798–0.825, all P < 0.001), indicating robust generalization with minimal overfitting.
Classification model performance
Among the eight classification algorithms evaluated, logistic regression showed the best performance. The model exhibited strong discriminative power, with AUCs of 0.94 (95% CI: 0.82–1.00), 0.85 (95% CI: 0.67–1.00), and 0.83 (95% CI: 0.63–0.99) for VMAT-2P, VMAT-4P, and VMAT-5P, respectively. In the independent test set (n = 24), the classification accuracy was 87.5% for both VMAT-2P and VMAT-4P [Table 4 and Figure 4]. The model achieved high sensitivity (92% and 83%, respectively) in identifying DIBH candidates, with specificity ranging from 83% to 92% and positive predictive values of 85% to 91%. In contrast, VMAT-5P demonstrated significantly lower accuracy (58.3%) despite a reasonable AUC (0.83) and high sensitivity (83.3%), with notably reduced specificity (33.3%). An examination of false positives revealed that 4 of 6 patients classified as needing DIBH had heart doses between 4.5 and 5.5 Gy, clustered near the 5 Gy threshold, indicating potential threshold instability for this complex technique.
Deep inspiration breath-hold validation cohort: Clinical outcomes
Data distribution and statistical assumptions: All continuous variables in the DIBH validation cohort (n = 25) were found to follow a normal distribution based on Shapiro–Wilk tests (all P > 0.05): MHD (W = 0.965, P = 0.518), HPD (W = 0.953, P = 0.293), VMAT-2P heart dose (W = 0.961, P = 0.425), and VMAT-4P heart dose (W = 0.968, P = 0.584). These results affirmed the suitability of parametric paired t-tests for comparing measurements obtained during FB and DIBH.
Anatomical changes with deep inspiration breath-hold
The implementation of DIBH led to significant anatomical changes [Table 5]. The mean heart distance (MHD) decreased by 27.9% (from 2.44 ± 0.62 cm to 1.76 ± 0.71 cm, t (24) =6.12, P < 0.001), indicating a reduction in cardiac protrusion toward the tangent line during deep inspiration. The heart position displacement (HPD) increased by 66.5% (from 2.51 ± 1.19 cm to 4.18 ± 1.61 cm, t (24) = −7.45, P < 0.001), indicating a favorable posterior displacement of the heart away from the treatment volume.
Dosimetric impact of deep inspiration breath-hold
DIBH achieved substantial dose reductions for both validated techniques [Table 5]:
VMAT-2P: 34.1% reduction (5.45 ± 1.88 Gy to 3.59 ± 1.38 Gy, mean difference 1.86 Gy, 95% CI: 1.42–2.30, Cohen’s d = 1.16, t (24) =8.65, P < 0.001)
VMAT-4P: 34.7% reduction (4.96 ± 1.69 Gy to 3.24 ± 1.12 Gy, mean difference 1.72 Gy, 95% CI: 1.31–2.13, Cohen’s d = 1.19, t (24) =8.42, P < 0.001).
High-risk patients (>5 Gy) decreased dramatically: From 52.0% (13/25) to 16.0% (4/25) for VMAT-2P and from 60.0% (15/25) to 12.0% (3/25) for VMAT-4P, representing 69%–80% reduction.
Validation of prediction accuracy
Robust correlations were detected between FB and DIBH measurements, validating that FB anatomical parameters can reliably predict DIBH outcomes [Table 5 and Figure 5]. The anatomical predictors revealed strong correlations: MHD (r = 0.720, 95% CI: 0.456–0.868, P < 0.001, Bonferroni-corrected P < 0.0125) and HPD (r = 0.720, 95% CI: 0.455–0.868, P < 0.001, Bonferroni-corrected P < 0.0125). Importantly, the mean heart dose exhibited a strong correlation for VMAT-2P and a moderate correlation for VMAT-4P: r = 0.667 (95% CI: 0.376–0.840, P < 0.001, Bonferroni-corrected P < 0.0125) for VMAT-2P and r = 0.545 (95% CI: 0.197–0.774, P = 0.005, Bonferroni-corrected P = 0.020) for VMAT-4P. The correlation for VMAT-4P, whereas statistically significant before correction (P = 0.005), approached but did not meet the stringent Bonferroni-adjusted threshold (P = 0.020 vs. α = 0.0125), suggesting a need for greater caution in interpreting individual patient predictions for this technique. The weaker correlation and broader CI for VMAT-4P (lower CI bound = 0.197) in comparison to VMAT-2P indicate a higher level of prediction uncertainty for the 4-arc technique.
Clinical decision support interface
The clinical decision support interface [Figure 3] successfully demonstrates real-time prediction capability with computational efficiency. The system allows radiation oncologists and medical physicists to input MHD and HPD measurements, providing immediate predictions in <1 s. The interface comprises two integrated modules: (a) continuous dose prediction displaying mean heart dose estimates for all three VMAT techniques simultaneously with CIs, and (b) binary classification providing DIBH recommendations (required or not required) based on the 5 Gy threshold with associated probability scores. This rapid pre-planning evaluation capability eliminates computational burden while facilitating informed decision-making for DIBH patient selection without the need for time-intensive comparative planning.
Anatomical parameters and dosimetric measurements
Median anatomical measurements showed considerable variability among patients: MHD 2.5 cm (IQR: 2.1–2.9 cm; range: 1.0–3.9 cm) and HPD 2.5 cm (IQR: 1.8–3.2 cm; range: 0.2–6.0 cm). The median heart volume was 477 cm3 (IQR: 420–530 cm3). Interobserver reliability was excellent for both distance parameters (ICC: 0.988 for MHD, 95% confidence interval [CI]: 0.970–0.995; ICC: 0.991 for HPD, 95% CI: 0.978–0.997), with mean measurement differences of 1.69% and 3.60%, respectively [Supplementary Table 1]. These ICC values exceeding 0.98 indicate excellent reliability, significantly surpassing the 0.75 threshold for clinical acceptability, confirming that the measurements are robust to inter-observer variability.
The mean heart dose on FB scans demonstrated: A median of 5.5 Gy (IQR: 4.6–6.4 Gy) for VMAT-2P, 5.1 Gy (IQR: 4.5–6.0 Gy) for VMAT-4P, and 5.1 Gy (IQR: 4.1–5.9 Gy) for VMAT-5P, as shown in Supplementary Table 1.
Correlation between anatomical predictors and cardiac dose
Within the full cohort (n = 120), both anatomical predictors revealed significant correlations with cardiac dose across all VMAT techniques [Table 2]. MHD demonstrated moderate-to-strong positive correlations (r = 0.575–0.607, all P < 0.001), whereas HPD showed moderate inverse correlations (r = −0.339 to − 0.388, all P < 0.001). These consistent findings confirm MHD and HPD as effective predictors of cardiac dose.
Five-fold cross-validation performance
Stratified 5-fold cross-validation applied to the training set (n = 96) indicated consistent performance across all folds [Supplementary Tables 2 and 3]. The Elastic Net regression analysis revealed stable performance, with mean absolute errors of 0.98 ± 0.11 Gy (VMAT-2P), 0.93 ± 0.09 Gy (VMAT-4P), and 0.82 ± 0.08 Gy (VMAT-5P). Root mean square errors were consistently below 1.3 Gy across all techniques: 1.21 ± 0.05 Gy (VMAT-2P), 1.14 ± 0.04 Gy (VMAT-4P), and 1.01 ± 0.05 Gy (VMAT-5P). The coefficients of variation were low (CV: 9.7%–11.2% for MAE; 3.5%–4.9% for RMSE), indicating minimal overfitting, with narrow fold-by-fold MAE ranges of 0.89–1.04 Gy (VMAT-2P), 0.85–0.97 Gy (VMAT-4P), and 0.74–0.88 Gy (VMAT-5P). Logistic regression achieved strong discriminative performance based on the 5 Gy mean heart dose threshold. The mean AUC values were 0.91 ± 0.05 (VMAT-2P), 0.88 ± 0.06 (VMAT-4P), and 0.85 ± 0.07 (VMAT-5P), with classification accuracies of 0.86 ± 0.04, 0.82 ± 0.04, and 0.77 ± 0.04, respectively. Sensitivity (0.83-0.88) and specificity (0.68-0.85) were balanced across folds. The low AUC coefficient of variation (CV < 9%) confirmed consistent generalization across all VMAT techniques.
Algorithm selection
Elastic Net regression was identified as the optimal choice for continuous dose prediction based on: (1) the lowest mean absolute error across the various techniques (0.77–0.98 Gy), (2) its superior interpretability through linear coefficients, and (3) its effective management of potential multicollinearity. For DIBH classification, logistic regression was selected based on: (1) the highest AUC values (0.83–0.94), (2) an optimal balance of sensitivity (83%–92%) and specificity (83%–92%), and (3) its transparent decision-making process facilitated by interpretable coefficients. Complete performance metrics for all algorithms are provided in Supplementary Table 4.
Regression model performance
Among six regression algorithms, Elastic Net demonstrated the highest predictive accuracy for estimating continuous cardiac doses while ensuring model interpretability. In the training cohort (n = 96), the mean absolute errors varied from 0.77 to 0.98 Gy across VMAT techniques, with root mean squared errors remaining below 1.2 Gy [Table 3]. In the independent test cohort (n = 24), the Elastic Net model recorded mean absolute errors of 1.02 Gy (VMAT-2P), 0.99 Gy (VMAT-4P), and 0.81 Gy (VMAT-5P), with the respective root mean squared errors being 1.27 Gy, 1.20 Gy, and 1.03 Gy. Strong correlations between the observed and predicted doses were noted across all techniques (Pearson r = 0.798–0.825, all P < 0.001), indicating robust generalization with minimal overfitting.
Classification model performance
Among the eight classification algorithms evaluated, logistic regression showed the best performance. The model exhibited strong discriminative power, with AUCs of 0.94 (95% CI: 0.82–1.00), 0.85 (95% CI: 0.67–1.00), and 0.83 (95% CI: 0.63–0.99) for VMAT-2P, VMAT-4P, and VMAT-5P, respectively. In the independent test set (n = 24), the classification accuracy was 87.5% for both VMAT-2P and VMAT-4P [Table 4 and Figure 4]. The model achieved high sensitivity (92% and 83%, respectively) in identifying DIBH candidates, with specificity ranging from 83% to 92% and positive predictive values of 85% to 91%. In contrast, VMAT-5P demonstrated significantly lower accuracy (58.3%) despite a reasonable AUC (0.83) and high sensitivity (83.3%), with notably reduced specificity (33.3%). An examination of false positives revealed that 4 of 6 patients classified as needing DIBH had heart doses between 4.5 and 5.5 Gy, clustered near the 5 Gy threshold, indicating potential threshold instability for this complex technique.
Deep inspiration breath-hold validation cohort: Clinical outcomes
Data distribution and statistical assumptions: All continuous variables in the DIBH validation cohort (n = 25) were found to follow a normal distribution based on Shapiro–Wilk tests (all P > 0.05): MHD (W = 0.965, P = 0.518), HPD (W = 0.953, P = 0.293), VMAT-2P heart dose (W = 0.961, P = 0.425), and VMAT-4P heart dose (W = 0.968, P = 0.584). These results affirmed the suitability of parametric paired t-tests for comparing measurements obtained during FB and DIBH.
Anatomical changes with deep inspiration breath-hold
The implementation of DIBH led to significant anatomical changes [Table 5]. The mean heart distance (MHD) decreased by 27.9% (from 2.44 ± 0.62 cm to 1.76 ± 0.71 cm, t (24) =6.12, P < 0.001), indicating a reduction in cardiac protrusion toward the tangent line during deep inspiration. The heart position displacement (HPD) increased by 66.5% (from 2.51 ± 1.19 cm to 4.18 ± 1.61 cm, t (24) = −7.45, P < 0.001), indicating a favorable posterior displacement of the heart away from the treatment volume.
Dosimetric impact of deep inspiration breath-hold
DIBH achieved substantial dose reductions for both validated techniques [Table 5]:
VMAT-2P: 34.1% reduction (5.45 ± 1.88 Gy to 3.59 ± 1.38 Gy, mean difference 1.86 Gy, 95% CI: 1.42–2.30, Cohen’s d = 1.16, t (24) =8.65, P < 0.001)
VMAT-4P: 34.7% reduction (4.96 ± 1.69 Gy to 3.24 ± 1.12 Gy, mean difference 1.72 Gy, 95% CI: 1.31–2.13, Cohen’s d = 1.19, t (24) =8.42, P < 0.001).
High-risk patients (>5 Gy) decreased dramatically: From 52.0% (13/25) to 16.0% (4/25) for VMAT-2P and from 60.0% (15/25) to 12.0% (3/25) for VMAT-4P, representing 69%–80% reduction.
Validation of prediction accuracy
Robust correlations were detected between FB and DIBH measurements, validating that FB anatomical parameters can reliably predict DIBH outcomes [Table 5 and Figure 5]. The anatomical predictors revealed strong correlations: MHD (r = 0.720, 95% CI: 0.456–0.868, P < 0.001, Bonferroni-corrected P < 0.0125) and HPD (r = 0.720, 95% CI: 0.455–0.868, P < 0.001, Bonferroni-corrected P < 0.0125). Importantly, the mean heart dose exhibited a strong correlation for VMAT-2P and a moderate correlation for VMAT-4P: r = 0.667 (95% CI: 0.376–0.840, P < 0.001, Bonferroni-corrected P < 0.0125) for VMAT-2P and r = 0.545 (95% CI: 0.197–0.774, P = 0.005, Bonferroni-corrected P = 0.020) for VMAT-4P. The correlation for VMAT-4P, whereas statistically significant before correction (P = 0.005), approached but did not meet the stringent Bonferroni-adjusted threshold (P = 0.020 vs. α = 0.0125), suggesting a need for greater caution in interpreting individual patient predictions for this technique. The weaker correlation and broader CI for VMAT-4P (lower CI bound = 0.197) in comparison to VMAT-2P indicate a higher level of prediction uncertainty for the 4-arc technique.
Clinical decision support interface
The clinical decision support interface [Figure 3] successfully demonstrates real-time prediction capability with computational efficiency. The system allows radiation oncologists and medical physicists to input MHD and HPD measurements, providing immediate predictions in <1 s. The interface comprises two integrated modules: (a) continuous dose prediction displaying mean heart dose estimates for all three VMAT techniques simultaneously with CIs, and (b) binary classification providing DIBH recommendations (required or not required) based on the 5 Gy threshold with associated probability scores. This rapid pre-planning evaluation capability eliminates computational burden while facilitating informed decision-making for DIBH patient selection without the need for time-intensive comparative planning.
D
DISCUSSION
This research developed and validated a machine learning framework for the prediction of mean heart dose and the guidance of DIBH selection in left-sided postmastectomy radiotherapy, utilizing straightforward and readily available anatomical measurements. The dual-component system exhibited clinically acceptable performance, validated through paired FB-DIBH scans in 25 independent patients (distinct from the 120-patient model development cohort), accurately predicting the dosimetric advantages of DIBH (r = 0.545–0.720 between FB-predicted and DIBH-achieved doses, all P ≤ 0.020 after Bonferroni correction), with DIBH resulting in a 34% reduction in mean heart dose and a 69%–80% reduction for high-risk patients.
Anatomical predictors and clinical efficiency
Our distance measurements, which include MHD and HPD, demonstrated significant correlations with cardiac dose across all VMAT techniques (r = 0.58–0.61 for MHD; r = –0.34 to − 0.39 for HPD; all P < 0.001), consistent with previous studies. Ferdinand S, Mondal M, Mallik S, et al.[21] noted that heart volume in the field and maximum heart distance independently predicted cardiac sparing, whereas Mendez et
al.[12] highlighted that distance-based metrics provided superior predictive accuracy (AUC 0.72–0.89) when compared to traditional measurements.
The validation cohort for DIBH confirmed significant changes in parameters (27.9% reduction in MHD, 66.5% increase in HPD, both P < 0.001), which were strongly correlated with reductions in cardiac dose (r = 0.545–0.720, all P ≤ 0.020 after Bonferroni correction). The mean heart dose reduction we observed, between 1.72 and 1.86 Gy (34% reduction), aligns with the findings of previous DIBH studies by Vikström et al.,[22] who noted a decrease from 3.7 to 1.7 Gy. DIBH-induced cardiac dose reductions varying widely from approximately 30% to over 60% have been documented in the literature,[562432] corroborating our results, with significantly enhanced benefits observed in patients undergoing regional nodal irradiation, particularly involving internal mammary chain nodes.
Crucially, strong correlations between FB and DIBH measurements (r = 0.720 for both MHD and HPD) validate that FB anatomical parameters reliably predict DIBH dosimetric benefits, critical for clinical workflow optimization. Our findings (AUC 0.83-0.94) align with or exceed these reported values, while using simpler measurements obtainable in <2–3 min using standard tools versus 10–15 min for volumetric analysis requiring specialized software. The excellent inter-observer reliability (ICC >0.98 for both predictors) confirms measurement reproducibility, an important factor often missing from previous studies.
Model performance and risk translation
Our Elastic Net regression achieved mean absolute errors of 0.77–1.02 Gy across techniques, representing clinically acceptable accuracy given typical DIBH dose reductions of 1.5–2.0 Gy documented in multiple studies.[562223243132] Recent machine learning investigations support this approach: Kamizaki et al.[25] reported deep neural network RMSE of 77.4 cGy for mean heart dose prediction. Our Elastic Net selection achieves a balance between prediction accuracy and interpretability, addressing the “black box” concern that limits clinical adoption.
Our observed dose reductions of 1.72–1.86 Gy translate to estimated 13%–14% relative cardiac event risk reduction based on Darby et al.’s[4] landmark finding of 7.4% relative increase per gray (95% CI: 2.9%–14.5%, P < 0.001), with no apparent threshold below which radiation is safe. For patients with 10% baseline 20-year cardiac risk, this represents approximately 1.3% points absolute reduction, though varying with individual comorbidities. The 69%–80% reduction in high-risk patients (>5 Gy) provides meaningful clinical impact, supported by Zhang et al.’s[26] demonstration that DIBH significantly reduced subclinical acute cardiac injury markers.
Our classification performance (87.5% accuracy, 83%–92% sensitivity and specificity) compares favorably with Koide et al.’s[27] reported AUC of 0.71–0.78 for cardiotoxicity prediction and Talebi et al.’s[28] machine learning models combining radiomics and dosimetric features. High sensitivity ensures few DIBH candidates are missed, while strong specificity minimizes unnecessary planning.
Volumetric modulated arc therapy technique considerations
The moderate correlation of VMAT-4P (r = 0.545) relative to VMAT-2P (r = 0.667) reflects a certain level of technical complexity. Multiarc techniques, which involve additional anterior-posterior arcs and complex collimator angles, result in heterogeneous dose distributions. This heterogeneity may compromise the linear relationships between simple distance metrics and the doses achieved. This is consistent with the findings of Popescu et al.[29] and Wang et al.,[30] who showed that VMAT achieves superior conformity and cardiac sparing through an increased number of optimization degrees of freedom. In spite of a weaker correlation, the clinical applicability of VMAT-4P predictions remains intact, as evidenced by the sustained classification performance (87.5% accuracy and 83% sensitivity) and significant dose reductions observed with DIBH (1.72 Gy, 34.7%, P < 0.001).
In testing involving 24 independent patients treated with VMAT-5P, our classification model revealed a reduced accuracy (58.3%), although it maintained a reasonable level of discriminatory power (AUC = 0.826). Notably, 67% of false positives were found within ±0.5 Gy of the 5 Gy threshold, indicating instability in the threshold for complex multi-arc techniques. We suggest utilizing our model for high-sensitivity initial screening, followed by selective verification planning for borderline cases (4.5–5.5 Gy) when applying advanced multiarc techniques.
Clinical implementation and future directions
A key strength of this study is the simultaneous validation of three VMAT techniques, which addresses variability among institutions. Our real-time interface [Figure 3], with processing times of <1 s, significantly improves upon dual-planning methods that typically require 30–45 min. However, generalizability to different treatment planning systems, dose calculation algorithms, and institutional protocols requires prospective external validation.
Prospective multi-institutional validation is critical to establish clinical utility across diverse patient populations and institutional protocols. Expanded DIBH validation cohorts should correlate predicted doses with those achieved and assess long-term cardiovascular outcomes to validate clinical relevance and refine treatment thresholds. Deep learning approaches for capturing nonlinear relationships and integrating doses to cardiac substructures, particularly the LAD artery, warrant further investigation. Comparative effectiveness studies should evaluate automated versus conventional DIBH selection workflows, quantifying impacts on treatment efficiency, patient outcomes, and resource utilization.
Study limitations
It is essential to acknowledge the important limitations of this study. Our data are derived from a single institution that employs standardized protocols. The validation cohort, which consisted of 25 patients, demonstrated strong correlations with FB-DIBH and clinically significant benefits of DIBH; however, this sample size necessitates validation in larger prospective studies. The retrospective design may introduce selection bias, and the homogeneity of the post-mastectomy population limits the generalizability of the results to patients undergoing breast-conserving surgery. Therefore, external validation across diverse populations, treatment methods, and institutional protocols is essential.
This research developed and validated a machine learning framework for the prediction of mean heart dose and the guidance of DIBH selection in left-sided postmastectomy radiotherapy, utilizing straightforward and readily available anatomical measurements. The dual-component system exhibited clinically acceptable performance, validated through paired FB-DIBH scans in 25 independent patients (distinct from the 120-patient model development cohort), accurately predicting the dosimetric advantages of DIBH (r = 0.545–0.720 between FB-predicted and DIBH-achieved doses, all P ≤ 0.020 after Bonferroni correction), with DIBH resulting in a 34% reduction in mean heart dose and a 69%–80% reduction for high-risk patients.
Anatomical predictors and clinical efficiency
Our distance measurements, which include MHD and HPD, demonstrated significant correlations with cardiac dose across all VMAT techniques (r = 0.58–0.61 for MHD; r = –0.34 to − 0.39 for HPD; all P < 0.001), consistent with previous studies. Ferdinand S, Mondal M, Mallik S, et al.[21] noted that heart volume in the field and maximum heart distance independently predicted cardiac sparing, whereas Mendez et
al.[12] highlighted that distance-based metrics provided superior predictive accuracy (AUC 0.72–0.89) when compared to traditional measurements.
The validation cohort for DIBH confirmed significant changes in parameters (27.9% reduction in MHD, 66.5% increase in HPD, both P < 0.001), which were strongly correlated with reductions in cardiac dose (r = 0.545–0.720, all P ≤ 0.020 after Bonferroni correction). The mean heart dose reduction we observed, between 1.72 and 1.86 Gy (34% reduction), aligns with the findings of previous DIBH studies by Vikström et al.,[22] who noted a decrease from 3.7 to 1.7 Gy. DIBH-induced cardiac dose reductions varying widely from approximately 30% to over 60% have been documented in the literature,[562432] corroborating our results, with significantly enhanced benefits observed in patients undergoing regional nodal irradiation, particularly involving internal mammary chain nodes.
Crucially, strong correlations between FB and DIBH measurements (r = 0.720 for both MHD and HPD) validate that FB anatomical parameters reliably predict DIBH dosimetric benefits, critical for clinical workflow optimization. Our findings (AUC 0.83-0.94) align with or exceed these reported values, while using simpler measurements obtainable in <2–3 min using standard tools versus 10–15 min for volumetric analysis requiring specialized software. The excellent inter-observer reliability (ICC >0.98 for both predictors) confirms measurement reproducibility, an important factor often missing from previous studies.
Model performance and risk translation
Our Elastic Net regression achieved mean absolute errors of 0.77–1.02 Gy across techniques, representing clinically acceptable accuracy given typical DIBH dose reductions of 1.5–2.0 Gy documented in multiple studies.[562223243132] Recent machine learning investigations support this approach: Kamizaki et al.[25] reported deep neural network RMSE of 77.4 cGy for mean heart dose prediction. Our Elastic Net selection achieves a balance between prediction accuracy and interpretability, addressing the “black box” concern that limits clinical adoption.
Our observed dose reductions of 1.72–1.86 Gy translate to estimated 13%–14% relative cardiac event risk reduction based on Darby et al.’s[4] landmark finding of 7.4% relative increase per gray (95% CI: 2.9%–14.5%, P < 0.001), with no apparent threshold below which radiation is safe. For patients with 10% baseline 20-year cardiac risk, this represents approximately 1.3% points absolute reduction, though varying with individual comorbidities. The 69%–80% reduction in high-risk patients (>5 Gy) provides meaningful clinical impact, supported by Zhang et al.’s[26] demonstration that DIBH significantly reduced subclinical acute cardiac injury markers.
Our classification performance (87.5% accuracy, 83%–92% sensitivity and specificity) compares favorably with Koide et al.’s[27] reported AUC of 0.71–0.78 for cardiotoxicity prediction and Talebi et al.’s[28] machine learning models combining radiomics and dosimetric features. High sensitivity ensures few DIBH candidates are missed, while strong specificity minimizes unnecessary planning.
Volumetric modulated arc therapy technique considerations
The moderate correlation of VMAT-4P (r = 0.545) relative to VMAT-2P (r = 0.667) reflects a certain level of technical complexity. Multiarc techniques, which involve additional anterior-posterior arcs and complex collimator angles, result in heterogeneous dose distributions. This heterogeneity may compromise the linear relationships between simple distance metrics and the doses achieved. This is consistent with the findings of Popescu et al.[29] and Wang et al.,[30] who showed that VMAT achieves superior conformity and cardiac sparing through an increased number of optimization degrees of freedom. In spite of a weaker correlation, the clinical applicability of VMAT-4P predictions remains intact, as evidenced by the sustained classification performance (87.5% accuracy and 83% sensitivity) and significant dose reductions observed with DIBH (1.72 Gy, 34.7%, P < 0.001).
In testing involving 24 independent patients treated with VMAT-5P, our classification model revealed a reduced accuracy (58.3%), although it maintained a reasonable level of discriminatory power (AUC = 0.826). Notably, 67% of false positives were found within ±0.5 Gy of the 5 Gy threshold, indicating instability in the threshold for complex multi-arc techniques. We suggest utilizing our model for high-sensitivity initial screening, followed by selective verification planning for borderline cases (4.5–5.5 Gy) when applying advanced multiarc techniques.
Clinical implementation and future directions
A key strength of this study is the simultaneous validation of three VMAT techniques, which addresses variability among institutions. Our real-time interface [Figure 3], with processing times of <1 s, significantly improves upon dual-planning methods that typically require 30–45 min. However, generalizability to different treatment planning systems, dose calculation algorithms, and institutional protocols requires prospective external validation.
Prospective multi-institutional validation is critical to establish clinical utility across diverse patient populations and institutional protocols. Expanded DIBH validation cohorts should correlate predicted doses with those achieved and assess long-term cardiovascular outcomes to validate clinical relevance and refine treatment thresholds. Deep learning approaches for capturing nonlinear relationships and integrating doses to cardiac substructures, particularly the LAD artery, warrant further investigation. Comparative effectiveness studies should evaluate automated versus conventional DIBH selection workflows, quantifying impacts on treatment efficiency, patient outcomes, and resource utilization.
Study limitations
It is essential to acknowledge the important limitations of this study. Our data are derived from a single institution that employs standardized protocols. The validation cohort, which consisted of 25 patients, demonstrated strong correlations with FB-DIBH and clinically significant benefits of DIBH; however, this sample size necessitates validation in larger prospective studies. The retrospective design may introduce selection bias, and the homogeneity of the post-mastectomy population limits the generalizability of the results to patients undergoing breast-conserving surgery. Therefore, external validation across diverse populations, treatment methods, and institutional protocols is essential.
C
CONCLUSIONS
This research presents a validated machine learning framework for the prediction of mean heart dose and the guidance of DIBH selection in left-sided post-mastectomy VMAT, based on easily obtainable anatomical measurements. The model exhibited strong predictive accuracy across multiple VMAT techniques, achieving mean dose reductions of 1.72–1.86 Gy (34%) and providing substantial benefits for high-risk patients. Independent testing resulted in a classification accuracy of 87.5%, with balanced sensitivity and specificity for standard techniques, although performance was less effective for complex multiarc plans.
As a quick pre-planning screening tool, this method offers practical advantages for streamlining DIBH patient selection and enhancing cardiac protection. The reliance on basic distance metrics – maximum heart distance and heart-to-PTV distance – facilitates its implementation across various clinical settings without the requirement for specialized software. While the data from a single institution and the modest sample sizes indicate the need for multi-institutional validation, these findings reinforce the clinical potential of automated, anatomy-based cardiac dose prediction for efficient DIBH screening in contemporary breast radiotherapy.
Data availability
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request, subject to institutional review board approval and data use agreements.
Code availability
The machine learning models (Python code using Scikit-learn) are available from the corresponding author on reasonable request for research purposes.
Ethics statement
This retrospective study was approved by the Institutional Ethics Committee. The requirement for informed consent was waived, given the retrospective nature of the study using de-identified data.
Conflicts of interest
There are no conflicts of interest.
This research presents a validated machine learning framework for the prediction of mean heart dose and the guidance of DIBH selection in left-sided post-mastectomy VMAT, based on easily obtainable anatomical measurements. The model exhibited strong predictive accuracy across multiple VMAT techniques, achieving mean dose reductions of 1.72–1.86 Gy (34%) and providing substantial benefits for high-risk patients. Independent testing resulted in a classification accuracy of 87.5%, with balanced sensitivity and specificity for standard techniques, although performance was less effective for complex multiarc plans.
As a quick pre-planning screening tool, this method offers practical advantages for streamlining DIBH patient selection and enhancing cardiac protection. The reliance on basic distance metrics – maximum heart distance and heart-to-PTV distance – facilitates its implementation across various clinical settings without the requirement for specialized software. While the data from a single institution and the modest sample sizes indicate the need for multi-institutional validation, these findings reinforce the clinical potential of automated, anatomy-based cardiac dose prediction for efficient DIBH screening in contemporary breast radiotherapy.
Data availability
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request, subject to institutional review board approval and data use agreements.
Code availability
The machine learning models (Python code using Scikit-learn) are available from the corresponding author on reasonable request for research purposes.
Ethics statement
This retrospective study was approved by the Institutional Ethics Committee. The requirement for informed consent was waived, given the retrospective nature of the study using de-identified data.
Conflicts of interest
There are no conflicts of interest.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Prospective monitoring of the clinical implementation of ultrahypofractionated whole-breast radiotherapy using electronic patient-reported outcome measures.
- Beyond surface dose: subcutaneous dose escalation underlies TomoTherapy's skin toxicity in breast radiotherapy.
- Breast cancer radiotherapy and the risk of lung injury: Advances and perspectives.
- Optimizing the integration of modern systemic therapies and advanced radiotherapy techniques in breast cancer management: An expert opinion from the Institut Curie Breast Radiotherapy Group.
- Evaluation of the accuracy of a surface-guided radiotherapy system for patient positioning in radiotherapy of breast cancer.
- Comparison of heart-sparing radiation techniques for left-sided breast cancer: DIBH combined with tangential 3D radiation vs. conventional and tangential VMAT techniques.