External evaluation of an open-source deep learning model for prostate cancer detection on bi-parametric MRI.
1/5 보강
[OBJECTIVES] This study aims to evaluate the diagnostic accuracy of an open-source deep learning (DL) model for detecting clinically significant prostate cancer (csPCa) in biparametric MRI (bpMRI).
- 95% CI 0.80-0.92
APA
Johnson PM, Tong A, et al. (2026). External evaluation of an open-source deep learning model for prostate cancer detection on bi-parametric MRI.. European radiology, 36(2), 1084-1092. https://doi.org/10.1007/s00330-025-11865-x
MLA
Johnson PM, et al.. "External evaluation of an open-source deep learning model for prostate cancer detection on bi-parametric MRI.." European radiology, vol. 36, no. 2, 2026, pp. 1084-1092.
PMID
40753327
Abstract
[OBJECTIVES] This study aims to evaluate the diagnostic accuracy of an open-source deep learning (DL) model for detecting clinically significant prostate cancer (csPCa) in biparametric MRI (bpMRI). It also aims to outline the necessary components of the model that facilitate effective sharing and external evaluation of PCa detection models.
[MATERIALS AND METHODS] This retrospective diagnostic accuracy study evaluated a publicly available DL model trained to detect PCa on bpMRI. External validation was performed on bpMRI exams from 151 biologically male patients (mean age, 65 ± 8 years). The model's performance was evaluated using patient-level classification of PCa with both radiologist interpretation and histopathology serving as the ground truth. The model processed bpMRI inputs to generate lesion probability maps. Performance was assessed using the area under the receiver operating characteristic curve (AUC) for PI-RADS ≥ 3, PI-RADS ≥ 4, and csPCa (defined as Gleason ≥ 7) at an exam level.
[RESULTS] The model achieved AUCs of 0.86 (95% CI: 0.80-0.92) and 0.91 (95% CI: 0.85-0.96) for predicting PI-RADS ≥ 3 and ≥ 4 exams, respectively, and 0.78 (95% CI: 0.71-0.86) for csPCa. Sensitivity and specificity for csPCa were 0.87 and 0.53, respectively. Fleiss' kappa for inter-reader agreement was 0.51.
[CONCLUSION] The open-source DL model offers high sensitivity to clinically significant prostate cancer. The study underscores the importance of sharing model code and weights to enable effective external validation and further research.
[KEY POINTS] Question Inter-reader variability hinders the consistent and accurate detection of clinically significant prostate cancer in MRI. Findings An open-source deep learning model demonstrated reproducible diagnostic accuracy, achieving AUCs of 0.86 for PI-RADS ≥ 3 and 0.78 for CsPCa lesions. Clinical relevance The model's high sensitivity for MRI-positive lesions (PI-RADS ≥ 3) may provide support for radiologists. Its open-source deployment facilitates further development and evaluation across diverse clinical settings, maximizing its potential utility.
[MATERIALS AND METHODS] This retrospective diagnostic accuracy study evaluated a publicly available DL model trained to detect PCa on bpMRI. External validation was performed on bpMRI exams from 151 biologically male patients (mean age, 65 ± 8 years). The model's performance was evaluated using patient-level classification of PCa with both radiologist interpretation and histopathology serving as the ground truth. The model processed bpMRI inputs to generate lesion probability maps. Performance was assessed using the area under the receiver operating characteristic curve (AUC) for PI-RADS ≥ 3, PI-RADS ≥ 4, and csPCa (defined as Gleason ≥ 7) at an exam level.
[RESULTS] The model achieved AUCs of 0.86 (95% CI: 0.80-0.92) and 0.91 (95% CI: 0.85-0.96) for predicting PI-RADS ≥ 3 and ≥ 4 exams, respectively, and 0.78 (95% CI: 0.71-0.86) for csPCa. Sensitivity and specificity for csPCa were 0.87 and 0.53, respectively. Fleiss' kappa for inter-reader agreement was 0.51.
[CONCLUSION] The open-source DL model offers high sensitivity to clinically significant prostate cancer. The study underscores the importance of sharing model code and weights to enable effective external validation and further research.
[KEY POINTS] Question Inter-reader variability hinders the consistent and accurate detection of clinically significant prostate cancer in MRI. Findings An open-source deep learning model demonstrated reproducible diagnostic accuracy, achieving AUCs of 0.86 for PI-RADS ≥ 3 and 0.78 for CsPCa lesions. Clinical relevance The model's high sensitivity for MRI-positive lesions (PI-RADS ≥ 3) may provide support for radiologists. Its open-source deployment facilitates further development and evaluation across diverse clinical settings, maximizing its potential utility.
MeSH Terms
Humans; Male; Prostatic Neoplasms; Deep Learning; Aged; Retrospective Studies; Magnetic Resonance Imaging; Image Interpretation, Computer-Assisted; Middle Aged; Reproducibility of Results; Sensitivity and Specificity