Meta learning optimized TabNet for small sample repeat prostate biopsy prediction.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
40 cases (28.
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
[CONCLUSION] Meta-learning optimization successfully addresses sample size limitations in repeat prostate biopsy prediction without requiring advanced imaging. This provides an evidence-based decision support tool enhancing diagnostic accuracy while minimizing unnecessary procedures.
[PURPOSE] Repeat prostate biopsy prediction remains limited by small patient cohorts that constrain artificial intelligence application despite theoretical advantages in capturing complex clinical pat
- Specificity 90.0%
APA
Lou J, Xu J, et al. (2026). Meta learning optimized TabNet for small sample repeat prostate biopsy prediction.. Discover oncology, 17(1). https://doi.org/10.1007/s12672-026-04777-9
MLA
Lou J, et al.. "Meta learning optimized TabNet for small sample repeat prostate biopsy prediction.." Discover oncology, vol. 17, no. 1, 2026.
PMID
41807867 ↗
Abstract 한글 요약
[PURPOSE] Repeat prostate biopsy prediction remains limited by small patient cohorts that constrain artificial intelligence application despite theoretical advantages in capturing complex clinical patterns. This study develops and validates a meta-learning optimized TabNet framework using readily available clinical parameters to overcome sample size constraints and enhance repeat biopsy (RB) prediction accuracy through knowledge transfer from larger initial biopsy (IB) cohorts, with particular applicability to resource-limited settings where mpMRI remains unavailable. Meta-learning enables rapid model adaptation by leveraging knowledge from related tasks with minimal training examples.
[METHODS] This retrospective study analyzed 2,087 initial prostate biopsies and 139 subsequent RBs without mpMRI data. A two-stage training paradigm implemented Model-Agnostic Meta-Learning for pre-training on IB data, followed by fine-tuning on the RB cohort. Performance evaluation included discrimination analysis, calibration assessment, and decision curve analysis compared to original TabNet and conventional machine learning approaches, with classification performance benchmarked against established clinical risk calculators.
[RESULTS] Among 139 RB patients, cancer was detected in 40 cases (28.8%), including 31 clinically significant cancers (75.5%). On the independent testing set of 42 patients, meta-learning TabNet achieved superior discriminative performance (AUROC 0.872) compared to XGBoost (0.808), original TabNet (0.800), and conventional approaches. The model demonstrated optimal calibration (Brier score 0.068, ECE 0.100) and high specificity (90.0%) with only three false positives, substantially outperforming ERSPC and PCPT calculators.
[CONCLUSION] Meta-learning optimization successfully addresses sample size limitations in repeat prostate biopsy prediction without requiring advanced imaging. This provides an evidence-based decision support tool enhancing diagnostic accuracy while minimizing unnecessary procedures.
[METHODS] This retrospective study analyzed 2,087 initial prostate biopsies and 139 subsequent RBs without mpMRI data. A two-stage training paradigm implemented Model-Agnostic Meta-Learning for pre-training on IB data, followed by fine-tuning on the RB cohort. Performance evaluation included discrimination analysis, calibration assessment, and decision curve analysis compared to original TabNet and conventional machine learning approaches, with classification performance benchmarked against established clinical risk calculators.
[RESULTS] Among 139 RB patients, cancer was detected in 40 cases (28.8%), including 31 clinically significant cancers (75.5%). On the independent testing set of 42 patients, meta-learning TabNet achieved superior discriminative performance (AUROC 0.872) compared to XGBoost (0.808), original TabNet (0.800), and conventional approaches. The model demonstrated optimal calibration (Brier score 0.068, ECE 0.100) and high specificity (90.0%) with only three false positives, substantially outperforming ERSPC and PCPT calculators.
[CONCLUSION] Meta-learning optimization successfully addresses sample size limitations in repeat prostate biopsy prediction without requiring advanced imaging. This provides an evidence-based decision support tool enhancing diagnostic accuracy while minimizing unnecessary procedures.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (4)
- The MIF-CD74 axis drives colorectal cancer via glycolytic reprogramming and is targeted by a novel small-molecule inhibitor.
- TIR domain proteins: regulatory mechanisms in the tumor immune microenvironment, clinical translation strategies, and prospects for precision therapy applications.
- M2 Macrophage Polarization Mediated by Complement C3 from Hedgehog-Activated Fibroblasts Establishes an Immunosuppressive Niche in Gastric Cancer.
- Analysis of the influence factors of cervical lymph node metastasis in Papillary thyroid carcinoma: A retrospective observational study.
📖 전문 본문 읽기 PMC JATS · ~54 KB · 영문
Introduction
Introduction
Initial prostate biopsies yield negative results in approximately 70% of cases [1]. However, patients with persistent suspicion of prostate cancer (PCa) present a significant clinical dilemma regarding the necessity for repeat tissue sampling. Among this select population with previous negative biopsies but ongoing clinical concern, repeat biopsy (RB) procedures yield positive results in merely 10–35% of cases [2, 3]. Accurate prediction of positive RB outcomes therefore represents a critical need for identifying patients most likely to harbor undetected PCa while avoiding unnecessary invasive procedures in those unlikely to benefit from additional tissue sampling.
Despite advances in biomarker development such as prostate cancer antigen 3 demonstrating enhanced predictive capability for RB outcomes [4, 5], clinical implementation of these novel markers remains constrained by cost considerations, limited availability in community practice settings, and lack of standardization across healthcare systems. Similarly, while multiparametric MRI (mpMRI) has emerged as a valuable tool for PCa detection, its routine application in RB decision-making remains limited by resource availability, extended waiting times, and substantial cost barriers in many clinical settings [6]. Consequently, current clinical decision-making for RB continues to rely predominantly on conventional risk assessment tools utilizing readily available clinical parameters. The European Randomized Study of Screening for Prostate Cancer (ERSPC) and Prostate Cancer Prevention Trial (PCPT) risk calculators represent the most widely validated instruments, employing multivariable logistic regression (LR) frameworks to integrate clinical variables including age, prostate-specific antigen (PSA) kinetics, digital rectal examination (DRE) findings, prostate volume, and prior biopsy characteristics [7, 8]. These tools have demonstrated moderate discriminative performance with area under the receiver operating characteristic curve (AUROC) values typically ranging from 0.63 to 0.71 in external validation cohorts [9]. The persistent limitation stems from the inherent linear assumptions of LR modeling [10], which inadequately captures complex non-linear interactions among clinical variables that influence cancer detection probability in RB scenarios.
Artificial intelligence (AI) algorithms offer potential to overcome these limitations through their capacity to identify complex multidimensional patterns without predetermined assumptions regarding variable relationships [11]. Among these approaches, gradient boosting tree and artificial neural network architectures have demonstrated superior performance in comparative studies, achieving AUROC values ranging from 0.76 to 0.83 [12, 13]. However, AI applications in RB prediction face a fundamental constraint imposed by inherently small cohort sizes [14]. RB populations typically encompass several hundred patients, with cancer-positive cases representing only a small fraction of this limited sample. This data scarcity creates a critical mismatch with conventional AI algorithms that require thousands of training examples for robust generalization and stable parameter estimation, particularly problematic for deep learning methods with high parameter counts relative to sample size [14]. The resulting overfitting risk when model complexity exceeds available data represents the primary barrier to clinical translation of advanced prediction techniques.
This study aims to address the fundamental challenge of small sample AI in repeat prostate biopsy prediction by developing a meta-learning optimized TabNet framework specifically designed for limited-data clinical settings. The TabNet architecture employs sequential attention mechanisms that enable effective feature selection and capture complex variable interactions in small tabular datasets through sparse instance-wise feature learning capacity [15, 16]. Meta-learning, also known as “learning to learn,” enables rapid model adaptation by extracting generalizable knowledge from related tasks during pre-training, then fine-tuning on target datasets with substantially fewer training examples than conventional deep learning approaches require [17]. By overcoming historical sample size constraints that have limited AI applications in this domain, the proposed framework offers urologists an evidence-based decision support tool that enhances diagnostic accuracy while reducing unnecessary invasive procedures for patients with previous negative biopsies.
Initial prostate biopsies yield negative results in approximately 70% of cases [1]. However, patients with persistent suspicion of prostate cancer (PCa) present a significant clinical dilemma regarding the necessity for repeat tissue sampling. Among this select population with previous negative biopsies but ongoing clinical concern, repeat biopsy (RB) procedures yield positive results in merely 10–35% of cases [2, 3]. Accurate prediction of positive RB outcomes therefore represents a critical need for identifying patients most likely to harbor undetected PCa while avoiding unnecessary invasive procedures in those unlikely to benefit from additional tissue sampling.
Despite advances in biomarker development such as prostate cancer antigen 3 demonstrating enhanced predictive capability for RB outcomes [4, 5], clinical implementation of these novel markers remains constrained by cost considerations, limited availability in community practice settings, and lack of standardization across healthcare systems. Similarly, while multiparametric MRI (mpMRI) has emerged as a valuable tool for PCa detection, its routine application in RB decision-making remains limited by resource availability, extended waiting times, and substantial cost barriers in many clinical settings [6]. Consequently, current clinical decision-making for RB continues to rely predominantly on conventional risk assessment tools utilizing readily available clinical parameters. The European Randomized Study of Screening for Prostate Cancer (ERSPC) and Prostate Cancer Prevention Trial (PCPT) risk calculators represent the most widely validated instruments, employing multivariable logistic regression (LR) frameworks to integrate clinical variables including age, prostate-specific antigen (PSA) kinetics, digital rectal examination (DRE) findings, prostate volume, and prior biopsy characteristics [7, 8]. These tools have demonstrated moderate discriminative performance with area under the receiver operating characteristic curve (AUROC) values typically ranging from 0.63 to 0.71 in external validation cohorts [9]. The persistent limitation stems from the inherent linear assumptions of LR modeling [10], which inadequately captures complex non-linear interactions among clinical variables that influence cancer detection probability in RB scenarios.
Artificial intelligence (AI) algorithms offer potential to overcome these limitations through their capacity to identify complex multidimensional patterns without predetermined assumptions regarding variable relationships [11]. Among these approaches, gradient boosting tree and artificial neural network architectures have demonstrated superior performance in comparative studies, achieving AUROC values ranging from 0.76 to 0.83 [12, 13]. However, AI applications in RB prediction face a fundamental constraint imposed by inherently small cohort sizes [14]. RB populations typically encompass several hundred patients, with cancer-positive cases representing only a small fraction of this limited sample. This data scarcity creates a critical mismatch with conventional AI algorithms that require thousands of training examples for robust generalization and stable parameter estimation, particularly problematic for deep learning methods with high parameter counts relative to sample size [14]. The resulting overfitting risk when model complexity exceeds available data represents the primary barrier to clinical translation of advanced prediction techniques.
This study aims to address the fundamental challenge of small sample AI in repeat prostate biopsy prediction by developing a meta-learning optimized TabNet framework specifically designed for limited-data clinical settings. The TabNet architecture employs sequential attention mechanisms that enable effective feature selection and capture complex variable interactions in small tabular datasets through sparse instance-wise feature learning capacity [15, 16]. Meta-learning, also known as “learning to learn,” enables rapid model adaptation by extracting generalizable knowledge from related tasks during pre-training, then fine-tuning on target datasets with substantially fewer training examples than conventional deep learning approaches require [17]. By overcoming historical sample size constraints that have limited AI applications in this domain, the proposed framework offers urologists an evidence-based decision support tool that enhances diagnostic accuracy while reducing unnecessary invasive procedures for patients with previous negative biopsies.
Materials and methods
Materials and methods
This retrospective study received institutional review board approval from the Affiliated People’s Hospital of Ningbo University (NDFRLS 2023 − 116). Given the retrospective design and use of deidentified clinical data for predictive model development, the requirement for individual informed consent was waived. All patient information was systematically deidentified prior to analysis according to institutional privacy standards to ensure confidentiality.
Data source
Clinical data were extracted from the institutional prostate biopsy database encompassing 2,087 procedures performed between January 2019 and May 2025. Initial systematic biopsies yielded positive results in 755 cases and negative results in 1,332 cases. Among patients with negative initial biopsies (IBs), 147 subsequently developed persistent clinical suspicion for PCa warranting RB evaluation. Persistent suspicion was defined according to established clinical practice guidelines [18] as persistently elevated or rising PSA levels, abnormal DRE findings, suspicious imaging lesions, or high-grade prostatic intraepithelial neoplasia or atypical small acinar proliferation on IB specimens. All 147 patients underwent systematic RB with complete pathological documentation. Eight patients with concurrent malignancies that could confound biomarker interpretation were excluded from analysis. The final analytical cohort comprised 139 patients, among whom RB detected PCa in 40 cases (28.8%).
Biopsy procedures and clinical data collection
All prostate biopsies were performed under transrectal ultrasound (TRUS) guidance using a standardized systematic sampling protocol. The procedure employed a transrectal approach with patients placed in the lateral decubitus position following prophylactic antibiotic administration and periprostatic local anesthesia. A 10–12 core systematic biopsy template was utilized, sampling both the peripheral and transition zones bilaterally in accordance with contemporary clinical practice guidelines [18]. When suspicious lesions were identified on ultrasound imaging, additional targeted biopsies were obtained. Prostate volume was measured by TRUS immediately prior to biopsy using the standard ellipsoid formula (height × width × length × π/6), with height and length measured in the sagittal plane and width in the axial plane.
Clinical and laboratory parameters were systematically extracted from electronic medical records. For the IB cohort, collected data included demographic characteristics (age, body mass index [BMI]), serum PSA level (ng/mL), free-to-total PSA ratio (fPSA/PSA, %), prostate volume (mL), PSA density (PSAD, ng/mL/cm³), DRE findings, imaging results, number of biopsy cores, and pathological outcomes classified according to the International Society of Urological Pathology grading system. For the RB cohort, the same baseline variables were collected at both the IB and RB timepoints. In addition, PSA velocity was calculated as the change in PSA concentration divided by the time interval between biopsies, expressed as ng/mL/year: PSA velocity = (PSA at RB - PSA at IB) / interval in years. Pathological diagnosis at IB was recorded as benign prostatic hyperplasia (BPH), chronic inflammation, prostatic intraepithelial neoplasia (PIN), or atypical small acinar proliferation (ASAP).
Data preprocessing
Continuous variables were standardized using z-score normalization. Categorical variables were encoded with binary format preserved and multi-class pathological categories transformed using ordinal encoding to maintain clinical hierarchy. Missing values were imputed using median values stratified by cancer status for continuous variables and mode values for categorical variables. The two-stage meta-learning framework required hierarchical data partitioning across the IB and RB cohorts. The IB cohort was partitioned into pre-training (80%) and validation (20%) sets using stratified random sampling to maintain balanced cancer representation, serving as the source domain for meta-learning knowledge transfer. Subsequently, the RB cohort was divided into training (70%) and testing (30%) sets with stratified sampling, constituting the target domain for task-specific model fine-tuning and independent evaluation. Hyperparameter optimization employed 5-fold cross-validation on the RB training set. No data augmentation was applied as meta-learning pre-training addressed sample size limitations while preserving authentic clinical distributions.
Model development
A meta-learning optimized TabNet framework was developed following a two-stage training paradigm. During the pre-training phase, the TabNet architecture was trained on the IB pre-training set using the Model-Agnostic Meta-Learning (MAML) algorithm, with the IB validation set monitoring convergence. The pre-training employed the Adam optimizer with adaptive learning rate scheduling and nested gradient updates with inner and outer loop learning rates to facilitate knowledge transfer. Task-specific fine-tuning was subsequently performed on the RB training set with reduced learning rate to preserve pre-learned representations while adapting to RB characteristics. Early stopping was implemented for both phases based on validation loss with patience parameter 10. Hyperparameter optimization was conducted through systematic grid search with 5-fold cross-validation on the RB training set, evaluating combinations of architectural parameters (decision steps, feature dimensions), learning rates, meta-learning configurations, batch sizes, and regularization coefficients. The final model predicted positive RB outcomes using clinical and pathological features from both initial and repeat biopsy timepoints.
Model interpretation
Feature importance was evaluated using TabNet’s inherent attention mechanism, which provides interpretable insights into clinical variable contributions through learned attention weights across sequential decision steps. Global feature importance scores were calculated by aggregating attention weights across all patients in the testing set. Comparative analysis between meta-learning TabNet and original TabNet was performed to assess how pre-training influenced feature prioritization and to evaluate alignment with established clinical predictors of PCa risk.
Model validation
Model performance was evaluated on the RB testing set comprising 42 patients completely excluded from all training and hyperparameter optimization procedures. Five comparison models were developed to benchmark the meta-learning optimized TabNet framework: LR, random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), and original TabNet, all trained on the RB training set using identical input features with hyperparameters optimized through 5-fold cross-validation grid search. Discriminative ability was assessed as the primary performance metric with model calibration evaluated to assess agreement between predicted probabilities and observed outcomes. Decision curve analysis (DCA) was performed to quantify clinical utility by evaluating net benefit across threshold probabilities. Classification performance analysis subsequently evaluated all AI models alongside established clinical risk calculators including the ERSPC risk calculator 4 and PCPT risk calculator 2.0. Both calculators are specifically validated for RB scenarios in patients with previous negative biopsies. Secondary metrics including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and overall accuracy were calculated.
Statistical analysis
Discriminative performance was quantified through receiver operating characteristic (ROC) curve analysis with area under the curve (AUC) serving as the primary metric. Statistical comparisons of AUC values between the meta-learning optimized TabNet and comparison models were conducted using DeLong test for correlated ROC curves with Bonferroni correction applied for multiple comparisons. Model calibration was assessed through calibration plots and Brier score calculation, with calibration slope and intercept derived from logistic regression of observed outcomes against predicted probabilities. Expected calibration error (ECE) was computed as the weighted average of absolute differences between predicted probabilities and observed frequencies across ten equally-sized probability bins. All statistical analyses were performed using Python 3.8, with two-sided p-values less than 0.05 considered statistically significant.
This retrospective study received institutional review board approval from the Affiliated People’s Hospital of Ningbo University (NDFRLS 2023 − 116). Given the retrospective design and use of deidentified clinical data for predictive model development, the requirement for individual informed consent was waived. All patient information was systematically deidentified prior to analysis according to institutional privacy standards to ensure confidentiality.
Data source
Clinical data were extracted from the institutional prostate biopsy database encompassing 2,087 procedures performed between January 2019 and May 2025. Initial systematic biopsies yielded positive results in 755 cases and negative results in 1,332 cases. Among patients with negative initial biopsies (IBs), 147 subsequently developed persistent clinical suspicion for PCa warranting RB evaluation. Persistent suspicion was defined according to established clinical practice guidelines [18] as persistently elevated or rising PSA levels, abnormal DRE findings, suspicious imaging lesions, or high-grade prostatic intraepithelial neoplasia or atypical small acinar proliferation on IB specimens. All 147 patients underwent systematic RB with complete pathological documentation. Eight patients with concurrent malignancies that could confound biomarker interpretation were excluded from analysis. The final analytical cohort comprised 139 patients, among whom RB detected PCa in 40 cases (28.8%).
Biopsy procedures and clinical data collection
All prostate biopsies were performed under transrectal ultrasound (TRUS) guidance using a standardized systematic sampling protocol. The procedure employed a transrectal approach with patients placed in the lateral decubitus position following prophylactic antibiotic administration and periprostatic local anesthesia. A 10–12 core systematic biopsy template was utilized, sampling both the peripheral and transition zones bilaterally in accordance with contemporary clinical practice guidelines [18]. When suspicious lesions were identified on ultrasound imaging, additional targeted biopsies were obtained. Prostate volume was measured by TRUS immediately prior to biopsy using the standard ellipsoid formula (height × width × length × π/6), with height and length measured in the sagittal plane and width in the axial plane.
Clinical and laboratory parameters were systematically extracted from electronic medical records. For the IB cohort, collected data included demographic characteristics (age, body mass index [BMI]), serum PSA level (ng/mL), free-to-total PSA ratio (fPSA/PSA, %), prostate volume (mL), PSA density (PSAD, ng/mL/cm³), DRE findings, imaging results, number of biopsy cores, and pathological outcomes classified according to the International Society of Urological Pathology grading system. For the RB cohort, the same baseline variables were collected at both the IB and RB timepoints. In addition, PSA velocity was calculated as the change in PSA concentration divided by the time interval between biopsies, expressed as ng/mL/year: PSA velocity = (PSA at RB - PSA at IB) / interval in years. Pathological diagnosis at IB was recorded as benign prostatic hyperplasia (BPH), chronic inflammation, prostatic intraepithelial neoplasia (PIN), or atypical small acinar proliferation (ASAP).
Data preprocessing
Continuous variables were standardized using z-score normalization. Categorical variables were encoded with binary format preserved and multi-class pathological categories transformed using ordinal encoding to maintain clinical hierarchy. Missing values were imputed using median values stratified by cancer status for continuous variables and mode values for categorical variables. The two-stage meta-learning framework required hierarchical data partitioning across the IB and RB cohorts. The IB cohort was partitioned into pre-training (80%) and validation (20%) sets using stratified random sampling to maintain balanced cancer representation, serving as the source domain for meta-learning knowledge transfer. Subsequently, the RB cohort was divided into training (70%) and testing (30%) sets with stratified sampling, constituting the target domain for task-specific model fine-tuning and independent evaluation. Hyperparameter optimization employed 5-fold cross-validation on the RB training set. No data augmentation was applied as meta-learning pre-training addressed sample size limitations while preserving authentic clinical distributions.
Model development
A meta-learning optimized TabNet framework was developed following a two-stage training paradigm. During the pre-training phase, the TabNet architecture was trained on the IB pre-training set using the Model-Agnostic Meta-Learning (MAML) algorithm, with the IB validation set monitoring convergence. The pre-training employed the Adam optimizer with adaptive learning rate scheduling and nested gradient updates with inner and outer loop learning rates to facilitate knowledge transfer. Task-specific fine-tuning was subsequently performed on the RB training set with reduced learning rate to preserve pre-learned representations while adapting to RB characteristics. Early stopping was implemented for both phases based on validation loss with patience parameter 10. Hyperparameter optimization was conducted through systematic grid search with 5-fold cross-validation on the RB training set, evaluating combinations of architectural parameters (decision steps, feature dimensions), learning rates, meta-learning configurations, batch sizes, and regularization coefficients. The final model predicted positive RB outcomes using clinical and pathological features from both initial and repeat biopsy timepoints.
Model interpretation
Feature importance was evaluated using TabNet’s inherent attention mechanism, which provides interpretable insights into clinical variable contributions through learned attention weights across sequential decision steps. Global feature importance scores were calculated by aggregating attention weights across all patients in the testing set. Comparative analysis between meta-learning TabNet and original TabNet was performed to assess how pre-training influenced feature prioritization and to evaluate alignment with established clinical predictors of PCa risk.
Model validation
Model performance was evaluated on the RB testing set comprising 42 patients completely excluded from all training and hyperparameter optimization procedures. Five comparison models were developed to benchmark the meta-learning optimized TabNet framework: LR, random forest (RF), extreme gradient boosting (XGBoost), support vector machine (SVM), and original TabNet, all trained on the RB training set using identical input features with hyperparameters optimized through 5-fold cross-validation grid search. Discriminative ability was assessed as the primary performance metric with model calibration evaluated to assess agreement between predicted probabilities and observed outcomes. Decision curve analysis (DCA) was performed to quantify clinical utility by evaluating net benefit across threshold probabilities. Classification performance analysis subsequently evaluated all AI models alongside established clinical risk calculators including the ERSPC risk calculator 4 and PCPT risk calculator 2.0. Both calculators are specifically validated for RB scenarios in patients with previous negative biopsies. Secondary metrics including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and overall accuracy were calculated.
Statistical analysis
Discriminative performance was quantified through receiver operating characteristic (ROC) curve analysis with area under the curve (AUC) serving as the primary metric. Statistical comparisons of AUC values between the meta-learning optimized TabNet and comparison models were conducted using DeLong test for correlated ROC curves with Bonferroni correction applied for multiple comparisons. Model calibration was assessed through calibration plots and Brier score calculation, with calibration slope and intercept derived from logistic regression of observed outcomes against predicted probabilities. Expected calibration error (ECE) was computed as the weighted average of absolute differences between predicted probabilities and observed frequencies across ten equally-sized probability bins. All statistical analyses were performed using Python 3.8, with two-sided p-values less than 0.05 considered statistically significant.
Results
Results
Patient characteristics and cohort distribution
Initial systematic prostate biopsy yielded negative results in 63.8% of patients. Comparative analysis between cancer and non-cancer groups in the IB cohort demonstrated significant differences across key clinical parameters including age, PSA levels, fPSA/PSA ratios, prostate volumes, PSAD, DRE findings, and imaging characteristics (all P < 0.05, Table 1). Among the 139 patients in the RB cohort, cancer was detected in 40 cases (28.8%), with 31 cases (77.5%) representing clinically significant PCa. Gleason score distribution showed predominantly intermediate-grade disease, including 21 cases of 3 + 4=7 (52.5%), 8 cases of 4 + 3=7 (20.0%), 2 cases of score 8 (5.0%), and 9 cases of score 6 (22.5%). The cancer group exhibited distinct clinical profiles characterized by elevated PSA parameters, increased PSA velocity, and higher prevalence of high-risk pathological findings on IB specimens compared to the non-cancer group (all P < 0.05, Table 2).
The IB cohort was allocated into meta-learning pre-training (n = 1,670, cancer prevalence 36.2%) and validation (n = 417, cancer prevalence 36.2%) sets, while the RB cohort was divided into training (n = 97, cancer prevalence 28.9%) and testing (n = 42, cancer prevalence 28.6%) cohorts using stratified random sampling. Detailed comparisons between pre-training and validation sets for the IB cohort and between training and testing sets for the RB cohort are presented in Supplementary Tables S1 and S2, respectively, with no statistically significant differences observed across all evaluated parameters between respective cohorts (all P > 0.05).
Model training
The meta-learning optimized TabNet model was established through a two-stage training paradigm (Fig. 1). During the pre-training phase on the IB cohort, the network was trained for 60 epochs using 4 decision steps, feature dimension 32, attention dimension 16, and the Adam optimizer with initial learning rate 0.01 and cosine annealing, batch size 128, and weight decay 1e-4. Meta-learning optimization followed the MAML scheme with 5 inner-loop updates, inner learning rate 0.001, and outer learning rate 0.005. Pre-training achieved effective knowledge acquisition with validation loss decreasing from 0.720 to 0.311, with early stopping triggered at epoch 25 using patience parameter 10. The fine-tuning phase employed learning rate 0.001, batch size 64, dropout 0.20, and maximum 30 epochs with early stopping monitoring. Meta-learning TabNet achieved optimal performance at epoch 16 with validation loss 0.196, while original TabNet reached early stopping at epoch 29 with validation loss of 0.288, demonstrating the initialization advantage conferred by pre-trained representations.
Feature importance analysis revealed both shared priorities and distinct clinical variable weighting patterns between the two models (Fig. 2). Both models identified PSAD at RB as the most critical predictor, though meta-learning TabNet assigned substantially higher importance. Meta-learning TabNet demonstrated enhanced utilization of historical clinical information, prioritizing imaging suspicion as the second-ranked feature and elevating PSAD at IB to fourth position. This pattern reflects successful knowledge transfer from pre-training on the larger IB cohort. In contrast, original TabNet ranked PSA at RB second and PSA velocity third, with imaging suspicion and historical PSAD parameters receiving relatively reduced weighting at fourth and lower positions respectively.
Model evaluation
To benchmark the meta-learning optimized TabNet framework, five comparison models were trained on the RB training set using identical input features with hyperparameters optimized through 5-fold cross-validation grid search. The meta-learning optimized TabNet demonstrated superior discriminative ability on the independent testing set (Fig. 3A), achieving the highest AUROC of 0.872 compared to XGBoost, original TabNet, RF, SVM, and LR, with statistical significance confirmed by DeLong test (all P < 0.001 after Bonferroni correction). Calibration analysis revealed optimal performance for meta-learning TabNet with the lowest Brier score of 0.068 and ECE of 0.100 (Fig. 3B; Table 3), substantially outperforming comparison models which exhibited Brier scores ranging from 0.182 to 0.270. DCA demonstrated superior net clinical benefit for meta-learning TabNet across threshold probabilities from 0.1 to 0.6 (Fig. 3C), maintaining peak net benefit of approximately 0.25 between threshold probabilities of 0.2 and 0.4, consistently exceeding both treat-all and treat-none strategies while comparison models showed progressively diminishing clinical utility.
Classification performance analysis
Classification performance evaluation demonstrated that meta-learning TabNet achieved superior balanced diagnostic accuracy compared to conventional ML approaches and established clinical assessment tools (Table 4). All variables required for ERSPC and PCPT calculator implementation, including age, PSA levels, DRE findings, prostate volume, and prior biopsy characteristics, were completely available in our dataset. The model attained the highest overall accuracy and specificity while generating only three false positive classifications, indicating strong capability in avoiding unnecessary repeat biopsies. This contrasts markedly with XGBoost, which despite achieving the second-highest accuracy, produced eight false positives and correspondingly lower specificity, reflecting the typical trade-off where aggressive cancer detection leads to more unnecessary procedures. The established clinical risk calculators ERSPC and PCPT, which serve as standard decision support tools for prostate cancer risk stratification in routine practice, performed comparably to LR with notably limited overall diagnostic performance (accuracy 0.714), highlighting the constraints of conventional linear modeling approaches in complex clinical prediction scenarios.
Patient characteristics and cohort distribution
Initial systematic prostate biopsy yielded negative results in 63.8% of patients. Comparative analysis between cancer and non-cancer groups in the IB cohort demonstrated significant differences across key clinical parameters including age, PSA levels, fPSA/PSA ratios, prostate volumes, PSAD, DRE findings, and imaging characteristics (all P < 0.05, Table 1). Among the 139 patients in the RB cohort, cancer was detected in 40 cases (28.8%), with 31 cases (77.5%) representing clinically significant PCa. Gleason score distribution showed predominantly intermediate-grade disease, including 21 cases of 3 + 4=7 (52.5%), 8 cases of 4 + 3=7 (20.0%), 2 cases of score 8 (5.0%), and 9 cases of score 6 (22.5%). The cancer group exhibited distinct clinical profiles characterized by elevated PSA parameters, increased PSA velocity, and higher prevalence of high-risk pathological findings on IB specimens compared to the non-cancer group (all P < 0.05, Table 2).
The IB cohort was allocated into meta-learning pre-training (n = 1,670, cancer prevalence 36.2%) and validation (n = 417, cancer prevalence 36.2%) sets, while the RB cohort was divided into training (n = 97, cancer prevalence 28.9%) and testing (n = 42, cancer prevalence 28.6%) cohorts using stratified random sampling. Detailed comparisons between pre-training and validation sets for the IB cohort and between training and testing sets for the RB cohort are presented in Supplementary Tables S1 and S2, respectively, with no statistically significant differences observed across all evaluated parameters between respective cohorts (all P > 0.05).
Model training
The meta-learning optimized TabNet model was established through a two-stage training paradigm (Fig. 1). During the pre-training phase on the IB cohort, the network was trained for 60 epochs using 4 decision steps, feature dimension 32, attention dimension 16, and the Adam optimizer with initial learning rate 0.01 and cosine annealing, batch size 128, and weight decay 1e-4. Meta-learning optimization followed the MAML scheme with 5 inner-loop updates, inner learning rate 0.001, and outer learning rate 0.005. Pre-training achieved effective knowledge acquisition with validation loss decreasing from 0.720 to 0.311, with early stopping triggered at epoch 25 using patience parameter 10. The fine-tuning phase employed learning rate 0.001, batch size 64, dropout 0.20, and maximum 30 epochs with early stopping monitoring. Meta-learning TabNet achieved optimal performance at epoch 16 with validation loss 0.196, while original TabNet reached early stopping at epoch 29 with validation loss of 0.288, demonstrating the initialization advantage conferred by pre-trained representations.
Feature importance analysis revealed both shared priorities and distinct clinical variable weighting patterns between the two models (Fig. 2). Both models identified PSAD at RB as the most critical predictor, though meta-learning TabNet assigned substantially higher importance. Meta-learning TabNet demonstrated enhanced utilization of historical clinical information, prioritizing imaging suspicion as the second-ranked feature and elevating PSAD at IB to fourth position. This pattern reflects successful knowledge transfer from pre-training on the larger IB cohort. In contrast, original TabNet ranked PSA at RB second and PSA velocity third, with imaging suspicion and historical PSAD parameters receiving relatively reduced weighting at fourth and lower positions respectively.
Model evaluation
To benchmark the meta-learning optimized TabNet framework, five comparison models were trained on the RB training set using identical input features with hyperparameters optimized through 5-fold cross-validation grid search. The meta-learning optimized TabNet demonstrated superior discriminative ability on the independent testing set (Fig. 3A), achieving the highest AUROC of 0.872 compared to XGBoost, original TabNet, RF, SVM, and LR, with statistical significance confirmed by DeLong test (all P < 0.001 after Bonferroni correction). Calibration analysis revealed optimal performance for meta-learning TabNet with the lowest Brier score of 0.068 and ECE of 0.100 (Fig. 3B; Table 3), substantially outperforming comparison models which exhibited Brier scores ranging from 0.182 to 0.270. DCA demonstrated superior net clinical benefit for meta-learning TabNet across threshold probabilities from 0.1 to 0.6 (Fig. 3C), maintaining peak net benefit of approximately 0.25 between threshold probabilities of 0.2 and 0.4, consistently exceeding both treat-all and treat-none strategies while comparison models showed progressively diminishing clinical utility.
Classification performance analysis
Classification performance evaluation demonstrated that meta-learning TabNet achieved superior balanced diagnostic accuracy compared to conventional ML approaches and established clinical assessment tools (Table 4). All variables required for ERSPC and PCPT calculator implementation, including age, PSA levels, DRE findings, prostate volume, and prior biopsy characteristics, were completely available in our dataset. The model attained the highest overall accuracy and specificity while generating only three false positive classifications, indicating strong capability in avoiding unnecessary repeat biopsies. This contrasts markedly with XGBoost, which despite achieving the second-highest accuracy, produced eight false positives and correspondingly lower specificity, reflecting the typical trade-off where aggressive cancer detection leads to more unnecessary procedures. The established clinical risk calculators ERSPC and PCPT, which serve as standard decision support tools for prostate cancer risk stratification in routine practice, performed comparably to LR with notably limited overall diagnostic performance (accuracy 0.714), highlighting the constraints of conventional linear modeling approaches in complex clinical prediction scenarios.
Discussion
Discussion
While traditional LR modeling inadequately captures complex non-linear interactions among clinical variables in RB scenarios, current AI approaches for RB prediction remain fundamentally constrained by the inherent small sample sizes characteristic of this clinical population, creating a critical gap between the potential of advanced ML techniques and their practical implementation in urological practice. This investigation developed and validated a meta-learning optimized TabNet framework specifically designed to address limited-data clinical scenarios through a novel two-stage training paradigm that leverages knowledge transfer from IB cohorts to enhance RB prediction accuracy. The proposed framework demonstrated superior discriminative performance compared to conventional ML approaches, original TabNet architecture, and established clinical risk calculators. To our knowledge, this study represents the first successful application of meta-learning optimization to overcome sample size limitations in RB prediction, demonstrating that transfer learning from related clinical tasks can effectively bridge the gap between advanced AI capabilities and small-cohort urological applications.
Meta-learning in clinical prediction
Meta-learning optimization addresses limited-data clinical scenarios through “learning to learn,” whereby models rapidly adapt to new tasks using minimal training examples by leveraging knowledge from related tasks during pre-training [17]. MAML achieves this by optimizing initial model parameters for effective generalization with limited task-specific data [19]. This approach has demonstrated promise across diverse medical applications, including few-shot image classification for rare diseases, clinical decision support, and survival analysis where traditional deep learning faces data scarcity constraints [20–22]. Recent systematic reviews have highlighted meta-learning’s value in medical imaging for rare conditions where annotated datasets remain scarce and expensive, demonstrating that meta-learned models achieve performance comparable to conventional approaches with substantially fewer training examples [23–25]. For RB prediction specifically, this knowledge transfer capability from the larger IB cohort to the limited RB population directly addresses the mismatch between deep learning’s data requirements and the inherently small sample sizes of specialized clinical populations.
Performance evaluation of meta-learning TabNet
In our study, the distinct training convergence patterns between meta-learning TabNet and original TabNet provide compelling evidence for knowledge transfer effectiveness in small sample clinical scenarios. Meta-learning TabNet achieved optimal performance at epoch 16 with validation loss 0.196, while original TabNet required 29 epochs to converge at 0.288, demonstrating that pre-trained representations from the larger IB cohort accelerated learning efficiency and enhanced final model performance. This accelerated convergence aligns with established meta-learning principles where pre-acquired knowledge enables rapid adaptation to new tasks with limited data [26], particularly valuable in clinical domains where specialized patient cohorts inherently constrain sample sizes.
Beyond training efficiency, the knowledge transfer manifested in distinct feature importance patterns that favored clinically established predictors. Meta-learning TabNet elevated PSAD at IB to fourth-ranked importance and prioritized imaging suspicion as the second-most critical feature, while original TabNet demonstrated greater reliance on PSA velocity with relatively diminished weighting of historical pathological information. This enhanced utilization of historically validated predictors reflects successful knowledge integration from the larger IB cohort and contributes directly to the model’s superior discriminative performance achieved by the meta-learning framework. The model prioritization of PSAD and imaging suspicion aligns with extensive clinical evidence demonstrating their superior predictive value for PCa detection. PSAD consistently outperforms PSA alone in multiple validation cohorts by accounting for benign prostatic hyperplasia effects [27, 28], while imaging suspicion identified on TRUS represents a well-established indicator of malignancy risk in RB scenarios [29, 30]. This alignment between model-derived feature importance and clinically validated predictors enhances interpretability and facilitates clinical acceptance by reflecting established decision-making frameworks used by urologists.
The superior discriminative performance of meta-learning TabNet compared to original TabNet and other ML approaches reflects meaningful improvements in complex pattern recognition capabilities, consistent with recent systematic reviews demonstrating meta-learning advantages in medical applications with data scarcity constraints [31]. Notably, our meta-learning TabNet outperformed XGBoost, which has shown robust performance in previous PCa prediction studies [12], suggesting that attention-based sequential feature selection mechanisms combined with knowledge transfer provide superior modeling of non-linear clinical variable interactions. The calibration analysis revealed striking improvements for meta-learning TabNet with the lowest Brier score and ECE, substantially outperforming comparison models with Brier scores ranging from 0.182 to 0.270, indicating that meta-learning optimization enhanced both discrimination and reliability of probability estimates crucial for clinical decision-making.
The classification performance demonstrated clinically meaningful specificity improvements that directly address the core challenge in RB scenarios. By minimizing false positive predictions while maintaining reasonable sensitivity, the framework reduces unnecessary invasive procedures without compromising cancer detection. This becomes particularly valuable given that RB procedures yield positive results in only 10–35% of cases [32], making accurate identification of patients unlikely to harbor cancer essential for optimal clinical resource allocation. The high proportion of clinically significant disease among detected cancers indicates that positive predictions in this RB cohort predominantly correspond to malignancies requiring clinical intervention rather than indolent disease, supporting the model’s practical utility for guiding RB decisions. Our conventional ML models demonstrated performance metrics comparable to recent literature reports, with XGBoost achieving AUROC 0.808 similar to the 0.761 reported by Zhang et al. [12]. The established clinical risk calculators ERSPC and PCPT, both specifically designed for RB scenarios, achieved accuracy of 0.714, consistent with their reported AUROC values of 0.71–0.79 in external validation cohorts [33]. All required variables for both calculators were completely available in our dataset, confirming the persistent limitations of conventional linear modeling approaches in capturing complex clinical variable interactions that influence cancer detection probability in RB scenarios.
Clinical implications for urological practice
The clinical implications of our meta-learning optimized TabNet framework are substantial, providing urologists with an evidence-based decision support tool that could reduce unnecessary invasive procedures while maintaining optimal cancer detection rates in patients with previous negative biopsies and persistent clinical suspicion. The high specificity achieved translates directly to clinical benefit by preventing unnecessary RBs in patients unlikely to harbor cancer, thereby reducing patient morbidity and healthcare costs. The model achieves this through risk-adapted patient stratification that enables tailored management decisions. In practice, the framework integrates multiple clinical parameters to generate individual cancer probability estimates that guide decision-making in cases where conventional risk assessment provides insufficient clarity. For instance, consider a patient presenting with moderately elevated PSA at 15 ng/mL and PSAD 0.28 ng/mL/cm³, values that fall into an ambiguous range and create clinical uncertainty. Despite these concerning parameters, when the model incorporates additional features including stable PSA velocity, benign initial pathology, and absence of imaging suspicion, it generates a predicted probability of 0.38, indicating low cancer risk that supports continued surveillance rather than immediate RB. This capacity to identify low-risk patients within clinically uncertain scenarios allows appropriate observation in cases with low malignancy probability while ensuring timely biopsy for those with substantial cancer risk. Such accurate patient prioritization proves particularly valuable in healthcare systems with limited biopsy capacity, optimizing resource allocation and reducing waiting times for highest-risk patients [34]. The framework’s reliance on routinely collected clinical parameters without requiring specialized biomarkers or advanced imaging further enhances accessibility across diverse practice settings, addressing a key implementation barrier that has constrained clinical adoption of many artificial intelligence applications in PCa management.
Limitations and future directions
However, several limitations warrant consideration. This single-center retrospective analysis of 139 RB patients may limit generalizability to diverse populations and clinical settings. The modest cohort size constrains model development in multiple ways. While meta-learning optimization partially addresses sample size limitations through knowledge transfer from larger IB cohorts, the independent testing set comprises only 42 patients with 12 cancer-positive cases, which limits statistical precision of performance estimates and may result in wider confidence intervals than larger validation cohorts would provide. An additional methodological consideration involves patient overlap between training phases. The RB cohort comprises patients who were present in the IB cohort during meta-learning pre-training, representing 6.7% of the source domain population. Although the model aims to extract generalizable discriminative patterns rather than memorize individual patient characteristics, and RB prediction incorporates longitudinal features unavailable during pre-training, this overlap may introduce optimistic bias in performance estimates. These design-level limitations collectively underscore the critical need for external validation on completely independent patient populations with adequate sample sizes to establish true generalizability and obtain stable performance estimates.
Beyond study design considerations, clinical implementation choices impose additional constraints. Our cohort predominantly employed transrectal biopsy approach, while transperineal biopsy offers lower infection risk and improved sampling accuracy. Model performance under transperineal protocols requires future validation. Similarly, our systematic TRUS-guided biopsy protocols with targeted sampling of ultrasound-visible lesions differed from mpMRI fusion-guided approaches increasingly advocated by current international guidelines including the 2025 EAU guidelines [18]. This methodological choice prioritized broad applicability and data consistency given considerable variation in mpMRI availability and protocols across our study period. While our framework utilizing readily available clinical parameters and conventional ultrasound guidance addresses practical resource constraints in many healthcare systems, incorporating mpMRI-derived features could potentially further enhance predictive performance in settings where advanced imaging is routinely available. Finally, comprehensive external validation across diverse clinical environments with varied technical protocols remains essential to establish model robustness and clinical utility.
Future research should prioritize multicenter external validation to establish model robustness across diverse clinical settings with varied patient demographics and biopsy protocols. Such validation efforts could proceed through two complementary approaches. Retrospectively, collaborative multi-institutional studies pooling RB cohorts would enhance meta-learning training datasets while enabling comprehensive assessment of model performance across different healthcare environments. Prospectively, integration of the framework into actual clinical workflows represents the critical next step, allowing evaluation of not only diagnostic accuracy but also real-world impact on clinical decision-making patterns, RB rates, and patient outcomes. These prospective studies should systematically assess clinician acceptance and identify implementation barriers to inform deployment strategies. Beyond validation of the current framework, future model development could explore hybrid approaches incorporating advanced imaging features alongside conventional clinical parameters to further enhance predictive accuracy while maintaining broad feasibility. Successful clinical translation will ultimately require seamless electronic health record integration and user-friendly decision support interfaces that fit naturally into existing urological practice workflows.
While traditional LR modeling inadequately captures complex non-linear interactions among clinical variables in RB scenarios, current AI approaches for RB prediction remain fundamentally constrained by the inherent small sample sizes characteristic of this clinical population, creating a critical gap between the potential of advanced ML techniques and their practical implementation in urological practice. This investigation developed and validated a meta-learning optimized TabNet framework specifically designed to address limited-data clinical scenarios through a novel two-stage training paradigm that leverages knowledge transfer from IB cohorts to enhance RB prediction accuracy. The proposed framework demonstrated superior discriminative performance compared to conventional ML approaches, original TabNet architecture, and established clinical risk calculators. To our knowledge, this study represents the first successful application of meta-learning optimization to overcome sample size limitations in RB prediction, demonstrating that transfer learning from related clinical tasks can effectively bridge the gap between advanced AI capabilities and small-cohort urological applications.
Meta-learning in clinical prediction
Meta-learning optimization addresses limited-data clinical scenarios through “learning to learn,” whereby models rapidly adapt to new tasks using minimal training examples by leveraging knowledge from related tasks during pre-training [17]. MAML achieves this by optimizing initial model parameters for effective generalization with limited task-specific data [19]. This approach has demonstrated promise across diverse medical applications, including few-shot image classification for rare diseases, clinical decision support, and survival analysis where traditional deep learning faces data scarcity constraints [20–22]. Recent systematic reviews have highlighted meta-learning’s value in medical imaging for rare conditions where annotated datasets remain scarce and expensive, demonstrating that meta-learned models achieve performance comparable to conventional approaches with substantially fewer training examples [23–25]. For RB prediction specifically, this knowledge transfer capability from the larger IB cohort to the limited RB population directly addresses the mismatch between deep learning’s data requirements and the inherently small sample sizes of specialized clinical populations.
Performance evaluation of meta-learning TabNet
In our study, the distinct training convergence patterns between meta-learning TabNet and original TabNet provide compelling evidence for knowledge transfer effectiveness in small sample clinical scenarios. Meta-learning TabNet achieved optimal performance at epoch 16 with validation loss 0.196, while original TabNet required 29 epochs to converge at 0.288, demonstrating that pre-trained representations from the larger IB cohort accelerated learning efficiency and enhanced final model performance. This accelerated convergence aligns with established meta-learning principles where pre-acquired knowledge enables rapid adaptation to new tasks with limited data [26], particularly valuable in clinical domains where specialized patient cohorts inherently constrain sample sizes.
Beyond training efficiency, the knowledge transfer manifested in distinct feature importance patterns that favored clinically established predictors. Meta-learning TabNet elevated PSAD at IB to fourth-ranked importance and prioritized imaging suspicion as the second-most critical feature, while original TabNet demonstrated greater reliance on PSA velocity with relatively diminished weighting of historical pathological information. This enhanced utilization of historically validated predictors reflects successful knowledge integration from the larger IB cohort and contributes directly to the model’s superior discriminative performance achieved by the meta-learning framework. The model prioritization of PSAD and imaging suspicion aligns with extensive clinical evidence demonstrating their superior predictive value for PCa detection. PSAD consistently outperforms PSA alone in multiple validation cohorts by accounting for benign prostatic hyperplasia effects [27, 28], while imaging suspicion identified on TRUS represents a well-established indicator of malignancy risk in RB scenarios [29, 30]. This alignment between model-derived feature importance and clinically validated predictors enhances interpretability and facilitates clinical acceptance by reflecting established decision-making frameworks used by urologists.
The superior discriminative performance of meta-learning TabNet compared to original TabNet and other ML approaches reflects meaningful improvements in complex pattern recognition capabilities, consistent with recent systematic reviews demonstrating meta-learning advantages in medical applications with data scarcity constraints [31]. Notably, our meta-learning TabNet outperformed XGBoost, which has shown robust performance in previous PCa prediction studies [12], suggesting that attention-based sequential feature selection mechanisms combined with knowledge transfer provide superior modeling of non-linear clinical variable interactions. The calibration analysis revealed striking improvements for meta-learning TabNet with the lowest Brier score and ECE, substantially outperforming comparison models with Brier scores ranging from 0.182 to 0.270, indicating that meta-learning optimization enhanced both discrimination and reliability of probability estimates crucial for clinical decision-making.
The classification performance demonstrated clinically meaningful specificity improvements that directly address the core challenge in RB scenarios. By minimizing false positive predictions while maintaining reasonable sensitivity, the framework reduces unnecessary invasive procedures without compromising cancer detection. This becomes particularly valuable given that RB procedures yield positive results in only 10–35% of cases [32], making accurate identification of patients unlikely to harbor cancer essential for optimal clinical resource allocation. The high proportion of clinically significant disease among detected cancers indicates that positive predictions in this RB cohort predominantly correspond to malignancies requiring clinical intervention rather than indolent disease, supporting the model’s practical utility for guiding RB decisions. Our conventional ML models demonstrated performance metrics comparable to recent literature reports, with XGBoost achieving AUROC 0.808 similar to the 0.761 reported by Zhang et al. [12]. The established clinical risk calculators ERSPC and PCPT, both specifically designed for RB scenarios, achieved accuracy of 0.714, consistent with their reported AUROC values of 0.71–0.79 in external validation cohorts [33]. All required variables for both calculators were completely available in our dataset, confirming the persistent limitations of conventional linear modeling approaches in capturing complex clinical variable interactions that influence cancer detection probability in RB scenarios.
Clinical implications for urological practice
The clinical implications of our meta-learning optimized TabNet framework are substantial, providing urologists with an evidence-based decision support tool that could reduce unnecessary invasive procedures while maintaining optimal cancer detection rates in patients with previous negative biopsies and persistent clinical suspicion. The high specificity achieved translates directly to clinical benefit by preventing unnecessary RBs in patients unlikely to harbor cancer, thereby reducing patient morbidity and healthcare costs. The model achieves this through risk-adapted patient stratification that enables tailored management decisions. In practice, the framework integrates multiple clinical parameters to generate individual cancer probability estimates that guide decision-making in cases where conventional risk assessment provides insufficient clarity. For instance, consider a patient presenting with moderately elevated PSA at 15 ng/mL and PSAD 0.28 ng/mL/cm³, values that fall into an ambiguous range and create clinical uncertainty. Despite these concerning parameters, when the model incorporates additional features including stable PSA velocity, benign initial pathology, and absence of imaging suspicion, it generates a predicted probability of 0.38, indicating low cancer risk that supports continued surveillance rather than immediate RB. This capacity to identify low-risk patients within clinically uncertain scenarios allows appropriate observation in cases with low malignancy probability while ensuring timely biopsy for those with substantial cancer risk. Such accurate patient prioritization proves particularly valuable in healthcare systems with limited biopsy capacity, optimizing resource allocation and reducing waiting times for highest-risk patients [34]. The framework’s reliance on routinely collected clinical parameters without requiring specialized biomarkers or advanced imaging further enhances accessibility across diverse practice settings, addressing a key implementation barrier that has constrained clinical adoption of many artificial intelligence applications in PCa management.
Limitations and future directions
However, several limitations warrant consideration. This single-center retrospective analysis of 139 RB patients may limit generalizability to diverse populations and clinical settings. The modest cohort size constrains model development in multiple ways. While meta-learning optimization partially addresses sample size limitations through knowledge transfer from larger IB cohorts, the independent testing set comprises only 42 patients with 12 cancer-positive cases, which limits statistical precision of performance estimates and may result in wider confidence intervals than larger validation cohorts would provide. An additional methodological consideration involves patient overlap between training phases. The RB cohort comprises patients who were present in the IB cohort during meta-learning pre-training, representing 6.7% of the source domain population. Although the model aims to extract generalizable discriminative patterns rather than memorize individual patient characteristics, and RB prediction incorporates longitudinal features unavailable during pre-training, this overlap may introduce optimistic bias in performance estimates. These design-level limitations collectively underscore the critical need for external validation on completely independent patient populations with adequate sample sizes to establish true generalizability and obtain stable performance estimates.
Beyond study design considerations, clinical implementation choices impose additional constraints. Our cohort predominantly employed transrectal biopsy approach, while transperineal biopsy offers lower infection risk and improved sampling accuracy. Model performance under transperineal protocols requires future validation. Similarly, our systematic TRUS-guided biopsy protocols with targeted sampling of ultrasound-visible lesions differed from mpMRI fusion-guided approaches increasingly advocated by current international guidelines including the 2025 EAU guidelines [18]. This methodological choice prioritized broad applicability and data consistency given considerable variation in mpMRI availability and protocols across our study period. While our framework utilizing readily available clinical parameters and conventional ultrasound guidance addresses practical resource constraints in many healthcare systems, incorporating mpMRI-derived features could potentially further enhance predictive performance in settings where advanced imaging is routinely available. Finally, comprehensive external validation across diverse clinical environments with varied technical protocols remains essential to establish model robustness and clinical utility.
Future research should prioritize multicenter external validation to establish model robustness across diverse clinical settings with varied patient demographics and biopsy protocols. Such validation efforts could proceed through two complementary approaches. Retrospectively, collaborative multi-institutional studies pooling RB cohorts would enhance meta-learning training datasets while enabling comprehensive assessment of model performance across different healthcare environments. Prospectively, integration of the framework into actual clinical workflows represents the critical next step, allowing evaluation of not only diagnostic accuracy but also real-world impact on clinical decision-making patterns, RB rates, and patient outcomes. These prospective studies should systematically assess clinician acceptance and identify implementation barriers to inform deployment strategies. Beyond validation of the current framework, future model development could explore hybrid approaches incorporating advanced imaging features alongside conventional clinical parameters to further enhance predictive accuracy while maintaining broad feasibility. Successful clinical translation will ultimately require seamless electronic health record integration and user-friendly decision support interfaces that fit naturally into existing urological practice workflows.
Conclusion
Conclusion
To our knowledge, this study represents the first successful application of meta-learning optimization to repeat prostate biopsy prediction, demonstrating superior performance compared to conventional ML approaches and established clinical risk calculators. By leveraging knowledge transfer from larger IB cohorts, the framework addresses the fundamental challenge of limited sample sizes in specialized clinical populations. The high specificity achieved translates to meaningful reductions in unnecessary invasive procedures while maintaining optimal cancer detection rates. The meta-learning optimized TabNet provides urologists with an evidence-based decision support tool that enhances diagnostic accuracy and optimizes resource allocation for patients with previous negative biopsies and persistent clinical suspicion. However, further validation across diverse populations and clinical settings is needed to establish generalizability.
To our knowledge, this study represents the first successful application of meta-learning optimization to repeat prostate biopsy prediction, demonstrating superior performance compared to conventional ML approaches and established clinical risk calculators. By leveraging knowledge transfer from larger IB cohorts, the framework addresses the fundamental challenge of limited sample sizes in specialized clinical populations. The high specificity achieved translates to meaningful reductions in unnecessary invasive procedures while maintaining optimal cancer detection rates. The meta-learning optimized TabNet provides urologists with an evidence-based decision support tool that enhances diagnostic accuracy and optimizes resource allocation for patients with previous negative biopsies and persistent clinical suspicion. However, further validation across diverse populations and clinical settings is needed to establish generalizability.
Supplementary Information
Supplementary Information
Below is the link to the electronic supplementary material.
Below is the link to the electronic supplementary material.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Nanotechnology-Assisted Molecular Profiling: Emerging Advances in Circulating Tumor DNA Detection.
- Artificial intelligence and breast cancer screening in Serbia: a dual-perspective qualitative study among radiologists and screening-aged women.
- Functional-based multi-omics early prediction of radiation pneumonitis in NSCLC using AI-generated perfusion and ventilation from planning CT.
- Artificial Intelligence-Enhanced Optimization of Wireless Breath Sensor Arrays for Detection of Lung Cancer Using Fuzzy Logic-Guided Genetic Algorithm and Multimodal Machine Learning.
- Aesthetically ideal noses created using a single artificial intelligence model: Validating literature and exploring ethnic differences.
- Integrative Computational Approaches to Prostate Cancer with Conditional Reprogramming and AI-Driven Precision Medicine.