Simulation and empirical evaluation of biologically-informed neural network performance.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
환자: prostate cancer using somatic mutation and copy number information
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
Broadly, our simulation frameworks enable systematic evaluation of how dataset-level characteristics affect BiNN performance and provide a principled framework for benchmarking novel
Biologically-informed neural networks (BiNNs) offer interpretable deep learning models for biological data, but the dataset characteristics required for strong performance remain poorly understood.
APA
Miller GA, Roman A, et al. (2025). Simulation and empirical evaluation of biologically-informed neural network performance.. bioRxiv : the preprint server for biology. https://doi.org/10.1101/2025.11.13.687845
MLA
Miller GA, et al.. "Simulation and empirical evaluation of biologically-informed neural network performance.." bioRxiv : the preprint server for biology, 2025.
PMID
41292768 ↗
Abstract 한글 요약
Biologically-informed neural networks (BiNNs) offer interpretable deep learning models for biological data, but the dataset characteristics required for strong performance remain poorly understood. For instance, we previously developed P-NET, a BiNN with an architecture based on the Reactome pathway database, and applied this model to predict metastatic status of patients with prostate cancer using somatic mutation and copy number information. It seems likely that including additional relevant signal - e.g., germline variation in this context - should improve model performance, but we currently lack a principled approach to assess whether BiNNs will successfully detect this signal. Here, we developed two simulation frameworks to evaluate the factors that influence BiNN performance - including signal type, signal strength, feature sparsity, and sample size - and empirically tested how integrating germline and somatic data affects the model's ability to predict prostate cancer metastatic status. Simulations revealed that small sample size, weak signal strength, and especially extreme feature sparsity limit BiNN performance, and that the model preferentially uses linear over nonlinear signal. Empirically, P-NET performed poorly on sparse germline data, and while adding germline to somatic data did not improve prediction, it improved gene prioritization and model interpretation. Broadly, our simulation frameworks enable systematic evaluation of how dataset-level characteristics affect BiNN performance and provide a principled framework for benchmarking novel methods.