Bioinformatics and machine learning integration reveals a novel 4-gene (GFUS, ARHGAP8, NBL1, and ACTB) biomarker model for prostate cancer.
1/5 보강
[BACKGROUND] Prostate cancer is still a significant health burden worldwide, mostly because of its genetic heterogeneity and the low specificity of the known biomarkers.
- Sensitivity 94.02%
- Specificity 95.80%
APA
Kilicarslan S, Cicekliyurt MM, et al. (2026). Bioinformatics and machine learning integration reveals a novel 4-gene (GFUS, ARHGAP8, NBL1, and ACTB) biomarker model for prostate cancer.. Discover oncology, 17(1). https://doi.org/10.1007/s12672-025-04369-z
MLA
Kilicarslan S, et al.. "Bioinformatics and machine learning integration reveals a novel 4-gene (GFUS, ARHGAP8, NBL1, and ACTB) biomarker model for prostate cancer.." Discover oncology, vol. 17, no. 1, 2026.
PMID
41721923 ↗
Abstract 한글 요약
[BACKGROUND] Prostate cancer is still a significant health burden worldwide, mostly because of its genetic heterogeneity and the low specificity of the known biomarkers. It is really important to develop precise molecular signatures to increase early detection, improve prognosis, and provide personalized treatment.
[METHODS] We have combined four GEO microarray datasets - GSE3325, GSE6919, GSE55945, GSE26910 ( = 179) and subjected them to the same preprocessing steps, which included background correction, log2 transformation, and quantile normalization. We have normalized gene expression across different probes to HGNC. Then, we applied Limma for the detection of differentially expressed genes (DEGs), incorporating diagnosis and batch as covariates, and extracted significant DEGs with |log2FC|>1 and BH-FDR ≤ 0.05. Next, we propose a Novel Graph-Convolutional Feature Selection framework, ranking the genes by using the expression data in relation with network topology. The performance is validated by Hybrid Random Forest and LightGBM classifiers, and independent validation is done using the GSE46602 dataset ( = 50).
[RESULTS] We discovered a promising four-gene signature that is significantly enriched in GFUS, ARHGAP8, NBL1, and ACTB, implicated in various crucial cancer pathways such as PI3K–Akt, JAK–STAT, and NF-κB. In the discovery set, the Hybrid model emerged as superior, providing an AUC of 0.9612 with an accuracy of 95.37%, sensitivity of 94.02%, and specificity of 95.80%. Other models also demonstrated high performance with considerable values of AUCs: C5, 0.9257; AdaBoost, 0.9098; SVM, 0.8926; RF, 0.9519; and LightGBM, 0.9578 all reinforcing the reliability of identified genes. The obtained results were later on validated in the GSE46602 dataset, since during validation, the four-gene panel gave a very good diagnostic capability the Hybrid model reached an AUC of 0.90, and an accuracy above 91%. Similar performances were obtained from the application of other models like SVM, AdaBoost, and others that strongly confirmed the generalizability of the biomarker panel.
[CONCLUSIONS] This study presents a reproducible network-integrated machine learning-based biomarker discovery framework in prostate cancer. The identified four-gene panel was repeatedly predictive from both discovery and validation datasets, which highlights its potential as a clinically useful diagnostic and prognostic tool. The application of Novel GCFS coupled with ensemble learning employing RF and LightGBM has not been reported, to the best of our knowledge, in any prostate cancer investigation thus far.
[METHODS] We have combined four GEO microarray datasets - GSE3325, GSE6919, GSE55945, GSE26910 ( = 179) and subjected them to the same preprocessing steps, which included background correction, log2 transformation, and quantile normalization. We have normalized gene expression across different probes to HGNC. Then, we applied Limma for the detection of differentially expressed genes (DEGs), incorporating diagnosis and batch as covariates, and extracted significant DEGs with |log2FC|>1 and BH-FDR ≤ 0.05. Next, we propose a Novel Graph-Convolutional Feature Selection framework, ranking the genes by using the expression data in relation with network topology. The performance is validated by Hybrid Random Forest and LightGBM classifiers, and independent validation is done using the GSE46602 dataset ( = 50).
[RESULTS] We discovered a promising four-gene signature that is significantly enriched in GFUS, ARHGAP8, NBL1, and ACTB, implicated in various crucial cancer pathways such as PI3K–Akt, JAK–STAT, and NF-κB. In the discovery set, the Hybrid model emerged as superior, providing an AUC of 0.9612 with an accuracy of 95.37%, sensitivity of 94.02%, and specificity of 95.80%. Other models also demonstrated high performance with considerable values of AUCs: C5, 0.9257; AdaBoost, 0.9098; SVM, 0.8926; RF, 0.9519; and LightGBM, 0.9578 all reinforcing the reliability of identified genes. The obtained results were later on validated in the GSE46602 dataset, since during validation, the four-gene panel gave a very good diagnostic capability the Hybrid model reached an AUC of 0.90, and an accuracy above 91%. Similar performances were obtained from the application of other models like SVM, AdaBoost, and others that strongly confirmed the generalizability of the biomarker panel.
[CONCLUSIONS] This study presents a reproducible network-integrated machine learning-based biomarker discovery framework in prostate cancer. The identified four-gene panel was repeatedly predictive from both discovery and validation datasets, which highlights its potential as a clinically useful diagnostic and prognostic tool. The application of Novel GCFS coupled with ensemble learning employing RF and LightGBM has not been reported, to the best of our knowledge, in any prostate cancer investigation thus far.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (1)
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Raman Spectroscopic Signatures of Hepatic Carcinoma: Progress and Future Prospect.
- Nanotechnology-Assisted Molecular Profiling: Emerging Advances in Circulating Tumor DNA Detection.
- The role of disulfidptosis-driven tumor microenvironment remodeling in pancreatic cancer progression.
- SMURF2 in Anticancer Therapy: Dual Role in Carcinogenesis and Theranostics.
- Safe discharge on the second postoperative day after major colorectal surgery: a decision-making strategy based on quantitative serological data.