본문으로 건너뛰기
← 뒤로

Predicting clinical outcomes in Helicobacter pylori-positive patients using supervised learning through the integration of demographic and genomic features.

1/5 보강
BMC gastroenterology 📖 저널 OA 97% 2021: 1/1 OA 2024: 14/14 OA 2025: 121/121 OA 2026: 58/64 OA 2021~2026 2026 Vol.26(1) p. 143
Retraction 확인
출처

Narasimhan V, Pulakkat Warrier S, John JJ, T MP, Varadaraj N, Thomas GG, Veeraraghavan B

📝 환자 설명용 한 줄

[BACKGROUND] infection is widespread globally and is linked to outcomes ranging from chronic gastritis to gastric cancer.

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)
  • 95% CI 0.637–0.830

이 논문을 인용하기

↓ .bib ↓ .ris
APA Narasimhan V, Pulakkat Warrier S, et al. (2026). Predicting clinical outcomes in Helicobacter pylori-positive patients using supervised learning through the integration of demographic and genomic features.. BMC gastroenterology, 26(1), 143. https://doi.org/10.1186/s12876-025-04595-3
MLA Narasimhan V, et al.. "Predicting clinical outcomes in Helicobacter pylori-positive patients using supervised learning through the integration of demographic and genomic features.." BMC gastroenterology, vol. 26, no. 1, 2026, pp. 143.
PMID 41606475 ↗

Abstract

[BACKGROUND] infection is widespread globally and is linked to outcomes ranging from chronic gastritis to gastric cancer. However, only a minority of infected individuals progress to malignancy, influenced by a mix of bacterial, host, and environmental factors. Current predictive approaches are limited due to relying mainly on clinical and lifestyle data. Genomic approaches have been sparsely used, and thus their incorporation into machine learning models could ensure early and personalized detection. This study aimed to evaluate the impact of integrating host metadata with genomic features from to predict gastric cancer outcomes and identify associated variables.

[METHODS] One thousand three hundred sixty-three publicly available genomes with associated host information between 1991 and 2024 were collected from NCBI and EnteroBase. Demographic features, virulence genes, sequence-derived and variant-based features were extracted. Machine learning models were then developed to classify infection outcomes into gastric cancer and non-gastric cancer and trained using internal cross-validation folds within the training set comprising 80% of the dataset. Logistic regression, an interpretable baseline model, was compared against higher-performance ensemble models (XGBoost, Random Forest). Final model performance was assessed on the held-out test set using recall, precision, AUROC, and AUPRC curves.

[RESULTS] The logistic regression model achieved a recall of 0.737 (95% CI: 0.637–0.830) for gastric cancer and an AUROC of 0.830 (95% CI: 0.779–0.880). Both XGBoost and Random Forest models outperformed the baseline model with AUROC values ranging from 0.950 to 0.954 (95% CI: 0.904–0.976). Black-box model recall for gastric cancer detection improved compared to the baseline by 8.14% for XGBoost (0.797, 95% CI: 0.711–0.877), and 11.3% for Random Forest (0.820, 95% CI: 0.734–0.896). Across models, patient age consistently emerged as the strongest predictor of gastric cancer, with several sequence-derived genomic features beyond pre-established virulence genes contributing to the infection outcome differences.

[CONCLUSION] This study demonstrates that combining pathogen genomics with host demographics uncovers novel risk factors and ensures early detection with high predictive power. The use of explainability methods like SHAP allows for greater interpretability by clinical professionals and improves informed decision-making processes. While internal validation showed strong performance, external validation on independent data and translation into clinical practice is necessary using broader, diverse datasets, along with the inclusion of additional host and lifestyle variables.

[SUPPLEMENTARY INFORMATION] The online version contains supplementary material available at 10.1186/s12876-025-04595-3.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기