본문으로 건너뛰기
← 뒤로

Development and validation of an interpretable machine learning model for non-invasive screening of precancerous gastric lesions using symptom and lifestyle data: a multicentre cohort study.

코호트 1/5 보강
EClinicalMedicine 📖 저널 OA 100% 2022: 1/1 OA 2023: 1/1 OA 2024: 6/6 OA 2025: 30/30 OA 2026: 34/34 OA 2022~2026 2026 Vol.92() p. 103756
Retraction 확인
출처

PICO 자동 추출 (휴리스틱, conf 2/4)

유사 논문
P · Population 대상 환자/모집단
1034 participants recruited at two hospitals between Nov 16, 2022, and Apr 7, 2023.
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
Innovation Team and Talents Cultivation Program of the National Administration of Traditional Chinese Medicine. Fundamental and Interdisciplinary Disciplines Breakthrough Plan of the Ministry of Education of China.

Wang L, Tang K, Zhang P, Liu J, Wu B, Chen J, Li Y, Du S, Wang Y, Li S

📝 환자 설명용 한 줄

[BACKGROUND] Precancerous gastric lesions (PLGC) are a critical stage in gastric cancer progression, where timely intervention can substantially reduce mortality.

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)
  • 표본수 (n) 620
  • p-value p < 0.001
  • 95% CI 0.77-0.87

이 논문을 인용하기

↓ .bib ↓ .ris
APA Wang L, Tang K, et al. (2026). Development and validation of an interpretable machine learning model for non-invasive screening of precancerous gastric lesions using symptom and lifestyle data: a multicentre cohort study.. EClinicalMedicine, 92, 103756. https://doi.org/10.1016/j.eclinm.2026.103756
MLA Wang L, et al.. "Development and validation of an interpretable machine learning model for non-invasive screening of precancerous gastric lesions using symptom and lifestyle data: a multicentre cohort study.." EClinicalMedicine, vol. 92, 2026, pp. 103756.
PMID 41623856 ↗

Abstract

[BACKGROUND] Precancerous gastric lesions (PLGC) are a critical stage in gastric cancer progression, where timely intervention can substantially reduce mortality. However, current screening strategies are predominantly endoscopic, which are invasive, costly, and often inaccessible in resource-limited settings. We aimed to develop and validate an interpretable machine learning model for non-invasive PLGC screening using symptom and lifestyle data.

[METHODS] In this multicentre study, we enrolled eligible adult participants undergoing or scheduled to undergo upper gastrointestinal endoscopy with no prior diagnosis of malignancy. The development cohort comprised 1034 participants recruited at two hospitals between Nov 16, 2022, and Apr 7, 2023. Symptom and lifestyle data from this cohort were used to construct the development dataset, which was randomly split into a training set (n = 620), an internal validation set (n = 207), and a hold-out test set (n = 207). External performance was assessed in a retrospective hospital-based cohort from four additional hospitals (n = 630; May 21, 2018 to Jul 30, 2023) and a prospective community-based cohort from 32 screening sites (n = 847; June 21, 2023, to Nov 7, 2023). We developed a stacking ensemble model to predict the primary outcome (presence of PLGC) by integrating seven base learners (Gaussian Naïve Bayes, Logistic Regression, K-Nearest Neighbours, Gradient Boosting Classifier, eXtreme Gradient Boosting, Random Forest, Adaptive Boosting) and applied Shapley Additive Explanations (SHAP) for clinical interpretability. Model performance was compared with guideline-based screening strategies from the and the , using the area under the receiver operating characteristic curve (AUC; 95% CI), sensitivity, specificity, positive predictive value, and negative predictive value.

[FINDINGS] In total, 2511 participants (male: n = 871, 34.7%; female: n = 1640, 65.3%) were included. The primary outcome, PLGC, was present in 509 of 1034 participants (49.2%) in the development cohort, in 331 of 630 participants (52.5%) in the retrospective validation cohort, and in 312 of 847 participants (36.8%) in the prospective validation cohort. The model showed robust performance for non-invasive PLGC screening, with AUCs of 0.82 (95% CI: 0.77-0.87) in the internal hold-out test set, 0.80 (95% CI: 0.78-0.82) in the external retrospective validation set, and 0.79 (95% CI: 0.77-0.81) in the prospective validation set. With AUC improvements of 0.18-0.35, our model exceeded both guideline-based strategies across all datasets (internal hold-out test: 0.82 (95% CI: 0.77-0.87) vs. 0.47 (95% CI: 0.42-0.53)/0.48 (95% CI: 0.42-0.53); external retrospective validation: 0.80 (95% CI: 0.78-0.82) vs. 0.62 (95% CI: 0.60-0.64)/0.58 (95% CI: 0.55-0.60); prospective validation: 0.79 (95% CI: 0.77-0.81) vs. 0.57 (95% CI: 0.54-0.59)/0.52 (95% CI: 0.50-0.55); all p < 0.001). In a cost-effectiveness analysis, this translated into a 37.1% reduction in the average cost per detected PLGC case versus guideline-based tools. SHAP analysis further identified 15 key predictors, including infection, age, and melaena.

[INTERPRETATION] An interpretable machine learning model integrating symptom and lifestyle information, some of which were implicated by traditional medicine, achieved superior performance to guideline-based screening strategies for PLGC non-invasive screening in both hospital-based and community-based populations. However, the generalisability may be limited by the cohorts' age and regional distribution; further studies should incorporate more non-invasive metrics to optimise the screening model and pursue broader external validation and real-world implementation.

[FUNDING] National Natural Science Foundation of China. Innovation Team and Talents Cultivation Program of the National Administration of Traditional Chinese Medicine. Fundamental and Interdisciplinary Disciplines Breakthrough Plan of the Ministry of Education of China.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기