Enhancing Large Language Models With AI Agents for Chronic Gastritis Management: Comprehensive Comparative Study.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
환자: chronic gastritis
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
our results demonstrated that AI agents substantially outperformed LLMs in addressing high-complexity questions (embedding average score: 82.
[BACKGROUND] The prevalence of chronic gastritis is high, and if not intervened in a timely manner, it may eventually lead to gastric cancer.
- 95% CI 4.96-4.99
APA
Wang S, Ye Q (2025). Enhancing Large Language Models With AI Agents for Chronic Gastritis Management: Comprehensive Comparative Study.. JMIR medical informatics, 13, e73857. https://doi.org/10.2196/73857
MLA
Wang S, et al.. "Enhancing Large Language Models With AI Agents for Chronic Gastritis Management: Comprehensive Comparative Study.." JMIR medical informatics, vol. 13, 2025, pp. e73857.
PMID
41236204 ↗
DOI
10.2196/73857
Abstract 한글 요약
[BACKGROUND] The prevalence of chronic gastritis is high, and if not intervened in a timely manner, it may eventually lead to gastric cancer. Managing chronic gastritis essentially requires comprehensive lifestyle changes. However, the current health care environment does not support continuous follow-up by professional health care providers, making self-management a key component of postdiagnosis care. Increasingly, researchers are exploring the use of large language models (LLMs) for patient management. However, LLMs have limitations, including hallucinations, limited knowledge scope, and lack of timeliness. Artificial intelligence (AI) agents may provide a more effective solution. Nevertheless, it remains uncertain whether AI agents can effectively support postdiagnosis self-management for patients with chronic gastritis.
[OBJECTIVE] The purpose of this study was to explore the effectiveness of AI agents in the postdiagnosis management of patients with chronic gastritis from different perspectives.
[METHODS] In this study, we developed an agent framework for the health management of patients with chronic gastritis based on LLMs in conjunction with retrieval-augmented generation and a search engine tool. We collected real questions from patients with chronic gastritis in clinical settings and tested the framework's performance across different difficulty levels and scenarios. We analyzed its safety and robustness and compared it with state-of-the-art models to comprehensively evaluate its effectiveness.
[RESULTS] Using a dual-evaluation framework comprising automated metrics and expert manual assessments, our results demonstrated that AI agents substantially outperformed LLMs in addressing high-complexity questions (embedding average score: 82.849 for AI agents vs 77.825 for LLMs) and were particularly effective in clinical consultation tasks. Clinical evaluation of safety based on a 5-point Likert scale by physicians indicated that the safety of the agents was 4.98 (SD 0.15; 95% CI 4.96-4.99). After 30 repeated experiments, the mean absolute deviation of the AI agents in the embedding average score and BERTScore metrics were 0.0167 and 0.0387, respectively. Therefore, the safety and robustness analysis confirmed that the AI agents can produce safe, stable, and minimally variable responses. In addition, comparative results with those of advanced medical-domain LLMs (Baichuan-14B-M1 and MedGemma-27B) and general-domain LLMs (Qwen3-32B) also demonstrated that the AI agents in this study performed outstandingly in the field of chronic gastritis. Our findings underscore the superior reliability, interpretability, and practical applicability of AI agents over conventional LLMs in chronic gastritis management, offering a robust foundation for their broader adoption in health care settings.
[CONCLUSIONS] AI agents based on LLMs have high application value in the management of chronic gastritis. They can effectively guide patients with chronic diseases in addressing common issues, which may potentially reduce the workload of physicians and improve the quality of patient home care.
[OBJECTIVE] The purpose of this study was to explore the effectiveness of AI agents in the postdiagnosis management of patients with chronic gastritis from different perspectives.
[METHODS] In this study, we developed an agent framework for the health management of patients with chronic gastritis based on LLMs in conjunction with retrieval-augmented generation and a search engine tool. We collected real questions from patients with chronic gastritis in clinical settings and tested the framework's performance across different difficulty levels and scenarios. We analyzed its safety and robustness and compared it with state-of-the-art models to comprehensively evaluate its effectiveness.
[RESULTS] Using a dual-evaluation framework comprising automated metrics and expert manual assessments, our results demonstrated that AI agents substantially outperformed LLMs in addressing high-complexity questions (embedding average score: 82.849 for AI agents vs 77.825 for LLMs) and were particularly effective in clinical consultation tasks. Clinical evaluation of safety based on a 5-point Likert scale by physicians indicated that the safety of the agents was 4.98 (SD 0.15; 95% CI 4.96-4.99). After 30 repeated experiments, the mean absolute deviation of the AI agents in the embedding average score and BERTScore metrics were 0.0167 and 0.0387, respectively. Therefore, the safety and robustness analysis confirmed that the AI agents can produce safe, stable, and minimally variable responses. In addition, comparative results with those of advanced medical-domain LLMs (Baichuan-14B-M1 and MedGemma-27B) and general-domain LLMs (Qwen3-32B) also demonstrated that the AI agents in this study performed outstandingly in the field of chronic gastritis. Our findings underscore the superior reliability, interpretability, and practical applicability of AI agents over conventional LLMs in chronic gastritis management, offering a robust foundation for their broader adoption in health care settings.
[CONCLUSIONS] AI agents based on LLMs have high application value in the management of chronic gastritis. They can effectively guide patients with chronic diseases in addressing common issues, which may potentially reduce the workload of physicians and improve the quality of patient home care.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- Research Progress on the Detection Methods of Botulinum Neurotoxin.
- Application study of febuxostat combined with hypothermic preservation technology in reducing ischemia-reperfusion injury in free flap transplantation.
- A novel nomogram incorporating LASSO and Cox regression analyses for predicting survival in early-stage non-small cell lung cancer patients following sublobectomy.
- Emerging importance of ALDH2 in liver diseases and its potential therapeutic role.
- Gastric Cancer in China, 1990 to 2023: Trends, Modifiable Risks, and Prevention Priorities.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.
- Association of patient health education with the postoperative health related quality of life in low- intermediate recurrence risk differentiated thyroid cancer patients.