Automated Literature Screening for Hepatocellular Carcinoma Treatment Through Integration of 3 Large Language Models: Methodological Study.
1/5 보강
[BACKGROUND] Primary liver cancer, particularly hepatocellular carcinoma (HCC), poses significant clinical challenges due to late-stage diagnosis, tumor heterogeneity, and rapidly evolving therapeutic
APA
Pan C, Lu W, et al. (2025). Automated Literature Screening for Hepatocellular Carcinoma Treatment Through Integration of 3 Large Language Models: Methodological Study.. JMIR medical informatics, 13, e76252. https://doi.org/10.2196/76252
MLA
Pan C, et al.. "Automated Literature Screening for Hepatocellular Carcinoma Treatment Through Integration of 3 Large Language Models: Methodological Study.." JMIR medical informatics, vol. 13, 2025, pp. e76252.
PMID
40921065 ↗
DOI
10.2196/76252
Abstract 한글 요약
[BACKGROUND] Primary liver cancer, particularly hepatocellular carcinoma (HCC), poses significant clinical challenges due to late-stage diagnosis, tumor heterogeneity, and rapidly evolving therapeutic strategies. While systematic reviews and meta-analyses are essential for updating clinical guidelines, their labor-intensive nature limits timely evidence synthesis.
[OBJECTIVE] This study proposes an automated literature screening workflow powered by large language models (LLMs) to accelerate evidence synthesis for HCC treatment guidelines.
[METHODS] We developed a tripartite LLM framework integrating Doubao-1.5-pro-32k, Deepseek-v3, and DeepSeek-R1-Distill-Qwen-7B to simulate collaborative decision-making for study inclusion and exclusion. The system was evaluated across 9 reconstructed datasets derived from published HCC meta-analyses, with performance assessed using accuracy, agreement metrics (κ and prevalence-adjusted bias-adjusted κ), recall, precision, F-scores, and computational efficiency parameters (processing time and cost).
[RESULTS] The framework demonstrated good performance, with a weighted accuracy of 0.96 and substantial agreement (prevalence-adjusted bias-adjusted κ=0.91), achieving high weighted recall (0.90) but modest weighted precision (0.15) and F-scores (0.22). Computational efficiency varied across datasets (processing time: 248-5850 s; cost: US $0.14-$3.68 per dataset).
[CONCLUSIONS] This LLM-driven approach shows promise for accelerating evidence synthesis in HCC care by reducing screening time while maintaining methodological rigor. Key limitations related to clinical context sensitivity and error propagation highlight the need for reinforcement learning integration and domain-specific fine-tuning. LLM agent architectures with reinforcement learning offer a practical path for streamlining guideline updates, though further optimization is needed to improve specialization and reliability in complex clinical settings.
[OBJECTIVE] This study proposes an automated literature screening workflow powered by large language models (LLMs) to accelerate evidence synthesis for HCC treatment guidelines.
[METHODS] We developed a tripartite LLM framework integrating Doubao-1.5-pro-32k, Deepseek-v3, and DeepSeek-R1-Distill-Qwen-7B to simulate collaborative decision-making for study inclusion and exclusion. The system was evaluated across 9 reconstructed datasets derived from published HCC meta-analyses, with performance assessed using accuracy, agreement metrics (κ and prevalence-adjusted bias-adjusted κ), recall, precision, F-scores, and computational efficiency parameters (processing time and cost).
[RESULTS] The framework demonstrated good performance, with a weighted accuracy of 0.96 and substantial agreement (prevalence-adjusted bias-adjusted κ=0.91), achieving high weighted recall (0.90) but modest weighted precision (0.15) and F-scores (0.22). Computational efficiency varied across datasets (processing time: 248-5850 s; cost: US $0.14-$3.68 per dataset).
[CONCLUSIONS] This LLM-driven approach shows promise for accelerating evidence synthesis in HCC care by reducing screening time while maintaining methodological rigor. Key limitations related to clinical context sensitivity and error propagation highlight the need for reinforcement learning integration and domain-specific fine-tuning. LLM agent architectures with reinforcement learning offer a practical path for streamlining guideline updates, though further optimization is needed to improve specialization and reliability in complex clinical settings.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- A Comparison of Modified Radical Mastectomy, Breast Conservation Therapy, and Nipple-sparing Mastectomy With Immediate Implant Reconstruction.
- Artificial intelligence and radiomics on computed tomography for differentiating hepatocellular carcinoma and intrahepatic cholangiocarcinoma: a multimodal integration approach.
- Chemoresistance and Immune Suppression in Gastric Cancer Are Driven by PPDPF Overexpression.
- Akkermansia Muciniphila-Inspired Oral siRNA Delivery by Active Matrix-Penetration and Intracellular Endoplasmic Reticulum Accumulation.
- Natural killer cell-specific chimeric antigen receptor enhances CAR NK cell functions and anti-tumor activity.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.