본문으로 건너뛰기
← 뒤로

Clinical utility of large language models in metastatic prostate cancer: A multicenter expert validation for decision support.

2/5 보강
European journal of cancer (Oxford, England : 1990) 📖 저널 OA 11.1% 2021: 0/1 OA 2022: 0/1 OA 2023: 0/2 OA 2024: 1/8 OA 2025: 2/74 OA 2026: 20/116 OA 2021~2026 2026 Vol.238() p. 116667 Artificial Intelligence in Healthcar
TL;DR Current LLMs can support mPCa care as drafting assistants, but ∼15%-21% of outputs breached safety thresholds under a strict gate, precluding unsupervised use at initial treatment-planning encounters.
Retraction 확인
출처
PubMed DOI OpenAlex Semantic 마지막 보강 2026-04-28
OpenAlex 토픽 · Artificial Intelligence in Healthcare and Education Topic Modeling Machine Learning in Healthcare

Chen Y, He K, Qu W, Ma L, Su X, Liu X, Kang Z, Li F, Yan J, Wang Z, Li J, Li Z

ℹ️ 이 논문은 무료 전문이 아직 없습니다. 코퍼스 전체의 43.9%는 무료 가능 (통계 →) · 🏥 기관 EZproxy로 시도

📝 환자 설명용 한 줄

Current LLMs can support mPCa care as drafting assistants, but ∼15%-21% of outputs breached safety thresholds under a strict gate, precluding unsupervised use at initial treatment-planning encounters.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Yiqun Chen, Kai He, et al. (2026). Clinical utility of large language models in metastatic prostate cancer: A multicenter expert validation for decision support.. European journal of cancer (Oxford, England : 1990), 238, 116667. https://doi.org/10.1016/j.ejca.2026.116667
MLA Yiqun Chen, et al.. "Clinical utility of large language models in metastatic prostate cancer: A multicenter expert validation for decision support.." European journal of cancer (Oxford, England : 1990), vol. 238, 2026, pp. 116667.
PMID 41831267 ↗

Abstract

[BACKGROUND] Initial systemic treatment planning for metastatic prostate cancer (mPCa) requires rapid synthesis of heterogeneous clinical and biomarker information. Large language models (LLMs) could assist clinicians, but their safety and acceptability in this high-stakes setting remain uncertain.

[METHODS] We conducted a multicenter retrospective evaluation of 238 consecutive mPCa cases from three tertiary centers (2018-2025). Five contemporary LLMs were tested via publicly available web interfaces under a locked, zero-shot prompting protocol to generate a clinical summary, a first-line systemic treatment recommendation, and a rationale. Outputs underwent two-stage assessment: (1) multidisciplinary team (MDT) binary safety adjudication using a one-strike gate with a prespecified taxonomy of critical errors; unsafe outputs were assigned a Likert score of 1 for all domains; (2) three senior medical oncologists independently rated safety-passed outputs on 5-point Likert scales for summary accuracy, guideline-concordant and patient-tailored recommendations, and rationale quality. Paired ordinal outcomes were analyzed with Friedman tests and Holm-adjusted post hoc comparisons, and binary safety outcomes with Cochran's Q and McNemar tests.

[RESULTS] Safety rates ranged from 79.0% to 84.9%. Among safety-passed outputs, mean utility scores (5-point Likert) were in the low-to-mid 4 range. Between-model differences were most apparent for summarization, whereas treatment recommendations and rationales showed modest separation after multiplicity adjustment. Failures clustered in hard cases with incomplete documentation and were dominated by missingness-related extraction errors, disease-state/pathway errors, guideline logic deviations, and safety-check omissions.

[CONCLUSIONS] Current LLMs can support mPCa care as drafting assistants, but ∼15%-21% of outputs breached safety thresholds under a strict gate, precluding unsupervised use at initial treatment-planning encounters.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반