본문으로 건너뛰기
← 뒤로

[The application of large language models in the diagnosis of clinically significant prostate cancer].

증례연속 1/5 보강
Zhonghua wai ke za zhi [Chinese journal of surgery] 📖 저널 OA 0% 2021: 0/2 OA 2022: 0/2 OA 2023: 0/2 OA 2024: 0/10 OA 2025: 0/33 OA 2026: 0/12 OA 2021~2026 2026 Vol.64(2) p. 182-190
Retraction 확인
출처

PICO 자동 추출 (휴리스틱, conf 2/4)

유사 논문
P · Population 대상 환자/모집단
077 patients who underwent ultrasound-guided systematic prostate biopsy at Department of Urology,Peking University Third Hospital from January 2018 to December 2024 were collected, aged ((IQR)) 69(13) years (range:38 to 90 years) including 391 patients in the gray zone (prostate-specific antigen 4 to 10 μg/L).
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
Four LLM (GPT 4.1, DeepSeek R1, Qwen3-235B-A22B, Qwen3-32B) were used to diagnose csPCa based on patient information, and the performance of the LLM was evaluated using biopsy histopathological results as the gold standard.

Qiu L, Ni QY, Li ZA, Lin XS, Zhao ZK, Wu JL

📝 환자 설명용 한 줄

To explore the performance of large language model (LLM) in diagnosing clinically significant prostate cancer (csPCa), and the improvement in diagnostic performance of open-source LLM after low-rank a

이 논문을 인용하기

↓ .bib ↓ .ris
APA Qiu L, Ni QY, et al. (2026). [The application of large language models in the diagnosis of clinically significant prostate cancer].. Zhonghua wai ke za zhi [Chinese journal of surgery], 64(2), 182-190. https://doi.org/10.3760/cma.j.cn112139-20250814-00402
MLA Qiu L, et al.. "[The application of large language models in the diagnosis of clinically significant prostate cancer].." Zhonghua wai ke za zhi [Chinese journal of surgery], vol. 64, no. 2, 2026, pp. 182-190.
PMID 41667933 ↗

Abstract

To explore the performance of large language model (LLM) in diagnosing clinically significant prostate cancer (csPCa), and the improvement in diagnostic performance of open-source LLM after low-rank adaptation (LoRA) fine-tuning. This is a retrospective case series study. Data from 1 077 patients who underwent ultrasound-guided systematic prostate biopsy at Department of Urology,Peking University Third Hospital from January 2018 to December 2024 were collected, aged ((IQR)) 69(13) years (range:38 to 90 years) including 391 patients in the gray zone (prostate-specific antigen 4 to 10 μg/L). The collected data included patients' clinical characteristics, prostate MRI reports, and biopsy histopathological results. Four LLM (GPT 4.1, DeepSeek R1, Qwen3-235B-A22B, Qwen3-32B) were used to diagnose csPCa based on patient information, and the performance of the LLM was evaluated using biopsy histopathological results as the gold standard. Subsequently, the data from 1 077 patients were divided into training and test sets at an 8∶2 ratio, and LoRA fine-tuning was performed on Qwen3-32B. The fine-tuned model was named PCD-Qwen3, and its diagnostic efficacy in the test set was evaluated. The receiver operating characteristics curve was plotted and the area under the curve (AUC) and 95% were calculated to evaluate the diagnostic performance of LLM. The Delong test was used to compare the differences in AUC between groups. Among all patients, DeepSeek R1 had the highest AUC for diagnosing csPCa at 0.848 (95%: 0.826 to 0.871), with statistically significant differences compared to Qwen3-235B-A22B (0.827 (95%: 0.803 to 0.851)) and Qwen3-32B (0.753 (95%: 0.724 to 0.781))(=2.34, =0.020; =7.35, <0.01), but no difference compared to GPT 4.1(0.842 (95%: 0.819 to 0.865))(>0.05). The accuracy, sensitivity, and specificity of DeepSeek R1 for diagnosing csPCa were 77.3%, 70.2%, and 84.1%, respectively. In the gray zone patient population with total prostate specific antigen of 4 to 10 μg/L, DeepSeek R1 had an AUC of 0.765 (95%: 0.715 to 0.816) for diagnosing csPCa. Using DeepSeek R1 to diagnose gray zone patients could avoid 46.3% (181/391) of unnecessary biopsies while missing 5.9% (23/391) of csPCa patients. Except for Qwen3-32B, the PI-RADS scores evaluated by the three LLM achieved moderate agreement with those of radiologists. After LoRA fine-tuning, the diagnostic performance of PCD-Qwen3 was significantly improved compared to Qwen3-32B. In the test set of 216 patients, the accuracy, sensitivity, specificity, and AUC were 77.3%, 75.5%, 79.1%, and 0.831 (95%: 0.776 to 0.885), respectively, comparable to the performance of DeepSeek R1 (all >0.05). Among the four LLM, DeepSeek R1 had the best performance in diagnosing csPCa. After LoRA fine-tuning, PCD-Qwen3 achieved performance comparable to DeepSeek R1. LLM demonstrated promising application value in diagnosing csPCa.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반