[The application of large language models in the diagnosis of clinically significant prostate cancer].
증례연속
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
077 patients who underwent ultrasound-guided systematic prostate biopsy at Department of Urology,Peking University Third Hospital from January 2018 to December 2024 were collected, aged ((IQR)) 69(13) years (range:38 to 90 years) including 391 patients in the gray zone (prostate-specific antigen 4 to 10 μg/L).
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
Four LLM (GPT 4.1, DeepSeek R1, Qwen3-235B-A22B, Qwen3-32B) were used to diagnose csPCa based on patient information, and the performance of the LLM was evaluated using biopsy histopathological results as the gold standard.
To explore the performance of large language model (LLM) in diagnosing clinically significant prostate cancer (csPCa), and the improvement in diagnostic performance of open-source LLM after low-rank a
APA
Qiu L, Ni QY, et al. (2026). [The application of large language models in the diagnosis of clinically significant prostate cancer].. Zhonghua wai ke za zhi [Chinese journal of surgery], 64(2), 182-190. https://doi.org/10.3760/cma.j.cn112139-20250814-00402
MLA
Qiu L, et al.. "[The application of large language models in the diagnosis of clinically significant prostate cancer].." Zhonghua wai ke za zhi [Chinese journal of surgery], vol. 64, no. 2, 2026, pp. 182-190.
PMID
41667933 ↗
Abstract 한글 요약
To explore the performance of large language model (LLM) in diagnosing clinically significant prostate cancer (csPCa), and the improvement in diagnostic performance of open-source LLM after low-rank adaptation (LoRA) fine-tuning. This is a retrospective case series study. Data from 1 077 patients who underwent ultrasound-guided systematic prostate biopsy at Department of Urology,Peking University Third Hospital from January 2018 to December 2024 were collected, aged ((IQR)) 69(13) years (range:38 to 90 years) including 391 patients in the gray zone (prostate-specific antigen 4 to 10 μg/L). The collected data included patients' clinical characteristics, prostate MRI reports, and biopsy histopathological results. Four LLM (GPT 4.1, DeepSeek R1, Qwen3-235B-A22B, Qwen3-32B) were used to diagnose csPCa based on patient information, and the performance of the LLM was evaluated using biopsy histopathological results as the gold standard. Subsequently, the data from 1 077 patients were divided into training and test sets at an 8∶2 ratio, and LoRA fine-tuning was performed on Qwen3-32B. The fine-tuned model was named PCD-Qwen3, and its diagnostic efficacy in the test set was evaluated. The receiver operating characteristics curve was plotted and the area under the curve (AUC) and 95% were calculated to evaluate the diagnostic performance of LLM. The Delong test was used to compare the differences in AUC between groups. Among all patients, DeepSeek R1 had the highest AUC for diagnosing csPCa at 0.848 (95%: 0.826 to 0.871), with statistically significant differences compared to Qwen3-235B-A22B (0.827 (95%: 0.803 to 0.851)) and Qwen3-32B (0.753 (95%: 0.724 to 0.781))(=2.34, =0.020; =7.35, <0.01), but no difference compared to GPT 4.1(0.842 (95%: 0.819 to 0.865))(>0.05). The accuracy, sensitivity, and specificity of DeepSeek R1 for diagnosing csPCa were 77.3%, 70.2%, and 84.1%, respectively. In the gray zone patient population with total prostate specific antigen of 4 to 10 μg/L, DeepSeek R1 had an AUC of 0.765 (95%: 0.715 to 0.816) for diagnosing csPCa. Using DeepSeek R1 to diagnose gray zone patients could avoid 46.3% (181/391) of unnecessary biopsies while missing 5.9% (23/391) of csPCa patients. Except for Qwen3-32B, the PI-RADS scores evaluated by the three LLM achieved moderate agreement with those of radiologists. After LoRA fine-tuning, the diagnostic performance of PCD-Qwen3 was significantly improved compared to Qwen3-32B. In the test set of 216 patients, the accuracy, sensitivity, specificity, and AUC were 77.3%, 75.5%, 79.1%, and 0.831 (95%: 0.776 to 0.885), respectively, comparable to the performance of DeepSeek R1 (all >0.05). Among the four LLM, DeepSeek R1 had the best performance in diagnosing csPCa. After LoRA fine-tuning, PCD-Qwen3 achieved performance comparable to DeepSeek R1. LLM demonstrated promising application value in diagnosing csPCa.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- SETD2 inhibited T-cell acute lymphocytic leukemia invasion and infiltration by inhibiting the JAK/STAT pathway.
- Genomic proximity mapping: a promising next generation cytogenomic assay for comprehensive assessment of acute myeloid leukemia.
- Antiviral prophylaxis for hepatitis B virus reactivation in T-cell lymphoma patients with resolved hepatitis B virus infection.
- Predictive value of the pretreatment serum sialic acid/total protein ratio for bone metastases in newly diagnosed prostate cancer patients: development of a nomogram model.
- Effectiveness of nutrition support team-led care on perioperative outcomes in malnourished older adults with gastric cancer.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.
- Association of patient health education with the postoperative health related quality of life in low- intermediate recurrence risk differentiated thyroid cancer patients.