GPT-4 vs. radiologists: who advances mediastinal tumor classification better across report quality levels? a cohort study.

Wen R; Li X; Chen K; Sun M; Zhu C; Xu P; Chen F; Ji C; Pei M; Li X; Deng X; Yang Q; Song W; Shang Y; Huang S; Zhou M; Wang J; Zhou C; Chen W; Liu C

doi:10.1097/JS9.0000000000003127

← 뒤로

GPT-4 vs. radiologists: who advances mediastinal tumor classification better across report quality levels? a cohort study.

코호트 1/5 보강

International journal of surgery (London, England) 📖 저널 OA 62.3% 2021~2026 2025 Vol.111(12) p. 9000-9011

Wen R, Li X, Chen K, Sun M, Zhu C, Xu P

📖 무료 전문 🟢 PMC 전문 PMC12695208

PubMed ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

[BACKGROUND] Accurate mediastinal tumor classification is crucial for treatment planning, but diagnostic performance varies with radiologists' experience and report quality.

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)

p-value P <0.001
95% CI 71.0-75.5

이 논문을 인용하기

↓ .bib ↓ .ris

APA Wen R, Li X, et al. (2025). GPT-4 vs. radiologists: who advances mediastinal tumor classification better across report quality levels? a cohort study.. International journal of surgery (London, England), 111(12), 9000-9011. https://doi.org/10.1097/JS9.0000000000003127

MLA Wen R, et al.. "GPT-4 vs. radiologists: who advances mediastinal tumor classification better across report quality levels? a cohort study.." International journal of surgery (London, England), vol. 111, no. 12, 2025, pp. 9000-9011.

PMID 40788014 ↗

DOI 10.1097/JS9.0000000000003127

Abstract

[BACKGROUND] Accurate mediastinal tumor classification is crucial for treatment planning, but diagnostic performance varies with radiologists' experience and report quality.

[PURPOSE] To evaluate generative pretrained transformer's (GPT-4's) diagnostic accuracy in classifying mediastinal tumors from radiological reports compared to radiologists of different experience levels using radiological reports of varying quality.

[MATERIALS AND METHODS] We conducted a retrospective study of 1494 patients from five tertiary hospitals with mediastinal tumors diagnosed via chest CT and pathology. Radiological reports were categorized into low-, medium-, and high-quality based on predefined criteria assessed by experienced radiologists. Six radiologists (two residents, two attending radiologists, and two associate senior radiologists) and GPT-4 evaluated the chest CT reports. Diagnostic performance was analyzed overall, by report quality, and by tumor type using Wald χ2 tests and 95% CIs calculated via the Wilson method.

[RESULTS] GPT-4 achieved an overall diagnostic accuracy of 73.3% (95% CI: 71.0-75.5), comparable to associate senior radiologists (74.3%, 95% CI: 72.0-76.5; P >0.05). For low-quality reports, GPT-4 outperformed associate senior radiologists (60.8% vs. 51.1%, P <0.001). In high-quality reports, GPT-4 was comparable to attending radiologists (80.6% vs.79.4%, P >0.05). Diagnostic performance varied by tumor type: GPT-4 was comparable to radiology residents for neurogenic tumors (44.9% vs. 50.3%, P >0.05), similar to associate senior radiologists for teratomas (68.1% vs. 65.9%, P >0.05), and superior in diagnosing lymphoma (75.4% vs. 60.4%, P <0.001).

[CONCLUSION] GPT-4 demonstrated interpretation accuracy comparable to Associate Senior Radiologists, excelling in low-quality reports and outperforming them in diagnosing lymphoma. These findings underscore GPT-4's potential to enhance diagnostic performance in challenging diagnostic scenarios.

[SUMMARY] In this retrospective study involving 1494 chest CT reports of different quality from five tertiary hospitals, GPT-4 demonstrated diagnostic accuracy comparable to Associate Senior Radiologists in classifying mediastinal tumors from chest CT reports, excelling in low-quality reports and outperforming Associate Senior Radiologists in diagnosing specific tumor types like lymphoma, showcasing its potential to enhance diagnostic performance in challenging scenarios.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (2)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
Cancer research communications 2026 Talmor B 외 📖 unpaywall
Self-management of male urinary symptoms: qualitative findings from a primary care trial.
The British journal of general practice : the journal of the Royal College of General Practitioners 2026 Wheeler JR 외 📖 unpaywall
Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
JAMA network open 2026 Lindholz M 외 📖 unpaywall
Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
Science progress 2026 Qin Z 외 📖 unpaywall
Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
Biochimica et biophysica acta. Molecular cell research 2026 Wüstmann N 외 📖 unpaywall
Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.
Journal of the College of Physicians and Surgeons--Pakistan : JCPSP 2026 Khan MMU 외 📖 unpaywall

이 논문을 인용하기

Abstract 한글 요약

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (2)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

Abstract