본문으로 건너뛰기
← 뒤로

Comparative Performance of 3 Artificial Intelligence Systems for Lung Nodule Characterization in Low-Dose Computed Tomography Screening.

1/5 보강
Journal of thoracic imaging 2026
Retraction 확인
출처

PICO 자동 추출 (휴리스틱, conf 2/4)

유사 논문
P · Population 대상 환자/모집단
100 subjects, assessing agreement with a reference standard and inter-vendor consistency.
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
Inter-AI agreement was substantial (κ=0.66 to 0.78), and diameter/volume measurements showed moderate to good reliability (ICC=0.57 to 0.87). [CONCLUSION] Commercial AI systems show variable performance in nodule detection and classification, underscoring the need for users to understand each system's characteristics and interpret results within clinical context.

Khurelsukh K, Lin YP, Chang HM, Hsu WC, Huang PC, Wu CT, Wan YL

📝 환자 설명용 한 줄

[PURPOSE] This study evaluates 3 artificial intelligence (AI) systems in detecting, characterizing, and classifying lung nodules on low-dose computed tomography (LDCT) scans of 100 subjects, assessing

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)
  • p-value P<0.001

이 논문을 인용하기

↓ .bib ↓ .ris
APA Khurelsukh K, Lin YP, et al. (2026). Comparative Performance of 3 Artificial Intelligence Systems for Lung Nodule Characterization in Low-Dose Computed Tomography Screening.. Journal of thoracic imaging. https://doi.org/10.1097/RTI.0000000000000877
MLA Khurelsukh K, et al.. "Comparative Performance of 3 Artificial Intelligence Systems for Lung Nodule Characterization in Low-Dose Computed Tomography Screening.." Journal of thoracic imaging, 2026.
PMID 41815002 ↗

Abstract

[PURPOSE] This study evaluates 3 artificial intelligence (AI) systems in detecting, characterizing, and classifying lung nodules on low-dose computed tomography (LDCT) scans of 100 subjects, assessing agreement with a reference standard and inter-vendor consistency.

[MATERIALS AND METHODS] Performance of 3 commercially available AI platforms-AI 1, AI 2, and AI 3-was retrospectively analyzed against evaluations by 2 thoracic radiologists, with discordances resolved by consensus as reference standard. Agreements were assessed for nodule presence, type (solid, part-solid, ground-glass), and Lung-RADS category using Cohen Kappa. Agreement for continuous measurements (nodule diameter and volume) across AI systems was evaluated using intraclass correlation coefficients (ICC). Group comparisons for continuous variables were performed using the Kruskal-Wallis test, with Mann-Whitney U tests for post hoc pairwise comparisons. Categorical variables were compared using χ2 tests. Bland-Altman analysis evaluated variability in diameter and volume measurements.

[RESULTS] The 3 AI systems detected 435, 152, and 70 nodules, respectively, whereas radiologists identified 126 nodules (P<0.001). Sensitivity, specificity, and accuracy were 77.0%, 8.2%, and 25.7% for AI 1; 72.2%, 83.4%, and 80.6% for AI 2; and 42.9%, 95.7%, and 82.2% for AI 3. Agreement with the reference standard was perfect for AI 2 and almost perfect for AI 3, but absent for AI 1. Inter-AI agreement was substantial (κ=0.66 to 0.78), and diameter/volume measurements showed moderate to good reliability (ICC=0.57 to 0.87).

[CONCLUSION] Commercial AI systems show variable performance in nodule detection and classification, underscoring the need for users to understand each system's characteristics and interpret results within clinical context.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반