본문으로 건너뛰기
← 뒤로

Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis.

1/5 보강
Cancers 2026 Vol.18(6)
Retraction 확인
출처

Al-Khanaty A, Santucci J, Hennes D, Sathianathen N, Delgado C, Sharma K, Dinneen E, Sandhu K, Chen D, Eapen R, Moon D, Jack G, Goad J, Siva S, Ali M, Bolton D, Lawrentschuk N, Murphy DG, Perera M

📝 환자 설명용 한 줄

[BACKGROUND] Artificial intelligence chatbots are increasingly used by patients to obtain health information, including for prostate cancer.

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Al-Khanaty A, Santucci J, et al. (2026). Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis.. Cancers, 18(6). https://doi.org/10.3390/cancers18060906
MLA Al-Khanaty A, et al.. "Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis.." Cancers, vol. 18, no. 6, 2026.
PMID 41899512

Abstract

[BACKGROUND] Artificial intelligence chatbots are increasingly used by patients to obtain health information, including for prostate cancer. While these platforms offer accessible and conversational responses, concerns remain regarding the quality, usability, and clinical relevance of AI-generated content. This study comparatively evaluated patient-directed prostate cancer information generated by commonly used AI chatbots.

[METHODS] Standardised prostate cancer-related prompts were developed using Google Trends and authoritative healthcare resources. Identical queries were submitted to five publicly accessible AI chatbots: ChatGPT 5.2, Google Gemini, Claude AI, Microsoft Copilot, and Perplexity. Responses were independently assessed by two blinded reviewers using the DISCERN instrument for information quality and the Patient Education Materials Assessment Tool for printable materials (PEMAT-P) for understandability and actionability. Inter-rater reliability was assessed using intraclass correlation coefficients (ICCs). Readability was evaluated using the Flesch-Kincaid Reading Ease score. Descriptive statistics were used for comparative and pooled analyses.

[RESULTS] Overall information quality was moderate, with a pooled median (interquartile range [IQR]) DISCERN score of 56.5 (53.0-61.0). Higher mean DISCERN scores were observed for ChatGPT 5.2 and Microsoft Copilot, whereas lower scores were observed for Claude and Perplexity. PEMAT-P understandability was consistently high across platforms, with a pooled median (IQR) score of 91.7% (83.3-91.7%). In contrast, PEMAT-P actionability was uniformly poor, with a pooled median (IQR) score of 0% (0-0%). Readability analysis demonstrated moderate complexity, with a pooled median (IQR) Flesch-Kincaid Reading Ease score of 50.4 (49.2-52.5) and a median word count of 666 (657-1022). Inter-rater reliability was good for PEMAT understandability (ICC 0.841) and moderate for DISCERN (ICC 0.712).

[CONCLUSIONS] AI chatbots provide highly understandable but only moderately high-quality patient-directed prostate cancer information, with a consistent lack of actionable guidance. Although variation in content quality was observed across platforms, significant limitations remain in evidence transparency and practical patient support. Future development should prioritise integration of evidence-based resources and actionable decision-support tools to enhance the role of AI chatbots in prostate cancer education.

같은 제1저자의 인용 많은 논문 (5)