Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis.
1/5 보강
[BACKGROUND] Artificial intelligence chatbots are increasingly used by patients to obtain health information, including for prostate cancer.
APA
Al-Khanaty A, Santucci J, et al. (2026). Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis.. Cancers, 18(6). https://doi.org/10.3390/cancers18060906
MLA
Al-Khanaty A, et al.. "Quality and Usability of Prostate Cancer Information Generated by Artificial Intelligence Chatbots: A Comparative Analysis.." Cancers, vol. 18, no. 6, 2026.
PMID
41899512
Abstract
[BACKGROUND] Artificial intelligence chatbots are increasingly used by patients to obtain health information, including for prostate cancer. While these platforms offer accessible and conversational responses, concerns remain regarding the quality, usability, and clinical relevance of AI-generated content. This study comparatively evaluated patient-directed prostate cancer information generated by commonly used AI chatbots.
[METHODS] Standardised prostate cancer-related prompts were developed using Google Trends and authoritative healthcare resources. Identical queries were submitted to five publicly accessible AI chatbots: ChatGPT 5.2, Google Gemini, Claude AI, Microsoft Copilot, and Perplexity. Responses were independently assessed by two blinded reviewers using the DISCERN instrument for information quality and the Patient Education Materials Assessment Tool for printable materials (PEMAT-P) for understandability and actionability. Inter-rater reliability was assessed using intraclass correlation coefficients (ICCs). Readability was evaluated using the Flesch-Kincaid Reading Ease score. Descriptive statistics were used for comparative and pooled analyses.
[RESULTS] Overall information quality was moderate, with a pooled median (interquartile range [IQR]) DISCERN score of 56.5 (53.0-61.0). Higher mean DISCERN scores were observed for ChatGPT 5.2 and Microsoft Copilot, whereas lower scores were observed for Claude and Perplexity. PEMAT-P understandability was consistently high across platforms, with a pooled median (IQR) score of 91.7% (83.3-91.7%). In contrast, PEMAT-P actionability was uniformly poor, with a pooled median (IQR) score of 0% (0-0%). Readability analysis demonstrated moderate complexity, with a pooled median (IQR) Flesch-Kincaid Reading Ease score of 50.4 (49.2-52.5) and a median word count of 666 (657-1022). Inter-rater reliability was good for PEMAT understandability (ICC 0.841) and moderate for DISCERN (ICC 0.712).
[CONCLUSIONS] AI chatbots provide highly understandable but only moderately high-quality patient-directed prostate cancer information, with a consistent lack of actionable guidance. Although variation in content quality was observed across platforms, significant limitations remain in evidence transparency and practical patient support. Future development should prioritise integration of evidence-based resources and actionable decision-support tools to enhance the role of AI chatbots in prostate cancer education.
[METHODS] Standardised prostate cancer-related prompts were developed using Google Trends and authoritative healthcare resources. Identical queries were submitted to five publicly accessible AI chatbots: ChatGPT 5.2, Google Gemini, Claude AI, Microsoft Copilot, and Perplexity. Responses were independently assessed by two blinded reviewers using the DISCERN instrument for information quality and the Patient Education Materials Assessment Tool for printable materials (PEMAT-P) for understandability and actionability. Inter-rater reliability was assessed using intraclass correlation coefficients (ICCs). Readability was evaluated using the Flesch-Kincaid Reading Ease score. Descriptive statistics were used for comparative and pooled analyses.
[RESULTS] Overall information quality was moderate, with a pooled median (interquartile range [IQR]) DISCERN score of 56.5 (53.0-61.0). Higher mean DISCERN scores were observed for ChatGPT 5.2 and Microsoft Copilot, whereas lower scores were observed for Claude and Perplexity. PEMAT-P understandability was consistently high across platforms, with a pooled median (IQR) score of 91.7% (83.3-91.7%). In contrast, PEMAT-P actionability was uniformly poor, with a pooled median (IQR) score of 0% (0-0%). Readability analysis demonstrated moderate complexity, with a pooled median (IQR) Flesch-Kincaid Reading Ease score of 50.4 (49.2-52.5) and a median word count of 666 (657-1022). Inter-rater reliability was good for PEMAT understandability (ICC 0.841) and moderate for DISCERN (ICC 0.712).
[CONCLUSIONS] AI chatbots provide highly understandable but only moderately high-quality patient-directed prostate cancer information, with a consistent lack of actionable guidance. Although variation in content quality was observed across platforms, significant limitations remain in evidence transparency and practical patient support. Future development should prioritise integration of evidence-based resources and actionable decision-support tools to enhance the role of AI chatbots in prostate cancer education.
같은 제1저자의 인용 많은 논문 (5)
- Guideline of guidelines: lutetium-177 PSMA radioligand therapy in advanced prostate cancer.
- From salvage to spotlight: how 2025 transformed PSMA radioligand therapy.
- Positive Margin Location and Prostate Biopsy Route: A Consecutive Cohort Comparison of Transperineal and Transrectal Techniques.
- Neoadjuvant Systemic Therapy in High-risk Localised Prostate Cancer: Current Evidence and Future Directions.
- Prostate Cancer Diagnosis by Transurethral Resection of the Prostate Is Associated with Compromised Oncologic Outcomes Post-Prostatectomy.