Assessing the Safety and Clinical Appropriateness of Breast Cancer Advice From Consumer-Grade Large Language Models.

Michael Njunge; Yang Huang; Ran Li; Anitha Karunairajah; Nathan Burns; Nathalie M Falkner; Gareth Porter

doi:10.1111/1754-9485.70092

← 뒤로

Assessing the Safety and Clinical Appropriateness of Breast Cancer Advice From Consumer-Grade Large Language Models.

2/5 보강

Journal of medical imaging and radiation oncology 📖 저널 OA 15.4% 2023~2026 2026 Global Cancer Incidence and Screenin

OpenAlex 토픽 · Global Cancer Incidence and Screening Health Literacy and Information Accessibility Patient-Provider Communication in Healthcare

Njunge M, Huang Y, Li R, Karunairajah A, Burns N, Falkner N, Porter G

PubMed ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

[INTRODUCTION] Freely available consumer large language models (LLMs) have become a common source of health information for patients.

이 논문을 인용하기

↓ .bib ↓ .ris

APA Michael Njunge, Yang Huang, et al. (2026). Assessing the Safety and Clinical Appropriateness of Breast Cancer Advice From Consumer-Grade Large Language Models.. Journal of medical imaging and radiation oncology. https://doi.org/10.1111/1754-9485.70092

MLA Michael Njunge, et al.. "Assessing the Safety and Clinical Appropriateness of Breast Cancer Advice From Consumer-Grade Large Language Models.." Journal of medical imaging and radiation oncology, 2026.

PMID 41937254 ↗

DOI 10.1111/1754-9485.70092

Abstract

[INTRODUCTION] Freely available consumer large language models (LLMs) have become a common source of health information for patients. Though convenient, their consumer use by patients raises concerns about accuracy, safety and applicability to local clinical practice. We set out to assess how reliable and clinically appropriate breast cancer advice from three widely used LLMs (ChatGPT 3.5o, Gemini 2.0 and Perplexity (Standard)) is when applied in a Western Australian (WA) context.

[METHOD] We developed 31 questions covering breast cancer prevention, screening, imaging and management. Each LLM was asked the same question three times. The final answers were assessed for qualitative and quantitative reliability and graded for clinical appropriateness by a blinded panel of Consultant Breast Surgeons and Radiologists.

[RESULTS] All three models performed well in terms of reliability, with ChatGPT and Perplexity providing consistent answers to all questions. ChatGPT had the highest rate of clinically appropriate answers (97%), followed by Perplexity (90%) and Gemini (87%). Inappropriate responses were more common when questions included WA-specific terminology, particularly for Perplexity and Gemini. Agreement between Surgeons was strong, while Radiologists showed variability in their ratings.

[CONCLUSION] LLMs can provide reliable and generally appropriate breast cancer advice, but performance suffers regarding WA-specific breast screening terminology. Our findings highlight how LLM performance is region-specific, and this fact is likely generalisable to other areas of medicine where there may be regional variance in practice. Overall, LLMs are useful as educational tools, but their outputs should always be interpreted considering local guidelines and with clinical oversight.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

Early local immune activation following intra-operative radiotherapy in human breast tissue.
Oncoimmunology 2026 Tiefenthaller A 외 📖 OA
Overall survival and prognostic factors in young women with breast cancer: a retrospective cohort study from Southern Thailand.
World journal of surgical oncology 2026 Khongthong P 외 📖 OA
Age at First Pregnancy, Adult Weight Gain and Postmenopausal Breast Cancer Risk: The PROCAS Study (United Kingdom).
International journal of cancer 2026 Malcomson L 외 📖 OA
Advances in Targeted Therapy for Human Epidermal Growth Factor Receptor 2-Low Tumors: From Trastuzumab to Antibody-Drug Conjugates.
World journal of oncology 2026 Zheng ZN 외 📖 OA
Structural determinants of glycosaminoglycan oligosaccharides as LL-37 inhibitors in breast cancer.
Glycobiology 2026 Le Fournis C 외 📖 OA
Artificial intelligence and breast cancer screening in Serbia: a dual-perspective qualitative study among radiologists and screening-aged women.
Frontiers in radiology 2026 Jovanović S 외 📖 OA

이 논문을 인용하기

Abstract 한글 요약

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

Abstract