Accuracy of ChatGPT, Gemini, Copilot, and Claude to Blepharoplasty-Related Questions.

Seher Köksaldı; Mustafa Kayabaşı; Ceren Durmaz Engin; Andrzej Grzybowski

doi:10.1007/s00266-025-05071-9

← 뒤로

Accuracy of ChatGPT, Gemini, Copilot, and Claude to Blepharoplasty-Related Questions.

Aesthetic plastic surgery 2025 Vol.49(17) p. 4775-4785 🌐 cited 3 Artificial Intelligence in Healthcar

TL;DR ChatGPT demonstrated superior performance in both medical accuracy and clinical relevance among evaluated LLMs regarding upper eyelid blepharoplasty, particularly excelling in postoperative monitoring and follow-up categories.

📈 연도별 인용 (2025–2026) · 합계 3

OpenAlex 토픽 · Artificial Intelligence in Healthcare and Education Meta-analysis and systematic reviews Pain Management and Placebo Effect

Köksaldı S, Kayabaşı M, Durmaz Engin C, Grzybowski A

원문 ↗ DOI ↗ BibTeX ↓ RIS ↓

이 논문을 인용하기

BibTeX ↓ RIS ↓

APA Seher Köksaldı, Mustafa Kayabaşı, et al. (2025). Accuracy of ChatGPT, Gemini, Copilot, and Claude to Blepharoplasty-Related Questions.. Aesthetic plastic surgery, 49(17), 4775-4785. https://doi.org/10.1007/s00266-025-05071-9

MLA Seher Köksaldı, et al.. "Accuracy of ChatGPT, Gemini, Copilot, and Claude to Blepharoplasty-Related Questions.." Aesthetic plastic surgery, vol. 49, no. 17, 2025, pp. 4775-4785.

PMID 40691658

DOI 10.1007/s00266-025-05071-9

Abstract

[BACKGROUND] This study aimed to evaluate the performance of four large language models (LLMs)-ChatGPT, Gemini, Copilot, and Claude-in responding to upper eyelid blepharoplasty-related questions, focusing on medical accuracy, clinical relevance, response length, and readability.

[METHODS] A set of queries regarding upper eyelid blepharoplasty, covering six categories (anatomy, surgical procedure, additional intraoperative procedures, postoperative monitoring, follow-up, and postoperative complications) were posed to each LLM. An identical prompt establishing clinical context was provided before each question. Responses were evaluated by three ophthalmologists using a 5-point Likert scale for medical accuracy and a 3-point Likert scale for clinical relevance. The length of the responses was assessed. Readability was also evaluated using the Flesch Reading Ease Score, Flesch-Kincaid Grade Level, Coleman-Liau Index, Gunning Fog Index, and Simple Measure of Gobbledygook grade.

[RESULTS] A total of 30 standardized questions were presented to each LLM. None of the responses from any LLM received a score of 1 regarding medical accuracy for any question. ChatGPT achieved an 80% 'highly accurate' response rate, followed by Claude (60%), Gemini (40%), and Copilot (20%). None of the responses from ChatGPT and Claude received a score of 1 regarding clinical relevance, whereas 10% of Gemini's responses and 26.7% of Copilot's responses received a score of 1. ChatGPT also provided the most clinically 'relevant' responses (86.7%), outperforming the other LLMs. Copilot generated the shortest responses, while ChatGPT generated the longest. Readability analyses revealed that all responses required advanced reading skills at a 'college graduate' level or higher, with Copilot's responses being the most complex.

[CONCLUSION] ChatGPT demonstrated superior performance in both medical accuracy and clinical relevance among evaluated LLMs regarding upper eyelid blepharoplasty, particularly excelling in postoperative monitoring and follow-up categories. While all models generated complex texts requiring advanced literacy, ChatGPT's detailed responses offer valuable guidance for ophthalmologists managing upper eyelid blepharoplasty cases.

[LEVEL OF EVIDENCE V] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .

추출된 의학 개체 (NER)

유형	영어 표현	한국어 / 풀이	출처	등장
시술	`blepharoplasty`	안검성형술	dict	5
해부	`upper eyelid`	눈꺼풀	dict	4
합병증	`eyelid`		scispacy	1
약물	`ChatGPT`		scispacy	1
약물	`Claude-in`		scispacy	1
약물	`[RESULTS] A`		scispacy	1
약물	`Gemini`		scispacy	1
질환	`LLM`		scispacy	1
기타	`Gemini`		scispacy	1
기타	`ChatGPT`		scispacy	1
기타	`Copilot`		scispacy	1

MeSH Terms

Blepharoplasty; Humans; Surveys and Questionnaires; Female; Male; Language; Comprehension; Generative Artificial Intelligence

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

피판재건술 (123) 안면거상술 (83) 경결막 접근 (74) 코성형술 (67) 비중격 (51) 상안검거근 (45) 유방성형술 (35) 감염 (31)

같은 제1저자의 인용 많은 논문 (1)

Anti-vascular endothelial growth factor therapies in ophthalmology.
Medical hypothesis, discovery & innovation ophthalmology journal 2025

Accuracy of ChatGPT, Gemini, Copilot, and Claude to Blepharoplasty-Related Questions.

관련 도메인

이 논문을 인용하기

Abstract

추출된 의학 개체 (NER)

MeSH Terms

🔗 함께 등장하는 도메인

같은 제1저자의 인용 많은 논문 (1)

관련 논문