Evaluating ChatGPT in Pediatric Craniofacial Surgery Counseling: A Vignette-Based Assessment of Educational Quality, Specificity, Readability, and Emotional Content.

The Journal of craniofacial surgery 2026 Vol.37(3-4) p. 812-817

Miller K, Antonevich S, Sturm S, Sandino A, Brochu B, Kassira W, Thaller S

관련 도메인

Abstract

[INTRODUCTION] Large language models (LLMs) like ChatGPT have the potential to improve patient education. Their role in pediatric plastic surgery counseling remains underexplored. This study evaluated ChatGPT-4o's responses to common parent questions across 4 pediatric craniofacial procedures using 5 metrics: DISCERN, specificity, Flesch-Kincaid Grade Level (FKGL), emotion scoring, and Patient Education Materials Assessment (PEMAT).

[METHODS] Twelve standardized vignettes were developed for cleft lip and palate, craniosynostosis, facial trauma from a dog bite, and otoplasty. Each case featured prompts on surgical risks, recovery, and procedure-specific concerns. All were submitted on the same day using the same ChatGPT-4o profile. DISCERN scores were rated by 2 board-certified plastic surgeons. Specificity and emotion were rated on a 5-point Likert scale by 2 medical students. Readability was calculated with FKGL. PEMAT was used to assess understandability and actionability.

[RESULTS] Mean DISCERN score was 43.7/75 (reliability 23.8/40, treatment quality 20.3/35). Mean specificity ranged from 1.7 (craniosynostosis) to 3.0 (otoplasty and dog bite). Average FKGL was 9.5 (10th-grade level). Mean emotion score was 3.1. PEMAT scores averaged 62% for understandability and 27% for actionability. Facial trauma demonstrated the highest in both domains.

[CONCLUSIONS] ChatGPT-4o produced organized, accessible responses, but underperformed in reliability, quality, specificity, and actionability. Reading level exceeds recommended patient education standards of sixth to eighth grade. Emotional tone was moderate but not consistently tailored to sensitive pediatric contexts. These findings suggest ChatGPT is insufficient for unsupervised use. With refinement, LLMs may serve as support, but not replace, physician-led counseling in pediatric craniofacial surgery.

추출된 의학 개체 (NER)

유형영어 표현한국어 / 풀이UMLS CUI출처등장
시술 otoplasty 귀성형술 dict 2
약물 [INTRODUCTION] Large scispacy 1
약물 [CONCLUSIONS] ChatGPT-4o scispacy 1
질환 cleft lip C0008924
Cleft upper lip
scispacy 1
질환 palate C0700374
Palate
scispacy 1
질환 craniosynostosis C0010278
Craniosynostosis
scispacy 1
질환 trauma C0043251
Wounds and Injuries
scispacy 1
질환 FKGL → Flesch-Kincaid Grade Level scispacy 1
질환 lip scispacy 1
기타 ChatGPT scispacy 1
기타 patient scispacy 1
기타 FKGL → Flesch-Kincaid Grade Level scispacy 1

MeSH Terms

Humans; Patient Education as Topic; Comprehension; Craniofacial Abnormalities; Emotions; Counseling; Cleft Palate; Child; Parents; Plastic Surgery Procedures; Reproducibility of Results; Cleft Lip; Sensitivity and Specificity; Female; Generative Artificial Intelligence

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

관련 논문