← 뒤로

Evaluating ChatGPT in Pediatric Craniofacial Surgery Counseling: A Vignette-Based Assessment of Educational Quality, Specificity, Readability, and Emotional Content.

The Journal of craniofacial surgery 2026 Vol.37(3-4) p. 812-817

Miller K, Antonevich S, Sturm S, Sandino A, Brochu B, Kassira W, Thaller S

원문 ↗ DOI ↗

Abstract

[INTRODUCTION] Large language models (LLMs) like ChatGPT have the potential to improve patient education. Their role in pediatric plastic surgery counseling remains underexplored. This study evaluated ChatGPT-4o's responses to common parent questions across 4 pediatric craniofacial procedures using 5 metrics: DISCERN, specificity, Flesch-Kincaid Grade Level (FKGL), emotion scoring, and Patient Education Materials Assessment (PEMAT).

[METHODS] Twelve standardized vignettes were developed for cleft lip and palate, craniosynostosis, facial trauma from a dog bite, and otoplasty. Each case featured prompts on surgical risks, recovery, and procedure-specific concerns. All were submitted on the same day using the same ChatGPT-4o profile. DISCERN scores were rated by 2 board-certified plastic surgeons. Specificity and emotion were rated on a 5-point Likert scale by 2 medical students. Readability was calculated with FKGL. PEMAT was used to assess understandability and actionability.

[RESULTS] Mean DISCERN score was 43.7/75 (reliability 23.8/40, treatment quality 20.3/35). Mean specificity ranged from 1.7 (craniosynostosis) to 3.0 (otoplasty and dog bite). Average FKGL was 9.5 (10th-grade level). Mean emotion score was 3.1. PEMAT scores averaged 62% for understandability and 27% for actionability. Facial trauma demonstrated the highest in both domains.

[CONCLUSIONS] ChatGPT-4o produced organized, accessible responses, but underperformed in reliability, quality, specificity, and actionability. Reading level exceeds recommended patient education standards of sixth to eighth grade. Emotional tone was moderate but not consistently tailored to sensitive pediatric contexts. These findings suggest ChatGPT is insufficient for unsupervised use. With refinement, LLMs may serve as support, but not replace, physician-led counseling in pediatric craniofacial surgery.

추출된 의학 개체 (NER)

유형	영어 표현	한국어 / 풀이	UMLS CUI	출처	등장
시술	`otoplasty`	귀성형술		dict	2
약물	`[INTRODUCTION] Large`			scispacy	1
약물	`[CONCLUSIONS] ChatGPT-4o`			scispacy	1
질환	`cleft lip`		C0008924 Cleft upper lip	scispacy	1
질환	`palate`		C0700374 Palate	scispacy	1
질환	`craniosynostosis`		C0010278 Craniosynostosis	scispacy	1
질환	`trauma`		C0043251 Wounds and Injuries	scispacy	1
질환	`FKGL` → Flesch-Kincaid Grade Level			scispacy	1
질환	`lip`			scispacy	1
기타	`ChatGPT`			scispacy	1
기타	`patient`			scispacy	1
기타	`FKGL` → Flesch-Kincaid Grade Level			scispacy	1

MeSH Terms

Humans; Patient Education as Topic; Comprehension; Craniofacial Abnormalities; Emotions; Counseling; Cleft Palate; Child; Parents; Plastic Surgery Procedures; Reproducibility of Results; Cleft Lip; Sensitivity and Specificity; Female; Generative Artificial Intelligence

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

코성형술 (12) 감염 (10) 안검성형술 (6) 피판재건술 (6) 이개연골 (6) 안면거상술 (5) 비대칭 (5) 유방성형술 (3)

Evaluating ChatGPT in Pediatric Craniofacial Surgery Counseling: A Vignette-Based Assessment of Educational Quality, Specificity, Readability, and Emotional Content.

관련 도메인

Abstract

추출된 의학 개체 (NER)

MeSH Terms

🔗 함께 등장하는 도메인

관련 논문