Evaluating Plastic Surgery Chatbot Performance: Insights into Medical Triage, Classification Accuracy, and Escalation Trends.

Aesthetic surgery journal 2026 Vol.46(2) p. 122-129

Wolmer S, Shauly O

Abstract

[BACKGROUND] The integration of AI chatbots into plastic surgery websites is now standard, providing asynchronous, real-time engagement for patients. Although promoted as scheduling and medical guidance tools, their contribution to clinical workflow improvement and patient satisfaction remains unclear.

[OBJECTIVES] The aim of this study was to evaluate the accuracy of AI chatbot performance in clinical triage of plastic surgery patients, focusing on triage accuracy and quality of patient interactions.

[METHODS] The responses of chatbots on top-ranking plastic surgery websites, identified by search engine optimization (SEO) rankings, were analyzed with standardized clinical scenarios representing emergent, urgent, and elective patient inquiries. Responses were analyzed by the chatbot's triage sensitivity and specificity, classification accuracy, escalation metrics, and content quality. Patient experience was quantified with a chatbot usability questionnaire and a visual analog scale. Subgroup analysis by chatbot platform and thematic analysis was performed to identify tonal patterns in chatbot language.

[RESULTS] Performance varied significantly across 60 clinical scenarios, particularly in urgency classification. Emergent classifications were most mislabeled as urgent, with a low sensitivity (20%), negative predictive value (0.71), and high false negative rate (80.0%). Agreement with physician-determined classifications was moderate (Cohen's kappa = 0.47), and over half of conversations required human-provider escalation. Misclassified interactions were associated with lower patient usability scores compared to correct classifications (49.1 vs 60.8, P < .05). Thematic analysis revealed reliance on templated, administrative language.

[CONCLUSIONS] Chatbots are practical and useful tools for managing elective plastic surgery inquiries but are ill-equipped to handle urgent and emergent patient needs. To move beyond utilization as basic administrative assistants, deployment of more clinically adept chatbots is needed.

추출된 의학 개체 (NER)

유형영어 표현한국어 / 풀이UMLS CUI출처등장
약물 [BACKGROUND] scispacy 1
약물 [OBJECTIVES] scispacy 1
약물 [CONCLUSIONS] scispacy 1
기타 patients scispacy 1
기타 patient scispacy 1

MeSH Terms

Humans; Triage; Surgery, Plastic; Patient Satisfaction; Internet; Artificial Intelligence; Surveys and Questionnaires; Plastic Surgery Procedures; Workflow; Generative Artificial Intelligence