← 뒤로

A Comparative Study on the Use of DeepSeek-R1 and ChatGPT-4.5 in Different Aspects of Plastic Surgery.

Aesthetic plastic surgery 2026 Vol.50(7) p. 2776-2792

원문 ↗ DOI ↗

Abstract

[BACKGROUND] Artificial intelligence (AI) has the potential to enhance medical practice, but its application in plastic surgery remains underexplored. DeepSeek-R1 and ChatGPT-4.5 are AI models that can assist with clinical tasks, but their performance in plastic surgery-related queries needs evaluation. This study compares the two models in providing clinically relevant, detailed, and accurate responses.

[OBJECTIVE] The objective of this study is to evaluate and compare the performance of DeepSeek-R1 and ChatGPT-4.5 across 10 plastic surgery-related tasks, focusing on accuracy, detail, and clinical relevance.

[METHODS] This comparative evaluation was conducted by having two senior plastic surgeons review the AI-generated responses for each task. The responses were rated on a 1-10 scale based on their accuracy, completeness, and clinical relevance. The tasks involved both general knowledge questions and more complex, clinically relevant tasks such as medical history notes and hospital admission/discharge slips. After scoring, the mean and standard deviation (SD) were calculated for each model to evaluate their overall performance and consistency.

[RESULTS] The results revealed that DeepSeek-R1 consistently outperformed ChatGPT-4.5 across all tasks, with higher average scores for both evaluators. DeepSeek-R1 excelled in tasks requiring high clinical detail, comprehensive explanations, and professional-level accuracy, particularly in tasks involving botulinum toxin, medical documentation, and novel research topics. In contrast, ChatGPT-4.5 was rated higher for tasks requiring concise responses, providing accurate but less detailed overviews. The mean scores for DeepSeek-R1 were significantly higher, with lower standard deviations, indicating greater consistency in its responses. ChatGPT-4.5, though performing well for general inquiries, showed more variability and scored lower in complex clinical tasks.

[CONCLUSION] DeepSeek-R1 is better suited for tasks needing clinical detail and professional-level accuracy, while ChatGPT-4.5 excels in providing quick, concise responses. Both models show promise in supporting plastic surgery practice and education, but should complement, not replace, human expertise.

[LEVEL OF EVIDENCE V] This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .

추출된 의학 개체 (NER)

유형	영어 표현	한국어 / 풀이	출처	등장
시술	`botulinum toxin`	보툴리눔독소 주사	dict	1
해부	`ChatGPT-4.5`		scispacy	1
약물	`[BACKGROUND] Artificial`		scispacy	1
약물	`[OBJECTIVE]`		scispacy	1
질환	`DeepSeek-R1`		scispacy	1
질환	`ChatGPT-4.5`		scispacy	1
기타	`human`		scispacy	1

MeSH Terms

Humans; Surgery, Plastic; Artificial Intelligence; Plastic Surgery Procedures; Female; Clinical Competence; Male; Generative Artificial Intelligence

📑 인용 관계

이 논문이 참조한 문헌 20

외부 PMID 14건 (DB 미수집)

🔗 함께 등장하는 도메인

이 논문이 속한 카테고리와 같은 논문에서 자주 함께 다뤄지는 카테고리들

필러 주입술 (21) 안검성형술 (14) 안면거상술 (10) 유방 (10) 지방흡입 (9) 코성형술 (8) 히알루론산 (7) 유방성형술 (6)