Performance of Large Language Models vs Conventional Machine Learning for Predicting Clinical Outcomes With Limited Data: Comparative Study.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
50 patients, using the receiver operating characteristic area under the curve, the F-score, the average precision, and the balanced accuracy as metrics.
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
[CONCLUSIONS] These preliminary results may be further optimized. They already show the potential of LLM-based ML to enable new clinical use cases when data are available for only a few tens of patients.
[BACKGROUND] Machine learning (ML) can be used to predict clinical outcomes.
APA
Bigan E, Dufour S (2026). Performance of Large Language Models vs Conventional Machine Learning for Predicting Clinical Outcomes With Limited Data: Comparative Study.. JMIR AI, 5, e83853. https://doi.org/10.2196/83853
MLA
Bigan E, et al.. "Performance of Large Language Models vs Conventional Machine Learning for Predicting Clinical Outcomes With Limited Data: Comparative Study.." JMIR AI, vol. 5, 2026, pp. e83853.
PMID
41921208 ↗
DOI
10.2196/83853
Abstract 한글 요약
[BACKGROUND] Machine learning (ML) can be used to predict clinical outcomes. Training predictive models typically requires data for hundreds or thousands of patients. Lowering this requirement to a few tens of patients would enable new applications in clinical trials (eg, optimizing the design of a phase III trial by training a predictive model on phase II data and applying it to synthetic phase III patients) or in clinical decision support systems (for rare diseases or narrow indications). Large language models (LLMs) have recently been shown to outperform conventional ML algorithms for predictions on tabular data when the train dataset is small.
[OBJECTIVE] This study aims to investigate the advantage of LLMs compared with conventional ML for predicting clinical outcomes by applying state-of-the-art models to recently published clinical datasets.
[METHODS] We compared 2 LLMs, 1 proprietary (from OpenAI) and 1 open source (from the Meta Llama family), with conventional ML classification algorithms to predict clinical outcomes using 3 recently published clinical datasets spanning distinct conditions (sepsis, gastric cancer, and acute leukemia). Datasets were chosen such that their publication date was after the LLM knowledge cutoff date to ensure that models were never exposed to these data during pretraining. Datasets were sampled to vary the training size.
[RESULTS] On average, the 2 tested LLMs perform better than conventional ML for training sizes below 50 patients, using the receiver operating characteristic area under the curve, the F-score, the average precision, and the balanced accuracy as metrics. Contextual information was found to be key to this advantage.
[CONCLUSIONS] These preliminary results may be further optimized. They already show the potential of LLM-based ML to enable new clinical use cases when data are available for only a few tens of patients.
[OBJECTIVE] This study aims to investigate the advantage of LLMs compared with conventional ML for predicting clinical outcomes by applying state-of-the-art models to recently published clinical datasets.
[METHODS] We compared 2 LLMs, 1 proprietary (from OpenAI) and 1 open source (from the Meta Llama family), with conventional ML classification algorithms to predict clinical outcomes using 3 recently published clinical datasets spanning distinct conditions (sepsis, gastric cancer, and acute leukemia). Datasets were chosen such that their publication date was after the LLM knowledge cutoff date to ensure that models were never exposed to these data during pretraining. Datasets were sampled to vary the training size.
[RESULTS] On average, the 2 tested LLMs perform better than conventional ML for training sizes below 50 patients, using the receiver operating characteristic area under the curve, the F-score, the average precision, and the balanced accuracy as metrics. Contextual information was found to be key to this advantage.
[CONCLUSIONS] These preliminary results may be further optimized. They already show the potential of LLM-based ML to enable new clinical use cases when data are available for only a few tens of patients.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Comparative efficacy of different therapeutic approaches in treatment naïve FLT3-mutated AML eligible for intensive chemotherapy: a Bayesian network meta-analysis of randomized trials.
- From mutation to treatment: The dual role of BRCA1 and BRCA2 in gynecological malignancy development and management a systematic review.
- IDH enzyme inhibition in cancer therapy: mechanisms, mutational insights, and effects of IDH inhibitors in glioma, acute myeloid leukemia and chondrosarcoma.
- DualPG-DTA: A Large Language Model-Powered Graph Neural Network Framework for Enhanced Drug-Target Affinity Prediction and Discovery of Novel CDK9 Inhibitors Exhibiting In Vivo Anti-Leukemia Activity.
- Spatial omics at the forefront: emerging technologies, analytical innovations, and clinical applications.
- Day 14 Bone Marrow Biopsy in Acute Myeloid Leukemia Induction: The End of Story or Not Yet?