본문으로 건너뛰기
← 뒤로

Benchmarking pathology foundation models for predicting microsatellite instability in colorectal cancer histopathology.

Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society 2026 Vol.127() p. 102680

Bilal M, Gulzar MA, Jaffar N, Alabduljabbar A, Altherwy Y, Alsuhaibani A, Almarshad F

📝 환자 설명용 한 줄

The rapid evolution of pathology foundation models necessitates rigorous benchmarking for clinical tasks.

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)
  • 표본수 (n) 409

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Bilal M, Gulzar MA, et al. (2026). Benchmarking pathology foundation models for predicting microsatellite instability in colorectal cancer histopathology.. Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society, 127, 102680. https://doi.org/10.1016/j.compmedimag.2025.102680
MLA Bilal M, et al.. "Benchmarking pathology foundation models for predicting microsatellite instability in colorectal cancer histopathology.." Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society, vol. 127, 2026, pp. 102680.
PMID 41352179

Abstract

The rapid evolution of pathology foundation models necessitates rigorous benchmarking for clinical tasks. We evaluated three leading foundation models, UNI, Virchow2, and CONCH, for predicting microsatellite instability status from colorectal cancer whole-slide images, an essential routine clinical test. Our comprehensive framework assessed stain, tissue, and resolution invariance using datasets from The Cancer Genome Atlas (TCGA, USA; n = 409) and Pathology Artificial Intelligence Platform (PAIP, South Korea; training n = 47, testing n = 21 and n = 78). We developed an efficient pipeline with minimal preprocessing, omitting stain normalization, color augmentation, and tumor segmentation. To improve contextual encoding, we applied a five-crop strategy per patch, averaging embeddings from the center and four peripheral crops. We compared three slide-level aggregation and four efficient adaptation strategies. CONCH, using 2-cluster aggregation and ProtoNet adaptation, achieved top balanced accuracies (0.775 and 0.778) in external validation on PAIP. Conversely, UNI, with mean-averaging aggregation and ANN adaptation, excelled in TCGA cross-validation (0.778) but not in external validation (0.764), suggesting potential overfitting. The proposed 5-Crop augmentation enhances robustness to scale in UNI and CONCH and reflects intrinsic invariance achieved by Virchow2 through large-scale pretraining. For prescreening, CONCH demonstrated specificity of 0.65 and 0.45 at sensitivities of 0.90 and 0.94, respectively, highlighting its effectiveness in identifying stable cases and minimizing number of rapid molecular tests needed. Interestingly, a fine-tuned ResNet34 adaptation achieved superior performance (0.836) in the smaller internal validation cohort, suggesting current pathology foundation models training recipes may not sufficiently generalize without task-specific fine-tuning. Interpretability analyses using CONCH's multimodal embeddings identified plasma cells as key morphological features differentiating microsatellite instability from stability, validated by pathologists (accuracy up to 92.4 %). This study underscores the feasibility and clinical significance of adapting foundation models to enhance diagnostic efficiency and patient outcomes.

MeSH Terms

Microsatellite Instability; Humans; Colorectal Neoplasms; Benchmarking; Artificial Intelligence

같은 제1저자의 인용 많은 논문 (2)