본문으로 건너뛰기
← 뒤로

Bridging radiology and pathology: domain-generalized cross-modal learning for clinical.

NPJ digital medicine 2026 Vol.9(1)

Zhong X, Gu Z, Shanmuganathan M, Li M, Sun H, Du M, Chen Q, Jiang G

📝 환자 설명용 한 줄

Reliable interpretation of clinical imaging requires integrating complementary evidence across modalities, yet most AI systems remain limited by single-modality analysis and poor generalization across

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Zhong X, Gu Z, et al. (2026). Bridging radiology and pathology: domain-generalized cross-modal learning for clinical.. NPJ digital medicine, 9(1). https://doi.org/10.1038/s41746-026-02423-w
MLA Zhong X, et al.. "Bridging radiology and pathology: domain-generalized cross-modal learning for clinical.." NPJ digital medicine, vol. 9, no. 1, 2026.
PMID 41699055

Abstract

Reliable interpretation of clinical imaging requires integrating complementary evidence across modalities, yet most AI systems remain limited by single-modality analysis and poor generalization across institutions. We propose a unified cross-modal framework that bridges mammography and histopathology for breast cancer diagnosis through: (1) a shared vision transformer encoder with lightweight modality-specific adapters, (2) a weakly supervised patient-level contrastive alignment module that learns cross-modal correspondences without pixel-level supervision, (3) domain generalization strategies combining MixStyle augmentation and invariant risk minimization, and (4) causal test-time adaptation for unseen target domains. The model jointly addresses classification, lesion localization, and pathological grading while generating reasoning-guided attention maps that explicitly link suspicious mammographic regions with corresponding histopathological evidence. Evaluated on four public benchmarks (CBIS-DDSM, INbreast, BACH, CAMELYON16/17), the framework consistently outperforms state-of-the-art unimodal, multimodal, and domain generalization baselines, achieving mean AUC of 0.90 under rigorous leave-one-domain-out evaluation and substantially smaller domain gaps (0.03 vs. 0.06-0.10). Visualization and interpretability analyses further confirm that predictions align with clinically meaningful features, supporting transparency and trust. By advancing multimodal integration, cross-institutional robustness, and explainability, this study represents a step toward clinically deployable AI systems for diagnostic decision support.

같은 제1저자의 인용 많은 논문 (5)