본문으로 건너뛰기
← 뒤로

End-to-End Multimodal Multiple Instance Learning for Cancer Histopathology Classification with Dual-Attention Fusion.

Journal of medical systems 2026 Vol.50(1)

Shirae S, Debsarkar SS, Kawanaka H, Aronow BJ, Prasath VBS

📝 환자 설명용 한 줄

[UNLABELLED] In recent years, computational pathology has advanced toward integrating histopathological images and genomic data to improve diagnostic accuracy and biological interpretability.

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Shirae S, Debsarkar SS, et al. (2026). End-to-End Multimodal Multiple Instance Learning for Cancer Histopathology Classification with Dual-Attention Fusion.. Journal of medical systems, 50(1). https://doi.org/10.1007/s10916-026-02379-0
MLA Shirae S, et al.. "End-to-End Multimodal Multiple Instance Learning for Cancer Histopathology Classification with Dual-Attention Fusion.." Journal of medical systems, vol. 50, no. 1, 2026.
PMID 41986692

Abstract

[UNLABELLED] In recent years, computational pathology has advanced toward integrating histopathological images and genomic data to improve diagnostic accuracy and biological interpretability. In this study, we propose a lightweight end-to-end multi-instance learning model that integrates whole slide images (WSIs) with gene expression profiles. The proposed method efficiently extracts image features using a reduced MobileNetV4 and treats gene information at the gene set level based on Gene Set Enrichment Analysis (GSEA) to capture functional relationships among genes. Image and gene features are aggregated within and across modalities through self-attention and cross-attention mechanisms. Our experiments on three cancer types, namely low-grade glioma (LGG), non-small cell lung cancer (NSCLC), and breast cancer (BRCA) showed that the proposed model outperformed image-only models and conventional single-vector gene input models. In particular, the ROC-AUC reached 0.740 for LGG, 0.982 for NSCLC, and 0.966 for BRCA, with a notable improvement in PR-AUC observed for BRCA classification. These results indicate that the integration of morphological and molecular information is effective for capturing disease characteristics. Furthermore, the proposed model maintains high classification performance while reducing computational resources, suggesting its potential applicability to large-scale pathological datasets and clinical applications.

[SUPPLEMENTARY INFORMATION] The online version contains supplementary material available at 10.1007/s10916-026-02379-0.