Explainable multi-modal deep learning for transparent cancer diagnosis: integrating radiology, clinical features, and decision visualization.

Dash S; Bewoor L; Dongre Y; Bhosle A; Patil K; Jadhav S; Mohapatra B; Walia B

doi:10.3389/frai.2026.1767612

← 뒤로

Explainable multi-modal deep learning for transparent cancer diagnosis: integrating radiology, clinical features, and decision visualization.

Frontiers in artificial intelligence 2026 Vol.9() p. 1767612

Dash S, Bewoor L, Dongre Y, Bhosle A, Patil K, Jadhav S, Mohapatra B, Walia B

PMC 전문 ↗ 원문 ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

이 논문을 인용하기

BibTeX ↓ RIS ↓

APA Dash S, Bewoor L, et al. (2026). Explainable multi-modal deep learning for transparent cancer diagnosis: integrating radiology, clinical features, and decision visualization.. Frontiers in artificial intelligence, 9, 1767612. https://doi.org/10.3389/frai.2026.1767612

MLA Dash S, et al.. "Explainable multi-modal deep learning for transparent cancer diagnosis: integrating radiology, clinical features, and decision visualization.." Frontiers in artificial intelligence, vol. 9, 2026, pp. 1767612.

PMID 41809581

DOI 10.3389/frai.2026.1767612

Abstract

[INTRODUCTION] Although artificial intelligence-based cancer diagnostic models have demonstrated strong predictive performance, their lack of transparency and reliance on single-modality data continue to limit clinical trust and adoption. Effectively integrating multi-modal data with interpret-able decision-making remains a key challenge.

[METHODS] We propose an explainable multi-modal deep learning framework that integrates radiological imaging and structured clinical features using attention-based fusion. Image-level explanations are generated using Grad-CAM++, while SHAP is employed to quantify clinical feature contributions, enabling unified and cross-modal aligned interpretation rather than independent uni-modal explanations. The framework was evaluated on publicly available datasets, including CBIS-DDSM mammography, Duke Breast Cancer MRI, and TCGA cohorts (BRCA, LUAD, and GBM), comprising a total of 3,842 images from 2,917 patients.

[RESULTS] The proposed model consistently outperformed uni-modal approaches and simple fusion baselines, achieving an improved balance between sensitivity and specificity. Attention-based fusion demonstrated superior performance compared with feature concatenation, and the integration of explainability did not compromise predictive accuracy. Visual and clinical explanations highlighted diagnostically relevant tumor regions and established oncological risk factors. Stable performance across datasets indicates strong generalization capability.

[DISCUSSION] These results demonstrate that explainable multi-modal learning can effectively combine accuracy, interpret-ability, and robustness, supporting the development of reliable AI-based decision-support systems for cancer diagnosis.

같은 제1저자의 인용 많은 논문 (5)

Targeting MDSCs in cancer: emerging immunotherapeutic and metabolic strategies.
Frontiers in immunology 2026
Targeting of fibroblast activation protein with [Tb]Tb-FAP-2286 in treatment-refractory non-small cell lung cancer.
European journal of nuclear medicine and molecular imaging 2026
Human epidermal growth factor receptor-2/neu expression in gallbladder cancer is significantly associated with clinicopathological parameters and survival.
Indian journal of gastroenterology : official journal of the Indian Society of Gastroenterology 2025
Correction: Deoxycytidine kinase inactivation enhances gemcitabine resistance and sensitizes mitochondrial metabolism interference in pancreatic cancer.
Cell death & disease 2024
Anaesthetic Challenges in a Case of Oral Carcinoma With Anticipated Difficult Airway Posted for Tumour Excision and Reconstruction Surgery.
Cureus 2023