본문으로 건너뛰기
← 뒤로

Anatomy-guided visual prompt tuning for cross-modal breast cancer understanding.

NPJ digital medicine 2026 Vol.9(1)

Zhao S, Meng Q, He Y, Xu X, Zhu J, Qiu J, Wu C, Han Y, Deng J, Pan T, Liu J

📝 환자 설명용 한 줄

Early and reliable detection of breast cancer across imaging modalities remains a long-standing challenge due to the heterogeneous appearance of lesions and the lack of cross-domain consistency among

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Zhao S, Meng Q, et al. (2026). Anatomy-guided visual prompt tuning for cross-modal breast cancer understanding.. NPJ digital medicine, 9(1). https://doi.org/10.1038/s41746-026-02417-8
MLA Zhao S, et al.. "Anatomy-guided visual prompt tuning for cross-modal breast cancer understanding.." NPJ digital medicine, vol. 9, no. 1, 2026.
PMID 41688744

Abstract

Early and reliable detection of breast cancer across imaging modalities remains a long-standing challenge due to the heterogeneous appearance of lesions and the lack of cross-domain consistency among medical imaging systems. Recent advances in Vision Transformers (ViTs) and parameter-efficient fine-tuning (PEFT) techniques have enabled rapid model adaptation, yet most existing approaches remain data-driven and fail to incorporate domain-specific anatomical priors. In this work, we propose A-VPT (Anatomy-Guided Visual Prompt Tuning), a novel framework that integrates explicit anatomical structure into the prompt space of a frozen ViT backbone. Unlike conventional prompt tuning methods, A-VPT dynamically generates tissue-aware prompts guided by glandular, fatty, and ductal region embeddings, and performs hierarchical prompt-token interaction across transformer layers. Furthermore, a cross-modal contrastive alignment strategy harmonizes anatomical semantics among mammography, ultrasound, and MRI, enabling robust multi-domain generalization. Extensive experiments on three benchmark datasets (INbreast, BUSI, and Duke-Breast-MRI) demonstrate that A-VPT achieves state-of-the-art performance in both lesion classification and segmentation while using less than 2% of the tunable parameters required for full fine-tuning. Qualitative analyses confirm that anatomy-guided prompts yield interpretable attention patterns consistent with radiological structures. Our results suggest that embedding anatomical priors into prompt tuning not only enhances efficiency and generalization but also provides an interpretable bridge between deep learning representations and human anatomical reasoning.

같은 제1저자의 인용 많은 논문 (5)