Leveraging pretrained vision-language model for enhanced breast cancer diagnosis with multi-view mammography.

Chen X; Li Y; Hu M; Salari E; Chen X; Qiu RLJ; Zheng B; Yang X

doi:10.1002/mp.70261

← 뒤로

Leveraging pretrained vision-language model for enhanced breast cancer diagnosis with multi-view mammography.

1/5 보강

Medical physics 📖 저널 OA 33.8% 2026 Vol.53(1) p. e70261

PICO 자동 추출 (휴리스틱, conf 2/4)

유사 논문

P · Population 대상 환자/모집단

This study highlights the potential of applying the finetuned vision-language models for developing multi-view, image-text-based CAD schemes of breast cancer.

I · Intervention 중재 / 시술

추출되지 않음

C · Comparison 대조 / 비교

추출되지 않음

O · Outcome 결과 / 결론

[CONCLUSIONS] The proposed Mammo-CLIP demonstrates superior breast cancer diagnosis performance compared to SOTA methods. This study highlights the potential of applying the finetuned vision-language models for developing multi-view, image-text-based CAD schemes of breast cancer.

Chen X, Li Y, Hu M, Salari E, Chen X, Qiu RLJ, Zheng B, Yang X

PubMed ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

이 논문을 인용하기

↓ .bib ↓ .ris

APA Chen X, Li Y, et al. (2026). Leveraging pretrained vision-language model for enhanced breast cancer diagnosis with multi-view mammography.. Medical physics, 53(1), e70261. https://doi.org/10.1002/mp.70261

MLA Chen X, et al.. "Leveraging pretrained vision-language model for enhanced breast cancer diagnosis with multi-view mammography.." Medical physics, vol. 53, no. 1, 2026, pp. e70261.

PMID 41532302

DOI 10.1002/mp.70261

Abstract

[BACKGROUND] Although fusion of information from multiple views of mammograms plays an important role to increase accuracy of breast cancer detection, developing multi-view mammograms-based computer-aided diagnosis (CAD) schemes still faces big challenges and no such CAD schemes have been used in clinical practice.

[PURPOSE] To overcome these challenges, we investigate a new approach based on the concept of contrastive language-image pre-training (CLIP), which has sparked interest across various medical imaging tasks. The aim is to solve the challenges in: (1) effectively adapting the single-view CLIP for multi-view feature fusion and (2) efficiently fine-tuning this parameter-dense model with limited samples and computational resources.

[METHODS] We introduce a unique Mammo-CLIP, the first multi-modal framework to process multi-view mammograms and corresponding simple texts. Mammo-CLIP uses an early feature fusion strategy to learn multi-view relationships in four mammograms acquired from the craniocaudal (CC) and mediolateral oblique (MLO) views of the left and right breasts. To enhance learning efficiency, plug-and-play adapters are added into CLIP's image and text encoders for fine-tuning the model efficiently and limiting updates to about 1% of the parameters. For framework evaluation, we assembled two datasets retrospectively. The first dataset, comprising 470 malignant and 479 benign cases, was used for few-shot fine-tuning and internal evaluation of the proposed Mammo-CLIP via 5-fold cross-validation. The second dataset, including 60 malignant and 294 benign cases, was used to test generalizability of Mammo-CLIP.

[RESULTS] Mammo-CLIP outperforms the state-of-the-art (SOTA) cross-view transformer evaluated using areas under ROC curves (AUC = 0.841 ± 0.017 vs. 0.817 ± 0.012 and 0.837 ± 0.034 vs. 0.807 ± 0.036) on both datasets. It also surpasses previous two CLIP-based methods by 20.3% and 14.3% in AUC.

[CONCLUSIONS] The proposed Mammo-CLIP demonstrates superior breast cancer diagnosis performance compared to SOTA methods. This study highlights the potential of applying the finetuned vision-language models for developing multi-view, image-text-based CAD schemes of breast cancer.

🏷️ 키워드 / MeSH

같은 제1저자의 인용 많은 논문 (5)

Rare fusion transcript in a refractory adult T-cell lymphoblastic lymphoma.
American journal of cancer research 2026
Rabdosin B suppresses proliferation of nonsmall cell lung cancer by regulating the SRC/PI3K/AKT signaling pathway.
Pharmaceutical biology 2026
Development of a chemiluminescence immunoassay for proGRP in human serum.
Analytical biochemistry 2026
Genetically encoded biosensors in microbes for Tumor targeting.
Biosensors & bioelectronics 2026
Analysis of discordant results in multi-technique platform-based MRD detection in multiple myeloma and the clinical decision-making dilemma.
Leukemia & lymphoma 2026