본문으로 건너뛰기
← 뒤로

Thyroid nodule segmentation in ultrasound images using transformer models with masked autoencoder pre-training.

1/5 보강
Frontiers in artificial intelligence 2025 Vol.8() p. 1618426
Retraction 확인
출처

Xiang Y, Acharya R, Le Q, Tan JH, Chng CL

📝 환자 설명용 한 줄

[INTRODUCTION] Thyroid nodule segmentation in ultrasound (US) images is a valuable yet challenging task, playing a critical role in diagnosing thyroid cancer.

이 논문을 인용하기

BibTeX ↓ RIS ↓
APA Xiang Y, Acharya R, et al. (2025). Thyroid nodule segmentation in ultrasound images using transformer models with masked autoencoder pre-training.. Frontiers in artificial intelligence, 8, 1618426. https://doi.org/10.3389/frai.2025.1618426
MLA Xiang Y, et al.. "Thyroid nodule segmentation in ultrasound images using transformer models with masked autoencoder pre-training.." Frontiers in artificial intelligence, vol. 8, 2025, pp. 1618426.
PMID 40777517

Abstract

[INTRODUCTION] Thyroid nodule segmentation in ultrasound (US) images is a valuable yet challenging task, playing a critical role in diagnosing thyroid cancer. The difficulty arises from factors such as the absence of prior knowledge about the thyroid region, low contrast between anatomical structures, and speckle noise, all of which obscure boundary detection and introduce variability in nodule appearance across different images.

[METHODS] To address these challenges, we propose a transformer-based model for thyroid nodule segmentation. Unlike traditional convolutional neural networks (CNNs), transformers capture global context from the first layer, enabling more comprehensive image representation, which is crucial for identifying subtle nodule boundaries. In this study, We first pre-train a Masked Autoencoder (MAE) to reconstruct masked patches, then fine-tune on thyroid US data, and further explore a cross-attention mechanism to enhance information flow between encoder and decoder.

[RESULTS] Our experiments on the public AIMI, TN3K, and DDTI datasets show that MAE pre-training accelerates convergence. However, overall improvements are modest: the model achieves Dice Similarity Coefficient (DSC) scores of 0.63, 0.64, and 0.65 on AIMI, TN3K, and DDTI, respectively, highlighting limitations under small-sample conditions. Furthermore, adding cross-attention did not yield consistent gains, suggesting that data volume and diversity may be more critical than additional architectural complexity.

[DISCUSSION] MAE pre-training notably reduces training time and helps themodel learn transferable features, yet overall accuracy remains constrained by limited data and nodule variability. Future work will focus on scaling up data, pre-training cross-attention layers, and exploring hybrid architectures to further boost segmentation performance.

같은 제1저자의 인용 많은 논문 (5)