Thyroid nodule segmentation in ultrasound images using transformer models with masked autoencoder pre-training.

Xiang Y; Acharya R; Le Q; Tan JH; Chng CL

doi:10.3389/frai.2025.1618426

← 뒤로

Thyroid nodule segmentation in ultrasound images using transformer models with masked autoencoder pre-training.

1/5 보강

Frontiers in artificial intelligence 2025 Vol.8() p. 1618426

Xiang Y, Acharya R, Le Q, Tan JH, Chng CL

PMC 전문 ↗ 원문 ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

[INTRODUCTION] Thyroid nodule segmentation in ultrasound (US) images is a valuable yet challenging task, playing a critical role in diagnosing thyroid cancer.

이 논문을 인용하기

BibTeX ↓ RIS ↓

APA Xiang Y, Acharya R, et al. (2025). Thyroid nodule segmentation in ultrasound images using transformer models with masked autoencoder pre-training.. Frontiers in artificial intelligence, 8, 1618426. https://doi.org/10.3389/frai.2025.1618426

MLA Xiang Y, et al.. "Thyroid nodule segmentation in ultrasound images using transformer models with masked autoencoder pre-training.." Frontiers in artificial intelligence, vol. 8, 2025, pp. 1618426.

PMID 40777517

DOI 10.3389/frai.2025.1618426

Abstract

[INTRODUCTION] Thyroid nodule segmentation in ultrasound (US) images is a valuable yet challenging task, playing a critical role in diagnosing thyroid cancer. The difficulty arises from factors such as the absence of prior knowledge about the thyroid region, low contrast between anatomical structures, and speckle noise, all of which obscure boundary detection and introduce variability in nodule appearance across different images.

[METHODS] To address these challenges, we propose a transformer-based model for thyroid nodule segmentation. Unlike traditional convolutional neural networks (CNNs), transformers capture global context from the first layer, enabling more comprehensive image representation, which is crucial for identifying subtle nodule boundaries. In this study, We first pre-train a Masked Autoencoder (MAE) to reconstruct masked patches, then fine-tune on thyroid US data, and further explore a cross-attention mechanism to enhance information flow between encoder and decoder.

[RESULTS] Our experiments on the public AIMI, TN3K, and DDTI datasets show that MAE pre-training accelerates convergence. However, overall improvements are modest: the model achieves Dice Similarity Coefficient (DSC) scores of 0.63, 0.64, and 0.65 on AIMI, TN3K, and DDTI, respectively, highlighting limitations under small-sample conditions. Furthermore, adding cross-attention did not yield consistent gains, suggesting that data volume and diversity may be more critical than additional architectural complexity.

[DISCUSSION] MAE pre-training notably reduces training time and helps themodel learn transferable features, yet overall accuracy remains constrained by limited data and nodule variability. Future work will focus on scaling up data, pre-training cross-attention layers, and exploring hybrid architectures to further boost segmentation performance.

같은 제1저자의 인용 많은 논문 (5)

Co-delivery nanoparticle targeting CAF for simultaneous activating T cell plus NKT cell attack in solid tumor.
Journal of advanced research 2026
In-depth Evaluation of the Olink Target 48 Cytokine Panel: Inter-Laboratory Evaluation of Performance and Reliability for Biomarker Studies in Oncology.
The AAPS journal 2026
Short-chain fatty acids in the tumor microenvironment: from molecular mechanisms to cancer therapy.
Theranostics 2026
Construction of circadian clock signature for tumor microenvironment in predicting survival of esophageal squamous cell carcinoma.
Frontiers in immunology 2026
A prospective phase II trial of 10-fraction whole-breast radiotherapy following breast-conserving surgery.
Clinical and translational radiation oncology 2026