A novel expert-annotated single-cell dataset for thyroid cancer diagnosis with deep learning benchmarks.
This paper introduces a novel, expert-annotated single-cell image dataset for thyroid cancer diagnosis, comprising 3,419 individual cell images extracted from high-resolution histopathological slides
APA
Huy NQ, Do TH, et al. (2025). A novel expert-annotated single-cell dataset for thyroid cancer diagnosis with deep learning benchmarks.. PLOS digital health, 4(12), e0001120. https://doi.org/10.1371/journal.pdig.0001120
MLA
Huy NQ, et al.. "A novel expert-annotated single-cell dataset for thyroid cancer diagnosis with deep learning benchmarks.." PLOS digital health, vol. 4, no. 12, 2025, pp. e0001120.
PMID
41401160
Abstract
This paper introduces a novel, expert-annotated single-cell image dataset for thyroid cancer diagnosis, comprising 3,419 individual cell images extracted from high-resolution histopathological slides and annotated with nine clinically significant nuclear features. The dataset, collected and annotated in collaboration with pathologists at the 108 Military Central Hospital (Vietnam), presents a significant resource for advancing research in automated cytological analysis. We establish a series of robust deep-learning baseline pipelines for multi-label classification on this dataset. These baselines incorporate ConvNeXt, Vision Transformers (ViT), and ResNet backbones, along with techniques to address class imbalance, including conditional CutMix, weighted sampling, and SPA loss with Label Pairwise Regularization (LPR). Experiments evaluate the good performance of the proposed pipelines, demonstrating the challenges over the dataset's characteristics and providing a benchmark for future studies in interpretable and reliable AI-based cytological diagnosis. The results highlight the importance of effective model architectures and data-centric strategies for accurate multi-label classification of single-cell images.