Automatic and accurate auxiliary detection of lung cancer pathological classification based on novel lightweight deep learning model.
1/5 보강
[BACKGROUND] Lung cancer is one of the major cancers worldwide, and rapid, accurate diagnosis is crucial for subsequent treatment and management.
APA
Wang S, Tian F, Niu Y (2026). Automatic and accurate auxiliary detection of lung cancer pathological classification based on novel lightweight deep learning model.. Discover oncology, 17(1), 325. https://doi.org/10.1007/s12672-026-04487-2
MLA
Wang S, et al.. "Automatic and accurate auxiliary detection of lung cancer pathological classification based on novel lightweight deep learning model.." Discover oncology, vol. 17, no. 1, 2026, pp. 325.
PMID
41579280 ↗
Abstract 한글 요약
[BACKGROUND] Lung cancer is one of the major cancers worldwide, and rapid, accurate diagnosis is crucial for subsequent treatment and management. Currently, pathological subtype detection requires clinical experts to invest significant time and effort, making the development of automatic, efficient detection models essential.
[METHODS] This study developed a novel deep learning model named BreezeNet for the recognition of lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. BreezeNet is a lightweight deep learning framework specifically designed for precise and automated diagnosis of lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. Compared with current mainstream deep learning models such as VGG, GoogleNet, and MobileNet, BreezeNet demonstrated superior performance in key metrics such as precision and accuracy.
[RESULTS] In our study, we developed a lightweight deep learning model named BreezeNet for the automatic classification of lung cancer cells. The experimental results show that BreezeNet performs excellently across various metrics, particularly in terms of the number of parameters. Specifically, BreezeNet achieved a precision of 0.9749, a recall of 0.9742, an F1-score of 0.9742, and an accuracy of 0.9789, which are slightly better than traditional deep learning models such as AlexNet, VGG, GoogleNet, ResNet, and MobileNet. However, the most significant advantage of BreezeNet lies in its parameter count, which is only 1,256,679, far lower than AlexNet's 14,587,587 and ResNet's 23,514,179. This means that our model is not only competitive in terms of performance but also significantly reduces the computational resource requirements, greatly enhancing the model's lightweight nature and deployment efficiency.
[CONCLUSION] Compared with traditional deep learning models such as AlexNet, VGG, and ResNet, BreezeNet achieves slightly better performance across all key metrics, with up to 1.6% higher accuracy, 1.76% higher F1-score, and over 18× fewer parameters, highlighting its superior lightweight design and diagnostic effectiveness. Our developed deep learning model can efficiently perform automated subtyping of lung cancer cells, providing accurate diagnostic recommendations for doctors. This will help improve the efficiency of lung cancer diagnosis, thereby enhancing patient survival rates.
[METHODS] This study developed a novel deep learning model named BreezeNet for the recognition of lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. BreezeNet is a lightweight deep learning framework specifically designed for precise and automated diagnosis of lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. Compared with current mainstream deep learning models such as VGG, GoogleNet, and MobileNet, BreezeNet demonstrated superior performance in key metrics such as precision and accuracy.
[RESULTS] In our study, we developed a lightweight deep learning model named BreezeNet for the automatic classification of lung cancer cells. The experimental results show that BreezeNet performs excellently across various metrics, particularly in terms of the number of parameters. Specifically, BreezeNet achieved a precision of 0.9749, a recall of 0.9742, an F1-score of 0.9742, and an accuracy of 0.9789, which are slightly better than traditional deep learning models such as AlexNet, VGG, GoogleNet, ResNet, and MobileNet. However, the most significant advantage of BreezeNet lies in its parameter count, which is only 1,256,679, far lower than AlexNet's 14,587,587 and ResNet's 23,514,179. This means that our model is not only competitive in terms of performance but also significantly reduces the computational resource requirements, greatly enhancing the model's lightweight nature and deployment efficiency.
[CONCLUSION] Compared with traditional deep learning models such as AlexNet, VGG, and ResNet, BreezeNet achieves slightly better performance across all key metrics, with up to 1.6% higher accuracy, 1.76% higher F1-score, and over 18× fewer parameters, highlighting its superior lightweight design and diagnostic effectiveness. Our developed deep learning model can efficiently perform automated subtyping of lung cancer cells, providing accurate diagnostic recommendations for doctors. This will help improve the efficiency of lung cancer diagnosis, thereby enhancing patient survival rates.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- Research Progress on the Detection Methods of Botulinum Neurotoxin.
- Application study of febuxostat combined with hypothermic preservation technology in reducing ischemia-reperfusion injury in free flap transplantation.
- A novel nomogram incorporating LASSO and Cox regression analyses for predicting survival in early-stage non-small cell lung cancer patients following sublobectomy.
- Emerging importance of ALDH2 in liver diseases and its potential therapeutic role.
- Gastric Cancer in China, 1990 to 2023: Trends, Modifiable Risks, and Prevention Priorities.
📖 전문 본문 읽기 PMC JATS · ~51 KB · 영문
Introduction
Introduction
Lung cancer is one of the most common types of cancer worldwide, with an incidence rate of 22.41 per 100,000 people [1], and its cases and mortality rates continue to rise [2]. In 2020 alone, approximately 1.8 million people globally succumbed to lung cancer, accounting for 18% of all cancer-related deaths [3]. The overall 5-year survival rate for lung cancer is less than 20% [4], with inadequate diagnosis being a significant contributing factor to this high mortality [5]. Therefore, timely and accurate diagnosis of lung cancer is crucial for improving prognosis and reducing mortality rates.
Despite advancements in lung cancer diagnostic methods, such as significant progress in biomarkers and radiomics, histopathology remains the gold standard for definitive diagnosis [6, 7]. Traditionally, the pathology diagnostic process requires pathologists to manually analyze a variety of complex images and identify subtle pathological patterns. However, this process faces challenges due to the shortage of pathologists and the substantial workload involved. From 2007 to 2017, the proportion of pathologists in the total number of physicians in the United States decreased from 2.03% to 1.43%, while their workload increased by 41.73% [8]. In addition to the significant manpower required, the pathology diagnostic process is also inevitably subjective [9], which can impact the accuracy of diagnoses. Therefore, the current reliance on manual analysis in pathology limits the speed and accuracy of image interpretation, highlighting the necessity for the development of fully automated and efficient auxiliary diagnostic methods.
With the rise of artificial intelligence, deep learning has demonstrated powerful analytical capabilities in the field of image processing and achieved good results [10]. It can accurately identify and extract key features from massive amounts of complex data, thereby enabling efficient predictive analysis [11]. In previous studies, some progress has been made in the classification of certain respiratory diseases, such as pneumonia, tuberculosis, and COVID-19 [12–14]. In addition, deep learning is increasingly being used in the detection and classification of tumor diseases, often outperforming traditional machine learning methods in predictive accuracy [15]. Ahmed et al. [16] developed a model based on an improved convolutional neural network (CNN) that can reduce the time to detect lung cancer pathology images to within 10 s while maintaining high accuracy (over 96%). Similarly, Kriegsmann et al. [17] used CNN to classify common subtypes of lung cancer, demonstrating the potential of this model for pathological diagnosis. Janßen et al. [18] successfully applied a dual-modal classification algorithm to subtype lung cancer tissue slices. Another study applied an optimised deep learning framework to the analysis of mammography image datasets, achieving a breast cancer detection accuracy of 0.942 and a sensitivity of 0.982, with excellent results [19]. However, current research in this field still has some limitations, such as the use of single models, single-center data sources, or the need for further improvement in model accuracy, which restricts their application.
In this study, we propose BreezeNet, a lightweight and efficient deep learning model designed for the automatic detection of lung cancer cells in pathological images. The model integrates three key techniques to balance diagnostic accuracy with computational efficiency. First, we employ group convolutions in place of standard 3 × 3 kernels, substantially reducing the number of parameters while preserving effective feature extraction. This design reduces computational costs and facilitates deployment in resource-constrained environments. Second, we incorporate a Squeeze-and-Excitation (SE) attention module, which adaptively recalibrates channel-wise feature responses to emphasize informative signals. This enhances the model’s sensitivity to critical diagnostic patterns without adding significant computational overhead. Finally, we optimize the residual structure of ResNet to improve information flow and reduce redundancy, further strengthening both performance and efficiency. Collectively, these innovations enable BreezeNet to deliver high classification accuracy with significantly fewer parameters compared to conventional architectures like ResNet and AlexNet. The model’s lightweight design supports rapid and reliable deployment on standard medical hardware and mobile platforms, offering practical value in real-world clinical scenarios. To further illustrate the effectiveness of our method, we compared BreezeNet with several widely used models, including AlexNet, VGG, GoogleNet, ResNet, and MobileNet. Across both internal and independent evaluations, BreezeNet achieved slightly higher precision, recall, F1-score, and accuracy—e.g., up to 1.76% higher F1-score—while reducing the total number of model parameters by more than 18 times compared to ResNet. This combination of improved performance and efficiency demonstrates BreezeNet’s unique advantages in lightweight deployment and clinical applicability. The workflow of this study is illustrated in Fig. 1. In summary, the originality of this study lies in three aspects: (1) the integration of grouped convolution with SE attention for enhanced feature calibration in pathology image analysis; (2) the significant reduction in parameters—over 18 times fewer than ResNet—while maintaining or surpassing accuracy; and (3) the demonstration of a lightweight design suitable for real-time and resource-limited clinical environments.
Lung cancer is one of the most common types of cancer worldwide, with an incidence rate of 22.41 per 100,000 people [1], and its cases and mortality rates continue to rise [2]. In 2020 alone, approximately 1.8 million people globally succumbed to lung cancer, accounting for 18% of all cancer-related deaths [3]. The overall 5-year survival rate for lung cancer is less than 20% [4], with inadequate diagnosis being a significant contributing factor to this high mortality [5]. Therefore, timely and accurate diagnosis of lung cancer is crucial for improving prognosis and reducing mortality rates.
Despite advancements in lung cancer diagnostic methods, such as significant progress in biomarkers and radiomics, histopathology remains the gold standard for definitive diagnosis [6, 7]. Traditionally, the pathology diagnostic process requires pathologists to manually analyze a variety of complex images and identify subtle pathological patterns. However, this process faces challenges due to the shortage of pathologists and the substantial workload involved. From 2007 to 2017, the proportion of pathologists in the total number of physicians in the United States decreased from 2.03% to 1.43%, while their workload increased by 41.73% [8]. In addition to the significant manpower required, the pathology diagnostic process is also inevitably subjective [9], which can impact the accuracy of diagnoses. Therefore, the current reliance on manual analysis in pathology limits the speed and accuracy of image interpretation, highlighting the necessity for the development of fully automated and efficient auxiliary diagnostic methods.
With the rise of artificial intelligence, deep learning has demonstrated powerful analytical capabilities in the field of image processing and achieved good results [10]. It can accurately identify and extract key features from massive amounts of complex data, thereby enabling efficient predictive analysis [11]. In previous studies, some progress has been made in the classification of certain respiratory diseases, such as pneumonia, tuberculosis, and COVID-19 [12–14]. In addition, deep learning is increasingly being used in the detection and classification of tumor diseases, often outperforming traditional machine learning methods in predictive accuracy [15]. Ahmed et al. [16] developed a model based on an improved convolutional neural network (CNN) that can reduce the time to detect lung cancer pathology images to within 10 s while maintaining high accuracy (over 96%). Similarly, Kriegsmann et al. [17] used CNN to classify common subtypes of lung cancer, demonstrating the potential of this model for pathological diagnosis. Janßen et al. [18] successfully applied a dual-modal classification algorithm to subtype lung cancer tissue slices. Another study applied an optimised deep learning framework to the analysis of mammography image datasets, achieving a breast cancer detection accuracy of 0.942 and a sensitivity of 0.982, with excellent results [19]. However, current research in this field still has some limitations, such as the use of single models, single-center data sources, or the need for further improvement in model accuracy, which restricts their application.
In this study, we propose BreezeNet, a lightweight and efficient deep learning model designed for the automatic detection of lung cancer cells in pathological images. The model integrates three key techniques to balance diagnostic accuracy with computational efficiency. First, we employ group convolutions in place of standard 3 × 3 kernels, substantially reducing the number of parameters while preserving effective feature extraction. This design reduces computational costs and facilitates deployment in resource-constrained environments. Second, we incorporate a Squeeze-and-Excitation (SE) attention module, which adaptively recalibrates channel-wise feature responses to emphasize informative signals. This enhances the model’s sensitivity to critical diagnostic patterns without adding significant computational overhead. Finally, we optimize the residual structure of ResNet to improve information flow and reduce redundancy, further strengthening both performance and efficiency. Collectively, these innovations enable BreezeNet to deliver high classification accuracy with significantly fewer parameters compared to conventional architectures like ResNet and AlexNet. The model’s lightweight design supports rapid and reliable deployment on standard medical hardware and mobile platforms, offering practical value in real-world clinical scenarios. To further illustrate the effectiveness of our method, we compared BreezeNet with several widely used models, including AlexNet, VGG, GoogleNet, ResNet, and MobileNet. Across both internal and independent evaluations, BreezeNet achieved slightly higher precision, recall, F1-score, and accuracy—e.g., up to 1.76% higher F1-score—while reducing the total number of model parameters by more than 18 times compared to ResNet. This combination of improved performance and efficiency demonstrates BreezeNet’s unique advantages in lightweight deployment and clinical applicability. The workflow of this study is illustrated in Fig. 1. In summary, the originality of this study lies in three aspects: (1) the integration of grouped convolution with SE attention for enhanced feature calibration in pathology image analysis; (2) the significant reduction in parameters—over 18 times fewer than ResNet—while maintaining or surpassing accuracy; and (3) the demonstration of a lightweight design suitable for real-time and resource-limited clinical environments.
Methods
Methods
Data acquisition
This study utilized a publicly available dataset from the Kaggle platform (https://www.kaggle.com/datasets/javaidahmadwani/lc25000), which focuses on the classification of lung diseases through images, including lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. The dataset used in this study comprises a total of 15,000 hematoxylin and eosin (H&E)-stained lung histopathological image patches, retrospectively collected from a single medical institution. These image patches were extracted from whole slide images (WSIs) by trained technicians using standardized pathological slicing protocols and a uniform 40× magnification. The dataset is evenly distributed across three pathological categories: 5,000 images for lung adenocarcinoma (lung_aca), 5,000 for lung squamous cell carcinoma (lung_scc), and 5,000 for normal (benign) lung tissue (lung_n). Each image is stored in RGB format with a resolution of 224 × 224 pixels, suitable for deep learning model input. To facilitate model training and evaluation, the dataset is organized into three dedicated subfolders, each corresponding to one of the three classes, ensuring clear label indexing. Figure 2 shows some of the dataset images we used. All image classifications are based on detailed clinical diagnoses and histopathological evaluations, ensuring high accuracy and reliability of the data.Lung adenocarcinoma is the most common type of non-small cell lung cancer, originating from alveolar or bronchial glandular epithelial cells. Its pathological features typically include gland formation, mucus secretion, and mutational characteristics of tumor cells. Lung squamous cell carcinoma, also a type of non-small cell lung cancer, mainly occurs in the central bronchi and is highly associated with smoking. Its pathological criteria include keratinization, intercellular bridge formation, and the presence of polymorphic tumor cells. As LC25000 is a patch-level benchmark (LC25000 dataset) that may include correlated or near-duplicate tiles, we discuss the potential implications for generalizability and data leakage in the Discussion section.
Data preprocessing
In this study, we employed a series of image preprocessing techniques to optimize the input data quality for the lung cancer detection model. First, through image cleaning, we removed unnecessary background information and interfering elements, making the cancerous tissues more prominent. Next, we used denoising and filtering methods to reduce image noise and enhance detail representation, thereby improving the recognition of cancerous tissues. Additionally, we adjusted the image contrast to clearly distinguish between normal and abnormal tissues, enhancing the model’s discrimination capability.
To enhance the model’s generalization ability and performance, we implemented a series of data augmentation techniques on the training and validation sets. Specifically, during the training phase, we used RandomResizedCrop to resize the images to 224 × 224 pixels and applied RandomHorizontalFlip to increase data diversity. Then, we converted the images into PyTorch tensors (ToTensor) and normalized them using a mean of (0.5, 0.5, 0.5) and a standard deviation of (0.5, 0.5, 0.5) to standardize the data distribution. For the validation set, we resized the images uniformly to 224 × 224 pixels, followed by the same tensor conversion and normalization processes.
Construction of auxiliary diagnostic model
Proposal of a deep learning-based model
In modern healthcare, using deep learning models to identify and classify lung disease images has become a critical technology. Specifically, classical deep learning architectures such as AlexNet, VGG, GoogleNet, ResNet, and MobileNet have been widely applied in medical image analysis, demonstrating outstanding capabilities in handling complex image data [20–24]. However, despite their excellent performance in many aspects, these models often require substantial computational resources and can become inefficient when processing particularly large medical datasets. Additionally, they may encounter overfitting issues when dealing with heterogeneous data. To address these limitations, we developed BreezeNet, a lightweight architecture based on an improved ResNet model. This new model is specifically designed for image recognition of lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue, aiming to provide more efficient processing capabilities and better generalization performance. According to recent studies, the ResNet architecture has been proven highly effective in identifying lung diseases, with the ResNet-152 architecture demonstrating high accuracy and F1 scores in similar tasks [25]. BreezeNet builds on this foundation, reducing model complexity while maintaining high accuracy, making it effectively operable in resource-constrained environments. These improvements not only enhance the model’s practicality but also expand its potential for clinical applications, particularly in the early diagnosis and precise treatment of lung cancer.
BreezeNet is an innovative, lightweight deep learning model that significantly improves upon the ResNet architecture, specifically designed to accurately identify various lung lesions, including lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. By adopting grouped convolution techniques instead of traditional convolution layers, this model reduces computational complexity while retaining the ability to capture key features. Specifically, unlike Transformer-based attention mechanisms that typically require extensive computational resources and large-scale data, our model incorporates the Squeeze-and-Excitation (SE) attention module, which offers a channel-wise recalibration strategy with minimal parameter overhead. In addition, we combine this with grouped convolution to further reduce model complexity, yielding a lightweight yet effective solution tailored for histopathology tasks. Additionally, BreezeNet introduces the Squeeze-and-Excitation (SE) module, an attention mechanism that dynamically weights the features of different channels, significantly enhancing the model’s responsiveness to important information and thereby improving overall diagnostic accuracy. During training, the model utilizes a cross-entropy loss function and the Adam optimizer, with appropriate learning rate settings and decay strategies to ensure fast and stable convergence. The standardized input image size is 224 × 224 pixels, facilitating the model’s ability to process medical image data from various sources. To further enhance the model’s performance in practical applications, BreezeNet incorporates various data augmentation techniques during the training phase, such as random cropping, rotation, and flipping. These techniques not only increase the model’s robustness to new, unseen samples but also effectively expand the training dataset, thereby reducing the risk of overfitting. Additionally, by applying Dropout and L2 regularization, the model optimizes parameter configuration and prevents overfitting during training without compromising diagnostic performance. Figure 3 illustrates the primary architecture adopted by our model.
The BreezeNet model is highly suitable for detecting lung cancer in H&E stained pathology slides. Firstly, BreezeNet’s lightweight network architecture significantly reduces computational resource requirements, allowing it to efficiently operate in standard clinical settings without high-end GPU support. This is particularly important for rapid diagnostics, as it enables real-time image processing and analysis, thereby accelerating the diagnostic process. Secondly, BreezeNet incorporates advanced SE attention mechanisms in its design, enabling the model to more precisely focus on the lung cancer characteristic regions in H&E stained slides. The SE module enhances the model’s ability to recognize lung cancer-specific markers, such as tumor cell morphology and staining characteristics, by weighting the feature channels within the network. This detailed feature extraction is crucial for distinguishing between benign and malignant lung tissues, and even different types of lung cancer, such as lung adenocarcinoma and lung squamous cell carcinoma.Additionally, BreezeNet employs data augmentation strategies that further enhance the model’s generalization capability for lung cancer pathology images. By simulating various changes that pathology slides might encounter in real clinical environments (such as rotation, scaling, and flipping), BreezeNet maintains its diagnostic accuracy when handling real-world data, which is particularly important when dealing with images from different hospitals and devices.
Comparative models
AlexNet is a groundbreaking model in the field of deep learning for image recognition, developed by Alex Krizhevsky and his colleagues in 2012. The structure of AlexNet includes multiple convolutional layers, pooling layers, and fully connected layers. Notably, it was the first to use the ReLU activation function and Dropout regularization strategy in CNNs. In medical image processing, especially in lung cancer detection, AlexNet has been used to automatically identify and classify lung nodules in CT images, demonstrating potential in effectively distinguishing between benign and malignant tumors. VGG, developed by the Visual Geometry Group at Oxford University, further deepened and broadened the network structure based on AlexNet, particularly by using repeated small convolutional kernels (3 × 3) to construct deeper networks. In the field of medical image analysis, VGG has been used to analyze lung X-rays and CT images, helping to diagnose lung cancer and other pulmonary diseases by extracting rich feature maps to enhance disease recognition accuracy. GoogleNet, developed by Google, introduced a new network structure called Inception, which increases the network’s width and depth without adding computational burden. GoogleNet uses multi-scale convolutional kernels, enabling the model to capture image features at different scales. This is particularly important in processing medical images of varying sizes and shapes. In the field of lung cancer image recognition, GoogleNet has been used to automatically detect and classify lung nodules from CT scans, effectively improving early lung cancer detection rates. ResNet, or Residual Network, was developed by researchers at Microsoft Research. It addresses the vanishing gradient problem in deep networks by introducing a residual learning framework. In lung cancer detection, ResNet is widely used to analyze lung CT and MRI images. Its deep feature extraction capabilities significantly enhance diagnostic accuracy and reliability. MobileNet is a lightweight deep neural network optimized for mobile and edge devices, proposed by Google. It uses depthwise separable convolutions to reduce model size and computational demands, allowing efficient operation on low-power devices without sacrificing much performance. In lung cancer detection applications, MobileNet can be used for real-time lung image analysis on mobile devices, enabling rapid screening in resource-constrained environments.
Experimental setup
In this study, we ensured the accuracy and reliability of the model’s performance through meticulous internal evaluations and tests using independent test datasets. We divided the data into training and test sets in an 80% and 20% ratio to examine the model’s generalization ability to new data. Within the training set (80%), we further split the data into 70% for actual training and 30% for validation. This approach allowed us to fine-tune the model during training and validate its performance before final testing on the independent test datasets. To achieve optimal model performance, all model parameters were systematically optimized. We utilized a combination of grid search and random search techniques for parameter tuning to effectively explore the parameter space and find the best configuration. Additionally, we implemented an early stopping mechanism and monitored the validation loss to effectively prevent overfitting, ensuring stable convergence and good generalization of the model. The experimental environment was set as follows: all experiments were conducted on a high-performance computing platform equipped with an NVIDIA 4090Ti GPU with 32GB of memory, and an Intel Xeon CPU. Our experimental software environment was based on the PyTorch deep learning framework, version 1.7.1. To ensure the reproducibility of the experiments, we fixed all random seeds that could affect the results, including those for PyTorch, Numpy, and Python itself.
Model evalution
To comprehensively evaluate the model’s performance, we applied several standard classification evaluation metrics, including accuracy, precision, recall, F1 score, ROC curve, and confusion matrix. Accuracy reflects the proportion of correctly predicted samples out of the total samples. Precision measures the proportion of true positive samples among those predicted as positive, while recall assesses the proportion of actual positive samples correctly identified by the model. The F1 score is the harmonic mean of precision and recall, used to evaluate the balance between these two metrics. The ROC curve shows the relationship between the true positive rate (TPR) and false positive rate (FPR) at different threshold settings, serving as a crucial tool for assessing the overall performance of the model. The confusion matrix clearly displays the correspondence between the model’s predictions and the actual labels, including true positives (TP), false negatives (FN), true negatives (TN), and false positives (FP). By combining these metrics, we can conduct a comprehensive performance evaluation of the model, ensuring the objectivity and accuracy of the assessment results and enhancing confidence in the model’s effectiveness. The relevant formulas can be expressed as follows:
Data acquisition
This study utilized a publicly available dataset from the Kaggle platform (https://www.kaggle.com/datasets/javaidahmadwani/lc25000), which focuses on the classification of lung diseases through images, including lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. The dataset used in this study comprises a total of 15,000 hematoxylin and eosin (H&E)-stained lung histopathological image patches, retrospectively collected from a single medical institution. These image patches were extracted from whole slide images (WSIs) by trained technicians using standardized pathological slicing protocols and a uniform 40× magnification. The dataset is evenly distributed across three pathological categories: 5,000 images for lung adenocarcinoma (lung_aca), 5,000 for lung squamous cell carcinoma (lung_scc), and 5,000 for normal (benign) lung tissue (lung_n). Each image is stored in RGB format with a resolution of 224 × 224 pixels, suitable for deep learning model input. To facilitate model training and evaluation, the dataset is organized into three dedicated subfolders, each corresponding to one of the three classes, ensuring clear label indexing. Figure 2 shows some of the dataset images we used. All image classifications are based on detailed clinical diagnoses and histopathological evaluations, ensuring high accuracy and reliability of the data.Lung adenocarcinoma is the most common type of non-small cell lung cancer, originating from alveolar or bronchial glandular epithelial cells. Its pathological features typically include gland formation, mucus secretion, and mutational characteristics of tumor cells. Lung squamous cell carcinoma, also a type of non-small cell lung cancer, mainly occurs in the central bronchi and is highly associated with smoking. Its pathological criteria include keratinization, intercellular bridge formation, and the presence of polymorphic tumor cells. As LC25000 is a patch-level benchmark (LC25000 dataset) that may include correlated or near-duplicate tiles, we discuss the potential implications for generalizability and data leakage in the Discussion section.
Data preprocessing
In this study, we employed a series of image preprocessing techniques to optimize the input data quality for the lung cancer detection model. First, through image cleaning, we removed unnecessary background information and interfering elements, making the cancerous tissues more prominent. Next, we used denoising and filtering methods to reduce image noise and enhance detail representation, thereby improving the recognition of cancerous tissues. Additionally, we adjusted the image contrast to clearly distinguish between normal and abnormal tissues, enhancing the model’s discrimination capability.
To enhance the model’s generalization ability and performance, we implemented a series of data augmentation techniques on the training and validation sets. Specifically, during the training phase, we used RandomResizedCrop to resize the images to 224 × 224 pixels and applied RandomHorizontalFlip to increase data diversity. Then, we converted the images into PyTorch tensors (ToTensor) and normalized them using a mean of (0.5, 0.5, 0.5) and a standard deviation of (0.5, 0.5, 0.5) to standardize the data distribution. For the validation set, we resized the images uniformly to 224 × 224 pixels, followed by the same tensor conversion and normalization processes.
Construction of auxiliary diagnostic model
Proposal of a deep learning-based model
In modern healthcare, using deep learning models to identify and classify lung disease images has become a critical technology. Specifically, classical deep learning architectures such as AlexNet, VGG, GoogleNet, ResNet, and MobileNet have been widely applied in medical image analysis, demonstrating outstanding capabilities in handling complex image data [20–24]. However, despite their excellent performance in many aspects, these models often require substantial computational resources and can become inefficient when processing particularly large medical datasets. Additionally, they may encounter overfitting issues when dealing with heterogeneous data. To address these limitations, we developed BreezeNet, a lightweight architecture based on an improved ResNet model. This new model is specifically designed for image recognition of lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue, aiming to provide more efficient processing capabilities and better generalization performance. According to recent studies, the ResNet architecture has been proven highly effective in identifying lung diseases, with the ResNet-152 architecture demonstrating high accuracy and F1 scores in similar tasks [25]. BreezeNet builds on this foundation, reducing model complexity while maintaining high accuracy, making it effectively operable in resource-constrained environments. These improvements not only enhance the model’s practicality but also expand its potential for clinical applications, particularly in the early diagnosis and precise treatment of lung cancer.
BreezeNet is an innovative, lightweight deep learning model that significantly improves upon the ResNet architecture, specifically designed to accurately identify various lung lesions, including lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue. By adopting grouped convolution techniques instead of traditional convolution layers, this model reduces computational complexity while retaining the ability to capture key features. Specifically, unlike Transformer-based attention mechanisms that typically require extensive computational resources and large-scale data, our model incorporates the Squeeze-and-Excitation (SE) attention module, which offers a channel-wise recalibration strategy with minimal parameter overhead. In addition, we combine this with grouped convolution to further reduce model complexity, yielding a lightweight yet effective solution tailored for histopathology tasks. Additionally, BreezeNet introduces the Squeeze-and-Excitation (SE) module, an attention mechanism that dynamically weights the features of different channels, significantly enhancing the model’s responsiveness to important information and thereby improving overall diagnostic accuracy. During training, the model utilizes a cross-entropy loss function and the Adam optimizer, with appropriate learning rate settings and decay strategies to ensure fast and stable convergence. The standardized input image size is 224 × 224 pixels, facilitating the model’s ability to process medical image data from various sources. To further enhance the model’s performance in practical applications, BreezeNet incorporates various data augmentation techniques during the training phase, such as random cropping, rotation, and flipping. These techniques not only increase the model’s robustness to new, unseen samples but also effectively expand the training dataset, thereby reducing the risk of overfitting. Additionally, by applying Dropout and L2 regularization, the model optimizes parameter configuration and prevents overfitting during training without compromising diagnostic performance. Figure 3 illustrates the primary architecture adopted by our model.
The BreezeNet model is highly suitable for detecting lung cancer in H&E stained pathology slides. Firstly, BreezeNet’s lightweight network architecture significantly reduces computational resource requirements, allowing it to efficiently operate in standard clinical settings without high-end GPU support. This is particularly important for rapid diagnostics, as it enables real-time image processing and analysis, thereby accelerating the diagnostic process. Secondly, BreezeNet incorporates advanced SE attention mechanisms in its design, enabling the model to more precisely focus on the lung cancer characteristic regions in H&E stained slides. The SE module enhances the model’s ability to recognize lung cancer-specific markers, such as tumor cell morphology and staining characteristics, by weighting the feature channels within the network. This detailed feature extraction is crucial for distinguishing between benign and malignant lung tissues, and even different types of lung cancer, such as lung adenocarcinoma and lung squamous cell carcinoma.Additionally, BreezeNet employs data augmentation strategies that further enhance the model’s generalization capability for lung cancer pathology images. By simulating various changes that pathology slides might encounter in real clinical environments (such as rotation, scaling, and flipping), BreezeNet maintains its diagnostic accuracy when handling real-world data, which is particularly important when dealing with images from different hospitals and devices.
Comparative models
AlexNet is a groundbreaking model in the field of deep learning for image recognition, developed by Alex Krizhevsky and his colleagues in 2012. The structure of AlexNet includes multiple convolutional layers, pooling layers, and fully connected layers. Notably, it was the first to use the ReLU activation function and Dropout regularization strategy in CNNs. In medical image processing, especially in lung cancer detection, AlexNet has been used to automatically identify and classify lung nodules in CT images, demonstrating potential in effectively distinguishing between benign and malignant tumors. VGG, developed by the Visual Geometry Group at Oxford University, further deepened and broadened the network structure based on AlexNet, particularly by using repeated small convolutional kernels (3 × 3) to construct deeper networks. In the field of medical image analysis, VGG has been used to analyze lung X-rays and CT images, helping to diagnose lung cancer and other pulmonary diseases by extracting rich feature maps to enhance disease recognition accuracy. GoogleNet, developed by Google, introduced a new network structure called Inception, which increases the network’s width and depth without adding computational burden. GoogleNet uses multi-scale convolutional kernels, enabling the model to capture image features at different scales. This is particularly important in processing medical images of varying sizes and shapes. In the field of lung cancer image recognition, GoogleNet has been used to automatically detect and classify lung nodules from CT scans, effectively improving early lung cancer detection rates. ResNet, or Residual Network, was developed by researchers at Microsoft Research. It addresses the vanishing gradient problem in deep networks by introducing a residual learning framework. In lung cancer detection, ResNet is widely used to analyze lung CT and MRI images. Its deep feature extraction capabilities significantly enhance diagnostic accuracy and reliability. MobileNet is a lightweight deep neural network optimized for mobile and edge devices, proposed by Google. It uses depthwise separable convolutions to reduce model size and computational demands, allowing efficient operation on low-power devices without sacrificing much performance. In lung cancer detection applications, MobileNet can be used for real-time lung image analysis on mobile devices, enabling rapid screening in resource-constrained environments.
Experimental setup
In this study, we ensured the accuracy and reliability of the model’s performance through meticulous internal evaluations and tests using independent test datasets. We divided the data into training and test sets in an 80% and 20% ratio to examine the model’s generalization ability to new data. Within the training set (80%), we further split the data into 70% for actual training and 30% for validation. This approach allowed us to fine-tune the model during training and validate its performance before final testing on the independent test datasets. To achieve optimal model performance, all model parameters were systematically optimized. We utilized a combination of grid search and random search techniques for parameter tuning to effectively explore the parameter space and find the best configuration. Additionally, we implemented an early stopping mechanism and monitored the validation loss to effectively prevent overfitting, ensuring stable convergence and good generalization of the model. The experimental environment was set as follows: all experiments were conducted on a high-performance computing platform equipped with an NVIDIA 4090Ti GPU with 32GB of memory, and an Intel Xeon CPU. Our experimental software environment was based on the PyTorch deep learning framework, version 1.7.1. To ensure the reproducibility of the experiments, we fixed all random seeds that could affect the results, including those for PyTorch, Numpy, and Python itself.
Model evalution
To comprehensively evaluate the model’s performance, we applied several standard classification evaluation metrics, including accuracy, precision, recall, F1 score, ROC curve, and confusion matrix. Accuracy reflects the proportion of correctly predicted samples out of the total samples. Precision measures the proportion of true positive samples among those predicted as positive, while recall assesses the proportion of actual positive samples correctly identified by the model. The F1 score is the harmonic mean of precision and recall, used to evaluate the balance between these two metrics. The ROC curve shows the relationship between the true positive rate (TPR) and false positive rate (FPR) at different threshold settings, serving as a crucial tool for assessing the overall performance of the model. The confusion matrix clearly displays the correspondence between the model’s predictions and the actual labels, including true positives (TP), false negatives (FN), true negatives (TN), and false positives (FP). By combining these metrics, we can conduct a comprehensive performance evaluation of the model, ensuring the objectivity and accuracy of the assessment results and enhancing confidence in the model’s effectiveness. The relevant formulas can be expressed as follows:
Results
Results
Internal validation results
During the training process of our model in this study, we meticulously set multiple hyperparameters, selected an appropriate optimizer, and adjusted the learning rate and batch size to ensure the model could effectively learn and optimize its performance. First, for the optimizer, we chose the Adam optimizer because it combines the advantages of RMSprop and momentum methods, automatically adjusting the learning rate, which helps avoid getting stuck in local optima during training. The Adam optimizer is widely regarded as one of the most stable optimizers for deep learning tasks. Regarding the learning rate, we set an initial learning rate of 1e-4 and applied a learning rate decay strategy, halving the learning rate every 10 training epochs to ensure finer weight adjustments as the model approaches the optimal solution, thus improving the model’s final performance. For the batch size, we selected a relatively small batch size of 32. This choice ensures that each iteration’s computational resources are not overly consumed, while maintaining sufficient data diversity to prevent overfitting and fully utilizing the GPU’s parallel computing capabilities. Additionally, we implemented an Early Stopping mechanism, terminating training early if the performance on the validation set did not significantly improve for 10 consecutive training epochs. This strategy helps prevent overfitting during training and saves unnecessary training time and resources.
In our task of detecting lung cancer in pathology images, the proposed new model demonstrated exceptional performance. After rigorous evaluation, the model achieved impressive results across multiple key performance metrics. Specifically, the model reached a precision of 97.49%, indicating its high accuracy in identifying lung cancer images. The recall reached 97.42%, showing the model’s effectiveness in identifying the vast majority of lung cancer cases, thus avoiding missed diagnoses. The accuracy was 97.89%, indicating high diagnostic consistency and reliability overall. Additionally, the F1 score reached 97.42%, reflecting a good balance between precision and recall, ensuring the model’s overall performance in recognizing both positive and negative samples. Through ROC curve analysis, the model’s AUC (Area Under the Curve) value reached 0.99, demonstrating the model’s excellent capability in distinguishing between lung cancer and normal images, maintaining high performance across different threshold settings. The results from the confusion matrix further confirmed the model’s accuracy in classification tasks, showing the specific performance in predicting different categories, including the numbers of true positives, false positives, true negatives, and false negatives, further validating the model’s diagnostic efficiency and accuracy. Figure 4 illustrates the performance metrics of our model.
In our study, in addition to proposing a new model, we also compared the performance of several classic and advanced deep learning models in the task of detecting lung cancer in pathology images. These comparison models include AlexNet, VGG, GoogleNet, ResNet, and MobileNet. Compared to these models, our proposed new model performed better across all key metrics. Table 1 shows the performance comparison of the various models, Fig. 5 presents the ROC curve evaluation metrics for each model, and Fig. 6 displays the confusion matrix evaluation metrics for each model.
Independent test results
Results of the BreezeNet
To further validate the generalization capability of our model, we used independent testing, achieving excellent results. Figure 7 presents the ROC curve evaluation metrics of our model in the independent testing.
Results of the comparison model
In the independent testing, our BreezeNet model was compared with other classic deep learning models such as AlexNet, VGG, GoogleNet, ResNet, and MobileNet. The results showed that BreezeNet demonstrated superior performance across key metrics, including accuracy, precision, recall, F1 score, and AUC value. Specifically, BreezeNet achieved an accuracy of 97.42%, precision of 96.25%, recall of 96.25%, F1 score of 96.24%, and an AUC value as high as 0.99, all of which significantly outperformed the other models. These results not only demonstrate BreezeNet’s superiority in the task of lung cancer image recognition but also highlight its potential in ensuring high precision and high recall, making it the preferred model in the field of complex medical image analysis. Table 2 shows the performance metrics of the various comparison models in independent testing, and Fig. 8 presents the ROC curve evaluation metrics for each model in independent testing.
The comparison of the training cost of the models
As illustrated in Table 1, BreezeNet excels in several key performance metrics, including precision, recall, F1-score, and accuracy, slightly outperforming traditional deep learning models like AlexNet, VGG, GoogleNet, ResNet, and MobileNet. The most significant advantage of BreezeNet lies in its lightweight design, particularly reflected in the substantial reduction in parameter count. Specifically, BreezeNet has only 1,256,679 parameters, compared to 14,587,587 for AlexNet (about 11.6 times less), 13,427,835 for VGG (about 10.7 times less), 10,312,505 for GoogleNet (about 8.2 times less), 23,514,179 for ResNet (about 18.7 times less), and 2,227,715 for MobileNet (about 1.8 times less). By significantly reducing the number of parameters, BreezeNet lowers training costs and computational resource requirements, making it particularly advantageous in resource-limited environments such as mobile devices and edge computing platforms. This lightweight characteristic not only enhances the practical application value of the model but also increases its feasibility and versatility in different environments.
Clinical interpretability
In the pathological classification of lung cancer, lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue are common categories. Lung adenocarcinoma typically occurs in the peripheral areas of the lung and is often associated with non-smokers, being more common in women and younger patients. Lung squamous cell carcinoma is closely related to long-term smoking and generally occurs in the central part of the lung near the main bronchus. Unlike these two malignant tumors, benign lung tissue does not exhibit malignant growth and may present as non-tumorous lesions such as cysts or inflammation. In the medical field, traditional methods for diagnosing lung cancer rely on doctors analyzing pathology slides under a microscope to identify specific cytological features. This process is not only time-consuming but also requires a high level of professional skill. In recent years, the application of deep learning models, especially through techniques like activation heatmaps, has brought revolutionary advances to lung cancer detection. Activation heatmaps visually display the areas that deep learning models focus on when analyzing images, clearly indicating cells or tissues identified as cancerous. This not only helps explain the model’s decision logic but also effectively assists doctors in quickly locating potential cancerous areas, thereby improving the speed and accuracy of diagnosis. The application of this technology has greatly enhanced the efficiency and accuracy of medical diagnostics. Figure 9 illustrates the activation heatmap.
Internal validation results
During the training process of our model in this study, we meticulously set multiple hyperparameters, selected an appropriate optimizer, and adjusted the learning rate and batch size to ensure the model could effectively learn and optimize its performance. First, for the optimizer, we chose the Adam optimizer because it combines the advantages of RMSprop and momentum methods, automatically adjusting the learning rate, which helps avoid getting stuck in local optima during training. The Adam optimizer is widely regarded as one of the most stable optimizers for deep learning tasks. Regarding the learning rate, we set an initial learning rate of 1e-4 and applied a learning rate decay strategy, halving the learning rate every 10 training epochs to ensure finer weight adjustments as the model approaches the optimal solution, thus improving the model’s final performance. For the batch size, we selected a relatively small batch size of 32. This choice ensures that each iteration’s computational resources are not overly consumed, while maintaining sufficient data diversity to prevent overfitting and fully utilizing the GPU’s parallel computing capabilities. Additionally, we implemented an Early Stopping mechanism, terminating training early if the performance on the validation set did not significantly improve for 10 consecutive training epochs. This strategy helps prevent overfitting during training and saves unnecessary training time and resources.
In our task of detecting lung cancer in pathology images, the proposed new model demonstrated exceptional performance. After rigorous evaluation, the model achieved impressive results across multiple key performance metrics. Specifically, the model reached a precision of 97.49%, indicating its high accuracy in identifying lung cancer images. The recall reached 97.42%, showing the model’s effectiveness in identifying the vast majority of lung cancer cases, thus avoiding missed diagnoses. The accuracy was 97.89%, indicating high diagnostic consistency and reliability overall. Additionally, the F1 score reached 97.42%, reflecting a good balance between precision and recall, ensuring the model’s overall performance in recognizing both positive and negative samples. Through ROC curve analysis, the model’s AUC (Area Under the Curve) value reached 0.99, demonstrating the model’s excellent capability in distinguishing between lung cancer and normal images, maintaining high performance across different threshold settings. The results from the confusion matrix further confirmed the model’s accuracy in classification tasks, showing the specific performance in predicting different categories, including the numbers of true positives, false positives, true negatives, and false negatives, further validating the model’s diagnostic efficiency and accuracy. Figure 4 illustrates the performance metrics of our model.
In our study, in addition to proposing a new model, we also compared the performance of several classic and advanced deep learning models in the task of detecting lung cancer in pathology images. These comparison models include AlexNet, VGG, GoogleNet, ResNet, and MobileNet. Compared to these models, our proposed new model performed better across all key metrics. Table 1 shows the performance comparison of the various models, Fig. 5 presents the ROC curve evaluation metrics for each model, and Fig. 6 displays the confusion matrix evaluation metrics for each model.
Independent test results
Results of the BreezeNet
To further validate the generalization capability of our model, we used independent testing, achieving excellent results. Figure 7 presents the ROC curve evaluation metrics of our model in the independent testing.
Results of the comparison model
In the independent testing, our BreezeNet model was compared with other classic deep learning models such as AlexNet, VGG, GoogleNet, ResNet, and MobileNet. The results showed that BreezeNet demonstrated superior performance across key metrics, including accuracy, precision, recall, F1 score, and AUC value. Specifically, BreezeNet achieved an accuracy of 97.42%, precision of 96.25%, recall of 96.25%, F1 score of 96.24%, and an AUC value as high as 0.99, all of which significantly outperformed the other models. These results not only demonstrate BreezeNet’s superiority in the task of lung cancer image recognition but also highlight its potential in ensuring high precision and high recall, making it the preferred model in the field of complex medical image analysis. Table 2 shows the performance metrics of the various comparison models in independent testing, and Fig. 8 presents the ROC curve evaluation metrics for each model in independent testing.
The comparison of the training cost of the models
As illustrated in Table 1, BreezeNet excels in several key performance metrics, including precision, recall, F1-score, and accuracy, slightly outperforming traditional deep learning models like AlexNet, VGG, GoogleNet, ResNet, and MobileNet. The most significant advantage of BreezeNet lies in its lightweight design, particularly reflected in the substantial reduction in parameter count. Specifically, BreezeNet has only 1,256,679 parameters, compared to 14,587,587 for AlexNet (about 11.6 times less), 13,427,835 for VGG (about 10.7 times less), 10,312,505 for GoogleNet (about 8.2 times less), 23,514,179 for ResNet (about 18.7 times less), and 2,227,715 for MobileNet (about 1.8 times less). By significantly reducing the number of parameters, BreezeNet lowers training costs and computational resource requirements, making it particularly advantageous in resource-limited environments such as mobile devices and edge computing platforms. This lightweight characteristic not only enhances the practical application value of the model but also increases its feasibility and versatility in different environments.
Clinical interpretability
In the pathological classification of lung cancer, lung adenocarcinoma, lung squamous cell carcinoma, and benign lung tissue are common categories. Lung adenocarcinoma typically occurs in the peripheral areas of the lung and is often associated with non-smokers, being more common in women and younger patients. Lung squamous cell carcinoma is closely related to long-term smoking and generally occurs in the central part of the lung near the main bronchus. Unlike these two malignant tumors, benign lung tissue does not exhibit malignant growth and may present as non-tumorous lesions such as cysts or inflammation. In the medical field, traditional methods for diagnosing lung cancer rely on doctors analyzing pathology slides under a microscope to identify specific cytological features. This process is not only time-consuming but also requires a high level of professional skill. In recent years, the application of deep learning models, especially through techniques like activation heatmaps, has brought revolutionary advances to lung cancer detection. Activation heatmaps visually display the areas that deep learning models focus on when analyzing images, clearly indicating cells or tissues identified as cancerous. This not only helps explain the model’s decision logic but also effectively assists doctors in quickly locating potential cancerous areas, thereby improving the speed and accuracy of diagnosis. The application of this technology has greatly enhanced the efficiency and accuracy of medical diagnostics. Figure 9 illustrates the activation heatmap.
Discussion
Discussion
BreezeNet is designed to balance diagnostic performance with computational efficiency for histopathology patch classification. By combining grouped convolutions with a lightweight channel-attention module (SE) and a streamlined residual pathway, the model reduces parameter redundancy while preserving discriminative feature extraction for lung adenocarcinoma, lung squamous cell carcinoma, and benign tissue. This architecture supports practical deployment in resource-constrained environments and enables faster training/inference compared with heavier backbones. Consistent with this design goal, BreezeNet achieves competitive (and in our experiments slightly higher) precision, recall, F1-score, and accuracy while using substantially fewer parameters than conventional CNN baselines. These results suggest that careful architectural choices can yield an efficient model without sacrificing classification quality under this benchmark setting.
In analyzing the reasons behind the observed performance trends, several key architectural choices in BreezeNet appear to contribute significantly to its effectiveness. The use of group convolutions reduces parameter redundancy while maintaining robust feature extraction, which helps prevent overfitting and improves generalization, especially on limited medical datasets. The SE attention mechanism further enhances this by enabling the model to focus on the most informative feature channels, thereby improving class separability and aiding in the accurate identification of subtle pathological features. The streamlined residual structure also facilitates efficient information flow and reduces computational burden. On the other hand, we acknowledge that certain misclassifications may occur due to inherent challenges in pathology images. These include overlapping morphological features between benign and malignant tissues, low-contrast regions, or noisy background artifacts. Additionally, heterogeneity in staining quality or tissue preservation may introduce visual ambiguities that can affect model predictions. These factors highlight the importance of incorporating diverse and high-quality training data in future studies to further improve the model’s robustness and reliability in real-world applications.
In comparison with conventional CNN-based models such as AlexNet, VGG, and ResNet, BreezeNet demonstrates competitive or superior performance across all key evaluation metrics, while maintaining a significantly reduced parameter count. This highlights its practical advantage for deployment in resource-constrained environments where computational efficiency is critical. Furthermore, unlike deeper models that often suffer from overfitting due to redundant capacity, BreezeNet’s lightweight design helps mitigate this issue while still capturing essential morphological patterns in histopathological images. However, we also acknowledge that our current architecture does not incorporate advanced global attention mechanisms or transformer-based modules, which could further enhance feature representation in more complex diagnostic scenarios. Future iterations of the model may consider integrating such mechanisms to boost generalization without compromising model size.
In recent years, the application of artificial intelligence in the medical field has evolved from machine learning to deep learning. Deep learning is gradually being applied to lung cancer pathology detection [26], extracting valuable information from images that are difficult for the human eye to distinguish [27]. This can help overcome the subjectivity inherent in pathologists’ visual evaluations, thus improving the accuracy of pathological diagnoses [28]. By learning from and analyzing a large number of pathology images and data, the system can assist pathologists in making more precise and objective diagnoses [29], thereby enhancing diagnostic efficiency. Moreover, previous studies have often relied on high-performance computer systems and relatively complex analytical procedures, which greatly hinder the clinical application and popularization of such technologies [30]. Our study aims to optimize the analytical processes of deep learning models while maintaining computational performance, thereby reducing their dependence on advanced computer systems and promoting the integration of deep learning into clinical practice.
In addition to the strong overall performance, we also recognize that the model may encounter failure cases under certain conditions. For example, pathology images with excessive noise, uneven staining, low contrast between tumor and surrounding tissues, or ambiguous tumor borders can increase the risk of misclassification [31–33]. These factors may obscure discriminative cellular structures and reduce the model’s ability to distinguish subtle differences between subtypes. Although such cases were not explicitly visualized in this study, acknowledging these potential sources of error provides important context for interpreting the results and highlights areas for future methodological improvement.
Despite the promising results, this study has several limitations. First, the current model performs only three-class classification (lung adenocarcinoma, lung squamous cell carcinoma, and benign tissue), which may limit its applicability to more complex or rare histological subtypes. Future research will explore multi-class and multi-label frameworks to accommodate a broader range of pathological categories. Second, although we conducted both internal and independent validation, the external dataset used was limited in size and source diversity. To further improve model robustness and generalizability, multi-center datasets encompassing varied staining protocols and imaging equipment will be included in future work. Lastly, while BreezeNet exhibits strong diagnostic potential, it has not yet been deployed in real-world clinical workflows. We acknowledge that although BreezeNet achieves strong performance with fewer parameters, its lightweight structure may limit its ability to capture certain fine-grained pathological patterns in highly heterogeneous samples. We also note that the absence of global attention modules might constrain its feature abstraction capacity in comparison to transformer-based models. Additionally, the model’s performance was validated on a limited number of centers and staining protocols, which may affect its generalizability. Future efforts will focus on prospective studies in collaboration with clinical partners to evaluate model performance in routine practice and to facilitate clinical translation.
Dataset-related limitation (LC25000). We acknowledge an additional limitation related to the benchmark dataset used in this study. LC25000 is a patch-based dataset composed of tissue tiles extracted from whole-slide images, and it is widely used for proof-of-concept evaluation [34]. However, patch-level benchmarks may contain highly correlated or near-duplicate tiles (including augmented variants of the same tissue region). As a result, random patch-level splitting can inadvertently place correlated tiles into both training and test sets, leading to data leakage and potentially optimistic performance estimates. Recent work has explicitly investigated this issue in LC25000 and proposed cleaning/grouping strategies to separate augmented/near-duplicate tiles for more reliable reporting [35]. Therefore, while our results support the efficiency of BreezeNet under this benchmark setting, the reported performance should be interpreted with caution regarding real-world generalization. Future work will adopt leakage-aware splitting (e.g., grouping correlated tiles) and evaluate the model on independent multi-center whole-slide datasets and/or cleaned LC25000 variants.
BreezeNet is designed to balance diagnostic performance with computational efficiency for histopathology patch classification. By combining grouped convolutions with a lightweight channel-attention module (SE) and a streamlined residual pathway, the model reduces parameter redundancy while preserving discriminative feature extraction for lung adenocarcinoma, lung squamous cell carcinoma, and benign tissue. This architecture supports practical deployment in resource-constrained environments and enables faster training/inference compared with heavier backbones. Consistent with this design goal, BreezeNet achieves competitive (and in our experiments slightly higher) precision, recall, F1-score, and accuracy while using substantially fewer parameters than conventional CNN baselines. These results suggest that careful architectural choices can yield an efficient model without sacrificing classification quality under this benchmark setting.
In analyzing the reasons behind the observed performance trends, several key architectural choices in BreezeNet appear to contribute significantly to its effectiveness. The use of group convolutions reduces parameter redundancy while maintaining robust feature extraction, which helps prevent overfitting and improves generalization, especially on limited medical datasets. The SE attention mechanism further enhances this by enabling the model to focus on the most informative feature channels, thereby improving class separability and aiding in the accurate identification of subtle pathological features. The streamlined residual structure also facilitates efficient information flow and reduces computational burden. On the other hand, we acknowledge that certain misclassifications may occur due to inherent challenges in pathology images. These include overlapping morphological features between benign and malignant tissues, low-contrast regions, or noisy background artifacts. Additionally, heterogeneity in staining quality or tissue preservation may introduce visual ambiguities that can affect model predictions. These factors highlight the importance of incorporating diverse and high-quality training data in future studies to further improve the model’s robustness and reliability in real-world applications.
In comparison with conventional CNN-based models such as AlexNet, VGG, and ResNet, BreezeNet demonstrates competitive or superior performance across all key evaluation metrics, while maintaining a significantly reduced parameter count. This highlights its practical advantage for deployment in resource-constrained environments where computational efficiency is critical. Furthermore, unlike deeper models that often suffer from overfitting due to redundant capacity, BreezeNet’s lightweight design helps mitigate this issue while still capturing essential morphological patterns in histopathological images. However, we also acknowledge that our current architecture does not incorporate advanced global attention mechanisms or transformer-based modules, which could further enhance feature representation in more complex diagnostic scenarios. Future iterations of the model may consider integrating such mechanisms to boost generalization without compromising model size.
In recent years, the application of artificial intelligence in the medical field has evolved from machine learning to deep learning. Deep learning is gradually being applied to lung cancer pathology detection [26], extracting valuable information from images that are difficult for the human eye to distinguish [27]. This can help overcome the subjectivity inherent in pathologists’ visual evaluations, thus improving the accuracy of pathological diagnoses [28]. By learning from and analyzing a large number of pathology images and data, the system can assist pathologists in making more precise and objective diagnoses [29], thereby enhancing diagnostic efficiency. Moreover, previous studies have often relied on high-performance computer systems and relatively complex analytical procedures, which greatly hinder the clinical application and popularization of such technologies [30]. Our study aims to optimize the analytical processes of deep learning models while maintaining computational performance, thereby reducing their dependence on advanced computer systems and promoting the integration of deep learning into clinical practice.
In addition to the strong overall performance, we also recognize that the model may encounter failure cases under certain conditions. For example, pathology images with excessive noise, uneven staining, low contrast between tumor and surrounding tissues, or ambiguous tumor borders can increase the risk of misclassification [31–33]. These factors may obscure discriminative cellular structures and reduce the model’s ability to distinguish subtle differences between subtypes. Although such cases were not explicitly visualized in this study, acknowledging these potential sources of error provides important context for interpreting the results and highlights areas for future methodological improvement.
Despite the promising results, this study has several limitations. First, the current model performs only three-class classification (lung adenocarcinoma, lung squamous cell carcinoma, and benign tissue), which may limit its applicability to more complex or rare histological subtypes. Future research will explore multi-class and multi-label frameworks to accommodate a broader range of pathological categories. Second, although we conducted both internal and independent validation, the external dataset used was limited in size and source diversity. To further improve model robustness and generalizability, multi-center datasets encompassing varied staining protocols and imaging equipment will be included in future work. Lastly, while BreezeNet exhibits strong diagnostic potential, it has not yet been deployed in real-world clinical workflows. We acknowledge that although BreezeNet achieves strong performance with fewer parameters, its lightweight structure may limit its ability to capture certain fine-grained pathological patterns in highly heterogeneous samples. We also note that the absence of global attention modules might constrain its feature abstraction capacity in comparison to transformer-based models. Additionally, the model’s performance was validated on a limited number of centers and staining protocols, which may affect its generalizability. Future efforts will focus on prospective studies in collaboration with clinical partners to evaluate model performance in routine practice and to facilitate clinical translation.
Dataset-related limitation (LC25000). We acknowledge an additional limitation related to the benchmark dataset used in this study. LC25000 is a patch-based dataset composed of tissue tiles extracted from whole-slide images, and it is widely used for proof-of-concept evaluation [34]. However, patch-level benchmarks may contain highly correlated or near-duplicate tiles (including augmented variants of the same tissue region). As a result, random patch-level splitting can inadvertently place correlated tiles into both training and test sets, leading to data leakage and potentially optimistic performance estimates. Recent work has explicitly investigated this issue in LC25000 and proposed cleaning/grouping strategies to separate augmented/near-duplicate tiles for more reliable reporting [35]. Therefore, while our results support the efficiency of BreezeNet under this benchmark setting, the reported performance should be interpreted with caution regarding real-world generalization. Future work will adopt leakage-aware splitting (e.g., grouping correlated tiles) and evaluate the model on independent multi-center whole-slide datasets and/or cleaned LC25000 variants.
Conclusion
Conclusion
We have developed a novel lightweight deep learning model—BreezeNet—for the automated classification of lung cancer in pathology images. The model achieves excellent predictive performance, with an accuracy of 97.89%, precision of 97.49%, recall of 97.42%, and an F1-score of 97.42%. One of the most notable advantages of BreezeNet lies in its highly efficient architecture: it contains only 1.26 million parameters, a significant reduction compared to classical models such as AlexNet (14.59 million) and ResNet (23.51 million). This substantial reduction in model complexity translates to lower training costs, reduced memory requirements, and improved inference speed—making BreezeNet particularly well-suited for deployment in standard clinical settings and resource-constrained environments. By effectively integrating grouped convolution and SE attention mechanisms within a streamlined residual framework, BreezeNet offers a practical and scalable solution for aiding pathologists in rapid and accurate subtype identification of lung cancer. This can enhance diagnostic workflows, reduce workload, and improve decision-making efficiency. Looking forward, we plan to extend this work through multi-center validations, cross-platform deployment tests, and real-world pilot studies, with the goal of accelerating clinical translation and contributing to improved patient outcomes in oncology care.
We have developed a novel lightweight deep learning model—BreezeNet—for the automated classification of lung cancer in pathology images. The model achieves excellent predictive performance, with an accuracy of 97.89%, precision of 97.49%, recall of 97.42%, and an F1-score of 97.42%. One of the most notable advantages of BreezeNet lies in its highly efficient architecture: it contains only 1.26 million parameters, a significant reduction compared to classical models such as AlexNet (14.59 million) and ResNet (23.51 million). This substantial reduction in model complexity translates to lower training costs, reduced memory requirements, and improved inference speed—making BreezeNet particularly well-suited for deployment in standard clinical settings and resource-constrained environments. By effectively integrating grouped convolution and SE attention mechanisms within a streamlined residual framework, BreezeNet offers a practical and scalable solution for aiding pathologists in rapid and accurate subtype identification of lung cancer. This can enhance diagnostic workflows, reduce workload, and improve decision-making efficiency. Looking forward, we plan to extend this work through multi-center validations, cross-platform deployment tests, and real-world pilot studies, with the goal of accelerating clinical translation and contributing to improved patient outcomes in oncology care.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- LCMS-Net: Deep Learning for Raw High Resolution Mass Spectrometry Data Applied to Forensic Cause-of-Death Screening.
- PIBAdb: a public cohort of multimodal colonoscopy videos and images including polyps with histological information.
- Exploring the Role of Extracellular Vesicles in Pancreatic and Hepatobiliary Cancers: Advances Through Artificial Intelligence.
- Feasibility of Depth-in-Color En Face Optical Coherence Tomography for Colorectal Polyp Classification Using Ensemble Learning and Score-Level Fusion.
- Impact of CT Intensity and Contrast Variability on Deep-Learning-Based Lung-Nodule Detection: A Systematic Review of Preprocessing and Harmonization Strategies (2020-2025).
- A Transformer-Based Deep Learning Model for predicting Early Recurrence in Hepatocellular Carcinoma After Hepatectomy Using Intravoxel Incoherent Motion Images.