A lightweight CNN for enhanced non-small cell lung cancer classification using CT scan image.
1/5 보강
Lung cancer is a leading cause of cancer-related mortality worldwide, and its early and accurate detection is critical for improving patient outcomes.
APA
Baqir MA, Qayyum S, et al. (2026). A lightweight CNN for enhanced non-small cell lung cancer classification using CT scan image.. Scientific reports, 16(1). https://doi.org/10.1038/s41598-026-41401-w
MLA
Baqir MA, et al.. "A lightweight CNN for enhanced non-small cell lung cancer classification using CT scan image.." Scientific reports, vol. 16, no. 1, 2026.
PMID
41807468 ↗
Abstract 한글 요약
Lung cancer is a leading cause of cancer-related mortality worldwide, and its early and accurate detection is critical for improving patient outcomes. Computed tomography (CT) scans are widely used to diagnose lung cancer; however, the accuracy of diagnosis often depends on the expertise of radiologists. Recently, deep learning-based clinical decision support systems have shown promise in assisting diagnosis by providing reliable and consistent predictions. In this paper, we propose MiniConvNet, a lightweight convolutional neural network designed to detect and classify non-small cell lung cancer (NSCLC) and its subtypes, adenocarcinoma, squamous cell carcinoma, and large cell carcinoma, from CT images. We also evaluate the model's generalizability on a histopathological lung cancer dataset, demonstrating its robustness across imaging modalities. We benchmark MiniConvNet against several established CNN architectures, including ResNet50, VGG16, VGG19, Inception V3, MobileNetV3Small, EfficientNetV2B0, and ConvNeXtTiny, under identical experimental conditions. Extensive experiments on two publicly available datasets show that MiniConvNet achieves competitive or superior performance compared to the baselines while maintaining a significantly smaller model size and faster inference. These results highlight MiniConvNet's potential as an efficient and deployable tool for lung cancer subtype classification in resource-constrained clinical settings.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
📖 전문 본문 읽기 PMC JATS · ~85 KB · 영문
Introduction
Introduction
Lung cancer remains the most prevalent cause of cancer deaths globally1. Cancer is primarily driven by the ageing and growth of the population and the adoption of cancer-causing behaviours in developing countries2. Tobacco smoking continues to be the predominant contributing factor3. According to the World Health Organization statistics in 2020, LC was the leading cause of cancer deaths, accounting for approximately 1.8 million deaths. The number of new lung cancer cases reported in 2022 was about 2.48 million, making it one of the most common cancers globally4. Within the realm of LC, NSCLC emerges as the predominant subtype, as highlighted by3. Non-small cell lung cancer (NSCLC) accounts for approximately 85% of cases. NSCLC comprises three primary subtypes: AC, SCC, and LCC. AC is the prevailing type of LC, and advancements in various fields such as oncology, molecular biology, pathology, radiology, and surgery necessitated a comprehensive and standardized classification system5. AC has risen to prominence as the most prevalent NSCLC subtype, with its incidence steadily increasing6,7. Similarly, SCC holds a significant share of NSCLC cases, each subtype exhibiting distinct and shared clinicopathologic characteristics6. SCC has central lesions and is considered genetically complex, lacking targetable abnormalities8. SCC is an aggressive type of LC known for its early spread, caused by using tobacco9. LCC is also a variant of LC that does not possess distinct features of small-cell LC, AC, or SCC10. LCC is rare but poses a highly aggressive variant11. In the medical field, CT scanning has emerged as a valuable tool for detecting a range of lung disorders12,13. CT scan images’ suitability in detecting lung diseases has been recognized in several studies14. Traditional methods of diagnosis, like biopsy and imaging, are often costly. In medical diagnosis, artificial intelligence (AI) has proven the potential to reduce diagnostic time while maintaining high accuracy significantly15,16. However, it is necessary to ensure that AI is used in a responsible and morally upright way17. The emergence of machine learning, specifically CNN, has transformed medical imaging, providing efficient diagnostic solutions. In medical image analysis, CNNs have demonstrated remarkable capabilities18.
Motivation and novel contribution
Lung cancer remains one of the leading causes of cancer-related mortality worldwide, with non-small cell lung cancer (NSCLC) accounting for approximately 85% of cases. Early and accurate detection is essential for improving patient outcomes. While recent advances in deep learning have shown great promise in automating lung cancer diagnosis, many state-of-the-art models rely on deep, computationally intensive architectures. These models often demand substantial memory, long training times, and high-end GPUs, limiting their feasibility for real-time clinical deployment, particularly in resource-constrained settings. To overcome these limitations, we propose MiniConvNet, a novel lightweight convolutional neural network specifically designed to balance diagnostic accuracy and computational efficiency. The architecture was developed through systematic hyperparameter optimization and iterative refinement, aiming to maximize sensitivity and specificity while minimizing computational demands.
The motivation behind this lightweight design lies in addressing the practical challenges posed by conventional deep learning models, which typically contain tens of millions of parameters and deep hierarchical layers, resulting in significant resource consumption. In contrast, MiniConvNet employs a compact structure with substantially fewer trainable parameters, leading to faster training, lower memory usage, and reduced power requirements. Despite its compactness, the model achieves performance comparable to, and in some cases surpassing, larger architectures, making it well-suited for real-time clinical applications. By prioritizing computational efficiency without compromising accuracy, MiniConvNet addresses a critical gap in AI-driven lung cancer diagnostics. It offers a scalable, accessible, and clinically relevant solution that facilitates the practical integration of deep learning into medical imaging workflows.
Lung cancer remains the most prevalent cause of cancer deaths globally1. Cancer is primarily driven by the ageing and growth of the population and the adoption of cancer-causing behaviours in developing countries2. Tobacco smoking continues to be the predominant contributing factor3. According to the World Health Organization statistics in 2020, LC was the leading cause of cancer deaths, accounting for approximately 1.8 million deaths. The number of new lung cancer cases reported in 2022 was about 2.48 million, making it one of the most common cancers globally4. Within the realm of LC, NSCLC emerges as the predominant subtype, as highlighted by3. Non-small cell lung cancer (NSCLC) accounts for approximately 85% of cases. NSCLC comprises three primary subtypes: AC, SCC, and LCC. AC is the prevailing type of LC, and advancements in various fields such as oncology, molecular biology, pathology, radiology, and surgery necessitated a comprehensive and standardized classification system5. AC has risen to prominence as the most prevalent NSCLC subtype, with its incidence steadily increasing6,7. Similarly, SCC holds a significant share of NSCLC cases, each subtype exhibiting distinct and shared clinicopathologic characteristics6. SCC has central lesions and is considered genetically complex, lacking targetable abnormalities8. SCC is an aggressive type of LC known for its early spread, caused by using tobacco9. LCC is also a variant of LC that does not possess distinct features of small-cell LC, AC, or SCC10. LCC is rare but poses a highly aggressive variant11. In the medical field, CT scanning has emerged as a valuable tool for detecting a range of lung disorders12,13. CT scan images’ suitability in detecting lung diseases has been recognized in several studies14. Traditional methods of diagnosis, like biopsy and imaging, are often costly. In medical diagnosis, artificial intelligence (AI) has proven the potential to reduce diagnostic time while maintaining high accuracy significantly15,16. However, it is necessary to ensure that AI is used in a responsible and morally upright way17. The emergence of machine learning, specifically CNN, has transformed medical imaging, providing efficient diagnostic solutions. In medical image analysis, CNNs have demonstrated remarkable capabilities18.
Motivation and novel contribution
Lung cancer remains one of the leading causes of cancer-related mortality worldwide, with non-small cell lung cancer (NSCLC) accounting for approximately 85% of cases. Early and accurate detection is essential for improving patient outcomes. While recent advances in deep learning have shown great promise in automating lung cancer diagnosis, many state-of-the-art models rely on deep, computationally intensive architectures. These models often demand substantial memory, long training times, and high-end GPUs, limiting their feasibility for real-time clinical deployment, particularly in resource-constrained settings. To overcome these limitations, we propose MiniConvNet, a novel lightweight convolutional neural network specifically designed to balance diagnostic accuracy and computational efficiency. The architecture was developed through systematic hyperparameter optimization and iterative refinement, aiming to maximize sensitivity and specificity while minimizing computational demands.
The motivation behind this lightweight design lies in addressing the practical challenges posed by conventional deep learning models, which typically contain tens of millions of parameters and deep hierarchical layers, resulting in significant resource consumption. In contrast, MiniConvNet employs a compact structure with substantially fewer trainable parameters, leading to faster training, lower memory usage, and reduced power requirements. Despite its compactness, the model achieves performance comparable to, and in some cases surpassing, larger architectures, making it well-suited for real-time clinical applications. By prioritizing computational efficiency without compromising accuracy, MiniConvNet addresses a critical gap in AI-driven lung cancer diagnostics. It offers a scalable, accessible, and clinically relevant solution that facilitates the practical integration of deep learning into medical imaging workflows.
Literature review
Literature review
Traditional machine learning approaches
Early approaches in lung cancer detection employed handcrafted features and classical classifiers. Aggarwal et al.19 used feature extraction with thresholding and LDA, achieving 84% accuracy. Bhuvaneswari and Therese20 combined Genetic Algorithms with KNN to reach 90% accuracy. Maleki et al.21 converted segmented CT images into numerical data and applied Gradient Boosting, Random Forest, and SVM, achieving 95% accuracy. These methods, while lightweight, were limited in their ability to generalize to complex and high-dimensional image data.
2D CNN-based deep learning models
The rise of convolutional neural networks (CNNs) significantly advanced lung cancer classification. Jin et al.22 used a basic 2D CNN and achieved 84.6% accuracy. Eun et al.23 applied single-view CNNs with non-nodule categorization, reaching 92% accuracy. Alakwaa et al.24 utilized a modified U-Net architecture with 3D CNN components, achieving 86.6%. Teramoto et al.25 applied deep CNNs to cytological images with 71.1% accuracy. Shi et al.26 proposed a deconvolutional CNN framework with VGG16 feature extraction, reporting 82.62% sensitivity. Wang et al.27 applied CNNs to raw CT patches, achieving 92.8% sensitivity with 8 false positives per scan. Huang et al.28 proposed A-CNN, achieving sensitivity of 81.7% and 85.1% at an average of 0.125 and 0.25 false positives per scan.
3D CNN and volumetric CT models
To capture spatial depth in CT data, many studies explored 3D CNNs. Gong et al.29 proposed a 3D squeeze-and-excitation ResNet model with sensitivities of 93.6% and 95.7%. Li et al.30 introduced a 3DCN with IoU self-normalization and maxout units, yielding an FROC score of 0.912. Pezeshk et al.31 developed the DeepMed CAD system, achieving 91% sensitivity and only 2 false positives per scan. Yuan et al.32 built a hierarchical 3D CNN, scoring 0.881 on the LUNA16 dataset. Zhang et al.33 reported 84.4% sensitivity and 83.0% specificity using 3D CNNs on LUNA16 and Kaggle datasets. Zheng et al.34 implemented a dual-stage multiplanar framework, reaching 96.0% sensitivity with 2 false positives per scan.
Ensemble and optimization-driven approaches
Kumar et al.35 proposed a Fully Parallel Systolic CNN (FPSOCNN) combined with SVM, achieving 98% accuracy. Masood et al.36 enhanced mRFCN with multilayer fusion and achieved 98.1% sensitivity and 97.91% accuracy. Mohamed et al.37 introduced the EOSA CNN optimized by evolutionary algorithms, reaching 93.21%. Hanaoka et al.38 proposed HoTPiG features for CAD, achieving 80% sensitivity with 3 false positives. Nasrullah et al.39 integrated clinical biomarkers into CMixNet, reporting 94% sensitivity and 91% specificity. Shimazaki et al.40 used segmentation on chest radiographs and achieved 0.73 sensitivity.
Lightweight and transfer learning models
Transfer learning and efficient architectures have gained attention due to their low computational costs. Agarwal et al.41 employed AlexNet-based transfer learning and achieved 96% accuracy. Salah et al.42 applied EfficientNet-B3 for lung cancer detection with 96% accuracy. Maozhi et al.43 proposed a Res2Net-based U-Net variant with a sensitivity of 83.74%. Wang et al.44 introduced DPCA-Net, integrating dual-path 3D attention, scoring 0.849 in FROC. Wu et al.45 designed the Entropy Degradation Method (EDM) for small cell lung cancer, achieving 77.8% accuracy on a small dataset. Elnakib et al.46 used VGG19 with SVM for early detection, reaching 96.25% accuracy.
While existing studies demonstrate substantial progress in lung cancer detection using deep and hybrid neural networks, the majority of these models suffer from high computational complexity and memory demands. Many rely on large datasets, pretrained architectures, or ensemble methods that are difficult to deploy in real-time clinical environments. Additionally, most prior work emphasizes binary classification (cancer vs. non-cancer) or nodule detection, with limited focus on multiclass classification of NSCLC subtypes, particularly when using small datasets. Our proposed MiniConvNet addresses these limitations by offering a lightweight yet high performing CNN architecture specifically designed for NSCLC subtype classification. With only 0.5 million parameters and a 6 MB model size, MiniConvNet provides substantial reductions in memory and computational requirements while maintaining strong classification accuracy. This makes it highly suitable for deployment on low-resource hardware such as embedded systems or edge devices in clinical workflows.
Traditional machine learning approaches
Early approaches in lung cancer detection employed handcrafted features and classical classifiers. Aggarwal et al.19 used feature extraction with thresholding and LDA, achieving 84% accuracy. Bhuvaneswari and Therese20 combined Genetic Algorithms with KNN to reach 90% accuracy. Maleki et al.21 converted segmented CT images into numerical data and applied Gradient Boosting, Random Forest, and SVM, achieving 95% accuracy. These methods, while lightweight, were limited in their ability to generalize to complex and high-dimensional image data.
2D CNN-based deep learning models
The rise of convolutional neural networks (CNNs) significantly advanced lung cancer classification. Jin et al.22 used a basic 2D CNN and achieved 84.6% accuracy. Eun et al.23 applied single-view CNNs with non-nodule categorization, reaching 92% accuracy. Alakwaa et al.24 utilized a modified U-Net architecture with 3D CNN components, achieving 86.6%. Teramoto et al.25 applied deep CNNs to cytological images with 71.1% accuracy. Shi et al.26 proposed a deconvolutional CNN framework with VGG16 feature extraction, reporting 82.62% sensitivity. Wang et al.27 applied CNNs to raw CT patches, achieving 92.8% sensitivity with 8 false positives per scan. Huang et al.28 proposed A-CNN, achieving sensitivity of 81.7% and 85.1% at an average of 0.125 and 0.25 false positives per scan.
3D CNN and volumetric CT models
To capture spatial depth in CT data, many studies explored 3D CNNs. Gong et al.29 proposed a 3D squeeze-and-excitation ResNet model with sensitivities of 93.6% and 95.7%. Li et al.30 introduced a 3DCN with IoU self-normalization and maxout units, yielding an FROC score of 0.912. Pezeshk et al.31 developed the DeepMed CAD system, achieving 91% sensitivity and only 2 false positives per scan. Yuan et al.32 built a hierarchical 3D CNN, scoring 0.881 on the LUNA16 dataset. Zhang et al.33 reported 84.4% sensitivity and 83.0% specificity using 3D CNNs on LUNA16 and Kaggle datasets. Zheng et al.34 implemented a dual-stage multiplanar framework, reaching 96.0% sensitivity with 2 false positives per scan.
Ensemble and optimization-driven approaches
Kumar et al.35 proposed a Fully Parallel Systolic CNN (FPSOCNN) combined with SVM, achieving 98% accuracy. Masood et al.36 enhanced mRFCN with multilayer fusion and achieved 98.1% sensitivity and 97.91% accuracy. Mohamed et al.37 introduced the EOSA CNN optimized by evolutionary algorithms, reaching 93.21%. Hanaoka et al.38 proposed HoTPiG features for CAD, achieving 80% sensitivity with 3 false positives. Nasrullah et al.39 integrated clinical biomarkers into CMixNet, reporting 94% sensitivity and 91% specificity. Shimazaki et al.40 used segmentation on chest radiographs and achieved 0.73 sensitivity.
Lightweight and transfer learning models
Transfer learning and efficient architectures have gained attention due to their low computational costs. Agarwal et al.41 employed AlexNet-based transfer learning and achieved 96% accuracy. Salah et al.42 applied EfficientNet-B3 for lung cancer detection with 96% accuracy. Maozhi et al.43 proposed a Res2Net-based U-Net variant with a sensitivity of 83.74%. Wang et al.44 introduced DPCA-Net, integrating dual-path 3D attention, scoring 0.849 in FROC. Wu et al.45 designed the Entropy Degradation Method (EDM) for small cell lung cancer, achieving 77.8% accuracy on a small dataset. Elnakib et al.46 used VGG19 with SVM for early detection, reaching 96.25% accuracy.
While existing studies demonstrate substantial progress in lung cancer detection using deep and hybrid neural networks, the majority of these models suffer from high computational complexity and memory demands. Many rely on large datasets, pretrained architectures, or ensemble methods that are difficult to deploy in real-time clinical environments. Additionally, most prior work emphasizes binary classification (cancer vs. non-cancer) or nodule detection, with limited focus on multiclass classification of NSCLC subtypes, particularly when using small datasets. Our proposed MiniConvNet addresses these limitations by offering a lightweight yet high performing CNN architecture specifically designed for NSCLC subtype classification. With only 0.5 million parameters and a 6 MB model size, MiniConvNet provides substantial reductions in memory and computational requirements while maintaining strong classification accuracy. This makes it highly suitable for deployment on low-resource hardware such as embedded systems or edge devices in clinical workflows.
Proposed method
Proposed method
The workflow of this study follows a systematic and reproducible sequence of steps, incorporating both established pretrained models: ResNet50, VGG16, VGG19, Inception V3, MobileNetV3Small, EfficientNetV2B0, ConvNeXtTiny, and the proposed lightweight model, MiniConvNet.The primary objective is to evaluate and compare the effectiveness of these models when trained under identical conditions on the lung cancer datasets. Figure 1 illustrates the overall methodology. The process begins with dataset preparation, including loading and partitioning the data into training, validation, and test subsets. To mitigate overfitting and enhance the variability of the training data, a range of data augmentation techniques are applied. Each model, pretrained architectures and MiniConvNet, is then trained independently using the same experimental settings. Model performance is evaluated using a comprehensive set of metrics, including training, validation, and test loss, accuracy, precision, recall, and F-score. These metrics enable a robust comparative analysis of the models’ diagnostic performance and computational efficiency.
Dataset
This study utilizes two publicly available datasets to evaluate the performance and generalizability of the proposed MiniConvNet model. The first dataset consists of computed tomography (CT) scan images of the lungs, capturing various NSCLC subtypes and healthy lung tissue. The second dataset comprises histopathological images of lung tissue, providing an alternative imaging modality for lung cancer diagnosis. Both datasets were preprocessed and partitioned into training, validation, and test sets to ensure robust evaluation. Detailed descriptions of each dataset are provided in the following subsections.
Lung CT scan dataset
The lung cancer CT dataset, obtained from the Kaggle platform 47, comprises a diverse collection of CT images categorized into four classes: adenocarcinoma (AC), large cell carcinoma (LCC), squamous cell carcinoma (SCC), and healthy lung tissue. The dataset contains a total of 900 images, distributed as follows:Adenocarcinoma (AC): 338 images. Tumors located in the left lower lobe of the lung, classified as T2-stage lesions (largest dimension cm and cm), often exhibiting limited visceral pleural invasion or associated atelectasis/pneumonitis.
Large cell carcinoma (LCC): 187 images. Tumors situated near the left hilum, classified as T2-stage with evidence of regional lymph node metastasis, involving the main bronchus with associated atelectasis/pneumonitis.
Squamous cell carcinoma (SCC): 260 images. Tumors located at the lung hilum, classified as T1-stage (largest dimension cm), also involving the main bronchus and showing regional lymph node metastasis.
Healthy lung: 115 images depicting normal lung tissue without visible abnormalities.
Figures 2, 3, 4 and 5 show representative examples of each class. Figure 2 illustrates AC lesions in the left lower lobe, characterized by T2-stage size and possible visceral pleural involvement. Figure 3 presents an LCC tumor near the left hilum, also T2-stage with lymph node metastasis. Figure 4 depicts SCC at the lung hilum, classified as T1-stage, and Fig. 5 shows a healthy lung CT scan.
Lung histopathological image dataset
To further assess the generalizability of MiniConvNet across imaging modalities, we utilized the publicly available Lung and Colon Cancer Histopathological Images dataset from Kaggle [larxel2020lc]. This dataset comprises 15,000 high-resolution histopathology images evenly distributed across three classes: adenocarcinoma, squamous cell carcinoma, and healthy lung tissue. The balanced and diverse nature of this dataset provides an additional benchmark for evaluating the model’s performance beyond CT imaging.
Dataset preprocessing
Both datasets were divided into three subsets: training (80%), validation (10%), and testing (10%). The CT scan images were preprocessed using the ImageDataGenerator class from Keras, which included rescaling pixel values to the [0,1] range and applying various augmentation techniques. These steps ensured that the datasets were properly organized and diversified, facilitating effective model training and evaluation. The CT dataset comprises only 900 images, which poses a potential risk of overfitting, particularly in deep learning applications where larger datasets are typically preferred. To mitigate this limitation, we applied a comprehensive set of data augmentation strategies during training, including horizontal and vertical flipping, random rotations, brightness adjustments, and zoom transformations. These augmentations effectively increased the variability of the training data, enhancing the model’s robustness and generalization to unseen examples. For the histopathology dataset, which contains 15,000 images evenly distributed across three classes, the same train-validation-test split ratio was applied, yielding 12,000 images for training and 1500 each for validation and testing. All images were resized to pixels and normalized in the same manner as the CT scan dataset to ensure consistency across modalities.
Deep CNN baseline models
This study evaluates the performance of several well-established pretrained convolutional neural network (CNN) architectures as baselines. Each model was fine-tuned on the lung cancer datasets under identical experimental conditions and compared against the proposed MiniConvNet architecture.
VGG networks
VGG16 and VGG19 are two variants of the Visual Geometry Group (VGG) network, developed by the University of Oxford 48. These models are characterized by their deep yet conceptually simple architecture, which stacks multiple convolutional layers with small kernels. VGG16 comprises 16 weight layers organized into five convolutional blocks, followed by three fully connected layers, where the first two have 4096 units each and the final layer corresponds to the number of output classes. VGG19 extends this design by adding additional convolutional layers, resulting in a total of 19 weight layers, with deeper third, fourth, and fifth blocks. While VGG models achieve strong performance, their primary drawback lies in their computational and memory demands, due to the large number of trainable parameters, which limits their applicability in resource-constrained environments.
ResNet-50
ResNet-5049 is a 50-layer deep CNN that introduces the concept of residual learning through skip connections, enabling the training of much deeper networks by mitigating the vanishing gradient problem. The architecture consists of convolutional and pooling layers organized into residual blocks, followed by global average pooling and a fully connected SoftMax classification layer. Although ResNet-50 achieves high accuracy and stable optimization, its depth and complexity still entail substantial computational costs during training and inference.
Inception V3
Inception V350 is an enhanced version of Google’s Inception architecture, designed to improve computational efficiency without compromising accuracy. Each inception module combines convolutions with multiple kernel sizes and pooling operations in parallel, enabling the extraction of multi-scale features. Despite its improved efficiency compared to other deep networks, the intricate design and large number of operations in Inception V3 can complicate implementation and optimization, particularly in scenarios requiring real-time processing.
MobileNetV3Small
MobileNetV351 is a family of lightweight convolutional neural networks designed for mobile and embedded vision applications. The Small variant strikes a balance between latency and accuracy for low-resource environments. It combines depthwise separable convolutions with squeeze-and-excitation blocks and a streamlined architecture discovered via neural architecture search.
EfficientNetV2B0
EfficientNetV252 is an improved version of the EfficientNet family that uses Fused-MBConv layers and a progressive training strategy to achieve higher accuracy with faster training. The B0 variant is the smallest model in the V2 series, designed for speed and compactness while still benefiting from compound scaling of depth, width, and resolution.
ConvNeXtTiny
ConvNeXt53 is a modernized convolutional network architecture inspired by transformer design principles but retaining pure convolutions. The Tiny variant adapts the architecture for smaller parameter budgets while keeping many of the improvements from large-scale ConvNeXt models such as large kernel sizes, inverted bottlenecks, and layer normalization.
MiniConvNet model
The proposed MiniConvNet is designed to achieve an optimal balance between computational efficiency and diagnostic accuracy for the detection and classification of NSCLC. The architecture comprises several essential components, each contributing to its lightweight yet effective design. At its core, the model employs convolutional layers as the fundamental building blocks for hierarchical feature extraction from input images. These layers are interleaved with activation functions, pooling operations, and regularization mechanisms to enhance the model’s representational capacity while minimizing overfitting and computational overhead, as presented in Eq. (1).where: is the output feature map, is the input image patch, is the convolution filter of size , and is the bias term. This operation allows the model to learn features by applying filters across the input image, effectively capturing patterns. After the convolutional operation, ReLU is applied to introduce non-linearity, ensuring that all negative values are set to zero, which helps in preventing the vanishing gradient problem and ensures faster convergence during training, as depicted in Eq. (2).where: is the input to a neuron. Max pooling is used after convolutional layers to reduce the spatial dimensions of the feature maps and to control overfitting. For a typical pooling window, the operation selects the maximum value from each window, effectively down-sampling the feature map while retaining the most prominent features, as presented in Eq. (3).where: is the output of the pooling operation and is the input feature map within the pooling window. Batch normalization is applied to improve stability. This layer normalizes the output of a previous activation layer by adjusting and scaling the activations, as shown in Eqs. (4) and (5).where, and are the mean and variance of the batch, is a small constant added for numerical stability, and and are learnable parameters used for scaling and shifting the normalized value.
MiniConvNet architecture
In Fig. 6, the 1st block of MiniConvNet architecture commences with an input layer of dimensions , accepting medical images of size H W C, which is then subjected to a convolutional operation employing 16 filters. The following blocks consist of additional convolutional layers, batch normalization, and pooling layers. A flattening layer transforms the multi-dimensional feature tensor into a singular, linearized form, which is followed by dense layers for classification. The final output layer is adorned with a SoftMax activation function.
Training and evaluation
The models are trained using the Adam optimizer, which combines the advantages of both AdaGrad and RMSProp optimizers. The weight update rule for Adam is shown in Eq. (6).where: are the model parameters at time step , is the learning rate, is the first moment estimate (mean of gradients), is the second moment estimate (uncentered variance of gradients), and is a small constant to prevent division by zero. The model is trained using the categorical cross-entropy loss function, which quantifies the disparity between predicted and actual class probabilities, as presented in Eq. (7).where: is the true label, is the predicted probability for class , and is the number of classes. The model’s performance is evaluated using accuracy, precision, recall, and F-score, as shown in Eqs. (8) and (11) .where: is True Positives, is False Positives, and is False Negatives. Comparisons are made against pretrained models using the same dataset and evaluation metrics.
The workflow of this study follows a systematic and reproducible sequence of steps, incorporating both established pretrained models: ResNet50, VGG16, VGG19, Inception V3, MobileNetV3Small, EfficientNetV2B0, ConvNeXtTiny, and the proposed lightweight model, MiniConvNet.The primary objective is to evaluate and compare the effectiveness of these models when trained under identical conditions on the lung cancer datasets. Figure 1 illustrates the overall methodology. The process begins with dataset preparation, including loading and partitioning the data into training, validation, and test subsets. To mitigate overfitting and enhance the variability of the training data, a range of data augmentation techniques are applied. Each model, pretrained architectures and MiniConvNet, is then trained independently using the same experimental settings. Model performance is evaluated using a comprehensive set of metrics, including training, validation, and test loss, accuracy, precision, recall, and F-score. These metrics enable a robust comparative analysis of the models’ diagnostic performance and computational efficiency.
Dataset
This study utilizes two publicly available datasets to evaluate the performance and generalizability of the proposed MiniConvNet model. The first dataset consists of computed tomography (CT) scan images of the lungs, capturing various NSCLC subtypes and healthy lung tissue. The second dataset comprises histopathological images of lung tissue, providing an alternative imaging modality for lung cancer diagnosis. Both datasets were preprocessed and partitioned into training, validation, and test sets to ensure robust evaluation. Detailed descriptions of each dataset are provided in the following subsections.
Lung CT scan dataset
The lung cancer CT dataset, obtained from the Kaggle platform 47, comprises a diverse collection of CT images categorized into four classes: adenocarcinoma (AC), large cell carcinoma (LCC), squamous cell carcinoma (SCC), and healthy lung tissue. The dataset contains a total of 900 images, distributed as follows:Adenocarcinoma (AC): 338 images. Tumors located in the left lower lobe of the lung, classified as T2-stage lesions (largest dimension cm and cm), often exhibiting limited visceral pleural invasion or associated atelectasis/pneumonitis.
Large cell carcinoma (LCC): 187 images. Tumors situated near the left hilum, classified as T2-stage with evidence of regional lymph node metastasis, involving the main bronchus with associated atelectasis/pneumonitis.
Squamous cell carcinoma (SCC): 260 images. Tumors located at the lung hilum, classified as T1-stage (largest dimension cm), also involving the main bronchus and showing regional lymph node metastasis.
Healthy lung: 115 images depicting normal lung tissue without visible abnormalities.
Figures 2, 3, 4 and 5 show representative examples of each class. Figure 2 illustrates AC lesions in the left lower lobe, characterized by T2-stage size and possible visceral pleural involvement. Figure 3 presents an LCC tumor near the left hilum, also T2-stage with lymph node metastasis. Figure 4 depicts SCC at the lung hilum, classified as T1-stage, and Fig. 5 shows a healthy lung CT scan.
Lung histopathological image dataset
To further assess the generalizability of MiniConvNet across imaging modalities, we utilized the publicly available Lung and Colon Cancer Histopathological Images dataset from Kaggle [larxel2020lc]. This dataset comprises 15,000 high-resolution histopathology images evenly distributed across three classes: adenocarcinoma, squamous cell carcinoma, and healthy lung tissue. The balanced and diverse nature of this dataset provides an additional benchmark for evaluating the model’s performance beyond CT imaging.
Dataset preprocessing
Both datasets were divided into three subsets: training (80%), validation (10%), and testing (10%). The CT scan images were preprocessed using the ImageDataGenerator class from Keras, which included rescaling pixel values to the [0,1] range and applying various augmentation techniques. These steps ensured that the datasets were properly organized and diversified, facilitating effective model training and evaluation. The CT dataset comprises only 900 images, which poses a potential risk of overfitting, particularly in deep learning applications where larger datasets are typically preferred. To mitigate this limitation, we applied a comprehensive set of data augmentation strategies during training, including horizontal and vertical flipping, random rotations, brightness adjustments, and zoom transformations. These augmentations effectively increased the variability of the training data, enhancing the model’s robustness and generalization to unseen examples. For the histopathology dataset, which contains 15,000 images evenly distributed across three classes, the same train-validation-test split ratio was applied, yielding 12,000 images for training and 1500 each for validation and testing. All images were resized to pixels and normalized in the same manner as the CT scan dataset to ensure consistency across modalities.
Deep CNN baseline models
This study evaluates the performance of several well-established pretrained convolutional neural network (CNN) architectures as baselines. Each model was fine-tuned on the lung cancer datasets under identical experimental conditions and compared against the proposed MiniConvNet architecture.
VGG networks
VGG16 and VGG19 are two variants of the Visual Geometry Group (VGG) network, developed by the University of Oxford 48. These models are characterized by their deep yet conceptually simple architecture, which stacks multiple convolutional layers with small kernels. VGG16 comprises 16 weight layers organized into five convolutional blocks, followed by three fully connected layers, where the first two have 4096 units each and the final layer corresponds to the number of output classes. VGG19 extends this design by adding additional convolutional layers, resulting in a total of 19 weight layers, with deeper third, fourth, and fifth blocks. While VGG models achieve strong performance, their primary drawback lies in their computational and memory demands, due to the large number of trainable parameters, which limits their applicability in resource-constrained environments.
ResNet-50
ResNet-5049 is a 50-layer deep CNN that introduces the concept of residual learning through skip connections, enabling the training of much deeper networks by mitigating the vanishing gradient problem. The architecture consists of convolutional and pooling layers organized into residual blocks, followed by global average pooling and a fully connected SoftMax classification layer. Although ResNet-50 achieves high accuracy and stable optimization, its depth and complexity still entail substantial computational costs during training and inference.
Inception V3
Inception V350 is an enhanced version of Google’s Inception architecture, designed to improve computational efficiency without compromising accuracy. Each inception module combines convolutions with multiple kernel sizes and pooling operations in parallel, enabling the extraction of multi-scale features. Despite its improved efficiency compared to other deep networks, the intricate design and large number of operations in Inception V3 can complicate implementation and optimization, particularly in scenarios requiring real-time processing.
MobileNetV3Small
MobileNetV351 is a family of lightweight convolutional neural networks designed for mobile and embedded vision applications. The Small variant strikes a balance between latency and accuracy for low-resource environments. It combines depthwise separable convolutions with squeeze-and-excitation blocks and a streamlined architecture discovered via neural architecture search.
EfficientNetV2B0
EfficientNetV252 is an improved version of the EfficientNet family that uses Fused-MBConv layers and a progressive training strategy to achieve higher accuracy with faster training. The B0 variant is the smallest model in the V2 series, designed for speed and compactness while still benefiting from compound scaling of depth, width, and resolution.
ConvNeXtTiny
ConvNeXt53 is a modernized convolutional network architecture inspired by transformer design principles but retaining pure convolutions. The Tiny variant adapts the architecture for smaller parameter budgets while keeping many of the improvements from large-scale ConvNeXt models such as large kernel sizes, inverted bottlenecks, and layer normalization.
MiniConvNet model
The proposed MiniConvNet is designed to achieve an optimal balance between computational efficiency and diagnostic accuracy for the detection and classification of NSCLC. The architecture comprises several essential components, each contributing to its lightweight yet effective design. At its core, the model employs convolutional layers as the fundamental building blocks for hierarchical feature extraction from input images. These layers are interleaved with activation functions, pooling operations, and regularization mechanisms to enhance the model’s representational capacity while minimizing overfitting and computational overhead, as presented in Eq. (1).where: is the output feature map, is the input image patch, is the convolution filter of size , and is the bias term. This operation allows the model to learn features by applying filters across the input image, effectively capturing patterns. After the convolutional operation, ReLU is applied to introduce non-linearity, ensuring that all negative values are set to zero, which helps in preventing the vanishing gradient problem and ensures faster convergence during training, as depicted in Eq. (2).where: is the input to a neuron. Max pooling is used after convolutional layers to reduce the spatial dimensions of the feature maps and to control overfitting. For a typical pooling window, the operation selects the maximum value from each window, effectively down-sampling the feature map while retaining the most prominent features, as presented in Eq. (3).where: is the output of the pooling operation and is the input feature map within the pooling window. Batch normalization is applied to improve stability. This layer normalizes the output of a previous activation layer by adjusting and scaling the activations, as shown in Eqs. (4) and (5).where, and are the mean and variance of the batch, is a small constant added for numerical stability, and and are learnable parameters used for scaling and shifting the normalized value.
MiniConvNet architecture
In Fig. 6, the 1st block of MiniConvNet architecture commences with an input layer of dimensions , accepting medical images of size H W C, which is then subjected to a convolutional operation employing 16 filters. The following blocks consist of additional convolutional layers, batch normalization, and pooling layers. A flattening layer transforms the multi-dimensional feature tensor into a singular, linearized form, which is followed by dense layers for classification. The final output layer is adorned with a SoftMax activation function.
Training and evaluation
The models are trained using the Adam optimizer, which combines the advantages of both AdaGrad and RMSProp optimizers. The weight update rule for Adam is shown in Eq. (6).where: are the model parameters at time step , is the learning rate, is the first moment estimate (mean of gradients), is the second moment estimate (uncentered variance of gradients), and is a small constant to prevent division by zero. The model is trained using the categorical cross-entropy loss function, which quantifies the disparity between predicted and actual class probabilities, as presented in Eq. (7).where: is the true label, is the predicted probability for class , and is the number of classes. The model’s performance is evaluated using accuracy, precision, recall, and F-score, as shown in Eqs. (8) and (11) .where: is True Positives, is False Positives, and is False Negatives. Comparisons are made against pretrained models using the same dataset and evaluation metrics.
Experiments and results
Experiments and results
This section presents a comprehensive evaluation of the proposed MiniConvNet alongside established pretrained CNN architectures, ResNet50, VGG16, VGG19, Inception V3, MobileNetV3Small, EfficientNetV2B0, and ConvNeXtTiny, on both CT scan and histopathology datasets. Results are analyzed across several dimensions: overall performance, class-wise metrics, confusion matrices, computational efficiency, and training dynamics.
Implementation details
Table 1 summarizes the architectural attributes of the baseline models and the proposed MiniConvNet. The table reports key characteristics, including the number of layers, trainable and non-trainable parameters, and total model size, facilitating a direct comparison of their computational complexity and memory footprint. All models were trained under identical experimental conditions to ensure fair comparison. A batch size of 16 was used for all experiments. Training was conducted using the Adam optimizer, which provides adaptive learning rates and facilitates faster convergence while promoting good generalization. The multi-class classification problem was addressed using the categorical cross-entropy loss function, which is appropriate for scenarios where each instance belongs to a single class and effectively penalizes misclassifications. The number of steps per epoch was determined by dividing the total number of images in the training set by the batch size, ensuring complete and balanced coverage of the training data in each epoch. To further enhance model generalization and avoid overfitting, early stopping was implemented with a patience parameter of 10, monitoring validation loss to terminate training when improvement plateaued.
Evaluation on pretrained models
The performance of the pretrained models on the LC dataset varied significantly, reflecting the differences in architecture and optimization strategies. Each model demonstrated a unique trajectory in terms of loss reduction and accuracy improvement, indicating their varying capacities to learn and generalize from the dataset. ResNet 50, with its deep residual connections, showed steady improvement but ultimately achieved moderate accuracy. VGG 19 and VGG 16, both known for their deep convolutional layers, displayed higher peaks in accuracy, particularly with VGG 16 reaching an impressive 0.83 accuracy on the training set. Inception V3, with its complex inception modules, demonstrated strong performance, aligning closely between training and validation accuracy, indicating good generalization. Among the newer lightweight and efficient architectures, MobileNet V3 Small recorded a training accuracy of 0.45 and validation accuracy of 0.51, while EfficientNet V2 B0 achieved 0.42 training accuracy and 0.53 validation accuracy. ConvNeXt Tiny showed relatively stronger results with 0.56 training accuracy and 0.63 validation accuracy. Notably, the MiniConvNet, despite its smaller and more straightforward architecture, achieved near-perfect accuracy on the training set, showcasing its robustness and adaptability to the specific characteristics of the LC dataset.
ResNet 50
The training of the ResNet 50 model concluded after 22 epochs. The initial loss value at the beginning of training was 1.39, gradually improving to achieve its lowest recorded value of 0.937. Similarly, during the validation phase, the loss commenced at 1.29 and progressed to its minimum of 0.919. Regarding accuracy, the training process started with an accuracy score of 0.41, which gradually increased to a peak of 0.52. On the validation set, the accuracy began at 0.37 and reached its highest point at 0.54. Visual representation of these trends is shown in Fig. 7.
VGG 19
The training process of VGG 19 encompassed a span of 30 epochs. Commencing with an initial loss value of 1.18, the network exhibited a downward trajectory, culminating in the achievement of a minimal loss of 0.530. Concurrently, on the validation set, an initial loss of 1.10 later converged to a minimum of 0.551. Throughout the training regimen, accuracy experienced fluctuations. Beginning at 0.44, it ascended and reached a peak of 0.78. Similarly, the validation set commenced at 0.41 and demonstrated a notable ascent, plateauing at a peak accuracy of 0.77. These pivotal training metrics, along with their corresponding validation counterparts, are visually represented in Fig. 8.
VGG 16
The training process of the VGG 16 neural network model concluded after 28 epochs. The initial loss value at the beginning of training was 1.14, reaching its lowest point at 0.45. Similarly, the validation loss commenced at 1.06 and minimized to 0.51. The model’s accuracy commenced at 0.45 and steadily climbed, eventually peaking at 0.83 on the training set. On the validation set, the accuracy began at 0.44 and reached its apex at 0.82. These outcomes are visually represented in Fig. 9.
Inception V3
The training process of Inception V3 extended over 22 epochs. The initial loss commenced at 1.11, ultimately achieving a low of 0.21. Meanwhile, the corresponding validation loss embarked on its journey at 1.89, eventually stabilizing at a minimum of 0.33. The training accuracy, originating at 0.44, ended at 0.903. Simultaneously, the validation accuracy commenced at 0.258 and reached a peak of 0.901, aligning closely with the training accuracy. Visual representation of these trends is in Fig. 10.
MobileNet V3 small
The MobileNet V3 Small model51 was trained for 15 epochs. Training loss decreased from 1.40 to 1.21, while validation loss fell from 1.35 to 1.16. Training accuracy started at 0.27 and reached about 0.45. validation accuracy began at 0.45 and peaked at 0.51 before early stopping. These trends indicate moderate improvements on both sets, with consistent loss reduction and slight accuracy gains. Visual representation of these trends is in Fig. 11.
EfficientNet V2 B0
Efficient Net V2 B052 was fine-tuned for up to 50 epochs and stopped at epoch 30. Training accuracy rose from about 0.27 to 0.42, while validation accuracy improved from 0.25 to a maximum of 0.53 at epoch 20. Loss declined from 1.39 to around 1.17 on the validation set, reflecting effective learning. Precision increased steadily (above 0.80 on training, near 1.0 on validation) but recall stayed low (), showing conservative predictions. Visual representation of these trends is in Fig. 12.
ConvNeXt Tiny
ConvNeXt-Tiny was trained for 50 epochs with early stop at epoch 31. Training accuracy improved from 0.26 with a loss of 1.57 to 0.56 with a loss of 0.86 at epoch 21. Validation accuracy climbed from 0.44 to 0.63 with a loss of 0.82 at the same epoch. The precision remained high 0.97, but the recall was modest 0.25. Visual representation of these trends is in Fig. 13.
Evaluation on MiniConvNet
The training process of the MiniConvNet model encompassed 23 epochs. Commencing with an initial loss of 1.16, the model iteratively refined its performance, reaching a remarkable low of 0.0003 in terms of loss. Similarly, on the validation dataset, the initial loss of 2.36 underwent substantial improvement, settling at a minimal value of 0.27. Throughout the training phase, the model exhibited consistent augmentation in accuracy. Commencing at a modest accuracy of 0.56, the MiniConvNet steadily progressed to achieve a perfect accuracy of 1.00 on the training dataset. Likewise, the validation accuracy began at 0.098 and significantly elevated to a peak accuracy of 0.96. The representations of these metrics are encapsulated in Fig. 14.
The model achieved outstanding performance across training, validation, and test sets on the histopathological image dataset. Our model exhibited excellent performance on this dataset. On the training set, the model achieved an accuracy of 0.991 with a corresponding loss of 0.026, indicating effective learning without overfitting. During validation, the model maintained high accuracy at 0.98, with a validation loss of 0.10. The precision and recall values both were 0.98, highlighting strong classification consistency across classes. The visual representation of these trends are in Fig. 15.
These results demonstrate MiniConvNet’s ability to generalize to different imaging modalities and domains. The high performance across both datasets indicates its robustness and potential for broad applicability in cancer subtype classification.
Results
Table 2 presents the performance of all models on the CT dataset. Among the pretrained architectures, Inception V3 achieved the highest accuracy at 82%, followed by VGG16 (77%), VGG19 (69%), ConvNeXt Tiny (58%), ResNet-50 (52%), EfficientNet V2 (52%), and MobileNet V3 (46%). In contrast, MiniConvNet outperformed all baselines, attaining a accuracy of 96% alongside consistently higher precision, recall, and F-score (each 0.96), and the lowest loss (0.15), compared to Inception V3’s loss of 0.42. As shown in Fig. 16, MiniConvNet achieved superior accuracy and minimal loss during training. Its evaluation on the unseen test set confirmed strong generalization, with robust performance across all metrics, as detailed in Table 2 and visualized in Fig. 17.
On the histopathology dataset, MiniConvNet achieved 96.8% test accuracy, confirming its ability to generalize across modalities. Tables 3 and 4 provide detailed class-wise metrics. On the CT dataset, MiniConvNet achieved perfect recall (1.00) and precision (1.00) on the Healthy Lung (HL) class, while slightly lower scores were observed for SCC and AC. Notably, the SCC class had more misclassifications, consistent with its clinical difficulty. On the histopathology dataset, MiniConvNet maintained high precision and recall across all four classes, with particularly strong performance on LCC and HL classes.
Figures 18 and 19 illustrate the confusion matrices. On the CT dataset, MiniConvNet achieved near-perfect classification of HL, while misclassifications were primarily between AC and SCC. On the histopathology dataset, very few misclassifications occurred, with most predictions aligning along the diagonal, confirming high specificity and sensitivity.
Computational efficiency
Table 5 compares training and inference times. MiniConvNet required less training time than most baselines on the CT dataset. On the larger histopathology dataset, it maintained acceptable training time while further improving inference speed to 3.6 ms. This efficiency is critical for clinical deployment.
Comparative evaluation of with and without dropout
To assess the impact of regularization on model performance, we compared the results of MiniConvNet before and after introducing dropout layers. The baseline MiniConvNet model, which did not include dropout, was trained for 23 epochs and the modified MiniConvNet architecture incorporating dropout between dense layers was trained for only 17 epochs. It surpassed the baseline in terms of validation accuracy, reaching 0.9710 with a lower validation loss of 0.1750 these metrics are illustrated in Fig. 20. The comparison highlights that while both architectures performed well, the addition of dropout provided a more balanced trade-off between training and validation performance.
Cross-validation evaluation
To further validate the robustness of the proposed model and address the concern of potential overfitting raised during the review process, a five-fold cross-validation experiment was conducted. In this approach, the entire dataset was randomly partitioned into five equally sized folds, ensuring class stratification. In each run, four folds were used for training and the remaining fold for validation/testing. This process was repeated five times so that each fold served as the test set once, and the performance metrics were averaged over all runs.
Across the five folds, the model consistently achieved high predictive performance. As summarized in Table X, the test accuracies ranged from 94.1 to 97.4%, with corresponding precision, recall, and F1-scores closely aligned. The average performance across all folds was approximately 96.6% accuracy, 96.6% precision, 96.4% recall, and 96.5% F1-score. Table 6 present a full overview of the performance metrics during training.
These results confirm that the model’s performance is stable across different data splits and is not limited to a single train–test partition. The small variation between folds reflects natural differences in the underlying data distributions, while the consistently high scores demonstrate that the proposed method generalises well to unseen data.
This section presents a comprehensive evaluation of the proposed MiniConvNet alongside established pretrained CNN architectures, ResNet50, VGG16, VGG19, Inception V3, MobileNetV3Small, EfficientNetV2B0, and ConvNeXtTiny, on both CT scan and histopathology datasets. Results are analyzed across several dimensions: overall performance, class-wise metrics, confusion matrices, computational efficiency, and training dynamics.
Implementation details
Table 1 summarizes the architectural attributes of the baseline models and the proposed MiniConvNet. The table reports key characteristics, including the number of layers, trainable and non-trainable parameters, and total model size, facilitating a direct comparison of their computational complexity and memory footprint. All models were trained under identical experimental conditions to ensure fair comparison. A batch size of 16 was used for all experiments. Training was conducted using the Adam optimizer, which provides adaptive learning rates and facilitates faster convergence while promoting good generalization. The multi-class classification problem was addressed using the categorical cross-entropy loss function, which is appropriate for scenarios where each instance belongs to a single class and effectively penalizes misclassifications. The number of steps per epoch was determined by dividing the total number of images in the training set by the batch size, ensuring complete and balanced coverage of the training data in each epoch. To further enhance model generalization and avoid overfitting, early stopping was implemented with a patience parameter of 10, monitoring validation loss to terminate training when improvement plateaued.
Evaluation on pretrained models
The performance of the pretrained models on the LC dataset varied significantly, reflecting the differences in architecture and optimization strategies. Each model demonstrated a unique trajectory in terms of loss reduction and accuracy improvement, indicating their varying capacities to learn and generalize from the dataset. ResNet 50, with its deep residual connections, showed steady improvement but ultimately achieved moderate accuracy. VGG 19 and VGG 16, both known for their deep convolutional layers, displayed higher peaks in accuracy, particularly with VGG 16 reaching an impressive 0.83 accuracy on the training set. Inception V3, with its complex inception modules, demonstrated strong performance, aligning closely between training and validation accuracy, indicating good generalization. Among the newer lightweight and efficient architectures, MobileNet V3 Small recorded a training accuracy of 0.45 and validation accuracy of 0.51, while EfficientNet V2 B0 achieved 0.42 training accuracy and 0.53 validation accuracy. ConvNeXt Tiny showed relatively stronger results with 0.56 training accuracy and 0.63 validation accuracy. Notably, the MiniConvNet, despite its smaller and more straightforward architecture, achieved near-perfect accuracy on the training set, showcasing its robustness and adaptability to the specific characteristics of the LC dataset.
ResNet 50
The training of the ResNet 50 model concluded after 22 epochs. The initial loss value at the beginning of training was 1.39, gradually improving to achieve its lowest recorded value of 0.937. Similarly, during the validation phase, the loss commenced at 1.29 and progressed to its minimum of 0.919. Regarding accuracy, the training process started with an accuracy score of 0.41, which gradually increased to a peak of 0.52. On the validation set, the accuracy began at 0.37 and reached its highest point at 0.54. Visual representation of these trends is shown in Fig. 7.
VGG 19
The training process of VGG 19 encompassed a span of 30 epochs. Commencing with an initial loss value of 1.18, the network exhibited a downward trajectory, culminating in the achievement of a minimal loss of 0.530. Concurrently, on the validation set, an initial loss of 1.10 later converged to a minimum of 0.551. Throughout the training regimen, accuracy experienced fluctuations. Beginning at 0.44, it ascended and reached a peak of 0.78. Similarly, the validation set commenced at 0.41 and demonstrated a notable ascent, plateauing at a peak accuracy of 0.77. These pivotal training metrics, along with their corresponding validation counterparts, are visually represented in Fig. 8.
VGG 16
The training process of the VGG 16 neural network model concluded after 28 epochs. The initial loss value at the beginning of training was 1.14, reaching its lowest point at 0.45. Similarly, the validation loss commenced at 1.06 and minimized to 0.51. The model’s accuracy commenced at 0.45 and steadily climbed, eventually peaking at 0.83 on the training set. On the validation set, the accuracy began at 0.44 and reached its apex at 0.82. These outcomes are visually represented in Fig. 9.
Inception V3
The training process of Inception V3 extended over 22 epochs. The initial loss commenced at 1.11, ultimately achieving a low of 0.21. Meanwhile, the corresponding validation loss embarked on its journey at 1.89, eventually stabilizing at a minimum of 0.33. The training accuracy, originating at 0.44, ended at 0.903. Simultaneously, the validation accuracy commenced at 0.258 and reached a peak of 0.901, aligning closely with the training accuracy. Visual representation of these trends is in Fig. 10.
MobileNet V3 small
The MobileNet V3 Small model51 was trained for 15 epochs. Training loss decreased from 1.40 to 1.21, while validation loss fell from 1.35 to 1.16. Training accuracy started at 0.27 and reached about 0.45. validation accuracy began at 0.45 and peaked at 0.51 before early stopping. These trends indicate moderate improvements on both sets, with consistent loss reduction and slight accuracy gains. Visual representation of these trends is in Fig. 11.
EfficientNet V2 B0
Efficient Net V2 B052 was fine-tuned for up to 50 epochs and stopped at epoch 30. Training accuracy rose from about 0.27 to 0.42, while validation accuracy improved from 0.25 to a maximum of 0.53 at epoch 20. Loss declined from 1.39 to around 1.17 on the validation set, reflecting effective learning. Precision increased steadily (above 0.80 on training, near 1.0 on validation) but recall stayed low (), showing conservative predictions. Visual representation of these trends is in Fig. 12.
ConvNeXt Tiny
ConvNeXt-Tiny was trained for 50 epochs with early stop at epoch 31. Training accuracy improved from 0.26 with a loss of 1.57 to 0.56 with a loss of 0.86 at epoch 21. Validation accuracy climbed from 0.44 to 0.63 with a loss of 0.82 at the same epoch. The precision remained high 0.97, but the recall was modest 0.25. Visual representation of these trends is in Fig. 13.
Evaluation on MiniConvNet
The training process of the MiniConvNet model encompassed 23 epochs. Commencing with an initial loss of 1.16, the model iteratively refined its performance, reaching a remarkable low of 0.0003 in terms of loss. Similarly, on the validation dataset, the initial loss of 2.36 underwent substantial improvement, settling at a minimal value of 0.27. Throughout the training phase, the model exhibited consistent augmentation in accuracy. Commencing at a modest accuracy of 0.56, the MiniConvNet steadily progressed to achieve a perfect accuracy of 1.00 on the training dataset. Likewise, the validation accuracy began at 0.098 and significantly elevated to a peak accuracy of 0.96. The representations of these metrics are encapsulated in Fig. 14.
The model achieved outstanding performance across training, validation, and test sets on the histopathological image dataset. Our model exhibited excellent performance on this dataset. On the training set, the model achieved an accuracy of 0.991 with a corresponding loss of 0.026, indicating effective learning without overfitting. During validation, the model maintained high accuracy at 0.98, with a validation loss of 0.10. The precision and recall values both were 0.98, highlighting strong classification consistency across classes. The visual representation of these trends are in Fig. 15.
These results demonstrate MiniConvNet’s ability to generalize to different imaging modalities and domains. The high performance across both datasets indicates its robustness and potential for broad applicability in cancer subtype classification.
Results
Table 2 presents the performance of all models on the CT dataset. Among the pretrained architectures, Inception V3 achieved the highest accuracy at 82%, followed by VGG16 (77%), VGG19 (69%), ConvNeXt Tiny (58%), ResNet-50 (52%), EfficientNet V2 (52%), and MobileNet V3 (46%). In contrast, MiniConvNet outperformed all baselines, attaining a accuracy of 96% alongside consistently higher precision, recall, and F-score (each 0.96), and the lowest loss (0.15), compared to Inception V3’s loss of 0.42. As shown in Fig. 16, MiniConvNet achieved superior accuracy and minimal loss during training. Its evaluation on the unseen test set confirmed strong generalization, with robust performance across all metrics, as detailed in Table 2 and visualized in Fig. 17.
On the histopathology dataset, MiniConvNet achieved 96.8% test accuracy, confirming its ability to generalize across modalities. Tables 3 and 4 provide detailed class-wise metrics. On the CT dataset, MiniConvNet achieved perfect recall (1.00) and precision (1.00) on the Healthy Lung (HL) class, while slightly lower scores were observed for SCC and AC. Notably, the SCC class had more misclassifications, consistent with its clinical difficulty. On the histopathology dataset, MiniConvNet maintained high precision and recall across all four classes, with particularly strong performance on LCC and HL classes.
Figures 18 and 19 illustrate the confusion matrices. On the CT dataset, MiniConvNet achieved near-perfect classification of HL, while misclassifications were primarily between AC and SCC. On the histopathology dataset, very few misclassifications occurred, with most predictions aligning along the diagonal, confirming high specificity and sensitivity.
Computational efficiency
Table 5 compares training and inference times. MiniConvNet required less training time than most baselines on the CT dataset. On the larger histopathology dataset, it maintained acceptable training time while further improving inference speed to 3.6 ms. This efficiency is critical for clinical deployment.
Comparative evaluation of with and without dropout
To assess the impact of regularization on model performance, we compared the results of MiniConvNet before and after introducing dropout layers. The baseline MiniConvNet model, which did not include dropout, was trained for 23 epochs and the modified MiniConvNet architecture incorporating dropout between dense layers was trained for only 17 epochs. It surpassed the baseline in terms of validation accuracy, reaching 0.9710 with a lower validation loss of 0.1750 these metrics are illustrated in Fig. 20. The comparison highlights that while both architectures performed well, the addition of dropout provided a more balanced trade-off between training and validation performance.
Cross-validation evaluation
To further validate the robustness of the proposed model and address the concern of potential overfitting raised during the review process, a five-fold cross-validation experiment was conducted. In this approach, the entire dataset was randomly partitioned into five equally sized folds, ensuring class stratification. In each run, four folds were used for training and the remaining fold for validation/testing. This process was repeated five times so that each fold served as the test set once, and the performance metrics were averaged over all runs.
Across the five folds, the model consistently achieved high predictive performance. As summarized in Table X, the test accuracies ranged from 94.1 to 97.4%, with corresponding precision, recall, and F1-scores closely aligned. The average performance across all folds was approximately 96.6% accuracy, 96.6% precision, 96.4% recall, and 96.5% F1-score. Table 6 present a full overview of the performance metrics during training.
These results confirm that the model’s performance is stable across different data splits and is not limited to a single train–test partition. The small variation between folds reflects natural differences in the underlying data distributions, while the consistently high scores demonstrate that the proposed method generalises well to unseen data.
Discussion
Discussion
In this study, we introduced MiniConvNet, an efficient neural network architecture designed with multiple convolutional layers. Each layer is followed by ReLU activation and pooling, allowing the model to effectively extract features from input images. The network strikes a careful balance between depth and computational efficiency, avoiding the excessive parameters found in deeper models like ResNet-50 or VGG19.
MiniConvNet performed remarkably well during training, achieving 100% accuracy. It maintained strong generalization with a 96% and 98% accuracy on both validation and test sets, showing its ability to handle unseen data from the same dataset. The model has a lightweight design compared to deeper architectures and has significantly fewer parameters, making it an excellent choice where computational resources are limited. Its design not only speeds up the training process but also delivers faster inference times, which can be crucial in clinical settings where timely diagnoses are critical. When compared to pretrained models in terms of accuracy, precision, and recall with ResNet50, VGG16, VGG19, Inception V3, MobileNet V3 Small, EfficientNet V2 B0, and ConvNeXt Tiny, MiniConvNet outperformed them on the NSCLC dataset. For instance, while Inception V3 achieved 82% test set accuracy and 0.43% test loss, MiniConvNet goes beyond these benchmarks by a large margin.
That said, the model isn’t without its challenges. The drop from 98% accuracy in training to 96% in validation and testing suggests it might struggle with generalizing to new data that differs from the training set. Additionally, despite being more efficient than deeper networks, there’s still a risk of overfitting, especially given the relatively small size of the dataset. While its computational efficiency is a strength, real-time applications might still demand substantial resources. Future work could focus on enhancing the model’s generalizability, perhaps by integrating transfer learning techniques or expanding the dataset size.
In this study, we introduced MiniConvNet, an efficient neural network architecture designed with multiple convolutional layers. Each layer is followed by ReLU activation and pooling, allowing the model to effectively extract features from input images. The network strikes a careful balance between depth and computational efficiency, avoiding the excessive parameters found in deeper models like ResNet-50 or VGG19.
MiniConvNet performed remarkably well during training, achieving 100% accuracy. It maintained strong generalization with a 96% and 98% accuracy on both validation and test sets, showing its ability to handle unseen data from the same dataset. The model has a lightweight design compared to deeper architectures and has significantly fewer parameters, making it an excellent choice where computational resources are limited. Its design not only speeds up the training process but also delivers faster inference times, which can be crucial in clinical settings where timely diagnoses are critical. When compared to pretrained models in terms of accuracy, precision, and recall with ResNet50, VGG16, VGG19, Inception V3, MobileNet V3 Small, EfficientNet V2 B0, and ConvNeXt Tiny, MiniConvNet outperformed them on the NSCLC dataset. For instance, while Inception V3 achieved 82% test set accuracy and 0.43% test loss, MiniConvNet goes beyond these benchmarks by a large margin.
That said, the model isn’t without its challenges. The drop from 98% accuracy in training to 96% in validation and testing suggests it might struggle with generalizing to new data that differs from the training set. Additionally, despite being more efficient than deeper networks, there’s still a risk of overfitting, especially given the relatively small size of the dataset. While its computational efficiency is a strength, real-time applications might still demand substantial resources. Future work could focus on enhancing the model’s generalizability, perhaps by integrating transfer learning techniques or expanding the dataset size.
Conclusion
Conclusion
Our research focuses on an important goal which is detecting and classifying lung LC using CNN to achieve highly accurate results. The primary aim is to develop a lightweight model that excels in detecting lung cancer with high precision. To achieve this, we followed a detailed process to improve classification accuracy. The study utilizes a diverse dataset of CT scan images, including samples from multiple categories such as adenocarcinoma, large cell carcinoma, squamous cell carcinoma, and healthy lung tissue, providing a robust foundation for the research. A key aspect of our work is comparing the performance of our proposed MiniConvNet model with well-known pretrained CNN architectures like ResNet50, VGG16, VGG19, Inception V3, MobileNet V3 Small, EfficientNet V2 B0, and ConvNeXt Tiny. The results clearly highlight MiniConvNet’s consistent superiority, as it outperformed these established models across multiple metrics, including accuracy and F1 score. One standout feature of MiniConvNet is its remarkably low loss value of 0.05, showcasing its ability to effectively learn and generalize critical features for accurate classification. The model achieved an impressive training accuracy of 100% and a test accuracy of 96% , reflecting its strong capability to identify patterns in the data. By delivering such exceptional results and surpassing established models, our study makes a meaningful contribution to advancing medical diagnostics with cutting-edge deep learning techniques. MiniConvNet’s efficiency and high performance also make it a strong candidate for real-time clinical use. Future research should focus on validating the model with larger, more diverse datasets and exploring how it can be integrated into clinical workflows to further improve diagnostic processes.
Our research focuses on an important goal which is detecting and classifying lung LC using CNN to achieve highly accurate results. The primary aim is to develop a lightweight model that excels in detecting lung cancer with high precision. To achieve this, we followed a detailed process to improve classification accuracy. The study utilizes a diverse dataset of CT scan images, including samples from multiple categories such as adenocarcinoma, large cell carcinoma, squamous cell carcinoma, and healthy lung tissue, providing a robust foundation for the research. A key aspect of our work is comparing the performance of our proposed MiniConvNet model with well-known pretrained CNN architectures like ResNet50, VGG16, VGG19, Inception V3, MobileNet V3 Small, EfficientNet V2 B0, and ConvNeXt Tiny. The results clearly highlight MiniConvNet’s consistent superiority, as it outperformed these established models across multiple metrics, including accuracy and F1 score. One standout feature of MiniConvNet is its remarkably low loss value of 0.05, showcasing its ability to effectively learn and generalize critical features for accurate classification. The model achieved an impressive training accuracy of 100% and a test accuracy of 96% , reflecting its strong capability to identify patterns in the data. By delivering such exceptional results and surpassing established models, our study makes a meaningful contribution to advancing medical diagnostics with cutting-edge deep learning techniques. MiniConvNet’s efficiency and high performance also make it a strong candidate for real-time clinical use. Future research should focus on validating the model with larger, more diverse datasets and exploring how it can be integrated into clinical workflows to further improve diagnostic processes.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.