Improvement of machine learning models for predicting high-grade subtypes of lung adenocarcinoma based on delta radiomics: A multicenter cohort study.
코호트
1/5 보강
[OBJECTIVES] To evaluate the effectiveness of delta radiomics in predicting high-grade components in lung adenocarcinoma and to develop a robust machine learning model for clinical application.
- 표본수 (n) 491
- 연구 설계 cohort study
APA
Zhong F, Li T, et al. (2025). Improvement of machine learning models for predicting high-grade subtypes of lung adenocarcinoma based on delta radiomics: A multicenter cohort study.. European journal of radiology open, 15, 100699. https://doi.org/10.1016/j.ejro.2025.100699
MLA
Zhong F, et al.. "Improvement of machine learning models for predicting high-grade subtypes of lung adenocarcinoma based on delta radiomics: A multicenter cohort study.." European journal of radiology open, vol. 15, 2025, pp. 100699.
PMID
41244303 ↗
Abstract 한글 요약
[OBJECTIVES] To evaluate the effectiveness of delta radiomics in predicting high-grade components in lung adenocarcinoma and to develop a robust machine learning model for clinical application.
[METHODS] This retrospective multi-center cohort study included lung cancer patients from three hospitals who had pre-surgery CT follow-up scans. Training (n = 491) and validation (n = 210) were performed using cases from Center 1, and testing was conducted using cases from Centers 2 and 3 (n = 92). Radiomic features were extracted from baseline and follow-up CT images, and delta radiomic features were calculated. The LASSO algorithm was used for radiomic feature selection, and rad-score and delta rad-score were constructed. Significant clinical and radiomic features were combined to build the final machine learning model. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), DeLong test, decision curve analysis (DCA), and integrated discrimination improvement (IDI) analysis.
[RESULTS] In the external test cohort, the integrated machine learning model constructed based on clinical features (CTR, smoking status, maximum diameter of the solid component), rad-score, and delta rad-score showed that the random forest model performed the best, with an AUC of 0.91. The random forest model outperformed the clinical model (AUC = 0.80), rad-score (AUC = 0.79), and delta rad-score (AUC = 0.81). DCA and IDI indicated that the random forest model provides superior clinical benefit and improvement.
[CONCLUSION] Delta radiomics significantly aids in identifying high-grade subtypes of lung adenocarcinoma. The integrated machine learning model offers an effective approach for prediction of high-grade components, with potential clinical implications.
[CLINICAL RELEVANCE STATEMENT] This study presents a novel application of delta radiomics to predict high-grade lung adenocarcinoma, which may influence surgical management and improve patient outcomes.
[METHODS] This retrospective multi-center cohort study included lung cancer patients from three hospitals who had pre-surgery CT follow-up scans. Training (n = 491) and validation (n = 210) were performed using cases from Center 1, and testing was conducted using cases from Centers 2 and 3 (n = 92). Radiomic features were extracted from baseline and follow-up CT images, and delta radiomic features were calculated. The LASSO algorithm was used for radiomic feature selection, and rad-score and delta rad-score were constructed. Significant clinical and radiomic features were combined to build the final machine learning model. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), DeLong test, decision curve analysis (DCA), and integrated discrimination improvement (IDI) analysis.
[RESULTS] In the external test cohort, the integrated machine learning model constructed based on clinical features (CTR, smoking status, maximum diameter of the solid component), rad-score, and delta rad-score showed that the random forest model performed the best, with an AUC of 0.91. The random forest model outperformed the clinical model (AUC = 0.80), rad-score (AUC = 0.79), and delta rad-score (AUC = 0.81). DCA and IDI indicated that the random forest model provides superior clinical benefit and improvement.
[CONCLUSION] Delta radiomics significantly aids in identifying high-grade subtypes of lung adenocarcinoma. The integrated machine learning model offers an effective approach for prediction of high-grade components, with potential clinical implications.
[CLINICAL RELEVANCE STATEMENT] This study presents a novel application of delta radiomics to predict high-grade lung adenocarcinoma, which may influence surgical management and improve patient outcomes.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- Predictive value of delta-radiomic features for the prognosis of patients with stage IA lung adenocarcinoma.
- Integrated m6A reader network in acute myeloid leukemia: prognostic modeling, immune modulation, and functional validation of YTHDF3.
- Machine learning-driven glycolytic subtyping and exosome-based PKM splicing modulation overcome drug resistance in hyper-glycolytic myeloid leukemia.
- PRDX1 depletion predisposes to ferroptosis through inhibiting the cAMP pathway in B-cell acute lymphoblastic leukemia.
- Integrative multiomics analysis of platelet-related genes unveils molecular subtypes and prognostic signatures in acute myeloid leukemia.
📖 전문 본문 읽기 PMC JATS · ~63 KB · 영문
Introduction
1
Introduction
Lung cancer remains the leading cause of cancer incidence and mortality worldwide [1], with adenocarcinoma representing the most common histological subtype, accounting for nearly half of all cases [2]. The prognosis of lung adenocarcinoma (LUAD) varies significantly across its histological subtypes. The 2015 World Health Organization (WHO) classification identified micropapillary and solid-predominant tumors as poorly differentiated [3]. However, studies have indicated that even a small percentage of high-grade subtypes, less than 5 %, have been proven to significantly affect patient prognosis [4], [5], [6]. Surgical resection is considered the optimal treatment choice for early-stage non-small cell lung cancer. Segmentectomy has been shown to provide clinical benefits comparable to lobectomy [7] and can better preserve patients' lung function, leading to a higher quality of life. However, the presence of high-grade components has been identified as an independent risk factor for local recurrence in patients treated with limited resection [8]. Therefore, preoperative identification of high-grade components in lung adenocarcinoma is crucial for the formulation of patient treatment plans.
At present, the preoperative diagnosis of lung cancer primarily relies on small sample percutaneous biopsies and computed tomography (CT) imaging examinations. However, biopsy samples are often inadequate for complete histological subtyping [9], and conventional imaging struggles to detect non-predominant high-grade components Consequently, radiologists frequently recommend 3–6 month follow-up CT for indeterminate cases [10], creating an opportunity for longitudinal analysis.
Radiomics enables the non-invasive transformation of medical images into a large array of quantitative features, which can be mined to explore potential biomarkers. It has been widely researched and applied in the study of various tumors [11]. However, conventional radiomics relies on a single-timepoint 'snapshot' of the tumor, which may not fully capture its dynamic biological aggressiveness and intrinsic heterogeneity. Delta radiomics is an emerging approach that allows for the comparison of radiomic feature changes in the same lesion across different time points [12]. This dynamic perspective may provide a more sensitive and biologically relevant assessment than static features alone, as it directly measures the tumor's growth pattern and phenotypic instability over time. In the field of pulmonary tumors, delta radiomics has been employed to predict the benign or malignant nature of lung nodules [13], assess the invasiveness of part-solid nodules in lung adenocarcinoma [14], and forecast the presence of spreading through air spaces (STAS) in lung cancer patients [15]. It is already established that during the preoperative follow-up of lung adenocarcinoma, the rate of tumor volume change has been shown to be closely associated with its histological subtypes [16]. This evidence supports the concept that the trajectory of imaging feature evolution holds prognostic information. We hypothesize that extending this concept beyond simple volume to a comprehensive set of quantitative texture and shape features (i.e., delta radiomics) will yield a more powerful tool for identifying high-grade components, potentially capturing subtle, pre-morphological signs of aggression. However, to the best of our knowledge, no studies have yet investigated the utility of delta radiomics, based on preoperative routine follow-up, in identifying high-grade components of primary lung adenocarcinoma.
The aim of this study is to evaluate whether delta radiomics can effectively predict the presence of high-grade components in primary lung adenocarcinoma. Furthermore, we seek to develop and validate a robust machine learning model based on clinical imaging features, delta radiomics, and preoperative radiomics to guide clinical practice.
Introduction
Lung cancer remains the leading cause of cancer incidence and mortality worldwide [1], with adenocarcinoma representing the most common histological subtype, accounting for nearly half of all cases [2]. The prognosis of lung adenocarcinoma (LUAD) varies significantly across its histological subtypes. The 2015 World Health Organization (WHO) classification identified micropapillary and solid-predominant tumors as poorly differentiated [3]. However, studies have indicated that even a small percentage of high-grade subtypes, less than 5 %, have been proven to significantly affect patient prognosis [4], [5], [6]. Surgical resection is considered the optimal treatment choice for early-stage non-small cell lung cancer. Segmentectomy has been shown to provide clinical benefits comparable to lobectomy [7] and can better preserve patients' lung function, leading to a higher quality of life. However, the presence of high-grade components has been identified as an independent risk factor for local recurrence in patients treated with limited resection [8]. Therefore, preoperative identification of high-grade components in lung adenocarcinoma is crucial for the formulation of patient treatment plans.
At present, the preoperative diagnosis of lung cancer primarily relies on small sample percutaneous biopsies and computed tomography (CT) imaging examinations. However, biopsy samples are often inadequate for complete histological subtyping [9], and conventional imaging struggles to detect non-predominant high-grade components Consequently, radiologists frequently recommend 3–6 month follow-up CT for indeterminate cases [10], creating an opportunity for longitudinal analysis.
Radiomics enables the non-invasive transformation of medical images into a large array of quantitative features, which can be mined to explore potential biomarkers. It has been widely researched and applied in the study of various tumors [11]. However, conventional radiomics relies on a single-timepoint 'snapshot' of the tumor, which may not fully capture its dynamic biological aggressiveness and intrinsic heterogeneity. Delta radiomics is an emerging approach that allows for the comparison of radiomic feature changes in the same lesion across different time points [12]. This dynamic perspective may provide a more sensitive and biologically relevant assessment than static features alone, as it directly measures the tumor's growth pattern and phenotypic instability over time. In the field of pulmonary tumors, delta radiomics has been employed to predict the benign or malignant nature of lung nodules [13], assess the invasiveness of part-solid nodules in lung adenocarcinoma [14], and forecast the presence of spreading through air spaces (STAS) in lung cancer patients [15]. It is already established that during the preoperative follow-up of lung adenocarcinoma, the rate of tumor volume change has been shown to be closely associated with its histological subtypes [16]. This evidence supports the concept that the trajectory of imaging feature evolution holds prognostic information. We hypothesize that extending this concept beyond simple volume to a comprehensive set of quantitative texture and shape features (i.e., delta radiomics) will yield a more powerful tool for identifying high-grade components, potentially capturing subtle, pre-morphological signs of aggression. However, to the best of our knowledge, no studies have yet investigated the utility of delta radiomics, based on preoperative routine follow-up, in identifying high-grade components of primary lung adenocarcinoma.
The aim of this study is to evaluate whether delta radiomics can effectively predict the presence of high-grade components in primary lung adenocarcinoma. Furthermore, we seek to develop and validate a robust machine learning model based on clinical imaging features, delta radiomics, and preoperative radiomics to guide clinical practice.
Materials and methods
2
Materials and methods
2.1
Participants
This retrospective study enrolled patients who underwent surgical resection for lung cancer from January 2011 to June 2023 at the Center 1 (the First Medical Center of the PLA General Hospital), Center 2 (Zhongnan Hospital of Wuhan University), and Center 3 (the Sixth Medical Center of the PLA General Hospital). The inclusion criteria were as follows: (1) patients who underwent complete surgical resection and were pathologically diagnosed with primary invasive non-mucinous adenocarcinoma of the lung. (2) patients who underwent multiple preoperative thin-slice chest CT scans (1.0–1.5 mm), with the time interval between one of these CT scans and the last preoperative CT scan being within 50–200 days. The exclusion criteria were as follows: (1) patients with missing data or poor image quality; (2) a time interval exceeding one month between the last preoperative CT and surgery; (3) patients who had received any form of antitumor treatment prior to surgery. A detailed flowchart illustrating the inclusion and exclusion process of patients was provided in Fig. 1. In cases where a patient had multiple lesions, only the largest tumor lesion was used for analysis. All patients from the Center 1 were randomly divided into a training cohort and a validation cohort in a 7:3 ratio, while patients from the Center 2 and Center 3 were combined as a test cohort. This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of our institutions. Informed consent was waived due to the retrospective nature of the study.
2.2
Histopathology and clinical data
Pathological subtypes of resected lung adenocarcinomas were collected from pathology reports. Patients with at least 5 % micropapillary, solid, cribriform, and fused gland patterns were included in the high-grade group [5], [6], while the remaining patients were categorized into the non-high-grade group.
The clinical information of the patients was collected through the electronic medical record system, including gender, age, smoking status, presence of chronic obstructive pulmonary disease (COPD), tumor history, and family history of tumors.
2.3
CT imaging acquisition
The CT scans were performed using either the Brilliance 256i CT (Philips, Netherlands), the Optima CT660 (GE, USA), the Siemens SOMATOM Force dual-source CT (Siemens Healthineers, Germany), the SOMATOM Definition scanner (Siemens Healthineers, Germany), or the GE Discovery 750HD scanner (GE, USA). All patients underwent chest CT scans during an inspiratory breath-hold in the supine position. The scanning range was from lung apices to costo-phrenic angles. CT scanning parameters were as follows: voltage 120 kV; automatic tube current; pitch, 0.993; image matrix, 512 × 512; the reconstructed slice thickness, 1.00 mm or 1.25 mm.
For each patient, two CT scans were selected for analysis: the follow-up CT, defined as the last CT examination before operation, and the baseline CT, defined as another scan taken within 50–200 days prior to the follow-up CT.
2.4
CT imaging assessment
The CT images were independently analyzed by two radiologists with 6 and 8 years of experience in thoracic imaging diagnosis, respectively, using a Picture Archiving and Communication System (PACS) without knowledge of the pathological results. In cases of disagreement, a senior radiologist with 20 years of experience served as the arbitrator. All images were reviewed in the lung window setting (width, 1500 HU; level, −600 HU) and the mediastinal window setting (width, 400 HU, level, 40 HU).
Lesion location was classified by lung lobe (right upper, right middle, right lower, left upper including lingula, or left lower). The maximum tumor diameter was defined as the longest dimension of the entire lesion measured in the lung window. The maximum solid component diameter was measured as the longest dimension of the solid portion of the lesion. The consolidation-to-tumor ratio (CTR) was calculated as the ratio of the maximum solid component diameter to the maximum tumor diameter.
Lesion morphology was categorized as either round/oval or irregular. The tumor-lung interface was classified as well-defined (sharp demarcation from adjacent lung parenchyma) or indistinct. Additional features assessed included: cavitation/cyst, spiculation, lobulation, air bronchogram sign, vascular convergence sign, pleural retraction, and mediastinal lymph node enlargement (short-axis diameter >10 mm). Each of these features was recorded as either present or absent.
2.5
Tumor ROI segmentation
A chest radiologist with 6 years of experience in thoracic CT diagnosis imported the original Digital Imaging and Communications in Medicine (DICOM) files into a commercial 3D Slicer software (version 5.2.1; www.slicer.org). Subsequently, manual tumor region of interest (ROI) segmentation was performed on both CT scans for each patient, carefully avoiding large blood vessels, bronchi, and the chest wall. Additionally, a senior radiologist with twenty years of experience reviewed the delineation of all ROIs. To assess the robustness of inter-radiologist ROI segmentation, 30 randomly selected lesions were re-segmented by a different radiologist (8 years’ experience) for intraclass correlation (ICC) analysis.
2.6
Feature extraction and calculation of delta radiomics features
Firstly, the images were resampled (the voxel size was resampled to 1 mm×1 mm×1 mm). Subsequently, using the radiomics module in 3D Slicer, a total of 1037 features, in accordance with the Image Biomarker Standardization Initiative (IBSI) guidelines [17], were extracted. These included 198 first-order features, 14 shape-related features, 264 gray-level co-occurrence matrix (GLCM) features, 176 gray-level run-length matrix (GLRLM) features, 176 gray-level size zone matrix (GLSZM) features, 154 gray-level dependence matrix (GLDM) features, and 55 neighboring gray-tone difference matrix (NGTDM) features.
Delta radiomics features were calculated using the following formula [15]:
Delta radiomics essentially represents the slope of radiomic features over time. Here, Index[follow-up] – Index[baseline] denotes the difference in radiomic features between the follow-up and baseline lesions, while T[follow-up] - T[baseline] represents the time interval between the follow-up and baseline scans.
2.7
Feature selection
To eliminate the impact of dimensionality differences between radiomic features, the min-max standardization method was applied. The inter-observer consistency of the radiomic features was assessed using the ICCs, with features having an ICC greater than 0.75 considered to demonstrate good reproducibility and thus included in further univariate analysis. The results of the univariate analysis showed that radiomic variables with statistically significant correlations (p < 0.05) between the high-grade and non-high-grade groups were subjected to correlation analysis. Features with a correlation coefficient greater than 0.8 were excluded from subsequent analyses. To identify the most discriminative radiomics and delta radiomics features, we applied the Least Absolute Shrinkage and Selection Operator (LASSO) regression method. The complexity of LASSO regression was controlled using the λ parameter, with larger values imposing greater penalties on feature coefficients and reducing the number of non-zero terms. Significant features were selected based on non-zero coefficients identified through ten-fold cross-validation with a fixed random seed. The selected radiomics and delta radiomics features were used to construct the Rad-Score and Delta Rad-Score, respectively, by weighting each feature with its corresponding coefficient and summing the results.
For clinical and conventional imaging features, univariate analyses were first conducted in the training cohort to identify variables that significantly differed between high-grade and non-high-grade subtypes of lung adenocarcinoma. Features with significant differences were then entered into a multivariate logistic regression with forward stepwise selection to identify independent risk factors associated with high-grade subtypes.
2.8
Model construction
Integrated models were developed using five machine learning algorithms, including decision tree (DT), logistic regression (LR), random forest (RF), support vector machine (SVM), and neural network (NN), based on the independent clinical risk factors, Rad-Score, and Delta Rad-Score. Further details are provided in the Supplementary Material.
2.9
Statistical analysis
All statistical analyses were performed using SPSS (version 26.0), R (version 3.6.3), and Python (version 3.11.5). Continuous data were expressed as medians with interquartile ranges. Comparisons between two groups were performed using independent samples t-tests or Mann-Whitney U tests, depending on the normality of the data distribution. Categorical variables were represented by counts, with comparisons between groups made using chi-squared tests or Fisher's exact tests.
Receiver operating characteristic (ROC) curves were used to assess model performance, with cut-off values determined by the Youden index in the training cohort. For each model, the area under the curve (AUC), accuracy, sensitivity, specificity were calculated to quantify model discrimination. The DeLong test was used to compare AUCs between models. Additionally, the integrated discrimination improvement (IDI) metric [18] was computed to assess the predictive capability differences between the models and decision curve analysis (DCA) was conducted to evaluate the clinical utility. A two-tailed p-value of less than 0.05 was considered statistically significant. Bootstrapping (n = 1000) was conducted to calculate the 95 % confidence interval (CI).
Materials and methods
2.1
Participants
This retrospective study enrolled patients who underwent surgical resection for lung cancer from January 2011 to June 2023 at the Center 1 (the First Medical Center of the PLA General Hospital), Center 2 (Zhongnan Hospital of Wuhan University), and Center 3 (the Sixth Medical Center of the PLA General Hospital). The inclusion criteria were as follows: (1) patients who underwent complete surgical resection and were pathologically diagnosed with primary invasive non-mucinous adenocarcinoma of the lung. (2) patients who underwent multiple preoperative thin-slice chest CT scans (1.0–1.5 mm), with the time interval between one of these CT scans and the last preoperative CT scan being within 50–200 days. The exclusion criteria were as follows: (1) patients with missing data or poor image quality; (2) a time interval exceeding one month between the last preoperative CT and surgery; (3) patients who had received any form of antitumor treatment prior to surgery. A detailed flowchart illustrating the inclusion and exclusion process of patients was provided in Fig. 1. In cases where a patient had multiple lesions, only the largest tumor lesion was used for analysis. All patients from the Center 1 were randomly divided into a training cohort and a validation cohort in a 7:3 ratio, while patients from the Center 2 and Center 3 were combined as a test cohort. This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of our institutions. Informed consent was waived due to the retrospective nature of the study.
2.2
Histopathology and clinical data
Pathological subtypes of resected lung adenocarcinomas were collected from pathology reports. Patients with at least 5 % micropapillary, solid, cribriform, and fused gland patterns were included in the high-grade group [5], [6], while the remaining patients were categorized into the non-high-grade group.
The clinical information of the patients was collected through the electronic medical record system, including gender, age, smoking status, presence of chronic obstructive pulmonary disease (COPD), tumor history, and family history of tumors.
2.3
CT imaging acquisition
The CT scans were performed using either the Brilliance 256i CT (Philips, Netherlands), the Optima CT660 (GE, USA), the Siemens SOMATOM Force dual-source CT (Siemens Healthineers, Germany), the SOMATOM Definition scanner (Siemens Healthineers, Germany), or the GE Discovery 750HD scanner (GE, USA). All patients underwent chest CT scans during an inspiratory breath-hold in the supine position. The scanning range was from lung apices to costo-phrenic angles. CT scanning parameters were as follows: voltage 120 kV; automatic tube current; pitch, 0.993; image matrix, 512 × 512; the reconstructed slice thickness, 1.00 mm or 1.25 mm.
For each patient, two CT scans were selected for analysis: the follow-up CT, defined as the last CT examination before operation, and the baseline CT, defined as another scan taken within 50–200 days prior to the follow-up CT.
2.4
CT imaging assessment
The CT images were independently analyzed by two radiologists with 6 and 8 years of experience in thoracic imaging diagnosis, respectively, using a Picture Archiving and Communication System (PACS) without knowledge of the pathological results. In cases of disagreement, a senior radiologist with 20 years of experience served as the arbitrator. All images were reviewed in the lung window setting (width, 1500 HU; level, −600 HU) and the mediastinal window setting (width, 400 HU, level, 40 HU).
Lesion location was classified by lung lobe (right upper, right middle, right lower, left upper including lingula, or left lower). The maximum tumor diameter was defined as the longest dimension of the entire lesion measured in the lung window. The maximum solid component diameter was measured as the longest dimension of the solid portion of the lesion. The consolidation-to-tumor ratio (CTR) was calculated as the ratio of the maximum solid component diameter to the maximum tumor diameter.
Lesion morphology was categorized as either round/oval or irregular. The tumor-lung interface was classified as well-defined (sharp demarcation from adjacent lung parenchyma) or indistinct. Additional features assessed included: cavitation/cyst, spiculation, lobulation, air bronchogram sign, vascular convergence sign, pleural retraction, and mediastinal lymph node enlargement (short-axis diameter >10 mm). Each of these features was recorded as either present or absent.
2.5
Tumor ROI segmentation
A chest radiologist with 6 years of experience in thoracic CT diagnosis imported the original Digital Imaging and Communications in Medicine (DICOM) files into a commercial 3D Slicer software (version 5.2.1; www.slicer.org). Subsequently, manual tumor region of interest (ROI) segmentation was performed on both CT scans for each patient, carefully avoiding large blood vessels, bronchi, and the chest wall. Additionally, a senior radiologist with twenty years of experience reviewed the delineation of all ROIs. To assess the robustness of inter-radiologist ROI segmentation, 30 randomly selected lesions were re-segmented by a different radiologist (8 years’ experience) for intraclass correlation (ICC) analysis.
2.6
Feature extraction and calculation of delta radiomics features
Firstly, the images were resampled (the voxel size was resampled to 1 mm×1 mm×1 mm). Subsequently, using the radiomics module in 3D Slicer, a total of 1037 features, in accordance with the Image Biomarker Standardization Initiative (IBSI) guidelines [17], were extracted. These included 198 first-order features, 14 shape-related features, 264 gray-level co-occurrence matrix (GLCM) features, 176 gray-level run-length matrix (GLRLM) features, 176 gray-level size zone matrix (GLSZM) features, 154 gray-level dependence matrix (GLDM) features, and 55 neighboring gray-tone difference matrix (NGTDM) features.
Delta radiomics features were calculated using the following formula [15]:
Delta radiomics essentially represents the slope of radiomic features over time. Here, Index[follow-up] – Index[baseline] denotes the difference in radiomic features between the follow-up and baseline lesions, while T[follow-up] - T[baseline] represents the time interval between the follow-up and baseline scans.
2.7
Feature selection
To eliminate the impact of dimensionality differences between radiomic features, the min-max standardization method was applied. The inter-observer consistency of the radiomic features was assessed using the ICCs, with features having an ICC greater than 0.75 considered to demonstrate good reproducibility and thus included in further univariate analysis. The results of the univariate analysis showed that radiomic variables with statistically significant correlations (p < 0.05) between the high-grade and non-high-grade groups were subjected to correlation analysis. Features with a correlation coefficient greater than 0.8 were excluded from subsequent analyses. To identify the most discriminative radiomics and delta radiomics features, we applied the Least Absolute Shrinkage and Selection Operator (LASSO) regression method. The complexity of LASSO regression was controlled using the λ parameter, with larger values imposing greater penalties on feature coefficients and reducing the number of non-zero terms. Significant features were selected based on non-zero coefficients identified through ten-fold cross-validation with a fixed random seed. The selected radiomics and delta radiomics features were used to construct the Rad-Score and Delta Rad-Score, respectively, by weighting each feature with its corresponding coefficient and summing the results.
For clinical and conventional imaging features, univariate analyses were first conducted in the training cohort to identify variables that significantly differed between high-grade and non-high-grade subtypes of lung adenocarcinoma. Features with significant differences were then entered into a multivariate logistic regression with forward stepwise selection to identify independent risk factors associated with high-grade subtypes.
2.8
Model construction
Integrated models were developed using five machine learning algorithms, including decision tree (DT), logistic regression (LR), random forest (RF), support vector machine (SVM), and neural network (NN), based on the independent clinical risk factors, Rad-Score, and Delta Rad-Score. Further details are provided in the Supplementary Material.
2.9
Statistical analysis
All statistical analyses were performed using SPSS (version 26.0), R (version 3.6.3), and Python (version 3.11.5). Continuous data were expressed as medians with interquartile ranges. Comparisons between two groups were performed using independent samples t-tests or Mann-Whitney U tests, depending on the normality of the data distribution. Categorical variables were represented by counts, with comparisons between groups made using chi-squared tests or Fisher's exact tests.
Receiver operating characteristic (ROC) curves were used to assess model performance, with cut-off values determined by the Youden index in the training cohort. For each model, the area under the curve (AUC), accuracy, sensitivity, specificity were calculated to quantify model discrimination. The DeLong test was used to compare AUCs between models. Additionally, the integrated discrimination improvement (IDI) metric [18] was computed to assess the predictive capability differences between the models and decision curve analysis (DCA) was conducted to evaluate the clinical utility. A two-tailed p-value of less than 0.05 was considered statistically significant. Bootstrapping (n = 1000) was conducted to calculate the 95 % confidence interval (CI).
Results
3
Results
3.1
Patient characteristics
A total of 774 patients were ultimately included in this study, comprising 491 patients in the training cohort (425 non-high-grade LUAD and 66 high-grade LUAD, representing 13.44 % high-grade subtypes) and 210 patients in the validation cohort (182 non-high-grade LUAD and 28 high-grade LUAD, 13.33 %), along with an external test cohort of 92 patients (73 non-high-grade LUAD and 19 high-grade LUAD, 20.65 %). The external test cohort included patients from Center 2 (48 non-high-grade and 10 high-grade) and Center 3 (25 non-high-grade and 9 high-grade). The baseline characteristics of the training, validation, and test cohorts were presented in Table 1.
In the training cohort, univariable analysis revealed significant differences between high-grade and non-high-grade LUAD regarding gender, age, smoking status, volume, maximum diameter of the solid component, maximum diameter of the tumor, CTR, tumor-lung interface, spiculation, air bronchogram sign, vascular convergence sign, pleural retraction, and mediastinal lymph node enlargement (p < 0.05, Table 2). Subsequent multivariable logistic regression analysis indicated that CTR (OR, 8.86; 95 % CI, 2.45–32.02), smoking status (OR, 2.18; 95 % CI, 1.12–4.23) and maximum diameter of the solid component (OR, 1.10; 95 % CI, 1.03–1.15), were independent risk factors for the high-grade group (p < 0.05, Table 2). These three clinical features were further used to construct a clinical model and integrated models.
3.2
Feature selection
Among the 1037 radiomic features extracted, 1025 features with an ICC > 0.75 were included for further analysis (median [IQR] ICC: 0.98 [0.96, 1.00]). Based on the training cohort, univariate analysis identified 731 radiomic variables that exhibited statistically significant differences between the two groups. Subsequently, a correlation analysis was performed to reduce dimensionality, resulting in 111 radiomic features with low correlation. LASSO regression (Fig. 2) was then applied to select 10 optimal radiomic features, which were used to construct the Rad-Score (Supplementary Table S1). Similarly, univariate analysis identified 278 delta radiomic features with p < 0.05. Further correlation analysis selected 86 features, and LASSO regression ultimately identified 22 optimal delta radiomic features, which were used to construct the Delta Rad-Score (Supplementary Table S2). To evaluate the consistency of the Rad-Score across different scanning platforms, we compared scores from the three participating centers. In the high-grade group, Rad-Scores were 0.14 ± 0.16 for Center 1, 0.13 ± 0.16 for Center 2, and 0.16 ± 0.19 for Center 3 (p = 0.097). Similarly, in the non-high-grade group, Rad-Scores were 0.10 ± 0.13 for Center 1, 0.12 ± 0.14 for Center 2, and 0.07 ± 0.11 for Center 3 (p = 0.194). The absence of statistically significant differences demonstrates the robustness of our Rad-Score across different CT scanners. In each cohort, tumors in the high-grade group exhibited significantly higher Rad-Score and Delta Rad-score (all p < 0.001, Fig. 3).
3.3
Model performance
The AUCs of the clinical model, Rad-Score, Delta Rad-Score, and five integrated models were detailed in Table 3 and Fig. 4. The clinical model achieved an AUC of 0.80 [0.69, 0.90] in the external test cohort. The discriminative ability of rad-score and delta rad-score was comparable to that of the clinical model, with AUCs of 0.79 [0.67, 0.90] and 0.81 [0.68, 0.91], respectively, but the differences were not statistically significant (p = 0.88 and 0.89, Fig. 5). The integrated models demonstrated strong and robust predictive performance in the external test cohort. The RF model exhibited the highest AUC of 0.91 [0.85, 0.97], outperforming the DT model (AUC = 0.88 [0.79, 0.96]; p = 0.2686), SVM model (AUC = 0.80 [0.69, 0.90]; p = 0.0026), NN model (AUC = 0.80 [0.681, 0.899]; p = 0.0019), and LR model (AUC = 0.80 [0.68, 0.90]; p = 0.0029). The model performance in the training cohort was shown in Supplementary Table S3.
DCA revealed that the RF model provided greater overall net benefit in distinguishing the high-grade and non-high-grade groups compared to the clinical model or radiomic models (Supplementary Figure S1). The IDI test indicated that the RF model and DT model showed commonly positive improvement compared to the other models (Supplementary Figure S2).
Results
3.1
Patient characteristics
A total of 774 patients were ultimately included in this study, comprising 491 patients in the training cohort (425 non-high-grade LUAD and 66 high-grade LUAD, representing 13.44 % high-grade subtypes) and 210 patients in the validation cohort (182 non-high-grade LUAD and 28 high-grade LUAD, 13.33 %), along with an external test cohort of 92 patients (73 non-high-grade LUAD and 19 high-grade LUAD, 20.65 %). The external test cohort included patients from Center 2 (48 non-high-grade and 10 high-grade) and Center 3 (25 non-high-grade and 9 high-grade). The baseline characteristics of the training, validation, and test cohorts were presented in Table 1.
In the training cohort, univariable analysis revealed significant differences between high-grade and non-high-grade LUAD regarding gender, age, smoking status, volume, maximum diameter of the solid component, maximum diameter of the tumor, CTR, tumor-lung interface, spiculation, air bronchogram sign, vascular convergence sign, pleural retraction, and mediastinal lymph node enlargement (p < 0.05, Table 2). Subsequent multivariable logistic regression analysis indicated that CTR (OR, 8.86; 95 % CI, 2.45–32.02), smoking status (OR, 2.18; 95 % CI, 1.12–4.23) and maximum diameter of the solid component (OR, 1.10; 95 % CI, 1.03–1.15), were independent risk factors for the high-grade group (p < 0.05, Table 2). These three clinical features were further used to construct a clinical model and integrated models.
3.2
Feature selection
Among the 1037 radiomic features extracted, 1025 features with an ICC > 0.75 were included for further analysis (median [IQR] ICC: 0.98 [0.96, 1.00]). Based on the training cohort, univariate analysis identified 731 radiomic variables that exhibited statistically significant differences between the two groups. Subsequently, a correlation analysis was performed to reduce dimensionality, resulting in 111 radiomic features with low correlation. LASSO regression (Fig. 2) was then applied to select 10 optimal radiomic features, which were used to construct the Rad-Score (Supplementary Table S1). Similarly, univariate analysis identified 278 delta radiomic features with p < 0.05. Further correlation analysis selected 86 features, and LASSO regression ultimately identified 22 optimal delta radiomic features, which were used to construct the Delta Rad-Score (Supplementary Table S2). To evaluate the consistency of the Rad-Score across different scanning platforms, we compared scores from the three participating centers. In the high-grade group, Rad-Scores were 0.14 ± 0.16 for Center 1, 0.13 ± 0.16 for Center 2, and 0.16 ± 0.19 for Center 3 (p = 0.097). Similarly, in the non-high-grade group, Rad-Scores were 0.10 ± 0.13 for Center 1, 0.12 ± 0.14 for Center 2, and 0.07 ± 0.11 for Center 3 (p = 0.194). The absence of statistically significant differences demonstrates the robustness of our Rad-Score across different CT scanners. In each cohort, tumors in the high-grade group exhibited significantly higher Rad-Score and Delta Rad-score (all p < 0.001, Fig. 3).
3.3
Model performance
The AUCs of the clinical model, Rad-Score, Delta Rad-Score, and five integrated models were detailed in Table 3 and Fig. 4. The clinical model achieved an AUC of 0.80 [0.69, 0.90] in the external test cohort. The discriminative ability of rad-score and delta rad-score was comparable to that of the clinical model, with AUCs of 0.79 [0.67, 0.90] and 0.81 [0.68, 0.91], respectively, but the differences were not statistically significant (p = 0.88 and 0.89, Fig. 5). The integrated models demonstrated strong and robust predictive performance in the external test cohort. The RF model exhibited the highest AUC of 0.91 [0.85, 0.97], outperforming the DT model (AUC = 0.88 [0.79, 0.96]; p = 0.2686), SVM model (AUC = 0.80 [0.69, 0.90]; p = 0.0026), NN model (AUC = 0.80 [0.681, 0.899]; p = 0.0019), and LR model (AUC = 0.80 [0.68, 0.90]; p = 0.0029). The model performance in the training cohort was shown in Supplementary Table S3.
DCA revealed that the RF model provided greater overall net benefit in distinguishing the high-grade and non-high-grade groups compared to the clinical model or radiomic models (Supplementary Figure S1). The IDI test indicated that the RF model and DT model showed commonly positive improvement compared to the other models (Supplementary Figure S2).
Discussion
4
Discussion
The high-grade subtypes of LUAD are independent prognostic factors for patients undergoing surgical resection. Preoperative identification of high-grade subtypes can provide more targeted therapeutic strategies for patients. To the best of our knowledge, this is the first study to use delta radiomics to predict the high-grade subtypes of LUAD. Our study confirms that the delta radiomics score, calculated based on differences in tumor imaging features from short-term follow-up, plays a significant role in identifying high-grade subtypes (test set, AUC = 0.81). The integrated model, incorporating features such as the CTR, smoking status, maximum diameter of the solid component, radiomics scores, and delta radiomics scores, effectively and robustly identifies high-grade subtypes preoperatively and non-invasively (random forest classifier, training cohort, AUC = 1.00; validation cohort, AUC = 0.92; test cohort, AUC = 0.91), thereby providing valuable insights for clinical decision-making. Specifically, a prediction indicating a high-grade subtype would support opting for an anatomical resection over a limited resection to achieve wider margins and minimize the risk of local recurrence. Conversely, a low prediction score could bolster confidence in pursuing lung-preserving segmentectomy in appropriately selected cases, balancing oncological safety with functional preservation. Thus, our model offers valuable preoperative insights that can be integrated with other clinical factors to personalize surgical management in LUAD.
Radiomics allows for the high-throughput extraction of information from imaging data, enabling the quantification of features that may be imperceptible to radiologists. Previous studies have shown that machine learning algorithms based on radiomics are promising tools for predicting the high-grade subtypes of LUAD, with good predictive performance demonstrated in independent external training sets (AUC range from 0.730 to 0.860) [19], [20], [21], [22], [23]. However, these studies only utilized image data from a single time point, which fails to fully leverage the imaging differences caused by tumor progression over time. Tumor growth reflects the proliferation of tumor cells and contributes to assessing the potential invasiveness of the tumor [24], [25]. Malignant nodules exhibit more dynamic radiomic changes compared to benign ones [26]. Sohee et al. demonstrated that different histological subtypes of primary LUAD exhibit varying volume doubling times (VDT) [16]. Jung et al. further revealed that histological subtypes are independently associated with tumor volume and mass doubling times (MDT), which can be used to distinguish LUAD tumors with predominantly solid/micropapillary subtypes from other subtypes, with AUC values of 0.791 and 0.795, respectively [27]. Compared to these studies, our delta radiomics approach utilizes a broader spectrum of image information, not just tumor volume. In a larger cohort than previous similar studies, our model, integrating easily obtainable clinical features, preoperative radiomics, and delta radiomics features, achieved an AUC of 0.91 in external validation sets, outperforming traditional radiomics-based models and those relying solely on tumor VDT and MDT.
For highly suspicious pulmonary nodules detected during imaging examinations, radiologists often recommend follow-up in 3–6 months [28]. However, due to various reasons, patients may slightly advance or delay their follow-up appointments. Consequently, we limited the preoperative follow-up period to 50–200 days in this study. Although the delta radiomics feature we computed represents the slope of feature variation, tumor is inherently heterogeneous and tumor growth is usually nonlinear [29]. Therefore, we believe that limiting the preoperative follow-up time more closely approximates the actual clinical environment.
Our study also validates the importance of conventional clinical and imaging features. Tumor size is generally considered to have a significant correlation with malignancy risk and is a key factor in the follow-up management of lung nodules [30]. Furthermore, previous studies have indicated that high-grade LUAD subtypes often present specific imaging characteristics preoperatively, including higher CTR and larger tumor diameter [31], [32], which were confirmed in this study as independent predictive factors. Smoking has been identified as a major epidemiological risk factor for lung cancer, and an epidemiological study based on East Asian populations found that smokers are more likely to develop advanced ADC compared to non-smokers [33]. In our study, smoking history was an independent predictor for high-grade LUAD, possibly because tobacco exposure accelerates LUAD progression [34], [35].
There are several limitations in this study. First, it is a retrospective analysis with validation conducted at two external centers with small sample sizes. The proportion of high-grade subtypes varied across centers, a discrepancy that may reflect underlying selection bias. Future research should include multi-center cohorts and employ prospective designs. Second, ROIs were manually segmented by radiologists, which inevitably introduces some inter-observer variability. We aim to explore suitable automated segmentation methods in future studies. Lastly, variations in scanning devices, reconstruction parameters, and image segmentation techniques can impact the stability of radiomic features extracted from lung nodules [36], [37], [38]. Future research will further explore the application of robust AI methods in this field.
In conclusion, delta radiomics features encapsulate the overall changes in tumor heterogeneity during preoperative follow-up and provide valuable information for predicting high-grade components of LUAD. Machine learning models that integrate clinical, imaging, radiomics, and delta radiomics features can effectively predict the presence of high-grade components in LUAD and demonstrate promising performance in external validation sets, thereby providing important insights for clinical management strategies.
Discussion
The high-grade subtypes of LUAD are independent prognostic factors for patients undergoing surgical resection. Preoperative identification of high-grade subtypes can provide more targeted therapeutic strategies for patients. To the best of our knowledge, this is the first study to use delta radiomics to predict the high-grade subtypes of LUAD. Our study confirms that the delta radiomics score, calculated based on differences in tumor imaging features from short-term follow-up, plays a significant role in identifying high-grade subtypes (test set, AUC = 0.81). The integrated model, incorporating features such as the CTR, smoking status, maximum diameter of the solid component, radiomics scores, and delta radiomics scores, effectively and robustly identifies high-grade subtypes preoperatively and non-invasively (random forest classifier, training cohort, AUC = 1.00; validation cohort, AUC = 0.92; test cohort, AUC = 0.91), thereby providing valuable insights for clinical decision-making. Specifically, a prediction indicating a high-grade subtype would support opting for an anatomical resection over a limited resection to achieve wider margins and minimize the risk of local recurrence. Conversely, a low prediction score could bolster confidence in pursuing lung-preserving segmentectomy in appropriately selected cases, balancing oncological safety with functional preservation. Thus, our model offers valuable preoperative insights that can be integrated with other clinical factors to personalize surgical management in LUAD.
Radiomics allows for the high-throughput extraction of information from imaging data, enabling the quantification of features that may be imperceptible to radiologists. Previous studies have shown that machine learning algorithms based on radiomics are promising tools for predicting the high-grade subtypes of LUAD, with good predictive performance demonstrated in independent external training sets (AUC range from 0.730 to 0.860) [19], [20], [21], [22], [23]. However, these studies only utilized image data from a single time point, which fails to fully leverage the imaging differences caused by tumor progression over time. Tumor growth reflects the proliferation of tumor cells and contributes to assessing the potential invasiveness of the tumor [24], [25]. Malignant nodules exhibit more dynamic radiomic changes compared to benign ones [26]. Sohee et al. demonstrated that different histological subtypes of primary LUAD exhibit varying volume doubling times (VDT) [16]. Jung et al. further revealed that histological subtypes are independently associated with tumor volume and mass doubling times (MDT), which can be used to distinguish LUAD tumors with predominantly solid/micropapillary subtypes from other subtypes, with AUC values of 0.791 and 0.795, respectively [27]. Compared to these studies, our delta radiomics approach utilizes a broader spectrum of image information, not just tumor volume. In a larger cohort than previous similar studies, our model, integrating easily obtainable clinical features, preoperative radiomics, and delta radiomics features, achieved an AUC of 0.91 in external validation sets, outperforming traditional radiomics-based models and those relying solely on tumor VDT and MDT.
For highly suspicious pulmonary nodules detected during imaging examinations, radiologists often recommend follow-up in 3–6 months [28]. However, due to various reasons, patients may slightly advance or delay their follow-up appointments. Consequently, we limited the preoperative follow-up period to 50–200 days in this study. Although the delta radiomics feature we computed represents the slope of feature variation, tumor is inherently heterogeneous and tumor growth is usually nonlinear [29]. Therefore, we believe that limiting the preoperative follow-up time more closely approximates the actual clinical environment.
Our study also validates the importance of conventional clinical and imaging features. Tumor size is generally considered to have a significant correlation with malignancy risk and is a key factor in the follow-up management of lung nodules [30]. Furthermore, previous studies have indicated that high-grade LUAD subtypes often present specific imaging characteristics preoperatively, including higher CTR and larger tumor diameter [31], [32], which were confirmed in this study as independent predictive factors. Smoking has been identified as a major epidemiological risk factor for lung cancer, and an epidemiological study based on East Asian populations found that smokers are more likely to develop advanced ADC compared to non-smokers [33]. In our study, smoking history was an independent predictor for high-grade LUAD, possibly because tobacco exposure accelerates LUAD progression [34], [35].
There are several limitations in this study. First, it is a retrospective analysis with validation conducted at two external centers with small sample sizes. The proportion of high-grade subtypes varied across centers, a discrepancy that may reflect underlying selection bias. Future research should include multi-center cohorts and employ prospective designs. Second, ROIs were manually segmented by radiologists, which inevitably introduces some inter-observer variability. We aim to explore suitable automated segmentation methods in future studies. Lastly, variations in scanning devices, reconstruction parameters, and image segmentation techniques can impact the stability of radiomic features extracted from lung nodules [36], [37], [38]. Future research will further explore the application of robust AI methods in this field.
In conclusion, delta radiomics features encapsulate the overall changes in tumor heterogeneity during preoperative follow-up and provide valuable information for predicting high-grade components of LUAD. Machine learning models that integrate clinical, imaging, radiomics, and delta radiomics features can effectively predict the presence of high-grade components in LUAD and demonstrate promising performance in external validation sets, thereby providing important insights for clinical management strategies.
CRediT authorship contribution statement
CRediT authorship contribution statement
Feiyang Zhong: Writing – original draft, Visualization, Validation, Software, Methodology, Formal analysis, Data curation, Conceptualization. Ting Li: Validation, Software, Data curation. Wenping Li: Methodology, Investigation, Data curation. Lijun Wu: Software, Methodology, Data curation. Pengju Zhang: Resources, Data curation. Yu pengxin: Validation, Software, Methodology. Yuan Fang: Methodology, Data curation. Meiyan Liao: Writing – review & editing, Validation. Zhao Shaohong: Writing – review & editing, Validation, Supervision, Resources, Investigation, Conceptualization.
Feiyang Zhong: Writing – original draft, Visualization, Validation, Software, Methodology, Formal analysis, Data curation, Conceptualization. Ting Li: Validation, Software, Data curation. Wenping Li: Methodology, Investigation, Data curation. Lijun Wu: Software, Methodology, Data curation. Pengju Zhang: Resources, Data curation. Yu pengxin: Validation, Software, Methodology. Yuan Fang: Methodology, Data curation. Meiyan Liao: Writing – review & editing, Validation. Zhao Shaohong: Writing – review & editing, Validation, Supervision, Resources, Investigation, Conceptualization.
Ethics approval and consent to participate
Ethics approval and consent to participate
This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of the PLA General Hospital and Zhongnan Hospital of Wuhan University. Informed consent was waived due to the retrospective nature of the study.
This study was conducted in accordance with the Declaration of Helsinki and was approved by the Ethics Committee of the PLA General Hospital and Zhongnan Hospital of Wuhan University. Informed consent was waived due to the retrospective nature of the study.
Funding
Funding
Capital's Funds for Health lmprovement and Research. CFH 2022-2-5023.
Capital's Funds for Health lmprovement and Research. CFH 2022-2-5023.
Declaration of Competing Interest
Declaration of Competing Interest
The authors declare that they have no competing interests.
The authors declare that they have no competing interests.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Nanotechnology-Assisted Molecular Profiling: Emerging Advances in Circulating Tumor DNA Detection.
- Building Hybrid Pharmacometric-Machine Learning Models in Oncology Drug Development: Current State and Recommendations.
- Combining network pharmacology and experimental validation to study the action and mechanism of brusatol against lung adenocarcinoma.
- Acquired L858R mutation following -TKI resistance in lung adenocarcinoma: a case report.
- Machine learning integrating MRI and clinical features predicts early recurrence of hepatocellular carcinoma after resection.
- Machine learning approaches to optimize the integration of sociodemographic factors for predicting cancer-specific survival among patients with high-risk prostate cancer.