A comprehensive analysis identified BECN1 as potential diagnostic biomarker for homotypic cell-in-cell in non-small cell lung cancer through integrated bioinformatics and clinical validation approaches.
1/5 보강
[BACKGROUNDS] Non-small cell lung cancer (NSCLC) is an aggressive malignant tumor characterized by early recurrence and poor prognosis.
APA
Liu X, Guo R, et al. (2026). A comprehensive analysis identified BECN1 as potential diagnostic biomarker for homotypic cell-in-cell in non-small cell lung cancer through integrated bioinformatics and clinical validation approaches.. Discover oncology, 17(1). https://doi.org/10.1007/s12672-026-04541-z
MLA
Liu X, et al.. "A comprehensive analysis identified BECN1 as potential diagnostic biomarker for homotypic cell-in-cell in non-small cell lung cancer through integrated bioinformatics and clinical validation approaches.." Discover oncology, vol. 17, no. 1, 2026.
PMID
41617944 ↗
Abstract 한글 요약
[BACKGROUNDS] Non-small cell lung cancer (NSCLC) is an aggressive malignant tumor characterized by early recurrence and poor prognosis. Homotypic cell-in-cell (HoCIC) are significantly associated with adverse outcomes in multiple tumors, serving as valuable indicators for patient outcome assessment. However, current HoCIC diagnosis methods rely primarily on manual microscopic observation and lack standardized detection biomarkers and methodologies, which may introduce bias into research findings. Therefore, this study aims to identify diagnostic markers for HoCIC in NSCLC, laying the foundation for further research into the biological roles and mechanisms of HoCIC.
[METHODS] Kaplan‒Meier curves and log-rank tests were used to investigate the relationship between HoCIC and prognosis. Bioinformatics analysis of NSCLC gene expression data related to HoCIC from the Gene Expression Omnibus (GEO) dataset revealed differentially HoCIC expressed genes ( HoCICDEGs) between tumor tissues and normal tissues. We identified an overlapping HoCIC hub gene, BECN1, among the HoCICDEGs and autophagy-related genes (ARGs). The expression and biological functions of BECN1 were analysed via The Cancer Genome Atlas (TCGA) database. The Kaplan‒Meier, TIMER2.0, cBioPortal, and GSCA public databases were subsequently used to investigate the prognosis, immune infiltration, genetic alterations, and drug sensitivity associated with BECN1. Finally, clinical NSCLC samples were collected for immunohistochemical experiments to validate BECN1 expression and its diagnostic value for HoCIC.
[RESULTS] HoCIC was significantly correlated with poor overall survival (OS) and disease-free survival (DFS). We identified BECN1 as a core gene associated with HoCIC in NSCLC, which is highly expressed in tumor tissues and is correlated with unfavourable prognosis. BECN1 is correlated with the mitotic spindle, G2M checkpoint, and MYC pathways, suppresses immune cell infiltration, and is sensitive to most anticancer drugs. In our validated NSCLC cohort, BECN1 protein was highly expressed in tumor tissues and demonstrated a significant association with HoCIC, serving as an independent risk factor for HoCIC. The HoCIC prediction model constructed on the basis of BECN1 demonstrated favourable diagnostic capability, discriminatory power, and clinical benefit.
[CONCLUSIONS] In summary, this study identified BECN1 as a diagnostic biomarker associated with HoCIC in NSCLC, providing a strong foundation for improving diagnostic and research strategies related to this phenomenon.
[METHODS] Kaplan‒Meier curves and log-rank tests were used to investigate the relationship between HoCIC and prognosis. Bioinformatics analysis of NSCLC gene expression data related to HoCIC from the Gene Expression Omnibus (GEO) dataset revealed differentially HoCIC expressed genes ( HoCICDEGs) between tumor tissues and normal tissues. We identified an overlapping HoCIC hub gene, BECN1, among the HoCICDEGs and autophagy-related genes (ARGs). The expression and biological functions of BECN1 were analysed via The Cancer Genome Atlas (TCGA) database. The Kaplan‒Meier, TIMER2.0, cBioPortal, and GSCA public databases were subsequently used to investigate the prognosis, immune infiltration, genetic alterations, and drug sensitivity associated with BECN1. Finally, clinical NSCLC samples were collected for immunohistochemical experiments to validate BECN1 expression and its diagnostic value for HoCIC.
[RESULTS] HoCIC was significantly correlated with poor overall survival (OS) and disease-free survival (DFS). We identified BECN1 as a core gene associated with HoCIC in NSCLC, which is highly expressed in tumor tissues and is correlated with unfavourable prognosis. BECN1 is correlated with the mitotic spindle, G2M checkpoint, and MYC pathways, suppresses immune cell infiltration, and is sensitive to most anticancer drugs. In our validated NSCLC cohort, BECN1 protein was highly expressed in tumor tissues and demonstrated a significant association with HoCIC, serving as an independent risk factor for HoCIC. The HoCIC prediction model constructed on the basis of BECN1 demonstrated favourable diagnostic capability, discriminatory power, and clinical benefit.
[CONCLUSIONS] In summary, this study identified BECN1 as a diagnostic biomarker associated with HoCIC in NSCLC, providing a strong foundation for improving diagnostic and research strategies related to this phenomenon.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- Factors Affecting Patient Satisfaction with Double-Eyelid Blepharoplasty.
- The use of expanded polytetrafluoroethylene in depressed deformities of the face.
- Monetary Risk Preferences and Demand for Preventative Treatment: A Discrete Choice Experiment Among Individuals at High Risk for Lung Cancer.
- The transcription factor EHF promotes the maturation and immunosuppression of conventional dendritic cells.
- Rare-earth cerium-coordinated ICG nanoprobe for tumor hypoxia relief and intensified photodynamic therapy.
📖 전문 본문 읽기 PMC JATS · ~78 KB · 영문
Introduction
Introduction
Lung cancer ranks as the second most commonly diagnosed malignant tumor among new cancer cases worldwide and ranks first in terms of cancer-related mortality [1], with non-small cell lung cancer (NSCLC) accounting for approximately 85% of lung cancer cases [2]. Despite continuous improvements in early-stage screening and diagnostic techniques, advancements in lung cancer surgical treatments, and the widespread adoption of targeted therapy and immunotherapeutic drugs in recent years [3, 4], while the incidence and mortality rates of lung cancer show a slightly declining trend, the 5-year survival rate for lung cancer patients remains concerning [5].
Cell-in-cell (CIC) refers to a cellular structure formed when a living cell (target cell) is engulfed by another living cell (host cell), which is observable in both physiological and pathological contexts [6]. CIC can be classified into homotypic CIC (HoCIC) and heterotypic CIC (HeCIC) on the basis of the participating cell types. The former denotes target cells and host cells being of identical type, whereas the latter indicates different cell types between the two. Entosis and cannibalism represent distinct processes that form HoCIC: entosis involves active invasion of target cells into host cells, while cannibalism involves active engulfment of target cells by host cells [7]. HoCIC is widely observed in various tumor tissues [8] and has been proven to be correlated with prognosis in multiple cancer types [9]. However, the current HoCIC assessment relies primarily on manual counting and lacks standardized evaluation methods. This introduces subjectivity that may cause significant variability and lead to inconsistent research outcomes. Additionally, identifying HoCIC within dense tumor cell populations via microscopy requires substantial time and patience. Given the heavy workload of pathologists, systematically examining the HoCIC in every case remains impractical. Therefore, exploring diagnostic biomarkers for HoCIC is critically urgent.
Target cells within HoCIC structures are typically degraded through a lysosome-dependent mechanism [10]. The degradation products can be recycled by the engulfing cells. Therefore, under hostile conditions such as nutrient deprivation, ischemia, hypoxia, and low pH, tumor cells can engulf sibling cells for degradation and digestion. The recycled metabolites serve as nutrients to promote tumor cell survival and proliferation [7, 11, 12]. This phenomenon may be related to tumor metabolic reprogramming, representing a survival mechanism for tumor cells in adverse environments. Autophagy, the process by which cells ‘consume themselves’, participates in metabolic reprogramming under stress conditions, promoting self-degradation and recycling of intracellular components to cope with adverse environments. HoCIC phenomenon exhibit significant similarities to those of autophagy in both function and formation process. The mechanisms regulating autophagy closely resemble those governing HoCIC formation, with numerous autophagy-related proteins also participating in HoCIC development [13].
BECN1 (also known as Beclin1) was the first identified mammalian autophagy protein. It serves as the mammalian homologue of the yeast autophagy-related gene ATG6/VPS30 and has high evolutionary conservation [14]. BECN1 is crucial for embryonic survival and normal development and participates in phagocytosis and the clearance of apoptotic cells during embryogenesis [15]. The BECN1 gene is located on human chromosome 17q21, encoding a coiled-coil protein with a relative molecular mass of approximately 60 kDa [16]. Comprising 450 amino acids, it forms three domains: the Bcl-2 binding domain (BH3), the coiled-coil domain (CCD), and the evolutionarily conserved domain (ECD) [17]. As a key regulator of autophagy, BECN1 forms three distinct complexes with proteins including VPS34, ATG14, UVRAG, and Rubicon, influencing autophagosome formation, maturation, and trafficking to modulate autophagic activity [18]. Currently, the use of bioinformatics approaches to identify novel tumor biomarkers provides crucial targets for investigating the biological basis of cancer [19]. Examining the mechanisms of target genes offers new perspectives for tumor diagnosis, prognosis, and treatment.
In this study, we observed and quantified HoCIC structures in NSCLC tissues and evaluated their impact on overall survival (OS) and disease-free survival (DFS) in NSCLC patients. However, due to the absence of definitive HoCIC biomarkers in NSCLC, current research still relies on manual counting, which may introduce subjective bias into the study outcomes. To address this limitation, through meticulous analysis of public NSCLC datasets and the application of bioinformatics techniques, we aimed to identify genes closely associated with HoCIC, which may serve as potential diagnostic biomarkers for HoCIC. Ultimately BECN1 was identified as a pivotal molecule for HoCIC in NSCLC. We further leveraged public databases and platforms to explore the biological functions, prognostic significance, immune infiltration patterns, and drug sensitivity associated with BECN1. Finally, we utilized immunohistochemistry to detect BECN1 expression in clinical NSCLC samples and evaluated its value as a diagnostic biomarker for HoCIC. The workflow of this research is presented in Fig. 1.
Lung cancer ranks as the second most commonly diagnosed malignant tumor among new cancer cases worldwide and ranks first in terms of cancer-related mortality [1], with non-small cell lung cancer (NSCLC) accounting for approximately 85% of lung cancer cases [2]. Despite continuous improvements in early-stage screening and diagnostic techniques, advancements in lung cancer surgical treatments, and the widespread adoption of targeted therapy and immunotherapeutic drugs in recent years [3, 4], while the incidence and mortality rates of lung cancer show a slightly declining trend, the 5-year survival rate for lung cancer patients remains concerning [5].
Cell-in-cell (CIC) refers to a cellular structure formed when a living cell (target cell) is engulfed by another living cell (host cell), which is observable in both physiological and pathological contexts [6]. CIC can be classified into homotypic CIC (HoCIC) and heterotypic CIC (HeCIC) on the basis of the participating cell types. The former denotes target cells and host cells being of identical type, whereas the latter indicates different cell types between the two. Entosis and cannibalism represent distinct processes that form HoCIC: entosis involves active invasion of target cells into host cells, while cannibalism involves active engulfment of target cells by host cells [7]. HoCIC is widely observed in various tumor tissues [8] and has been proven to be correlated with prognosis in multiple cancer types [9]. However, the current HoCIC assessment relies primarily on manual counting and lacks standardized evaluation methods. This introduces subjectivity that may cause significant variability and lead to inconsistent research outcomes. Additionally, identifying HoCIC within dense tumor cell populations via microscopy requires substantial time and patience. Given the heavy workload of pathologists, systematically examining the HoCIC in every case remains impractical. Therefore, exploring diagnostic biomarkers for HoCIC is critically urgent.
Target cells within HoCIC structures are typically degraded through a lysosome-dependent mechanism [10]. The degradation products can be recycled by the engulfing cells. Therefore, under hostile conditions such as nutrient deprivation, ischemia, hypoxia, and low pH, tumor cells can engulf sibling cells for degradation and digestion. The recycled metabolites serve as nutrients to promote tumor cell survival and proliferation [7, 11, 12]. This phenomenon may be related to tumor metabolic reprogramming, representing a survival mechanism for tumor cells in adverse environments. Autophagy, the process by which cells ‘consume themselves’, participates in metabolic reprogramming under stress conditions, promoting self-degradation and recycling of intracellular components to cope with adverse environments. HoCIC phenomenon exhibit significant similarities to those of autophagy in both function and formation process. The mechanisms regulating autophagy closely resemble those governing HoCIC formation, with numerous autophagy-related proteins also participating in HoCIC development [13].
BECN1 (also known as Beclin1) was the first identified mammalian autophagy protein. It serves as the mammalian homologue of the yeast autophagy-related gene ATG6/VPS30 and has high evolutionary conservation [14]. BECN1 is crucial for embryonic survival and normal development and participates in phagocytosis and the clearance of apoptotic cells during embryogenesis [15]. The BECN1 gene is located on human chromosome 17q21, encoding a coiled-coil protein with a relative molecular mass of approximately 60 kDa [16]. Comprising 450 amino acids, it forms three domains: the Bcl-2 binding domain (BH3), the coiled-coil domain (CCD), and the evolutionarily conserved domain (ECD) [17]. As a key regulator of autophagy, BECN1 forms three distinct complexes with proteins including VPS34, ATG14, UVRAG, and Rubicon, influencing autophagosome formation, maturation, and trafficking to modulate autophagic activity [18]. Currently, the use of bioinformatics approaches to identify novel tumor biomarkers provides crucial targets for investigating the biological basis of cancer [19]. Examining the mechanisms of target genes offers new perspectives for tumor diagnosis, prognosis, and treatment.
In this study, we observed and quantified HoCIC structures in NSCLC tissues and evaluated their impact on overall survival (OS) and disease-free survival (DFS) in NSCLC patients. However, due to the absence of definitive HoCIC biomarkers in NSCLC, current research still relies on manual counting, which may introduce subjective bias into the study outcomes. To address this limitation, through meticulous analysis of public NSCLC datasets and the application of bioinformatics techniques, we aimed to identify genes closely associated with HoCIC, which may serve as potential diagnostic biomarkers for HoCIC. Ultimately BECN1 was identified as a pivotal molecule for HoCIC in NSCLC. We further leveraged public databases and platforms to explore the biological functions, prognostic significance, immune infiltration patterns, and drug sensitivity associated with BECN1. Finally, we utilized immunohistochemistry to detect BECN1 expression in clinical NSCLC samples and evaluated its value as a diagnostic biomarker for HoCIC. The workflow of this research is presented in Fig. 1.
Materials and methods
Materials and methods
Data sources and acquisition
We obtained three datasets (GSE101929, GSE40791, and GSE68465) from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) [20], a public repository of the National Center for Biotechnology Information (NCBI). The first two datasets were generated by the GPL570 Affymetrix Human Genome U133 Plus 2.0 Array platform, while the last dataset utilized the GPL96 Affymetrix Human Genome U133A Array platform. GSE101929 contains 32 tumor samples and 34 normal samples; GSE40791 contains 94 tumor samples and 100 normal samples; GSE68465 contains 443 tumor samples and 19 normal samples. Additionally, we obtained RNA-seq data for lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) [21], which includes 539 tumor samples and 59 normal samples in the LUAD dataset, along with 502 tumor samples and 49 normal samples in the LUSC dataset.
Clinical sample collection
This study retrospectively collectedc data from LUAD and LUSC patients who underwent surgical resection at The Second Affiliated Hospital of Xi’an Jiaotong University. Inclusion criteria: (1) pathologically confirmed diagnosis of LUAD or LUSC; (2) no preoperative adjuvant therapy; (3) single tumor lesion. Exclusion criteria: (1) metastatic or recurrent NSCLC; (2) incomplete clinicopathological information. The study was approved by the Biomedical Ethics Committee of faculty of medicine Xi’an Jiaotong University (No. 2021.954), and was conducted in accordance with the ethical guidelines of the Declaration of Helsinki. Informed consent was obtained from all the subjects and/or their legal guardian(s).
For the analysis of the relationship between HoCIC and prognosis, we selected 168 patients treated between January 2018 and December 2021, with the follow-up period extending from the date of surgery to June 2024. This study focused on two primary endpoints: overall survival (OS) and disease free survival (DFS). OS was defined as the time interval from surgery to death from any cause. DFS was defined as the time interval from surgery to tumor recurrence, distant metastasis, or death from any cause. Diagnoses of tumor recurrence and distant metastasis were confirmed by imaging examinations, cytological analysis, or biopsy findings. The median OS was 24.3 months (range: 0.24–6.58 months), and the median DFS was 31.5 months (range: 0.96–78.9 months).
To validate BECN1 expression as a diagnostic marker for HoCIC, we selected paraffin-embedded tissues from 98 patients treated between December 2022 and June 2023 for immunohistochemical staining. Clinicopathological information of NSCLC patients was collected through reviewing electronic medical records and tissue sections, including age, gender, smoking history, involved lung lobe, tumor location, histologic type, differentiation status, tumor size, and TNM stage based on the eighth edition of the American Joint Committee on Cancer (AJCC) Staging Manual [22], lymph node metastasis, pleural invasion, vascular invasion, necrosis, spread through air spaces (STAS), Ki-67 proliferation index.
Quantification of the hocic
The tumor tissue slides stained with hematoxylin‒eosin (HE) were observed under an optical microscope (Carl Zeiss Microscopy GmbH, Germany) to identify the HoCIC. Following the methodology described in previous research [23], ten high-power fields were randomly selected within the tumor cell enrichment area for evaluation. According to Mackay’s criteria [24], a HoCIC structure was identified if it meets any four of the following six characteristics: (1) nucleus of the target cell; (2) cytoplasm of the target cell; (3) nucleus of the host cell; (4) cytoplasm of the host cell; (5) crescent-shaped nucleus of the host cell; (6) vacuole enclosing the target cells. For classification purposes, tumors in which at least one HoCIC structure was detected were defined as HoCIC-positive, whereas tumors with no detectable HoCIC structures were classified as HoCIC-negative.
Identification of hocic related genes and autophagy related genes
The GeneCards database (https://www.genecards.org/) provides comprehensive information on human genes. Using ‘homotypic cell-in-cell’, ‘homotypic cell cannibalism’, and ‘entosis’ as search keywords in GeneCards. Genes meeting the criteria of “Relevance score > 1” and “Protein Coding” were identified as HoCIC related genes, yielding a total of 1456 HoCIC related genes (Table S1). From the ‘KEGG_REGULATION_OF_AUTOPHAGY’ gene set in the MSigDB database (https://www.gsea-msigdb.org/gsea/msigdb/), we obtained 35 core autophagy related genes (ARGs) (Table S2).
Analysis of differentially expressed genes
Analysis of three GEO datasets was performed via R software (version 4.2.2) and relevant R packages. First, we verified whether the raw data underwent log2 transformation and applied the ‘normalizeBetweenArrays’ R package to correct for batch effects. Second, the gene annotation of probes was conducted through the ‘hgu133plus2.db’ package, followed by screening for HoCIC related genes. Next, differential gene expression analysis was performed by the ‘limma’ package. This package was selected for microarray data analysis because of its flexibility in handling diverse data issues and enhancing result reliability [25]. To identify differentially expressed genes between NSCLC tumor tissues and normal tissues, we applied thresholds of FC(fold change)>1.25 and p value < 0.001. Volcano plots were generated by the ‘ggplot2’ package to visualize differentially expressed genes. HoCIC differentially expressed genes (HoCICDEGs) were subsequently identified via the ‘VennDiagram’ package, and a visual heatmap of HoCICDEGs was created with the ‘Pheatmap’ package.
Functional enrichment analysis
Gene Ontology (GO) has proven highly valuable for elucidating functional and biological significance from extremely large datasets, aiming to provide biologically meaningful annotations for genes and their products across diverse organisms [26]. We applied the ‘clusterProfiler’ package to identify characteristic biological processes (BP) within the GO framework for HoCICDEGs, subsequently counting the frequency of enrichment for each HoCICDEG in these biological processes.
Gene set enrichment analysis (GSEA) was employed to explore the functional roles of BECN1. Correlation analysis between BECN1 expression levels and other genes in the TCGA cohort was performed, with the results sorted by the correlation coefficient. The hallmark gene set (h.all.v2024.1.Hs.symbols.gmt) was acquired from the MSigDB database. GSEA analysis was conducted by the ‘clusterProfiler’ package.
Prognosis analysis
To assess the prognostic impact of the HoCIC, relationships between the HoCIC and NSCLC OS and DFS of patients with NSCLC were analysed via the ‘survival’ package. Kaplan‒Meier survival curves were generated with the ‘survminer’ and ‘ggplot2’ packages, with differences between the HoCIC negative and positive groups evaluated by log-rank test. Using the Kaplan‒Meier plotter online platform (http://kmplot.com/) [27], the relationship between BECN1 expression and prognosis was analyzed.
Association between BECN1 and immune infiltration
Utilizing the TIMER2.0 database (http://timer.cistrome.org/) [28], the interactions between BECN1 and various immune infiltrating cells were analyzed. Additionally, to validate reliability of the results, we employed the single-sample gene set enrichment analysis (ssGSEA) algorithm to investigate immune infiltration levels associated with BECN1 expression [29].
Analysis of genetic alterations in BECN1
The cBio Cancer Genomics Portal (cBioPortal) (http://cbioportal.org) serves as an open resource for exploring tumor genetic data [30]. We utilized it to analyse the frequency of BECN1 genetic alterations, mutation types, and specific mutation sites in LUAD and LUSC. Furthermore, we investigated copy number alterations of BECN1.
10 BECN1 drug sensitivity analysis
GSCA (https://guolab.wchscu.cn/GSCA/#/) is a publicly accessible portal for dynamic analysis and visualization of gene sets in cancer, as well as correlations between genes and drug sensitivity [31]. It integrates gene expression signatures and drug sensitivity information from the Genomics of Drug Sensitivity in Cancer (GDSC) dataset and the Cancer Therapeutics Response Portal (CTRP) dataset. We utilized this platform to analyse the relationship between BECN1 expression and sensitivity to anticancer drugs.
11 immunohistochemistry assay
Paraffin-embedded blocks of NSCLC tumor tissue samples were collected and Sect. (4-µm-thick) were cut continuously. The slides were baked at 60 °C for 2 h, followed by antigen retrieval using high-temperature water bath heating with EDTA buffer (pH 9.0). The ELPS method (enhance labelled polymer system) was employed for staining. To eliminate endogenous peroxidase activity, the sections were incubated with a 0.5% peroxidase solution at room temperature for 10 min. The primary antibody BECN1 (dilution: 1:150, cat. No: 11306-1-AP, Proteintech) was applied to the sections, which were incubated overnight at room temperature in a humid chamber. The second antibody was subsequently added to the sections, which were incubated for 30 min, followed by washing in PBS. Next, athe solution of diaminobenzidine was used as a chromogen and incubated for 5 min. Finally, the sections were counterstained with hematoxylin for 2 min.
The immunohistochemically stained slides were scored by two pathologists using a double-blind method, and in cases of disagreement, a third pathologist was consulted to determine the final score. The staining pattern of the cells on each slide was observed under low power magnification to assess the overall staining. Finally, five random high power fields were selected to calculate the percentage of positive cells relative to the total number of cells and evaluate the intensity of staining. The staining intensity was scored as follows: 0 (no staining), 1 (weak staining, faint yellow), 2 (moderate staining, yellow), and 3 (strong staining, light brown). The percentage of positive cells was scored as: 0 (no positive cells), 1 (≤ 10%), 2 (11%-30%), 3 (31%-70%), and 4 (71%-100%). The final expression score was calculated as the intensity score × percentage score, ranging from 0 to12.The samples were divided into four groups according to their total scores as follows: 0 = negative ,1–2 = weakly positive, 3–7 = moderately positive and 8–12 = strongly positive. For the analysis the expressionof BECN1 protein, cases with negative or weakly positive results were pooled into the low expression group, whereas those with positive or strongly positive results were combined into the high expression group.
Statistical analysis
The statistical analyses were performed by SPSS version 24.0 (SPSS, Chicago, IL) and R Studio version 4.4.2 (The R Foundation for Statistical Computing, Vienna, Austria). Categorical data were presented as whole numbers and frequencies (%). Differences between groups were evaluated by the chi-square test. The wilcoxon rank-sum test was employed for paired group comparisons, while the spearman rank correlation test was used to analyse correlations between two variables. P < 0.05 was considered statistically significant. Least absolute shrinkage and selection operator (LASSO) regression and multivariate logistic regression analysis were applied to identify the optimal risk factors for HoCIC. A predictive model for HoCIC based on BECN1 was subsequently established and internally validated through a bootstrap resampling method (1000 resamples). We plotted the receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC) to assess the discriminative performance of the BECN1 prediction model. An AUC > 0.70 indicated that the BECN1 prediction model demonstrated superior predictive ability. The calibration curve and Hosmer–Lemeshow test were utilized to assess the consistency between the actual and predicted outcomes on the ROC curve. A higher P value in the Hosmer–Lemeshow test indicated enhanced accuracy of the BECN1 prediction model. Finally, decision curve analysis (DCA) was used to compute the net benefit and determine the practicality and value of the BECN1 prediction model.
Data sources and acquisition
We obtained three datasets (GSE101929, GSE40791, and GSE68465) from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) [20], a public repository of the National Center for Biotechnology Information (NCBI). The first two datasets were generated by the GPL570 Affymetrix Human Genome U133 Plus 2.0 Array platform, while the last dataset utilized the GPL96 Affymetrix Human Genome U133A Array platform. GSE101929 contains 32 tumor samples and 34 normal samples; GSE40791 contains 94 tumor samples and 100 normal samples; GSE68465 contains 443 tumor samples and 19 normal samples. Additionally, we obtained RNA-seq data for lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) from The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/) [21], which includes 539 tumor samples and 59 normal samples in the LUAD dataset, along with 502 tumor samples and 49 normal samples in the LUSC dataset.
Clinical sample collection
This study retrospectively collectedc data from LUAD and LUSC patients who underwent surgical resection at The Second Affiliated Hospital of Xi’an Jiaotong University. Inclusion criteria: (1) pathologically confirmed diagnosis of LUAD or LUSC; (2) no preoperative adjuvant therapy; (3) single tumor lesion. Exclusion criteria: (1) metastatic or recurrent NSCLC; (2) incomplete clinicopathological information. The study was approved by the Biomedical Ethics Committee of faculty of medicine Xi’an Jiaotong University (No. 2021.954), and was conducted in accordance with the ethical guidelines of the Declaration of Helsinki. Informed consent was obtained from all the subjects and/or their legal guardian(s).
For the analysis of the relationship between HoCIC and prognosis, we selected 168 patients treated between January 2018 and December 2021, with the follow-up period extending from the date of surgery to June 2024. This study focused on two primary endpoints: overall survival (OS) and disease free survival (DFS). OS was defined as the time interval from surgery to death from any cause. DFS was defined as the time interval from surgery to tumor recurrence, distant metastasis, or death from any cause. Diagnoses of tumor recurrence and distant metastasis were confirmed by imaging examinations, cytological analysis, or biopsy findings. The median OS was 24.3 months (range: 0.24–6.58 months), and the median DFS was 31.5 months (range: 0.96–78.9 months).
To validate BECN1 expression as a diagnostic marker for HoCIC, we selected paraffin-embedded tissues from 98 patients treated between December 2022 and June 2023 for immunohistochemical staining. Clinicopathological information of NSCLC patients was collected through reviewing electronic medical records and tissue sections, including age, gender, smoking history, involved lung lobe, tumor location, histologic type, differentiation status, tumor size, and TNM stage based on the eighth edition of the American Joint Committee on Cancer (AJCC) Staging Manual [22], lymph node metastasis, pleural invasion, vascular invasion, necrosis, spread through air spaces (STAS), Ki-67 proliferation index.
Quantification of the hocic
The tumor tissue slides stained with hematoxylin‒eosin (HE) were observed under an optical microscope (Carl Zeiss Microscopy GmbH, Germany) to identify the HoCIC. Following the methodology described in previous research [23], ten high-power fields were randomly selected within the tumor cell enrichment area for evaluation. According to Mackay’s criteria [24], a HoCIC structure was identified if it meets any four of the following six characteristics: (1) nucleus of the target cell; (2) cytoplasm of the target cell; (3) nucleus of the host cell; (4) cytoplasm of the host cell; (5) crescent-shaped nucleus of the host cell; (6) vacuole enclosing the target cells. For classification purposes, tumors in which at least one HoCIC structure was detected were defined as HoCIC-positive, whereas tumors with no detectable HoCIC structures were classified as HoCIC-negative.
Identification of hocic related genes and autophagy related genes
The GeneCards database (https://www.genecards.org/) provides comprehensive information on human genes. Using ‘homotypic cell-in-cell’, ‘homotypic cell cannibalism’, and ‘entosis’ as search keywords in GeneCards. Genes meeting the criteria of “Relevance score > 1” and “Protein Coding” were identified as HoCIC related genes, yielding a total of 1456 HoCIC related genes (Table S1). From the ‘KEGG_REGULATION_OF_AUTOPHAGY’ gene set in the MSigDB database (https://www.gsea-msigdb.org/gsea/msigdb/), we obtained 35 core autophagy related genes (ARGs) (Table S2).
Analysis of differentially expressed genes
Analysis of three GEO datasets was performed via R software (version 4.2.2) and relevant R packages. First, we verified whether the raw data underwent log2 transformation and applied the ‘normalizeBetweenArrays’ R package to correct for batch effects. Second, the gene annotation of probes was conducted through the ‘hgu133plus2.db’ package, followed by screening for HoCIC related genes. Next, differential gene expression analysis was performed by the ‘limma’ package. This package was selected for microarray data analysis because of its flexibility in handling diverse data issues and enhancing result reliability [25]. To identify differentially expressed genes between NSCLC tumor tissues and normal tissues, we applied thresholds of FC(fold change)>1.25 and p value < 0.001. Volcano plots were generated by the ‘ggplot2’ package to visualize differentially expressed genes. HoCIC differentially expressed genes (HoCICDEGs) were subsequently identified via the ‘VennDiagram’ package, and a visual heatmap of HoCICDEGs was created with the ‘Pheatmap’ package.
Functional enrichment analysis
Gene Ontology (GO) has proven highly valuable for elucidating functional and biological significance from extremely large datasets, aiming to provide biologically meaningful annotations for genes and their products across diverse organisms [26]. We applied the ‘clusterProfiler’ package to identify characteristic biological processes (BP) within the GO framework for HoCICDEGs, subsequently counting the frequency of enrichment for each HoCICDEG in these biological processes.
Gene set enrichment analysis (GSEA) was employed to explore the functional roles of BECN1. Correlation analysis between BECN1 expression levels and other genes in the TCGA cohort was performed, with the results sorted by the correlation coefficient. The hallmark gene set (h.all.v2024.1.Hs.symbols.gmt) was acquired from the MSigDB database. GSEA analysis was conducted by the ‘clusterProfiler’ package.
Prognosis analysis
To assess the prognostic impact of the HoCIC, relationships between the HoCIC and NSCLC OS and DFS of patients with NSCLC were analysed via the ‘survival’ package. Kaplan‒Meier survival curves were generated with the ‘survminer’ and ‘ggplot2’ packages, with differences between the HoCIC negative and positive groups evaluated by log-rank test. Using the Kaplan‒Meier plotter online platform (http://kmplot.com/) [27], the relationship between BECN1 expression and prognosis was analyzed.
Association between BECN1 and immune infiltration
Utilizing the TIMER2.0 database (http://timer.cistrome.org/) [28], the interactions between BECN1 and various immune infiltrating cells were analyzed. Additionally, to validate reliability of the results, we employed the single-sample gene set enrichment analysis (ssGSEA) algorithm to investigate immune infiltration levels associated with BECN1 expression [29].
Analysis of genetic alterations in BECN1
The cBio Cancer Genomics Portal (cBioPortal) (http://cbioportal.org) serves as an open resource for exploring tumor genetic data [30]. We utilized it to analyse the frequency of BECN1 genetic alterations, mutation types, and specific mutation sites in LUAD and LUSC. Furthermore, we investigated copy number alterations of BECN1.
10 BECN1 drug sensitivity analysis
GSCA (https://guolab.wchscu.cn/GSCA/#/) is a publicly accessible portal for dynamic analysis and visualization of gene sets in cancer, as well as correlations between genes and drug sensitivity [31]. It integrates gene expression signatures and drug sensitivity information from the Genomics of Drug Sensitivity in Cancer (GDSC) dataset and the Cancer Therapeutics Response Portal (CTRP) dataset. We utilized this platform to analyse the relationship between BECN1 expression and sensitivity to anticancer drugs.
11 immunohistochemistry assay
Paraffin-embedded blocks of NSCLC tumor tissue samples were collected and Sect. (4-µm-thick) were cut continuously. The slides were baked at 60 °C for 2 h, followed by antigen retrieval using high-temperature water bath heating with EDTA buffer (pH 9.0). The ELPS method (enhance labelled polymer system) was employed for staining. To eliminate endogenous peroxidase activity, the sections were incubated with a 0.5% peroxidase solution at room temperature for 10 min. The primary antibody BECN1 (dilution: 1:150, cat. No: 11306-1-AP, Proteintech) was applied to the sections, which were incubated overnight at room temperature in a humid chamber. The second antibody was subsequently added to the sections, which were incubated for 30 min, followed by washing in PBS. Next, athe solution of diaminobenzidine was used as a chromogen and incubated for 5 min. Finally, the sections were counterstained with hematoxylin for 2 min.
The immunohistochemically stained slides were scored by two pathologists using a double-blind method, and in cases of disagreement, a third pathologist was consulted to determine the final score. The staining pattern of the cells on each slide was observed under low power magnification to assess the overall staining. Finally, five random high power fields were selected to calculate the percentage of positive cells relative to the total number of cells and evaluate the intensity of staining. The staining intensity was scored as follows: 0 (no staining), 1 (weak staining, faint yellow), 2 (moderate staining, yellow), and 3 (strong staining, light brown). The percentage of positive cells was scored as: 0 (no positive cells), 1 (≤ 10%), 2 (11%-30%), 3 (31%-70%), and 4 (71%-100%). The final expression score was calculated as the intensity score × percentage score, ranging from 0 to12.The samples were divided into four groups according to their total scores as follows: 0 = negative ,1–2 = weakly positive, 3–7 = moderately positive and 8–12 = strongly positive. For the analysis the expressionof BECN1 protein, cases with negative or weakly positive results were pooled into the low expression group, whereas those with positive or strongly positive results were combined into the high expression group.
Statistical analysis
The statistical analyses were performed by SPSS version 24.0 (SPSS, Chicago, IL) and R Studio version 4.4.2 (The R Foundation for Statistical Computing, Vienna, Austria). Categorical data were presented as whole numbers and frequencies (%). Differences between groups were evaluated by the chi-square test. The wilcoxon rank-sum test was employed for paired group comparisons, while the spearman rank correlation test was used to analyse correlations between two variables. P < 0.05 was considered statistically significant. Least absolute shrinkage and selection operator (LASSO) regression and multivariate logistic regression analysis were applied to identify the optimal risk factors for HoCIC. A predictive model for HoCIC based on BECN1 was subsequently established and internally validated through a bootstrap resampling method (1000 resamples). We plotted the receiver operating characteristic (ROC) curve and calculated the area under the curve (AUC) to assess the discriminative performance of the BECN1 prediction model. An AUC > 0.70 indicated that the BECN1 prediction model demonstrated superior predictive ability. The calibration curve and Hosmer–Lemeshow test were utilized to assess the consistency between the actual and predicted outcomes on the ROC curve. A higher P value in the Hosmer–Lemeshow test indicated enhanced accuracy of the BECN1 prediction model. Finally, decision curve analysis (DCA) was used to compute the net benefit and determine the practicality and value of the BECN1 prediction model.
Results
Results
HoCIC phenomena in NSCLC
We identified the presence of HoCIC structures within NSCLC tumor tissues. As illustrated in Fig. 2, typical HoCIC structures clearly show host cells localized inside target cells, surrounded by vesicle-like structures. Both host and target cells maintain intact cellular architectures, including cell membranes, cytoplasm, and nuclei. However, most target cells gradually undergo degradation via the lysosomal pathway after being engulfed by host cells [10], making intact structures of target cells rarely observable. Therefore, according to Mackay’s criteria [24], meeting any four of the six characteristics illustrated in Fig. 2 qualifies as a HoCIC. Concurrently, this study investigated HoCIC phenomena in adjacent normal tissues but found no evidence of HoCIC existence.
HoCIC is significantly associated with poor prognosis in NSCLC
To explore the role of HoCIC in NSCLC, we employed Kaplan‒Meier curves and log-rank tests to evaluate the relationships between HoCIC and both OS and DFS, thereby revealing the potential prognostic significance of HoCIC in NSCLC. In terms of OS, HoCIC positive status was significantly correlated with poor prognosis (P = 0.00017 ) (Fig. 3A). Stratified analysis across clinicopathological characteristics further revealed that HoCIC positivity consistently exerted adverse effects on OS in all subgroups, including M0 (P = 0.00015 ) (Fig. 3B), N0 (P = 0.0084 ) (Fig. 3C), N1-N3 (P = 0.021 ) (Fig. 3D), early-stage (P = 0.0086 ) (Fig. 3E), advanced-stage (P = 0.037 ) (Fig. 3F), age ≤ 60 years (P = 0.0051 ) (Fig. 3G), and age > 60 years (P = 0.026 ) (Fig. 3H), where HoCIC positive status correlated with worse OS outcomes. For DFS, HoCIC positivity was significantly associated with poorer DFS (P < 0.0001) (Fig. 3I). Further analysis across different subgroups revealed significant correlations in DFS between the HoCIC positive and HoCIC negative groups for M0 (P < 0.0001) (Fig. 3J), N0 (P = 0.017) (Fig. 3K), N1-N3 (P = 0.0015) (Fig. 3L), early-stage (P = 0.0043) (Fig. 3M), advanced-stage (P = 0.028) (Fig. 3N), age ≤ 60 years (P = 0.00099) (Fig. 3O), and age > 60 years (P = 0.02) (Fig. 3P). Within each subgroup, HoCIC positive patients exhibited worse DFS than HoCIC negative patients did. Consequently, HoCIC is clearly associated with poorer prognosis in patients with NSCLC, indicating that the presence of HoCIC in patients with NSCLC likely signifies an unfavourable prognosis.
Identification of BECN1 as a key gene for hocic in NSCLC
Considering that HoCIC are exclusively detected in tumor tissues of NSCLC but not observed in normal tissues, we compared the expression levels of 1,456 HoCIC related genes between tumor tissues and normal tissues using NSCLC transcriptome data from GEO databases. A total of 537, 800, and 709 differentially expressed genes (DEGs) were identified from the GSE101929, GSE40791, and GSE68465 datasets, respectively. These genes included 235 upregulated and 303 downregulated genes in GSE101929, 362 upregulated and 438 downregulated genes in GSE40791, and 383 upregulated and 326 downregulated genes in GSE68465 (Fig. 4A). To increase the reliability of the results, we intersected the DEGs from these three datasets by Venn diagrams, ultimately identifying 48 upregulated and 40 downregulated HoCIC differentially expressed genes (HoCICDEGs) (Fig. 4B). Heatmaps were generated to visualize these shared HoCICDEGs (Fig. 4C). Gene Ontology (GO) analysis revealed that HoCICDEGs participate in multiple biological processes (BP), such as positive regulation of cell adhesion (Fig. 4D). We calculated the enrichment frequency of each gene across pathways to identify key factors, designating genes with a frequency ≥ 50 as high frequency HoCICDEGs (HF-HoCICDEGs) (Fig. 4E). Given the substantial similarities between autophagy processes and HoCIC formation, autophagy related genes may concurrently participate in HoCIC development. Therefore, we screened 35 key autophagy-related genes from the autophagy pathway in the MSigDB database, which yielded only one overlapping gene, BECN1, when intersected with HF-HoCICDEGs (Fig. 4F). Compared with that in normal tissue controls, BECN1 expression was significantly elevated in NSCLC tumor tissues (Fig. 4G).
Functional enrichment analysis of BECN1
To further explore the potential biological roles and mechanisms of BECN1, we performed correlation analysis on BECN1 and conducted gene set enrichment analysis (GSEA) by ranking correlation coefficients. This comprehensive analysis deepens our insights into the biological processes and signaling pathways influenced by BECN1 in NSCLC. Figure 5A and C respectively show the GSEA functional enrichment results of BECN1 in LUAD and LUSC, respectively. In LUAD, BECN1 expressionis is correlated with protein secretion (closely associated with protein synthesis, processing, transport, and secretion processes), mitotic spindle (which utilizes microtubules and motor proteins to separate sister chromatids before cell division, ensuring accurate chromosome segregation [32]), MYC (a key member of the proto-oncogene family and a universal transcriptional enhancer that regulates nearly every physiological process within cells, including the cell cycle, proliferation, metabolism, differentiation, and apoptosis [33]), G2M checkpoint (the G2/M checkpoint functions to prevent cells from entering mitosis with unrepaired DNA damage, tumor cells primarily rely on the G2-M checkpoint to halt the cell cycle for DNA damage repair [34]), E2F (the E2F transcription factor family primarily regulates the expression of genes related to cell proliferation, apoptosis, and differentiation in a cell cycle-dependent manner [35]) (Fig. 5B). In LUSC, the expression of BECN1 is correlated mitotic spindle, MYC, G2M checkpoint, E2F, and TGFβ signaling (TGFβ signaling promotes tumor growth and invasion, evasion of immune surveillance, as well as dissemination and metastasis of cancer cells [36]) (Fig. 5D).
Prognostic analysis of BECN1 in NSCLC
The Kaplan-Meier plotter, a comprehensive database integrating GEO, EGA, and TCGA data, was utilized to investigate the association between BECN1 expression and NSCLC prognosis. Analysis of OS revealed that high BECN1 expression was correlated with poorer OS (P = 0.0016) (Fig. 6A). Significant differences in OS between the high and low BECN1 expression groups were consistently observed across multiple subgroup analyses, including females (P = 0.00077) (Fig. 6B), II stage patients (P = 0.00097) (Fig. 6C), adenocarcinoma cases (P = 0.027) (Fig. 6D), and patients without chemotherapy treatment (P = 0.017) (Fig. 6E). Analysis of progression-free survival (PFS) revealed that high BECN1 expression was significantly associated with poorer PFS (P = 0.0042) (Fig. 6F). Subgroup analyses consistently demonstrated this adverse prognostic relationship in males (P = 0.011) (Fig. 6G), II stage patients (P = 0.0011) (Fig. 6H), and individuals with smoking history (P = 0.00081) (Fig. 6I).
Association between BECN1 expression and immune infiltration
Given that HoCIC structures contribute to tumor cell evasion from immune cell killing in the tumor microenvironment, it was hypothesized that immune cell infiltration within the tumor microenvironment influences HoCIC formation. Concurrently, exploring the complex immune infiltration patterns in the tumor microenvironment facilitates the discovery of novel immunotherapy targets while advancing our understanding of tumorigenesis and progression mechanisms. Therefore, this study utilized the TIMER2.0 database to analyse the relationship between BECN1 and immune cell infiltration, with a focus on B cells, CD8 + T cells, CD4 + T cells, macrophages, neutrophils, and dendritic cells. In LUAD, BECN1 expression was positively correlated with CD4 + T cells (r = 0.133, P < 0.001), macrophages (r = 0.139, P < 0.001), and neutrophils (r = 0.104, P < 0.001) (Fig. 7A). In lung squamous cell carcinoma, BECN1 expression was negatively correlated with CD8 + T cells (r=-0.127, P < 0.001), neutrophils (r=-0.103, P < 0.001), and dendritic cells (r=-0.101, P < 0.001) (Fig. 7B). To corroborate these findings, we performed immune infiltration analysis on TCGA data by the ssGSEA algorithm. The results revealed predominantly negative correlations between BECN1 expression and immune cells in both LUAD and LUSC (Fig. 7C, D). Finally, we employed the XCELL algorithm for further analysis of conflicting results, aiming to clarify the relationship between BECN1 expression and immune cells. The data demonstrated that in LUAD, B cells, CD8 + T cells, CD4 + Th1 cells, macrophages, activated myeloid dendritic cells, and plasmacytoid dendritic cells were negatively correlated with BECN1 expression. In LUSC, BECN1 expression was negatively correlated with B cells, CD8 + T cells, effector memory CD4 + T cells, CD4 + Th1 cells, CD4 + Th2 cells, macrophages, myeloid dendritic cells, activated myeloid dendritic cells, and plasmacytoid dendritic cells (Fig. 7E).
Genetic alterations of BECN1
Genetic alterations of BECN1 in NSCLC were analyzed through the cBioPortal database, as such alterations play crucial roles in tumorigenesis and progression. Analysis revealed alterations in BECN1 in 1.3% of patients (Fig. 8A). The predominant genetic alteration type in LUAD was amplification of BECN1, whereas mutation of BECN1 constituted the primary alteration type in LUSC (Fig. 8B). Figure 8C delineates the types, frequency, and loci of BECN1 gene mutations, identifying 16 missense mutation sites across 450 amino acids. Specific changes within the BECN1 domain, exemplified by the Q309H variant, were observed in NSCLC patients. Furthermore, we investigated the relationship between BECN1 mRNA expression and copy number alterations. In LUAD, the most prevalent copy number change types associated with BECN1 expression were gain-of-function and diploid (Fig. 8D). In LUSC, the predominant types of copy number alterations were gain-of-function, diploid, and shallow deletion (Fig. 8E).
Drug sensitivity analysis of BECN1
Correlation analysis between BECN1 mRNA expression and drug therapeutic sensitivity was performed by the GDSC and CTRP databases. A positive correlation indicates that high expression may lead to drug resistance, whereas a negative correlation suggests that high expression may increase drug sensitivity. In the GDSC database, the top three drugs that were positively correlated with BECN1 expression were talazoparib, genentech cpd 10, and AG-014699, while the top three drugs that were negatively correlated with BECN1 expression RDEA119, selumetinib, and trametinib (Fig. 9A). In the CTRP database, ML239, phloretin, and valdecoxib were most strongly negatively correlated with BECN1 expression, while GDC − 0879, trametinib, and selumetinib were the top three drugs were most strongly positively correlated with BECN1 expression (Fig. 9B).
Validation of BECN1 expression profile in NSCLC clinical samples
To further validate the expression pattern of BECN1 in NSCLC, immunohistochemical experiments were performed to evaluate BECN1 expression in 98 NSCLC surgical specimens obtained from The Second Affiliated Hospital of Xi’an Jiaotong University. BECN1 was primarily localized in the cytoplasm of NSCLC cells, with minimal expression observed in the nucleus, where it appeared as pale yellow to brown granular material (Fig. 10A-C). The high expression rate of Beclin1 in normal lung tissues was 16.3%, while in tumor tissues it was 56.1%. BECN1 expression was significantly greater in tumor tissues than in normal lung tissues (P < 0.0001) (Fig. 10D). Furthermore, BECN1 expression correlated with the degree of differentiation in NSCLC. Compared with well-differentiated tumors, moderately differentiated (P = 0.0033) and poorly differentiated (P = 0.0026) NSCLC tissues presented a significantly increased proportion of high BECN1 expression (Fig. 10E). Notably, multiple HoCIC were observed in tumor tissues with high BECN1 expression (Fig. 10F). Moreover, BECN1 expression was significantly elevated in HoCIC positive patients compared with HoCIC negative patients ( P = 0.032) (Fig. 10G).
Baseline characteristics of the validation cohort
This study enrolled 98 NSCLC patients, whose clinicopathological characteristics are summarized in Table 1. HoCIC positive patients accounted for 45.9%, whereas HoCIC negative patients constituted 54.1%. The histological subtypes of NSCLC were LUAD and LUSC, accounting for 68.4% and 31.6% of the cases, respectively. There were 76 patients (77.6%) with the early stage (TNM I-II) and 22 patients (22.4%) with the advanced stage (TNM III-IV). Compared with the HoCIC negative group, HoCIC positive patients exhibited significantly higher rates of larger tumor size (37.1% vs. 61.1%, P = 0.037), poorer differentiation (21.7% vs. 40.8% vs. 76.9%, P < 0.001), advanced TNM stage (39.5% vs. 68.2%, P = 0.033), pleural invasion (43.0% vs. 100%, P = 0.018), vascular invasion (39.1% vs. 100%, P < 0.001), necrosis (33.3% vs. 75.9%, P < 0.001), Ki67 ≥ 36% (32.1% vs. 64.3%, P = 0.003), and high Beclin1 expression (32.6% vs. 56.4%, P = 0.032).
Validation of BECN1 as a diagnostic marker for hocic in NSCLC
Although BECN1 is a HoCIC core gene in NSCLC, its utility as a HoCIC biomarker requires further validation. Therefore, we propose constructing a HoCIC diagnostic model to evaluate the role of BECN1.
To prevent overfitting, LASSO regression analysis was employed to perform dimensionality reduction on the variables listed in Table 1, thereby mitigating the potential impacts of linear relationships among variables on the final model’s accuracy. Seven out of sixteen variables exhibited non-zero coefficients in the LASSO regression model (Fig. 11A, B), including tumor size, differentiation state, pleural invasion, vascular invasion, necrosis, Ki67 and BECN1. Multivariate logistic regression analysis using backward stepwise screening of these seven variables identified necrosis and BECN1 as independent risk factors for HoCIC (Fig. 11C). Compared with HoCIC negative patients, NSCLC patients who exhibited necrosis and high BECN1 expression showed a significantly elevated in HoCIC positive patients. A diagnostic nomogram incorporating Beclin1 was established on the basis of risk factors screened through LASSO regression and multivariate logistic regression (Fig. 11D). In the nomogram, each patient receives points corresponding to their necrosis and BECN1 status on the points axis. Summing the scores of all variables yields a total score. A vertical line drawn downward from the total points axis indicates the probability of HoCIC presence in NSCLC patients.
We evaluated the performance of the predictive model by discriminatory power and calibration. The ROC curve was employed to assess the discriminatory power of the predictive model. Necrosis alone demonstrated a suboptimal predictive ability for HoCIC (AUC = 0.678, 95% CI 0.591–0.765). However, incorporating BECN1 increased the AUC to 0.723 (95% CI 0.626–0.821), outperforming single-indicator AUC values (Fig. 11E). Internal validation using 1000 bootstrap repetitions yielded a validated model AUC of 0.723 (95% CI 0.624–0.822) (Fig. 11F), indicating favourable discriminatory power for the BECN1-based predictive model. Calibration was evaluated via the Hosmer‒Lemeshow test and calibration curve. The Hosmer‒Lemeshow test indicated good fit (P = 0.4614), demonstrating that the deviation between BECN1 risk predictions and observed values lacked statistical significance, confirming excellent model calibration. Furthermore, the calibration curve strongly agreed between the predicted and actual probabilities. Post-validation calibration analysis revealed that bias-corrected curve of the BECN1 model remained close to the ideal curve (Fig. 11G), suggesting robust predictive stability and consistency. Decision curve analysis (DCA) was performed to compare the BECN1-based model against single-predictor models (Fig. 11H). The DCA demonstrated a significantly greater net benefit for the BECN1 model compared to single indicators, a finding that was maintained after internal validation (Fig. 11I), confirming the model’s strong clinical applicability. These results collectively indicate the potential of BECN1 as a diagnostic biomarker for HoCIC.
HoCIC phenomena in NSCLC
We identified the presence of HoCIC structures within NSCLC tumor tissues. As illustrated in Fig. 2, typical HoCIC structures clearly show host cells localized inside target cells, surrounded by vesicle-like structures. Both host and target cells maintain intact cellular architectures, including cell membranes, cytoplasm, and nuclei. However, most target cells gradually undergo degradation via the lysosomal pathway after being engulfed by host cells [10], making intact structures of target cells rarely observable. Therefore, according to Mackay’s criteria [24], meeting any four of the six characteristics illustrated in Fig. 2 qualifies as a HoCIC. Concurrently, this study investigated HoCIC phenomena in adjacent normal tissues but found no evidence of HoCIC existence.
HoCIC is significantly associated with poor prognosis in NSCLC
To explore the role of HoCIC in NSCLC, we employed Kaplan‒Meier curves and log-rank tests to evaluate the relationships between HoCIC and both OS and DFS, thereby revealing the potential prognostic significance of HoCIC in NSCLC. In terms of OS, HoCIC positive status was significantly correlated with poor prognosis (P = 0.00017 ) (Fig. 3A). Stratified analysis across clinicopathological characteristics further revealed that HoCIC positivity consistently exerted adverse effects on OS in all subgroups, including M0 (P = 0.00015 ) (Fig. 3B), N0 (P = 0.0084 ) (Fig. 3C), N1-N3 (P = 0.021 ) (Fig. 3D), early-stage (P = 0.0086 ) (Fig. 3E), advanced-stage (P = 0.037 ) (Fig. 3F), age ≤ 60 years (P = 0.0051 ) (Fig. 3G), and age > 60 years (P = 0.026 ) (Fig. 3H), where HoCIC positive status correlated with worse OS outcomes. For DFS, HoCIC positivity was significantly associated with poorer DFS (P < 0.0001) (Fig. 3I). Further analysis across different subgroups revealed significant correlations in DFS between the HoCIC positive and HoCIC negative groups for M0 (P < 0.0001) (Fig. 3J), N0 (P = 0.017) (Fig. 3K), N1-N3 (P = 0.0015) (Fig. 3L), early-stage (P = 0.0043) (Fig. 3M), advanced-stage (P = 0.028) (Fig. 3N), age ≤ 60 years (P = 0.00099) (Fig. 3O), and age > 60 years (P = 0.02) (Fig. 3P). Within each subgroup, HoCIC positive patients exhibited worse DFS than HoCIC negative patients did. Consequently, HoCIC is clearly associated with poorer prognosis in patients with NSCLC, indicating that the presence of HoCIC in patients with NSCLC likely signifies an unfavourable prognosis.
Identification of BECN1 as a key gene for hocic in NSCLC
Considering that HoCIC are exclusively detected in tumor tissues of NSCLC but not observed in normal tissues, we compared the expression levels of 1,456 HoCIC related genes between tumor tissues and normal tissues using NSCLC transcriptome data from GEO databases. A total of 537, 800, and 709 differentially expressed genes (DEGs) were identified from the GSE101929, GSE40791, and GSE68465 datasets, respectively. These genes included 235 upregulated and 303 downregulated genes in GSE101929, 362 upregulated and 438 downregulated genes in GSE40791, and 383 upregulated and 326 downregulated genes in GSE68465 (Fig. 4A). To increase the reliability of the results, we intersected the DEGs from these three datasets by Venn diagrams, ultimately identifying 48 upregulated and 40 downregulated HoCIC differentially expressed genes (HoCICDEGs) (Fig. 4B). Heatmaps were generated to visualize these shared HoCICDEGs (Fig. 4C). Gene Ontology (GO) analysis revealed that HoCICDEGs participate in multiple biological processes (BP), such as positive regulation of cell adhesion (Fig. 4D). We calculated the enrichment frequency of each gene across pathways to identify key factors, designating genes with a frequency ≥ 50 as high frequency HoCICDEGs (HF-HoCICDEGs) (Fig. 4E). Given the substantial similarities between autophagy processes and HoCIC formation, autophagy related genes may concurrently participate in HoCIC development. Therefore, we screened 35 key autophagy-related genes from the autophagy pathway in the MSigDB database, which yielded only one overlapping gene, BECN1, when intersected with HF-HoCICDEGs (Fig. 4F). Compared with that in normal tissue controls, BECN1 expression was significantly elevated in NSCLC tumor tissues (Fig. 4G).
Functional enrichment analysis of BECN1
To further explore the potential biological roles and mechanisms of BECN1, we performed correlation analysis on BECN1 and conducted gene set enrichment analysis (GSEA) by ranking correlation coefficients. This comprehensive analysis deepens our insights into the biological processes and signaling pathways influenced by BECN1 in NSCLC. Figure 5A and C respectively show the GSEA functional enrichment results of BECN1 in LUAD and LUSC, respectively. In LUAD, BECN1 expressionis is correlated with protein secretion (closely associated with protein synthesis, processing, transport, and secretion processes), mitotic spindle (which utilizes microtubules and motor proteins to separate sister chromatids before cell division, ensuring accurate chromosome segregation [32]), MYC (a key member of the proto-oncogene family and a universal transcriptional enhancer that regulates nearly every physiological process within cells, including the cell cycle, proliferation, metabolism, differentiation, and apoptosis [33]), G2M checkpoint (the G2/M checkpoint functions to prevent cells from entering mitosis with unrepaired DNA damage, tumor cells primarily rely on the G2-M checkpoint to halt the cell cycle for DNA damage repair [34]), E2F (the E2F transcription factor family primarily regulates the expression of genes related to cell proliferation, apoptosis, and differentiation in a cell cycle-dependent manner [35]) (Fig. 5B). In LUSC, the expression of BECN1 is correlated mitotic spindle, MYC, G2M checkpoint, E2F, and TGFβ signaling (TGFβ signaling promotes tumor growth and invasion, evasion of immune surveillance, as well as dissemination and metastasis of cancer cells [36]) (Fig. 5D).
Prognostic analysis of BECN1 in NSCLC
The Kaplan-Meier plotter, a comprehensive database integrating GEO, EGA, and TCGA data, was utilized to investigate the association between BECN1 expression and NSCLC prognosis. Analysis of OS revealed that high BECN1 expression was correlated with poorer OS (P = 0.0016) (Fig. 6A). Significant differences in OS between the high and low BECN1 expression groups were consistently observed across multiple subgroup analyses, including females (P = 0.00077) (Fig. 6B), II stage patients (P = 0.00097) (Fig. 6C), adenocarcinoma cases (P = 0.027) (Fig. 6D), and patients without chemotherapy treatment (P = 0.017) (Fig. 6E). Analysis of progression-free survival (PFS) revealed that high BECN1 expression was significantly associated with poorer PFS (P = 0.0042) (Fig. 6F). Subgroup analyses consistently demonstrated this adverse prognostic relationship in males (P = 0.011) (Fig. 6G), II stage patients (P = 0.0011) (Fig. 6H), and individuals with smoking history (P = 0.00081) (Fig. 6I).
Association between BECN1 expression and immune infiltration
Given that HoCIC structures contribute to tumor cell evasion from immune cell killing in the tumor microenvironment, it was hypothesized that immune cell infiltration within the tumor microenvironment influences HoCIC formation. Concurrently, exploring the complex immune infiltration patterns in the tumor microenvironment facilitates the discovery of novel immunotherapy targets while advancing our understanding of tumorigenesis and progression mechanisms. Therefore, this study utilized the TIMER2.0 database to analyse the relationship between BECN1 and immune cell infiltration, with a focus on B cells, CD8 + T cells, CD4 + T cells, macrophages, neutrophils, and dendritic cells. In LUAD, BECN1 expression was positively correlated with CD4 + T cells (r = 0.133, P < 0.001), macrophages (r = 0.139, P < 0.001), and neutrophils (r = 0.104, P < 0.001) (Fig. 7A). In lung squamous cell carcinoma, BECN1 expression was negatively correlated with CD8 + T cells (r=-0.127, P < 0.001), neutrophils (r=-0.103, P < 0.001), and dendritic cells (r=-0.101, P < 0.001) (Fig. 7B). To corroborate these findings, we performed immune infiltration analysis on TCGA data by the ssGSEA algorithm. The results revealed predominantly negative correlations between BECN1 expression and immune cells in both LUAD and LUSC (Fig. 7C, D). Finally, we employed the XCELL algorithm for further analysis of conflicting results, aiming to clarify the relationship between BECN1 expression and immune cells. The data demonstrated that in LUAD, B cells, CD8 + T cells, CD4 + Th1 cells, macrophages, activated myeloid dendritic cells, and plasmacytoid dendritic cells were negatively correlated with BECN1 expression. In LUSC, BECN1 expression was negatively correlated with B cells, CD8 + T cells, effector memory CD4 + T cells, CD4 + Th1 cells, CD4 + Th2 cells, macrophages, myeloid dendritic cells, activated myeloid dendritic cells, and plasmacytoid dendritic cells (Fig. 7E).
Genetic alterations of BECN1
Genetic alterations of BECN1 in NSCLC were analyzed through the cBioPortal database, as such alterations play crucial roles in tumorigenesis and progression. Analysis revealed alterations in BECN1 in 1.3% of patients (Fig. 8A). The predominant genetic alteration type in LUAD was amplification of BECN1, whereas mutation of BECN1 constituted the primary alteration type in LUSC (Fig. 8B). Figure 8C delineates the types, frequency, and loci of BECN1 gene mutations, identifying 16 missense mutation sites across 450 amino acids. Specific changes within the BECN1 domain, exemplified by the Q309H variant, were observed in NSCLC patients. Furthermore, we investigated the relationship between BECN1 mRNA expression and copy number alterations. In LUAD, the most prevalent copy number change types associated with BECN1 expression were gain-of-function and diploid (Fig. 8D). In LUSC, the predominant types of copy number alterations were gain-of-function, diploid, and shallow deletion (Fig. 8E).
Drug sensitivity analysis of BECN1
Correlation analysis between BECN1 mRNA expression and drug therapeutic sensitivity was performed by the GDSC and CTRP databases. A positive correlation indicates that high expression may lead to drug resistance, whereas a negative correlation suggests that high expression may increase drug sensitivity. In the GDSC database, the top three drugs that were positively correlated with BECN1 expression were talazoparib, genentech cpd 10, and AG-014699, while the top three drugs that were negatively correlated with BECN1 expression RDEA119, selumetinib, and trametinib (Fig. 9A). In the CTRP database, ML239, phloretin, and valdecoxib were most strongly negatively correlated with BECN1 expression, while GDC − 0879, trametinib, and selumetinib were the top three drugs were most strongly positively correlated with BECN1 expression (Fig. 9B).
Validation of BECN1 expression profile in NSCLC clinical samples
To further validate the expression pattern of BECN1 in NSCLC, immunohistochemical experiments were performed to evaluate BECN1 expression in 98 NSCLC surgical specimens obtained from The Second Affiliated Hospital of Xi’an Jiaotong University. BECN1 was primarily localized in the cytoplasm of NSCLC cells, with minimal expression observed in the nucleus, where it appeared as pale yellow to brown granular material (Fig. 10A-C). The high expression rate of Beclin1 in normal lung tissues was 16.3%, while in tumor tissues it was 56.1%. BECN1 expression was significantly greater in tumor tissues than in normal lung tissues (P < 0.0001) (Fig. 10D). Furthermore, BECN1 expression correlated with the degree of differentiation in NSCLC. Compared with well-differentiated tumors, moderately differentiated (P = 0.0033) and poorly differentiated (P = 0.0026) NSCLC tissues presented a significantly increased proportion of high BECN1 expression (Fig. 10E). Notably, multiple HoCIC were observed in tumor tissues with high BECN1 expression (Fig. 10F). Moreover, BECN1 expression was significantly elevated in HoCIC positive patients compared with HoCIC negative patients ( P = 0.032) (Fig. 10G).
Baseline characteristics of the validation cohort
This study enrolled 98 NSCLC patients, whose clinicopathological characteristics are summarized in Table 1. HoCIC positive patients accounted for 45.9%, whereas HoCIC negative patients constituted 54.1%. The histological subtypes of NSCLC were LUAD and LUSC, accounting for 68.4% and 31.6% of the cases, respectively. There were 76 patients (77.6%) with the early stage (TNM I-II) and 22 patients (22.4%) with the advanced stage (TNM III-IV). Compared with the HoCIC negative group, HoCIC positive patients exhibited significantly higher rates of larger tumor size (37.1% vs. 61.1%, P = 0.037), poorer differentiation (21.7% vs. 40.8% vs. 76.9%, P < 0.001), advanced TNM stage (39.5% vs. 68.2%, P = 0.033), pleural invasion (43.0% vs. 100%, P = 0.018), vascular invasion (39.1% vs. 100%, P < 0.001), necrosis (33.3% vs. 75.9%, P < 0.001), Ki67 ≥ 36% (32.1% vs. 64.3%, P = 0.003), and high Beclin1 expression (32.6% vs. 56.4%, P = 0.032).
Validation of BECN1 as a diagnostic marker for hocic in NSCLC
Although BECN1 is a HoCIC core gene in NSCLC, its utility as a HoCIC biomarker requires further validation. Therefore, we propose constructing a HoCIC diagnostic model to evaluate the role of BECN1.
To prevent overfitting, LASSO regression analysis was employed to perform dimensionality reduction on the variables listed in Table 1, thereby mitigating the potential impacts of linear relationships among variables on the final model’s accuracy. Seven out of sixteen variables exhibited non-zero coefficients in the LASSO regression model (Fig. 11A, B), including tumor size, differentiation state, pleural invasion, vascular invasion, necrosis, Ki67 and BECN1. Multivariate logistic regression analysis using backward stepwise screening of these seven variables identified necrosis and BECN1 as independent risk factors for HoCIC (Fig. 11C). Compared with HoCIC negative patients, NSCLC patients who exhibited necrosis and high BECN1 expression showed a significantly elevated in HoCIC positive patients. A diagnostic nomogram incorporating Beclin1 was established on the basis of risk factors screened through LASSO regression and multivariate logistic regression (Fig. 11D). In the nomogram, each patient receives points corresponding to their necrosis and BECN1 status on the points axis. Summing the scores of all variables yields a total score. A vertical line drawn downward from the total points axis indicates the probability of HoCIC presence in NSCLC patients.
We evaluated the performance of the predictive model by discriminatory power and calibration. The ROC curve was employed to assess the discriminatory power of the predictive model. Necrosis alone demonstrated a suboptimal predictive ability for HoCIC (AUC = 0.678, 95% CI 0.591–0.765). However, incorporating BECN1 increased the AUC to 0.723 (95% CI 0.626–0.821), outperforming single-indicator AUC values (Fig. 11E). Internal validation using 1000 bootstrap repetitions yielded a validated model AUC of 0.723 (95% CI 0.624–0.822) (Fig. 11F), indicating favourable discriminatory power for the BECN1-based predictive model. Calibration was evaluated via the Hosmer‒Lemeshow test and calibration curve. The Hosmer‒Lemeshow test indicated good fit (P = 0.4614), demonstrating that the deviation between BECN1 risk predictions and observed values lacked statistical significance, confirming excellent model calibration. Furthermore, the calibration curve strongly agreed between the predicted and actual probabilities. Post-validation calibration analysis revealed that bias-corrected curve of the BECN1 model remained close to the ideal curve (Fig. 11G), suggesting robust predictive stability and consistency. Decision curve analysis (DCA) was performed to compare the BECN1-based model against single-predictor models (Fig. 11H). The DCA demonstrated a significantly greater net benefit for the BECN1 model compared to single indicators, a finding that was maintained after internal validation (Fig. 11I), confirming the model’s strong clinical applicability. These results collectively indicate the potential of BECN1 as a diagnostic biomarker for HoCIC.
Discussion
Discussion
Although numerous molecular indicators have emerged to guide clinical treatment and predict the prognosis of patients with NSCLC, their detection remains time-consuming and labor-intensive. Given the persistently poor prognosis of NSCLC patients, identifying novel and more direct prognostic predictors and therapeutic targets is particularly urgent. CIC is associated with embryonic development, viral infection, immune homeostasis in the liver, and tumor progression [8]. HoCIC occurs most frequently in tumor tissues. Our findings revealed that HoCIC was observed in 45.9% of NSCLC patients. The roles of HoCIC in tumors include: supplying nutrients for themselves [37, 38], inducing genomic instability [10], and promoting clonal selection through tumor cell competition and evolution [39–41], thereby impacting tumor prognosis. This study reveal that HoCIC positive patients exhibited worse overall survival (OS) and disease-free survival (DFS). Furthermore, across different clinicopathological characteristic subgroups, HoCIC has adverse effects on patient survival. However, microscopic counting of HoCIC in tumor tissue sections is subjective. Therefore, identifying biomarkers capable of diagnosing HoCIC is essential to ensure the reliability of subsequent research on this phenomenon.
Mounting evidence indicates a connection between autophagy and CIC formation. During nutrient deprivation, the energy sensor AMPK becomes activated, a known inducer of autophagy. AMPK activation elevates cellular tension, increasing cell stiffness. This creates mechanical disparities between adjacent cells, which promotes HoCIC formation [11]. TM9SF4, a member of the transmembrane 9 protein (TM9) superfamily, functions as an autophagy regulator that initiates autophagy in response to nutrient starvation by inhibiting the nutrient-sensitive kinase complex mTORC1. Silencing TM9SF4 leads to increased mTORC1 activity, reduced autophagy levels, and decreased cell survival under starvation conditions [42]. Concurrently, silencing TM9SF4 significantly suppresses cell-in-cell (CIC) formation [43]. Both the cargo-encapsulating autophagosomes formed during autophagy and the vesicles enclosing internalized cells in HoCIC ultimately fuse with lysosomes. Relying on hydrolytic enzymes within lysosomes, the cargo and internalized cells are degraded, allowing the host cell to recycle nutrients and replenish energy deficits. Therefore, it is speculated that the degradation of internalized cells in HoCIC also depends on autophagy related proteins. Florey et al. [44] reported that the vacuolar membrane enclosing internalized cells recruits autophagy-related proteins including Atg5, Atg7, and Vps34 via autophagy mechanisms, along with microtubule-associated protein 1 light chain 3 (LC3), without forming autophagosomes. The LC3-targeted endocytic vacuoles recruit lysosomes, leading to degradation of the internalized cells. Taken together, under adverse microenvironments, autophagy and HoCIC formation mechanisms can be co-regulated. Tumor cells that initiate autophagy may simultaneously engulf adjacent tumor cells to form HoCIC structures. Consequently, key proteins involved in autophagy may also participate in HoCIC formation and hold promise as signature biomarkers for HoCIC.
With advancements in high-throughput sequencing technologies and computational biology, bioinformatics has been increasingly widely and deeply applied in tumor research, becoming an important tool for deciphering tumorigenesis mechanisms and guiding clinical interventions. Publicly available bioinformatics tools and data resources can be employed for differential expression analysis, immune infiltration analysis, functional enrichment analysis, etc., enabling the identification of key molecules and biological processes closely associated with tumors based on large-scale omics data [19]. Differential expression analysis reveals changes in gene expression levels under different physiological or pathological states, facilitating the identification of significantly up- or down-regulated genes in tumor tissues. These genes are often implicated in critical processes such as cell proliferation, apoptosis, invasion, and metastasis, thereby providing a theoretical foundation for screening potential oncogenes. Immune infiltration analysis can be used to evaluate the distribution and abundance of various immune cells within the tumor microenvironment, helping to elucidate immune evasion mechanisms and predict immunotherapy responses, thus supporting the development of tumor immunotherapy strategies. Functional enrichment analysis enables the biological annotation of differentially expressed genes, identifying their potential involvement in functional categories and signaling pathways—such as cell cycle regulation, DNA damage repair, and apoptosis—thereby providing a basis for elucidating the molecular mechanisms of tumors. In clinical applications, bioinformatics methods empower researchers to construct risk scoring models for prognostic prediction and identify core molecules with potential drug-targeting value, thus advancing the development of individualized therapy and precision medicine. In summary, bioinformatics technology not only facilitates the systematic elucidation of the molecular basis of tumors but also demonstrates significant clinical translational value in tumor diagnosis, treatment, and prognosis assessment.
This study utilized NSCLC datasets from GEO to analyse HoCIC differentially expressed genesin normal versus tumor tissues. By integrating autophagy pathway marker molecules, we identified BECN1 as a pivotal gene in HoCIC formation. BECN1 is an evolutionarily conserved autophagy-related protein that plays a key role in autophagosome biogenesis and is recognized as a positive regulator of autophagy [18]. Autophagy plays a role in tumorigenesis and is often described as a ‘double-edged sword’, concurrently exhibiting tumor-suppressive and tumor-promoting functions [45]. This paradoxical nature depends on the specific stage of cancer development and is further influenced by contextual factors such as nutrient availability and microenvironmental stress [46]. Consistent with the paradoxical role of autophagy, BECN1 expression may exhibit completely opposite patterns across different tumors or even in distinct studies of the same tumor type. In studies by Lei et al. [47] and XiuYi et al. [48], BECN1 expression was lower in NSCLC tumor tissues than in adjacent noncancerous tissues. However, Wan et al. [49] demonstrated through PCR and FISH analyses that BECN1 expression was significantly elevated in NSCLC tumor tissues compared with normal tissues and was correlated with poor prognosis. This conclusion aligns with our findings: analysis of NSCLC transcriptomic data from the TCGA database revealed significantly increased BECN1 expression in tumor tissues compared with normal tissues, which was closely associated with poorer OS and progression-free survival (PFS). Furthermore, we validated BECN1 expression in NSCLC clinical samples by immunohistochemistry. The results similarly demonstrated significant differences in BECN1 expression between normal and tumor tissues, showing a positive correlation with the degree of tumor differentiation. Notably, we observed with excitement that BECN1 expression correlates with HoCIC. Compared with HoCIC negative cases, BECN1 expression in HoCIC positive samples demonstrated statistical significance (P < 0.05). Among HoCIC positive patients, the high expression rate of BECN1 reached 68.9%. Therefore, high expression of BECN1 may indicate the presence of HoCIC in NSCLC patients.
Our study identified BECN1 as a key gene with diagnostic potential for HoCIC. In both LUAD and LUSC, BECN1 functions were significantly enriched in the mitotic spindle, G2M checkpoint, and MYC hallmark pathways. The mitotic spindle and G2M checkpoint are critical processes in mitosis. Mitosis can drive HoCIC formation, a process regulated by Cdc42 which controls mitotic morphology. Deficiency of Cdc42 enhances the loss of adhesion and cell rounding during mitosis, enabling neighboring tumor cells to engulf mitotic cells and form HoCIC structures. These biophysical changes depend on RhoA activation and can be achieved by inhibiting Rap1, thereby creating conditions for subsequent endocytosis. Cdk1 inhibitors arrest cells in the G2/M phase, preventing entry into mitosis and significantly reducing endocytosis; whereas paclitaxel arrests cells in prophase, increasing the number of rounded mitotic cells and enhancing endocytosis [50]. Secondly, during mitosis, cells become more rigid, which is consistent with the observed mechanism whereby stiffer cells invade softer cells to form HoCIC structures [51]. Within tumors, distinct cell populations compete directly or indirectly for survival on the basis of differential gene expression—exemplifying survival of the fittest. Tumor cells exhibiting high c-Myc expression engulf those with low c-Myc expression. This endocytosis may confer a survival advantage to ‘winner’ tumor cells acquiring Myc mutations, thereby driving intratumoral heterogeneity [52].
Analysis of tumor tissue sections from murine models pre- and post-immunotherapy revealed increased occurrence of one tumor cell residing within another. This phenomenon was corroborated by in vitro experiments. Further investigation demonstrated that immunotherapy induces HoCIC formation, which shields internal tumor cells from immune cell-mediated killing, consequently contributing to tumor recurrence and drug resistance [53]. Our study revealed that BECN1 expression is negatively correlated with B cells, CD8 + T cells, CD4 + Th1 cells, macrophages, activated myeloid dendritic cells, and plasmacytoid dendritic cells in both LUAD and LUSC. These findings indicate that high BECN1 expression may suppress immune cell infiltration within the tumor microenvironment, thereby negatively regulating immune responses and promoting tumor progression. Consequently, these findings suggest that BECN1 likely does not facilitate HoCIC formation through the induction of immune cells.
To further determine the diagnostic role of BECN1 in HoCIC, LASSO and multivariate logistic regression analyses revealed that BECN1 serves as an independent risk factor for HoCIC with favourable diagnostic capability. Necrosis was also identified as being associated with HoCIC. Consequently, a diagnostic model for HoCIC was constructed based on BECN1. Receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA) were employed to evaluate the model’s discriminatory power, calibration, and clinical utility. The results demonstrated that the BECN1-based diagnostic model exhibited robust performance. Therefore, BECN1 has potential as a diagnostic marker for HoCIC.
Although this study systematically analysed and identified the role of BECN1 in the diagnosis of HoCIC and its underlying mechanisms, certain limitations remain. First, our analysis utilized data from different databases, which may harbor potential systematic biases. Second, the validation cohort size was relatively small, expanding clinical samples is necessary to enhance the credibility of the results. Finally, we did not experimentally validate the impact of BECN1 on HoCIC formation at the cell level. In future studies, we will address these limitations to deepen the understanding of BECN1 as a HoCIC biomarker, thereby establishing a more solid foundation for HoCIC research in NSCLC.
In summary, HoCIC is associated with poor prognosis in patients with NSCLC. However, as HoCIC structures reside within densely packed tumor cells, they are easily overlooked during visual inspection, and manual counting exhibits strong subjectivity. Moreover, incomplete sampling of tumor tissue may hinder the accurate representation of true tumor status, potentially leading to underestimation of HoCIC and impacting clinical prognosis assessment. Therefore, utilizing approaches including public database datasets, bioinformatics analysis, and clinical sample validation, we identified an association between BECN1 and the HoCIC phenomenon. This association may involve the participation of BECN1 in tumor cell mitosis and the MYC pathway, influencing the HoCIC formation process. A significant difference in BECN1 expression levels was observed between the NSCLC HoCIC positive and HoCIC negative NSCLC groups. Compared with HoCIC negative patients, HoCIC-positive patients presented a markedly greater rate of high BECN1 expression. This finding was associated with an independent and specific risk factor for HoCIC. We constructed a diagnostic model for HoCIC based on BECN1, which demonstrates excellent diagnostic performance. These findings indicate the potential of BECN1 as a specific detection biomarker for HoCIC, offering insights and a foundation for further research into HoCIC.
Although numerous molecular indicators have emerged to guide clinical treatment and predict the prognosis of patients with NSCLC, their detection remains time-consuming and labor-intensive. Given the persistently poor prognosis of NSCLC patients, identifying novel and more direct prognostic predictors and therapeutic targets is particularly urgent. CIC is associated with embryonic development, viral infection, immune homeostasis in the liver, and tumor progression [8]. HoCIC occurs most frequently in tumor tissues. Our findings revealed that HoCIC was observed in 45.9% of NSCLC patients. The roles of HoCIC in tumors include: supplying nutrients for themselves [37, 38], inducing genomic instability [10], and promoting clonal selection through tumor cell competition and evolution [39–41], thereby impacting tumor prognosis. This study reveal that HoCIC positive patients exhibited worse overall survival (OS) and disease-free survival (DFS). Furthermore, across different clinicopathological characteristic subgroups, HoCIC has adverse effects on patient survival. However, microscopic counting of HoCIC in tumor tissue sections is subjective. Therefore, identifying biomarkers capable of diagnosing HoCIC is essential to ensure the reliability of subsequent research on this phenomenon.
Mounting evidence indicates a connection between autophagy and CIC formation. During nutrient deprivation, the energy sensor AMPK becomes activated, a known inducer of autophagy. AMPK activation elevates cellular tension, increasing cell stiffness. This creates mechanical disparities between adjacent cells, which promotes HoCIC formation [11]. TM9SF4, a member of the transmembrane 9 protein (TM9) superfamily, functions as an autophagy regulator that initiates autophagy in response to nutrient starvation by inhibiting the nutrient-sensitive kinase complex mTORC1. Silencing TM9SF4 leads to increased mTORC1 activity, reduced autophagy levels, and decreased cell survival under starvation conditions [42]. Concurrently, silencing TM9SF4 significantly suppresses cell-in-cell (CIC) formation [43]. Both the cargo-encapsulating autophagosomes formed during autophagy and the vesicles enclosing internalized cells in HoCIC ultimately fuse with lysosomes. Relying on hydrolytic enzymes within lysosomes, the cargo and internalized cells are degraded, allowing the host cell to recycle nutrients and replenish energy deficits. Therefore, it is speculated that the degradation of internalized cells in HoCIC also depends on autophagy related proteins. Florey et al. [44] reported that the vacuolar membrane enclosing internalized cells recruits autophagy-related proteins including Atg5, Atg7, and Vps34 via autophagy mechanisms, along with microtubule-associated protein 1 light chain 3 (LC3), without forming autophagosomes. The LC3-targeted endocytic vacuoles recruit lysosomes, leading to degradation of the internalized cells. Taken together, under adverse microenvironments, autophagy and HoCIC formation mechanisms can be co-regulated. Tumor cells that initiate autophagy may simultaneously engulf adjacent tumor cells to form HoCIC structures. Consequently, key proteins involved in autophagy may also participate in HoCIC formation and hold promise as signature biomarkers for HoCIC.
With advancements in high-throughput sequencing technologies and computational biology, bioinformatics has been increasingly widely and deeply applied in tumor research, becoming an important tool for deciphering tumorigenesis mechanisms and guiding clinical interventions. Publicly available bioinformatics tools and data resources can be employed for differential expression analysis, immune infiltration analysis, functional enrichment analysis, etc., enabling the identification of key molecules and biological processes closely associated with tumors based on large-scale omics data [19]. Differential expression analysis reveals changes in gene expression levels under different physiological or pathological states, facilitating the identification of significantly up- or down-regulated genes in tumor tissues. These genes are often implicated in critical processes such as cell proliferation, apoptosis, invasion, and metastasis, thereby providing a theoretical foundation for screening potential oncogenes. Immune infiltration analysis can be used to evaluate the distribution and abundance of various immune cells within the tumor microenvironment, helping to elucidate immune evasion mechanisms and predict immunotherapy responses, thus supporting the development of tumor immunotherapy strategies. Functional enrichment analysis enables the biological annotation of differentially expressed genes, identifying their potential involvement in functional categories and signaling pathways—such as cell cycle regulation, DNA damage repair, and apoptosis—thereby providing a basis for elucidating the molecular mechanisms of tumors. In clinical applications, bioinformatics methods empower researchers to construct risk scoring models for prognostic prediction and identify core molecules with potential drug-targeting value, thus advancing the development of individualized therapy and precision medicine. In summary, bioinformatics technology not only facilitates the systematic elucidation of the molecular basis of tumors but also demonstrates significant clinical translational value in tumor diagnosis, treatment, and prognosis assessment.
This study utilized NSCLC datasets from GEO to analyse HoCIC differentially expressed genesin normal versus tumor tissues. By integrating autophagy pathway marker molecules, we identified BECN1 as a pivotal gene in HoCIC formation. BECN1 is an evolutionarily conserved autophagy-related protein that plays a key role in autophagosome biogenesis and is recognized as a positive regulator of autophagy [18]. Autophagy plays a role in tumorigenesis and is often described as a ‘double-edged sword’, concurrently exhibiting tumor-suppressive and tumor-promoting functions [45]. This paradoxical nature depends on the specific stage of cancer development and is further influenced by contextual factors such as nutrient availability and microenvironmental stress [46]. Consistent with the paradoxical role of autophagy, BECN1 expression may exhibit completely opposite patterns across different tumors or even in distinct studies of the same tumor type. In studies by Lei et al. [47] and XiuYi et al. [48], BECN1 expression was lower in NSCLC tumor tissues than in adjacent noncancerous tissues. However, Wan et al. [49] demonstrated through PCR and FISH analyses that BECN1 expression was significantly elevated in NSCLC tumor tissues compared with normal tissues and was correlated with poor prognosis. This conclusion aligns with our findings: analysis of NSCLC transcriptomic data from the TCGA database revealed significantly increased BECN1 expression in tumor tissues compared with normal tissues, which was closely associated with poorer OS and progression-free survival (PFS). Furthermore, we validated BECN1 expression in NSCLC clinical samples by immunohistochemistry. The results similarly demonstrated significant differences in BECN1 expression between normal and tumor tissues, showing a positive correlation with the degree of tumor differentiation. Notably, we observed with excitement that BECN1 expression correlates with HoCIC. Compared with HoCIC negative cases, BECN1 expression in HoCIC positive samples demonstrated statistical significance (P < 0.05). Among HoCIC positive patients, the high expression rate of BECN1 reached 68.9%. Therefore, high expression of BECN1 may indicate the presence of HoCIC in NSCLC patients.
Our study identified BECN1 as a key gene with diagnostic potential for HoCIC. In both LUAD and LUSC, BECN1 functions were significantly enriched in the mitotic spindle, G2M checkpoint, and MYC hallmark pathways. The mitotic spindle and G2M checkpoint are critical processes in mitosis. Mitosis can drive HoCIC formation, a process regulated by Cdc42 which controls mitotic morphology. Deficiency of Cdc42 enhances the loss of adhesion and cell rounding during mitosis, enabling neighboring tumor cells to engulf mitotic cells and form HoCIC structures. These biophysical changes depend on RhoA activation and can be achieved by inhibiting Rap1, thereby creating conditions for subsequent endocytosis. Cdk1 inhibitors arrest cells in the G2/M phase, preventing entry into mitosis and significantly reducing endocytosis; whereas paclitaxel arrests cells in prophase, increasing the number of rounded mitotic cells and enhancing endocytosis [50]. Secondly, during mitosis, cells become more rigid, which is consistent with the observed mechanism whereby stiffer cells invade softer cells to form HoCIC structures [51]. Within tumors, distinct cell populations compete directly or indirectly for survival on the basis of differential gene expression—exemplifying survival of the fittest. Tumor cells exhibiting high c-Myc expression engulf those with low c-Myc expression. This endocytosis may confer a survival advantage to ‘winner’ tumor cells acquiring Myc mutations, thereby driving intratumoral heterogeneity [52].
Analysis of tumor tissue sections from murine models pre- and post-immunotherapy revealed increased occurrence of one tumor cell residing within another. This phenomenon was corroborated by in vitro experiments. Further investigation demonstrated that immunotherapy induces HoCIC formation, which shields internal tumor cells from immune cell-mediated killing, consequently contributing to tumor recurrence and drug resistance [53]. Our study revealed that BECN1 expression is negatively correlated with B cells, CD8 + T cells, CD4 + Th1 cells, macrophages, activated myeloid dendritic cells, and plasmacytoid dendritic cells in both LUAD and LUSC. These findings indicate that high BECN1 expression may suppress immune cell infiltration within the tumor microenvironment, thereby negatively regulating immune responses and promoting tumor progression. Consequently, these findings suggest that BECN1 likely does not facilitate HoCIC formation through the induction of immune cells.
To further determine the diagnostic role of BECN1 in HoCIC, LASSO and multivariate logistic regression analyses revealed that BECN1 serves as an independent risk factor for HoCIC with favourable diagnostic capability. Necrosis was also identified as being associated with HoCIC. Consequently, a diagnostic model for HoCIC was constructed based on BECN1. Receiver operating characteristic (ROC) curves, calibration curves, and decision curve analysis (DCA) were employed to evaluate the model’s discriminatory power, calibration, and clinical utility. The results demonstrated that the BECN1-based diagnostic model exhibited robust performance. Therefore, BECN1 has potential as a diagnostic marker for HoCIC.
Although this study systematically analysed and identified the role of BECN1 in the diagnosis of HoCIC and its underlying mechanisms, certain limitations remain. First, our analysis utilized data from different databases, which may harbor potential systematic biases. Second, the validation cohort size was relatively small, expanding clinical samples is necessary to enhance the credibility of the results. Finally, we did not experimentally validate the impact of BECN1 on HoCIC formation at the cell level. In future studies, we will address these limitations to deepen the understanding of BECN1 as a HoCIC biomarker, thereby establishing a more solid foundation for HoCIC research in NSCLC.
In summary, HoCIC is associated with poor prognosis in patients with NSCLC. However, as HoCIC structures reside within densely packed tumor cells, they are easily overlooked during visual inspection, and manual counting exhibits strong subjectivity. Moreover, incomplete sampling of tumor tissue may hinder the accurate representation of true tumor status, potentially leading to underestimation of HoCIC and impacting clinical prognosis assessment. Therefore, utilizing approaches including public database datasets, bioinformatics analysis, and clinical sample validation, we identified an association between BECN1 and the HoCIC phenomenon. This association may involve the participation of BECN1 in tumor cell mitosis and the MYC pathway, influencing the HoCIC formation process. A significant difference in BECN1 expression levels was observed between the NSCLC HoCIC positive and HoCIC negative NSCLC groups. Compared with HoCIC negative patients, HoCIC-positive patients presented a markedly greater rate of high BECN1 expression. This finding was associated with an independent and specific risk factor for HoCIC. We constructed a diagnostic model for HoCIC based on BECN1, which demonstrates excellent diagnostic performance. These findings indicate the potential of BECN1 as a specific detection biomarker for HoCIC, offering insights and a foundation for further research into HoCIC.
Conclusions
Conclusions
Utilizing public database datasets, bioinformatics, multivariate statistical analysis, and clinical sample validation, this study identified BECN1 as a core gene associated with HoCIC in NSCLC. It exhibits significant potential value in HoCIC diagnosis and represents a promising biomarker for diagnosing and predicting HoCIC. BECN1 appears to function in HoCIC formation via pathways related to mitosis and MYC. This study provides a foundation for further reliable investigations into the roles and mechanisms of the HoCIC. Future research should further elucidate how BECN1 influences the formation process and specific mechanisms of HoCIC at the cellular level, thereby providing more robust evidence for its utility as a HoCIC diagnostic marker.
Utilizing public database datasets, bioinformatics, multivariate statistical analysis, and clinical sample validation, this study identified BECN1 as a core gene associated with HoCIC in NSCLC. It exhibits significant potential value in HoCIC diagnosis and represents a promising biomarker for diagnosing and predicting HoCIC. BECN1 appears to function in HoCIC formation via pathways related to mitosis and MYC. This study provides a foundation for further reliable investigations into the roles and mechanisms of the HoCIC. Future research should further elucidate how BECN1 influences the formation process and specific mechanisms of HoCIC at the cellular level, thereby providing more robust evidence for its utility as a HoCIC diagnostic marker.
Supplementary Information
Supplementary Information
Below is the link to the electronic supplementary material.
Below is the link to the electronic supplementary material.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Pan-Cancer Analysis of CLDN3 and Its Contribution to 5-FU Resistance in Colorectal Cancer.
- Identification of a 9-gene autophagy-related signature for predicting prognosis and immune exhaustion features in breast cancer.
- Integrative analysis of urinary microRNAs for prostate cancer detection: A proof-of-concept study.
- Genetic underpinnings of type-2 diabetes (T2D) with colorectal cancer (CRC): In-silico discovery of common molecular signatures, pathogenetic processes and therapeutic candidates.
- Construction of a prognostic model and multidimensional analysis of hepatocellular carcinoma based on palmitoylation-related genes.
- Exosomal IGFALS as a prognostic biomarker in hepatocellular Carcinoma: Associations with immune infiltration and clinical outcomes.