CKAP2, miR-941, miR-548 and LINC02577 as biomarkers for early diagnosis in colorectal cancer.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
환자: progressive disease after leucovorin-containing regimens (P = 0
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
LINC02577 and CKAP2 show strong diagnostic utility (AUCs 0.982 and 0.797) outperforming CEA, and LINC02577 may predict poor response to leucovorin-containing therapy (P = 0.009). These candidates warrant functional studies to define mechanisms and evaluate clinical applicability.
Colorectal cancer (CRC) is a leading cause of cancer-related mortality globally, necessitating biomarkers for early detection and targeted therapies.
- p-value P < 0.001
APA
Maharati A, Malekifar MJ, et al. (2025). CKAP2, miR-941, miR-548 and LINC02577 as biomarkers for early diagnosis in colorectal cancer.. Scientific reports, 16(1), 740. https://doi.org/10.1038/s41598-025-30520-5
MLA
Maharati A, et al.. "CKAP2, miR-941, miR-548 and LINC02577 as biomarkers for early diagnosis in colorectal cancer.." Scientific reports, vol. 16, no. 1, 2025, pp. 740.
PMID
41476086 ↗
Abstract 한글 요약
Colorectal cancer (CRC) is a leading cause of cancer-related mortality globally, necessitating biomarkers for early detection and targeted therapies. We integrated RNA-seq (GSE180440) weighted gene co-expression analysis with experimental validation to identify coding and long non-coding RNA biomarkers associated with tumor biology and treatment response. Bioinformatics analysis of GSE180440 dataset (145 tumor, 45 normal samples) identified differentially expressed genes (DEGs) and lncRNAs (DELs) using DESeq2. Weighted gene co-expression network analysis (WGCNA) identified modules linked to CRC traits. Functional enrichment, protein-protein interaction (PPI), and miRNA-gene networks were constructed. Validation used TCGA-COAD and RT-qPCR on 61 paired CRC samples. Diagnostic performance was assessed by ROC/AUC (pROC). Drug-response and survival analyses were performed with GEPIA3. From 672 DEGs and WGCNA, LINC02577, LINC00294 and CKAP2 emerged as hub candidates together with miR-548k and miR-941. In TCGA-COAD CKAP2 (log2FC = 1.501, P < 0.001) and LINC02577 (log2FC = 6.676, P < 0.001) were significantly upregulated, while LINC00294 was downregulated (log2FC = - 1.104, P < 0.001). Diagnostic AUCs (tumor vs. normal) were: LINC02577 0.982, CKAP2 0.797 and CEA 0.663; LINC00294 showed poor discrimination (AUC = 0.055). In our cohort CKAP2 associated with tumor size > 5 cm (P = 0.025, AUC = 0.667) and LINC02577 associated with nodal involvement in females (P = 0.028, AUC = 0.736). miR-548k overexpression correlated with early invasion (P = 0.044, AUC = 0.667). LINC02577 expression was higher in patients with progressive disease after leucovorin-containing regimens (P = 0.009). None of the selected genes were prognostic for overall survival. We identified CKAP2, LINC02577, miR-941, and miR-548k as key biomarkers with distinct expression patterns in early-stage CRC. Their altered expression in small, early-stage tumors reveales their potential for early CRC detection. LINC02577 and CKAP2 show strong diagnostic utility (AUCs 0.982 and 0.797) outperforming CEA, and LINC02577 may predict poor response to leucovorin-containing therapy (P = 0.009). These candidates warrant functional studies to define mechanisms and evaluate clinical applicability.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
- Humans
- MicroRNAs
- Colorectal Neoplasms
- RNA
- Long Noncoding
- Biomarkers
- Tumor
- Gene Expression Regulation
- Neoplastic
- Female
- Early Detection of Cancer
- Male
- Gene Regulatory Networks
- Prognosis
- Gene Expression Profiling
- Computational Biology
- Protein Interaction Maps
- Middle Aged
- Biomarker
- CKAP2
- Colorectal cancer
- LINC02577
- MiR-941
- miR-548k
같은 제1저자의 인용 많은 논문 (1)
📖 전문 본문 읽기 PMC JATS · ~90 KB · 영문
Introduction
Introduction
Colorectal cancer (CRC) is the third most commonly diagnosed cancer globally, with an estimated 1.9 million new cases in 2022. It is also the second leading cause of cancer-related mortality, responsible for nearly 904,000 deaths worldwide. In men and women alike, CRC ranks third in both incidence and mortality rates1. Between 2014 and 2017, approximately 43,580 new cases of CRC were reported in Iran, with an age-standardized incidence rate (ASR) of 114.49 per 100,000 individuals. The incidence was notably higher in males (ASR of 134.45) compared to females (ASR of 94.85). By 2025, the number of CRC cases in Iran is projected to rise to 17,812, with an expected ASR of 17.72. Advancing the diagnosis of CRC from stage IV to stage III is projected to result in a 38% reduction in CRC-specific mortality, corresponding to approximately 11 fewer deaths per 100,000 individuals. This underscores the critical role of earlier detection in significantly improving survival outcomes and reducing disease burden in CRC patients3.
Due to the complex and heterogeneous nature of CRC, a wide range of genetic and epigenetic alterations contribute to various signaling pathways related to CRC4. Some alterations have already been integrated into clinical practice as predictive and prognostic markers, such as KRAS/NRAS and BRAF mutations5–9. In 2009, the assessment of KRAS mutation status was incorporated into the therapeutic decision-making process for metastatic CRC patients treated with panitumumab and cetuximab, as the presence of KRAS mutations renders these therapies ineffective. This marked CRC as the first common cancer for which molecular testing became a mandatory step in therapeutic planning10. Additionally, the overexpression of specific microRNAs (miRNAs), such as miR-92a and miR-21, has shown significant potential as diagnostic biomarkers11–13. The consistent alteration in the expression of non-coding RNAs (ncRNAs) across various biological specimens, including tissue, blood, and stool, highlights their relevance in monitoring cancer progression and recurrence. Gaining deeper insights into these alterations can facilitate the identification of robust diagnostic, prognostic, and predictive biomarkers, thereby enhancing future management strategies for CRC14. Long non-coding RNAs (lncRNAs), another class of ncRNAs, are emerging as potential contributors to CRC management, though they remain significantly understudied compared to microRNAs. LncRNAs are defined as non-coding transcripts longer than 200 nucleotides and are involved in a wide range of biological processes, including genome organization, gene expression regulation and cellular functions such as differentiation, development, stress responses, and metabolic regulation15 .
lncRNAs act as oncogenes or tumor suppressor gene and are closely associated with cancer hallmarks, including cell cycle progression, inhibition of apoptosis, angiogenesis, and metastasis. They facilitate metastasis through various signaling pathways, influencing metastasis-related genes at both transcriptional and post-transcriptional levels. This can occur via their interactions with chromatin-remodeling complexes or by acting as decoys, sponging anti-metastatic microRNAs16,17.
This connection between lncRNAs and miRNAs directly ties into the concept of competing endogenous RNAs (ceRNAs). ceRNAs, including lncRNAs, compete for shared miRNAs, affecting the regulation of gene expression. When lncRNAs act as “decoys” for miRNAs, they prevent miRNAs from silencing their target mRNAs and effectively influence multiple cancer-related pathways18. ceRNA activity impacts processes such as tumor growth, metastasis, and resistance to cell death by altering the balance of gene expression in many cancers including CRC. Understanding the functions of lncRNAs including lncRNA-miRNA-mRNA relationships is therefore essential for identifying new biomarkers and developing more effective treatments in CRC management19.
Certain lncRNAs have shown promise as non-invasive diagnostic biomarkers. For example, PCA3 was approved by the FDA in 2012 for molecular diagnosis of prostate cancer, and MALAT1 has been identified as a salivary biomarker for oral squamous cell carcinoma20,21. In CRC, HOTAIR and CCAT1 were among the first lncRNAs proposed as diagnostic markers due to their elevated plasma levels in patients compared to healthy individuals22,23. Recent studies have identified NALT1, facilitating late-stage CRC development by sponging miR-574-5p24, LINC00887, promoting CRC metastasis by increasing H3K27cr levels, and SNHG1, enhancing CRC cell metastasis through recruitment of the HNRNPD protein to stabilize SERPINA3 mRNA25,26. Such exciting discoveries, are laying the foundation for the clinical application of lncRNAs, helping to unlock their full potential as biomarkers and therapeutic targets in CRC.
Next-generation sequencing (NGS) advancements have revolutionized the exploration of ncRNAs, including lncRNAs. These technologies enable discovery and detailed characterization of lncRNA structures and functions, providing insights into their roles in gene regulation and cellular processes. Additionally, the integration of NGS with multi-omics approaches facilitates a comprehensive understanding of lncRNA-related regulatory mechanisms27,28. NGS allows large-scale studies comparing lncRNA expression between different cancer stages and tissues, confirming their biomarker potential for diagnosis and prognosis29,30. Machine learning (ML) plays a crucial role in analyzing the huge amount of the complex data produced by NGS. ML algorithms can classify lncRNAs, predict their interactions with other molecules, and identify patterns linked to clinical outcomes. ML also integrates multi-omic data to reveal how lncRNAs fit into broader cancer regulatory networks, aiding in the identification of therapeutic targets. Furthermore, ML improves the accuracy of CRC subtype classification based on lncRNA profiles, helping to personalize treatments31.
This study aims to investigate the roles of specific non-coding and coding genes in the pathogenesis of colorectal cancer (CRC). Initially, we identified two lncRNAs and one coding gene through bioinformatics analyses, based on their potential significance in CRC. These candidates will be further examined through wet lab experiments to validate their functional roles and assess their viability as biomarkers for CRC.
Colorectal cancer (CRC) is the third most commonly diagnosed cancer globally, with an estimated 1.9 million new cases in 2022. It is also the second leading cause of cancer-related mortality, responsible for nearly 904,000 deaths worldwide. In men and women alike, CRC ranks third in both incidence and mortality rates1. Between 2014 and 2017, approximately 43,580 new cases of CRC were reported in Iran, with an age-standardized incidence rate (ASR) of 114.49 per 100,000 individuals. The incidence was notably higher in males (ASR of 134.45) compared to females (ASR of 94.85). By 2025, the number of CRC cases in Iran is projected to rise to 17,812, with an expected ASR of 17.72. Advancing the diagnosis of CRC from stage IV to stage III is projected to result in a 38% reduction in CRC-specific mortality, corresponding to approximately 11 fewer deaths per 100,000 individuals. This underscores the critical role of earlier detection in significantly improving survival outcomes and reducing disease burden in CRC patients3.
Due to the complex and heterogeneous nature of CRC, a wide range of genetic and epigenetic alterations contribute to various signaling pathways related to CRC4. Some alterations have already been integrated into clinical practice as predictive and prognostic markers, such as KRAS/NRAS and BRAF mutations5–9. In 2009, the assessment of KRAS mutation status was incorporated into the therapeutic decision-making process for metastatic CRC patients treated with panitumumab and cetuximab, as the presence of KRAS mutations renders these therapies ineffective. This marked CRC as the first common cancer for which molecular testing became a mandatory step in therapeutic planning10. Additionally, the overexpression of specific microRNAs (miRNAs), such as miR-92a and miR-21, has shown significant potential as diagnostic biomarkers11–13. The consistent alteration in the expression of non-coding RNAs (ncRNAs) across various biological specimens, including tissue, blood, and stool, highlights their relevance in monitoring cancer progression and recurrence. Gaining deeper insights into these alterations can facilitate the identification of robust diagnostic, prognostic, and predictive biomarkers, thereby enhancing future management strategies for CRC14. Long non-coding RNAs (lncRNAs), another class of ncRNAs, are emerging as potential contributors to CRC management, though they remain significantly understudied compared to microRNAs. LncRNAs are defined as non-coding transcripts longer than 200 nucleotides and are involved in a wide range of biological processes, including genome organization, gene expression regulation and cellular functions such as differentiation, development, stress responses, and metabolic regulation15 .
lncRNAs act as oncogenes or tumor suppressor gene and are closely associated with cancer hallmarks, including cell cycle progression, inhibition of apoptosis, angiogenesis, and metastasis. They facilitate metastasis through various signaling pathways, influencing metastasis-related genes at both transcriptional and post-transcriptional levels. This can occur via their interactions with chromatin-remodeling complexes or by acting as decoys, sponging anti-metastatic microRNAs16,17.
This connection between lncRNAs and miRNAs directly ties into the concept of competing endogenous RNAs (ceRNAs). ceRNAs, including lncRNAs, compete for shared miRNAs, affecting the regulation of gene expression. When lncRNAs act as “decoys” for miRNAs, they prevent miRNAs from silencing their target mRNAs and effectively influence multiple cancer-related pathways18. ceRNA activity impacts processes such as tumor growth, metastasis, and resistance to cell death by altering the balance of gene expression in many cancers including CRC. Understanding the functions of lncRNAs including lncRNA-miRNA-mRNA relationships is therefore essential for identifying new biomarkers and developing more effective treatments in CRC management19.
Certain lncRNAs have shown promise as non-invasive diagnostic biomarkers. For example, PCA3 was approved by the FDA in 2012 for molecular diagnosis of prostate cancer, and MALAT1 has been identified as a salivary biomarker for oral squamous cell carcinoma20,21. In CRC, HOTAIR and CCAT1 were among the first lncRNAs proposed as diagnostic markers due to their elevated plasma levels in patients compared to healthy individuals22,23. Recent studies have identified NALT1, facilitating late-stage CRC development by sponging miR-574-5p24, LINC00887, promoting CRC metastasis by increasing H3K27cr levels, and SNHG1, enhancing CRC cell metastasis through recruitment of the HNRNPD protein to stabilize SERPINA3 mRNA25,26. Such exciting discoveries, are laying the foundation for the clinical application of lncRNAs, helping to unlock their full potential as biomarkers and therapeutic targets in CRC.
Next-generation sequencing (NGS) advancements have revolutionized the exploration of ncRNAs, including lncRNAs. These technologies enable discovery and detailed characterization of lncRNA structures and functions, providing insights into their roles in gene regulation and cellular processes. Additionally, the integration of NGS with multi-omics approaches facilitates a comprehensive understanding of lncRNA-related regulatory mechanisms27,28. NGS allows large-scale studies comparing lncRNA expression between different cancer stages and tissues, confirming their biomarker potential for diagnosis and prognosis29,30. Machine learning (ML) plays a crucial role in analyzing the huge amount of the complex data produced by NGS. ML algorithms can classify lncRNAs, predict their interactions with other molecules, and identify patterns linked to clinical outcomes. ML also integrates multi-omic data to reveal how lncRNAs fit into broader cancer regulatory networks, aiding in the identification of therapeutic targets. Furthermore, ML improves the accuracy of CRC subtype classification based on lncRNA profiles, helping to personalize treatments31.
This study aims to investigate the roles of specific non-coding and coding genes in the pathogenesis of colorectal cancer (CRC). Initially, we identified two lncRNAs and one coding gene through bioinformatics analyses, based on their potential significance in CRC. These candidates will be further examined through wet lab experiments to validate their functional roles and assess their viability as biomarkers for CRC.
Materials and methods
Materials and methods
Raw data acquisition, clinical information and differential expression
We used Gene Expression Omnibus (GEO) to download the GSE180440 dataset a RNA-sequencing data profile with GPL11154 Illumina HiSeq 2000 platform, was selected to investigate DElncRNAs and DEmRNAs. This study includes 145 tumor samples (12 stage I, 61 stage II, and 72 stage III, with 122 microsatellite stable (MSS) and 23 microsatellite instability (MSI)) and 45 non-tumor samples. Sequencing FASTQ files were mapped to the GRCh38/hg38 human reference genome with HISAT2. Gene counts were produced by HTSeq-count and differential expression of long noncoding and protein-coding transcripts was determined using DESeq2 in R (4.3.3), with significance defined as |log2FC| > 1, adjusted P < 0.05 and baseMean > 50.
WGCNA analysis
The DEGs/DELs were utilized to build a weighted co-expression network using the R package “WGCNA“32. We used the “goodSamplesGenes” function to check data for excessive missing values and identify outlier samples. The similarity matrix of the paired genes was converted into an adjacency matrix using the Pearson test. Afterward, a scale-independence co-expression network was built, and a minimum possible soft-thresholding β value (= 18) was determined to ensure that the adjacency matrix matched the parameters of scale-independence topology using pickSoftThreshold function. The scale independence and mean connection were calculated, and the clinical trait data were loaded. The adjacency matrix was transformed into a topological overlap matrix (TOM) to assess gene network connectedness, defined here as the sum of a gene’s adjacency scores with all other genes. This TOM was employed for module detection with parameters set to a minimum module size of 30 and a module merging cut height of 0.25. Hierarchical clustering of the TOM produced a dendrogram from which modules were identified. Genes not assigned to a specific module were colored grey. Then, the clinical traits and the various module eigengenes (MEs) were matched. We determined the module membership (MM), which served as the correlation between the module eigengenes and the gene expression profiles, and the gene significance (GS), which validated connections of specific genes with the clinically important traits. Finally, we set the GS > 0.6 and MM > 0.8 as thresholds to select potential potent genes.
Functional enrichment analysis for important modules
We conducted gene ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis33–35, and gene set enrichment analysis of the selected modules using shinyGO (version: 0.76.1 http://bioinformatics.sdstate.edu). A false discovery rate (FDR) threshold of 0.05 was used to identify enriched ontology keywords and pathways.
Identification of hub genes
We identified hub genes by intersecting the genes from the turquoise module of the weighted gene co-expression network with the differentially expressed genes. To refine the selection, we performed a detailed literature review using PubMed, assessing each gene for its novelty and relevance to the study’s conceptual framework. We investigated the relationships among the hub genes in the turquoise module using PPI networks. A PPI network centered on CKAP2 was constructed by incorporating interacting proteins, identified through the STRING database (https://string-db.org/). Additionally, we utilized the miRNet platform (https://www.mirnet.ca/), which integrates data from 14 miRNA databases, to identify candidate microRNAs (miRNAs) that interact with CKAP2, providing further insights into its regulatory network.
Validation of genes in TCGA cohort and ROC analysis
We downloaded the COAD-TCGA cohort (colon adenocarcinoma) using the TCGAbiolinks package in R. Differential expression analysis (DEA) was performed with the DESeq2 package, and the expression levels of selected genes were validated within this cohort. Gene expression was visualized using the EnhancedVolcano and ggplot2 packages. To assess the diagnostic performance of the selected genes in distinguishing between tumor and normal tissues, we calculated the area under the curve (AUC) and plotted receiver operating characteristic (ROC) curves using the pROC package. Carcinoembryonic antigen (CEA) was used as a control, and its AUC was compared with those of the selected genes. Furthermore, we evaluated the AUC of the selected genes across clinicopathological features identified in the previous section.
Survival analysis and drug response
We used the newly developed GEPIA3 platform (https://gepia3.bioinfoliu.com/) to perform survival analysis and to assess the expression levels of the selected genes across different response groups (complete response, partial response, stable disease, and progressive disease) to oxaliplatin, leucovorin, and 5-fluorouracil in colorectal cancer. The built-in visualization tools of GEPIA3 were employed to generate Kaplan–Meier survival plots and drug response profiles.
Sample collection
We obtained 61 paired specimens of fresh tumor tissue and adjacent normal mucosa from colorectal cancer patients who had not received chemo-radiation before surgery. All cases were confirmed by histopathology and staged according to the tumor–node–metastasis (TNM) system per American Joint Committee on Cancer (AJCC) criteria36. Collected tissues were immediately immersed in RNAlater (Qiagen, Germany) to protect RNA integrity and kept at − 20 °C until extraction. Written informed consent was secured from every participant prior to sampling. The study protocol adhered to ethical guidelines and was approved by the Ethics Committee of Mashhad University of Medical Sciences (MUMS), Mashhad, Iran (ethical code: IR.MUMS.MEDICAL.REC.1402.103). The clinicopathological characteristics of the participants are summarized in Table 1.
Real-Time RT-PCR and statistical analysis
Total RNA was isolated from the tissue specimens using the Total RNA Extraction Kit (Parstous, Iran). Expression analysis and RNA quantification were carried out by reverse transcription quantitative PCR (RT-qPCR) with the SYBR Green chemistry (Amplicon, Denmark) on a LightCycler instrument (Roche, Germany). Transcript levels of LINC00294, LINC02577, CKAP2, miR-548k and miR-941 were normalized to Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) as the endogenous control. Primers were designed in AlleleID v6.0 and their specificity verified via NCBI BLAST. Sequences are listed in Table 2. Relative expression was determined using the − ΔΔCT method. For classification, expression increases above one-fold were considered upregulation, decreases below one-fold were considered downregulation, and changes within ± 1-fold were treated as normal. Statistical analyses were performed with SPSS v27.0.1 (SPSS, Chicago, IL) and the HAVij R package (https://github.com/amirmaharati/HAVij). Associations between gene expression and clinicopathological parameters were evaluated using Chi-square (χ²) or Fisher’s exact test, independent-samples t-test, Mann–Whitney U, Kruskal–Wallis, and ANOVA as appropriate; correlations were assessed with Pearson’s and Spearman’s rank tests. A two-sided P ≤ 0.05 was considered statistically significant.
Raw data acquisition, clinical information and differential expression
We used Gene Expression Omnibus (GEO) to download the GSE180440 dataset a RNA-sequencing data profile with GPL11154 Illumina HiSeq 2000 platform, was selected to investigate DElncRNAs and DEmRNAs. This study includes 145 tumor samples (12 stage I, 61 stage II, and 72 stage III, with 122 microsatellite stable (MSS) and 23 microsatellite instability (MSI)) and 45 non-tumor samples. Sequencing FASTQ files were mapped to the GRCh38/hg38 human reference genome with HISAT2. Gene counts were produced by HTSeq-count and differential expression of long noncoding and protein-coding transcripts was determined using DESeq2 in R (4.3.3), with significance defined as |log2FC| > 1, adjusted P < 0.05 and baseMean > 50.
WGCNA analysis
The DEGs/DELs were utilized to build a weighted co-expression network using the R package “WGCNA“32. We used the “goodSamplesGenes” function to check data for excessive missing values and identify outlier samples. The similarity matrix of the paired genes was converted into an adjacency matrix using the Pearson test. Afterward, a scale-independence co-expression network was built, and a minimum possible soft-thresholding β value (= 18) was determined to ensure that the adjacency matrix matched the parameters of scale-independence topology using pickSoftThreshold function. The scale independence and mean connection were calculated, and the clinical trait data were loaded. The adjacency matrix was transformed into a topological overlap matrix (TOM) to assess gene network connectedness, defined here as the sum of a gene’s adjacency scores with all other genes. This TOM was employed for module detection with parameters set to a minimum module size of 30 and a module merging cut height of 0.25. Hierarchical clustering of the TOM produced a dendrogram from which modules were identified. Genes not assigned to a specific module were colored grey. Then, the clinical traits and the various module eigengenes (MEs) were matched. We determined the module membership (MM), which served as the correlation between the module eigengenes and the gene expression profiles, and the gene significance (GS), which validated connections of specific genes with the clinically important traits. Finally, we set the GS > 0.6 and MM > 0.8 as thresholds to select potential potent genes.
Functional enrichment analysis for important modules
We conducted gene ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis33–35, and gene set enrichment analysis of the selected modules using shinyGO (version: 0.76.1 http://bioinformatics.sdstate.edu). A false discovery rate (FDR) threshold of 0.05 was used to identify enriched ontology keywords and pathways.
Identification of hub genes
We identified hub genes by intersecting the genes from the turquoise module of the weighted gene co-expression network with the differentially expressed genes. To refine the selection, we performed a detailed literature review using PubMed, assessing each gene for its novelty and relevance to the study’s conceptual framework. We investigated the relationships among the hub genes in the turquoise module using PPI networks. A PPI network centered on CKAP2 was constructed by incorporating interacting proteins, identified through the STRING database (https://string-db.org/). Additionally, we utilized the miRNet platform (https://www.mirnet.ca/), which integrates data from 14 miRNA databases, to identify candidate microRNAs (miRNAs) that interact with CKAP2, providing further insights into its regulatory network.
Validation of genes in TCGA cohort and ROC analysis
We downloaded the COAD-TCGA cohort (colon adenocarcinoma) using the TCGAbiolinks package in R. Differential expression analysis (DEA) was performed with the DESeq2 package, and the expression levels of selected genes were validated within this cohort. Gene expression was visualized using the EnhancedVolcano and ggplot2 packages. To assess the diagnostic performance of the selected genes in distinguishing between tumor and normal tissues, we calculated the area under the curve (AUC) and plotted receiver operating characteristic (ROC) curves using the pROC package. Carcinoembryonic antigen (CEA) was used as a control, and its AUC was compared with those of the selected genes. Furthermore, we evaluated the AUC of the selected genes across clinicopathological features identified in the previous section.
Survival analysis and drug response
We used the newly developed GEPIA3 platform (https://gepia3.bioinfoliu.com/) to perform survival analysis and to assess the expression levels of the selected genes across different response groups (complete response, partial response, stable disease, and progressive disease) to oxaliplatin, leucovorin, and 5-fluorouracil in colorectal cancer. The built-in visualization tools of GEPIA3 were employed to generate Kaplan–Meier survival plots and drug response profiles.
Sample collection
We obtained 61 paired specimens of fresh tumor tissue and adjacent normal mucosa from colorectal cancer patients who had not received chemo-radiation before surgery. All cases were confirmed by histopathology and staged according to the tumor–node–metastasis (TNM) system per American Joint Committee on Cancer (AJCC) criteria36. Collected tissues were immediately immersed in RNAlater (Qiagen, Germany) to protect RNA integrity and kept at − 20 °C until extraction. Written informed consent was secured from every participant prior to sampling. The study protocol adhered to ethical guidelines and was approved by the Ethics Committee of Mashhad University of Medical Sciences (MUMS), Mashhad, Iran (ethical code: IR.MUMS.MEDICAL.REC.1402.103). The clinicopathological characteristics of the participants are summarized in Table 1.
Real-Time RT-PCR and statistical analysis
Total RNA was isolated from the tissue specimens using the Total RNA Extraction Kit (Parstous, Iran). Expression analysis and RNA quantification were carried out by reverse transcription quantitative PCR (RT-qPCR) with the SYBR Green chemistry (Amplicon, Denmark) on a LightCycler instrument (Roche, Germany). Transcript levels of LINC00294, LINC02577, CKAP2, miR-548k and miR-941 were normalized to Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) as the endogenous control. Primers were designed in AlleleID v6.0 and their specificity verified via NCBI BLAST. Sequences are listed in Table 2. Relative expression was determined using the − ΔΔCT method. For classification, expression increases above one-fold were considered upregulation, decreases below one-fold were considered downregulation, and changes within ± 1-fold were treated as normal. Statistical analyses were performed with SPSS v27.0.1 (SPSS, Chicago, IL) and the HAVij R package (https://github.com/amirmaharati/HAVij). Associations between gene expression and clinicopathological parameters were evaluated using Chi-square (χ²) or Fisher’s exact test, independent-samples t-test, Mann–Whitney U, Kruskal–Wallis, and ANOVA as appropriate; correlations were assessed with Pearson’s and Spearman’s rank tests. A two-sided P ≤ 0.05 was considered statistically significant.
Results
Results
Raw data preprocessing and DEGs/DELs identification
To minimize technical variability, expression values were subjected to quantile normalization. Differential analysis of the GSE180440 dataset yielded 672 DEGs (adjusted P < 0.05 and |log2 fold change| > 1); the set of differentially expressed lncRNAs and mRNAs was visualized with a volcano plot to confirm significance and effect size (Fig. 1A). The log2 fold-change values for all features were then presented in an MA-plot (Fig. 1B), which shows fold change against the average normalized expression across samples for both coding and noncoding markers. Finally, principal component analysis (PCA) was used to assess replicate reproducibility and detect outliers, and it clearly separated normal from tumor samples within the GSE180440 cohort (Fig. 1C).
WGCNA analysis and module identification
Based on the differential expression analysis, 23,767 genes were included in WGCNA. After clustering the samples in the dataset, two outliers were discovered, and 188 samples were then used in this study. We identified 11 distinct gene co-expression modules using WGCNA analysis which is shown in dendrogram branches of different colors (Fig. 2A). The Eigengene was established for each module to indicate the correlation between modules and the presence of clinical traits. Accordingly, the turquoise and blue module were associated with the tumor, higher stage, and MSS state, which were chosen for further analysis (Fig. 2B). (r = 0.98, p-value = 9e- 07). The association between the MM and GS of the selected modules, which was mostly related to clinical traits, is represented in Fig. 2C.
Functional enrichment analysis of modules
We performed GO, KEGG and GSEA analysis to investigate the biological functions of the DEmRNAs in selected modules to predict the function of associated DElncRNAs in turquoise modules, and the result is shown in Fig. 3. All of the GO and KEGG enrichment analysis results were ordered according to the enrichment score (-log P value). In the turquoise module, chromosome segregation, ribosome biogenesis, ribonucleoprotein complex biogenesis, and chromosome organization were the most enriched in the biological process (BP) (Fig. 3A). In cellular component (CC) analysis, the condensed chromosome, chromosomal region, and centromeric region were the most enriched gene terms (Fig. 3B). Moreover, the catalytic activity acting on DNA, DNA helicase activity, and Single-stranded DNA binding were the most enriched gene terms in molecular function (MF) (Fig. 3C). Further, the DNA replication, Ribosome biogenesis in eukaryotes, and Cell cycle were the 3 top pathways associated with turquoise module based on KEGG pathway analysis (Fig. 3D). Our functional analysis revealed that turquoise DEmRNAs and maybe DElncRNAs were implicated in nuclear processes.
Gene selection and network analysis
We identified 289 hub genes by intersecting the genes from the turquoise module of the WGCNA with the DEGs. These hub genes consisted of 156 lncRNAs and 133 mRNAs. A search of public databases, including PubMed and Scopus, led to the selection of LINC00294, LINC02577, and CKAP2 (Fig. 4A). Functional enrichment analysis had previously shown the turquoise module to be associated with nuclear processes, leading us to hypothesize that the selected lncRNAs may be involved in these processes by interacting with coding genes in the module. We constructed a PPI network based on CKAP2 and other coding genes co-expressed with CKAP2 in the turquoise module (Fig. 4B). To explore regulatory interactions, we focused on miRNAs targeting the coding gene. Using the miRNet database, we built a miRNA-gene interaction network and identified miR-548k and miR-941 as key candidates based on novelty and supporting literature (Fig. 4C).
RT-PCR of coding and non-coding genes
We enrolled 61 CRC cases (30 males and 31 females) with age range of 29–79 years old (mean ± SD: 56 ± 13 years old) which 20 of them were under 50 and 41 were over 50 years old. Tumor sizes were also ranged between 1 and 12 cm (mean ± SD: 4.9 ± 2.2 cm). The male patients were older than females (57 ± 12 VS. 54 ± 13 years old), and had slightly bigger tumor size (5.1 VS. 4.7 cm). The majority of tumor tissues were grade II (31/61, 50.8%), T3/4 depth of invasion (46/61, 75.4%), and N0/N1 lymph node involvement (56/61, 91.8%). We mentioned all the clinicopathological features in Table 1. CKAP2, LINC00294, LINC02577, miR-941, and miR-548k exhibited varying patterns of dysregulation among patients. CKAP2 was underexpressed in 16.3% and overexpressed in 31.1% of cases, with fold changes ranging from − 1.86 to − 1.01 (mean − 1.42 ± 0.27) and 1.07 to 4.34 (mean 2.18 ± 1.02), respectively, indicating notable upregulation. Similarly, LINC02577 exhibited underexpression in 29.5% and overexpression in 44.3% of patients, with fold changes of − 1.33 to − 10.59 (mean − 4.27 ± 2.38) and 1.2 to 9.02 (mean 3.74 ± 2.76), reflecting a notable trend of upregulation. LINC00294 was downregulated in 27.9% and upregulated in 26.2% of patients, with fold changes between − 1.62 and − 6.03 (mean − 3.73 ± 1.42) and 1.2 to 9.02 (mean 3.46 ± 2.69). For miR-941, underexpression was observed in 44.3% and overexpression in 41% of cases, with fold changes from − 1.11 to − 16.01 (mean − 5.02 ± 2.38) and 1.22 to 14.06 (mean 5.2 ± 2.76). Lastly, miR-548k showed downregulation in 37.7% and upregulation in 36.1% of patients, with fold changes ranging from − 1.1 to − 21.25 (mean − 6.33 ± 6.42) and 1.59 to 25.73 (mean 7.42 ± 7.53). Therefore, CKAP2 and LINC02577 demonstrated a stronger trend toward upregulation compared to the other genes (Fig. 5A–E).
Association between genes and clinicopathological features
Probable correlation between levels of our genes mRNA expression and clinicopathological features of CRC patients was assessed to clarify the role of these genes in biology of CRC. CKAP2, a cytoskeleton-associated protein involved in mitotic regulation and oncogenesis, was significantly upregulated in tumors larger than 5 cm (P = 0.025, AUC = 0.667). Moreover, its downregulation was indicated in T4 compared to T1/2 female patients (P = 0.005, AUC = 0.636). These results suggest it can play a significant role in tumor growth and serve as a biomarker for larger tumors at the beginning of the disease. LINC02577 was downregulated in advanced-stage tumors of female patients (P = 0.028, AUC = 0.264). It was also downregulated in T4 CRC female patients (P = 0.035, AUC = 0.664), whereas it was overexpressed in node-positive female cases (P = 0.028, AUC = 0.736). These results indicate its tumor-suppressive role and potential as a biomarker for the early stages of the tumor.
LINC00294 demonstrated significant co-expression with LINC02577 (P = 0.03) and positive correlation with miR-941 overexpression (P = 0.022), suggesting a coordinated regulatory network in CRC. miR-941 was downregulated in larger tumors (P = 0.018, AUC = 0.342) but upregulated in females with deeper invasion (P = 0.043, AUC = 0.647). This microRNA suppresses bulk growth and promotes local invasion. Additionally, miR-941 downregulation was observed in high-grade tumors with deeper invasion (P = 0.028, AUC = 0.411). miR-548k exhibited oncogenic behavior, with its overexpression correlating with increased invasion in early-stage tumors (P = 0.044, AUC = 0.667), larger tumor size in advanced stages (P = 0.040, AUC = 0.51), and lymph node involvement in T1/T2 tumors (P = 0.033, AUC = 0.547). A positive correlation was observed between miR-548k and miR-941 in low-grade tumors (P = 0.021), further supporting their cooperative role in early disease progression. Therefore, miR-548k can predict tumor invasion and node involvement at early stages.
All associations between gene expression levels and clinicopathological data are provided in Tables 3 and 4. Future functional studies should validate these associations and explore the mechanisms of the observed expression patterns.
TCGA cohort validation and AUC calculation
We downloaded the TCGA-COAD dataset, which included 481 tumor tissues and 41 normal tissues. Differential expression analysis was performed, yielding log2 fold-change values for 60,660 genes. The global distribution of differentially expressed genes was visualized as a volcano plot (Fig. 6A). From these results, we examined the expression patterns of our genes of interest, including both protein-coding and long non-coding RNAs, and displayed their expression levels as box plots. CKAP2 was significantly upregulated in tumor tissues (log2FC = 1.501, p < 0.001). Similarly, LINC02577 showed strong upregulation in tumor tissues compared with normal tissues (log2FC = 6.676, p < 0.001), while LINC00294 was significantly downregulated (log2FC = − 1.104, p < 0.001) (Fig. 6B). We next calculated the AUC values and plotted ROC curves for our genes of interest, comparing their performance in discriminating between tumor and normal tissues with that of carcinoembryonic antigen (CEA), a well-established biomarker for CRC (Fig. 7). The results showed that LINC02577 exhibited a markedly higher AUC than CEA (0.982 vs. 0.663). CKAP2 also demonstrated improved diagnostic performance compared with CEA (AUC = 0.797 vs. 0.663). In contrast, LINC00294 showed a substantially lower diagnostic power than CEA (AUC = 0.055). We also plotted ROC curves to evaluate the diagnostic and predictive power of our gene expressions for adverse clinicopathological features (Fig. 7).
Survival analysis and drug response evalution
We assessed whether the expression levels of our genes of interest could predict responses to commonly used adjuvant drugs in CRC. The analysis revealed that LINC02577 was significantly associated with progressive disease following treatment with leucovorin as part of the FOLFOX regimen, suggesting its potential as a predictive biomarker prior to chemotherapy (Fig. 8A). Specifically, LINC02577 expression was significantly upregulated (p = 0.009) in patients who experienced disease progression compared with those who achieved complete or partial response. In addition, survival analysis was performed; however, none of the selected genes demonstrated significant prognostic value (Fig. 8B).
Raw data preprocessing and DEGs/DELs identification
To minimize technical variability, expression values were subjected to quantile normalization. Differential analysis of the GSE180440 dataset yielded 672 DEGs (adjusted P < 0.05 and |log2 fold change| > 1); the set of differentially expressed lncRNAs and mRNAs was visualized with a volcano plot to confirm significance and effect size (Fig. 1A). The log2 fold-change values for all features were then presented in an MA-plot (Fig. 1B), which shows fold change against the average normalized expression across samples for both coding and noncoding markers. Finally, principal component analysis (PCA) was used to assess replicate reproducibility and detect outliers, and it clearly separated normal from tumor samples within the GSE180440 cohort (Fig. 1C).
WGCNA analysis and module identification
Based on the differential expression analysis, 23,767 genes were included in WGCNA. After clustering the samples in the dataset, two outliers were discovered, and 188 samples were then used in this study. We identified 11 distinct gene co-expression modules using WGCNA analysis which is shown in dendrogram branches of different colors (Fig. 2A). The Eigengene was established for each module to indicate the correlation between modules and the presence of clinical traits. Accordingly, the turquoise and blue module were associated with the tumor, higher stage, and MSS state, which were chosen for further analysis (Fig. 2B). (r = 0.98, p-value = 9e- 07). The association between the MM and GS of the selected modules, which was mostly related to clinical traits, is represented in Fig. 2C.
Functional enrichment analysis of modules
We performed GO, KEGG and GSEA analysis to investigate the biological functions of the DEmRNAs in selected modules to predict the function of associated DElncRNAs in turquoise modules, and the result is shown in Fig. 3. All of the GO and KEGG enrichment analysis results were ordered according to the enrichment score (-log P value). In the turquoise module, chromosome segregation, ribosome biogenesis, ribonucleoprotein complex biogenesis, and chromosome organization were the most enriched in the biological process (BP) (Fig. 3A). In cellular component (CC) analysis, the condensed chromosome, chromosomal region, and centromeric region were the most enriched gene terms (Fig. 3B). Moreover, the catalytic activity acting on DNA, DNA helicase activity, and Single-stranded DNA binding were the most enriched gene terms in molecular function (MF) (Fig. 3C). Further, the DNA replication, Ribosome biogenesis in eukaryotes, and Cell cycle were the 3 top pathways associated with turquoise module based on KEGG pathway analysis (Fig. 3D). Our functional analysis revealed that turquoise DEmRNAs and maybe DElncRNAs were implicated in nuclear processes.
Gene selection and network analysis
We identified 289 hub genes by intersecting the genes from the turquoise module of the WGCNA with the DEGs. These hub genes consisted of 156 lncRNAs and 133 mRNAs. A search of public databases, including PubMed and Scopus, led to the selection of LINC00294, LINC02577, and CKAP2 (Fig. 4A). Functional enrichment analysis had previously shown the turquoise module to be associated with nuclear processes, leading us to hypothesize that the selected lncRNAs may be involved in these processes by interacting with coding genes in the module. We constructed a PPI network based on CKAP2 and other coding genes co-expressed with CKAP2 in the turquoise module (Fig. 4B). To explore regulatory interactions, we focused on miRNAs targeting the coding gene. Using the miRNet database, we built a miRNA-gene interaction network and identified miR-548k and miR-941 as key candidates based on novelty and supporting literature (Fig. 4C).
RT-PCR of coding and non-coding genes
We enrolled 61 CRC cases (30 males and 31 females) with age range of 29–79 years old (mean ± SD: 56 ± 13 years old) which 20 of them were under 50 and 41 were over 50 years old. Tumor sizes were also ranged between 1 and 12 cm (mean ± SD: 4.9 ± 2.2 cm). The male patients were older than females (57 ± 12 VS. 54 ± 13 years old), and had slightly bigger tumor size (5.1 VS. 4.7 cm). The majority of tumor tissues were grade II (31/61, 50.8%), T3/4 depth of invasion (46/61, 75.4%), and N0/N1 lymph node involvement (56/61, 91.8%). We mentioned all the clinicopathological features in Table 1. CKAP2, LINC00294, LINC02577, miR-941, and miR-548k exhibited varying patterns of dysregulation among patients. CKAP2 was underexpressed in 16.3% and overexpressed in 31.1% of cases, with fold changes ranging from − 1.86 to − 1.01 (mean − 1.42 ± 0.27) and 1.07 to 4.34 (mean 2.18 ± 1.02), respectively, indicating notable upregulation. Similarly, LINC02577 exhibited underexpression in 29.5% and overexpression in 44.3% of patients, with fold changes of − 1.33 to − 10.59 (mean − 4.27 ± 2.38) and 1.2 to 9.02 (mean 3.74 ± 2.76), reflecting a notable trend of upregulation. LINC00294 was downregulated in 27.9% and upregulated in 26.2% of patients, with fold changes between − 1.62 and − 6.03 (mean − 3.73 ± 1.42) and 1.2 to 9.02 (mean 3.46 ± 2.69). For miR-941, underexpression was observed in 44.3% and overexpression in 41% of cases, with fold changes from − 1.11 to − 16.01 (mean − 5.02 ± 2.38) and 1.22 to 14.06 (mean 5.2 ± 2.76). Lastly, miR-548k showed downregulation in 37.7% and upregulation in 36.1% of patients, with fold changes ranging from − 1.1 to − 21.25 (mean − 6.33 ± 6.42) and 1.59 to 25.73 (mean 7.42 ± 7.53). Therefore, CKAP2 and LINC02577 demonstrated a stronger trend toward upregulation compared to the other genes (Fig. 5A–E).
Association between genes and clinicopathological features
Probable correlation between levels of our genes mRNA expression and clinicopathological features of CRC patients was assessed to clarify the role of these genes in biology of CRC. CKAP2, a cytoskeleton-associated protein involved in mitotic regulation and oncogenesis, was significantly upregulated in tumors larger than 5 cm (P = 0.025, AUC = 0.667). Moreover, its downregulation was indicated in T4 compared to T1/2 female patients (P = 0.005, AUC = 0.636). These results suggest it can play a significant role in tumor growth and serve as a biomarker for larger tumors at the beginning of the disease. LINC02577 was downregulated in advanced-stage tumors of female patients (P = 0.028, AUC = 0.264). It was also downregulated in T4 CRC female patients (P = 0.035, AUC = 0.664), whereas it was overexpressed in node-positive female cases (P = 0.028, AUC = 0.736). These results indicate its tumor-suppressive role and potential as a biomarker for the early stages of the tumor.
LINC00294 demonstrated significant co-expression with LINC02577 (P = 0.03) and positive correlation with miR-941 overexpression (P = 0.022), suggesting a coordinated regulatory network in CRC. miR-941 was downregulated in larger tumors (P = 0.018, AUC = 0.342) but upregulated in females with deeper invasion (P = 0.043, AUC = 0.647). This microRNA suppresses bulk growth and promotes local invasion. Additionally, miR-941 downregulation was observed in high-grade tumors with deeper invasion (P = 0.028, AUC = 0.411). miR-548k exhibited oncogenic behavior, with its overexpression correlating with increased invasion in early-stage tumors (P = 0.044, AUC = 0.667), larger tumor size in advanced stages (P = 0.040, AUC = 0.51), and lymph node involvement in T1/T2 tumors (P = 0.033, AUC = 0.547). A positive correlation was observed between miR-548k and miR-941 in low-grade tumors (P = 0.021), further supporting their cooperative role in early disease progression. Therefore, miR-548k can predict tumor invasion and node involvement at early stages.
All associations between gene expression levels and clinicopathological data are provided in Tables 3 and 4. Future functional studies should validate these associations and explore the mechanisms of the observed expression patterns.
TCGA cohort validation and AUC calculation
We downloaded the TCGA-COAD dataset, which included 481 tumor tissues and 41 normal tissues. Differential expression analysis was performed, yielding log2 fold-change values for 60,660 genes. The global distribution of differentially expressed genes was visualized as a volcano plot (Fig. 6A). From these results, we examined the expression patterns of our genes of interest, including both protein-coding and long non-coding RNAs, and displayed their expression levels as box plots. CKAP2 was significantly upregulated in tumor tissues (log2FC = 1.501, p < 0.001). Similarly, LINC02577 showed strong upregulation in tumor tissues compared with normal tissues (log2FC = 6.676, p < 0.001), while LINC00294 was significantly downregulated (log2FC = − 1.104, p < 0.001) (Fig. 6B). We next calculated the AUC values and plotted ROC curves for our genes of interest, comparing their performance in discriminating between tumor and normal tissues with that of carcinoembryonic antigen (CEA), a well-established biomarker for CRC (Fig. 7). The results showed that LINC02577 exhibited a markedly higher AUC than CEA (0.982 vs. 0.663). CKAP2 also demonstrated improved diagnostic performance compared with CEA (AUC = 0.797 vs. 0.663). In contrast, LINC00294 showed a substantially lower diagnostic power than CEA (AUC = 0.055). We also plotted ROC curves to evaluate the diagnostic and predictive power of our gene expressions for adverse clinicopathological features (Fig. 7).
Survival analysis and drug response evalution
We assessed whether the expression levels of our genes of interest could predict responses to commonly used adjuvant drugs in CRC. The analysis revealed that LINC02577 was significantly associated with progressive disease following treatment with leucovorin as part of the FOLFOX regimen, suggesting its potential as a predictive biomarker prior to chemotherapy (Fig. 8A). Specifically, LINC02577 expression was significantly upregulated (p = 0.009) in patients who experienced disease progression compared with those who achieved complete or partial response. In addition, survival analysis was performed; however, none of the selected genes demonstrated significant prognostic value (Fig. 8B).
Discussion
Discussion
CRC is a highly prevalent malignancy, with its incidence and mortality rates increasing significantly in recent years37. Consequently, the identification of reliable prognostic and diagnostic biomarkers is essential for early detection, risk stratification, and the development of personalized therapeutic approaches. This study focuses on elucidating the interactions between coding and non-coding regions of the genome, which may provide critical insights into identifying clinically relevant biomarkers and advancing targeted therapeutic strategies for CRC38,39. Therefore, we aimed to investigate this highly lethal cancer and explore the genes involved in its pathogenesis.
The miRNA-lncRNA-gene (miR-lnc-gene) axis plays a crucial role in cancer by regulating gene expression at multiple levels, influencing tumor initiation, progression, and therapy resistance40–42. In CRC, this axis contributes to key oncogenic processes such as proliferation, invasion, metastasis, and chemoresistance by modulating crucial signaling pathways. For instance, the lncRNA ZFAS1 has been shown to act as a ceRNA by sponging miR200b, leading to the upregulation of key oncogenes such as ZEB1. Similarly, the interaction between lncRNA H19 and miR-29b-3p promoted CRC progression by enhancing the epithelial-mesenchymal transition (EMT) process that in turn induced Wnt/β-catenin signaling in CRC cells43. Additionally, LINC00543, regulated miR-506-3p to control the expression of FOXQ1 and enhanced EMT process and CRC metastasis44. Given the vast number of potential regulatory interactions, bioinformatics plays a pivotal role in identifying functionally relevant miR–lncRNA–gene axes in cancer. In this study, we applied bioinformatics approaches and network biology principles to systematically prioritize genes implicated in CRC. Although experimental validation methods such as RNA pull-down or dual-luciferase assays were not performed to confirm the predicted interactions, we identified both coding and non-coding genes, highlighting novel candidates potentially involved in CRC progression through ceRNA regulatory mechanisms.
CKAP2 gene is located on the long arm of chromosome 1345. It is a cytoskeleton-associated protein and a potent microtubule growth factor that promotes microtubule assembly and stability, essential for cell division46,47. CKAP2 is one of the most important proteins identified for promoting microtubule assembly46. The aberrant expression and function of CKAP2 have been implicated in various cancers. Studies have shown that CKAP2 is often overexpressed and can be used as a prognostic marker in a variety of cancers including breast48,49, lung50, and glioma51. The long non-coding RNA DLEU1, by acting as a coactivator for HIF-1α, up-regulated CKAP2 expression and promoted the growth of breast cancer cells52. Similarly, lncRNA DARS-AS1 enhanced CKAP2 expression through competitive binding to miR-3200-5p, thereby enhancing the growth and metastasis of hepatocellular carcinoma53. This demonstrates how non-coding RNAs can directly regulate the expression of protein-coding genes like CKAP2. Furthermore, CKAP2 silencing reduced cell proliferation, migration, and invasion in CRC cells. CKAP2 promoted tumor progression by enhancing these malignant features and modulating the tumor microenvironment through M2 macrophage polarization and angiogenesis54. Our study further supported the oncogenic role of CKAP2, as its significant upregulation was observed in larger tumors (> 5 cm, P = 0.025), indicating a potential role in tumor growth. This aligns with previous reports linking CKAP2 overexpression to enhanced proliferation and malignancy. Additionally, its downregulation was indicated in T4 compared to T1/2 female patients (P = 0.005). CKAP2 was significantly upregulated in the TCGA cohort and demonstrated a higher AUC than CEA in distinguishing tumor from normal tissues. Therefore, we propose that CKAP2 may serve as an oncogenic biomarker in the early stages of tumor development, with potential utility in predicting tumor growth. Using WGCNA, we also identified LINC00294 and LINC02577 within the same module as CKAP2 and hypothesize that these lncRNAs may regulate its expression. Given CKAP2’s role in mitotic regulation based on our functional enrichment analysis on its module, further studies are warranted to investigate its involvement in cell cycle pathways and to validate its functional interactions with co-expressed non-coding RNAs.
LINC02577 and LINC00294 are lncRNAs implicated in cancer progression and genomic regulation. LINC02577 has been identified as a prognostic marker in pancreatic adenocarcinoma, where it was associated with genomic instability and immune microenvironment regulation55. Similarly, LINC00294 has been linked to multiple cancers. In hepatocellular carcinoma, it promoted tumor progression through METTL3/YTHDC1-mediated mechanisms, while in CRC, its overexpression induced apoptosis and cell cycle arrest via the miR-499a-5p/LARP4B axis56,57. Additionally, the LINC00294/miR-620/MKRN2 axis has been reported to negatively regulate malignant progression in CRC58. In glioma, LINC00294 is downregulated and exerts its tumor-suppressive function by upregulating CASKIN1 expression through sponging miR-21-5p and activating the cAMP signaling pathway, thereby inducing apoptosis and impairing mitochondrial function59.LINC00294 resulted in tumor progression via regulating cell cycle and Hedgehog pathway in cervical cancer cells60. Furthermore, LINC00294 exhibited tumor-suppressive effects in glioma by sponging miR-1278 and promoting neurofilament medium (NEFM) expression61. LINC00294 was also introduced as a prognostic biomarker in diffuse large B cell lymphoma62. Our study revealed that LINC02577 was predominantly upregulated (44.3% of cases), while LINC00294 downregulation exhibited in 27.9% of cases, suggesting a potential role in CRC. Functional enrichment analysis of their co-expression module identified significant associations with chromosome segregation, ribosome biogenesis, and DNA replication, reinforcing their relevance in nuclear processes. Correlation analysis between lncRNA expression and clinicopathological features further supported the suppressive role of LINC02577 and its potential as an early-stage tumor marker. We observed significant upregulation of LINC02577 in TCGA colorectal cancer patients, most of whom were at early stages, further emphasizing its value as an early diagnostic marker. Drug response analysis revealed that patients who did not respond to leucovorin as part of the FOLFOX regimen had higher expression levels of LINC02577. Moreover, LINC02577 achieved the highest AUC among our selected genes and outperformed CEA, reinforcing its diagnostic and predictive potential as a biomarker in CRC. In contrast, no significant correlation was observed between LINC00294 expression and clinicopathological features.
We decided to select two microRNA that have interaction with LINC00294, LINC02577 and CKAP2 using mirnet database. Mir-941 and miR-548kk were selected from this database. miR-941 enhanced the resistance of breast cancer cells to 5-fluorouracil while its inhibition decreased cell proliferation and histone phosphorylation63. There was significant upregulation of miR-941 in Laryngeal Squamous Cell Carcinoma (LSCC) tissues, cells and serum exosomes. Its upregulation increased cell growth and invasion of LSCC cells64. Additionally, bioinformatic analysis has revealed that miR-941 has crucial role in the gastric GC, colon and prostate cancer65–67. Our findings showed that its downregulation in larger tumors (P = 0.018) indicates tumor-suppressive effects on bulk growth, while its upregulation in female patients with deeper tumor invasion (P = 0.043) supports its previously reported pro-invasive roles. There is extensive evidence that microRNAs can exert dual roles, functioning as either oncogenes or tumor suppressors, not only across different cancer types but even within the same cancer type68. The controversial findings regarding the role of miR-941 in our study may be attributed to this context-dependent dual function or to the relatively small sample size analyzed.
miR-548k has been shown to suppress PTEN expression, leading to enhanced cell proliferation and reduced apoptosis through activation of the PI3K/Akt pathway in breast cancer69. Additionally, histone acetylation, an epigenetic modification regulated by histone deacetylases (HDACs), plays a critical role in gene expression. Recent studies have highlighted the essential functions of HDACs in tumorigenesis. Specifically, miR-548kk was found to be sponged by lncRNA-LET, which regulates HDAC3, thereby mediating the initiation and progression of GC70. Nuclear factor 90 (NF90), an RNA-binding protein, has been implicated in the regulation of miRNA biogenesis71. miR-548kk exhibited oncogenic properties by upregulating NF90 and suppressing lncRNA-LET, with its expression levels significantly correlated with patient prognosis in Esophageal squamous cell carcinoma (ESCC)72,73. Furthermore, miR-548k overexpression in early-stage ESCC patients promoted lymphangiogenesis and tumor metastasis74. Additionally, miR-548k has been identified as a ubiquitination-related microRNA in retinoblastoma75. miR-548k was also associated with increased invasion in early-stage tumors (P = 0.044), larger tumor size in advanced stages (P = 0.040), and lymph node involvement in T1/T2 tumors (P = 0.033) in our study. This regulatory behavior expanded its established oncogenic roles in ESCC metastasis and GC progression. Therefore, miR-548k could serve as a potential biomarker for early stages of CRC and for early diagnosis.
Our study revealed distinct dysregulation patterns and significant clinicopathological associations for CKAP2, LINC02577, LINC00294, miR-941, and miR-548k in CRC patients. Notably, CKAP2 was significantly upregulated in larger and T1/T2 tumors, indicating its potential as a novel biomarker for early diagnosis and also shows moderate diagnostic power (AUC = 0.797). LINC02577 showed consistent downregulation in advanced-stage and deeply invasive tumors and was co-expressed with LINC00294. This lncRNA exhibits potential tumor-suppressive and therapeutic roles in CRC. We indicated that LINC02577 has a great potential to be an accurate diagnostic marker (AUC = 0.982) and is associated with treatment failure to leucovorin-containing chemotherapy (P = 0.009). LINC00294 was co-expressed with LINC02577 and correlated with miR-941 overexpression. miR-941 inhibited bulk growth and promoted local invasion, with its downregulation observed in high-grade tumors with deeper invasion. miR-548k showed oncogenic behavior, with overexpression correlating with increased invasion in early-stage tumors, larger tumor size in advanced stages, and lymph node involvement in T1/T2 tumors. These findings highlight the potential of CKAP2 and miR-548k as oncogenic drivers, while LINC02577 appears to function as a tumor suppressor. Additionally, CKAP2, miR-548k, miR-941, and LINC02577 could serve as biomarkers for the early diagnosis of CRC due to their expression patterns in early-stage disease. These markers warrant further validation for therapeutic targeting and prognostic stratification in CRC.
One limitation of our study is the relatively small sample size, which may affect the generalizability of our findings. Future studies should assess the expression of the identified biomarkers in a larger cohort and in serum or plasma samples to confirm their clinical relevance. Additionally, protein-level validation using techniques such as immunohistochemistry and western blotting is necessary to support our transcriptomic findings. While our results suggest a novel lncRNA–miRNA–protein regulatory axis in colorectal cancer, definitive proof will require functional studies. Future studies should perform luciferase reporter assays to validate direct targeting, RNA immunoprecipitation (RIP) to detect RNP complexes in vivo, and RNA pull-down experiments to identify interacting proteins. Together these experiments will confirm the interactions and illuminate the molecular mechanisms driving the observed effects.
CRC is a highly prevalent malignancy, with its incidence and mortality rates increasing significantly in recent years37. Consequently, the identification of reliable prognostic and diagnostic biomarkers is essential for early detection, risk stratification, and the development of personalized therapeutic approaches. This study focuses on elucidating the interactions between coding and non-coding regions of the genome, which may provide critical insights into identifying clinically relevant biomarkers and advancing targeted therapeutic strategies for CRC38,39. Therefore, we aimed to investigate this highly lethal cancer and explore the genes involved in its pathogenesis.
The miRNA-lncRNA-gene (miR-lnc-gene) axis plays a crucial role in cancer by regulating gene expression at multiple levels, influencing tumor initiation, progression, and therapy resistance40–42. In CRC, this axis contributes to key oncogenic processes such as proliferation, invasion, metastasis, and chemoresistance by modulating crucial signaling pathways. For instance, the lncRNA ZFAS1 has been shown to act as a ceRNA by sponging miR200b, leading to the upregulation of key oncogenes such as ZEB1. Similarly, the interaction between lncRNA H19 and miR-29b-3p promoted CRC progression by enhancing the epithelial-mesenchymal transition (EMT) process that in turn induced Wnt/β-catenin signaling in CRC cells43. Additionally, LINC00543, regulated miR-506-3p to control the expression of FOXQ1 and enhanced EMT process and CRC metastasis44. Given the vast number of potential regulatory interactions, bioinformatics plays a pivotal role in identifying functionally relevant miR–lncRNA–gene axes in cancer. In this study, we applied bioinformatics approaches and network biology principles to systematically prioritize genes implicated in CRC. Although experimental validation methods such as RNA pull-down or dual-luciferase assays were not performed to confirm the predicted interactions, we identified both coding and non-coding genes, highlighting novel candidates potentially involved in CRC progression through ceRNA regulatory mechanisms.
CKAP2 gene is located on the long arm of chromosome 1345. It is a cytoskeleton-associated protein and a potent microtubule growth factor that promotes microtubule assembly and stability, essential for cell division46,47. CKAP2 is one of the most important proteins identified for promoting microtubule assembly46. The aberrant expression and function of CKAP2 have been implicated in various cancers. Studies have shown that CKAP2 is often overexpressed and can be used as a prognostic marker in a variety of cancers including breast48,49, lung50, and glioma51. The long non-coding RNA DLEU1, by acting as a coactivator for HIF-1α, up-regulated CKAP2 expression and promoted the growth of breast cancer cells52. Similarly, lncRNA DARS-AS1 enhanced CKAP2 expression through competitive binding to miR-3200-5p, thereby enhancing the growth and metastasis of hepatocellular carcinoma53. This demonstrates how non-coding RNAs can directly regulate the expression of protein-coding genes like CKAP2. Furthermore, CKAP2 silencing reduced cell proliferation, migration, and invasion in CRC cells. CKAP2 promoted tumor progression by enhancing these malignant features and modulating the tumor microenvironment through M2 macrophage polarization and angiogenesis54. Our study further supported the oncogenic role of CKAP2, as its significant upregulation was observed in larger tumors (> 5 cm, P = 0.025), indicating a potential role in tumor growth. This aligns with previous reports linking CKAP2 overexpression to enhanced proliferation and malignancy. Additionally, its downregulation was indicated in T4 compared to T1/2 female patients (P = 0.005). CKAP2 was significantly upregulated in the TCGA cohort and demonstrated a higher AUC than CEA in distinguishing tumor from normal tissues. Therefore, we propose that CKAP2 may serve as an oncogenic biomarker in the early stages of tumor development, with potential utility in predicting tumor growth. Using WGCNA, we also identified LINC00294 and LINC02577 within the same module as CKAP2 and hypothesize that these lncRNAs may regulate its expression. Given CKAP2’s role in mitotic regulation based on our functional enrichment analysis on its module, further studies are warranted to investigate its involvement in cell cycle pathways and to validate its functional interactions with co-expressed non-coding RNAs.
LINC02577 and LINC00294 are lncRNAs implicated in cancer progression and genomic regulation. LINC02577 has been identified as a prognostic marker in pancreatic adenocarcinoma, where it was associated with genomic instability and immune microenvironment regulation55. Similarly, LINC00294 has been linked to multiple cancers. In hepatocellular carcinoma, it promoted tumor progression through METTL3/YTHDC1-mediated mechanisms, while in CRC, its overexpression induced apoptosis and cell cycle arrest via the miR-499a-5p/LARP4B axis56,57. Additionally, the LINC00294/miR-620/MKRN2 axis has been reported to negatively regulate malignant progression in CRC58. In glioma, LINC00294 is downregulated and exerts its tumor-suppressive function by upregulating CASKIN1 expression through sponging miR-21-5p and activating the cAMP signaling pathway, thereby inducing apoptosis and impairing mitochondrial function59.LINC00294 resulted in tumor progression via regulating cell cycle and Hedgehog pathway in cervical cancer cells60. Furthermore, LINC00294 exhibited tumor-suppressive effects in glioma by sponging miR-1278 and promoting neurofilament medium (NEFM) expression61. LINC00294 was also introduced as a prognostic biomarker in diffuse large B cell lymphoma62. Our study revealed that LINC02577 was predominantly upregulated (44.3% of cases), while LINC00294 downregulation exhibited in 27.9% of cases, suggesting a potential role in CRC. Functional enrichment analysis of their co-expression module identified significant associations with chromosome segregation, ribosome biogenesis, and DNA replication, reinforcing their relevance in nuclear processes. Correlation analysis between lncRNA expression and clinicopathological features further supported the suppressive role of LINC02577 and its potential as an early-stage tumor marker. We observed significant upregulation of LINC02577 in TCGA colorectal cancer patients, most of whom were at early stages, further emphasizing its value as an early diagnostic marker. Drug response analysis revealed that patients who did not respond to leucovorin as part of the FOLFOX regimen had higher expression levels of LINC02577. Moreover, LINC02577 achieved the highest AUC among our selected genes and outperformed CEA, reinforcing its diagnostic and predictive potential as a biomarker in CRC. In contrast, no significant correlation was observed between LINC00294 expression and clinicopathological features.
We decided to select two microRNA that have interaction with LINC00294, LINC02577 and CKAP2 using mirnet database. Mir-941 and miR-548kk were selected from this database. miR-941 enhanced the resistance of breast cancer cells to 5-fluorouracil while its inhibition decreased cell proliferation and histone phosphorylation63. There was significant upregulation of miR-941 in Laryngeal Squamous Cell Carcinoma (LSCC) tissues, cells and serum exosomes. Its upregulation increased cell growth and invasion of LSCC cells64. Additionally, bioinformatic analysis has revealed that miR-941 has crucial role in the gastric GC, colon and prostate cancer65–67. Our findings showed that its downregulation in larger tumors (P = 0.018) indicates tumor-suppressive effects on bulk growth, while its upregulation in female patients with deeper tumor invasion (P = 0.043) supports its previously reported pro-invasive roles. There is extensive evidence that microRNAs can exert dual roles, functioning as either oncogenes or tumor suppressors, not only across different cancer types but even within the same cancer type68. The controversial findings regarding the role of miR-941 in our study may be attributed to this context-dependent dual function or to the relatively small sample size analyzed.
miR-548k has been shown to suppress PTEN expression, leading to enhanced cell proliferation and reduced apoptosis through activation of the PI3K/Akt pathway in breast cancer69. Additionally, histone acetylation, an epigenetic modification regulated by histone deacetylases (HDACs), plays a critical role in gene expression. Recent studies have highlighted the essential functions of HDACs in tumorigenesis. Specifically, miR-548kk was found to be sponged by lncRNA-LET, which regulates HDAC3, thereby mediating the initiation and progression of GC70. Nuclear factor 90 (NF90), an RNA-binding protein, has been implicated in the regulation of miRNA biogenesis71. miR-548kk exhibited oncogenic properties by upregulating NF90 and suppressing lncRNA-LET, with its expression levels significantly correlated with patient prognosis in Esophageal squamous cell carcinoma (ESCC)72,73. Furthermore, miR-548k overexpression in early-stage ESCC patients promoted lymphangiogenesis and tumor metastasis74. Additionally, miR-548k has been identified as a ubiquitination-related microRNA in retinoblastoma75. miR-548k was also associated with increased invasion in early-stage tumors (P = 0.044), larger tumor size in advanced stages (P = 0.040), and lymph node involvement in T1/T2 tumors (P = 0.033) in our study. This regulatory behavior expanded its established oncogenic roles in ESCC metastasis and GC progression. Therefore, miR-548k could serve as a potential biomarker for early stages of CRC and for early diagnosis.
Our study revealed distinct dysregulation patterns and significant clinicopathological associations for CKAP2, LINC02577, LINC00294, miR-941, and miR-548k in CRC patients. Notably, CKAP2 was significantly upregulated in larger and T1/T2 tumors, indicating its potential as a novel biomarker for early diagnosis and also shows moderate diagnostic power (AUC = 0.797). LINC02577 showed consistent downregulation in advanced-stage and deeply invasive tumors and was co-expressed with LINC00294. This lncRNA exhibits potential tumor-suppressive and therapeutic roles in CRC. We indicated that LINC02577 has a great potential to be an accurate diagnostic marker (AUC = 0.982) and is associated with treatment failure to leucovorin-containing chemotherapy (P = 0.009). LINC00294 was co-expressed with LINC02577 and correlated with miR-941 overexpression. miR-941 inhibited bulk growth and promoted local invasion, with its downregulation observed in high-grade tumors with deeper invasion. miR-548k showed oncogenic behavior, with overexpression correlating with increased invasion in early-stage tumors, larger tumor size in advanced stages, and lymph node involvement in T1/T2 tumors. These findings highlight the potential of CKAP2 and miR-548k as oncogenic drivers, while LINC02577 appears to function as a tumor suppressor. Additionally, CKAP2, miR-548k, miR-941, and LINC02577 could serve as biomarkers for the early diagnosis of CRC due to their expression patterns in early-stage disease. These markers warrant further validation for therapeutic targeting and prognostic stratification in CRC.
One limitation of our study is the relatively small sample size, which may affect the generalizability of our findings. Future studies should assess the expression of the identified biomarkers in a larger cohort and in serum or plasma samples to confirm their clinical relevance. Additionally, protein-level validation using techniques such as immunohistochemistry and western blotting is necessary to support our transcriptomic findings. While our results suggest a novel lncRNA–miRNA–protein regulatory axis in colorectal cancer, definitive proof will require functional studies. Future studies should perform luciferase reporter assays to validate direct targeting, RNA immunoprecipitation (RIP) to detect RNP complexes in vivo, and RNA pull-down experiments to identify interacting proteins. Together these experiments will confirm the interactions and illuminate the molecular mechanisms driving the observed effects.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.