The possibility of prognostic and functional values of the 8q24 and 20q13 chromosomal bands in colorectal cancer.
1/5 보강
Colorectal cancer (CRC) remains a major global health concern, especially given its increasing incidence among younger individuals.
APA
Siadat SA (2026). The possibility of prognostic and functional values of the 8q24 and 20q13 chromosomal bands in colorectal cancer.. Molecular biology research communications, 15(1), 3-10. https://doi.org/10.22099/mbrc.2025.54114.2202
MLA
Siadat SA. "The possibility of prognostic and functional values of the 8q24 and 20q13 chromosomal bands in colorectal cancer.." Molecular biology research communications, vol. 15, no. 1, 2026, pp. 3-10.
PMID
41346766 ↗
Abstract 한글 요약
Colorectal cancer (CRC) remains a major global health concern, especially given its increasing incidence among younger individuals. While genome-wide association studies (GWAS) have identified numerous CRC-associated polymorphisms, their spatial distribution and functional implications are not fully understood. This study examined the locations of 1,346 CRC-linked polymorphisms across chromosomal bands. The results revealed significant nonrandom clustering across thirteen chromosomal bands: 1q41, 6p21, 8q24, 9q34, 10p14, 10q25, 11q12, 12p13, 15q13, 18q21, 19q13, 20p12, and 20q13. Functional enrichment analysis of genes within these bands revealed several Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Reciprocal chromosomal enrichment confirmed that many of these terms and pathways were not randomly localized within the same bands, highlighting their potential biological significance. Survival analysis using TCGA data identified three KEGG pathways and 33 GO terms mapped to nine of the thirteen bands that were significantly associated with poor prognosis. Notably, the 8q24 and 20q13 regions were enriched for differentially expressed genes and survival-associated terms yet showed no significant enrichment for genes with high somatic mutation rates. These results imply that 8q24 and 20q13 act as regulatory hotspots rather than mutation-driven regions. Overall, this integrative approach identified functionally and clinically relevant genomic regions that may contribute to inherited CRC risk and progression, providing valuable targets for the development of diagnostic and prognostic biomarkers.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
📖 전문 본문 읽기 PMC JATS · ~33 KB · 영문
INTRODUCTION
INTRODUCTION
Colorectal cancer (CRC) is a common malignancy and the second leading cause of cancer-related deaths worldwide. Despite advances in prevention, screening, and treatment, the incidence of CRC is increasing, particularly among younger individuals. This highlights the need for a better understanding of its genetic landscape [1]. Genome-wide association studies (GWAS) have identified numerous genetic variants associated with CRC risk, shedding light on inherited predispositions [2-4]. However, although these individual risk loci have been extensively cataloged, the functional significance, spatial distribution, and collective impact of these variants across chromosomal regions remain poorly understood. This knowledge gap hinders their translation into clinical applications. Investigating the genomic organization and chromosomal localization of these polymorphisms could reveal nonrandom clustering patterns and highlight regulatory hotspots involved in tumorigenesis.
Previous studies have demonstrated that the nonrandom distribution of cancer-associated polymorphisms may indicate chromosomal regions containing key oncogenes, tumor suppressors, or elements involved in gene regulation [5-7]. Other studies have shown that alterations in specific chromosomal bands can affect gene expression, disrupt cellular pathways, and promote malignant transformation [8, 9]. Despite these observations, a comprehensive analysis of chromosomal bands enriched for CRC-associated polymorphisms and their functional implications is lacking. Therefore, the present study sought to address this gap.
Colorectal cancer (CRC) is a common malignancy and the second leading cause of cancer-related deaths worldwide. Despite advances in prevention, screening, and treatment, the incidence of CRC is increasing, particularly among younger individuals. This highlights the need for a better understanding of its genetic landscape [1]. Genome-wide association studies (GWAS) have identified numerous genetic variants associated with CRC risk, shedding light on inherited predispositions [2-4]. However, although these individual risk loci have been extensively cataloged, the functional significance, spatial distribution, and collective impact of these variants across chromosomal regions remain poorly understood. This knowledge gap hinders their translation into clinical applications. Investigating the genomic organization and chromosomal localization of these polymorphisms could reveal nonrandom clustering patterns and highlight regulatory hotspots involved in tumorigenesis.
Previous studies have demonstrated that the nonrandom distribution of cancer-associated polymorphisms may indicate chromosomal regions containing key oncogenes, tumor suppressors, or elements involved in gene regulation [5-7]. Other studies have shown that alterations in specific chromosomal bands can affect gene expression, disrupt cellular pathways, and promote malignant transformation [8, 9]. Despite these observations, a comprehensive analysis of chromosomal bands enriched for CRC-associated polymorphisms and their functional implications is lacking. Therefore, the present study sought to address this gap.
MATERIALS AND METHODS
MATERIALS AND METHODS
The current study consisted of five steps (Fig. 1). They are described below.
Step 1: Chromosomal Distribution of CRC-Associated Polymorphisms:
On July 12, 2024, we obtained the polymorphisms from the MONDO_0005575 dataset, consisting of 75 GWAS on colorectal cancer (see Supplementary Data). We subsequently evaluated the nonrandom distribution of these polymorphic sites across human chromosomes [10].
Step 2: Enrichment Analyses:
Two enrichment analyses were performed in this step. First, we used the biomaRt package in R to extract all genes located within chromosomal bands harboring nonrandom CRC-associated polymorphic sites from Ensembl. We then performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses on these gene sets using the clusterProfiler package in R. The GO terms were categorized into biological processes (BP), molecular functions (MF), and cellular components (CC). Second, we used the AnnotationDbi and KEGGREST packages to retrieve genes associated with each GO term and KEGG pathway, respectively. Finally, we conducted a chromosomal location enrichment analysis of these genes using the msigdbr package.
Step 3: Survival Analysis:
Survival analysis was conducted using the cSurvival web tool (https://tau.cmmt.ubc.ca/cSurvival) [11]. based on the TCGA-COAD and TCGA-READ datasets. Kaplan-Meier estimates and log-rank tests were used to analyze the relationship between KEGG pathways and GO terms expression levels and overall survival.
Step 4: Differentially Expressed Genes (DEGs) Analysis:
The top 500 differentially expressed genes from TCGA-COAD, including 250 upregulated and 250 downregulated genes, were obtained via UALCAN (https://ualcan.path.uab.edu/index.html) [12]. Enrichment of these DEGs within chromosomal bands was assessed using hypergeometric testing.
Step 5: Mutation Burden Analysis in Chromosomal Bands:
Somatic mutation data for COAD and READ were obtained from The Cancer Genome Atlas (TCGA) via the TCGAbiolinks package. Mutations within coding sequences (CDS) were normalized using the following formula:
where Q1 represents the first quartile of gene lengths, mitigating gene size bias. Gene length information was obtained from Ensembl using the biomaRt package. This adjustment accounts for gene size while emphasizing the relative mutation burden in longer genes. Genes with adjusted mutation counts above the median were classified as high-mutation genes. A hypergeometric test was used to assess their enrichment within chromosomal bands. Moreover, the mean adjusted mutation burden of genes in each band was compared with background genes using a t-test.
Statistical Analyses and Data Visualization:
Statistical analyses were conducted in R, and significant bands, along with their polymorphism burden, were visualized using the chromPlot package. To account for multiple hypothesis testing and control the false discovery rate, the Benjamini-Hochberg method was used to adjust p-values. Statistical significance was defined as an adjusted p-value less than 0.05.
The current study consisted of five steps (Fig. 1). They are described below.
Step 1: Chromosomal Distribution of CRC-Associated Polymorphisms:
On July 12, 2024, we obtained the polymorphisms from the MONDO_0005575 dataset, consisting of 75 GWAS on colorectal cancer (see Supplementary Data). We subsequently evaluated the nonrandom distribution of these polymorphic sites across human chromosomes [10].
Step 2: Enrichment Analyses:
Two enrichment analyses were performed in this step. First, we used the biomaRt package in R to extract all genes located within chromosomal bands harboring nonrandom CRC-associated polymorphic sites from Ensembl. We then performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses on these gene sets using the clusterProfiler package in R. The GO terms were categorized into biological processes (BP), molecular functions (MF), and cellular components (CC). Second, we used the AnnotationDbi and KEGGREST packages to retrieve genes associated with each GO term and KEGG pathway, respectively. Finally, we conducted a chromosomal location enrichment analysis of these genes using the msigdbr package.
Step 3: Survival Analysis:
Survival analysis was conducted using the cSurvival web tool (https://tau.cmmt.ubc.ca/cSurvival) [11]. based on the TCGA-COAD and TCGA-READ datasets. Kaplan-Meier estimates and log-rank tests were used to analyze the relationship between KEGG pathways and GO terms expression levels and overall survival.
Step 4: Differentially Expressed Genes (DEGs) Analysis:
The top 500 differentially expressed genes from TCGA-COAD, including 250 upregulated and 250 downregulated genes, were obtained via UALCAN (https://ualcan.path.uab.edu/index.html) [12]. Enrichment of these DEGs within chromosomal bands was assessed using hypergeometric testing.
Step 5: Mutation Burden Analysis in Chromosomal Bands:
Somatic mutation data for COAD and READ were obtained from The Cancer Genome Atlas (TCGA) via the TCGAbiolinks package. Mutations within coding sequences (CDS) were normalized using the following formula:
where Q1 represents the first quartile of gene lengths, mitigating gene size bias. Gene length information was obtained from Ensembl using the biomaRt package. This adjustment accounts for gene size while emphasizing the relative mutation burden in longer genes. Genes with adjusted mutation counts above the median were classified as high-mutation genes. A hypergeometric test was used to assess their enrichment within chromosomal bands. Moreover, the mean adjusted mutation burden of genes in each band was compared with background genes using a t-test.
Statistical Analyses and Data Visualization:
Statistical analyses were conducted in R, and significant bands, along with their polymorphism burden, were visualized using the chromPlot package. To account for multiple hypothesis testing and control the false discovery rate, the Benjamini-Hochberg method was used to adjust p-values. Statistical significance was defined as an adjusted p-value less than 0.05.
RESULTS
RESULTS
A total of 1,796 CRC-associated polymorphisms were compiled from 75 GWAS studies. After removing duplicates, 1,346 unique variants were retained for further analysis. Of these variants, 637 were located within protein-coding genes, with the majority (577) occurring in intronic regions. The remaining 709 variants were mapped to non-coding regions, including 503 in unannotated intergenic elements. To investigate their genomic distribution, we examined the location of these variants across chromosomal bands. The results revealed significant nonrandom clustering across thirteen chromosomal bands (1q41, 6p21, 8q24, 9q34, 10p14, 10q25, 11q12, 12p13, 15q13, 18q21, 19q13, 20p12, and 20q13) (Table 1, Figure S1).
Next, we performed two enrichment analyses. First, we retrieved all genes located within the aforementioned bands and subjected them to enrichment analysis to elucidate their functional significance. This analysis revealed a total of 207 GO terms and 10 KEGG pathways, which are summarized in Tables S1 and S2, respectively. We know that genes involved in biological pathways are located on different chromosomes. Second, to investigate whether the genes associated with the aforementioned KEGG pathways and GO terms are nonrandomly distributed and are primarily located within the thirteen specified chromosomal bands, we conducted an additional analysis. We retrieved all genes associated with these pathways and terms and tested them for enrichment within the previously identified chromosomal bands (see Tables S3 and S4). This analysis revealed that all chromosomal bands, except 10p14, were significantly enriched for genes associated with at least one respective KEGG pathway and/or GO term. These results suggest the functional relevance of these chromosomal bands.
Following the previous enrichment analyses, in the third step we performed survival analysis on the KEGG pathways and GO terms that were mutually enriched within the aforementioned 12 chromosomal bands. This analysis evaluated the clinical relevance of our findings. The goal was to determine whether functionally enriched GO terms and KEGG pathways were associated with patient prognosis. The analysis identified three KEGG pathways and 33 GO terms that were significantly associated with survival in CRC (Table 2). Notably, these prognostic pathways and terms were associated with nine chromosomal bands: 8q24, 9q34, 10q25, 11q12, 12p13, 15q13, 18q21, 19q13, and 20q13.
In the fourth step of the present study, we examined the randomness distribution of the top 500 DEGs from the COAD project on the chromosomal bands associated with CRC that we identified in the first step. Our analysis revealed that only two chromosomal bands, 8q24 and 20q13, were significantly enriched with the DEGs (Table 3).
Lastly, we examined the somatic mutation burden of genes located on chromosomal bands associated with CRC. These bands were identified in the first step of the study. However, no statistically significant association was found between somatic mutation burden and chromosomal bands (Table S6).
A total of 1,796 CRC-associated polymorphisms were compiled from 75 GWAS studies. After removing duplicates, 1,346 unique variants were retained for further analysis. Of these variants, 637 were located within protein-coding genes, with the majority (577) occurring in intronic regions. The remaining 709 variants were mapped to non-coding regions, including 503 in unannotated intergenic elements. To investigate their genomic distribution, we examined the location of these variants across chromosomal bands. The results revealed significant nonrandom clustering across thirteen chromosomal bands (1q41, 6p21, 8q24, 9q34, 10p14, 10q25, 11q12, 12p13, 15q13, 18q21, 19q13, 20p12, and 20q13) (Table 1, Figure S1).
Next, we performed two enrichment analyses. First, we retrieved all genes located within the aforementioned bands and subjected them to enrichment analysis to elucidate their functional significance. This analysis revealed a total of 207 GO terms and 10 KEGG pathways, which are summarized in Tables S1 and S2, respectively. We know that genes involved in biological pathways are located on different chromosomes. Second, to investigate whether the genes associated with the aforementioned KEGG pathways and GO terms are nonrandomly distributed and are primarily located within the thirteen specified chromosomal bands, we conducted an additional analysis. We retrieved all genes associated with these pathways and terms and tested them for enrichment within the previously identified chromosomal bands (see Tables S3 and S4). This analysis revealed that all chromosomal bands, except 10p14, were significantly enriched for genes associated with at least one respective KEGG pathway and/or GO term. These results suggest the functional relevance of these chromosomal bands.
Following the previous enrichment analyses, in the third step we performed survival analysis on the KEGG pathways and GO terms that were mutually enriched within the aforementioned 12 chromosomal bands. This analysis evaluated the clinical relevance of our findings. The goal was to determine whether functionally enriched GO terms and KEGG pathways were associated with patient prognosis. The analysis identified three KEGG pathways and 33 GO terms that were significantly associated with survival in CRC (Table 2). Notably, these prognostic pathways and terms were associated with nine chromosomal bands: 8q24, 9q34, 10q25, 11q12, 12p13, 15q13, 18q21, 19q13, and 20q13.
In the fourth step of the present study, we examined the randomness distribution of the top 500 DEGs from the COAD project on the chromosomal bands associated with CRC that we identified in the first step. Our analysis revealed that only two chromosomal bands, 8q24 and 20q13, were significantly enriched with the DEGs (Table 3).
Lastly, we examined the somatic mutation burden of genes located on chromosomal bands associated with CRC. These bands were identified in the first step of the study. However, no statistically significant association was found between somatic mutation burden and chromosomal bands (Table S6).
DISCUSSION
DISCUSSION
The present study identified thirteen chromosomal bands that exhibiting a non-random distribution of CRC-associated GWAS polymorphisms. This finding suggests the functional relevance of these chromosomal bands in CRC. These regions have previously been associated with cancer progression. For example, 8q24 contains regulatory elements that control MYC expression [13, 14].
These regions may play critical roles in CRC pathogenesis by harboring key regulatory elements, tumor suppressor genes, or oncogenes. As previously suggested, genetic polymorphisms in certain chromosomal regions could be leveraged to develop a laboratory diagnostic test for mass screening programs, enabling the identification of individuals at high risk for various diseases [15, 16]. Similarly, genetic variations in the 13 chromosomal regions identified in this study could facilitate the development of diagnostic kits for colorectal cancer.
Survival analysis revealed that three KEGG pathways and 33 GO terms, associated with nine chromosomal bands, were significantly linked to poor CRC survival (Table 2). Many of these pathways and terms, such as those involved in olfactory and taste receptor signaling, immune regulation, microRNA (miRNA) transcription control, and enzyme inhibition, have been previously linked to CRC processes, including tumor growth, immune evasion, and microenvironmental remodeling [17-23]. These results suggest that these chromosomal bands and their associated pathways could serve as indicators of CRC prognosis.
The present study showed that the 8q24 and 20q13 chromosomal bands were enriched for DEGs and six GO terms associated with CRC patient survival. The 8q24 region, which harbors the MYC oncogene, plays a central role in cell proliferation, and its amplification has been linked to increased CRC susceptibility [13, 24]. Similarly, 20q13 is involved in the disruption of cell cycle regulation and apoptosis, with recurrent amplifications and deletions contributing to tumorigenesis [25-27]. Notably, amplification of the 20q13.33 sub-band has been reported as an early and frequent event in sporadic colorectal cancer and proposed as a potential biomarker for early tumor detection [26]. Importantly, neither of these regions showed enrichment for genes with high somatic mutation rates, and their average mutation burden was comparable to background levels. These findings support the idea that 8q24 and 20q13 primarily act as transcriptional regulatory hotspots rather than being involved in functional alterations due to somatic mutations.
Collectively, this study revealed a nonrandom distribution of CRC-associated GWAS polymorphisms across thirteen chromosomal bands. Enrichment analyses highlighted 8q24 and 20q13 as regulatory hotspots, enriched for differentially expressed genes and survival-associated pathways, but not for highly mutated genes. These findings underscore the functional and prognostic importance of these regions and suggest their potential utility as genetic biomarkers for CRC diagnosis, prognosis, and population-level screening.
Acknowledgements:
I wish to express my sincere gratitude to Professor Mostafa Saadat for his invaluable guidance and support, which were instrumental in the advancement of this research.
Conflict of Interest:
The author declares no conflicts of interest.
Ethics approval and consent to participate:
Not required.
Authors’ Contribution:
SARS: Conceptualization, Methodology, Investigation, Formal Analysis, Data Curation, Visualization, Validation, and Writing.
The present study identified thirteen chromosomal bands that exhibiting a non-random distribution of CRC-associated GWAS polymorphisms. This finding suggests the functional relevance of these chromosomal bands in CRC. These regions have previously been associated with cancer progression. For example, 8q24 contains regulatory elements that control MYC expression [13, 14].
These regions may play critical roles in CRC pathogenesis by harboring key regulatory elements, tumor suppressor genes, or oncogenes. As previously suggested, genetic polymorphisms in certain chromosomal regions could be leveraged to develop a laboratory diagnostic test for mass screening programs, enabling the identification of individuals at high risk for various diseases [15, 16]. Similarly, genetic variations in the 13 chromosomal regions identified in this study could facilitate the development of diagnostic kits for colorectal cancer.
Survival analysis revealed that three KEGG pathways and 33 GO terms, associated with nine chromosomal bands, were significantly linked to poor CRC survival (Table 2). Many of these pathways and terms, such as those involved in olfactory and taste receptor signaling, immune regulation, microRNA (miRNA) transcription control, and enzyme inhibition, have been previously linked to CRC processes, including tumor growth, immune evasion, and microenvironmental remodeling [17-23]. These results suggest that these chromosomal bands and their associated pathways could serve as indicators of CRC prognosis.
The present study showed that the 8q24 and 20q13 chromosomal bands were enriched for DEGs and six GO terms associated with CRC patient survival. The 8q24 region, which harbors the MYC oncogene, plays a central role in cell proliferation, and its amplification has been linked to increased CRC susceptibility [13, 24]. Similarly, 20q13 is involved in the disruption of cell cycle regulation and apoptosis, with recurrent amplifications and deletions contributing to tumorigenesis [25-27]. Notably, amplification of the 20q13.33 sub-band has been reported as an early and frequent event in sporadic colorectal cancer and proposed as a potential biomarker for early tumor detection [26]. Importantly, neither of these regions showed enrichment for genes with high somatic mutation rates, and their average mutation burden was comparable to background levels. These findings support the idea that 8q24 and 20q13 primarily act as transcriptional regulatory hotspots rather than being involved in functional alterations due to somatic mutations.
Collectively, this study revealed a nonrandom distribution of CRC-associated GWAS polymorphisms across thirteen chromosomal bands. Enrichment analyses highlighted 8q24 and 20q13 as regulatory hotspots, enriched for differentially expressed genes and survival-associated pathways, but not for highly mutated genes. These findings underscore the functional and prognostic importance of these regions and suggest their potential utility as genetic biomarkers for CRC diagnosis, prognosis, and population-level screening.
Acknowledgements:
I wish to express my sincere gratitude to Professor Mostafa Saadat for his invaluable guidance and support, which were instrumental in the advancement of this research.
Conflict of Interest:
The author declares no conflicts of interest.
Ethics approval and consent to participate:
Not required.
Authors’ Contribution:
SARS: Conceptualization, Methodology, Investigation, Formal Analysis, Data Curation, Visualization, Validation, and Writing.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Opposing prognostic roles of tumor-associated and circulating MMP8 in colorectal cancer.
- Copper-enriched zinc peroxides induced cuproptosis through concurrent metabolic and oxidative dysregulation for boosting immunotherapy in colorectal cancer.
- Editorial: Altered metabolic traits in gastro-intestinal tract cancers, volume II.
- Macrophage deficiency discordantly regulated tumor growth and metastasis through increased thrombospondin-1 production.
- Time-Resolved Oxygen Dynamics Reveals Redox-Selective Apoptosis Induced by Cold Atmospheric Plasma in HT-29 Colorectal Cancer Cells.
- System-Wide Implementation of Colorectal Cancer Screening in a Value-Based Care Setting.