Liquid-liquid phase separation-driven molecular subtyping and prognostic modeling in colorectal cancer.
1/5 보강
[BACKGROUND] Liquid-liquid phase separation (LLPS) orchestrates the spatiotemporal organization of biomolecular condensates and regulates numerous biological processes.
APA
Guan H, Tian C, et al. (2025). Liquid-liquid phase separation-driven molecular subtyping and prognostic modeling in colorectal cancer.. Frontiers in immunology, 16, 1741979. https://doi.org/10.3389/fimmu.2025.1741979
MLA
Guan H, et al.. "Liquid-liquid phase separation-driven molecular subtyping and prognostic modeling in colorectal cancer.." Frontiers in immunology, vol. 16, 2025, pp. 1741979.
PMID
41583431 ↗
Abstract 한글 요약
[BACKGROUND] Liquid-liquid phase separation (LLPS) orchestrates the spatiotemporal organization of biomolecular condensates and regulates numerous biological processes. However, the extent to which dysregulated LLPS facilitates the progression of colorectal cancer (CRC) has not been elucidated. Elucidating how LLPS influences CRC possibly offers valuable insights into diagnosis and therapeutic intervention.
[METHODS] Differentially expressed genes (DEGs) were identified from 566 CRC samples and 19 normal controls in the GSE39582 dataset. LLPS-linked genes were collected from the DrLLPS database. Prognostically significant genes were identified via univariate Cox regression, least absolute shrinkage and selection operator regression, and stepwise akaike information criterion algorithm. The risk score was derived utilizing the LLPS-linked gene signature. Patient characteristics were evaluated concerning the computed risk scores. The biological and clinical distinctions across high-risk and low-risk cohorts were further investigated, leveraging the COAD, READ, and GSE17536 validation cohorts. The expression and spatial distribution of the five prognostic genes were examined via the GSE166555 dataset and spatial transcriptomics analysis. The hydroxyacyl-coenzyme A dehydrogenase expression-related enrichment pathways were further analysed via weighted gene coexpression network analysis combined with Metascape. The expression and biological functions of were verified .
[RESULTS] A total of 430 LLPS-related DEGs were identified, from which five prognostic genes were selected to construct the LLPS-associated risk signature. Marked differences in gene expression profiles, overall prognosis, clinicopathological attributes, somatic mutations, signaling pathway activity, tumor microenvironment composition, and drug sensitivity were noted across the high-risk and low-risk populations. Furthermore, the expression of the five prognostic genes and biological functions of were validated through experiments.
[CONCLUSIONS] An LLPS-related prognostic model was created, enabling the stratification of the CRC population according to risk and informing individualized therapeutic strategies.
[METHODS] Differentially expressed genes (DEGs) were identified from 566 CRC samples and 19 normal controls in the GSE39582 dataset. LLPS-linked genes were collected from the DrLLPS database. Prognostically significant genes were identified via univariate Cox regression, least absolute shrinkage and selection operator regression, and stepwise akaike information criterion algorithm. The risk score was derived utilizing the LLPS-linked gene signature. Patient characteristics were evaluated concerning the computed risk scores. The biological and clinical distinctions across high-risk and low-risk cohorts were further investigated, leveraging the COAD, READ, and GSE17536 validation cohorts. The expression and spatial distribution of the five prognostic genes were examined via the GSE166555 dataset and spatial transcriptomics analysis. The hydroxyacyl-coenzyme A dehydrogenase expression-related enrichment pathways were further analysed via weighted gene coexpression network analysis combined with Metascape. The expression and biological functions of were verified .
[RESULTS] A total of 430 LLPS-related DEGs were identified, from which five prognostic genes were selected to construct the LLPS-associated risk signature. Marked differences in gene expression profiles, overall prognosis, clinicopathological attributes, somatic mutations, signaling pathway activity, tumor microenvironment composition, and drug sensitivity were noted across the high-risk and low-risk populations. Furthermore, the expression of the five prognostic genes and biological functions of were validated through experiments.
[CONCLUSIONS] An LLPS-related prognostic model was created, enabling the stratification of the CRC population according to risk and informing individualized therapeutic strategies.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
- Humans
- Colorectal Neoplasms
- Prognosis
- Biomarkers
- Tumor
- Gene Expression Profiling
- Gene Expression Regulation
- Neoplastic
- Transcriptome
- Gene Regulatory Networks
- Male
- Female
- Databases
- Genetic
- Phase Separation
- colorectal cancer
- liquid-liquid phase separation
- prognosis
- spatial transcriptomics
- tumor immune microenvironment
같은 제1저자의 인용 많은 논문 (5)
- Global research states and trends of the circadian clock and cancer from 2001 to 2024: A bibliometric and visualization analysis.
- Nebivolol-induced bi-directional modulation of intracellular kinases contributes to the inhibition of triple-negative breast cancer.
- Melatonin Attenuates Synovial Hyperplasia, Inflammation, and Fibrosis and Postpones Osteoarthritis Progression.
- Shared immune dysregulation in systemic lupus erythematosus and colorectal cancer: a multi-omics guided discovery of DNASE1L3-centric efferocytosis deficiency.
- Nurse-led management of sialorrhea in Parkinson's disease: a pilot randomized controlled trial.
📖 전문 본문 읽기 PMC JATS · ~89 KB · 영문
Introduction
Introduction
Liquid–liquid phase separation (LLPS) refers to the process by which biomolecules such as proteins and nucleic acids form liquid membraneless organelles (MLOs) through transient, low-affinity, multivalent weak interactions (1, 2). For example, nuclei, nuclear speckles, paraspeckles, Cajal vesicles in the nucleus, and stress granules and processors in the cytoplasm are all formed through LLPS (3, 4). LLPS-related biomolecules are functionally classified into scaffolds, clients, and regulators (5). Scaffolds are drivers of LLPS and are indispensable for MLOs, which can be formed by MLOs alone or with other molecules (6). Clients are recruited and regulated by scaffolds and do not undergo LLPS by themselves, forming MLOs with scaffolds only under certain circumstances (7). Regulators typically do not constitute the MLOs themselves but include other molecules, such as proteins, RNA, or ATP, that modulate the formation or stability of MLOs (8).
MLOs participate in many physiological and signaling pathways for heterochromatin formation, DNA damage repair, RNA metabolism, and autophagy through LLPS (9, 10). For instance, the MRN complex binds to DNA double-strand breaks (DSBs), and the MRN-interacting protein undergoes LLPS, which induces autophosphorylation of ataxia telangiectasia mutated protein, activates DNA damage response signaling, and promotes homologous recombination-mediated DSB repair (11). RAP80 undergoes LLPS at DSB sites, facilitating the rapid formation of Lys63-linked polyubiquitin chains, which in turn enhance RAP80 phase separation, recruit BRCA1, and contribute to DNA repair (12).
Dysregulated LLPS is crucial in malignant tumors (13, 14). For example, in triple-negative breast cancer cells, the disordered N-terminal domain and phosphorylated residues of histone deacetylase 6 promote aberrant LLPS, resulting in disorganized chromatin structure and uncontrolled cell proliferation (15). Mechanistically, DAZAP1, an RNA-binding protein, accumulates in the nucleus through LLPS, modulates pre-mRNA alternative splicing, enhances the expression of cytochrome-c oxidase 16, and promotes mitochondrial energy metabolism and invasion in oral squamous carcinoma (16). ARID1A, a chromatin remodeling factor, forms MLOs via prion-like domain-mediated LLPS, localizes to the EWS/FLT1 enhancer, and facilitates the formation of functional chromatin remodeling centers at oncogenic target sites, inducing long-range chromatin remodeling and driving proliferation and invasion in Ewing sarcoma (17).
Colorectal cancer (CRC) is a critical global health issue. In 2022, among 20 million new cases and 9.7 million deaths, CRC was the third most prevalent (1,926,118 cases; 9.6%) and second deadliest (903,859 deaths; 9.3%) (18). Its risk factors include obesity, smoking, and poor dietary patterns. The hallmarks of CRC pathogenesis include genomic instability, epigenetic dysregulation, and aberrant gene expression (19). Despite progress in systemic therapies such as surgery, radiotherapy, chemotherapy, and immunotherapy, a marked proportion of patients remain at risk for recurrence and metastasis, resulting in poor clinical outcomes (20). Therefore, unravelling pathogenic mechanisms and finding novel treatment targets are imperative to improve customized treatments and improve the prognosis of CRC patients.
LLPS is involved in the initiation, progression, and prognosis of CRC. For example, MEX3A in the cytoplasm of CRC cells undergoes LLPS, interacts with circMPP6, and the MEX3A/circMPP6 complex recruits processing bodies, enhances the degradation of PDE5A mRNA, which facilitates proliferation and invasion, and suppresses CRC cell autophagy, thus resulting in poor survival outcomes in CRC patients (21). In another example, oxaliplatin treatment alters nucleolar LLPS dynamics in CRC cells, impairing rRNA transcription and ribosomal function, triggering cell cycle arrest, and ultimately inducing cell death (22). However, many LLPS-associated molecules remain uncharacterized in the context of CRC, and their functional significance warrants further investigation.
Given these considerations, the exploration of LLPS-associated molecular mechanisms may provide critical insights into CRC pathogenesis and prospective therapeutic targets. LLPS-related differentially expressed genes (DEGs) across malignant and healthy tissues were identified. A prognostic signature was subsequently constructed, which demonstrated robust predictive performance. Furthermore, distinct clinical outcomes, biological functions, genomic alterations, features in tumor microenvironment (TME), and cancer stemness profiles were noted across high- and low-risk CRC subtypes. In vitro experiments verified that hydroxyacyl-coenzyme A dehydrogenase (HADH) significantly affected the proliferation, apoptosis and migration of CRC cells. These findings can improve prognostic evaluation and inform personalized therapeutic strategies for CRC patients.
Liquid–liquid phase separation (LLPS) refers to the process by which biomolecules such as proteins and nucleic acids form liquid membraneless organelles (MLOs) through transient, low-affinity, multivalent weak interactions (1, 2). For example, nuclei, nuclear speckles, paraspeckles, Cajal vesicles in the nucleus, and stress granules and processors in the cytoplasm are all formed through LLPS (3, 4). LLPS-related biomolecules are functionally classified into scaffolds, clients, and regulators (5). Scaffolds are drivers of LLPS and are indispensable for MLOs, which can be formed by MLOs alone or with other molecules (6). Clients are recruited and regulated by scaffolds and do not undergo LLPS by themselves, forming MLOs with scaffolds only under certain circumstances (7). Regulators typically do not constitute the MLOs themselves but include other molecules, such as proteins, RNA, or ATP, that modulate the formation or stability of MLOs (8).
MLOs participate in many physiological and signaling pathways for heterochromatin formation, DNA damage repair, RNA metabolism, and autophagy through LLPS (9, 10). For instance, the MRN complex binds to DNA double-strand breaks (DSBs), and the MRN-interacting protein undergoes LLPS, which induces autophosphorylation of ataxia telangiectasia mutated protein, activates DNA damage response signaling, and promotes homologous recombination-mediated DSB repair (11). RAP80 undergoes LLPS at DSB sites, facilitating the rapid formation of Lys63-linked polyubiquitin chains, which in turn enhance RAP80 phase separation, recruit BRCA1, and contribute to DNA repair (12).
Dysregulated LLPS is crucial in malignant tumors (13, 14). For example, in triple-negative breast cancer cells, the disordered N-terminal domain and phosphorylated residues of histone deacetylase 6 promote aberrant LLPS, resulting in disorganized chromatin structure and uncontrolled cell proliferation (15). Mechanistically, DAZAP1, an RNA-binding protein, accumulates in the nucleus through LLPS, modulates pre-mRNA alternative splicing, enhances the expression of cytochrome-c oxidase 16, and promotes mitochondrial energy metabolism and invasion in oral squamous carcinoma (16). ARID1A, a chromatin remodeling factor, forms MLOs via prion-like domain-mediated LLPS, localizes to the EWS/FLT1 enhancer, and facilitates the formation of functional chromatin remodeling centers at oncogenic target sites, inducing long-range chromatin remodeling and driving proliferation and invasion in Ewing sarcoma (17).
Colorectal cancer (CRC) is a critical global health issue. In 2022, among 20 million new cases and 9.7 million deaths, CRC was the third most prevalent (1,926,118 cases; 9.6%) and second deadliest (903,859 deaths; 9.3%) (18). Its risk factors include obesity, smoking, and poor dietary patterns. The hallmarks of CRC pathogenesis include genomic instability, epigenetic dysregulation, and aberrant gene expression (19). Despite progress in systemic therapies such as surgery, radiotherapy, chemotherapy, and immunotherapy, a marked proportion of patients remain at risk for recurrence and metastasis, resulting in poor clinical outcomes (20). Therefore, unravelling pathogenic mechanisms and finding novel treatment targets are imperative to improve customized treatments and improve the prognosis of CRC patients.
LLPS is involved in the initiation, progression, and prognosis of CRC. For example, MEX3A in the cytoplasm of CRC cells undergoes LLPS, interacts with circMPP6, and the MEX3A/circMPP6 complex recruits processing bodies, enhances the degradation of PDE5A mRNA, which facilitates proliferation and invasion, and suppresses CRC cell autophagy, thus resulting in poor survival outcomes in CRC patients (21). In another example, oxaliplatin treatment alters nucleolar LLPS dynamics in CRC cells, impairing rRNA transcription and ribosomal function, triggering cell cycle arrest, and ultimately inducing cell death (22). However, many LLPS-associated molecules remain uncharacterized in the context of CRC, and their functional significance warrants further investigation.
Given these considerations, the exploration of LLPS-associated molecular mechanisms may provide critical insights into CRC pathogenesis and prospective therapeutic targets. LLPS-related differentially expressed genes (DEGs) across malignant and healthy tissues were identified. A prognostic signature was subsequently constructed, which demonstrated robust predictive performance. Furthermore, distinct clinical outcomes, biological functions, genomic alterations, features in tumor microenvironment (TME), and cancer stemness profiles were noted across high- and low-risk CRC subtypes. In vitro experiments verified that hydroxyacyl-coenzyme A dehydrogenase (HADH) significantly affected the proliferation, apoptosis and migration of CRC cells. These findings can improve prognostic evaluation and inform personalized therapeutic strategies for CRC patients.
Methods
Methods
Overall flowchart
The workflow is provided in Figure 1. Initially, DEGs were identified from the GSE39582 dataset. Genes overlapping with the LLPS-related gene set were considered core genes linked to CRC. The hub genes were subsequently screened via univariate Cox and least absolute shrinkage and selection operator (LASSO) regression, as well as the stepwise Akaike information criterion (stepAIC) algorithm. They were utilized to create a prognostic signature, with sample-specific risk scores derived accordingly. Differences in the TME, drug sensitivity, gene mutations, signalling pathways, and clinical prognosis were subsequently compared across high- and low-risk populations. The prognostic models were verified via the GSE17536, colon adenocarcinoma (COAD) and rectal adenocarcinoma (READ) datasets. The GSE166555 single-cell dataset and spatial transcriptome data were leveraged to assess prognostic gene expression and distribution. Finally, the expression and function of the hub genes were verified in vitro.
Data collection
Normalized expression matrices containing RNA data from CRC patients were downloaded from the GEO database via the “GEOquery” package, with a particular focus on the GSE39582 and GSE17536 datasets developed on the GPL570 platform. Separate gene expression datasets for COAD and READ were also acquired from the TCGA database via the UCSC Xena interface (https://xena.ucsc.edu/) (23). Genes with missing expression values were removed, and only protein-coding genes with protein products were retained. Only genes present in both the annotation database and the expression matrix were preserved to ensure biological relevance in subsequent analyses. For cases where multiple probes corresponded to the same gene, the probe with the highest expression level was selected to represent that gene.
After removing samples lacking prognostic information, we subsequently analyzed 566 malignant and 19 control samples from the GSE39582 dataset, 177 tumor samples from GSE17536, 430 tumor samples from the COAD dataset, and 154 tumor samples from the READ dataset (24, 25). Details of all datasets were provided in Supplementary Table S1. Leveraging 10x Genomics single-cell 3’ library technology, scRNA-seq data from twelve CRC patients in the GSE166555 dataset were retrieved from the GEO database. Spatial transcriptomic analysis of three CRC patients was performed via the Sparkle website (https://grswsci.top/). Information on genes related to LLPS was acquired from data resource of liquid–liquid phase separation database (DrLLPS, http://llps.biocuckoo.cn/) (26). A total of 3,585 genes specific to humans (85 scaffolds, 355 regulators, and 3,145 clients) were isolated for downstream investigation, details were listed in Supplementary Table S2 (27). All LLPS-related genes were assigned equal weight.
DEG screening and functional analysis
DEGs across tumor and control tissues were identified via the “limma” package in the GSE39582 dataset, and the results were displayed as volcano plots. Statistically significant genes were defined as those with a |log2-fold change| > 1 and an adjusted P ≤ 0.05. The overlap between DEGs and LLPS-related genes was illustrated via Venn diagrams. The expression of intersecting genes was presented in a heatmap generated with the “pheatmap” package. To elucidate their biological functions, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed via the “clusterProfiler” package. Multiple testing correction was performed using the Benjamini–Hochberg method, and the top 15 enriched pathways were selected for visualization bar plots. Additionally, a protein–protein interaction (PPI) network was constructed via STRING (https://cn.string-db.org/) to assess possible interactions among the shared genes.
Construction of an LLPS-linked gene signature
A three-step approach was employed to develop a robust LLPS-associated gene signature for prognostic evaluation utilizing the GSE39582 dataset. First, univariate Cox regression analysis identified DEGs linked to overall survival (OS). Next, the LASSO regression algorithm, implemented via the “glmnet” package, was applied to further screen the variables. The stepAIC from “MASS” subsequently refined the selection to retain the most prognostically significant genes. Finally, these key LLPS-related genes were integrated into a multivariate Cox proportional hazards model. The gene signature was derived via the following methods:
In the model, “Expi” denotes the expression of a gene, whereas “Coei” indicates its corresponding coefficient. With these variables, every patient was assigned a risk score according to the LLPS-associated gene signature. The participants were then divided into high- and low-risk cohorts via the median risk score as the cut-off. The “survival” package facilitated Kaplan–Meier survival analysis and log-rank testing for OS comparison across groups. The “ggrisk” package was employed to graphically display the risk score distribution, survival duration, and mortality status. For external validation, the GSE17536, COAD, and READ datasets were analysed to evaluate the broader applicability of this LLPS-based signature.
Nomogram model development
The prognostic independence of clinical parameters (age, chemotherapy, clinical stage, T, N, and M classifications, and risk score) was estimated in the GSE39582 dataset via univariate and multivariate Cox regression analyses. A prognostic nomogram was subsequently generated utilizing the aforementioned significant risk factors with “rms” and “survival” package. To determine the nomogram’s predictive precision, researchers have employed the area under the receiver operating characteristic (ROC) curve (AUC), Kaplan–Meier survival analysis, calibration curves, and decision curve analysis (DCA). The COAD cohort data provided external validation of the model’s performance.
TME evaluation
Additionally, the expression profiles of immune checkpoint-linked molecules and human leukocyte antigen (HLA) genes were examined. Estimation of stromal and immune cells in malignant tumour tissues using expression data (ESTIMATE) method based on single-sample gene set enrichment analysis (ssGSEA) algorithm was applied to infer tumor purity, stromal, immune, and ESTIMATE scores (combining stromal and immune components) via the “estimate” package (28). The tumor immune dysfunction and exclusion (TIDE, https://tide.nki.nl/) score was employed to predict the response to immunotherapy between the two groups. The CIBERSORT method based on linear support vector regression was conducted to assess the composition of tumor-infiltrating immune cells via the “CIBERSORT” package (29). Furthermore, disparities in immune cell infiltration were evaluated across high-risk and low-risk populations. The relationships among immune cells and their potential correlations with hub gene expression were revealed via the “corrplot” package.
Stemness signature analysis
StemChecker (http://stemchecker.sysbiolab.eu/) was utilized to identify and examine stemness-associated signatures within the relevant gene sets (30). A collection of 26 stemness gene sets was first obtained from StemChecker, incorporating diverse data types such as transcriptional profiles, RNAi screening results, manually compiled literature, computationally inferred transcription factor (TF)-regulated genes, and Nanog-linked signatures. The stemness score was evaluated based on ssGSEA by calculating the similarity between tumor cells and predefined stem cell gene signatures.
Functional enrichment analysis
DEGs across two risk cohorts in the GSE39582 dataset were identified and visualized via a volcano plot. The possible biological functions of these genes were examined through GO enrichment analysis. The enriched pathways and biological processes were further integrated via the Metascape database (https://Metascape.org/gp/index.html) (31). GSEA based on KEGG pathways was performed with the “clusterProfiler” package to identify differentially enriched pathways between two risk groups, applying the thresholds of |normalized enrichment score (NES)| > 1, nominal p < 0.05, and q < 0.25 (32). Additionally, gene set variation analysis (GSVA) was carried out via “GSVA” package to derive enrichment scores of predefined gene sets across samples, with the threshold of adjusted P < 0.05.
Comprehensive network analysis
NetworkAnalyst (http://www.networkanalyst.ca) is an application that maps genes of interest to related databases and performs network analysis online (33). Five LLPS-related genes were uploaded to the NetworkAnalyst website, and their links to TFs and miRNAs were analysed. Regulatory interaction data, which were carefully extracted from published studies, were acquired through the RegNetwork repository. TF–gene–miRNA interaction networks were visualized via Cytoscape 3.10.2.
Somatic variant analysis
Somatic variant data for CRC were downloaded from the UCSC Xena platform, and mutation patterns in 430 DEGs were analyzed using the “maftools” package (34). Additionally, copy number variation (CNV) data encompassing gene amplifications and deletions were retrieved from the UCSC Xena platform for COAD and READ cohorts to analyze mutation patterns of the 430 DEGs.
Single cell RNA sequencing analysis
The “Seurat”package facilitated the conversion of scRNA-seq analysis data of GSE166555 dataset into an analysable object. After normalization (NormalizeData), highly variable genes were selected (FindVariableFeatures) and data were scaled. Batch-corrected principal component analysis was conducted using these genes. Cell clusters were identified at a resolution of 0.8 and annotated against marker databases. Subsequently, cell clustering and annotation were performed, followed by dimensionality reduction and visualization using the Uniform Manifold Approximation and Projection (UMAP) method. The activity of 430 LLPS-related DEGs in each cell was subsequently quantified with the “AUCell” package and visualized with the “Seurat” package (LLPS enrichment score). The expression and distribution of the five prognostic genes across cell types were examined. Taking normal epithelial cells as the reference, the CNV level of tumor epithelial cells was evaluated via the “inferCNV” package. The cells were divided into high and low groups on the basis of the median value of the CNV score, and the expression levels of five prognostic genes in the two groups were evaluated.
Spatial transcriptome analysis
Spatial transcriptomics enabled the spatial distribution and expression evaluation of the five prognostic genes in the tumor tissues of the CRC population. Initially, the cellular composition of each microregion was estimated via inverse convolution methods. Quality control procedures were implemented on the basis of gene expression counts, unique molecular identifiers, and the mitochondrial RNA content per cell. A signature score matrix was created by averaging the expression of the top 25 cell sorting-specific genes for each microregion. Enrichment score matrices were generated via the “Cottrazm” package. Microregions with enrichment scores equal to 1 were classified as malignant; others were considered nonmalignant. Differential expression of the five prognostic genes across subcohorts was examined via the Wilcoxon rank-sum test. Spearman correlation analyses between gene expression and cellular composition were presented as a heatmap generated with the “linkET” package.
Screening of potential small molecule therapeutics for CRC
Data on tumor cell sensitivity to small-molecule compounds were retrieved from Genomics of Drug Sensitivity in Cancer 2 (GDSC2) database (https://osf.io/c6tfx/) (35). Drug sensitivity (represented as the IC50) was predicted for each sample in the GSE39582 cohort via the “oncoPredict” package. Spearman correlation coefficients were used to explore the relationships of the expression of the five prognostic genes with the IC50 values. Disparities in the IC50 across high- and low-risk cohorts were visualized as violin plots. The chemical structures of the compounds were obtained from ChemSpider (http://www.chemspider.com) and rendered via ChemDraw software (version 22.0.0.22).
Biological functions of HADH genes analysed via weighted gene coexpression network analysis
We further compared the expression levels of the five prognostic genes and OS in the high-risk and low-risk groups. The results of the comprehensive analysis of the bulk RNA transcriptome, scRNA-seq, and spatial transcriptome suggested that HADH was downregulated in tumor tissues and was expressed mainly in tumor cells with low CNV, as was its favourable prognostic value. This gene was selected for further functional investigation. First, a scale-free coexpression gene network and average connectivity were constructed via the “WGCNA” package to optimize the soft threshold power. The modular trait was subsequently constructed to evaluate the associations between different module eigengenes and HADH expression. The module with the most significant positive correlation of module-HADH expression was screened. The module genes were imported into the Metascape website to explore the pathways relevant to HADH expression.
Immunohistochemistry
Tissue microarrays comprising tumor specimens and paired adjacent nontumorous tissues from 80 individuals with CRC were obtained from AIFANG Biotech Co., Ltd. (AF-CocSur2201, China). The tissues were first dewaxed and rehydrated. The antigens were retrieved with EDTA buffer. The endogenous peroxidase activity of the sections was subsequently inhibited with hydrogen peroxide, and the sections were blocked with 3% BSA. The sections were incubated with an antibody that targets HADH (Proteintech, 19828-1-AP, China) at 4 °C overnight. After washing, a secondary antibody (Servicebio, GB23303, China) was applied for 1 hour. DAB was used to visualize the immunoreaction, and hematoxylin was used to contrast the cell nuclei. The stained samples were imaged and analysed under a microscope (KF-FL-020, KFBIO, China). For scoring, staining intensity was classified as 0 (no staining), 1 (light yellow), 2 (tan), or 3 (deep brown). The percentages of positively stained cells were 0 (0–5%), 1 (6–25%), 2 (26–50%), 3 (51–75%), or 4 (>75%). The ultimate immunoreactivity score (IRS) was obtained by multiplying the intensity and percentage. Tumor tissues were divided into high- and low-HADH expression cohorts according to the median IRS value. Clinical and pathological differences across cohorts were compared, and the results were visualized via the “ComplexHeatmap” package.
Cell culture
Colorectal cancer cell lines (HCT116, RKO, HT29, HCT8, SW620, and KM12), immortalized NCM460 epithelial cells, and HEK-293T cells were obtained from OriCell (China). HT29 cells were cultured in McCoy’s 5A medium (G4541, Servicebio, China) supplemented with 10% fetal bovine serum (FBS, A5256701, Gibco, USA) and 1% penicillin/streptomycin (P/S, 15140122, Gibco, USA), and the other cells were cultured in DMEM (11965092, Gibco, USA) supplemented with FBS and P/S. All the cells were placed in a humid incubator (Thermo Fisher Scientific, USA) (37°C, 5% CO2).
Lentiviral transfection
The HADH overexpression plasmid (HADH), control plasmid (NC), and helper plasmids (pSPAX2 and pMD2G) were purchased from Miao Ling Company (China). HEK-293T cells were transfected with plasmids containing polyethylenimine (40816ES01, Yeasen, China) and cultured for 72 h. The lentivirus-containing cell supernatant was subsequently collected. The KM12 and RKO cells were incubated with the lentiviral supernatant and 10 μg/mL polybrene (C0351, Beyotime, China) for 48 hours and then with puromycin (2 μg/mL; ST551, Beyotime, China). Successful overexpression was confirmed by real−time quantitative polymerase chain reaction (RT−qPCR) and Western blotting.
RT−qPCR
Total cellular RNA was extracted with TRIzol reagent (15596026CN, Thermo Fisher Scientific, USA). cDNA was derived via reverse transcription according to the manufacturer’s guidelines for the Reverse Transcription Kit (R433, Vazyme, China). Fifteen pairs of tumor and matched paracancerous cDNA microarrays were procured from Outdo Biotech Company (MecDNA-HColA030CS03, China). qPCR was performed with a SYBR Green PCR Kit (Q712, Vazyme, China). The primer sequences were detailed in Supplementary Table S3. The relative expression of mRNA was computed via the 2-ΔΔCt approach and standardized with β-actin as the internal control.
Western blotting
The cells were lysed in RIPA buffer (CW2333S, CWBIO, China) supplemented with 1% protease inhibitor cocktail (CW2200S, CWBIO, China) for 30 minutes. Protein concentrations were determined via a BCA kit (23227, Thermo Fisher Scientific, USA). Proteins were transferred onto PVDF membranes (IPVH00010, Millipore, USA) after being separated via SDS–PAGE. After being blocked for an hour in 5% nonfat milk, the samples were incubated overnight at 4°C with antibodies targeting HADH or β-tubulin (1:5000, 80762-1-RR; Proteintech, China). The membranes were then incubated with HRP-conjugated secondary antibodies (GAR1007, MultiSciences, China) for 1 hour. Following washes, protein signals were detected via a chemiluminescence reagent (GK10008, GLPBIO, USA). Densitometric analysis of HADH expression relative to that of β-tubulin was performed via ImageJ.
Cell viability assay
Briefly, 1,000 cells per well were seeded in 96-well plates. At 0, 24, 48, 72, and 96 h, the cells were treated with culture medium supplemented with 10% CCK-8 solution (GK10001, GLPBIO, USA) and incubated for 2 h at 37°C. The optical density at 450 nm was measured via a microplate reader (Synergy H1, BioTek, USA).
Colony formation assay
Cells were seeded in plates with 6 wells with1,000 cells in each well and cultured with medium replacement every 72 h. After two weeks, the cells were rinsed with PBS, fixed with 4% paraformaldehyde (G1101, Servicebio, China) for 30 min, and stained with crystal violet reagent (C0121, Beyotime, China) for 30 min. Colonies were visualized via the Amersham ImageQuant 800 system (Cytiva, Japan), and quantification was performed via ImageJ.
Cell proliferation assay
The cells were seeded in 6-well plates and incubated for 24 hours before they were incubated with an EdU solution (C0078S, Beyotime, China) for 4 hours. Following fixation in 4% paraformaldehyde, the cells were washed three times with PBS supplemented with 3% BSA and incubated with permeabilization solution (P0097, Beyotime, China) for 15 minutes. The samples were incubated with click additive solution for half an hour, stained with Hoechst for 10 minutes, and visualized via a microscope (BX53, Olympus, Japan). The ratio of EdU-positive cells was determined via ImageJ software. For each well, the percentage of EdU-positive cells was determined by counting approximately 300 cells per field across five randomly selected fields of view, and the mean value was calculated.
Apoptosis assay
Apoptosis was assessed via an Annexin V/PI detection kit (HY-K1093, MedChemExpress, USA). The cells were cultured for 24 hours and harvested via trypsin digestion (C0207, Beyotime, China). Subsequent to washing with PBS, the cell suspensions in binding buffer were subjected to 15 minutes of incubation in the dark with both Annexin V and PI stains. Fluorescent signals were detected via flow cytometry (Canto, BD Biosciences, USA). Data analysis was enabled by FlowJo 10.8.1 (Tree Star, USA).
Wound healing assay
The cells were seeded in 12-well plates until approximately 90% confluence was achieved. Mechanical wounds were created with a 10 μl pipette tip. After being washed, the cells were cultured in serum-free medium. The observation time points included immediately after wounding and 24 hours after wounding, and a microscope was used. The area of wound closure was quantified via ImageJ.
Transwell assay
Transwell chambers (8 μm pore polyester membrane inserts; TCS020024, Jet, China) were either uncoated or precoated with 60 μl of matrix gel (082704T, Mogengel, Xiamen, China) to assess cell migration or invasion. Cells (25,000 KM12 or 100,000 RKO per well) suspended in serum-free medium were seeded into the upper chambers. The lower chambers were supplemented with 600 μl of medium containing 20% FBS as a chemoattractant. After 48 h, the cells remaining in the upper chambers were removed. The inserts were then fixed with paraformaldehyde and stained with crystal violet. The cells that had migrated or invaded were visualized and quantified via a microscope and ImageJ.
Statistical analysis
All data originating from public databases were analysed via R 4.4.1. All experiments were performed in triplicate with three technical replicates, and independently repeated three or more times as biological replicates. The results are presented as the mean ± SEM of independent determinations. Normally distributed variables in the two cohorts were compared via unpaired t tests, whereas nonnormally distributed variables were explored via the Mann–Whitney test. The log-rank test was used to evaluate survival disparities through Kaplan–Meier survival curves. The false discovery rate (FDR) was controlled through Benjamini–Hochberg correction.
Overall flowchart
The workflow is provided in Figure 1. Initially, DEGs were identified from the GSE39582 dataset. Genes overlapping with the LLPS-related gene set were considered core genes linked to CRC. The hub genes were subsequently screened via univariate Cox and least absolute shrinkage and selection operator (LASSO) regression, as well as the stepwise Akaike information criterion (stepAIC) algorithm. They were utilized to create a prognostic signature, with sample-specific risk scores derived accordingly. Differences in the TME, drug sensitivity, gene mutations, signalling pathways, and clinical prognosis were subsequently compared across high- and low-risk populations. The prognostic models were verified via the GSE17536, colon adenocarcinoma (COAD) and rectal adenocarcinoma (READ) datasets. The GSE166555 single-cell dataset and spatial transcriptome data were leveraged to assess prognostic gene expression and distribution. Finally, the expression and function of the hub genes were verified in vitro.
Data collection
Normalized expression matrices containing RNA data from CRC patients were downloaded from the GEO database via the “GEOquery” package, with a particular focus on the GSE39582 and GSE17536 datasets developed on the GPL570 platform. Separate gene expression datasets for COAD and READ were also acquired from the TCGA database via the UCSC Xena interface (https://xena.ucsc.edu/) (23). Genes with missing expression values were removed, and only protein-coding genes with protein products were retained. Only genes present in both the annotation database and the expression matrix were preserved to ensure biological relevance in subsequent analyses. For cases where multiple probes corresponded to the same gene, the probe with the highest expression level was selected to represent that gene.
After removing samples lacking prognostic information, we subsequently analyzed 566 malignant and 19 control samples from the GSE39582 dataset, 177 tumor samples from GSE17536, 430 tumor samples from the COAD dataset, and 154 tumor samples from the READ dataset (24, 25). Details of all datasets were provided in Supplementary Table S1. Leveraging 10x Genomics single-cell 3’ library technology, scRNA-seq data from twelve CRC patients in the GSE166555 dataset were retrieved from the GEO database. Spatial transcriptomic analysis of three CRC patients was performed via the Sparkle website (https://grswsci.top/). Information on genes related to LLPS was acquired from data resource of liquid–liquid phase separation database (DrLLPS, http://llps.biocuckoo.cn/) (26). A total of 3,585 genes specific to humans (85 scaffolds, 355 regulators, and 3,145 clients) were isolated for downstream investigation, details were listed in Supplementary Table S2 (27). All LLPS-related genes were assigned equal weight.
DEG screening and functional analysis
DEGs across tumor and control tissues were identified via the “limma” package in the GSE39582 dataset, and the results were displayed as volcano plots. Statistically significant genes were defined as those with a |log2-fold change| > 1 and an adjusted P ≤ 0.05. The overlap between DEGs and LLPS-related genes was illustrated via Venn diagrams. The expression of intersecting genes was presented in a heatmap generated with the “pheatmap” package. To elucidate their biological functions, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed via the “clusterProfiler” package. Multiple testing correction was performed using the Benjamini–Hochberg method, and the top 15 enriched pathways were selected for visualization bar plots. Additionally, a protein–protein interaction (PPI) network was constructed via STRING (https://cn.string-db.org/) to assess possible interactions among the shared genes.
Construction of an LLPS-linked gene signature
A three-step approach was employed to develop a robust LLPS-associated gene signature for prognostic evaluation utilizing the GSE39582 dataset. First, univariate Cox regression analysis identified DEGs linked to overall survival (OS). Next, the LASSO regression algorithm, implemented via the “glmnet” package, was applied to further screen the variables. The stepAIC from “MASS” subsequently refined the selection to retain the most prognostically significant genes. Finally, these key LLPS-related genes were integrated into a multivariate Cox proportional hazards model. The gene signature was derived via the following methods:
In the model, “Expi” denotes the expression of a gene, whereas “Coei” indicates its corresponding coefficient. With these variables, every patient was assigned a risk score according to the LLPS-associated gene signature. The participants were then divided into high- and low-risk cohorts via the median risk score as the cut-off. The “survival” package facilitated Kaplan–Meier survival analysis and log-rank testing for OS comparison across groups. The “ggrisk” package was employed to graphically display the risk score distribution, survival duration, and mortality status. For external validation, the GSE17536, COAD, and READ datasets were analysed to evaluate the broader applicability of this LLPS-based signature.
Nomogram model development
The prognostic independence of clinical parameters (age, chemotherapy, clinical stage, T, N, and M classifications, and risk score) was estimated in the GSE39582 dataset via univariate and multivariate Cox regression analyses. A prognostic nomogram was subsequently generated utilizing the aforementioned significant risk factors with “rms” and “survival” package. To determine the nomogram’s predictive precision, researchers have employed the area under the receiver operating characteristic (ROC) curve (AUC), Kaplan–Meier survival analysis, calibration curves, and decision curve analysis (DCA). The COAD cohort data provided external validation of the model’s performance.
TME evaluation
Additionally, the expression profiles of immune checkpoint-linked molecules and human leukocyte antigen (HLA) genes were examined. Estimation of stromal and immune cells in malignant tumour tissues using expression data (ESTIMATE) method based on single-sample gene set enrichment analysis (ssGSEA) algorithm was applied to infer tumor purity, stromal, immune, and ESTIMATE scores (combining stromal and immune components) via the “estimate” package (28). The tumor immune dysfunction and exclusion (TIDE, https://tide.nki.nl/) score was employed to predict the response to immunotherapy between the two groups. The CIBERSORT method based on linear support vector regression was conducted to assess the composition of tumor-infiltrating immune cells via the “CIBERSORT” package (29). Furthermore, disparities in immune cell infiltration were evaluated across high-risk and low-risk populations. The relationships among immune cells and their potential correlations with hub gene expression were revealed via the “corrplot” package.
Stemness signature analysis
StemChecker (http://stemchecker.sysbiolab.eu/) was utilized to identify and examine stemness-associated signatures within the relevant gene sets (30). A collection of 26 stemness gene sets was first obtained from StemChecker, incorporating diverse data types such as transcriptional profiles, RNAi screening results, manually compiled literature, computationally inferred transcription factor (TF)-regulated genes, and Nanog-linked signatures. The stemness score was evaluated based on ssGSEA by calculating the similarity between tumor cells and predefined stem cell gene signatures.
Functional enrichment analysis
DEGs across two risk cohorts in the GSE39582 dataset were identified and visualized via a volcano plot. The possible biological functions of these genes were examined through GO enrichment analysis. The enriched pathways and biological processes were further integrated via the Metascape database (https://Metascape.org/gp/index.html) (31). GSEA based on KEGG pathways was performed with the “clusterProfiler” package to identify differentially enriched pathways between two risk groups, applying the thresholds of |normalized enrichment score (NES)| > 1, nominal p < 0.05, and q < 0.25 (32). Additionally, gene set variation analysis (GSVA) was carried out via “GSVA” package to derive enrichment scores of predefined gene sets across samples, with the threshold of adjusted P < 0.05.
Comprehensive network analysis
NetworkAnalyst (http://www.networkanalyst.ca) is an application that maps genes of interest to related databases and performs network analysis online (33). Five LLPS-related genes were uploaded to the NetworkAnalyst website, and their links to TFs and miRNAs were analysed. Regulatory interaction data, which were carefully extracted from published studies, were acquired through the RegNetwork repository. TF–gene–miRNA interaction networks were visualized via Cytoscape 3.10.2.
Somatic variant analysis
Somatic variant data for CRC were downloaded from the UCSC Xena platform, and mutation patterns in 430 DEGs were analyzed using the “maftools” package (34). Additionally, copy number variation (CNV) data encompassing gene amplifications and deletions were retrieved from the UCSC Xena platform for COAD and READ cohorts to analyze mutation patterns of the 430 DEGs.
Single cell RNA sequencing analysis
The “Seurat”package facilitated the conversion of scRNA-seq analysis data of GSE166555 dataset into an analysable object. After normalization (NormalizeData), highly variable genes were selected (FindVariableFeatures) and data were scaled. Batch-corrected principal component analysis was conducted using these genes. Cell clusters were identified at a resolution of 0.8 and annotated against marker databases. Subsequently, cell clustering and annotation were performed, followed by dimensionality reduction and visualization using the Uniform Manifold Approximation and Projection (UMAP) method. The activity of 430 LLPS-related DEGs in each cell was subsequently quantified with the “AUCell” package and visualized with the “Seurat” package (LLPS enrichment score). The expression and distribution of the five prognostic genes across cell types were examined. Taking normal epithelial cells as the reference, the CNV level of tumor epithelial cells was evaluated via the “inferCNV” package. The cells were divided into high and low groups on the basis of the median value of the CNV score, and the expression levels of five prognostic genes in the two groups were evaluated.
Spatial transcriptome analysis
Spatial transcriptomics enabled the spatial distribution and expression evaluation of the five prognostic genes in the tumor tissues of the CRC population. Initially, the cellular composition of each microregion was estimated via inverse convolution methods. Quality control procedures were implemented on the basis of gene expression counts, unique molecular identifiers, and the mitochondrial RNA content per cell. A signature score matrix was created by averaging the expression of the top 25 cell sorting-specific genes for each microregion. Enrichment score matrices were generated via the “Cottrazm” package. Microregions with enrichment scores equal to 1 were classified as malignant; others were considered nonmalignant. Differential expression of the five prognostic genes across subcohorts was examined via the Wilcoxon rank-sum test. Spearman correlation analyses between gene expression and cellular composition were presented as a heatmap generated with the “linkET” package.
Screening of potential small molecule therapeutics for CRC
Data on tumor cell sensitivity to small-molecule compounds were retrieved from Genomics of Drug Sensitivity in Cancer 2 (GDSC2) database (https://osf.io/c6tfx/) (35). Drug sensitivity (represented as the IC50) was predicted for each sample in the GSE39582 cohort via the “oncoPredict” package. Spearman correlation coefficients were used to explore the relationships of the expression of the five prognostic genes with the IC50 values. Disparities in the IC50 across high- and low-risk cohorts were visualized as violin plots. The chemical structures of the compounds were obtained from ChemSpider (http://www.chemspider.com) and rendered via ChemDraw software (version 22.0.0.22).
Biological functions of HADH genes analysed via weighted gene coexpression network analysis
We further compared the expression levels of the five prognostic genes and OS in the high-risk and low-risk groups. The results of the comprehensive analysis of the bulk RNA transcriptome, scRNA-seq, and spatial transcriptome suggested that HADH was downregulated in tumor tissues and was expressed mainly in tumor cells with low CNV, as was its favourable prognostic value. This gene was selected for further functional investigation. First, a scale-free coexpression gene network and average connectivity were constructed via the “WGCNA” package to optimize the soft threshold power. The modular trait was subsequently constructed to evaluate the associations between different module eigengenes and HADH expression. The module with the most significant positive correlation of module-HADH expression was screened. The module genes were imported into the Metascape website to explore the pathways relevant to HADH expression.
Immunohistochemistry
Tissue microarrays comprising tumor specimens and paired adjacent nontumorous tissues from 80 individuals with CRC were obtained from AIFANG Biotech Co., Ltd. (AF-CocSur2201, China). The tissues were first dewaxed and rehydrated. The antigens were retrieved with EDTA buffer. The endogenous peroxidase activity of the sections was subsequently inhibited with hydrogen peroxide, and the sections were blocked with 3% BSA. The sections were incubated with an antibody that targets HADH (Proteintech, 19828-1-AP, China) at 4 °C overnight. After washing, a secondary antibody (Servicebio, GB23303, China) was applied for 1 hour. DAB was used to visualize the immunoreaction, and hematoxylin was used to contrast the cell nuclei. The stained samples were imaged and analysed under a microscope (KF-FL-020, KFBIO, China). For scoring, staining intensity was classified as 0 (no staining), 1 (light yellow), 2 (tan), or 3 (deep brown). The percentages of positively stained cells were 0 (0–5%), 1 (6–25%), 2 (26–50%), 3 (51–75%), or 4 (>75%). The ultimate immunoreactivity score (IRS) was obtained by multiplying the intensity and percentage. Tumor tissues were divided into high- and low-HADH expression cohorts according to the median IRS value. Clinical and pathological differences across cohorts were compared, and the results were visualized via the “ComplexHeatmap” package.
Cell culture
Colorectal cancer cell lines (HCT116, RKO, HT29, HCT8, SW620, and KM12), immortalized NCM460 epithelial cells, and HEK-293T cells were obtained from OriCell (China). HT29 cells were cultured in McCoy’s 5A medium (G4541, Servicebio, China) supplemented with 10% fetal bovine serum (FBS, A5256701, Gibco, USA) and 1% penicillin/streptomycin (P/S, 15140122, Gibco, USA), and the other cells were cultured in DMEM (11965092, Gibco, USA) supplemented with FBS and P/S. All the cells were placed in a humid incubator (Thermo Fisher Scientific, USA) (37°C, 5% CO2).
Lentiviral transfection
The HADH overexpression plasmid (HADH), control plasmid (NC), and helper plasmids (pSPAX2 and pMD2G) were purchased from Miao Ling Company (China). HEK-293T cells were transfected with plasmids containing polyethylenimine (40816ES01, Yeasen, China) and cultured for 72 h. The lentivirus-containing cell supernatant was subsequently collected. The KM12 and RKO cells were incubated with the lentiviral supernatant and 10 μg/mL polybrene (C0351, Beyotime, China) for 48 hours and then with puromycin (2 μg/mL; ST551, Beyotime, China). Successful overexpression was confirmed by real−time quantitative polymerase chain reaction (RT−qPCR) and Western blotting.
RT−qPCR
Total cellular RNA was extracted with TRIzol reagent (15596026CN, Thermo Fisher Scientific, USA). cDNA was derived via reverse transcription according to the manufacturer’s guidelines for the Reverse Transcription Kit (R433, Vazyme, China). Fifteen pairs of tumor and matched paracancerous cDNA microarrays were procured from Outdo Biotech Company (MecDNA-HColA030CS03, China). qPCR was performed with a SYBR Green PCR Kit (Q712, Vazyme, China). The primer sequences were detailed in Supplementary Table S3. The relative expression of mRNA was computed via the 2-ΔΔCt approach and standardized with β-actin as the internal control.
Western blotting
The cells were lysed in RIPA buffer (CW2333S, CWBIO, China) supplemented with 1% protease inhibitor cocktail (CW2200S, CWBIO, China) for 30 minutes. Protein concentrations were determined via a BCA kit (23227, Thermo Fisher Scientific, USA). Proteins were transferred onto PVDF membranes (IPVH00010, Millipore, USA) after being separated via SDS–PAGE. After being blocked for an hour in 5% nonfat milk, the samples were incubated overnight at 4°C with antibodies targeting HADH or β-tubulin (1:5000, 80762-1-RR; Proteintech, China). The membranes were then incubated with HRP-conjugated secondary antibodies (GAR1007, MultiSciences, China) for 1 hour. Following washes, protein signals were detected via a chemiluminescence reagent (GK10008, GLPBIO, USA). Densitometric analysis of HADH expression relative to that of β-tubulin was performed via ImageJ.
Cell viability assay
Briefly, 1,000 cells per well were seeded in 96-well plates. At 0, 24, 48, 72, and 96 h, the cells were treated with culture medium supplemented with 10% CCK-8 solution (GK10001, GLPBIO, USA) and incubated for 2 h at 37°C. The optical density at 450 nm was measured via a microplate reader (Synergy H1, BioTek, USA).
Colony formation assay
Cells were seeded in plates with 6 wells with1,000 cells in each well and cultured with medium replacement every 72 h. After two weeks, the cells were rinsed with PBS, fixed with 4% paraformaldehyde (G1101, Servicebio, China) for 30 min, and stained with crystal violet reagent (C0121, Beyotime, China) for 30 min. Colonies were visualized via the Amersham ImageQuant 800 system (Cytiva, Japan), and quantification was performed via ImageJ.
Cell proliferation assay
The cells were seeded in 6-well plates and incubated for 24 hours before they were incubated with an EdU solution (C0078S, Beyotime, China) for 4 hours. Following fixation in 4% paraformaldehyde, the cells were washed three times with PBS supplemented with 3% BSA and incubated with permeabilization solution (P0097, Beyotime, China) for 15 minutes. The samples were incubated with click additive solution for half an hour, stained with Hoechst for 10 minutes, and visualized via a microscope (BX53, Olympus, Japan). The ratio of EdU-positive cells was determined via ImageJ software. For each well, the percentage of EdU-positive cells was determined by counting approximately 300 cells per field across five randomly selected fields of view, and the mean value was calculated.
Apoptosis assay
Apoptosis was assessed via an Annexin V/PI detection kit (HY-K1093, MedChemExpress, USA). The cells were cultured for 24 hours and harvested via trypsin digestion (C0207, Beyotime, China). Subsequent to washing with PBS, the cell suspensions in binding buffer were subjected to 15 minutes of incubation in the dark with both Annexin V and PI stains. Fluorescent signals were detected via flow cytometry (Canto, BD Biosciences, USA). Data analysis was enabled by FlowJo 10.8.1 (Tree Star, USA).
Wound healing assay
The cells were seeded in 12-well plates until approximately 90% confluence was achieved. Mechanical wounds were created with a 10 μl pipette tip. After being washed, the cells were cultured in serum-free medium. The observation time points included immediately after wounding and 24 hours after wounding, and a microscope was used. The area of wound closure was quantified via ImageJ.
Transwell assay
Transwell chambers (8 μm pore polyester membrane inserts; TCS020024, Jet, China) were either uncoated or precoated with 60 μl of matrix gel (082704T, Mogengel, Xiamen, China) to assess cell migration or invasion. Cells (25,000 KM12 or 100,000 RKO per well) suspended in serum-free medium were seeded into the upper chambers. The lower chambers were supplemented with 600 μl of medium containing 20% FBS as a chemoattractant. After 48 h, the cells remaining in the upper chambers were removed. The inserts were then fixed with paraformaldehyde and stained with crystal violet. The cells that had migrated or invaded were visualized and quantified via a microscope and ImageJ.
Statistical analysis
All data originating from public databases were analysed via R 4.4.1. All experiments were performed in triplicate with three technical replicates, and independently repeated three or more times as biological replicates. The results are presented as the mean ± SEM of independent determinations. Normally distributed variables in the two cohorts were compared via unpaired t tests, whereas nonnormally distributed variables were explored via the Mann–Whitney test. The log-rank test was used to evaluate survival disparities through Kaplan–Meier survival curves. The false discovery rate (FDR) was controlled through Benjamini–Hochberg correction.
Results
Results
LLPS-related DEGs were identified in the CRC cohort
A total of 1,967 genes demonstrated significant differences in expression (1,087 with elevated expression and 880 with reduced expression) in tumor versus normal tissues within the GSE39582 dataset (Figure 2A). Among these DEGs, 430 DEGs related to LLPS were identified between tumor and normal tissues, representing distinct transcriptomic features (Figure 2B). The expression of these 430 LLPS-linked DEGs in the GSE39582 cohort was presented as a heatmap (Figure 2C). The complex interactions among the proteins encoded by the 430 DEGs were revealed via a protein–protein interaction (PPI) network (Figure 2D). The biological functions and regulatory mechanisms of LLPS-linked DEGs were elucidated via GO and KEGG enrichment analyses. The former revealed significant enrichment in ribonucleoprotein complex biogenesis, noncoding RNA processing, ribosome biogenesis, organelle fission, and ribosomal RNA metabolic processes (Figure 2E). The latter indicated enrichment in pathways related to the cell cycle, DNA replication, ribosome biogenesis in eukaryotes, progesterone-mediated oocyte maturation, and fatty acid metabolism (Figure 2F).
The LLPS related gene signature was constructed to predict prognosis
The prognostic model construction was conducted as follows: Starting from 430 DEGs, 79 genes were identified via univariate Cox regression. Subsequent LASSO regression narrowed the candidates to seven genes, and stepAIC optimization finalized a five-gene signature (Figures 3A–C). Multivariate Cox proportional hazards modeling generated coefficients for these genes, yielding the risk score formula: Risk score = (-0.3059) × aquaporin 11 (AQP11) + (-0.2237) × coiled-coil domain-containing protein 34 (CCDC34) + (-0.4115) × fibrinogen-like protein (FBL) + (-0.2149) × HADH + (-0.2564) × RAS-associated protein 15 (RAB15) (Figure 3D). Using the median risk score as the threshold, patients were stratified into high-risk (score ≥ median) and low-risk (score < median) groups. Notably, all coefficients were negative, indicating that higher expression of these five genes correlates with lower risk scores. This recalibration clarifies that elevated expression of these genes confers protective effects, consistent with low-risk group association. Kaplan‒Meier analysis revealed notably poorer OS in the high-risk cohort than in the low-risk cohort within the GSE39582 cohort (median OS: 102.0 months vs. 183.0 months, P < 0.0001; Figure 3E). Consistent findings were noted in three independent validation cohorts: the high-risk population displayed worse OS in the GSE17536 and COAD datasets (P = 0.026 and P = 0.0021), with a similar trend noted in the READ set (P = 0.069; Figures 3F-H). The risk score distributions, survival rates, and gene expression profiles across the CRC cohort were presented in Figures 3I-L. An increasing trend in mortality was noted with increasing risk scores. These findings confirm the robustness and prediction performance of the LLPS-associated gene signature across CRC datasets.
A nomogram model based on the risk score was established to predict patient prognosis
In the GSE39582 cohort, the prognostic significance of the risk score and other clinical variables, such as age, chemotherapy status, clinical stage, and TNM classification, was evaluated via univariate and multivariate Cox regression analyses. The data revealed age, M classification, and the LLPS-based risk score as independent prognostic indicators in comparison with other clinical parameters (Figures 4A, B). A nomogram model integrating age, M classification, and the risk score was created to forecast the CRC population in the GSE39582 cohort (Figure 4C). The AUCs for the nomogram were 0.803, 0.783, and 0.751 for 1-, 3-, and 5-year OS, respectively (Figure 4D). The disparity in survival outcomes was marked across the high- and low-risk cohorts (P < 0.0001; Figure 4E). DCA indicated greater net clinical benefit in predicting 3- and 5-year OS than in predicting 1-year OS (Figure 4F). The calibration curves demonstrated a high level of agreement between the predicted and real survival probabilities (Figure 4G). The prediction precision of the nomogram was confirmed in the external COAD cohort, where the model effectively predicted OS (Figures 4H-K). Therefore, the LLPS-based nomogram possesses strong predictive power and clinical utility for the prognostic assessment of CRC patients.
Different patterns of somatic mutations were identified in high- and low-risk cohorts
The somatic mutations of 430 LLPS-related genes were analyzed across two risk cohorts within the COAD and READ cohorts. Mutation frequencies were comparable across cohorts, with APC, TP53, TTN, and KRAS emerging as the most frequently mutated genes (Figures 5A, B). In contrast, CNV frequencies demonstrated notable differences between the COAD and READ cohorts. In COAD, copy number amplifications were predominantly noted in CSE1L, DDX27, and PABC1L, whereas deletions were more frequent in TRIB3, SNRPB, and NOP56 (Figure 5C). In READ, copy number amplifications were more common in AURKA, CSE1L, and DDX27, whereas deletions were more prevalent in TGIF1, SMCHD1, and NDC80 (Figure 5D).
Differences in the TME were observed between the high- and low-risk cohorts
Marked disparities in the TME were noted between the two risk groups. Most immune checkpoint genes, with the exception of HHLA2, were markedly upregulated in the high-risk cohort, indicating possible variations in sensitivity to immune checkpoint inhibitors (ICIs) across risk strata (Figure 5E). A similar pattern was noted in the expression of HLA genes, with the majority being upregulated in the high-risk group (Figure 5F). Stemness analyses based on 26 gene sets were shownin Figure 5G. Analysis via the ESTIMATE algorithm revealed higher immune, stromal, and ESTIMATE scores, along with lower tumor purity, in the high-risk cohort (Figure 5H). Furthermore, TIDE score was markedly higher in the high-risk cohort than in the low-risk cohort (Figure 5I).
Significant differences via the CIBERSORT algorithm were found in the proportions of 22 immune cell types between tumor and control samples in the GSE39582 cohort (Figure 6A). Notably, M2 macrophages were substantially enriched in the high-risk cohort, whereas activated memory CD4+ T cells, resting NK cells, and activated dendritic cells were more abundant in the low-risk cohort (Figure 6B). Correlation analyses revealed strong relationships among immune cell populations; for example, a marked positive relationship between follicular helper T cells and M1 macrophages was detected (r = 0.449), and a notable negative relationship was detected between activated mast cells and resting mast cells (r = -0.661) (Figure 6C). Additionally, the expression of the five prognostic genes was strongly linked to the proportions of various immune cell types (Figure 6D). The high expression of inhibitory immune genes and the infiltration of immunosuppressive cells may contribute to a greater likelihood of immune escape, poor immune response, and ultimately unfavorable prognosis in the high-risk group. These findings highlight substantial TME heterogeneity between risk cohorts, which may lead to the development of more individualized and effective antitumour therapeutic strategies.
Different biological mechanisms drive tumor progression in high- and low-risk groups
To elucidate the underlying biological mechanisms distinguishing the high- and low-risk cohorts, DEGs were identified, and 230 upregulated and 262 downregulated genes were identified (Figure 7A). GO enrichment analysis revealed their involvement in mitotic cell cycle regulation, cell cycle control, extracellular matrix components, DNA replication, and DNA repair (Figure 7B). Functional annotation via the Metascape database further confirmed enrichment in similar processes, such as the mitotic cell cycle, modulation of the cell cycle, collagen-containing extracellular matrix, DNA metabolic process, and positive cell cycle progression regulation (Figure 7C). GSEA revealed that pathways such as ECM-receptor interaction, cell adhesion, and B-cell receptor signalling, as well as Th1, Th2 and Th17 cell differentiation, were predominantly enriched in the high-risk cohort. Conversely, the Fanconi anaemia pathway, mismatch repair, and oxidative phosphorylation were enriched in the low-risk population (Figure 7D). GSVA revealed further enrichment in the low-risk population of pathways such as meiotic cell cycle phase transition, mitochondrial RNA modification, DNA replication checkpoint signalling, DNA replication initiation, and cell cycle DNA replication. In contrast, pathways such as sequestration of extracellular ligands, negative regulation of the humoral immune response, inhibition of complement activation, cell junction disassembly, and neutrophil extravasation were also notably enriched in the high-risk population of pathways (Figure 7E). KEGG pathway analysis revealed the main links of the DEGs to DNA replication, cell cycle regulation, mismatch repair, the p53 signalling pathway, and base excision repair (Figure 7F). Comprehensive regulatory network analysis utilizing the NetworkAnalyst database identified several TFs and miRNAs as key regulators of the five prognostic genes. For example, HADH was closely linked to hsa-miR-124 and EGR1; CCDC34 with hsa-miR-381 and CEBPB; FBL with hsa-miR-144 and DDX5; AQP11 with hsa-miR-380 and POU2F1; and RAB15 with hsa-miR-98 (Figure 7G).
ScRNA-seq analysis revealed different distributions and expression patterns of the five prognostic genes
The scRNA-seq data from the GSE166555 dataset were analysed, with 13 distinct cell clusters identified and classified into nine cell types, including B cell, endothelial cell, epithelial cell, inflammatory cancer-associated fibroblast (iCAF)cell, mast cell, myofibroblastic cancer-associated fibroblast cell (mCAF), monocyte/macrophage, plasma cell, and T cell (Figures 8A, B). LLPS enrichment scores were markedly higher in epithelial cells than in the other eight cell types (Figures 8C, D), indicating that LLPS-related DEGs were predominantly expressed in epithelial cells. The expression and distribution of AQP11, CCDC34, FBL, HADH, and RAB15 were visualized across these immune cell types (Figures 8E, F). UMAP and dot plot analyses revealed widespread expression of FBL across all immune cell types except plasma cells. RAB15 and HADH were preferentially expressed in epithelial cells, and CCDC34 was expressed mainly in inflammatory cancer-associated fibroblasts and epithelial cells. Moreover, AQP11 and HADH were predominantly expressed in the low-CNV group, whereas CCDC34, FBL, and RAB15 were enriched in the high-CNV group, suggesting that the former were primarily associated with low-grade malignant cells, and the latter with highly malignant cells (Figure 8G).
Spatial transcriptome analysis revealed different spatial distributions and expression patterns of the five prognostic genes
Integrated spatial transcriptome sequencing analysis was conducted via the Sparkle platform to investigate the spatial distribution and expression patterns of five prognostic genes in CRC tissues. Hematoxylin and eosin staining of the pathological sections was illustrated in Supplementary Figure S1A. Ten cell types were identified, namely B cell, CD4 T cell, dendritic cell (DC), endothelial cell, epithelial cell, fibroblast, macrophage, neutrophil, plasma cell, and tumor cell (Supplementary Figure S1B). The distribution of the five prognostic genes across microregions was shown in Figure 9A. The tissue sections were further segmented into malignant and non-malignant regions (Supplementary Figure S1C), and AUC scores of 430 LLPS-related genes across all regions were calculated (Supplementary Figure S1D). The results revealed higher AUC values for these genes in malignant regions, suggesting their predominant distribution in tumor areas. Correlation analysis further confirmed a significant positive association between the AUC values of these genes and tumor cell content (Supplementary Figure S1E). Moreover, the expression levels of the five prognostic genes were markedly higher in malignant regions compared to non-malignant regions (Figure 9B). Correlation analysis demonstrated a strong positive relationship between the expression of these five prognostic genes and tumor cell abundance (Figure 9C). Similar results were observed in two additional tissue sections, as shown in Supplementary Figures S2–S5.
Small molecule compounds screened for CRC treatment
Among the 198 candidate small-molecule drugs, 162 presented marked disparities across high- and low-risk individuals. We screened 162 drugs on the basis of the correlation (r > 0.3; p < 0.05) of their IC50 values with the risk score, which yielded 59 candidates. Subsequently, 20 of these drugs, which were deemed potentially effective for CRC treatment, were selected for further analysis. A heatmap revealed strong relationships between the expression of the five prognostic genes and the IC50 values of the selected compounds in the CRC population (Figure 10A). The risk scores were negatively related to the IC50 values of AZD8055, BMS-754807, doramapimod, JQ1, NU7441, and SB216763, indicating reduced IC50 values in high-risk patients. Conversely, positive relationships were noted between the risk scores and the IC50 values of 14 other compounds, namely, bortezomib, carmustine, dihydrorotenone, gallibiscoquinazole, GSK1904529A, LGK974, ML323, OF-1, oxaliplatin, palbociclib, sinularin, sorafenib, ulixertinib, and VE821, suggesting reduced IC50 values in low-risk individuals (Figure 10B; Supplementary Figure S6). The chemical structures of the 20 compounds were obtained from ChemSpider and plotted in Supplementary Figure S7.
HADH expression-related enrichment pathways
Analysis of the GSE39582 dataset revealed notably elevated expression of CCDC34, FBL, and RAB15 in tumor tissues, whereas AQP11 and HADH were notably downregulated (Figure 11A). High expression levels of HADH, CCDC34, FBL, and RAB15 were linked to markedly improved OS compared with their respective low-expression cohorts. For AQP11, a similar trend toward better survival in the high-expression group was noted (Figure 11B).
Our comprehensive findings demonstrated that HADH expression was significantly lower in tumor tissues than in normal tissues. Spatial and single-cell analyses further revealed that within the TME, HADH was predominantly expressed in tumor epithelial cells with low copy number variation rather than in immune or stromal cells. Patients in the high-HADH expression subgroup presented a markedly more favourable prognosis than did those in the low-HADH expression subgroup. These results suggest that increasing HADH levels in tumor cells may suppress malignant proliferation and improve clinical outcomes in CRC, positioning HADH as a potential therapeutic target. On the basis of this integrative evidence, we selected HADH for further functional investigation.
The most relevant gene modules related to HADH expression were analysed via WGCNA. The optimal soft threshold power was set to 5 (Figures 11C, D). Twenty modules were generated via this soft threshold, and a clustering dendrogram of the modules was presented in Figure 11E. The correlations between HADH expression and gene modules were shown in Figure 11F. The results revealed that the gray 60 module presented the greatest significant positive correlation with HADH expression (451 genes, r = 047, p = 3e-12). A total of 451 genes in the gray 60 module were imported into the Metascape website for enrichment analysis. The data revealed that HADH was enriched mainly in the maintenance of the gastrointestinal epithelium, nucleoside diphosphate phosphatase activity, the intrinsic apoptotic signalling pathway in response to endoplasmic reticulum stress, the extracellular matrix, etc. (Figure 11G).
Experimental verification of HADH expression in CRC
RT-qPCR analysis of 15 paired tumor and adjacent healthy tissue samples confirmed the differential expression of five prognostic genes (Figure 12A). Subsequent validation revealed significantly lower HADH expression in HCT116, RKO, KM12, HT29, HCT8, and SW620 cells than in NCM460 cells (Figures 12B, C). IHC staining of tumor and adjacent tissues from 80 CRC patients revealed lower HADH protein expression in tumor tissues (Figures 12D, E). The diagnostic performance of HADH for CRC, as measured by the AUC, was 0.7477 (Figure 12F). High HADH expression was further linked to a lower histological grade and earlier clinical stage (Figures 12G, H). The clinical and pathological characteristics of the patients stratified by HADH expression were summarized in Figure 12I and Supplementary Table S4.
HADH overexpression inhibited the viability of CRC cells
HADH function was elucidated via overexpression experiments in the RKO and KM12 cell lines. HADH overexpression was confirmed via RT-qPCR and Western blotting (Figures 13A, B). CCK-8 assays revealed decreased cell viability following HADH overexpression (Figure 13C). Flow cytometry analysis indicated a marked increase in apoptosis (Figure 13D). Colony formation and EdU incorporation assays confirmed the reduced proliferative capacity of HADH-overexpressing cells (Figures 13E, F). Furthermore, wound healing and transwell assays revealed marked suppression of cell migration and invasion upon HADH overexpression (Figures 13G, H).
LLPS-related DEGs were identified in the CRC cohort
A total of 1,967 genes demonstrated significant differences in expression (1,087 with elevated expression and 880 with reduced expression) in tumor versus normal tissues within the GSE39582 dataset (Figure 2A). Among these DEGs, 430 DEGs related to LLPS were identified between tumor and normal tissues, representing distinct transcriptomic features (Figure 2B). The expression of these 430 LLPS-linked DEGs in the GSE39582 cohort was presented as a heatmap (Figure 2C). The complex interactions among the proteins encoded by the 430 DEGs were revealed via a protein–protein interaction (PPI) network (Figure 2D). The biological functions and regulatory mechanisms of LLPS-linked DEGs were elucidated via GO and KEGG enrichment analyses. The former revealed significant enrichment in ribonucleoprotein complex biogenesis, noncoding RNA processing, ribosome biogenesis, organelle fission, and ribosomal RNA metabolic processes (Figure 2E). The latter indicated enrichment in pathways related to the cell cycle, DNA replication, ribosome biogenesis in eukaryotes, progesterone-mediated oocyte maturation, and fatty acid metabolism (Figure 2F).
The LLPS related gene signature was constructed to predict prognosis
The prognostic model construction was conducted as follows: Starting from 430 DEGs, 79 genes were identified via univariate Cox regression. Subsequent LASSO regression narrowed the candidates to seven genes, and stepAIC optimization finalized a five-gene signature (Figures 3A–C). Multivariate Cox proportional hazards modeling generated coefficients for these genes, yielding the risk score formula: Risk score = (-0.3059) × aquaporin 11 (AQP11) + (-0.2237) × coiled-coil domain-containing protein 34 (CCDC34) + (-0.4115) × fibrinogen-like protein (FBL) + (-0.2149) × HADH + (-0.2564) × RAS-associated protein 15 (RAB15) (Figure 3D). Using the median risk score as the threshold, patients were stratified into high-risk (score ≥ median) and low-risk (score < median) groups. Notably, all coefficients were negative, indicating that higher expression of these five genes correlates with lower risk scores. This recalibration clarifies that elevated expression of these genes confers protective effects, consistent with low-risk group association. Kaplan‒Meier analysis revealed notably poorer OS in the high-risk cohort than in the low-risk cohort within the GSE39582 cohort (median OS: 102.0 months vs. 183.0 months, P < 0.0001; Figure 3E). Consistent findings were noted in three independent validation cohorts: the high-risk population displayed worse OS in the GSE17536 and COAD datasets (P = 0.026 and P = 0.0021), with a similar trend noted in the READ set (P = 0.069; Figures 3F-H). The risk score distributions, survival rates, and gene expression profiles across the CRC cohort were presented in Figures 3I-L. An increasing trend in mortality was noted with increasing risk scores. These findings confirm the robustness and prediction performance of the LLPS-associated gene signature across CRC datasets.
A nomogram model based on the risk score was established to predict patient prognosis
In the GSE39582 cohort, the prognostic significance of the risk score and other clinical variables, such as age, chemotherapy status, clinical stage, and TNM classification, was evaluated via univariate and multivariate Cox regression analyses. The data revealed age, M classification, and the LLPS-based risk score as independent prognostic indicators in comparison with other clinical parameters (Figures 4A, B). A nomogram model integrating age, M classification, and the risk score was created to forecast the CRC population in the GSE39582 cohort (Figure 4C). The AUCs for the nomogram were 0.803, 0.783, and 0.751 for 1-, 3-, and 5-year OS, respectively (Figure 4D). The disparity in survival outcomes was marked across the high- and low-risk cohorts (P < 0.0001; Figure 4E). DCA indicated greater net clinical benefit in predicting 3- and 5-year OS than in predicting 1-year OS (Figure 4F). The calibration curves demonstrated a high level of agreement between the predicted and real survival probabilities (Figure 4G). The prediction precision of the nomogram was confirmed in the external COAD cohort, where the model effectively predicted OS (Figures 4H-K). Therefore, the LLPS-based nomogram possesses strong predictive power and clinical utility for the prognostic assessment of CRC patients.
Different patterns of somatic mutations were identified in high- and low-risk cohorts
The somatic mutations of 430 LLPS-related genes were analyzed across two risk cohorts within the COAD and READ cohorts. Mutation frequencies were comparable across cohorts, with APC, TP53, TTN, and KRAS emerging as the most frequently mutated genes (Figures 5A, B). In contrast, CNV frequencies demonstrated notable differences between the COAD and READ cohorts. In COAD, copy number amplifications were predominantly noted in CSE1L, DDX27, and PABC1L, whereas deletions were more frequent in TRIB3, SNRPB, and NOP56 (Figure 5C). In READ, copy number amplifications were more common in AURKA, CSE1L, and DDX27, whereas deletions were more prevalent in TGIF1, SMCHD1, and NDC80 (Figure 5D).
Differences in the TME were observed between the high- and low-risk cohorts
Marked disparities in the TME were noted between the two risk groups. Most immune checkpoint genes, with the exception of HHLA2, were markedly upregulated in the high-risk cohort, indicating possible variations in sensitivity to immune checkpoint inhibitors (ICIs) across risk strata (Figure 5E). A similar pattern was noted in the expression of HLA genes, with the majority being upregulated in the high-risk group (Figure 5F). Stemness analyses based on 26 gene sets were shownin Figure 5G. Analysis via the ESTIMATE algorithm revealed higher immune, stromal, and ESTIMATE scores, along with lower tumor purity, in the high-risk cohort (Figure 5H). Furthermore, TIDE score was markedly higher in the high-risk cohort than in the low-risk cohort (Figure 5I).
Significant differences via the CIBERSORT algorithm were found in the proportions of 22 immune cell types between tumor and control samples in the GSE39582 cohort (Figure 6A). Notably, M2 macrophages were substantially enriched in the high-risk cohort, whereas activated memory CD4+ T cells, resting NK cells, and activated dendritic cells were more abundant in the low-risk cohort (Figure 6B). Correlation analyses revealed strong relationships among immune cell populations; for example, a marked positive relationship between follicular helper T cells and M1 macrophages was detected (r = 0.449), and a notable negative relationship was detected between activated mast cells and resting mast cells (r = -0.661) (Figure 6C). Additionally, the expression of the five prognostic genes was strongly linked to the proportions of various immune cell types (Figure 6D). The high expression of inhibitory immune genes and the infiltration of immunosuppressive cells may contribute to a greater likelihood of immune escape, poor immune response, and ultimately unfavorable prognosis in the high-risk group. These findings highlight substantial TME heterogeneity between risk cohorts, which may lead to the development of more individualized and effective antitumour therapeutic strategies.
Different biological mechanisms drive tumor progression in high- and low-risk groups
To elucidate the underlying biological mechanisms distinguishing the high- and low-risk cohorts, DEGs were identified, and 230 upregulated and 262 downregulated genes were identified (Figure 7A). GO enrichment analysis revealed their involvement in mitotic cell cycle regulation, cell cycle control, extracellular matrix components, DNA replication, and DNA repair (Figure 7B). Functional annotation via the Metascape database further confirmed enrichment in similar processes, such as the mitotic cell cycle, modulation of the cell cycle, collagen-containing extracellular matrix, DNA metabolic process, and positive cell cycle progression regulation (Figure 7C). GSEA revealed that pathways such as ECM-receptor interaction, cell adhesion, and B-cell receptor signalling, as well as Th1, Th2 and Th17 cell differentiation, were predominantly enriched in the high-risk cohort. Conversely, the Fanconi anaemia pathway, mismatch repair, and oxidative phosphorylation were enriched in the low-risk population (Figure 7D). GSVA revealed further enrichment in the low-risk population of pathways such as meiotic cell cycle phase transition, mitochondrial RNA modification, DNA replication checkpoint signalling, DNA replication initiation, and cell cycle DNA replication. In contrast, pathways such as sequestration of extracellular ligands, negative regulation of the humoral immune response, inhibition of complement activation, cell junction disassembly, and neutrophil extravasation were also notably enriched in the high-risk population of pathways (Figure 7E). KEGG pathway analysis revealed the main links of the DEGs to DNA replication, cell cycle regulation, mismatch repair, the p53 signalling pathway, and base excision repair (Figure 7F). Comprehensive regulatory network analysis utilizing the NetworkAnalyst database identified several TFs and miRNAs as key regulators of the five prognostic genes. For example, HADH was closely linked to hsa-miR-124 and EGR1; CCDC34 with hsa-miR-381 and CEBPB; FBL with hsa-miR-144 and DDX5; AQP11 with hsa-miR-380 and POU2F1; and RAB15 with hsa-miR-98 (Figure 7G).
ScRNA-seq analysis revealed different distributions and expression patterns of the five prognostic genes
The scRNA-seq data from the GSE166555 dataset were analysed, with 13 distinct cell clusters identified and classified into nine cell types, including B cell, endothelial cell, epithelial cell, inflammatory cancer-associated fibroblast (iCAF)cell, mast cell, myofibroblastic cancer-associated fibroblast cell (mCAF), monocyte/macrophage, plasma cell, and T cell (Figures 8A, B). LLPS enrichment scores were markedly higher in epithelial cells than in the other eight cell types (Figures 8C, D), indicating that LLPS-related DEGs were predominantly expressed in epithelial cells. The expression and distribution of AQP11, CCDC34, FBL, HADH, and RAB15 were visualized across these immune cell types (Figures 8E, F). UMAP and dot plot analyses revealed widespread expression of FBL across all immune cell types except plasma cells. RAB15 and HADH were preferentially expressed in epithelial cells, and CCDC34 was expressed mainly in inflammatory cancer-associated fibroblasts and epithelial cells. Moreover, AQP11 and HADH were predominantly expressed in the low-CNV group, whereas CCDC34, FBL, and RAB15 were enriched in the high-CNV group, suggesting that the former were primarily associated with low-grade malignant cells, and the latter with highly malignant cells (Figure 8G).
Spatial transcriptome analysis revealed different spatial distributions and expression patterns of the five prognostic genes
Integrated spatial transcriptome sequencing analysis was conducted via the Sparkle platform to investigate the spatial distribution and expression patterns of five prognostic genes in CRC tissues. Hematoxylin and eosin staining of the pathological sections was illustrated in Supplementary Figure S1A. Ten cell types were identified, namely B cell, CD4 T cell, dendritic cell (DC), endothelial cell, epithelial cell, fibroblast, macrophage, neutrophil, plasma cell, and tumor cell (Supplementary Figure S1B). The distribution of the five prognostic genes across microregions was shown in Figure 9A. The tissue sections were further segmented into malignant and non-malignant regions (Supplementary Figure S1C), and AUC scores of 430 LLPS-related genes across all regions were calculated (Supplementary Figure S1D). The results revealed higher AUC values for these genes in malignant regions, suggesting their predominant distribution in tumor areas. Correlation analysis further confirmed a significant positive association between the AUC values of these genes and tumor cell content (Supplementary Figure S1E). Moreover, the expression levels of the five prognostic genes were markedly higher in malignant regions compared to non-malignant regions (Figure 9B). Correlation analysis demonstrated a strong positive relationship between the expression of these five prognostic genes and tumor cell abundance (Figure 9C). Similar results were observed in two additional tissue sections, as shown in Supplementary Figures S2–S5.
Small molecule compounds screened for CRC treatment
Among the 198 candidate small-molecule drugs, 162 presented marked disparities across high- and low-risk individuals. We screened 162 drugs on the basis of the correlation (r > 0.3; p < 0.05) of their IC50 values with the risk score, which yielded 59 candidates. Subsequently, 20 of these drugs, which were deemed potentially effective for CRC treatment, were selected for further analysis. A heatmap revealed strong relationships between the expression of the five prognostic genes and the IC50 values of the selected compounds in the CRC population (Figure 10A). The risk scores were negatively related to the IC50 values of AZD8055, BMS-754807, doramapimod, JQ1, NU7441, and SB216763, indicating reduced IC50 values in high-risk patients. Conversely, positive relationships were noted between the risk scores and the IC50 values of 14 other compounds, namely, bortezomib, carmustine, dihydrorotenone, gallibiscoquinazole, GSK1904529A, LGK974, ML323, OF-1, oxaliplatin, palbociclib, sinularin, sorafenib, ulixertinib, and VE821, suggesting reduced IC50 values in low-risk individuals (Figure 10B; Supplementary Figure S6). The chemical structures of the 20 compounds were obtained from ChemSpider and plotted in Supplementary Figure S7.
HADH expression-related enrichment pathways
Analysis of the GSE39582 dataset revealed notably elevated expression of CCDC34, FBL, and RAB15 in tumor tissues, whereas AQP11 and HADH were notably downregulated (Figure 11A). High expression levels of HADH, CCDC34, FBL, and RAB15 were linked to markedly improved OS compared with their respective low-expression cohorts. For AQP11, a similar trend toward better survival in the high-expression group was noted (Figure 11B).
Our comprehensive findings demonstrated that HADH expression was significantly lower in tumor tissues than in normal tissues. Spatial and single-cell analyses further revealed that within the TME, HADH was predominantly expressed in tumor epithelial cells with low copy number variation rather than in immune or stromal cells. Patients in the high-HADH expression subgroup presented a markedly more favourable prognosis than did those in the low-HADH expression subgroup. These results suggest that increasing HADH levels in tumor cells may suppress malignant proliferation and improve clinical outcomes in CRC, positioning HADH as a potential therapeutic target. On the basis of this integrative evidence, we selected HADH for further functional investigation.
The most relevant gene modules related to HADH expression were analysed via WGCNA. The optimal soft threshold power was set to 5 (Figures 11C, D). Twenty modules were generated via this soft threshold, and a clustering dendrogram of the modules was presented in Figure 11E. The correlations between HADH expression and gene modules were shown in Figure 11F. The results revealed that the gray 60 module presented the greatest significant positive correlation with HADH expression (451 genes, r = 047, p = 3e-12). A total of 451 genes in the gray 60 module were imported into the Metascape website for enrichment analysis. The data revealed that HADH was enriched mainly in the maintenance of the gastrointestinal epithelium, nucleoside diphosphate phosphatase activity, the intrinsic apoptotic signalling pathway in response to endoplasmic reticulum stress, the extracellular matrix, etc. (Figure 11G).
Experimental verification of HADH expression in CRC
RT-qPCR analysis of 15 paired tumor and adjacent healthy tissue samples confirmed the differential expression of five prognostic genes (Figure 12A). Subsequent validation revealed significantly lower HADH expression in HCT116, RKO, KM12, HT29, HCT8, and SW620 cells than in NCM460 cells (Figures 12B, C). IHC staining of tumor and adjacent tissues from 80 CRC patients revealed lower HADH protein expression in tumor tissues (Figures 12D, E). The diagnostic performance of HADH for CRC, as measured by the AUC, was 0.7477 (Figure 12F). High HADH expression was further linked to a lower histological grade and earlier clinical stage (Figures 12G, H). The clinical and pathological characteristics of the patients stratified by HADH expression were summarized in Figure 12I and Supplementary Table S4.
HADH overexpression inhibited the viability of CRC cells
HADH function was elucidated via overexpression experiments in the RKO and KM12 cell lines. HADH overexpression was confirmed via RT-qPCR and Western blotting (Figures 13A, B). CCK-8 assays revealed decreased cell viability following HADH overexpression (Figure 13C). Flow cytometry analysis indicated a marked increase in apoptosis (Figure 13D). Colony formation and EdU incorporation assays confirmed the reduced proliferative capacity of HADH-overexpressing cells (Figures 13E, F). Furthermore, wound healing and transwell assays revealed marked suppression of cell migration and invasion upon HADH overexpression (Figures 13G, H).
Discussion
Discussion
Emerging evidence reveals that LLPS regulates biological processes through biomolecular condensates, while its dysregulation contributes to pathologies such as oncogenic signaling activation, genomic instability, and TME remodeling (36). The identification of LLPS-linked biomarkers may provide insights for tumor subtyping and prognostic assessment. Our study investigated how LLPS influences CRC. On the basis of the expression profiles of the 430 LLPS-associated DEGs, five prognostic genes were identified via univariate Cox regression, LASSO regression and stepAIC algorithms, culminating in the construction of a nomogram model. The derived risk scores exhibited strong associations with drug sensitivity, clinical outcomes, tumor stemness, the TME, clinicopathological features, and genomic alterations.
Our results revealed that the 430 LLPS-related DEGs were enriched primarily in pathways related to the cell cycle, DNA replication, and DNA damage repair. Dysregulated LLPS can repair DNA damage in tumor cells induced by radiotherapy and chemotherapy, leading to treatment failure. For example, under physiological conditions, SUMOylated RNF168 is inefficiently recruited to DSB sites, thereby impairing H2A ubiquitination and 53BP1 accumulation within nuclear condensates, ultimately reducing DNA repair efficiency. In CRC, the overexpression of SENP1 decreases RNF168 SUMOylation, enhances DNA repair capacity, and correlates with poor prognosis (37). Notably, the LLPS-based prognostic model revealed a markedly worse overall prognosis in high-risk individuals than in low-risk individuals. Compounds such as JQ-1 and BMS-754807 may serve as promising therapeutic candidates for high-risk individuals, whereas agents such as gefitinib, sorafenib, and oxaliplatin may exhibit greater efficacy in low-risk individuals. These findings prove the importance of LLPS-related genes in CRC progression and present promising avenues for personalized treatments.
Tumor immune evasion and resistance to ICIs remain major obstacles in cancer immunotherapy. Recent studies suggest that LLPS modulates immune cell infiltration within the TME. For example, LLPS-mediated disruption of T-cell receptor signaling by upregulated LAG3 inhibits T-cell proliferation and differentiation, thereby promoting an immunosuppressive milieu (38). IFN-γ-triggered phase separation of the KAT8-IRF1 complex enhances PD-L1 expression and facilitates immune evasion (39). In the present analysis, notable disparities in immune checkpoint expression, HLA levels, immune cell infiltration, and genomic mutation profiles were noted between risk cohorts. Additionally, higher immune and stromal scores, alongside lower tumor purity, were noted in the high-risk group, indicating the presence of more extensive immune cells in the TME. Notably, the high-risk group exhibited elevated TIDE scores, suggesting potentially poorer responses to immunotherapy, which may partially account for their unfavorable prognosis. Overall, LLPS can regulate immune dynamics and is a possible predictive biomarker for the immunotherapeutic response.
Our prognostic model, comprising five genes (HADH, AQP11, CCDC34, FBL, and RAB15), demonstrated robust predictive ability for patient prognosis, the immune response, and mutational landscapes. AQP11 AQP11 encodes a water-channel membrane protein that is downregulated in CRC and facilitates tumor cell proliferation, migration, invasion, and adhesion (40). CCDC34, a coiled-coil domain-containing protein, is upregulated in CRC and linked to reduced apoptosis and increased metastasis, possibly via the modulation of Bcl-2, survivin, E-cadherin, N-cadherin, and MMP-9 (41). FBL is a highly conserved nucleolar methyltransferase that promotes nucleolar assembly and pre-rRNA synthesis via phase separation, contributing to ribosome biogenesis and cellular proliferation (42). Long non-coding RNA (lncRNA) binds to multiple FBL molecules and partially blocks the glycine- and arginine-rich (GAR) domain, thereby inhibiting excessive solid phase condensation caused by GAR-GAR interactions. This mechanism maintains the liquid phase fluidity of FBL and consequently sustains cancer cell growth (43). RAB15 is a small GTPase belonging to the RAS superfamily that is predominantly localized to membrane-bound organelles where it participates in vesicle-mediated intracellular transport. It is critical for modulating cell division, migration, and signal transduction in tumor cells, and modulates immune receptor trafficking as well as the secretion of chemokines and cytokines within the immune system (44).
Metabolic reprogramming is a recognized hallmark of cancer progression. HADH, a member of the 3-hydroxyacyl-CoA dehydrogenase family, encodes a mitochondrial enzyme essential for fatty acid β-oxidation (45). HADH expression is reduced in gastric cancer, where it inhibits PTEN expression and promotes AKT phosphorylation, thereby activating the Akt signaling pathway and facilitating gastric cancer cell migration and invasion (46). Similar tumor-suppressive roles have been reported in renal, esophageal, hepatic, and ampullary adenocarcinomas. In CRC, reduced HADH expression is associated with advanced disease stages and poor prognosis (47–49). These effects possibly arise from HADH deficiency-induced disruption of fatty acid metabolism and subsequent lipid accumulation, leading to metabolic reprogramming and uncontrolled tumor cell proliferation (50). Our study, on the other hand, supported the idea that HADH was significantly reduced in CRC tissues and was more highly expressed in early tumor tissues. Spatial transcriptomic analysis confirmed the predominant expression of HADH in tumor cells. Therefore, HADH possibly influences CRC progression, although the underlying mechanisms remain to be fully elucidated. HADH overexpression experiments revealed that HADH inhibited cell proliferation, migration, and invasion while facilitating apoptosis. Hypermethylation of the HADH gene has been identified as one mechanism underlying its downregulation in CRC, where it modulates cancer cell proliferation through the Wnt/β-catenin and Wnt/ROR2/DVL2 signaling pathways (46, 51). HADH overexpression may suppress tumor cell proliferation by inhibiting NRF2-mediated glutathione synthesis and inducing oxidative damage in cancer cells (52). Therefore, HADH can serve not only as a diagnostic biomarker for CRC but is also closely associated with its prognosis. Future studies involving a series of experiments in vitro and in vivo are warranted to further explore the mechanisms by which HADH regulates CRC progression.
According to infomation from the DrLLPS database, AQP11, CCDC34, HADH, RAB15, and FBL are all classified as client proteins, with AQP11 involved in nucleolus formation, CCDC34 in centrosome/spindle pole body formation, HADH in both nucleolus and postsynaptic density formation, RAB15 in postsynaptic density formation, and FBL in the formation of Cajal bodies, nucleoli, P-bodies, and stress granules. LLPS has previously been shown to explain tumor heterogeneity in gliomas and melanomas. Previous studies have developed numerous prognostic models for CRC based on a variety of molecular signatures, including immune infiltration, tumor metabolism, stemness, non-coding RNAs, epigenetic regulators, and cell death pathways. While each model offers unique insights into tumor biology and demonstrates robust predictive capacity, our LLPS-related 5-gene model emerges as a novel predictor. It demonstrates prognostic accuracy that is comparable, if not superior, to these established signatures. Uniquely, our model links prognosis to the functional integrity of organelles involved in phase separation. By capturing critical features of tumor aggressiveness and undergoing extensive validation, the LLPS-related model exhibits significant clinical utility. Furthermore, its correlation with multiple biological processes—such as ribosome biogenesis, metabolism (specifically HADH), and immune modulation—may underpin its superior performance. Moving forward, the integration of molecular signatures with clinicopathological factors appears to be crucial for achieving effective risk stratification. This provides a novel, mechanism-driven perspective for understanding the heterogeneity and malignant progression of CRC, offering new insights for individualized precision therapy in CRC.
Nonetheless, there are several limitations. The prognostic model was developed primarily utilizing open datasets, representing a retrospective analysis based on existing data. The mechanistic contribution of HADH to CRC pathogenesis remains incompletely defined. Furthermore, while functional validation was performed for HADH, the other four prognostic genes were not subjected to further experimental exploration. Future studies involving prospective clinical cohorts and mechanistic investigations at the molecular level for all five signature genes are essential to further validate and expand upon these findings.
Emerging evidence reveals that LLPS regulates biological processes through biomolecular condensates, while its dysregulation contributes to pathologies such as oncogenic signaling activation, genomic instability, and TME remodeling (36). The identification of LLPS-linked biomarkers may provide insights for tumor subtyping and prognostic assessment. Our study investigated how LLPS influences CRC. On the basis of the expression profiles of the 430 LLPS-associated DEGs, five prognostic genes were identified via univariate Cox regression, LASSO regression and stepAIC algorithms, culminating in the construction of a nomogram model. The derived risk scores exhibited strong associations with drug sensitivity, clinical outcomes, tumor stemness, the TME, clinicopathological features, and genomic alterations.
Our results revealed that the 430 LLPS-related DEGs were enriched primarily in pathways related to the cell cycle, DNA replication, and DNA damage repair. Dysregulated LLPS can repair DNA damage in tumor cells induced by radiotherapy and chemotherapy, leading to treatment failure. For example, under physiological conditions, SUMOylated RNF168 is inefficiently recruited to DSB sites, thereby impairing H2A ubiquitination and 53BP1 accumulation within nuclear condensates, ultimately reducing DNA repair efficiency. In CRC, the overexpression of SENP1 decreases RNF168 SUMOylation, enhances DNA repair capacity, and correlates with poor prognosis (37). Notably, the LLPS-based prognostic model revealed a markedly worse overall prognosis in high-risk individuals than in low-risk individuals. Compounds such as JQ-1 and BMS-754807 may serve as promising therapeutic candidates for high-risk individuals, whereas agents such as gefitinib, sorafenib, and oxaliplatin may exhibit greater efficacy in low-risk individuals. These findings prove the importance of LLPS-related genes in CRC progression and present promising avenues for personalized treatments.
Tumor immune evasion and resistance to ICIs remain major obstacles in cancer immunotherapy. Recent studies suggest that LLPS modulates immune cell infiltration within the TME. For example, LLPS-mediated disruption of T-cell receptor signaling by upregulated LAG3 inhibits T-cell proliferation and differentiation, thereby promoting an immunosuppressive milieu (38). IFN-γ-triggered phase separation of the KAT8-IRF1 complex enhances PD-L1 expression and facilitates immune evasion (39). In the present analysis, notable disparities in immune checkpoint expression, HLA levels, immune cell infiltration, and genomic mutation profiles were noted between risk cohorts. Additionally, higher immune and stromal scores, alongside lower tumor purity, were noted in the high-risk group, indicating the presence of more extensive immune cells in the TME. Notably, the high-risk group exhibited elevated TIDE scores, suggesting potentially poorer responses to immunotherapy, which may partially account for their unfavorable prognosis. Overall, LLPS can regulate immune dynamics and is a possible predictive biomarker for the immunotherapeutic response.
Our prognostic model, comprising five genes (HADH, AQP11, CCDC34, FBL, and RAB15), demonstrated robust predictive ability for patient prognosis, the immune response, and mutational landscapes. AQP11 AQP11 encodes a water-channel membrane protein that is downregulated in CRC and facilitates tumor cell proliferation, migration, invasion, and adhesion (40). CCDC34, a coiled-coil domain-containing protein, is upregulated in CRC and linked to reduced apoptosis and increased metastasis, possibly via the modulation of Bcl-2, survivin, E-cadherin, N-cadherin, and MMP-9 (41). FBL is a highly conserved nucleolar methyltransferase that promotes nucleolar assembly and pre-rRNA synthesis via phase separation, contributing to ribosome biogenesis and cellular proliferation (42). Long non-coding RNA (lncRNA) binds to multiple FBL molecules and partially blocks the glycine- and arginine-rich (GAR) domain, thereby inhibiting excessive solid phase condensation caused by GAR-GAR interactions. This mechanism maintains the liquid phase fluidity of FBL and consequently sustains cancer cell growth (43). RAB15 is a small GTPase belonging to the RAS superfamily that is predominantly localized to membrane-bound organelles where it participates in vesicle-mediated intracellular transport. It is critical for modulating cell division, migration, and signal transduction in tumor cells, and modulates immune receptor trafficking as well as the secretion of chemokines and cytokines within the immune system (44).
Metabolic reprogramming is a recognized hallmark of cancer progression. HADH, a member of the 3-hydroxyacyl-CoA dehydrogenase family, encodes a mitochondrial enzyme essential for fatty acid β-oxidation (45). HADH expression is reduced in gastric cancer, where it inhibits PTEN expression and promotes AKT phosphorylation, thereby activating the Akt signaling pathway and facilitating gastric cancer cell migration and invasion (46). Similar tumor-suppressive roles have been reported in renal, esophageal, hepatic, and ampullary adenocarcinomas. In CRC, reduced HADH expression is associated with advanced disease stages and poor prognosis (47–49). These effects possibly arise from HADH deficiency-induced disruption of fatty acid metabolism and subsequent lipid accumulation, leading to metabolic reprogramming and uncontrolled tumor cell proliferation (50). Our study, on the other hand, supported the idea that HADH was significantly reduced in CRC tissues and was more highly expressed in early tumor tissues. Spatial transcriptomic analysis confirmed the predominant expression of HADH in tumor cells. Therefore, HADH possibly influences CRC progression, although the underlying mechanisms remain to be fully elucidated. HADH overexpression experiments revealed that HADH inhibited cell proliferation, migration, and invasion while facilitating apoptosis. Hypermethylation of the HADH gene has been identified as one mechanism underlying its downregulation in CRC, where it modulates cancer cell proliferation through the Wnt/β-catenin and Wnt/ROR2/DVL2 signaling pathways (46, 51). HADH overexpression may suppress tumor cell proliferation by inhibiting NRF2-mediated glutathione synthesis and inducing oxidative damage in cancer cells (52). Therefore, HADH can serve not only as a diagnostic biomarker for CRC but is also closely associated with its prognosis. Future studies involving a series of experiments in vitro and in vivo are warranted to further explore the mechanisms by which HADH regulates CRC progression.
According to infomation from the DrLLPS database, AQP11, CCDC34, HADH, RAB15, and FBL are all classified as client proteins, with AQP11 involved in nucleolus formation, CCDC34 in centrosome/spindle pole body formation, HADH in both nucleolus and postsynaptic density formation, RAB15 in postsynaptic density formation, and FBL in the formation of Cajal bodies, nucleoli, P-bodies, and stress granules. LLPS has previously been shown to explain tumor heterogeneity in gliomas and melanomas. Previous studies have developed numerous prognostic models for CRC based on a variety of molecular signatures, including immune infiltration, tumor metabolism, stemness, non-coding RNAs, epigenetic regulators, and cell death pathways. While each model offers unique insights into tumor biology and demonstrates robust predictive capacity, our LLPS-related 5-gene model emerges as a novel predictor. It demonstrates prognostic accuracy that is comparable, if not superior, to these established signatures. Uniquely, our model links prognosis to the functional integrity of organelles involved in phase separation. By capturing critical features of tumor aggressiveness and undergoing extensive validation, the LLPS-related model exhibits significant clinical utility. Furthermore, its correlation with multiple biological processes—such as ribosome biogenesis, metabolism (specifically HADH), and immune modulation—may underpin its superior performance. Moving forward, the integration of molecular signatures with clinicopathological factors appears to be crucial for achieving effective risk stratification. This provides a novel, mechanism-driven perspective for understanding the heterogeneity and malignant progression of CRC, offering new insights for individualized precision therapy in CRC.
Nonetheless, there are several limitations. The prognostic model was developed primarily utilizing open datasets, representing a retrospective analysis based on existing data. The mechanistic contribution of HADH to CRC pathogenesis remains incompletely defined. Furthermore, while functional validation was performed for HADH, the other four prognostic genes were not subjected to further experimental exploration. Future studies involving prospective clinical cohorts and mechanistic investigations at the molecular level for all five signature genes are essential to further validate and expand upon these findings.
Conclusion
Conclusion
In conclusion, an LLPS-based prognostic model was established to stratify CRC patients into two molecular subtypes with distinct clinical outcomes: genomic alterations, immune profiles, and therapeutic sensitivities. This model facilitates a deeper understanding of CRC heterogeneity and may serve as a valuable tool for guiding individualized therapeutic strategies.
In conclusion, an LLPS-based prognostic model was established to stratify CRC patients into two molecular subtypes with distinct clinical outcomes: genomic alterations, immune profiles, and therapeutic sensitivities. This model facilitates a deeper understanding of CRC heterogeneity and may serve as a valuable tool for guiding individualized therapeutic strategies.
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.