본문으로 건너뛰기
← 뒤로

Multiomics analysis and prognostic modeling reveal the molecular features and potential therapeutic targets of breast cancer brain metastasis.

1/5 보강
Clinical and experimental medicine 📖 저널 OA 96.3% 2022: 0/1 OA 2023: 2/3 OA 2024: 7/7 OA 2025: 83/83 OA 2026: 62/65 OA 2022~2026 2025 Vol.26(1) p. 99
Retraction 확인
출처

He D, Huang G, Jia X, Yu Y, Ge X, Chu L

📝 환자 설명용 한 줄

[UNLABELLED] Breast cancer brain metastasis (BCBM) is a major cause of poor prognosis in breast cancer, driven by complex molecular mechanisms.

이 논문을 인용하기

↓ .bib ↓ .ris
APA He D, Huang G, et al. (2025). Multiomics analysis and prognostic modeling reveal the molecular features and potential therapeutic targets of breast cancer brain metastasis.. Clinical and experimental medicine, 26(1), 99. https://doi.org/10.1007/s10238-025-02003-4
MLA He D, et al.. "Multiomics analysis and prognostic modeling reveal the molecular features and potential therapeutic targets of breast cancer brain metastasis.." Clinical and experimental medicine, vol. 26, no. 1, 2025, pp. 99.
PMID 41474469 ↗

Abstract

[UNLABELLED] Breast cancer brain metastasis (BCBM) is a major cause of poor prognosis in breast cancer, driven by complex molecular mechanisms. Innovative diagnostic and therapeutic strategies are urgently needed. We integrated single-cell RNA sequencing, multiomics profiling from the TCGA database, and machine learning to explore the molecular features of BCBM. Cell composition, gene expression, and subtypes were characterized. Prognostic models were developed, and potential drug targets were computationally identified through analysis of differentially expressed genes and molecular interactions. Experimental validation of these targets was performed using orthotopic implantation of MDA-MB-231-Luc cells in nude mice. scRNA-seq revealed 10 cell types and 1,479 differentially expressed genes, highlighting significant differences between the primary brain cancer and BCBM. Multiomics clustering defined two distinct subtypes (CS1 and CS2) with differential prognosis. A CoxBoost + RSF model identified hub genes (BTG2, PSMB8, SRGN, HLA-DPB1) and demonstrated high predictive accuracy for 3-, 5-, and 10-year survival (AUCs: 0.813, 0.788, and 0.776, respectively). Drug sensitivity analysis highlighted five candidate agents, with molecular docking confirming strong binding affinity to targeted proteins. In vivo experiments confirmed that PSMB8 and HLA-DPB1 promoted brain metastasis, while BTG2 and SRGN suppressed it. High-risk patients exhibited elevated monocyte proportions, which were involved in intercellular interactions. This study delineates the molecular landscape of BCBM, establishes robust prognostic models, and identifies promising therapeutic targets, offering a framework for precision diagnosis and individualized treatment.

[SUPPLEMENTARY INFORMATION] The online version contains supplementary material available at 10.1007/s10238-025-02003-4.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

📖 전문 본문 읽기 PMC JATS · ~62 KB · 영문

Introduction

Introduction
Breast cancer (BC) is a prevalent malignancy in women worldwide. Owing to progress in the development of early detection and therapeutic strategies, the survival rates of most BC patients have improved substantially, but the prognosis remains poor for patients with metastatic disease and specific BC subtypes [1–3]. Primary BC can disseminate to the brain via the bloodstream or lymphatic system, giving rise to secondary brain tumors, which are associated with a particularly poor prognosis. The median survival time after the diagnosis of brain metastasis is often only a few months [4, 5]. The process of BC brain metastasis (BCBM) is highly complex and involves a series of molecular events. Initially, primary tumor cells must acquire invasive and migratory capacities to penetrate blood vessel walls and enter the bloodstream. These cells must subsequently acquire the ability to survive in the bloodstream and evade immune surveillance [6, 7]. Finally, circulating tumor cells (CTCs) must traverse the blood-brain barrier (BBB) and form new tumor colonies. This complex process is affected by interactions among multiple factors, such as changes in gene expression, activation of signaling pathways, and adaptation to the microenvironment [8].
The diagnosis and treatment of BCBM require individualized decision-making on the basis of a comprehensive evaluation of imaging findings, molecular characteristics, and clinical status [9, 10]. For diagnosis, contrast-enhanced magnetic resonance imaging (MRI) is the preferred modality for detecting brain metastases[11]. Computed tomography (CT) or positron emission tomography–computed tomography (PET-CT) may also be used complementarily to assess lesion characteristics [12, 13]. Analysis of cerebrospinal fluid CTCs [14] or cell-free DNA (ctDNA) [15] can aid in molecular subtyping. However, the sensitive detection of early micrometastasis remains a significant challenge. Systemic therapy for BCBM relies heavily on targeted drugs, such as HER2 inhibitors [16] and CDK4/6 inhibitors [17]. However, the efficacy of these treatments is often hindered by their limited ability to penetrate the BBB and the heterogeneity of tumors. Emerging strategies focus on enhancing BBB penetration (e.g., nanocarriers [18] and focused ultrasound [19]), implementing precision medicine guided by multiomics classification, and modulation of the immune microenvironment. Therefore, it is crucial to integrate multiomics biomarkers with single-cell sequencing data to develop models for early detection and identify potential therapeutic targets, thereby facilitating a paradigm shift from “monotherapy” to “precision stratification” in the diagnosis and treatment of BCBM.
Multiomics analysis has emerged as a powerful tool in high-throughput sequencing and bioinformatics, offering a comprehensive view of tumor biology and treatment responses by integrating genomics, transcriptomics, proteomics, and metabolomics data [20]. Moreover, proteomics and metabolomics analyses have revealed that during the cascade of organ-specific metastasis, tumor cells adapt to distinct microenvironments and activate corresponding molecular pathways [21]. By analyzing the cellular community structure and gene expression patterns in both normal tissues and brain metastasis samples, researchers can elucidate how tumor cells interact with their surroundings and identify potential therapeutic targets.
The aim of this study was to comprehensively investigate the molecular features and potential biological significance of BCBM through multiomics-based data analysis. We conducted differential expression analysis, pathway enrichment analysis, copy number variation (CNV) analysis, tumor mutation burden (TMB) analysis, DNA methylation profiling, and miRNA differential expression analysis using data from the TCGA-BRCA cohort to identify genes and regulatory networks highly associated with brain metastasis. Furthermore, by combining Mendelian randomization (MR) analysis and colocalization analysis, we aimed to identify clinically relevant candidate genes and construct a multiomics classification model to evaluate prognostic differences among different BC subtypes. Finally, experimental validation was performed to provide valuable insights into the biology of BCBM.

Methods

Methods

Data acquisition and preprocessing
We downloaded relevant data from The Cancer Genome Atlas (TCGA) Breast Cancer (TCGA-BRCA) cohort, which included mRNA, lncRNA, and miRNA expression data, DNA methylation data, and somatic mutation data. These datasets were retrieved using the “TCGAbiolinks” R package. The somatic mutation data were processed using the “maftools” R package. The mRNA, lncRNA, and miRNA expression data were log2-transformed for further analysis. These datasets were merged to form a multiomics dataset comprising transcriptomic, epigenomic, and mutation data from 651 samples. DNA methylation data were obtained from the UCSC Xena database. Additionally, transcriptomic datasets GSE20685 (327 BC samples) and GSE58812 (107 BC samples) were downloaded from the Gene Expression Omnibus (GEO) database and batch-corrected using the “sva” R package to create a meta-dataset. Single-cell RNA sequencing (scRNA-seq) data were obtained from GEO datasets GSE186344 (three primary BC samples) and GSE248288 (four BCBM samples), and spatial transcriptomic data were obtained from GSE214571 (four BC samples). PCA analysis confirmed successful batch correction between GSE20685 and GSE58812 datasets (Figure S1).

Multiomics subtype identification
To identify the top 1,500 most variable gene features, we used the “MOVICS” R package. We subsequently identified prognostic genes with p < 0.05 in each data dimension using clinical data. Finally, we selected the top 5% most frequently mutated genes in the binary mutation data. This process yielded 283 mRNAs, 116 lncRNAs, 115 miRNAs, 14 mutation features, and 182 DNA methylation sites. Clustering analysis was performed with the “getMOIC” function, and consensus across multiple algorithms was integrated using the “getConsensusMOIC” function to improve clustering robustness. Differentially expressed genes (DEGs) across subtypes were identified using the “runDEA” function. Gene set enrichment analysis (GSEA) was conducted based on the “c5.bp.v7.1.symbols.xls” collection. Gene set variation analysis (GSVA) was performed on a custom gene set file (“gene sets of interest.gmt”) from the MOVICS package.

MR analysis
A two-sample MR framework was employed to investigate potential causal relationships between brain metastasis-related gene expression and BC risk. Genes that exhibited BCBM-associated differential expression were selected and their cis-acting expression quantitative trait loci (cis-eQTLs) were used as exposure instruments. Gene symbols were transformed into ENSEMBL IDs using the clusterProfiler package and the org.Hs.eg.db database. eQTL data for these genes were retrieved from the IEU GWAS database, and single nucleotide polymorphisms (SNPs)  significantly associated with gene expression (p < 5 × 10⁻⁸) were selected as instrumental variables (IVs). Linkage disequilibrium (LD) clumping (r² < 0.001 within a 10,000 kb window) was performed to removed correlated IVs. The strength of the IVs was assessed using F statistics, and SNPs with F > 10 were retained to minimize weak instrument bias. GWAS summary statistics for BC were obtained sourced from the IEU database (ukb-b-16890). Inverse variance-weighted (IVW) regression and the Wald ratio method were applied to estimate causal effects. Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated to quantify the associations. Sensitivity analyses included heterogeneity assessment, pleiotropy evaluation, and leave-one-out analysis. The “locuscomparer” package was used to plot regional associations assess shared causal variants between eQTLs and BC GWAS signals, and calculate the proportion of variance explained (PVE) for significant genes. The “TwoSampleMR” R package was used for executing MR workflows, “ggplot2” was used for visualization, and “coloc” was used for colocalization tests. A significance threshold of p < 0.05 was applied for primary MR analyses.

Immune landscape exploration
The activity of cancer-related signatures was quantified using the single-sample GSEA (ssGSEA) algorithm implemented in the “gsva” R package. ssGSEA enables the evaluation of gene set enrichment across different samples and biological conditions. This algorithm was applied to compute immune function scores and immune cell infiltration profiles in all samples. Additionally, immune phenotype scores, levels of immune regulators, and the abundance of effector and inhibitory cells were also calculated. A random forest algorithm was used to predict responses to anti-CTLA-4 and anti-PD-1 immunotherapies based on these immune features. We evaluated differences in the expression of common immune checkpoint molecules between groups, estimated immune and stromal scores using the “ESTIMATE” R package, and assessed immune cell infiltration using CIBERSORT. Tumor-infiltrating lymphocyte DNA methylation (MeTIL) scores were calculated following published protocols. TMB and immunotherapy response scores‒such as the Tumor Immune Dysfunction and Exclusion (TIDE) and The Cancer Imaging Archive (TCIA) scores‒were obtained from relevant databases. The “IOBR” R package was applied to analyze differences in immune cell types in the tumor microenvironment (TME), immune suppression signatures, immune exclusion signatures, and immunotherapy biomarkers between the high- and low-risk groups.

Prognostic model construction based on multiple machine learning approaches
The prognostic model was constructed via the integration of variable selection and prognostic modeling, combining traditional statistical models with machine learning algorithms to efficiently analyze high-dimensional biomarker data. The data were preprocessed and normalized; genes with low expression and invalid samples were removed to ensure consistency between the training and validation sets. More than one hundred combinations of algorithms combinations, including stepwise Cox regression, elastic net (Enet), random survival forests (RSF), SuperPC, CoxBoost, and GBM, were applied for feature selection and model construction. Hyperparameters were optimized via cross-validation. Model performance was evaluated primarily using the concordance index (C-index), which was calculated for both the training and validation cohorts. The model with the highest average C-index was selected as the optimal model, and the selected identified prognostic genes were further assessed using univariate Cox regression analysis (p < 0.05).

Drug sensitivity and molecular docking analysis
Drug sensitivity analysis was performed with the “oncoPredict” R package together with data from the GDSC2 database. The significance threshold was set to p < 0.00001. Molecular docking analysis was performed by first identifying small molecules targeting prognostic genes through the DSigDB database. The top two candidate compounds‒mepacrine and etacrynic acid‒were selected for further analysis. Three-dimensional (3D) structures of these compounds were retrieved from the PubChem database, and the 3D structures of the corresponding protein targets were obtained from the Protein Data Bank (PDB). Molecular docking was then conducted using the CB-Dock2 web server.

ScRNA-seq analysis
The raw scRNA-seq data were normalized using the LogNormalize method in the “Seurat” R package to minimize technical noise. Batch effects were independently corrected. High variable genes (HVGs) were identified via the FindVariableFeatures() function. Data scaling was performed with the ScaleData() function. Principal component analysis (PCA) was performed for dimensionality reduction, typically retaining the top 20 principal components (PCs) that captured the majority of biological variation. Cells were clustered based on similarity in the PCA space using the FindNeighbors() and FindClusters() functions, and the resulting clusters were visualized in two dimensions via t-SNE or UMAP. Cell types were annotated using the SingleR package with HumanPrimaryCellAtlasData as the reference. Cell‒cell communication networks were inferred and visualized using the “CellChat” R package. Key ligand‒receptor interactions were identified. Cell differentiation trajectories were reconstructed with the “monocle2” package, and a directed acyclic graph (DAG) was constructed to reveal cell state transitions. Transcription factor (TF) activity in each cell was quantified using the DoRothEA tool, which calculates TF activity scores.

Spatial transcriptomics analysis
The analysis of spatial transcriptomics (ST) data was performed with the “Seurat” R package. ST data from four distinct samples (GSM6612124, GSM6833486, GSM6833485, and GSM6833484) were loaded with the `Read10X()` function, and the corresponding high-resolution tissue images were loaded with the `Read10X_Image()` function. The ST data were integrated into a Seurat object via the `CreateSeuratObject()` function. The four Seurat objects were then merged into a single Seurat object (`stRNA`), followed by quality control steps, including the removal of mitochondrial and ribosomal genes, as well as genes expressed in fewer than 10 spots. Data normalization, dimensionality reduction, and clustering were performed using the `SCTransform()` function and standard Seurat workflows.

Experimental methods

Cell lines and stable overexpression
Human triple-negative BC (TNBC) MDA-MB-231 cells expressing luciferase (MDA-MB-231-Luc) were transduced with lentiviral vectors encoding BTG2, PSMB8, HLA-DPB1, or SRGN; cells transduced with an empty vector served as the negative control (NC). The lentiviruses were produced in HEK293T cells with standard packaging plasmids. MDA-MB-231-Luc cells were transduced at a multiplicity of infection (MOI) of 10 in the presence of 8 µg/mL polybrene. Stable transformants were selected by culturing cells with 2 µg/mL puromycin for one week. Gene overexpression was confirmed by quantitative real-time PCR (qRT-PCR).

Orthotopic implantation and animal care
Female BALB/c nude mice (3 weeks old, n = 15) were housed under specific pathogen-free (SPF) conditions, with ad libitum access to food and water. All experimental protocols were approved by the Institutional Animal Care and Use Committee (IACUC, Approval No. 2502449). Mice were randomly allocated into five groups (n = 3 per group): negative control (NC), BTG2-OE, PSMB8-OE, HLA-DPB1-OE, and SRGN-OE. Orthotopic tumors were established by injecting 1 × 10⁶ cells in 50 µL of Matrigel-PBS (1:1) into the right fourth mammary fat pad under isoflurane anesthesia. Tumor size was monitored weekly using calipers.

Bioluminescence imaging and end-point analysis
Prior to imaging, mice received an intraperitoneal injection of D-luciferin (150 mg/kg). The head region was defined as the region of interest (ROI) to quantify cerebral bioluminescence flux (photons per second). On Day 50 post-implantation, mice were euthanized, and brains were harvested for ex vivo imaging. Brain tissues were fixed in 4% paraformaldehyde (PFA),  embedded in paraffin, and sectioned at 5 µm thickness. Tissue sections were stained with hematoxylin and eosin (H&E). Brain metastatic foci (≥ 50 μm in diameter) were counted in three nonoverlapping fields per section by an observer blinded to the experimental groups.

Statistical analysis
Between-group differences in omics-related analyses were evaluated using the Wilcoxon rank-sum test. Spearman correlation coefficients were calculated to assess correlations between variables. The MOVICS software package was used for multiomics clustering. In vivo experiments data are presented as the means ± standard errors of the means (SEMs). Differences between groups were analyzed by one-way analysis of variance (ANOVA), followed by Tukey’s post hoc test for multiple comparisons. A p value < 0.05 was considered statistically significant.

Results

Results

Multiomics profiling of BCBM
Figure 1 presents a schematic diagram of the methodological framework used in this study. This workflow illustrates the integration of multiomics analysis, predictive modeling, and experimental validation. Our analysis began with scRNA-seq. Initially, we conducted rigorous quality control for cells from multiple samples by evaluating the number of detected genes per cell (nFeature_RNA), the number of unique molecular identifiers (UMIs) per cell (nCount_RNA), and the percentage of transcripts mapped to mitochondrial genes (percent.mt) (Fig. 2A). To mitigate potential batch effects that could confound the interpretation of cellular heterogeneity, we used the Harmony algorithm for correction. The batch-corrected data are shown in Fig. 2B, where cells are projected into a two-dimensional space based on integrated gene expression profiles, with colors denoting the sample origin. T-distributed stochastic neighbor embedding (t-SNE) was subsequently used for nonlinear dimensionality reduction to visualize distinct cell clusters in two dimensions. Figure 2C illustrates the ten major cell types identified in the dataset‒macrophages, tissue stem cells, epithelial cells, endothelial cells, fibroblasts, T cells, neurons, monocytes, B cells, and mesenchymal stem cells‒with distinct colors in the t-SNE plot corresponding to each type as indicated in the accompanying legend. The proportional abundance of each cell types in BC and BCBM samples is presented in Fig. 2D. Notably, the proportions of multiple cell types varied significantly between the two groups, as confirmed by differential abundance analysis. In total, 1,479 differentially expressed genes (DEGs) were identified (Table S2). In addition, MR analysis was used to identify DEGs with putative causal relationships with BC risk.
We then integrated multiomics data (mRNA, lncRNA, miRNA, DNA methylation, and mutation data) from the TCGA-BRCA cohort to perform clustering analysis, aiming to identify BC subtypes on the basis of brain metastasis-related DEGs. The clustering heatmap (Fig. 3A) revealed distinct gene expression patterns for two identified subtypes (CS1 and CS2), highlighting their unique molecular features. The clustering performance was rigorously evaluated via the clustering prediction index and Gap statistic, both of which indicated robust clustering results (Fig. 3B). Cluster silhouette analysis (Fig. 3C) further validated the subtype assignments, with average silhouette widths of 0.44 for CS1 and 0.41 for CS2. Consensus clustering analysis across ten algorithms demonstrated high concordance, supporting the stability of the identified subtypes (Fig. 3D). The consensus matrix heatmap (Fig. 3E) visually confirmed the consistency of the clustering results. Kaplan‒Meier survival analysis revealed significantly different overall survival probabilities between CS1 and CS2 subtypes (log-rank p = 0.001), indicating distinct clinical outcomes (Fig. 3F).
Figure 4 delineates the molecular landscapes of the two identified subtypes (CS1 and CS2). Results from gene set variation analysis (GSVA) are presented in Figures 4A and 4B. Fig. 4A reveals distinct enrichment patterns: subtype CS1 exhibited significant enrichment in cell cycle and DNA replication-related processes, whereas subtype CS2 demonstrated predominant enrichment in immune response pathways.  Figure 4B further details these immune disparities, highlighting enrichment of the B cell receptor pathway in CS1 and the T cell regulation pathway in CS2. These findings were corroborated by GSEA (Fig. 4C), which confirmed the association of CS1 with cell cycle regulation and DNA repair, and of CS2 with immune response and protein transport processes. Figure 4D displays the activity profiles of various transcription factors (TFs) (upper panel) and key chromatin remodeling regulators (lower panel) across the two subtypes. The distinct TF regulatory networks and subtype-associated chromatin remodelers revealed in each panel collectively suggest divergent transcriptional and epigenetic control mechanisms between CS1 and CS2. Figure 4E summarizes the immune landscape within the TCGA-BRCA cohort, showing annotations of immune cells, DNA methylation levels, and stromal components. The upper panel highlights the expression profiles of canonical immune checkpoint genes, while the lower panel illustrates the abundance of TME-related immune cells. This comprehensive view is further complemented by annotations of tumor-infiltrating lymphocytes and DNA methylation enrichment scores. Figure 4F shows that these immune characteristics were confirmed in the META-BRCA cohort using nearest template prediction (NTP). Figure 4G and 4H show the concordance between consensus subtypes (CSs) and NTP or PAM (partitioning around medoids) clustering in the TCGA-BRCA cohort, respectively. Figure 4I demonstrates the correspondence between NTP and PAM results within the META-BRCA cohort, further confirming the reproducibility of the molecular classification.

Construction and validation of the prognostic model
To elucidate the genetic underpinnings of BCBM prognosis, we intersected DEGs identified between BCBM and primary BC with those DEGs between the two multiomics subtypes, resulting in 112 overlapping genes (Figure S2). Using these intersecting genes, we constructed prognostic models to predict patient survival; the results are summarized in Fig. 5. An integrated computational framework was used to generate 99 machine learning model combinations, which were evaluated using the C-index in both the TCGA-BRCA (training) and META (validation) cohorts. The models were ranked based on average validation performance. The CoxBoost + RSF model demonstrated the best performance and was selected as the optimal prognostic model (Fig. 5A). Hub genes were identified with the CoxBoost algorithm (Fig. 5B) and further validated through by univariate Cox regression analysis (Fig. 5C). This analysis confirmed the significant prognostic value of four genes, including BTG2, PSMB8, SRGN, and HLA-DPB1, all of which showed notable prognostic significance. Kaplan-Meier survival analysis based on expression levels of these hub genes revealed significant differences in overall survival between the high- and low-expression groups in both the TCGA-BRCA (log-rank p < 0.0001, Fig. 5E) and META (log-rank p < 0.0001, Fig. 5D) cohorts. These results suggest that the identified hub genes can stratify BCBM patients into distinct prognostic risk groups and predict their survival outcomes.  Detailed KM survival analyses for each gene in TCGA and META cohorts are shown in Figure S3.
Furthermore, we performed a comprehensive analysis of TME features, such as immune cell types, immune suppression signatures, immune exclusion signatures, and immunotherapy biomarkers, within the BCBM cohort. The results revealed significant differences between the high- and low-risk groups as defined by the prognostic model (Fig. 6A and 6D). Specifically, the high-risk group exhibited elevated levels of immune suppression and immune exclusion signatures, along with increased expression of biomarkers associated with immunotherapy resistance. In contrast, the low-risk group demonstrated decreased immune suppression signatures and a more favorable immunotherapy-related biomarker profile, suggesting better potential for therapeutic response. In addition, correlation analysis was performed to explore the associations between the prognostic genes and immune cell infiltration abundance (Fig. 6E). PSMB8 expression showed strong positive correlations with the infiltration of cytotoxic immune cells, particularly CD8+ T cells, activated memory CD4+ T cells, activated NK cells, and M1 macrophages, implicating its involvement in enhancing anti-tumor effector responses. Conversely, BTG2 expression was positively associated with naïve B cells, monocytes, and resting dendritic cells, yet inversely correlated with activated dendritic cells, M0 macrophages, and activated memory CD4+ T cells. This pattern indicates a potential role for BTG2 in regulating immune activation thresholds. SRGN expression was positively linked to CD8+ T cells, M1 macrophages, and activated memory CD4+ T cells, supporting its contribution to pro-inflammatory, effector immune functions within the TME. HLA-DPB1 exhibited a complex association profile: positive correlations with memory B cells, resting dendritic cells, regulatory T cells (Tregs), and activated NK and CD8+ T cells, alongside negative correlations with M0 macrophages and resting NK cells. These associations highlight its dual potential in antigen presentation and fine-tuning of immune regulation in the BCBM microenvironment. Predicted immunotherapy responses also significantly differed between the two groups across all the four different prediction scenarios (Figs. 6F-I).
We also comprehensively evaluated copy number variants (CNVs), DNA methylation, and expression differences for PSMB8, BTG2, SRGN, and HLA-DPB1 across multiple cancer types. Pan-cancer genomic and epigenetic profiles of the four genes are summarized in Figure S4. Figure S4A shows the percentages of these genes with CNV events, categorized by heterozygous/homozygous amplification and deletion. The frequencies varied among cancer types, indicating heterogeneous genomic alterations. Figure S4B shows log2 (fold change) in methylation levels between tumor and adjacent normal tissues, with significant differences (FDR ≤ 0.05) highlighted. Figure S4C presents the expression heatmap of these genes with FDR and log2 (fold change) annotations. Figure S4D summarizes functional roles of the four genes in activating or inhibiting canonical pathways across cancers. Together, these results support a complex genetic and epigenetic regulatory network involving these genes in BCBM and possibly in other tumor contexts.

Identification of independent prognostic factors and drugs with therapeutic potential
The prognostic value of diverse clinical features was assessed via univariate Cox regression analysis. Our results revealed that several clinical parameters, including age, M stage, and N stage, were significantly associated with patient prognosis (Fig. 7A and 7B). We subsequently constructed a comprehensive clinical nomogram model that integrates prognostic genes identified through multivariate Cox regression analysis (Fig. 7C). This nomogram model incorporates both independent prognostic factors and gene expression signatures to provide a holistic assessment of the prognostic risk for BCBM patients. The calibration curve of the nomogram model demonstrated excellent concordance between predicted and observed survival (Fig. 7D), highlighting the model’s accuracy and clinical utility. Additionally, we assessed the nomogram model’s ability to predict survival rates at 3, 5, and 10 years. The model exhibited robust predictive capabilities across different time points, as demonstrated by the receiver operating characteristic (ROC) area under the curve (AUC) values of 0.813 for 3-year survival, 0.788 for 5-year survival, and 0.776 for 10-year survival (Fig. 7E).
Following IC50 prediction for 1,265 compounds in the GDSC2 database using the oncoPredict package, we applied an initial filter based on a >2-fold decrease in IC50 (i.e., |ΔIC50| > 2) and significantly lower sensitivity (FDR < 0.001) in the low-risk group, yielding 37 candidate drugs. Among these, effect sizes (Cohen’s d) between high- and low-risk groups were calculated. Elephantin, mitoxantrone, PRIMA-1MET, sabutoclax, and topotecan exhibited Cohen’s d values > 1.5, the highest among all candidates (Fig. 8A–E), indicating the most pronounced inter-group IC50 differences and thus were prioritized for subsequent molecular docking analyses. In the high-risk group, these agents had significantly higher IC50 values, indicating greater sensitivity in the low-risk group. Subsequently, molecular docking analyses were conducted to examine binding affinities between mepacrine and etacrynic acid and their respective protein targets derived from the prognostic genes. Figures 8F-I present the molecular docking results for two small-molecule compounds (mepacrine and etacrynic acid) in complex with these proteins. The binding affinities and interaction patterns of the compounds toward their protein targets were evaluated. Mepacrine showed robust binding affinity for its target protein, featuring key interactions such as hydrogen bonds and hydrophobic interactions. Etacrynic acid also had favorable binding characteristics, highlighting its potential for use as a therapeutic agent. Figure S5 shows the enrichment analysis results based on drug‒gene associations. Figure S5A presents the drug-gene association analysis results in the form of a bubble plot, in which bubble size indicates the significance of the association (p.adjust). Figure S5B shows the associations between these two drugs and their target genes.

Single-cell and spatial transcriptomic insights
The expression patterns of the prognostic genes were systematically analyzed using scRNA-seq data. Gene signature scores were calculated for individual cells via the AUCell algorithm and visualized on a t-NSE plot (Fig. 9A); showing distinct distributions between the high- and low-expression groups. Figure 9B shows the expression patterns of the four prognostic genes across all cells stratified by sample group (BC vs. BCBM). GSEA revealed significantly enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Fig. 9C), including endocytosis and leukocyte transendothelial migration. Analysis of the intercellular communication network (Fig. 9D and 9E) and specific interactions involving monocytes (Fig. 9F and 9G) revealed significant signaling pathways, highlighting monocytes as key communication hubs. Finally, pseudotime analysis ordered cells along a trajectory (Fig. 9H and 9I), providing insights into the temporal dynamics of cell states and potential interactions. Monocytes were predominantly located in the mid-to-late pseudotime stages. In addition, the spatial expression patterns of prognostic genes across different tissue sections and clusters were explored using spatially resolved transcriptomics (ST) data from primary BC samples (Figure S6).

Differential impact of gene overexpression on brain metastasis
Our results demonstrate that the overexpression of specific prognostic genes significantly alters the brain metastatic potential of TNBC cells. Specifically, PSMB8 and HLA-DPB1 enhanced metastatic progression to the brain, whereas BTG2 and SRGN suppressed it. In vivo bioluminescence imaging on Day 0 post-implantation confirmed comparable orthotopic signals across all five groups (NC, BTG2-OE, PSMB8-OE, HLA-DPB1-OE, and SRGN-OE) (Fig. 10A), indicating similar initial tumor loads. On Day 50 (Fig. 10B), mice implanted with PSMB8-OE or HLA-DPB1-OE cells exhibited significantly stronger cranial bioluminescence signals compared to the NC group, while those in the BTG2-OE and SRGN-OE groups exhibited minimal fluorescence. Ex vivo imaging of harvested brains (Fig. 10C) corroborated these observation, revealing prominent metastatic foci in the PSMB8-OE and HLA-DPB1-OE groups, but minimal or absent signal in the BTG2-OE and SRGN-OE groups. H&E staining of brain sections (Fig. 10D) revealed numerous metastatic nodules in the brains of mice in the PSMB8-OE and HLA-DPB1-OE groups, an intermediate level in the NC group, and sparse or absent nodules in the BTG2-OE and SRGN-OE groups. Quantitative analysis of metastatic nodules (Fig. 10E) confirmed these results: PSMB8 or HLA-DPB1 overexpression significantly increased brain metastatic burden (p < 0.001), whereas BTG2 or SRGN overexpression significantly suppressed it.

Discussion

Discussion
This study comprehensively elucidated the molecular characteristics and potential biological significance of BCBM through integrated multiomics analysis. Our findings not only revealed the cellular heterogeneity and molecular subtypes of BCBM but also established a prognostic model on the basis of multiomics data and identified potential therapeutic targets. scRNA-seq analysis revealed diverse cellular landscapes in BCBM, including epithelial cells, endothelial cells, tissue stem cells, fibroblasts, T cells, neurons, and macrophages. Notably, the relative abundance of these cell types differed significantly between primary BC and BCBM samples, indicating that substantial remodeling of the TME occurs during brain metastasis. Moreover, MR analysis identified DEGs with a putative causal relationships to BC risk, offering new insights into the genetic drivers of metastatic progression. Integrative multiomics clustering analysis of multiomics data revealed two molecular subtypes (CS1 and CS2) with distinct gene expression patterns, biological process enrichment, and pathway enrichment profiles. CS1 was characterized by enrichment in the cell cycle and DNA repair associated pathways, whereas CS2 showed enriched in immune response-related processes. These findings are consistent with known BC biology. For example, in BC, the histone ubiquitination pathway influences tumor progression and metastasis through key regulatory proteins and epigenetic modification. Notably, prior studies have reported that nucleolar transcription factor 1 (UBTF) activates L3MBTL2, which subsequently suppresses Nischarin (NISCH) expression via H2AK119 monoubiquitination (H2AK119ub) [22]. Additionally, dysregulated spindle assembly checkpoint, which contributes to CDK4/6 inhibitor resistance, suggesting that targeting regulators such as TTK could be a therapeutic strategy to overcome resistance [23]. Survival analysis confirmed divergent clinical outcomes, with the CS1 subtype exhibiting significantly worse overall survival than the CS2 subtype. These findings indicate that multiomics subtyping is a promising strategy for determining the prognosis of BCBM. Similar subtyping approaches have proven effective in lung cancer brain metastasis, suggesting potential translational impact across tumor types.
Among the machine learning models that were evaluated, the integrated CoxBoost + RSF model demonstrated the most robust prognostic performance. Univariate Cox regression analysis identified four hub genes‒BTG2, PSMB8, SRGN, and HLA-DPB1‒with significant prognostic relevance. Prior studies have demonstrated the roles of these genes in BC and metastasis [24–28], findings further substantiated here. BTG2 suppresses BCBM progression by reducing tumor invasion and metastasis, potentially through interactions with the HER2 pathway and modulation of the endocrine therapy response, thereby reducing brain metastasis risk and correlating with improved prognosis [24]. PSMB8 plays a significant role in immune evasion, and its elevated expression may facilitate brain metastasis [25]. HLA-DPB1 is associated with antitumor immunity in BC, and it promotes immune cell infiltration, making it a promising immunotherapeutic marker [26]. SRGN modulates tumor-associated macrophages (TAMs) and influences immune function within the TME [27]. The expression levels of these key genes stratified patients into high- and low-risk cohorts, revealing notable differences in survival probability. Moreover, integrating the prognostic gene signature with clinical variables into a nomogram model substantially enhanced the accuracy and reliability of prognostic predictions. This comprehensive strategy not only provides a robust framework for outcome prediction but also holds valuable implications for guiding the clinical management of metastatic BC, particularly BCBM.
The tumor immune microenvironment plays a pivotal role in BCBM progression and therapy response. Our study revealed substantial differences in immune profiles between the high- and low-risk groups, as defined by the prognostic model. Specifically, the high-risk group exhibited features of immune suppression and exclusion, whereas the low-risk group displayed an immunologically active profile, suggesting greater potential to benefit from immunotherapy. Correlation analysis revealed that the four hub genes were positively associated with distinct immune cell subsets within the BCBM microenvironment (Fig. 6E). PSMB8 showed broad positive correlations with diverse immune infiltrates, including activated NK cells, CD8+ T cells, activated memory CD4+ T cells, M1 macrophages, and activated dendritic cells. These associations suggest that PSMB8 may contribute to enhanced immune activation and cytotoxic activity, particularly in TNBC, consistent with previous findings [28]. HLA-DPB1 showed positive correlations with memory B cells, resting dendritic cells, regulatory T cells, and activated NK and CD8+ T cells, consistent with its role in antigen presentation and immune modulation within the TME of BCBM [26]. BTG2 was positively associated with naïve B cells, monocytes, and resting dendritic cells, indicating a connection with baseline immune functions. SRGN also correlated with M1 macrophages and cytotoxic T cell subsets, supporting its involvement in pro-inflammatory immune activity [27].
We further comprehensively analyzed CNVs, methylation patterns, and expression profiles of prognostic genes to elucidate the complex genetic and epigenetic alterations of the prognostic genes (PSMB8, BTG2, SRGN, and HLA-DPB1) across a spectrum of cancers. Our findings revealed that these hub genes are involved in critical biological pathways‒including cell cycle, DNA damage repair, and immune response‒thereby positioning them as promising therapeutic targets. To translate this finding into potential therapeutic strategies, we performed drug-gene association enrichment analysis and molecular docking experiments. Drug sensitivity screening revealed significant differences in the half-maximal inhibitory concentration (IC50) of several agents between the prognostic high- and low-risk groups. Notably, compounds such elephantin, mitoxantrone, PRIMA-1MET, sabutoclax, and topotecan, showed more favorable response profiles in the low-risk group, highlighting their risk-specific therapeutic applicability and validating the potential of targeting these genes. Molecular docking analysis highlighted the strong binding potential of two candidate compounds, mepacrine and etacrynic acid, to their respective targets (BTG2, SRGN, HLA-DPB1, and PSMB8), underscoring their potential therapeutic promise for BCBM.  To advance these findings toward clinical application, key pharmacological limitations must be addressed. For instance, the efficacy of agents like mitoxantrone in treating brain metastases is likely constrained by its poor BBB permeability. To overcome this, combination strategies with BBB modulators or the use of advanced delivery systems (e.g., nanocarriers, liposomes) could be essential for achieving therapeutic concentrations in the brain [29]. Similarly, sabutoclax‒a BCL-2 family inhibitor‒has demonstrated efficacy against chemotherapy-resistant and stem-like breast cancer cells by reactivating apoptosis and inhibiting the IL-6/STAT3 pathway [30]. Sabutoclax inhibits antiapoptotic proteins (e.g., BCL-2, MCL-1, and BCL-XL), thereby reactivating apoptosis, and inhibits the IL-6/STAT3 pathway, contributing to reductions in CSC numbers. Given its notable cytotoxicity toward drug-resistant BC cells and synergistic effects with chemotherapeutic agents, these mechanisms are especially relevant to BCBM treatment. Although the direct efficacy of sabutoclax in BCBM models requires further study, its potential to target cancer stem cells (CSCs) and its utility in combination regimens position it as a compelling for future BCBM therapeutic exploration [31].
Our in vivo functional validation confirmed the critical, opposing roles of the identified hub genes in brain metastasis: overexpression of PSMB8 and HLA-DPB1 significantly promoted brain metastasis, whereas BTG2 and SRGN exerted suppressive effects. These results strongly support their biological relevance and highlight their potential as therapeutic targets or modifiers in BCBM. Despite these promising findings, several limitations of the current study must be acknowledged to guide future research. First, the in vivo experiments, while revealing clear trends, were conducted with a limited number of animals (n = 3 per group), limits the statistical power and generalizability of these results. To enhance statistical robustness and generalizability, validation in larger preclinical cohorts and patient-derived xenograft (PDX) models is warranted. Second, while we performed key in vivo functional validation experiments on the selected genes, the study relied primarily on bioinformatics analyses; further mechanistic experimental validation is warranted. Although bioluminescence imaging and histology provided strong evidence about the effects on metastatic potential, further in vitro and in vivo studies are needed to delineate the downstream signaling pathways and interaction networks involved. For example, studies should investigate the downstream signaling pathways that are regulated by PSMB8 and HLA-DPB1 to promote metastasis and the mechanisms underlying the suppressive effects of BTG2 and SRGN. Such validation would strengthen the translational potential of our findings. Third, the complexity of BCBM pathogenesis necessitates a more comprehensive approach to understanding the underlying biological mechanisms. Our multiomics analysis of scRNA-seq and transcriptomic data provided foundational insights. To obtain a more holistic understanding of the metastatic process, the integration of additional omics methods, such as metabolomics and proteomics profiles would be invaluable for uncovering critical post-translational modifications and metabolic adaptations within the brain tumor microenvironment. Integrating these omics data with our current findings could lead to the discovery of more robust biomarkers and therapeutic targets.

Conclusions

Conclusions
This study systematically elucidated the molecular landscape and potential biological mechanisms of BCBM by integrating scRNA-seq, multiomics analysis, and machine learning approaches. We identified significant alterations in the cellular composition and gene expression profiles between BCBM and primary BC tissues. Multiomics clustering analysis identified two distinct molecular subtypes (CS1 and CS2) with divergent prognostic and immune profiles. A robust prognostic nomogram model was developed based on four hub genes (BTG2, PSMB8, SRGN, and HLA-DPB1), demonstrating excellent performance in predicting patient survival (AUCs of 0.813, 0.788, and 0.776 for 3-, 5-, and 10-year survival, respectively). Functional validation in vivo confirmed that PSMB8 and HLA-DPB1 promoted brain metastasis, whereas BTG2 and SRGN suppressed it. Drug sensitivity analysis and molecular docking analyses further identified several candidate therapeutic agents with selective efficacy in the low-risk patient group, supporting the feasibility of risk-stratified therapeutic approaches. By contrast, the high-risk group may require alternative therapeutic strategies, such as immune-based interventions, in light of their distinct immune exclusion profiles. Furthermore, monocytes demonstrated increased communication activity within the high-risk group served as key signal transmitters and receivers in cell-cell communication networks, suggesting their central role in shaping the BCBM microenvironment. Collectively, these findings provide a comprehensive molecular framework for BCBM, offering novel insights for its precise diagnosis, risk stratification, and the development of individualized therapeutic strategies. Future validation in larger clinical cohorts and deeper mechanistic studies are warranted to translate these biomarkers and therapeutic candidates into clinical practice.

Supplementary Information

Supplementary Information
Below is the link to the electronic supplementary material.

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기