본문으로 건너뛰기
← 뒤로

Novel fatty acid metabolism-related molecular subtyping and prognostic signature for breast cancer.

1/5 보강
Translational cancer research 📖 저널 OA 100% 2021: 1/1 OA 2023: 10/10 OA 2024: 23/23 OA 2025: 166/166 OA 2026: 124/124 OA 2021~2026 2026 Vol.15(1) p. 29
Retraction 확인
출처

Du J, Liu X, Jin Z, Jiang Q, Li Y, Liu C, Wang B, Liu Y

📝 환자 설명용 한 줄

[BACKGROUND] Breast cancer (BRCA) is one of the most prevalent malignant tumors in women worldwide, characterized by significant heterogeneity.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Du J, Liu X, et al. (2026). Novel fatty acid metabolism-related molecular subtyping and prognostic signature for breast cancer.. Translational cancer research, 15(1), 29. https://doi.org/10.21037/tcr-2025-1424
MLA Du J, et al.. "Novel fatty acid metabolism-related molecular subtyping and prognostic signature for breast cancer.." Translational cancer research, vol. 15, no. 1, 2026, pp. 29.
PMID 41674979 ↗

Abstract

[BACKGROUND] Breast cancer (BRCA) is one of the most prevalent malignant tumors in women worldwide, characterized by significant heterogeneity. Fatty acid metabolism (FAM) plays a crucial biological role in the initiation and progression of cancer. This study aims to identify novel, effective biomarkers related to FAM for improved risk stratification and treatment selection in BRCA patients.

[METHODS] Gene expression data from 1,217 BRCA patients were obtained from The Cancer Genome Atlas (TCGA) database. A comprehensive machine learning approach, incorporating ten different methods, was used to develop a FAM-related gene prognostic model (FAMGM). The Kaplan-Meier method and correlation analysis were employed to assess differences in overall survival (OS) and immune characteristics between high- and low-risk groups. External validation was performed using independent datasets. Single-cell RNA sequencing (scRNA-seq) data from 26 BRCA patients were analyzed, and the potential functions and mechanisms of the model genes were investigated using single-sample gene set enrichment analysis (ssGSEA), CellChat, and other algorithms. Finally, spatial transcriptomics (ST) analysis was conducted to examine the expression of model genes in the malignant regions of tumors.

[RESULTS] The FAMGM, developed using CoxBoost and random survival forest (RSF) methods, was identified as the optimal prognostic model. FAMGM demonstrated stable and robust performance in predicting clinical outcomes for BRCA. The high-risk group showed poor survival prognosis, typically associated with advanced clinical stages, reduced immune cell infiltration, and increased tumor mutational burden (TMB). Model genes were predominantly enriched in macrophages and appeared to influence tumor progression through the upregulation of multiple signaling pathways. Additionally, these model genes exhibited higher expression in malignant tumor regions.

[CONCLUSIONS] FAMGM holds significant potential as a prognostic marker and could be used in the subsequent diagnosis, treatment, prognostic prediction, and mechanistic research of BRCA.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

📖 전문 본문 읽기 PMC JATS · ~79 KB · 영문

Introduction

Introduction
Breast cancer (BRCA) ranks among the most prevalent malignancies affecting women globally. In recent years, BRCA has become the second most common cancer worldwide, following lung cancer, with an annual incidence of approximately 2.31 million new cases, representing 11.6% of all new cancer cases, and this number continues to rise (1). Based on hormone receptor status and molecular characteristics, BRCA can be categorized into four subtypes: luminal A, luminal B, human epidermal growth factor receptor 2 (HER2) overexpressing, and triple-negative BRCA (TNBC) (basal-like) (1,2). Despite advancements in treatment strategies, BRCA remains a significant health challenge. The marked heterogeneity of this disease leads to wide variations in subtypes and clinical characteristics among patients, with diverse manifestations in appearance, clinical behavior, and genetics, complicating treatment responses and prognostic outcomes (3). Consequently, there is an urgent need for novel biomarkers to enhance BRCA detection, guide innovative therapeutic strategies, and monitor prognostic outcomes (4). Researches have indicated that cancer cells frequently modify metabolic processes during their proliferation and spread, with alterations in fatty acid metabolism (FAM) being a hallmark of tumor cells (5,6). Compared to normal cells, BRCA cells typically exhibit increased fatty acid uptake, de novo lipogenesis, and fatty acid oxidation (7). Furthermore, the breast tumor microenvironment is often enriched with adipocytes capable of secreting and producing fatty acids, which in turn can enhance the aggressiveness of cancer cells and impact disease progression (8). The strong association between fatty acids and BRCA underscores the potential for targeting FAM-related genes as a novel therapeutic strategy for BRCA.
Fatty acids in the human body are essential lipids with crucial biological functions. They not only provide and store energy but also serve as signaling molecules and key components of cell membranes (9). FAM encompasses multiple pathways, including fatty acid transport, lipolysis, fatty acid oxidation, storage in lipid droplets as triglycerides and cholesterol esters, as well as de novo synthesis and integration (10). Unlike normal cells, cancer cells, with their rapid proliferation rates, often require increased fatty acid synthesis. Consequently, FAM must be reprogrammed to meet the demands of cancer cells. This reprogramming is vital for the survival and proliferation of cancer cells and plays a critical role in the differentiation and migration of tumor-associated immune cells (11). Researchers have identified specific genes involved in FAM that significantly impact the development and progression of BRCA. These genes are linked to various molecular subtypes of BRCA and may serve as promising therapeutic targets (12-15). In recent years, deep exploration of clinical data and gene sequencing information from public databases has led to the identification of numerous biomarkers related to the prognosis of BRCA. Although these biomarkers show great potential, their clinical utility needs further validation and assessment in real-world cohorts. Therefore, more scientific and standardized methods are required to continually identify more accurate and sensitive prognostic features for BRCA.
In this study, we first identified 16 differentially expressed genes (DEGs) related to FAM within The Cancer Genome Atlas (TCGA)-BRCA cohort. Using an unsupervised clustering algorithm, we then classified the cohort into two molecular subtypes and compared their differences in immune infiltration, survival outcomes, and clinical characteristics. Subsequently, we developed an advanced FAM-related gene prognostic model (FAMGM) using comprehensive machine learning algorithms to evaluate the prognosis, immune features, mutation landscape, and immunotherapy response in BRCA patients. Additionally, single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) were utilized to map the cellular landscape of BRCA, identifying the tumor macrophage population most enriched for FAMGM. Through single-sample gene set enrichment analysis (ssGSEA) analysis, we quantified FAMGM and found significant activation of pathways such as MAPK, p53, glycolysis, and FAM in macrophages with high FAMGM enrichment. Furthermore, interactions between the high FAMGM group and other cell types were observed to be more frequent and robust. ST analysis indicated that the FAMGM model genes were highly expressed to varying degrees in malignant tumor regions. These findings suggest that FAMGM is closely linked to the malignant progression of tumors. Overall, this study highlights the significant correlation between FAMGM and various clinical and molecular features of BRCA. The establishment of FAMGM may aid in optimizing personalized treatment strategies and open new avenues for developing targeted therapies against BRCA. We present this article in accordance with the TRIPOD reporting checklist (available at https://tcr.amegroups.com/article/view/10.21037/tcr-2025-1424/rc).

Methods

Methods

Data collection
The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. Initially, we utilized the GSEA database to identify gene sets related to FAM within the Molecular Signatures Database (MsigDB) v7.1 (http://software.broadinstitute.org/gsea/msigdb/index.jsp) (16). A total of 8 gene sets were selected: FATTY_ACID_OXIDATION, GOBP_FATTY_ACID_CATABOLIC_PROCESS, GOBP_FATTY_ACID_TRANSPORT, HALLMARK_FATTY_ACID_METABOLISM, KEGG_FATTY_ACID_METABOLISM, REACTOME_FATTY_ACID_METABOLISM, REACTOME_TRANSPORT_OF_FATTY_ACIDS, and WP_FATTY_ACID_BIOSYNTHESIS. Following the elimination of repetitive genes, 450 FAM-related genes were incorporated into subsequent analyses. Bulk RNA sequencing (RNA-seq) data and clinical information for BRCA patients were retrieved from the TCGA database (http://gdc.cancer.gov) (17) and the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/) (18). The TCGA-BRCA cohort includes 1,217 samples, while the GEO dataset (GSE96058) comprises 3,069 samples. Additionally, the scRNA-seq dataset (GSE176078), obtained from the GEO database, consists of 11 estrogen receptor (ER)+, 5 HER2+, and 10 TNBC samples. ST data from patients were acquired through the 10× Genomics official website, with specific datasets accessible via the following link: https://www.10xgenomics.com/datasets/human-breast-cancer-visium-fresh-frozen-whole-transcriptome-1-standard and https://www.10xgenomics.com/datasets/human-breast-cancer-whole-transcriptome-analysis-1-standard-1-2-0.

Differential expression analysis of FAM-related genes
We conducted a differential expression analysis of TCGA-BRCA samples from both tumor and normal tissues using the R package “limma” (http://bioconductor.org/packages/release/bioc/html/limma.html, version 3.48.3) (19). Genes with a P value of less than 0.05 and an absolute log fold change (|logFC|) greater than 0.585 (indicating a FC of more than 1.5) were identified as DEGs after applying the Benjamini-Hochberg method for multiple testing correction. The intersection of FAM-related genes with these DEGs was identified as the differentially expressed FAM genes, which were further subjected to mutational landscape and copy number variation (CNV) analysis. To investigate the biological functions and pathway processes of these differentially expressed FAM genes, we employed the R package “clusterProfiler” (http://bioconductor.org/packages/release/bioc/html/clusterProfiler.html, version 4.4.4) (20) to perform Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses. The GO analysis provided information about the involvement of target genes in biological processes (BPs), cellular components (CCs), and molecular functions (MFs), while the KEGG analysis offered pathway annotations for the target genes.

Establishing molecular subtypes of TCGA-BRCA
We employed the R package “survival” (http://bioconductor.org/packages/survivalr/, version 2.41-1) to perform univariate Cox regression analysis on differentially expressed FAM genes within the TCGA-BRCA cohort, aiming to identify genes associated with prognosis, where a P value of less than 0.05 was deemed statistically significant. Following this, unsupervised hierarchical clustering analysis was conducted on BRCA patient samples using the R package “ConsensusClusterPlus” (http://www.bioconductor.org/packages/release/bioc/html/ConsensusClusterPlus.html, version 1.60.0) (21) to ascertain the optimal number of clusters. The R package “survival” was also utilized to evaluate survival differences across subtype groups and to generate Kaplan-Meier survival curves. Additionally, clinical data from the TCGA-BRCA cohort were integrated to analyze the relationship between molecular subtypes and clinical characteristics [such as age, sex, and tumor stage (T stage)] using the Chi-squared test. A Sankey diagram was used to visualize the clinical characteristics of samples within each subtype.

Immune cell infiltration analysis
We utilized the CIBERSORT algorithm from the R package “IOBR” to evaluate the abundance of 22 types of human immune cells in the TCGA-BRCA dataset, followed by a visualization of the differences in immune cell profiles among subtype groups. To more precisely identify immune cell subtypes of interest and validate the CIBERSORT findings, the ssGSEA algorithm was also applied to determine the differences in immune cell composition between subtypes.
Moreover, the reference file “infiltration_estimation_for_tcga” was downloaded from the TIMER database (http://timer.cistrome.org/) (22) to facilitate immune cell infiltration analysis. The extent of immune cell infiltration in BRCA patients was assessed using seven different algorithms: TIMER, CIBERSORT, CIBERSORT-ABS, QUANTISEQ, MCPCOUNTER, XCELL, and EPIC. The R package “ggplot2” was used to compute the correlation between risk scores and immune cell infiltration.

Construction of the FAMGM based on combined machine learning
To develop the FAMGM, we first integrated the differentially expressed FAM prognostic genes that were common to both the TCGA-BRCA and GSE96058 datasets. The TCGA-BRCA cohort was designated as the training set, while the GSE96058 cohort served as the external validation set. We constructed 101 model combinations using 10 machine learning algorithms, including support vector machine (SVM), least absolute shrinkage and selection operator (LASSO), gradient boosting machine (GBM), random forest, elastic net, stepwise Cox, ridge, CoxBoost, super partial correlation (SuperPC), and partial least squares with Cox regression (plsRcox). The concordance index (C-index) for each model was calculated within both cohorts, and models were ranked by the average C-index from highest to lowest across the two cohorts (23). The model with the highest average C-index was identified as the optimal model. C-index calculations were performed using the R package “Hmisc”. Samples in the training and validation sets were scored based on the selected model and categorized into high- and low-risk groups according to the median score within each cohort. The R package “survminer” was utilized for survival analysis. To assess the predictive performance of the FAMGM, we used the R package “timeROC” to generate time-dependent receiver operating characteristic (ROC) curves and calculated the area under the curve (AUC) values.

Development and assessment of a nomogram based on FAMGM
To assess the potential of FAMGM as an independent prognostic marker for BRCA patients, we performed univariate and multivariate Cox regression analyses using the TCGA-BRCA cohort, evaluating the significance of FAMGM in conjunction with relevant clinical parameters. A nomogram was then constructed by integrating age, T stage, node stage (N stage), metastasis stage (M stage), and risk score using the R package “rms” (24). By summing the scores for each variable, a comprehensive survival prediction model for individual patients was established. The accuracy of the nomogram was evaluated using calibration curves and the C-index.

Association between the FAMGM signature and BRCA molecular subtypes
To further assess the relationship between the risk model and BRCA molecular subtypes, we compared the distribution of risk scores across intrinsic subtypes. According to the PAM50 classification in the TCGA cohort, patients were categorized into four subtypes: basal-like, HER2-enriched, luminal A, and luminal B. Median risk score differences among subtypes were evaluated using the Wilcoxon test. Additionally, to examine the subtype composition within the high- and low-risk groups, stacked bar plots were generated, and Chi-squared tests were conducted to assess statistical significance.

Somatic mutation analysis and immunotherapy prediction
We obtained BRCA-related gene mutation data from the TCGA database and analyzed the mutation status of each gene in the TCGA-BRCA cohort using the R package “maftools” (version 2.8.0, https://bioconductor.org/packages/release/bioc/html/maftools.html) (25). Genes were ranked by mutation frequency, and the top 20 most frequently mutated genes were visualized. The calculation of tumor mutational burden (TMB) and the analysis of CNV were also performed using the “maftools” package. The Tumor Immune Dysfunction and Exclusion (TIDE; http://tide.dfci.harvard.edu/) (26) online tool was used to assess specific transcriptomic biomarkers, indicating potential clinical benefits of immunotherapy across different risk groups. TIDE scores were generated for each sample from this platform, and the Wilcoxon test was used to compare TIDE score differences between different risk groups. A higher TIDE score suggests a higher likelihood of tumor immune evasion and reduced effectiveness of treatment. Furthermore, we evaluated additional immunotherapy predictive biomarkers for each sample, including CD8A/PD-L1 expression levels, cytolytic activity (CYT), microsatellite instability (MSI), and tertiary lymphoid structure (TLS) scores. The CYT score was calculated as the average expression value of GZMA and PRF1. The TLS score was derived using the gene set variation analysis (GSVA) algorithm based on the expression of TLS signature genes (CCR6, CD1D, CD79B, CETP, EIF1AY, LAT, PTGDS, RBP5, and SKAP1). Differences in these biomarker scores between different risk groups were also assessed using the Wilcoxon test.

Drug sensitivity analysis
To evaluate patient sensitivity to chemotherapy drugs, we used data from the Genomics of Drug Sensitivity in Cancer (GDSC; https://www.cancerrxgene.org/) database. The R package “pRRophetic” (https://github.com/paulgeeleher/pRRophetic) (27) was employed to predict drug sensitivity, comparing the differences in half-maximal inhibitory concentration (IC50) values for 138 chemotherapy drugs across different risk groups.

Immunohistochemical analysis using the Human Protein Atlas (HPA) database
The HPA (https://www.proteinatlas.org/) offers comprehensive data on the tissue and cellular distribution of 26,000 human proteins. Researchers utilize highly specific antibodies and various immunoassay techniques, including immunoblotting, immunofluorescence, and immunohistochemistry, to detail the expression of each protein across 64 cell lines, 48 types of normal human tissues, and 20 types of cancer tissues. We employed this database to examine the immunohistochemical profiles of the model genes in both normal and BRCA tissues.

scRNA-seq analysis
We retrieved scRNA-seq data from the GEO database under accession code GSE176078 for this study. We analyzed a total of 100,064 cells from 11 ER+, 5 HER2+, and 10 TNBC samples. The data were processed using the R package “Seurat”, which involved quality control, batch effect correction, and data integration. Mitochondrial gene expression percentages were calculated with the “PercentageFeatureSet” function, and cells with mitochondrial expression ≤20% were retained. The raw data were normalized using the “NormalizeData” function, and the top 2,000 highly variable genes were selected using the “FindVariableFeatures” function. Data scaling was done with the “ScaleData” function, and principal component analysis (PCA) was conducted with the “RunPCA” function. Clustering of cells was performed at a resolution of 0.2 using the “FindNeighbors” and “FindClusters” functions, and visualization was achieved with the “RunUMAP” function, incorporating 30 PCA dimensions and default parameters. Lastly, single-cell sequencing data were annotated using the “SingleR” package and marker genes.

GSEA and GSVA
GSEA and GSVA are methodologies used for genomic enrichment analysis. In this study, we utilized ssGSEA to assess the enrichment levels of the model genes. Based on the median enrichment score, samples were categorized into high FAMGM and low FAMGM groups. GSEA was used to explore differences in signaling pathways between high and low FAMGM groups. GSVA, an unsupervised and non-parametric method, converts gene expression data from individual gene levels to gene set enrichment levels by integrating scores from target gene sets, allowing for the assessment of potential biological changes. Specifically, GSVA was applied to evaluate pathway activity profiles within the high FAMGM group. GMT files, sourced from public databases such as MSigDB, were used for annotating pathways and gene sets during the analysis.

Cell metabolic activity analysis
The R package “scMetabolism” (version 0.2.1) (28) was employed to quantify metabolic activity at the single-cell level by integrating scRNA-seq data with information on metabolic pathways. This method enabled us to discern dynamic changes in metabolic pathways across different groups, underscoring the significance of metabolism in disease progression.

Cell-cell communication and pseudotime analysis
CellChat (version 1.4.0) was utilized to infer the interactions between various cell types within the BRCA microenvironment. We specifically examined categories such as “secreted signals” and “cell-cell contact” from the CellChat database, focusing on comparative analysis of macrophages between high and low FAMGM groups. This method aided in identifying dysregulated signaling pathways, characterized by the upregulation or downregulation of specific ligand-receptor interactions in different conditions.
For exploring the dynamic developmental trajectories of cell populations, we conducted pseudotime analysis using the R packages “CytoTRACE” and “Monocle2”. “CytoTRACE” (version 0.3.3) was employed to evaluate the differentiation levels of macrophage subpopulations and to determine the initial states of their developmental trajectories, while “Monocle2” (version 2.26.0) provided further insights into the trajectory features of different macrophage subpopulations. These analyses offered a comprehensive view of evolving cell states and lineage relationships at single-cell resolution.

ST analysis
We utilized the R package “Seurat” for the processing and visualization of ST data. Normalization of ST data was performed using the SCT method. Unsupervised clustering was then applied to group similar ST data points. The cell cluster annotation was based on hematoxylin and eosin (HE) stained slices and highly variable genes identified within each cluster. Scores for cell-specific features (FAMGM model genes) were calculated using the ssGSEA algorithm on scRNA-seq data. The SpatialDimPlot and SpatialFeaturePlot functions were used together to visualize cellular expression levels within the ST data.

Statistical analysis
Statistical analyses were conducted using R software version 4.3.0. Comparisons were carried out with the Student’s t-test or the Wilcoxon rank-sum test. A P value of less than 0.05 was considered statistically significant.

Results

Results

Differential expression analysis of FAM-related genes in TCGA-BRCA
The expression data and clinical information utilized in this study were sourced from the TCGA and GEO databases. Initially, we gathered eight gene sets related to FAM from the MSigDB v7.1 database. Upon extracting and deduplicating these gene sets, we identified a total of 450 FAM-related genes (Figure S1A, table available at https://cdn.amegroups.cn/static/public/tcr-2025-1424-1.xlsx). Differential expression analysis of the TCGA-BRCA cohort was then conducted to compare normal and BRCA tissues, resulting in the identification of 2,062 up-regulated and 2,446 down-regulated genes (Figure 1A). A Venn diagram was subsequently employed to pinpoint the DEGs involved in FAM, yielding 132 key genes (Figure 1B). Furthermore, we examined the mutation landscape and CNV of these 132 genes in BRCA. The analysis revealed a high frequency of copy number amplifications in genes such as ADIPOR1 and MAPKAPK2, and a high frequency of deletions in genes such as ACADVL and ALOX15B (Figure S1B). Among the 252 samples analyzed, 90 (35.71%) samples exhibited somatic mutations, with PTPRG and ACACB showing the highest mutation rates at 4% (Figure 1C). A Circos plot was used to map the chromosomal locations of these 132 genes (Figure 1D). Subsequently, GO and KEGG enrichment analyses were performed on the identified genes. The KEGG analysis revealed that these genes were primarily associated with metabolic pathways, PPAR signaling, fatty acid degradation, and FAM (Figure 1E). The GO analysis indicated that they were mainly linked to small molecule metabolic processes, lipid metabolism, mitochondrial function, endoplasmic reticulum activity, and oxidoreductase activity (Figure 1F-1H).

Molecular subtype classification based on FAM-related genes
To further explore the influence of FAM-related genes on BRCA prognosis, we extracted 132 DEGs from the TCGA-BRCA cohort. Univariate Cox regression analysis identified 16 significant genes associated with prognosis (Figure 2A, Table S1). Consensus clustering based on the expression profiles of these 16 prognostic genes was performed, with the optimal number of clusters determined by the lowest “proportion of ambiguous clusters” (PAC). The analysis revealed that when k=2, the distinction between the two clusters was most pronounced, resulting in two distinct subtypes: cluster 1 (C1, 465 samples) and cluster 2 (C2, 617 samples) (Figure 2B,2C, Figure S2). A heatmap of the clustering for k=2 was generated (Figure 2D, table available at https://cdn.amegroups.cn/static/public/tcr-2025-1424-2.xlsx). Kaplan-Meier survival analysis indicated that patients classified in the C1 subtype had a poorer prognosis compared to those in the C2 subtype (Figure 2E). The heatmap in Figure 2F illustrates the expression patterns of the 16 genes across the subtypes and their correlations with clinical characteristics. Figure 2G provides a detailed correlation analysis between the subtypes and clinical parameters. The results indicate that the C1 subtype is predominantly associated with advanced clinical stages, while the C2 subtype is associated with early clinical stages, aligning with the survival outcomes.
Recognizing the close interplay between FAM and the immune microenvironment, we assessed immune cell infiltration across the two subtypes. CIBERSORT analysis of 22 immune cell types revealed statistically significant differences in 15 of them. Higher infiltration levels of resting natural killer (NK) cells, M0 and M2 macrophages, and activated mast cells were observed in the high-risk C1 group, whereas naive and memory B cells, plasma cells, CD8+ T cells, follicular helper T cells, regulatory T cells (Tregs), gamma delta T cells, M1 macrophages, resting dendritic cells, resting mast cells, and neutrophils exhibited higher infiltration in the low-risk C2 group (Figure 2H). The ssGSEA algorithm further validated the CIBERSORT findings, yielding consistent results (Figure 2I). Notably, macrophages, especially the M0 and M2 subtypes, constituted a significant proportion of immune cells in the CIBERSORT assessment.

Development and validation of the FAMGM based on integrative computational framework
To develop a more precise prognostic model for FAM-related genes, we selected 16 prognostic genes that were significantly associated with overall survival (OS) from the previous analysis. These genes were processed through 101 combinations of machine learning algorithms to construct the FAMGM. We subsequently calculated the C-index for each model combination within both the training cohort (TCGA-BRCA) and the validation cohort (GSE96058). Among the 101 models, the combination of CoxBoost and random survival forest (RSF) algorithms achieved the highest average C-index (Figure 3A). Consequently, we used these two algorithms for gene selection and model construction. CoxBoost identified nine key genes, which were further refined by the RSF algorithm. This process resulted in a model composed of nine core genes: ACSL1, ALDH2, SUCLA2, APOD, ACSL5, CNR1, ECHDC2, PSME1, and SLC27A2 (Figure 3B,3C, Table S2). We then calculated the FAMGM score for each patient in both cohorts. Patients were stratified into high- and low-risk groups based on the median score within their respective cohorts, with Kaplan-Meier curves demonstrating a significantly better prognosis for patients in the low-risk group compared to the high-risk group (Figure 3D,3E). In the TCGA-BRCA training cohort, the AUC values for 1-, 2-, 3-, 4-, and 5-year survival were 0.984, 0.98, 0.987, 0.987, and 0.991, respectively. However, in the GSE96058 validation cohort, the AUC values were approximately 0.6 (Figure 3F). To facilitate a comprehensive comparison between FAMGM and other prognostic signatures, we incorporated previously published models into the TCGA-BRCA dataset. Notably, FAMGM exhibited superior AUC values relative to all other models in the TCGA-BRCA cohort, underscoring the exceptional predictive performance of our model (Figure 3G).

FAMGM is an independent predictor for survival of BRCA patients
To clarify the relationship between FAMGM and clinical characteristics in BRCA, we analyzed data from the TCGA-BRCA cohort. Our results showed that the risk score was significantly correlated with age, T, N, and M stages (Figure 4A). Patients in the high-risk group predominantly exhibited features of advanced clinical stages, whereas those in the low-risk group were more likely to present early-stage characteristics. We then performed univariate and multivariate Cox regression analyses to evaluate the association between the risk score, other clinical features, and patient prognosis (Figure 4B,4C). The univariate Cox analysis demonstrated that age, T, N, M stages, and risk score were significantly associated with OS (all P<0.001). In the multivariate Cox analysis, even after adjusting for other confounding factors, FAMGM remained an independent prognostic factor (P<0.001). As a result, FAMGM was confirmed to be an independent predictor of prognosis in BRCA patients. To enhance the modeled clinical applicability, we developed a clinicopathological nomogram incorporating age, T, N, M stages, and the risk score, which achieved an optimal C-index (0.887) (Figure 4D). This nomogram provided predictions for 1-, 3-, and 5-year survival, aiding clinical decision-making. The accuracy of the nomogram was validated using calibration curves, which demonstrated a good agreement between the predicted and observed 1-, 3-, and 5-year survival rates (Figure 4E), highlighting the nomogram’s reliable predictive power.
Given the critical role of hormone receptor status in guiding BRCA treatment decisions, we explored the association between FAMGM and the four molecular subtypes of BRCA. The results revealed significant variations in risk scores among intrinsic subtypes. The basal-like and HER2-enriched subtypes exhibited the highest median risk scores, whereas the luminal A subtype showed the lowest (P<0.001) (Figure 4F). This pattern aligns with the established clinical consensus that basal-like and HER2-enriched subtypes are more aggressive and associated with poorer prognoses. Furthermore, analysis of subtype distribution between the high- and low-risk groups demonstrated a significant imbalance, as confirmed by the Chi-squared test (P<0.001) (Figure 4G). The high-risk group was enriched in basal-like and HER2-enriched subtypes, while the low-risk group predominantly comprised the luminal A subtype, which is typically linked to favorable outcomes. Collectively, these findings indicate that FAMGM-based risk stratification is strongly correlated with the intrinsic molecular subtypes of BRCA.

Immune and genomic variation landscape in risk groups
Somatic mutations are widely characteristic of BRCA, as demonstrated by our initial mapping of the mutation landscape within the TCGA-BRCA cohort (Figure 5A). The analysis identified missense mutations and single-nucleotide polymorphisms as the most prevalent among the different categories of mutations, with C>T transitions being the most common type of single-nucleotide variation. We then quantified the mutation frequency for each gene across the samples and visualized the top 20 mutated genes’ mutation profiles in both high- and low-risk groups using a waterfall plot (Figure 5B). This analysis encompassed 955 BRCA samples, with mutations detected in 821 samples, corresponding to a mutation rate of 85.97%. TP53, a well-known tumor suppressor gene, had the highest mutation frequency in both groups, with a higher mutation frequency observed in the high-risk group compared to the low-risk group. Given the significance of TMB in predicting patient responses to immunotherapy, we assessed the correlation between the risk score and TMB. The TMB values were log-transformed as depicted in Figure S3. The results demonstrated a statistically significant difference in TMB between the high- and low-risk groups, with the high-risk group showing elevated TMB levels (Figure 5C). Additionally, the risk score was found to be significantly positively correlated with TMB (Figure 5D). Furthermore, leveraging the TIMER database, we employed seven different algorithms to analyze immune cell profiles between the risk groups. The findings revealed that 87 immune cell types were significantly correlated with the risk score (P<0.05) (Figure 5E). Notably, most immune cell levels were inversely correlated with the risk score, indicating that patients in the low-risk group may exhibit higher levels of immune cell infiltration.

Predictive effects of FAMGM in immunotherapy
Immunotherapy has become a cornerstone in the treatment of many cancers. We evaluated differences in immune-related scores, including CD8A/PD-L1 expression level, MSI score, CYT score, and TLS score, between high- and low-risk groups to predict their response to immunotherapy. The analysis revealed significant correlations between CD8A/PD-L1 expression level, TLS score, and the risk score, with both displaying a significant inverse relationship with the FAMGM score (Figure 6A,6B, Figure S4A,S4B). Furthermore, the high-risk group exhibited a lower TIDE score compared to the low-risk group, and the TIDE score was significantly negatively correlated with the FAMGM score (Figure 6C,6D). These findings suggest that the high-risk group may represent an immune subtype that is more responsive to immunotherapy and could derive greater benefits from it. Additionally, to investigate the potential connection between our model and drug sensitivity, we utilized the GDSC database to analyze the IC50 values of 138 chemotherapeutic agents in BRCA samples, assessing the relationship between risk score and drug efficacy. The results identified 80 drugs with significantly different IC50 values between the high- and low-risk groups (P<0.01) (table available at https://cdn.amegroups.cn/static/public/tcr-2025-1424-3.xlsx), with the most significant differences observed for rapamycin, erlotinib, and LFM-A13. Notably, the IC50 values for these three drugs were higher in the high-risk group (Figure 6E-6G). Overall, these findings indicated a potential link between our model and immune markers as well as drug sensitivity, offering valuable insights for personalized treatment strategies in BRCA.

The model genes affect the progression of BRCA through the involvement of macrophages
Single-cell sequencing data from 26 BRCA patients were annotated using clearly defined marker genes (Figure 7A, Figure S5A). The pie chart (Figure S5B) illustrates that TNBC patients exhibit higher percentages of macrophages and B cells infiltration compared to other BRCA types, while ER+ patients show higher percentages of endothelial and epithelial cells. We then assessed the expression and distribution of FAMGM model genes across the six cell types, as shown in FeaturePlot and violin plots (Figure 7B, Figure S5C). Further, using the ssGSEA algorithm, we calculated the FAMGM model gene scores for each cell type (Figure 7C). Notably, FAMGM was more highly enriched in macrophages compared to other cell types. Therefore, the role of FAMGM in macrophages was further investigated.
We isolated macrophages characterized by CD68 and CD163 expression levels (Figure S5A) and categorized them into seven subgroups (Figure 7D). All macrophages were then stratified into high and low groups according to the median FAMGM score. To elucidate the impact of FAMGM on tumor progression, we analyzed gene pathways associated with tumor progression in both groups. The results demonstrated that the high FAMGM group exhibited significant activation of key tumor progression pathways, including AKT, MAPK, NF-KB, NOTCH, p53, and TGFB signaling pathways (Figure 7E). Further GSEA enrichment analysis revealed significant activation of the pentose phosphate and glycolysis/gluconeogenesis pathways, as well as upregulation of arachidonic acid, β-alanine, butyrate, fatty acid, pyruvate, and tryptophan metabolism in the high FAMGM group (Figure 7F, Figure S5D). This suggested that FAMGM may influence the metabolic reprogramming of macrophages. To test this hypothesis, we performed metabolic activity analysis using the “scMetabolism” package, which revealed metabolic pathways such as fatty acids and glycolysis were significantly upregulated in macrophages from the high FAMGM group (Figure 7G). The upregulation of these metabolic pathways indicated an enhanced capacity for tumor progression, emphasizing the high metabolic characteristics of cancer cells. Thus, our findings suggested a strong association between FAMGM enrichment in macrophages and the malignancy of tumor progression.

Analysis of differentiation trajectory and cell communication of macrophages
Previously, FAMGM was found to be significantly enriched in macrophages. Given the functional plasticity of macrophages in the tumor microenvironment, we further examined the relationship between macrophage differentiation trajectories, cell communication, and FAMGM enrichment. CytoTRACE analysis of macrophages (Figure 8A) identified cluster 6 as the differentiation origin, exhibiting the highest differentiation potential, while cluster 5 represented the terminal state. Macrophages in the high FAMGM group exhibited higher CytoTRACE scores, indicating a less differentiated phenotype and significantly increased malignancy (Figure 8B). Pseudotime trajectory analysis revealed dynamic FAMGM enrichment patterns across two cell fates, with differentiation initiated from cluster 6 and maturation progressing from right to left (Figure 8C-8E). The relationship between macrophage developmental trajectories and FAMGM enrichment was further illustrated in Figure 8F and Figure S6A. FAMGM displayed a temporally regulated expression pattern during differentiation, with ACSL1, ACSL5, ALDH2, APOD, SLC27A2, PSME1, and CNR1 highly expressed in later stages, whereas ECHDC2 and SUCLA2 predominated in early stages (Figure 8G). These findings suggest that FAMGM may contribute to BRCA malignancy progression by dynamically regulating macrophage differentiation.
We next aimed to investigate whether the heightened enrichment of FAMGM in Macrophages coincided with altered intercellular communication in BRCA. To address this, we conducted a CellChat analysis to identify differences in cell communication between the high FAMGM and low FAMGM groups. The results indicated that macrophages in the high FAMGM group had more frequent and stronger interactions with other cells compared to those in the low FAMGM group (Figure 8H,8I, Figure S6B,S6C). To broaden our investigation, we explored additional signaling pathways and discovered that macrophages in the high FAMGM group regulated BRCA via SEMA4, SEMA7, and SELPLG pathways (Figure 8J). Interestingly, our analysis revealed that the high FAMGM group, compared to the low FAMGM group, exhibited increased activation of the CXCL, EGF, CD86, IL10, APRIL, and ITGB2 signaling pathways (Figure S6D). Collectively, these findings indicated that the enrichment of FAMGM in macrophages could potentially regulate the initiation and progression of BRCA through a complex network of signaling pathways.

The model genes demonstrate high expression levels in malignant areas of BRCA
We sourced ST data from the 10× Genomics website and analyzed two slices, each initially containing 36,601 genes. To investigate FAMGM enrichment patterns across different regions, we analyzed slices 1 and 2 separately (Figure 9A,9B, Figure S7A,S7B) without merging the data. Initially, dimensional reduction and point clustering in each ST array allowed us to identify and categorize spatial locations into different major clusters. Six major cell clusters were identified in slice 1, while eight were identified in slice 2. We examined the expression of model genes in tissue sections. The analysis revealed that ECHDC2, ALDH2, and PSME1 exhibited higher expression in the slices (Figure 9C, Figure S7C). Subsequently, to explore spatial expression features further, we used the ssGSEA algorithm to assess the ST features of FAMGM. We observed that FAMGM was significantly enriched in clusters 2 and 4 of slice 1, and in clusters 6 and 7 of slice 2 (Figure 9D, Figure S7D). Figure 9E and Figure S7E displayed the enrichment levels of model genes within the spatial locations of the slices. To further explore the malignancy within regions of high FAMGM enrichment, we conducted chromosomal CNV analysis on two slices using SPATA2 (Figure 9F, Figure S7F). The resulting CNV heatmaps were depicted in Figure 9G and Figure S7G. In the first slice, significant amplification events were observed on chromosome 1. Clusters 2 and 4, which were enriched in FAMGM, predominantly exhibited amplifications on chromosomes 1, 6, 12, and 17, and a deletion on chromosome 4 (Figure 9H). Similarly, in the second slice, the primary chromosomal alteration was an amplification event on chromosome 1. Clusters 6 and 7, which were enriched in FAMGM, demonstrated amplifications on chromosomes 1, 11, and 12, a deletion on chromosome 16, and a tendency for amplification on chromosome 17 (Figure S7H). We further analyzed the average CNV distribution across chromosomes 1, 6, 12, and 17 in both slices, finding that regions with high FAMGM enrichment frequently displayed CNV events (Figure 9I, Figure S7I).
Lastly, under the Creative Commons Attribution (CCBY) license, we employed the HPA database (https://www.proteinatlas.org/) to validate the expression of the model genes at the protein level in BRCA and adjacent normal tissues. Immunohistochemical analysis revealed higher protein expression levels of SUCLA2, APOD, and PSME1 in BRCA samples, with lower levels observed in normal tissues (Figure S8). Collectively, these findings suggested an accumulation of CNV events in regions of high FAMGM enrichment, indicating that these regions were associated with malignant characteristics, and implying a potential spatial correlation between the high expression of model genes and increased tumor malignancy.

Discussion

Discussion
Tang et al. developed a prognostic model for FAM in BRCA using univariate Cox and LASSO analyses (29). Similarly, Qian et al. employed the LASSO-Cox method to create gene signatures associated with FAM to predict BRCA prognosis and immune response (30). However, these models often depend on individual preferences for algorithm selection or lack validation across multiple datasets, which can result in suboptimal performance.
This study systematically elucidated the pivotal role of FAMGM in BRCA progression and their clinical translational potential. Through integrated genomic analysis, 132 DEGs significantly associated with FAM were identified, and their co-occurrence with high-frequency mutations and CNVs in BRCA samples was revealed, indicating that FAM reprogramming is a fundamental molecular hallmark of BRCA. Univariate Cox regression and unsupervised consensus clustering based on survival data successfully defined two molecular subtypes (C1/C2) with distinct prognostic outcomes, with the C1 subtype associated with significantly poorer survival compared to the C2 subtype. The FAMGM prognostic model, developed using machine learning algorithms, exhibited outstanding clinical predictive performance, offering a novel tool for precise prognosis assessment in BRCA. Further multi-omics analyses demonstrated that single-cell sequencing identified a specific enrichment of FAMGM signature genes in tumor-associated macrophages (TAMs), with expression patterns closely linked to protumor signaling pathways and metabolic reprogramming. ST further confirmed the association between FAMGM signature genes and malignant tumor phenotypes. Collectively, these multi-omics findings underscore the central role of the FAM regulatory network in BRCA malignancy progression, highlighting its molecular characteristics as potential prognostic biomarkers and providing a theoretical basis for developing therapeutic strategies targeting the tumor metabolic microenvironment.
Machine learning algorithms have emerged as a powerful tool for analyzing multi-omics data, significantly contributing to cancer diagnosis, prognosis prediction, and treatment planning (31). To develop a robust prognostic model, we created a novel computational framework that integrates 101 combinations derived from 10 different machine learning algorithms. Ultimately, we successfully constructed a robust FAMGM, comprising just nine genes (ACSL1, ALDH2, SUCLA2, APOD, ACSL5, CNR1, ECHDC2, PSME1, and SLC27A2). Long-chain acyl-CoA synthetase family members 1 and 5 (ACSL1 and ACSL5) were identified as isoenzymes of the long-chain fatty acid CoA ligase family. ACSL1 was found to promote BRCA invasion and metastasis by activating long-chain fatty acids (32), whereas ACSL5 exhibited pro-apoptotic, tumor-suppressive functions that potentially influenced BRCA prognosis (33). ALDH2, an aldehyde dehydrogenase critical for maintaining tumor cell stemness, was highly expressed in triple-negative and luminal B subtypes of BRCA (34). The SUCLA2 enhanced cancer cell resistance to anoikis by regulating redox homeostasis, positioning it as a potential tumor regulator (35). APOD inhibited BRCA cell proliferation and served as a favorable prognostic biomarker (36). CNR1, a G protein-coupled receptor in the endocannabinoid system, had a role in BRCA that remained to be fully elucidated (37). ECHDC2 is involved in fatty acid β-oxidation and is predicted to disrupt cancer cell energy supply by inhibiting glycolysis and shifting metabolism towards oxidative phosphorylation, thereby suppressing cancer progression (38). PSME1 is a component of immunoproteasome, playing a key role in the presentation of tumor antigens via major histocompatibility complex (MHC) class I molecules. SLC27A2 encodes a protein involved in regulating lipid metabolism and potentially impacts tumor progression, making it a prospective therapeutic target in BRCA (39). Based on the median FAMGM levels, survival rates were lower in the high-risk group compared to the low-risk group across both training and validation cohorts. It is important to note that overfitting remains a significant challenge when modeling large-scale biological data and predicting disease outcomes using machine learning algorithms. Prognostic models often exhibit strong performance in training datasets but may underperform in validation datasets due to excessive complexity (40). To mitigate overfitting, we utilized the average C-index from both validation and training sets as a ranking criterion, minimizing overfitting effects. Notably, our carefully curated FAMGM model demonstrated superior predictive value in the TCGA dataset compared to other published models, highlighting the unique predictive capabilities of our algorithm.
In the context of immunotherapy, PD-L1, MSI, and TMB are typically regarded as the three most definitive predictive biomarkers (41). While there was no significant difference in MSI between the two groups, our study observed higher TMB and PD-L1 levels in the high-risk group. This finding suggests that patients in the high-risk group may respond more favorably to immune checkpoint inhibitors, potentially indicating stronger antitumor immunity. The TIDE tool, commonly used to predict the likelihood of tumor immune evasion, showed elevated scores in the low FAMGM group, supporting the link between immune evasion and low FAMGM and suggesting a higher sensitivity to immunotherapy in the high-risk group. This aligns with our analysis, highlighting that FAMGM might be a valuable biomarker for early identification of patients likely to respond to immunotherapy.
Furthermore, our study offers insights into the relationship between FAMGM characteristics and drug sensitivity in BRCA patients. We observed a significant correlation between the IC50 values of certain chemotherapy drugs and FAMGM expression levels. Notably, FAMGM was most strongly correlated with rapamycin, erlotinib, and LFM-A13, with the low FAMGM group showing increased sensitivity to these three drugs. Previous studies have reported that rapamycin can prevent cyclophosphamide-induced loss of primordial follicles and inhibit tumor proliferation in BRCA xenograft mouse models (42). Combining rapamycin with itraconazole can arrest cells in the G0/G1 phase of the cell cycle, potentially serving as a therapeutic option for patients with TNBC (43). Erlotinib has also been demonstrated to effectively inhibit the growth of cancer cells and the characteristics of cancer stem cells in TNBC when used in conjunction with monensin. This combination regulates both the EGFR/ERK and PI3K/AKT signaling pathways and reduces the protein levels of SOX2 and CD133 (44). LFM-A13, a specific inhibitor of Bruton’s tyrosine kinase and Polo-like kinases, has been shown to suppress 7,12-dimethylbenz(a)anthracene (DMBA)-induced BRCA progression in mice by inhibiting cancer cell proliferation and promoting apoptosis. Importantly, LFM-A13 does not harm cells of the immune system or other tissues and demonstrates enhanced efficacy when combined with paclitaxel (45). This combination could represent a promising strategy for BRCA treatment.
Currently, single-cell sequencing is extensively used in cancer research to reveal genetic and transcriptomic differences between individual cells. This technology can capture variations in gene expression, metabolic states, and other characteristics among different tumor cells, providing a precise assessment of gene functions within specific cell types and elucidating the molecular mechanisms underlying tumor initiation and progression (46). In our study, the core FAMGM genes were found to be significantly enriched in macrophages. TAMs, including classically activated M1 macrophages, M2 macrophages, and unactivated M0 macrophages, were regarded as critical components of the tumor microenvironment. M1 macrophages exhibited pro-inflammatory properties by secreting pro-inflammatory cytokines to activate the immune system and promote antitumor immunity (47). In contrast, M2 macrophages, which possessed limited antigen-presenting capabilities, were observed to secrete anti-inflammatory cytokines that promoted tumor growth, invasion, and metastasis, thereby facilitating tumor progression (48). The link between FAM and macrophages was primarily demonstrated through metabolic reprogramming that regulated macrophage function, thus affecting immune regulation and tumor progression. TAMs were seen to support immunosuppression and tumor advancement via fatty acid oxidation, characteristic of the M2 phenotype. In liver cancer, FAM reprogramming was reported to enhance the pro-tumor function of TAMs (49), and S100a4+ alveolar macrophages were noted to accelerate precancerous lung lesions through palmitic acid metabolism (50). In our study, FAMGM genes were shown to activate signaling pathways in macrophages, such as AKT, p53, and NF-γB, as well as metabolic pathways including the pentose phosphate pathway and glycolysis. Previous studies demonstrated that AKT activation upregulated fatty acid synthase (FASN) to enhance fatty acid synthesis and drive tumor growth (51), while NF-γB pathway activation promoted the secretion of pro-inflammatory factors (e.g., IL6 and TNF-α), induced lipolysis, and released free fatty acids (FFAs) for tumor utilization (52). These findings advanced our understanding of the role of FAMGM in BRCA and its association with malignant tumor progression. Future efforts will focus on elucidating the specific roles of each model gene within cellular populations and further exploring the complex relationship between FAM and BRCA.
Nevertheless, there are certain limitations in this study. Firstly, the training dataset we utilized comes from the TCGA database, where there is a notable disparity in the number of cancers versus normal samples. Secondly, the data were entirely sourced from public databases, which may lead to selection bias, and the quality of genomic data could directly influence the accuracy and reliability of our findings. Thirdly, we categorized BRCA samples into high and low FAMGM groups based on the median FAMGM score, but identifying an optimal FAMGM cutoff for better stratification of BRCA patients might be more suitable. Lastly, this study is based on a retrospective analysis, and its conclusions require validation through larger-scale prospective studies. Additionally, further in vivo and in vitro experiments are necessary to substantiate our findings.

Conclusions

Conclusions
Our study developed a robust prognostic model based on FAM-related genes. Through consensus clustering, we identified two molecular subtypes of BRCA, uncovering their distinct prognostic outcomes and clinical characteristics, which may enhance the molecular classification of BRCA. By employing a machine learning algorithm framework, we established FAMGM, which demonstrated superior performance across multiple cohorts. FAMGM serves as an effective tool for accurately predicting patient prognosis and identifying patient groups that are more likely to benefit from immunotherapy or chemotherapy, thereby providing a scientific basis for personalized treatment decisions. In essence, our study clarifies the potential biological roles of FAM-related genes in BRCA, offering a theoretical foundation and background support for future clinical translational research.

Supplementary

Supplementary
The article’s supplementary files as

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기