본문으로 건너뛰기
← 뒤로

SELE is associated with reduced breast cancer susceptibility: Evidence from Mendelian randomization and single-cell transcriptome.

2/5 보강
Translational oncology 📖 저널 OA 100% 2023: 3/3 OA 2024: 13/13 OA 2025: 72/72 OA 2026: 103/103 OA 2023~2026 2026 Vol.66() p. 102713 OA Circadian rhythm and melatonin
TL;DR This study provides robust genetic evidence for the causal roles of SELE, CDH1, and ALPI in reducing BC risk and identifies potential therapeutic targets and offers new insights into BC pathogenesis, presenting hypotheses for clinical validation.
Retraction 확인
출처
PubMed DOI PMC OpenAlex Semantic 마지막 보강 2026-05-01
OpenAlex 토픽 · Circadian rhythm and melatonin Ion Channels and Receptors Inflammasome and immune disorders

Chen H, Hu W, Liu R, Liu Q, Cheng X

📝 환자 설명용 한 줄

This study provides robust genetic evidence for the causal roles of SELE, CDH1, and ALPI in reducing BC risk and identifies potential therapeutic targets and offers new insights into BC pathogenesis,

🔬 핵심 임상 통계 (초록에서 자동 추출 — 원문 검증 권장)
  • 95% CI 0.97-0.98
  • OR 0.98
  • 연구 설계 meta-analysis

이 논문을 인용하기

↓ .bib ↓ .ris
APA Hanghang Chen, Weihua Hu, et al. (2026). SELE is associated with reduced breast cancer susceptibility: Evidence from Mendelian randomization and single-cell transcriptome.. Translational oncology, 66, 102713. https://doi.org/10.1016/j.tranon.2026.102713
MLA Hanghang Chen, et al.. "SELE is associated with reduced breast cancer susceptibility: Evidence from Mendelian randomization and single-cell transcriptome.." Translational oncology, vol. 66, 2026, pp. 102713.
PMID 41722201 ↗

Abstract

[BACKGROUND] The role of circulating proteins in breast cancer (BC) early diagnosis remains unclear. We investigated genetically predicted associations between circulating proteins and BC risk using Mendelian randomization (MR). This study aims to identify novel protein biomarkers through an integrative multi-omics approach.

[METHODS] Using a two-sample MR framework, we assessed genetically determined circulating protein associations with BC risk/subtypes. Analysis incorporated large-scale protein quantitative trait loci (pQTL) and genome-wide association studies (GWAS) data, strengthened by cross-validation, sensitivity analyses (MR-Egger, MR-PRESSO), and meta-analysis. We further performed genetic colocalization, molecular docking, and phenome-wide MR (PheWAS-MR). Bulk and single-cell RNA sequencing data were analyzed to compare gene expression of causal proteins between healthy and BC tissues. This multi-layered validation enhances the robustness of causal inference.

[RESULTS] Three circulating proteins are associated with reduced BC risk-SELE (OR = 0.98, 95 % CI: 0.97-0.98), CDH1 (OR = 0.94, 95 % CI: 0.93-0.95), ALPI (OR = 0.95, 95 % CI: 0.94-0.96). CNTNAP2 is associated with elevated BC risk (OR = 1.02, 95 % CI: 1.01-1.03). Colocalization supported shared causal variants for SELE and ALPI. Molecular docking simulation indicates high binding affinity of SELE-simvastatin. SELE expression was significantly reduced in endothelial cells of BC tissue, and PheWAS-MR revealed SELE's association with 123 phenotypes, highlighting its extensive pleiotropic effects.

[CONCLUSIONS] This study provides robust genetic evidence for the causal roles of SELE, CDH1, and ALPI in reducing BC risk. The integrative proteomic-genetic-transcriptomic approach identifies potential therapeutic targets and offers new insights into BC pathogenesis, presenting hypotheses for clinical validation.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

📖 전문 본문 읽기 PMC JATS · ~52 KB · 영문

Introduction

Introduction
According to the 2022 Global Cancer Statistics, breast cancer (BC) is the most prevalent malignant tumor and the leading cause of cancer-related fatalities in women worldwide [1]. Early detection and risk assessment are crucial for prevention and treatment, yet there remain limited simple and reliable hematological early screening tools. This underscores the urgent need to identify new biomarkers for BC susceptibility. Given evidence linking proteins such as alpha-1-acidic glycoprotein (AGP) and dermal protein to tumor progression and metastasis [2], exploring their potential as biomarkers could offer valuable insights for early detection and personalized treatment strategies.
Circulating proteins have been related to multiple cancer risks, including BC. Circulating proteins are involved in multiple biological functions through various pathways, including cancer cell secretomes or immune system inducers/effectors [3]. The "cancer cell secretome" consists of proteins secreted or released by cancer or cancer-related cells or different types of cells that are part of dynamic interactions in a highly complex local tumor environment [4]. For instance, a large pooled study found that circulating insulin-like growth factor 1 (IGF1) is associated with BC risk [5]. Thus, circulating proteins represent promising biomarkers for refining risk assessment and early intervention strategies.
Most existing studies on circulating proteins and BC risk are observational, facing challenges such as reverse causality, residual confounding, and measurement errors. Mendelian randomization (MR), a genetic method that uses DNA-sequence data to infer genetically predicted associations, has emerged as a powerful tool to address these limitations in observational studies. This approach minimizes confounding and reverse causality, providing more reliable evidence for genetically predicted associations [6]. Previous MR studies have identified several proteins associated with BC risk. For example, Shu et al. used large-scale protein quantitative trait locus (pQTL) data to identify 56 significant proteins [7], while another study linked Cluster of Differentiation 160 (CD160), Dinucleotide PHosphatase 1 (DNPH1), Layilin (LAYN), Leucine Rich Repeat Containing 37 Member A2 (LRRC37A2), and Toll Like Receptor 1 (TLR1)​ to BC risk [8]. However, many studies rely on single pQTL datasets, limiting generalizability, and few integrate multi-omics validation.
In this study, we applied a two-sample MR framework with cross-validation using aggregated statistics from four independent data sources. To enhance robustness, we incorporated meta-analysis, colocalization analysis, and sensitivity tests for pleiotropy and heterogeneity​. We further explored genetically predicted associations between proteins and BC subtypes, and validated findings using bulk and single-cell transcriptomic data from healthy and BC tissues. Our multi-omics approach strengthens causal inference and provides deeper insights into underlying mechanisms. Our primary goal was to identify circulating proteins with causal links to BC risk, offering potential targets for biological research, drug development, and early screening strategies.

Materials and methods

Materials and methods
First and foremost, we employed two-sample MR to identify proteins associated with BC susceptibility. This study was conducted in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology using Mendelian Randomization (STROBE-MR) guidelines [9]. To ensure the reliability and reproducibility of our findings, we integrated data from multiple sources (two exposure datasets and two outcome datasets) and adhered to a structured "discovery and cross-validation" framework. This comprehensive analytical strategy was designed to minimize false positives and enhance the robustness of causal inference. Furthermore, we utilized "coloc" co-localization analysis to explore the potential genetic mechanisms underlying the proteins flagged in MR studies. We also investigated the causal relationship between these significant proteins and BC susceptibility. Additionally, we examined the expression patterns of RNAs encoding these proteins in both bulk BC tissue and single-cell data to understand their expression dynamics. Finally, to assess the broad health implications of a protein that is differentially expressed in cancerous tissue, we conducted a phenome-wide Mendelian randomization (PheWAS-MR) study using comprehensive multi-omics data. Through this approach, we aimed to identify proteins present in blood and tissues that may be causally linked to BC development, thereby providing omics-based insights for early detection and targeted treatment of BC.

Exposure data source
The MR analysis required matching genetic backgrounds for exposure and outcome groups, prompting us to use European ancestry data with multiple sources. Exposure data were collected from the two largest protein pQTL databases: 4907 aptamer pQTL data from 35,559 Icelanders (DeCODE genetics) [10], and large-scale pQTL data for 2940 proteins from 54,219 UK Biobank participants (UKBPPP) [11]. Proteomic analysis of plasma samples was conducted by DeCODE using the SomaScan platform and by UKPPP using the Olink platform. The use of two independent pQTL datasets allowed for cross-validation of findings and enhanced the generalizability of results.

Outcome data source
The outcome data for BC susceptibility were sourced from two large-scale GWAS consortia​ to maximize statistical power and population representativeness. We utilized summary statistics from the FinnGen study (Release 9, R9, n = 182,869) [12], and from the Breast Cancer Association Consortium (BCAC, n = 214,675) [13]. Both consortia provide well-curated GWAS summary statistics for BC susceptibility, facilitating reliable two-sample MR analyses.​Details are provided in Table 1 and the corresponding original publications.

Two sample MR analysis and cross-validation
MR uses genetic variants strongly linked to exposure factors as instrumental variables to estimate genetically predicted associations. An instrumental variable (IV) is defined as a SNP with a strong correlation to exposure (p < 5e-8) and no significant association with outcomes (p > 1 e-5) SNP. We utilize the “TwoSampleMR” R package [14] for our MR analysis. The inverse variance weighting (IVW) method, being the most effective and widely adopted technique [15], is our primary choice. It's important to note that if only one IV is available, the Wald ratio method [16] becomes the sole option.
First, we collected the IVs of each circulating protein from DeCODE. Next. We used plink software (V1.9) for de-linkage disequilibrium (LD) and identified almost independent genetic IVs. Given that proteins do not have complex LDs like diseases, LD is defined as R2 < 0.1 within the clumping window of 100 kb [17]. IV with F < 10 is defined as weak IV and removed to avoid weak instrumental variable bias. The F-value of each IV is calculated using the following formula: (R2: interpretability of instrumental variables, N: sample size) and R2 could be calculated using this formula: (EAF: effect elle frequency, β: beta size) [18]. Next, we extracted these IVs from the outcome data. If certain SNPs were not present in the data, we opted not to seek proxies for them. Thirdly, SNPs associated with outcomes or exposures were considered as being consistent with the same allele. The "MR" function was then used to process the harmonized data and generate MR results. To mitigate potential false positive errors from multiple testing, we applied the "p.adjust" function to perform false discovery rate (FDR) correction, adjusting the p values accordingly. A significance threshold of FDR < 0.05 was applied for all primary analyses.
Heterogeneity and pleiotropy are primary factors influencing the effectiveness of MR studies. We employed Cochran's Q statistic [19] to evaluate heterogeneity and applied the MR-Egger method for assessing pleiotropy within the integrated dataset. Additionally, we conducted MR-PRESSO analysis to detect and correct for potential horizontal pleiotropy. A p-value exceeding 0.05 was deemed indicative of either non-heterogeneity or pleiotropy. Furthermore, we conducted an MR Steiger directionality test to rule out reverse causality.

Meta analysis for significant MR results
To ensure robustness of the results and to obtain unique OR and p values, we performed a meta analysis of all significant associations using the "meta" R package (V6.2–1) [20]. Heterogeneity among studies was examined using Cochran's Q test and Higgins's I2 test [21]. P < 0.05 or I2 >50 % was believed that there was heterogeneity among the studies and we adopted the random effects model. Otherwise, fixed effect model is chosen as the main meta analysis method. All meta-analysis results are presented with 95 % confidence intervals.

Colocalization analysis
In a fixed gene region, the causal protein and BC may share one or more causal variants, so we use the "coloc" R package [22] for genetic colocalization analysis. We defined the 1Mb region upstream and downstream of each protein's lead SNP as the validation region. If the posterior probability of hypothesis 4 (PP.H4) ≥ 0.8, it is considered that the two traits have strong evidence of colocalization [23]. If PP.H4 ≥ 0.5 and < 0.8, the two traits were considered to have moderate strength evidence of colocalization. This approach helps distinguish true causal associations from those driven by linkage disequilibrium.

Molecular docking
In view of the causal relationship of SELE in reducing BC susceptibility, and studies showing that simvastatin can induce increased SELE expression [24], we conducted a molecular docking of SELE and simvastatin. This analysis was performed as an exploratory investigation to assess potential binding interactions. Molecular docking was performed using two software tools: AutoDock Tools 1.5.6 (https://autodocksuite.scripps.edu/adt/) and PyMOL (http://www.pymol.org/pymol). The 3D structure of the target protein (receptor) was obtained from the PDB database (https://www.rcsb.org/) [25], and co-crystal structures with RMSD values < 2 Å were selected for subsequent processing, including water removal and hydrogen atom addition. Small molecule ligand files were downloaded from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/) [26], converted using Open Babel GUI (https://openbabel.org/docs/GUI/GUI.html), and then docked with the receptor. Finally, PyMOL software was employed for visualization of the docking results. Binding energies were calculated to evaluate interaction strength, with lower values indicating more favorable binding.

MR analysis of proteins and BC subtypes
Based on the expression of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2), BC was defined as five molecular subtypes: luminal A-like, luminal B/ HER2-negative, luminal B-like, HER2-enriched, and triple-negative or basal-like. We also used the same MR Approach to explore the causal relationship between the causal protein and the BC subtype. Subtype-specific analyses were powered by the large sample sizes of the consortium data.

Bulk and single cell transcriptome analysis
To explore the expression of genes encoding causal proteins in BC tissue, we used bulk RNA-seq data and the largest collection of BC single cell expression data to date to analyze transcriptome-level RNA expression patterns and differences. The bulk RNA-seq data of female BC patients were collected from The Cancer Genome Atlas (TCGA) database [27]. It contains control data and bulk sequencing expression of BC tissue. Single cell data were derived from GEO (https://www.ncbi.nlm.nih.gov/geo/, GSE161529) [28] and included control data and single cell sequencing data from individual BC tissues. We applied standardized analytical methods to these data, which can be found in our previously published articles.

PheWAS-MR analysis
PheWAS-MR was performed to identify potential pleiotropic effects and clinical implications associated with the proteins of interest. This hypothesis-generating approach systematically evaluates the causal effects of genetically predicted protein levels on a wide spectrum of disease outcomes, allowing for the discovery of both potential therapeutic benefits and adverse effects.
We evaluated the associations between significant proteins and all available health outcomes in the FinnGen R9 database (n = 2099 phenotypes, with case numbers > 100 for sufficient statistical power). To account for the high-dimensional testing burden inherent in PheWAS, we applied a FDR correction of < 0.05. Significant associations were further interpreted within the context of their phenotypic categories to identify potential shared biological pathways, such as vascular inflammation or immune regulation, rather than merely assessing pleiotropy as a source of bias.

Results

Results
The overall workflow is presented in Fig. 1. A detailed description of the methodology can be found in the MATERIALS AND METHODS section. In brief, we employed the Two-Sample MR Method to identify circulating proteins associated with BC susceptibility. We utilized cross-validation, sensitivity analysis, meta-analysis, and other methods to ensure the reliability of our findings. Subsequently, we explored the genetic potential and expression patterns of these causal proteins through colocalization analyses, single-cell transcriptome profiling, PheWAS-MR, and additional complementary approaches.

Four proteins are significantly associated with BC risk
After performing LD clumping and rigorous screening of exposure data, some proteins did not meet the IV criteria and were excluded from MR Analysis. The MR and the sensitivity analysis results between each circulating protein (UKBPPP) and BC risk (FinnGen) was displayed in Table S1. The MR and the sensitivity analysis results between each circulating protein (DeCODE) and BC risk (FinnGen) was displayed in Table S2. The MR and the sensitivity analysis results between each circulating protein (UKBPPP) and BC risk (BCAC) was displayed in Table S3. The MR and the sensitivity analysis results between each circulating protein (DeCODE) and BC risk (BCAC) was displayed in Table S4.
After cross-validation, four proteins passed all MR Analysis and sensitivity analysis, and their odds ratio (OR) and 95 % confidence interval (CI) were shown in Fig. 2. Among these, three proteins—alkaline phosphatase intestinal (ALPI), cadherin 1 (CDH1), and selectin E (SELE)—were associated with a reduced risk of BC, whereas contactin associated protein like 2 (CNTNAP2)​ was associated with an elevated BC risk. Notably, while the effect sizes for some proteins (particularly SELE with OR = 0.98) are modest and are unlikely to be directly useful for individual-level risk prediction, they align with expectations for circulating protein biomarkers in MR studies and are interpreted as indicative of pathway-level biological signals worthy of further investigation.
After meta analysis, three proteins are associated with reduced BC risk—ALPI (Fig. 3A, random effects, OR = 0.94, 95 % CI: 0.93–0.96, p < 0.001), CDH1 (Fig. 3B, random effects, OR = 0.92, 95 % CI: 0.87–0.96, p < 0.001), SELE (Fig. 3D, common effect, OR = 0.98, 95 % CI: 0.97–0.98, p < 0.001). CNTNAP2 is associated with elevated BC risk (Fig. 3C, common effect, OR = 1.02, 95 % CI: 1.01–1.03, p < 0.001). Summary information regarding the instrumental variables—such as R² values and F-statistics—for the four proteins (ALPI, CDH1, CNTNAP2 and SELE) that showed significant associations is provided in Table S5. All instruments for our significant proteins demonstrated strong F-statistics (F > 15, exceeding the conventional threshold of 10), effectively minimizing the risk of weak instrument bias.

Colocalization analysis indicate shared causal variants
PP.H4 > 0.8 means strong colocalization evidence and PP.H4 ≥ 0.5 and < 0.8 indicates moderate strength evidence of colocalization. Colocalization analysis can verify whether there is common variation between traits in a specified gene region. Of the four proteins that were causally associated with BC risk, ALPI and SELE were found to have strong colocalization evidence for BC risk (Fig. 4A, Table S6). The lack of strong colocalization evidence for CDH1 (PP.H4 < 0.5) suggests that its association with BC risk may be driven by distinct genetic mechanisms or pleiotropic pathways, warranting a more cautious interpretation compared to proteins with strong colocalization support (e.g., ALPI and SELE).
As a supplement, we mapped the PhenoGram plot to describe the genomic localization of the genes coding for the four proteins (Fig. 4B). Four genes were located on chromosome 1, 2, 7 and 16 respectively.

The molecular docking simulation indicates high binding affinity of SELE-simvastatin
The molecular docking simulation of SELE-simvastatin was performed to evaluate the binding activity between the two based on binding energy. Lower binding energy indicates stronger binding affinity. Specifically, binding energy below −5 kJ/mol suggests relatively high binding affinity, while values below −7 kJ/mol indicate high binding affinity [29]. The results showed a binding energy of −8.62 kJ/mol, demonstrating high binding affinity between SELE and simvastatin. The visualization using PyMOL software revealed the interaction details, as shown in the figure below. It is important to note that this analysis represents a computational prediction that requires experimental validation and should be interpreted as hypothesis-generating rather than confirmatory.
Three binding sites—LYS-28, LEU-112, and ASN-85—were identified between the ligand simvastatin and the receptor SELE. Hydrogen bond length is a critical parameter in molecular docking, as it reflects the strength of intermolecular interactions and binding stability [30]. The hydrogen bond lengths between simvastatin and SELE were measured as 1.8 Å, 1.9 Å, and 2.1 Å. Shorter hydrogen bond lengths generally correspond to stronger hydrogen bonds, which contribute to stable complex formation. Among these, the 1.8 Å and 1.9 Å bonds are particularly strong, while the 2.1 Å bond is slightly weaker but still contributes to overall stability. These interactions were visualized and analyzed using PyMOL, confirming the critical role of hydrogen bonds in stabilizing the ligand-receptor complex.

Three proteins are significantly associated with BC subtype risk
Three of the four proteins were associated with the risk of the BC subtype (Fig. 5, Table S7). Specifically, ALPI was negatively associated with TNBC and luminal B/ HER2-negative subtypes. It seems that circulating ALPI is negatively correlated with HER2 gene expression in cancer tissue, and more evidence is needed. CDH1 was negatively correlated with Luminal_B subtype. SELE was negatively correlated with luminal B/ HER2-negative subtypes. These subtype-specific associations highlight the potential differential roles of these proteins across molecular subtypes of BC.

SELE is differentially expressed between BC and normal tissues
Further transcriptome analysis was performed to investigate the expression patterns of genes encoding significant proteins in BC and normal control tissues. In the single cell transcriptome data, seven cell types were initially identified (Fig. 6A). Differentially expressed genes for each cell type between normal controls and BC tissue were displayed in Table S8. Of the four genes, ALPI was not significantly expressed in breast tissue, so the expression of this gene was not measured in the single-cell transcriptome data. This finding supports the hypothesis that ALPI likely influences BC risk through systemic mechanisms rather than direct local action in breast tissue. Among the other 3 genes, SELE expression decreased significantly in BC tissues (Fig. 6B). Specifically in the BC subtype, SELE expression decreased most significantly in ER+ and PR+ subtypes (Fig. 6C). In terms of cell type, SELE was expressed in endothelial cells at the highest level. Similarly, specific to endothelial cells, SELE expression was significantly reduced in BC tissues (Fig. 6D). Therefore, the expression changes of SELE in bulk transcriptome sequencing data were further explored. SELE in TCGA BC data is also significantly reduced compared with normal samples (Fig. 6E). Further comparison of paired samples showed the same pattern (Fig. 6F). The concordant reduction of SELE expression across multiple datasets and analytical approaches strengthens the biological relevance of this observation. We propose a working model wherein genetically elevated circulating SELE may confer systemic protection, while local tumor microenvironment suppresses endothelial SELE expression as a potential immune evasion mechanism; this model reconciles the observed genetic and transcriptomic data but requires further experimental validation.

SELE is associated with multiple health outcomes
To comprehensively assess the potential pleiotropic effects of SELE, PheWAS-MR analyses of SELE were performed against all FinnGen health outcomes. After FDR correction and sensitivity analysis, SELE is significantly associated with 123 distinct phenotypes spanning multiple disease categories, including gastrointestinal, vascular, respiratory, and autoimmune conditions. It's worth noting that SELE has been linked to an increased risk of certain diseases, such as carcinoma in situ of colon, vascular disorders of the intestines, and diseases of the respiratory system. This may be related to its expression in vascular endothelial cells. This extensive pleiotropy highlights both the biological importance of SELE and potential challenges for therapeutic targeting, as modulation might have widespread effects beyond BC risk reduction.​ Detailed analysis results are shown in Fig. 7 and Table S9.

Discussion

Discussion
Circulating proteins may play a potential role in early screening and diagnosis of cancer [31]. In our study, we performed proteome-wide MR analyses to explore causality for >4000 circulating proteins associated with BC risk. We implemented strict quality control measures, sensitivity analyses, and p value adjustments. Additionally, we conducted a meta-analysis to verify the robustness of our findings. Out of thousands of proteins, four (ALPI, CDH1, CNTNAP2, SELE) were consistently found to be associated with BC risk. In transcriptome analysis, SELE expression was observed to be reduced in endothelial cells within BC tissues. Finally, PheWAS-MR analysis also explored the potential wide range of health effects that SELE may have. Our multi-omics approach aligns with recent advances in proteomic cancer research, demonstrating the value of integrating genetic instruments with transcriptomic and phenome-wide data to strengthen causal inference for complex diseases like BC [8,32].
Our study revealed that genetically determined SELE is associated with reduced BC susceptibility, suggesting that the SELE protein may play a protective role in BC development. Using RNA-Seq, we identified DEGs in normal and BC endothelial cells, showing that SELE gene expression was significantly reduced in BC cells compared to normal endothelial cells. This apparent discrepancy—where genetically elevated circulating SELE is protective, yet its local expression in tumors is reduced—is best interpreted as a working model rather than a resolved mechanism: systemic SELE levels could enhance immune surveillance, while tumor-induced endothelial dysfunction may suppress local SELE expression as an immune evasion mechanism [33]. SELE, a cell surface glycoprotein, plays a crucial role in the inflammatory response and immune system regulation. It is induced by various inflammatory mediators and serves as an important marker of endothelial cell activation. Although the effect size for SELE (OR = 0.98) is modest and unlikely to be directly useful for individual-level risk prediction, it aligns with expectations for circulating proteins in MR analyses and should be interpreted as a pathway-level signal worthy of further biological investigation, as even small effect sizes can inform population-level disease mechanisms [32]. Cancer cells bind to SELE on the blood vessel wall, adhere to endothelial cells, transgress through the endothelial cell gap, enter surrounding tissues, or migrate to distant organs via the bloodstream [34]. SELE has been shown to be associated with the metastasis of various malignant tumors, such as BC [35], gastric cancer [36], colorectal cancer [37], etc. A prospective study found that preoperative serum soluble SELE levels were positively correlated with the TNM stage of BC and reflected the severity of invasive disease. Consequently, numerous studies have focused on the development of targeted therapeutic strategies aimed at inhibiting the metastatic process of BC to enhance therapeutic efficacy [38,39]. Our molecular docking analysis suggesting potential SELE-simvastatin interaction should be interpreted as hypothesis-generating; while the computational prediction shows favorable binding energy, biological validation is needed to confirm functional relevance. We speculate that the role of SELE in reducing BC susceptibility may be closely related to its function in innate immune response. SELE promotes pathogen clearance and maintains immune response homeostasis by regulating the rapid and efficient recruitment, attachment, and migration of immune cells to infected or inflammatory areas. Notably, statins such as simvastatin exhibit pleiotropic immunomodulatory effects beyond cholesterol reduction, including suppression of pro-inflammatory cytokines (e.g., IL-1β, TNF-α) and modulation of GTPase signaling pathways (e.g., RhoA, Rac1), which may indirectly influence SELE-mediated immune surveillance in the tumor microenvironment [40,41]. Davies et al. found that SELE induces up-regulation of CD86 expression, thereby initiating and amplifying the innate immune response at an early stage [42]. CD86 activates T cells by binding to the CD28 molecule on T cells and plays a tumor immune role. This study may be related to our discovery of the underlying mechanism by which SELE reduces susceptibility to BC. The pleiotropic effects of SELE identified through PheWAS-MR, with associations to 123 phenotypes, highlight both its biological importance and potential challenges for therapeutic targeting; this extensive pleiotropy suggests that modulation of SELE may have widespread effects beyond BC risk reduction. Therefore, studying the role of SELE in immune regulation could shed light on its impact on the tumor microenvironment and provide new insights into the prevention and treatment of BC.
Our study found ALPI is associated with a reduced BC risk, particularly in the TNBC and Luminal_B_HER2Neg subtypes. Our findings suggest an inverse causal relationship between circulating ALPI levels and HER2 expression in tissues, though further investigation is needed to establish causality. ALPI, an alkaline phosphatase, plays a critical role in maintaining the integrity of the intestinal mucosal barrier and gut function. Dysregulation of ALPI has been implicated in various gastrointestinal conditions, including inflammatory bowel disease (IBD), necrotizing enterocolitis, and metabolic syndrome. The role of ALPI in innate immunity is multifaceted, encompassing regulation of inflammation, modulation of immune cell function, maintenance of the intestinal barrier, microbiome balance, and lipid metabolism through diverse mechanisms. These functions collectively underscore its significant impact on overall immune health [43]. Given that ALPI is not significantly expressed in breast tissue, its protective effects likely operate through systemic mechanisms rather than local action; potential pathways include the gut-breast axis, immune modulation, and metabolic regulation, which represent promising directions for future research [8]. Malo et al. [44] demonstrated that Zinc finger binding protein-89 (ZBP-89) is modulating endogenous ALPI gene expression in human colorectal cancer HT-29 and Caco-2 cell lines and may have a tumor suppressive effect. Together, these findings suggest that intestinal alkaline phosphatase may be involved in reducing BC risk through a variety of mechanisms, including its role in anti-inflammatory response, metabolic regulation, immune response, and hormone metabolism. However, studies on ALPI and BC are still lacking. Our study shows that genetically determined ALPI is associated with a reduced BC risk, which provides a potential opportunity for the treatment of BC, especially the TNBC subtype, which is relatively lacking in treatment options.
Our study showed that CDH1 was associated with a reduced risk of Luminal B subtype BC. CDH1 is a calcium-dependent cell adhesion molecule belonging to the cadherin family. As a tumor suppressor protein, the loss of CDH1 leads to dysfunction of the cell-cell adhesion system, resulting in decreased cell adhesion, release of cytoplasmic β-catenin, enhanced Wnt signaling, and increased tumor aggressiveness [45]. In certain malignancies, including BC, gastric, and colorectal cancers, reduced CDH1 expression is associated with increased tumor aggressiveness, lymph node metastasis, and poor prognosis [46]. Bruner et al. suggest that early loss of CDH1 induces the activation of two different carcinogenic signals. First, cytoplasmic p120 induces constitutive actomyosin activation and subsequent anchoring independence; Second, loss of CDH1 leads to hypersensitization of GFR signaling, which weakens transcriptional inhibition of key pro-apoptotic factors BMF and BIM [47]. CDH1 may play a protective role in BC risk through its functions in cell adhesion, signal regulation, tumor microenvironment influence, and genomic stability. The association between CDH1 and reduced Luminal B subtype risk is consistent with its established role as a tumor suppressor; however, the lack of strong colocalization evidence (PP.H4 < 0.5) suggests that this relationship may involve complex pathways or pleiotropic effects rather than shared causal variants, distinguishing it from proteins with stronger colocalization support like SELE and ALPI. However, the exact biological mechanism is unclear and requires further study.
Our research has several key advantages. First, we aggregated and integrated statistics from four distinct sources to perform MR Analysis, ensuring robust cross-validation and comprehensive exploration of the relationship between the broadest pQTL dataset and BC risk, thereby minimizing false positives. Second, we conducted rigorous sensitivity analyses and meta-analyses to validate our findings, enhancing the reliability and accuracy of the results. Finally, we performed scRNA-seq analysis on gene expression profiles of disease-associated proteins in BC tissues, combining these omics data and methods to provide complementary evidence and a robust foundation for further biological research. The integration of multiple analytical approaches strengthens causal inference and addresses key assumptions of MR, as demonstrated in recent comprehensive proteomic studies.
Several limitations should be noted in our analysis. First, we did not account for gender differences in processing the exposure data, and the outcome data was restricted to women, which may impact the generalizability and accuracy of our findings. Additionally, all participants in our study were of European descent, potentially limiting the applicability of our results to other ethnic or genetic backgrounds. To ensure result validity, we implemented strict criteria, thereby excluding many potential false negatives. Furthermore, as with all MR studies, we cannot completely rule out residual confounding or pleiotropy despite our rigorous sensitivity analyses; future studies in diverse populations are needed to validate these associations​.
In conclusion, this study investigated the potential causal relationship between circulating proteins and BC risk using a multi-omics approach. Levels of three circulating proteins—SELE, CDH1, and ALPI—were associated with reduced BC risk, supported by transcriptome data for SELE's role in tissues. Future research should prioritize functional validation of the proposed working model for SELE's dual roles, clarify the systemic mechanisms of ALPI, and investigate the distinct pathways underlying CDH1′s association in the absence of strong colocalization evidence. Understanding these proteins' mechanisms could inform the development of new prevention and treatment strategies. Future research should focus on elucidating the specific cellular mechanisms of these proteins to enhance BC prevention and control efforts. Future research should prioritize functional validation of these proteins' roles in BC pathogenesis, particularly exploring the systemic mechanisms through which ALPI may influence BC risk and the therapeutic implications of SELE's pleiotropic effects​. Additionally, studies should consider the roles of these proteins across different BC subtypes and their interactions with other known risk factors. These insights may facilitate personalized treatment plans, improve prevention and treatment effectiveness, and help prevent potential adverse effects.

Conclusions

Conclusions
This study provides robust evidence for genetically predicted associations between circulating proteins and BC risk through a comprehensive multi-omics MR framework. Employing cross-validation, sensitivity analyses (e.g., MR-Egger, MR-PRESSO), and FDR corrections, we consistently identified four proteins—ALPI, CDH1, SELE (protective), and CNTNAP2 (risk)—with stable effect estimates across diverse datasets. Novel findings include SELE's association with reduced risk, supported by transcriptomic downregulation in tumor endothelial cells and hypothesis-generating molecular docking with simvastatin, suggesting potential immune-mediated pathways. Limitations involve the exclusive European ancestry data and unaccounted gender-specific effects, which may affect generalizability. Future work should prioritize validation in multi-ethnic cohorts, experimental mechanistic studies, and investigations into subtype-specific therapeutic applications.

Ethical approval and consent to participate

Ethical approval and consent to participate
All studies were approved by their corresponding ethics review boards, and all subjects provided informed consent. This study used publicly available GWAS summary statistics data without individual information, and thus no ethical approval was required.

Consent for publication

Consent for publication
This study has been approved by all authors for publication.

Availability of supporting data

Availability of supporting data
The GWAS summary statistics data is available at https://bcac.ccge.medschl.cam.ac.uk/bcacdata/oncoarray/oncoarray-and-combined-summary-result/ and https://finngen.gitbook.io/documentation/data-download. The pQTL data of DeCODE data is available at https://www.decode.com/summarydata/. The pQTL data of UKBPPP data is available at https://metabolomips.org/ukbbpgwas/.

Funding

Funding
The study was supported by Natural Science Foundation of Henan Province of China (232300421183); Henan Province Traditional Chinese Medicine Key Discipline Construction Project (Grant No CZ0366-07); Doctoral Research Foundation of the First Affiliated Hospital of Henan University of Chinese Medicine (Grant No 2024BSJJ044).

CRediT authorship contribution statement

CRediT authorship contribution statement
Hanghang Chen: Writing – original draft, Visualization, Software, Methodology, Funding acquisition, Formal analysis, Conceptualization. Weihua Hu: Writing – original draft, Validation. Ruidong Liu: Validation, Data curation. Qi Liu: Visualization, Software. Xufeng Cheng: Writing – review & editing, Software, Funding acquisition, Conceptualization.

Declaration of competing interest

Declaration of competing interest
The authors declare that they have no competing interest.

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기