Challenges to case-only analysis for interaction detection using polygenic risk scores: model assumptions and biases in large biobanks.
Understanding gene-environment and gene-gene interactions is important for studying complex diseases.
APA
Zhang W, Lu Q, Lu T (2026). Challenges to case-only analysis for interaction detection using polygenic risk scores: model assumptions and biases in large biobanks.. Genetics, 232(3). https://doi.org/10.1093/genetics/iyag006
MLA
Zhang W, et al.. "Challenges to case-only analysis for interaction detection using polygenic risk scores: model assumptions and biases in large biobanks.." Genetics, vol. 232, no. 3, 2026.
PMID
41529202
Abstract
Understanding gene-environment and gene-gene interactions is important for studying complex diseases. Case-only analysis has been proposed to improve power for detecting interactions. However, case-only analysis relies on key assumptions, including correct specification of the disease risk model and marginal independence between variables. In this study, we systematically investigate the challenges of case-only analysis using polygenic risk scores (PRS) as genetic variables in large biobanks. Through simulations, we demonstrate that the false positive control of PRS-based case-only analysis depends on the log-linear disease risk model and weak main effects, and that it is prone to false positives under other commonly used disease risk models. We then conduct case-only analyses for breast cancer, prostate cancer, class 3 obesity, and short stature in the UK Biobank, using PRS derived from non-overlapping chromosome sets (e.g. even-numbered and odd-numbered chromosomes) that are unlikely to interact with each other. The resulting case-only regression estimates consistently show negative shifts compared to population-based estimates, suggesting false positives driven by collider bias due to model misspecification. Furthermore, correlations between chromosome set-specific PRS, likely driven by assortative mating or population stratification, suggest additional sources of confounding. Our results underscore the challenges of applying PRS-based case-only analysis in large biobank settings and highlight the need for caution when interpreting case-only results.
MeSH Terms
Humans; Multifactorial Inheritance; Biological Specimen Banks; Models, Genetic; Male; Prostatic Neoplasms; Female; Genetic Predisposition to Disease; Breast Neoplasms; Genome-Wide Association Study; Gene-Environment Interaction; Genetic Risk Score
같은 제1저자의 인용 많은 논문 (5)
- USP32 Promotes Cancer Cell Invasion, Macrophage M2 Polarization, and CD8+ T Cell Apoptosis in Gastric Cancer Through Upregulation of DAPK1.
- United multi-omics and machine learning refine regulatory T cell-defined hepatocellular carcinoma subtypes.
- A SLC7A5-Specific Near-Infrared Fluorescent Probe for Cancer-Targeted Imaging Applications.
- Dynamic liver dysfunction predicts poor survival in patients with EGFR-mutant non-small cell lung cancer and liver metastases treated with EGFR tyrosine kinase inhibitors.
- Development of liquid biopsy for screening colorectal cancer through the combination of an antibody microarray-based metal-enhanced sandwich immunofluorescent assay of cytokines with machine learning.