본문으로 건너뛰기
← 뒤로

Detecting and mitigating doppelgänger bias in microbiome data: impacts on machine learning and disease classification.

1/5 보강
Gut microbes 📖 저널 OA 100% 2025: 44/44 OA 2026: 16/16 OA 2025~2026 2025 Vol.17(1) p. 2554196
Retraction 확인
출처

Zhou R, Ng SK, Sung JJY, Wong SH, Goh WWB

📝 환자 설명용 한 줄

Highly similar microbiome samples - so-called "doppelgänger pairs" - can distort analysis outcomes, yet are rarely addressed in microbiome studies.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Zhou R, Ng SK, et al. (2025). Detecting and mitigating doppelgänger bias in microbiome data: impacts on machine learning and disease classification.. Gut microbes, 17(1), 2554196. https://doi.org/10.1080/19490976.2025.2554196
MLA Zhou R, et al.. "Detecting and mitigating doppelgänger bias in microbiome data: impacts on machine learning and disease classification.." Gut microbes, vol. 17, no. 1, 2025, pp. 2554196.
PMID 40888678 ↗

Abstract

Highly similar microbiome samples - so-called "doppelgänger pairs" - can distort analysis outcomes, yet are rarely addressed in microbiome studies. Here, we demonstrate that even a small proportion of such pairs (1-10% of samples) can substantially inflate machine learning performance across diverse disease cohorts including colorectal cancer (CRC), inflammatory bowel diseases (IBD), infection (CDI), and obesity. Doppelgänger pairs also bias statistical tests and distort microbial network topology. In predictive models, classification accuracy was artificially boosted by 15-30% points across KNN, SVM, and Random Forest classifiers. In association testing, doppelgängers increased false-positive rates and decreased effect size stability; their removal reduced bootstrap variance by up to 28.3%. Moreover, the removal of doppelgängers yielded more stable networks. These effects were consistently observed across 16S, shotgun metagenomic, and simulated datasets. By accounting for highly similar samples, we reduce analytical noise and false discoveries, ultimately enabling more accurate and biologically meaningful microbiome insights.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기