본문으로 건너뛰기
← 뒤로

Gut fungal landscape in colorectal cancer and its cross-kingdom interplay with gut microbial ecology.

1/5 보강
iScience 📖 저널 OA 100% 2023: 4/4 OA 2024: 21/21 OA 2025: 69/69 OA 2026: 112/112 OA 2023~2026 2026 Vol.29(2) p. 114664 OA
Retraction 확인
출처

Yinhang W, Xueli J, Zheng W, Xiaojian Y, Shu X, Qingjie Z

📝 환자 설명용 한 줄

The gut microbiota is a key hallmark of colorectal cancer (CRC), yet gut fungi remain understudied.

이 논문을 인용하기

↓ .bib ↓ .ris
APA Yinhang W, Xueli J, et al. (2026). Gut fungal landscape in colorectal cancer and its cross-kingdom interplay with gut microbial ecology.. iScience, 29(2), 114664. https://doi.org/10.1016/j.isci.2026.114664
MLA Yinhang W, et al.. "Gut fungal landscape in colorectal cancer and its cross-kingdom interplay with gut microbial ecology.." iScience, vol. 29, no. 2, 2026, pp. 114664.
PMID 41704769 ↗

Abstract

The gut microbiota is a key hallmark of colorectal cancer (CRC), yet gut fungi remain understudied. We characterized the gut fungal landscape and its associations with bacteria, metabolites, and trace elements in CRC using fecal samples from healthy controls ( = 401), colorectal polyp patients ( = 162), and CRC patients ( = 253). Fungal annotation was performed using genomic data from NCBI (PRJNA833221) as reference. Fungal diversity increased in CRC patients, with seven genera showing differential abundance. was specifically enriched in CRC, while , . enriched in polyps. Ablation study identified an optimal 31-microbial-marker panel (28 bacteria and three fungi) that effectively distinguished intestinal disease groups (AUC = 0.89). Structural equation modeling revealed three fungal markers-, sp. , and -that influence bacterial-metabolite-trace element networks. This study delineates the gut fungal atlas in CRC and reveals complex cross-kingdom interactions, offering new insights into CRC pathogenesis.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (2)

📖 전문 본문 읽기 PMC JATS · ~61 KB · 영문

Introduction

Introduction
The human gastrointestinal tract harbors a diverse ecosystem of microorganisms that exhibit remarkable abundance and complexity.1 Among these microbial communities, gut microbiota play fundamental roles in maintaining intestinal homeostasis2,3 and significantly influence host health and disease susceptibility. While bacterial populations have been the primary focus of microbiome research, fungi represent an essential yet understudied component of the human microbiota, colonizing various anatomical sites including the skin, oral cavity, and gastrointestinal tract.4,5
Although fungi account for less than 0.1% of total gut microorganisms, they establish stable colonization and contribute substantially to gut microecological stability.6 As an emerging research frontier in human microbiome studies, gut mycobiota participate in critical physiological processes including immune modulation, metabolic regulation, and microbial community homeostasis.7,8 The gut fungal ecosystem is predominantly composed of Ascomycota and Basidiomycota phyla, with Candida, Aspergillus, and Rhizopus being the most prevalent genera.9 These fungi maintain gut barrier integrity and immune homeostasis through diverse metabolic activities (e.g., polysaccharide degradation and secondary metabolite production) and complex interactions with bacterial communities.10,11
Emerging evidence highlights the significance of fungi in tumor microenvironments. For instance, fungal species can stimulate pancreatic ductal adenocarcinoma (PDAC) cells to secrete IL-33, subsequently recruiting and activating type 2 innate lymphoid cells (ILC2s) to facilitate tumor progression.12 In colorectal cancer (CRC) pathogenesis, dysbiosis of gut fungal communities has been implicated in both malignant transformation and precancerous lesion development.13 Despite this potential, fungal contributions to CRC remain poorly characterized compared to bacterial studies. Clinical observations reveal that Candida albicans overgrowth in inflammatory bowel disease patients promotes chronic intestinal inflammation and elevates CRC risk.14,15 Furthermore, dynamic alterations in fungal-fungal interactions during disease progression have been documented: CRC patients exhibit distinct co-occurrence patterns of Ascomycota (A. rambellii) and M. perniciosa compared to healthy controls, suggesting potential synergistic or antagonistic roles in carcinogenesis.16 These findings position gut fungi as promising diagnostic biomarkers and therapeutic targets for CRC.
Elucidating the mechanistic roles of fungi in CRC pathogenesis may yield novel strategies for cancer prevention and treatment while advancing our understanding of intestinal microecological complexity. Therefore, in order to systematically analyze the role of fungi in CRC, we conducted metagenomic sequencing of fecal samples across a clinical spectrum encompassing healthy controls, patients with colorectal polyps, and CRC cases. This study breaks new ground in CRC research by establishing the specialized fungal genomic database (PRJNA833221) for metagenomic annotation. Through comprehensive multi-omics analyses, we characterized the comprehensive gut fungal atlas across the CRC disease continuum (healthy-polyp-CRC), revealing stage-specific fungal signatures. Additionally, we innovatively applied ablation studies to identify an optimal microbial-marker panel for intestinal disease discrimination. Moreover, we delineated intricate relationships between fungal communities and bacterial microbiota, metabolic profiles, and trace element composition. These transformative advances not only provide crucial insights into fungal involvement in CRC but also create a foundational framework for developing microbiome-based diagnostic and therapeutic strategies.

Results

Results

Distribution of differential fungi across different groups and taxonomic levels
We conducted a comprehensive analysis of fungal profiles across 253 CRC patients, 162 polyp patients, and 401 healthy controls using metagenomic sequencing of fecal samples (Figure 1A; Table 1). Our study revealed distinct fungal distribution patterns at multiple taxonomic levels through pairwise comparisons (normal vs. polyp vs. CRC), with alpha diversity analysis showing significantly lower fungal diversity in healthy controls compared to polyp and CRC groups at the genus level (Figure 1B). Importantly, we identified seven differentially abundant genera: Rhizopus was specifically enriched in CRC, Sporisorium, Cladosporium, Aureobasidium, Zygoascus, and Meyerozyma showed higher diversity in polyps, while Penicillium was uniquely associated with healthy controls (Figures 1C, 1D, and S1). These species-level differential fungi provide crucial insights for developing risk prediction models and investigating multi-omics causal relationships in colorectal carcinogenesis (Figure 1A).

Identification of fungal biomarkers and development of optimized disease classification model
Building upon our discovery of differential fungal genera, we performed species-level biomarker screening and comparative analysis across the three study groups. The ablation study identified 31 optimal markers for the disease discrimination model (Figure 2A). The analysis revealed distinct abundance patterns: (1) CRC vs. Normal comparisons identified 33 species enriched and 64 species depleted in CRC; (2) polyp vs. normal comparisons showed 17 species increased and 18 decreased in polyps; and (3) CRC vs. Polyp comparisons demonstrated 24 species elevated and 72 reduced in CRC (Figure 2B, Data S1).
From these findings, we selected 21 species-level fungal markers corresponding to the 7 differential genera to construct a random forest classifier for distinguishing healthy controls from intestinal disease (polyp/CRC) groups. Using solely these fungal biomarkers, the model achieved an area under the curve (AUC) of 0.79 in the test set (Figure 2C). Parallel analyses of bacterial and viral communities identified 53 bacterial and nine viral markers showing significant intergroup variation (normal vs. polyp vs. CRC, p < 0.05) (Data S2 and S3). Individual evaluation showed bacterial markers exhibited superior predictive performance (AUC = 0.82) compared to viral markers (AUC = 0.79) (Figure 2C).
To optimize predictive capability, we implemented a systematic multi-kingdom integration approach. Through combinatorial testing, we found that the combined fungal-bacterial-viral marker set achieved optimal diagnostic performance (AUC = 0.87, Figure 2C), significantly outperforming all single-kingdom and dual-kingdom combinations in the test set. This integrated model demonstrates the synergistic value of multi-kingdom microbial biomarkers for disease identification.
In summary, our multi-kingdom analysis identified 83 potential “fungal-bacterial-viral” co-markers. To refine this biomarker panel, we employed a two-step selection process: (1) initial random forest modeling followed by (2) feature importance ranking using MeanDecreaseGini values. Through this iterative approach, we optimized the model to 31 key biomarkers that achieved peak diagnostic performance (AUC = 0.89; Figure 2D). The robustness of this 31-microbial-marker panel was rigorously evaluated through a multi-tiered approach: (i) receiver operating characteristic (ROC) analysis of 10-fold cross-validation in the training set, (ii) for internal validation we performed ROC analysis with 10-fold cross-validation within our discovery cohort, which yielded an AUC of 0.89, indicating high predictive accuracy (Figure 2E), (iii) for external validation, to further test its generalizability, we validated the model on an independent, publicly available cohort (PRJNA731589, comprising 82 CRC patients and 84 healthy controls). The panel achieved an AUC of 0.72 in this external dataset (Figure 2E), confirming its diagnostic utility across distinct populations (Figure 2E), and (iv) demonstration of consistent abundance patterns across disease groups (Figure 2F). Furthermore, we performed additional ROC analyses to evaluate the diagnostic efficacy of the 31-microbial-marker panel across key clinical comparisons: normal vs. CRC, normal vs. polyp, and Polyp vs. CRC. The panel achieved AUCs of 0.91, 0.83, and 0.71 for these respective comparisons (Figure S2). These results demonstrate the robustness of the microbial signature in distinguishing CRC from healthy controls and, importantly, also in identifying early-stage adenomas. The observed decline in diagnostic accuracy across the normal-polyp-CRC sequence is consistent with the progressive nature of colorectal carcinogenesis. Notably, correlation network analysis revealed two significant findings: (a) fungal markers (highlighted in red) formed distinct interaction clusters, while (b) viral markers were absent from the final biomarker set (Figures 2G and S3).

Multi-omics integration reveals complex fungal-bacterial-ion-metabolite interactions in CRC
To elucidate the complex relationships among fungi, bacteria, trace elements, and metabolites in the gut environment, we analyzed trace elements (Figure 3A) and metabolites (Figure 3B) across 816 samples. The top two most influential features from each dataset were identified via random forest analysis: zinc (Zn) and calcium (Ca) for trace elements, and kanamycin and N-acetyl-D-tryptophan for metabolites. The optimized diagnostic model highlighted three fungal species as key predictors: Penicillium citrinum, Penicillium sp. PG10607D, and Rhizopus stolonifer. Additionally, the two bacterial taxa most strongly associated with these fungi were identified.
Integrating these features, structural equation modeling (SEM) was employed to investigate potential causal interactions among fungi, bacteria, trace elements, and metabolites in CRC from a multi-omics perspective. The SEM results demonstrated distinct correlation patterns:
Penicillium citrinum exhibited positive associations with the ion group (coefficient = 0.298; Zn2+, Ca2+) and the bacterial group (coefficient = 0.468; Flavobacterium, UBA7642), but a negative correlation with the metabolome group (coefficient = −0.057; kanamycin) (Figure 3C).
Penicillium sp. PG10607D showed negative correlations with the ion group (coefficient = −0.247; Zn2+) and the bacterial group (coefficient = −0.322; Anaerostipes), while displaying a positive association with the metabolome group (coefficient = 0.05; kanamycin) (Figure 3D).
Rhizopus stolonifer was positively linked to trace elements (coefficient = 0.431; Zn2+) and the bacterial group (coefficient = 0.113; Cupriavidus, Haloferax), but negatively correlated with the metabolome group (coefficient = −2.258; kanamycin) (Figure 3E).
These findings uncover potential causal relationships between fungal communities, bacterial taxa, trace elements, and metabolites in CRC, offering novel insights for future research.

Discussion

Discussion
This study marks the first construction of a specialized fungal genomic database (NCBI: PRJNA833221) for annotating metagenomic sequencing data, enabling high-resolution taxonomic profiling of gut fungi. Moreover, we constructed the disease-associated fungal atlas by comparing the gut fungal composition of healthy people, polyp patients and CRC patients, and developed a CRC related disease classification model based on it. The ablation study identified an optimal 31-microbial-marker panel (28 bacterial and three fungal species) that effectively distinguished intestinal disease groups (AUC = 0.89). In addition, the study innovatively integrated fungal, metabolomic and trace elements data, revealing the potential mechanisms by which fungal communities affect the host microenvironment through metabolic reprogramming (e.g., short-chain fatty acid synthesis) and trace element regulation (e.g., Zn2+, Ca2+ signaling pathways). The above findings provide novel biomarkers for the early diagnosis of CRC and lay the theoretical foundation for intervention strategies targeting the fungal-metabolic axis.
Research on fungal communities remains relatively limited, primarily constrained by the limitations of bioinformatics tools and the scarcity of specialized databases.17 In this study, we integrated high-quality fungal genomic data from NCBI (PRJNA833221) and established an optimized taxonomic annotation pipeline, which significantly improved both the classification accuracy and coverage of gut fungal annotation based on metagenomic sequencing. This advancement provides a more reliable annotation foundation for subsequent gut fungal atlas construction and biomarker screening. It is also noteworthy that although viral communities were profiled and initially included in the random forest model for biomarker screening, no viral markers were retained in the final 31-microbial-marker panel after rigorous ablation feature selection. This outcome may be explained by several factors: (1) the relatively low abundance and highly fragmented nature of viral DNA in stool samples, complicating consistent detection and quantification compared to bacterial and fungal elements; (2) inherent limitations of current viral databases and annotation workflows, which remain less comprehensive than those available for bacteria and fungi, potentially affecting taxonomic accuracy and reproducibility; and (3) the possibility that any discriminatory signal from viral constituents was outweighed by the stronger associations of bacterial and fungal markers in this cohort. Future studies employing viral DNA enrichment strategies and expanded reference databases will be valuable to clarify the potential role of the gut virome in colorectal carcinogenesis.
Contrary to the well-established positive correlation between bacterial diversity and host health,18,19 our study revealed significantly lower fungal diversity in healthy individuals compared to CRC and polyp patients. We hypothesize that in healthy states, dominant fungal taxa maintain functional homeostasis by suppressing opportunistic pathogens via resource competition, resulting in a stable, low-diversity community. In contrast, disease states exhibit higher fungal diversity due to the overgrowth of pathogenic fungi (e.g., Aspergillus spp.), which disrupts community function despite increased species richness.
Through species-level screening of fungal, bacterial, and viral markers, we constructed a binary CRC related disease classification model optimized via 10-fold cross-validation. Ablation study revealed that the combined fungal-bacterial model (AUC = 0.89) outperformed models relying solely on viral markers or traditional single-omics approaches (e.g., host gene mutations or protein expression).20,21,22,23 This study pioneers the application of ablation study—an artificial intelligence-based model optimization technique originally developed for machine learning—to revolutionize microbial biomarker discovery for disease discrimination. Our innovative approach breaks away from traditional “all-or-nothing” screening methods by establishing a novel “many-to-best” reverse optimization paradigm that systematically evaluates and refines biomarker combinations through sequential elimination of candidate markers. This methodological breakthrough enables precise optimization of microbial consortia by quantitatively assessing the impact of each marker’s removal on diagnostic performance, ultimately identifying an optimal 31-microbial panel (28 bacterial and three fungal species) with superior discriminative power (AUC = 0.89). The successful implementation of this AI-derived strategy in microbiome research represents a paradigm shift from conventional additive biomarker discovery to intelligent subtractive optimization, offering a powerful new framework for developing high-performance diagnostic signatures from complex microbial communities.
Although direct evidence linking Penicillium citrinum and Rhizopus stolonifer to the promotion or suppression of human CRC is currently limited, their biological characteristics and our findings allow for scientifically plausible speculation. For instance, novel anthrone dimers with antitumor activity have been isolated from Penicillium species, suggesting that certain metabolites from this genus may hold potential for development as anti-CRC agents.24
Rhizopus stolonifer is known to secrete cell wall-degrading enzymes that contribute to plant pathogenicity, its potential to modulate immune responses25 could represent a mechanism by which it increases CRC risk.
Using SEM, we deciphered the intricate relationships between fungi, host metabolites, and trace elements. The metabolite kanamycin exhibited a strong negative correlation with the pathogenic fungus Rhizopus stolonifer (coefficient = −0.468, p < 0.001). Kanamycin, an aminoglycoside broad-spectrum antibiotic, primarily exerts antibacterial effects by inhibiting bacterial protein synthesis and is not typically considered an antifungal agent. However, the observed negative correlation between Rhizopus stolonifer and kanamycin suggests that antibacterial agents may indirectly influence fungal ecology. On one hand, kanamycin-mediated suppression of susceptible bacteria may reshape the gut microbial community structure. The removal of bacteria that compete ecologically with Rhizopus stolonifer could alleviate competitive pressure and provide the fungus with a proliferation advantage, reflecting an indirect interaction at the microbial community level.26 On the other hand, fungi may employ multiple resistance mechanisms to counteract antibiotic stress. Studies indicate that Rhizopus stolonifer may form biofilms that act as physical barriers, utilize efflux pumps to actively expel drugs, or undergo genetic mutations that alter drug targets.27 These mechanisms may collectively enhance its survival capacity in the presence of kanamycin. Fungal communities interacted dynamically with trace elements (e.g., Zn2+ and Ca2+), implicating ionic homeostasis in disease progression—possibly through modulation of fungal enzymatic activities.28,29 Specifically, Zn2+ plays a pivotal role in maintaining mucosal barrier integrity and modulating host nutritional immunity by mediating intermicrobial competition.30,31 Concurrently, Ca2+ orchestrates the formation of cross-kingdom biofilms and virulence responses.32 Alterations in these trace elements can thereby reshape gut microbial dynamics, ultimately influencing host health through modulated microbial interactions. These insights support the development of “fungal-metabolic axis”-targeted interventions, such as dietary supplementation with specific metabolites (e.g., short-chain fatty acids) to reshape fungal community structure. By integrating multi-omics data and cross-scale analyses, this study is the first to systematically elucidate the systematic characterization of gut fungi-metabolite-trace element interactions in CRC, transcending the limitations of traditional bacteria-centric research.
Our study delineates a cross-kingdom microbial network in CRC by integrating fungal, bacterial, metabolic, and trace element data, thereby extending the prevailing bacterio-centric view of CRC pathogenesis. While previous large-scale sequencing efforts have identified multi-kingdom microbial signatures in CRC,33 our application of SEM provides unprecedented insights into the potential ecological interactions among these kingdoms, particularly the influential role of specific fungi like Penicillium citrinum and Rhizopus stolonifer. The translational potential of our 31-microbial-marker panel is consistent with a growing body of evidence that supports the use of microbial biomarkers for non-invasive diagnosis, even at precancerous stages.34 However, establishing robust and causal microbiome-disease relationships remains a central challenge in the field.35 Our observational findings, while revealing significant associations, necessitate cautious interpretation regarding causality. Future studies employing advanced designs, such as prospectively collected cohorts and functional validations in gnotobiotic models, are crucial to move beyond correlation and establish pathogenic mechanisms. In this regard, the innovative study design of Chen et al.,36 which compared Crohn’s disease patients with their healthy first-degree relatives to control for genetic and environmental confounders, offers a powerful paradigm for future CRC microbiome research to enhance the specificity of biomarker discovery and causal inferences.
Our study established the gut fungi atlas for CRC, demonstrating progressive increase of fungal diversity along the healthy-polyp-carcinoma continuum. We identified distinct disease-specific fungal signatures, including CRC-predominant Rhizopus enrichment and polyp-predominant Sporisorium/Cladosporium colonization patterns. A multi-kingdom diagnostic model incorporating 31 microbial biomarkers (28 bacterial and three fungal species) achieved superior intestinal disease discrimination (AUC = 0.89). Through SEM, we revealed three keystone fungal species (Penicillium citrinum, Penicillium sp. PG10607D, and Rhizopus stolonifer) that orchestrate cross-kingdom interactions by functionally bridging bacterial communities, metabolic pathways, and trace element homeostasis. These findings not only deliver novel fungal biomarkers for clinical translation but also propose a new conceptual framework for CRC research through tripartite microbial-metabolite-trace elements networks.

Limitations of the study
While this study advances our understanding of fungal contributions to CRC, certain limitations remain. Although predictive modeling and SEM revealed interactions among fungi, bacteria, metabolites, and trace elements, these findings require validation in controlled animal experiments. The complexity of the gut microenvironment poses challenges in designing accurate experimental models, but our SEM framework provides a foundation for future mechanistic studies. Further research is needed to dissect how specific fungi (e.g., Rhizopus stolonifer) regulate CRC progression via metabolic pathways. Addressing these questions will refine targeted therapeutic strategies. Collectively, this work not only redefines the role of fungi in CRC but also provides a new direction for innovative research on microbiome-based diagnostics and therapies.

Resource availability

Resource availability

Lead contact
Further information and requests regarding resources and analyses should be directed to and will be fulfilled by the lead contact, Han Shuwen (shuwenhan985@163.com).

Materials availability
Due to ethical regulations and institutional policies, the availability of human fecal samples and related materials generated in this study may be subject to restrictions. Requests for materials should be directed to the lead contact and reviewed by the Ethics Committee of Huzhou Central Hospital.

Data and code availability

•The fungal genomic annotation was performed by first downloading the BioProject: PRJNA833221 dataset from NCBI as a reference database (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA833221/).

•The external validation dataset was obtained from the NCBI Sequence Read Archive (SRA) under BioProject accession NCBI SRA: PRJNA731589, corresponding to the study published under PMID 35087227.

•The metagenomic sequencing data generated in this study have been deposited in the China National GeneBank Database (CNGBdb) under accession number CNGBdb: CNP0004360 and are publicly accessible at https://db.cngb.org/data_resources/project/CNP0004360/. The raw metabolomics data have been deposited in the MetaboLights database with the identifier MetaboLights: MTBLS12776.

•All custom code used for statistical analysis and visualization in this study is available without restrictions in the GitHub repository at (GitHub: https://github.com/pegasusCN/Fungi_pipeline).

•Any additional information required to reanalyze the data reported in this article is available from the lead contact upon request.

Acknowledgments

Acknowledgments
The authors gratefully thank the patients and volunteers for their contributions to sample collection. Graphical abstract is created in BioRender. Shuwen, H. (2025) https://BioRender.com/wnvsmjt. This research was supported by Open subject of the Innovation Center for Basic Research on fungal Infectious Diseases of the 10.13039/100009002Ministry of Education (YXX2024-KF02-03), 10.13039/501100010248Public Welfare Technology Application Research Program of Huzhou (no. 2024GY22), 10.13039/501100016114Zhejiang Medical and Health Technology project (no. 2025KY1531), and 10.13039/100012829Zhejiang Province Traditional Chinese Medicine Science and Technology project (no. 2024ZL1018).

Author contributions

Author contributions
H.S. conceived and drafted the manuscript. W.Y. wrote the paper. W.Z. and Y.X. analyzed the data. X.S. and Z.Q. collected the basic patient information, clinical indicators, and imaging data. J.X. and L.Y. designed and generated the figures. All authors read and approved the paper.

Declaration of interests

Declaration of interests
The authors declare no competing interests.

STAR★Methods

STAR★Methods

Key resources table

Experimental model and study participant details

Ethics approval and consent for the use of human specimens
Human ethics was approved by the Ethics Committee of Huzhou Central Hospital (202202005-01 and 202202005-02).

Study participants
The study cohort comprised 401 healthy controls, 162 patients with colorectal polyps, and 253 CRC patients (Table 1) recruited from Huzhou Central Hospital between January 2022 and December 2023.
Inclusion criteria were as follows:
CRC group: Patients pathologically confirmed with colorectal adenocarcinoma.
Polyp group: Individuals diagnosed with benign colorectal polyps.
Healthy controls: Individuals showing no evidence of gastrointestinal pathologies upon colonoscopic examination and no histological evidence of gastrointestinal abnormalities.
All participants or their legal guardians provided written informed consent prior to enrollment.
Exclusion Criteria were as follows: benign diseases such as hemorrhoids, hematologic diseases; those who have a combination of metastatic tumors from other sites or multiple other tumors at the same time (e.g., pancreatic cancer, lung cancer, prostate cancer, etc.); those who suffer from psychiatric disorders or cognitive or communication dysfunction; those who have a combination of other gut disorders, such as Crohn’s disease, ulcerative colitis, and those who have a combination of major diseases such as cardiac, cerebral, vascular, and respiratory systems.

Method details

Fecal sample collection protocol
All fecal samples were collected from fasting subjects without prior use of laxatives or lubricating agents. Participants provided approximately 5–10 g of fresh fecal material using sterile collection kits, which were immediately placed on ice packs. Samples were transferred to −80°C cryopreservation within 30 min post-collection to ensure biomolecular stability. The maximum storage duration prior to processing was standardized at 30 days to minimize potential degradation effects.

Metagenomic analysis protocol
Total microbial DNA was extracted from stool sample using the E.Z.N.A. Viral DNA Kit (Omega Bio-tek, Norcross, GA, USA) following the manufacturer’s protocol. DNA purity and concentration were verified spectrophotometrically, with only samples meeting stringent quality criteria (A260/A280 = 1.8–2.2; A260/A230 ≥ 2.0) proceeding to library preparation. The remaining supernatant was subjected to lysis and viral DNA were extracted using the QIAamp Viral RNA mini kit without carrier RNA (Qiagen). Metagenomic shotgun sequencing libraries were constructed and sequenced at Shanghai Biozeron Biological Technology Co. Ltd. Microbiome profiling was performed following standard protocols.46 Briefly, quality-filtered reads were taxonomically classified by BWA mem algorithm using the public data of human gut MAGs from Unified Human Gastrointestinal Genome (UHGG) collection database. Microbial abundances were calculated using the formula as follows.

Fungal genomic annotation
The fungal genomic annotation was performed by first downloading the PRJNA833221 dataset from NCBI as a reference database (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA833221/). Using Kraken2 software, we constructed the taxonomic index, followed by fungal genome annotation of our metagenomic data through Bracken software.
The external validation dataset was obtained from the NCBI Sequence Read Archive (SRA) under BioProject accession PRJNA731589, corresponding to the study published under PMID 35087227.

Metabolites detection
Untargeted metabolomic profiling was performed using liquid chromatography–mass spectrometry (LC-MS). Fecal samples were lyophilized, and approximately 0.5 g was accurately weighed into digestion vessels. After low-temperature heating to remove volatile compounds, samples were digested with 10 mL of a nitric–perchloric acid (10:1) mixture. Digestion continued until the solution turned colorless or pale yellow. After cooling, extracts were diluted to 50 mL with ultrapure water. A blank control was processed in parallel.
Metal ions were quantified using an inductively coupled plasma optical emission spectrometer (Thermo iCAP 7200 HSDuo) under the following conditions: RF power, 1150 W; carrier gas, 0.7 L/min; auxiliary gas, 1.0 L/min; cooling gas, 12.0 L/min; axial detection mode. A 32-element mixed standard (BWT30121-100-100, B22120033) was used for calibration, with concentrations of 0, 2, 5, 10, and 20 mg/L.

Trace elements detection
Trace element quantification were inductively coupled plasma mass spectrometry (ICP-MS).
Trace element analysis was conducted using inductively coupled plasma–mass spectrometry (ICP‒MS; Thermo iCAP7200 HSDuo). Fecal samples were lyophilized, weighed, and subjected to acid digestion in graphite tubes. The digestion protocol consisted of sequential addition of 15 mL nitric acid (80°C, 20 min) and 3 mL perchloric acid (80°C, 10 min), followed by heating at 130°C for 15 min and 180°C for 120 min. After achieving a colorless digestate and near-evaporation of perchloric acid, the residue was cooled, transferred to a 50 mL volumetric flask, and diluted to volume with distilled water. Calibration was performed using a mixed-element standard at concentrations of 0, 2, 5, 10, and 20 mg/L. Elemental concentrations (W, mg/kg) were calculated as: W = (C0 × V × f)/M.
where C0 is the measured concentration (mg/L), V is the final volume (mL), f is the dilution factor, and M is the sample mass (g).

Alpha diversity analysis
The rarefaction analysis based on Mothur v.1.21.141 was conducted to analyze the Simpson and Shannon diversity indices. The beta diversity analysis was performed using the community ecology package, R-vegan package.

Comparative analysis of differential fungi

Taxonomic distribution of differential fungi across subgroups
Differential fungal taxa among the normal, polyp, and CRC groups were identified through pairwise comparisons using the R package MaAsLin2, with significance thresholds set at |log2FC| > 0.5 and FDR <0.05, respectively. The FDR was controlled using the Benjamini-Hochberg procedure for multiple comparisons. Subsequently, the top 10 most significantly differentially abundant features (ranked by |log2FC|) were selected at the Order, Family, Genus, and Species levels, and their relative abundances were compared across the subgroups.

Fungal diversity analysis
Fungal diversity across sample subgroups was assessed using the phyloseq R package. Alpha diversity metrics were employed to evaluate and compare microbial community richness and evenness among the different subgroups.

Screening for differential gut microbes
Pairwise comparative analysis of the normal, polyp, and CRC groups was performed using the MaAsLin2 R package, with thresholds set at |log2FC| > 0.5 and FDR <0.05 (adjusted via the Benjamini-Hochberg method). This analysis identified fungi, bacteria, and viruses that exhibited significant differences at the genus level.

Construction of a binary classifier using microbial markers
Based on species-level differential analysis, we extracted 21 fungal species, 53 bacterial species, and 9 viral species associated with seven key fungal genera. These were used to train a random forest classifier (R package: randomForest) to distinguish between Normal and Intestinal Disease (Polyp + CRC) groups. The dataset was split into training (80%) and test (20%) sets, with model performance evaluated via 10-fold cross-validation.

Combined marker classifier performance
The three microbial marker types (fungi, bacteria, viruses) were individually and jointly incorporated into random forest models. Their predictive performance was assessed and visualized using the test set.

Optimization of co-markers
Based on a total of 83 fungal, bacterial, and viral markers, a random forest model was first constructed. Subsequently, ablation experiments were performed sequentially according to the Mean Decrease Gini values of each feature: in each round of experiments, the least contributing feature (i.e., the one with the lowest Mean Decrease Gini value) was removed, and the model was retrained. By comparing the AUC values of each model on the test set, the key features with optimal performance were ultimately identified. This process refined the model to 31 optimal markers, with performance visualized via:
ROC curves from 10-fold cross-validation (training set).
ROC curves from the independent test set.

Correlation analysis of co-markers
The average abundance shifts of the 31 markers in the intestinal disease groups (polyps and CRC) were visualized. Subsequently, pairwise correlations among these markers were computed and displayed using Wilcoxon tests and the corrplot R package.

Structural equation modeling (SEM)
We used Python’s scikit-learn package to screen for bacteria and fungi exhibiting the strongest variability within the normal (Normal), polyp (Polyp), and CRC groups. After screening the bacteria and fungi, we used the Python scripting package to process and analyze the Pearson correlations between these bacteria and fungi and other histological features (trace elements and metabolome) and obtained the most highly correlated features in the normal, polyp and CRC groups. With the help of the vegan package for the R language, we analyzed the relationship between these filtered features and the different histological data in the normal, polyp and CRC groups based on the Mantel test. To normalize these dimensions, we applied a 0–1 transformation to all data using the following formula:
Here X∗ is the normalized eigenvalue, (0,1) is the range of values, the original eigenvalue is X, Xmin is the sample minimum for the feature, and Xmax is the sample maximum for the feature. We will use the values after the 0–1 transformation for subsequent analysis. In addition, we constructed structural equation models using Python’s semopy package to determine the contribution of different features in a multi-omics approach. We assessed the validity of the models using the chi-square test p-value (chi-square p-value), comparative fit index (CFI), goodness-of-fit (GFI), normative fit index (NFI), and root-mean-square error of approximation (RMSEA). The higher the chi-square P-value (i.e., the smaller the difference between the expected covariance matrix and the observed covariance matrix), the higher the CFI, GFI, and NFI, and the lower the RMSEA, the better the model fit.

Quantification and statistical analysis
All error bars indicate the SD. Statistical assessment of differences in microbial abundances, diversity indices, and marker performances was done using Wilcoxon rank-sum test, Student’s t test, and ROC curve analysis, respectively. For inter-group comparisons of microbial taxa, MaAsLin2 was used with |log2FC| > 0.5 and FDR <0.05 as significance thresholds. Correlation analyses were performed using Pearson correlation and Mantel test. All statistical analyses were conducted in R, Python, and GraphPad Prism. All p values of <0.05 were considered significantly different (∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001).

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기