본문으로 건너뛰기
← 뒤로

STGNET: extending panel coverage in imaging-based spatial transcriptomics using deep generative adversarial networks.

1/5 보강
Briefings in bioinformatics 📖 저널 OA 88.5% 2024: 2/2 OA 2025: 7/8 OA 2026: 14/16 OA 2024~2026 2026 Vol.27(2)
Retraction 확인
출처

Wang T, Wang B, Shu H, Zhen P, Hu J, Wang Y

📝 환자 설명용 한 줄

Imaging-based spatial transcriptomics (ST) technologies offer unparalleled resolution for mapping gene expression within intact tissues but are fundamentally constrained by the limited size of their g

이 논문을 인용하기

↓ .bib ↓ .ris
APA Wang T, Wang B, et al. (2026). STGNET: extending panel coverage in imaging-based spatial transcriptomics using deep generative adversarial networks.. Briefings in bioinformatics, 27(2). https://doi.org/10.1093/bib/bbag122
MLA Wang T, et al.. "STGNET: extending panel coverage in imaging-based spatial transcriptomics using deep generative adversarial networks.." Briefings in bioinformatics, vol. 27, no. 2, 2026.
PMID 41875024 ↗
DOI 10.1093/bib/bbag122

Abstract

Imaging-based spatial transcriptomics (ST) technologies offer unparalleled resolution for mapping gene expression within intact tissues but are fundamentally constrained by the limited size of their gene panels. This restriction hinders comprehensive biological discovery by omitting potentially crucial genes from analysis. To overcome this limitation, we introduce STGNET, a deep learning framework that extends gene panel coverage by integrating generative adversarial networks (GANs) with graph neural networks. STGNET employs a multi-stage GAN to learn the global transcriptomic distribution from single-cell RNA sequencing data, followed by a spatially aware graph convolutional network that refines imputations by modeling both physical cell proximity and transcriptional similarity. We rigorously benchmarked STGNET against seven state-of-the-art methods across nine diverse ST datasets. STGNET consistently achieved superior performance, demonstrating enhanced accuracy in gene imputation, and exceptional preservation of cellular topology. We further showcase its biological utility by accurately reconstructing developmental marker patterns in mouse embryogenesis, revealing a novel transitional cell state in breast cancer progression, and uncovering extensive, previously obscured cell-cell communication networks in the mouse brain. STGNET provides a powerful and robust solution for unlocking the full potential of targeted ST assays, thereby enabling deeper and more comprehensive spatial biology. STGNET is freely accessible at https://github.com/wuyuanwuhuii/STGNET.

🏷️ 키워드 / MeSH 📖 같은 키워드 OA만

같은 제1저자의 인용 많은 논문 (5)

📖 전문 본문 읽기 PMC JATS · ~89 KB · 영문

Introduction

Introduction
The functional architecture of complex tissues is fundamentally governed by the precise spatial organization of and communication between their constituent cells, a critical dimension obliterated by dissociative single-cell RNA sequencing (scRNA-seq) [1, 2]. The emergence of spatial transcriptomics (ST) has bridged this gap, offering an unprecedented window into the molecular geography of tissues [3, 4]. Among these technologies, imaging-based in situ profiling methods—such as MERFISH [5], seqFISH+ [6], and the commercial Xenium [7] and CosMx platforms [8]—have set a new standard for sensitivity and spatial resolution, enabling the mapping of hundreds to thousands of genes at subcellular precision. Despite their transformative potential, these powerful methods are intrinsically constrained by a fundamental limitation: their reliance on a predefined gene panel, which severely limits transcriptome-wide discovery and can introduce bias by requiring a priori gene selection [9]. This physical bound, imposed by optical crowding, spectral overlap, and lengthy imaging cycles, creates a critical bottleneck [10]. It forces a trade-off between resolution and discovery, potentially obscuring novel biomarkers, unanticipated cellular states, and the completeness of biological pathways, thereby limiting the utility of these rich datasets for hypothesis-free exploration [11, 12].
In response to this panel limitation, several computational strategies have been employed, yet they are fundamentally constrained by their reliance on single-cell RNA sequencing reference atlases and are often ill-suited for the specific task of high-resolution panel extension. A prevalent paradigm, used by tools like Tangram [13], SpaGE [14], and deep learning-based methods like gimVI [15] and stPlus [16], relies on aligning spatial and single-cell data through a limited set of shared genes. This approach is inherently susceptible to technical confounders like batch effects and dropout events, which can distort the mapping and compromise the accuracy of imputed expression values. Moreover, the entire inference is bottlenecked by the number of shared genes, limiting the effective prediction of a broader transcriptome. More recently, diffusion-based models like SpatialScope [17] and stDiff [18] have emerged, framing data enhancement as a generative process. While powerful, these models primarily learn from the correlation structure of gene expression within dissociated scRNA-seq cells, treating expression profiles as the primary determinant of cell state. Consequently, they fail to explicitly and effectively leverage the rich, continuous spatial context—such as cellular neighborhoods and morphological regions—that is uniquely present in imaging-based ST data [19]. This oversight prevents them from capturing the critical spatial expression patterns and gradients that are essential for accurate, context-aware gene imputation in situ, thus creating a pronounced methodological gap for a dedicated solution.
To address these limitations, we introduce STGNET, a deep learning framework architected around two core principles: robust generative modeling and direct spatial integration. STGNET’s first key innovation is a multi-stage generative adversarial networks (GANs) process, stabilized by the Wasserstein GAN with gradient penalty (WGAN-GP) [20], which is trained to learn the underlying distribution of gene expression from a scRNA-seq reference. This approach focuses on capturing the fundamental structure of the transcriptome, making it more resilient to the technical noise and dropout events that plague individual cells in reference data. The second, and crucial, innovation is the direct incorporation of spatial context through a multi-view graph convolutional network (GCN). This GCN simultaneously models the tissue architecture using a spatial graph of physical cell adjacencies and a feature graph of transcriptional similarities. An integrated attention mechanism then dynamically fuses these two views, allowing the model to adaptively weight the importance of a cell’s physical neighborhood versus its transcriptional identity for each imputation event. This ensures that the final predicted expression values are not only transcriptionally plausible but also spatially coherent within the tissue microenvironment.
We rigorously evaluate its performance against state-of-the-art methods across nine public datasets, showing that STGNET robustly imputes missing gene expression with high fidelity to held-out ground truth data. We further validate that the imputed data preserves critical spatial patterns and enhances downstream biological analyses, including the identification of spatially variable genes and the inference of developmental trajectories. By providing a method that directly addresses the core limitations of existing approaches, STGNET establishes itself as a powerful tool for unlocking the full potential of spatially resolved transcriptomics.

Materials and methods

Materials and methods

Employing multiple generative adversarial networks to facilitate initial spatial transcriptomics imputation
STGNET is a deep learning framework that integrates GANs and GNNs to impute and extend gene expression profiles in ST data. STGENT employs multiple GANs to facilitate initial ST imputation. This phase operates in three sequential, adversarial training steps to ensure robustness: first, learning a global reference distribution from scRNA-seq data; second, imputing missing values in scRNA-seq data using a masked learning approach; and third, transferring and refining these imputations within the spatial context of ST data. An overview of the architecture is provided in Supplementary Figs S1–S3.

Adversarial training with Wasserstein generative adversarial network and gradient penalty
To ensure stable training and mitigate mode collapse common in standard GANs, all generative components in STGNET are built upon the Wasserstein GAN with Gradient Penalty (WGAN-GP) framework. The WGAN objective minimizes the Earth-Mover (Wasserstein-1) distance, which provides a more meaningful learning gradient than the Jensen–Shannon divergence used in original GANs.
The general objective function for a generator and a discriminator is:
where is the set of 1-Lipschitz functions, is the real data distribution, and is the noise distribution. The final term is the gradient penalty, where is a random interpolation between real and generated samples (), and is a hyperparameter. This penalty enforces the 1-Lipschitz constraint more reliably than weight clipping.

Learning the global single-cell RNA sequencing distribution
The objective of the first phase is to learn a robust generator that captures the underlying distribution of the global scRNA-seq reference atlas. The input scRNA-seq data matrix (with cells and genes) is normalized (TPM followed by transformation). Generator takes random noise and, optionally, cell-type labels as input to produce a synthetic expression profile . The discriminator is trained to distinguish between samples from the real data distribution and the generated distribution . The loss for this phase is defined as:
A well-trained provides a foundational model of plausible transcriptomic states, which bootstraps the subsequent imputation phases.

Single-cell RNA sequencing data imputation
The objective of the second phase is to train a generator that can accurately impute missing values (dropouts) in scRNA-seq data. This phase employs a U-Net-based generator (see Supplementary Information). The model learns in a self-supervised manner by artificially masking a portion of the genes in a real scRNA-seq cell profile , creating a masked vector and a corresponding binary mask matrix . The generator takes the tuple as input and outputs an imputed vector . The adversarial loss for this phase is:
This forces to generate imputations that are indistinguishable from non-masked, real data, thereby learning to correct for technical noise and dropouts. After the training process converges, is used for the imputation of scRNA-seq data.

Spatial-aware imputation for spatial transcriptomics data
The objective of the third phase is to adapt the imputation model to the spatial context of ST data, training a final generator . This is the core spatial integration phase. The input is ST data, which includes the gene expression matrix, noise, mask matrix, cell type, and the spatial coordinates of each spot/cell. It takes the similar inputs as (masked expression, noise, mask matrix, and cell type) but now also propagates information across the coordinates. By introducing spatial coordinates, the data generated by similar nodes becomes more similar. The discriminator is trained to discriminate between data generated by and data generated by the well-trained scRNA-seq imputer . This aligns the ST imputations with the global transcriptomic distribution learned from scRNA-seq. Additionally, a mean squared error (MSE) loss between the generated data and the real, non-masked portions of the ST data is added to ensure fidelity. The composite loss function for this phase is:
where is a hyperparameter balancing the adversarial and reconstruction losses.

Cell type label handling
When explicit cell type labels are unavailable for the scRNA-seq or ST data, STGNET performs unsupervised clustering (e.g. Leiden clustering) on the expression data to assign provisional labels, which are then used as conditional inputs to the generators.

Integrating spatial information to refine spatial transcriptomics imputation
Building upon the initial imputation from the multiple GANs phase, STGNET employs a sophisticated graph-based framework to further enhance imputation quality by explicitly modeling the spatial dependencies inherent in transcriptomic data. This refinement phase operates on the fundamental biological principle that spatial proximity often correlates with functional similarity, driven by shared microenvironments and cell–cell communication. This stage is divided into three parts: first, STGNET constructs adjacency graphs based on spatial coordinates, and gene expression information, respectively, to fully integrate tissue structural information. Second, an attention mechanism is used to dynamically merge these two types of features for integration. Third, a comprehensive optimization objective is employed to achieve precise imputation of ST data.

Multi-view graph construction
To capture both physical adjacency and transcriptional relationships, we construct two complementary graph representations of the ST data.
The spatial graph encodes physical tissue organization, where represents the spatial adjacency matrix, and denotes the imputed gene expression matrix. Spatial connectivity is determined by Euclidean distance between spots:
where represents the Euclidean distance between spatial coordinates of spots and , and is a predefined radius threshold.
The feature graph captures transcriptional similarity using cosine similarity in gene expression space:
The feature adjacency matrix is constructed by connecting each spot to its -nearest neighbors based on transcriptional similarity.

Multi-view graph convolutional encoder
We employ a multi-view GCN to extract and integrate information from both spatial and feature perspectives. The spatial convolution operates as:
where , is the corresponding degree matrix, are trainable weights, and .
Similarly, the feature convolution is defined as:
with and .

Attention-based feature fusion
To adaptively balance contributions from spatial and transcriptional views, we employ an attention mechanism:
The final integrated representation is obtained through:
where represents a linear transformation layer that captures highly variable features of the latent representations.

Zero-inflated negative binomial decoder
To accurately model the characteristics of spatial transcriptomic data, we employ a zero-inflated negative binomial (ZINB) decoder that accounts for over-dispersion and excess zeros:
The reconstruction loss is formulated as the negative log-likelihood:

Spatial regularization constraints
To preserve spatial neighborhood relationships in the latent space, we impose a spatial regularization constraint:
where represents the cosine similarity between latent representations of spots and , and denotes the set of spatial neighbors for spot .

Joint optimization framework
The complete optimization objective integrates all components through a weighted combination:
where , , and are balancing coefficients that control the contributions of reconstruction accuracy, spatial coherence, and distributional consistency with the original data, respectively. This multi-objective optimization ensures that the refined imputations are both biologically plausible and spatially coherent.

Validation datasets
To ensure a comprehensive evaluation of STGNET’s performance, we curated a diverse collection of nine publicly available ST datasets. These datasets were generated using a variety of experimental platforms (e.g. MERFISH, seqFISH, and Xenium) and encompass multiple tissue and organ types, providing a robust testbed that reflects the heterogeneity of real-world data. The selected datasets exhibit considerable variation in the number of cells/spots and genes, as detailed in Table 1.
A subset of these ST datasets (MERFISH, seqFISH, and Xenium) includes authoritative cell type annotations. For the remaining datasets lacking such annotations, we first performed unsupervised clustering on the paired single-cell RNA sequencing (scRNA-seq) reference data using the Leiden algorithm [21] to assign provisional cell type labels. These labels were then transferred to the ST data using the robust deconvolution method implemented in SpatialScope [17], ensuring consistent cell type information across all datasets for downstream evaluation.

Benchmarking and baseline methods
We benchmarked STGNET against seven state-of-the-art methods for spatial data imputation and integration:

Tangram [13]: A mapping-based method that aligns scRNA-seq data to spatial data.

gimVI [15]: A deep generative model based on a variational autoencoder for joint analysis of ST and scRNA-seq data.

stPlus [16]: A method that uses reference scRNA-seq and a autoencoder for joint embedding to enhance ST data.

SpaGE [14]: A machine learning approach that integrates scRNA-seq and ST data to predict unmeasured genes.

uniPort [33]: A unified framework for the integration and imputation of single-cell and spatial data.

SpatialScope [17]: A method employing diffusion-based models to reference scRNA-seq data for ST data completion.

stDiff [18]: A recent diffusion-model-based approach for enhancing ST data.

For all baseline methods, data preprocessing steps—including normalization and scaling—were performed in strict accordance with the instructions and source code provided by the original authors. Detailed source code references and parameter settings are documented in Supplementary Materials.

Performance metrics
We conducted a multi-faceted evaluation of imputation performance from both cellular and genic perspectives.

Cellular-level evaluation
To assess the ability of imputed data to preserve biological identity and cellular relationships, we evaluated the consistency of cell clustering results between the imputed and ground-truth ST data. We employed four established clustering metrics [34]:

Adjusted Rand Index (ARI): Measures the similarity between two data clusterings, corrected for chance.

Adjusted Mutual Information (AMI): Quantifies the mutual information between two clusterings, adjusted for chance.

Normalized Mutual Information (NMI): A normalization of the Mutual Information score to scale the results between 0 (no mutual information) and 1 (perfect correlation).

Homogeneity (Homo): Measures whether each cluster contains only members of a single class.

Higher values for all four metrics indicate better clustering consistency and thus superior preservation of cellular states.

Genic-level evaluation
To evaluate the accuracy of gene expression prediction, we employed a five-fold cross-validation strategy, holding out a portion of genes as ground truth. We used four complementary metrics:

Spearman’s rank correlation coefficient (SPCC): A non-parametric measure of the monotonic relationship between the predicted and true expression values for a gene across all cells. A higher SPCC indicates better preservation of expression rankings.

Structural similarity index measure (SSIM): Assesses the perceptual similarity between the spatial expression patterns of predicted and true values for a gene. A higher SSIM indicates better preservation of spatial structure.

Root mean square error (RMSE): Measures the magnitude of the absolute differences between predicted and true expression values. A lower RMSE indicates higher numerical accuracy.

Jensen–Shannon divergence (JS): A symmetric and bounded measure of the similarity between the probability distributions of predicted and true expression values for a gene. A lower JS divergence indicates greater distributional similarity. The Kullback–Leibler (KL) divergence is given by:

For SPCC and SSIM, higher values denote better performance, while for RMSE and JS, lower values are preferable.

Results

Results

Overview of STGENT
We present STGNET, a deep generative framework designed to overcome the gene panel limitation of imaging-based ST by accurately imputing the expression of unmeasured genes. The core challenge is to generate a complete, spatially coherent transcriptome from a limited initial measurement, thereby enabling more powerful downstream biological discovery.
As illustrated in Fig. 1, STGNET operates through a structured, two-stage pipeline. In the first stage, the model leverages a multi-stage GANs to learn the complex, high-dimensional distribution of gene expression from a reference scRNA-seq atlas. This phase is stabilized using the Wasserstein GAN with gradient penalty (WGAN-GP), ensuring robust learning of global transcriptomic features that are resilient to technical noise and dropout events common in single-cell data.
The second stage refines these initial imputations by explicitly modeling the spatial context of the tissue. Here, a multi-view GCN integrates two complementary graphs constructed from the ST data: a spatial graph, encoding physical proximity between cells/spots, and a feature graph, encoding transcriptional similarity. A key innovation is an integrated attention mechanism that dynamically fuses information from these two views, allowing the model to adaptively prioritize neighborhood context or cellular identity for each node. This ensures the final imputed expression matrix is not only transcriptionally plausible but also spatially consistent, faithfully capturing the tissue’s structural organization. By synergistically combining global generative modeling with local spatial refinement, STGNET transforms sparse, targeted ST panels into comprehensive spatial transcriptomes, unlocking their full potential for analyses such as spatial domain identification, cell–cell communication inference, and trajectory analysis.

STGNET accurately imputes gene expression in spatial transcriptomics data
We rigorously evaluated the gene imputation performance of STGNET against seven state-of-the-art methods (i.e. Tangram, gimVI, SpaGE, stPlus, uniPort, SpatialScope, and stDiff) using a five-fold cross-validation framework. For each ST dataset, the gene panel was partitioned into five subsets. In each fold, one subset was held out as ground truth for validation, while the remaining four subsets were used as input to train the models. This procedure ensured a comprehensive assessment of each method’s ability to predict unmeasured genes.
Performance was quantified using four complementary metrics: SPCC to assess monotonic relationships, SSIM to evaluate spatial pattern preservation, RMSE to measure numerical accuracy, and Jensen–Shannon divergence (JS) to compare expression distributions. We applied this evaluation to three different imaging-based ST datasets representing different biological backgrounds and technology platforms: sequencing mouse embryonic using seqFish technology, sequencing mouse primary motor cortex using MERFISH technology, and sequencing mouse osteosarcoma using MERFISH technology.
As shown in Fig. 2, STGNET demonstrated superior performance across all evaluation metrics and datasets. Notably, STGNET achieved the highest SPCC scores, with improvements ranging from 0.1145 to 0.1344 over the second-best performing methods, indicating stronger correlation between imputed and actual expression values. STGNET also excelled in preserving spatial expression patterns, as evidenced by superior SSIM scores, while simultaneously minimizing both numerical error and distributional divergence. The consistent outperformance across diverse datasets and multiple evaluation perspectives demonstrates STGNET’s robust capability for accurate ST imputation.

STGNET preserves cellular topology and spatial domains
Accurate preservation of tissue topology is paramount for ST imputation, as the spatial organization of cells into distinct domains underpins fundamental biological processes. We therefore evaluated whether imputed data maintained the latent topological patterns present in the original tissue by comparing cluster assignments derived from imputed data against ground-truth annotations.
We first performed a detailed analysis on mouse primary motor cortex sequenced by MERFISH, which possesses authoritative cell type labels. Application of the Leiden algorithm to the data imputed by STGNET using the five-fold cross validation method and the full-transcriptome data revealed spatial domains that closely recapitulated the original biological structure (Fig. 3A and B). Notably, clustering on STGNET-imputed data achieved higher concordance with the ground truth than clustering on the original, limited gene panel, indicating that STGNET not only preserves but can enhance the signal for spatial domain identification by augmenting the expression profiles.
To establish statistical robustness, we extended this validation across mouse embryonic sequenced by seqFish, mouse gastrulation sequenced by seqFish, primary visual cortex sequenced by ExSeq, somatosensory cortex sequenced by osmFISH, primary visual cortex sequenced by MERFISH and drosophila embryo sequenced by FISH six additional datasets using a five-fold cross-validation framework. The similarity between clusters from the imputed data and the ground-truth clusters was quantified using four established metrics (ARI, AMI, Homogeneity, and NMI). As summarized in Fig. 3C, STGNET consistently outperformed all benchmark methods, achieving the highest scores across all metrics and datasets. While methods like stDiff showed competitive performance on select datasets, their results were inconsistent. In contrast, STGNET reliably produced imputations that maintained the inherent topological structure of the tissue, underscoring its robustness and superiority for preserving the spatial integrity essential for downstream analyses. Comprehensive results are provided in Supplementary Tables S1 and S2.
For spatial clustering tasks, in order to verify the impact of gene imputation on performance. We validated on GraphST [35], DeepST [36], and Spaceflow [37] using raw and STGENT-imputed gene expression data in the Mouse primary motor cortex dataset. The experimental results show that imputation has improved performance in spatial clustering compared with the original data. The specific experimental results can be found in Supplementary Fig. S9.

STGNET accurately reconstructs spatial expression patterns of key developmental genes
A critical challenge in ST imputation is the accurate reconstruction of biologically meaningful expression patterns for genes not included in the original panel. To evaluate this capability, we applied STGNET to a mouse gastrulation embryo dataset sequenced by seqFish, focusing on well-characterized marker genes with established spatial distributions across distinct tissue domains. Using a five-fold cross-validation framework, we masked these marker genes and assessed each method’s ability to recover their characteristic spatial patterns.
We selected four key developmental markers with documented expression patterns: Sox2, a neural tube marker expressed in the brain and dorsal embryo regions [38]; Popdc2, a cardiomyocyte marker localized to the developing heart tube [39]; Foxa1, an endoderm marker expressed along the anterior–posterior axis of the gut tube [40]; and Foxf1, a mesoderm marker present in gut tube mesoderm, lateral plate mesoderm, and allantois [41].
As shown in Fig. 4B, STGNET successfully reconstructed the precise spatial distributions of all four marker genes, closely matching the patterns observed in the ground truth data. In contrast, other methods showed varying degrees of failure: stDiff, SpaGE, and stPlus generated diffuse or incorrect spatial patterns for most genes. While SpaGE and stPlus partially captured Sox2 expression, their reconstructions lacked the spatial precision achieved by STGNET. Notably, for Foxa1 and Popdc2, only STGNET accurately reconstructed the characteristic expression patterns, while all other methods failed. For Foxf1, although stDiff, SpaGE, and Tangram showed some spatial organization, their patterns lacked clear structural definition and contained regions inconsistent with the true biological distribution.
Quantitative analysis of Spearman correlation (Fig. 4C) confirmed STGNET’s superior performance, with significantly higher SPCC values across all evaluated markers. Results for eight additional marker genes (Supplementary Fig. S4) further demonstrate STGNET’s consistent ability to reconstruct biologically accurate spatial expression patterns, establishing its utility for discovering spatial gene distributions in developmental contexts.

STGNET reveals cancer progression trajectory and identifies novel markers in human breast cancer
Breast cancer progression involves a critical transition from ductal carcinoma in situ (DCIS) to invasive ductal carcinoma (IDC), a process that remains incompletely understood at the molecular level. To investigate this transition and identify potential drivers of invasion, we applied STGNET to a human breast cancer dataset generated by the Xenium platform, using the imputed whole transcriptome to reconstruct developmental trajectories and discover novel cancer-associated genes.
We used raw data for Louvain clustering and examined four relevant clustering metric values. The specific experimental results are ARI:0.3179, AMI: 0.5401, HOMO:0.5666, and NMI:0.5411. Under the same parameter settings, we examined the clustering metrics of the imputed data. The specific metric results are ARI: 0.4572, AMI: 0.7656, HOMO: 0.8896, and NMI: 0.7664. STGNET’s imputation revealed substantially enhanced cellular heterogeneity compared with the original data. While Louvain clustering identified 11 distinct clusters in the original ST data (Fig. 5A), the imputed transcriptome resolved 20 fine-grained cellular communities (Fig. 5B). This enhanced resolution suggests that the complete transcriptome captured by STGNET uncovers previously obscured cellular states within the tumor microenvironment.
Trajectory inference using stLearn [42] on the imputed data revealed a more complex progression pathway from DCIS to IDC than was apparent in the original data. Whereas the original data suggested a direct transition from DCIS (Cluster 1) to Proliferative_Invasive_Tumor (Cluster 8), where Cluster 8 will develop into IDC, the imputed data identified an cellular state (Cluster 14) through which cells must pass during the DCIS-to-IDC transition (Fig. 5B). This intermediate state, which appears to be located in a region primarily labeled as Unlabeled in the original data, may represent a previously obscured cellular state that becomes statistically resolvable with enhanced whole-transcriptome context.
Leveraging the whole-transcriptome coverage enabled by STGNET, we identified several genes specifically associated with DCIS and proliferative invasive tumor clusters that are supported by existing cancer literature (Fig. 5C). These include CHI3L1 [43], known to be elevated in cancer tissues and implicated in carcinogenesis; DKK3, frequently downregulated in malignant breast cancer cell lines [44]; EREG, enhanced in early breast lesions [45]; ITGB3, associated with tumor invasiveness and metastatic potential [46]; MUC16, overexpressed in breast cancer tissues [47]; and SCGB1D2, a highly specific breast cancer marker [48]. The identification of these established cancer genes validates STGNET’s ability to recover biologically meaningful expression patterns, while also providing a platform for discovering novel candidates in the complete set of imputed genes.

STGNET enhanced differential expression analysis and marker gene detection
Due to the inherent sparsity and noise in ST data, many cell type-specific marker genes are difficult to detect in the original measurements, particularly those with low expression levels or layer-specific patterns. This sparsity limits the ability to resolve cellular subtypes and cortical laminar organization. Therefore, recovering masked biological signals and improving the sensitivity of cell-type marker detection remain key challenges in ST analysis. We demonstrated on the Mouse primary motor cortex dataset that it can more sensitively detect cell type specific markers.
STGNET enables the discovery of new marker genes that cannot be detected in the original gene expression matrix (Fig. 6A). Differential expression analysis identified Clstn2 [49] as a novel layer-specific marker for L5 ET neurons (Fig. 6B), a finding supported by existing literature on cortical layer organization. Gene ontology enrichment analysis further demonstrated that the imputed data significantly enhanced the detection of neural function, tissue development, and synaptic activity pathways compared with the original data (Supplementary Figs S5 and S6).
These results collectively demonstrate that STGNET not only recovers known biology but also reveals novel insights into the complex signaling networks and molecular programs underlying motor cortex organization and function.

Computational performance and robustness analysis
To provide a comprehensive evaluation of STGNET’s practical utility and architectural design, we conducted analyses across three dimensions: computational resource requirements, component contribution, and parameter sensitivity.
The time cost and memory usage experiments are conducted on an Ubuntu server with an Intel Sky Lake-E processor, GeForce RTX 2080 Ti GPU, and 256 GB of memory. We evaluated the runtime and peak memory consumption of STGNET against all seven benchmark methods on the Somatosensory Cortex dataset under a standardized five-fold cross-validation protocol. Runtime was recorded as the total wall-clock time for a complete training and imputation cycle per fold. Peak memory usage was monitored throughout the process. The results, detailed in Supplementary Table S3, reveal a clear spectrum of computational cost. Mapping-based method Tangram and the autoencoder-based (stPlus, gimVI) were computationally efficient, completing imputation within minutes using <1.1 GB of memory. In contrast, deep generative models requiring more complex optimization—including the diffusion-based methods (SpatialScope, stDiff) and our GAN-based STGNET—incurred significantly higher resource costs. STGNET required a mean runtime of 1.72 h and 7 GB of peak memory per fold. This cost is attributed to the joint adversarial training of multiple generator–discriminator pairs and the subsequent message-passing operations of the GCN, which are intrinsic to achieving high-fidelity, spatially coherent imputations.
To rigorously evaluate the contribution of each core component within the STGNET architecture, we performed a systematic ablation study. We designed six distinct model variants, each removing a specific module or constraint from the full pipeline (see Supplementary notes). Each variant was evaluated on the Somatosensory Cortex dataset using our established five-fold cross-validation protocol. Performance was assessed using the full suite of gene-level (SPCC, SSIM, RMSE, and JS) and cluster-level (ARI, AMI, Homogeneity, and NMI) metrics. The results, detailed in Supplementary Fig. S7 and Table S4, provide quantitative validation of our architectural design. The most substantial performance degradation occurred when either the primary generative module (W/O ) or the primary spatial module (W/O MG) was removed. This confirms that the two-stage paradigm—global distribution learning followed by local spatial refinement—is fundamental to STGNET’s success. The ablation of individual graph views (W/O Spatial Graph, W/O Feature Graph) or auxiliary loss terms (W/O SR, W/O KL) resulted in consistent, though less severe, declines across all metrics. This demonstrates that the multi-view graph integration and the composite loss function provide complementary information and constraints, collectively optimizing the imputed data for both transcriptional accuracy and spatial coherence.
To verify the robustness of the model, we conducted systematic experiments on the imputed results obtained using the five-fold cross-validation method for the hyperparameters on the Somatosensory Cortex dataset. The experimental results are shown in Supplementary Fig. 8 and Tables S5 and S6. The experimental results indicate that STGNET requires an appropriate to maintain the stability of GAN training. At the same time, a suitable is needed to ensure a balance between adversarial loss and reconstruction loss during the training process. When and , the model reaches its optimum. As these two values increase, redundant information may be introduced. Typically, four neighboring cells are selected to construct the adjacency graph. The appropriate is determined based on the number of -nearest neighbors to construct a gene expression similarity adjacency matrix with consistent numbers. At the same time, we are also concerned about the hyperparameters of imputation optimization objectives in multi-view graph neural network. Experiments were conducted, and the results showed that various parameter combinations had little effect on gene similarity indicators. These experiments effectively demonstrate the robustness of STGNET.

Conclusion

Conclusion
In this study, we present STGNET, a novel deep learning framework that integrates GANs and GCNs to address the fundamental limitation of gene panel size in imaging-based ST. By leveraging a multi-stage GAN architecture, STGNET first learns the continuous distribution of gene expression from single-cell RNA sequencing references, generating transcriptomically plausible data. The model further refines these imputations through a spatially-aware graph neural network that explicitly incorporates both physical proximity and transcriptional similarity, ensuring the final predictions maintain biological coherence within the tissue architecture.
Our comprehensive evaluation across nine diverse datasets demonstrates that STGNET consistently outperforms existing state-of-the-art methods in both gene-level accuracy and structural preservation. Quantitative assessments using multiple clustering and similarity metrics confirm that STGNET produces imputed data that closely matches ground truth measurements while maintaining the topological relationships essential for spatial biology. More importantly, the practical utility of STGNET is evidenced through multiple biological applications: it successfully reconstructed key developmental gene expression patterns in mouse embryogenesis, and revealed novel transitional states in breast cancer progression.
The ability of STGNET to extend limited gene panels to comprehensive transcriptome coverage while preserving spatial context represents a significant advance in ST analysis. By enabling more complete characterization of cellular heterogeneity, developmental processes, and disease mechanisms, STGNET provides researchers with a powerful tool to extract deeper biological insights from existing ST datasets. As the field continues to evolve, we anticipate that computational approaches like STGNET will play an increasingly vital role in maximizing the information yield from spatial genomics technologies and advancing our understanding of tissue organization and function in both health and disease.

Key Points
This study proposed STGNET, a novel deep learning framework that uniquely integrates multi-stage generative adversarial networks with a spatially-aware graph convolutional network to address the gene panel limitation in imaging-based spatial transcriptomics (ST).

STGNET outperforms seven state-of-the-art methods across nine diverse datasets, demonstrating superior accuracy in gene expression imputation and exceptional preservation of cellular topological structures.

STGNET provides a robust computational solution to extend the effective coverage of targeted ST assays, transforming sparse, panel-based data into comprehensive ST for deeper, and more reliable biological investigation.

STGNET enables new biological discoveries, including reconstructing key developmental gene patterns, identifying a novel transitional cell state in breast cancer progression, and uncovering extensive cell–cell communication networks previously obscured by limited gene panels.

Supplementary Material

Supplementary Material
supplementary_information_bbag122

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.

🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반

🟢 PMC 전문 열기