Graph neural network modeling of spatial tumor-immune interactions identifies prognostic cellular niches in non‑small cell lung cancer.

Hoebel KV; Lindsay JR; Altreuter J; Alessi JV; Weirather JL; Dryg I; Giobbie-Hurder A; Li Z; Yu KH; Awad MM; Rodig SJ; Lotter W

doi:10.1038/s41698-026-01314-3

← 뒤로

Graph neural network modeling of spatial tumor-immune interactions identifies prognostic cellular niches in non‑small cell lung cancer.

1/5 보강

NPJ precision oncology 📖 저널 OA 91.8% 2023~2026 2026 Vol.10(1)

PICO 자동 추출 (휴리스틱, conf 2/4)

유사 논문

P · Population 대상 환자/모집단

506 patients.

I · Intervention 중재 / 시술

추출되지 않음

C · Comparison 대조 / 비교

추출되지 않음

O · Outcome 결과 / 결론

Latent-space clustering identified distinct TME states predictive of outcome, reflecting varying balances of immune activation and evasion. Our approach underscores the prognostic significance of spatially resolved immune-tumor interactions, providing a blueprint for developing next-generation spatial biomarkers to guide precision treatment strategies in NSCLC.

Hoebel KV, Lindsay JR, Altreuter J, Alessi JV, Weirather JL, Dryg I

📖 무료 전문 🟢 PMC 전문 PMC13096322

PubMed ↗ DOI ↗ BibTeX ↓ RIS ↓

📝 환자 설명용 한 줄

The spatial organization of immune and tumor cells within the tumor microenvironment (TME) has a critical influence on antitumor immunity and patient survival.

이 논문을 인용하기

↓ .bib ↓ .ris

APA Hoebel KV, Lindsay JR, et al. (2026). Graph neural network modeling of spatial tumor-immune interactions identifies prognostic cellular niches in non‑small cell lung cancer.. NPJ precision oncology, 10(1). https://doi.org/10.1038/s41698-026-01314-3

MLA Hoebel KV, et al.. "Graph neural network modeling of spatial tumor-immune interactions identifies prognostic cellular niches in non‑small cell lung cancer.." NPJ precision oncology, vol. 10, no. 1, 2026.

PMID 41794974 ↗

DOI 10.1038/s41698-026-01314-3

Abstract

The spatial organization of immune and tumor cells within the tumor microenvironment (TME) has a critical influence on antitumor immunity and patient survival. However, hand-engineered metrics such as cell densities or pairwise proximity scores fail to capture the complexity of local cell-cell interactions. Understanding these higher-order spatial patterns and their relation to patient outcomes is especially important for non-small cell lung cancer (NSCLC), the deadliest cancer worldwide. Here, we elucidate the NSCLC TME using a graph neural network (GNN)-based framework to model spatially localized cellular neighborhoods in multiplex immunofluorescence data from a clinical cohort of 506 patients. The GNN predicted patient survival with high accuracy (concordance index: 0.82) and remained a significant prognostic factor when adjusted for clinical covariates. Interpretability analyses revealed that specific combinations of cell types, particularly involving CD8+ T cells, PD-L1+ immune cells, and FOXP3+ regulatory T cells, modulated predictions depending on their spatial context. In-silico manipulation experiments applied to the trained GNN, used here as an interpretable surrogate model, suggested that the impact of CD8+ cells on survival were estimated as favorable when in direct tumor contact and less favorable when adjacent to immunosuppressive cells. Latent-space clustering identified distinct TME states predictive of outcome, reflecting varying balances of immune activation and evasion. Our approach underscores the prognostic significance of spatially resolved immune-tumor interactions, providing a blueprint for developing next-generation spatial biomarkers to guide precision treatment strategies in NSCLC.

📖 전문 본문 읽기 PMC JATS · ~76 KB · 영문

Introduction

Introduction
Despite significant improvements in survival rates, primarily driven by advances in targeted therapies and immunotherapies for non-small cell lung cancer (NSCLC)1–4, lung cancer remains the leading cause of cancer-related mortality2. A significant hurdle is the variability in patient response to therapies, compounded by the development of drug resistance5,6.
The tumor microenvironment (TME) is central to this variability—a dynamic ecosystem of tumor and non-tumor cells that orchestrates tumor evolution and progression, influencing patient prognosis7–10. The spatial organization and cellular and molecular signaling within the TME are increasingly recognized as key determinants of clinical outcomes. Yet, traditional approaches to studying the TME often fail to capture its complexity7.
Tumor-infiltrating lymphocytes (TILs), particularly CD8+ immune cells, are critical mediators of antitumor immunity, with high densities of CD8+ TILs strongly correlating with improved survival and response to immune checkpoint blockade independent of tumor stage and therapy11–16. However, the TME in NSCLC is defined by significant spatial heterogeneity and complex cell-cell interactions, making density-based metrics, such as TIL density or PD-L1 tumor proportion score (TPS), insufficient for capturing its intricate architecture17–19. Towards addressing this, proximity-based approaches, like the G-cross function, have quantified spatial relationships between cell populations, uncovering, for instance, that regulatory T cells close to tumor cells are associated with poorer survival outcomes20,21. Yet, such pairwise analyses provide a limited view, failing to reflect the broader, multifaceted interactions among diverse cell types that shape the TME.
Neighborhood analysis has emerged as a powerful approach for characterizing localized spatial patterns within the TME, thereby bridging this gap. By defining ‘neighborhoods’ of adjacent and potentially interacting cells, this method better captures high-order spatial organization that is lost in pairwise metrics and global analyses22–26. Studies have demonstrated the prognostic value of spatial neighborhoods in various cancers, such as colorectal cancer, where the enrichment of neighborhoods containing PD1+ CD4+ immune cells is associated with improved survival24. In NSCLC, the co-localization of tumor and immune cells, such as tumor neighborhoods containing CD4+ effector cells and macrophages, significantly improved the efficacy of immune checkpoint inhibitors27.
Despite its promise, neighborhood analysis poses significant computational challenges due to many possible interactions among cell types, for which traditional, ‘hand-engineered’ metrics fail to adequately capture. Graph neural networks (GNNs) offer a promising solution to this complexity by leveraging data-driven deep learning techniques while offering enhanced interpretability compared to other deep learning approaches. GNNs model cells as nodes and spatial relationships as edges, enabling the integration of multivariate interactions into predictive frameworks28. Through end-to-end training, these networks can uncover latent patterns in the TME, linking cellular networks to clinical outcomes like patient survival without relying on predefined features.
By leveraging GNNs to analyze spatial neighborhood graphs, this study aims to advance the characterization of the NSCLC TME, addressing the limitations of existing methods and uncovering novel insights into how localized spatial interactions associate with patient survival. Using a real-world clinical cohort, we train GNNs to estimate patient overall survival based on multiplex immunofluorescence (mIF) data, and subsequently interrogate the features learned. This approach has the potential to provide a more comprehensive understanding of the TME, identifying actionable spatial features that may drive therapeutic response.

Results

Results

Patient cohort
This study is a secondary analysis of a cohort of 506 NSCLC patients treated at the Dana-Farber Cancer Institute11,12. The cohort represents an unselected, prospectively collected set of patient samples, with the only criteria for inclusion being patient consent and the availability of adequate tissue for analysis. Spatial distribution maps detailing the expression of five markers (Cytokeratin [CK], CD8, PD-L1, PD-1, FOXP3) for each cell in a region of interest (ROI) were generated using a standardized mIF assay for 542 formalin-fixed paraffin-embedded tissue biopsy samples (Fig. 1a)11,12. After quality control, 538 tumor samples from 506 patients, encompassing 1992 ROIs with 6,941,946 cells, were included in the analysis (Supplementary Fig. 1). The median follow-up was 2.9 years. Patient demographics are detailed in Fig. 2a and Supplementary Table 1. For analysis purposes, we defined eight different cell types based on the co-occurrence of the five mIF markers: PD-L1- tumor (CK+ ), PD-L1+ tumor (CK+ ), PD-1-/CD8+ , PD-1+/CD8+ , PD-L1+ immune cells (CK-), PD-1 immune cells (CD8-), FOXP3+ and unspecified (negative for all five markers). See Supplementary Table 4 for an overview.
For model development, the data were split into training, validation, and test cohorts of 305 (332/1229), 72 (73/275), and 129 (129/471) patients (samples/ROIs), respectively. Training and validation cohorts were used for model training and selection. The dataset split was performed at the patient level and stratified by patient survival. Clinicopathologic characteristics and patient survival distributions were comparable across splits (Supplementary Fig. 2 and Supplementary Table 1). Factors significantly associated with overall survival (univariate Cox regression; Fig. 2b) included tumor stage (logHR: 0.86) and immuno-therapy (IO) treatment status (yes/no, logHR: 1.15). This analysis was restricted to the training and validation datasets to avoid test data leakage.

Neighborhood graph neural network accurately predicts patient survival
We trained an interpretable GNN to predict the relationship between spatially localized TME patterns and patient survival, following the SPACE-GM architecture of Wu et al.28. Graphs were constructed for each ROI, where nodes represented individual cells with the binary marker expression as node features, and edges connected neighboring cells (Fig. 1a). From these ROI graphs, spatial neighborhood graphs were created around each cell in the ROI by including all cells connected to it by up to four edges (Fig. 1b). The average neighborhood graph contained 42 cells. The model was trained to predict whether a given neighborhood was associated with a positive or negative effect on overall survival (OS) (Fig. 1b), where the prediction, pi, for a spatial neighborhood graph, gi, can be interpreted as the negative log hazard ratio of the neighborhood. Patient-level survival predictions were generated by averaging all neighborhood predictions in all available samples for a patient (Supplementary Fig. 3).
The spatial neighborhood GNN demonstrated strong predictive performance at the patient level. On the hold-out test dataset, the concordance index (c-index) between patient-level predictions and OS was 0.82 (95% CI: 0.75–0.87), indicating a high level of agreement between the model’s survival predictions and the actual survival outcomes. To assess model variability, we additionally trained four models using different random patient-level splits of the development data (training + validation) while keeping the same hold-out test set to avoid data leakage (see Methods). All five models achieved comparable test performance (c-indices = 0.80–0.82), confirming the robustness of the reported results. Kaplan-Meier curves (Fig. 3a) illustrate the survival stratification for high versus low patient-level GNN predictions.
We compared the GNN’s survival prediction performance on the hold-out test dataset to two baselines. The first was CD8+ TIL density quantified from the same ROIs, which achieved a c-index of 0.75 (95% CI: 0.66–0.84, p = 0.069 compared to spatial neighborhood GNN)11. The second baseline was a spatial arrangement-only GNN to evaluate the relationship between the spatial organization of cells and patient survival. This model incorporated only spatial arrangement information of cells, excluding all marker expression data, and was trained following the same procedures as those for the full spatial neighborhood GNN. The spatial arrangement-only GNN model achieved a c-index of 0.715 (0.617–0.804, p = 0.0025 compared to the spatial neighborhood GNN). Thus, the spatial neighborhood GNN, utilizing information about the spatial organization of cells and their marker expression patterns, offered promising performance in relation to these baselines (Supplementary Table 2).
We also tested whether the GNN predictions provide prognostic information independent of known clinical factors by fitting a multivariable Cox regression model including tumor stage and immunotherapy status. The GNN predictions remained significant (p = 0.006, Fig. 3b), suggesting it has identified complementary prognostic features of patient overall survival that are not captured by stage and immunotherapy status.

Associations between neighborhood composition and GNN prediction
To evaluate the biological and spatial features driving the spatial neighborhood GNN predictions, we analyzed the characteristics of neighborhoods associated with favorable or unfavorable survival predictions (Fig. 1c–e).
First, we assessed the relationship between the composition of a neighborhood and the GNN predictions (Fig. 1c). A prediction was generated for each neighborhood in the test dataset, where the GNN prediction corresponds to the negative log hazard ratio, such that higher values indicate a stronger positive association with patient survival. The neighborhoods were then ranked from most favorable to least favorable survival prediction and divided into 20 equal partitions, g1-g20 in Fig. 4a (84,468 subgraphs each). Each spatial neighborhood was treated as an independent entity, regardless of the patient sample from which it originated.
For a subset of cell types, a monotonic relationship was observed between the neighborhood enrichment and survival predictions across the partitions (Fig. 4a). Neighborhoods with higher enrichment of CD8+ and PD-1+ immune cells were associated with more favorable survival predictions, exhibiting a pronounced positive correlation for PD-1+ CD8+ cells. Conversely, increasing tumor cell proportions correlated with unfavorable predictions. The relationship for PD-L1+ immune cells was nonlinear, showing average enrichment in the most unfavorable partitions, depletion in the most favorable partitions, and positive enrichment in between. FOXP3+ cells exhibited no apparent pattern, consistent with previous studies that reported variable associations between FOXP3+ cell density and survival outcomes in NSCLC and other cancer types11,12.
At the extremes of the prediction spectrum (Fig. 4b, c, magenta and green boxes in Fig. 4a), enrichment patterns deviated slightly from these broader trends, where subdividing these extreme partitions, g1 and g20 in Fig. 4a, into 20 subgroups (4,223 neighborhoods each) revealed additional insights (Fig. 4b, c). In the most favorable partition g1 (Fig. 4b), CD8+ cell enrichment continued to associate with high survival predictions across all subgroups (b1-b20). However, a more nuanced trend was observed for tumor cells. Most of the subgroups indicated low tumor cell enrichment, but the most favorable subgroup (b1) indicated an enrichment of both PD-L1- and PD-L1+ tumor cells, suggesting a positive interaction between CD8+ and tumor cells. Representative neighborhoods from this subgroup (Fig. 4d) featured diverse immune infiltrates, including PD-1+ and PD-1- CD8+ cells, alongside PD-L1+ tumor cells.
Some nuanced patterns were also observed when subdividing the neighborhood group with the least favorable predictions (g20, Fig. 4c). Most of these subgroups exhibited an enrichment of PD-L1+ tumor cells and a depletion of most immune populations. However, at the extreme subgroup w20, tumor cells were depleted, and there was a slight enrichment of FOXP3+ cells. Representative neighborhoods of this subgroup w20 (Fig. 4e) were populated by cells negative for each marker, with few CD8+ cells, scattered PD-L1- tumor cells, and FOXP3+ cells.
Notably, each patient sample contained spatial neighborhoods encompassing a range of GNN predictions as illustrated in Supplementary Tables 3a–c and Supplementary Fig. 4, meaning that both favorable and unfavorable neighborhoods were present within individual samples. As a result, the partitions in Fig. 4a–c contained neighborhoods from a broad distribution of patients in the test dataset, rather than being dominated by a small subset of cases. This highlights the heterogeneity of spatial TME patterns with individual tumors and reinforces the idea that survival-relevant spatial interactions occur across diverse patient samples rather than being confined to a select few.

Latent space clustering – Spatial and cellular contexts of CD8+ cells influence survival predictions
The enrichment of CD8+ cells, both PD-1+ and PD-1-, in neighborhoods demonstrated the strongest trend with favorable survival predictions. However, non-linear behavior at the extremes of the prediction spectrum suggested that the GNN survival predictions were additionally influenced by the co-localization of CD8+ cells with other cell types, such as tumor cells, and potentially their spatial arrangement.
To further investigate these spatial, non-linear effects, we analyzed the latent space representation of neighborhoods within the GNN (Fig. 1d). Using k-means clustering, we grouped neighborhoods into seven distinct clusters based on latent space embeddings, revealing substantial variation in cell type enrichment and survival predictions across clusters (Fig. 5a). As further described in the methods, clustering was performed on training data embeddings and applied to all neighborhoods in the test dataset to characterize spatial neighborhood states and their relationship with survival. The optimal number of clusters was determined using the elbow method and the Davies-Bouldin index.
Clusters 1 and 5 exemplified immune-enriched environments but with distinct characteristics. Cluster 1 was characterized by an enrichment of PD-L1+ immune and tumor cells alongside CD8+ and PD-1+ cells. Predictions for this cluster were neutral; survival predictions were neither strongly favorable nor unfavorable, reflecting a potential balancing effect between immune activation (CD8+ and PD-1+ cells) and suppression (PD-L1+ immune and tumor cells). Conversely, cluster 5, enriched in CD8+ and PD-1+ cells with depletion of tumor cells, particularly PD-L1+ tumor cells, was associated with the most favorable survival predictions and represented immune-activated neighborhoods. These findings indicate that higher CD8+ cell enrichment alone does not guarantee positive predictions; specific cell type combinations appeared critical.
Clusters with fewer CD8+ cells were linked to unfavorable survival predictions. Cluster 4 represented immune-desert neighborhoods dominated by PD-L1- tumor cells, while cluster 6 reflected immune evasion, characterized by PD-L1+ tumor and immune cells, with a higher proportion of FOXP3+ cells than cluster 4. Cluster 7, notable for a high enrichment of PD-L1+ tumor cells and depletion of PD-L1- tumor cells, CD8+ cells, and PD-1+ cells, was also associated with unfavorable predictions. Going beyond the univariate trends suggested in the prior analysis (Fig. 4), the clustering analysis thus identified distinct neighborhood types characterized by certain combinations of cells, which were associated with certain survival predictions by the GNN model. Examples of neighborhoods for each cluster, highlighting their unique spatial and cellular characteristics, are illustrated in Fig. 5c.
To validate that the clusters were not only associated with the GNN’s predictions but also the ground truth patient outcomes, we computed the proportion of neighborhoods assigned to each cluster per patient sample and their relationship to OS using univariate Cox regression. Higher proportions of clusters 2 (logHR = -4.3, 95% CI: -7.4 to -1.2, p = 0.007) and 5 (logHR = -10.1, 95% CI: -14.9 to -5.4, p = 3e-6), and lower proportions of clusters 4 (logHR = 2.3, 95% CI: 0.5 to 4.1, p = 0.01), 6 (logHR = 4.6, 95% CI: 2.6 to 6.6, p = 5e-6), and 7 (logHR = 2.4, 95% CI: 0.4 to 4.5, p = 0.02) were significantly associated with better OS after correcting for multiple comparisons (Fig. 5b). This confirmed that the proportions of favorable and unfavorable clusters within a patient sample were predictive of patient survival.

Effect of targeted manipulations on GNN neighborhood predictions
The latent space analysis of the GNN provided insights into how the co-localization of different cell types influenced model predictions and patient survival. Next, we aimed to explicitly explore how the spatial arrangement of cells within neighborhoods affected the GNN predictions as a way to interrogate the model’s learned decision rules. We employed a spatial manipulation framework in which the location of specific cell types within a neighborhood was altered, followed by an assessment of the corresponding change in the GNN prediction. This approach enabled the evaluation of rare but potentially critical spatial arrangements, providing a controlled framework for uncovering interactions that contribute to the predictive power of the model (Fig. 1e). All experiments in this section were performed by manipulating spatial neighborhoods selected from the test dataset.
While greater densities of CD8+ TILs have been often linked to improved patient survival11–16, the interaction between CD8+ TILs, their PD-1 expression status, and tumor PD-L1 status is complex and context-dependent. Prior studies have indicated that PD-1+ CD8+ cells may exhibit differential cytotoxicity depending on tumor PD-L1 expression29–31. To better understand these dynamics, we investigated how introducing CD8+ cells into tumor microenvironments affects GNN survival predictions, focusing on the impact of PD-1 and PD-L1 status (Fig. 6a). To isolate the effect of tumor cells on CD8+ cells, we performed this experiment in tumor-cell-only neighborhoods from the test dataset and quantified the effects for neighborhoods with only PD-L1- tumor cells (n = 11,675).
As expected, converting a randomly selected tumor cell into a CD8+ cell increased the GNN predictions, corresponding to predictions of longer survival, irrespective of the tumor’s PD-L1 status or the CD8+ cell’s PD-1 status (p < 0.0001, effect sizes >0.86, average change in log HR 0.15 and 0.19 for PD-1 negative and positive CD8+ cells respectively, Fig. 6b, c). The positive impact of CD8+ cells was more pronounced in PD-L1- tumors than in PD-L1+ tumors (p < 0.0001, effect sizes: 0.69 for PD-1+ CD8+ cells and 0.53 for PD-1- CD8+ cells, Fig. 6a). CD8+ cells that were also positive for PD-1 demonstrated a greater positive effect than their PD-1- counterparts, for both PD-L1- and PD-L1+ tumor cells (p < 0.0001, effect sizes: 0.86 in PD-L1- tumors and 0.83 in PD-L1+ tumors, Fig. 6b, c). The additive effect of PD-1 expression on CD8+ cells was significantly larger in PD-L1- tumors (p < 0.0001, effect size: 0.71), indicative of an expected inhibitory role of tumor PD-L1 on PD-1+ CD8+ cells (Supplementary Fig. 5b).
Prior studies have linked closer proximity between CD8+ and tumor cells to improved survival outcomes32–34. However, these analyses defined proximity only by spatial distance and as an average across a patient sample. Through our controlled manipulation approach, we assessed whether contact between CD8+ cells and tumor cells further influenced GNN predictions. Given the potential biological significance of the tumor-stromal interface and CD8+ cell infiltration, we specifically focused this experiment on neighborhoods containing a mix of tumor cells and unspecified cells, which often represent stromal cells upon visual inspection.
Starting from spatial neighborhoods with balanced proportions of PD-L1- tumor cells and unspecified cells (tumor/unspecified ratio: 0.96–1.04; n = 6668 neighborhoods), we manipulated unspecified cells into PD-1- CD8+ cells in two scenarios as illustrated in Fig. 6d: 1) the resulting CD8+ cell was directly connected to at least one tumor cell (M2 in Fig. 6d), and 2) the CD8+ cell was not connected to any tumor cell (M1 in Fig. 6d).
Adding a CD8+ cell increased predictions in both scenarios (Fig. 6e); however, the increase was significantly larger when the CD8+ cell was directly connected to a tumor cell (p < 0.0001, effect size: 0.58, average difference in log HR 0.039), indicating a greater positive effect on survival. Thus, beyond neighborhood composition, these results highlight the role of spatial organization in modulating the effect of CD8+ cells on survival predictions.
PD-L1-expressing immune cells, particularly tumor-associated macrophages (TAMs), are key mediators of immune suppression within the TME35,36. In NSCLC, high PD-L1 expression on TAMs has been linked to reduced CD8+ T cell infiltration and function, thereby promoting tumor progression37. Through interactions with CD8+ TILs, these cells suppress antitumor immunity37,38, a mechanism that spatial proximity may exacerbate. For instance, in pancreatic ductal adenocarcinoma39, the close spatial proximity between PD-L1+ TAMs and CD8+ TILs is associated with poor prognosis.
To evaluate the impact of PD-L1+ immune cells on survival predictions, we first analyzed neighborhoods from the test dataset containing PD-L1- tumor cells, PD-1- CD8+ cells, PD-L1+ immune cells, and, optionally, unspecified cells (n = 47,664). These neighborhoods were selected to represent typical tumor neighborhoods with CD8+ TILs. Removing PD-L1+ immune cells from these spatial neighborhoods by converting them into unspecified cells resulted in a modest but significant positive effect on GNN predictions (p < 0.0001, effect size: 0.44, Supplementary Fig. 6a, b). Removing more than two PD-L1+ cells had a significantly greater positive impact than removing fewer (p < 0.0001, effect size: 0.44, Fig. 6f), suggesting an additive impact of PD-L1+ immune cells on immune suppression. However, despite the overall positive effect of PD-L1+ cell removal on patient survival predictions, there was considerable variability in the magnitude of the response.
Interestingly, the effect of removing PD-L1+ immune cells did not vary based on the number of CD8+ cells present in the neighborhood (Supplementary Fig. 6c). This led us to hypothesize that the spatial arrangement between these cell types, rather than their abundance, might modulate this effect. Consequently, we limited the analysis to neighborhoods with equal numbers of PD-1- CD8+ cells and PD-L1+ immune cells (n = 13,078), ensuring the observed effects were not driven by differences in cell counts. We then repositioned PD-L1+ cells to establish direct contact with CD8+ cells while maintaining the overall composition of the neighborhood (Fig. 6g). This manipulation significantly reduced survival predictions (effect size: 0.46, average change in log HR: -0.013, Fig. 6h). The magnitude of this effect increased with the number of connections between PD-L1+ and CD8+ cells, with log HR changes of -0.010, -0.024, and -0.039 for one, two, and three or more interactions, respectively. This effect was independent of the PD-1 status of CD8+ cells, as the difference in predictions between PD-1+ and PD-1- CD8+ cells was negligible (effect size: 0.19, Supplementary Fig. 6d). These findings suggest that, within the GNN, spatial interactions between CD8+ cells and PD-L1+ immune cells are used as signals consistent with immunosuppression and influencing patient outcomes, as quantified by changes in the GNN predictions, which was trained in a purely data-driven fashion.
The role of FOXP3+ cells in the NSCLC TME remains controversial, with mixed reports on their impact on patient outcomes11,40,41. Additionally, the influence of spatial interactions between FOXP3+ cells and CD8+ cells remains unclear. For instance, in colorectal carcinoma, increased CD8+ infiltration among FOXP3+ cells has been hypothesized to mitigate the general adverse effects of FOXP3+ cells20; however, it is still unclear whether direct cell-cell contact is required for FOXP3+ cells to exert cytolytic effects on CD8+ cells42. Hence, we investigated the spatial dynamics of FOXP3+ and CD8+ cells in NSCLC to address these gaps using our in-silico GNN framework.
To assess both spatial and compositional effects between FOXP3+ and CD8+ cells, we analyzed spatial neighborhoods from the test dataset which contained PD-1- CD8+ and FOXP3+ cells, excluding neighborhoods with PD-L1+ cells (n = 42,777). Predictions were significantly more favorable towards longer survival for neighborhoods where CD8+ cells outnumbered FOXP3+ cells (CD8/FOXP3 ratio >1) compared to neighborhoods with equal ratios (CD8/FOXP3 ratio = 1) or a predominance of FOXP3+ cells (ratio <1) (Fig. 6i). The differences across all three groups were statistically significant (Dunn test, all p-values < 0.0001). In neighborhoods with equal proportions of CD8+ and FOXP3+ cells, predictions were not influenced by the number of each cell type (p = 0.18, Kruskal-Wallis test, Supplementary Fig. 7).
To explore the spatial effects of FOXP3+ cells, we manipulated neighborhoods with equal CD8/FOXP3 ratios (n = 17,468) by repositioning FOXP3+ cells to ensure contact with CD8+ cells (Fig. 6j). This direct interaction significantly reduced GNN predictions (p < 0.0001, effect size: 0.35; Fig. 6k). Moreover, the magnitude of this effect increased with the number of direct interactions between FOXP3+ and CD8+ cells, mirroring the trend observed for PD-L1+ immune cells and CD8+ cells. The average change in log HR was -0.0091, -0.021, and -0.036 for one, two, and three or more connections, respectively, suggesting that, within the GNN model, the number of FOXP3+ and CD8+ cells in contact is interpreted as increasingly unfavorable contexts, consistent with a cumulative suppressive effect in the TME.

Discussion

Discussion
This study demonstrates the power of GNNs to uncover critical spatial relationships within the TME that are associated with survival in NSCLC. We leverage a prospectively collected, unselected cohort of over 500 patients with diverse clinical outcomes and a validated mIF assay11,12,19, which provides spatially resolved, single-cell data, allowing for a detailed and real-world assessment of cell interactions. Unlike conventional approaches that rely on global density or simple distance metrics, we integrate cell marker expression and spatial arrangement within an entirely data-driven framework to investigate how local interactions associate with patient survival. Through a combination of neighborhood composition analysis, latent-space clustering, and in silico manipulation experiments (Fig. 1c–e), our findings provide new insights into the role of TME architecture in shaping the overall survival of patients with NSCLC. Together, these analyses highlight the crucial interplay between cellular diversity, specific cell type combinations, and spatial relationships within the TME, demonstrating the GNN’s capacity to capture biologically meaningful patterns that influence patient outcomes.
Our results highlight the added prognostic value of spatially resolved immune interactions. In a multivariable Cox model accounting for tumor stage and immunotherapy status, the GNN-based predictions remained significantly prognostic (Fig. 3b). This underscores the robustness of the GNN predictions despite the heterogeneity of treatments, tumor stage, and biopsy sites, suggesting that spatial TME features provide broad prognostic power. Each sample exhibited a mix of high- and low-prediction neighborhoods, reflecting intra-tumoral heterogeneity. The average GNN prediction per sample, reflecting the balance of favorable and unfavorable local contexts, was strongly associated with overall survival.
Previous studies have shown that high CD8+ TIL density is correlated with better survival in NSCLC11,12,16,43. Our study extends this finding by demonstrating that the spatial interactions of CD8+ T cells with tumor cells and other immune cell populations are key in modifying the effectiveness of CD8+ cells. Density-based metrics or pairwise proximity analyses (e.g., G-cross function) may fail to capture crucial but rare cell-cell interactions because they average over entire tumor regions, losing the granularity of localized cellular environments44,45. Here, the GNN serves not merely as a black-box predictor of patient survival but as a surrogate model for analyzing spatial neighborhood characteristics and their effect on survival28. This approach enables a more nuanced understanding of how local interactions impact survival in NSCLC.
By evaluating neighborhood composition and associated GNN predictions, we confirmed broader trends observed in NSCLC (Fig. 4). Overall, neighborhoods enriched in CD8+ cells were associated with more favorable survival. Enrichment of PD-L1+ tumor cells showed a more substantial adverse association with survival predictions than PD-L1- tumor cells. This aligns with prior reports showing that patients with high PD-L1 expression generally have shorter overall survival46,47, potentially due to a adverse effect of PD-L1 expression against cytotoxic T-cell responses48. Although patients with a high PD-L1 TPS were more likely to receive immunotherapy in our cohort, the GNN’s predictions were prognostically independent of immunotherapy treatment status in multivariable modeling, indicating that the local immune context remains a critical determinant of survival regardless of treatment exposure.
Our analysis also revealed that neighborhoods depleted of PD-1+ CD8- cells were associated with unfavorable survival predictions (Fig. 4b). While this population is not commonly studied in NSCLC, it likely includes activated CD4+ T cells, tumor-associated macrophages (TAMs), dendritic cells, and other immunoregulatory subsets. Additionally, FOXP3+ cell density did not show clear trends in survival predictions, consistent with prior reports describing mixed associations between Tregs and patient outcomes11,49,50. However, deeper analysis revealed that their impact on survival predictions was significantly modified by their spatial arrangement to CD8 cells (Fig. 6j/k).
Latent-space clustering identified distinct TME states with prognostic relevance (Fig. 5). Some of these states recapitulated overall trends observed between neighborhood composition and survival predictions. For example, neighborhoods enriched in CD8+ and PD-1+ cells but with only a few PD-L1+ tumor cells (cluster 5) were associated with the most favorable survival predictions. However, cluster 1, despite showing the strongest CD8+ cell enrichment, was also enriched in PD-L1+ cells, suggesting an equilibrium where PD-L1-mediated immune evasion counterbalances cytotoxic immune pressure51. Conversely, immune-desert clusters dominated by PD-L1- tumor cells (cluster 4) and immune-suppressive clusters with high PD-L1+ immune and tumor cell content (cluster 6) correlated with poor outcomes. These findings emphasize that the conventional “hot” versus “cold” tumor classification52 may be insufficient, as immune-infiltrated tumors can exhibit either productive immune activation (high CD8+ , PD-1+ , low PD-L1) or immune exhaustion (high CD8+ , PD-1+ , high PD-L1), leading to differential survival outcomes. Notably, these distinct TME states mirror observations from a recent pan-cancer mIF study by Lindsay et al., in which patterns of immune enrichment and immune evasion, driven by both cell densities and tumor–immune proximities, emerged as key axes of variation19, paralleling the relationships identified by our GNN-based analyses.
Our manipulation experiments demonstrate the utility of leveraging the GNN as a surrogate model of the TME to probe how it encodes different hypotheses about cell-cell interactions. Through targeted manipulations to spatial neighborhoods, we found that the positive impact of CD8+ cells was limited by the presence of PD-L1 on tumor cells, consistent with reports that in PD-L1+ tumors, more CD8+ cells are required to improve prognosis43 and the idea that PD-L1 signaling drives T-cell exhaustion53.
Additionally, inducing direct contact between PD-L1+ non-tumor cells, likely TAMs or cancer-associated fibroblasts, and CD8+ cells led to a shift towards less favorable survival predictions, reinforcing the role of PD-L1 in suppressing cytotoxicity. Interestingly, this effect was independent of PD-1 expression on CD8+ cells, raising questions about mechanisms other than classical PD-1/PD-L1 signaling that may contribute to immune suppression in these contexts.
While PD-1 expression on CD8 + T cells has been linked to both activation and exhaustion depending on context54,55, our findings raise the possibility that PD-L1+ immune cells exert suppressive effects even in the absence of detectable PD-1. This supports emerging evidence that PD-L1 expression on non-tumor cells can modulate therapy response29,56.
Our results also underscore the role of physical proximity in effective tumor suppression. CD8+ cells were predicted to be most effective when in direct contact with tumor cells, consistent with previous findings linking shorter distances between CD8+ and tumor cells to better survival57. Additionally, the CD8+ cell function was significantly influenced by interactions with FOXP3+ cells, i.e., Tregs. Contact between FOXP3+ and CD8+ cells decreased survival predictions, consistent with prior reports that direct Treg interactions suppress T-cell cytotoxicity42. This observation may help to reconcile conflicting reports on FOXP3+ cells and their prognostic impact in NSCLC.
These in-silico perturbations should be interpreted as validation and interrogation of the GNN’s learned decision rules, generating biologically plausible hypotheses and inspiring additional experimental investigation rather than providing definitive mechanistic evidence.
Our findings highlight the need for next-generation spatial biomarkers integrating cell composition and spatial arrangement within the TME. The independent prognostic value of GNN predictions, even when adjusted for tumor stage, suggests that local immune interactions can refine patient stratification beyond traditional clinical markers. Furthermore, these findings could potentially inform therapeutic strategies. For instance, interactions such as CD8+ to PD-L1+ immune contact or CD8+ to FOXP3+ contact could serve as actionable targets to optimize immunotherapy. These results provide a rationale for combination therapies, such as Treg depletion or macrophage reprogramming58,59, alongside PD-1/PD-L1 blockade to restore T-cell function and maximize tumor control.
Our study is subject to several limitations. While our cohort is large for an mIF study, it may be underpowered to detect rare interactions, and our restricted marker panel limits the interpretation of other immune populations (e.g., B cells, CD4+ cells, macrophages). Expanding to additional markers would provide a more comprehensive view of the TME. Additionally, survival in our cohort may be influenced by patients’ mutational status and treatment history, which we did not account for in our analyses. Future studies should incorporate these factors to disentangle the effects of spatial TME features from underlying genomic and therapeutic influences. Furthermore, static tissue samples prevent us from capturing neighborhoods’ dynamic evolution over time or under therapy. Longitudinal datasets will be essential to assess how TME interactions shift pre-and post-treatment. Finally, it will be important in future work to validate these findings in wholly independent cohorts.
In conclusion, this study presents data-driven evidence that spatially localized interactions among CD8+ T cells, PD-L1+ cells, and FOXP3+ Tregs critically modulate survival in NSCLC. Our GNN-based analysis framework refines immune stratification beyond density metrics by integrating spatial arrangement with neighborhood-level interactions, while facilitating targeted hypothesis testing. These findings reinforce the importance of spatially nuanced biomarkers in patient outcomes and provide a blueprint for extending this methodology to other cancers. Harnessing these local TME signatures may improve patient stratification and guide precision immunotherapy strategies.

Methods

Methods

Ethics approval and assay validation
This study is a secondary analysis of data from the ImmunoProfile cohort, a prospectively collected dataset at the Dana-Farber Cancer Institute (DFCI) from 2018 to 2021. All participants provided written informed consent under protocols DFCI 11-104, 17-000 and 20-000, approved by the Dana-Farber/Harvard Cancer Center Institutional Review Board in accordance with the Declaration of Helsinki. The mIF assay and end-to-end analytic workflow were developed and fully validated within the CLIA-certified molecular diagnostics laboratory at Brigham and Women’s Hospital (CLIA license #22D2156930) in accordance with Clinical Laboratory Improvement Amendments (CLIA) regulations.

Clinical cohort and tissue processing
Formalin-fixed, paraffin-embedded (FFPE) tumor sections from 506 prospectively enrolled NSCLC patients in the ImmunoProfile study were analyzed. Each section underwent a five-plex immunofluorescence assay targeting Cytokeratin, CD8, FOXP3, PD-L1, and PD-1; nuclei were counterstained with DAPI to enable automated single-cell segmentation.
Regions of interest (ROIs) within tumor tissue were defined by trained pathologists at 20x magnification following an established protocol11. ROIs were selected to capture immunological “hotspot” regions rather than obtained through automated tiling, resulting in an average of approximately four ROIs per sample (see Supplementary Fig. 4). Each ROI measured approximately 1809 ± 150.1 × 1384 ± 46.2 pixels (mean ± standard deviation).
Cell segmentation was performed using the inForm® software’s machine-learning–based segmentation pipeline. The quantification of biomarker expression was performed with the inForm® software (Akoya Biosciences) and validated machine learning pipelines, including iterative quality control and manual review11. Only ROIs with at least 90% of their cells in the inner tumor region were included. Spatial coordinates and binary marker expression data for individual cells were automatically derived to enable downstream graph-based modeling.
For analysis purposes, we defined eight different cell types based on the co-occurrence of the five mIF markers: PD-L1- tumor (CK+), PD-L1+ tumor (CK+), PD-1- CD8 + , PD-1+ CD8 + , PD-L1+ immune cells (CK-), PD-1+ immune cells (CD8-), FOXP3+ and unspecified (negative for all five markers). See Supplementary Table 4 for an overview.

Graph construction and preprocessing
For each ROI, we first constructed a full graph using Delaunay triangulation, where nodes represented individual cells with binary expression features for the five markers and edges connected neighboring cells. Edge features included the Euclidean distance between connected cells calculated from the inForm-derived cell coordinates (20x magnification, 0.496 µm/pixel). We then defined subgraphs or “spatial neighborhood graphs” from the ROI graph by extracting all nodes within four hops of a given center cell. Spatial neighborhoods were generated for all cells within an ROI whenever computationally feasible. Neighborhoods were further constrained to a maximum diameter of 200 pixels ( ~ 99.2 µm). An average spatial neighborhood contained 42 cells.

Dataset handling and splitting
The dataset included 538 tumor samples from 506 patients after quality control (Supplementary Fig. 1). The dataset was split into training (305 patients, 60%), validation (72 patients, 15%), and test (129 patients, 25%) sets. To ensure a balanced distribution of patient outcomes across the datasets, we stratified the dataset split based on survival status, maintaining the ratio of deceased to censored patients across all subsets (Supplementary Table 1 and Supplementary Fig. 2). Samples from the same patient were treated as independent during training, but only one sample per patient was used for evaluation following the same selection criteria as used by Alessi et al. and Lindsay et al.11,19. Detailed data distribution and patient characteristics are provided in Supplementary Table 1.
To evaluate the stability of model training, we performed five random patient-level splits of the development dataset (training + validation) while keeping the same test set to avoid information leakage60. Each split was used to train an independent spatial neighborhood GNN following identical hyperparameters and early-stopping criteria. The model trained using the first (original) cross-validation split was used for all downstream interpretability analyses. This single-model strategy ensured reproducible latent-space embeddings and neighborhood assignments across analyses.

Graph neural network development
We trained a GNN following the SPACE-GM architecture proposed by Wu et al.28. The network consisted of three graph isomorphism layers, a 512-dimensional embedding layer with max pooling, and a fully connected prediction head (512, 128, 32 neurons) with a 0.25 dropout rate. The model was trained using Cox proportional hazard loss to predict the relationship between spatial neighborhoods and overall survival (OS). The loss was computed at the subgraph level, while model performance was monitored at the ROI level by averaging predictions across all subgraphs within an ROI. During training, we tracked the c-index on the validation dataset at the ROI level and selected the model checkpoint with the highest validation c-index. Training utilized equal sampling across the cell type of a neighborhoods center cell to ensure balanced representation of neighborhood characteristics. Clinical covariates, e.g. tumor stage, immunotherapy status, or histopathology, were not used as model inputs. Hyperparameters and training details are provided in Supplementary Methods.

Model evaluation
ROI-level predictions were generated by averaging predictions from all spatial neighborhoods within an ROI. Predictions were further averaged for samples with multiple ROIs to obtain a single patient-level prediction. The c-index was used as the primary evaluation metric. Model independence from clinicopathologic factors was assessed using multivariable Cox regression. If not mentioned otherwise, all performance metrics and analyses were determined exclusively using the hold-out test dataset.
We also developed two baseline models for comparison:
1. CD8+ Density Cox Regression: Univariate Cox regression model using the spatial density of CD8+ cells across the ROIs for a given sample as the sole predictor11.
2. Spatial Arrangement-Only GNN: A GNN using only spatial arrangement features (edge features) without biomarker expression data (node features).
Performance comparisons between models were performed using matched bootstrapping with 2000 samples. See Supplementary Material for details on the matched bootstrapping process.

Subgraph composition analysis
To investigate the relationship between subgraph composition and GNN predictions, we sorted the spatial neighborhoods from the test dataset based on their associated model predictions. All spatial neighborhoods from the test dataset were first ranked by their GNN predictions. We then calculated the proportions of the eight defined cell types: PD-L1- tumor (CK+), PD-L1+ tumor (CK+), PD-1- CD8 + , PD-1+ CD8 + , FOXP3+ , PD-L1+ immune cells (CK-), PD-1+ (CD8-), and unspecified for each neighborhood. Subsequently, neighborhoods were divided into equally sized 20 bins such that each bin represented a distinct segment of the prediction range, from the most favorable to the least favorable survival predictions. The first bin included subgraphs with predictions in the top 5th percentile, the second bin encompassed predictions between the 5th and 10th percentiles, and so on. For each bin, we computed cell type enrichment by comparing the observed proportions of cell types within neighborhoods to the overall distribution of cell types in the test dataset.
We calculate the composition of a cell type in a spatial neighborhood with a total number of cells as follows:
The composition of cell type in a partition consisting of spatial neighborhoods is then calculated based on the average composition of all spatial neighborhoods within p:
To obtain the enrichment of cell type in a partition , we normalize by , the global composition of cell type within the dataset :
This approach allowed us to identify monotonic or nonlinear relationships between subgraph composition and GNN predictions.

Neighborhood clustering
To interpret latent space features, we extracted 32-dimensional embeddings from the penultimate layer of the GNN for clustering. K-means clustering was applied to the embedding of 200,000 spatial subgraphs randomly sampled from the training dataset. The optimal number of clusters was determined using the elbow method and the Davies-Bouldin index. The clustering was then applied to the embeddings of all subgraphs in the test datasets for further analysis. Clusters were characterized by cell type enrichment (see Eqs. 1–3) and GNN prediction trends, providing insights into the types of spatial neighborhoods present in the NSCLC TME and their associations with patient survival.

Neighborhood manipulation experiments
To assess how specific spatial features influence GNN predictions, we conducted targeted manipulation experiments on a subset of spatial neighborhoods from the test dataset and quantified the effect of a manipulation on the GNN predictions (Fig. 1e). These experiments were intended as model-interpretability analyses that probe the GNN’s learned decision rules, rather than experimental proof of new biological mechanisms. Neighborhoods were selected based on predefined criteria, such as the presence or absence of specific cell types (e.g., PD-1- CD8+ cells or FOXP3+ cells).
Manipulations involved altering the expression features of specific nodes or the spatial arrangement of cells within neighborhoods. For example:
Expression manipulation: We modified node features to simulate biological changes, such as converting a PD-1- CD8+ cell into a PD-1+ CD8+ cell by changing the PD-1 expression feature from 0 to 1 (Fig. 6a). This type of manipulation can be either applied to a single randomly selected cell of a desired cell type (e.g., turning a randomly selected PD-L1- tumor cell into a PD-1- CD8+ cell, Fig. 6a, M1) or to all cells of the specified cell type (e.g., turning all PD-L1- tumor cells into PD-L1+ tumor cells, as in Fig. 6a).
Spatial manipulation: We altered cell positions by selecting two nodes and exchanging their expression features, effectively simulating changes in spatial relationships while preserving neighborhood composition (Fig. 6g and j).
For each manipulation, we computed the difference in GNN predictions between the original and manipulated spatial neighborhood graphs. This difference served as a measure of the effect of the manipulation, providing insights into how specific cell types and their spatial arrangement modulate the GNN’s survival predictions.

Statistical analysis
Statistical analyses included Kaplan-Meier survival estimates, Cox proportional hazards models, and the log-rank test for survival comparisons. Differences between the distributions of continuous variables were assessed using the Student’s t-test or Wilcoxon rank-sum test, as appropriate and reported alongside the test result, with effect sizes quantified using Cohen’s d or point-biserial correlation. Multiple comparisons were adjusted using the Benjamini-Hochberg procedure. All analyses were conducted in Python (version 3.8.10) with the lifelines (version 0.27.8) and scipy (version 1.10.1) packages.
Matched bootstrapping was used for performance comparison between the different survival prediction models (spatial neighborhood GNN and baselines). We sampled 2000 bootstrap samples from the test dataset and computed the performance of the three baseline models and the spatial neighborhood GNN on the identical bootstrap samples. Subsequently, we computed the confidence intervals over the performance distribution of each method on the bootstrap samples.

Supplementary information

출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.