Decades of omics in lung cancer research: a bibliometric analysis and visualization from 2004 to 2024.
1/5 보강
[BACKGROUND] Omics, encompassing genomics, transcriptomics, proteomics and metabolomics, plays a pivotal role in elucidating the molecular mechanisms underlying lung cancer and advancing precision onc
APA
Wang X, Dong H, et al. (2026). Decades of omics in lung cancer research: a bibliometric analysis and visualization from 2004 to 2024.. Journal of thoracic disease, 18(2), 143. https://doi.org/10.21037/jtd-2025-aw-2245
MLA
Wang X, et al.. "Decades of omics in lung cancer research: a bibliometric analysis and visualization from 2004 to 2024.." Journal of thoracic disease, vol. 18, no. 2, 2026, pp. 143.
PMID
41816375 ↗
Abstract 한글 요약
[BACKGROUND] Omics, encompassing genomics, transcriptomics, proteomics and metabolomics, plays a pivotal role in elucidating the molecular mechanisms underlying lung cancer and advancing precision oncology. While existing studies have primarily focused on the technical development and clinical efficacy of omics applications in cancer, there remains a notable gap in comprehensive assessments of the global research landscape. At different stages of lung cancer initiation, progression, and metastasis, genomics and transcriptomics predominantly reveal oncogenic alterations and dysregulated signaling networks, whereas proteomics and metabolomics capture functional protein dynamics and metabolic reprogramming that drive tumor growth and metastatic adaptation. Importantly, the integration of multi-omics data enables a systematic understanding of the crosstalk between genetic alterations, transcriptional regulation, protein expression, and metabolic remodeling throughout lung cancer evolution. This bibliometric analysis study aims to systematically evaluate scientific output, research trends and hotspots in omics-related lung cancer research.
[METHODS] Relevant publications were retrieved from the Web of Science Core Collection (WoSCC) from January 1, 2004 to April 27, 2024. Bibliometric analyses and knowledge domain visualizations were conducted using VOSviewer (v1.6.20), CiteSpace (v6.3), R (v4.3.3), and Origin (2024).
[RESULTS] A total of 19,087 publications were included, demonstrating sustained growth over two decades [2004-2024]. China contributed the largest volume of publications, whereas the USA showed higher citation impact and stronger influence in collaboration networks. Keyword co-occurrence and burst analyses illustrated that "expression", "lung cancer", "gene expression", "tumor microenvironment", "mutation" and "immunotherapy" are dominant and emerging themes. These findings indicate a clear shift from single-omics approaches and gene-centric investigations toward integrative multi-omics frameworks, with increasing emphasis on the tumor microenvironment (TME) and immunotherapy. The burst analysis of keywords also highlights the rising prominence of artificial intelligence (AI) and machine learning (ML), which have emerged as rapidly growing methodological backbones in recent years.
[CONCLUSIONS] Research on omics in lung cancer has rapidly evolved toward integrative, TME-focused and immunotherapy-oriented paradigms, with AI/ML serving as an enabling analytical infrastructure. This study underscores the critical role that omics in facilitating early detection, guiding personalized therapeutic strategies, and improving prognostic accuracy. The findings suggest that enhancing cross-disciplinary collaboration and accelerating the clinical translation of multi-omics data may help overcome current challenges in precision oncology.
[METHODS] Relevant publications were retrieved from the Web of Science Core Collection (WoSCC) from January 1, 2004 to April 27, 2024. Bibliometric analyses and knowledge domain visualizations were conducted using VOSviewer (v1.6.20), CiteSpace (v6.3), R (v4.3.3), and Origin (2024).
[RESULTS] A total of 19,087 publications were included, demonstrating sustained growth over two decades [2004-2024]. China contributed the largest volume of publications, whereas the USA showed higher citation impact and stronger influence in collaboration networks. Keyword co-occurrence and burst analyses illustrated that "expression", "lung cancer", "gene expression", "tumor microenvironment", "mutation" and "immunotherapy" are dominant and emerging themes. These findings indicate a clear shift from single-omics approaches and gene-centric investigations toward integrative multi-omics frameworks, with increasing emphasis on the tumor microenvironment (TME) and immunotherapy. The burst analysis of keywords also highlights the rising prominence of artificial intelligence (AI) and machine learning (ML), which have emerged as rapidly growing methodological backbones in recent years.
[CONCLUSIONS] Research on omics in lung cancer has rapidly evolved toward integrative, TME-focused and immunotherapy-oriented paradigms, with AI/ML serving as an enabling analytical infrastructure. This study underscores the critical role that omics in facilitating early detection, guiding personalized therapeutic strategies, and improving prognostic accuracy. The findings suggest that enhancing cross-disciplinary collaboration and accelerating the clinical translation of multi-omics data may help overcome current challenges in precision oncology.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (5)
- Molecular mechanisms in MASLD/MASH-related HCC.
- A CLDN18.2-Targeted Nanoplatform Manipulates Magnetic Hyperthermia Spatiotemporally for Synergistic Immunotherapy in Gastric Cancer.
- A high proportion of CD38 (high) CD16 (low) NK cells in colorectal cancer can interrupt immune surveillance and favor tumor growth.
- The Pendulum Movement of Orbital Fat and Retro-Orbicularis Oculi Fat: A New Strategy for Correction of Sunken Eyelid Deformity in Revision Upper Blepharoplasty for Asian Patients.
- Outcomes in older adults with metastatic esophageal and gastric carcinoma treated with palliative chemotherapy.
📖 전문 본문 읽기 PMC JATS · ~60 KB · 영문
Introduction
Introduction
According to the Global Cancer Statistics 2022, there were approximately 20 million new cancer cases and 9.7 million cancer-related deaths worldwide in 2022. Lung cancer remains the leading cause of cancer mortality, accounting for an estimated 1.8 million deaths (18.7% of all cancer deaths) in 2022. Projections indicate that the annual number of new cancer cases could reach 35 million by 2050, highlighting an urgent need for improved prevention, early detection, and precision treatment strategies (1).
Omics—the comprehensive, high-throughput analysis of biological molecules within cells, tissues, or organisms—has become indispensable in cancer research, particularly in the context of lung cancer. This suite of technologies, including genomics, transcriptomics, proteomics, and metabolomics, enables system-level insights into tumorigenesis, progression, and therapeutic response.
At the molecular level, lung cancer development is a multistep and spatially heterogeneous process in which different omics layers capture complementary, stage-specific mechanisms. During tumor initiation, genomic alterations—such as driver mutations and copy-number variations—establish oncogenic signaling pathways. These genetic events are subsequently modulated by epigenetic regulation, transcriptional reprogramming, protein network rewiring, and metabolic adaptation, collectively shaping tumor phenotypes and influencing clinical outcomes. As tumors progress locally, proteomic analyses offer a more precise reflection of signaling pathway activity, including cell-cycle control, DNA damage response, and receptor-mediated signaling, that cannot be fully captured by DNA or RNA data alone. Concurrently, metabolomic profiling uncovers metabolic reprogramming, including dysregulated glycolysis and lipid metabolism, which supports rapid proliferation and enhances survival under microenvironmental stress (2).
During invasion and metastasis, these molecular layers become increasingly interconnected: genomic instability and transcriptional plasticity facilitate phenotypic adaptation; proteomic alterations drive cellular motility and invasive capacity; and metabolic adaptations enable survival in circulation and successful colonization at distant sites. Importantly, these tumor-intrinsic programs are dynamically modulated by interactions with the tumor microenvironment (TME), including immune infiltration, stromal remodeling, hypoxia, and cytokine signaling networks—all of which contribute to therapeutic resistance and metastatic progression. These stage-specific, interdependent molecular mechanisms highlight the necessity of integrated multi-omics approaches to comprehensively capture the continuum of lung cancer evolution (3).
Consequently, integrated multi-omics approaches are increasingly essential for reconstructing the causal continuum from genetic alterations to functional phenotypes and clinical outcomes, linking mutation-driven programs to signaling activity, metabolic fitness, TME interactions, and ultimately disease progression and treatment response. Recent advances in single-cell and spatial multi-omics further highlight the importance of characterization of TME crosstalk and intratumoral heterogeneity at high resolution (4,5). However, the rapid expansion of omics technologies, coupled with their fragmented application across research subfields, has led to a vast and complex body of literature, making it challenging to obtain a coherent understanding of the knowledge structure, collaborative networks, and evolving mechanistic priorities. Bibliometric analysis provides a systematic, quantitative framework for synthesizing large-scale scientific publications, evaluating research productivity and impact, and identifying emerging themes and cross-disciplinary trends (4,5).
In this study, we employ bibliometric and visualization methods to depict the application research status of oncology in lung cancer, with a focus on publication trends, research hotspots, and emerging directions, thereby providing a structured reference for future omics-based precision medicine in lung cancer. We present this article in accordance with the BIBLIO reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-aw-2245/rc).
According to the Global Cancer Statistics 2022, there were approximately 20 million new cancer cases and 9.7 million cancer-related deaths worldwide in 2022. Lung cancer remains the leading cause of cancer mortality, accounting for an estimated 1.8 million deaths (18.7% of all cancer deaths) in 2022. Projections indicate that the annual number of new cancer cases could reach 35 million by 2050, highlighting an urgent need for improved prevention, early detection, and precision treatment strategies (1).
Omics—the comprehensive, high-throughput analysis of biological molecules within cells, tissues, or organisms—has become indispensable in cancer research, particularly in the context of lung cancer. This suite of technologies, including genomics, transcriptomics, proteomics, and metabolomics, enables system-level insights into tumorigenesis, progression, and therapeutic response.
At the molecular level, lung cancer development is a multistep and spatially heterogeneous process in which different omics layers capture complementary, stage-specific mechanisms. During tumor initiation, genomic alterations—such as driver mutations and copy-number variations—establish oncogenic signaling pathways. These genetic events are subsequently modulated by epigenetic regulation, transcriptional reprogramming, protein network rewiring, and metabolic adaptation, collectively shaping tumor phenotypes and influencing clinical outcomes. As tumors progress locally, proteomic analyses offer a more precise reflection of signaling pathway activity, including cell-cycle control, DNA damage response, and receptor-mediated signaling, that cannot be fully captured by DNA or RNA data alone. Concurrently, metabolomic profiling uncovers metabolic reprogramming, including dysregulated glycolysis and lipid metabolism, which supports rapid proliferation and enhances survival under microenvironmental stress (2).
During invasion and metastasis, these molecular layers become increasingly interconnected: genomic instability and transcriptional plasticity facilitate phenotypic adaptation; proteomic alterations drive cellular motility and invasive capacity; and metabolic adaptations enable survival in circulation and successful colonization at distant sites. Importantly, these tumor-intrinsic programs are dynamically modulated by interactions with the tumor microenvironment (TME), including immune infiltration, stromal remodeling, hypoxia, and cytokine signaling networks—all of which contribute to therapeutic resistance and metastatic progression. These stage-specific, interdependent molecular mechanisms highlight the necessity of integrated multi-omics approaches to comprehensively capture the continuum of lung cancer evolution (3).
Consequently, integrated multi-omics approaches are increasingly essential for reconstructing the causal continuum from genetic alterations to functional phenotypes and clinical outcomes, linking mutation-driven programs to signaling activity, metabolic fitness, TME interactions, and ultimately disease progression and treatment response. Recent advances in single-cell and spatial multi-omics further highlight the importance of characterization of TME crosstalk and intratumoral heterogeneity at high resolution (4,5). However, the rapid expansion of omics technologies, coupled with their fragmented application across research subfields, has led to a vast and complex body of literature, making it challenging to obtain a coherent understanding of the knowledge structure, collaborative networks, and evolving mechanistic priorities. Bibliometric analysis provides a systematic, quantitative framework for synthesizing large-scale scientific publications, evaluating research productivity and impact, and identifying emerging themes and cross-disciplinary trends (4,5).
In this study, we employ bibliometric and visualization methods to depict the application research status of oncology in lung cancer, with a focus on publication trends, research hotspots, and emerging directions, thereby providing a structured reference for future omics-based precision medicine in lung cancer. We present this article in accordance with the BIBLIO reporting checklist (available at https://jtd.amegroups.com/article/view/10.21037/jtd-2025-aw-2245/rc).
Methods
Methods
Data collection and extraction
We selected the Web of Science Core Collection (WoSCC) as the primary data source for our bibliometric analysis of English-language publications. WoSCC was chosen due to its standardized, high-quality citation indexing and comprehensive bibliographic metadata, which are compatible with mainstream bibliometric software, enabling reproducible co-authorship, co-citation, and burst detection analyses. We did not merge multiple databases such as Scopus, PubMed, Embase, as differences in indexing criteria, document types, and citation formats may introduce substantial duplication and normalization bias, potentially compromising comparability of citation-based indicators (6-9). The search was restricted to publications indexed between January 1, 2004, and April 27, 2024, encompassing all relevant studies on omics applications in lung cancer. The detailed retrieval strategy is provided in Table S1.
Statistical analysis
This study conducted descriptive and exploratory statistical analyses to summarize publication outputs, collaboration patterns, and research trends in omics-related lung cancer research. We employed a suite of established analytical tools, including VOSviewer 1.6.20, CiteSpace 6.3, R version 4.3.3, and Origin 2024, to process and analyze statistics from the WoSCC database. These tools are widely recognized in the field of bibliometric research and data visualization.
VOSviewer was used to construct and visualize bibliometric maps, offering intuitive graphical representations of publication and citation patterns, including output by country and institution. CiteSpace was applied to detect emerging trends and evolving research fronts through keyword burst analysis, enabling temporal tracking of thematic shifts.
RStudio facilitated comprehensive bibliometric analysis using packages like Bibliometrix and ggplot2, enabling the creation of co-citation, co-authorship, and co-occurrence networks, with the H-index primarily calculated using this tool. Finally, Origin provided advanced data processing and graphing functionalities, allowing for rigorous statistical analysis and the creation of detailed visualizations of bibliometric indicators.
Data collection and extraction
We selected the Web of Science Core Collection (WoSCC) as the primary data source for our bibliometric analysis of English-language publications. WoSCC was chosen due to its standardized, high-quality citation indexing and comprehensive bibliographic metadata, which are compatible with mainstream bibliometric software, enabling reproducible co-authorship, co-citation, and burst detection analyses. We did not merge multiple databases such as Scopus, PubMed, Embase, as differences in indexing criteria, document types, and citation formats may introduce substantial duplication and normalization bias, potentially compromising comparability of citation-based indicators (6-9). The search was restricted to publications indexed between January 1, 2004, and April 27, 2024, encompassing all relevant studies on omics applications in lung cancer. The detailed retrieval strategy is provided in Table S1.
Statistical analysis
This study conducted descriptive and exploratory statistical analyses to summarize publication outputs, collaboration patterns, and research trends in omics-related lung cancer research. We employed a suite of established analytical tools, including VOSviewer 1.6.20, CiteSpace 6.3, R version 4.3.3, and Origin 2024, to process and analyze statistics from the WoSCC database. These tools are widely recognized in the field of bibliometric research and data visualization.
VOSviewer was used to construct and visualize bibliometric maps, offering intuitive graphical representations of publication and citation patterns, including output by country and institution. CiteSpace was applied to detect emerging trends and evolving research fronts through keyword burst analysis, enabling temporal tracking of thematic shifts.
RStudio facilitated comprehensive bibliometric analysis using packages like Bibliometrix and ggplot2, enabling the creation of co-citation, co-authorship, and co-occurrence networks, with the H-index primarily calculated using this tool. Finally, Origin provided advanced data processing and graphing functionalities, allowing for rigorous statistical analysis and the creation of detailed visualizations of bibliometric indicators.
Results
Results
General overview
A total of 20,336 records were initially retrieved from the WoSCC database. After filtering for document types (articles and reviews only) and language (English only), the dataset was refined to 19,087 publications. The data retrieval process is illustrated in Figure 1.
This study focuses on these 19,087 publications related to omics in lung cancer, which were saved as “plain text” and exported as “full-text citation records” for analysis using VOSviewer and CiteSpace.
The annual publication distribution from 2004 to 2024, shown in Figure S1, reveals a consistent upward trajectory through 2022, followed by a modest decline in 2023. The number of publications increased from 136 in 2004 to 942 in 2016, driven by technological advancements and growing research interest.
From 2017 onward, growth accelerated markedly, peaking at 2,008 publications in 2021 and reaching a record high of 2,289 in 2022. Overall, the field of omics in lung cancer has experienced sustained expansion over the past two decades, as reflected in the cumulative publication count, which continues to rise despite a slight decrease in 2023.
Country distribution and geographical analysis
Figure 2A presents the top 10 most productive countries in omics research related to lung cancer from 2004 to 2024, ranked by publication counts. Notably, China leads with 28,281 publications, followed closely by the USA with 26,557. Germany [4,508], Japan [4,026], and the UK [3,571] rank third through fifth, while Italy, South Korea, France, Spain, and Canada exhibit considerably lower publication and citation counts. Despite its high output, China’s citation rate remains relatively low compared to that of the USA, which has accumulated 358,031 citations, indicating greater scholarly influence in the field. This citation-to-publication ratio suggests that while China demonstrates exceptional productivity, its academic impact may not be proportionally equivalent.
Figure 2B illustrates the global distribution of research output in omics and lung cancer, with darker shades indicating higher publication counts, particularly concentrated in North America, Europe, and East Asia—especially China and the USA. Figure 2C further illustrates the temporal distribution of publications among the top five countries (China, USA, Germany, Japan, and the UK) from 2004 to 2024. The color gradient ranging from deep blue to red reflects increasing publication intensity over time. China’s research output has grown dramatically, rising from only 44 articles in 2004 to a peak of 4,965 in 2022, underscoring its increasing dominance in this domain since 2016. In contrast, the USA maintained consistently high productivity, peaking at 2,377 in 2021, followed by a recent decline. Germany, Japan, and the UK exhibited steady but moderate growth without significant surges.
Using VOSviewer, we analyzed international collaborations through co-authorship and citation networks. Cooler colors (blue) indicate earlier collaborations, while warmer colors (yellow) represent more recent partnerships. With a threshold of 30 occurrences, 48 countries were identified, grouped into four clusters. Figure S2A depicts the co-authorship network, while Figure S2B illustrates the citation network. China’s prominent position in both networks highlights its growing global research output and strong collaborative ties, particularly with the USA, South Korea, and Australia, underscoring strategic alliances and the central role of the USA in facilitating international academic cooperation.
Contribution of authors
A total of 96,357 researchers participated in the field of omics in lung cancer research. Table 1 lists the top 10 authors ranked by publication count, including each author’s number of publications, citation count, and H-index—an indicator of research impact—calculated using VOSviewer and R. ZHANG W leads with 108 publications and an H-index of 34, while WANG J has the highest H-index of 55, reflecting substantial academic influence.
Using VOSviewer, we conducted an author co-authorship analysis with a minimum threshold of 25 publications, identifying 128 authors clustered into six groups (Figure S3). Node size corresponds to the number of publications, and color indicates cluster membership. The red cluster exhibits the highest density, comprising 75 authors, indicating intense collaboration within this group. Figure S3 also includes a time overlay map, where the color gradient from blue to yellow reflects temporal progression, with blue representing earlier years. AMOS CI, an early contributor, achieved the highest citation count [3,537] and an H-index of 40, highlighting his enduring impact on the field.
Distribution of organizations
We employed VOSviewer to analyze the organizational distribution in omics research on lung cancer, constructing a co-authorship network with a threshold of 150 occurrences, which yielded 45 organizations. Figure 3A illustrates the collaborative landscape among these global research institutions, where nodes represent institutions and links indicate their collaborative efforts. Node size reflects the frequency and intensity of collaboration. The University of Texas MD Anderson Cancer Center emerges as the most interconnected institution, emphasizing its pivotal role in global research networks. Other leading institutions—Nanjing Medical University, Shanghai Jiao Tong University, and Fudan University—show recent collaborative activity (indicated in yellow), reflecting China’s increasing prominence in this domain. Figure 3B shows the citation network of organizations, highlighting the citation-based influence patterns across institutions.
Table 2 ranks the top 20 institutions by citation count from 2004 to 2024, incorporating metrics such as citations per document and total link strength. Harvard University ranks first with 86,059 citations and a total link strength of 721, demonstrating its leadership in omics lung cancer research. Notably, the top 12 institutions are based in the USA, reflecting their dominant publication output and scholarly influence. MIT achieves the highest citations-per-document ratio [397], indicating exceptional research impact. In contrast, MD Anderson, despite high productivity, shows a lower ratio of 73. Overall, all listed institutions exhibit robust collaboration networks, with total link strengths exceeding 100, contributing significantly to their visibility and influence.
Citation status of publications
We used CiteSpace to identify the most highly cited publications among 19,087 retrieved articles. Figure S4 showcases the top 25 publications with the strongest citation bursts from 2014 to 2024. Notably, “Comprehensive genomic characterization of squamous cell lung cancers” [2014] exhibited a burst strength of 91.91, lasting from 2014 to 2017 (10). This study advanced understanding of pathogenic genes through genomic data analysis and promoted targeted therapies for lung cancer. The article with the highest burst strength (112.08) is “Comprehensive molecular profiling of lung adenocarcinoma” (11), influential from 2015 to 2019. By analyzing recurrent alterations in key signaling pathways, this work elucidated subtypes of lung adenocarcinoma (LUAD), offering insights that support clinical advancements and deepen our understanding of lung cancer biology.
Table S2 displays the top 10 most cited publications, foundational to genomics and cancer research. Leading with 31,422 citations is “Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles” (12), a seminal methodological contribution [2005] (12) by Subramanian et al. enabling functional interpretation of large-scale omics data. Published in Proceedings of the National Academy of Sciences (PNAS), and the primary research focus is Gene Set Enrichment Analysis (GSEA), and revealed biological processes in malignant tumors, such as LUAD. This article provides insights into the potential of GSEA in elucidating cancer biology and demonstrates how integrated analysis yield a clearer understanding of underlying biological mechanisms. Following this, “A microRNA expression signature of human solid tumors defines cancer gene targets” has accumulated 4,749 citations—substantially fewer than the first article (13). Most publications listed in Table S2 appear in high-impact journals such as PNAS, Nature, and Science, underscoring their significant academic influence.
Frequency and co-occurrence analysis of keywords/terms
Keywords serve as concise summaries of core research themes, making high-frequency terms particularly valuable for identifying research hotspots. In this study, we employed the Bibliometrix R package to analyze 19,087 publications, extracting the top 20 most frequently used keywords in omics-related lung cancer research from 2004 to 2024. Visualizations were generated using Origin 2024, RStudio, and VOSviewer.
Figure 4A presents the top 20 keywords, with “expression” being the most frequent (4,274 occurrences), highlighting the central role of gene and protein expression studies in cancer omics research. This is followed by “lung cancer” (3,631 occurrences) and “cancer” (3,043 occurrences), reflecting the strong focus on lung cancer within broader oncological investigations. Additional keywords such as “survival”, “growth”, and “mutations” emphasize the research interest in cancer prognosis and genetic alterations. Collectively, these keyword trends offer insight into the evolving academic landscape and dominant research trajectories.
Figure 4B illustrates the three-field plot linking keywords, institutions, and countries. Institutions overall cover a broad range of research topics across the keyword spectrum. In addition, leading institutions like Nanjing Medical University and Fudan University, primarily focus on “lung cancer”, “lung adenocarcinoma” and “non-small cell lung cancer”, concentrated expertise in these areas. In contrast, U.S.-based institutions show stronger linkages with keywords such as “biomarker” and “immunotherapy”, reflecting a strategic emphasis on novel therapeutic strategies. Overall, China maintains a leading position in lung cancer omics research, particularly in disease-specific genomic and molecular studies.
Figure 4C displays the cluster network visualization of the top 60 keywords. Node size reflects keyword frequency, while link thickness indicates co-occurrence strength between terms. The keywords are grouped into five distinct clusters. The red cluster (No. 1) centers on tumor biology, featuring terms such as “expression”, “progression”, “proliferation”, and “metastasis”. The green cluster (No. 2) encompasses terms like “cancer”, “prognosis”, “lung adenocarcinoma”, and “chemotherapy”, focusing on cancer outcomes and therapeutic strategies. The yellow cluster (No. 3) includes specific cancer types, while the blue cluster (No. 4) highlights molecular mechanisms with keywords like “identification”, “metabolism”, and “proteomics”. The purple cluster (No. 5) contains related terms from cluster 4. Overall, Figure 4C reveals a balanced integration of fundamental biological processes and clinical applications, with strong interconnections between “expression” and “prognosis”, underscoring the importance of gene expression profiling in predicting cancer outcomes. Figure 4D provides a temporal overlay visualization, showcasing a shift from foundational biological research [2016] toward clinical oncology applications [2019], particularly in “progression”, “prognosis”, and “immunotherapy”. This shift highlights an increasing emphasis on improving patient treatment outcomes through translational research.
Author keywords are selected by researchers to encapsulate the primary themes and core concepts of their studies, making them critical for tracking emerging trends and shifts in research priorities. We used the Bibliometrix R package to conduct this analysis and generated heatmaps to visualize temporal changes in keyword usage. Figure S5A presents the heatmap of author keyword distribution from 2004 to 2024. Each row represents a specific keyword, each column a time point, and color intensity reflects keyword frequency—darker shades indicate lower values, while brighter shades correspond to higher values (scale: 0–1).
The author keyword “mass spectrometry” emerged prominently in 2005, signaling early interest in omics technologies for cancer research. Topics related to genomics and molecular biology—such as “epigenetics”, “proteomics”, and “microRNA”—began gaining traction after 2010, showing marked increases in usage between 2010 and 2015, demonstrating a rising interest in cancer’s molecular underpinnings as well as the potential therapeutic targets. By around 2016, terms related to the tumor microenvironment, including “immune infiltration”, “tumor microenvironment”, and “immunotherapy”, became increasingly prominent, reflecting their growing recognition in cancer research. In recent years, the keyword “machine learning” has emerged sharply, indicating the expanding application of artificial intelligence (AI) in advanced cancer omics studies. Figure S5B illustrates temporal changes in terms extracted from article titles, showing subtle differences compared to Figure S5A. While the title-term heatmap captures broader thematic shifts, the author keyword heatmap highlights specific emerging fields and technological advancements. Together, both visualizations underscore the rising influence of cutting-edge technologies and evolving research priorities in cancer omics, particularly over the past decade.
Keyword burst analysis
“Burst terms” refer to keywords that experience a significant increase in citation frequency, indicating emerging hotspots within the research field. Analyzing keyword citation bursts offers valuable insights into dynamic shifts in research focus and potential future directions. Using CiteSpace, we evaluated keyword burst strengths, as shown in Figure S6. Since 2014, “genome-wide association” has consistently demonstrated high burst strength, highlighting an early emphasis on identifying genetic variations linked to lung cancer. A noticeable shift in research themes began around 2016, with increasing focus on specific cancer types such as “lung adenocarcinoma” (strength: 22.89) and “breast cancer” (strength: 14.18), accompanied by immune-related terms including “immunotherapy” (strength: 15.58) and “immune infiltration” (strength: 36.52). This trend reflects growing interest in both the molecular profiling of specific cancers and their clinical implications. In more recent years, microenvironment-related keywords—such as “tumor microenvironment” (strength: 41.18) and “immune microenvironment” (strength: 14.99)—have gained significant momentum. Keywords with the strongest burst signals provide predictive insight into emerging frontiers in cancer omics, suggesting expanding research on tumor-immune interactions, integrative use of genomics and proteomics, and their connections to immunotherapy and personalized treatment strategies (14,15).
General overview
A total of 20,336 records were initially retrieved from the WoSCC database. After filtering for document types (articles and reviews only) and language (English only), the dataset was refined to 19,087 publications. The data retrieval process is illustrated in Figure 1.
This study focuses on these 19,087 publications related to omics in lung cancer, which were saved as “plain text” and exported as “full-text citation records” for analysis using VOSviewer and CiteSpace.
The annual publication distribution from 2004 to 2024, shown in Figure S1, reveals a consistent upward trajectory through 2022, followed by a modest decline in 2023. The number of publications increased from 136 in 2004 to 942 in 2016, driven by technological advancements and growing research interest.
From 2017 onward, growth accelerated markedly, peaking at 2,008 publications in 2021 and reaching a record high of 2,289 in 2022. Overall, the field of omics in lung cancer has experienced sustained expansion over the past two decades, as reflected in the cumulative publication count, which continues to rise despite a slight decrease in 2023.
Country distribution and geographical analysis
Figure 2A presents the top 10 most productive countries in omics research related to lung cancer from 2004 to 2024, ranked by publication counts. Notably, China leads with 28,281 publications, followed closely by the USA with 26,557. Germany [4,508], Japan [4,026], and the UK [3,571] rank third through fifth, while Italy, South Korea, France, Spain, and Canada exhibit considerably lower publication and citation counts. Despite its high output, China’s citation rate remains relatively low compared to that of the USA, which has accumulated 358,031 citations, indicating greater scholarly influence in the field. This citation-to-publication ratio suggests that while China demonstrates exceptional productivity, its academic impact may not be proportionally equivalent.
Figure 2B illustrates the global distribution of research output in omics and lung cancer, with darker shades indicating higher publication counts, particularly concentrated in North America, Europe, and East Asia—especially China and the USA. Figure 2C further illustrates the temporal distribution of publications among the top five countries (China, USA, Germany, Japan, and the UK) from 2004 to 2024. The color gradient ranging from deep blue to red reflects increasing publication intensity over time. China’s research output has grown dramatically, rising from only 44 articles in 2004 to a peak of 4,965 in 2022, underscoring its increasing dominance in this domain since 2016. In contrast, the USA maintained consistently high productivity, peaking at 2,377 in 2021, followed by a recent decline. Germany, Japan, and the UK exhibited steady but moderate growth without significant surges.
Using VOSviewer, we analyzed international collaborations through co-authorship and citation networks. Cooler colors (blue) indicate earlier collaborations, while warmer colors (yellow) represent more recent partnerships. With a threshold of 30 occurrences, 48 countries were identified, grouped into four clusters. Figure S2A depicts the co-authorship network, while Figure S2B illustrates the citation network. China’s prominent position in both networks highlights its growing global research output and strong collaborative ties, particularly with the USA, South Korea, and Australia, underscoring strategic alliances and the central role of the USA in facilitating international academic cooperation.
Contribution of authors
A total of 96,357 researchers participated in the field of omics in lung cancer research. Table 1 lists the top 10 authors ranked by publication count, including each author’s number of publications, citation count, and H-index—an indicator of research impact—calculated using VOSviewer and R. ZHANG W leads with 108 publications and an H-index of 34, while WANG J has the highest H-index of 55, reflecting substantial academic influence.
Using VOSviewer, we conducted an author co-authorship analysis with a minimum threshold of 25 publications, identifying 128 authors clustered into six groups (Figure S3). Node size corresponds to the number of publications, and color indicates cluster membership. The red cluster exhibits the highest density, comprising 75 authors, indicating intense collaboration within this group. Figure S3 also includes a time overlay map, where the color gradient from blue to yellow reflects temporal progression, with blue representing earlier years. AMOS CI, an early contributor, achieved the highest citation count [3,537] and an H-index of 40, highlighting his enduring impact on the field.
Distribution of organizations
We employed VOSviewer to analyze the organizational distribution in omics research on lung cancer, constructing a co-authorship network with a threshold of 150 occurrences, which yielded 45 organizations. Figure 3A illustrates the collaborative landscape among these global research institutions, where nodes represent institutions and links indicate their collaborative efforts. Node size reflects the frequency and intensity of collaboration. The University of Texas MD Anderson Cancer Center emerges as the most interconnected institution, emphasizing its pivotal role in global research networks. Other leading institutions—Nanjing Medical University, Shanghai Jiao Tong University, and Fudan University—show recent collaborative activity (indicated in yellow), reflecting China’s increasing prominence in this domain. Figure 3B shows the citation network of organizations, highlighting the citation-based influence patterns across institutions.
Table 2 ranks the top 20 institutions by citation count from 2004 to 2024, incorporating metrics such as citations per document and total link strength. Harvard University ranks first with 86,059 citations and a total link strength of 721, demonstrating its leadership in omics lung cancer research. Notably, the top 12 institutions are based in the USA, reflecting their dominant publication output and scholarly influence. MIT achieves the highest citations-per-document ratio [397], indicating exceptional research impact. In contrast, MD Anderson, despite high productivity, shows a lower ratio of 73. Overall, all listed institutions exhibit robust collaboration networks, with total link strengths exceeding 100, contributing significantly to their visibility and influence.
Citation status of publications
We used CiteSpace to identify the most highly cited publications among 19,087 retrieved articles. Figure S4 showcases the top 25 publications with the strongest citation bursts from 2014 to 2024. Notably, “Comprehensive genomic characterization of squamous cell lung cancers” [2014] exhibited a burst strength of 91.91, lasting from 2014 to 2017 (10). This study advanced understanding of pathogenic genes through genomic data analysis and promoted targeted therapies for lung cancer. The article with the highest burst strength (112.08) is “Comprehensive molecular profiling of lung adenocarcinoma” (11), influential from 2015 to 2019. By analyzing recurrent alterations in key signaling pathways, this work elucidated subtypes of lung adenocarcinoma (LUAD), offering insights that support clinical advancements and deepen our understanding of lung cancer biology.
Table S2 displays the top 10 most cited publications, foundational to genomics and cancer research. Leading with 31,422 citations is “Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles” (12), a seminal methodological contribution [2005] (12) by Subramanian et al. enabling functional interpretation of large-scale omics data. Published in Proceedings of the National Academy of Sciences (PNAS), and the primary research focus is Gene Set Enrichment Analysis (GSEA), and revealed biological processes in malignant tumors, such as LUAD. This article provides insights into the potential of GSEA in elucidating cancer biology and demonstrates how integrated analysis yield a clearer understanding of underlying biological mechanisms. Following this, “A microRNA expression signature of human solid tumors defines cancer gene targets” has accumulated 4,749 citations—substantially fewer than the first article (13). Most publications listed in Table S2 appear in high-impact journals such as PNAS, Nature, and Science, underscoring their significant academic influence.
Frequency and co-occurrence analysis of keywords/terms
Keywords serve as concise summaries of core research themes, making high-frequency terms particularly valuable for identifying research hotspots. In this study, we employed the Bibliometrix R package to analyze 19,087 publications, extracting the top 20 most frequently used keywords in omics-related lung cancer research from 2004 to 2024. Visualizations were generated using Origin 2024, RStudio, and VOSviewer.
Figure 4A presents the top 20 keywords, with “expression” being the most frequent (4,274 occurrences), highlighting the central role of gene and protein expression studies in cancer omics research. This is followed by “lung cancer” (3,631 occurrences) and “cancer” (3,043 occurrences), reflecting the strong focus on lung cancer within broader oncological investigations. Additional keywords such as “survival”, “growth”, and “mutations” emphasize the research interest in cancer prognosis and genetic alterations. Collectively, these keyword trends offer insight into the evolving academic landscape and dominant research trajectories.
Figure 4B illustrates the three-field plot linking keywords, institutions, and countries. Institutions overall cover a broad range of research topics across the keyword spectrum. In addition, leading institutions like Nanjing Medical University and Fudan University, primarily focus on “lung cancer”, “lung adenocarcinoma” and “non-small cell lung cancer”, concentrated expertise in these areas. In contrast, U.S.-based institutions show stronger linkages with keywords such as “biomarker” and “immunotherapy”, reflecting a strategic emphasis on novel therapeutic strategies. Overall, China maintains a leading position in lung cancer omics research, particularly in disease-specific genomic and molecular studies.
Figure 4C displays the cluster network visualization of the top 60 keywords. Node size reflects keyword frequency, while link thickness indicates co-occurrence strength between terms. The keywords are grouped into five distinct clusters. The red cluster (No. 1) centers on tumor biology, featuring terms such as “expression”, “progression”, “proliferation”, and “metastasis”. The green cluster (No. 2) encompasses terms like “cancer”, “prognosis”, “lung adenocarcinoma”, and “chemotherapy”, focusing on cancer outcomes and therapeutic strategies. The yellow cluster (No. 3) includes specific cancer types, while the blue cluster (No. 4) highlights molecular mechanisms with keywords like “identification”, “metabolism”, and “proteomics”. The purple cluster (No. 5) contains related terms from cluster 4. Overall, Figure 4C reveals a balanced integration of fundamental biological processes and clinical applications, with strong interconnections between “expression” and “prognosis”, underscoring the importance of gene expression profiling in predicting cancer outcomes. Figure 4D provides a temporal overlay visualization, showcasing a shift from foundational biological research [2016] toward clinical oncology applications [2019], particularly in “progression”, “prognosis”, and “immunotherapy”. This shift highlights an increasing emphasis on improving patient treatment outcomes through translational research.
Author keywords are selected by researchers to encapsulate the primary themes and core concepts of their studies, making them critical for tracking emerging trends and shifts in research priorities. We used the Bibliometrix R package to conduct this analysis and generated heatmaps to visualize temporal changes in keyword usage. Figure S5A presents the heatmap of author keyword distribution from 2004 to 2024. Each row represents a specific keyword, each column a time point, and color intensity reflects keyword frequency—darker shades indicate lower values, while brighter shades correspond to higher values (scale: 0–1).
The author keyword “mass spectrometry” emerged prominently in 2005, signaling early interest in omics technologies for cancer research. Topics related to genomics and molecular biology—such as “epigenetics”, “proteomics”, and “microRNA”—began gaining traction after 2010, showing marked increases in usage between 2010 and 2015, demonstrating a rising interest in cancer’s molecular underpinnings as well as the potential therapeutic targets. By around 2016, terms related to the tumor microenvironment, including “immune infiltration”, “tumor microenvironment”, and “immunotherapy”, became increasingly prominent, reflecting their growing recognition in cancer research. In recent years, the keyword “machine learning” has emerged sharply, indicating the expanding application of artificial intelligence (AI) in advanced cancer omics studies. Figure S5B illustrates temporal changes in terms extracted from article titles, showing subtle differences compared to Figure S5A. While the title-term heatmap captures broader thematic shifts, the author keyword heatmap highlights specific emerging fields and technological advancements. Together, both visualizations underscore the rising influence of cutting-edge technologies and evolving research priorities in cancer omics, particularly over the past decade.
Keyword burst analysis
“Burst terms” refer to keywords that experience a significant increase in citation frequency, indicating emerging hotspots within the research field. Analyzing keyword citation bursts offers valuable insights into dynamic shifts in research focus and potential future directions. Using CiteSpace, we evaluated keyword burst strengths, as shown in Figure S6. Since 2014, “genome-wide association” has consistently demonstrated high burst strength, highlighting an early emphasis on identifying genetic variations linked to lung cancer. A noticeable shift in research themes began around 2016, with increasing focus on specific cancer types such as “lung adenocarcinoma” (strength: 22.89) and “breast cancer” (strength: 14.18), accompanied by immune-related terms including “immunotherapy” (strength: 15.58) and “immune infiltration” (strength: 36.52). This trend reflects growing interest in both the molecular profiling of specific cancers and their clinical implications. In more recent years, microenvironment-related keywords—such as “tumor microenvironment” (strength: 41.18) and “immune microenvironment” (strength: 14.99)—have gained significant momentum. Keywords with the strongest burst signals provide predictive insight into emerging frontiers in cancer omics, suggesting expanding research on tumor-immune interactions, integrative use of genomics and proteomics, and their connections to immunotherapy and personalized treatment strategies (14,15).
Discussion
Discussion
Global trends and contributions
This study presents a bibliometric analysis of omics research in lung cancer from 2004 to 2024, based on 19,087 publications indexed in the WoSCC. The steadily increasing annual publication output (growth rate of 8.35%) and the rising international co-authorship index indicate sustained expansion and growing global collaboration in this field. These quantitative indicators confirm that omics-based approaches have evolved from exploratory tools into central methodologies in lung cancer research.
In terms of academic contributions, China and the USA have emerged as the two dominant contributors. China leads in publication volume, while the USA ranks first in citations, reflecting their advanced technological development and economic backgrounds. As illustrated in Figure S7, although the USA has historically dominated, China has risen rapidly since 2022, driven by substantial investments in science and technology and large-scale data generation capabilities, indicative of its growing research capacity. Notably, eight of the top ten most productive authors in omics lung cancer research are affiliated with Chinese institutions, highlighting China’s dominant role in driving research output. In contrast, seven of the ten most cited articles were published in U.S.-based journals, underscoring the United States’ leadership in methodological innovation and global scholarly influence.
Early genomics foundations
From a temporal perspective, early omics research in lung cancer [2004–2015] was predominantly genomics-driven, focusing on identifying genetic alterations underlying tumor initiation and progression. Representative studies from this period include Janne et al. [2004], who employed high-resolution single-nucleotide polymorphism arrays to characterize loss of heterozygosity in lung cancer cell lines, thereby establishing a foundational understanding of genomic instability in lung cancer (16). Similarly, Calin et al. [2004] demonstrated that microRNA genes are frequently located in fragile genomic regions implicated in cancer, revealing a regulatory mechanism linking genomic alterations to oncogenesis, progression, and metastasis (17). These seminal studies are consistent with our keyword burst analysis, in which terms such as “gene expression” and mutation-related concepts dominated the early research landscape. The development of omics in lung cancer research gradually steps forward from 2004 to 2015, expanding the hotspot from genomics to proteomics (18-21) and metabolomics (22-24), providing multidimensional biological insights for elucidating the mechanisms of cancer onset and identifying new therapeutic targets. This evolution reflects the growing recognition that lung cancer progression and treatment response are driven by complex, multi-layered molecular interactions rather than isolated genomic alterations.
Between 2004 and 2015, keyword bursts were predominantly centered on “gene expression”, “immune”, and “immunotherapy”. Notably, Anderson’s 2005 study “The Sentinel Within: Exploiting the Immune System for Cancer Biomarkers” (25) leveraged the immune system to identify cancer biomarkers, demonstrating that simultaneous analysis of multiple autoantigens enhances the sensitivity and specificity of cancer detection—thereby improving clinical diagnostic accuracy and enabling more precise identification of immunotherapy targets.
Furthermore, metabolically oriented studies such as “Nuclear magnetic resonance in conjunction with functional genomics suggests mitochondrial dysfunction in a murine model of cancer cachexia” [2010] (26) using nuclear magnetic resonance in combination with functional genomics, revealed mitochondrial dysfunction and altered energy metabolism in cancer cachexia models, extending omics beyond genetic changes to include functional metabolic consequences, offering potential biomarkers and novel insights for early diagnosis and therapeutic intervention in lung cancer.
Integrated multi-omics era
Between 2016 and 2022, omics research in lung cancer entered a phase of rapid expansion, with publication output peaking in 2022 (a peak of 2,289 publications). This period was marked by a clear transition from single-omics approaches to integrated multi-omics strategies. Representative studies, such as Jiang et al. (27) combined genomics and proteomics to identify tumor immune-related biomarkers and characterize immune cell composition within the TME. Concurrently, keywords including “tumor immune infiltration” and “network pharmacology” gained prominence, reflecting growing interest in systems-level analyses of tumor-immune interactions and drug-target networks. “Comprehensive analyses of tumor immunity: implications for cancer immunotherapy” [2016] (28) further underscored the importance of integrating genomic alterations with chemokine and receptor expression profiles to elucidate mechanisms of immune infiltration and predict therapeutic response.
AI/ML in lung cancer omics
The accelerated growth of omics research in lung cancer is driven by three interrelated factors. First, after more than a decade of methodological development, omics technologies, particularly genomics, transcriptomics, and metabolomics—have matured, enabling deeper investigation into complex diseases like cancer and significantly increasing research output. Second, the rise of precision medicine as a central paradigm in both cancer research and clinical practice has created a strong demand for detailed molecular characterization, thereby further accelerating omics-based studies (29,30). Third, the widespread adoption of AI and machine learning (ML) has greatly enhanced the efficiency of analyzing high-dimensional genomic and multi-omics data, accelerating biomarker discovery and advancing cancer diagnosis, prognosis, and personalized therapy (31-33).
In lung-cancer omics, AI/ML addresses several disease-specific bottlenecks highlighted by our keyword dynamics. These methods enable efficient feature selection from high-dimensional data, allowing robust signal extraction despite the pronounced molecular heterogeneity of lung tumors. AI/ML also facilitates multi-modal integration across omics layers and clinical phenotypes, supporting clinically meaningful patient stratification—such as molecular subtypes specific to LUAD—and outcome prediction. In multi-center lung cancer cohorts, ML-based methods improve data harmonization by handling missing values and mitigating batch effects. Moreover, for immune-related hotspots, AI/ML-based deconvolution and cell-state inference help characterize the TME, linking immune contextures to immunotherapy response and resistance mechanisms. Collectively, these applications transform lung cancer multi-omics from descriptive molecular profiling into actionable clinical decision support tools.
Emerging hotspots and links
The research hotspots and emerging frontiers identified in 2024 continue to center on multi-omics integration, precision medicine, and AI in lung cancer. Keyword frequency and burst analyses suggest these trends represent direct responses to persistent clinical challenges, including therapeutic resistance, insufficient predictive biomarkers, and variable treatment outcomes. A recent review article by Aldea et al. (34) highlighted that effective precision medicine relies on rational drug-target matching and the expanding role of data science in translating intricate molecular data into clinical decisions.
Clinically, multi-omics integration offers a powerful approach to capture tumor heterogeneity and mechanisms of resistance by jointly profiling genomic, transcriptional, signaling, and metabolic features, while AI-based approaches facilitate the extraction of predictive patterns from high-dimensional, heterogeneous datasets, the emerging hotspots identified in our analysis exhibit an interconnected rather than isolated structure. The strong association between “expression” and “prognosis” indicates that transcriptomic discoveries are increasingly linked to clinically relevant endpoints. Over time, this linkage extends toward immunotherapy and TME-related terms, reflecting a shift from tumor-intrinsic alterations to immune context-dependent outcomes. Within this framework, multi-omics integration provides mechanistic continuity, whereas ML serves as an enabling layer that connects diverse molecular features to predictive clinical outcomes.
Translational challenges and opportunities
Beyond dominant research themes, our keyword network reveals several underexplored yet promising interdisciplinary directions. The relatively weak connections between molecular mechanism-focused clusters—such as proteomics and metabolism—and clinical outcome-oriented clusters suggest opportunities to strengthen translational integration. In addition, although immune-related terms show significant burst activity, tighter coupling between tumor-intrinsic programs and TME-driven processes is still needed to improve modeling of therapeutic resistance and metastasis. The emergence of ML and network pharmacology points to potential methodological bridges; however, their integration with multi-omics evidence for clinically actionable validation remains limited.
Despite substantial advances in molecular profiling and targeted therapies, lung cancer management continues to be constrained by several fundamental challenges, including therapeutic resistance, metastatic progression, phenotypic plasticity, and the limited robustness of predictive biomarkers. These obstacles stem from the dynamic and context-dependent nature of tumor evolution, underscoring the need for integrative analytical frameworks that go beyond single-layer molecular analyses.
Growing evidence shows that therapeutic resistance in lung cancer is rarely caused by isolated molecular events but instead arises through coordinated, multi-layer interactions involving genomic and epigenomic plasticity, transcriptional reprogramming, signaling rewiring, metabolic adaptation, and selective pressures exerted by the TME (3). This systems-level adaptability enables tumor cells to survive therapeutic stress and represents a central obstacle to durable treatment responses. Consistent with this biological understanding, our bibliometric analysis reveals sustained high-frequency usage and burst signals related to the TME, immunotherapy, and multi-omics integration, indicating a field-wide transition from single-gene paradigms toward integrative, context-aware models (35,36).
Recent spatially resolved multi-omics studies have provided direct tissue-scale evidence that tumor evolution and immune dysfunction are strongly shaped by spatial microenvironmental organization. Mo et al. demonstrated that integrating spatial transcriptomics with multiplexed imaging directly links microenvironmental context to immune exhaustion (37). Complementarily, a large-scale spatial multi-omics study in advanced non-small cell lung cancer (NSCLC) patients receiving PD-1 immunotherapy identified spatial immune and tumor signatures that robustly predicted clinical outcomes across independent cohorts (38). These findings illustrate how the bibliometric hotspot centered on “TME/immunotherapy” is being translated into clinically actionable biomarkers for response prediction and resistance stratification.
Metastasis, which accounts for the majority of lung cancer-related deaths, emerges in our keyword network as a convergence point of invasion, immune modulation, and metabolic stress adaptation. The increasing prominence of single-cell and spatial immune terms aligns with recent high-resolution atlases showing that metastatic progression is closely associated with immune cell functional states and plasticity—not merely cell abundance. This insight is exemplified by a 2024 Nature Communications multi-omics mapping of tumor-associated macrophage reprogramming (39).
Finally, an under-recognized yet clinically urgent challenge—early detection of histologic transformation and lineage plasticity, such as small-cell-like transition in LUAD with EGFR mutation is increasingly highlighted by recent keyword bursts related to precision medicine and AI. Notably, El Zarif et al. demonstrated that epigenomic profiling of circulating cell-free DNA (cfDNA) can detect small-cell transformation prior to radiographic or clinical progression, underscoring the value of integrative, minimally invasive approaches beyond conventional genotyping (40).
Study implications and limits
Collectively, the convergence of bibliometric patterns and recent multi-omics evidence indicates that the rapid expansion of omics research in lung cancer is not solely driven by technological advancements, but also reflects a strategic response to persistent diagnostic and therapeutic challenges. By systematically mapping research hotspots and thematic evolution, this bibliometric study elucidates how integrative, spatially informed, and data-driven approaches are being leveraged to key challenges in lung cancer—specifically therapeutic resistance, metastasis, and limitations in predictive biomarker development—thereby highlighting the utility of bibliometric analysis in guiding future precision oncology research.
As a bibliometric study, this research has inherent limitations. Relying on WoSCC may lead to an incomplete coverage, especially for non-English publications and studies published in journals with low influence or regional focus. This may introduce a selection bias. Such choice prioritizes citation consistency, indexing quality, and methodological reproducibility over exhaustive inclusion of all global research outputs.
Global trends and contributions
This study presents a bibliometric analysis of omics research in lung cancer from 2004 to 2024, based on 19,087 publications indexed in the WoSCC. The steadily increasing annual publication output (growth rate of 8.35%) and the rising international co-authorship index indicate sustained expansion and growing global collaboration in this field. These quantitative indicators confirm that omics-based approaches have evolved from exploratory tools into central methodologies in lung cancer research.
In terms of academic contributions, China and the USA have emerged as the two dominant contributors. China leads in publication volume, while the USA ranks first in citations, reflecting their advanced technological development and economic backgrounds. As illustrated in Figure S7, although the USA has historically dominated, China has risen rapidly since 2022, driven by substantial investments in science and technology and large-scale data generation capabilities, indicative of its growing research capacity. Notably, eight of the top ten most productive authors in omics lung cancer research are affiliated with Chinese institutions, highlighting China’s dominant role in driving research output. In contrast, seven of the ten most cited articles were published in U.S.-based journals, underscoring the United States’ leadership in methodological innovation and global scholarly influence.
Early genomics foundations
From a temporal perspective, early omics research in lung cancer [2004–2015] was predominantly genomics-driven, focusing on identifying genetic alterations underlying tumor initiation and progression. Representative studies from this period include Janne et al. [2004], who employed high-resolution single-nucleotide polymorphism arrays to characterize loss of heterozygosity in lung cancer cell lines, thereby establishing a foundational understanding of genomic instability in lung cancer (16). Similarly, Calin et al. [2004] demonstrated that microRNA genes are frequently located in fragile genomic regions implicated in cancer, revealing a regulatory mechanism linking genomic alterations to oncogenesis, progression, and metastasis (17). These seminal studies are consistent with our keyword burst analysis, in which terms such as “gene expression” and mutation-related concepts dominated the early research landscape. The development of omics in lung cancer research gradually steps forward from 2004 to 2015, expanding the hotspot from genomics to proteomics (18-21) and metabolomics (22-24), providing multidimensional biological insights for elucidating the mechanisms of cancer onset and identifying new therapeutic targets. This evolution reflects the growing recognition that lung cancer progression and treatment response are driven by complex, multi-layered molecular interactions rather than isolated genomic alterations.
Between 2004 and 2015, keyword bursts were predominantly centered on “gene expression”, “immune”, and “immunotherapy”. Notably, Anderson’s 2005 study “The Sentinel Within: Exploiting the Immune System for Cancer Biomarkers” (25) leveraged the immune system to identify cancer biomarkers, demonstrating that simultaneous analysis of multiple autoantigens enhances the sensitivity and specificity of cancer detection—thereby improving clinical diagnostic accuracy and enabling more precise identification of immunotherapy targets.
Furthermore, metabolically oriented studies such as “Nuclear magnetic resonance in conjunction with functional genomics suggests mitochondrial dysfunction in a murine model of cancer cachexia” [2010] (26) using nuclear magnetic resonance in combination with functional genomics, revealed mitochondrial dysfunction and altered energy metabolism in cancer cachexia models, extending omics beyond genetic changes to include functional metabolic consequences, offering potential biomarkers and novel insights for early diagnosis and therapeutic intervention in lung cancer.
Integrated multi-omics era
Between 2016 and 2022, omics research in lung cancer entered a phase of rapid expansion, with publication output peaking in 2022 (a peak of 2,289 publications). This period was marked by a clear transition from single-omics approaches to integrated multi-omics strategies. Representative studies, such as Jiang et al. (27) combined genomics and proteomics to identify tumor immune-related biomarkers and characterize immune cell composition within the TME. Concurrently, keywords including “tumor immune infiltration” and “network pharmacology” gained prominence, reflecting growing interest in systems-level analyses of tumor-immune interactions and drug-target networks. “Comprehensive analyses of tumor immunity: implications for cancer immunotherapy” [2016] (28) further underscored the importance of integrating genomic alterations with chemokine and receptor expression profiles to elucidate mechanisms of immune infiltration and predict therapeutic response.
AI/ML in lung cancer omics
The accelerated growth of omics research in lung cancer is driven by three interrelated factors. First, after more than a decade of methodological development, omics technologies, particularly genomics, transcriptomics, and metabolomics—have matured, enabling deeper investigation into complex diseases like cancer and significantly increasing research output. Second, the rise of precision medicine as a central paradigm in both cancer research and clinical practice has created a strong demand for detailed molecular characterization, thereby further accelerating omics-based studies (29,30). Third, the widespread adoption of AI and machine learning (ML) has greatly enhanced the efficiency of analyzing high-dimensional genomic and multi-omics data, accelerating biomarker discovery and advancing cancer diagnosis, prognosis, and personalized therapy (31-33).
In lung-cancer omics, AI/ML addresses several disease-specific bottlenecks highlighted by our keyword dynamics. These methods enable efficient feature selection from high-dimensional data, allowing robust signal extraction despite the pronounced molecular heterogeneity of lung tumors. AI/ML also facilitates multi-modal integration across omics layers and clinical phenotypes, supporting clinically meaningful patient stratification—such as molecular subtypes specific to LUAD—and outcome prediction. In multi-center lung cancer cohorts, ML-based methods improve data harmonization by handling missing values and mitigating batch effects. Moreover, for immune-related hotspots, AI/ML-based deconvolution and cell-state inference help characterize the TME, linking immune contextures to immunotherapy response and resistance mechanisms. Collectively, these applications transform lung cancer multi-omics from descriptive molecular profiling into actionable clinical decision support tools.
Emerging hotspots and links
The research hotspots and emerging frontiers identified in 2024 continue to center on multi-omics integration, precision medicine, and AI in lung cancer. Keyword frequency and burst analyses suggest these trends represent direct responses to persistent clinical challenges, including therapeutic resistance, insufficient predictive biomarkers, and variable treatment outcomes. A recent review article by Aldea et al. (34) highlighted that effective precision medicine relies on rational drug-target matching and the expanding role of data science in translating intricate molecular data into clinical decisions.
Clinically, multi-omics integration offers a powerful approach to capture tumor heterogeneity and mechanisms of resistance by jointly profiling genomic, transcriptional, signaling, and metabolic features, while AI-based approaches facilitate the extraction of predictive patterns from high-dimensional, heterogeneous datasets, the emerging hotspots identified in our analysis exhibit an interconnected rather than isolated structure. The strong association between “expression” and “prognosis” indicates that transcriptomic discoveries are increasingly linked to clinically relevant endpoints. Over time, this linkage extends toward immunotherapy and TME-related terms, reflecting a shift from tumor-intrinsic alterations to immune context-dependent outcomes. Within this framework, multi-omics integration provides mechanistic continuity, whereas ML serves as an enabling layer that connects diverse molecular features to predictive clinical outcomes.
Translational challenges and opportunities
Beyond dominant research themes, our keyword network reveals several underexplored yet promising interdisciplinary directions. The relatively weak connections between molecular mechanism-focused clusters—such as proteomics and metabolism—and clinical outcome-oriented clusters suggest opportunities to strengthen translational integration. In addition, although immune-related terms show significant burst activity, tighter coupling between tumor-intrinsic programs and TME-driven processes is still needed to improve modeling of therapeutic resistance and metastasis. The emergence of ML and network pharmacology points to potential methodological bridges; however, their integration with multi-omics evidence for clinically actionable validation remains limited.
Despite substantial advances in molecular profiling and targeted therapies, lung cancer management continues to be constrained by several fundamental challenges, including therapeutic resistance, metastatic progression, phenotypic plasticity, and the limited robustness of predictive biomarkers. These obstacles stem from the dynamic and context-dependent nature of tumor evolution, underscoring the need for integrative analytical frameworks that go beyond single-layer molecular analyses.
Growing evidence shows that therapeutic resistance in lung cancer is rarely caused by isolated molecular events but instead arises through coordinated, multi-layer interactions involving genomic and epigenomic plasticity, transcriptional reprogramming, signaling rewiring, metabolic adaptation, and selective pressures exerted by the TME (3). This systems-level adaptability enables tumor cells to survive therapeutic stress and represents a central obstacle to durable treatment responses. Consistent with this biological understanding, our bibliometric analysis reveals sustained high-frequency usage and burst signals related to the TME, immunotherapy, and multi-omics integration, indicating a field-wide transition from single-gene paradigms toward integrative, context-aware models (35,36).
Recent spatially resolved multi-omics studies have provided direct tissue-scale evidence that tumor evolution and immune dysfunction are strongly shaped by spatial microenvironmental organization. Mo et al. demonstrated that integrating spatial transcriptomics with multiplexed imaging directly links microenvironmental context to immune exhaustion (37). Complementarily, a large-scale spatial multi-omics study in advanced non-small cell lung cancer (NSCLC) patients receiving PD-1 immunotherapy identified spatial immune and tumor signatures that robustly predicted clinical outcomes across independent cohorts (38). These findings illustrate how the bibliometric hotspot centered on “TME/immunotherapy” is being translated into clinically actionable biomarkers for response prediction and resistance stratification.
Metastasis, which accounts for the majority of lung cancer-related deaths, emerges in our keyword network as a convergence point of invasion, immune modulation, and metabolic stress adaptation. The increasing prominence of single-cell and spatial immune terms aligns with recent high-resolution atlases showing that metastatic progression is closely associated with immune cell functional states and plasticity—not merely cell abundance. This insight is exemplified by a 2024 Nature Communications multi-omics mapping of tumor-associated macrophage reprogramming (39).
Finally, an under-recognized yet clinically urgent challenge—early detection of histologic transformation and lineage plasticity, such as small-cell-like transition in LUAD with EGFR mutation is increasingly highlighted by recent keyword bursts related to precision medicine and AI. Notably, El Zarif et al. demonstrated that epigenomic profiling of circulating cell-free DNA (cfDNA) can detect small-cell transformation prior to radiographic or clinical progression, underscoring the value of integrative, minimally invasive approaches beyond conventional genotyping (40).
Study implications and limits
Collectively, the convergence of bibliometric patterns and recent multi-omics evidence indicates that the rapid expansion of omics research in lung cancer is not solely driven by technological advancements, but also reflects a strategic response to persistent diagnostic and therapeutic challenges. By systematically mapping research hotspots and thematic evolution, this bibliometric study elucidates how integrative, spatially informed, and data-driven approaches are being leveraged to key challenges in lung cancer—specifically therapeutic resistance, metastasis, and limitations in predictive biomarker development—thereby highlighting the utility of bibliometric analysis in guiding future precision oncology research.
As a bibliometric study, this research has inherent limitations. Relying on WoSCC may lead to an incomplete coverage, especially for non-English publications and studies published in journals with low influence or regional focus. This may introduce a selection bias. Such choice prioritizes citation consistency, indexing quality, and methodological reproducibility over exhaustive inclusion of all global research outputs.
Conclusions
Conclusions
This bibliometric analysis summarizes the evolution of omics research in lung cancer from 2004 to 2024, revealing sustained growth and increasing global collaboration, with China and the USA as the leading contributors. The transition from single-omics approaches to integrative multi-omics strategies, coupled with the expanding application of AI, reflects a broader shift toward data-driven frameworks in lung cancer research. Critically, the identified thematic progression demonstrates that omics research has become increasingly aligned with major clinical challenges, including tumor heterogeneity, therapeutic resistance, metastatic progression, and the need for robust and clinically actionable biomarkers.
Future studies that integrate diverse data sources and AI-driven multi-omics analyses may offer a more comprehensive perspective and help identify clinically significant biomarkers. Overall, these advances are expected to enable earlier diagnosis, improved patient stratification, and more precise treatment strategies in lung cancer.
This bibliometric analysis summarizes the evolution of omics research in lung cancer from 2004 to 2024, revealing sustained growth and increasing global collaboration, with China and the USA as the leading contributors. The transition from single-omics approaches to integrative multi-omics strategies, coupled with the expanding application of AI, reflects a broader shift toward data-driven frameworks in lung cancer research. Critically, the identified thematic progression demonstrates that omics research has become increasingly aligned with major clinical challenges, including tumor heterogeneity, therapeutic resistance, metastatic progression, and the need for robust and clinically actionable biomarkers.
Future studies that integrate diverse data sources and AI-driven multi-omics analyses may offer a more comprehensive perspective and help identify clinically significant biomarkers. Overall, these advances are expected to enable earlier diagnosis, improved patient stratification, and more precise treatment strategies in lung cancer.
Supplementary
Supplementary
The article’s supplementary files as
The article’s supplementary files as
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- Prostate Cancer Care for Men with an Intellectual Disability: A Population-based Cohort Study of Symptoms, Diagnosis, Treatment, and Survival.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Association between polygenic risk scores and cardiovascular events in prostate cancer patients receiving androgen deprivation therapy in Han Chinese.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.