Low OLFM1 and BMP6 Expression Predicts Recurrence in Early-Stage Nonsquamous NSCLC with Pure Solid Tumor Appearance.
1/5 보강
PICO 자동 추출 (휴리스틱, conf 2/4)
유사 논문P · Population 대상 환자/모집단
33 patients with nonsquamous NSCLC with disease recurrence (recurrence group) and 33 counterparts (control group) extracted from patients without disease recurrence using propensity score matching (cohort 1).
I · Intervention 중재 / 시술
추출되지 않음
C · Comparison 대조 / 비교
추출되지 않음
O · Outcome 결과 / 결론
we found that low expression of six genes (BMP6, KCNK3, NFASC, OLFM1, PEG3, and TNXB) was associated with disease recurrence.
[UNLABELLED] Although pathologic stage I (pStage I) non-small cell lung cancer (NSCLC) has a good prognosis, some patients experience disease recurrence.
- 표본수 (n) 125
APA
Suda K, Yoshida Y, et al. (2025). Low OLFM1 and BMP6 Expression Predicts Recurrence in Early-Stage Nonsquamous NSCLC with Pure Solid Tumor Appearance.. Cancer research communications, 5(12), 2186-2196. https://doi.org/10.1158/2767-9764.CRC-25-0186
MLA
Suda K, et al.. "Low OLFM1 and BMP6 Expression Predicts Recurrence in Early-Stage Nonsquamous NSCLC with Pure Solid Tumor Appearance.." Cancer research communications, vol. 5, no. 12, 2025, pp. 2186-2196.
PMID
41259663 ↗
Abstract 한글 요약
[UNLABELLED] Although pathologic stage I (pStage I) non-small cell lung cancer (NSCLC) has a good prognosis, some patients experience disease recurrence. Identification of prognostic markers for pStage I NSCLC may facilitate personalized perioperative treatment by expanding the candidates for adjuvant treatment. Patients with NSCLC with ground-glass opacity have excellent survival outcomes. Therefore, in this study, we explored prognostic biomarkers in pure solid nonsquamous NSCLC. We focused on nonsquamous NSCLC because the gene expression status is distinct from that of squamous cell carcinomas. RNA sequencing was performed on frozen tumor specimens from 33 patients with nonsquamous NSCLC with disease recurrence (recurrence group) and 33 counterparts (control group) extracted from patients without disease recurrence using propensity score matching (cohort 1). The candidate genes were further refined using an independent real-world cohort 2a (N = 125) and validated in The Cancer Genome Atlas cohort. Through the analysis of cohort 1 and cohort 2a, we found that low expression of six genes (BMP6, KCNK3, NFASC, OLFM1, PEG3, and TNXB) was associated with disease recurrence. The prognostic impact of the six genes was confirmed using The Cancer Genome Atlas lung adenocarcinoma database. Multivariable proportional hazard analysis using the cohort 2a dataset revealed that the combination of BMP6 and OLFM1 status predicted recurrence-free survival. In conclusion, we found that low BMP6/OLFM1 gene status is a potential biomarker to identify patients with high-risk pStage I pure solid nonsquamous NSCLC after pulmonary resection.
[SIGNIFICANCE] The identification of subgroups with a high risk of disease recurrence among patients with NSCLC with pStage I is an important clinical issue. We explored genetic factors in patients with pStage I NSCLC with a high-risk clinical feature and identified low expression of OLFM1 and BMP6 genes as candidate biomarkers. Clinical application of these biomarkers, together with others, will open an opportunity to administer adequate adjuvant treatment for these high-risk patients.
[SIGNIFICANCE] The identification of subgroups with a high risk of disease recurrence among patients with NSCLC with pStage I is an important clinical issue. We explored genetic factors in patients with pStage I NSCLC with a high-risk clinical feature and identified low expression of OLFM1 and BMP6 genes as candidate biomarkers. Clinical application of these biomarkers, together with others, will open an opportunity to administer adequate adjuvant treatment for these high-risk patients.
🏷️ 키워드 / MeSH 📖 같은 키워드 OA만
같은 제1저자의 인용 많은 논문 (1)
📖 전문 본문 읽기 PMC JATS · ~60 KB · 영문
Introduction
Introduction
Complete surgical resection is the cornerstone of treatment for early-stage non–small cell lung cancer (NSCLC). Although adjuvant platinum doublet is administered to patients with pathologic stage (pStage) II to III disease, the National Comprehensive Cancer Network guidelines do not recommend such adjuvant chemotherapy for pStage IA patients (1). The Japanese guidelines also do not recommend the use of adjuvant platinum doublet for patients with pStage I NSCLC and instead recommend oral tegafur uracil (1–2 years) for pStage I NSCLC with tumor sizes larger than 2 cm. Although a good prognosis is expected for patients with pStage I NSCLC, some patients experience disease recurrence after complete surgical resection.
The great success of recent clinical trials on molecular-targeted drugs (2, 3) and immune checkpoint inhibitors (4–6) in adjuvant settings have led to the approvals of these agents as adjuvant treatments for patients with pStage II to III (IB–III for some drugs) NSCLC after pulmonary resection. However, it is not realistic to use these new drugs in all patients with pStage I NSCLC from the standpoints of cost-effectiveness and drug side effects, because the large percentage of these patients is cured by surgery alone. Therefore, the establishment of poor prognostic biomarkers in pStage I NSCLC may help identify candidates for adjuvant treatments, thereby improving the outcomes of patients with early-stage NSCLC after surgical resection.
Numerous studies have investigated the poor prognostic factors in patients with early-stage NSCLC. One of the easiest prognostic factors for clinical application in stage I nonsquamous NSCLC is the absence of ground-glass opacity (GGO) in preoperative CT imaging (7–9). Pathologic factors such as vascular/lymphatic invasion and/or high histologic grade are also reported as poor prognostic factors (10, 11). Mutational status of driver genes (12–14) and tumor suppressor genes (15, 16) is also reportedly prognostic; however, the magnitude of prognostic differences is not large enough to affect clinical decision-making.
Gene expression status is another potential prognostic biomarker for NSCLC although the results vary among studies. One potential obstacle in using gene expression status as a biomarker is the variation in gene expression by tumor histology. For example, genetic alterations in lung squamous cell carcinoma resemble alterations in squamous cell carcinomas of other organs rather than nonsquamous NSCLCs (17). Additionally, the prognostic implication of gene expression can vary by tumor stage. For example, TGFβ expression has different prognostic impacts in early-stage NSCLC versus advanced-stage disease (18).
Therefore, in this study, we aimed to identify genetic biomarkers associated with a high risk of disease recurrence by investigating a cohort with relatively homogeneous gene expression status, namely patients with pStage I nonsquamous NSCLC without GGO.
Complete surgical resection is the cornerstone of treatment for early-stage non–small cell lung cancer (NSCLC). Although adjuvant platinum doublet is administered to patients with pathologic stage (pStage) II to III disease, the National Comprehensive Cancer Network guidelines do not recommend such adjuvant chemotherapy for pStage IA patients (1). The Japanese guidelines also do not recommend the use of adjuvant platinum doublet for patients with pStage I NSCLC and instead recommend oral tegafur uracil (1–2 years) for pStage I NSCLC with tumor sizes larger than 2 cm. Although a good prognosis is expected for patients with pStage I NSCLC, some patients experience disease recurrence after complete surgical resection.
The great success of recent clinical trials on molecular-targeted drugs (2, 3) and immune checkpoint inhibitors (4–6) in adjuvant settings have led to the approvals of these agents as adjuvant treatments for patients with pStage II to III (IB–III for some drugs) NSCLC after pulmonary resection. However, it is not realistic to use these new drugs in all patients with pStage I NSCLC from the standpoints of cost-effectiveness and drug side effects, because the large percentage of these patients is cured by surgery alone. Therefore, the establishment of poor prognostic biomarkers in pStage I NSCLC may help identify candidates for adjuvant treatments, thereby improving the outcomes of patients with early-stage NSCLC after surgical resection.
Numerous studies have investigated the poor prognostic factors in patients with early-stage NSCLC. One of the easiest prognostic factors for clinical application in stage I nonsquamous NSCLC is the absence of ground-glass opacity (GGO) in preoperative CT imaging (7–9). Pathologic factors such as vascular/lymphatic invasion and/or high histologic grade are also reported as poor prognostic factors (10, 11). Mutational status of driver genes (12–14) and tumor suppressor genes (15, 16) is also reportedly prognostic; however, the magnitude of prognostic differences is not large enough to affect clinical decision-making.
Gene expression status is another potential prognostic biomarker for NSCLC although the results vary among studies. One potential obstacle in using gene expression status as a biomarker is the variation in gene expression by tumor histology. For example, genetic alterations in lung squamous cell carcinoma resemble alterations in squamous cell carcinomas of other organs rather than nonsquamous NSCLCs (17). Additionally, the prognostic implication of gene expression can vary by tumor stage. For example, TGFβ expression has different prognostic impacts in early-stage NSCLC versus advanced-stage disease (18).
Therefore, in this study, we aimed to identify genetic biomarkers associated with a high risk of disease recurrence by investigating a cohort with relatively homogeneous gene expression status, namely patients with pStage I nonsquamous NSCLC without GGO.
Materials and Methods
Materials and Methods
Patient selection and tumor samples
This study included two independent patient cohorts: the discovery cohort (cohort 1) and the validation cohort (cohort 2; Fig. 1A). Cohort 1 was established as follows: Between January 2013 and June 2021, 942 patients received lobectomy plus mediastinal lymph node dissection for pStage I (ninth edition of tumor–node–metastasis classification) nonsquamous NSCLC at Kindai University Faculty of Medicine. Among these 942 patients, 145 patients with pure solid nodules at preoperative CT were initially recruited to this study, including 44 patients who experienced disease recurrence and 101 patients without disease recurrence and who received follow-up for 5 years or more. Among the 44 patients who experienced disease recurrence, 33 had available frozen tumor specimens for molecular analysis; these were included as the recurrence group in cohort 1 (Fig. 1A). To establish a control group for cohort 1, 1:1 propensity score matching was performed to extract 33 matched patients from the 101 patients who did not experience disease recurrence. Factors used for propensity score matching were age, sex, smoking status, pathologic tumor size, and the presence/absence of pleural invasion. Patient characteristics were similar in the two groups (Supplementary Table S1). We performed RNA sequencing (described below) on frozen tumor specimens from patients in the recurrence group (N = 33) and in the matched control group (N = 33).
Cohort 2 included patients with pStage I nonsquamous NSCLC who received lobectomy plus mediastinal lymph node dissection at the National Cancer Center (NCC) Hospital. These patients were selected from those who were enrolled in the PRISM project, in which RNA sequencing of the primary tumor tissue was prospectively performed (19, 20). The inclusion criteria were pulmonary resection before July 2018, R0 resection, pure solid tumor at preoperative CT examination, and follow-up period of 5 years or more for patients without disease recurrence. Exclusion criteria were tumors with neuroendocrine or pleomorphic component and patients with multiple lung cancers or advanced-stage tumors in other organs. A total of 149 patients were selected for cohort 2. Mutational status and expression data for the genes identified in the analysis of cohort 1 and anonymized patient data were obtained from the PRISM database for these 149 patients.
For cohort E (EGFR-mutated cohort), we extracted patients with activating EGFR mutations from cohort 1 and cohort 2. A total of 108 patients were included in cohort E.
This study adhered to the Declaration of Helsinki and was conducted according to protocols approved by the institutional review boards of Kindai University Faculty of Medicine (no. R04-078) and NCC Hospital (no. 2022-119). In this study, we analyzed tumor specimens collected under a separate research protocol that was approved by the institutional review board of Kindai University Faculty of Medicine (no. 24-071). At the time of specimen collection, written informed consent was obtained from all patients. For patients currently attending our hospital, written informed consent was obtained again for use in this study as well, and for patients who cannot provide the second informed consent because of discontinuation of treatment at our hospital, informed consent was obtained through an opt-out method. Cohort 2 patient samples were obtained from the NCC Biobank and the RNA sequencing analysis for this cohort was approved by the Institutional Review Board (IORG0002238) of the NCC (no. 2005-109). This study has been registered at Japan Registry of Clinical Trials (jRCT1050220114).
RNA sequencing
Total RNA was extracted from frozen tissue specimens obtained from the 66 patients in cohort 1 using the RNeasy Mini Kit (Qiagen, cat. #74104). The quality of RNA was assessed by the RNA integrity number score.
RNA sequencing was performed at the Division of Multiomics research, Yamagata University Well-being Institute. The cDNA sequence library was constructed from 1 μg of rRNA-depleted RNA using the Ion Total RNA-Seq Kit v2 (Thermo Fisher Scientific, cat. #4479789) following the manufacturer’s protocol. Library quantification was performed on a 4150 TapeStation system using a D1000 ScreenTape (Agilent Technologies, cat. #5067-5582). Massive parallel semiconductor sequencing was performed on an Ion GeneStudio S5 XL system for 520 cycles using the Ion 540 Kit-OT2 and the Ion 540 Chip Kit (Thermo Fisher Scientific, cat. #A27766). More than 40 million reads were obtained and subjected to subsequent analysis.
Gene expression analysis
Sequencing reads were computed using an in-house nextflow-based workflow pipeline on High Performance Computing (total 360-core CPU, total 3.85TB RAM, 10-nodes, and Linux OS) under the SLURM job scheduler management system. Sequencing reads data were preprocessed to correct base pair errors and filter shorter reads <70 bp and mapped to the human genome reference sequence (GRCh38/hg38) using the STAR program (RRID: SCR_004463). RefSeq gene-level transcript quantification and transcripts per million (TPM) calculation were performed using the RSEM bioconductor package (RRID: SCR_000262) in the R statistical program. Finally, data for 15,129 genes were obtained.
To evaluate the 15,129 transcripts in the recurrence and nonrecurrence groups, we analyzed TPM expression data to compare the cluster of expression patterns by the orthogonal partial least squares using the ropls bioconductor package (RRID: SCR_016888) for discriminant binary outcome with sevenfold cross-validation and 20 permutations. Significant levels were set at 0.05 in adjusted P and Q values. To investigate differentially expressed genes, a Mann–Whitney U test was performed on TPM expression data of each gene; significant levels were set as the fold change |log (recurrence/nonrecurrence)| > 1 and FDR <0.2. Differentially expressed genes in the recurrence group and control group were analyzed for expression patterns in the significant gene set using the nonnegative matrix factorization [k (italic)-rank = 2] algorithm in nmf packages (RRID: SCR_024284) in the R program.
Gene mutation analysis
To analyze somatic functional variants, sequencing reads were computed using another in-house nextflow pipeline on the High Performance Computing. Sequencing reads were preprocessed as above and then mapped to the reference sequence using BWA (RRID: SCR_010910) and Bowtie2 (RRID: SCR_016368) program. Variant calling was conducted on data to determine a variant in the setting >0.2 allele fraction and 20 or more read counts using the Freebayes program (RRID: SCR_010761). After variant calling, high-confidence somatic variants were selected from gnomAD (RRID: SCR_014964) population frequencies in <0.0001 and unregistered SNPs in the Single Nucleotide Polymorphism Database or unregistered gnomAD database and registered SNPs. Functional somatic variants as predicted deleterious mutations were selected from the type of variants excluding synonymous, intergenic, or intronic variants.
To analyze the mutation spectrum, all functional somatic variants were analyzed for the mutational signature on an SBS signature database using the maftools bioconductor package (RRID: SCR_024519). A machine learning–based gene selection analysis was conducted using a hybrid method, integrating a random forest–based recursive feature elimination algorithm and a partial least square algorithm. This was performed using the caret package (RRID: SCR_021138) in the R programing language to identify candidate genes associated with recurrence and nonrecurrence.
Statistical analysis
Recurrence-free survival (RFS) was defined as the interval between the date of pulmonary resection and the date of recurrence or death by any cause. Overall survival (OS) was defined as the interval between the date of pulmonary resection and the date of death by any cause. RFS and OS were estimated by the Kaplan–Meier method and compared using the log-rank test. Univariable and multivariable Cox regression analyses for associations between clinicopathologic parameters/biomarker status and RFS were performed. The Cochran–Armitage trend test was applied to compare clinicopathologic characteristics between subgroups divided by gene expression status.
All statistical analyses were performed with EZR (Saitama Medical Center, Jichi Medical University; ref. 21), which is a graphical user interface for R (The R Foundation for Statistical Computing). EZR is a modified version of R commander designed to add statistical functions frequently used in biostatistics.
Patient selection and tumor samples
This study included two independent patient cohorts: the discovery cohort (cohort 1) and the validation cohort (cohort 2; Fig. 1A). Cohort 1 was established as follows: Between January 2013 and June 2021, 942 patients received lobectomy plus mediastinal lymph node dissection for pStage I (ninth edition of tumor–node–metastasis classification) nonsquamous NSCLC at Kindai University Faculty of Medicine. Among these 942 patients, 145 patients with pure solid nodules at preoperative CT were initially recruited to this study, including 44 patients who experienced disease recurrence and 101 patients without disease recurrence and who received follow-up for 5 years or more. Among the 44 patients who experienced disease recurrence, 33 had available frozen tumor specimens for molecular analysis; these were included as the recurrence group in cohort 1 (Fig. 1A). To establish a control group for cohort 1, 1:1 propensity score matching was performed to extract 33 matched patients from the 101 patients who did not experience disease recurrence. Factors used for propensity score matching were age, sex, smoking status, pathologic tumor size, and the presence/absence of pleural invasion. Patient characteristics were similar in the two groups (Supplementary Table S1). We performed RNA sequencing (described below) on frozen tumor specimens from patients in the recurrence group (N = 33) and in the matched control group (N = 33).
Cohort 2 included patients with pStage I nonsquamous NSCLC who received lobectomy plus mediastinal lymph node dissection at the National Cancer Center (NCC) Hospital. These patients were selected from those who were enrolled in the PRISM project, in which RNA sequencing of the primary tumor tissue was prospectively performed (19, 20). The inclusion criteria were pulmonary resection before July 2018, R0 resection, pure solid tumor at preoperative CT examination, and follow-up period of 5 years or more for patients without disease recurrence. Exclusion criteria were tumors with neuroendocrine or pleomorphic component and patients with multiple lung cancers or advanced-stage tumors in other organs. A total of 149 patients were selected for cohort 2. Mutational status and expression data for the genes identified in the analysis of cohort 1 and anonymized patient data were obtained from the PRISM database for these 149 patients.
For cohort E (EGFR-mutated cohort), we extracted patients with activating EGFR mutations from cohort 1 and cohort 2. A total of 108 patients were included in cohort E.
This study adhered to the Declaration of Helsinki and was conducted according to protocols approved by the institutional review boards of Kindai University Faculty of Medicine (no. R04-078) and NCC Hospital (no. 2022-119). In this study, we analyzed tumor specimens collected under a separate research protocol that was approved by the institutional review board of Kindai University Faculty of Medicine (no. 24-071). At the time of specimen collection, written informed consent was obtained from all patients. For patients currently attending our hospital, written informed consent was obtained again for use in this study as well, and for patients who cannot provide the second informed consent because of discontinuation of treatment at our hospital, informed consent was obtained through an opt-out method. Cohort 2 patient samples were obtained from the NCC Biobank and the RNA sequencing analysis for this cohort was approved by the Institutional Review Board (IORG0002238) of the NCC (no. 2005-109). This study has been registered at Japan Registry of Clinical Trials (jRCT1050220114).
RNA sequencing
Total RNA was extracted from frozen tissue specimens obtained from the 66 patients in cohort 1 using the RNeasy Mini Kit (Qiagen, cat. #74104). The quality of RNA was assessed by the RNA integrity number score.
RNA sequencing was performed at the Division of Multiomics research, Yamagata University Well-being Institute. The cDNA sequence library was constructed from 1 μg of rRNA-depleted RNA using the Ion Total RNA-Seq Kit v2 (Thermo Fisher Scientific, cat. #4479789) following the manufacturer’s protocol. Library quantification was performed on a 4150 TapeStation system using a D1000 ScreenTape (Agilent Technologies, cat. #5067-5582). Massive parallel semiconductor sequencing was performed on an Ion GeneStudio S5 XL system for 520 cycles using the Ion 540 Kit-OT2 and the Ion 540 Chip Kit (Thermo Fisher Scientific, cat. #A27766). More than 40 million reads were obtained and subjected to subsequent analysis.
Gene expression analysis
Sequencing reads were computed using an in-house nextflow-based workflow pipeline on High Performance Computing (total 360-core CPU, total 3.85TB RAM, 10-nodes, and Linux OS) under the SLURM job scheduler management system. Sequencing reads data were preprocessed to correct base pair errors and filter shorter reads <70 bp and mapped to the human genome reference sequence (GRCh38/hg38) using the STAR program (RRID: SCR_004463). RefSeq gene-level transcript quantification and transcripts per million (TPM) calculation were performed using the RSEM bioconductor package (RRID: SCR_000262) in the R statistical program. Finally, data for 15,129 genes were obtained.
To evaluate the 15,129 transcripts in the recurrence and nonrecurrence groups, we analyzed TPM expression data to compare the cluster of expression patterns by the orthogonal partial least squares using the ropls bioconductor package (RRID: SCR_016888) for discriminant binary outcome with sevenfold cross-validation and 20 permutations. Significant levels were set at 0.05 in adjusted P and Q values. To investigate differentially expressed genes, a Mann–Whitney U test was performed on TPM expression data of each gene; significant levels were set as the fold change |log (recurrence/nonrecurrence)| > 1 and FDR <0.2. Differentially expressed genes in the recurrence group and control group were analyzed for expression patterns in the significant gene set using the nonnegative matrix factorization [k (italic)-rank = 2] algorithm in nmf packages (RRID: SCR_024284) in the R program.
Gene mutation analysis
To analyze somatic functional variants, sequencing reads were computed using another in-house nextflow pipeline on the High Performance Computing. Sequencing reads were preprocessed as above and then mapped to the reference sequence using BWA (RRID: SCR_010910) and Bowtie2 (RRID: SCR_016368) program. Variant calling was conducted on data to determine a variant in the setting >0.2 allele fraction and 20 or more read counts using the Freebayes program (RRID: SCR_010761). After variant calling, high-confidence somatic variants were selected from gnomAD (RRID: SCR_014964) population frequencies in <0.0001 and unregistered SNPs in the Single Nucleotide Polymorphism Database or unregistered gnomAD database and registered SNPs. Functional somatic variants as predicted deleterious mutations were selected from the type of variants excluding synonymous, intergenic, or intronic variants.
To analyze the mutation spectrum, all functional somatic variants were analyzed for the mutational signature on an SBS signature database using the maftools bioconductor package (RRID: SCR_024519). A machine learning–based gene selection analysis was conducted using a hybrid method, integrating a random forest–based recursive feature elimination algorithm and a partial least square algorithm. This was performed using the caret package (RRID: SCR_021138) in the R programing language to identify candidate genes associated with recurrence and nonrecurrence.
Statistical analysis
Recurrence-free survival (RFS) was defined as the interval between the date of pulmonary resection and the date of recurrence or death by any cause. Overall survival (OS) was defined as the interval between the date of pulmonary resection and the date of death by any cause. RFS and OS were estimated by the Kaplan–Meier method and compared using the log-rank test. Univariable and multivariable Cox regression analyses for associations between clinicopathologic parameters/biomarker status and RFS were performed. The Cochran–Armitage trend test was applied to compare clinicopathologic characteristics between subgroups divided by gene expression status.
All statistical analyses were performed with EZR (Saitama Medical Center, Jichi Medical University; ref. 21), which is a graphical user interface for R (The R Foundation for Statistical Computing). EZR is a modified version of R commander designed to add statistical functions frequently used in biostatistics.
Results
Results
Gene expression analysis in cohort 1
The clinicopathologic characteristics of the 66 patients in cohort 1, including 33 patients in the recurrence group and 33 matched patients in the control group, are summarized in Supplementary Table S1. Through the gene expression analysis of RNA sequencing data from samples of cohort 1, data for 15,129 genes were obtained and processed as described in the “Materials and Methods” section. First, we performed OPLS-DA analysis using RNA sequencing data of the 66 patients. As shown in Fig. 1B, OPLS-DA scores of patients who experienced disease recurrence were clearly separated from those of the control group. This suggests that gene expression status may be a useful tool to identify patients with a higher risk of disease recurrence in pStage I NSCLCs with pure solid appearance. We next screened candidate genes with prognostic impacts using volcano plots. Using the criteria of NM FDR <0.2 and fold changes of >2.0 or <0.5, we extracted 168 candidate genes (Fig. 1C). Nonnegative matrix factorization analysis was performed using the 168 candidate genes (Supplementary Fig. S1). The candidate genes were further narrowed down to 101 genes, including 44 genes with higher expression in the recurrence group (clustered in R1, R2, and R3) and 57 genes with lower expression in the recurrence group (clustered in N1, N2, and N3). The 101 genes are listed in Supplementary Table S2.
We next evaluated the function of 101 genes included in each gene set by functional analysis using Gene Ontology enrichment and pathway analyses. Genes in N1/N2/N3 (with lower expression in the recurrence group) were associated with potential cancer-related pathways, such as NK cell–mediated cytotoxicity, DNA replication–dependent chromatin assembly, and regulation of vascular permeability, whereas genes in R1/R2/R3 (with higher expression in the recurrence group) were less associated with cancer biology (Supplementary Fig. S2). Therefore, we hypothesized that genes with downregulated expression in the recurrence group rather than those with upregulated expression may be useful prognostic biomarkers.
Analysis of gene mutational status in cohort 1 and cohort 2
Mutational data were compared between the 33 patients in the recurrence group and 32 patients in the control group (after excluding one patient from the control group because of high sequence noise). We first performed mutational signature analysis by comparing the two groups; however, we did not observe differences between them [e.g., defective DNA mismatch repair (SBS26) was found in both groups, Supplementary Fig. S3].
We then compared the frequencies of mutated genes between the two groups (Supplementary Fig. S4). We found that three genes were more frequently mutated in the recurrence group, SYNE1 (recurrence group: 12% vs. control group: 0%), MACF1, (15% vs. 3%), and DCAF8L2 (21% vs. 6%), whereas two genes were frequently mutated in the control group, HLA-F (recurrence group: 12% vs. control group: 34%) and RPL10 (6% vs. 22%). However, analysis in the validation cohort (cohort 2) revealed no prognostic impact of the mutation status of these five genes. Therefore, we focused on gene expression status rather than gene mutation status as a candidate biomarker to predict disease recurrence in pStage I NSCLCs with pure solid appearance.
Prognostic impacts of candidate genes in cohort 2
Cohort 2 included 149 patients with pStage I nonsquamous NSCLC who met the inclusion and exclusion criteria described in the Methods (median follow-up of 76.9 months, Supplementary Table S1). PolyA sequencing had been performed in 125 patients (the clinicopathologic characteristics of these patients are summarized in Supplementary Table S3) and other RNA sequencing (ribo-zero NGS or SMART-seq) had been performed in 24 patients. Because the distribution of gene expression differed between these two groups (e.g., expression status of GJB6, Supplementary Fig. S5), we focused on the data of the 125 patients whose tumors were analyzed using PolyA sequencing (designated as cohort 2a, Fig. 1A) in the following analyses.
Among the 101 candidate genes identified in the analysis of cohort 1 (Supplementary Table S1), the expression status of six genes, bone morphogenetic protein 6 (BMP6), potassium two pore domain channel subfamily K member 3 (KCNK3), neurofascin (NFASC), olfactomedin 1 (OLFM1), paternally expressed 3 (PEG3), and tenascin XB (TNXB), was associated with disease status (recurrence or disease-free survival for at least 5 years) in cohort 2a. Lower expression of these six genes was associated with disease recurrence in both cohort 1 and cohort 2a. We thus next evaluated the prognostic impact of these six genes in lung adenocarcinoma using the Human Protein Atlas (proteinatlas.org; refs. 22, 23), which contains RNA expression and survival data of The Cancer Genome Atlas (TCGA; ref. 24). We observed that lower expression of these six genes was associated with poorer survival in the TCGA adenocarcinoma cohort; only BMP6 expression did not show statistical significance (Supplementary Fig. S6).
Survival analysis of cohort 2a by expression status of the six genes
We then evaluated prognostic implication of the expression status of the six genes on RFS in cohort 2a (Fig. 2). For all six genes, patients with lower gene expression had worse RFS compared with patients with higher gene expression. We next analyzed the correlations of expression status of the six genes. As shown in Supplementary Fig. S7, the expression status of BMP6 was not concordant with that of the other five genes; however, the expression status of the other five genes was associated with each other with high concordance rates (57.6%–75.2%). Furthermore, multivariable analysis of the expression status of the six genes revealed that BMP6 and OLFM1 expressions were independent prognostic factors for RFS (Supplementary Table S4). These results indicate that the expressions of BMP6 and OLFM1 were candidate prognostic markers. Prognostic implications of BMP6 and OLFM1 expression status were further validated using the TCGA validation cohort from the Human Protein Atlas (Supplementary Fig. S8). We also observed that the prognostic implications of these genes were prominent in early-stage lung adenocarcinomas compared with later-stage disease (Supplementary Fig. S9).
Combination of OLFM1 and BMP6 expression status as a prognostic factor
We next evaluated the prognostic implications of the combinations of OLFM1 and BMP6 expression status on RFS. As shown in Fig. 3A and B, we found that patients with lower expression for both genes had worse prognosis, followed by patients who had lower expression for either of the two genes. Patients with higher expression of both BMP6 and OLFM1 showed RFS of 91.2% (95% CI, 75.1%–95.1%) at 5 years, compared with patients with lower expression of OLFM1 or BMP6 and patients with lower expression for both genes, who showed 5-year RFS of 60.7% [95% confidence interval (CI), 47.3%–71.6%] and 40% (95% CI, 22.8%–56.7%), respectively. We also found that the expression status of OLFM1 and BMP6 was associated with OS (Supplementary Fig. S10).
Multivariable analysis revealed that OLFM1 and BMP6 expression status was an independent prognostic factor for RFS; the HR was 4.28 (95% CI, 1.47–12.46; P = 0.008) in patients with low expression for either OLFM1 or BMP6 and 7.32 (95% CI, 2.36–22.72; P < 0.001) in patients with low expression for both OLFM1 and BMP6 compared with those who had high expression for both genes (Table 1). Similar findings were observed in multivariable analysis for OS. We also found that vascular invasion (HR, 3.13; 95% CI, 1.37–7.36; P = 0.007) and male sex (HR, 2.46; 95% CI, 1.19–5.06; P = 0.014) were independent poor prognostic factors for OS.
We next examined the clinicopathologic characteristics of the subgroups of patients divided by OLFM1 and BMP6 expression status (Table 2). We observed that high-risk gene expression status was significantly associated with male sex (P = 0.028) and pathologic high-risk status (P = 0.012), which was defined as the presence of one of following factors: large invasive tumor size (>2.0 cm), pleural invasion, and/or lympho-vascular invasion.
Prognostic impacts of OLFM1 and BMP6 expression status in cohort 2
We further evaluated our findings using the entire cohort 2 (N = 149), which included patients whose tumors were analyzed using RNA sequencing platforms other than PolyA sequencing. Because it is not possible to simply compare gene expression data obtained by different platforms (Supplementary Fig. S6), we adjusted the gene expression status of tumors analyzed by platforms other than PolyA sequencing before combining with cohort 2a. Survival analysis showed that the combination of BMP6 and OLFM1 was also prognostic in the entire cohort 2 (Supplementary Fig. S11).
Prognostic impacts of OLFM1 and BMP6 in patients with EGFR mutation
To evaluate the prognostic impacts of OLFM1 and BMP6 in patients with EGFR mutation, the most frequent molecular subtype of NSCLC in East Asians, we further performed an analysis combining all patients positive for EGFR mutation from cohort 1 and cohort 2 into cohort E. The clinicopathologic characteristics in patient groups divided by OLFM1 and BMP6 gene expression status are summarized in Table 2. Similar to the analysis of cohort 2a, we observed that 91% of patients with lower OLFM1 and BMP6 gene expression, which means high-risk gene expression status, had pathologic high-risk features compared with 72% in those with higher OLFM1 and BMP6 gene expression status.
In univariable and multivariable analyses, the expression status of OLFM1 and BMP6 was a prognostic factor for RFS and OS in the EGFR-positive cohort (Fig. 3C and D; Supplementary Fig. S10B; Table 3).
Gene expression analysis in cohort 1
The clinicopathologic characteristics of the 66 patients in cohort 1, including 33 patients in the recurrence group and 33 matched patients in the control group, are summarized in Supplementary Table S1. Through the gene expression analysis of RNA sequencing data from samples of cohort 1, data for 15,129 genes were obtained and processed as described in the “Materials and Methods” section. First, we performed OPLS-DA analysis using RNA sequencing data of the 66 patients. As shown in Fig. 1B, OPLS-DA scores of patients who experienced disease recurrence were clearly separated from those of the control group. This suggests that gene expression status may be a useful tool to identify patients with a higher risk of disease recurrence in pStage I NSCLCs with pure solid appearance. We next screened candidate genes with prognostic impacts using volcano plots. Using the criteria of NM FDR <0.2 and fold changes of >2.0 or <0.5, we extracted 168 candidate genes (Fig. 1C). Nonnegative matrix factorization analysis was performed using the 168 candidate genes (Supplementary Fig. S1). The candidate genes were further narrowed down to 101 genes, including 44 genes with higher expression in the recurrence group (clustered in R1, R2, and R3) and 57 genes with lower expression in the recurrence group (clustered in N1, N2, and N3). The 101 genes are listed in Supplementary Table S2.
We next evaluated the function of 101 genes included in each gene set by functional analysis using Gene Ontology enrichment and pathway analyses. Genes in N1/N2/N3 (with lower expression in the recurrence group) were associated with potential cancer-related pathways, such as NK cell–mediated cytotoxicity, DNA replication–dependent chromatin assembly, and regulation of vascular permeability, whereas genes in R1/R2/R3 (with higher expression in the recurrence group) were less associated with cancer biology (Supplementary Fig. S2). Therefore, we hypothesized that genes with downregulated expression in the recurrence group rather than those with upregulated expression may be useful prognostic biomarkers.
Analysis of gene mutational status in cohort 1 and cohort 2
Mutational data were compared between the 33 patients in the recurrence group and 32 patients in the control group (after excluding one patient from the control group because of high sequence noise). We first performed mutational signature analysis by comparing the two groups; however, we did not observe differences between them [e.g., defective DNA mismatch repair (SBS26) was found in both groups, Supplementary Fig. S3].
We then compared the frequencies of mutated genes between the two groups (Supplementary Fig. S4). We found that three genes were more frequently mutated in the recurrence group, SYNE1 (recurrence group: 12% vs. control group: 0%), MACF1, (15% vs. 3%), and DCAF8L2 (21% vs. 6%), whereas two genes were frequently mutated in the control group, HLA-F (recurrence group: 12% vs. control group: 34%) and RPL10 (6% vs. 22%). However, analysis in the validation cohort (cohort 2) revealed no prognostic impact of the mutation status of these five genes. Therefore, we focused on gene expression status rather than gene mutation status as a candidate biomarker to predict disease recurrence in pStage I NSCLCs with pure solid appearance.
Prognostic impacts of candidate genes in cohort 2
Cohort 2 included 149 patients with pStage I nonsquamous NSCLC who met the inclusion and exclusion criteria described in the Methods (median follow-up of 76.9 months, Supplementary Table S1). PolyA sequencing had been performed in 125 patients (the clinicopathologic characteristics of these patients are summarized in Supplementary Table S3) and other RNA sequencing (ribo-zero NGS or SMART-seq) had been performed in 24 patients. Because the distribution of gene expression differed between these two groups (e.g., expression status of GJB6, Supplementary Fig. S5), we focused on the data of the 125 patients whose tumors were analyzed using PolyA sequencing (designated as cohort 2a, Fig. 1A) in the following analyses.
Among the 101 candidate genes identified in the analysis of cohort 1 (Supplementary Table S1), the expression status of six genes, bone morphogenetic protein 6 (BMP6), potassium two pore domain channel subfamily K member 3 (KCNK3), neurofascin (NFASC), olfactomedin 1 (OLFM1), paternally expressed 3 (PEG3), and tenascin XB (TNXB), was associated with disease status (recurrence or disease-free survival for at least 5 years) in cohort 2a. Lower expression of these six genes was associated with disease recurrence in both cohort 1 and cohort 2a. We thus next evaluated the prognostic impact of these six genes in lung adenocarcinoma using the Human Protein Atlas (proteinatlas.org; refs. 22, 23), which contains RNA expression and survival data of The Cancer Genome Atlas (TCGA; ref. 24). We observed that lower expression of these six genes was associated with poorer survival in the TCGA adenocarcinoma cohort; only BMP6 expression did not show statistical significance (Supplementary Fig. S6).
Survival analysis of cohort 2a by expression status of the six genes
We then evaluated prognostic implication of the expression status of the six genes on RFS in cohort 2a (Fig. 2). For all six genes, patients with lower gene expression had worse RFS compared with patients with higher gene expression. We next analyzed the correlations of expression status of the six genes. As shown in Supplementary Fig. S7, the expression status of BMP6 was not concordant with that of the other five genes; however, the expression status of the other five genes was associated with each other with high concordance rates (57.6%–75.2%). Furthermore, multivariable analysis of the expression status of the six genes revealed that BMP6 and OLFM1 expressions were independent prognostic factors for RFS (Supplementary Table S4). These results indicate that the expressions of BMP6 and OLFM1 were candidate prognostic markers. Prognostic implications of BMP6 and OLFM1 expression status were further validated using the TCGA validation cohort from the Human Protein Atlas (Supplementary Fig. S8). We also observed that the prognostic implications of these genes were prominent in early-stage lung adenocarcinomas compared with later-stage disease (Supplementary Fig. S9).
Combination of OLFM1 and BMP6 expression status as a prognostic factor
We next evaluated the prognostic implications of the combinations of OLFM1 and BMP6 expression status on RFS. As shown in Fig. 3A and B, we found that patients with lower expression for both genes had worse prognosis, followed by patients who had lower expression for either of the two genes. Patients with higher expression of both BMP6 and OLFM1 showed RFS of 91.2% (95% CI, 75.1%–95.1%) at 5 years, compared with patients with lower expression of OLFM1 or BMP6 and patients with lower expression for both genes, who showed 5-year RFS of 60.7% [95% confidence interval (CI), 47.3%–71.6%] and 40% (95% CI, 22.8%–56.7%), respectively. We also found that the expression status of OLFM1 and BMP6 was associated with OS (Supplementary Fig. S10).
Multivariable analysis revealed that OLFM1 and BMP6 expression status was an independent prognostic factor for RFS; the HR was 4.28 (95% CI, 1.47–12.46; P = 0.008) in patients with low expression for either OLFM1 or BMP6 and 7.32 (95% CI, 2.36–22.72; P < 0.001) in patients with low expression for both OLFM1 and BMP6 compared with those who had high expression for both genes (Table 1). Similar findings were observed in multivariable analysis for OS. We also found that vascular invasion (HR, 3.13; 95% CI, 1.37–7.36; P = 0.007) and male sex (HR, 2.46; 95% CI, 1.19–5.06; P = 0.014) were independent poor prognostic factors for OS.
We next examined the clinicopathologic characteristics of the subgroups of patients divided by OLFM1 and BMP6 expression status (Table 2). We observed that high-risk gene expression status was significantly associated with male sex (P = 0.028) and pathologic high-risk status (P = 0.012), which was defined as the presence of one of following factors: large invasive tumor size (>2.0 cm), pleural invasion, and/or lympho-vascular invasion.
Prognostic impacts of OLFM1 and BMP6 expression status in cohort 2
We further evaluated our findings using the entire cohort 2 (N = 149), which included patients whose tumors were analyzed using RNA sequencing platforms other than PolyA sequencing. Because it is not possible to simply compare gene expression data obtained by different platforms (Supplementary Fig. S6), we adjusted the gene expression status of tumors analyzed by platforms other than PolyA sequencing before combining with cohort 2a. Survival analysis showed that the combination of BMP6 and OLFM1 was also prognostic in the entire cohort 2 (Supplementary Fig. S11).
Prognostic impacts of OLFM1 and BMP6 in patients with EGFR mutation
To evaluate the prognostic impacts of OLFM1 and BMP6 in patients with EGFR mutation, the most frequent molecular subtype of NSCLC in East Asians, we further performed an analysis combining all patients positive for EGFR mutation from cohort 1 and cohort 2 into cohort E. The clinicopathologic characteristics in patient groups divided by OLFM1 and BMP6 gene expression status are summarized in Table 2. Similar to the analysis of cohort 2a, we observed that 91% of patients with lower OLFM1 and BMP6 gene expression, which means high-risk gene expression status, had pathologic high-risk features compared with 72% in those with higher OLFM1 and BMP6 gene expression status.
In univariable and multivariable analyses, the expression status of OLFM1 and BMP6 was a prognostic factor for RFS and OS in the EGFR-positive cohort (Fig. 3C and D; Supplementary Fig. S10B; Table 3).
Discussion
Discussion
Although numerous prognostic factors have been reported in patients with NSCLC after pulmonary resection, pathologic stage is still the only determinant for the application of adjuvant treatment in many guidelines. One of the potential difficulties to identify additional prognostic biomarkers is the diversity of surgically resectable NSCLC. Therefore, in this study, we focused on pStage I nonsquamous NSCLC, and we also excluded patients with GGO who tend to show excellent survival after pulmonary resection (7–9). This strategy was useful to investigate gene expression status as a prognostic biomarker because molecular aberrations are reportedly distinct between nonsquamous NSCLCs with and without GGO component (25, 26). We found that lower expression of OLFM1 and BMP6 was associated with disease recurrence in three independent cohorts (cohort 1, cohort 2a/cohort 2, and TCGA database). We also observed that lower expression of OLFM1 and BMP6 was a poor prognostic factor in patients with pStage I nonsquamous NSCLC with EGFR mutation (cohort E). Although our results were consistent in these independent cohorts, as limitations of this study, we note that this study was performed using retrospective cohorts that included relatively small numbers of patients and that the biological implications of the identified biomarkers are not fully understood.
The OLFM1 (olfactomedin 1) gene is located at 9q34.3 and is a member of the olfactomedin domain–containing protein family that encodes a protein that is abundantly expressed in brain. Although the exact function of this protein is unknown, some studies suggested that decreased OLFM1 expression is associated with tumorigenesis, increased malignant potential, and worse progression-free survival and OS in colorectal cancer, endometrial cancer, and thyroid carcinoma (27–29). One of these studies suggested that decreased expression of OLFM1 accelerates tumor progression through activation of the NF-κB signaling (27). The association between decreased OLFM1 expression and activated Wnt/β-catenin signaling pathway was also reported in a study on embryo transition (30, 31).
The biological function of BMP6 (bone morphogenetic protein 6) has been clarified. The BMP6 gene encodes a secreted ligand of the TGFβ superfamily of proteins. BMP6 binds various TGFβ receptors, leading to recruitment and activation of SMAD family transcription factors that regulate gene expression. Loss of BMP signaling, and that of its receptors, has been shown to increase WNT signaling and cancer stem cells in colorectal cancer (32). Associations between decreased BMP6 expression and tumorigenesis, high malignant potential, and/or poorer survival were reported in patients with breast cancer, gastric cancer, and NSCLC (33–39), whereas opposite associations were reported in prostate cancer (40–42). The molecular mechanisms for decreased OLFM1 and BMP6 expression in high-risk nonsquamous NSCLC tumors are unclear. However, Wnt activation reportedly downregulates OLFM1 expression in fallopian tubal epithelial cells (30), and it is possible that decreased BMP6 expression downregulates OLFM1 expression through activated Wnt signaling.
In this study, we found that the expression status of only two genes (OLFM1 and BMP6) predicted patient outcomes after pulmonary resection. This contrasts with previous studies that identified 10 to 15 gene signatures as a prognostic biomarker in surgically resected patients with NSCLC (43–46). We consider that the exclusion of squamous cell carcinomas in this study partially contributed to the identification of these two biomarkers because analyses of the TCGA dataset suggested that six genes identified through the analyses of cohort 1 and cohort 2a did not have prognostic impact (OLFM1 and NFASC) or had opposite prognostic implications (BMP6, KCNK3, PEG3, and TNXB) in squamous cell carcinomas, as shown in Supplementary Fig. S6.
Molecular-targeted drugs and immune checkpoint inhibitors, which are effective but costly treatments, are now being applied in the perioperative setting for early-stage NSCLCs. However, the risk of disease recurrence is much lower in patients with pStage I NSCLC compared with those with stage II to III disease. Therefore, it does not make sense to recommend adjuvant treatment for all patients with pStage I NSCLC but biomarker selection is definitely necessary to identify high-risk patients to maximize the cost-effectiveness of adjuvant treatments and reduce the risk of unnecessary side effects. Among patients with nonsquamous NSCLC, postoperative disease recurrences are usually seen in tumors without GGO, which often accompany pathologic high-risk features and other high-risk clinical factors. Therefore, additional biomarkers such as those found in this study may help to narrow the list of patients who benefit from adjuvant treatment. Some clinical trials of adjuvant treatment are now enrolling patients with stage I NSCLCs with pathologic high-risk features such as pleural and/or lympho-vascular invasion (47). In this study, we observed that OLFM1 and BMP6 expression status was significantly associated with pathologic poor prognostic factors, including lympho-vascular invasion, in both cohort 2a and cohort E (patients with EGFR mutation; Table 2). Multivariable analyses revealed that OLFM1/BMP6 expression status was an independent prognostic factor after adjustment for these pathologic factors (Tables 1 and 3).
Another strategy to identify high-risk patients is the detection of minimal residual disease (MRD) using liquid biopsy such as ctDNA analysis. The usefulness of ctDNA analysis as a tool to stratify the risk of recurrence for adjuvant treatment has been reported in some solid tumors such as colorectal carcinomas and urothelial cancer (48, 49). However, the sensitivity of ctDNA analysis in detecting MRD is much lower in NSCLCs (50), including stage I NSCLCs with EGFR mutation (51). Therefore, MRD analysis is currently far from clinical application in pStage I NSCLC although preoperative ctDNA detection may indicate clinical stage I NSCLCs with high malignant potential (52).
In conclusion, through a three-step analysis, (i) exploration of candidate genes in a case–control setting, (ii) narrowing down candidate genes using a real-world cohort, and (iii) validation using TCGA data and combined cohorts, we identified that the lower expression of OLFM1 and BMP6 is a biomarker to identify patients with high-risk pStage I nonsquamous NSCLC with pure solid appearance at preoperative CT. We found that the genetic high-risk score was associated with pathologic high-risk features. Therefore, we consider that the combination of these factors may be useful to identify patients with high-risk stage I NSCLC who may benefit from adjuvant treatment after pulmonary resection.
Although numerous prognostic factors have been reported in patients with NSCLC after pulmonary resection, pathologic stage is still the only determinant for the application of adjuvant treatment in many guidelines. One of the potential difficulties to identify additional prognostic biomarkers is the diversity of surgically resectable NSCLC. Therefore, in this study, we focused on pStage I nonsquamous NSCLC, and we also excluded patients with GGO who tend to show excellent survival after pulmonary resection (7–9). This strategy was useful to investigate gene expression status as a prognostic biomarker because molecular aberrations are reportedly distinct between nonsquamous NSCLCs with and without GGO component (25, 26). We found that lower expression of OLFM1 and BMP6 was associated with disease recurrence in three independent cohorts (cohort 1, cohort 2a/cohort 2, and TCGA database). We also observed that lower expression of OLFM1 and BMP6 was a poor prognostic factor in patients with pStage I nonsquamous NSCLC with EGFR mutation (cohort E). Although our results were consistent in these independent cohorts, as limitations of this study, we note that this study was performed using retrospective cohorts that included relatively small numbers of patients and that the biological implications of the identified biomarkers are not fully understood.
The OLFM1 (olfactomedin 1) gene is located at 9q34.3 and is a member of the olfactomedin domain–containing protein family that encodes a protein that is abundantly expressed in brain. Although the exact function of this protein is unknown, some studies suggested that decreased OLFM1 expression is associated with tumorigenesis, increased malignant potential, and worse progression-free survival and OS in colorectal cancer, endometrial cancer, and thyroid carcinoma (27–29). One of these studies suggested that decreased expression of OLFM1 accelerates tumor progression through activation of the NF-κB signaling (27). The association between decreased OLFM1 expression and activated Wnt/β-catenin signaling pathway was also reported in a study on embryo transition (30, 31).
The biological function of BMP6 (bone morphogenetic protein 6) has been clarified. The BMP6 gene encodes a secreted ligand of the TGFβ superfamily of proteins. BMP6 binds various TGFβ receptors, leading to recruitment and activation of SMAD family transcription factors that regulate gene expression. Loss of BMP signaling, and that of its receptors, has been shown to increase WNT signaling and cancer stem cells in colorectal cancer (32). Associations between decreased BMP6 expression and tumorigenesis, high malignant potential, and/or poorer survival were reported in patients with breast cancer, gastric cancer, and NSCLC (33–39), whereas opposite associations were reported in prostate cancer (40–42). The molecular mechanisms for decreased OLFM1 and BMP6 expression in high-risk nonsquamous NSCLC tumors are unclear. However, Wnt activation reportedly downregulates OLFM1 expression in fallopian tubal epithelial cells (30), and it is possible that decreased BMP6 expression downregulates OLFM1 expression through activated Wnt signaling.
In this study, we found that the expression status of only two genes (OLFM1 and BMP6) predicted patient outcomes after pulmonary resection. This contrasts with previous studies that identified 10 to 15 gene signatures as a prognostic biomarker in surgically resected patients with NSCLC (43–46). We consider that the exclusion of squamous cell carcinomas in this study partially contributed to the identification of these two biomarkers because analyses of the TCGA dataset suggested that six genes identified through the analyses of cohort 1 and cohort 2a did not have prognostic impact (OLFM1 and NFASC) or had opposite prognostic implications (BMP6, KCNK3, PEG3, and TNXB) in squamous cell carcinomas, as shown in Supplementary Fig. S6.
Molecular-targeted drugs and immune checkpoint inhibitors, which are effective but costly treatments, are now being applied in the perioperative setting for early-stage NSCLCs. However, the risk of disease recurrence is much lower in patients with pStage I NSCLC compared with those with stage II to III disease. Therefore, it does not make sense to recommend adjuvant treatment for all patients with pStage I NSCLC but biomarker selection is definitely necessary to identify high-risk patients to maximize the cost-effectiveness of adjuvant treatments and reduce the risk of unnecessary side effects. Among patients with nonsquamous NSCLC, postoperative disease recurrences are usually seen in tumors without GGO, which often accompany pathologic high-risk features and other high-risk clinical factors. Therefore, additional biomarkers such as those found in this study may help to narrow the list of patients who benefit from adjuvant treatment. Some clinical trials of adjuvant treatment are now enrolling patients with stage I NSCLCs with pathologic high-risk features such as pleural and/or lympho-vascular invasion (47). In this study, we observed that OLFM1 and BMP6 expression status was significantly associated with pathologic poor prognostic factors, including lympho-vascular invasion, in both cohort 2a and cohort E (patients with EGFR mutation; Table 2). Multivariable analyses revealed that OLFM1/BMP6 expression status was an independent prognostic factor after adjustment for these pathologic factors (Tables 1 and 3).
Another strategy to identify high-risk patients is the detection of minimal residual disease (MRD) using liquid biopsy such as ctDNA analysis. The usefulness of ctDNA analysis as a tool to stratify the risk of recurrence for adjuvant treatment has been reported in some solid tumors such as colorectal carcinomas and urothelial cancer (48, 49). However, the sensitivity of ctDNA analysis in detecting MRD is much lower in NSCLCs (50), including stage I NSCLCs with EGFR mutation (51). Therefore, MRD analysis is currently far from clinical application in pStage I NSCLC although preoperative ctDNA detection may indicate clinical stage I NSCLCs with high malignant potential (52).
In conclusion, through a three-step analysis, (i) exploration of candidate genes in a case–control setting, (ii) narrowing down candidate genes using a real-world cohort, and (iii) validation using TCGA data and combined cohorts, we identified that the lower expression of OLFM1 and BMP6 is a biomarker to identify patients with high-risk pStage I nonsquamous NSCLC with pure solid appearance at preoperative CT. We found that the genetic high-risk score was associated with pathologic high-risk features. Therefore, we consider that the combination of these factors may be useful to identify patients with high-risk stage I NSCLC who may benefit from adjuvant treatment after pulmonary resection.
Supplementary Material
Supplementary Material
Supplementary Figure S1Figure S1. Study One-hundred sixty eight genes which were differently expressed between these two groups were further narrowed down by NMF analysis. For the details of genes included in R1/R2/R3 and N1/N2/N3, please see Supplementary Table S1.
Supplementary Figure S2Figure S2. Results of the Reactome Pathway Analysis in each gene set; A, B, C, D, E, and F show the results of pathway analysis (cnet) for genes included in R1. R2. R3. N1, N2, and N3, respectively.
Supplementary Figure S3Figure S3. Mutational signature analysis in Recurrence group (A) and in Control group (B).
Supplementary Figure S4Figure S4. List of genes of which the frequencies of mutation were different between recurrence group and control group. Green, red, and black indicate missense mutation, nonsense mutation, and multi-hits, respectively.
Supplementary Figure S5Figure S5. Differences in gene expression levels between samples analyzed by PolyA sequencing vs. samples analyzed by other NGS (ribozero NGS or SMART-seq).
Supplementary Figure S6Figure S6. Prognostic implications of 6 genes identified through the analysis of Cohort 1 and Cohort 2P.
Supplementary Figure S7Figure S7. Concordance rates (A) between identified 6 genes and p-values (B).
Supplementary Figure S8Figure S8. Prognostic implications of BMP6 and OLFM1 in TCGA variation cohort in the Human Protein Atlas.
Supplementary Figure S9Figure S9. Prognostic implications of BMP6 and OLFM1 (stage-specific analysis).
Supplementary Figure S10Figure S10. Overall survivals based on BMP6 and OLFM1 expression status in Cohort 2P (A) and in EGFR mutated cohort (B).
Supplementary Figure S11Figure S11. Recurrence-free (A) and overall survivals (B) based on BMP6 and OLFM1 expression status in the entire Cohort 2.
Supplementary Table S1Table S1. Clinical characteristics of Patients (Cohort 1 and Cohort 2)
Supplementary Table S2Table S2. The list of differently expressed genes in Cohort 1
Supplementary Table S3Table S3. Clinical characteristics of Patients
Supplementary Table S4Table S4. Multivariable analysis of 6 genes for RFS in the Cohort 2a (N = 125)
Supplementary Figure S1Figure S1. Study One-hundred sixty eight genes which were differently expressed between these two groups were further narrowed down by NMF analysis. For the details of genes included in R1/R2/R3 and N1/N2/N3, please see Supplementary Table S1.
Supplementary Figure S2Figure S2. Results of the Reactome Pathway Analysis in each gene set; A, B, C, D, E, and F show the results of pathway analysis (cnet) for genes included in R1. R2. R3. N1, N2, and N3, respectively.
Supplementary Figure S3Figure S3. Mutational signature analysis in Recurrence group (A) and in Control group (B).
Supplementary Figure S4Figure S4. List of genes of which the frequencies of mutation were different between recurrence group and control group. Green, red, and black indicate missense mutation, nonsense mutation, and multi-hits, respectively.
Supplementary Figure S5Figure S5. Differences in gene expression levels between samples analyzed by PolyA sequencing vs. samples analyzed by other NGS (ribozero NGS or SMART-seq).
Supplementary Figure S6Figure S6. Prognostic implications of 6 genes identified through the analysis of Cohort 1 and Cohort 2P.
Supplementary Figure S7Figure S7. Concordance rates (A) between identified 6 genes and p-values (B).
Supplementary Figure S8Figure S8. Prognostic implications of BMP6 and OLFM1 in TCGA variation cohort in the Human Protein Atlas.
Supplementary Figure S9Figure S9. Prognostic implications of BMP6 and OLFM1 (stage-specific analysis).
Supplementary Figure S10Figure S10. Overall survivals based on BMP6 and OLFM1 expression status in Cohort 2P (A) and in EGFR mutated cohort (B).
Supplementary Figure S11Figure S11. Recurrence-free (A) and overall survivals (B) based on BMP6 and OLFM1 expression status in the entire Cohort 2.
Supplementary Table S1Table S1. Clinical characteristics of Patients (Cohort 1 and Cohort 2)
Supplementary Table S2Table S2. The list of differently expressed genes in Cohort 1
Supplementary Table S3Table S3. Clinical characteristics of Patients
Supplementary Table S4Table S4. Multivariable analysis of 6 genes for RFS in the Cohort 2a (N = 125)
출처: PubMed Central (JATS). 라이선스는 원 publisher 정책을 따릅니다 — 인용 시 원문을 표기해 주세요.
🏷️ 같은 키워드 · 무료전문 — 이 논문 MeSH/keyword 기반
- A Phase I Study of Hydroxychloroquine and Suba-Itraconazole in Men with Biochemical Relapse of Prostate Cancer (HITMAN-PC): Dose Escalation Results.
- Self-management of male urinary symptoms: qualitative findings from a primary care trial.
- Clinical and Liquid Biomarkers of 20-Year Prostate Cancer Risk in Men Aged 45 to 70 Years.
- Diagnostic accuracy of Ga-PSMA PET/CT versus multiparametric MRI for preoperative pelvic invasion in the patients with prostate cancer.
- Comprehensive analysis of androgen receptor splice variant target gene expression in prostate cancer.
- Clinical Presentation and Outcomes of Patients Undergoing Surgery for Thyroid Cancer.