CN114959020A - Application of three SNPs in preparation of product for predicting survival of non-small cell lung cancer patient - Google Patents

Application of three SNPs in preparation of product for predicting survival of non-small cell lung cancer patient Download PDF

Info

Publication number
CN114959020A
CN114959020A CN202110215224.4A CN202110215224A CN114959020A CN 114959020 A CN114959020 A CN 114959020A CN 202110215224 A CN202110215224 A CN 202110215224A CN 114959020 A CN114959020 A CN 114959020A
Authority
CN
China
Prior art keywords
survival
snps
lung cancer
small cell
cell lung
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110215224.4A
Other languages
Chinese (zh)
Inventor
王启鸣
杨森
张哲�
何振
吴育锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Cancer Hospital
Original Assignee
Henan Cancer Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Cancer Hospital filed Critical Henan Cancer Hospital
Priority to CN202110215224.4A priority Critical patent/CN114959020A/en
Publication of CN114959020A publication Critical patent/CN114959020A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention finds that three functional SNPs (ERAP1rs469783T > C, PSMF1rs13040574C > A and NCF2rs36071574G > A) are independent and significantly related to the survival of patients. The overall survival HR for the three SNPs was 0.83, 0.86 and 1.31, respectively. In functional analysis, alleles ERAP1rs469783C and PSMF1rs13040574A were associated with high level expression of the corresponding genes. Whereas the allele NCF2rs36071574A is associated with low expression levels of NCF 2. Thus, these three SNPs in the MHC-I pathway are biomarkers that predict the survival of NSCLC and achieve this function by regulating the expression of the corresponding genes.

Description

Application of three SNPs in preparation of product for predicting survival of non-small cell lung cancer patient
Technical Field
The invention relates to the technical field of biology, in particular to application of three SNPs (single nucleotide polymorphisms) in preparing a product for predicting survival of a non-small cell lung cancer patient.
Background
Lung cancer is one of the most common malignancies in humans, with the highest mortality rate associated with cancer worldwide. It is estimated that about 228,150 new cases and 142,670 patients died of lung cancer in the us in 2019. Non-small cell lung cancer (NSCLC) is the most common histological type, accounting for approximately 90% of all lung cancer patients. Current treatment options for advanced NSCLC include chemotherapy, radiation therapy, and targeted therapy, but the first two treatments only moderately improve survival, and patients inevitably develop drug resistance. In recent years, the role of the immune system in the development and progression of cancer has been widely recognized. Immunotherapy is now established as the "fourth major leg" of cancer, and patients with high expression of PD-L1 can be treated by immunotherapy alone. Currently, immune combination chemotherapy in patients with metastatic NSCLC has become the current standard first-line treatment. However, many patients do not benefit from current immunotherapy and there is an urgent need to determine patient-related predictive biomarkers to maximize the identification of the population that can benefit from immunotherapy.
Major histocompatibility complex class I (MHC-1) proteins are the primary regulators of cellular immunity, which control the function of Cytotoxic T Lymphocytes (CTLs) through the process of antigen presentation. For example, an effective CTL response depends on the ability of the MHC-1 protein to present various peptides. On the other hand, the killing effect of cellular immunity in tumor cells is highly dependent on the distribution of MHC-1 activated CTL on the surface of cancer cells and dendritic cells.
Since the hypothesis-free genome-wide association study (GWAS) only targets the most important SNPs/genes with stringent P-values after multiple test corrections, most of the most important SNPs identified lack functional annotation. To date, few new functional SNPs have been found that correlate with the prognosis of lung cancer patients.
To date, few new functional SNPs have been associated with the prognosis of lung cancer patients in GWAS studies in the european ancestry population. Thus, as a promising hypothesis-driven research approach in the latter era of GWAS, biological pathway-based approaches have been used to reanalyze the published GWAS dataset to assess the cumulative effects of SNPs between multiple genes in the same biological pathway. Since there are fewer SNPs in candidate genes for important biological pathways, the trouble of multiplex detection of clinically irrelevant SNPs that have no biological significance can be avoided. Thereby improving the detection capability of the SNP which has potential biological functions but has little difference significance.
Disclosure of Invention
The invention provides application of a substance for detecting polymorphism or genotype of rs469783, rs13040574 and/or rs36071574 in a human genome in preparation of a product for predicting survival of non-small cell lung cancer patients.
The invention provides an application of a substance for detecting polymorphism or genotype of rs469783, rs13040574 and rs36071574 in a human genome in preparation of a product for prognosis of a patient with non-small cell lung cancer.
The invention provides an application of a substance for detecting polymorphism or genotype of rs469783, rs13040574 and rs36071574 in a human genome in preparation of a product of related single nucleotide polymorphism of a patient with non-small cell lung cancer.
rs469783 is 5: 96121524(GRCh 37); rs13040574 is 20: 1123070(GRCh37), rs36071574 is 1: 183545603(GRCh 37).
The invention provides any one of the following applications:
B1) use of polymorphisms or genotypes of rs469783, rs13040574 and rs36071574 in the human genome in a product for predicting survival of non-small cell lung cancer patients;
B2) use of polymorphisms or genotypes of rs469783, rs13040574 and NCF2rs36071574 in human genome in preparation of a product for prognosis of non-small cell lung cancer patients.
B3) The application of the polymorphism or genotype of rs469783, rs13040574 and rs36071574 in the human genome in the preparation of the product of the single nucleotide polymorphism related to non-small cell lung cancer patients.
Optionally, the substance for detecting the polymorphism or genotype of the rs469783, rs13040574 and rs36071574 in the human genome is a PCR primer and/or a single-base extension primer for amplifying a genomic DNA fragment including the rs469783, rs13040574 and rs 36071574.
Optionally, the survival is a life cycle or a disease-specific survival rate.
Optionally, the non-small cell lung cancer is lung adenocarcinoma or lung squamous carcinoma.
The technical scheme of the invention has the following advantages: the present invention has advanced two previously published genome-wide association studies (PLCO and HLCS)Multivariate Cox proportional risk regression analysis is carried out, the relation between variation of MHC-I related pathway genes and the survival time of non-small cell lung cancer patients is evaluated, and then the relation between gene expression quantity and variation sites is analyzed. In the stage of discovering the variation, 7811 Single Nucleotide Polymorphisms (SNPs) in 89 genes of 1185 NSCLC patients in the PCLO study are obviously related to the survival of the patients, and in the stage of verifying the variation, 24 SNPs are verified in 984 NSCLC patients in the HLCS study. In multivariate stepwise regression analysis, three functional SNPs (ERAP1rs 469783T) were discovered in the present invention>C、PSMF1 rs13040574C>A and NCF2rs36071574G>A) Independent and significantly associated with patient survival. The Hazard Ratio (HR) of the Overall Survival (OS) of the three SNPs was 0.83[ 95% Confidence Interval (CI) ═ 0.77-0.89, P ═ 8.0 × 10% -7 ]、0.86(0.80-0.93,P=9.4×10 -5 ) And 1.31(1.11-1.54, P ═ 0.001). In functional analysis, alleles ERAP1rs469783C and PSMF1rs13040574A were associated with high level expression of the corresponding genes. While allele NCF2rs36071574A was associated with low expression levels of NCF 2. Thus, these three SNPs in the MHC-I pathway are biomarkers that predict the survival of NSCLC and achieve this function by regulating the expression of the corresponding genes.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is an experimental flow chart of the present invention; abbreviations: PLCO, prostate cancer, lung cancer, colorectal cancer and ovarian cancer screening tests; NSCLC, non-small cell lung cancer; HLCS, harvard lung cancer susceptibility study; ERAP1, endoplasmic reticulum aminopeptidase 1; PSMF1, proteasome inhibitor subunit 1; NCF2, neutrophil cytoplasmic factor 2;
FIG. 2 is a prediction of the survival of SNPs in ERAP1 and PSMF1 combined with risk genotype and eQTL analysis;
(a) KM survival curve for OS for risk genotype in PLCO dataset;
(b) KM survival curves for OS of the binary group of NUGs in the PLCO dataset;
(c) KM survival curves for DSS of risk genotypes in the PLCO dataset;
(d) a Kaplan-Meier survival curve for the NUG dichotomized group DSS in the PLCO dataset;
(e) in 373 out of the 1000 genome project, the ERAP1rs469783C allele was associated with high mRNA expression of ERAP 1;
(f) the PSMF1rs13040574A allele was associated with high mRNA expression in 373 european people of the 1000 genome project;
(g) in normal lung tissue from GTEx project, ERAP1rs469783C allele was associated with high mRNA expression of ERAP 1;
(h) in whole blood from GTEx project, ERAP1rs469783C allele was associated with high mRNA expression of ERAP 1;
FIG. 3 differential mRNA expression analysis and OS analysis of three genes in the TCGA database
(a) In 109 pairs of paired NSCLC tumor tissues and adjacent normal tissues, higher expression levels of ERAP 1mRNA were found in adjacent normal tissues (p ═ 0.0001);
(b) in 109 pairs of paired NSCLC tumor tissues and adjacent normal tissues, higher PSMF1mRNA expression levels were found in tumor tissues (p ═ 0.0006);
(c) higher NCF2mRNA expression levels were found in the adjacent normal tissues in 109 pairs of paired NSCLC tumor tissues and adjacent normal tissues (p < 0.0001);
(d) a Kaplan-Meier survival curve of high expression and low expression of ERAP 1mRNA Overall Survival (OS); in NSCLC patients, the survival rate of the ERAP 1mRNA high expression patients is higher than that of the low expression patients;
(e) the Kaplan-Meier survival curve for high and low Overall Survival (OS) of PSMF1 mRNA; among NSCLC patients, PSMF1mRNA high-expression patients had a higher survival rate than low-expression patients;
(f) Kaplan-Meier survival curves for high and low expression Overall Survival (OS) of NCF2 mRNA;
figure 4 predicts ten-year NSCLC survival for three SNPs by ROC curves in the PLCO dataset.
(a) OS time-dependent AUC estimation: based on clinical variables (age, sex, smoking status, histology, tumor grade, tumor stage, chemotherapy, surgery, table 6 top four major components) and the risk genotype of 3 genes.
(b) Predicting ten years NSCLC OS by ROC curve;
(c) DSS time-dependent AUC estimation: risk genotype based on clinical variables (age, sex, smoking status, histology, tumor grade, tumor stage, chemotherapy, surgery, top four major components of table 6) and 3 genes;
(d) predicting a ten-year NSCLC DSS through an ROC curve;
FIG. 5 correlation of rs36071574 and rs13040574 genotypes with their corresponding mRNA expression levels
(a) The association between the NCF2rs36071574A allele and mRNA expression levels was not significant in 373 european people of the 1000 genome project;
(b) in normal lung tissue from GTEx project, the NCF2rs36071574A allele was not found to correlate with NCF2mRNA expression levels;
(c) in whole blood from the GTEx project, the NCF2rs36071574A allele was not found to correlate with the expression level of NCF2 mRNA;
(d) there was no significant correlation between the PSMF1rs13040574A allele and mRNA expression in normal lung tissue (P ═ 0.933) in normal lung tissue from the GTEx project;
(e) in whole blood from GTEx project, PSMF1rs13040574A was associated with high PSMF1mRNA expression;
fig. 6 correlation of rs12094228 genotype with its corresponding mRNA expression level;
(a) the NCF2rs 12094228G allele was not associated with mRNA expression of 373 european human NCF2 of the 1000 genome project;
(b) the NCF2rs 12094228G allele was associated with lower mRNA expression of NCF2 in normal lung tissue of GTEx project;
(c) the NCF2rs 12094228G allele did not have data for whole blood on the GTEx project.
FIG. 7 differential mRNA expression analysis in the TCGA database
(a) In 58 pairs of LUAD paired tumor tissue and adjacent normal tissue, higher levels of ERAP1 expression were found in adjacent normal tissue;
(b) in 51 pairs of matched LUSC tissues, higher expression levels of ERAP1 were found in adjacent normal tissues;
(c) in unpaired tumor and normal tissues in LUAD, ERAP 1mRNA expression levels were found to be higher in normal tissues than in tumor tissues;
(d) in LUSC unpaired tumor and normal tissues, the expression level of ERAP 1mRNA in normal tissues was found to be higher than that in tumor tissues;
FIG. 8 differential mRNA expression analysis in the TCGA database.
(a) In 58 pairs of tumor tissue paired with LUAD and adjacent normal tissue, no high expression level of PSMF1mRNA was found in tumor tissue (relative to normal tissue);
(b) high expression levels of PSMF1 were found in tumor tissues in 51 pairs of lesc tumor tissues and adjacent normal tissues;
(c) in the LUAD unpaired tumor tissue and normal tissue, the expression level of PSMF1mRNA in the tumor tissue was not found to be higher than that in the normal tissue;
(d) PSMF1mRNA expression levels were higher in the LUSC unpaired tumor tissues than in normal tissues;
FIG. 9 differential mRNA expression analysis in the TCGA database
(a) In 58 pairs of tumor tissues and adjacent normal tissues of LUAD, higher expression levels of NCF2 were found in adjacent normal tissues;
(b) in 51 pairs of LUSC tumor tissues and adjacent normal tissues, high expression levels of NCF2 were found in adjacent normal tissues.
(c) NCF2mRNA expression levels were found to be higher in unpaired normal tissues in LUAD than in tumor tissues;
(d) NCF2mRNA expression levels were found to be higher in unpaired normal tissues in the lucs than in tumor tissues.
Detailed Description
Example 1
The specific experimental procedure is shown in FIG. 1.
(I) Experimental method
In a two-stage analysis, the GWAS dataset was used as a discovery dataset for european-derived lung cancer patients in prostate cancer, lung cancer, colorectal cancer and ovarian cancer (PLCO) cancer screening trials. PLCO is a randomized, controlled study by the National Cancer Institute (NCI) that includes 77,500 males and 77,500 females aged 55-74 years, with subjects enrolled from 10 medical centers in the united states between 1993 and 2011. Are randomly assigned to either the intervention group receiving trial screening or the control group receiving standard care and then followed for at least 13 years after group entry. Blood samples collected at time of enrollment, as well as personal information about smoking status and family history Cancer diagnosis, histopathology, staging and treatment methods were collected during follow-up (Hocking WG, Hu P, Oken MM, Winslow SD, Kvale PA, Prorok PC, Ragard LR, comms J, Lynch DA, Andriole GL, buy SS, Fouad MN, Fuhrman CR, Isaacs C, Yokochi LA, Riley TL, Pinsky PF, Gohagan JK, Berg CD, Team PP, Lung Cancer Screening in the random dosed Prostate, Lung, Coloral, and Ovarian (PLCO) Cancer Screening Tril J Natl Cancer Inst 2010; 102: 722-31). Of these 15 ten thousand patients, a total of 1185 NSCLC patients met the requirements of survival analysis after excluding two individuals without follow-up information (Oken MM, Marcus PM, Hu P, Beck TM, Hocking W, Kvale PA, Cordes J, Riley TL, Winslow SD, Peace S, Levin DL, Prorok PC, Gohagan JK, Tean chemistry radiograph for a long Cancer detection in the randomised State, Lung, Coloract and Ovarian Cancer Screening Trial. J Natl Cancer Inst 2005; 97: 1832-9). Genomic DNA extracted from whole blood samples of these participants was genotyped with Illumina HumanHap240Sv1.0 and HumanHap550v3.0(dbGaP accession: phs000093.v2.p2 and phs000336.v1. p1). All participants have provided written informed consent allowing the PLCO trial to use the data set, and each institutional review board of the participating institution has approved use of the data set.
Another GWAS dataset of 984 histologically confirmed white-breed NSCLC patients collected in 1992 from the lung cancer susceptibility study (HLCS) at the university of harvard was used as the validation dataset. In the HLCS study, whole blood samples and personal information were collected after diagnosis, and DNA was extracted from the blood samples using the Auto Pure Large Sample nucleic acid purification System (QIAGEN Company, Fenlo, Linburg, the Netherlands) and genotyped using Illumina Humanhap610-Quad array. Genotyping data was used in the estimation of the Mach3 software [ ZHai R, Yu X, Wei Y, Su L, Christiani DC. Smoking and wrinkling processing in relation to the severity of co-existing non-small cell lung Cancer. int J Cancer 2014, based on sequencing data from 1,000 genome projects; 134:961-70].
Both the Duke university medical school's internal review Committee (project number Pro00054575) and the National Center for Biological Information (NCBI) approved the use of these two GWAS datasets in this study to access the NCBI dbGaP database of genotypes and phenotypes (project # 6404). Table 1 lists the comparison of characteristics between the PLCO test (n 1185) and the HLCS study (n 984).
Gene and SNP selection
Genes involved in the MHC-I associated pathway with the keyword "MHC IAND peptoid" were selected by a molecular marker database (http:// software. After removal of 85 repeat genes, 1 pseudogene and 3 genes on the X chromosome, 89 genes remained as candidate genes for further analysis. These candidate genes were interpolated by IMPUTE2 and 1,000 genome planning data (stage 3). Thereafter, the SNPs in these genes and their + -2 kb flanking regions were extracted according to the following criteria: the estimated information score is more than or equal to 0.8, the genotyping rate is more than or equal to 95%, the Minor Allele Frequency (MAF) is more than or equal to 5%, and the Hardy-Weinberg equilibrium (HWE) is more than or equal to 1 multiplied by 10 -5 . As a result, from the PLCO GWAS data set (dbGaP accession numbers: phs000093.v2.p 2. and phs000336. v)P1) a total of 7,811 SNPs (527 genotyping and 7,284 estimates) were selected for further analysis. The HLCS study was used as a validation dataset and SNP inclusion criteria for HLCS genotyping data were identical to PLCO genotyping data.
TABLE 1 comparison of characteristics of PLCO assay and HLCS study
Figure BDA0002952942940000071
Figure BDA0002952942940000081
Chi-square test was used to compare the characteristics of each clinical variable between the PLCO test and the harvard study
(II) results of the experiment
1. Association between SNPs in MHC-I associated pathway genes and NSCLC survival
The workflow diagram of this study is shown in figure 1. The basic characteristics of 1,185 NSCLC patients tested in PLCO and 984 NSCLC patients studied in HLCS were described elsewhere (Wang Y, Liu H, Ready NE, Su L, Wei Y, Christiani DC, Wei Q. genetic variants in ABCG1 area associated with a summary with replacement. int J Cancer 2016; 138: 2592-. In the finding PLCO genotype dataset, single locus multiple Cox regression analysis was performed on the 7,811 SNPs selected. For multiple test corrections, none of the SNPs passed Bonferroni correction (P >0.05) or false discovery rate (> 0.20). This may be due to the high LD in the resulting SNPs. In addition, the purpose of this pre-screening was used in this example to identify functional candidate SNPs for further analysis. Therefore, the inventors used the BFDP method. A total of 206 SNPs were identified as significantly associated with NSCLC OS (P <0.05) corrected by multiple tests with BFDP ≦ 0.80, with 24 SNPs still significant after further validation with HLCS genotype dataset (fig. 1). Subsequently, a combined analysis of the PLCO and HLCS datasets for these 24 newly discovered SNPs found better survival for the SNPs in ERAP1 and PSMF1, but weaker survival for the SNPs in NCF2, with no heterogeneity between the two studies (table 2).
TABLE 2 Association of 24 validated important SNPs from two previously published discovery and validation datasets of NSCLC WAS with OS
Figure BDA0002952942940000091
a treatment of a disease selected from the group consisting of adjustable age, sex, stage, histology, smoking status, chemotherapy, radiation therapy, surgery,
PC1, PC2, PC3 and PC 4;
b treatment of patients with adjustable age, sex, stage, histology, smoking status, chemotherapy, radiotherapy, surgery,
obtaining an accumulated genetic model of PC1, PC2 and PC 3;
cPhet-heterogeneous P values by the Cochrane's Q test;
d meta-analysis in fixed effects model;
e SNPs rs30380, rs469758, rs30378, rs246453, rs246454 and SNP rs30379 are highly linked, with substantially identical results;
f SNPs rs26510, rs27710, rs27529 and SNP rs30187 are highly linked with essentially identical results.
2. Independent SNPs in PLCO data sets associated with NSCLC survival
When 24 validated SNPs were included in the multivariate stepwise Cox regression model for the PLCO dataset (since the HLCS study dataset had no detailed genotyping data), only three SNPs remained significantly associated with survival. Then, this example extended the model by further including other 15 previously reported survival predictive SNPs from the same PLCO dataset, which were still significantly associated with survival (table 3).
TABLE 3 multivariate stepwise Cox regression model analysis of 3 independent SNPs including adjusted other covariates and SNPs previously published in PLCO assay GWAS dataset
Figure BDA0002952942940000101
a stepwise analysis including age, sex, smoking status, staging, histology, chemotherapy, radiotherapy, surgery, PC1, PC2, PC3, PC4 and SNPs;
b stepwise adjusted using 15 published SNPs. Five SNPs (PMID: 27557513) were reported in previous publications; one SNP (PMID: 29978465) was reported in the previous publication; two SNPs were reported in previous publications (PMID: 30259978); two SNPs were reported in previous publications (PMID: 26757251); three SNPs were reported in previous publications (PMID: 30650190); two SNPs have been reported in previous publications (PMID: 30989732).
In the PLCO data set with complete adjustment of the available covariates, patients had a reduced risk of death of the ERAP1rs469783C or PSMF1rs13040574A alleles, or a longer survival (P of OS) trend 0.0003; ptrend of DSS is 0.0008, Ptrend of OS is 0.006; ptrend of DSS is 0.014), whereas patients with the NCF2rs36071574A allele had high risk of mortality or shorter survival (Ptrend of OS is 0.018, Ptrend of DSS is 0.012) (table 4). In particular, the ERAP1rs469783C variant genotype had a longer survival compared to the TT genotype (TC: OS HR 0.94, 95% CI 0.81-1.10, P0.462 and DDS HR 0.95, 95% CI 0.81-1.12, P0.539, CC: OS HR 0.65, 95% CI 0.52-0.80, P<0.0001, HR of DSS 0.64, 95% CI 0.51-0.81, P0.0002; TC + CC: OS HR 0.86, 95% CI 0.74-0.99, P0.041, DDS HR 0.86, 95% CI 0.74-1.01, P0.063), PSMF1rs13040574A variant genotype had better long survival than CC genotype (CA: OS is HR-0.79, 95% CI-0.66-0.93, P-0.005, and DSS is HR-0.83, 95% CI-0.69-0.99, P-0.038; AA: HR 0.75 for OS, 95% CI 0.61-0.92, P0.007, HR 0.76 for DDS, 95% CI 0.61-0.95, P0.015; CA + AA: HR 0.78 for OS, 5% CI 0.66-0.91, P0.002, HR 0.81 for DSS, 95% CI 0.68-0.96,P ═ 0.015) (table 4). In contrast, compared to the GG genotype, the NCF2rs36071574A variant genotype was associated with a shorter survival time (GA: OS HR 1.26, 95% CI 1.00-1.60, P0.051, DDS HR 1.29, 95% CI 1.01-1.65, P0.043, AA: OS HR 2.35, 95% CI 0.87-6.38, P0.093, DSS HR 2.68, 95% CI 0.99-7.29, P0.053, GA + AA: OS HR 1.29, 95% CI 1.03-1.62, P0.030 and DSS HR 1.32, 95% CI 1.04-1.68) (table).
TABLE 4 associations between 3 independent SNPs and survival of NSCLC patients (OS at 4-1, DSS at Table 4-2) in PLCO trials
4-1
Figure BDA0002952942940000111
Figure BDA0002952942940000121
a is adjusted according to age, sex, smoking status, histology, staging, chemotherapy, surgery and major components.
b missing data were excluded.
c 10 missed visits were excluded; d 10 missed visits were excluded; e 10 missed visits were excluded; f the unfavorable genotypes ERAP1rs469783 TT, PSMF1rs13040574 CC, NCF2rs36071574 GA + AA.
Trend test Trend test.
4-2
Figure BDA0002952942940000131
Figure BDA0002952942940000141
a is adjusted according to age, sex, smoking status, histology, staging, chemotherapy, surgery and major components.
b missing data were excluded.
c 10 missed visits were excluded; d 10 missed visits were excluded; e 10 missed visits were excluded; f unfavorable genotype ERAP1rs469783 TT, PSMF1rs13040574 CC, NCF2rs36071574 GA + AA.
Trend test Trend test.
Since the available validation datasets from HLCS studies lack detailed genotyping data, the present example uses the PLCO dataset to assess the combined effect of three independent SNPs on NSCLC OS and DSS. For combinatorial analysis, the present implementation swaps some alleles based on their actual direction of action to align the direction of allele action at different loci. First, the risk genotypes (i.e., ERAP1rs469783 TT, PSMF1rs13040574 CC, and NCF2rs136071574GA + AA) were combined into a multiple risk genotype (NUGs) cumulative score. As shown in table 4, the increase in NUG score evaluated in the multivariate analysis of PLCO data sets correlated with decreased survival (OS and DSS Ptrend <0.0001 and 0.0002, respectively).
Then, to facilitate further stratification analysis, the present example used a dichotomy NUG score to classify all patients into a 0-1 score and a 2-3 score. Survival was significantly reduced in the 2-3 scoring group compared to the 0-1 scoring group (OS HR 1.52, 95% CI 1.24-1.86, P < 0.0001; DSS HR 1.51, 95% CI 1.22-1.87, P <0.0001) (table 4). This example further presents a Kaplan-Meier survival curve to describe these associations between risk genotypes and NSCLC OS and DSS (fig. 2a-2 d).
3. ROC curve and time dependent AUC
This example further evaluated the predictive value of 3 SNPs in the PLCO dataset using a time-dependent AUC and ROC curve at month 60 (or 5 year survival) (follow-up time 0.03 to 155.83 months, median follow-up time 19.80 months). The SNP combination covariate (age, gender, smoking status, histology, staging, chemotherapy, radiation therapy, surgery and the first four principal component (table 5) covariate model did not improve the model's predicted performance at 60 months (or 5 years survival) compared to the time-AUC plots for the model including age, gender, smoking status, histology, staging, chemotherapy, radiation therapy, surgery and the first four principal component (table 5). however, the predicted performance of the model was significantly improved when the correlated AUC and ROC curves were performed at 120 months (or ten years survival). the AUC of OS increased from 84.41% to 86.68% (P0.022) and the new AUC of DSS increased from 84.88% to 87.08% (P0.030) (fig. 4), suggesting that these discovered SNPs contribute to predicting the overall survival of NSCLC patients in the PLCO dataset.
TABLE 5 correlation of the first 10 major components of NSCLC to OS in PLCO testing
Figure BDA0002952942940000151
First 4 variables used for global hierarchical tuning in multivariate stepwise Cox models
4. eQTL analysis
This example performed eQTL analysis to explore the correlation between the genotypes of the newly discovered three survival predictive SNPs and their corresponding mRNA expression levels. eQTL analysis was performed using data from the 1000 genome project of 373 lymphoblast cell lines of European descent and found that both the ERAP1rs469783C and PSMF1rs13040574A alleles had significant associations with increased mRNA expression levels of their genes (P values P, respectively)<2.0x10 -16 And P ═ 0.004; FIGS. 2E and 2f) (Lappalainen T, Sammeth M, Friedlander MR, T Hoen PA, Monlong J, Rivas MA, Gonzalez-Porta M, Kurbatova N, Griebel T, Ferreira PG, Barann M, Wieland T, Greger L, van Iterson M, Almlof J, Ribeca P, Pulyakhina I, Eser D, Giger T, Tikhonov A, Sultan M, Berter G, Macarture DG, Lek M, Lizanano E, Bulumbermans HP, Padolou I, Schwarzmayr T, Karlberg O, gen H, Kilpinen H, Beltran S, Gut M, Kanglem, Amstiski, Monstivi V, Pishizyn T, Guitrex R, Georgen R, Georgenk, C, Georgenk, C, Georgenk, C, Georgenk, C, and soIG, useful X, Dermitzakis et. transcriptome and genome sequential catalysis in human. nature 2013; 501: 506-11); however, the correlation between NCF2rs36071574A allele and mRNA expression level was not significant (P ═ 0.701) (fig. 5 a). Then, this example performed eQTL using data from 369 whole blood samples and 383 normal lung tissues of the GTEx project, and found that the rs469783C allele was still associated with higher mRNA expression levels of ERAP1 in lung normal tissues and whole blood, tissues (P ═ 6.2x 10) -15 ) And whole blood (P ═ 1.2x10 -19 ) See FIGS. 2g and 2h, (Consortium GT. human genetics. the genomic-Tissue Expression (GTEx) pilot analysis: multiple Tissue gene regulation in human science 2015; 348: 648-60); however, there was no significant correlation between PSMF1rs13040574A allele and mRNA expression in normal lung tissue (P ═ 0.933) and whole blood (P ═ 0.498) (fig. 5d and 5 e); and the relationship between the NCF2rs36071574A allele and mRNA Expression levels in normal lung Tissue (P0.326) or in whole blood (P0.671) (fig. 5b, c) (Consortium gt. human genetics. the gene-Tissue Expression (GTEx) pilot analysis: multiple gene regulation in human science 2015; 348: 648-60). Next, this example evaluated the high Linkage Disequilibrium (LD) with NCF2rs36071574 (r 2)>0.80) to determine whether a SNP with high Linkage Disequilibrium (LD) with rs36071574 is likely to have an effect on mRNA expression of NCF 2. This example found that the NCF2rs 12094228G allele was at high LD (r2 ═ 0.93) with the NCF2rs36071574A allele, associated with lower Expression levels of NCF2 in normal lung Tissue (P ═ 0.027) (fig. 6b) (Consortium gt. human genetics. the genomic-Tissue Expression (GTEx) pilot Expression: Tissue gene regulation in humanes. science 2015; 348:648-60)), and further that the NCF2rs 12094228G allele had no data for whole blood in the GTEx project, as shown in fig. 6 (c); the NCF2rs 12094228G allele was not associated with mRNA expression of 373 european human NCF2 of the 1000 genome project, as in fig. 6 (a).
Finally, this example uses SNPinfo (Xu Z, Taylor JA. SNPinfo: integrating GWAS and coding gene information for generating association standards. nucleic Acids Res 2009; 37: W600-5), Regulation DB (Boyle AP, Hong EL, Harihanan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S, Chery M.JM, simulation M.JM notification of functional variation in coding genes using Regulation DB. Nume Res 2012; 22:1790-7) and Haplor (Ward LD, keyboard M.Haslag 4: coding genes prediction system DB. Nume Res 2016; 22:1790-7) and the prediction tool for on-line functions of coding genes, coding genes 4: coding genes, coding genes, coding. Although none of these four SNPs had obvious functions based on SNPinfo, they had some bioinformatics functions based on RegulomeDB and Haploreg. For example, ERAP1rs469783T > C has an effect on enhancer subunit protein labeling, DNAse and motifs; PSMF1rs13040574C > a had an effect on the motif; NCF2rs36071574G > a has an effect on enhancing subunit protein markers and motifs; NCF2rs12094228T > G has an effect on enhancing the histone mark.
5. Differential expression analysis of different mRNAs
This example evaluated 109 the mRNA expression levels (http:// UALCAN. path. uab. edu /) of three genes identified by SNPs in tumor and adjacent normal tissue samples in NSCLC obtained from the TCGA database and unpaired tumor and normal tissue samples in the UALCAN database. This example also assessed www.kmplot.com a correlation between mRNA expression levels and survival probability. Probes were selected according to network recommendations (ERAP1, probes for PSMF1 and NCF2, 209788_ s _ at, 236012_ at and 209949_ at, respectively). As shown in fig. 3a and fig. 7a and 7b, the expression of ERAP1 level mRNA was lower in tumor tissues compared to adjacent normal tissues in all samples. In the UALCAN (http:// UALCAN. path. uab. edu) database, the mRNA expression level of ERAP1 was also much lower in tumor tissues of LUAD (P ═ 0.059) and luxc (P ═ 7.8 × 10-5) (fig. 7c and 7 d). In addition, higher expression levels of ERAP 1mRNA correlated with better survival of NSCLC (fig. 3 d). Similarly, as shown in figures 3b and 8a and 8b, samples from NSCLC (P ═ 0.006) and luxc (P) were compared to adjacent normal tissues<0.0001), the expression level of PSMF1mRNA in tumor tissue was high, and LUAD (P)0.571) is not. The results are also similar in the UALCAN (http:// UALCAN. path. uab. edu) database. That is, the expression level of PSMF1mRNA in tumor tissue was higher in the lucc tissue compared to normal tissue (P ═ 3.3 × 10) -7 ) (ii) a In contrast, the expression level of PSMF1mRNA was not higher in tumor tissues in LUAD tissues compared to normal tissues (P ═ 0.340) (fig. 8c and d). However, higher PSMF1mRNA expression levels correlated with better NSCLC survival (fig. 3e), probably due to the indirect upregulation of PSMF1 in tumor tissues as part of the proteasome inhibitor PI 31.
Finally, as shown in FIGS. 3c and 9a and b, tumor tissue was sampled at the LUAD and LUSC pool (P) compared to adjacent normal tissue<0.0001)、LUAD(P<0.0001) and in LUSC (P)<0.0001) had a lower level of NCF2mRNA expression. The results are also similar in the UALCAN (http:// UALCAN. path. uab. edu) database. That is, the LUAD (P) is compared with that of the normal tissue<1.0×10 -12 ) And luxc (P ═ 1.6 × 10) -12 ) The mRNA expression level of NCF2 was lower in the middle tumor tissues (fig. 9c and d). In addition, higher expression levels of NCF2mRNA also correlated with higher NSCLC survival (fig. 3 f).
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications therefrom are within the scope of the invention.

Claims (7)

1. The application of a substance for detecting the polymorphism or genotype of ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A in human genome in preparing products for predicting the survival of non-small cell lung cancer patients.
2. The application of the substance for detecting the polymorphism or genotype of ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A in human genome in preparing the product for prognosis of non-small cell lung cancer patients.
3. The application of the substance for detecting the polymorphism or genotype of ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A in human genome in preparing the product of related single nucleotide polymorphism of non-small cell lung cancer patients.
4. Any of the following applications:
B1) use of polymorphisms or genotypes of ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A in the human genome in products for predicting survival of non-small cell lung cancer patients;
B2) use of polymorphisms or genotypes of ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A in human genome for preparing non-small cell lung cancer patient prognosis products;
B3) application of the polymorphism or genotype of ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A in human genome in preparation of products of related single nucleotide polymorphism of non-small cell lung cancer patients.
5. Use according to any one of claims 1 to 4, characterized in that: the substance for detecting the polymorphism or genotype of ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A in human genome is a PCR primer and/or a single-base extension primer for amplifying a genome DNA fragment comprising ERAP1rs469783T > C, PSMF1rs13040574C > A and/or NCF2rs36071574G > A.
6. Use according to any one of claims 1 to 5, characterized in that: the survival is life cycle or disease specific survival rate.
7. Use according to any one of claims 1 to 6, characterized in that: the non-small cell lung cancer is lung adenocarcinoma or lung squamous carcinoma.
CN202110215224.4A 2021-02-25 2021-02-25 Application of three SNPs in preparation of product for predicting survival of non-small cell lung cancer patient Pending CN114959020A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110215224.4A CN114959020A (en) 2021-02-25 2021-02-25 Application of three SNPs in preparation of product for predicting survival of non-small cell lung cancer patient

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110215224.4A CN114959020A (en) 2021-02-25 2021-02-25 Application of three SNPs in preparation of product for predicting survival of non-small cell lung cancer patient

Publications (1)

Publication Number Publication Date
CN114959020A true CN114959020A (en) 2022-08-30

Family

ID=82973385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110215224.4A Pending CN114959020A (en) 2021-02-25 2021-02-25 Application of three SNPs in preparation of product for predicting survival of non-small cell lung cancer patient

Country Status (1)

Country Link
CN (1) CN114959020A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107058562A (en) * 2017-05-17 2017-08-18 成都仕康美生物科技有限公司 The application of SNP site and its detection kit
KR20170116342A (en) * 2016-04-11 2017-10-19 경북대학교 산학협력단 EGFR gene polymorphisms marker for predicting survival in patients with lung cancer and method for predicting survival using the same
KR20190004530A (en) * 2017-07-04 2019-01-14 경북대학교 산학협력단 DIAGNOSTIC METHODS FOR PROGNOSIS OF NON-SMALL-CELL LUNG CANCER USING BUB3, AURKB, PTTG1 AND RAD21 SNPs
CN111394456A (en) * 2020-03-19 2020-07-10 中国医学科学院肿瘤医院 Early lung adenocarcinoma patient prognosis evaluation system and application thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170116342A (en) * 2016-04-11 2017-10-19 경북대학교 산학협력단 EGFR gene polymorphisms marker for predicting survival in patients with lung cancer and method for predicting survival using the same
CN107058562A (en) * 2017-05-17 2017-08-18 成都仕康美生物科技有限公司 The application of SNP site and its detection kit
KR20190004530A (en) * 2017-07-04 2019-01-14 경북대학교 산학협력단 DIAGNOSTIC METHODS FOR PROGNOSIS OF NON-SMALL-CELL LUNG CANCER USING BUB3, AURKB, PTTG1 AND RAD21 SNPs
CN111394456A (en) * 2020-03-19 2020-07-10 中国医学科学院肿瘤医院 Early lung adenocarcinoma patient prognosis evaluation system and application thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
S.YANG ET AL.: "Genetic Variants in ERAP1 and NCF2 in the MHC Class I Related Genes Are Associated with Non-Small Cell Lung Cancer Survival", 《JOURNAL OF THORACIC ONCOLOGY》, vol. 14, no. 10, 31 October 2019 (2019-10-31), pages 256 - 257 *
SEN YANG ET AL.: "Potentially functional variants of ERAP1, PSMF1 and NCF2 in the MHC‑I‑related pathway predict non‑small cell lung cancer survival", 《CANCER IMMUNOLOGY》, 2 March 2021 (2021-03-02), pages 2819 - 2833, XP037559223, DOI: 10.1007/s00262-021-02877-9 *
YUFENG WU ET AL.: "Potentially functional variants of HBEGF and ITPR3 in GnRH signaling pathway genes predict survival of non-small cell lung cancer patients", 《TRANSLATIONAL RESEARCH》, 2 January 2021 (2021-01-02), pages 1 - 12 *

Similar Documents

Publication Publication Date Title
Capon et al. Psoriasis and other complex trait dermatoses: from Loci to functional pathways
Assié et al. Integrated genomic characterization of adrenocortical carcinoma
Li et al. Association study of germline variants in CCNB1 and CDK1 with breast cancer susceptibility, progression, and survival among Chinese Han women
De las Peñas et al. Polymorphisms in DNA repair genes modulate survival in cisplatin/gemcitabine-treated non-small-cell lung cancer patients
AU2020221845A1 (en) An integrated machine-learning framework to estimate homologous recombination deficiency
WO2017215230A1 (en) Use of a group of gastric cancer genes
JP6704861B2 (en) Methods for selecting personalized triple therapies for cancer treatment
US20140087961A1 (en) Genetic variants useful for risk assessment of thyroid cancer
Lee et al. Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes
US20230074781A1 (en) Methods and composition for the prediction of the activity of enzastaurin
US20130338012A1 (en) Genetic risk factors of sick sinus syndrome
WO2012085948A1 (en) Genetic variants useful for risk assessment of thyroid cancer
Silverman et al. Genetics and genomics of chronic obstructive pulmonary disease
Hu et al. Five‑long non‑coding RNA risk score system for the effective prediction of gastric cancer patient survival
Sun et al. Genomic instability-associated lncRNA signature predicts prognosis and distinct immune landscape in gastric cancer
Yang et al. Potentially functional variants of ERAP1, PSMF1 and NCF2 in the MHC-I-related pathway predict non-small cell lung cancer survival
Qu et al. m6A‐Related Angiogenic Genes to Construct Prognostic Signature, Reveal Immune and Oxidative Stress Landscape, and Screen Drugs in Hepatocellular Carcinoma
US20140155287A1 (en) Methods and compositions for assessment of pulmonary function and disorders
Liu et al. The current progress and future prospects of personalized radiogenomic cancer study
Campa et al. A gene-wide investigation on polymorphisms in the ABCG2/BRCP transporter and susceptibility to colorectal cancer
Geng et al. An exome-wide sequencing study of lipid response to high-fat meal and fenofibrate in Caucasians from the GOLDN cohort
Wei et al. N6-methyladenosine (m6A) regulatory gene divides hepatocellular carcinoma into three subtypes
Edvardsen et al. SNPs in genes coding for ROS metabolism and signalling in association with docetaxel clearance
Cheng et al. Genetic variants in the cholesterol biosynthesis pathway genes and risk of prostate cancer
US20140080727A1 (en) Variants predictive of risk of gout

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination