CN109735619B - Molecular marker related to non-small cell lung cancer prognosis and application thereof - Google Patents

Molecular marker related to non-small cell lung cancer prognosis and application thereof Download PDF

Info

Publication number
CN109735619B
CN109735619B CN201811574788.1A CN201811574788A CN109735619B CN 109735619 B CN109735619 B CN 109735619B CN 201811574788 A CN201811574788 A CN 201811574788A CN 109735619 B CN109735619 B CN 109735619B
Authority
CN
China
Prior art keywords
lung cancer
small cell
cell lung
genes
prognosis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811574788.1A
Other languages
Chinese (zh)
Other versions
CN109735619A (en
Inventor
米双利
张健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Genomics of CAS
Original Assignee
Beijing Institute of Genomics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Genomics of CAS filed Critical Beijing Institute of Genomics of CAS
Priority to CN201811574788.1A priority Critical patent/CN109735619B/en
Publication of CN109735619A publication Critical patent/CN109735619A/en
Application granted granted Critical
Publication of CN109735619B publication Critical patent/CN109735619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a molecular marker related to non-small cell lung cancer prognosis and application thereof. The invention combines a non-small cell lung cancer (NSCLC) gene chip expression data set, utilizes a random forest method and Cox single-factor regression analysis to find genes of which the expression level is obviously related to the life cycle of a patient from candidate genes, utilizes Cox multi-factor regression analysis to carry out survival analysis modeling on the genes, and utilizes a Kaplan Meier method and a time-dependent ROC analysis method to evaluate the performance of the model to obtain a lung cancer prognosis tag set of 17 genes, wherein specific gene combinations in the 17 genes respectively grade the life cycle length and the recurrence risk of the postoperative patient so as to carry out individualized treatment. The molecular marker is not influenced by factors such as NSCLC tissue type, disease stage, age, sex and the like, can be used as a tool for assessing NSCLC prognosis, and has universal applicability to NSCLC patients.

Description

Molecular marker related to non-small cell lung cancer prognosis and application thereof
Technical Field
The invention relates to the field of medical molecular biology, in particular to the technical field of postoperative recurrence evaluation and medication guidance of lung cancer patients, and specifically relates to a molecular marker related to non-small cell lung cancer prognosis and application thereof.
Background
Lung cancer is one of the most common malignancies. Lung cancer includes small cell lung cancer and non-small cell lung cancer, wherein the non-small cell lung cancer accounts for 85% of the overall incidence rate of lung cancer. The non-small cell lung cancer mainly comprises squamous carcinoma and adenocarcinoma in tissue morphology, and is characterized by slow disease development and inconspicuous metastasis and spread.
Non-small cell lung cancer patients are easy to relapse and transfer, so the cure rate is extremely low, and the five-year survival rate of the patients after operation is less than 15 percent. Although there have been significant advances in the diagnosis and treatment of non-small cell lung cancer in recent years, there is no effective prognostic marker for non-small cell lung cancer. In actual work, most clinicians treat the tumor according to the tumor tissue type, pathological stage and related drug reaction of the patient and judge the next treatment scheme after the operation of the patient. However, lung cancer is a heterogeneous disease, individual differences among patients are obvious, even if individuals with the same pathological classification are still subjected to different drug reactions, different treatment effects, different recurrence times and the like, and the prognosis situation is complex and difficult to judge. Therefore, the development of molecular prognostic markers with high sensitivity and high accuracy is of great significance in guiding the clinical treatment of the non-small cell lung cancer.
Different from the prior clinical pathology observation, the molecular marker based on the gene expression level can more accurately predict the survival probability and the recurrence risk, and provides a targeted individualized auxiliary treatment scheme for postoperative patients. Therefore, the invention aims to provide a prognostic marker of non-small cell lung cancer, and clinically provides individualized guidance for postoperative adjuvant therapy of lung cancer patients.
Disclosure of Invention
The first purpose of the invention is to provide an effective molecular marker for judging prognosis for a patient with non-small cell lung cancer, namely a molecular marker related to the prognosis of non-small cell lung cancer.
The second purpose of the invention is to provide a kit for the prognostic evaluation of patients with non-small cell lung cancer.
The third purpose of the invention is to provide an evaluation system for analyzing the length of life cycle and the recurrence risk after operation and assisting in guiding clinical treatment of non-small cell lung cancer patients.
The purpose of the invention is realized by the following technical scheme: the present invention first determines a candidate gene set. Extracellular secreted proteins in the tumor microenvironment have been reported to play an important role in the development of tumors. In the initial stage of lung cancer, the tumor cells can actively transform peripheral fibroblasts into extracellular secretion exosomes, so that the fibroblasts are promoted to be activated into tumor-related fibroblasts, the level of extracellular secretion proteins is changed, the microenvironment of the tumor cells is remodeled, and the further development of tumors is facilitated. Therefore, these proteins are affected by tumor exosomes and the proteins whose level of secretion to the extracellular space is altered are crucial for tumor development. In theory, the genes corresponding to these proteins are equally important in the development of tumors. In tumor tissues, changes in the expression levels of these genes contribute to the development of tumors. Therefore, the coding gene of the differential secretory protein is used as a candidate gene for judging the prognosis performance of the patient with the non-small cell lung cancer. In order to further narrow the range of candidate genes, the invention combines non-small cell lung cancer gene chip expression data sets obtained by a plurality of published research organizations, and uses a random forest method and Cox single-factor regression analysis to find a group of genes of which the expression level is obviously related to the life cycle of a patient from the candidate genes. Then, this group of genes was modeled for survival analysis using Cox multifactor regression analysis, and the performance of the model was evaluated using the Kaplan Meier method and the time-dependent ROC (receiver operating curve) analysis method. Finally, a group of 17-gene lung cancer prognosis label sets is obtained, and specific gene combinations in the 17 genes respectively have obvious effects on judging postoperative prognosis of non-small cell lung cancer patients.
Specifically, the invention provides a molecular marker related to non-small cell lung cancer prognosis, which is one or more of COL6A1, PLOD2, CTSZ, STMN2, EIF3B, SEPT2, GSR, DYNC1H1, SEC23A, MARCKS, TUBB, NRP1, LRP1, EIF5A, ARSA, CD81 and ANPEP.
Preferably, the present invention provides the following combinations of molecular markers associated with prognosis of non-small cell lung cancer:
(1) PLOD2, EIF3B, SEC23A, MARCKS; or
(2) PLOD2, EIF3B, MARCKS, LRP 1; or
(3) COL6a1, PLOD2, CTSZ, EIF3B, MARCKS; or
(4) COL6a1, PLOD2, EIF3B, LRP1, ANPEP; or
(5) PLOD2, EIF3B, SEPT2, SEC23A, MARCKS; or
(6) PLOD2, EIF3B, MARCKS, NRP1, LRP 1; or
(7) COL6a1, PLOD2, CTSZ, EIF3B, SEPT2, MARCKS; or
(8) PLOD2, EIF3B, SEPT2, SEPT2, SEC23A, MARCKS; or
(9) PLOD2, EIF3B, DYNC1H1, MARCKS, NRP1, LRP 1; or
(10) COL6A1, PLOD2, CTSZ, EIF3B, DYNC1H1, MARCKS; or
(11) COL6A1, PLOD2, CTSZ, EIF3B, MARCKS; or
(12) COL6a1, PLOD2, CTSZ, EIF3B, MARCKS, NRP 1; or
(13) COL6A1, PLOD2, CTSZ, EIF3B, MARCKS, ANPEP; or
(14) PLOD2, EIF3B, SEPT2, MARCKS, TUBB, LRP1, ANPEP; or
(15) PLOD2, EIF3B, DYNC1H1, SEC23A, MARCKS, LRP1, ANPEP; or
(16) COL6a1, PLOD2, CTSZ, EIF3B, SEPT2, SEPT2, MARCKS; or
(17) COL6a1, PLOD2, CTSZ, EIF3B, SEPT2, DYNC1H1, SEC 23A; or
(18) COL6a1, PLOD2, CTSZ, EIF3B, SEPT2, SEC23A, MARCKS; or
(19) COL6A1, PLOD2, CTSZ, EIF3B, DYNC1H1, MARCKS; or
(20) COL6A1, PLOD2, CTSZ, EIF3B, SEC23A, MARCKS; or
(21) COL6A1, PLOD2, CTSZ, EIF3B, MARCKS, NRP 1; or
(22) COL6a1, PLOD2, CTSZ, EIF3B, SEPT2, SEPT2, DYNC1H1, SEC 23A; or
(23) COL6a1, PLOD2, CTSZ, SEPT2, SEPT2, DYNC1H1, SEC23A, LRP 1; or
(24) EIF3B, SEPT2, DYNC1H1, SEC23A, MARCKS, NRP1, LRP 1; or
(25) COL6A1, PLOD2, CTSZ, EIF3B, SEPT2, SEPT2, DYNC1H1, MARCKS, ANPEP; or
(26)PLOD2、EIF3B、SEPT2、DYNC1H1、SEC23A、MARCKS、MARCKS、NRP1、LRP1。
Preferably, in the embodiment of the present invention, for the (1) th combination, the corresponding probes are respectively: 202619_ s _ at, 211501_ s _ at, 204344_ s _ at, 201668_ x _ at;
for the (2) th combination, the corresponding probes are respectively: 202619_ s _ at, 211501_ s _ at, 201668_ x _ at, 1555353_ at
For the (3) th combination, the corresponding probes are: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 201668_ x _ at, or,
216904_at、202619_s_at、212562_s_at、211501_s_at、213002_at;
for the (4) th combination, the corresponding probes are: 216904_ at, 202619_ s _ at, 211501_ s _ at, 1555353_ at, 234576_ at;
for the (5) th combination, the corresponding probes are: 202619_ s _ at, 211501_ s _ at, 1554747_ a _ at, 204344_ s _ at, 201668_ x _ at; or
202619_s_at、211501_s_at、200778_s_at、204344_s_at、201668_x_at;
For the (6) th combination, the corresponding probes are: 202619_ s _ at, 211501_ s _ at, 213002_ at, 210615_ at, 1555353_ at;
for the (7) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 1554747_ a _ at, 213002_ at;
for the (8) th combination, the corresponding probes are: 202619_ s _ at, 211501_ s _ at, 200778_ s _ at, 1554747_ a _ at, 204344_ s _ at, 201668_ x _ at;
for the combination (9), the corresponding probes are: 202619_ s _ at, 211501_ s _ at, 229115_ at, 213002_ at, 210615_ at, 1555353_ at;
for the (10) th combination, the corresponding probes are: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 229115_ at, 213002_ at;
for the (11) th combination, the corresponding probes are: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 201668_ x _ at, 213002_ at;
for the (12) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 213002_ at, 210615_ at;
for the (13) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 213002_ at, 234576_ at;
for the (14) th combination, the corresponding probes are: 202619_ s _ at, 211501_ s _ at, 200778_ s _ at, 201668_ x _ at, 209026_ x _ at, 1555353_ at, 234576_ at;
for the (15) th combination, the corresponding probes were: 202619_ s _ at, 211501_ s _ at, 229115_ at, 204344_ s _ at, 213002_ at, 1555353_ at, 234576_ at;
for the (16) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 1554747_ a _ at, 200778_ s _ at, 213002_ at;
for the (17) th combination, the corresponding probes are: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 1554747_ a _ at, 229115_ at, 204344_ s _ at;
for the (18) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 1554747_ a _ at, 204344_ s _ at, 213002_ at;
for the (19) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 229115_ at, 201668_ x _ at, 213002_ at;
for the (20) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 204344_ s _ at, 201668_ x _ at, 213002_ at;
for the (21) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 201668_ x _ at, 213002_ at, 210615_ at;
for the (22) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 1554747_ a _ at, 200778_ s _ at, 229115_ at, 204344_ s _ at;
for the (23) th combination, the corresponding probes are: 216904_ at, 202619_ s _ at, 212562_ s _ at, 1554747_ a _ at, 200778_ s _ at, 229115_ at, 204344_ s _ at, 1555353_ at;
for the (24) th combination, the corresponding probes are: 211501_ s _ at, 200778_ s _ at, 229115_ at, 204344_ s _ at, 201668_ x _ at, 213002_ at, 210615_ at, 1555353_ at;
for the (25) th combination, the corresponding probes were: 216904_ at, 202619_ s _ at, 212562_ s _ at, 211501_ s _ at, 1554747_ a _ at, 200778_ s _ at, 229115_ at, 213002_ at, 234576_ at;
for the (26) th combination, the corresponding probes are: 202619_ s _ at, 211501_ s _ at, 200778_ s _ at, 229115_ at, 204344_ s _ at, 201668_ x _ at, 213002_ at, 210615_ at, 1555353_ at.
The probe tags are from the Affymetrix U133 series of chips. In one embodiment of the invention, the probes are from the Affymetrix U1332.0 chip.
The invention provides application of the molecular marker in preparation of a non-small cell lung cancer patient prognosis evaluation kit, reagent or chip.
The invention provides application of the molecular marker in preparation of a non-small cell lung cancer patient postoperative survival period long-short or recurrence risk high-low evaluation system.
The invention provides application of the molecular marker in preparation of a non-small cell lung cancer patient postoperative guiding medication system.
Preferably, the system is a kit or an apparatus for the purpose of non-small cell lung cancer therapy.
For each of the above-mentioned molecular markers of the combinations of the present invention, a corresponding Cox multifactor regression model can be established based on the existing non-small cell lung cancer tumor tissue gene expression database containing patient survival information as a training set (the training set used in the embodiments of the present invention is a sample selected from three randomly integrated data sets by randomly selecting 50%), and the coefficients of the probe variables corresponding to each gene of the combination and the risk score threshold are obtained. And for predicting the survival condition of each postoperative patient, detecting the expression values of the probes corresponding to the tumor tissues of the patient, and correspondingly bringing the expression values into the established regression model to obtain the value, namely the risk score of the patient. Comparing the risk score of the patient with the risk score critical value obtained before, wherein the risk score higher than the critical value indicates that the patient has short postoperative life cycle and high recurrence possibility; a risk score below this threshold identifies the patient as having a long post-operative life span with a low likelihood of recurrence.
The invention provides a kit, which contains a detection reagent and/or a detection instrument for detecting the molecular marker; the kit is a prognosis evaluation kit for a non-small cell lung cancer patient, a long or short postoperative life or high or low recurrence risk evaluation kit for the non-small cell lung cancer patient or a postoperative guiding medication kit for the non-small cell lung cancer patient.
The invention combines a non-small cell lung cancer (NSCLC) gene chip expression data set, utilizes a random forest method and Cox single-factor regression analysis to find genes of which the expression level is obviously related to the life cycle of a patient from candidate genes, utilizes Cox multi-factor regression analysis to carry out survival analysis modeling on the genes, and utilizes a Kaplan Meier method and a time-dependent ROC analysis method to evaluate the performance of a model to obtain a lung cancer prognosis tag set of 17 genes, wherein the specific gene combination in the 17 genes respectively carries out the life cycle length and the risk recurrence height grading on a postoperative NSCLC patient, thereby prompting that different groups are clinically treated with corresponding individualized auxiliary treatment. In addition, the molecular marker is independent of factors such as tissue type, disease stage, age, sex and whether the non-small cell lung cancer is smoked or not. Therefore, the molecular marker provided by the invention can be used as a reliable prognostic evaluation tool for NCSCL, and has universal applicability to non-small cell lung cancer patients.
Drawings
FIG. 1 is a set of molecules identified in example 1 that lung cancer exosomes affect differential secretion by fibroblasts: a is the lung cancer exosome form and size (Bar is 200nm) observed by a transmission electron microscope; b is the particle size distribution of lung cancer exosomes detected by dynamic light scattering; c is western Blot for detecting the expression of a lung cancer exosome surface marker CD63, which indicates that the separated precipitate is an exosome secreted by non-small cell lung cancer; d, analyzing and comparing proteins secreted by the fibroblasts treated by the lung cancer exosomes and normal fibroblasts by using biological information, and obtaining that the expression levels of 302 proteins are significantly different (namely the difference multiple is more than 1.5, and the significance P value is less than 0.05) by using mass spectrometry analysis on the proteins secreted by the lung cancer exosomes and the normal fibroblasts.
FIG. 2 is the establishment of the prognosis model of non-small cell lung cancer in example 2: the results of the statistical analysis of Kaplan-Meier of a group of gene combination labels in a training set and log-rank test show that the group of prognostic genes can distinguish non-small cell lung cancer patients with short survival period and long survival period.
FIGS. 3A-3D are validation of the non-small cell lung cancer prognostic model for predicting overall survival: the results of the statistical analysis of Kaplan-Meier and log-rank test of a group of gene combination labels in a validation set and three independent test sets show that the group of prognostic genes can distinguish non-small cell lung cancer patients with short survival time and long survival time.
FIGS. 4A-4D are ROC curve analyses of non-small cell lung cancer prognostic models predicting 5-year overall survival rates: the results of time-dependent ROC curve analysis of a group of gene combination labels in a validation set and three independent test sets show that the prognostic gene combination can obviously distinguish non-small cell lung cancer patients with high mortality and low mortality within 5 years.
FIGS. 5A-5D are validation of non-small cell lung cancer models to predict risk of cancer recurrence: the statistical analysis of the Kaplan-Meier of a group of gene combination labels in four independent verification sets shows that the log-rank test result shows that the group of prognostic genes can distinguish non-small cell lung cancer patients with high recurrence rate and low recurrence rate.
FIGS. 6A-6D show the non-small cell lung cancer prognosis model predicting 5-year non-recurrence survival rate ROC curve analysis, and the results of the time-dependent ROC curve analysis of a group of gene combination labels in four independent verification sets show that the prognosis gene combination can significantly distinguish non-small cell lung cancer patients with high recurrence rate and low recurrence rate within 5 years.
Detailed Description
The following examples further illustrate the present invention but are not to be construed as limiting the invention. Modifications or substitutions to methods, procedures, or conditions of the invention may be made without departing from the spirit and scope of the invention.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified; the technical means used in the examples are conventional means well known to those skilled in the art. Unless otherwise specified, the lung cancer cells described in the examples below are all non-small cell lung cancer cells.
Example 1 conversion of Lung cancer exosomes into fibroblasts and affecting protein levels secreted extracellularly
Extracting exosomes secreted by the non-small cell lung cancer: serum-free culture medium supernatants were collected for culturing lung cancer cell lines. Centrifuge at 4 ℃ for 15 minutes at 3000g and place the supernatant in a new centrifuge tube. The mixture was centrifuged at 16000g for 1 hour at 4 ℃ and the supernatant was removed and placed in a new centrifuge tube. After centrifugation at 120000g for 2 hours at 4 ℃ the supernatant was discarded. PBS resuspended pellets, exosomes. Identifying the structure of the collected exosome by an electron microscope: taking 10ul of the heavy suspension, dripping the heavy suspension on a carbon-containing copper net, adding 2% uranyl acetate after air drying, standing for 5min, rinsing with PBS for 2min, sucking redundant liquid with filter paper, repeatedly cleaning for three times, and air drying. The exosome structure is observed to be a membrane bubble structure with outer membrane wrapping under a transmission electron microscope with a voltage of 80kV, and the result is shown as A in figure 1.
The collected exosomes were studied for overall particle size distribution: the resuspension solution was diluted again with PBS to an appropriate optical signal detection level using a Nanosight3000 instrument from Malven, uk, and the mixture was mixed and detected, the results showed that the diameter of the collected exosome vesicles ranged from 30 to 200nm, the results are shown in fig. 1B, which indicated that the diameter of the collected exosome vesicles ranged from the diameter of exosomes.
Verifying the collected surface protein markers of the exosomes by using a western blotting method: the protein concentration of the resuspension was determined by BCA protein assay, and the surface protein marker CD63, TSG101 was detected by western blotting from 1. mu.g of the resuspension fluid. Detecting the exosome surface characteristic proteins CD63 and TSG101, and adopting a mouse monoclonal antibody (diluted by 1: 1000); detecting a cell characteristic protein Tubulin by adopting a mouse monoclonal antibody (diluted by 1: 4000); detection of the internal reference protein GAPDH was performed using rabbit polyclonal antibody (1:10000 dilution). The results are shown in panel C of figure 1, where the collected bleb is expressed with the exosome markers CD63 and TSG101, but not with the cell characteristic protein Tubulin, indicating that the collected bleb is an exosome.
Tumor exosomes incubated lung fibroblasts: the resuspended tumor exosome solution was added to a fibroblast culture medium, a tumor exosome solution (MEM medium + tumor exosomes) containing 6 μ g was added to each of three wells in a 6-well plate, a control medium (MEM medium) containing no tumor exosomes was added to the other three wells, and after 72 hours of incubation, the medium was changed to a fresh serum-free medium (MEM medium) uniformly, and further incubated for 48 hours, and the supernatant was collected. Centrifugation at 3000g for 15 min at 4 ℃ collected supernatant and mass spectrometry sequencing of the protein solution was performed on a sequencing platform, a Thermo Ultimate 3000 machine equipped with qxactive software. After comparing the detected original sequence information with a database human Uniprot (version 2018.1), extracellular protein spectra secreted by fibroblasts incubated with tumor exosomes and fibroblasts not incubated with tumor exosomes are respectively obtained. Subsequently, a total of 302 secreted proteins were compared for differences by bioinformatics. The results are shown in figure 1, panel D, where the criteria for selecting differentially secreted proteins were a fold difference of greater than 1.5 and a significance P value of the difference of less than 0.05. The genes corresponding to the 302 different proteins are used as a gene set for searching tumor prognostic factors.
Example 2 establishment of non-Small cell Lung cancer prognostic model
In order to establish a reliable non-small cell lung cancer prognosis model, the invention collects a data set which is related to the non-small cell lung cancer and has survival information. The specific sample information of the data set is shown in table 1 and table 2.
TABLE 1 non-small cell Lung cancer patient information relating to Total survival
Figure BDA0001916460390000111
Figure BDA0001916460390000121
TABLE 2 non-small cell Lung cancer patient information relating to recurrence-free survival
Figure BDA0001916460390000122
Figure BDA0001916460390000131
The experimental platforms for the data sets were all Affymetrix U133Plus2.0 chips. Z-value normalization is performed on the gene expression value of each data set, namely (gene expression value-mean value of the gene expression value in the data set)/(standard deviation of the gene expression value in the data set) ("-" represents a minus sign and "/" represents a dividing sign), then 409 samples in total are randomly selected from three data sets GSE50081, GSE37745 and GSE101929, and the three data sets are integrated, 205 samples are randomly selected from the three data sets as a training set, and the rest 204 samples are used as a verification set (identification set). The gene set related to the overall survival rate comprises GSE3141, GSE31210 and GSE 30219. The gene set related to relapse-free survival rate comprises GSE31210, GSE30219, GSE8894 and GSE 50081. These seven data sets serve as independent verification sets. And then, corresponding the names of the 302 differential secretory proteins obtained by analysis to corresponding genes, applying the genes to a training set, and searching genes which have important influence on the life cycle by a random forest method.
Cox single-factor regression analysis was performed to select 17 genes that were significantly associated with patient survival (p <0.05), and the genetic information for these 17 prognostic signatures is shown in Table 3.
TABLE 3 prognostic signature genes
Figure BDA0001916460390000132
Figure BDA0001916460390000141
Then, 19 probes corresponding to the 17 genes are randomly combined without repetition, multi-factor Cox regression analysis is carried out in a training set aiming at each combination, a prognosis model is established, and a corresponding risk score critical value aiming at the model is calculated by using the coefficient and the expression value of the probe variable corresponding to each gene in the model. Further verifying the effectiveness of the model, using the obtained prognosis model and the risk score critical value to carry out Kaplan-Meier survival analysis in 1 verification set and 3 independent test sets related to the overall survival rate and 4 independent test sets related to the relapse-free survival rate respectively, and using log-rank test to carry out the test. Finally, 28 Cox multifactor regression models with P values of less than 0.05 for both log-rank test results in the training set and 1 validation set and 7 independent test sets were retained as prognostic models for non-small cell lung cancer. Any model can obviously distinguish patients with long and short survival periods after operations, and also can obviously distinguish patients with high and low recurrence risks after operations. The molecular combinations of the models are shown in table 4. The prediction model is shown in table 5. The risk score cut-off values obtained by the predictive model are shown in table 6.
TABLE 4 Gene combinations for non-small cell Lung cancer prognostic models
Figure BDA0001916460390000142
Figure BDA0001916460390000151
Figure BDA0001916460390000161
TABLE 5 prognostic model
Figure BDA0001916460390000162
Figure BDA0001916460390000171
Figure BDA0001916460390000181
Note: the name of the probe in Table 5 is expressed as the expression value of the probe, and the formula corresponding to the name of the gene in Table 5 can calculate the risk score ("+" stands for multiple)
TABLE 6 prognostic model Risk score cutoff values
Gene combinations Risk score cutoff Gene combinations Risk score cutoff
1 -0.004284 15 -0.014003
2 0.003898 16 -0.036386
3 -0.040707 17 0.003143
4 -0.003746 18 -0.010045
5 -0.055524 19 -0.033844
6 -0.000928 20 0.005911
7 -0.003877 21 -0.009103
8 0.013358 22 0.005290
9 -0.025576 23 0.005758
10 -0.000036 24 0.003738
11 -0.004525 25 -0.033186
12 -0.020949 26 -0.024634
13 -0.003874 27 0.005087
14 0.007349 28 -0.028955
And applying the prognosis models to carry out prognosis prediction on lung cancer patients, selecting any model, and calculating the risk score of each patient according to the probe set expression value corresponding to the model gene combination of each patient. Patients with a risk score greater than the risk score threshold for the corresponding model are classified as high risk of recurrence patients or short-lived patients, and patients with a risk score less than the risk score threshold for the corresponding model are classified as low risk of recurrence patients or long-lived patients.
From the above prognosis models, a prognosis model consisting of 5 probe sets was selected as an example, in which the genes were: SEC23A, PLOD2, MARCKS, SEPT2, EIF 3B. The performance of the 5-gene prognostic model in the training set is shown in FIG. 2. FIG. 2 is a graph of the significance of the P values obtained by log-rank test (P <0.05) using Kaplan-Meier analysis, for a model built from a training set for the combination "SEC 23A, PLOD2, MARCKS, SEPT2, EIF 3B". Table 7 results of Kaplan-Meier survival analysis in training set and testing with log-rank test for non-small cell lung cancer prognostic models.
TABLE 7 non-small cell Lung cancer prognostic Gene combination signatures Kaplan-Meier survival analysis in Total survival training set, log-rank test results
Figure BDA0001916460390000191
Example 3 application of non-Small cell Lung cancer prognostic model
The non-small cell lung cancer prognosis models obtained from the training set obtained in example 2 are applied to a validation set, three independent test sets related to overall survival and four independent test sets related to relapse-free survival, respectively, to predict the survival risk and the relapse risk of the disease of the postoperative patient.
Taking the survival rate prediction of the non-small cell lung cancer prognosis model composed of the 5 probes on the data set related to the overall survival rate as an example, the risk score of each patient in the related data set is obtained according to the prognosis model, the patient with the risk score larger than the risk score critical value corresponding to the prognosis model is classified as a high-risk patient, and the patient with the risk score smaller than the risk score critical value corresponding to the gene combination model is classified as a low-risk patient. And calculating the significance P value of Kaplan-Meier survival analysis according to the obtained prediction result and the postoperative survival condition of the patient. The P value is less than 0.05, which indicates that the prognosis model can predict patients with long postoperative survival period and patients with short postoperative survival period.
Fig. 3 is a prediction of overall survival after surgery by this model. FIG. 3A is a significance analysis of Kaplan-Meier survival analysis of the model in the validation set. FIG. 3B is a significance analysis of Kaplan-Meier survival analysis of this model in independent test set GSE 30219. FIG. 3C is a significance analysis of Kaplan-Meier survival analysis of this model in independent test set GSE 31210. FIG. 3D is a significance analysis of Kaplan-Meier survival analysis of the model in independent test set GSE 3141. FIG. 4 ROC curve analysis of the prediction of overall survival for a non-small cell lung cancer prognostic model. FIG. 4A is a 5-year overall survival ROC curve analysis of the model in the validation set. Fig. 4B is a 5-year overall survival ROC curve analysis of the model in independent test set GSE 30219. Fig. 4C is a 5-year overall survival ROC curve analysis of the model in independent test set GSE 31210. Fig. 4D is a 5-year overall survival ROC curve analysis of the model in independent test set GSE 3141.
Similarly, taking the non-small cell lung cancer prognosis model composed of the above 5 probes as an example of predicting the recurrence-free survival rate of the data set related to the recurrence-free survival rate, the risk score of each patient in the related data set is obtained according to the prognosis model, the patient whose risk score is greater than the risk score critical value corresponding to the prognosis model is classified as a high recurrence patient, and the patient whose risk score is less than the risk score critical value corresponding to the gene combination model is classified as a low recurrence patient. And calculating the significance P value of Kaplan-Meier survival analysis according to the obtained prediction result and the postoperative survival condition of the patient. P values less than 0.05 indicate that the prognostic model can predict patients with high postoperative recurrence rate and patients with low postoperative recurrence rate. Figure 5 is a prediction of the model for postoperative recurrence-free survival. FIG. 5A is a significance analysis of Kaplan-Meier survival analysis of the model in independent test set GSE 8894. FIG. 5B is a significance analysis of Kaplan-Meier survival analysis of the model in independent test set GSE 30219. FIG. 5C is a significance analysis of Kaplan-Meier survival analysis of the model in independent test set GSE 31210. FIG. 5D is a significance analysis of Kaplan-Meier survival analysis of the model in independent test set GSE 50081. FIG. 6 ROC curve analysis for prognosis model prediction of recurrence-free survival rate for non-small cell lung cancer. FIG. 6A is a 5-year relapse-free survival ROC curve analysis of the model in independent test set GSE 8894. Fig. 6B is a 5-year relapse-free survival ROC curve analysis of the model in independent test set GSE 30219. Fig. 6C is a 5-year relapse-free survival ROC curve analysis of the model in independent test set GSE 31210. FIG. 6D is a ROC curve analysis of the model in GSE500815 without recurrence in independent test set.
Table 8 results of Kaplan-Meier survival analysis of non-small cell lung cancer prognostic gene combination signatures in validation set and independent test set and test with log-rank test.
TABLE 8 outcome test of prediction of non-small cell lung cancer prognostic gene combination signatures in data sets relating to overall survival and relapse-free survival
Figure BDA0001916460390000211
Figure BDA0001916460390000221
The invention carries out Kaplan-Meier survival analysis on the non-small cell lung cancer gene combination in a verification set and an independent test set, and tests by using log-rank test, wherein the P value is less than 0.05, namely the gene combination can separate the patients with long overall survival period after operation from the patients with short overall survival period after operation; patients with high risk of relapse after surgery were separated from patients with low risk of relapse. The verification results of the gene combination label for prognosis of non-small cell lung cancer in the data sets related to overall survival rate and recurrence-free survival rate show that the specific gene combination in the 17 screened genes can be used for grading the length of life and the recurrence risk of postoperative non-small cell lung cancer patients respectively so as to carry out individualized treatment. The molecular marker is not influenced by factors such as NSCLC tissue type, disease stage, age, sex and the like, can be used as a tool for assessing NSCLC prognosis, and has universal applicability to NSCLC patients.
While the invention has been described in detail in the foregoing by way of general description, specific embodiments and experiments, it will be apparent to those skilled in the art that certain modifications and improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (3)

1. The application of a reagent for detecting a molecular marker combination related to non-small cell lung cancer prognosis in preparing a non-small cell lung cancer patient prognosis evaluation kit, reagent or chip is disclosed, wherein the molecular marker combination is SEC23A, PLOD2, MARCKS, SEPT2 or EIF 3B.
2. The application of the reagent for detecting the molecular marker combination related to the prognosis of the non-small cell lung cancer in the preparation of an evaluation system for the long postoperative life or the high or low recurrence risk of the non-small cell lung cancer patient; the molecular marker combination is SEC23A, PLOD2, MARCKS, SEPT2 and EIF 3B.
3. The use of claim 2, wherein the system is a kit.
CN201811574788.1A 2018-12-21 2018-12-21 Molecular marker related to non-small cell lung cancer prognosis and application thereof Active CN109735619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811574788.1A CN109735619B (en) 2018-12-21 2018-12-21 Molecular marker related to non-small cell lung cancer prognosis and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811574788.1A CN109735619B (en) 2018-12-21 2018-12-21 Molecular marker related to non-small cell lung cancer prognosis and application thereof

Publications (2)

Publication Number Publication Date
CN109735619A CN109735619A (en) 2019-05-10
CN109735619B true CN109735619B (en) 2022-04-15

Family

ID=66359518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811574788.1A Active CN109735619B (en) 2018-12-21 2018-12-21 Molecular marker related to non-small cell lung cancer prognosis and application thereof

Country Status (1)

Country Link
CN (1) CN109735619B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111564177B (en) * 2020-05-22 2023-03-31 四川大学华西医院 Construction method of early non-small cell lung cancer recurrence model based on DNA methylation
CN114934117A (en) * 2022-05-31 2022-08-23 深圳市陆为生物技术有限公司 Product for evaluating recurrence risk of lung cancer patient

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101638656A (en) * 2009-08-28 2010-02-03 南京医科大学 Blood serum/blood plasma miRNA marker related to non-small cell lung cancer (SCLC) prognosis and application thereof
CN103492590A (en) * 2011-02-22 2014-01-01 卡里斯生命科学卢森堡控股有限责任公司 Circulating biomarkers
CN108315413A (en) * 2017-12-31 2018-07-24 郑州大学第附属医院 A kind of human liver cancer marker and application thereof
CN108753962A (en) * 2018-05-14 2018-11-06 丽水市人民医院 Purposes of the hsa-miR-130a in non-small cell lung cancer prognosis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101638656A (en) * 2009-08-28 2010-02-03 南京医科大学 Blood serum/blood plasma miRNA marker related to non-small cell lung cancer (SCLC) prognosis and application thereof
CN103492590A (en) * 2011-02-22 2014-01-01 卡里斯生命科学卢森堡控股有限责任公司 Circulating biomarkers
CN108315413A (en) * 2017-12-31 2018-07-24 郑州大学第附属医院 A kind of human liver cancer marker and application thereof
CN108753962A (en) * 2018-05-14 2018-11-06 丽水市人民医院 Purposes of the hsa-miR-130a in non-small cell lung cancer prognosis

Also Published As

Publication number Publication date
CN109735619A (en) 2019-05-10

Similar Documents

Publication Publication Date Title
Hayes et al. Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple independent patient cohorts
US20210017606A1 (en) Marker Genes for Prostate Cancer Classification
Amaro et al. Validation of proposed prostate cancer biomarkers with gene expression data: a long road to travel
Kristiansen Markers of clinical utility in the differential diagnosis and prognosis of prostate cancer
CN105102636B (en) For detecting and measuring the composition and method of prostate cancer prognosis
Connell et al. A four‐group urine risk classifier for predicting outcomes in patients with prostate cancer
JP2011523049A (en) Biomarkers for head and neck cancer identification, monitoring and treatment
CN104487591A (en) Molecular markers for prognostically predicting prostate cancer, method and kit thereof
EP1945817A2 (en) Molecular profiling of cancer
CN115497562A (en) Pancreatic cancer prognosis prediction model construction method based on copper death-related gene
Kang et al. Do microRNA 96, 145 and 221 expressions really aid in the prognosis of prostate carcinoma?
CN109735619B (en) Molecular marker related to non-small cell lung cancer prognosis and application thereof
CN112831562A (en) Biomarker combination and kit for predicting recurrence risk of liver cancer patient after resection
CN106415563A (en) Systems and methods for predicting a smoking status of an individual
JP7313374B2 (en) Postoperative risk stratification based on PDE4D mutation expression and postoperative clinical variables, selected by TMPRSS2-ERG fusion status
Yu et al. Identification of a prognostic biomarker predicting biochemical recurrence and construction of a novel nomogram for prostate cancer
CN104169435A (en) Method for the diagnosis or prognosis, in vitro, of lung cancer
CN113862354B (en) System for predicting prognosis of patients with limited stage small cell lung cancer and application thereof
CN113774135B (en) Group of markers for predicting prognosis of high-grade serous ovarian cancer and application thereof
CN113736879B (en) System for prognosis of small cell lung cancer patient and application thereof
CN113699235B (en) Application of immunogenic cell death related gene in head and neck squamous cell carcinoma survival prognosis and radiotherapy responsiveness
Yan et al. Bioinformatics analysis of markers based on m6A related to prognosis combined with immune invasion of rectal adenocarcinoma
US20180321244A1 (en) Method for Detection And Diagnosis of Oral Cancer in a sample
氏家大輔 KRT17 as a prognostic biomarker in stage II colorectal cancer
WO2022104278A1 (en) Cancer diagnosis and classification by non-human metagenomic pathway analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant