CN115497552A - Gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic gene and application - Google Patents
Gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic gene and application Download PDFInfo
- Publication number
- CN115497552A CN115497552A CN202211185735.7A CN202211185735A CN115497552A CN 115497552 A CN115497552 A CN 115497552A CN 202211185735 A CN202211185735 A CN 202211185735A CN 115497552 A CN115497552 A CN 115497552A
- Authority
- CN
- China
- Prior art keywords
- gastric cancer
- risk
- endoplasmic reticulum
- reticulum stress
- genes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 208000005718 Stomach Neoplasms Diseases 0.000 title claims abstract description 77
- 206010017758 gastric cancer Diseases 0.000 title claims abstract description 77
- 201000011549 stomach cancer Diseases 0.000 title claims abstract description 77
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 53
- 238000004393 prognosis Methods 0.000 title claims abstract description 32
- 210000002472 endoplasmic reticulum Anatomy 0.000 title claims abstract description 25
- 230000004083 survival effect Effects 0.000 claims abstract description 22
- 102100034283 Annexin A5 Human genes 0.000 claims abstract description 8
- 102100031650 C-X-C chemokine receptor type 4 Human genes 0.000 claims abstract description 8
- 101000780122 Homo sapiens Annexin A5 Proteins 0.000 claims abstract description 8
- 101000922348 Homo sapiens C-X-C chemokine receptor type 4 Proteins 0.000 claims abstract description 8
- 101001128156 Homo sapiens Nanos homolog 3 Proteins 0.000 claims abstract description 8
- 101001124309 Homo sapiens Nitric oxide synthase, endothelial Proteins 0.000 claims abstract description 8
- 101001094647 Homo sapiens Serum paraoxonase/arylesterase 1 Proteins 0.000 claims abstract description 8
- 108010072582 Matrilin Proteins Proteins 0.000 claims abstract description 8
- 102000055008 Matrilin Proteins Human genes 0.000 claims abstract description 8
- 108010022233 Plasminogen Activator Inhibitor 1 Proteins 0.000 claims abstract description 8
- 102100035476 Serum paraoxonase/arylesterase 1 Human genes 0.000 claims abstract description 8
- 102000012335 Plasminogen Activator Inhibitor 1 Human genes 0.000 claims abstract 3
- 230000014509 gene expression Effects 0.000 claims description 35
- 102100028452 Nitric oxide synthase, endothelial Human genes 0.000 claims description 7
- 206010028980 Neoplasm Diseases 0.000 abstract description 15
- 238000012549 training Methods 0.000 abstract description 14
- 238000012795 verification Methods 0.000 abstract description 5
- 238000001514 detection method Methods 0.000 abstract description 2
- 238000011269 treatment regimen Methods 0.000 abstract description 2
- 238000009412 basement excavation Methods 0.000 abstract 1
- 238000001228 spectrum Methods 0.000 abstract 1
- 238000013518 transcription Methods 0.000 abstract 1
- 230000035897 transcription Effects 0.000 abstract 1
- 238000004458 analytical method Methods 0.000 description 15
- 238000000611 regression analysis Methods 0.000 description 10
- 238000010200 validation analysis Methods 0.000 description 9
- 201000011510 cancer Diseases 0.000 description 7
- 210000001519 tissue Anatomy 0.000 description 6
- 102100039418 Plasminogen activator inhibitor 1 Human genes 0.000 description 5
- 230000036962 time dependent Effects 0.000 description 5
- 238000011282 treatment Methods 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 238000000034 method Methods 0.000 description 4
- 230000001105 regulatory effect Effects 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 239000000090 biomarker Substances 0.000 description 3
- 210000004027 cell Anatomy 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 238000013211 curve analysis Methods 0.000 description 3
- 230000034994 death Effects 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000010195 expression analysis Methods 0.000 description 3
- 238000000556 factor analysis Methods 0.000 description 3
- 210000002865 immune cell Anatomy 0.000 description 3
- 230000008595 infiltration Effects 0.000 description 3
- 238000001764 infiltration Methods 0.000 description 3
- 230000035772 mutation Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- 210000001666 CD4-positive, alpha-beta memory T lymphocyte Anatomy 0.000 description 2
- 206010027476 Metastases Diseases 0.000 description 2
- 238000003559 RNA-seq method Methods 0.000 description 2
- 238000011088 calibration curve Methods 0.000 description 2
- 231100000504 carcinogenesis Toxicity 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000001461 cytolytic effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000009401 metastasis Effects 0.000 description 2
- 230000001575 pathological effect Effects 0.000 description 2
- 230000001737 promoting effect Effects 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 201000009030 Carcinoma Diseases 0.000 description 1
- 102100023580 Cyclic AMP-dependent transcription factor ATF-4 Human genes 0.000 description 1
- 102100023583 Cyclic AMP-dependent transcription factor ATF-6 alpha Human genes 0.000 description 1
- 108700041152 Endoplasmic Reticulum Chaperone BiP Proteins 0.000 description 1
- 102100021451 Endoplasmic reticulum chaperone BiP Human genes 0.000 description 1
- 102000001398 Granzyme Human genes 0.000 description 1
- 108060005986 Granzyme Proteins 0.000 description 1
- 101150112743 HSPA5 gene Proteins 0.000 description 1
- 241000590002 Helicobacter pylori Species 0.000 description 1
- 101000905743 Homo sapiens Cyclic AMP-dependent transcription factor ATF-4 Proteins 0.000 description 1
- 101000905751 Homo sapiens Cyclic AMP-dependent transcription factor ATF-6 alpha Proteins 0.000 description 1
- 101000666295 Homo sapiens X-box-binding protein 1 Proteins 0.000 description 1
- 206010020674 Hypermetabolism Diseases 0.000 description 1
- 206010021143 Hypoxia Diseases 0.000 description 1
- 108091008036 Immune checkpoint proteins Proteins 0.000 description 1
- 102000037982 Immune checkpoint proteins Human genes 0.000 description 1
- 208000007433 Lymphatic Metastasis Diseases 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 102000004503 Perforin Human genes 0.000 description 1
- 108010056995 Perforin Proteins 0.000 description 1
- 230000010799 Receptor Interactions Effects 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 102100038151 X-box-binding protein 1 Human genes 0.000 description 1
- 101100441540 Xenopus laevis cxcr4-a gene Proteins 0.000 description 1
- 101100441541 Xenopus laevis cxcr4-b gene Proteins 0.000 description 1
- 230000033115 angiogenesis Effects 0.000 description 1
- 230000037429 base substitution Effects 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000010201 enrichment analysis Methods 0.000 description 1
- 230000007705 epithelial mesenchymal transition Effects 0.000 description 1
- 230000017188 evasion or tolerance of host immune response Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 229940037467 helicobacter pylori Drugs 0.000 description 1
- 230000007954 hypoxia Effects 0.000 description 1
- 238000009169 immunotherapy Methods 0.000 description 1
- 238000011337 individualized treatment Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 230000002601 intratumoral effect Effects 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 208000028867 ischemia Diseases 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000009456 molecular mechanism Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 238000010837 poor prognosis Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000035755 proliferation Effects 0.000 description 1
- 230000020978 protein processing Effects 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000000392 somatic effect Effects 0.000 description 1
- CCEKAJIANROZEO-UHFFFAOYSA-N sulfluramid Chemical group CCNS(=O)(=O)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)F CCEKAJIANROZEO-UHFFFAOYSA-N 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000002626 targeted therapy Methods 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000005748 tumor development Effects 0.000 description 1
- 230000004906 unfolded protein response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
- G16B35/20—Screening of libraries
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Epidemiology (AREA)
- Biotechnology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Primary Health Care (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Pathology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Library & Information Science (AREA)
- Molecular Biology (AREA)
- Bioethics (AREA)
- Chemical & Material Sciences (AREA)
- Biochemistry (AREA)
- Physiology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention belongs to the technical field of tumor markers and biomedical detection, and particularly relates to a gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic genes and application thereof. The data collected from the TCGA database was used as a training set for a total of 375 gastric cancer samples and 32 paracancerous samples. And verified using 387 gastric cancer samples in the GEO database as an external verification set. 6 endoplasmic reticulum stress characteristic genes NOS3, PON1, CXCR4, MATN3, ANXA5 and SERPINE1 which are successfully screened out according to comprehensive excavation of transcription spectrum and tumor microenvironment characteristics. The prediction model shows good performance of predicting the overall survival rate of the gastric cancer in a training set and a testing set. A risk score based on 6 genes associated with endoplasmic reticulum stress may well classify gastric cancer patients into high risk, low risk populations, which may aid in the selection of clinical treatment regimens.
Description
Technical Field
The invention belongs to the technical field of tumor markers and biomedical detection, and particularly relates to a gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic genes and application thereof.
Background
Gastric cancer is the third leading cause of cancer death and the fifth most common malignant tumor worldwide, with over 100 new cases annually. China is a country with high incidence of gastric cancer, newly increased cases and death cases account for 42.6 percent and 45.0 percent of the total number of the whole world respectively, and the 5-year survival rate of the age standardization is 27.4 percent. Early symptoms of the gastric cancer sample are hidden, and the treatment effect and prognosis are poor. Most gastric cancer samples are diagnosed at an advanced stage, leading to poor overall prognosis, manifested by metastasis, high intratumoral heterogeneity and chemoresistance. Despite the rapid development of immunotherapy, targeted therapy and transformation therapy in the treatment of gastric cancer, the overall survival rate of most samples remains low.
During the process of tumorigenesis and development, hypermetabolism and rapid proliferation of tumor cells lead to ischemia and hypoxia inside tumors, so that the tumor cells enter a continuous endoplasmic reticulum stress state. Studies have shown that specific strengths of endoplasmic reticulum stress can affect the development of cancer by a variety of mechanisms, including promoting cancer cell growth and metastasis, angiogenesis, immune escape, and chemo-resistance. Endoplasmic reticulum stress has important influence on the occurrence, progression and treatment of gastric cancer in particular, and can promote the progression of gastric cancer through interaction with helicobacter pylori, EB virus, and also can cause the migration and invasion of gastric cancer cells by promoting the epithelial-mesenchymal transition of gastric cancer cells.
Biomarkers can be useful in predicting prognosis of a cancer sample. In recent years, many studies have used genes as biomarkers for tumor development and prognosis. At present, histological diagnosis and tumor-lymph node-metastasis (TNM) staging remain the main methods for assessing prognosis of gastric cancer. Due to the high heterogeneity of gastric cancer and individual differences in samples, there are large differences in prognosis and treatment efficacy even for samples with similar clinical and pathological characteristics and even the same TNM staging. This suggests that past gastric cancer prognostic evaluation indicators may have expanded to the limit of predicting the outcome of a sample prognosis and the benefit of treatment. Therefore, there is an urgent need to identify new biomarkers to assist in improving the current prognostic indicators and provide the basis for the prognosis evaluation and individualized treatment of gastric cancer.
Disclosure of Invention
Aiming at the problem that the high heterogeneity of the gastric cancer and the individual difference of samples lack accurate prognostic indicators, the invention provides a gastric cancer risk prognosis scoring model of endoplasmic reticulum stress, a construction method and application thereof.
In order to achieve the purpose, the invention adopts the following technical scheme:
a gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic genes, wherein the endoplasmic reticulum stress genes comprise: NOS3, PON1, CXCR4, MATN3, ANXA5, SERPINE1;
risk score model = (0.052 × NOS3 expression level) + (0.137 × PON1 expression level) + (0.067 × CXCR4 expression level) + (0.131 × MATN3 expression level) + (0.116 × ANXA5 expression level) + (0.09 × SERPINE1 expression level).
A construction method of a gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic genes comprises the following steps:
to ensure comparability of the data, RNA-seqs data was transformed with millions of Transcripts (TPM) and normalized by log2 (TPM + 1) transformation for subsequent analysis;
screening genes for constructing a risk score model includes: NOS3, PON1, CXCR4, MATN3, ANXA5, SERPINE1;
constructing an air risk scoring model:
risk score model = (0.052 × NOS3 expression level) + (0.137 × PON1 expression level) + (0.067 × CXCR4 expression level) + (0.131 × MATN3 expression level) + (0.116 × ANXA5 expression level) + (0.09 × SERPINE1 expression level);
performing Kaplan-Meier (K-M) curve analysis on the identified high-risk group and low-risk group by using a 'surfminer' program package, and comparing the difference of the total survival time (OS) of the two groups of samples; using a 'timeROC' program package to draw a time-dependent receiver operating characteristic curve (ROC), calculating the Area under the curve (AUC) of the gastric cancer sample at a plurality of time points, and evaluating the capability of the risk model for predicting the prognosis of the gastric cancer sample; the same risk scoring formula and cut-off values are used in the validation set to validate the accuracy of the model.
According to the application of the gastric cancer prognosis risk model based on the endoplasmic reticulum stress characteristic gene in the preparation of the kit.
The kit is applied to products for predicting the overall survival rate of gastric cancer patients.
The kit is applied to diagnosis products of the overall survival rate of gastric cancer patients.
The kit is applied to auxiliary diagnosis products of the overall survival rate of gastric cancer patients.
The invention develops a gastric cancer prognosis risk model with strong practicability by using the endoplasmic reticulum stress related characteristic genes so as to predict the overall survival time of a gastric cancer patient. The present invention analyzes gene expression profiles of gastric cancer patients from the TCGA database and the GEO database. The data collected from the TCGA database was used as a training set, and a total of 375 gastric cancer samples and 32 paracancer samples were included. And verified using 387 gastric cancer samples from the GEO database. Differential Expression Genes (DEG) of gastric cancer tissues and paracarcinoma in TCGA database were screened by R software package "limma". ER stress-associated genes in DEG were identified by GeneCards database. Based on DEG data in a training set, a prognosis model with 6 endoplasmic reticulum stress-related characteristic genes is established by using univariate Cox regression analysis and LASSO regression analysis, and gastric cancer patients are divided into high-risk groups and low-risk groups. Nomograms are constructed by combining clinical characteristics and risk scores to predict the survival probability of gastric cancer patients. The calibration curve verifies good agreement between nomogram predictions and actual observations. The risk score of gastric cancer patients in the training cohort was significantly correlated with OS (p < 0.05).
ROC curve analysis showed that AUC was 0.695, 0.786, and 0.698 at 3, 5, and 8 years of follow-up. Also in the validation set, the 3-year, 5-year and 8-year AUC values were 0.580, 0.625 and 0.627, respectively. The predicted performance has been fully validated. The risk score determined by the risk model is determined by independent prognostic factor analysis as a prognostic factor independent of other clinical pathological characteristics. A risk score based on 6 ER stress-associated genes may well classify gastric cancer patients into high risk, low risk populations, which may aid in the selection of clinical treatment regimens.
Compared with the prior art, the invention has the following advantages:
the invention establishes a prognosis model with 6 genes related to endoplasmic reticulum stress characteristics, and divides a gastric cancer sample into a high risk group and a low risk group. The method has good prediction performance in both a training set and a verification set.
Drawings
FIG. 1 is a volcano plot of differentially expressed endoplasmic reticulum stress-associated signature genes;
FIG. 2 is a forest map of ERS-RGs identified by univariate cox regression analysis as being significantly correlated with prognosis;
FIG. 3 shows the development of a prognostic model based on ERS-RGs in the TCGA training set; FIG. 3 (A-B) identifies 6 ERS-RGs by LASSO regression analysis;
FIG. 4 shows validation of ERS-RGs-based developed prognostic models in the TCGA training set; survival analysis of the feature-defined risk groups in fig. 4 (a); (B) 6 ERS-RGs construct a time-dependent ROC curve of a prognosis model;
FIG. 5 shows validation of a prognostic model developed based on ERS-RGs in the GEO validation set; FIG. 5 (A) survival analysis of feature-defined risk groups; (B) 6 ERS-RGs construct a time-dependent ROC curve of a prognosis model;
FIG. 6 is a nomogram for constructing a survival prediction; FIG. 6 is a forest chart showing the results of single-and multi-factor independent prognostic analyses performed on risk scores; (B) Nomograms incorporating risk scores and clinical information characteristics; (C) The calibration curve shows a high agreement between the nomogram predicted and actual survival rates.
Detailed Description
In order that the manner in which the above recited objects, features and advantages of the present invention are obtained will be readily understood, a more particular description of the invention briefly described above will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced otherwise than as specifically described herein and, therefore, the present invention is not limited to the specific embodiments disclosed in the following description.
Example 1
Gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic gene
Transcriptome data and corresponding clinical information for 375 gastric cancer and 32 paracancer samples were obtained from the TCGA database (http:// cancer. Nih. Gov /) and used as training set samples; 387 gastric cancer samples were downloaded from the GEO database (http:// www. Ncbi. Nlm. Nih. Gov/GEO /) as validation set samples (GSE 84437). To ensure comparability of the data, RNA-seqs data were transformed per million Transcripts (TPM) and normalized by log2 (TPM + 1) transformation for subsequent analysis. ERS-related signature genes (ERS-RGs) were obtained from the GeneCards database (https:// www.
Differential expression analysis between gastric cancer tissue and para-carcinoma tissue was performed using the "Limma" package in R software. To incorporate a more comprehensive gastric cancer differential gene, P <0.05 and | log2 (Fold Change) | >0 were used as screening criteria. The R software package "clusterirprofiler" is used for performing Gene Ontology function (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis and visualization on differential Genes, characterizing key pathways involved by gastric cancer differential Genes and disclosing potential molecular mechanisms.
Single-factor Cox regression analysis is carried out by using a survivval package in R software, and gastric cancer differential expression genes related to ERS characteristics with prognostic value are screened out. The multigenic risk model was constructed by performing a least absolute shrinkage (LASSO) regression analysis using the "glmnet" package. The risk score of each sample was calculated by the regression coefficient corresponding to the model-constructed gene (risk score = ∑ (regression coefficient × model-constructed gene expression amount)). And taking the median of the risk scores of the samples in the training set as a Cut-off value to divide the samples into two groups with high risk and low risk. A successfully constructed risk scoring model is validated in an external data set (GEO data set). As shown in fig. 5.
The identified high and low risk groups were subjected to Kaplan-Meier (K-M) curve analysis using the "surfminer" package, comparing the difference in Overall Survival (OS) for the two groups of samples. The method comprises the steps of drawing a time-dependent receiver operating characteristic curve (ROC) by using a 'timeROC' program package, calculating areas under the curve (AUC) of the gastric cancer sample in OS of 3 years, 5 years and 8 years respectively, and further evaluating the capability of a risk model for predicting the prognosis of the gastric cancer sample. To assess whether the risk model has independent prognostic value for assessing the prognosis of gastric cancer samples, single-factor and multi-factor independent prognostic analyses were performed on the risk scores. Independent prognostic factors identified by multifactor independent prognostic analysis were used as variables by the "rms" package to plot nomograms for comprehensive assessment of survival rates of samples for 3, 5 and 8 years.
In order to evaluate the immune cell infiltration condition in the gastric cancer tumor microenvironment of the high-risk and low-risk groups, the characteristic immune cell infiltration abundance between the two groups is analyzed by a deconvolution-based CIBERSORT algorithm. The Cytolytic activity score (cytic activity, CYT score) was calculated using the geometric mean of the expression levels of granzyme a (GZMA) and perforin 1 (PRF 1). Tumor Mutation Burden (TMB) is the total number of somatic gene coding errors, base substitutions, gene insertions or deletion errors detected per million bases. TMB was calculated by determining the mutation status of gastric cancer samples by the "maftools" package.
All statistical analysis and visualization was based on the R language (Version 4.1.3) and the R package. P values less than 0.05 are considered statistically significant.
Example 2 identification of gastric cancer ERS-associated prognostic genes and construction of Cox Risk model
The analysis of the gastric cancer-health sample differential genes is carried out based on the TCGA training set queue, and 5054 remarkable up-regulated genes and 4229 remarkable down-regulated genes are obtained in total. A total of 785 ERS-associated signature genes (ERS-RGs) were obtained from the GeneCards database. Among them, 444 genes are considered as differentially expressed genes of gastric cancer, including 168 significantly down-regulated genes and 276 significantly up-regulated genes. Functional enrichment of GO and KEGG showed that the differential ERS signature genes were mainly enriched in biological processes like protein processing in the endoplasmic reticulum, ECM receptor interactions and unfolded protein responses (P < 0.05). Based on the 444 ERS characteristic genes differentially expressed in gastric cancer, single-factor Cox regression analysis is performed, as shown in FIG. 2, 12 high-risk genes with significant prognosis are screened out for LASSO regression analysis (HR > 1), as shown in FIG. 3, and the optimal lambda value =6 and the beta regression coefficient of each gene are obtained. Substituting the genes to obtain a risk score model formula as follows: risk score =0.052 × nos3 expression amount +0.137 × pon1 expression amount +0.067 × cxcr4 expression amount +0.131 × matnn 3 expression amount +0.116 × anxa5 expression amount +0.09 × serpine1 expression amount. Samples were divided into two groups, high risk group (N = 169) and low risk group (N = 168) according to the median 2.369 sample risk score in the training set cohort.
Example 3 prognostic efficacy evaluation and validation of risk models
The visualization result of the risk curve shows that the death ratio of the gastric cancer sample in the high risk group is higher than that in the low risk group, and the high risk group sample is suggested to have poor prognosis. NOS3, PON1, CXCR4, MATN3, ANXA5 and SERPINE1 are all highly expressed in the high risk group, which shows that the high expression of the six model construction genes is positively correlated with the high risk.
The Kaplan-Meier curve indicates that the low risk group exhibits higher survival in both the training set (P < 0.0001) and the validation set (P = 0.0013). The time-dependent ROC analysis results showed that 3 year time AUC values =0.695,5 year time AUC values =0.786,8 year time AUC values =0.698 in the TCGA data set; the AUC value =0.625 in the 5-year time in the verification set indicates that the risk model has good sensitivity and specificity for prognosis prediction of the gastric cancer sample, particularly for predicting the 5-year total survival rate of the gastric cancer sample.
Example 4 independent prognostic factor analysis of Risk scores and establishment of nomograms
Age, gender, grade, staging (Stage), and risk score were included as variables in the one-way independent prognostic factor analysis. The one-way independent prognostic analysis results showed that the risk score was a risk factor significantly associated with gastric cancer prognosis (HR =3.601,95% ci. Multifactor independent prognostic analysis results showed that the risk score was a prognostic factor independent of other clinical pathology characteristics (HR =3.598,95% ci. In order to further integrate clinical information and realize multivariate survival analysis of the gastric cancer samples, factors with independent prognostic value determined by the multifactor independent prognostic analysis are included and drawn into a nomogram for comprehensively analyzing the survival rates of the gastric cancer samples for 3 years, 5 years and 8 years. As shown in fig. 6.
Example 5 assessment of ERS status and immune microenvironment status in gastric cancer high and low risk group
The intracellular expression level of related proteins such as ATF6, HSPA5, XBP1 and ATF4 is used as the most common index for detecting ERS intensity in cells or tissues, and the expression level of the markers is detected in 375 gastric cancer samples. The expression level of the characteristic gene in the high risk group is obviously higher (P < 0.05), which indicates that the ER stress intensity of the high risk group is obviously higher than that of the low risk group (P < 0.05). The CIBERSORT results show that the infiltration abundance of immune cells in the tumor microenvironment of the gastric cancer samples of the high-risk group and the low-risk group is different, the abundance of activated CD4 memory T cells of the high-risk group is obviously lower than that of the activated CD4 memory T cells of the low-risk group, and the abundance of macrophages M0 and M2 is obviously higher than that of the low-risk group (P < 0.05). Furthermore, the expression level of common immune checkpoints was significantly higher in the high risk group than in the low risk group (P < 0.05). The cytolytic activity score of the high risk group samples was also significantly elevated (P < 0.05).
Those matters not described in detail in the present specification are well known in the art to which the skilled person pertains. Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.
Claims (5)
1. A gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic genes is characterized in that: the endoplasmic reticulum stress gene is: NOS3, PON1, CXCR4, MATN3, ANXA5, SERPINE1;
risk score model = (0.052 × NOS3 expression level) + (0.137 × PON1 expression level) + (0.067 × CXCR4 expression level) + (0.131 × MATN3 expression level) + (0.116 × ANXA5 expression level) + (0.09 × SERPINE1 expression level).
2. Use of the model for the risk of prognosis of gastric cancer based on the endoplasmic reticulum stress signature gene according to claim 1 in the preparation of a kit.
3. The kit according to claim 2, for use in a product for predicting the overall survival rate of gastric cancer patients.
4. The use of the kit according to claim 2 in a product for diagnosing the overall survival rate of gastric cancer patients.
5. The use of the kit according to claim 2 in a product for assisting in diagnosing the overall survival rate of a gastric cancer patient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211185735.7A CN115497552A (en) | 2022-09-27 | 2022-09-27 | Gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic gene and application |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211185735.7A CN115497552A (en) | 2022-09-27 | 2022-09-27 | Gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic gene and application |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115497552A true CN115497552A (en) | 2022-12-20 |
Family
ID=84471803
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211185735.7A Withdrawn CN115497552A (en) | 2022-09-27 | 2022-09-27 | Gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic gene and application |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115497552A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115641912A (en) * | 2022-12-07 | 2023-01-24 | 北京泽桥医疗科技股份有限公司 | Biomarker for treating and diagnosing gastric cancer metastasis and identification method thereof |
CN116665898A (en) * | 2023-06-01 | 2023-08-29 | 南方医科大学南方医院 | Biomarker for predicting prognosis of gastric cancer based on histone modification regulator characteristics, scoring model and application |
CN116844685A (en) * | 2023-07-03 | 2023-10-03 | 广州默锐医药科技有限公司 | Immunotherapeutic effect evaluation method, device, electronic equipment and storage medium |
-
2022
- 2022-09-27 CN CN202211185735.7A patent/CN115497552A/en not_active Withdrawn
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115641912A (en) * | 2022-12-07 | 2023-01-24 | 北京泽桥医疗科技股份有限公司 | Biomarker for treating and diagnosing gastric cancer metastasis and identification method thereof |
CN115641912B (en) * | 2022-12-07 | 2023-04-07 | 北京泽桥医疗科技股份有限公司 | Biomarker for treating and diagnosing gastric cancer metastasis and identification method thereof |
CN116665898A (en) * | 2023-06-01 | 2023-08-29 | 南方医科大学南方医院 | Biomarker for predicting prognosis of gastric cancer based on histone modification regulator characteristics, scoring model and application |
CN116665898B (en) * | 2023-06-01 | 2024-01-30 | 南方医科大学南方医院 | Biomarker for predicting prognosis of gastric cancer based on histone modification regulator characteristics, scoring model and application |
CN116844685A (en) * | 2023-07-03 | 2023-10-03 | 广州默锐医药科技有限公司 | Immunotherapeutic effect evaluation method, device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Litchfield et al. | Meta-analysis of tumor-and T cell-intrinsic mechanisms of sensitization to checkpoint inhibition | |
CN109859801B (en) | Model for predicting lung squamous carcinoma prognosis by using seven genes as biomarkers and establishing method | |
CN115497552A (en) | Gastric cancer prognosis risk model based on endoplasmic reticulum stress characteristic gene and application | |
JP2020522690A (en) | Method and system for identifying or monitoring lung disease | |
CN115497562B (en) | Pancreatic cancer prognosis prediction model construction method based on copper death related gene | |
AU2015289758A1 (en) | Methods for evaluating lung cancer status | |
CN110577998A (en) | Construction of molecular model for predicting postoperative early recurrence risk of liver cancer and application evaluation thereof | |
CN115410713A (en) | Hepatocellular carcinoma prognosis risk prediction model construction based on immune-related gene | |
CN111951893B (en) | Method for constructing tumor mutation load TMB panel | |
CN113355419B (en) | Breast cancer prognosis risk prediction marker composition and application | |
CN113517073B (en) | Method for constructing survival rate prediction model after lung cancer surgery and prediction model system | |
CN111863137A (en) | Complex disease state evaluation method established based on high-throughput sequencing data and clinical phenotype and application | |
CN115588507A (en) | Prognosis model of lung adenocarcinoma EMT related gene, construction method and application | |
CN112831562A (en) | Biomarker combination and kit for predicting recurrence risk of liver cancer patient after resection | |
CN115482880A (en) | Head and neck squamous carcinoma glycolysis related gene prognosis model, construction method and application | |
CN116287204A (en) | Application of mutation condition of detection characteristic gene in preparation of venous thromboembolism risk detection product | |
CN113234823B (en) | Pancreatic cancer prognosis risk assessment model and application thereof | |
CN112037863B (en) | Early NSCLC prognosis prediction system | |
Bhandari et al. | Comparing continuous and discrete analyses of breast cancer survival information | |
US20240194294A1 (en) | Artificial-intelligence-based method for detecting tumor-derived mutation of cell-free dna, and method for early diagnosis of cancer, using same | |
CN115862876A (en) | Device and computer readable storage medium for predicting lung adenocarcinoma patient prognosis and medication guidance based on immune microenvironment gene group | |
CN115331812A (en) | Establishment and verification method of serous ovarian cancer prognostic marker model | |
Cheng et al. | Early signatures of breast cancer up to seven years prior to clinical diagnosis in plasma cell-free DNA methylomes | |
Zhong et al. | Distinguishing kawasaki disease from febrile infectious disease using gene pair signatures | |
Liu et al. | Differentially expressed mutant genes reveal potential prognostic markers for lung adenocarcinoma |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20221220 |
|
WW01 | Invention patent application withdrawn after publication |