CN112746108B - Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application - Google Patents

Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application Download PDF

Info

Publication number
CN112746108B
CN112746108B CN202110028913.4A CN202110028913A CN112746108B CN 112746108 B CN112746108 B CN 112746108B CN 202110028913 A CN202110028913 A CN 202110028913A CN 112746108 B CN112746108 B CN 112746108B
Authority
CN
China
Prior art keywords
gene
exp
prognosis
mettl5
hierarchical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110028913.4A
Other languages
Chinese (zh)
Other versions
CN112746108A (en
Inventor
赫捷
高亦博
孙思进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cancer Hospital and Institute of CAMS and PUMC
Original Assignee
Cancer Hospital and Institute of CAMS and PUMC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cancer Hospital and Institute of CAMS and PUMC filed Critical Cancer Hospital and Institute of CAMS and PUMC
Priority to CN202110028913.4A priority Critical patent/CN112746108B/en
Publication of CN112746108A publication Critical patent/CN112746108A/en
Application granted granted Critical
Publication of CN112746108B publication Critical patent/CN112746108B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B5/00ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Abstract

The invention discloses a gene marker for tumor prognosis hierarchical evaluation, an evaluation method and application, wherein the gene marker comprises METTL5, RAC1, RCCD1, C11orf24 and SLC7A 5. The evaluation method comprises the following steps: substituting the expression quantity of the gene marker into a model formula, and calculating the risk score of each sample; and grouping the samples according to the threshold value of the model and performing intergroup survival analysis. The invention provides the lung adenocarcinoma prognosis hierarchical evaluation based on the METTL5 gene, and the prognosis hierarchical is independent of the tumor pathological stage, so the lung adenocarcinoma prognosis hierarchical evaluation can be used for lung adenocarcinoma patients in each stage, the heterogeneity of the lung adenocarcinoma can be reduced by different risk hierarchical, and the lung adenocarcinoma prognosis hierarchical evaluation provides important guiding significance for the accurate medical treatment of the patients.

Description

Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application
Technical Field
The invention relates to the field of medical diagnosis, in particular to a gene marker for tumor prognosis hierarchical assessment, an assessment method and application.
Background
Lung cancer is the highest incidence and mortality cancer species in the world, resulting in over 70 million deaths each year. Over 40% of all lung cancer patients are lung adenocarcinoma. Especially in asian populations, the proportion of adenocarcinoma in non-smokers is further increased. The prognosis of tumors varies among patients of different stages, and there is greater heterogeneity in response to treatment and prognosis regardless of the stage of the tumor. Therefore, identifying patients with poorer prognosis is of great help to guide treatment. Meanwhile, for patients with advanced lung adenocarcinoma, the 5-year survival rate is still lower than 25%, although chemotherapy, targeted therapy, immunotherapy and other measures are adopted in recent years. The search for effective prognostic factors and therapeutic targets is an urgent task in the study of lung adenocarcinoma.
With the development of gene sequencing technologies, including next generation sequencing technologies, the understanding of tumorigenesis and development is getting deeper. In addition to DNA mutations, the level of gene expression or the corresponding regulatory mechanisms that influence the treatment and prognosis of tumors N6-methyladenosine (m6A) modified RNA are important mechanisms for regulating RNA, particularly mRNA function. In previous studies, multiple m6A modified genes were involved in tumor development and prognosis, including METTL3, METTL4, which promote m6A methylation, FTO-mediated demethylated FTO and the YTHDF family of genes that influence mRNA expression levels. It can be seen that RNA methylation plays a significant role in the development of tumors.
The current research on RNA methylation on tumors focuses on the field of mRNA methylation, but the related research on ribosomal RNA methylation is neglected, and the existing m 6A-related prognosis stratification model has not ideal stratification effect on patients. Methyltransferase-like protein 5 (METTL 5) is a gene for modifying ribosomal RNA methylation, and has a certain prognostic potential, so a better prognostic hierarchical assessment gene marker and assessment method based on RNA modification are urgently needed to be developed.
Disclosure of Invention
The invention provides a gene marker for tumor prognosis hierarchical evaluation, which comprises METTL5, RAC1, RCCD1, C11orf24 and SLC7A 5.
The gene marker according to an embodiment is used for predicting lung cancer.
The genetic marker according to another embodiment of the present invention is used for predicting lung adenocarcinoma.
In another aspect, the present invention provides a method for the stratified evaluation of tumor prognosis, comprising: substituting the expression quantity of the gene marker into a model formula, and calculating the risk score of each sample; grouping the samples according to the threshold value of the model and performing intergroup survival analysis; wherein the gene markers comprise METTL5, RAC1, RCCD1, C11orf24 and SLC7A 5.
According to an embodiment of the present invention, the model formula is RS ═ 0.130 × explettl 5) + (0.078 × explac 1) + (0.031 × exprcd 1) + (0.053 × exppc 11orf24) + (0.096 × expsclc 7a5), where expletl 5 is METTL5 gene expression level, explac 1 is RAC1 gene expression level, exprcd 1 is RCCD1 gene expression level, EXPC11orf24 is C11orf24 gene expression level, and expsclc 7a5 is 7a5 gene expression level.
According to another embodiment of the present invention, the step of obtaining the gene marker comprises: extracting a transcriptome data set of the tumor in the GEO data; annotating the probe according to the platform file; merging the probe expressions with the same gene; and normalizing the gene expression of each sample.
According to another embodiment of the present invention, the normalization is LASSO regression.
According to another embodiment of the invention, the tumor is lung cancer.
According to another embodiment of the invention, the tumor is lung adenocarcinoma.
The invention also provides the application of the gene marker in the stratified evaluation of tumor prognosis.
The invention provides the lung adenocarcinoma prognosis hierarchical evaluation based on the METTL5 gene, and the prognosis hierarchical is independent of the tumor pathological stage, so the lung adenocarcinoma prognosis hierarchical evaluation can be used for lung adenocarcinoma patients in each stage, the heterogeneity of the lung adenocarcinoma can be reduced by different risk hierarchical, and the lung adenocarcinoma prognosis hierarchical evaluation provides important guiding significance for the accurate medical treatment of the patients. The maximum 2-year survival AUC of the prognostic model consisting of 5 genes reaches 0.823, the 3-year survival AUC reaches 0.705, and the 5-year survival AUC reaches 0.699.
Drawings
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
FIG. 1 shows the genes at the top 15 positions of the random survival forest ranked by importance (the numbers in the figure indicate the minimum depth).
Fig. 2 is a prognostic value for evaluating a risk score in the GSE3141 dataset using ROC curve evaluation.
Fig. 3 is a graph of the prognostic value of risk scores assessed in the GSE13213 dataset using ROC curve assessment.
Fig. 4 is a graph of the prognostic value of risk scores assessed in the GSE31210 dataset using ROC curve assessment.
Detailed Description
The technical solution of the present invention will be described in detail with reference to the following embodiments, which are a part of the embodiments of the present invention, but not all of them. Other embodiments, which can be derived by one of ordinary skill in the art from the embodiments of the present invention without creative efforts, are within the protection scope of the present invention.
In order to construct a robust prognosis risk model based on METTL5 gene, a lung adenocarcinoma transcription group dataset is downloaded from TCGA (the cancer genome atlas), model construction is carried out on a training set by integrating a plurality of machine learning algorithms, optimal model parameters are obtained by adopting a cross validation mode, and the risk model can carry out risk scoring on lung adenocarcinoma patients through mRNA expression levels of 5 genes.
Further to the verification section of the model, the method is as follows: (1) extracting a transcriptome dataset of lung adenocarcinoma in GEO; (2) annotating the probe according to the platform file; (3) merging probe expressions with the same gene; (4) normalizing the gene expression of each sample; (5) extracting the expression quantity of the normalized model inclusion gene; (6) calculating the risk score of each sample according to a model formula; (7) and grouping the samples according to the threshold value of the model and carrying out intergroup survival analysis.
The population for which the method is applicable is a patient diagnosed with lung adenocarcinoma.
To facilitate a clearer understanding of the contents of the present invention, reference will now be made in detail to the following specific examples:
example 1 construction of a METTL 5-based risk scoring model in TCGA lung adenocarcinoma cohort based on gene expression profiling data
In order to develop a METTL 5-based prognostic risk model, a Spathial algorithm was first applied to perform evolutionary analysis to find genes associated with different expression levels of METTL 5. And Spathial searches the most important gene in the change process by adopting a main path algorithm. The samples are divided into two groups according to the upper quantile and the lower quantile expressed by METTL5, and the starting point and the end point of the path are set as the centroids of the two groups. In the path analysis process, the number of path points is set to 50. The significance of each gene was ranked by the corrected P value, and the first 100 genes were screened as shown in gene list 1.
TABLE 1
Figure BDA0002891309570000041
Figure BDA0002891309570000051
The number of features in the above table is further reduced by randomforest src using random survival forests. The parameter "importance" in the algorithm is true and all other parameters are set to default values. Genes in the top 15 genes were ranked by importance and included in the penalized regression analysis (FIG. 1). Finally, a prognostic risk model based on METTL5 was constructed by least absolute contraction and selection operator (LASSO) regression, with the best parameter selection based on a ten-fold cross-validation method. Finally, 5 genes were incorporated into the model according to the optimal lambda value, and the calculation formula of the model was RS ═ 0.130 × EXPMETTL5)+(0.078×EXPRAC1)+(0.031× EXPRCCD1)+(0.053×EXPC11orf24)+(0.096×EXPSLC7A5). Wherein EXPMETTL5Is the expression level of METTL5 gene, EXPRAC1Is the expression level of RAC1 gene, EXPRCCD1Is the expression level of RCCD1 gene, EXPC11orf24Is C11orf24 gene expression level, EXPSLC7A5The expression level of SLC7A5 gene.
Example 2 verification of prognostic risk models based on independent external data sets
The microarray dataset is from Gene Expression Omnibus (GEO), and the details of the dataset are as in Table 2.
TABLE 2
Figure BDA0002891309570000052
Firstly, gene annotation is carried out on a probe label according to a platform file provided by GEO, different probes corresponding to the same gene are combined according to an average value, and then normalization processing is carried out on a sample. Substituting the genes included in the model into a formula to calculate risk scores for each sample, grouping according to median of the risk scores in each queue, performing survival analysis and AUC (average value of survival) value calculation for 2 years, 3 years and 5 years on subgroups, and evaluating the prognostic efficacy of the model.
The Kaplan-Meier survival curves show that the overall survival of the high-risk groups in all three cohorts is significantly lower than that of the low-risk group (P < 0.05). Furthermore, the one-factor Cox regression showed significant risk factors for OS in all three queues (P < 0.05). The effectiveness of the prediction was evaluated using the ROC curve, with AUC ranging from 0.647 to 0.823 (FIG. 4). Notably, the risk score also has good predictive power in the early lung adenocarcinoma cohort, suggesting that it may also reduce cohort heterogeneity in early lung adenocarcinomas.
Example 3 validation of the constructed Risk score is an independent risk factor for prognosis of patients with Lung adenocarcinoma
To further demonstrate that the constructed risk score is an independent risk factor, relevant information was extracted from the cohort with clinical information, including in particular the age, sex, stage of tumor pathology and smoking history of the patient at the time of diagnosis. Multivariate Cox regression analysis was performed on these datasets to further illustrate the prognostic significance of the map. The results show that the constructed risk scores were an independent prognostic factor (P <0.05) in all three datasets with overall survival risk ratios of 2.106, 2.514 and 3.666 in the TCGA lung adenocarcinoma, GSE13213 and GSE31210 datasets, respectively.
The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims (5)

1. A gene marker for prognosis stratification evaluation of lung adenocarcinoma, wherein the gene marker is METTL5, RAC1, RCCD1, C11orf24, SLC7A 5.
2. The application of a gene marker in constructing a lung adenocarcinoma prognosis hierarchical evaluation model is characterized in that the gene marker is METTL5, RAC1, RCCD1, C11orf24 and SLC7A 5.
3. The use according to claim 2,
substituting the expression quantity of the gene marker into a model formula, and calculating the risk score of each sample; and
grouping the samples according to the threshold value of the model and carrying out intergroup survival analysis;
wherein the model formula is RS = (0.130 EXP)METTL5) + (0.078 × EXPRAC1) + (0.031 × EXPRCCD1) + (0.053 × EXPC11orf24) + (0.096 × EXPSLC7A5) In which EXPMETTL5Is the expression level of METTL5 gene, EXPRAC1Is the expression level of RAC1 gene, EXPRCCD1Is the expression level of RCCD1 gene, EXPC11orf24Is C11orf24 gene expression level, EXPSLC7A5The expression level of SLC7A5 gene.
4. The use according to claim 3, wherein the step of obtaining the genetic marker comprises:
extracting a transcriptome data set of the tumor in the GEO data;
annotating the probe according to the platform file;
merging the probe expressions with the same gene; and
gene expression was normalized for each sample.
5. The use of claim 4, wherein the normalization is LASSO regression.
CN202110028913.4A 2021-01-11 2021-01-11 Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application Active CN112746108B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110028913.4A CN112746108B (en) 2021-01-11 2021-01-11 Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110028913.4A CN112746108B (en) 2021-01-11 2021-01-11 Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application

Publications (2)

Publication Number Publication Date
CN112746108A CN112746108A (en) 2021-05-04
CN112746108B true CN112746108B (en) 2022-04-05

Family

ID=75650534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110028913.4A Active CN112746108B (en) 2021-01-11 2021-01-11 Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application

Country Status (1)

Country Link
CN (1) CN112746108B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116844685B (en) * 2023-07-03 2024-04-12 广州默锐医药科技有限公司 Immunotherapeutic effect evaluation method, device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014072086A1 (en) * 2012-11-09 2014-05-15 Philip Morris Products S.A. Biomarkers for prognosis of lung cancer
CN109082471A (en) * 2018-09-18 2018-12-25 蚌埠医学院 A kind of patients with lung adenocarcinoma prognosis prediction peripheral blood mRNA marker and its screening technique and application

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG2013079173A (en) * 2013-10-18 2015-05-28 Agency Science Tech & Res Sense-antisense gene pairs for patient stratification, prognosis, and therapeutic biomarkers identification
WO2016049276A1 (en) * 2014-09-25 2016-03-31 Moffitt Genetics Corporation Prognostic tumor biomarkers

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014072086A1 (en) * 2012-11-09 2014-05-15 Philip Morris Products S.A. Biomarkers for prognosis of lung cancer
CN109082471A (en) * 2018-09-18 2018-12-25 蚌埠医学院 A kind of patients with lung adenocarcinoma prognosis prediction peripheral blood mRNA marker and its screening technique and application

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Construction and Comprehensive Analyses of a METTL5-Associated Prognostic Signature With Immune Implication in Lung Adenocarcinomas";Sijin Sun 等;《Frontiers in Genetics》;20210219;全文 *
"The rRNA m6A methyltransferase METTL5 is involved in pluripotency and developmental programs";Ignatova 等;《GENES & DEVELOPMENT》;20200326;全文 *
肺腺癌预后关键基因的筛选、验证及其调控通路分析;李昂等;《山东医药》;20200815(第23期);全文 *

Also Published As

Publication number Publication date
CN112746108A (en) 2021-05-04

Similar Documents

Publication Publication Date Title
JP7368483B2 (en) An integrated machine learning framework for estimating homologous recombination defects
US11621083B2 (en) Cancer evolution detection and diagnostic
Pellatt et al. Expression profiles of miRNA subsets distinguish human colorectal carcinoma and normal colonic mucosa
CN111128299B (en) Construction method of ceRNA regulation and control network with significant correlation to colorectal cancer prognosis
CN108475300B (en) Custom-made drug selection method and system using genomic base sequence mutation information and survival information of cancer patient
Chen et al. DNA methylation-based classification and identification of renal cell carcinoma prognosis-subgroups
CN111863159B (en) Establishment method of line chart model for predicting curative effect of tumor immunotherapy
CN110273003B (en) Marker tool for prognosis recurrence detection of papillary renal cell carcinoma patient and establishment of risk assessment model thereof
US20090197259A1 (en) Gene signature for diagnosis and prognosis of breast cancer and ovarian cancer
Hamed et al. Integrative network-based approach identifies key genetic elements in breast invasive carcinoma
Zhao et al. Identification of pan-cancer prognostic biomarkers through integration of multi-omics data
CN111863137A (en) Complex disease state evaluation method established based on high-throughput sequencing data and clinical phenotype and application
US20220136063A1 (en) Method of predicting survival rates for cancer patients
Gan et al. Construction and validation of a seven-microRNA signature as a prognostic tool for lung squamous cell carcinoma
Su et al. lncRNAs classifier to accurately predict the recurrence of thymic epithelial tumors
CN112746108B (en) Gene marker for tumor prognosis hierarchical evaluation, evaluation method and application
Redekar et al. Identification of key genes associated with survival of glioblastoma multiforme using integrated analysis of TCGA datasets
Tohme et al. The use of machine learning to create a risk score to predict survival in patients with hepatocellular carcinoma: a TCGA cohort analysis
Gong et al. Prediction of early breast cancer patient survival using ensembles of hypoxia signatures
Wang et al. Integrative analysis of DNA methylation data and transcriptome data identified a DNA methylation-dysregulated four-LncRNA signature for predicting prognosis in head and neck squamous cell carcinoma
Ma et al. Optimizing the Prognostic Model of Cervical Cancer Based on Artificial Intelligence Algorithm and Data Mining Technology
Zhao et al. Construction and Validation of a Prognostic Model Based on mRNAsi-Related Genes in Breast Cancer
Tang et al. Identification of mutator-derived lncRNA signatures of genomic instability for promoting the clinical outcome in hepatocellular carcinoma
KR102138517B1 (en) Extracting method for biomarker for diagnosis of pancreatic cancer, computing device therefor, biomarker, and pancreatic cancer diagnosis device comprising same
US20230274794A1 (en) Multiclass classification model for stratifying patients among multiple cancer types based on analysis of genetic information and systems for implementing the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant