CN113355426B - Evaluation gene set and kit for predicting liver cancer prognosis - Google Patents

Evaluation gene set and kit for predicting liver cancer prognosis Download PDF

Info

Publication number
CN113355426B
CN113355426B CN202110916132.9A CN202110916132A CN113355426B CN 113355426 B CN113355426 B CN 113355426B CN 202110916132 A CN202110916132 A CN 202110916132A CN 113355426 B CN113355426 B CN 113355426B
Authority
CN
China
Prior art keywords
gene
prognosis
risk score
patient
genes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110916132.9A
Other languages
Chinese (zh)
Other versions
CN113355426A (en
Inventor
王维锋
张欣
王丛茂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhiben Medical Laboratory Co ltd
Origimed Technology Shanghai Co ltd
Original Assignee
Shanghai Zhiben Medical Laboratory Co ltd
Origimed Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhiben Medical Laboratory Co ltd, Origimed Technology Shanghai Co ltd filed Critical Shanghai Zhiben Medical Laboratory Co ltd
Priority to CN202110916132.9A priority Critical patent/CN113355426B/en
Publication of CN113355426A publication Critical patent/CN113355426A/en
Application granted granted Critical
Publication of CN113355426B publication Critical patent/CN113355426B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57438Specifically defined cancers of liver, pancreas or kidney
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57484Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Urology & Nephrology (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Hematology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Microbiology (AREA)
  • Organic Chemistry (AREA)
  • Cell Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biochemistry (AREA)
  • Oncology (AREA)
  • Biophysics (AREA)
  • Medical Informatics (AREA)
  • General Physics & Mathematics (AREA)
  • Medicinal Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Epidemiology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)

Abstract

The invention relates to a liver cancer prognosis evaluation gene set. Specifically, the invention uses a gene set consisting of 62 specific immune related genes to predict and evaluate the prognosis of the liver cancer patient and provides scientific basis for medical decision. The invention also relates to a kit, a computing device and a storage medium for predicting liver cancer prognosis.

Description

Evaluation gene set and kit for predicting liver cancer prognosis
Technical Field
The present invention relates to a method for assessing the prognosis of a patient with liver cancer using a specific set of immune-related genes. Specifically, the invention relates to a characteristic gene set of 62 immune-related genes, which can be used for evaluating the prognosis of a liver cancer patient and providing scientific basis for medical decision.
Background
Tumors are incurable diseases, and the therapeutic targets are well defined, and patients are allowed to live longer by treatment since the tumors have been diagnosed. Liver cancer is one of the most common cancer types in China, and has high morbidity and mortality. The treatment generally adopts surgery, radiotherapy and chemotherapy and traditional Chinese medicine combination therapy. In the past, the recurrence rate and poor outcome rate of liver cancer are still high, and the 5-year recurrence-free survival rate and the overall survival rate are lower. The existing clinical prognostic evaluation indexes such as alpha-fetoprotein level, TNM staging and the like can not meet the requirements of comprehensively reflecting tumor characteristics and accurately judging the survival risk of patients. Therefore, there is an urgent need for reliable prognostic indicators to establish reliable prognostic models and make accurate survival predictions, and to combine corresponding molecular characteristics for targeted therapeutic intervention. In early clinical application, how to provide liver cancer prognosis prediction information for doctors and patients is an urgent problem to be solved, is helpful for formulating individualized treatment schemes, and has important clinical significance for improving postoperative survival of patients and realizing accurate treatment of liver cancer.
The activity of tumor cells and immune cells in the tumor microenvironment are involved in the generation and development of tumors, so tumor immunology has attracted attention. Tumor infiltrating immune cells are key cellular components of the host immune response and are important members of the tumor microenvironment. Many studies have demonstrated that tumor-infiltrating immune cells are associated with therapeutic response and prognosis in a variety of cancers.
The advantage of using the expression values of the characteristic gene set to evaluate the prognosis of the patient is objectivity and there is no subjective bias of the researcher. The disadvantage is that the observation time is long and it is necessary to record the occurrence of all events, i.e. the death of all patients. Published markers of immune-related genes, usually involve only a single immune gene or a small number of immune cells. However, the development of immune responses in vivo involves the involvement of multiple immune cells, and the evaluation of prognosis by a single immune gene or a small number of immune cells is not complete. Therefore, there remains a need for more accurate and efficient models that can predict the prognosis of cancer patients.
Disclosure of Invention
The method is based on TCGA liver cell liver cancer samples, samples are randomly divided into training sets and testing sets, and by combining gene expression value data and screening immune related genes, an evaluation gene set capable of predicting the prognosis of liver cell liver cancer according to the gene expression value is selected.
First, in a first aspect of the present invention, the present invention relates to an evaluation gene set for predicting prognosis of a liver cancer patient, the evaluation gene set comprising 62 genes, the genes being represented in table 1 below:
table 1: selected features assess gene status of a gene set
Figure 968496DEST_PATH_IMAGE001
In another aspect, the present invention also relates to a kit for predicting prognosis of a patient with liver cancer, comprising a reagent that can specifically detect a gene expression value; wherein the genes are 62 genes in Table 1.
In this context, the terms "expression level" and "expression value" of a gene are used interchangeably to refer to the value of a parameter that measures the degree of expression of a given gene. The expression value can be determined by measuring the level of mRNA encoded by the gene of interest or by measuring the amount of protein encoded by the gene.
In some embodiments, the kit comprises one or more of nucleic acid extraction reagents, PCR reagents, genome/transcriptome sequencing reagents, gene-specific primers or probes, antibodies specific for gene expression products.
In some embodiments, the agent is any agent known in the art that can be used to detect the level of gene expression; in particular embodiments, the reagents are used in reagents for performing one or more of the following methods: real-time fluorescent quantitative PCR, northern blotting, western blotting, genome sequencing, transcriptome sequencing, biomass spectrometry or specific antibody detection.
In some embodiments, the kit further comprises sample processing reagents, such as sample lysis reagents, sample purification reagents, and nucleic acid extraction reagents, among others.
Transcriptome sequencing can rapidly and comprehensively obtain almost all transcripts and gene sequences of a specific cell or tissue of a certain species in a certain state through a second-generation sequencing platform, and can be used for researching gene expression quantity, gene function, structure, alternative splicing, prediction of new transcripts and the like. In addition, by designing appropriate primers, the transcription expression level of a gene can be determined by PCR such as reverse transcription PCR. The protein expression level of each gene can also be measured by an immunoassay such as immunohistochemistry, ELISA, or the like using an antibody specific to the gene protein.
Preferably, the gene expression value is a value obtained by annotating transcriptome sequencing data.
In another aspect, the present invention also relates to a method for predicting the prognosis of a patient with liver cancer, comprising the steps of:
a) sample collection and data detection: collecting a sample of the patient, determining their expression values for 62 genes in the evaluation gene set in table 1;
b) calculating the risk score: calculating the total expression value of the liver cancer patient in 62 genes of the evaluation gene set, namely a Risk Score (Risk Score); the risk score calculation formula is as follows:
Figure 297846DEST_PATH_IMAGE002
wherein EiCoef for the value of expression of each geneiThe weight coefficient of each gene, n is the number of genes in the characteristic gene set, namely 62;
c) and (3) predicting the prognosis condition of the patient according to the calculated risk score of the liver cancer patient: the lower the risk score of the patient, the better the prognosis; and comparing the risk score with a defined value, if the risk score is higher than the defined value, predicting that the prognosis is poor, and if the risk score is lower than the defined value, predicting that the prognosis is good.
In some embodiments, the defined value is about 4.14.
In some embodiments, the patient sample is from a tissue of the patient, including tumor tissue, which is a primary lesion or a metastatic lesion.
As used herein, "about" when used in reference to a numerical value indicates that the calculation or measurement allows the value to encompass some approximation of the exact numerical value, or a reasonably close numerical value; "about" herein means at least the variation in value that can result from the usual methods of measuring or using such parameters; it should be understood that the presence or absence of "about" does not affect the interpretation of its numerical value; preferably, all values within the range of plus or minus 10% of the subsequent value are indicated. Those skilled in the art will appreciate that all or part of the functions of the above-described method steps may be implemented by hardware, or may be implemented by a computer program.
When all or part of the functions of the above method steps are implemented by means of a computer program, the program may be stored in a computer-readable storage medium, and the storage medium may include: a read only memory, a random access memory, a magnetic disk, an optical disk, a hard disk, etc., and the program is executed by a computer to realize the above functions. For example, the program may be stored in a memory of the device, and when the program in the memory is executed by the processor, all or part of the functions described above may be implemented. In addition, when all or part of the functions in the above embodiments are implemented by a computer program, the program may be stored in a storage medium such as a server, another computer, a magnetic disk, an optical disk, a flash disk, or a removable hard disk, and may be downloaded or copied to a memory of a local device, or may be version-updated in a system of the local device, and when the program in the memory is executed by a processor, all or part of the functions in the above embodiments may be implemented.
In another aspect, the invention also relates to a system for predicting prognosis of a patient with liver cancer, comprising the following modules:
a) a data collection module: collecting a sample of the patient, determining the expression values of 62 genes in the evaluation gene set in table 1, and inputting the expression value data of each gene to a model calculation module;
b) a model calculation module: calculating the total expression value of 62 genes in the evaluation gene set of the liver cancer patient, namely a Risk Score (Risk Score); the risk score calculation formula is as described above;
c) the output prediction module predicts the prognosis condition of the patient according to the risk score data of the liver cancer patient, wherein the lower the risk score of the patient is, the better the prognosis is; and comparing the risk score data with a defined value, if the risk score data is higher than the defined value, outputting that the prediction prognosis is not good, and if the risk score data is lower than the defined value, outputting that the prognosis is good.
In some embodiments, the defined value is about 4.14.
In some embodiments, the patient sample is from a tissue of the patient, including tumor tissue, which is a primary lesion or a metastatic lesion.
In another aspect, the present invention also relates to the use of the reagent for detecting the expression value of the genes described in table 1 in the preparation of kits and systems for predicting liver cancer prognosis.
In some embodiments, wherein the kit or system is a kit and system of the invention as described above.
In some embodiments, the agent for detecting expression values is selected from one or more of nucleic acid extraction reagents, PCR reagents, genome/transcriptome sequencing reagents, gene-specific primers or probes, antibodies specific for gene expression products. In some embodiments, the reagent is a reagent for performing one or more of the following: real-time fluorescent quantitative PCR, northern blotting, western blotting, genome sequencing, transcriptome sequencing, biomass spectrometry or specific antibody detection.
In another aspect, the invention also relates to a computing device comprising:
at least one processing unit; and
at least one memory coupled to the processing unit and storing instructions for execution by the processing unit, the instructions when executed, the apparatus enabling prediction of a prognosis for a liver cancer patient, the prediction comprising the steps of:
a) calculating a risk score for the patient based on the collected and determined expression values for 62 genes in the evaluation set of genes in table 1 for the patient sample; the risk score calculation formula is as described above;
b) predicting the prognosis condition of the patient according to the risk score data of the liver cancer patient, wherein the lower the risk score of the patient is, the better the prognosis is; comparing the risk score data with a defined value, if the risk score data is higher than the defined value, the prognosis is predicted to be poor, and if the risk score data is lower than the defined value, the prognosis is predicted to be good.
Preferably wherein said defined value is about 4.14.
In another aspect, the present invention also relates to a computer readable storage medium storing a computer program executable by a machine to perform the steps of predicting a prognosis for a patient with liver cancer, the steps comprising:
a) calculating a risk score for the patient based on the collected and determined expression values for 62 genes in the evaluation set of genes in table 1 for the patient sample; the risk score calculation formula is as described above;
b) predicting the prognosis condition of the patient according to the risk score data of the liver cancer patient, wherein the lower the risk score of the patient is, the better the prognosis is; comparing the risk score data with a defined value, if the risk score data is higher than the defined value, the prognosis is predicted to be poor, and if the risk score data is lower than the defined value, the prognosis is predicted to be good.
Preferably wherein said defined value is about 4.14.
The invention has the beneficial effects that:
the invention provides an evaluation gene set for liver cancer prognosis prediction and a corresponding kit, which can be more reliably applied to clinical practice. The characteristic gene set comprises 62 immune related genes, 22 immune cells are covered, and the prediction performance is verified in the test set. Compared with a method for correlating prognosis by mutation of a single gene, the method disclosed by the invention reduces the limitation of mutation frequency in a crowd and the limitation of collected samples on the stability of a survival analysis result, and can be used for more accurately predicting the prognosis of a liver cancer patient; and the method can be applied to clinical tests and provide scientific basis for medical decision making.
Drawings
FIG. 1: screening a characteristic gene set flow chart.
FIG. 2: selecting a result of the characteristic gene set; grouping patients into high risk groups (high group) and low risk groups (low group) according to the median of their risk scores for 62 selected genes; fig. 2 shows the Probability of Survival (Survival viability) for the high risk group and the low risk group in the training set (training set).
FIG. 3: selecting a result of the characteristic gene set; grouping patients into high risk groups (high group) and low risk groups (low group) according to their median of 62 selected gene risk scores; fig. 3 shows the probability of survival for the high risk group and the low risk group in the test set (test set).
FIG. 4: (ii) results of a randomly selected signature gene set; grouping patients into high risk groups (high group) and low risk groups (low group) according to their median of 62 randomly selected gene risk scores; where figure 4 shows the probability of survival for the high risk group and the low risk group in the training set.
FIG. 5: (ii) results of a randomly selected signature gene set; grouping patients into high risk groups (high group) and low risk groups (low group) according to their median of 62 randomly selected gene risk scores; fig. 5 shows the probability of survival for the high risk group and the low risk group in the test set.
Detailed Description
The following describes embodiments of the present invention with reference to the drawings. For the specific methods or materials used in the embodiments, those skilled in the art can make routine alternatives according to the existing technologies based on the technical idea of the present invention, and not limited to the specific embodiments of the present invention.
Example 1: establishing a model by a Lasso regression method to obtain a selected characteristic gene set
Data processing, screening immune gene related to liver cancer prognosis
Downloading gene expression data of the liver cell liver cancer and clinical data such as total survival time and survival end point of a patient from a cancer genome atlas (TCGA), wherein the gene expression data comprise 363 liver cell liver cancer samples and 60483 genes. In order to construct a liver cancer prognosis prediction model, 547 genes related to immunity are selected from 60483 genes for subsequent screening of a gene set for predicting patient prognosis.
Construction of liver cancer prognosis model
363 liver cancer patient samples in the TCGA dataset were randomly divided into 80% training set (290 samples) and 20% testing set (73 samples) with reference to clinical staging. Using training set samples and 547 immune-related genes, Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis was performed in the training set:
and (5) completing LASSO regression analysis and establishment of a multi-risk prediction model through the R package glmnet. And C, using cv. glmnet function in the training set, selecting a lasso regression model and a cox model, and modeling by using C-index as a judgment index of the model to obtain a penalty coefficient of the screening characteristic gene set. The penalty factor is 0.033. And the model built is validated using 20-fold cross validation. And finally, selecting 62 genes with weight values not being 0 as a final characteristic gene set. A multiple risk prediction model was thus established to predict patient prognosis, see table 2.
Table 2: selected characteristic evaluation gene set 62 genes and weight coefficients thereof
Figure 762326DEST_PATH_IMAGE003
The weight coefficient of each gene was used to calculate a risk score for the signature gene set, whose expression value was the sum of the products of 62 genes and the respective weights (the calculation formula is as follows).
The calculation method of the characteristic gene lumped expression value, namely the Risk Score (Risk Score) is as follows:
Figure 493521DEST_PATH_IMAGE004
i.e. the sum of the individual gene expression values and the individual weight coefficients. Wherein EiCoef for the value of expression of each geneiIs the weight coefficient of each gene, n is the number of genes in the characteristic gene set, and n is 62 in the invention; wherein the weight values corresponding to each gene are shown in Table 2.
Verifying model accuracy
And after the model training is finished, predicting the test set by using the established model and the selected gene set by using a prediction function, and testing the prediction capability of the model and the selected gene set on the data of the test set.
And according to the formula calculation method, calculating the total expression values (risk scores) of the patients in the training set, sorting the total expression values according to the sizes of the risk scores, and grouping the patients in the training set/the test set by using the median value, wherein the median value is 4.14 and is divided into a high risk group (high group) and a low risk group (low group). The Survival probability of high and low group patients is compared by plotting the Survival time (days) of the patients as the abscissa and the Survival probability (Survival probability) as the ordinate.
And performing multi-risk model prediction by using a coxph function in the R-packet survival. The function input file is patient group and patient survival time and status. The results were then examined using log-rank t test. Training set p values were less than 0.0001, 95% CI [0.066-0.18], and low risk group Hazard ratio values were 0.11. Test set p value 0.006, 95% CI [0.13-0.75], low risk group Hazard ratio value 0.3.
Fig. 2 shows the probability of survival situation for the high risk group and the low risk group of the training set. It can be seen that there are significant differences between the 2 groups of the training set, and the high risk group has a significantly lower probability of survival than the low risk group (P < 0.0001).
And calculating the C-index value of the training set. The training set C-index is 0.83. C-index, the consistency index (concordance index), used to evaluate the predictive power of the model; the C index is the proportion of pairs with the predicted result consistent with the actual result in all pairs of patients.
Test set data verification of liver cancer prognosis model
In order to verify the constructed liver cancer prognosis model, the expression values (risk scores) of liver cancer patients in the test set are calculated by using the same expression value formula and weight coefficients in the test set according to a similar process, and the test set is equally divided into a high group and a low group by using the same critical value so as to verify the accuracy of the liver cancer prognosis model of the evaluation gene set of the 62 genes. Fig. 3 shows the probability of survival for the high risk group and the low risk group of the test set. As can be seen from fig. 3, the survival probability of the high risk group is significantly lower than that of the low risk group (p = 0.006), i.e. the test set data verifies that the prognostic model is highly reliable. The C-index value of the test set was calculated to be 0.7.
Example 2: predictive power comparison of selected signature gene sets to random gene sets
To further verify the validity of the selected estimated gene set of 62 genes, the other 62 genes were randomly selected from 547 genes (excluding the above selected 62 genes) to form a "random gene set" and compared with the selected "estimated gene set"; the genes of the random gene set and their weight coefficients are seen in table 3.
Table 3: 62 genes of random gene set and weight coefficients thereof
Figure 18044DEST_PATH_IMAGE005
The patients were also divided into a training set (80%) and a test set (20%) according to the procedure described in example 1, and the risk scores of the patients in the test set in the randomized model were calculated using each gene in the randomized gene set and its weight coefficients in table 3. The random gene set risk score calculation method is similar to that in example 1. Calculating the C-index of the training set and the test set; wherein the training set C-index: 0.86, test set C-index: 0.55.
the training set patients were also grouped by median risk score (1.66 by computational analysis) of the training set into high risk group and low risk group. FIG. 4 shows the probability of survival for the high risk group and the low risk group of the training set of the "random gene set". It can be seen that there are significant differences between the 2 groups of the training set, with the high risk group having a significantly lower probability of survival than the low risk group (p < 0.0001).
However, the same risk scoring formula and weighting factors were used to calculate the risk scores of the liver cancer patients in the test set of the random gene set, and the test set was equally divided into high and low groups by the median value (1.66) obtained from the above training set to verify the accuracy of the liver cancer prognosis model of the 62 genes "random gene set". Fig. 5 shows the survival probability of the high risk group and the low risk group of the training set of the "random gene set", and it can be seen that the survival probability of the high risk group is not significantly different from that of the low risk group (p = 0.48). The verification of the test set shows that the random gene set can not effectively predict the prognosis of the liver cancer patient.
As shown by the comparison between the selected gene set and the random gene set, the estimated gene set of 62 specific genes constructed by the invention can effectively predict the prognosis of the liver cancer patient, but the randomly selected gene set cannot be realized.
In order to accurately predict the prognosis risk of the liver cancer patient, 62 immune related genes are determined to predict the prognosis condition of the liver cancer, so that a high risk group and a low risk group of the liver cancer patient can be effectively distinguished, and the immune related genes can be developed into potential in-vitro diagnosis products to predict and detect the prognosis condition of the liver cancer patient, so that preventive medication or treatment is realized, and an accurate judgment basis is provided for further auxiliary treatment of the prognosis of the liver cancer patient.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (11)

1. An evaluation gene set for predicting prognosis of a liver cancer patient, wherein the evaluation gene set comprises 62 genes as follows:
Figure DEST_PATH_IMAGE001
2. a kit for predicting prognosis of a patient with liver cancer, comprising a reagent for detecting the expression level of a gene; wherein the genes are 62 genes in the evaluation gene set of claim 1.
3. The kit of claim 2, further comprising one or more of nucleic acid extraction reagents, PCR reagents, genomic/transcriptome sequencing reagents, gene-specific primers or probes, antibodies specific for gene expression products.
4. A system for predicting prognosis in a patient with liver cancer, comprising the following modules:
a) a data collection module: collecting the sample of the patient, measuring the gene expression value of the sample, and outputting the expression value data of each gene to a model calculation module; wherein the genes are 62 genes in the evaluation gene set of claim 1;
b) a model calculation module: calculating the total expression value of 62 genes of the liver cancer patient, namely a Risk Score (Risk Score); the risk score calculation formula is as follows:
Figure DEST_PATH_IMAGE002
wherein; eiCoef for the expression value of each geneiIs the weight coefficient corresponding to each gene, n is the total number of genes, namely 62;
wherein, each gene and the corresponding weight coefficient are as follows:
Figure DEST_PATH_IMAGE003
c) an output prediction module: predicting the prognosis of the patient according to the risk score data of the liver cancer patient, wherein the lower the risk score of the patient is, the better the prognosis is; and comparing the risk score with a defined value, and if the risk score is higher than the defined value, outputting that the prognosis is not good, and if the risk score is lower than the defined value, outputting that the prognosis is good.
5. The system of claim 4, the defined value being 4.14.
6. A computing device, comprising:
at least one processing unit; and
at least one memory coupled to the processing unit and storing instructions for execution by the processing unit, the instructions when executed, the apparatus enabling prediction of a prognosis for a liver cancer patient, the prediction comprising the steps of:
a) calculating a risk score for the patient based on the collected and determined expression values of the genes in the patient sample; the genes are 62 genes in the evaluation gene set of claim 1; the risk score calculation formula is as follows:
Figure DEST_PATH_IMAGE004
wherein; eiCoef for the expression value of each geneiIs the weight coefficient corresponding to each gene, n is the total number of genes, namely 62; wherein, each gene and the corresponding weight coefficient are shown in claim 4;
b) predicting the prognosis condition of the patient according to the risk score of the liver cancer patient, wherein the lower the risk score of the patient is, the better the prognosis is; and comparing the risk score with a defined value, if the risk score is higher than the defined value, the prognosis is predicted to be poor, and if the risk score is lower than the defined value, the prognosis is predicted to be good.
7. The computing device of claim 6, wherein the defined value is 4.14.
8. A computer-readable storage medium storing a computer program executable by a machine to perform steps for predicting prognosis of a patient with liver cancer, the steps comprising:
a) calculating a risk score for the patient based on the collected and determined gene expression values for the patient sample; wherein the genes are 62 genes in the evaluation gene set of claim 1; the risk score calculation formula is as follows:
Figure DEST_PATH_IMAGE005
wherein; eiCoef for the expression value of each geneiIs the weight coefficient corresponding to each gene, n is the total number of genes, namely 62; wherein, each gene and the corresponding weight coefficient are shown in claim 4;
b) predicting the prognosis condition of the patient according to the risk score data of the liver cancer patient, wherein the lower the risk score of the patient is, the better the prognosis is; comparing the risk score data with a defined value, if the risk score data is higher than the defined value, the prognosis is predicted to be poor, and if the risk score data is lower than the defined value, the prognosis is predicted to be good.
9. The computer-readable storage medium of claim 8, wherein the defined value is 4.14.
10. Use of a reagent for detecting gene expression levels in the preparation of a kit or system for predicting prognosis in a patient with liver cancer; wherein the genes are 62 genes in the evaluation gene set of claim 1.
11. The use according to claim 10, wherein the kit is a kit according to claim 2 or 3; the system is the system of claim 4 or 5.
CN202110916132.9A 2021-08-11 2021-08-11 Evaluation gene set and kit for predicting liver cancer prognosis Active CN113355426B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110916132.9A CN113355426B (en) 2021-08-11 2021-08-11 Evaluation gene set and kit for predicting liver cancer prognosis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110916132.9A CN113355426B (en) 2021-08-11 2021-08-11 Evaluation gene set and kit for predicting liver cancer prognosis

Publications (2)

Publication Number Publication Date
CN113355426A CN113355426A (en) 2021-09-07
CN113355426B true CN113355426B (en) 2021-11-09

Family

ID=77522945

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110916132.9A Active CN113355426B (en) 2021-08-11 2021-08-11 Evaluation gene set and kit for predicting liver cancer prognosis

Country Status (1)

Country Link
CN (1) CN113355426B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114317532B (en) * 2021-12-31 2024-01-19 广东省人民医院 Evaluation gene set, kit, system and application for predicting leukemia prognosis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101457254A (en) * 2008-10-09 2009-06-17 北京大学人民医院 Liver cancer prognosis
CN102206710A (en) * 2011-04-12 2011-10-05 复旦大学附属中山医院 Real time polymerase chain reaction (PCR) microarray kit for predicting postoperative recurrence and metastasis of liver cancer after operation
CN107657149A (en) * 2017-09-12 2018-02-02 中国人民解放军军事医学科学院生物医学分析中心 System for predicting liver cancer patient prognosis
JP2019006678A (en) * 2015-11-09 2019-01-17 国立大学法人東北大学 Anti-phosphorylate bach2 antibody and anti-tumor immune activator screening method
CN112011616A (en) * 2020-09-02 2020-12-01 复旦大学附属中山医院 Immune gene prognosis model for predicting hepatocellular carcinoma tumor immune infiltration and postoperative survival time

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101457254A (en) * 2008-10-09 2009-06-17 北京大学人民医院 Liver cancer prognosis
CN102206710A (en) * 2011-04-12 2011-10-05 复旦大学附属中山医院 Real time polymerase chain reaction (PCR) microarray kit for predicting postoperative recurrence and metastasis of liver cancer after operation
JP2019006678A (en) * 2015-11-09 2019-01-17 国立大学法人東北大学 Anti-phosphorylate bach2 antibody and anti-tumor immune activator screening method
CN107657149A (en) * 2017-09-12 2018-02-02 中国人民解放军军事医学科学院生物医学分析中心 System for predicting liver cancer patient prognosis
CN112011616A (en) * 2020-09-02 2020-12-01 复旦大学附属中山医院 Immune gene prognosis model for predicting hepatocellular carcinoma tumor immune infiltration and postoperative survival time

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"Novel biomarkers in hepatocellular carcinoma";Felice De Stefano等;《Digestive and Liver Disease》;20180824;第50卷;第1115-1123页 *
"基于TCGA数据库分析肝癌miRNA及其靶基因的预后意义";刘芳远等;《包头医学院学报》;20201231;第36卷(第9期);第76-80、96页 *
"肝癌免疫相关预后标志物的分析";鞠铭伊等;《郑州大学学报(医学版)》;20201130;第55卷(第6期);第779-785页 *

Also Published As

Publication number Publication date
CN113355426A (en) 2021-09-07

Similar Documents

Publication Publication Date Title
US11041866B2 (en) Pancreatic cancer biomarkers and uses thereof
AU2011312491B2 (en) Mesothelioma biomarkers and uses thereof
Gray et al. Validation study of a quantitative multigene reverse transcriptase–polymerase chain reaction assay for assessment of recurrence risk in patients with stage II colon cancer
AU2009291811B2 (en) Lung cancer biomarkers and uses thereof
AU2011378427B8 (en) Lung cancer biomarkers and uses thereof
CN111394456B (en) Early lung adenocarcinoma patient prognosis evaluation system and application thereof
US20120101002A1 (en) Lung Cancer Biomarkers and Uses Thereof
CN113234829B (en) Colon cancer prognosis evaluation gene set and construction method thereof
US20120143805A1 (en) Cancer Biomarkers and Uses Thereof
CN114686591B (en) Lung squamous cell carcinoma immunotherapy curative effect prediction model based on gene expression condition, construction method and application thereof
WO2011043840A1 (en) Cancer biomarkers and uses thereof
CN111676288B (en) System for predicting lung adenocarcinoma patient prognosis and application thereof
CN112331343A (en) Method for establishing hepatocellular carcinoma postoperative risk assessment model
JP2016073287A (en) Method for identification of tumor characteristics and marker set, tumor classification, and marker set of cancer
CN113355426B (en) Evaluation gene set and kit for predicting liver cancer prognosis
Men et al. A prognostic 11 genes expression model for ovarian cancer
US20220065872A1 (en) Lung Cancer Biomarkers and Uses Thereof
Zhu et al. A systems biology‐based approach to screen key splicing factors in hepatocellular carcinoma
WO2023246808A1 (en) Use of cancer-associated short exons to assist cancer diagnosis and prognosis
Yang et al. Bioinformatics analysis of SH2D4A in Glioblastoma Multiforme to evaluate immune features and predict prognosis
CN115331812A (en) Establishment and verification method of serous ovarian cancer prognostic marker model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Evaluation gene set and kit for predicting liver cancer prognosis

Granted publication date: 20211109

Pledgee: Agricultural Bank of China Limited Shanghai Free Trade Zone Branch

Pledgor: Shanghai Zhiben medical laboratory Co.,Ltd.|ORIGIMED TECHNOLOGY (SHANGHAI) Co.,Ltd.|Zhiben medical technology (Chongqing) Co.,Ltd.

Registration number: Y2024980012757

PE01 Entry into force of the registration of the contract for pledge of patent right