CN115747329A - Gene marker combination, kit and system for predicting tumor progression and prognosis - Google Patents

Gene marker combination, kit and system for predicting tumor progression and prognosis Download PDF

Info

Publication number
CN115747329A
CN115747329A CN202211094688.5A CN202211094688A CN115747329A CN 115747329 A CN115747329 A CN 115747329A CN 202211094688 A CN202211094688 A CN 202211094688A CN 115747329 A CN115747329 A CN 115747329A
Authority
CN
China
Prior art keywords
gene
prognosis
marker combination
tumor progression
gene marker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211094688.5A
Other languages
Chinese (zh)
Other versions
CN115747329B (en
Inventor
吴玲祥
吴维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ankai Life Technology Suzhou Co ltd
Original Assignee
Ankai Life Technology Suzhou Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ankai Life Technology Suzhou Co ltd filed Critical Ankai Life Technology Suzhou Co ltd
Priority to CN202311514765.2A priority Critical patent/CN117385040A/en
Priority to CN202311515115.XA priority patent/CN117385042A/en
Priority to CN202311514772.2A priority patent/CN117385041A/en
Priority to CN202211094688.5A priority patent/CN115747329B/en
Publication of CN115747329A publication Critical patent/CN115747329A/en
Application granted granted Critical
Publication of CN115747329B publication Critical patent/CN115747329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses a gene marker combination for predicting tumor progression and prognosis, belonging to the field of medical molecular biology. The gene marker combination comprises at least one gene of S100A10, FOSL2, SPP1, CAV1, ANXA1, VIM, CD44, SERPINH1, LGALS3, CEBPB, ATF5 and LGALS1. The invention also discloses a kit and a system based on the gene marker combination. The invention can be used for predicting the development and prognosis of various tumors and has very high clinical application value.

Description

Gene marker combination, kit and system for predicting tumor progression and prognosis
Technical Field
The invention belongs to the field of medical molecular biology, and particularly relates to a gene marker combination, a kit and a system for predicting tumor progression and prognosis.
Background
Glioblastoma (GBM) is the most common invasive brain tumor in adults. Due to the malignant progression of tumors, improving the prognosis of GBM is a huge challenge. The median survival time for GBM patients under standard treatment regimens is approximately 15 months, however, most patients relapse rapidly within ten months after initial treatment.
Currently, several methods have been constructed in the prior art to explore the onset and progression of GBM. For example, ozawa et al analyzed human GBM data and found that chromosomal changes may be the cause of GBM (Ozawa, tatsuya, et al, "Most human non-GCIMP globibastoma subsidiary from a common pro-nuclear-like precursor globosa." Cancer cell 26.2 (2014): 288-300.). There are also technical studies exploring the spatiotemporal variation of GBM on a longitudinal sample model. The progression of GBM is predicted from paired primary and recurrent samples, and the above techniques show genomic features of GBM patients driven by therapy.
Although the prior art has advanced the art with regard to GBM progression and improved understanding of treatment failure, it is based primarily on genomic changes following treatment. In general, there is still a lack of suitable models and methods to predict how tumor cells progress during the course of natural disease progression.
Disclosure of Invention
In order to solve at least one of the technical problems, the invention obtains an RNA sample of a tumor tissue, obtains transcriptome sequencing through whole transcriptome high-throughput sequencing, preprocesses off-line data, compares the data to a human reference genome, and performs expression quantitative analysis to obtain expression profile information. Further use of the expression profile data results in combinations of gene markers that can be used to predict tumor progression and/or prognosis. The present invention has been completed by unexpectedly obtaining a combination of gene markers suitable for various tumor progression and/or prognosis by conducting the same analysis on various tumor samples.
In the present invention, the high-throughput whole transcriptome sequencing (whole transcriptome sequencing) refers to sequencing several hundred thousand to several million RNA molecules at a time. Transcriptome sequencing is the sum of all RNAs that a particular cell can transcribe in a functional state, mainly including mRNA and non-coding RNAs. Transcriptome research is the basis and starting point of gene function and structure research, almost all transcript sequence information of a specific tissue or organ of a certain species in a certain state can be comprehensively and quickly obtained through a new generation of high-throughput sequencing, and the method is widely applied to the fields of basic research, clinical diagnosis, drug research and development and the like.
In the present invention, the gene expression profile refers to a non-biased cDNA library constructed from cells or tissues in a specific state, and large-scale cDNA sequencing is performed to collect cDNA sequence fragments and qualitatively and quantitatively analyze mRNA population composition, so as to describe the gene expression type and abundance information of the specific cells or tissues in the specific state, and thus the compiled data table is called a gene expression profile.
In the invention, reads obtained by high-throughput sequencing are all sequence fragments of 300-500bp, and cannot be directly subjected to downstream. Thus, reads from mRNA sequencing need to be compared to the human reference genome, thereby determining from which gene on the reference genome the sequence segment of the mRNA originated.
In the present invention, the human reference genome refers to 2/12/2001, and the map and preliminary analysis results of the human genome were published by the international human genome participated in by scientists in 6 countries. Currently, the reference sequence may be the sequence of the human genome hg38, hg19 or other versions. In an embodiment of the invention, the human reference genome is hg19.
The technical scheme adopted by the invention is as follows:
in a first aspect, the invention provides a gene marker combination comprising at least one gene of S100a10, FOSL2, SPP1, CAV1, ANXA1, VIM, CD44, SERPINH1, LGALS3, CEBPB, ATF5 and LGALS1.
In the present invention, the terms "gene marker", "gene marker" and "signature" have equivalent meanings and refer to genes involved in tumor progression and/or prognosis.
The Protein encoded by the S100A10 (S100 Calcium Binding Protein A10, S100A 10) gene is one of the members of the S100 Protein family that contains 2 EF hand Calcium Binding motifs. S100 calcium binding proteins are located in the cytoplasm and/or nucleus of a variety of cells and are involved in the regulation of many cellular processes, such as cell cycle progression and differentiation. The S100 gene comprises at least 13 members, which are located in clusters on chromosome 1q 21.
FOSL2 (FOS Like 2, AP-1 transformation Factor Subunit, FOSL2) is a member of the Fos gene family, the remaining members including: FOS, FOSB, and FOSL1. The leucine zipper proteins encoded by these genes can dimerize with proteins of the JUN family, thereby forming the transcription factor complex AP-1. Thus, FOS proteins are considered modulators of cell proliferation, differentiation and transformation.
The protein encoded by the SPP1 (Secreted Phosphoprotein 1) gene is involved in the attachment of osteoclasts to mineralized bone matrix. The SPP1 protein is secreted and binds hydroxyapatite with high affinity. The osteoclast vitronectin receptor is present in the cell membrane and may be involved in binding to SPP1 protein. The SPP1 protein is also a cytokine that up-regulates interferon-gamma and interleukin-12 expression.
The scaffold protein encoded by the CAV1 (Caveolin 1) gene is a major component of the cell-cave-like membrane found in most cell types. The CAV1 protein links the integrin subunit to the tyrosine kinase FYN, which is the initial step in coupling the integrin to the Ras-ERK pathway and facilitating cell cycle progression. The CAV1 gene is a tumor suppressor gene candidate and a negative regulator of the Ras-p42/44 mitogen-activated kinase cascade. CAV1 and CAV2 are adjacent to each other on chromosome 7 and express a co-localization protein that forms a stable hetero-oligomer complex.
The ANXA1 (Annexin A1) gene encodes a membrane-localized protein that binds to phospholipids. ANXA1 protein inhibits phospholipase A2 and has anti-inflammatory activity.
The Vimentin gene encodes a type III intermediate filament protein. The intermediate filaments, together with the microtubules and actin microfilaments, form the cytoskeleton. VIM proteins are responsible for maintaining cell shape and cytoplasmic integrity, and stabilizing cytoskeletal interactions. VIM proteins are involved in neurogenesis and cholesterol transport and function as organizers of many other key proteins involved in cell attachment, migration and signaling.
The protein encoded by the CD44 (CD 44 Molecule) gene is a cell surface glycoprotein and is involved in cell-cell interactions, cell adhesion and migration. CD44 protein is a receptor for Hyaluronic Acid (HA) and can also interact with other ligands, such as osteopontin, collagen and Matrix Metalloproteinases (MMPs). The CD44 protein is involved in a variety of cellular functions, including lymphocyte activation, recirculation and homing, hematopoiesis, and tumor metastasis.
The SERPINH1 (Serpin Family H Member 1) gene encodes one of the members of the serine superfamily of serine protease inhibitors. The SERPINH1 protein localizes to the endoplasmic reticulum and plays a role in collagen biosynthesis as a collagen-specific chaperone. The nucleotide polymorphism of the SERPINH1 gene may be related to premature birth caused by premature rupture of fetal membranes, and the pseudogene of the gene is located on the short arm of chromosome 9.
The LGALS3 (Galectin 3) gene encodes one of the Galectin family members of carbohydrate binding proteins. Members of this family of proteins have affinity for beta-galactosides. The LGALS3 protein is characterized by an N-terminal proline-rich tandem repeat domain and a single C-terminal carbohydrate recognition domain. LGALS3 proteins can self-associate through the N-terminal domain, allowing binding to multivalent sugar ligands. The LGALS3 protein localizes to the extracellular matrix, cytoplasm, and nucleus, and plays a role in a number of cellular functions, including apoptosis, innate immunity, cell adhesion, and T cell regulation.
CEBPB (CCAAT Enhancer Binding Protein Beta) is an intron-free gene that encodes a transcription factor comprising a basic leucine zipper (bZIP) domain. The CEBPB protein functions as a homodimer, but may also form heterodimers with CCAAT/enhancer binding proteins α, δ, and γ. The activity of CEBPB proteins is important in the regulation of genes involved in immune and inflammatory responses, as well as other processes.
ATF5 (Activating Transcription Factor 5) gene enables multiple functions, including DNA binding Transcription activator activity and RNA polymerase II specificity; sequence-specific DNA binding activity and tubulin binding activity of the RNA polymerase II transcriptional regulatory region. And are involved in a number of processes, including adipocyte differentiation; regulation of cell cycle processes and transcriptional regulation.
The LGALS1 (Galectin 1) gene encodes one of the Galectin family members of carbohydrate binding proteins. The LGALS1 protein can be used as an autocrine negative growth factor for regulating cell proliferation.
Each gene in the above gene marker combinations has certain value in predicting tumor progression and/or prognosis, and one skilled in the art can select any combination, for example, any 1 gene, any 2 genes, any 3 genes, any 4 genes, any 5 genes, \ 8230; \ 8230;, any 10 genes, and any 11 genes, for prediction. Although the embodiment of the present invention only exemplifies the results of predicting tumor progression by a single gene and partial combinations, in fact, any of the above combinations can achieve good prediction results.
In some embodiments of the invention, the gene marker combination comprises FOSL2, ANXA1 and SERPINH1. In other embodiments of the invention, the gene marker combination comprises FOSL2, ANXA1, SERPINH1VIM and CAV1. In still further embodiments of the invention, the gene marker combination comprises S100a10, FOSL2, SPP1, CAV1, ANXA1, VIM, CD44, SERPINH1, LGALS3, CEBPB, ATF5, and LGALS1.
In a second aspect, the present invention provides the use of an expression level detection reagent of the gene marker combination according to the first aspect of the present invention in the preparation of a kit for predicting tumor progression and/or prognosis.
In the present invention, obtaining the expression level of the gene marker combination refers to obtaining the expression level of each gene in the gene marker combination. In some embodiments of the invention, the expression level is a relative expression level, i.e., relative to the expression level of an internal reference gene. The reference gene may be any reference gene known or commonly used in the art. The expression level of the reference gene may be the expression level of one reference gene, or may be an average value, a mode value, or a median value of the expression levels of a plurality of reference genes.
In some embodiments of the invention, the gene marker combination expression level detection reagent is a primer and/or a probe. Further, the expression level of the gene marker combination is obtained using at least one of the group consisting of whole transcriptome sequencing, capture sequencing and qRT-PCR. For example, one skilled in the art can design probes for each gene in the gene marker combination, prepare the probes into a gene chip, and perform capture sequencing using the gene chip. For another example, primers can be designed for each gene in the gene marker combination, and the sequences can be sequenced after PCR amplification. In another example, primers and probes can be designed for each gene in the gene marker combination, and the expression level of each gene can be detected by using a qRT-PCR method. In particular, the whole transcriptome high-throughput sequencing can also be directly carried out, after sequencing data is obtained, the sequencing data is compared with a human reference genome, and expression quantification is carried out to obtain expression profile information. Of course, the expression level of each gene in the gene marker combination can be obtained by any other method by those skilled in the art.
In the present invention, the expression level is detected based on an RNA sample. In particular, RNA samples can be extracted for detection after obtaining the biological sample, for example by whole transcriptome sequencing, capture sequencing or qRT-PCR. In other embodiments of the invention, single cell RNA sequencing (scRNA-seq) can also be used to obtain the expression levels of the genes in a single tumor cell.
In some embodiments of the invention, the biological sample is a tissue, organ or bodily fluid. The body fluids include, but are not limited to, blood, serum, plasma, interstitial fluid, lymph fluid, pleural fluid, peritoneal fluid, cerebrospinal fluid, urine, saliva, tears, semen, vaginal fluid. It is worth mentioning that the tissue or body fluid applicable may be different for different tumors, for example, for GBM, any one of blood, cerebrospinal fluid and brain tissue may be selected; it is possible for a person skilled in the art to select the most suitable sample by practice, but all fall within the scope of protection of the present invention. In a third aspect, the present invention provides a kit for predicting tumor progression and/or prognosis, which comprises an expression level detecting reagent of the gene marker combination according to claim 1.
In some embodiments of the invention, an RNA extraction reagent is also included.
A fourth aspect of the invention provides a system for predicting tumor progression and/or prognosis, comprising:
a data input module for obtaining the expression level of each gene in the gene marker combination according to the first aspect of the invention;
and the prediction module is connected with the data input module and used for predicting the tumor progression and/or prognosis by using a single-sample gene set enrichment analysis method according to the expression level of each gene.
In some embodiments of the invention, in the prediction module, an enrichment score of the gene marker combination is obtained using a single-sample gene set enrichment analysis method, the higher the enrichment score, the more advanced the tumor progression; the more advanced the tumor progresses, the more resistant it is to treatment modalities including surgery, chemotherapy, radiation therapy, etc., meaning that the worse the prognosis. In general terms, the amount of the solvent to be used,
the single sample gene set enrichment analysis (ssGSEA) is an extension of the GSEA method, and is mainly designed for the case that a single sample cannot be used for GSEA.
In some embodiments of the invention, the expression level of each gene is obtained by at least one method selected from the group consisting of whole transcriptome sequencing, capture sequencing and qRT-PCR based on the RNA sample.
In some embodiments of the invention, the method further comprises a parameter storage module, connected to the prediction module, for storing the enrichment score reference value. The enrichment score reference is a range of interval values, falling within a range of interval values, meaning that the tumor is at a certain stage of progression. For example, for GBM, three progressive stages can be divided into early (younger), middle (middle) and late (old), corresponding to three ranges of values: a first interval, a second interval, and a third interval. After the prediction module obtains the enrichment fraction of the sample, comparing the enrichment fraction with the enrichment fraction reference value in the parameter storage module, and if the enrichment fraction falls into a first interval, indicating that the original tumor sample is in an early stage and the prognosis is good; if the enrichment score falls into a second interval, the source tumor sample is in the middle stage, and the prognosis is general; if the enrichment score falls within the third interval, it indicates that the tumor sample is in a late stage and the prognosis is poor.
In some embodiments of the invention, the enrichment score reference is obtained using a population sample. In some preferred embodiments of the invention, the population sample comprises more than 20 samples, such as 30, 50, 80, 100, 150, 200, 300, 500 or more.
In some embodiments of the invention, the enrichment score reference in the parameter storage module is updated in accordance with the prediction in the prediction module. Specifically, the enrichment score and the actual progression and/or prognosis data are used as a training set together with the population data, and retrained to obtain the enrichment score reference value.
In the present invention, the tumor includes, but is not limited to, human sarcomas and carcinomas such as fibrosarcoma, myosarcoma, liposarcoma, chondrosarcoma, osteogenic sarcoma, chordoma, angiosarcoma, endotheliosarcoma, lymphangioendotheliosarcoma, synovioma, mesothelioma, ewing's tumor, leiomyosarcoma, rhabdomyosarcoma, colon cancer, pancreatic cancer, prostate cancer, squamous cell cancer, basal cell cancer, adenocarcinoma, sweat gland cancer, sebaceous gland cancer, papillary adenocarcinoma, cystadenocarcinoma, medullary cancer, bronchial cancer, hepatoma, cholangiocarcinoma, choriocarcinoma, seminoma, embryonic cancer, wilms' tumor, cervical cancer, testicular tumor, lung cancer, small cell lung cancer, epithelial cancer, glioma, astrocytoma, medulloblastoma, craniopharyngioma, ependymoma, pinealoma, hemangioblastoma, acoustic neuroma, oligodendroglioma, meningioma, melanoma, neuroblastoma, glioblastoma, retinoblastoma; leukemias, such as acute lymphocytic leukemia and acute myeloblastic leukemia (myeloblasts, promyelocytes, myelomonocytic, monocytic and erythrocytic leukemias); chronic leukemia (chronic myeloid (granulocytic) leukemia and chronic lymphocytic leukemia); and polycythemia vera, lymphomas (hodgkin's disease and non-hodgkin's disease), multiple myeloma, waldenstrom's macroglobulinemia, and heavy chain disease.
In some embodiments of the invention, the tumor is glioblastoma, bladder cancer, colorectal cancer, esophageal cancer, ovarian cancer, gastric cancer, non-small cell lung cancer, breast cancer, liver cancer, or pancreatic cancer.
The invention has the advantages of
Compared with the prior art, the invention has the following beneficial effects:
the invention utilizes transcriptome sequencing to research and develop GBM, obtains the progress and predicted gene marker combination of GBM tumor, and realizes unprecedented high-resolution characterization of tumor natural evolution process.
The gene marker combination, the kit or the system can be used for predicting the progress and/or prognosis of various tumors including GBM, and has strong universality, high accuracy and very high clinical application value.
The gene marker combination, the kit or the system can be used for detecting not only tumor samples but also other biological samples such as peripheral blood, and has wide application prospect.
Drawings
FIG. 1 shows the expression distribution of 12 genes in different tumor cell clusters. Among them, tumor cell clusters are classified into three groups of early (young), intermediate (intermediate) and late (old) according to the progression status.
Figure 2 shows the correlation of the enrichment score of the tumor progression and prognosis prediction system of the invention with the prognosis of a patient.
Fig. 3 shows the prediction results based on preoperative peripheral blood samples using the tumor progression and prognosis prediction system of the present invention. preGBM represents a preoperative peripheral blood sample of high-grade GBM, and preLGG represents a preoperative peripheral blood sample of low-grade LGG.
Detailed Description
Unless otherwise indicated, implicit from the context, or customary in the art, all parts and percentages herein are based on weight and the testing and characterization methods used are in step with the filing date of the present application. Where applicable, the contents of any patent, patent application, or publication referred to in this application are hereby incorporated by reference in their entirety, and the equivalent family of patents is also incorporated by reference, especially with respect to the definitions of relevant terms in the art, as disclosed in these documents. To the extent that a definition of a particular term disclosed in the prior art is inconsistent with any definitions provided herein, the definition of the term provided herein controls.
The numerical ranges in this application are approximations, and thus may include values outside of the ranges unless otherwise specified. A numerical range includes all numbers from the lower value to the upper value, in increments of 1 unit, provided that there is a separation of at least 2 units between any lower value and any higher value.
The terms "comprising," "including," "having," and derivatives thereof do not exclude the presence of any other component, step or procedure, and are not intended to exclude the presence of other elements, steps or procedures not expressly disclosed herein.
In order to make the technical problems, technical solutions and advantageous effects solved by the present invention more clear, the present invention is further described in detail below with reference to the embodiments.
Examples
The following examples are used herein to demonstrate preferred embodiments of the invention. It will be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function in the invention, and thus can be considered to constitute preferred modes for its practice. Those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit or scope of the invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs and the disclosures and references cited herein and the materials to which they refer are incorporated by reference.
Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.
The experimental procedures in the following examples are all conventional ones unless otherwise specified. The instruments used in the following examples, unless otherwise specified, were all conventional laboratory instruments; the test materials used in the following examples were purchased from a conventional biochemical reagent store unless otherwise specified.
Example 1 obtaining GBM tumor progression characterization molecules based on transcriptome sequencing
In this embodiment, RNA extraction is performed on a primary GBM tumor sample, then whole transcriptome sequencing is performed, off-line sequencing data is further preprocessed, and then sequence comparison and gene expression quantification are performed, so as to obtain a gene expression profile finally. The method comprises the following specific steps:
RNA extraction and transcriptome sequencing
RNA extraction was performed on each tumor sample to obtain total RNA: each tumor sample was first separately obtained as total RNA samples of the first and second lesions (lesions seen in isolation from each other on the image).
Reverse transcribing the RNA into cDNA; breaking a sample with excessively large cDNA fragments into 200-350 base pairs by ultrasonication; performing end repair, purine addition, library adaptor ligation and other operations on the fragmented cDNA molecules; sequencing was performed on a high throughput sequencer.
2. Off-line data preprocessing
After obtaining corresponding high throughput sequencing data, the data is pre-processed using methods common in the art to filter out adaptor sequences and non-conforming sequences.
3. Sequence alignment and Gene expression quantification
Firstly, an index is established for a reference genome to be aligned through a STAR built-in function genogenic algorithm, and then a sample is aligned through STAR to generate a BAM file. And processing the BAM file through an HTseq tool to obtain a sample gene expression matrix. Finally, the expression matrix is normalized by the FPKM algorithm.
4. Identification of tumor progression characterization molecules
To dissect the molecular features associated with GBM tumor progression, i.e. the gene markers or so-called gene markers associated with GBM tumor progression, the inventors identified genes differentially expressed in tumor cells of the first and second foci in the tumor sample, leaving genes with an absolute difference in expression between tumor cells of the first and second foci of greater than 5%. For ease of subsequent description, the differentially expressed genes of the first and second foci are denoted as oDEG and yDEG, respectively.
To identify clusters of older and younger progression tumors, the inventors first calculated the state of progression potential (referred to as PE) for each tumor cluster using the following formula:
Figure BDA0003830971350000091
wherein O is i And Y i The percentages of oDEG and yDEG in tumor cluster i, respectively, are expressed. PE indicates tumor progression, and higher PE indicates older tumor clusters, i.e., more advanced tumor progression.
The inventors further ranked each tumor cluster from high to low by PE score, selecting the top 20% and bottom 20% as older and younger clusters, respectively.
The inventors established an expression profile using older clusters of the first lesion and younger clusters of the second lesion and performed differential expression analysis based on the expression profile. The criteria used for differential expression analysis included:
(1) The Fold Change (FC) is more than or equal to 1.5;
(2) Wilcoxon rank sum test corrects p-value<10 -3
(3) The percentage of the expressed genes in each group is more than or equal to 10 percent.
Therefore, 6083 genes are obtained and can be used as potential candidate characteristic molecules for tumor progression.
To ensure that the tumor progression signature (gene) is applicable to a variety of GBMs, the inventors performed validation in samples of each lesion and 4 additional GBM patients. Significant differentially expressed genes were also identified using the following criteria:
(1)FC≥1.5;
(2) Wilcoxon rank sum test corrects p-values<10 -3
(3) The percentage of the expressed genes in each group is more than or equal to 10 percent.
The differentially expressed genes obtained in all analyses were intersected to obtain 12 genes, all of which were called tumor progression signature molecules, also called tumor progression predictive gene markers. The 12 genes were: S100A10, FOSL2, SPP1, CAV1, ANXA1, VIM, CD44, SERPINH1, LGALS3, CEBPB, ATF5 and LGALS1, the expression of which at different progressive stages is shown in FIG. 1. The combination of the above gene markers or a subset thereof can be used for predicting the tumor progression, and further, the combination of the above gene markers or a subset thereof can be used for predicting the tumor prognosis due to the inseparable prognosis and tumor progression.
To verify that a subset of the above combinations of gene markers can also be used for prediction of tumor progression and prognosis, we predicted GBM for single genes and combinations of less than 31 genes, with the results shown in table 1:
TABLE 1 prediction of tumor progression prediction Gene marker combination subset prediction results
Figure BDA0003830971350000101
Figure BDA0003830971350000111
It can be seen that any subset of the combinations of gene markers identified by the screening of this example can also be used for prediction of tumor progression and prognosis, and all have very high accuracy.
Example 2 tumor progression or prognosis prediction System
The present embodiment establishes a computer system for predicting tumor progression or prognosis, comprising a data input module and a prediction module. The data input module can obtain the expression levels of each gene in the gene marker combination obtained in example 1, and the prediction module predicts tumor progression or prognosis using a single sample gene set enrichment analysis (ssGSEA) method (Barbie, d., tamayo, p., boehm, j.et al.
The ssGSEA algorithm is summarized by first performing a rank normalization of gene expression values for a given sample and then calculating an Enrichment Score (ES) using an empirical cumulative distribution function. The R language GSVA package can realize ssGSEA analysis, and the GSVA package is issued on a Bioconductor.
Specifically, the enrichment score of the gene marker combination is obtained by using ssGSEA to predict tumor progression or prognosis, and the higher the enrichment score is, the later the tumor progression is, the worse the prognosis is.
Prediction of GBM prognosis as shown in FIG. 2, it can be seen from FIG. 2 that The system is able to effectively predict GBM patient prognosis with a shorter survival time for patients with high enrichment scores as indicated by data analysis of The Cancer Genome Atlas (TCGA), gravendeel et al (Gravendeel, lonneke AM, et al. "Cancer gene expression profiles of The viral area a beta predictor of The viral threshold high history." Cancer research 69.23 (2009): 9065-9072.), and Chinese Glioma Genome Atlas (CGGA).
Example 3 application of tumor progression or prognosis prediction System
To verify the accuracy and reliability of the system of example 2, and applicability to tumors other than GBM, the inventors collected common data for TCGA, international Cancer Genome Consortium (ICGC) database whole Genome sequencing (we), and various sample sequencing data in the open literature. Tumor progression scores were calculated and prognosis predicted for each sample by the system of example 2.
The results are shown in table 2:
TABLE 2 application of tumor progression or prognosis prediction system in various tumor samples
Figure BDA0003830971350000121
From the results, the tumor progression and prognosis prediction system of the invention achieves higher accuracy in various tumor samples. Wherein, for the prediction of the progression and prognosis of bladder cancer, colorectal cancer, esophageal cancer and ovarian cancer, the AUC is over 90 percent, and the accuracy is very high. For the prediction of the progression of gastric cancer and non-small cell lung cancer, AUC exceeds 80%, and the accuracy is relatively very high; the AUC of the prediction of the progress and prognosis of breast cancer, liver cancer and pancreatic cancer is over 75 percent, and the prediction method has obvious clinical value.
Example 4 use of tumor progression or prognosis prediction System in peripheral blood-based applications
To further verify that the system of example 2 can be used to predict samples other than tumor tissue. The inventors obtained peripheral blood samples of pre-operative GBM and low-grade glioma (LGG), which were separately predicted. The results are shown in FIG. 3. It can be seen that the enrichment fraction of GBM was significantly higher than that of LGG in both the Primary (Primary) and recurrent (Recurrence) groups, indicating that the system of example 2 could also predict tumor progression well based on peripheral blood samples. Further widening the application prospect of the system in the embodiment 2.
All documents mentioned in this application are incorporated by reference in this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes or modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the appended claims of the present application.

Claims (10)

1. A gene marker combination comprising at least one gene of S100a10, FOSL2, SPP1, CAV1, ANXA1, VIM, CD44, SERPINH1, LGALS3, CEBPB, ATF5 and LGALS1.
2. A gene marker combination according to claim 1, wherein the gene marker combination comprises S100A10, FOSL2, SPP1, CAV1, ANXA1, VIM, CD44, SERPINH1, LGALS3, CEBPB, ATF5 and LGALS1.
3. Use of the gene marker combination expression level detection reagent of claim 1 for the preparation of a kit for the prediction of tumor progression and/or prognosis.
4. The use according to claim 2, wherein the gene marker combination expression level detection reagent is a primer and/or a probe.
5. A kit for predicting tumor progression and/or prognosis, comprising an expression level detection reagent of the gene marker combination according to claim 1.
6. A system for predicting tumor progression and/or prognosis, comprising:
a data input module for obtaining the expression level of each gene in the gene marker combination of claim 1;
and the prediction module is connected with the data input module and is used for predicting the tumor progression and/or prognosis by using a single-sample gene set enrichment analysis method according to the expression level of each gene.
7. The system of claim 6, wherein the prediction module obtains an enrichment score for the combination of gene markers using a single-sample gene set enrichment analysis method, wherein the higher the enrichment score, the more advanced the tumor progression, and the poorer the prognosis.
8. The system of claim 6, wherein the expression level of each gene is obtained by at least one method selected from the group consisting of transcriptome sequencing, capture sequencing and qRT-PCR based on the RNA sample.
9. The system of claim 6, further comprising a parameter storage module coupled to the prediction module for storing the enrichment score reference, the enrichment score reference being obtained using a population sample.
10. The system according to claim 9, wherein the enrichment score reference in the parameter storage module is updated according to the prediction result in the prediction module.
CN202211094688.5A 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis Active CN115747329B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202311514765.2A CN117385040A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311515115.XA CN117385042A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311514772.2A CN117385041A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202211094688.5A CN115747329B (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211094688.5A CN115747329B (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis

Related Child Applications (3)

Application Number Title Priority Date Filing Date
CN202311514772.2A Division CN117385041A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311515115.XA Division CN117385042A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311514765.2A Division CN117385040A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis

Publications (2)

Publication Number Publication Date
CN115747329A true CN115747329A (en) 2023-03-07
CN115747329B CN115747329B (en) 2023-10-17

Family

ID=85349657

Family Applications (4)

Application Number Title Priority Date Filing Date
CN202211094688.5A Active CN115747329B (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311514772.2A Pending CN117385041A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311514765.2A Pending CN117385040A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311515115.XA Pending CN117385042A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis

Family Applications After (3)

Application Number Title Priority Date Filing Date
CN202311514772.2A Pending CN117385041A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311514765.2A Pending CN117385040A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis
CN202311515115.XA Pending CN117385042A (en) 2022-09-03 2022-09-03 Gene marker combination, kit and system for predicting tumor progression and prognosis

Country Status (1)

Country Link
CN (4) CN115747329B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116656829A (en) * 2023-08-01 2023-08-29 昂凯生命科技(苏州)有限公司 Gene marker combination, kit and system for predicting bad prognosis of gastric cancer
CN116844638A (en) * 2023-06-08 2023-10-03 上海信诺佰世医学检验有限公司 Child acute leukemia typing system and method based on high-throughput transcriptome sequencing

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080292546A1 (en) * 2003-06-09 2008-11-27 The Regents Of The University Of Michigan Compositions and methods for treating and diagnosing cancer
JP2010131006A (en) * 2008-10-31 2010-06-17 Dna Chip Research Inc Neuroglioma prognosis prediction method and kit usable therefore
US20100167939A1 (en) * 2007-03-02 2010-07-01 Kenneth Aldape Multigene assay to predict outcome in an individual with glioblastoma
WO2012126542A2 (en) * 2011-03-23 2012-09-27 Universite De Rennes 1 Biomarkers and methods for the prognosis of glioblastoma
US20140045915A1 (en) * 2010-08-31 2014-02-13 The General Hospital Corporation Cancer-related biological materials in microvesicles
CN107034305A (en) * 2017-06-19 2017-08-11 上海市第十人民医院 A kind of diagnosis marker of glioblastoma
CN107058596A (en) * 2017-06-19 2017-08-18 上海市第十人民医院 A kind of mark related to glioblastoma diagnosis and its application
WO2018065525A1 (en) * 2016-10-05 2018-04-12 University Of East Anglia Classification and prognosis of cancer
CN108949982A (en) * 2018-07-09 2018-12-07 中国医科大学附属第医院 A method of glioma clinical prognosis is evaluated using co-stimulators
KR20190143058A (en) * 2018-06-20 2019-12-30 연세대학교 산학협력단 Method of predicting prognosis of brain tumors
CN112002372A (en) * 2020-08-03 2020-11-27 李里 Screening method and application of prognosis target gene of human glioblastoma multiforme
CN112980952A (en) * 2021-02-05 2021-06-18 中山大学肿瘤防治中心(中山大学附属肿瘤医院、中山大学肿瘤研究所) Marker for predicting treatment efficacy of isocitrate dehydrogenase 1 gene wild-type glioma in prognosis and anti-PD 1 and application thereof
CN113481298A (en) * 2021-06-18 2021-10-08 广东中科清紫医疗科技有限公司 Application of immune related gene in kit and system for predicting diffuse glioma prognosis
CN114512184A (en) * 2021-10-11 2022-05-17 上海市胸科医院 Method for predicting cancer curative effect and prognosis, device and application thereof

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080292546A1 (en) * 2003-06-09 2008-11-27 The Regents Of The University Of Michigan Compositions and methods for treating and diagnosing cancer
US20100167939A1 (en) * 2007-03-02 2010-07-01 Kenneth Aldape Multigene assay to predict outcome in an individual with glioblastoma
JP2010131006A (en) * 2008-10-31 2010-06-17 Dna Chip Research Inc Neuroglioma prognosis prediction method and kit usable therefore
US20140045915A1 (en) * 2010-08-31 2014-02-13 The General Hospital Corporation Cancer-related biological materials in microvesicles
WO2012126542A2 (en) * 2011-03-23 2012-09-27 Universite De Rennes 1 Biomarkers and methods for the prognosis of glioblastoma
WO2018065525A1 (en) * 2016-10-05 2018-04-12 University Of East Anglia Classification and prognosis of cancer
CN107058596A (en) * 2017-06-19 2017-08-18 上海市第十人民医院 A kind of mark related to glioblastoma diagnosis and its application
CN107034305A (en) * 2017-06-19 2017-08-11 上海市第十人民医院 A kind of diagnosis marker of glioblastoma
KR20190143058A (en) * 2018-06-20 2019-12-30 연세대학교 산학협력단 Method of predicting prognosis of brain tumors
CN108949982A (en) * 2018-07-09 2018-12-07 中国医科大学附属第医院 A method of glioma clinical prognosis is evaluated using co-stimulators
CN112002372A (en) * 2020-08-03 2020-11-27 李里 Screening method and application of prognosis target gene of human glioblastoma multiforme
CN112980952A (en) * 2021-02-05 2021-06-18 中山大学肿瘤防治中心(中山大学附属肿瘤医院、中山大学肿瘤研究所) Marker for predicting treatment efficacy of isocitrate dehydrogenase 1 gene wild-type glioma in prognosis and anti-PD 1 and application thereof
CN113481298A (en) * 2021-06-18 2021-10-08 广东中科清紫医疗科技有限公司 Application of immune related gene in kit and system for predicting diffuse glioma prognosis
CN114512184A (en) * 2021-10-11 2022-05-17 上海市胸科医院 Method for predicting cancer curative effect and prognosis, device and application thereof

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HE等: "Single-Cell Transcriptomic Analysis Revealed a Critical Role of SPP1/CD44-Mediated Crosstalk Between Macrophages and Cancer Cells in Glioma", 《FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY》, vol. 9, pages 779319 *
ZHANG等: "S100A gene family: immune-related prognostic biomarkers and therapeutic targets for low-grade glioma", 《AGING》, vol. 13, no. 11, pages 154459 - 15478 *
刘洁 等: "基于单细胞转录组的多级别胶质瘤异质性及免疫微环境分析揭示了潜在的预后生物标志物", 《生物工程学报》, vol. 38, no. 10, pages 3790 - 3808 *
许广志 等: "IDH野生型胶质母细胞瘤患者预后影响因素分析", 《临床神经外科杂志》, vol. 19, no. 2, pages 130 - 134 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116844638A (en) * 2023-06-08 2023-10-03 上海信诺佰世医学检验有限公司 Child acute leukemia typing system and method based on high-throughput transcriptome sequencing
CN116656829A (en) * 2023-08-01 2023-08-29 昂凯生命科技(苏州)有限公司 Gene marker combination, kit and system for predicting bad prognosis of gastric cancer
CN116656829B (en) * 2023-08-01 2024-04-12 昂凯生命科技(苏州)有限公司 Gene marker combination, kit and system for predicting bad prognosis of gastric cancer

Also Published As

Publication number Publication date
CN115747329B (en) 2023-10-17
CN117385041A (en) 2024-01-12
CN117385040A (en) 2024-01-12
CN117385042A (en) 2024-01-12

Similar Documents

Publication Publication Date Title
CN115747329B (en) Gene marker combination, kit and system for predicting tumor progression and prognosis
US9097715B2 (en) Bladder cancer diagnosis and/or prognosis method
EP3350345B1 (en) Biomarkers for heart failure
US20090298082A1 (en) Biomarker panels for predicting prostate cancer outcomes
WO2016011377A1 (en) Methods and systems for assessing infertility and related pathologies
US10407731B2 (en) Biomarker panels for predicting prostate cancer outcomes
WO2015017537A2 (en) Colorectal cancer recurrence gene expression signature
Chen et al. Melanoma long non-coding RNA signature predicts prognostic survival and directs clinical risk-specific treatments
CN107058554B (en) Purposes of the PRMT5 genes as prediction diagnosis and treatment acute myocardial infarction AMI label
CN111172287B (en) Application of exosome lncRNA RN7SL5P as internal reference gene in gastric cancer lncRNA detection
CN108588230B (en) Marker for breast cancer diagnosis and screening method thereof
WO2011014697A1 (en) Methods of assessing a risk of cancer progression
CN110418850A (en) Identification and the method for using tiny RNA predictive factor
CN107574247A (en) A kind of the glioblastoma auxiliary diagnosis based on CLCF1 genes, prognostic evaluation kit and its application method
CN115927567A (en) Personalized customized molecular residual disease detection method
KR102156282B1 (en) Method of predicting prognosis of brain tumors
EP2126128A2 (en) Gene expression profiling for identification, monitoring, and treatment of lupus erythematosus
Perco et al. Integrative analysis of-omics data and histologic scoring in renal disease and transplantation: renal histogenomics
WO2023231280A1 (en) Product for evaluating recurrence risk of lung cancer patient
KR20190143417A (en) Method of predicting prognosis of brain tumors
CN111647670A (en) Nephrotic syndrome-related enterobacteria Faecaliallea and application thereof
CN110592219B (en) lncRNA diagnosis and treatment marker for breast cancer
JP2014518086A (en) Determination of tumor origin
CN111424096A (en) Biomarker related to occurrence and development of gastric adenocarcinoma
CN108220427B (en) Plasma microRNA marker for differential diagnosis of BHD syndrome and primary spontaneous pneumothorax and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant