CN111778336B - Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application - Google Patents

Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application Download PDF

Info

Publication number
CN111778336B
CN111778336B CN202010718484.9A CN202010718484A CN111778336B CN 111778336 B CN111778336 B CN 111778336B CN 202010718484 A CN202010718484 A CN 202010718484A CN 111778336 B CN111778336 B CN 111778336B
Authority
CN
China
Prior art keywords
value
weight value
weighted value
weighted
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010718484.9A
Other languages
Chinese (zh)
Other versions
CN111778336A (en
Inventor
张如奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Bangkai Gene Technology Co ltd
Original Assignee
Suzhou Bangkai Gene Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Bangkai Gene Technology Co ltd filed Critical Suzhou Bangkai Gene Technology Co ltd
Priority to CN202010718484.9A priority Critical patent/CN111778336B/en
Publication of CN111778336A publication Critical patent/CN111778336A/en
Application granted granted Critical
Publication of CN111778336B publication Critical patent/CN111778336B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • Medical Informatics (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Wood Science & Technology (AREA)
  • Evolutionary Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Pathology (AREA)
  • Hospice & Palliative Care (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Bioethics (AREA)
  • Oncology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Microbiology (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The gene marker combination for quantitative evaluation of the tumor microenvironment disclosed by the invention can well carry out comprehensive quantitative evaluation on the tumor microenvironment, the prediction accuracy of the evaluation result on tumor metastasis is 100%, and the prediction accuracy on tumor recurrence is over 80%. The invention disassembles and refines the tumor microenvironment into four constituent factors, and searches related genes of each factor through NCBI. The genes are simplified and refined according to functions to form a gene marker combination for quantitative evaluation of the tumor microenvironment. The four components comprise a vascular proliferation index, a chemokine expression index, an immune cell infiltration index and a tumor growth and invasion index, thereby forming a gene marker combination for comprehensive quantitative evaluation of a tumor microenvironment.

Description

Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application
Technical Field
The invention belongs to the field of molecular biomedicine, and particularly relates to comprehensive quantitative evaluation of a tumor microenvironment.
Background
At present, pathological researches on tumors, medicines for treating tumors and scheme researches are directed at tumor cells, such as pathological section staining, tumor gene mutation map drawing, radiotherapy and chemotherapy for treating tumors and targeted therapy, but the research on tumor microenvironment for survival of tumor cells is delayed obviously. The british surgeon Stephen Paget proposed in 1889 the "seed and soil" hypothesis of tumor metastasis, which could mean the abusing of the tumor microenvironment. However, this hypothesis has not been scientifically confirmed, and research on the tumor microenvironment has not been renewed until the wide clinical application of PD-1/PD-L1 immune checkpoint inhibitors and review of anti-vascular targeted therapies. Even now, the research on the tumor microenvironment is still not systematic, and in fact, the research on the tumor microenvironment at present is more limited at the conceptual level, and cannot be accurately described and quantitatively evaluated. Some studies only sporadically relate to some factors related to the tumor microenvironment, such as the composition of tumor immune cells according to the expression of tumor immune response related genes. Such studies are too specific and do not fully reflect the real state of the tumor microenvironment, so that the study results are different or even completely opposite, and the tumor microenvironment is difficult to be used as a target substance/index in clinical practice. The tumor microenvironment is a complex system (the molecular cell components are complex and the factors are all five, and whether the evaluation can be carried out comprehensively and accurately and how the evaluation is related to whether the tumor microenvironment can play a role in clinical research and clinical application. At present, a method and a scheme for comprehensively evaluating the state of the tumor microenvironment do not exist, the invention creatively provides that the tumor microenvironment is decomposed and summarized into four constituent factors, and the gene marker combination is screened out to comprehensively and quantitatively evaluate the tumor microenvironment, so that the tumor microenvironment is more specifically known, the tumor microenvironment is more accurately measured, and the method and the scheme are applied to the clinical practice process, and bring clinical value.
Disclosure of Invention
The invention solves the problems of comprehensive evaluation and quantitative evaluation of the tumor microenvironment, so that the tumor microenvironment can better play a role in clinical practice as a measurable biomarker.
According to one aspect of the invention, a gene marker combination for comprehensive quantitative evaluation of tumor microenvironment is provided, and the gene marker combination comprises a vascular proliferation index gene marker, a chemokine expression index gene marker, an immune cell infiltration index gene marker and a tumor growth and invasion index gene marker.
According to one aspect of the invention, the invention also provides a gene marker combination for comprehensive quantitative evaluation of tumor microenvironment, wherein the vascular proliferation indicator gene markers comprise 16 genes of ANG, ANGPT1, ANGPT2, ANGPT 4, DLL4, EDN1, FGF1, FGF2, FLT1, HIF1A, PDGFB, SERPINB5, TYMP, VEGFA, VEGFB and VEGFC;
the chemokine expression index gene marker comprises 18 genes of CCL1, CCL2, CCL3, CCL4, CCL5, CCL8, CCL18, CCL19, CCL21, CXCL1, CXCL2, CXCL3, CXCL8, CXCL9, CXCL10, CXCL11, CXCL12 and CXCL 13;
the immune cell infiltration index gene markers are as follows: IDO1, HLA-DRA, STAT1, IFNG, PRF1, GZMA, GZMB, NKG7, GZMH, KLRK1, KLRB1, KLRD1, CTSW, GNLY, CD14, CD15, CD19, CD68, CD163, CD33, CEACAM8, CD80, CD86, bat 3, TNFRSF17, CD20, TNFRSF4, CD4, TNFRSF9, CD8A, CD8B, LAG3, CD39, CXCR5, TBX21, FOXP3, CD45RO co-37 genes;
the tumor growth and invasion index gene markers are as follows: CDH1, CTNNB1, EPCAM, ITGAM, ITGAV, ITGAX, MACC1, MMP1, MMP2, MMP3, MMP9, MMP11, MMP13, MMP14, MKI67, MYC, PLAU, RAN, SNAI1, SNAI2, TIMP1, TNC, TWIST1, ZEB1, ZEB2 share 25 genes.
According to one aspect of the invention, the invention further provides the application of the gene marker combination, and the gene marker combination is applied to comprehensive quantitative evaluation of tumor microenvironment.
According to one aspect of the present invention, there is also provided a kit for predicting the accuracy of tumor metastasis and/or tumor recurrence, the kit comprising a probe for detecting the above-described combination of genetic markers.
According to one aspect of the invention, the establishment method of the comprehensive quantitative evaluation model for the tumor microenvironment is further provided, and the gene marker combination comprising the vascular proliferation index gene marker, the chemokine expression index gene marker, the immune cell infiltration index gene marker and the tumor growth and invasion index gene marker is used as the marker for the comprehensive quantitative evaluation of the tumor microenvironment.
According to one aspect of the invention, the establishment method for the tumor microenvironment comprehensive quantitative evaluation model analyzes the gene expression data and clinical data (staging/recurrence/metastasis) of the gene marker combination in the GEO database, firstly, the first 150 samples in GSE62254[ HG-U133_ Plus _2] are selected as a training set, the MAS5 method is used for background correction and standardization, the gene expression value of 96 genes is obtained, when the genes have a plurality of probes, the maximum value is taken as the expression value of the genes, log2 logarithmic transformation is performed on the gene expression value, and logistic regression is used for calculating the weight value.
According to one aspect of the invention, in the method for establishing the comprehensive quantitative evaluation model for the tumor microenvironment, the weight values of the gene markers are as follows:
ANG weight value-20.1306
ANGPT1 weighted value-34.0557
ANGPT2 weighted value 28.6868
ANGPTL4 weight value-7.5101
BATF3 weighted value-31.0749
CCL1 weighted value 13.6034
CCL18 weighted value 10.4873
CCL19 weighted value-20.9738
CCL2 weighted value 11.7576
CCL21 weighted value-5.1514
CCL3 weighted value 54.0877
CCL4 weighted value-53.5021
CCL5 weighted value 27.601
CCL8 weighted value-7.8325
CD14 weight value 49.8048
CD163 weight value 20.646
CD19 weight value 21.3692
CD33 weight value 4.0713
CD4 weight value 9.214
CD68 weighted value-24.3628
CD80 weight value 30.8515
CD86 weight value 70.5589
CD8A weighted value 5.4726
CD8B weighted value 7.7914
CDH1 weighted value-5.3397
CEACAM8 weight value of 0.21
CTNNB1 weighted value-20.1479
CTSW weighted value-6.4395
CXCL1 weighted value-20.0591
CXCL10 weighted value 29.703
CXCL11 weighted value-13.3524
CXCL12 weighted value 19.7961
CXCL13 weighted value-20.5527
CXCL2 weighted value 13.4255
CXCL3 weighted value 23.3393
CXCL8 weighted value-43.1623
CXCL9 weighted value 9.0169
CXCR5 weight value 5.4856
DLL4 weight value 17.0118
EDN1 weighted value-4.8511
ENTPD1 weighted value 3.4117
EPCAM weight value-33.8795
FGF1 weighted value 25.9813
FGF2 weighted value-12.3332
FLT1 weighted value-31.7469
FOXP3 weight value 13.9705
FUT4 weighted value 13.9706
GNLY weight value-10.0842
GZMA weight value 35.9977
GZMB weight value-9.9711
GZMH weight value 15.305
HIF1A weight value-125.904
HLA-DRA weight value-51.5725
IDO1 weighted value-7.3857
IFNG weight value-3.9568
ITGAM weight value-20.4378
ITGAV weight value 91.4138
ITGAX weight value-17.7399
KLRB1 weighted value 6.4642
KLRD1 weight value-27.2995
KLRK1 weighted value-31.2776
LAG3 weighted value 5.9364
MACC1 weight value 4.823
MKI67 weighted value-28.3144
MMP1 weighted value-19.0372
MMP11 weight value 9.8452
MMP13 weight value 3.3756
MMP14 weight value 29.6593
MMP2 weighted value-25.6822
MMP3 weight value 16.0266
MMP9 weight value 8.0522
MS4A1 weight value 1.1736
MYC weight value 20.084
NKG7 weighted value-31.6593
PDGFB weighted value 35.2784
PLAU weight value 20.9008
PRF1 weighted value 46.0077
PTPRC weight value-108.985
RAN weight value 5.157
SERPINB5 weighted value 18.3407
SNAI1 weight value-10.2615
SNAI2 weight value 7.6711
STAT1 weight value 18.7689
TBX21 weighted value 7.2855
TIMP1 weighted value 8.1053
TNC weight value-27.6527
TNFRSF17 weight value 28.0124
TNFRSF4 weight value-19.4462
TNFRSF9 weight value 7.2481
TWIST1 weighted value 12.0057
TYMP weight value-15.5035
VEGFA weight value 15.2587
VEGFB weight value-15.0916
VEGFC weight value 1.8606
ZEB1 weight value 47.2097
ZEB2 weight value-15.7898.
According to one aspect of the present invention, the above method for establishing the comprehensive quantitative evaluation model for the tumor microenvironment uses a logistic classifier, which is a classifier based on a generalized linear model and has a linear discriminant of TMEscore ═ ΣiωixiWherein ω isiWeight value, x, for each geneiFor the expression value of the gene, TMEscore generationQuantitative assessment of the tumor microenvironment.
According to one aspect of the invention, in the method for establishing the comprehensive quantitative evaluation model for the tumor microenvironment, the TMEscore > threshold value in the sample is classified as metastatic, and is classified as non-metastatic otherwise.
According to one aspect of the invention, the threshold of the logistic classifier of the method for establishing the comprehensive quantitative evaluation model of the tumor microenvironment is-427.9891.
The gene marker combination for quantitative evaluation of the tumor microenvironment disclosed by the invention can well carry out comprehensive quantitative evaluation on the tumor microenvironment, the prediction accuracy of the evaluation result on tumor metastasis is 100%, and the prediction accuracy on tumor recurrence is over 80%.
The above description is only one gene marker combination for quantitative evaluation of tumor microenvironment disclosed by the invention, and can well perform comprehensive quantitative evaluation on tumor microenvironment, the accuracy rate of prediction of tumor metastasis by the evaluation result is 100%, and the accuracy rate of prediction of tumor recurrence is over 80%.
In order to make the technical means of the present invention more clearly understood and to make the same practical in accordance with the present disclosure, preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings.
Drawings
FIG. 1 is ROC AUC in example 2 of the present invention;
FIG. 2 is a ROC AUC in example 3 of the present invention;
FIG. 3 is another ROC AUC in example 3 of the present invention;
FIG. 4 is a diagram of a transfer ROC in example 4 of the present invention;
FIG. 5 is another ROC diagram in example 4 of the present invention;
FIG. 6 is a GSE57303ROC diagram in example 4 of the present invention;
FIG. 7 is a GSE8167ROC diagram in example 4 of the present invention.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Example 1 selection of Gene markers
The invention defines four constituent factors of a Tumor Microenvironment (TME) (the microenvironment in which tumor cells live has immeasurable influence on the tumor cells, such as blood oxygen provided by a vascular system influences the metabolism of the tumor cells, the monitoring and tolerance of body immune cells on the tumor cells, the restriction of stromal cells and extracellular matrix on the growth and metastasis of tumors, and the like. in order to better grasp the tumor microenvironment, the tumor microenvironment is disassembled and refined into four factors, relevant genes of each factor are searched through NCBI (national center of health) and are simplified and refined according to functions to form a gene marker combination for quantitative evaluation of the tumor microenvironment), the four constituent factors comprise a vascular proliferation index, a chemotactic factor expression index, an immune cell infiltration index and a tumor growth and invasion index, so that the gene marker combination for comprehensive quantitative evaluation of the tumor microenvironment is formed, the gene markers comprise the following gene markers:
I. vascular proliferation index gene markers: ANG, ANGPT1, ANGPT2, ANGPTL4, DLL4, EDN1, FGF1, FGF2, FLT1, HIF1A, PDGFB, SERPINB5, TYMP, VEGFA, VEGFB, VEGFC 16 genes
Chemokine expression indicator gene markers: CCL1 CCL2 CCL3 CCL4 CCL5 CCL 8CCL 18CCL19CCL21CXCL 1CXCL 2 CXCL3 CXCL8 CXCL9 CXCL10 CXCL11 CXCL12 CXCL13 common 18 gene
Immune cell infiltration indicator gene markers: IDO1 HLA-DRASTAT1 IFNG PRF1 GZMAGZMBBNKG 7 GZMHKLRK1 KLRB1 KLRD1 CTSW GNLY CD14 CD15 CD19 CD68 CD163CD33 CEACAM8 CD80 CD86 BATF3TNFRSF17 CD20 TNFRSF4 CD4 TNFRSF9 CD8A CD8B LAG 3CD 39 CXCR5 TBX21 FOXP 3CD 45RO co-gene
Tumor growth and invasion indicator gene markers: CDH1 CTNNB1 EPCAM ITGAM ITGAVITGAX MACC 1MMP 1MMP2 MMP3 MMP9 MMP11 MMP13 MMP14 MKI67MYC PLAU RAN SNAI1 SNAI2 TIMP1 TNC TWIST1 ZEB1 ZEB2 co-25 gene
Example 2 construction of classifier model
The following applications were carried out using the gene marker combinations (hereinafter referred to as 96 genes) of the respective constitutional factors of the immune microenvironment:
analysis of gene expression data and clinical data (staging/recurrence/metastasis) including the above gene marker combinations in the GEO database. Firstly, the first 150 samples in GSE62254[ Affymetrix HG-U133_ Plus _2] are selected as a training set, and a MAS5 method is used for background correction and standardization to obtain a gene expression value of 96 genes. If a gene has multiple probes, the maximum value is taken as the expression value of the gene, and log2 logarithmic transformation is carried out on the gene expression value. And calculating the weight value by using logistic regression.
Figure BDA0002599088960000091
Figure BDA0002599088960000101
Figure BDA0002599088960000111
Figure BDA0002599088960000121
② because the clinical data of tumor such as whether recurrence or not, whether metastasis or not are so, and probability distribution thereof accords with Bernouli distribution, we adopt a classifier of binary, local. The classifier is based on a generalized linear model, and the linear discriminant TMEscore ═ Sigma thereofiωixiWherein ω isiWeight value, x, for each geneiThe TMEscore represents the quantitative evaluation value of the tumor microenvironment as the expression value of the gene. If a sample of TMEscore>And (4) threshold, classifying into transfer, otherwise classifying into non-transfer. The threshold of the logistic classifier is obtained by Maximum Likelihood Estimation (MLE)Obtained was-427.9891.
The classifier has the following judgment accuracy on the training set: 100 percent
Classifier Rate of accuracy
logistic classifier 100
The sensitivity and specificity verified by the cross-crosses were:
Figure BDA0002599088960000122
the above table explains: for example in the case of a transfer,
the Sensitivity is the accuracy with which the transfer sample is judged to be transferred.
Specificity is the rate of correctness for a non-metastatic sample that is judged to be non-metastatic.
The ppv (positive Predictive value) is a probability that the sample is actually a transition sample among the transition samples.
Npv (negative Predictive value) is the probability that the sample is actually a non-metastatic sample among the samples judged to be non-metastatic.
The corresponding logistic classifier ROC AUC reached 1.0, as in fig. 1.
Example 3 validation of classifier models
In order to test the stability of the 96-gene marker used for constructing the classifier and also to prevent the problem of over-fitting of the marker, another partial data set of GSE62254[ HG-U133_ Plus _2] was used for predictive verification of the marker.
The judgment accuracy rate of the prediction set GSE62254 is 100 percent
Classifier Rate of accuracy
logistic classifier 100
The sensitivity and specificity verified by the cross-crosses were:
Figure BDA0002599088960000131
the corresponding logistic classifier ROC AUC reached 1.0, as in fig. 2.
To further examine the stability of the marker used to construct the classifier, and also to prevent the problem of over-fitting of the marker, the marker was validated with the GSE57303 dataset and data from 22 tumor samples (clinically diagnosed as gastric/gastroesophageal cancer, 9 of which metastasized, 13 of which did not metastasize).
The judgment accuracy rate of the prediction set GSE57303 and the own 22 tumor samples is 100 percent
Classifier Rate of accuracy
logistic classifier 100
The sensitivity and specificity verified by the cross-crosses were:
Figure BDA0002599088960000141
the corresponding logistic classifier ROC AUC reached 1.0, as in fig. 3.
Example 4
Comparative 25-gene tumor microenvironment comprehensive quantitative evaluation model made by inventor in earlier stage
Similarly, the first 150 samples of GSE62254 were used as training set, and the MAS5 method was used for background correction and normalization to obtain 25 gene expression values. If a gene has multiple probes, the maximum value is taken as the expression value of the gene, and log2 logarithmic transformation is carried out on the gene expression value. And calculating the weight value by using logistic regression to establish a model. And predicting other prediction sets through the established model to obtain the prediction accuracy. By the verification, grouped ROC graphs are obtained, and the diagnostic sensitivity and specificity of the grouped ROC graphs are evaluated.
Transferring:
screening of training set marker
According to the expression value and the grouping condition in the GSE62254[ HG-U133_ Plus _2] sample, the training set is classified into two groups of transfer and non-transfer by using logistic, and the weight value coefficient is obtained:
Figure BDA0002599088960000151
Figure BDA0002599088960000161
set the weight value of each gene to ωiThe expression value of the gene is xiThen sigma of one sampleiωixi>Threshold, then classified as a branch. Is divided intoThe threshold for the classer is-0.735.
Wherein, the judgment accuracy of these markers to the training set is:
classifier Rate of accuracy
Classifier 94
The sensitivity and specificity verified by the cross-crosses were:
Figure BDA0002599088960000162
the above table explains: for example in the case of a transfer,
the Sensitivity is the accuracy with which the transfer sample is judged to be transferred.
Specificity is the rate of correctness for a non-metastatic sample that is judged to be non-metastatic.
The ppv (positive Predictive value) is a probability that the sample is actually a transition sample among the transition samples.
Npv (negative Predictive value) is the probability that the sample is actually a non-metastatic sample among the samples judged to be non-metastatic.
The corresponding logistic classifier ROC graph is shown in fig. 4.
Verification of prediction set marker
In order to test the stability of the screened marker and prevent the over-fitting problem of the marker, the marker is verified by using another part of data set.
The judgment accuracy rate of the prediction set GSE62254 is
Classifier Rate of accuracy
logistic classifier 80
The sensitivity and specificity verified by the cross-crosses were:
Figure BDA0002599088960000171
the corresponding logistic classifier ROC graph is shown in fig. 5.
In order to test the stability of the screened markers and prevent the over-fitting problem of the markers, the GSE57303 data set is used for verifying the markers.
The judgment accuracy rate of the prediction set GSE57303 is
Classifier Rate of accuracy
logistic classifier 79
The sensitivity and specificity verified by the cross-crosses were:
Figure BDA0002599088960000172
Figure BDA0002599088960000181
the corresponding logistic classifier ROC graph is shown in fig. 6.
GSE8167
In order to test the stability of the screened markers and prevent the over-fitting problem of the markers, the GSE8167 data set is used for verifying the markers.
The judgment accuracy rate of the prediction set GSE8167 is
Classifier Rate of accuracy
logistic classifier 79
The sensitivity and specificity verified by the cross-crosses were:
Figure BDA0002599088960000182
the corresponding logistic classifier ROC graph is shown in fig. 7.
From examples 1-4, it can be seen that the result of evaluation by using the 96 gene of the invention has a prediction accuracy of 100% for tumor metastasis and a prediction accuracy of more than 80% for tumor recurrence. The accurate quantitative evaluation can be carried out without randomly selecting some related genes.
The above embodiments are merely preferred embodiments of the present invention, and are not intended to limit the present invention, it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (6)

1. The gene marker combination for comprehensive quantitative evaluation of a tumor microenvironment is characterized by comprising a vascular proliferation index gene marker, a chemokine expression index gene marker, an immune cell infiltration index gene marker and a tumor growth and invasion index gene marker;
the vascular proliferation indicator gene markers are 16 genes in total, namely ANG, ANGPT1, ANGPT2, ANGPTL4, DLL4, EDN1, FGF1, FGF2, FLT1, HIF1A, PDGFB, SERPINB5, TYMP, VEGFA, VEGFB and VEGFC;
the chemokine expression index gene marker is 18 genes of CCL1, CCL2, CCL3, CCL4, CCL5, CCL8, CCL18, CCL19, CCL21, CXCL1, CXCL2, CXCL3, CXCL8, CXCL9, CXCL10, CXCL11, CXCL12 and CXCL 13;
the immune cell infiltration index gene marker is IDO1, HLA-DRA, STAT1, IFNG, PRF1, GZMA, GZMB, NKG7, GZMH, KLRK1, KLRB1, KLRD1, CTSW, GNLY, CD14, CD15, CD19, CD68, CD163, CD33, CEACAM8, CD80, CD86, BATF3, TNFRSF17, CD20, TNFRSF4, CD4, TNFRSF9, CD8A, CD8B, LAG3, CD39, CXCR5, TBX21, FOXP3 and CD45RO co-37 gene;
the tumor growth and invasion indicator gene markers are 25 genes of CDH1, CTNNB1, EPCAM, ITGAM, ITGAV, ITGAX, MACC1, MMP1, MMP2, MMP3, MMP9, MMP11, MMP13, MMP14, MKI67, MYC, PLAU, RAN, SNAI1, SNAI2, TIMP1, TNC, TWIST1, ZEB1 and ZEB 2.
2. A kit for predicting the accuracy of tumor metastasis and/or tumor recurrence comprising a probe that detects the combination of genetic markers of claim 1.
3. A method for establishing a comprehensive quantitative evaluation model of a tumor microenvironment, which is characterized in that the method uses the gene marker combination of claim 1 comprising a vascular proliferation index gene marker, a chemokine expression index gene marker, an immune cell infiltration index gene marker and a tumor growth and invasion index gene marker as markers for the comprehensive quantitative evaluation of the tumor microenvironment; the establishing method is used for analyzing gene expression data and clinical data of the gene marker combination comprising the gene marker combination of claim 1 in a GEO database, firstly, the first 150 samples in GSE62254[ HG-U133_ Plus _2] are selected as a training set, a MAS5 method is used for background correction and standardization, a gene expression value of 96 genes is obtained, when the genes have a plurality of probes, the maximum value is taken as the expression value of the genes, log2 logarithmic transformation is carried out on the gene expression value, and logistic regression is used for calculating the weight value; the weight values of the gene markers are respectively as follows:
ANG weight value-20.1306
ANGPT1 weighted value-34.0557
ANGPT2 weighted value 28.6868
ANGPTL4 weight value-7.5101
BATF3 weighted value-31.0749
CCL1 weighted value 13.6034
CCL18 weighted value 10.4873
CCL19 weighted value-20.9738
CCL2 weighted value 11.7576
CCL21 weighted value-5.1514
CCL3 weighted value 54.0877
CCL4 weighted value-53.5021
CCL5 weighted value 27.601
CCL8 weighted value-7.8325
CD14 weight value 49.8048
CD163 weight value 20.646
CD19 weight value 21.3692
CD33 weight value 4.0713
CD4 weight value 9.214
CD68 weighted value-24.3628
CD80 weight value 30.8515
CD86 weight value 70.5589
CD8A weighted value 5.4726
CD8B weighted value 7.7914
CDH1 weighted value-5.3397
CEACAM8 weight value of 0.21
CTNNB1 weighted value-20.1479
CTSW weighted value-6.4395
CXCL1 weighted value-20.0591
CXCL10 weighted value 29.703
CXCL11 weighted value-13.3524
CXCL12 weighted value 19.7961
CXCL13 weighted value-20.5527
CXCL2 weighted value 13.4255
CXCL3 weighted value 23.3393
CXCL8 weighted value-43.1623
CXCL9 weighted value 9.0169
CXCR5 weight value 5.4856
DLL4 weight value 17.0118
EDN1 weighted value-4.8511
ENTPD1 weighted value 3.4117
EPCAM weight value-33.8795
FGF1 weighted value 25.9813
FGF2 weighted value-12.3332
FLT1 weighted value-31.7469
FOXP3 weight value 13.9705
FUT4 weighted value 13.9706
GNLY weight value-10.0842
GZMA weight value 35.9977
GZMB weight value-9.9711
GZMH weight value 15.305
HIF1A weight value-125.904
HLA-DRA weight value-51.5725
IDO1 weighted value-7.3857
IFNG weight value-3.9568
ITGAM weight value-20.4378
ITGAV weight value 91.4138
ITGAX weight value-17.7399
KLRB1 weighted value 6.4642
KLRD1 weight value-27.2995
KLRK1 weighted value-31.2776
LAG3 weighted value 5.9364
MACC1 weight value 4.823
MKI67 weighted value-28.3144
MMP1 weighted value-19.0372
MMP11 weight value 9.8452
MMP13 weight value 3.3756
MMP14 weight value 29.6593
MMP2 weighted value-25.6822
MMP3 weight value 16.0266
MMP9 weight value 8.0522
MS4A1 weight value 1.1736
MYC weight value 20.084
NKG7 weighted value-31.6593
PDGFB weighted value 35.2784
PLAU weight value 20.9008
PRF1 weighted value 46.0077
PTPRC weight value-108.985
RAN weight value 5.157
SERPINB5 weighted value 18.3407
SNAI1 weight value-10.2615
SNAI2 weight value 7.6711
STAT1 weight value 18.7689
TBX21 weighted value 7.2855
TIMP1 weighted value 8.1053
TNC weight value-27.6527
TNFRSF17 weight value 28.0124
TNFRSF4 weight value-19.4462
TNFRSF9 weight value 7.2481
TWIST1 weighted value 12.0057
TYMP weight value-15.5035
VEGFA weight value 15.2587
VEGFB weight value-15.0916
VEGFC weight value 1.8606
ZEB1 weight value 47.2097
ZEB2 weight value-15.7898.
4. The method for establishing the comprehensive quantitative evaluation model of the tumor microenvironment according to claim 3, wherein a logistic classifier is used, and the logistic classifier is a classifier based on a generalized linear model and has a linear discriminant of TMEscore =
Figure DEST_PATH_IMAGE002
Wherein
Figure DEST_PATH_IMAGE004
The weight value of each gene is used as the weight value,
Figure DEST_PATH_IMAGE006
the TMEscore represents the quantitative evaluation value of the tumor microenvironment as the expression value of the gene.
5. The method of claim 4, wherein a tumor sample is classified as metastatic if its TMEscore > threshold, or as non-metastatic if it is not.
6. The method for establishing the comprehensive quantitative evaluation model of tumor microenvironment according to claim 4, wherein the threshold of the logistic classifier is-427.9891.
CN202010718484.9A 2020-07-23 2020-07-23 Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application Active CN111778336B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010718484.9A CN111778336B (en) 2020-07-23 2020-07-23 Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010718484.9A CN111778336B (en) 2020-07-23 2020-07-23 Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application

Publications (2)

Publication Number Publication Date
CN111778336A CN111778336A (en) 2020-10-16
CN111778336B true CN111778336B (en) 2021-02-26

Family

ID=72763960

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010718484.9A Active CN111778336B (en) 2020-07-23 2020-07-23 Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application

Country Status (1)

Country Link
CN (1) CN111778336B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012166700A2 (en) * 2011-05-29 2012-12-06 Lisanti Michael P Molecular profiling of a lethal tumor microenvironment
CN110456054A (en) * 2019-08-13 2019-11-15 臻悦生物科技江苏有限公司 Cancer of pancreas detection reagent, kit, device and application
CN110621790A (en) * 2017-05-10 2019-12-27 南托米克斯有限责任公司 Circulating RNA for detecting, predicting and monitoring cancer
CN111235273A (en) * 2020-01-16 2020-06-05 臻悦生物科技江苏有限公司 Colorectal cancer tumor microenvironment detection reagent, kit, device and application

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012166700A2 (en) * 2011-05-29 2012-12-06 Lisanti Michael P Molecular profiling of a lethal tumor microenvironment
CN110621790A (en) * 2017-05-10 2019-12-27 南托米克斯有限责任公司 Circulating RNA for detecting, predicting and monitoring cancer
CN110456054A (en) * 2019-08-13 2019-11-15 臻悦生物科技江苏有限公司 Cancer of pancreas detection reagent, kit, device and application
CN111235273A (en) * 2020-01-16 2020-06-05 臻悦生物科技江苏有限公司 Colorectal cancer tumor microenvironment detection reagent, kit, device and application

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Turning Cold into Hot: Firing up the Tumor Microenvironment;Qianqian Duan et al;《Trends in Cancer》;20200222;第2卷(第2期);第1-13页 *
浅论基因检测对肿瘤精准医疗的意义;张如奎等;《中国医药生物技术》;20160430;第11卷(第2期);都103-109页 *

Also Published As

Publication number Publication date
CN111778336A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
Peters et al. Relating the gut metagenome and metatranscriptome to immunotherapy responses in melanoma patients
CN101389957B (en) Prognosis prediction for colorectal cancer
Deng et al. Noninvasive discrimination of rejection in cardiac allograft recipients using gene expression profiling
CN103502473B (en) The prediction of gastro-entero-pancreatic tumor (GEP-NEN)
US9329170B2 (en) Single cell gene expression for diagnosis, prognosis and identification of drug targets
CN101208602A (en) Diagnosis of sepsis
US20130190194A1 (en) Determination of gene expression levels of a cell type
KR20080006617A (en) Diagnosis of sepsis
CN105874080A (en) Molecular diagnostic test for oesophageal cancer
CN101194166A (en) Materials and methods relating to breast cancer classification
CN113234829B (en) Colon cancer prognosis evaluation gene set and construction method thereof
Xiang et al. Construction of artificial neural network diagnostic model and analysis of immune infiltration for periodontitis
CN103097549A (en) Method of determining kidney transplantation tolerance
CN111778336B (en) Gene marker combination for comprehensive quantitative evaluation of tumor microenvironment and application
CN113528509A (en) Construction method of tumor microenvironment scoring system for predicting gastric cancer immunotherapy and molecular probe
US20110130302A1 (en) Biological pathways associated with chemotherapy outcome for breast cancer
Vlahos et al. Systematic, protein activity-based characterization of single cell State
US20170183738A1 (en) Process, Apparatus or System and Kit for Classification of Tumor Samples of Unknown and/or Uncertain Origin and Use of Genes of the Group of Biomarkers
CN116121383A (en) Composition for clinical diagnosis and treatment of hematological malignant tumor and application thereof
CN117766024B (en) Ovarian cancer CD8+T cell related prognosis evaluation method, system and application thereof
CN117373534B (en) Triple negative breast cancer prognosis risk assessment system
US20220235421A1 (en) Detecting and treating cisplatin sensitive cancer
CN113957147A (en) Double-gene combination and application thereof in personalized candidate evaluation of gastric cancer immunotherapy patients
CN118222713A (en) Application of biomarker in detection of brain glioma-related TLS
WO2022016447A1 (en) Marker for assessing responsiveness of colorectal cancer patients to immunotherapeutic drug

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant