CN114927231B - Method and device for predicting early lung adenocarcinoma progress based on gene expression information - Google Patents
Method and device for predicting early lung adenocarcinoma progress based on gene expression information Download PDFInfo
- Publication number
- CN114927231B CN114927231B CN202210391575.5A CN202210391575A CN114927231B CN 114927231 B CN114927231 B CN 114927231B CN 202210391575 A CN202210391575 A CN 202210391575A CN 114927231 B CN114927231 B CN 114927231B
- Authority
- CN
- China
- Prior art keywords
- adenocarcinoma
- tumor
- expression
- genes
- immune
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 135
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 title claims abstract description 61
- 201000005249 lung adenocarcinoma Diseases 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 27
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 177
- 206010028980 Neoplasm Diseases 0.000 claims abstract description 102
- 230000036737 immune function Effects 0.000 claims abstract description 41
- 230000004614 tumor growth Effects 0.000 claims abstract description 41
- 208000009956 adenocarcinoma Diseases 0.000 claims description 122
- 238000011065 in-situ storage Methods 0.000 claims description 42
- 238000012163 sequencing technique Methods 0.000 claims description 37
- 238000004364 calculation method Methods 0.000 claims description 33
- 238000012216 screening Methods 0.000 claims description 28
- 206010061309 Neoplasm progression Diseases 0.000 claims description 23
- 230000005751 tumor progression Effects 0.000 claims description 23
- 208000020816 lung neoplasm Diseases 0.000 claims description 17
- 206010058467 Lung neoplasm malignant Diseases 0.000 claims description 16
- 201000005202 lung cancer Diseases 0.000 claims description 16
- 238000001764 infiltration Methods 0.000 claims description 13
- 238000004458 analytical method Methods 0.000 claims description 11
- 230000001575 pathological effect Effects 0.000 claims description 9
- 101000877857 Homo sapiens Protein FAM83A Proteins 0.000 claims description 7
- 101000891113 Homo sapiens T-cell acute lymphocytic leukemia protein 1 Proteins 0.000 claims description 7
- 102100035446 Protein FAM83A Human genes 0.000 claims description 7
- 101000702553 Schistosoma mansoni Antigen Sm21.7 Proteins 0.000 claims description 7
- 101000714192 Schistosoma mansoni Tegument antigen Proteins 0.000 claims description 7
- 102100040365 T-cell acute lymphocytic leukemia protein 1 Human genes 0.000 claims description 7
- 201000011510 cancer Diseases 0.000 claims description 7
- 230000008595 infiltration Effects 0.000 claims description 7
- 210000004072 lung Anatomy 0.000 claims description 7
- 230000000630 rising effect Effects 0.000 claims description 7
- PJOHVEQSYPOERL-SHEAVXILSA-N (e)-n-[(4r,4as,7ar,12br)-3-(cyclopropylmethyl)-9-hydroxy-7-oxo-2,4,5,6,7a,13-hexahydro-1h-4,12-methanobenzofuro[3,2-e]isoquinoline-4a-yl]-3-(4-methylphenyl)prop-2-enamide Chemical compound C1=CC(C)=CC=C1\C=C\C(=O)N[C@]1(CCC(=O)[C@@H]2O3)[C@H]4CC5=CC=C(O)C3=C5[C@]12CCN4CC1CC1 PJOHVEQSYPOERL-SHEAVXILSA-N 0.000 claims description 6
- 102100033896 Arylsulfatase H Human genes 0.000 claims description 6
- 102100021896 Bcl-2-like protein 15 Human genes 0.000 claims description 6
- 102100036846 C-C motif chemokine 21 Human genes 0.000 claims description 6
- 102100036166 C-X-C chemokine receptor type 1 Human genes 0.000 claims description 6
- 102100028989 C-X-C chemokine receptor type 2 Human genes 0.000 claims description 6
- 102000049320 CD36 Human genes 0.000 claims description 6
- 108010045374 CD36 Antigens Proteins 0.000 claims description 6
- 102100024152 Cadherin-17 Human genes 0.000 claims description 6
- 102100029761 Cadherin-5 Human genes 0.000 claims description 6
- 108010050543 Calcium-Sensing Receptors Proteins 0.000 claims description 6
- 102100027473 Cartilage oligomeric matrix protein Human genes 0.000 claims description 6
- 101710176668 Cartilage oligomeric matrix protein Proteins 0.000 claims description 6
- 102100023126 Cell surface glycoprotein MUC18 Human genes 0.000 claims description 6
- 102100029267 Colipase-like protein 2 Human genes 0.000 claims description 6
- 102100036217 Collagen alpha-1(X) chain Human genes 0.000 claims description 6
- 102100038387 Cystatin-SN Human genes 0.000 claims description 6
- 102100035650 Extracellular calcium-sensing receptor Human genes 0.000 claims description 6
- 102100028413 Fibroblast growth factor 11 Human genes 0.000 claims description 6
- 102100038393 Granzyme H Human genes 0.000 claims description 6
- 101000925530 Homo sapiens Arylsulfatase H Proteins 0.000 claims description 6
- 101000971075 Homo sapiens Bcl-2-like protein 15 Proteins 0.000 claims description 6
- 101000713085 Homo sapiens C-C motif chemokine 21 Proteins 0.000 claims description 6
- 101000947174 Homo sapiens C-X-C chemokine receptor type 1 Proteins 0.000 claims description 6
- 101000762247 Homo sapiens Cadherin-17 Proteins 0.000 claims description 6
- 101000794587 Homo sapiens Cadherin-5 Proteins 0.000 claims description 6
- 101000623903 Homo sapiens Cell surface glycoprotein MUC18 Proteins 0.000 claims description 6
- 101000770424 Homo sapiens Colipase-like protein 2 Proteins 0.000 claims description 6
- 101000875027 Homo sapiens Collagen alpha-1(X) chain Proteins 0.000 claims description 6
- 101000884768 Homo sapiens Cystatin-SN Proteins 0.000 claims description 6
- 101000917236 Homo sapiens Fibroblast growth factor 11 Proteins 0.000 claims description 6
- 101001033000 Homo sapiens Granzyme H Proteins 0.000 claims description 6
- 101001050470 Homo sapiens Intelectin-2 Proteins 0.000 claims description 6
- 101000984189 Homo sapiens Leukocyte immunoglobulin-like receptor subfamily B member 2 Proteins 0.000 claims description 6
- 101000604998 Homo sapiens Lysosome-associated membrane glycoprotein 3 Proteins 0.000 claims description 6
- 101001134216 Homo sapiens Macrophage scavenger receptor types I and II Proteins 0.000 claims description 6
- 101001055956 Homo sapiens Mannan-binding lectin serine protease 1 Proteins 0.000 claims description 6
- 101001126874 Homo sapiens Peptidoglycan recognition protein 4 Proteins 0.000 claims description 6
- 101000947178 Homo sapiens Platelet basic protein Proteins 0.000 claims description 6
- 101000830411 Homo sapiens Probable ATP-dependent RNA helicase DDX4 Proteins 0.000 claims description 6
- 101000617130 Homo sapiens Stromal cell-derived factor 1 Proteins 0.000 claims description 6
- 101000990915 Homo sapiens Stromelysin-1 Proteins 0.000 claims description 6
- 101000713602 Homo sapiens T-box transcription factor TBX21 Proteins 0.000 claims description 6
- 102100023352 Intelectin-2 Human genes 0.000 claims description 6
- 108700003107 Interleukin-1 Receptor-Like 1 Proteins 0.000 claims description 6
- 102100036706 Interleukin-1 receptor-like 1 Human genes 0.000 claims description 6
- 108010018951 Interleukin-8B Receptors Proteins 0.000 claims description 6
- 102100025583 Leukocyte immunoglobulin-like receptor subfamily B member 2 Human genes 0.000 claims description 6
- 102100038213 Lysosome-associated membrane glycoprotein 3 Human genes 0.000 claims description 6
- 102100034184 Macrophage scavenger receptor types I and II Human genes 0.000 claims description 6
- 102100026061 Mannan-binding lectin serine protease 1 Human genes 0.000 claims description 6
- 102100030406 Peptidoglycan recognition protein 4 Human genes 0.000 claims description 6
- 102100036154 Platelet basic protein Human genes 0.000 claims description 6
- 102100024770 Probable ATP-dependent RNA helicase DDX4 Human genes 0.000 claims description 6
- 101710168942 Sphingosine-1-phosphate phosphatase 1 Proteins 0.000 claims description 6
- 102100021669 Stromal cell-derived factor 1 Human genes 0.000 claims description 6
- 102100030416 Stromelysin-1 Human genes 0.000 claims description 6
- 102100036840 T-box transcription factor TBX21 Human genes 0.000 claims description 6
- KMGARVOVYXNAOF-UHFFFAOYSA-N benzpiperylone Chemical compound C1CN(C)CCC1N1C(=O)C(CC=2C=CC=CC=2)=C(C=2C=CC=CC=2)N1 KMGARVOVYXNAOF-UHFFFAOYSA-N 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 230000007423 decrease Effects 0.000 claims description 6
- 210000002149 gonad Anatomy 0.000 claims description 6
- 238000000540 analysis of variance Methods 0.000 claims description 3
- 238000011551 log transformation method Methods 0.000 claims description 3
- 102100030684 Sphingosine-1-phosphate phosphatase 1 Human genes 0.000 claims 2
- 238000007619 statistical method Methods 0.000 abstract description 2
- 210000001519 tissue Anatomy 0.000 description 37
- 239000000523 sample Substances 0.000 description 9
- 238000011161 development Methods 0.000 description 7
- 230000012010 growth Effects 0.000 description 7
- 210000000987 immune system Anatomy 0.000 description 6
- 230000004083 survival effect Effects 0.000 description 6
- 238000004393 prognosis Methods 0.000 description 5
- 102100040557 Osteopontin Human genes 0.000 description 4
- 230000003902 lesion Effects 0.000 description 3
- 230000001105 regulatory effect Effects 0.000 description 3
- 201000007490 Adenocarcinoma in Situ Diseases 0.000 description 2
- 102000000905 Cadherin Human genes 0.000 description 2
- 108050007957 Cadherin Proteins 0.000 description 2
- 102000001301 EGF receptor Human genes 0.000 description 2
- 108060006698 EGF receptor Proteins 0.000 description 2
- 230000008827 biological function Effects 0.000 description 2
- 210000004027 cell Anatomy 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000015788 innate immune response Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 238000010837 poor prognosis Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 102000010400 1-phosphatidylinositol-3-kinase activity proteins Human genes 0.000 description 1
- 101150075099 Bcl2l15 gene Proteins 0.000 description 1
- 101150117102 C8b gene Proteins 0.000 description 1
- 101150112561 CD36 gene Proteins 0.000 description 1
- 101150065594 CDH5 gene Proteins 0.000 description 1
- 108050006947 CXC Chemokine Proteins 0.000 description 1
- 102000019388 CXC chemokine Human genes 0.000 description 1
- 101150065984 Comp gene Proteins 0.000 description 1
- 102000016916 Complement C8 Human genes 0.000 description 1
- 108010028777 Complement C8 Proteins 0.000 description 1
- 102000015833 Cystatin Human genes 0.000 description 1
- 102000010834 Extracellular Matrix Proteins Human genes 0.000 description 1
- 108010037362 Extracellular Matrix Proteins Proteins 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 208000002250 Hematologic Neoplasms Diseases 0.000 description 1
- 101100062319 Homo sapiens CST1 gene Proteins 0.000 description 1
- 108090001090 Lectins Proteins 0.000 description 1
- 102000004856 Lectins Human genes 0.000 description 1
- 102000043136 MAP kinase family Human genes 0.000 description 1
- 108091054455 MAP kinase family Proteins 0.000 description 1
- 101150104297 MASP1 gene Proteins 0.000 description 1
- 108091007960 PI3Ks Proteins 0.000 description 1
- 101150099954 PPBP gene Proteins 0.000 description 1
- 102000010780 Platelet-Derived Growth Factor Human genes 0.000 description 1
- 108010038512 Platelet-Derived Growth Factor Proteins 0.000 description 1
- 108700020978 Proto-Oncogene Proteins 0.000 description 1
- 102000052575 Proto-Oncogene Human genes 0.000 description 1
- 238000010802 RNA extraction kit Methods 0.000 description 1
- 239000013614 RNA sample Substances 0.000 description 1
- 108010022999 Serine Proteases Proteins 0.000 description 1
- 102000012479 Serine Proteases Human genes 0.000 description 1
- 208000029052 T-cell acute lymphoblastic leukemia Diseases 0.000 description 1
- 201000011638 T-cell childhood acute lymphocytic leukemia Diseases 0.000 description 1
- 108060008245 Thrombospondin Proteins 0.000 description 1
- 102000002938 Thrombospondin Human genes 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 230000004721 adaptive immunity Effects 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 208000023139 childhood T-cell acute lymphoblastic leukemia Diseases 0.000 description 1
- 230000024203 complement activation Effects 0.000 description 1
- 108050004038 cystatin Proteins 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000003828 downregulation Effects 0.000 description 1
- 230000007705 epithelial mesenchymal transition Effects 0.000 description 1
- 230000017188 evasion or tolerance of host immune response Effects 0.000 description 1
- 230000009033 hematopoietic malignancy Effects 0.000 description 1
- 206010020718 hyperplasia Diseases 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 238000003018 immunoassay Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 101150008802 itln gene Proteins 0.000 description 1
- 239000002523 lectin Substances 0.000 description 1
- 208000037841 lung tumor Diseases 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 244000052769 pathogen Species 0.000 description 1
- 238000003068 pathway analysis Methods 0.000 description 1
- 230000002980 postoperative effect Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 108020004418 ribosomal RNA Proteins 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
- 230000004565 tumor cell growth Effects 0.000 description 1
- 230000003827 upregulation Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012049 whole transcriptome sequencing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
- G16B35/20—Screening of libraries
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Primary Health Care (AREA)
- Library & Information Science (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Biochemistry (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Biology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a method and a device for predicting early lung adenocarcinoma progress based on gene expression information, wherein the method comprises the following steps: the tumor-related genes and immune-related genes are screened out by using a statistical method, the tumor growth index and the immune function index are calculated based on the expression of the two groups of genes, and finally, the difference value between the tumor growth index and the immune function index is used as the tumor progress index of the lung adenocarcinoma for predicting the early lung adenocarcinoma progress. The method of the invention can predict the tumor-immune system balance state of a lung adenocarcinoma patient and evaluate the progress level of the tumor according to the tumor-immune system balance state.
Description
Technical Field
The invention relates to the technical field of bioinformatics, in particular to a method and a device for predicting early lung adenocarcinoma progress based on gene expression information.
Background
Among all the pathological subtypes of lung cancer, lung adenocarcinoma (Lung adenocarcinoma, LUAD) is the most common pathological subtype. If surgical excision can be performed at the stage of precancerous lesions of lung adenocarcinoma, in-situ adenocarcinoma (Adenocarcinoma in situ, AIS) and micro-invasive adenocarcinoma (MINIMALLY INVASIVE adenocarpioma, MIA), the five-year survival rate of patients after surgery can reach or approach 100%. Once advanced to invasive adenocarcinoma, the prognosis of the patient is significantly reduced, and thus it is necessary to study the evolution of lung adenocarcinoma to discover new targets and develop new treatments. Although genomic and immunoassays for AIS, MIA and LUAD patients have been studied, there is a lack of systematic studies directed to key molecular events that drive the evolution of lung adenocarcinoma.
Because of the limited number of pre-invasive tumor samples of lung adenocarcinoma, there is currently very little international genomics research on pre-cancerous lesions of lung adenocarcinoma, and no technology is currently available to predict the progression of early lung adenocarcinoma.
Accordingly, the prior art is still in need of improvement and development.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention aims to provide a method and a device for predicting early lung adenocarcinoma progress based on gene expression information, which aims to solve the problem that the prior art lacks a method and a device capable of accurately predicting early lung adenocarcinoma progress.
The technical scheme of the invention is as follows:
a method of predicting early lung adenocarcinoma progression based on gene expression information, comprising the steps of:
Dividing the obtained lung tissue into four types of normal tissue, in-situ adenocarcinoma, micro-invasive adenocarcinoma and invasive adenocarcinoma in sequence according to tumor pathological features, and respectively performing full transcriptome sequencing on the four types of tissue to generate a sequencing library;
The sequencing results of the whole transcriptome of the normal tissue, the in-situ adenocarcinoma, the micro-invasive adenocarcinoma and the invasive adenocarcinoma form three control groups, wherein the three control groups comprise a normal tissue and in-situ adenocarcinoma control group, an in-situ adenocarcinoma and micro-invasive adenocarcinoma control group and a micro-invasive adenocarcinoma and invasive adenocarcinoma control group;
Performing variance analysis on each gene in the sequencing library among the three control groups to determine a differential expression gene;
Screening a group of tumor-related genes with the expression quantity showing a significant rising trend in three control groups from the differential expression genes, and calculating a tumor growth index according to the tumor-related genes;
screening a group of immune related genes with the expression quantity showing a significant decrease trend in a normal tissue and in-situ adenocarcinoma control group and a micro-infiltration gonad cancer and infiltration gonad cancer control group from the differential expression genes, and calculating an immune function index according to the immune related genes;
the difference between the tumor growth index and the immune function index was used as a tumor progression index for lung adenocarcinoma for predicting early lung adenocarcinoma progression.
The method for predicting early lung adenocarcinoma progression based on gene expression information, wherein each gene in the sequencing library is subjected to analysis of variance between the three control groups, and the step of determining the differentially expressed genes comprises:
variance analysis was performed on each gene in the sequencing library in each of the three control groups, and genes with p <0.0001 and an inter-group |log2-expression multiple| of 2 or more were used as differential expression genes.
The method for predicting early lung cancer progression based on gene expression information comprises screening a group of tumor-related genes with significantly increased expression levels in three control groups from the differentially expressed genes, wherein the group of tumor-related genes comprises BCL2L15, COMP, CST1, FAM83A, SLC A5, PGLYRP4, CLPSL2, ARSH, CDH17, COL10A1, SPP1, MMP3, DDX4, FGF11 and CASR.
The method for predicting early lung cancer progression based on gene expression information, wherein the step of calculating a tumor growth index from the tumor-associated gene comprises:
The expression level of the tumor-associated gene is log2 log transformed, and then for each sample, the tumor growth index is calculated as the average value of log2 log transformation of the expression level of the tumor-associated gene, and the calculation formula of the tumor growth index is: wherein TPM is the expression quantity of tumor related genes, and N is the quantity of tumor related genes.
The method for predicting early lung cancer progress based on gene expression information comprises screening a group of immune related genes with significantly reduced expression levels in a normal tissue and in-situ adenocarcinoma control group, micro-invasive adenocarcinoma and invasive adenocarcinoma control group from the differential expression genes, wherein the immune related genes comprise ITLN2、MARCO、C8B、MASP1、CD36、TAL1、PPBP、CDH5、MSR1、TBX21、C6、MCAM、GZMH、CZMB、CXCL12、LILRB2、CXCR1、CXCR2、LAMP3、IL1RL1.
The method for predicting early lung cancer progression based on gene expression information, wherein the step of calculating an immune function index from the immune-related genes comprises:
log2 log conversion is carried out on the expression quantity of the immune related genes, and then for each sample, the immune function index is calculated as the average value of log2 log conversion of the expression quantity of the immune related genes, and the calculation formula of the immune function index is as follows: Wherein TPM is the expression quantity of immune related genes, and n is the quantity of immune related genes.
An apparatus for predicting early lung cancer progression based on gene expression information, comprising:
The sequencing module is used for dividing the acquired lung tissue into four types of normal tissue, in-situ adenocarcinoma, micro-invasive adenocarcinoma and invasive adenocarcinoma in sequence according to tumor pathological characteristics, and respectively carrying out full transcriptome sequencing on the four types of tissue to generate a sequencing library;
the grouping module is used for forming three control groups from the sequencing results of the full transcriptome of the normal tissue, the in-situ adenocarcinoma, the micro-invasive adenocarcinoma and the invasive adenocarcinoma, wherein the three control groups comprise a normal tissue and in-situ adenocarcinoma control group, an in-situ adenocarcinoma and micro-invasive adenocarcinoma control group and a micro-invasive adenocarcinoma and invasive adenocarcinoma control group;
the differential expression gene determining module is used for respectively carrying out variance analysis on each gene in the sequencing library among the three control groups to determine differential expression genes;
a tumor growth index calculation module, which is used for screening a group of tumor-related genes with the expression quantity showing a significant rising trend in three control groups from the differential expression genes, and calculating a tumor growth index according to the tumor-related genes;
The immune function index calculation module is used for screening a group of immune related genes with the expression quantity showing a significant decline trend in a normal tissue and in-situ adenocarcinoma control group and a micro-infiltration adenocarcinoma and infiltration adenocarcinoma control group from the differential expression genes, and calculating an immune function index according to the immune related genes;
A tumor progress index calculation module for taking the difference between the tumor growth index and the immune function index as the tumor progress index of the lung adenocarcinoma for predicting the early lung adenocarcinoma progress.
The device for predicting early lung cancer progression based on expression information, wherein the tumor growth index calculation module comprises:
A tumor-associated gene screening unit for screening out a group of tumor-associated genes whose expression levels in three control groups are significantly increased, including BCL2L15, COMP, CST1, FAM83A, SLC A5, PGLYRP4, CLPSL2, ARSH, CDH17, COL10A1, SPP1, MMP3, DDX4, FGF11, CASR, among the differentially expressed genes;
A tumor growth index calculation unit for log2 log-transforming the expression level of the tumor-associated gene, and then calculating a tumor growth index as an average value of log2 log-transforming the expression level of the tumor-associated gene for each sample, the calculation formula of the tumor growth index being: wherein TPM is the expression quantity of tumor related genes, and N is the quantity of tumor related genes.
The device for predicting early lung cancer progression based on expression information, wherein the immune function index calculation module comprises:
An immune related gene screening unit for screening out a group of immune related genes with significantly reduced expression levels in a normal tissue and in-situ adenocarcinoma control group, micro-invasive adenocarcinoma and invasive adenocarcinoma control group from the differentially expressed genes, including ITLN2、MARCO、C8B、MASP1、CD36、TAL1、PPBP、CDH5、MSR1、TBX21、C6、MCAM、GZMH、CZMB、CXCL12、LILRB2、CXCR1、CXCR2、LAMP3、IL1RL1;
An immune function index calculation unit for log2 log-transforming the expression level of the immune-related gene, and then calculating an immune function index as an average value of log2 log-transforming the expression level of the immune-related gene for each sample, wherein the immune function index has a calculation formula of: Wherein TPM is the expression quantity of immune related genes, and n is the quantity of immune related genes.
The beneficial effects are that: according to the result of sequencing data of a complete transcriptome of lung adenocarcinoma at different development stages, key genes capable of predicting the evolution process from lung in-situ adenocarcinoma (adenocarcinoma in situ, AIS) to micro-invasive adenocarcinoma (MINIMALLY INVASIVE adenoocarcinoma, MIA) to invasive adenocarcinoma (invasive adenocarcinoma, LUAD) are screened, the key genes comprise a group of tumor-related genes related to the growth potential of tumors and a group of immune-related genes related to the functional level of the surrounding immune system, the standardized expression amounts of the key genes are used for modeling, the corresponding tumor growth index and immune function index are calculated, and finally the difference value between the tumor growth index and the immune function index is used as the tumor progress index of the lung adenocarcinoma to predict the early lung adenocarcinoma progress. The invention can reflect the tumor-immune balance state of lung adenocarcinoma at different development stages, thereby realizing the prediction of the progress of the lung adenocarcinoma and the prognosis of patients.
Drawings
FIG. 1 is a flow chart of a method for predicting early lung adenocarcinoma progression based on gene expression information according to the present invention.
FIG. 2 shows 12 expression patterns of 2023 genes during lung adenocarcinoma progression.
FIG. 3 is a graph showing the results of comparing tumor progression index in A) the data set of the present application, B) the external validation set, and C) the TCGA-LUAD data set for different stages of lung adenocarcinoma progression.
FIG. 4 is a graph showing the comparison of prognosis survival for lung adenocarcinoma patients with different tumor progression indices in the data set of the present application and the TCGA-LUAD data set.
Detailed Description
The invention provides a method and a device for predicting early lung adenocarcinoma progress based on gene expression information, which are used for making the purposes, technical schemes and effects of the invention clearer and more definite, and are further described in detail below. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart of a method for predicting early lung adenocarcinoma progress based on gene expression information, which includes the steps of:
S10, dividing the acquired lung tissue into four types of normal tissue, in-situ adenocarcinoma, micro-invasive adenocarcinoma and invasive adenocarcinoma in sequence according to tumor pathological features, and respectively performing full transcriptome sequencing on the four types of tissue to generate a sequencing library;
S20, sequencing the whole transcriptome of the normal tissue, the in-situ adenocarcinoma, the micro-invasive adenocarcinoma and the invasive adenocarcinoma to form three control groups, wherein the three control groups comprise a normal tissue and in-situ adenocarcinoma control group, an in-situ adenocarcinoma and micro-invasive adenocarcinoma control group and a micro-invasive adenocarcinoma and invasive adenocarcinoma control group;
s30, respectively performing variance analysis on each gene in the sequencing library among the three control groups to determine a differential expression gene;
s40, screening a group of tumor-related genes with the expression levels in three control groups showing a significant rising trend from the differential expression genes, and calculating a tumor growth index according to the tumor-related genes;
S50, screening a group of immune related genes with the expression quantity showing a significant decline trend in a normal tissue and in-situ adenocarcinoma control group and a micro-infiltration gonad cancer and infiltration gonad cancer control group from the differential expression genes, and calculating an immune function index according to the immune related genes;
S60, using the difference value between the tumor growth index and the immune function index as a tumor progress index of the lung adenocarcinoma to predict early lung adenocarcinoma progress.
Specifically, the invention searches the change of the gene expression level in the whole process of the lung adenocarcinoma from normal tissues to precancerous lesions and then to invasive lung adenocarcinoma by analyzing the large-scale full transcriptome sequencing data information of the Chinese lung adenocarcinoma patients, screens out 2 groups of representative genes by using a statistical method, wherein one group of genes reflects the inherent growth potential of tumors, the other group of genes reflects the functional state of an immune system, designs a tumor progression index based on the expression quantity of the two groups of genes, can predict the tumor-immune system balance state of the lung adenocarcinoma patients, and evaluates the tumor progression level according to the tumor progression state. The tumor-immune system balance state predicted by the tumor progress index can be verified by external lung adenocarcinoma data, and the technology of the invention provides theoretical basis and technical support for predicting the progress of lung adenocarcinoma.
The tumor progress index designed by the invention can predict the prognosis of the lung adenocarcinoma patient, and the result can be verified by external data, thereby providing a new index for predicting the prognosis of the lung adenocarcinoma patient.
The invention screens out a group of gene combinations capable of reflecting the inherent growth potential of tumors and the functional state of an immune system through full transcriptome sequencing, and can design a specific detection mode aiming at the 2 groups of genes according to the gene combinations in the future, thereby providing a new thought and method for the development of a detection kit.
The invention is further illustrated by the following examples:
First, 150 cases of surgically resected or biopsied extracted fresh lung tumor tissue and paired paracancerous normal tissue were used, divided into 4 groups according to the pathological characteristics of the tumor: 150 normal tissues, 16 in situ Adenocarcinomas (AIS), 52 Micro Invasive Adenocarcinomas (MIA) and 82 invasive adenocarcinomas (LUAD), after sampling, total RNA samples were extracted using the RNA extraction kit from Macherey-Nagel company (Germany), ribosomal RNA was removed, a sequencing library was generated and paired-end sequencing was performed on the Illumina HiSeq X Ten platform, read for 150bp.
Next, the results of the whole transcriptome sequencing of the normal tissue, the in situ adenocarcinoma, the micro-invasive adenocarcinoma, and the invasive adenocarcinoma are combined into three control groups consisting of a normal tissue and in situ adenocarcinoma control group (normal tissue vs AIS), an in situ adenocarcinoma and micro-invasive adenocarcinoma control group (AIS vs MIA), and a micro-invasive adenocarcinoma and invasive adenocarcinoma control group (MIA vs LUAD); performing analysis of variance on each gene in the sequencing library in each of the three control groups, and taking genes with p <0.0001 and inter-group |log2-expression multiple| of 2 or more as differential expression genes; genes that exhibited significant differential expression in at least 1 of the 3 comparisons were then selected for downstream analysis, and 12 expression patterns were determined based on their up-or down-regulation in adjacent two sets of samples, as shown in fig. 2. Based on the expression profile, 12 expression patterns were identified, and pathway analysis was performed to reveal the biological function of each pattern. Specifically, the statistically different genes selected in this example were 2023 in total, and 12 expression patterns were determined according to the change in expression level in the stages of development of two adjacent tumors, for example, expression pattern 1, the second group was higher than the first group, the third group was higher than the second group, and the fourth group was higher than the third group, in a gradually rising situation; expression pattern 2, second set higher than first set, third set not significantly different from second set, fourth set higher than third set, and so on. The example then selects biologically significant genes in both the upward and downward trends as representative genes to determine our final candidate genes. FIG. 2 shows the 12 expression patterns, the leftmost line graph represents the trend of the expression levels of genes between adjacent groups, and the notes in the graph represent the main biological functions of the genes.
Next, in order to minimize the possible confounding effect of the introduction of the low-expression genes, the present invention filters out genes with average expression level (TPM) <1.0 in all samples, and screens out a group of tumor-related genes whose expression levels significantly increase in three control groups, including BCL2L15, COMP, CST1, FAM83A, SLC A5, PGLYRP4, CLPSL2, ARSH, CDH17, COL10A1, SPP1, MMP3, DDX4, FGF11, CASR; the expression level of the tumor-associated gene is log2 log transformed, and then for each sample, the tumor growth index is calculated as the average value of log2 log transformation of the expression level of the tumor-associated gene, and the calculation formula of the tumor growth index is: wherein TPM is the expression quantity of tumor related genes, and N is the quantity of tumor related genes.
Further, screening out a group of immune-related genes with significantly reduced expression levels in a normal tissue and in-situ adenocarcinoma control group, a micro-invasive adenocarcinoma and an invasive adenocarcinoma control group from the differential expression genes, wherein ITLN2、MARCO、C8B、MASP1、CD36、TAL1、PPBP、CDH5、MSR1、TBX21、C6、MCAM、GZMH、CZMB、CXCL12、LILRB2、CXCR1、CXCR2、LAMP3、IL1RL1; performs log2 log conversion on the expression levels of the immune-related genes, and then for each sample, calculating an immune function index as an average value of log2 log conversion on the expression levels of the immune-related genes, wherein the calculation formula of the immune function index is as follows: Wherein TPM is the expression quantity of immune related genes, and n is the quantity of immune related genes.
Finally, the present invention defines the difference between the tumor growth index and the immune function index as the tumor progression index of lung adenocarcinoma, i.e. tumor progression index = tumor growth index-immune function index. The tumor progression index calculated using the above formula gradually increases with the progression of the tumor, as shown in fig. 3, so the present invention considers that the evolution and progression of lung adenocarcinoma can be predicted based on the expression level of this 2-group gene. In the present invention, a negative tumor progression index indicates that the immune system has sufficient capacity to inhibit tumor progression, while a positive tumor progression index indicates that the immune system is no longer capable of inhibiting tumor cell growth and the balance between tumor-immune system is broken. In our study cohort, the tumor progression index was negative in normal tissue, but positive in AIS and later stages, indicating that immune escape was already present in the pre-infiltration stage AIS of lung adenocarcinoma and became more severe as the disease progressed, as shown in fig. 3 a.
To verify the tumor progression index of the present invention, we observed the same trend of increase in another dataset showing a significant increase in tumor progression index from normal tissue to Atypical Adenomatous Hyperplasia (AAH) and then to LUAD, as shown in FIG. 3B. The present invention further found that in the AAH stage, the tumor progression index was negative, at which time the tumor could not overcome the immune system to metastasize further. Although the TCGA-LUAD dataset did not contain the pre-invasive stage of lung adenocarcinoma, we calculated the tumor progression index for each sample and made a comparison between the normal and tumor samples. In this partial comparison, we observed that the tumor progression index of the tumor samples was significantly higher than that of the normal samples, as shown in FIG. 3C.
In the invention, 2 groups of representative genes are screened for calculating tumor progression indexes, wherein the expression product of the BCL2L15 gene is involved in regulating apoptosis in up-regulated genes. The expression product of COMP gene is a non-collagenous extracellular matrix protein, and high expression of which has been reported to promote epithelial-mesenchymal transition of cancer cells, has a poor prognosis in patients. The expression product of the CST1 gene is serum cystatin, whose high expression is associated with poor prognosis for a variety of cancers. FAM83A is a possible proto-oncogene that plays a role in the Epidermal Growth Factor Receptor (EGFR) pathway, activating the downstream RAS/MAPK and PI3K/AKT/TOR signaling pathways, promoting cell growth. Among the down-regulated genes, the expression product of ITLN gene is involved in the body's defense against pathogens. MARCO is a receptor on the surface of macrophages and plays an important role in the innate immunity of the body. The C8B gene encodes the beta chain of complement C8. The MASP1 gene encodes a serine protease that functions both in innate and adaptive immunity as a component of the lectin pathway of complement activation. The CD36 gene encodes a major glycoprotein on the surface of platelets, which serves as a receptor for thrombospondin in platelets and other various cell lines, playing an important role in tumor immunity. TAL1 is related to the origin of hematopoietic malignancies and has been reported to be associated with pre-T cell acute lymphoblastic leukemia and childhood T cell acute lymphoblastic leukemia. The PPBP gene encodes a platelet-derived growth factor, belonging to the CXC chemokine family, which activates neutrophils. The CDH5 gene encodes a classical cadherin of the cadherin superfamily.
Furthermore, to investigate whether the tumor progression index designed according to the present application has prognostic guidance value for lung adenocarcinoma patients, we performed survival analysis on the data set according to the present application and the TCGA-LUAD data set. The analysis results showed that patients with high tumor progression index had significantly worse (as shown in fig. 4 a and B) and Overall survival (RFS) in the data set of the present application, whereas patients with higher tumor progression index had worse OS in the TCGA-LUAD data set, but the two groups had comparable progression-free survival (PFS) as shown in fig. 4C and D.
Based on the method, the invention also provides a device for predicting early lung cancer progress based on gene expression information, which comprises:
The sequencing module is used for dividing the acquired lung tissue into four types of normal tissue, in-situ adenocarcinoma, micro-invasive adenocarcinoma and invasive adenocarcinoma in sequence according to tumor pathological characteristics, and respectively carrying out full transcriptome sequencing on the four types of tissue to generate a sequencing library;
the grouping module is used for forming three control groups from the sequencing results of the full transcriptome of the normal tissue, the in-situ adenocarcinoma, the micro-invasive adenocarcinoma and the invasive adenocarcinoma, wherein the three control groups comprise a normal tissue and in-situ adenocarcinoma control group, an in-situ adenocarcinoma and micro-invasive adenocarcinoma control group and a micro-invasive adenocarcinoma and invasive adenocarcinoma control group;
the differential expression gene determining module is used for respectively carrying out variance analysis on each gene in the sequencing library among the three control groups to determine differential expression genes;
a tumor growth index calculation module, which is used for screening a group of tumor-related genes with the expression quantity showing a significant rising trend in three control groups from the differential expression genes, and calculating a tumor growth index according to the tumor-related genes;
The immune function index calculation module is used for screening a group of immune related genes with the expression quantity showing a significant decline trend in a normal tissue and in-situ adenocarcinoma control group and a micro-infiltration adenocarcinoma and infiltration adenocarcinoma control group from the differential expression genes, and calculating an immune function index according to the immune related genes;
A tumor progress index calculation module for taking the difference between the tumor growth index and the immune function index as the tumor progress index of the lung adenocarcinoma for predicting the early lung adenocarcinoma progress.
The device provided by the invention measures the unbalance degree between the inherent growth potential of the tumor and the immune microenvironment by designing and calculating the tumor progress index, proves that the tumor progress index has obvious differences in different development stages of lung adenocarcinoma, can predict the postoperative survival time of a lung adenocarcinoma patient, and is verified by an external data set.
In some embodiments, the tumor growth index calculation module comprises:
A tumor-associated gene screening unit for screening out a group of tumor-associated genes whose expression levels in three control groups are significantly increased, including BCL2L15, COMP, CST1, FAM83A, SLC A5, PGLYRP4, CLPSL2, ARSH, CDH17, COL10A1, SPP1, MMP3, DDX4, FGF11, CASR, among the differentially expressed genes;
A tumor growth index calculation unit for log2 log-transforming the expression level of the tumor-associated gene, and then calculating a tumor growth index as an average value of log2 log-transforming the expression level of the tumor-associated gene for each sample, the calculation formula of the tumor growth index being: wherein TPM is the expression quantity of tumor related genes, and N is the quantity of tumor related genes.
In some embodiments, the immune function index calculation module comprises:
An immune related gene screening unit for screening out a group of immune related genes with significantly reduced expression levels in a normal tissue and in-situ adenocarcinoma control group, micro-invasive adenocarcinoma and invasive adenocarcinoma control group from the differentially expressed genes, including ITLN2、MARCO、C8B、MASP1、CD36、TAL1、PPBP、CDH5、MSR1、TBX21、C6、MCAM、GZMH、CZMB、CXCL12、LILRB2、CXCR1、CXCR2、LAMP3、IL1RL1;
An immune function index calculation unit for log2 log-transforming the expression level of the immune-related gene, and then calculating an immune function index as an average value of log2 log-transforming the expression level of the immune-related gene for each sample, wherein the immune function index has a calculation formula of: Wherein TPM is the expression quantity of immune related genes, and n is the quantity of immune related genes.
In summary, the research results of the invention show that the increase of the inherent growth potential of the tumor and the impaired immune response to the tumor drive the progress of the lung adenocarcinoma together, while the use of the tumor progress indexes respectively representing the inherent growth potential of the tumor and the development of the immune function related genes in 2 parts of the invention measures the unbalanced level between the inherent growth potential of the tumor and the immune microenvironment of the tumor, thereby having prognostic value for lung adenocarcinoma patients.
It is to be understood that the invention is not limited in its application to the examples described above, but is capable of modification and variation in light of the above teachings by those skilled in the art, and that all such modifications and variations are intended to be included within the scope of the appended claims.
Claims (9)
1. A method for predicting early lung adenocarcinoma progression based on gene expression information, comprising the steps of:
Dividing the obtained lung tissue into four types of normal tissue, in-situ adenocarcinoma, micro-invasive adenocarcinoma and invasive adenocarcinoma in sequence according to tumor pathological features, and respectively performing full transcriptome sequencing on the four types of tissue to generate a sequencing library;
The sequencing results of the whole transcriptome of the normal tissue, the in-situ adenocarcinoma, the micro-invasive adenocarcinoma and the invasive adenocarcinoma form three control groups, wherein the three control groups comprise a normal tissue and in-situ adenocarcinoma control group, an in-situ adenocarcinoma and micro-invasive adenocarcinoma control group and a micro-invasive adenocarcinoma and invasive adenocarcinoma control group;
Performing variance analysis on each gene in the sequencing library among the three control groups to determine a differential expression gene;
Screening a group of tumor-related genes with the expression quantity showing a significant rising trend in three control groups from the differential expression genes, and calculating a tumor growth index according to the tumor-related genes;
screening a group of immune related genes with the expression quantity showing a significant decrease trend in a normal tissue and in-situ adenocarcinoma control group and a micro-infiltration gonad cancer and infiltration gonad cancer control group from the differential expression genes, and calculating an immune function index according to the immune related genes;
the difference between the tumor growth index and the immune function index was used as a tumor progression index for lung adenocarcinoma for predicting early lung adenocarcinoma progression.
2. The method of predicting early lung adenocarcinoma progression based on gene expression information of claim 1, wherein the step of determining differentially expressed genes comprises performing an analysis of variance between the three control groups, respectively, for each gene in the sequencing library:
variance analysis was performed on each gene in the sequencing library in each of the three control groups, and genes with p <0.0001 and an inter-group |log2-expression multiple| of 2 or more were used as differential expression genes.
3. The method for predicting early lung cancer progression according to claim 1, wherein the screening of the differentially expressed genes for a group of tumor-associated genes whose expression levels significantly increase in three control groups comprises BCL2L15, COMP, CST1, FAM83A, SLC A5, PGLYRP4, CLPSL2, ARSH, CDH17, COL10A1, SPP1, MMP3, DDX4, FGF11, CASR.
4. The method for predicting early lung cancer progression based on gene expression information of claim 3, wherein the step of calculating a tumor growth index from the tumor-associated genes comprises:
The expression level of the tumor-associated gene is log2 log transformed, and then for each sample, the tumor growth index is calculated as the average value of log2 log transformation of the expression level of the tumor-associated gene, and the calculation formula of the tumor growth index is: wherein TPM is the expression quantity of tumor related genes, and N is the quantity of tumor related genes.
5. The method for predicting early lung cancer progression based on gene expression information of claim 1, wherein selecting a group of immune-related genes with significantly reduced expression levels in a normal tissue and in-situ adenocarcinoma control group, a micro-invasive adenocarcinoma and an invasive adenocarcinoma control group from the differentially expressed genes comprises ITLN2、MARCO、C8B、MASP1、CD36、TAL1、PPBP、CDH5、MSR1、TBX21、C6、MCAM、GZMH、CZMB、CXCL12、LILRB2、CXCR1、CXCR2、LAMP3、IL1RL1.
6. The method of predicting early lung cancer progression based on gene expression information of claim 5, wherein the step of calculating an immune function index from the immune-related genes comprises:
log2 log conversion is carried out on the expression quantity of the immune related genes, and then for each sample, the immune function index is calculated as the average value of log2 log conversion of the expression quantity of the immune related genes, and the calculation formula of the immune function index is as follows: Wherein TPM is the expression quantity of immune related genes, and n is the quantity of immune related genes.
7. An apparatus for predicting early lung cancer progression based on gene expression information, comprising:
The sequencing module is used for dividing the acquired lung tissue into four types of normal tissue, in-situ adenocarcinoma, micro-invasive adenocarcinoma and invasive adenocarcinoma in sequence according to tumor pathological characteristics, and respectively carrying out full transcriptome sequencing on the four types of tissue to generate a sequencing library;
the grouping module is used for forming three control groups from the sequencing results of the full transcriptome of the normal tissue, the in-situ adenocarcinoma, the micro-invasive adenocarcinoma and the invasive adenocarcinoma, wherein the three control groups comprise a normal tissue and in-situ adenocarcinoma control group, an in-situ adenocarcinoma and micro-invasive adenocarcinoma control group and a micro-invasive adenocarcinoma and invasive adenocarcinoma control group;
the differential expression gene determining module is used for respectively carrying out variance analysis on each gene in the sequencing library among the three control groups to determine differential expression genes;
a tumor growth index calculation module, which is used for screening a group of tumor-related genes with the expression quantity showing a significant rising trend in three control groups from the differential expression genes, and calculating a tumor growth index according to the tumor-related genes;
The immune function index calculation module is used for screening a group of immune related genes with the expression quantity showing a significant decline trend in a normal tissue and in-situ adenocarcinoma control group and a micro-infiltration adenocarcinoma and infiltration adenocarcinoma control group from the differential expression genes, and calculating an immune function index according to the immune related genes;
A tumor progress index calculation module for taking the difference between the tumor growth index and the immune function index as the tumor progress index of the lung adenocarcinoma for predicting the early lung adenocarcinoma progress.
8. The apparatus for predicting early lung cancer progression based on expression information of claim 7, wherein the tumor growth index calculation module comprises:
A tumor-associated gene screening unit for screening out a group of tumor-associated genes whose expression levels in three control groups are significantly increased, including BCL2L15, COMP, CST1, FAM83A, SLC A5, PGLYRP4, CLPSL2, ARSH, CDH17, COL10A1, SPP1, MMP3, DDX4, FGF11, CASR, among the differentially expressed genes;
A tumor growth index calculation unit for log2 log-transforming the expression level of the tumor-associated gene, and then calculating a tumor growth index as an average value of log2 log-transforming the expression level of the tumor-associated gene for each sample, the calculation formula of the tumor growth index being: wherein TPM is the expression quantity of tumor related genes, and N is the quantity of tumor related genes.
9. The apparatus for predicting early lung cancer progression based on expression information of claim 7, wherein the immune function index calculation module comprises:
An immune related gene screening unit for screening out a group of immune related genes with significantly reduced expression levels in a normal tissue and in-situ adenocarcinoma control group, micro-invasive adenocarcinoma and invasive adenocarcinoma control group from the differentially expressed genes, including ITLN2、MARCO、C8B、MASP1、CD36、TAL1、PPBP、CDH5、MSR1、TBX21、C6、MCAM、GZMH、CZMB、CXCL12、LILRB2、CXCR1、CXCR2、LAMP3、IL1RL1;
An immune function index calculation unit for log2 log-transforming the expression level of the immune-related gene, and then calculating an immune function index as an average value of log2 log-transforming the expression level of the immune-related gene for each sample, wherein the immune function index has a calculation formula of: Wherein TPM is the expression quantity of immune related genes, and n is the quantity of immune related genes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210391575.5A CN114927231B (en) | 2022-04-14 | 2022-04-14 | Method and device for predicting early lung adenocarcinoma progress based on gene expression information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210391575.5A CN114927231B (en) | 2022-04-14 | 2022-04-14 | Method and device for predicting early lung adenocarcinoma progress based on gene expression information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114927231A CN114927231A (en) | 2022-08-19 |
CN114927231B true CN114927231B (en) | 2024-07-09 |
Family
ID=82807432
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210391575.5A Active CN114927231B (en) | 2022-04-14 | 2022-04-14 | Method and device for predicting early lung adenocarcinoma progress based on gene expression information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114927231B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109841281A (en) * | 2017-11-29 | 2019-06-04 | 郑州大学第一附属医院 | Construction method based on coexpression similitude identification adenocarcinoma of lung early diagnosis mark and risk forecast model |
CN112582028A (en) * | 2020-12-30 | 2021-03-30 | 华南理工大学 | Lung cancer prognosis prediction model, construction method and device |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1892303A1 (en) * | 2006-08-22 | 2008-02-27 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Methods for identifying therapeutical targets in tumors and for determining and targeting angiogenesis and hemostasis related to adenocarcinomas of the lung |
CN113140258B (en) * | 2021-04-28 | 2024-03-19 | 上海海事大学 | Method for screening potential prognosis biomarkers of lung adenocarcinoma based on tumor invasive immune cells |
-
2022
- 2022-04-14 CN CN202210391575.5A patent/CN114927231B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109841281A (en) * | 2017-11-29 | 2019-06-04 | 郑州大学第一附属医院 | Construction method based on coexpression similitude identification adenocarcinoma of lung early diagnosis mark and risk forecast model |
CN112582028A (en) * | 2020-12-30 | 2021-03-30 | 华南理工大学 | Lung cancer prognosis prediction model, construction method and device |
Also Published As
Publication number | Publication date |
---|---|
CN114927231A (en) | 2022-08-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Vieira et al. | An update on breast cancer multigene prognostic tests—emergent clinical biomarkers | |
Calza et al. | Intrinsic molecular signature of breast cancer in a population-based cohort of 412 patients | |
Hsu et al. | Identification of potential biomarkers related to glioma survival by gene expression profile analysis | |
Gevaert et al. | Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks | |
JP6140202B2 (en) | Gene expression profiles to predict breast cancer prognosis | |
Yu et al. | Feature selection and molecular classification of cancer using genetic programming | |
JP4619350B2 (en) | Diagnosis and prognosis of breast cancer patients | |
US20060211036A1 (en) | Metastasis-associated gene profiling for identification of tumor tissue, subtyping, and prediction of prognosis of patients | |
EP2041307A2 (en) | Prediction of breast cancer response to taxane-based chemotherapy | |
WO2010063121A1 (en) | Methods for biomarker identification and biomarker for non-small cell lung cancer | |
WO2010003773A1 (en) | Algorithms for outcome prediction in patients with node-positive chemotherapy-treated breast cancer | |
AU2005312081A1 (en) | Methods and systems for prognosis and treatment of solid tumors | |
Huang et al. | Molecular portrait of breast cancer in C hina reveals comprehensive transcriptomic likeness to C aucasian breast cancer and low prevalence of luminal A subtype | |
CN115807089B (en) | Liver cell liver cancer prognosis biomarker and application thereof | |
Barrett et al. | Transcriptional analyses of Barrett's metaplasia and normal upper GI mucosae | |
Marchini et al. | Analysis of gene expression in early-stage ovarian cancer | |
US20090069196A1 (en) | Prediction of Breast Cancer Response to Chemotherapy | |
Schaner et al. | Variation in gene expression patterns in effusions and primary tumors from serous ovarian cancer patients | |
Chang et al. | The promise of microarrays in the management and treatment of breast cancer | |
CN114927231B (en) | Method and device for predicting early lung adenocarcinoma progress based on gene expression information | |
Mitchell et al. | Inter-platform comparability of microarrays in acute lymphoblastic leukemia | |
CN113811621A (en) | Method for determining RCC subtype | |
KR100835296B1 (en) | Methods of Selecting Gene Set Predicting Cancer Phenotype | |
Yang et al. | An integrated model of clinical information and gene expression for prediction of survival in ovarian cancer patients | |
Guan et al. | Identification of tamoxifen-resistant breast cancer cell lines and drug response signature |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |