CN113234831A - Model, product and system for predicting lung cancer prognosis - Google Patents
Model, product and system for predicting lung cancer prognosis Download PDFInfo
- Publication number
- CN113234831A CN113234831A CN202110729777.1A CN202110729777A CN113234831A CN 113234831 A CN113234831 A CN 113234831A CN 202110729777 A CN202110729777 A CN 202110729777A CN 113234831 A CN113234831 A CN 113234831A
- Authority
- CN
- China
- Prior art keywords
- lung cancer
- biomarkers
- prognosis
- kit
- ero1a
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010058467 Lung neoplasm malignant Diseases 0.000 title claims abstract description 53
- 201000005202 lung cancer Diseases 0.000 title claims abstract description 53
- 208000020816 lung neoplasm Diseases 0.000 title claims abstract description 53
- 238000004393 prognosis Methods 0.000 title claims abstract description 45
- 239000000090 biomarker Substances 0.000 claims description 55
- 108090000623 proteins and genes Proteins 0.000 claims description 41
- 239000000523 sample Substances 0.000 claims description 37
- 238000000034 method Methods 0.000 claims description 29
- 230000014509 gene expression Effects 0.000 claims description 27
- 102100033393 Anillin Human genes 0.000 claims description 24
- 102100029994 ERO1-like protein alpha Human genes 0.000 claims description 24
- 101000732632 Homo sapiens Anillin Proteins 0.000 claims description 24
- 101001010853 Homo sapiens ERO1-like protein alpha Proteins 0.000 claims description 24
- 101000975496 Homo sapiens Keratin, type II cytoskeletal 8 Proteins 0.000 claims description 24
- 101000877857 Homo sapiens Protein FAM83A Proteins 0.000 claims description 24
- 102100023972 Keratin, type II cytoskeletal 8 Human genes 0.000 claims description 24
- 102100035446 Protein FAM83A Human genes 0.000 claims description 24
- 102000004169 proteins and genes Human genes 0.000 claims description 17
- 239000003153 chemical reaction reagent Substances 0.000 claims description 16
- 150000007523 nucleic acids Chemical class 0.000 claims description 16
- 108020004707 nucleic acids Proteins 0.000 claims description 12
- 102000039446 nucleic acids Human genes 0.000 claims description 12
- 102100024892 Transmembrane protein 178A Human genes 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 10
- 101000626577 Homo sapiens Transmembrane protein 178A Proteins 0.000 claims description 9
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 8
- 238000012163 sequencing technique Methods 0.000 claims description 8
- 239000003550 marker Substances 0.000 claims description 7
- 108091023037 Aptamer Proteins 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000001514 detection method Methods 0.000 claims description 5
- 238000004949 mass spectrometry Methods 0.000 claims description 5
- 150000001875 compounds Chemical class 0.000 claims description 4
- 239000003814 drug Substances 0.000 claims description 3
- 238000000684 flow cytometry Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 238000008157 ELISA kit Methods 0.000 claims description 2
- 238000011529 RT qPCR Methods 0.000 claims description 2
- 210000001124 body fluid Anatomy 0.000 claims description 2
- 239000010839 body fluid Substances 0.000 claims description 2
- 238000001378 electrochemiluminescence detection Methods 0.000 claims description 2
- 238000003384 imaging method Methods 0.000 claims description 2
- 230000003053 immunization Effects 0.000 claims description 2
- 238000002649 immunization Methods 0.000 claims description 2
- 238000003119 immunoblot Methods 0.000 claims description 2
- 238000003317 immunochromatography Methods 0.000 claims description 2
- 238000013115 immunohistochemical detection Methods 0.000 claims description 2
- 238000007899 nucleic acid hybridization Methods 0.000 claims description 2
- 239000003795 chemical substances by application Substances 0.000 claims 2
- 238000004587 chromatography analysis Methods 0.000 claims 1
- 229940079593 drug Drugs 0.000 claims 1
- 230000004083 survival effect Effects 0.000 description 24
- 206010028980 Neoplasm Diseases 0.000 description 22
- 201000011510 cancer Diseases 0.000 description 16
- 238000012549 training Methods 0.000 description 13
- 208000010507 Adenocarcinoma of Lung Diseases 0.000 description 9
- 238000004458 analytical method Methods 0.000 description 9
- 201000005249 lung adenocarcinoma Diseases 0.000 description 9
- 108020004414 DNA Proteins 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 239000002609 medium Substances 0.000 description 8
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 7
- 210000004027 cell Anatomy 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 238000003745 diagnosis Methods 0.000 description 5
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 5
- 238000012795 verification Methods 0.000 description 5
- 108091028043 Nucleic acid sequence Proteins 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 238000013211 curve analysis Methods 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 108020004999 messenger RNA Proteins 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 239000000243 solution Substances 0.000 description 4
- 239000000758 substrate Substances 0.000 description 4
- 102100031780 Endonuclease Human genes 0.000 description 3
- 206010027476 Metastases Diseases 0.000 description 3
- 230000027455 binding Effects 0.000 description 3
- 239000011230 binding agent Substances 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 239000000835 fiber Substances 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000009401 metastasis Effects 0.000 description 3
- 238000002493 microarray Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000003909 pattern recognition Methods 0.000 description 3
- 108091033319 polynucleotide Proteins 0.000 description 3
- 102000040430 polynucleotide Human genes 0.000 description 3
- 239000002157 polynucleotide Substances 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 238000004885 tandem mass spectrometry Methods 0.000 description 3
- 102000053602 DNA Human genes 0.000 description 2
- 239000003298 DNA probe Substances 0.000 description 2
- 108010042407 Endonucleases Proteins 0.000 description 2
- 101000686031 Homo sapiens Proto-oncogene tyrosine-protein kinase ROS Proteins 0.000 description 2
- 108060003951 Immunoglobulin Proteins 0.000 description 2
- UQSXHKLRYXJYBZ-UHFFFAOYSA-N Iron oxide Chemical compound [Fe]=O UQSXHKLRYXJYBZ-UHFFFAOYSA-N 0.000 description 2
- 102100023347 Proto-oncogene tyrosine-protein kinase ROS Human genes 0.000 description 2
- 238000003559 RNA-seq method Methods 0.000 description 2
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 2
- 206010041067 Small cell lung cancer Diseases 0.000 description 2
- 239000000427 antigen Substances 0.000 description 2
- 230000000890 antigenic effect Effects 0.000 description 2
- 102000036639 antigens Human genes 0.000 description 2
- 108091007433 antigens Proteins 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 102000037865 fusion proteins Human genes 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- 102000018358 immunoglobulin Human genes 0.000 description 2
- 238000003364 immunohistochemistry Methods 0.000 description 2
- 230000036210 malignancy Effects 0.000 description 2
- 238000000491 multivariate analysis Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- -1 nucleoside triphosphates Chemical class 0.000 description 2
- 239000002773 nucleotide Substances 0.000 description 2
- 125000003729 nucleotide group Chemical group 0.000 description 2
- 238000012567 pattern recognition method Methods 0.000 description 2
- 230000004557 prognostic gene signature Effects 0.000 description 2
- 238000000611 regression analysis Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 208000000587 small cell lung carcinoma Diseases 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 101150023956 ALK gene Proteins 0.000 description 1
- 238000001353 Chip-sequencing Methods 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 238000009007 Diagnostic Kit Methods 0.000 description 1
- 238000002965 ELISA Methods 0.000 description 1
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical group C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 1
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 102000017727 Immunoglobulin Variable Region Human genes 0.000 description 1
- 108010067060 Immunoglobulin Variable Region Proteins 0.000 description 1
- 108091007984 KARS Proteins 0.000 description 1
- 102100035529 Lysine-tRNA ligase Human genes 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 239000000020 Nitrocellulose Substances 0.000 description 1
- 108091005461 Nucleic proteins Proteins 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 108020004518 RNA Probes Proteins 0.000 description 1
- 239000003391 RNA probe Substances 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 1
- 108010046983 Ribonuclease T1 Proteins 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 101710198278 Transmembrane protein 178A Proteins 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 239000012736 aqueous medium Substances 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 238000001210 attenuated total reflectance infrared spectroscopy Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 102000023732 binding proteins Human genes 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 239000013060 biological fluid Substances 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 239000007853 buffer solution Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 238000013375 chromatographic separation Methods 0.000 description 1
- 239000003593 chromogenic compound Substances 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 238000003795 desorption Methods 0.000 description 1
- 238000012377 drug delivery Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002330 electrospray ionisation mass spectrometry Methods 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 description 1
- 108700015053 epidermal growth factor receptor activity proteins Proteins 0.000 description 1
- 238000010195 expression analysis Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 230000004547 gene signature Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000003018 immunoassay Methods 0.000 description 1
- 229940127121 immunoconjugate Drugs 0.000 description 1
- 238000000760 immunoelectrophoresis Methods 0.000 description 1
- 238000001114 immunoprecipitation Methods 0.000 description 1
- 238000012744 immunostaining Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000010884 ion-beam technique Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 208000003849 large cell carcinoma Diseases 0.000 description 1
- 238000001325 log-rank test Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 210000001165 lymph node Anatomy 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012083 mass cytometry Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000000816 matrix-assisted laser desorption--ionisation Methods 0.000 description 1
- 238000000133 mechanosynthesis reaction Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 239000002991 molded plastic Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- YOHYSYJDKVYCJI-UHFFFAOYSA-N n-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide Chemical compound FC(F)(F)C1=CC=CC(NC=2N=CN=C(NC=3C=C(NC(=O)C4CC4)C=CC=3)C=2)=C1 YOHYSYJDKVYCJI-UHFFFAOYSA-N 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 229920001220 nitrocellulos Polymers 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000001668 nucleic acid synthesis Methods 0.000 description 1
- 150000003833 nucleoside derivatives Chemical class 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000005304 optical glass Substances 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 238000000206 photolithography Methods 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000004007 reversed phase HPLC Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000000377 silicon dioxide Substances 0.000 description 1
- 238000010532 solid phase synthesis reaction Methods 0.000 description 1
- 206010041823 squamous cell carcinoma Diseases 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 239000013076 target substance Substances 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/574—Immunoassay; Biospecific binding assay; Materials therefor for cancer
- G01N33/57407—Specifically defined cancers
- G01N33/57423—Specifically defined cancers of lung
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/574—Immunoassay; Biospecific binding assay; Materials therefor for cancer
- G01N33/57484—Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/574—Immunoassay; Biospecific binding assay; Materials therefor for cancer
- G01N33/57484—Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites
- G01N33/57488—Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites involving compounds identifable in body fluids
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/20—Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B40/00—ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/136—Screening for pharmacological compounds
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Immunology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Pathology (AREA)
- Biotechnology (AREA)
- Urology & Nephrology (AREA)
- Hematology (AREA)
- Analytical Chemistry (AREA)
- Public Health (AREA)
- Cell Biology (AREA)
- Data Mining & Analysis (AREA)
- Microbiology (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
- Biochemistry (AREA)
- Food Science & Technology (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Medicinal Chemistry (AREA)
- Epidemiology (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biophysics (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Primary Health Care (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Biology (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Artificial Intelligence (AREA)
Abstract
The invention discloses a model for predicting lung cancer prognosis, a product and a system. The invention also discloses a product for predicting the prognosis of the lung cancer.
Description
Technical Field
The invention relates to the field of disease diagnosis, in particular to a model, a product and a system for predicting lung cancer prognosis.
Background
Lung cancer is the most common malignancy worldwide, with morbidity and mortality among men and women leading (Bray F, Ferlay J, et al. Global cancer world for 36cancers in185countries. CA: a cancer in cancer for clinicians,68(6),394 (2018)). Lung cancer is classified into Small Cell Lung Cancer (SCLC) and non-small cell lung cancer (NSCLC), and 80-85% of lung cancer patients are NSCLC. NSCLC is largely classified into three histological types, lung adenocarcinoma, lung squamous carcinoma and large cell carcinoma, with lung adenocarcinoma being The predominant histological type, accounting for about 40% (binder e.epidemiology: The dominant malignancy. nature,513(7517),52-3 (2014)). Different histological types respond differently to chemotherapy. The development process of lung cancer is very complicated, and researches in past decades show that certain gene (KARS, EGFR, HER2, MET, PI3KA) mutation and ROS1, ALK gene rearrangement play an important role in the pathogenesis of lung cancer, and also become a key link of the current stage lung cancer treatment, and lay a foundation for the arrival of personalized medical age (Bergethon K, Shaw AT, Ou SH et al.ROS1 registration details a unique molecular class of lung cancer. journal of clinical on-alcohol: of clinical j ournal of the American Society of clinical on-alcohol, 30(8), 863-. Significant progress has been made in the diagnosis and treatment of lung Cancer in recent years due to the popularity of early tumor screening, the development of medical technology, and the improvement of resident lifestyle, however, epidemiological data show that the 5-year overall survival rate for all stages of lung Cancer is as low as 15.9% (Ettingger DS, Akerley W, Borghaei H et al. non-small cell lung Cancer, version 2.2013.Journal of the National Comprehensive Cancer Network: JNCCN,11(6), quiz 653 (645) (2013)), and the main factors affecting the survival time of lung Cancer patients are relapse and metastasis.
Currently, a TNM (tumor node metastasis) staging system is commonly used clinically as an index for judging the prognosis of a lung Cancer patient, and a lung Cancer TNM staging standard is promulgated and implemented by the International Cancer consortium (UICC), and is the most widely applied tumor staging system in the current stage of lung Cancer diagnosis and treatment development. The TNM staging system is divided into four stages (stage I, stage II, stage III and stage IV) according to three indexes of the state (T) of a primary tumor, the regional lymph node condition (N) and the distant metastasis condition (M). Currently, TNM staging systems also have limited predictive capabilities, and there is a strong clinical need for novel markers that can accurately predict the prognosis of patients with lung cancer (Shi X, Li R, Dong X et al. IRGS: an animal-related gene classifier for lung cancer patients, journal of clinical medicine,18(1),55 (2020)).
Disclosure of Invention
The invention aims to provide application of biomarkers in predicting lung cancer prognosis and a product and a system/device for predicting lung cancer prognosis by using molecular markers.
In order to achieve the above objects, the present invention provides, in a first aspect, use of a reagent for detecting biomarkers including ANLN, ERO1A, FAM83A, KRT8, and/or TMEM178A in the preparation of a product for predicting lung cancer prognosis.
Further, the biomarkers were ANLN, ERO1A, FAM83A, KRT8, and TMEM 178A.
Further, the reagent comprises a reagent for detecting the expression level of the biomarker in the sample by a digital imaging technology, a protein immunization technology, a dye technology, a nucleic acid sequencing technology, a nucleic acid hybridization technology, a chromatographic technology and a mass spectrometry technology.
Further, the reagent sample comprises tissue and body fluid.
In a second aspect, the invention provides a product for predicting the prognosis of lung cancer, the product comprising reagents for detecting biomarkers comprising ANLN, ERO1A, FAM83A, KRT8 and/or TMEM 178A.
Further, the product comprises a chip and a kit.
Further, the kit comprises a qPCR kit, an immunoblotting detection kit, an immunochromatography detection kit, a flow cytometry kit, an immunohistochemical detection kit, an ELISA kit and an electrochemiluminescence detection kit.
Further, the kit also includes instructions for predicting a prognosis for lung cancer.
Further, the reagents comprise primers or probes that specifically bind to the biomarker genes; an antibody, peptide, aptamer, or compound that specifically binds to the marker protein.
In a third aspect, the present invention provides a system/apparatus for predicting lung cancer prognosis, comprising:
the acquisition unit is used for acquiring data of biomarkers in a sample to be detected, wherein the biomarkers comprise ANLN, ERO1A, FAM83A, KRT8 and/or TMEM 178A;
and the processing unit is used for inputting the data of the biomarkers into a lung cancer prognosis prediction model to obtain a prediction result of the lung cancer progress of the sample to be detected.
Further, the prognostic prediction model is a Cox regression model.
Further, the Cox regression model is a LASSO Cox regression model.
Further, the formula of the prognostic prediction model is risk score ═ C1 × ExpANLN + C2 × ExpERO1A + C3 × ExpFAM83A + C4 × expprrt 8+ C5 × ExpTMEM 178A;
wherein ExpANLN, ExpERO1A, ExpFAM83A, ExpKRT8 and ExpTMEM178A represent the expression levels of ANLN, ERO1A, FAM83A, KRT8 and TMEM178A respectively.
Further, the C1, the C2, the C3, the C4 and the C5 are respectively 0.1452, 0.1702, 0.0722, 0.1918 and-0.0861.
A fourth aspect of the present invention provides a computer-readable storage medium storing a program for executing a lung cancer prognosis prediction model constructed from the biomarkers ANLN, ERO1A, FAM83A, KRT8, and/or TMEM 178A.
Further, the prognostic prediction model is a Cox regression model.
Further, the Cox regression model is a LASSO Cox regression model.
Further, the formula of the prognostic prediction model is risk score ═ C1 × ExpANLN + C2 × ExpERO1A + C3 × ExpFAM83A + C4 × expprrt 8+ C5 × ExpTMEM 178A.
Further, the C1, the C2, the C3, the C4 and the C5 are respectively 0.1452, 0.1702, 0.0722, 0.1918 and-0.0861.
A fifth aspect of the present invention provides an electronic apparatus, comprising:
a client component, wherein the client component comprises a user interface;
a server component, wherein the server component comprises at least one memory unit configured to receive data input comprising sequencing data for biomarkers generated from a sample, the biomarkers including ANLN, ERO1A, FAM83A, KRT8, and/or TMEM 178A;
the user interface operatively coupled with the server component; and
a computer processor operatively coupled to the at least one memory unit, wherein the computer processor is programmed as an executable program for running a lung cancer prognostic prediction model constructed from biomarkers.
Further, the prognostic prediction model is a Cox regression model.
Further, the Cox regression model is a LASSO Cox regression model.
Further, the formula of the prognostic prediction model is risk score ═ C1 × ExpANLN + C2 × ExpERO1A + C3 × ExpFAM83A + C4 × expprrt 8+ C5 × ExpTMEM 178A.
Further, the C1, the C2, the C3, the C4 and the C5 are respectively 0.1452, 0.1702, 0.0722, 0.1918 and-0.0861.
In a sixth aspect, the present invention provides the use of a reagent for detecting biomarkers including ANLN, ERO1A, FAM83A, KRT8 and/or TMEM178A in the manufacture of a product for evaluating the effect of a medicament on treating lung cancer.
Further, the reagents comprise primers or probes that specifically bind to the biomarker genes; an antibody, peptide, aptamer, or compound that specifically binds to the marker protein.
The invention has the advantages and beneficial effects that:
according to the invention, ANLN, ERO1A, FAM83A, KRT8 and/or TMEM178A are selected as biomarkers, so that the prognosis of a lung cancer patient can be effectively predicted, and early intervention and early treatment can be realized.
Drawings
FIG. 1 is a graph of survival for a combination of ANLN, ERO1A, FAM83A, KRT8, and TMEM178A in a training set to predict prognosis of lung adenocarcinoma;
FIG. 2 is a survival graph demonstrating that a combination of ANLN, ERO1A, FAM83A, KRT8, and TMEM178A in a panel predicts prognosis of lung adenocarcinoma;
FIG. 3 is a ROC plot of the combined prediction of prognosis of lung adenocarcinoma in the training set of ANLN, ERO1A, FAM83A, KRT8, and TMEM 178A;
FIG. 4 is a ROC plot demonstrating that the combination of ANLN, ERO1A, FAM83A, KRT8 and TMEM178A in focus predicts prognosis of lung adenocarcinoma.
Detailed Description
Some aspects and embodiments of the invention will now be discussed with reference to the figures. Other aspects and embodiments will become apparent to those skilled in the art. All documents mentioned herein are incorporated herein by reference.
Sample(s)
As used herein, a "sample" may be a cell or tissue sample (e.g., a biopsy), a biological fluid, an extract (e.g., a protein or DNA extract obtained from a subject). In particular, the sample may be a tumor sample, e.g. a solid tumor, e.g. lung adenocarcinoma. The sample may be a sample freshly obtained from the subject, or may be a sample that has been processed and/or stored (e.g., frozen, fixed, or subjected to one or more purification, enrichment, or extraction steps) prior to making the determination.
As used herein, "and/or" should be viewed as specifically disclosing each of the two specified features or components, with or without the other. For example, "a and/or B" will be considered a specific disclosure of each of (i) a, (ii) B, and (iii) a and B, as if each were individually listed herein.
Biomarkers
As used herein, "biomarker" refers to a biomolecule that is present in an individual at different concentrations that can be used to predict the cancer status of the individual. Biomarkers can include, but are not limited to, nucleic acids, proteins, and variants and fragments thereof. A biomarker may be DNA comprising all or part of a nucleic acid sequence encoding the biomarker, or the complement of such a sequence. Biomarker nucleic acids useful in the present invention are considered to include DNA and RNA comprising all or part of any nucleic acid sequence of interest.
In a particular embodiment of the invention, the biomarker comprises ANLN, ERO1A, FAM83A, KRT8 and/or TMEM 178A. Biomarkers such as ANLN (antibiotic binding protein, gene ID: 54443), ERO1A (endo plastic diagnostic iron oxide detection 1alpha, gene ID: 30001), FAM83A (family with sequence similarity 83 cell A, gene ID: 84985), KRT8 (key 8, gene ID: 3856), TMEM178A (transmembrane protein 178A, gene ID: 130733), including genes and their encoded proteins and homologues, mutations, and equivalents thereof. The term encompasses full-length, unprocessed biomarkers, as well as any form of biomarker that results from processing in a cell. The term encompasses naturally occurring variants (e.g., splice variants or allelic variants) of the biomarkers. The gene ID is available at https:// www.ncbi.nlm.nih.gov/gene/. The nucleotide sequence of each gene disclosed as the NCBI gene ID number at 23/6/2021 is expressly incorporated herein by reference.
Gene expression
Reference to determining an expression level refers to determining the expression level of an expression product of a gene. The expression level can be determined at the nucleic acid level or at the protein level.
The determined gene expression level can be considered to provide an expression profile. By "expression profile" is meant a set of data relating to the expression levels of one or more related genes in an individual in a form that allows comparison with comparable expression profiles (e.g., from individuals with known prognoses) to help determine the prognosis and select an appropriate treatment for the individual patient.
Determination of the gene expression level may involve determining the presence or amount of mRNA in a cancer cell sample. Methods for doing so are well known to the skilled person. Gene expression levels can be determined in cancer cell samples using any conventional method, for example using nucleic acid microarrays or using nucleic acid synthesis (e.g., quantitative PCR).
Alternatively or additionally, the determination of the level of gene expression may involve determining the level of protein expressed from the gene in a sample comprising cancer cells obtained from the individual. Protein expression levels can be determined by any useful means, including the use of immunoassays. For example, expression levels can be determined by Immunohistochemistry (IHC), western blotting, ELISA, immunoelectrophoresis, immunoprecipitation, flow cytometry, mass cytometry, and immunostaining. Using any of these methods, the relative expression levels of the proteins of the biomarkers disclosed herein can be determined.
As an alternative embodiment, the expression level of the gene may also be detected using advanced sequencing methods. For example, Illumina can be used to detect biomarkers. Next generation Sequencing (e.g., Sequencing-By-Synthesis or TruSeq methods using, for example, the HiSeq, HiScan, genome Analyzer, or MiSeq systems). Biomarkers can also be detected using ion beam sequencing or other suitable semiconductor sequencing methods.
As an alternative embodiment, RNase profiling (mapping) can be used to quantify biomarkers using mass spectrometry. The isolated RNA may be enzymatically digested with an RNA endonuclease (RNase) having high specificity (e.g., RNase T1, which cleaves 3' to all unmodified guanosine residues) prior to analysis of the isolated RNA by MS or tandem MS (MS/MS) methods. The first method developed used reverse phase HPLC coupled directly to ESI-MS to perform on-line chromatographic separation of endonuclease digests. The presence of post-transcriptional modifications can be revealed by mass shifts from those expected based on the RNA sequence. Ions of abnormal mass/charge values can then be isolated for tandem MS sequencing, thereby locating the sequence position of the post-transcriptionally modified nucleoside.
Matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) has also been used as an analytical method to obtain information about post-transcriptionally modified nucleosides. MALDI-based methods can be distinguished from ESI-based methods by separation steps. In MALDI-MS, mass spectrometry is used to separate biomarkers.
The term "primer" as used herein refers to a nucleic acid sequence having a short free 3' -hydroxyl group, which is a short nucleic acid that can form a base pair with a complementary template and serves as an origin of replication for the template strand. The primers can prime DNA synthesis in the presence of reagents for polymerization (i.e., DNA polymerase or reverse transcriptase) and four different nucleoside triphosphates in appropriate buffer solutions and temperatures. The PCR conditions and the lengths of the sense and antisense primers can be appropriately selected according to the techniques known in the art.
The term "probe" as used herein refers to a nucleic acid fragment (e.g., RNA or DNA) corresponding to several bases to several hundred bases that can specifically bind to mRNA, and the presence or absence and expression level of a particular mRNA can be confirmed by a tag. The probe may be prepared in the form of an oligonucleotide probe, a single-stranded DNA probe, a double-stranded DNA probe, or an RNA probe. Suitable probes and hybridization conditions may be appropriately selected according to techniques known in the art.
The term "antibody" as used herein is well known in the art and refers to a specific immunoglobulin directed against an antigenic site. The antibody of the present invention refers to an antibody that specifically binds to the biomarker protein of the present invention, and can be produced according to a conventional method in the art. Forms of antibodies include polyclonal or monoclonal antibodies, antibody fragments (such as Fab, Fab ', F (ab')2, and Fv fragments), single chain Fv (scfv) antibodies, multispecific antibodies (such as bispecific antibodies), monospecific antibodies, monovalent antibodies, chimeric antibodies, humanized antibodies, human antibodies, fusion proteins comprising an antigen binding site of an antibody, and any other modified immunoglobulin molecule comprising an antigen binding site, so long as the antibody exhibits the desired biological binding activity.
The term "peptide" as used herein has the ability to bind to a target substance to a high degree and does not undergo denaturation during heat/chemical treatment. Also, due to its small size, it can be used as a fusion protein by attaching it to other proteins. In particular, since it can be specifically attached to a high molecular protein chain, it can be used as a diagnostic kit and a drug delivery substance.
The term "aptamer" as used herein refers to a polynucleotide composed of a specific type of single-stranded nucleic acid (DNA, RNA or modified nucleic acid) which itself has a stable tertiary structure and has the property of being able to bind with high affinity and specificity to a target molecule. As described above, since the aptamer can specifically bind to an antigenic substance like an antibody, but is more stable and has a simple structure than a protein, and is composed of a polynucleotide that is easily synthesized, it can be used instead of an antibody.
In addition, the kit of the present invention may comprise an antibody that specifically binds to the marker component; a secondary antibody conjugate conjugated to a marker developed by reaction with a substrate; a chromogenic substrate solution that undergoes a chromogenic reaction with the marker, a washing solution, an enzyme reaction termination solution, and the like, and may be prepared as a plurality of separate packages or compartments containing the reagent components used.
Prognosis
Whether the prognosis is considered good or poor can vary between cancer and disease stage. In general, a good prognosis is one in which Overall Survival (OS) and/or Progression Free Survival (PFS) is longer than the mean for that stage and cancer type. If PFS and/or OS are below the mean for the stage and type of cancer, the prognosis may be considered poor. The mean may be median survival OS or PFS.
In general, a "good prognosis" is a prognosis in which the survival (OS and/or PFS) of an individual patient may be favorable compared to the expectation of a population of patients in a comparable disease setting. This can be defined as better than median survival (i.e., survival over 50% of patients in the population).
Chip/kit
In the present invention, "chip", also referred to as "array", refers to a solid support comprising attached nucleic acid or peptide probes. Arrays typically comprise a plurality of different nucleic acid or peptide probes attached to the surface of a substrate at different known locations. These arrays, also known as "microarrays," can generally be produced using either mechanosynthesis methods or light-guided synthesis methods that incorporate a combination of photolithography and solid-phase synthesis methods. The array may comprise a flat surface, or may be nucleic acids or peptides on beads, gels, polymer surfaces, fibers such as optical fibers, glass, or any other suitable substrate. The array may be packaged in a manner that allows for diagnostic or other manipulation of the fully functional device.
A "microarray" is an ordered array of hybridization array elements, such as polynucleotide probes (e.g., oligonucleotides) or binding agents (e.g., antibodies), on a substrate. The matrix may be a solid matrix, for example, a glass or silica slide, beads, a fiber optic binder, or a semi-solid matrix, for example, a nitrocellulose membrane. The nucleotide sequence may be DNA, RNA or any permutation thereof.
In the present invention, the components of the kit may be packaged in the form of an aqueous medium or in a lyophilized form. Suitable containers in the kit generally include at least one vial, test tube, flask, pet bottle, syringe, or other container in which a component may be placed and, preferably, suitably aliquoted. Where more than one component is present in the kit, the kit will also typically comprise a second, third or other additional container in which the additional components are separately disposed. However, different combinations of components may be contained in one vial. The kit of the invention will also typically include a container for holding the reactants, sealed for commercial sale. Such containers may include injection molded or blow molded plastic containers in which the desired vials may be retained.
Classification method based on gene expression
The present invention provides methods for classifying, predicting or monitoring cancer in a subject. In particular, one or more pattern recognition algorithms may be used to evaluate data obtained from gene expression analysis. Such analytical methods may be used to form predictive models that may be used to classify test data. For example, one convenient and particularly effective classification method employs multivariate statistical analysis modeling, first using data from samples from known subgroups (e.g., from subjects known to have a particular cancer prognosis subgroup: high risk and low risk) ("modeled data") to form a model ("predictive model"), and second classifying unknown samples (e.g., "test samples") according to subgroups.
Pattern recognition methods have been widely used to characterize many different types of problems, such as across linguistics, fingerprinting, chemistry, and psychology. In the case of the methods described herein, pattern recognition is the use of multivariate statistics (both parametric and non-parametric) to analyze the data and thereby classify the samples based on a series of observed measurements and predict the values of some dependent variables. There are two main approaches. One group of methods is referred to as "unsupervised" and these simply reduce the data complexity in a reasonable manner and also produce a display map that can be interpreted by the human eye.
Another approach is referred to as "supervised" in which a mathematical model is generated using a training set of samples with known classes or results, and then evaluated using a separate validation dataset. Here, a "training set" of gene expression data is used to construct a statistical model that correctly predicts a "subset" of each sample. The training set is then tested with independent data (called a test or validation set) to determine the robustness (robustness) of the computer-based model. These models are sometimes referred to as "expert systems," but may be based on a series of different mathematical procedures, such as support vector machines, decision trees, k-nearest neighbor and naive Bayes (Bayes). Supervised methods may use datasets with reduced dimensionality (e.g., the first few principal components), but typically use unreduced data with all dimensions. In all cases, these methods allow for the quantitative delineation of the multivariate borders that characterize and separate each subtype according to its intrinsic gene expression profile. Any predicted confidence limit (confidence limit), e.g., probability level on goodness of fit, may also be obtained. The robustness of the predictive model can also be checked using cross-validation by omitting selected samples from the analysis.
Pattern recognition methods have been widely used to characterize many different types of problems, such as across linguistics, fingerprinting, science, and psychology. In the case of the methods described herein, pattern recognition is the use of multivariate statistics (both parametric and non-parametric) to analyze the data and thereby classify the samples based on a series of observed measurements and predict the values of some dependent variables. There are two main approaches. One group of methods is referred to as "unsupervised" and these simply reduce the data complexity in a reasonable manner and also produce a display map that can be interpreted by the human eye. However, this type of approach may not be suitable for developing clinical assays that can be used to classify samples derived from a subject without relying on an initial sample population for training a predictive algorithm.
System/apparatus
The acquisition unit is used for acquiring data of biomarkers in a sample to be detected, wherein the biomarkers comprise ANLN, ERO1A, FAM83A, KRT8 and/or TMEM 178A;
and the processing unit is used for inputting the data of the biomarkers into a lung cancer prognosis prediction model to obtain a prediction result of the lung cancer progress of the sample to be detected.
A device as applied herein shall at least comprise the above-mentioned units. The units of the device are operatively connected to each other. How the units are operatively linked will depend on the type of unit contained in the device. For example, in case a tool for automatic quantitative measurement of biomarkers is applied in the acquisition unit, the data obtained by said automatic operation unit may be processed by a processing unit, e.g. by a computer program running on a computer as data processor, in order to facilitate the diagnosis. In one embodiment, the data processor performs a comparison of the amount of the biomarker to a reference.
Further, in this case, the unit is constituted by a single device. However, the acquisition unit and the processing unit may also be physically separate. In this case, operational connection (operational connection) may be realized via wired and wireless connection between units allowing data transmission. The wireless connection may use a wireless lan (wlan) or the internet. The wired connection may be achieved by optical and non-optical cable connections between the units. The cable for wired connection is further suitable for high-throughput data transmission.
Readable storage medium
The present invention provides a computer-readable storage medium storing a program for executing a lung cancer prognosis prediction model constructed from biomarkers ANLN, ERO1A, FAM83A, KRT8, and/or TMEM 178A. The computer readable storage medium, such as computer executable code, may take many forms, including but not limited to tangible storage media, carrier wave media, or physical transmission media. Non-volatile storage media include, for example, optical or magnetic disks, any storage device such as in any computer or the like, volatile storage media include dynamic memory, such as the main memory of such computer platforms. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier-wave transmission media can take the form of electrical or electromagnetic signals, or acoustic or light waves, such as those generated during radio frequency and infrared data communications. Thus, common forms of computer-readable media include, for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards, paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer can read programming code and/or data. Many of these computer readable media may take the form of one or more sequences of one or more instructions that are executable by a processor to perform operations.
The following detailed description of embodiments of the present application will be made with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present application, are given by way of illustration and explanation only, and are not intended to limit the present application.
Example Gene markers associated with diagnosis and prognosis of Lung cancer
1. Data download
Acquiring RNA-seq data and clinical information of lung adenocarcinoma from a TCGA (TCGA), removing samples with missing survival information and 0 survival period, and taking 496 sample amount as a training set; and (3) acquiring chip sequencing data and clinical information of the lung adenocarcinoma from the GEO database, removing samples with missing survival information and 0 survival period, and taking the sample with the inclusion amount of 226 as a verification set.
2. Data normalization
RNA-seq data of TCGA was normalized by the Voom method, and chip data of GEO was normalized by the RMA method.
3. One-factor Cox analysis
And carrying out single-factor Cox analysis on the genes of the training set and the verification set, and screening the genes which are simultaneously related to the survival of the lung cancer patient in the two data sets, wherein the gene with the P <0.05 is considered to have an influence on the survival of the lung cancer patient.
4. LASSO Cox regression analysis
And performing LASSO Cox regression analysis to construct a LASSO regression model. TCGA data as training set and GEO data as test set. And constructing a prognostic gene signature by using a LASSO Cox regression model system and linear combination of mRNA expression levels to form a risk scoring formula.
And calculating the risk score of each sample by using the same formula when the GEO verification set is verified, dividing all samples into a high risk group and a low risk group according to the median of the risk scores, and further performing survival analysis and Receiver Operating Characteristic (ROC) curve analysis.
5. Survival Curve analysis
And (3) performing survival analysis and drawing survival curves on the lung cancer patients in the high-risk group and the low-risk group of the training set and the verification set by adopting R software 'survivval', 'surviviner' and 'ggplot 2', and performing difference comparison between the groups through log-rank test.
6. ROC curve analysis
In order to evaluate the accuracy of the prognosis model in predicting the lung cancer prognosis, the R software 'survivval' and 'timeROC' packages are adopted to detect the prognosis efficiencies of the biomarkers for 1 year, 3 years and 5 years by using time-dependent ROC curves, the significance of the difference between various groups of ROC curves is detected by using a self-sampling method, and the difference P <0.05 is considered to be statistically different.
7. Results
TCGA data were used as training sets to construct prognostic gene signatures using linear combinations of LASSO Cox regression model coefficients and gene expression levels with risk scores of 0.1452 × ExpANLN +0.1702 × ExpERO1A +0.0722 × ExpFAM83A +00.1918 × expprt 8-0.0861 × ExpTMEM 178A.
The lung cancer patients were analyzed in two groups, high risk group (high score) and low risk group (low score), according to the median of the risk scores, and by KM survival analysis, the difference in survival time of the two groups was compared, and the cumulative survival rate of the patients in the high risk group was found to be significantly lower than that in the low risk group. The same formula is used to calculate the risk score in the GEO data. Consistent with the results for the TCGA training set, the cumulative survival of patients in the high risk group was significantly lower than that in the low risk group (fig. 1 and 2).
The prognosis ROC curve analysis is carried out on the lung cancer patients in the training set and the verification set, and the result shows that the risk score prognosis model has better distinguishing performance on the prognosis of the lung cancer patients (figure 3 and figure 4).
In conclusion, the gene signature based on the five genes of the present invention can predict the prognosis of lung cancer.
The preferred embodiments of the present application have been described in detail with reference to the accompanying drawings, however, the present application is not limited to the details of the above embodiments, and various simple modifications can be made to the technical solution of the present application within the technical idea of the present application, and these simple modifications are all within the protection scope of the present application.
It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described in the present application.
In addition, any combination of the various embodiments of the present application is also possible, and the same should be considered as disclosed in the present application as long as it does not depart from the idea of the present application.
Claims (10)
1. Use of a reagent for detecting a biomarker comprising ANLN, ERO1A, FAM83A, KRT8 and/or TMEM178A in the manufacture of a product for predicting the prognosis of lung cancer.
2. The use according to claim 1, wherein the biomarkers are ANLN, ERO1A, FAM83A, KRT8 and TMEM 178A.
3. The use of claim 2, wherein the agent comprises an agent for detecting the level of expression of a biomarker in a sample by digital imaging techniques, protein immunization techniques, dye techniques, nucleic acid sequencing techniques, nucleic acid hybridization techniques, chromatography techniques, mass spectrometry techniques.
4. The use of claim 3, wherein the reagent sample comprises tissue, body fluid.
5. A product for predicting the prognosis of lung cancer, said product comprising reagents for detecting biomarkers comprising ANLN, ERO1A, FAM83A, KRT8, and/or TMEM 178A;
preferably, the product comprises a chip, a kit;
preferably, the kit comprises a qPCR kit, an immunoblotting detection kit, an immunochromatography detection kit, a flow cytometry kit, an immunohistochemical detection kit, an ELISA kit and an electrochemiluminescence detection kit;
preferably, the kit further comprises instructions for predicting the prognosis of lung cancer.
6. The product of claim 5, wherein the reagents comprise primers or probes that specifically bind to the biomarker genes; an antibody, peptide, aptamer, or compound that specifically binds to the marker protein.
7. A system/apparatus for predicting lung cancer prognosis, comprising:
the acquisition unit is used for acquiring data of biomarkers in a sample to be detected, wherein the biomarkers comprise ANLN, ERO1A, FAM83A, KRT8 and/or TMEM 178A;
the processing unit is used for inputting the data of the biomarkers into a lung cancer prognosis prediction model to obtain a prediction result of the lung cancer progress of the sample to be detected;
preferably, the prognostic prediction model is a Cox regression model;
preferably, the Cox regression model is a LASSOCox regression model;
preferably, the prognostic prediction model is formulated as a risk score of C1 × ExpANLN + C2 × ExpERO1A + C3 × ExpFAM83A + C4 × expprrt 8+ C5 × ExpTMEM 178A;
preferably, the C1, C2, C3, C4 and C5 are 0.1452, 0.1702, 0.0722, 0.1918 and-0.0861 respectively.
8. A computer-readable storage medium characterized by storing a program for executing a lung cancer prognosis prediction model constructed from biomarkers ANLN, ERO1A, FAM83A, KRT8, and/or TMEM 178A;
preferably, the prognostic prediction model is a Cox regression model;
preferably, the Cox regression model is a LASSOCox regression model;
preferably, the prognostic prediction model is formulated as a risk score of C1 × ExpANLN + C2 × ExpERO1A + C3 × ExpFAM83A + C4 × expprrt 8+ C5 × ExpTMEM 178A;
preferably, the C1, C2, C3, C4 and C5 are 0.1452, 0.1702, 0.0722, 0.1918 and-0.0861 respectively.
9. An electronic device, comprising:
a client component, wherein the client component comprises a user interface;
a server component, wherein the server component comprises at least one memory unit configured to receive data input comprising sequencing data for biomarkers generated from a sample, the biomarkers including ANLN, ERO1A, FAM83A, KRT8, and/or TMEM 178A;
the user interface operatively coupled with the server component; and
a computer processor operatively coupled to the at least one memory unit, wherein the computer processor is programmed as an executable program for running a lung cancer prognosis prediction model constructed from biomarkers;
preferably, the prognostic prediction model is a Cox regression model;
preferably, the Cox regression model is a LASSOCox regression model;
preferably, the prognostic prediction model is formulated as a risk score of C1 × ExpANLN + C2 × ExpERO1A + C3 × ExpFAM83A + C4 × expprrt 8+ C5 × ExpTMEM 178A;
preferably, the C1, C2, C3, C4 and C5 are 0.1452, 0.1702, 0.0722, 0.1918 and-0.0861 respectively.
10. Use of a reagent for detecting a biomarker for the manufacture of a product for evaluating the efficacy of a drug for the treatment of lung cancer, wherein the biomarker comprises ANLN, ERO1A, FAM83A, KRT8 and/or TMEM 178A;
preferably, the reagents comprise primers or probes that specifically bind to the biomarker genes; an antibody, peptide, aptamer, or compound that specifically binds to the marker protein.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110729777.1A CN113234831A (en) | 2021-06-29 | 2021-06-29 | Model, product and system for predicting lung cancer prognosis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110729777.1A CN113234831A (en) | 2021-06-29 | 2021-06-29 | Model, product and system for predicting lung cancer prognosis |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113234831A true CN113234831A (en) | 2021-08-10 |
Family
ID=77141130
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110729777.1A Pending CN113234831A (en) | 2021-06-29 | 2021-06-29 | Model, product and system for predicting lung cancer prognosis |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113234831A (en) |
-
2021
- 2021-06-29 CN CN202110729777.1A patent/CN113234831A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101857462B1 (en) | Pancreatic cancer biomarkers and uses thereof | |
JP7434161B2 (en) | Methods and systems for protein identification | |
CN106483290B (en) | Tumor-marker panel | |
JP5701212B2 (en) | Lung cancer biomarkers and their use | |
KR101921945B1 (en) | Lung cancer biomarkers and uses thereof | |
US20120101002A1 (en) | Lung Cancer Biomarkers and Uses Thereof | |
US20120143805A1 (en) | Cancer Biomarkers and Uses Thereof | |
CN103429753A (en) | Mesothelioma biomarkers and uses thereof | |
CN108603887A (en) | Nonalcoholic fatty liver disease (NAFLD) and nonalcoholic fatty liver disease (NASH) biomarker and application thereof | |
CN110662966A (en) | Panel of protein biomarkers for detecting colorectal cancer and advanced adenoma | |
WO2013190092A1 (en) | Gene signatures for copd diagnosis | |
WO2012009382A2 (en) | Molecular indicators of bladder cancer prognosis and prediction of treatment response | |
CN113430269A (en) | Application of biomarker in prediction of lung cancer prognosis | |
CN114875149A (en) | Application of reagent for detecting biomarkers in preparation of product for predicting gastric cancer prognosis | |
CN113388683A (en) | Biomarker related to lung cancer prognosis and application thereof | |
CN108026584B (en) | Protein biomarker panel for diagnosing non-small cell lung cancer and non-small cell lung cancer diagnosis method using same | |
US20220065872A1 (en) | Lung Cancer Biomarkers and Uses Thereof | |
CN113234831A (en) | Model, product and system for predicting lung cancer prognosis | |
CN113444795A (en) | Biomarker related to lung cancer survival time and application of biomarker in prediction of lung cancer prognosis | |
CN113444797A (en) | Biomarkers for predicting lung cancer prognosis | |
CN113322326A (en) | Lung cancer prognosis marker, prognosis model and related application | |
CN113373232A (en) | Biomarkers and related products for predicting survival of lung cancer patients | |
CN113430268A (en) | Prediction of lung cancer prognosis | |
CN113322327A (en) | Biomarker-based product for predicting lung cancer prognosis and related application | |
US20180356419A1 (en) | Biomarkers for detection of tuberculosis risk |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210810 |