CN109642258A - A kind of method and system of tumor prognosis prediction - Google Patents
A kind of method and system of tumor prognosis prediction Download PDFInfo
- Publication number
- CN109642258A CN109642258A CN201880002164.4A CN201880002164A CN109642258A CN 109642258 A CN109642258 A CN 109642258A CN 201880002164 A CN201880002164 A CN 201880002164A CN 109642258 A CN109642258 A CN 109642258A
- Authority
- CN
- China
- Prior art keywords
- tumor
- information
- prognosis
- model
- patient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 336
- 238000004393 prognosis Methods 0.000 title claims abstract description 181
- 238000000034 method Methods 0.000 title claims abstract description 86
- 206010064571 Gene mutation Diseases 0.000 claims abstract description 95
- 108090000623 proteins and genes Proteins 0.000 claims description 90
- 230000035772 mutation Effects 0.000 claims description 62
- 238000012549 training Methods 0.000 claims description 58
- 201000008968 osteosarcoma Diseases 0.000 claims description 41
- 238000012163 sequencing technique Methods 0.000 claims description 34
- 238000012706 support-vector machine Methods 0.000 claims description 34
- 108020004414 DNA Proteins 0.000 claims description 25
- 238000003860 storage Methods 0.000 claims description 24
- 230000003902 lesion Effects 0.000 claims description 23
- 230000000694 effects Effects 0.000 claims description 22
- 201000010099 disease Diseases 0.000 claims description 20
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 20
- 238000004422 calculation algorithm Methods 0.000 claims description 16
- 239000002245 particle Substances 0.000 claims description 15
- 238000004891 communication Methods 0.000 claims description 8
- 101001008892 Homo sapiens Histone-lysine N-methyltransferase 2C Proteins 0.000 claims description 7
- 230000036961 partial effect Effects 0.000 claims description 7
- 102100027161 BRCA2-interacting transcriptional repressor EMSY Human genes 0.000 claims description 6
- 206010061818 Disease progression Diseases 0.000 claims description 6
- 102100027755 Histone-lysine N-methyltransferase 2C Human genes 0.000 claims description 6
- 101001057996 Homo sapiens BRCA2-interacting transcriptional repressor EMSY Proteins 0.000 claims description 6
- 101000984620 Homo sapiens Low-density lipoprotein receptor-related protein 1B Proteins 0.000 claims description 6
- 101000686031 Homo sapiens Proto-oncogene tyrosine-protein kinase ROS Proteins 0.000 claims description 6
- 102100027121 Low-density lipoprotein receptor-related protein 1B Human genes 0.000 claims description 6
- 102100023347 Proto-oncogene tyrosine-protein kinase ROS Human genes 0.000 claims description 6
- 102100034204 Transcription factor SOX-9 Human genes 0.000 claims description 6
- 230000005750 disease progression Effects 0.000 claims description 6
- 238000003062 neural network model Methods 0.000 claims description 6
- 230000000391 smoking effect Effects 0.000 claims description 6
- 102100034540 Adenomatous polyposis coli protein Human genes 0.000 claims description 5
- 102100035595 Cohesin subunit SA-2 Human genes 0.000 claims description 5
- 102100022204 DNA-dependent protein kinase catalytic subunit Human genes 0.000 claims description 5
- 102100038912 E3 SUMO-protein ligase RanBP2 Human genes 0.000 claims description 5
- 101000642968 Homo sapiens Cohesin subunit SA-2 Proteins 0.000 claims description 5
- 101000619536 Homo sapiens DNA-dependent protein kinase catalytic subunit Proteins 0.000 claims description 5
- 101001025967 Homo sapiens Lysine-specific demethylase 6A Proteins 0.000 claims description 5
- 101000824318 Homo sapiens Protocadherin Fat 1 Proteins 0.000 claims description 5
- 101000651890 Homo sapiens Slit homolog 2 protein Proteins 0.000 claims description 5
- 101000651893 Homo sapiens Slit homolog 3 protein Proteins 0.000 claims description 5
- 101000711846 Homo sapiens Transcription factor SOX-9 Proteins 0.000 claims description 5
- 102100037462 Lysine-specific demethylase 6A Human genes 0.000 claims description 5
- 102100022095 Protocadherin Fat 1 Human genes 0.000 claims description 5
- 108010062219 ran-binding protein 2 Proteins 0.000 claims description 5
- 230000006641 stabilisation Effects 0.000 claims description 5
- 238000011105 stabilization Methods 0.000 claims description 5
- 102100033793 ALK tyrosine kinase receptor Human genes 0.000 claims description 4
- 101150009379 AS1 gene Proteins 0.000 claims description 4
- 102100023157 AT-rich interactive domain-containing protein 2 Human genes 0.000 claims description 4
- 101150020330 ATRX gene Proteins 0.000 claims description 4
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 claims description 4
- 101100324551 Chlamydomonas reinhardtii ARSA1 gene Proteins 0.000 claims description 4
- 102100034157 DNA mismatch repair protein Msh2 Human genes 0.000 claims description 4
- 102100021147 DNA mismatch repair protein Msh6 Human genes 0.000 claims description 4
- 102100023387 Endoribonuclease Dicer Human genes 0.000 claims description 4
- 101000779641 Homo sapiens ALK tyrosine kinase receptor Proteins 0.000 claims description 4
- 101000685261 Homo sapiens AT-rich interactive domain-containing protein 2 Proteins 0.000 claims description 4
- 101000924577 Homo sapiens Adenomatous polyposis coli protein Proteins 0.000 claims description 4
- 101001134036 Homo sapiens DNA mismatch repair protein Msh2 Proteins 0.000 claims description 4
- 101000968658 Homo sapiens DNA mismatch repair protein Msh6 Proteins 0.000 claims description 4
- 101000907904 Homo sapiens Endoribonuclease Dicer Proteins 0.000 claims description 4
- 101001088892 Homo sapiens Lysine-specific demethylase 5A Proteins 0.000 claims description 4
- 101100087590 Homo sapiens RICTOR gene Proteins 0.000 claims description 4
- 101000742859 Homo sapiens Retinoblastoma-associated protein Proteins 0.000 claims description 4
- 101001112293 Homo sapiens Retinoic acid receptor alpha Proteins 0.000 claims description 4
- 101000596093 Homo sapiens Transcription initiation factor TFIID subunit 1 Proteins 0.000 claims description 4
- 101000997832 Homo sapiens Tyrosine-protein kinase JAK2 Proteins 0.000 claims description 4
- 102100033246 Lysine-specific demethylase 5A Human genes 0.000 claims description 4
- 229910015837 MSH2 Inorganic materials 0.000 claims description 4
- 102000001759 Notch1 Receptor Human genes 0.000 claims description 4
- 108010029755 Notch1 Receptor Proteins 0.000 claims description 4
- 102000001756 Notch2 Receptor Human genes 0.000 claims description 4
- 108010029751 Notch2 Receptor Proteins 0.000 claims description 4
- 101100146539 Podospora anserina RPS15 gene Proteins 0.000 claims description 4
- 102000046941 Rapamycin-Insensitive Companion of mTOR Human genes 0.000 claims description 4
- 108700019586 Rapamycin-Insensitive Companion of mTOR Proteins 0.000 claims description 4
- 102100038042 Retinoblastoma-associated protein Human genes 0.000 claims description 4
- 102100023606 Retinoic acid receptor alpha Human genes 0.000 claims description 4
- 102100035222 Transcription initiation factor TFIID subunit 1 Human genes 0.000 claims description 4
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 claims description 4
- 102100033444 Tyrosine-protein kinase JAK2 Human genes 0.000 claims description 4
- 102000056014 X-linked Nuclear Human genes 0.000 claims description 4
- 108700042462 X-linked Nuclear Proteins 0.000 claims description 4
- 238000011269 treatment regimen Methods 0.000 claims description 4
- 102100021606 Ephrin type-A receptor 7 Human genes 0.000 claims 2
- 101000898708 Homo sapiens Ephrin type-A receptor 7 Proteins 0.000 claims 2
- 102000007530 Neurofibromin 1 Human genes 0.000 claims 2
- 108010085793 Neurofibromin 1 Proteins 0.000 claims 2
- 102100027339 Slit homolog 3 protein Human genes 0.000 claims 2
- 210000001519 tissue Anatomy 0.000 description 60
- 239000000523 sample Substances 0.000 description 47
- 238000011282 treatment Methods 0.000 description 30
- 238000012545 processing Methods 0.000 description 18
- 230000008569 process Effects 0.000 description 17
- 108091000080 Phosphotransferase Proteins 0.000 description 14
- 102000020233 phosphotransferase Human genes 0.000 description 14
- 230000006870 function Effects 0.000 description 13
- 230000001419 dependent effect Effects 0.000 description 12
- 230000001225 therapeutic effect Effects 0.000 description 12
- 238000003066 decision tree Methods 0.000 description 11
- 238000010837 poor prognosis Methods 0.000 description 11
- 230000002068 genetic effect Effects 0.000 description 10
- 238000005457 optimization Methods 0.000 description 10
- 210000004027 cell Anatomy 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 238000012986 modification Methods 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 238000002790 cross-validation Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 5
- 238000005070 sampling Methods 0.000 description 5
- 238000012795 verification Methods 0.000 description 5
- 101150096316 5 gene Proteins 0.000 description 4
- 102100021601 Ephrin type-A receptor 8 Human genes 0.000 description 4
- 101000898676 Homo sapiens Ephrin type-A receptor 8 Proteins 0.000 description 4
- 206010039491 Sarcoma Diseases 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 201000011510 cancer Diseases 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 230000004083 survival effect Effects 0.000 description 4
- 210000004881 tumor cell Anatomy 0.000 description 4
- 101150090724 3 gene Proteins 0.000 description 3
- 206010027476 Metastases Diseases 0.000 description 3
- 102100021557 Protein kinase C iota type Human genes 0.000 description 3
- 102100027340 Slit homolog 2 protein Human genes 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 239000003814 drug Substances 0.000 description 3
- 206010016629 fibroma Diseases 0.000 description 3
- 239000012530 fluid Substances 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 210000004072 lung Anatomy 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 230000009401 metastasis Effects 0.000 description 3
- 239000012188 paraffin wax Substances 0.000 description 3
- 238000005192 partition Methods 0.000 description 3
- 230000001575 pathological effect Effects 0.000 description 3
- 230000000644 propagated effect Effects 0.000 description 3
- 108010008359 protein kinase C lambda Proteins 0.000 description 3
- 238000007637 random forest analysis Methods 0.000 description 3
- 230000009467 reduction Effects 0.000 description 3
- 230000008439 repair process Effects 0.000 description 3
- 238000010200 validation analysis Methods 0.000 description 3
- 206010003445 Ascites Diseases 0.000 description 2
- 102100025399 Breast cancer type 2 susceptibility protein Human genes 0.000 description 2
- 108010040163 CREB-Binding Protein Proteins 0.000 description 2
- 102100021975 CREB-binding protein Human genes 0.000 description 2
- 201000009030 Carcinoma Diseases 0.000 description 2
- 102100028003 Catenin alpha-1 Human genes 0.000 description 2
- LZZYPRNAOMGNLH-UHFFFAOYSA-M Cetrimonium bromide Chemical compound [Br-].CCCCCCCCCCCCCCCC[N+](C)(C)C LZZYPRNAOMGNLH-UHFFFAOYSA-M 0.000 description 2
- 102100028843 DNA mismatch repair protein Mlh1 Human genes 0.000 description 2
- 102100030322 Ephrin type-A receptor 1 Human genes 0.000 description 2
- 102100040306 Fanconi anemia group D2 protein Human genes 0.000 description 2
- WSFSSNUMVMOOMR-UHFFFAOYSA-N Formaldehyde Chemical compound O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 description 2
- 102100032742 Histone-lysine N-methyltransferase SETD2 Human genes 0.000 description 2
- 101000934858 Homo sapiens Breast cancer type 2 susceptibility protein Proteins 0.000 description 2
- 101000654725 Homo sapiens Histone-lysine N-methyltransferase SETD2 Proteins 0.000 description 2
- 101000961071 Homo sapiens NF-kappa-B inhibitor alpha Proteins 0.000 description 2
- 101000945735 Homo sapiens Parafibromin Proteins 0.000 description 2
- 108010026664 MutL Protein Homolog 1 Proteins 0.000 description 2
- -1 NF-1 Proteins 0.000 description 2
- 102100039337 NF-kappa-B inhibitor alpha Human genes 0.000 description 2
- 102100034743 Parafibromin Human genes 0.000 description 2
- 102000012850 Patched-1 Receptor Human genes 0.000 description 2
- 108010065129 Patched-1 Receptor Proteins 0.000 description 2
- 208000005228 Pericardial Effusion Diseases 0.000 description 2
- 208000002151 Pleural effusion Diseases 0.000 description 2
- 206010036790 Productive cough Diseases 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 208000009956 adenocarcinoma Diseases 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 108091008324 binding proteins Proteins 0.000 description 2
- 238000002512 chemotherapy Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000034994 death Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 238000007490 hematoxylin and eosin (H&E) staining Methods 0.000 description 2
- 238000012165 high-throughput sequencing Methods 0.000 description 2
- 101150002000 hsp-3 gene Proteins 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 210000000496 pancreas Anatomy 0.000 description 2
- 239000013610 patient sample Substances 0.000 description 2
- 238000002203 pretreatment Methods 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 210000003802 sputum Anatomy 0.000 description 2
- 208000024794 sputum Diseases 0.000 description 2
- 206010041823 squamous cell carcinoma Diseases 0.000 description 2
- 206010042863 synovial sarcoma Diseases 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 210000002700 urine Anatomy 0.000 description 2
- KKVYYGGCHJGEFJ-UHFFFAOYSA-N 1-n-(4-chlorophenyl)-6-methyl-5-n-[3-(7h-purin-6-yl)pyridin-2-yl]isoquinoline-1,5-diamine Chemical compound N=1C=CC2=C(NC=3C(=CC=CN=3)C=3C=4N=CNC=4N=CN=3)C(C)=CC=C2C=1NC1=CC=C(Cl)C=C1 KKVYYGGCHJGEFJ-UHFFFAOYSA-N 0.000 description 1
- 101150051922 29 gene Proteins 0.000 description 1
- 208000035404 Autolysis Diseases 0.000 description 1
- 206010061692 Benign muscle neoplasm Diseases 0.000 description 1
- 108050007957 Cadherin Proteins 0.000 description 1
- 102000000905 Cadherin Human genes 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 101710106615 Catenin alpha-1 Proteins 0.000 description 1
- 206010057248 Cell death Diseases 0.000 description 1
- 102100038111 Cyclin-dependent kinase 12 Human genes 0.000 description 1
- 102000002263 Cytochrome P-450 CYP2C8 Human genes 0.000 description 1
- 108010000561 Cytochrome P-450 CYP2C8 Proteins 0.000 description 1
- 206010059866 Drug resistance Diseases 0.000 description 1
- 102100022207 E3 ubiquitin-protein ligase parkin Human genes 0.000 description 1
- 102000001301 EGF receptor Human genes 0.000 description 1
- 101150016325 EPHA3 gene Proteins 0.000 description 1
- 102100031856 ERBB receptor feedback inhibitor 1 Human genes 0.000 description 1
- 102100030324 Ephrin type-A receptor 3 Human genes 0.000 description 1
- 102100030779 Ephrin type-B receptor 1 Human genes 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 108010026653 Fanconi Anemia Complementation Group D2 protein Proteins 0.000 description 1
- 208000002927 Hamartoma Diseases 0.000 description 1
- 102000005548 Hexokinase Human genes 0.000 description 1
- 108700040460 Hexokinases Proteins 0.000 description 1
- 101710118579 Histone-lysine N-methyltransferase 2C Proteins 0.000 description 1
- 101000859063 Homo sapiens Catenin alpha-1 Proteins 0.000 description 1
- 101000884345 Homo sapiens Cyclin-dependent kinase 12 Proteins 0.000 description 1
- 101000619542 Homo sapiens E3 ubiquitin-protein ligase parkin Proteins 0.000 description 1
- 101000920812 Homo sapiens ERBB receptor feedback inhibitor 1 Proteins 0.000 description 1
- 101001064150 Homo sapiens Ephrin type-B receptor 1 Proteins 0.000 description 1
- 101000851181 Homo sapiens Epidermal growth factor receptor Proteins 0.000 description 1
- 101000891683 Homo sapiens Fanconi anemia group D2 protein Proteins 0.000 description 1
- 101001120097 Homo sapiens Phosphatidylinositol 3-kinase regulatory subunit beta Proteins 0.000 description 1
- 101000605639 Homo sapiens Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform Proteins 0.000 description 1
- 101000595741 Homo sapiens Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit beta isoform Proteins 0.000 description 1
- 101000601770 Homo sapiens Protein polybromo-1 Proteins 0.000 description 1
- 101000702545 Homo sapiens Transcription activator BRG1 Proteins 0.000 description 1
- 108030006556 Lysine dehydrogenases Proteins 0.000 description 1
- 208000006644 Malignant Fibrous Histiocytoma Diseases 0.000 description 1
- 101100381978 Mus musculus Braf gene Proteins 0.000 description 1
- 201000004458 Myoma Diseases 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 102100026177 Phosphatidylinositol 3-kinase regulatory subunit beta Human genes 0.000 description 1
- 102100038332 Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform Human genes 0.000 description 1
- 102100036061 Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit beta isoform Human genes 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102100037516 Protein polybromo-1 Human genes 0.000 description 1
- 108700020978 Proto-Oncogene Proteins 0.000 description 1
- 102000052575 Proto-Oncogene Human genes 0.000 description 1
- 101710100969 Receptor tyrosine-protein kinase erbB-3 Proteins 0.000 description 1
- 102100029986 Receptor tyrosine-protein kinase erbB-3 Human genes 0.000 description 1
- 102100029981 Receptor tyrosine-protein kinase erbB-4 Human genes 0.000 description 1
- 101710100963 Receptor tyrosine-protein kinase erbB-4 Proteins 0.000 description 1
- 208000007660 Residual Neoplasm Diseases 0.000 description 1
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 1
- 102000051614 SET domains Human genes 0.000 description 1
- 108700039010 SET domains Proteins 0.000 description 1
- 208000021712 Soft tissue sarcoma Diseases 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- 102100031027 Transcription activator BRG1 Human genes 0.000 description 1
- 101710198026 Transcription factor SOX-9 Proteins 0.000 description 1
- 208000015778 Undifferentiated pleomorphic sarcoma Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002001 anti-metastasis Effects 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 101150010487 are gene Proteins 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- JJWKPURADFRFRB-UHFFFAOYSA-N carbonyl sulfide Chemical compound O=C=S JJWKPURADFRFRB-UHFFFAOYSA-N 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 210000003679 cervix uteri Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007850 degeneration Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000235 effect on cancer Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 210000003238 esophagus Anatomy 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 210000000232 gallbladder Anatomy 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- ZJYYHGLJYGJLLN-UHFFFAOYSA-N guanidinium thiocyanate Chemical compound SC#N.NC(N)=N ZJYYHGLJYGJLLN-UHFFFAOYSA-N 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000009169 immunotherapy Methods 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 239000010977 jade Substances 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 201000010260 leiomyoma Diseases 0.000 description 1
- 210000003041 ligament Anatomy 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- SQEHCNOBYLQFTG-UHFFFAOYSA-M lithium;thiophene-2-carboxylate Chemical compound [Li+].[O-]C(=O)C1=CC=CS1 SQEHCNOBYLQFTG-UHFFFAOYSA-M 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 208000025036 lymphosarcoma Diseases 0.000 description 1
- 210000005075 mammary gland Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000003801 milling Methods 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000004660 morphological change Effects 0.000 description 1
- 230000001613 neoplastic effect Effects 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 239000002773 nucleotide Substances 0.000 description 1
- 125000003729 nucleotide group Chemical group 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- 210000003899 penis Anatomy 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 230000002980 postoperative effect Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 210000002307 prostate Anatomy 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 238000012175 pyrosequencing Methods 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000001959 radiotherapy Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 238000007789 sealing Methods 0.000 description 1
- 230000028043 self proteolysis Effects 0.000 description 1
- 238000007841 sequencing by ligation Methods 0.000 description 1
- 210000003491 skin Anatomy 0.000 description 1
- 210000004872 soft tissue Anatomy 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 210000000278 spinal cord Anatomy 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 206010043688 thyroid adenoma Diseases 0.000 description 1
- 210000001685 thyroid gland Anatomy 0.000 description 1
- 230000005740 tumor formation Effects 0.000 description 1
- 208000025421 tumor of uterus Diseases 0.000 description 1
- 208000010576 undifferentiated carcinoma Diseases 0.000 description 1
- 206010046766 uterine cancer Diseases 0.000 description 1
- 210000001215 vagina Anatomy 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 210000003905 vulva Anatomy 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/118—Prognosis of disease development
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Hospice & Palliative Care (AREA)
- Biophysics (AREA)
- Oncology (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The embodiment of the present application discloses a kind of method and system of tumor prognosis prediction.The tumor prognosis prediction technique includes: the characteristic information for obtaining tumor patient, and the characteristic information at least reflects the tumor patient in the gene mutation information of tumor locus;The prognosis prediction result of the tumor patient is determined according to tumor prognosis prediction model based on the characteristic information of the tumor patient.The application is based on tumor patient data and establishes tumor prognosis prediction model, and the accuracy rate to tumor prognosis prediction can be improved.
Description
Technical Field
The present application relates to the medical field, and in particular, to a method and system for tumor prognosis prediction.
Background
Tumors (e.g., osteosarcoma, etc.) are the second leading cause of death worldwide, and the mortality and morbidity of tumors are increasing. Despite the increasing diagnosis and treatment of tumors, the mortality rate of patients is still not controlled effectively, and recurrence and metastasis are the main causes of death of tumor patients, for example, osteosarcoma can metastasize to various tissues and organs such as lung and spinal cord, which is a serious threat to the life of patients.
At present, the tumor is clinically evaluated mainly through pathological and imaging morphological changes, and indexes such as the age of a patient, the pathological type of the tumor, the operation stage, the residual tumor and the like are determined. With the development of technologies such as molecular biology, molecular epidemic pathology and the like, the screening research of tumor-related genes and molecular markers on the molecular level is a hot spot of the current tumor research, and the method can provide reference indications for the operation of tumor patients, predict the postoperative recurrence or metastasis, radically treat the tumor objective indications, provide targets for anti-metastasis treatment and the like on the molecular level of tumor cells.
Therefore, it is important to study the expression difference of genes in tumor formation, development, drug resistance, etc. and analyze the activation and inhibition of genes in tumor, so as to more comprehensively and accurately evaluate the disease condition and prognosis of patients, and realize the individual treatment of tumor patients, and it is also the focus of attention of those skilled in the art.
Disclosure of Invention
One embodiment of the present application provides a method for predicting tumor prognosis, including: acquiring characteristic information of a tumor patient, wherein the characteristic information at least reflects gene mutation information of the tumor patient; and determining the prognosis prediction result of the tumor patient according to a tumor prognosis prediction model based on the characteristic information of the tumor patient.
In some embodiments, the gene mutation information includes genes mutated in DNA and their mutation abundances, and/or tumor prognosis on DNA predicts related genes and their mutation abundances.
In some embodiments, the obtaining characteristic information of the tumor patient further comprises: obtaining a tissue sample from the tumor patient; extracting DNA of the tissue sample; preparing a library of the DNA; performing gene sequencing according to the library to obtain a sequencing result; analyzing the sequencing result to determine the gene mutation information of the tumor patient.
In some embodiments, the characteristic information further comprises at least one of the following information of the tumor patient: age, gender, smoking history, educational age, working age, treatment regimen, and sample storage time.
In some embodiments, the tumor prognosis prediction model is a support vector machine model or a neural network model.
In some embodiments, the method for prognosis of a tumor further comprises: and training an initial model by utilizing the characteristic information and the prognosis information of a plurality of tumor patients to obtain the tumor prognosis prediction model.
In some embodiments, the training an initial model using feature information of a plurality of tumor patients and prognosis information thereof to obtain the tumor prognosis prediction model comprises: and removing mutant gene information of which the mutation abundance is less than a certain set threshold value from the gene mutation information of the plurality of tumor patients.
In some embodiments, the training an initial model using feature information of a plurality of tumor patients and prognosis information thereof to obtain the tumor prognosis prediction model comprises: removing redundant gene mutation information in the gene mutation information of the plurality of tumor patients.
In some embodiments, the tumor prognosis prediction model is a support vector machine model; the method for training an initial model to obtain the tumor prognosis prediction model by using the characteristic information and the prognosis information of a plurality of tumor patients comprises the following steps: determining at least part of genes as tumor prognosis prediction related genes according to the contribution value of each gene mutation information in the feature information of a plurality of tumor patients to the support vector machine model; and training the initial model by using the gene mutation information and the prognosis information of the genes related to the prognosis prediction of the tumors of a plurality of tumor patients to obtain the prognosis prediction model of the tumors.
In some embodiments, the tumor prognosis prediction model is a support vector machine model; the training the initial model to obtain the tumor prognosis prediction model further comprises: and optimizing the parameters of the support vector machine model by utilizing a particle swarm algorithm or a grid division method.
In some embodiments, the prognostic prediction result includes: disease progression, disease stabilization, partial remission and complete remission; alternatively, the prognostic prediction includes: good and bad curative effect.
In some embodiments, the tumor is osteosarcoma.
In some embodiments, the characteristic information reflects at least mutation information of at least one of the following genes in osteosarcoma patients: KMT2C, SOX9, LRP1B, NF-1, PRKDC, FAT1, STAG2, SLIT2, NOTCH1, EPHA7, ATRX, KDM6A, APC, RANBP2, RARA. AS1, C11orf30, ROS1, ARID2, TAF1, DICER1, MSH2, MSH6, TP53, KDM5A, JAK2, ALK, RB1, NOTCH2, and RICTOR.
In some embodiments, the tumor patient gene mutation information is gene mutation information of an osteosarcoma lesion site.
One embodiment of the present application provides a tumor prognosis prediction system, including an obtaining module and a prediction module, where the obtaining module is configured to obtain feature information of a tumor patient, and the feature information at least reflects gene mutation information of the tumor patient; the prediction module is used for determining the prognosis prediction result of the tumor patient according to the tumor prognosis prediction model based on the characteristic information of the tumor patient.
In some embodiments, the gene mutation information includes genes mutated in DNA and their mutation abundances, and/or tumor prognosis on DNA predicts related genes and their mutation abundances.
In some embodiments, the characteristic information further comprises at least one of the following information of the tumor patient: age, gender, smoking history, educational age, working age, treatment regimen, and sample storage time.
In some embodiments, the tumor prognosis prediction model is a support vector machine model or a neural network model.
In some embodiments, the system further comprises a training module for training an initial model to obtain the prognosis prediction model by using the feature information of a plurality of tumor patients and the prognosis information thereof.
In some embodiments, the training module is further configured to remove mutant gene information in which the abundance of mutation is less than a set threshold from the gene mutation information of the plurality of tumor patients.
In some embodiments, the training module is further configured to remove redundant gene mutation information from the gene mutation information of the plurality of tumor patients.
In some embodiments, the tumor prognosis prediction model is a support vector machine model; the training module is further configured to: determining at least part of genes as tumor prognosis prediction related genes according to the contribution value of each gene mutation information in the feature information of a plurality of tumor patients to the support vector machine model; and training the initial model by using the gene mutation information and the prognosis information of the genes related to the prognosis prediction of the tumors of a plurality of tumor patients to obtain the prognosis prediction model of the tumors.
In some embodiments, the tumor prognosis prediction model is a support vector machine model; the training module is further used for optimizing parameters of the support vector machine model by utilizing a particle swarm algorithm or a grid division method.
In some embodiments, the prognostic prediction result includes: disease progression, disease stabilization, partial remission and complete remission; alternatively, the prognostic prediction includes: good and bad curative effect.
In some embodiments, the tumor is osteosarcoma.
In some embodiments, the characteristic information reflects at least mutation information of at least one of the following genes in osteosarcoma patients: KMT2C, SOX9, LRP1B, NF-1, PRKDC, FAT1, STAG2, SLIT2, NOTCH1, EPHA7, ATRX, KDM6A, APC, RANBP2, RARA. AS1, C11orf30, ROS1, ARID2, TAF1, DICER1, MSH2, MSH6, TP53, KDM5A, JAK2, ALK, RB1, NOTCH2, and RICTOR.
In some embodiments, the tumor patient gene mutation information is gene mutation information of an osteosarcoma lesion site.
One embodiment of the present application provides a prognosis prediction apparatus for a tumor, the apparatus including at least one processor and at least one memory; the at least one memory is for storing computer instructions; the at least one processor is configured to execute at least a portion of the computer instructions to implement the method for prognosis of a tumor.
One of the embodiments of the present application provides a computer-readable storage medium storing computer instructions that, when executed by a processor, implement the method for tumor prognosis prediction.
One embodiment of the present application provides a tumor prognosis prediction system, including: at least one computer-readable storage medium comprising a set of instructions for prognosis prediction of a tumor; and at least one processor in communication with the at least one storage medium, the at least one processor, when executing the set of instructions, configured to: acquiring characteristic information of a tumor patient, wherein the characteristic information at least reflects gene mutation information of the tumor patient; and determining the prognosis prediction result of the tumor patient according to the tumor prognosis prediction model based on the characteristic information of the tumor patient.
Drawings
The present application will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals are used to indicate like structures, wherein:
FIG. 1 is a schematic diagram of an application scenario of a tumor prognosis prediction system according to some embodiments of the present application;
FIG. 2 is an architectural diagram of a computing device shown in accordance with some embodiments of the present application;
FIG. 3 is a block diagram of a prognostic tumor prediction system according to some embodiments of the present application;
FIG. 4 is an exemplary flow chart of a method of prognosis of a tumor according to some embodiments of the present application;
FIG. 5 is an exemplary flow chart for determining gene mutation information for a tumor patient according to some embodiments of the present application;
FIG. 6 is an exemplary flow chart for training an obtained prognosis prediction model for a tumor according to some embodiments of the present application;
FIG. 7 is a gene mutation heatmap of osteosarcoma patients according to exemplary embodiments of the present application;
FIG. 8 is a heat map of gene mutations in osteosarcoma patients with good therapeutic effect according to an exemplary embodiment of the present application;
FIG. 9 is a heat map of gene mutations in osteosarcoma patients with poor therapeutic effect according to exemplary embodiments of the present application; and
fig. 10 is a schematic diagram illustrating verification of a prediction result of a tumor prognosis prediction model according to an exemplary embodiment of the present application.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only examples or embodiments of the application, from which the application can also be applied to other similar scenarios without inventive effort for a person skilled in the art. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.
It should be understood that "system", "device", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.
As used in this application and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.
Flow charts are used herein to illustrate operations performed by systems according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.
Fig. 1 is a schematic diagram illustrating an application scenario of a tumor prognosis prediction system 100 according to some embodiments of the present application. As shown in fig. 1, the prognosis prediction system 100 can include a server 110, a network 120, and a database 130. In some embodiments, the database 130 can store patient basic information, disease history, treatment plan data, and can also store patient genetic information, such as genetic mutation information of the tumor patient 140 at the tumor site, genetic information of normal tissue of the tumor patient, reference genetic information, and the like. A biological tissue sample or fluid sample from a patient, such as tissue sample 145 from tumor patient 140, may be stored in a dedicated storage facility for further processing, such as a genetic sequencing process. In particular, tissue sample 145 may comprise a tumor tissue sample from a patient or a tissue sample from another part of the patient's body. The server 110 may be used to process and analyze the relevant information to generate a prognostic prediction. In some embodiments, the server 110 may obtain relevant information and/or data from the database 130 (e.g., genetic mutation information of the tumor patient at the tumor site, basic information of the tumor patient, reference genetic data, etc.), or directly obtain relevant information and/or data obtained by a worker or other equipment processing the tissue sample 145 of the tumor patient 140.
The server 110 may be a server or a server group. The server farm may be centralized, such as a data center. The server farm may also be distributed, such as a distributed system. The server 110 may be local or remote. In some embodiments, the server 110 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an intermediate cloud, a multi-cloud, and the like, or any combination thereof. In some embodiments, server 110 may be implemented on a computing device 200 having at least one of the components shown in FIG. 2.
In some embodiments, the server 110 may include a processing engine 112. The processing engine 112 may be used to execute instructions (program code) of the server 110. For example, the processing engine 112 can execute instructions for analyzing the characteristic information of the tumor patient 140 to obtain a prognosis prediction of the tumor. The instructions for analyzing the characteristic information of the tumor patient 140 may be stored in the form of computer instructions in a computer-readable storage medium (not shown). In some embodiments, the processing engine 112 may include one or more sub-processing devices (e.g., a single core processing device or a multi-core processing device). By way of example only, the processing engine 112 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a programmable logic circuit (PLD), a controller, a micro-controller unit, a Reduced Instruction Set Computer (RISC), a microprocessor, or the like, or any combination thereof.
The network 120 may provide a conduit for the exchange of information. In some embodiments, information may be exchanged between server 110 and database 130 via network 120. For example, server 110 may receive reference gene data in database 130 via network 120. In some embodiments, information related to tumor patient 140 and/or tissue sample 145 may be transmitted to server 110 and/or database 130 via network 120. For example, characteristic information (e.g., genetic mutation information, basic information, etc.) of the tumor patient 140 may be transmitted to the server 110 via the network 120. In some embodiments, the network 120 may be any type of wired or wireless network. For example, network 120 may include a cable network, a wired network, a fiber optic network, a telecommunications network, an intranet, the Internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Public Switched Telephone Network (PSTN), a Bluetooth network, a ZigBee network, a Near Field Communication (NFC) network, the like, or any combination thereof.
The database 130 may be used to store data and/or sets of instructions. In some embodiments, database 130 may store data obtained from server 110. In some embodiments, database 130 may store information and/or instructions for server 110 to perform or use to perform the example methods described herein. In some embodiments, the database 130 may store reference gene data. Specifically, the database 130 may store gene data in various types of genomic databases and/or gene data having an influence (or significant influence) on tumorigenesis reported in the existing literature, and the like. The genomic database may include, but is not limited to, a COSMIC database, ClinVar database, HGMD database, OMIM database, TCGA database, GeneCards database, and the like, among others. In some embodiments, database 130 may include mass storage, removable storage, volatile read-write memory, read-only memory (ROM), the like, or any combination thereof. In some embodiments, database 130 may be implemented on a cloud platform. By way of example only, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an intermediate cloud, a multi-cloud, and the like, or any combination thereof. In some embodiments, database 130 may be part of server 110.
In some embodiments, the oncology patient 140 may be a patient having one or more oncology diseases. Wherein the neoplastic disease may comprise a carcinoma, sarcoma, benign tumor, or the like, or any combination thereof. Specifically, the cancer may include squamous carcinoma, adenocarcinoma, undifferentiated carcinoma, and the like. For example, squamous cell carcinoma may include those occurring in the skin, esophagus, lung, cervix, vagina, vulva, penis, and the like. Adenocarcinoma may include cancer occurring in the digestive tract, lung, uterine body, breast, ovary, prostate, thyroid, liver, kidney, pancreas, gall bladder, and the like. Sarcomas may include, but are not limited to: soft tissue sarcoma, osteosarcoma, malignant fibrous histiocytoma, bilateral sarcoma, rhabdomyosarcoma, lymphosarcoma, synovial sarcoma, leiomyoma, etc. Benign tumors may include, but are not limited to, hamartoma, benign tumors of the pancreas, thyroid adenoma, mammary gland fibroma, uterine tumor, gastrointestinal flat bone myoma, soft tissue fibroma, synovioma, ligament fibroma, and the like. In one embodiment of the present application, the tumor patient 140 can be an osteosarcoma patient. In some embodiments, the tumor patient 140 can be a patient with a tumor at various stages (e.g., early, mid, late, etc.). The tumor patient 140 can also be a patient at various stages of treatment (e.g., pre-treatment, under-treatment, post-treatment, etc.).
In some embodiments, the tissue sample 145 may be used to reflect relevant information about the tumor of the tumor patient 140. In particular, the tissue sample 145 may be a biological tissue or fluid sample taken from a tumor site (e.g., a target lesion) and/or a non-tumor site (e.g., a site other than a lesion) of the tumor patient 140. For example, tissue samples may include, but are not limited to: sputum, blood samples, fresh tissue (e.g., surgical tissue, punctured tissue, etc.), paraffin-embedded tissue, urine, serosal cavity effusion (e.g., ascites, pleural effusion, pericardial effusion, etc.), or tissue, cells, etc. extracted from a tumor site, or any combination thereof. In some embodiments, the tissue sample 145 may include tissue, cells of the tumor patient 140 at the tumor site as well as sites other than the tumor. In some embodiments, the tissue sample 145 may include only tissue, cells of the tumor patient 140 at the tumor site.
In some embodiments, information related to the tumor patient 140 and/or the tissue sample 145 may be transmitted to one or more components of the tumor prognosis prediction system 100 (e.g., server 110, database 130) manually (e.g., personnel) or by machine (e.g., a robotic device, etc.).
FIG. 2 is a schematic diagram of an architecture of a computing device 200 shown in accordance with some embodiments of the present application. As shown in fig. 2, computing device 200 may include a processor 210, a memory 220, input/output interfaces 230, and communication ports 240. Server 110 and/or database 130 may be implemented on the computing device 200. For example, the processing engine 112 may be implemented on the computing device 200 and configured to perform the functions of the processing engine 112 in the present application.
The processor 210 may execute the computing instructions (program code) and perform the functions of the server 110 described herein. Computing instructions may include programs, objects, components, data structures, procedures, modules, and functions (a function refers to a specific function described in this application). For example, processor 210 may process instructions for predicting the effect of a prognosis of a tumor in prognosis prediction system 100. In some embodiments, processor 210 may include microcontrollers, microprocessors, Reduced Instruction Set Computers (RISC), Application Specific Integrated Circuits (ASIC), application specific instruction set processors (ASIP), Central Processing Units (CPU), Graphics Processing Units (GPU), Physical Processing Units (PPU), microcontroller units, Digital Signal Processors (DSP), Field Programmable Gate Array (FPGA), Advanced RISC Machines (ARM), programmable logic devices, any circuit or processor capable of executing one or more functions, or the like, or any combination thereof. For illustration only, only one processor 210 is depicted in FIG. 2, but it should be noted that the present application may include multiple processors.
Memory 220 may store data/information obtained from any component in the prognosis of tumor prediction system 100. In some embodiments, memory 220 may include mass storage, removable storage, volatile read and write memory, Read Only Memory (ROM), and the like, or any combination thereof. Exemplary mass storage devices may include magnetic disks, optical disks, solid state drives, and the like. The removable memory may include flash drives, floppy disks, optical disks, memory cards, U-disks, compact disks, removable hard disks, and the like. Volatile read and write memory can include Random Access Memory (RAM). RAM may include Dynamic RAM (DRAM), double-data-rate synchronous dynamic RAM (DDRSDRAM), Static RAM (SRAM), thyristor RAM (T-RAM), zero-capacitance (Z-RAM), and the like. ROM may include Masked ROM (MROM), Programmable ROM (PROM), erasable programmable ROM (PEROM), Electrically Erasable Programmable ROM (EEPROM), compact disk ROM (CD-ROM), digital versatile disk ROM, and the like.
The input/output interface 230 may be used to input or output signals, data, or information. In some embodiments, the input/output interface 230 may be used for user (e.g., the tumor patient 140, a user of the tumor prognosis prediction system 100, etc.) contact with the server 110. In some embodiments, the user may enter characteristic information of the oncology patient via the input/output interface 230. In some embodiments, input/output interface 230 may include an input device and an output device. Exemplary input devices may include a keyboard, mouse, touch screen, microphone, and the like, or any combination thereof. Exemplary output devices may include a display device, speakers, printer, projector, etc., or any combination thereof. Exemplary display devices may include Liquid Crystal Displays (LCDs), Light Emitting Diode (LED) based displays, flat panel displays, curved displays, television equipment, Cathode Ray Tubes (CRTs), and the like, or any combination thereof.
The communication port 240 may be connected to the network 120 for data communication. The connection may be a wired connection, a wireless connection, or a combination of both. The wired connection may include an electrical cable, an optical cable, or a telephone line, etc., or any combination thereof. The wireless connection may include bluetooth, WiFi, WiMax, WLAN, ZigBee, mobile networks (e.g., 3G, 4G, or 5G, etc.), etc., or any combination thereof. In some embodiments, the communication port 240 may be a standardized port, such as RS232, RS485, and the like. In some embodiments, the communication port 240 may be a specially designed port.
FIG. 3 is a block diagram of a prognostic tumor prediction system according to some embodiments of the present application. As shown in fig. 3, the prognosis of tumor prediction system may include an acquisition module 310, a prediction module 320, and a training module 330.
The acquisition module 310 may be used to acquire characteristic information of the tumor patient 140. In some embodiments, the characteristic information may reflect at least gene mutation information of the tumor patient. In some embodiments, the characteristic information of the tumor patient 140 may include: gene mutation information of tumor patients, basic information of tumor patients and the like.
The prediction module 320 may be used to predict a prognostic prediction for a tumor patient. For example, the prediction module 320 may determine a prognosis prediction result of the tumor patient according to a tumor prognosis prediction model based on the characteristic information of the tumor patient.
The training module 330 may be used to train to obtain a prognosis prediction model for the tumor. Specifically, the training module 330 can obtain the characteristic information of a plurality of tumor patients and the prognosis information thereof. The training module 330 may train the initial model to obtain a tumor prognosis prediction model by using the feature information of a plurality of tumor patients and their prognosis information. In some embodiments, the training module 330 may remove mutant gene information in which the abundance of mutations is less than a certain set threshold. In some embodiments, the training module 330 may remove redundant gene mutation information from the gene mutation information. In some embodiments, the training module 330 may determine that at least part of the genes are tumor prognosis prediction related genes according to the contribution value of each gene mutation information in the feature information of a plurality of tumor patients to the support vector machine model. In some embodiments, the training module 330 may train the initial model to obtain the prognosis prediction model by using the gene mutation information of the prognosis prediction related genes of a plurality of tumor patients and the prognosis information thereof. In some embodiments, the training module 330 may also optimize the parameters of the support vector machine model using a particle swarm algorithm or a grid partitioning method.
It should be understood that the system and its modules shown in FIG. 3 may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules of the present application may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).
It should be noted that the above descriptions of the candidate item display and determination system and the modules thereof are only for convenience of description, and are not intended to limit the present application within the scope of the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the present system, any combination of modules or sub-system configurations may be used to connect to other modules without departing from such teachings. For example, in some embodiments, the obtaining module 310, the predicting module 320, and the training module 330 may be different modules in a system, or may be a module that implements the functions of two or more modules described above. For example, the obtaining module 310 and the predicting module 320 may be a single module having both obtaining and predicting functions. For example, each module may share one memory module, and each module may have its own memory module. Such variations are within the scope of the present application.
Fig. 4 is an exemplary flow chart of a method of prognosis of a tumor according to some embodiments of the present application. As shown in fig. 4, the method for predicting tumor prognosis may include:
at step 410, characteristic information of the tumor patient is obtained, and the characteristic information at least reflects gene mutation information of the tumor patient. In particular, step 410 may be performed by the obtaining module 310.
In some embodiments, the characteristic information of the tumor patient 140 may include: gene mutation information of tumor patients, basic information of tumor patients and the like. In some embodiments, the characteristic information of the tumor patient may include only gene mutation information of the tumor patient. Specifically, the gene mutation information of the tumor patient may include a gene mutated on DNA and its mutation abundance, and/or a tumor prognosis prediction related gene on DNA and its mutation abundance. The basic information of the tumor patient may reflect other information related to the tumor patient than the gene mutation information. For example, the basic information of the oncology patient may include the oncology patient's age, sex, smoking history, educational age, working age, sample storage time (e.g., blood storage time, oncology tissue storage time, other normal tissue storage time of the patient), treatment protocol, etc., or any combination thereof. In some embodiments, the treatment plan may include the type of treatment plan (e.g., radiation therapy, chemotherapy, immunotherapy, etc.), the duration of treatment, the dose of radiation used, the dose of a drug, the name or type of drug, and the like. In some embodiments, the genetic mutation information of the tumor patient may be genetic mutation information of the tumor patient at a tumor site (e.g., a target lesion). For example, the genetic mutation information of osteosarcoma patients can be genetic mutation information of osteosarcoma lesion sites. In some embodiments, the tumor patient 140 can be a patient at various stages of the tumor (e.g., early, intermediate, late, etc.), and/or at various stages of treatment (e.g., pre-treatment, under-treatment, post-treatment, etc.). For example, characteristic information of an osteosarcoma patient before treatment (e.g., chemotherapy) can be obtained for predicting the prognosis effect of the treatment, and thus reference can be provided for formulation, selection and the like of a treatment scheme.
In some embodiments, obtaining/determining gene mutation information for the tumor patient 140 may include: obtaining a tissue sample 145 of the tumor patient 140, extracting DNA of the tissue sample, preparing a library of the DNA, performing gene sequencing according to the library to obtain a sequencing result, analyzing the sequencing result to determine gene mutation information of the tumor patient, and the like. For more details on determining the gene mutation information of the tumor patient 140, see FIG. 5 and its related description.
And step 420, determining the prognosis prediction result of the tumor patient according to the tumor prognosis prediction model based on the characteristic information of the tumor patient. In particular, this step 420 may be performed by the prediction module 320.
In some embodiments, the characteristic information of the tumor patient can be input into a trained tumor prognosis prediction model to obtain a prognosis prediction result of the tumor patient. In some embodiments, the tumor prognosis prediction model can be a supervised learning model. Specifically, the supervised learning model may include: one or more of a support vector machine model, a decision tree model, a neural network model, a nearest neighbor classifier and the like. The training procedure for the tumor prognosis prediction model can be seen in fig. 6 and its related description.
In some embodiments, the prognostic prediction outcome may be a prognostic status for a period of time (e.g., 5 years) after treatment. For example, prognostic prediction results can be classified into four categories, i.e., disease Progression (PD), Stable Disease (SD), Partial Remission (PR), and Complete Remission (CR), according to changes in the target lesion. In particular, PD may refer to an increase in the sum of the maximum diameters of the target lesions of 20% and above, or the appearance of new lesions (e.g., new lesions that appear due to tumor metastasis); SD may refer to the reduction in the sum of the maximum diameters of the target lesions to miss PR, or the increase in miss PD; PR may mean that the sum of the maximum diameters of the target lesions is reduced by 30% or more for at least 4 weeks; CR may mean that all target lesions disappear, no new lesions appear, and tumor markers are normal for at least 4 weeks. In some embodiments, the prognostic prediction results may include: good treatment effect and poor treatment effect. In particular, good or bad therapeutic effects can be determined according to clinical criteria. For example, a tumor patient shows poor therapeutic effect if the disease has recurred within 5 years after treatment, and shows good therapeutic effect if the disease has not recurred within 5 years after treatment. For another example, PD and SD may be classified as having poor therapeutic efficacy, and PR and CR may be classified as having good therapeutic efficacy. For another example, if the survival time of the patient exceeds 5 years after the first treatment, the treatment effect is good; if the survival time of the patient after the first treatment is less than 5 years, the treatment effect is poor.
In alternative embodiments, the prognostic prediction results can be classified into other categories, which are not limited by the embodiments of the present application. For example, prognostic prediction results can be classified into three categories, i.e., good therapeutic effect, general therapeutic effect, and poor therapeutic effect. In some embodiments, the prognostic prediction result may also be a prediction value of a specific certain index. For example, prognostic predictors can include, but are not limited to, disease remission rate, disease relapse rate, recurrence of disease within a few years, disease survival rate, time to live, near term mortality, far term mortality, in-patient mortality, out-of-hospital mortality, surgical mortality, and the like.
It should be noted that the above description related to the flow 400 is only for illustration and explanation, and does not limit the applicable scope of the present application. Various modifications and changes to flow 400 may occur to those skilled in the art in light of the teachings herein. However, such modifications and variations are intended to be within the scope of the present application.
FIG. 5 is an exemplary flow chart for determining gene mutation information for a tumor patient according to some embodiments of the present application. Specifically, the steps shown in fig. 5 may be performed by a worker (e.g., a doctor, a laboratory technician, an operator, etc.) and/or an instrument (e.g., a detector, an analyzer, etc.), etc. As shown in fig. 5, the process of determining the gene mutation information of the tumor patient may include:
at step 510, a tissue sample is obtained from a tumor patient.
In some embodiments, the tissue sample 145 may be used to reflect relevant information of the tumor. In particular, the tissue sample 145 may be a biological tissue or fluid sample taken from a tumor site (e.g., a target lesion) and/or a non-tumor site (e.g., a site other than a lesion) of the tumor patient 140. For example, tissue samples may include, but are not limited to: sputum, blood samples, fresh tissue (e.g., surgical tissue, punctured tissue, etc.), paraffin-embedded tissue, urine, serosal cavity effusion (e.g., ascites, pleural effusion, pericardial effusion, etc.), or tissue, cells, etc. extracted from a tumor site, or any combination thereof. In some embodiments, the tissue sample 145 may include tissue, cells of the tumor patient 140 at a tumor site or a site other than a tumor. In some embodiments, the tissue sample 145 may include only tissue, cells of the tumor patient 140 at the tumor site. In some embodiments, inclusion criteria may be formulated for the tissue sample 145. For example, the requirement to collect a tissue sample may be made as surgical tissue, fresh tissue, punctured tissue, 10% neutraline, paraffin embedded tissue, and the like. As another example, the paraffin white slice may be 10 (5 microns) or 5 (10 microns) white slices, and is provided to ensure that the sliced tissue contains a sufficient proportion of tumor cells (e.g., tumor cells)>70%) can be added with the same HE staining piece (or mail can inform the examined specimen of the amount of tumor cells after HE staining). And alsoFor example, for surgical or penetrating tissue, the sample size collected may be required>0.3cm3And quickly placed into the EP tube. As another example, sample shipping criteria may be established: the white paraffin slice can be sent for examination at normal temperature within 2 weeks after being cut, such as by using an EP tube, the opening of the tube is sealed by a sealing film to prevent leakage in the transportation process, and the pathological number of a sample to be examined is written on an application form. As another example, criteria for screening tissue samples may be established, such as sample rejection criteria: non-10% neutral formalin fixed liquid tissue, non-conformity of inspection sample information and application form, tissue autolysis or degeneration and the like.
Step 520, extracting DNA from the tissue sample.
In some embodiments, the method of extracting DNA of a tissue sample may include a cetyltrimethylammonium bromide method (CTAB method), a glass bead method, an ultrasonic method, a milling method, a freeze-thaw method, a guanidinium isothiocyanate method, an alkaline lysis method, an enzymatic method, and the like, or any combination thereof. In some embodiments, any known method can be used to extract DNA from a tissue sample, which is not limited by the embodiments of the present application.
Step 530, a library of the DNA is prepared.
In some embodiments, the library preparation process may include some or all of the steps of DNA fragmentation, end repair, bead fragment screening, end tailing, linker ligation, PCR enrichment, sequencing by hybridization, etc. In addition, any known method can be used to prepare a library of DNA from a tissue sample, which is not limited in the examples of the present application.
And 540, performing gene sequencing according to the library to obtain a sequencing result.
In some embodiments, the prepared library may be subjected to gene sequencing to obtain sequencing data. Among them, the gene sequencing technology can be a high-throughput sequencing technology. High-throughput sequencing technology ("NGS") may include: one or more arbitrary combinations of single-molecule real-time sequencing (Pacific Bio), Ion semiconductor (Ion torrent sequencing), pyrosequencing (454), sequencing by synthesis (Illumina), sequencing by ligation (SOLIDsequencing), and chain termination (Sanger sequencing). In addition, any known method can be used for gene sequencing, which is not limited in the examples of the present application.
Step 550, analyzing the sequencing result to determine the gene mutation information of the tumor patient.
In some embodiments, data analysis can be performed on the obtained sequencing data to obtain gene mutation information (including the gene mutated on the DNA and its mutation abundance, and/or the prediction of the relevant gene by the tumor prognosis on the DNA, mutation site mutation abundance, gene mutation abundance, etc.) of the tumor patient. In some embodiments, the gene mutation abundance can be the cumulative sum of mutation abundances for sites in the statistical sequencing result where a Single Nucleotide Variation (SNV) is greater than a certain set value. The set value may be 0.05%, 0.1%, 0.2%, 1%, 2%, 3%, or 10%, and so forth. The mutation abundance of the mutation site can refer to the proportion of one base mutation. Specifically, the mutation abundance of the mutation site is the number of mutant reads/(the number of mutant reads + the number of wild-type reads), wherein reads represents a short sequencing fragment. For example, the mutant gene KMT2C of a certain patient is obtained by sequencing, and the mutation abundances of 5 mutant sites are respectively as follows: 1%, 3%, 4%, 6%, 8%, and the threshold is set to 2%. The mutation abundance of the mutant gene KMT2C is the cumulative sum of the mutation abundances of 4 mutation sites of more than 2%. In some embodiments, data analysis may include (1) removing linker sequences in sequencing data; (2) performing quality control and removing low quality sequencing data (e.g., low quality bases, excessively short sequencing data, etc.); (3) comparing the processed sequencing data with reference gene data to identify mutant genes; (4) eliminating normal variation (such as polymorphism variation, synonymous variation, etc.) of gene; (5) obtaining gene mutation information of tumor patients and the like. In some embodiments, the reference genetic data can be normal genetic data (e.g., genetic data in normal cells of a non-tumor site of a tumor patient, genetic data of a non-tumor patient, etc.), genetic data of a corresponding tumor disease (e.g., a prognostic prediction-associated gene for each tumor), or the like. 93 patients are sequenced by the sequencing method, and the target region coverage is calculated to be 98.2-99.6%, and the mean value is 99.41%; the average sequencing depth of the target area is 462.7-1252.89, and the average value is 705.51; the target area capture efficiency was 75.6% to 84.6% with an average value of 80.01%. In some embodiments, the reference gene data may be stored in the database 130, and may be retrieved from the database 130 at the time of use. In some embodiments, the abundance of mutations in a gene can also be determined using any known method. For example, second generation sequencing, BEAMING, PARE, etc.
Through sequencing, different mutant genes are found to be distributed differently in different patient samples. FIGS. 7-9 are gene mutation heatmaps of osteosarcoma patients according to some embodiments of the present application; wherein, FIG. 7 is a gene mutation heatmap of a total osteosarcoma patient according to the exemplary embodiment of the present application; FIG. 8 is a heat map of gene mutations in osteosarcoma patients with good therapeutic effect according to an exemplary embodiment of the present application; FIG. 9 is a heat map of gene mutations in osteosarcoma patients with poor therapeutic effect according to the exemplary embodiment of the present application.
In this example, the corresponding tissue and cell can be extracted from the target lesion (osteosarcoma lesion site) of osteosarcoma patients (93 samples of osteosarcoma patients as shown in FIG. 7), and the gene mutation information of osteosarcoma patients can be determined from the tissue and cell. Specifically, the genetic mutation information of osteosarcoma patients can be determined by the above procedure for determining genetic mutation information of tumor patients.
In this example, the 315 genes (genes having a more significant effect on cancer according to the prior literature report) of the sample were mainly tested for mutation (e.g., gene mutation abundance). In some alternative embodiments, the number of genes detected may be increased or decreased as appropriate. The first 29 gene mutation heat maps of all osteosarcoma patients, patients with good prognosis and patients with poor prognosis are shown in FIGS. 7-9, wherein the left ordinate of FIGS. 7-9 represents the ratio of the mutation of a certain mutant gene in 93 samples, the right ordinate represents the mutant gene, and the abscissa represents the sample. Specifically, in this example, the mutant gene information (the partial mutant gene information shown in fig. 7 to 9) with a high ratio of gene mutation in the sample includes: lysine N-methyl transferase 2C (KMT2C), SRY-box 9(SOX9), LDLreceptor related protein 1B (LRP1B), Neofornia type I (NF-1), Protease (PRKDC), FATTypical cadherin 1(FAT1), SLIT ligand 2(SLIT2), Notch1, EPH receptor A8 (EPHA7), ATRX, Lysine dehydrogenase 6A (KDM6A), APC, binding protein 2(RANBP2), auto-oncogene 1(ROS1), EMSY (C11orf30), AT-marginal gene-binding protein 2 (ARR 42), ROS-promoter A1 (ROS 6355), RNA (ARS 59465), RNA (ARS 465), RNA of ATRX probe 5, TARG 5-gene 465), TARG 5-gene 465 (TARG 595), TARG 5-gene 465, TARG 5-gene 4619, TARG-gene 465, TARG-gene A, TARG-5, TARG-gene 465, TARG-DNA 465, TARG 3, TAI 3, TAI, structural antigen 2(STAG2), polybranched 1(PBRM1), mesoporous associated transformation factor (MITF), cytochromic P450family2subfamily C member 8(CYP2C8), phosphorescent 3-kinase 4, 5-biphospheric 3-phosphorescent subunit alpha (PIK3CA), phosphorescent 4, 5-biphospheric 3-cytotoxic subunit beta (PIK3CB), B-Raf promoter (BRAF), MET promoter, exonuclease (MET) ase, hexokinase 90A 90, isobornic kinase (HSP 2 XL), mesoporous associated transformation factor alpha (HSP 3/Asp 5), platelet alpha (HSP 5/Asp 5), platelet alpha (HSP 3/Asp 5), platelet alpha (AA) 3-promoter (HSP 5), platelet alpha (AA) 5, platelet alpha (5, three, four, BRCA2DNA repair associated (BRCA2), cell division cycle 73(CDC73), cycle dependent kinase 12(CDK12), CREB binding protein (CREBBP), catenin alpha 1(CTNNA1), CYLD dependent 63 dependent kinase (CYLD), EPH receiver A3(EPHA3), EPH receiver B1(EPHB1), erb-B2 receiver dependent kinase 3(ERBB3), erb-B2 dependent kinase 4(ERBB4), ERBB receiver inhibitor 1(ERRFI1), FA comparative set A (FAA), FA D2(FANCD2), FANCfeedback 1 (GAMBET 101845), GAMBE dependent kinase A927, GAMBE 2 dependent kinase 1 (GAMBE 592 binding kinase), GAMBE dependent kinase 7 (GAMBE 592), GAMBE dependent kinase A985 (GAMBE dependent kinase) and GAMBE 2 linkage kinase 7 (GAMBE.A.7), GAMBE 2 dependent kinase A.7, GAMBE 2 dependent kinase 1 (GAMBE.A.A.7), GAMBE.A.A.A.598, GAMBE.A.A.A.A.A.A.A.A., mutL homolog 1(MLH1), MYC proto-oncogene (MYC), MYCNproto-oncogene (MYCN), NFKB inhibitor alpha (NFKBIA), PARK2, phosphatilinosol-4, 5-bisphosphate 3-kinase catalytic repair gamma (PIK3CG), phosphatoninosite-3-kinase regulatory repair 2(PIK3R2), protein kinase C iota (PRKCI), patched 1(PTCH1), ret pro-oncogene (RET), SET domain linking 2(SETD2), SMAD 12 (SMAD4), SMARCA4, platelet polypeptide promoter (specific receptor), SPC receptor (SPIRE) and SPIROMETA 3625, SPIROMETA 3 and SPIROMETL 2 (TSCgene 2), SPIROMETL 3-kinase coding 1, SPIROMETA 3-kinase C6323), protein C865 3-kinase C865I, PATC 865 1 (PRKCI), and SPIRE 3-promoter 3-gene (SETD2), SMAD 6335, SMAD4, SPIRE 3-gene (TSC 11, SPIRE 3-gene, SPIRE 3-promoter, SPIRE 3.
In addition, after 315 genes of each patient were sequenced, it was also found that the mutation abundances of different mutant genes in the patient samples were different, as shown in table 1.
Table 1 list of information on abundant mutant genes corresponding to each patient (only 10 patients with good prognosis and 10 patients with poor prognosis are shown as examples).
It should be noted that the above description of the process for determining gene mutation information of tumor patients is only for illustration and description, and does not limit the application scope of the present application. For those skilled in the art, any information on gene mutation in tumor patients obtained by other technical means can be used under the guidance of the present application for the technical purpose of prognosis prediction of patients.
Fig. 6 is an exemplary flow chart for training an obtained tumor prognosis prediction model according to some embodiments of the present application. In particular, the process shown in fig. 6 (e.g., step 610, step 620, etc.) may be performed by training module 330. As shown in fig. 6, an exemplary procedure for training to obtain a tumor prognosis prediction model may include:
and step 610, acquiring characteristic information and prognosis information of a plurality of tumor patients.
In some embodiments, the characteristic information of the plurality of tumor patients may include: gene mutation information of tumor patients, basic information of tumor patients and the like. Specifically, the gene mutation information of a plurality of tumor patients may include the gene mutated in the DNA of each tumor patient and the abundance of the mutation thereof. In some embodiments, the genetic mutation information of the plurality of tumor patients may be genetic mutation information of the tumor patients at a tumor site (e.g., a target lesion). For a specific method for determining gene mutation information of the plurality of tumor patients, the flow of determining gene mutation information of tumor patients described in FIG. 5 can be referred. The basic information of the tumor patient may reflect other information related to the tumor patient than the gene mutation information. For example, the basic information of the cancer patient may include the age, sex, smoking history, education age, working age, treatment plan, sample preservation time, kind of medication, etc., or any combination thereof of the cancer patient.
In some embodiments, the prognosis information of multiple tumor patients can be classified into four categories, disease Progression (PD), disease Stabilization (SD), Partial Remission (PR), and Complete Remission (CR), according to the change of target lesion. As another example, the prognostic condition can include: good treatment effect and poor treatment effect. In some embodiments, the prognosis may also be a numerical value for a particular indicator. For example, prognosis can include, but is not limited to, disease remission rate, disease relapse rate, recurrence of disease within a few years, disease survival rate, time to live, recent mortality, distant mortality, hospitalized mortality, out-of-hospital mortality, surgical mortality, and the like. In some embodiments, the prognostic scenarios described herein may correspond to the prognostic prediction determined in step 420.
And step 620, training the initial model to obtain a tumor prognosis prediction model by using the characteristic information and prognosis information of a plurality of tumor patients. In some embodiments, the tumor prognosis prediction model can be a supervised learning model. Specifically, the supervised learning model may include: one or more of a support vector machine model, a decision tree model, a neural network model, a nearest neighbor classifier and the like. In this embodiment, a support vector machine model is taken as an example to describe the training process of the tumor prognosis prediction model.
In some embodiments, initial model parameters (e.g., parameters c (cost), g (gamma)), etc. may be set to establish an initial support vector machine model. And the optimal model parameters (such as parameter c (cost), parameter g (gamma) and the like) can be searched based on the characteristic information of a plurality of tumor patients and the prognosis information thereof by using a gridding partition method so as to update and optimize the model. In some embodiments, a kernel function (e.g., linear kernel function, polynomial kernel function, gaussian (RBF) kernel function, sigmoid kernel function) of the support vector machine model may be selected and trained based on feature information of a plurality of tumor patients and prognosis information thereof to obtain the support vector machine model. In some embodiments, the optimal model parameters can be found by combining a grid partition method and a verification method. For example, model parameters (e.g., parameter c (cost), parameter g (gamma)) are adjusted by a mesh partition method, the model with the parameters adjusted is verified, and the optimal model parameters are determined and selected according to the verification result.
In still other embodiments, a particle swarm optimization algorithm may be employed to optimize the parameters of the support vector machine model. Specifically, the parameters of the particle swarm optimization algorithm may be initialized first, and then the particle swarm optimization algorithm is used to find the optimal parameters (e.g., paired parameters c, g, etc.) of the updated model, and the optimal parameters are used as the optimized model parameters. The particle swarm optimization algorithm can include, but is not limited to, a basic particle swarm optimization algorithm, an adaptive variant particle swarm optimization algorithm, and the like. The parameters of the particle swarm optimization algorithm can comprise local search capability parameters, global search capability parameters, elastic coefficients of speed updating, maximum evolution quantity, population maximum quantity, folding times of cross validation, variation range of the parameter C, variation range of the parameter g and the like, or any combination thereof. In some embodiments, the parameters of the particle swarm optimization algorithm may be initially set manually or non-manually.
In other embodiments, the grid search and the particle swarm optimization algorithm can be jointly adopted to optimize the parameters of the support vector machine model. For example, the parameters of the support vector machine model may be optimized by grid search and then optimized again by the particle swarm optimization algorithm.
In order to improve the model precision or the training efficiency, the feature information of a plurality of tumor patients can be further screened, and the screened feature information is utilized to carry out model training.
In some embodiments, mutant gene information in which the abundance of a mutation is less than a certain set threshold in the gene mutation information of the plurality of tumor patients may be removed. The gene mutation abundance can be the accumulated sum of the mutation abundances of a plurality of different mutation sites in the gene, the threshold value (such as 0.05%, 0.1%, 0.2%, 1%, 2%, 3% and the like) of the mutation site gene mutation abundance can be artificially set, and the mutation gene information of which the mutation abundance is less than the set threshold value can be removed. For example, for some mutation sites with abundance less than a certain value (e.g., 0.05%, 0.1%, 0.2%, etc.), the abundance of the mutation may not be included in the abundance of the mutation in the gene.
In some embodiments, redundant gene mutation information among the gene mutation information of the plurality of tumor patients may be removed. Specifically, in the gene mutation information, two or more genes may exist, which are relatively highly correlated with each other. In some embodiments, two genes are considered more related when the mutations are identical or similar, or when the expression of the mutation abundances of the two genes are similar. For such highly related genes, one or more of them may be considered as redundant genes. By removing redundant gene mutation information (e.g., only one gene remains among the highly relevant genes), the gene dimensionality can be effectively reduced without affecting the model training effect.
In some embodiments, at least part of the genes that are relevant for predicting the tumor prognosis may be determined based on the contribution of each gene mutation information in the feature information of the plurality of tumor patients to the support vector machine model.
In some embodiments, the profile of multiple tumor patients may be further screened for individual gene mutation information. Specifically, a recursive feature elimination method can be used to screen the genetic mutation information in the feature information of a plurality of tumor patients. The method comprises the steps of taking the prediction accuracy of a model as an evaluation standard, carrying out alternative elimination on each gene mutation information in the characteristic information of a plurality of tumor patients to obtain a plurality of training sets, respectively training on each training set to obtain a model, and carrying out contribution value sequencing on the gene mutation information eliminated when each model is trained based on the prediction accuracy. Finally, screening the mutation information of each gene according to the contribution value to obtain the prediction related gene of which at least part is tumor prognosis. In some embodiments, a random forest algorithm may be further selected to screen the genetic mutation information in the feature information of a plurality of tumor patients. Specifically, (1) first construct a decision tree: p trees (such as 20 trees, 40 trees and the like) in the forest can be defined; a plurality of sample sets can be extracted from 93 samples by using a bootstrap sampling method to serve as a training sample set of each decision tree, the training sample set of each decision tree can be obtained by repeating P sampling, and a training set of one decision tree can be obtained by sampling 93 samples in each sampling cycle in a return sampling mode for 93 times; at each node of the decision tree, assuming that 315 characteristic variables are in total, randomly extracting m characteristic variables from the 315 characteristic variables, selecting one characteristic from the m characteristic variables for branch growth, not performing pruning operation in the growth process, and calculating the optimal splitting mode; (2) and combining the trained P decision trees to obtain a random forest. And predicting each of a plurality of tumor patients according to the P decision trees, wherein the final prediction result is the output of the random forest by a weighting or voting method. In the process of training each decision tree, it can be calculated how much less the tree is for each feature. For a decision tree forest, the average reduction purity of each feature can be calculated, and the average reduction impurity degree is used as a contribution value evaluation criterion. For example, the most impure gene mutation information can be used as the characteristic with the largest contribution value, and so on, the contribution values of different mutant genes to the model can be determined (as shown in table 2), so as to screen out the genes related to the tumor prognosis prediction at least in part. For example, n (e.g., 20, 29, 40, 100, etc.) mutant genes having the largest contribution to the tumor prognosis prediction model can be selected from among the mutant genes having significant influence on tumor occurrence as the genes related to tumor prognosis prediction.
TABLE 2 List of contribution values of different mutant genes to the model
In some embodiments, the trained prognostic predictive models of tumors may be validated. For example, for the support vector machine model, cross-validation may be employed to verify the model effect. Specifically, the cross-validation method may include: Leave-Out (Hold-Out Method), K-fold Cross Validation (K-CV), and Leave-One-Out Cross Validation (LOO-CV). Taking an LOO-CV as an example, the training samples may be divided into total number of samples (e.g., 93), 1 of the total number of samples is used as a verification sample, and the remaining 92 samples are used as training samples and input into the initial support vector machine model for training, the cross-validation process is repeated 93 times to obtain 93 validation results, and the 93 validation results are combined to determine a final validation result of the tumor prognosis prediction model obtained by training. Further, a receiver operating characteristic curve (ROC curve) can be drawn according to the verification result and visually represented (as shown in fig. 10). As shown in FIG. 10, the points on the ROC curve represent the sensitivity and specificity of the osteosarcoma prognosis prediction model under different truncation conditions (e.g., prognosis effect classification criteria). The point of the uppermost left corner of the ROC curve is close to the upper left corner, so that the fact that the osteosarcoma prognosis prediction model obtained in the embodiment is high in prediction accuracy can be reflected; the area AUC under the ROC curve is 0.988, which is very close to 1, and can reflect that the osteosarcoma prognosis prediction model obtained in the embodiment has a good classification effect; in addition, the osteosarcoma prognosis prediction model has higher sensitivity mean value (0.95) and specificity mean value (0.97) under different truncation conditions.
In this example, 6 osteosarcoma patients (4 of them are known to have poor prognosis effect, and 2 of them have good prognosis effect) were additionally selected. The gene mutation information of the osteosarcoma lesion part is obtained, based on the information, the prognosis prediction results of the 6 osteosarcoma patients are determined according to the osteosarcoma prognosis prediction model obtained by training in the embodiment (as shown in table 3, wherein the threshold value of the prediction value is set to 0.5, the prognosis is good when less than 0.5, and the prognosis is poor when more than 0.5), and the obtained prediction results are completely consistent with the known prognosis effects.
TABLE 3 comparison of the prognosis of osteosarcoma with the actual prognosis
Sample name | Prediction value | Predicting effect | Actual prognostic effect |
Patient 1 | 0.335717 | Good prognosis | Good prognosis |
Patient 2 | 0.44896 | Good prognosis | Good prognosis |
Patient 3 | 0.67417 | Poor prognosis | Poor prognosis |
Patient 4 | 0.735268 | Poor prognosis | Poor prognosis |
Patient 5 | 0.756405 | Poor prognosis | Poor prognosis |
Patient 6 | 0.930926 | Poor prognosis | Poor prognosis |
It should be noted that the above description related to the flow 600 is only for illustration and explanation, and does not limit the applicable scope of the present application. Various modifications and changes to flow 600 may occur to those skilled in the art, given the benefit of this disclosure. However, such modifications and variations are intended to be within the scope of the present application.
The beneficial effects that may be brought by the embodiments of the present application include, but are not limited to: (1) the prognosis effect of the tumor patient can be predicted based on the gene mutation information of the tumor patient; (2) the tumor prognosis prediction accuracy is improved; (3) the implementation of the tumor prognosis prediction process is convenient; (4) provides reference for the formulation and selection of treatment schemes. It is to be noted that different embodiments may produce different advantages, and in different embodiments, any one or combination of the above advantages may be produced, or any other advantages may be obtained.
Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered merely illustrative and not restrictive of the broad application. Various modifications, improvements and adaptations to the present application may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present application and thus fall within the spirit and scope of the exemplary embodiments of the present application.
Also, this application uses specific language to describe embodiments of the application. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the present application is included in at least one embodiment of the present application. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the present application may be combined as appropriate.
Moreover, those skilled in the art will appreciate that aspects of the present application may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereon. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data block," module, "" engine, "" unit, "" component, "or" system. Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.
Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present application. Other variations are also possible within the scope of the present application. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the present application can be viewed as being consistent with the teachings of the present application. Accordingly, the embodiments of the present application are not limited to only those embodiments explicitly described and depicted herein.
Claims (30)
1. A method for predicting prognosis of a tumor, comprising:
acquiring characteristic information of a tumor patient, wherein the characteristic information at least reflects gene mutation information of the tumor patient;
and determining the prognosis prediction result of the tumor patient according to a tumor prognosis prediction model based on the characteristic information of the tumor patient.
2. The method of predicting tumor prognosis of claim 1 wherein the gene mutation information includes a gene mutated in DNA and its mutation abundance, and/or a gene related to tumor prognosis prediction on DNA and its mutation abundance.
3. The method of predicting prognosis of tumor according to claim 1, wherein said obtaining the characteristic information of the tumor patient further comprises:
obtaining a tissue sample from the tumor patient;
extracting DNA of the tissue sample;
preparing a library of the DNA;
performing gene sequencing according to the library to obtain a sequencing result;
analyzing the sequencing result to determine the gene mutation information of the tumor patient.
4. The method of predicting prognosis of a tumor according to claim 1, wherein the characteristic information further includes at least one of the following information of the tumor patient: age, gender, smoking history, educational age, working age, treatment regimen, and sample storage time.
5. The method of predicting prognosis of a tumor according to claim 1, wherein the model for predicting prognosis of a tumor is a support vector machine model or a neural network model.
6. The method of predicting prognosis of a tumor according to claim 1, further comprising:
and training an initial model by utilizing the characteristic information and the prognosis information of a plurality of tumor patients to obtain the tumor prognosis prediction model.
7. The method of claim 6, wherein the training of the initial model using the feature information of a plurality of tumor patients and their prognosis information to obtain the tumor prognosis prediction model comprises:
and removing mutant gene information of which the mutation abundance is less than a certain set threshold value from the gene mutation information of the plurality of tumor patients.
8. The method of claim 6, wherein the training of the initial model using the feature information of a plurality of tumor patients and their prognosis information to obtain the tumor prognosis prediction model comprises:
removing redundant gene mutation information in the gene mutation information of the plurality of tumor patients.
9. The method of predicting prognosis of tumor according to claim 6,
the tumor prognosis prediction model is a support vector machine model;
the method for training an initial model to obtain the tumor prognosis prediction model by using the characteristic information and the prognosis information of a plurality of tumor patients comprises the following steps:
determining at least part of genes as tumor prognosis prediction related genes according to the contribution value of each gene mutation information in the feature information of a plurality of tumor patients to the support vector machine model;
and training the initial model by using the gene mutation information and the prognosis information of the genes related to the prognosis prediction of the tumors of a plurality of tumor patients to obtain the prognosis prediction model of the tumors.
10. The method of predicting prognosis of tumor according to claim 6,
the tumor prognosis prediction model is a support vector machine model;
the training the initial model to obtain the tumor prognosis prediction model further comprises: and optimizing the parameters of the support vector machine model by utilizing a particle swarm algorithm or a grid division method.
11. The method of predicting prognosis of tumor according to claim 1,
the prognostic prediction results include: disease progression, disease stabilization, partial remission and complete remission; or,
the prognostic prediction results include: good and bad curative effect.
12. The method of any one of claims 1-11, wherein the tumor is an osteosarcoma.
13. The method of predicting prognosis of a tumor according to claim 12, wherein the characteristic information reflects at least mutation information of at least one of the following genes in osteosarcoma patients: KMT2C, SOX9, LRP1B, NF-1, PRKDC, FAT1, STAG2, SLIT2, NOTCH1, EPHA7, ATRX, KDM6A, APC, RANBP2, RARA. AS1, C11orf30, ROS1, ARID2, TAF1, DICER1, MSH2, MSH6, TP53, KDM5A, JAK2, ALK, RB1, NOTCH2, and RICTOR.
14. The method of predicting prognosis of tumor according to claim 12, wherein said information on gene mutation of tumor patient is information on gene mutation of osteosarcoma lesion site.
15. A tumor prognosis prediction system is characterized by comprising an acquisition module and a prediction module, wherein,
the acquisition module is used for acquiring characteristic information of a tumor patient, and the characteristic information at least reflects gene mutation information of the tumor patient;
the prediction module is used for determining the prognosis prediction result of the tumor patient according to the tumor prognosis prediction model based on the characteristic information of the tumor patient.
16. The system of claim 15, wherein the gene mutation information includes a gene mutated in DNA and its mutation abundance, and/or a gene related to tumor prognosis prediction on DNA and its mutation abundance.
17. The tumor prognosis prediction system of claim 15 wherein the characteristic information further comprises at least one of the following information of the tumor patient: age, gender, smoking history, educational age, working age, treatment regimen, and sample storage time.
18. The tumor prognosis prediction system of claim 15 wherein the tumor prognosis prediction model is a support vector machine model or a neural network model.
19. The system of claim 15, further comprising a training module for training an initial model to obtain the prognosis prediction model by using the feature information of a plurality of tumor patients and their prognosis information.
20. The system of claim 19, wherein the training module is further configured to remove mutated gene information from the gene mutation information of the plurality of tumor patients, wherein the abundance of mutation is less than a predetermined threshold.
21. The system of claim 19, wherein the training module is further configured to remove redundant gene mutation information from the gene mutation information of the plurality of tumor patients.
22. The system of claim 19, wherein the prognosis of the tumor is predicted,
the tumor prognosis prediction model is a support vector machine model;
the training module is further configured to:
determining at least part of genes as tumor prognosis prediction related genes according to the contribution value of each gene mutation information in the feature information of a plurality of tumor patients to the support vector machine model;
and training the initial model by using the gene mutation information and the prognosis information of the genes related to the prognosis prediction of the tumors of a plurality of tumor patients to obtain the prognosis prediction model of the tumors.
23. The system of claim 19, wherein the prognosis of the tumor is predicted,
the tumor prognosis prediction model is a support vector machine model;
the training module is further used for optimizing parameters of the support vector machine model by utilizing a particle swarm algorithm or a grid division method.
24. The tumor prognosis prediction system of claim 15,
the prognostic prediction results include: disease progression, disease stabilization, partial remission and complete remission; or,
the prognostic prediction results include: good and bad curative effect.
25. The system of any one of claims 15-24, wherein the tumor is an osteosarcoma.
26. The tumor prognosis prediction system of claim 25 wherein the characteristic information reflects at least mutation information of at least one of the following genes in osteosarcoma patients: KMT2C, SOX9, LRP1B, NF-1, PRKDC, FAT1, STAG2, SLIT2, NOTCH1, EPHA7, ATRX, KDM6A, APC, RANBP2, RARA. AS1, C11orf30, ROS1, ARID2, TAF1, DICER1, MSH2, MSH6, TP53, KDM5A, JAK2, ALK, RB1, NOTCH2, and RICTOR.
27. The system of claim 26, wherein the tumor patient gene mutation information is gene mutation information of osteosarcoma lesion site.
28. An apparatus for prognosis of a tumor, the apparatus comprising at least one processor and at least one memory;
the at least one memory is for storing computer instructions;
the at least one processor is configured to execute at least a portion of the computer instructions to implement the method of any of claims 1-11.
29. A computer-readable storage medium storing computer instructions which, when executed by a processor, implement a method of prognosis prediction of a tumor according to any one of claims 1 to 11.
30. A tumor prognosis prediction system comprising:
at least one computer-readable storage medium comprising a set of instructions for prognosis prediction of a tumor; and
at least one processor in communication with the at least one storage medium, the at least one processor, when executing the set of instructions, configured to:
acquiring characteristic information of a tumor patient, wherein the characteristic information at least reflects gene mutation information of the tumor patient; and
and determining the prognosis prediction result of the tumor patient according to a tumor prognosis prediction model based on the characteristic information of the tumor patient.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/110565 WO2020077552A1 (en) | 2018-10-17 | 2018-10-17 | Tumor prognostic prediction method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109642258A true CN109642258A (en) | 2019-04-16 |
CN109642258B CN109642258B (en) | 2020-06-09 |
Family
ID=66060220
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201880002164.4A Active CN109642258B (en) | 2018-10-17 | 2018-10-17 | Method and system for tumor prognosis prediction |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109642258B (en) |
WO (1) | WO2020077552A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675956A (en) * | 2019-08-26 | 2020-01-10 | 南京医渡云医学技术有限公司 | Method and device for determining facial paralysis treatment scheme, readable medium and electronic equipment |
CN110993106A (en) * | 2019-12-11 | 2020-04-10 | 深圳市华嘉生物智能科技有限公司 | Liver cancer postoperative recurrence risk prediction method combining pathological image and clinical information |
CN111528918A (en) * | 2020-04-30 | 2020-08-14 | 深圳开立生物医疗科技股份有限公司 | Tumor volume change trend graph generation device after ablation, equipment and storage medium |
CN111784637A (en) * | 2020-06-04 | 2020-10-16 | 复旦大学附属中山医院 | Prognostic characteristic visualization method, system, equipment and storage medium |
CN112397172A (en) * | 2020-12-24 | 2021-02-23 | 上海墩庐生物医学科技有限公司 | Intelligent consultant internet application system for breast cancer survival |
CN113345564A (en) * | 2021-05-31 | 2021-09-03 | 电子科技大学 | Early prediction method and device for patient hospitalization duration based on graph neural network |
CN114627969A (en) * | 2022-03-23 | 2022-06-14 | 中国医学科学院肿瘤医院 | Application of complement-associated-gene-based prognosis prediction model and kit for sarcoma patient |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1764837A (en) * | 2003-03-24 | 2006-04-26 | 魏念之 | Methods for predicting an individual's clinical treatment outcome from sampling a group of patients' biological profiles |
CN102713606A (en) * | 2009-11-13 | 2012-10-03 | 无限制药股份有限公司 | Compositions, kits, and methods for identification, assessment, prevention, and therapy of cancer |
WO2012166700A2 (en) * | 2011-05-29 | 2012-12-06 | Lisanti Michael P | Molecular profiling of a lethal tumor microenvironment |
WO2013134786A2 (en) * | 2012-03-09 | 2013-09-12 | Caris Life Sciences Luxembourg Holdings, S.A.R.L. | Biomarker compositions and methods |
CN104769131A (en) * | 2012-09-21 | 2015-07-08 | 英特盖根公司 | A method for prognosis of global survival and survival without relapse in hepatocellular carcinoma |
WO2016141324A2 (en) * | 2015-03-05 | 2016-09-09 | Trovagene, Inc. | Early assessment of mechanism of action and efficacy of anti-cancer therapies using molecular markers in bodily fluids |
WO2017062989A1 (en) * | 2015-10-08 | 2017-04-13 | Urology Diagnostics, Inc. | Diagnostic assay for urine monitoring of bladder cancer |
JP2017216882A (en) * | 2016-06-02 | 2017-12-14 | 国立大学法人金沢大学 | Method of detecting the existence of osteopathy, osteopathy therapeutic agent, and method for screening osteopathy therapeutic agent |
CN107545144A (en) * | 2017-09-05 | 2018-01-05 | 上海市内分泌代谢病研究所 | Pheochromocytoma metastasis prediction system based on molecular marker |
CN107750279A (en) * | 2015-03-16 | 2018-03-02 | 个人基因组诊断公司 | Foranalysis of nucleic acids system and method |
CN108027372A (en) * | 2015-05-27 | 2018-05-11 | 卡纳比克斯制药公司 | System and method for high flux screening cancer cell |
US20180246104A1 (en) * | 2015-08-18 | 2018-08-30 | Agency For Science, Technology And Research | Method for detecting circulating tumor cells and uses thereof |
CN108474723A (en) * | 2015-12-02 | 2018-08-31 | 克莱尔莱特诊断有限责任公司 | Prepare and analyze the method that neoplasmic tissue sample is used to detecting and monitoring cancer |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106202969B (en) * | 2016-08-01 | 2018-10-23 | 东北大学 | A kind of tumor cells parting forecasting system |
CN106960122A (en) * | 2017-03-17 | 2017-07-18 | 晶能生物技术(上海)有限公司 | Genetic disease Forecasting Methodology and device caused by gene mutation |
CN107169264B (en) * | 2017-04-14 | 2019-12-06 | 广东药科大学 | complex disease diagnosis system |
CN107341366A (en) * | 2017-07-19 | 2017-11-10 | 西安交通大学 | A kind of method that complex disease susceptibility loci is predicted using machine learning |
CN107833636A (en) * | 2017-12-04 | 2018-03-23 | 浙江鸿赋堂健康管理有限公司 | A kind of tumour Forecasting Methodology |
CN108416190A (en) * | 2018-02-11 | 2018-08-17 | 广州市碳码科技有限责任公司 | Tumour methods for screening, device, equipment and medium based on deep learning |
-
2018
- 2018-10-17 CN CN201880002164.4A patent/CN109642258B/en active Active
- 2018-10-17 WO PCT/CN2018/110565 patent/WO2020077552A1/en active Application Filing
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1764837A (en) * | 2003-03-24 | 2006-04-26 | 魏念之 | Methods for predicting an individual's clinical treatment outcome from sampling a group of patients' biological profiles |
CN102713606A (en) * | 2009-11-13 | 2012-10-03 | 无限制药股份有限公司 | Compositions, kits, and methods for identification, assessment, prevention, and therapy of cancer |
WO2012166700A2 (en) * | 2011-05-29 | 2012-12-06 | Lisanti Michael P | Molecular profiling of a lethal tumor microenvironment |
WO2013134786A2 (en) * | 2012-03-09 | 2013-09-12 | Caris Life Sciences Luxembourg Holdings, S.A.R.L. | Biomarker compositions and methods |
CN104769131A (en) * | 2012-09-21 | 2015-07-08 | 英特盖根公司 | A method for prognosis of global survival and survival without relapse in hepatocellular carcinoma |
WO2016141324A2 (en) * | 2015-03-05 | 2016-09-09 | Trovagene, Inc. | Early assessment of mechanism of action and efficacy of anti-cancer therapies using molecular markers in bodily fluids |
CN107750279A (en) * | 2015-03-16 | 2018-03-02 | 个人基因组诊断公司 | Foranalysis of nucleic acids system and method |
CN108027372A (en) * | 2015-05-27 | 2018-05-11 | 卡纳比克斯制药公司 | System and method for high flux screening cancer cell |
US20180246104A1 (en) * | 2015-08-18 | 2018-08-30 | Agency For Science, Technology And Research | Method for detecting circulating tumor cells and uses thereof |
WO2017062989A1 (en) * | 2015-10-08 | 2017-04-13 | Urology Diagnostics, Inc. | Diagnostic assay for urine monitoring of bladder cancer |
CN108474723A (en) * | 2015-12-02 | 2018-08-31 | 克莱尔莱特诊断有限责任公司 | Prepare and analyze the method that neoplasmic tissue sample is used to detecting and monitoring cancer |
JP2017216882A (en) * | 2016-06-02 | 2017-12-14 | 国立大学法人金沢大学 | Method of detecting the existence of osteopathy, osteopathy therapeutic agent, and method for screening osteopathy therapeutic agent |
CN107545144A (en) * | 2017-09-05 | 2018-01-05 | 上海市内分泌代谢病研究所 | Pheochromocytoma metastasis prediction system based on molecular marker |
Non-Patent Citations (5)
Title |
---|
CATERINA CHIAPPETTA等: "Whole-exome analysis in osteosarcoma to identify a personalized therapy", 《ONCOTARGET》 * |
MATLAB中文论坛: "《MATLAB神经网络30个案例分析》", 30 April 2010 * |
周敏: "《制造业信息化工程学》", 31 January 2017, 冶金工业出版社 * |
肖鑫等: "基于高通量测序技术的骨肉瘤基因研究与个体化治疗", 《中华骨科杂志》 * |
陈敏: "《认知计算导论》", 31 May 2017 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110675956A (en) * | 2019-08-26 | 2020-01-10 | 南京医渡云医学技术有限公司 | Method and device for determining facial paralysis treatment scheme, readable medium and electronic equipment |
CN110993106A (en) * | 2019-12-11 | 2020-04-10 | 深圳市华嘉生物智能科技有限公司 | Liver cancer postoperative recurrence risk prediction method combining pathological image and clinical information |
CN111528918A (en) * | 2020-04-30 | 2020-08-14 | 深圳开立生物医疗科技股份有限公司 | Tumor volume change trend graph generation device after ablation, equipment and storage medium |
CN111528918B (en) * | 2020-04-30 | 2023-02-21 | 深圳开立生物医疗科技股份有限公司 | Tumor volume change trend graph generation device after ablation, equipment and storage medium |
CN111784637A (en) * | 2020-06-04 | 2020-10-16 | 复旦大学附属中山医院 | Prognostic characteristic visualization method, system, equipment and storage medium |
CN112397172A (en) * | 2020-12-24 | 2021-02-23 | 上海墩庐生物医学科技有限公司 | Intelligent consultant internet application system for breast cancer survival |
CN113345564A (en) * | 2021-05-31 | 2021-09-03 | 电子科技大学 | Early prediction method and device for patient hospitalization duration based on graph neural network |
CN113345564B (en) * | 2021-05-31 | 2022-08-05 | 电子科技大学 | Early prediction method and device for patient hospitalization duration based on graph neural network |
CN114627969A (en) * | 2022-03-23 | 2022-06-14 | 中国医学科学院肿瘤医院 | Application of complement-associated-gene-based prognosis prediction model and kit for sarcoma patient |
Also Published As
Publication number | Publication date |
---|---|
CN109642258B (en) | 2020-06-09 |
WO2020077552A1 (en) | 2020-04-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109642258B (en) | Method and system for tumor prognosis prediction | |
US11996202B2 (en) | Cancer evolution detection and diagnostic | |
Chen et al. | Targeted gene expression profiling predicts meningioma outcomes and radiotherapy responses | |
CN109689891A (en) | The method of segment group spectrum analysis for cell-free nucleic acid | |
EP2622100A1 (en) | Gene marker sets and methods for classification of cancer patients | |
CN113066585A (en) | Method for efficiently and quickly evaluating prognosis of stage II colorectal cancer patient based on immune gene expression profile | |
Miller et al. | Chromosomal instability in untreated primary prostate cancer as an indicator of metastatic potential | |
Orzan et al. | A simplified integrated molecular and immunohistochemistry-based algorithm allows high accuracy prediction of glioblastoma transcriptional subtypes | |
Belvedere et al. | A computational index derived from whole-genome copy number analysis is a novel tool for prognosis in early stage lung squamous cell carcinoma | |
TWI671653B (en) | Subtyping of tnbc and methods | |
Rade et al. | A reliable transcriptomic risk-score applicable to formalin-fixed paraffin-embedded biopsies improves outcome prediction in localized prostate cancer | |
Feng et al. | A network-based method for identifying prognostic gene modules in lung squamous carcinoma | |
Xu et al. | Identification of Hub Genes as Biomarkers Correlated with the Proliferation and Prognosis in Lung Cancer: A Weighted Gene Co‐Expression Network Analysis | |
CN104846073B (en) | The biological markers of prostate cancer, therapy target and application thereof | |
CA3214391A1 (en) | Cell-free dna sequence data analysis method to examine nucleosome protection and chromatin accessibility | |
CN114496062A (en) | Prognosis model of endometrial cancer related to lipid metabolism and construction method | |
Ryser et al. | Growth Dynamics of Ductal Carcinoma in Situ Recapitulate Normal Breast Development | |
CN116312814B (en) | Construction method, equipment, device and kit of lung adenocarcinoma molecular typing model | |
LU502513B1 (en) | Breast cancer prognosis evaluation method and system based on autophagy-related incrna model | |
CN115982644B (en) | Esophageal squamous cell carcinoma classification model construction and data processing method | |
KR102683687B1 (en) | Method for differentiating true somatic mutations from artifacts in dna sequencing data generated from formalin fixed paraffin embedded tissue sample using deep learning and device using the same | |
Zhu et al. | Novel DNA methylation biomarkers in enhancer regions with chromatin interactions for diagnosis of non‐small‐cell lung cancer | |
Kim et al. | A metastasis prediction model in non-small cell lung cancer using GLCM_contrast and epithelial mesenchymal transition related genes | |
Pareja et al. | Genomic Applications in Breast Carcinoma | |
Benítez et al. | Targeted Next Generation Sequencing of a Custom Capture Panel to Target Sequence 112 Cancer Related Genes in Breast Cancer Tumors ERBB2 Positive from Lleida (Spain) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: 314001 building 5-1, 3556 linggongtang Road, Daqiao Town, Nanhu District, Jiaxing City, Zhejiang Province Patentee after: Zhejiang Yunying Medical Technology Co.,Ltd. Address before: Room 201, building 18, Caohejing emerging technology development zone, No. 518, Xinzhuan Road, Songjiang District, Shanghai 201600 Patentee before: SHANGHAI YUNYING MEDICAL TECHNOLOGY Co.,Ltd. |
|
CP03 | Change of name, title or address |