CN117079716A - Deep learning prediction method of tumor drug administration scheme based on gene detection - Google Patents
Deep learning prediction method of tumor drug administration scheme based on gene detection Download PDFInfo
- Publication number
- CN117079716A CN117079716A CN202311177095.XA CN202311177095A CN117079716A CN 117079716 A CN117079716 A CN 117079716A CN 202311177095 A CN202311177095 A CN 202311177095A CN 117079716 A CN117079716 A CN 117079716A
- Authority
- CN
- China
- Prior art keywords
- training
- model
- mutation
- training model
- medication
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 206010028980 Neoplasm Diseases 0.000 title claims abstract description 94
- 238000001514 detection method Methods 0.000 title claims abstract description 39
- 238000000034 method Methods 0.000 title claims abstract description 37
- 108090000623 proteins and genes Proteins 0.000 title claims abstract description 36
- 238000013135 deep learning Methods 0.000 title claims abstract description 20
- 238000001647 drug administration Methods 0.000 title claims abstract description 13
- 239000003814 drug Substances 0.000 claims abstract description 138
- 210000000056 organ Anatomy 0.000 claims abstract description 45
- 230000000694 effects Effects 0.000 claims abstract description 35
- 238000012549 training Methods 0.000 claims description 129
- 229940079593 drug Drugs 0.000 claims description 126
- 230000035772 mutation Effects 0.000 claims description 71
- 150000001875 compounds Chemical group 0.000 claims description 70
- 210000002220 organoid Anatomy 0.000 claims description 68
- 201000011510 cancer Diseases 0.000 claims description 41
- 238000002474 experimental method Methods 0.000 claims description 31
- 238000012795 verification Methods 0.000 claims description 27
- 238000012163 sequencing technique Methods 0.000 claims description 25
- 230000002068 genetic effect Effects 0.000 claims description 23
- 238000012360 testing method Methods 0.000 claims description 20
- 230000008569 process Effects 0.000 claims description 12
- 230000005012 migration Effects 0.000 claims description 8
- 238000013508 migration Methods 0.000 claims description 8
- 238000012216 screening Methods 0.000 claims description 7
- 230000035945 sensitivity Effects 0.000 claims description 7
- 238000010200 validation analysis Methods 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 4
- 230000007246 mechanism Effects 0.000 claims description 4
- 238000007747 plating Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000012258 culturing Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000011160 research Methods 0.000 abstract description 6
- 238000013136 deep learning model Methods 0.000 abstract description 5
- 239000000523 sample Substances 0.000 description 48
- 210000004027 cell Anatomy 0.000 description 22
- 210000001519 tissue Anatomy 0.000 description 17
- 238000005516 engineering process Methods 0.000 description 14
- 241000894007 species Species 0.000 description 11
- 230000005764 inhibitory process Effects 0.000 description 10
- 230000009471 action Effects 0.000 description 9
- 238000000338 in vitro Methods 0.000 description 9
- 238000004113 cell culture Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 238000011156 evaluation Methods 0.000 description 7
- 238000001727 in vivo Methods 0.000 description 7
- 108010082117 matrigel Proteins 0.000 description 7
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 6
- 238000010171 animal model Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 5
- 238000003908 quality control method Methods 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 210000002569 neuron Anatomy 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000037361 pathway Effects 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 229930012538 Paclitaxel Natural products 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000013401 experimental design Methods 0.000 description 3
- 239000012520 frozen sample Substances 0.000 description 3
- 239000001963 growth medium Substances 0.000 description 3
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 3
- 229960001592 paclitaxel Drugs 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 239000013049 sediment Substances 0.000 description 3
- RCINICONZNJXQF-MZXODVADSA-N taxol Chemical compound O([C@@H]1[C@@]2(C[C@@H](C(C)=C(C2(C)C)[C@H](C([C@]2(C)[C@@H](O)C[C@H]3OC[C@]3([C@H]21)OC(C)=O)=O)OC(=O)C)OC(=O)[C@H](O)[C@@H](NC(=O)C=1C=CC=CC=1)C=1C=CC=CC=1)O)C(=O)C1=CC=CC=C1 RCINICONZNJXQF-MZXODVADSA-N 0.000 description 3
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 3
- GLYMPHUVMRFTFV-QLFBSQMISA-N 6-amino-5-[(1r)-1-(2,6-dichloro-3-fluorophenyl)ethoxy]-n-[4-[(3r,5s)-3,5-dimethylpiperazine-1-carbonyl]phenyl]pyridazine-3-carboxamide Chemical compound O([C@H](C)C=1C(=C(F)C=CC=1Cl)Cl)C(C(=NN=1)N)=CC=1C(=O)NC(C=C1)=CC=C1C(=O)N1C[C@H](C)N[C@H](C)C1 GLYMPHUVMRFTFV-QLFBSQMISA-N 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 238000012404 In vitro experiment Methods 0.000 description 2
- 239000005411 L01XE02 - Gefitinib Substances 0.000 description 2
- 239000005551 L01XE03 - Erlotinib Substances 0.000 description 2
- 239000002146 L01XE16 - Crizotinib Substances 0.000 description 2
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 2
- 229960001686 afatinib Drugs 0.000 description 2
- ULXXDDBFHOBEHA-CWDCEQMOSA-N afatinib Chemical compound N1=CN=C2C=C(O[C@@H]3COCC3)C(NC(=O)/C=C/CN(C)C)=CC2=C1NC1=CC=C(F)C(Cl)=C1 ULXXDDBFHOBEHA-CWDCEQMOSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 229960003982 apatinib Drugs 0.000 description 2
- 229960004316 cisplatin Drugs 0.000 description 2
- DQLATGHUWYMOKM-UHFFFAOYSA-L cisplatin Chemical compound N[Pt](N)(Cl)Cl DQLATGHUWYMOKM-UHFFFAOYSA-L 0.000 description 2
- 229960005061 crizotinib Drugs 0.000 description 2
- KTEIFNKAUNYNJU-GFCCVEGCSA-N crizotinib Chemical compound O([C@H](C)C=1C(=C(F)C=CC=1Cl)Cl)C(C(=NC=1)N)=CC=1C(=C1)C=NN1C1CCNCC1 KTEIFNKAUNYNJU-GFCCVEGCSA-N 0.000 description 2
- 238000005138 cryopreservation Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000001079 digestive effect Effects 0.000 description 2
- 229950004126 ensartinib Drugs 0.000 description 2
- 229960001433 erlotinib Drugs 0.000 description 2
- AAKJLRGGTJKAMG-UHFFFAOYSA-N erlotinib Chemical compound C=12C=C(OCCOC)C(OCCOC)=CC2=NC=NC=1NC1=CC=CC(C#C)=C1 AAKJLRGGTJKAMG-UHFFFAOYSA-N 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 235000011389 fruit/vegetable juice Nutrition 0.000 description 2
- 229960002584 gefitinib Drugs 0.000 description 2
- XGALLCVXEZPNRQ-UHFFFAOYSA-N gefitinib Chemical compound C=12C=C(OCCCN3CCOCC3)C(OC)=CC2=NC=NC=1NC1=CC=C(F)C(Cl)=C1 XGALLCVXEZPNRQ-UHFFFAOYSA-N 0.000 description 2
- 229960005277 gemcitabine Drugs 0.000 description 2
- SDUQYLNIPVEERB-QPPQHZFASA-N gemcitabine Chemical compound O=C1N=C(N)C=CN1[C@H]1C(F)(F)[C@H](O)[C@@H](CO)O1 SDUQYLNIPVEERB-QPPQHZFASA-N 0.000 description 2
- 230000012010 growth Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 210000005260 human cell Anatomy 0.000 description 2
- 238000010874 in vitro model Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 201000005202 lung cancer Diseases 0.000 description 2
- 208000020816 lung neoplasm Diseases 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000004060 metabolic process Effects 0.000 description 2
- WPEWQEMJFLWMLV-UHFFFAOYSA-N n-[4-(1-cyanocyclopentyl)phenyl]-2-(pyridin-4-ylmethylamino)pyridine-3-carboxamide Chemical compound C=1C=CN=C(NCC=2C=CN=CC=2)C=1C(=O)NC(C=C1)=CC=C1C1(C#N)CCCC1 WPEWQEMJFLWMLV-UHFFFAOYSA-N 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- NYNZQNWKBKUAII-KBXCAEBGSA-N (3s)-n-[5-[(2r)-2-(2,5-difluorophenyl)pyrrolidin-1-yl]pyrazolo[1,5-a]pyrimidin-3-yl]-3-hydroxypyrrolidine-1-carboxamide Chemical compound C1[C@@H](O)CCN1C(=O)NC1=C2N=C(N3[C@H](CCC3)C=3C(=CC=C(F)C=3)F)C=CN2N=C1 NYNZQNWKBKUAII-KBXCAEBGSA-N 0.000 description 1
- XYDNMOZJKOGZLS-NSHDSACASA-N 3-[(1s)-1-imidazo[1,2-a]pyridin-6-ylethyl]-5-(1-methylpyrazol-4-yl)triazolo[4,5-b]pyrazine Chemical compound N1=C2N([C@H](C3=CN4C=CN=C4C=C3)C)N=NC2=NC=C1C=1C=NN(C)C=1 XYDNMOZJKOGZLS-NSHDSACASA-N 0.000 description 1
- AILRADAXUVEEIR-UHFFFAOYSA-N 5-chloro-4-n-(2-dimethylphosphorylphenyl)-2-n-[2-methoxy-4-[4-(4-methylpiperazin-1-yl)piperidin-1-yl]phenyl]pyrimidine-2,4-diamine Chemical compound COC1=CC(N2CCC(CC2)N2CCN(C)CC2)=CC=C1NC(N=1)=NC=C(Cl)C=1NC1=CC=CC=C1P(C)(C)=O AILRADAXUVEEIR-UHFFFAOYSA-N 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 102400000888 Cholecystokinin-8 Human genes 0.000 description 1
- 101800005151 Cholecystokinin-8 Proteins 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 238000007400 DNA extraction Methods 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 206010064571 Gene mutation Diseases 0.000 description 1
- 206010061598 Immunodeficiency Diseases 0.000 description 1
- 208000029462 Immunodeficiency disease Diseases 0.000 description 1
- 239000002118 L01XE12 - Vandetanib Substances 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- 206010041067 Small cell lung cancer Diseases 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 238000000692 Student's t-test Methods 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 229960001611 alectinib Drugs 0.000 description 1
- KDGFLJKFZUIJMX-UHFFFAOYSA-N alectinib Chemical compound CCC1=CC=2C(=O)C(C3=CC=C(C=C3N3)C#N)=C3C(C)(C)C=2C=C1N(CC1)CCC1N1CCOCC1 KDGFLJKFZUIJMX-UHFFFAOYSA-N 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 238000010009 beating Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 238000007664 blowing Methods 0.000 description 1
- 229950004272 brigatinib Drugs 0.000 description 1
- JJWKPURADFRFRB-UHFFFAOYSA-N carbonyl sulfide Chemical compound O=C=S JJWKPURADFRFRB-UHFFFAOYSA-N 0.000 description 1
- 230000033077 cellular process Effects 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 238000002512 chemotherapy Methods 0.000 description 1
- 229940044683 chemotherapy drug Drugs 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000012136 culture method Methods 0.000 description 1
- LVXJQMNHJWSHET-AATRIKPKSA-N dacomitinib Chemical compound C=12C=C(NC(=O)\C=C\CN3CCCCC3)C(OC)=CC2=NC=NC=1NC1=CC=C(F)C(Cl)=C1 LVXJQMNHJWSHET-AATRIKPKSA-N 0.000 description 1
- 229950002205 dacomitinib Drugs 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000009509 drug development Methods 0.000 description 1
- 229950000521 entrectinib Drugs 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 230000007813 immunodeficiency Effects 0.000 description 1
- 230000005917 in vivo anti-tumor Effects 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 230000002147 killing effect Effects 0.000 description 1
- 229950003970 larotrectinib Drugs 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 1
- 230000036438 mutation frequency Effects 0.000 description 1
- HAYYBYPASCDWEQ-UHFFFAOYSA-N n-[5-[(3,5-difluorophenyl)methyl]-1h-indazol-3-yl]-4-(4-methylpiperazin-1-yl)-2-(oxan-4-ylamino)benzamide Chemical compound C1CN(C)CCN1C(C=C1NC2CCOCC2)=CC=C1C(=O)NC(C1=C2)=NNC1=CC=C2CC1=CC(F)=CC(F)=C1 HAYYBYPASCDWEQ-UHFFFAOYSA-N 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 235000015097 nutrients Nutrition 0.000 description 1
- 229960003278 osimertinib Drugs 0.000 description 1
- DUYJMQONPNNFPI-UHFFFAOYSA-N osimertinib Chemical compound COC1=CC(N(C)CCN(C)C)=C(NC(=O)C=C)C=C1NC1=NC=CC(C=2C3=CC=CC=C3N(C)C=2)=N1 DUYJMQONPNNFPI-UHFFFAOYSA-N 0.000 description 1
- 201000002528 pancreatic cancer Diseases 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011056 performance test Methods 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 102000004169 proteins and genes Human genes 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 238000001959 radiotherapy Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 229950003500 savolitinib Drugs 0.000 description 1
- 229940121610 selpercatinib Drugs 0.000 description 1
- XIIOFHFUYBLOLW-UHFFFAOYSA-N selpercatinib Chemical compound OC(COC=1C=C(C=2N(C=1)N=CC=2C#N)C=1C=NC(=CC=1)N1CC2N(C(C1)C2)CC=1C=NC(=CC=1)OC)(C)C XIIOFHFUYBLOLW-UHFFFAOYSA-N 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- IZTQOLKUZKXIRV-YRVFCXMDSA-N sincalide Chemical compound C([C@@H](C(=O)N[C@@H](CCSC)C(=O)NCC(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC=CC=1)C(N)=O)NC(=O)[C@@H](N)CC(O)=O)C1=CC=C(OS(O)(=O)=O)C=C1 IZTQOLKUZKXIRV-YRVFCXMDSA-N 0.000 description 1
- 208000000587 small cell lung carcinoma Diseases 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 238000011172 small scale experimental method Methods 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000003319 supportive effect Effects 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
- 230000009044 synergistic interaction Effects 0.000 description 1
- 238000012353 t test Methods 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 229940043263 traditional drug Drugs 0.000 description 1
- 238000009966 trimming Methods 0.000 description 1
- 230000004614 tumor growth Effects 0.000 description 1
- 230000005909 tumor killing Effects 0.000 description 1
- 230000005760 tumorsuppression Effects 0.000 description 1
- 229960000241 vandetanib Drugs 0.000 description 1
- UHTHHESEBZOYNR-UHFFFAOYSA-N vandetanib Chemical compound COC1=CC(C(/N=CN2)=N/C=3C(=CC(Br)=CC=3)F)=C2C=C1OCC1CCN(C)CC1 UHTHHESEBZOYNR-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- General Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Public Health (AREA)
- Medicinal Chemistry (AREA)
- Data Mining & Analysis (AREA)
- Genetics & Genomics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Pharmacology & Pharmacy (AREA)
- Crystallography & Structural Chemistry (AREA)
- Biomedical Technology (AREA)
- Analytical Chemistry (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Molecular Biology (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The application discloses a deep learning prediction method of a tumor drug administration scheme based on gene detection, which relates to the technical fields of deep learning and organ chips and solves the technical problem that the interpretation of the prediction effect of a deep learning model in clinical decision and biological medicine field research is not high enough.
Description
Technical Field
The application relates to the technical field of deep learning and organ chip, in particular to a method for predicting deep learning of a tumor medication scheme based on gene detection.
Background
Malignant tumors have become a serious problem threatening human life health, and death due to malignant tumors has exceeded 20% of the total population death causes in China by 2021, and is still in an ascending state. The influence of the tumor on health and public health not only causes great pressure to individuals of patients, but also causes great stress to medical systems and social resources, and the prevention and treatment of the tumor become social problems.
In clinical tumor control, traditional surgery, radiation therapy and chemotherapy, although capable of controlling tumor growth to some extent, are unstable in effectiveness in some cases and often accompanied by serious side effects, and patient compliance is not high. Along with the gradual entry of the targeted drugs into the clinic, the targeted drugs are comprehensively formed into an accurately designed personalized treatment scheme for each patient according to the genetic background of the patient and the objective evidence of the tumor phenotype characteristics and by combining the subjective experience diagnosis of a professional clinician team, and the targeted drugs become the key links for making a new paradigm for the tumor treatment scheme. However, the diversity of the gene regulation and the complexity of the biological signal pathway make it impossible to perform reliable prognosis according to single or isolated targets. In addition, the different effects of random mutations are difficult to classify with limited prior knowledge, and the subtype that is difficult to classify discounts the value of the targeted drug. How to effectively integrate the overall trend information of mutation and combine the prediction of drugs with wet experimental results is an effort direction capable of improving the prediction effectiveness of gene detection. The gene detection, genotype prediction and in-vitro experiment functional verification are tightly combined, and the two mutually support and complement each other, so that the evaluation of the actual drug application effectiveness of the patient can be realized more accurately.
The core technology of accurate medical treatment is based on genomics detection to identify the genetic mutation condition of a tumor sample from a patient. When these genetic mutations are matched to drugs developed for the mutations, strong supportive evidence can be provided for the effects of the use of targeted drugs. With the progress of research and detection technology of molecular biology and genomics, the cost of identifying genetic background of patients is greatly reduced, and the concept of accurate medical treatment is increasingly emphasized, so that gene detection is promoted to be one of important information sources for clinical treatment scheme decision together by multiple reasons.
In the form of patient replacement, studies of tumor drug response in vitro models have also been developed for many years. Prior to the development of organoid and organ-chip technology, in vitro models of tumors have relied on either traditional cell culture of the patient's tumor tissue (Patient Derived Cell lines, PDC) or xenograft-based animal models (Patient Derived Xenocrafts, PDX) and the like.
The traditional two-dimensional cell culture mode is to extract the human primary cells and culture the cells in vitro, and has important significance in the field of drug development because the method can realize the intervention observation of the human cells in vitro. However, the two-dimensional cell culture method cannot accurately reflect microenvironment and heterogeneity of a real tissue structure in vivo, and especially aims at research and application in the tumor field, the situation that the phenotype of the two-dimensional cells is low in consistency with the presence of the body due to the tumor heterogeneity and the lack of the microenvironment is extremely low in clinical availability.
Patient-derived xenograft animal models (Patient Derived Xenocrafts, PDX) are animal xenograft tumor models formed by transplanting tissue mass or primary cell samples of tumor patients into immunodeficient animals. Compared with two-dimensional cell culture, the animal model of xenograft fully maintains the heterogeneity and microenvironment of tumor tissues, effectively realizes the simulation of in-vitro platforms on the phenotype and function of in-vivo tumor tissues, and has high acceptance. However, due to the high construction difficulty, the PDX system based on the immunodeficiency mice has a culture period of at least two months, the average and complete drug sensitivity detection scheme can be priced to be about 20 ten thousand yuan, the accessibility of the PDX system is poor due to the excessively high period growth cost, and the PDX system cannot be effectively applied in clinic.
The organoid constructed based on sampling culture of tumor tissue of patient and the organ chip system thereof effectively overcome the problems. Organoid technology is an in vitro culture model that is targeted optimized based on traditional two-dimensional cell culture. The three-dimensional structure ordered arrangement construction is carried out on the human cells under the in vitro environment through mediums such as matrigel and the like, so that the tissue organ structure corresponding to organs in the human body is simulated as far as possible, and the partial in-vivo physiological characteristics of the organs are reproduced. The organ chip technology is an in vitro experiment platform which combines biological, material and engineering technologies, and forms one or more cell types, organoid tissues, microenvironments and the like into a system through a microfluidic technology and is commonly arranged on a micro chip system. The biological mechanics and chemical environment can be simulated accurately by the chip and the microfluidic device. Not only can the data which are closer to in-vivo experiments be obtained, but also the necessity of animal experiments can be effectively reduced, and the method has become an important direction of in-vitro research in the field of biological medicine.
Organoids derived from tumor tissue demonstrate both phenotypic and functional consistency with the derived tumor tissue, and organoid culture cycles and costs are significantly lower than animal models of xenografts. On the basis of organoids, the micro-fluidic technology is combined, and an organ chip platform which is formed by connecting a plurality of organs in series is included, so that the whole process of absorption, distribution, metabolism, excretion and toxicity of the medicine in a human body can be better simulated, and the action effect of the medicine in the human body can be more comprehensively reflected. The evaluation and prediction of in vitro drug regimens using organoids as a proxy for tumor patients has been accepted in the clinical and scientific fields, forming a preliminary consensus.
The traditional drug-sensitive screening method combined with the traditional two-dimensional cell or three-dimensional organoid culture has the advantage that a long-time drug-sensitive screening experimental period including sample primary extraction, organoid primary culture and detection method of drug-adding period end point is required to be completed. The experimental result is used as the input of a prediction model to make a decision of a medication scheme. The actual delivery period for drug sensitive screening often is around a natural month, and two weeks are required for the shortest time. From a patient perspective, the decision to treat a regimen strives for seconds, the earlier the regimen becomes effective and the higher the ultimate clinical survival. The gene detection service in the diagnosis field has a corresponding scale, the data delivery of the gene detection service also tends to be standardized, the data delivery period is obviously shorter than the experimental period, and the model can have better timeliness by encoding the data related to the wet experiment into the model in the training stage and only the prediction model of the gene detection data is required, so that the application value is obviously improved.
The gene detection and organ-on-chip experimental data have the characteristic of common high flux, the dimension of the data characteristics is large, the output is large, the data are reasonably and effectively utilized, the core characteristics of the data are clinically extracted, and the data processing and the technology landing application difficulty are also realized. Thanks to the rapid development of computer hardware and research in the field of deep learning, the high-dimensional feature data can be effectively analyzed and explored through the deep neural network. However, the training of the deep learning model is data driven, and the feature extraction and the automation of the training iterative process of the model enable the model to exist in a black box form, so that even if the model shows a considerable effect in prediction, the prediction basis of the model is difficult to explain. The interpretability of predictive effects of deep learning models in clinical decisions and biomedical field studies is to be improved.
Disclosure of Invention
The application provides a deep learning prediction method of a tumor drug administration scheme based on gene detection, and the technical purpose of the method is to improve the interpretation of the prediction effect of a deep learning model in the fields of clinical decision and biological medicine.
The technical aim of the application is realized by the following technical scheme:
A method for deep learning prediction of a tumor dosing regimen based on gene detection, comprising:
s1: constructing a model structure frame of a pre-training model, wherein the pre-training model comprises a first coding module, a second coding module, a genetic information decoding module, a compound structure decoding module, a full-connection layer and an output layer;
s2: constructing a pre-training mutation drug sensitive data set, and training a pre-training model through the pre-training mutation drug sensitive data set to obtain a pre-training model of the cancer species;
s3: screening cancer seed samples of the target prediction model from the organoid sample library, and resuscitating corresponding organoids;
s4: designing various medication schemes according to cancer samples;
s5: carrying out wet experiments on the organ-on-chip system matched with the selected organoids through different medication schemes to obtain a wet experiment data set;
s6: performing migration learning on the pre-training model of the cancer through a wet experimental data set until a target prediction model applicable to the target field is obtained;
s7: and predicting the rationality of the tumor medication scheme through the target prediction model.
Further, the step S2 includes:
s21: inputting the pre-training mutation drug sensitive data set into a first coding module and a second coding module to be coded respectively to obtain mutation information codes and compound structure codes;
S22: inputting the mutation information code to the genetic information decoding module for decoding and outputting, and inputting the compound structure code to the compound structure decoding module for decoding and outputting;
s23: the full-connection layer performs characteristic splicing on the outputs of the genetic information decoding module and the compound structure decoding module, and then outputs the outputs through the output layer;
s24: repeating the steps S21 to S23 until training of the pre-training model is completed, and obtaining a pre-training model of the cancer species;
wherein the first encoding module and the second encoding module each use a transfomer infrastructure based cross-attention mechanism that correlates mutation information encoding with hidden layer features of compound structural encoding, weighting the compound structural encoding by mutation information encoding while weighting mutation information encoding by compound structural encoding.
Further, the first coding module is an inverted KO module, and the inverted KO module performs mapping conversion on the mutation map of each sample in the pre-training mutation drug sensitive data set in an One-Hot coding mode to obtain mutation information codes; the second coding module is a Morgan fingerprint coding module, and the Morgan fingerprint coding module codes all intervention compounds related in the pre-training mutation drug sensitive data set according to the corresponding compound structure to obtain compound structure codes.
Further, in step S4, at least 12 of the dosage regimens are used, each of the dosage regimens producing a concentration gradient of at least 5 different concentration levels, the concentration gradient being normalized to [0,0.016,0.08,0.4,2,10.0] micromolar for the non-specific agent.
Further, step S5 includes:
s51: according to the number of the medicine taking schemes, performing the plating of organoid chambers in an organ chip system with the number of the medicine taking schemes of +1 on samples which are recovered for 5-14 days and are controlled by organoid activity and counting quality;
s52: according to each medication scheme and concentration gradient in the medication scheme design, respectively carrying out medication treatment on a single organ chip, selecting an organ chip system without adding compound medicines as a control group, and culturing the organ chip system for 7 days;
s53: the genetic information of the sample is combined with the collection of the sample library data and the detection in the system culture process of the organ chip, so that the collection of the genetic information of the sample is completed;
s54: on the 7 th day after dosing, performing activity detection on organoids on the organ-chip system, and judging the effectiveness of the drug administration scheme through cell activity data to obtain the effectiveness sequence of the drug administration scheme;
S55: the sample genetic information and the order of medication effectiveness together form a wet experiment data set.
Further, in step S2, the pre-training mutation drug sensitive data set is divided into a training set, a test set and a verification set according to the proportion of [0.7,0.2,0.1], and the training set trains the pre-training model to obtain a first pre-training model; evaluating the performance of the first pre-training model by the test set, and iteratively adjusting the super parameters of the first pre-training model according to the performance to obtain a second pre-training model; and the verification set verifies the performance level of the second pre-training model, if the verification result reaches the preset standard, the second pre-training model is the pre-training model of the cancer species, otherwise, the second pre-training model is continuously trained until the pre-training model of the cancer species is obtained.
Further, in step S6, the wet experimental data set is divided into a training set and a verification set according to a proportion not lower than [0.9,0.1], and the training set performs migration learning on the pre-training model of the pan-cancer species to obtain a first target prediction model; and the verification set verifies the performance level of the first target prediction model, if the verification result reaches a preset standard, the first target prediction model is the target prediction model, otherwise, the first target prediction model is continuously trained until the target prediction model is obtained.
Further, the validation set validates the performance level of the first target prediction model, including:
s61: predicting the effectiveness of different medication schemes of the tumor through the first target prediction model, and sequencing the different medication schemes from high to low according to the effectiveness to obtain prediction effectiveness sequencing of the different medication schemes;
s62: obtaining experimental effectiveness sequences of different medication schemes in an organ-on-a-chip system experiment;
s63: the consistency of the predicted validity ordering and the experimental validity ordering is calculated and expressed as:
wherein ρ (rho) represents a uniformity coefficient; d, d i Representing the difference between the predicted validity rank and the ith rank in the experimental validity rank, n representing the total amount of data;
s64: and when the consistency coefficient rho (rho) is larger than 0, carrying out statistical T-test on the consistency coefficient rho (rho) to obtain a test result, judging whether the test result is smaller than a significant threshold value, and if so, enabling the sequencing of the consistency coefficient rho (rho) corresponding to the predicted validity sequencing to be consistent with the sequencing in the experimental validity sequencing.
The application has the beneficial effects that: the wet experiment data obtained through the organ-chip automatic high-throughput experiment platform represent the results of wet experiment output in the real world, are closer to the phenotype and the function of the organ layer in the human body compared with the traditional two-dimensional cell experiment, and avoid the inconsistency caused by population difference compared with an animal model. Meanwhile, the deep learning model takes wet experimental data as input, so that the correlation between the model and biological significance can be effectively improved.
In addition, the correlation patterns of the genes and the functional channels thereof which are artificially induced and arranged based on biological priori research are reversely constructed, so that the feature circulation level of the trained model can be more in accordance with biological significance to a certain extent, and the interpretability of the biological significance level can be realized according to the weight and feature importance analysis of the neural network frame.
Furthermore, on one hand, a plurality of two-dimensional cell priori experimental results are combined to construct a pre-training model of the cancer cell, and meanwhile, a fine-tuning data set is constructed by performing small-scale experiments in the fields of organoids and organ chips; based on the basis of a two-dimensional cell culture pre-training model, the three-dimensional organoid and organ chip data are used for performing the fine tuning training of the field migration, the capacity of capturing the volume and the characteristics of the data in the pre-training model is fully ensured, and the pre-training model can be applied to the three-dimensional organoid structure which is closer to the in-vivo organ level data at reasonable cost. The technical route of the application effectively combines the trade-off of the model coverage, accuracy, consistency of in-vitro platform and real in-vivo result, cost and the like, and can effectively provide evaluation and prediction of in-vivo anti-tumor effect.
In the application of model reasoning, the input data only comprises mutation detection data, the wet experiment step is not included, the data output period is shorter, the model prediction reasoning influence period is shorter, the method is suitable for application scenes in which a quick output reasoning result is needed, and the availability is higher. Meanwhile, the application is not limited to mutation of individual targets, but based on all information integration data of biological functional channels, more accurate judgment can be made by utilizing wider gene mutation information and combining gene interaction correlation and by means of wider knowledge of the biological functional channel layers.
In conclusion, the application combines the gene sequencing technology with the close association of the existing clinical medicines and targets thereof, is based on mass data and experimental verification provided by the organoid and organ-on-chip wet experiment technology, utilizes the deep learning technology to process the data, is connected in multiple fields, can effectively develop an ex-vivo patient substitution model, and provides evaluation of the effect level of a medication scheme based on a wet experiment in an in-vitro environment.
Drawings
FIG. 1 is a full flow chart of a method for deep learning prediction of a tumor dosing regimen based on gene detection according to the present application;
FIG. 2 is a frame diagram of an inverted KO module;
FIG. 3 is a schematic diagram of a pre-training model according to an embodiment of the present application;
FIG. 4 is a graph comparing the predicted and experimental real tumor suppression abilities of the TP_Trial_01_0002 sample model.
Detailed Description
The technical scheme of the application will be described in detail with reference to the accompanying drawings.
As shown in fig. 1, the method for deep learning prediction of tumor medication scheme based on gene detection according to the present application comprises:
s1: the model structure framework construction is performed on a pre-training model, which is shown in fig. 3 and comprises a first coding module, a second coding module, a genetic information decoding module, a compound structure decoding module, a full-connection layer and an output layer.
S2: and constructing a pre-training mutation drug sensitive data set, and training the pre-training model through the pre-training mutation drug sensitive data set to obtain the pre-training model of the cancer.
As a specific example, GDSC (Genomics of Drug Sensitivity in Cancer) is an open source dataset comprising 1939 cancer cell lines of different tissue organ sources, combined with 621 different compound interventions, for a total of 57 ten thousand independent two-dimensional cell culture drug sensitive experimental design data records. The open source two-dimensional cell drug sensitivity data in the data set is used as a training data set source of a pre-training model. Specifically, all mutation data detected by the technical platforms in the dataset are summarized, and 1048575 mutation detection from 307 cell line samples are classified and stored in a cell line unit.
Taking the combination of a single sample and a single compound as the granularity of a data set, dividing a pre-training mutation drug sensitive data set into a training set, a testing set and a verification set according to the proportion of [0.7,0.2,0.1], and training a pre-training model by the training set to obtain a first pre-training model; evaluating the performance of the first pre-training model by the test set, and iteratively adjusting the super parameters of the first pre-training model according to the performance to obtain a second pre-training model; and the verification set verifies the performance level of the second pre-training model, if the verification result reaches the preset standard, the second pre-training model is the pre-training model of the cancer species, otherwise, the second pre-training model is continuously trained until the pre-training model of the cancer species is obtained.
Specifically, step S2 includes:
s21: and respectively inputting the pre-training mutation drug sensitive data set into a first coding module and a second coding module to code, and respectively obtaining mutation information codes and compound structure codes.
Specifically, the first coding module is an inverted KO module, and the inverted KO module performs mapping conversion on the mutation map of each sample in the pre-training mutation drug sensitive data set in an One-Hot coding mode to obtain mutation information codes.
As a specific embodiment, using the input length 17047 of the constructed neural network frame as a template, using 0 as an initial value without mutation, traversing all samples in the pre-training mutation drug sensitive dataset, and if a mutation is detected by a gene of one sample at a corresponding position, marking as 1. Finally, 307 binary mutation information coding vectors with the length of 17047 are formed by taking samples as units.
The KO module is a module for sorting genes and proteins manually according to functional pathways. Genes will refer to a priori knowledge of the wet experimental data contained in the study, and will be assigned to different functional pathways from shallow to deep according to functional homology. Functional pathways exist at different levels, and genes within the same functional pathway tend to have a stronger association.
The hierarchical classification of the KO modules is divided into 4 layers altogether. The first BRITE represents the largest category, including six top categories of metabolism, genetic information processing, environmental information processing, cellular processes, vital systems, and human disease. The second and third layers below this, respectively, contain Module information and orthographic Group information, both of which are progressively deeper, progressively finer in a broad class of functional classifications, classifying genes and molecules under finer functional classifications according to synergistic effects and interactions. The fourth layer of orthotics, the most detailed level of function corresponding to a single gene, each orthographically represents a homologously generalized gene or molecule and its corresponding function, and one gene may correspond to one or more orthotics since multiple functions may exist for the same gene.
And establishing a corresponding inverted neural network model frame according to the attribution relation of the KO module genes and the functional channels. Specifically, the 4 tiers of the inverted KO module include a feed-forward connection network of 4 tiers total of input layers. The first input layer of the feedforward neural network is a fourth Orthology layer of the KO module, and neurons of the corresponding input layer are built according to the number of all genes of the Orthology layer. And sequentially downwards establishing a hidden layer of the feed-forward network according to the residual hierarchy of the KO module, wherein the final output layer is the BRITE hierarchy of the KO module, namely, the hidden layer comprises 6 top-level classified neurons. After 4 levels of neurons are established, connections of different levels of neurons in the feed-forward network are established according to the attribution of KO module genes. Finally, a neural network framework with functional taxonomic biological significance is formed, which refers to the functional attribution of the KEGG database genes, as shown in figure 2.
The second coding module is a Morgan fingerprint coding module, and the Morgan fingerprint coding module codes all intervention compounds related in the pre-training mutation drug sensitive data set according to the corresponding compound structure to obtain compound structure codes. The Morgan fingerprint is a coding mode for recording by taking a single atom as a starting point, taking the atom as a circle center, gradually expanding the radius of a range and taking other contained atoms as substructures. Through the coding mode, the connection and the correlation relation between molecules in the compound and the information of the local functional groups of the compound are effectively captured, and the coding mode is a common compound coding mode in the field of chemical informatics. If the treatment of the compounds involves the combined action of two or more compounds, the compounds are encoded using Morgan fingerprints separately, and then the compounds are subjected to element-by-element addition to form feature cross-overs, which form the coincidence feature under the combined action.
Specifically, a SMILES expression of each compound structure is obtained, a mol file corresponding to the compound is obtained by using an rdkit. Chem. MolFromSmiles () function, and the compound is converted into a Morgan fingerprint by using an rdkit. Chem. GetMorganfinger finger rprintAsBitVect () function. The fingerprint is finally converted into a binary bit string. In the binary bit string, each position corresponds to whether a preset sub-functional group unit exists in the corresponding structure of the compound. If present, it is noted as 1, otherwise it is noted as 0. The Morgan fingerprint finally stores structural information of the compound in the form of a sparse vector. If the treatment of the compounds involves the combined action of two or more compounds, the compounds are encoded using Morgan fingerprints separately, and then the compounds are subjected to element-by-element addition to form feature cross-overs, which form the coincidence feature under the combined action.
S22: and inputting the mutation information code into the genetic information decoding module for decoding and outputting, and inputting the compound structure code into the compound structure decoding module for decoding and outputting.
S23: and the full-connection layer performs characteristic splicing on the outputs of the genetic information decoding module and the compound structure decoding module, and then outputs the outputs through the output layer.
The final output of the output layer is the IC50 regression value recorded corresponding to the data set, so that regression prediction of the IC50 value of the drug killing effect is constructed according to the data accumulation of the prior data set under the conditions of a sample and a compound structure of known genetic information.
S24: and repeating the steps S21 to S23 until the training of the pre-training model is completed, and obtaining the pre-training model of the cancer.
The first coding module and the second coding module both use a cross attention mechanism based on a Transformer infrastructure, the cross attention mechanism correlates mutation information codes with hidden layer characteristics of compound structural codes, the compound structural codes are weighted through the mutation information codes, and meanwhile, the mutation information codes are weighted through the compound structural codes, so that data perception interaction across data types in decoding layers constructed respectively is realized.
In the training process of the pre-training model, MSE is used as a Loss evaluation parameter of a model regression predicted value and a true value, adam is used as a model optimizer, and super parameters such as learning rate lr=1E-05, batch size batch=16, iteration number epoch=300 and the like are set for training the model. And evaluating the performance of the model by using the test set, and iteratively adjusting the super parameters of the model according to the performance. And after model training is completed, obtaining the performance level of the model on the unknown data set by using the verification set. The performance of the model before and after the training iteration is completed is shown in table 1, and it is known that the model converges in the training process, and finally, sufficient generalization is shown in the verification data set.
TABLE 1
S3: and screening cancer seed samples of the target prediction model from the organoid sample library, and resuscitating the corresponding organoids.
Because of the pre-training model of the pan-cancer species constructed based on the two-dimensional cell culture database, the predicted value of the model can not fully embody the real action condition of the compound at the level of the organ level in vivo. Therefore, the experiment of designing the organoid level on the organ chip by combining the sample and the compound realizes the migration of the model adaptation field by fine adjustment of the pre-training model of the pan-cancer species by small-scale data, thereby obtaining the real effect of the compound under the level of the three-dimensional organ structure.
Specifically, according to the cancerous tumor of the current target prediction model, the tissue organ type is selected for experiments in combination with the sample conditions in the organoid tissue sample library. The tissue cancer type of solid tumor comprises breast cancer, cervical cancer, colorectal cancer, esophageal cancer, liver cancer, lung cancer, pancreatic cancer, prostatic cancer and gastric cancer, and any one of the cancers is selected for screening samples in a sample library. And resuscitating the frozen organs of the screened frozen samples by using a human tumor organoid special culture medium, wherein the samples with better activity and the samples with complete priori mutation information in a tissue sample library can be incorporated into subsequent organ chip experiments of the cancer. Wherein the number of experimental sample queues for a single target cancer should be up to 30+.
S4: multiple medication schemes are designed according to cancer samples.
Specifically, according to selected cancer, referring to the opinion of clinical expert team, the common medication schemes in the current guideline recommendation and clinical development process are summarized and arranged, including targeted drugs and chemotherapeutics, including drugs on the market, and in special cases, the tests of the drugs still in the clinical experiment research and development process, including single-drug use or multi-drug use and other medication treatment schemes with different dimensions; finally, at least 12 combined administration schemes from clinic are realized for the target cancer, each administration scheme produces concentration gradients of at least 5 different concentration dosage grades, and the concentration gradient of the condition of the unspecified medicine is based on [0,0.016,0.08,0.4,2,10.0] micromoles. And a training data set with enough volume in the dimension of the compound is produced, so that the training capacity of the model is ensured.
S5: and carrying out a wet experiment on the organ-on-chip system matched with the selected organoids through different medication schemes to obtain a wet experiment data set.
Specifically, samples with stable growth condition and good activity after cryopreservation and resuscitation are screened from an organoid tissue sample library, preparation of an organ chip platform and consumable materials is carried out by using selected target cancers, the number of schemes in the design of a medication scheme is referred, and the organoid chambers in the organ chip platform of a +1 system of the number of medication schemes are plated on the samples which are resuscitated for 5-14 days and controlled by the organoid living property; taking an additional organ chip platform without adding compound medicines as a Control reference group, and respectively carrying out dosing treatment on a single organ chip according to each dosing scheme and concentration gradient in the design of the dosing scheme; according to the technical means of tissue type and organ chip platform, combining the simulation of micro-fluidic technology on the micro-environment of the organ chip and the culture of the self-gravity drug sensitive organ chip, the drug adding date is marked as Day0, and the organ chip is cultured for 7 days.
According to the completeness of genetic data corresponding to a selected sample in the organoid tissue sample library, collecting the genetic information of the sample by combining the collection of the data of the sample library and the detection in the process of culturing an organ chip. Specifically, if the organoid tissue sample library has priori knowledge of mutation information of complete samples, recording in a corresponding experimental database; if the tissue sample library has no genetic information, after 5 days of the organoid resuscitation cycle, the organoid sample above the organoid chamber of the order of 5 x 10 a 6 is cleaned, then the whole exon sequencing is used for detecting, analyzing and sorting the mutation information of the sample, and finally the mutation information is recorded into an experimental database by taking the sample as a unit.
And (3) detecting the activity of the organoids on the organ-chip platform at Day7 after the medicine is added. And judging the intervention and influence of the corresponding medication scheme on the activity of the tumor organoid through the cell activity data, thereby judging the effectiveness of the medication scheme. Specifically, using a cell activity detection scheme such as CCK8 and ATP, the optical density value of the sample is detected by an enzyme-labeled instrument at a specific wavelength. And respectively using a regression model purchasing machine to obtain optical density values of each drug administration scheme under different concentrations according to a best fit curve, calculating the drug concentration with the tumor organoid activity of exactly 50% of the maximum value from the fit curve, and recording the drug concentration as the EC50 of the drug administration scheme. After all the medication schemes are processed, the effectiveness of the medication schemes on tumor organoid inhibition is ordered according to the sequence of EC50 detection results, and the medication scheme effectiveness sequence is obtained.
In summary, the sample genetic information and the order of effectiveness of the regimen together form a wet test dataset.
S6: and performing migration learning on the pre-training model of the cancer seeds through a wet experimental data set until a target prediction model suitable for the target field is obtained.
Specifically, the pre-training model of the carcinomatous tumor is trained based on the priori data set, parameters of the pre-training model of the carcinomatous tumor are reserved as initial parameters, the wet experimental data set of the organ chip is used as the fine tuning data set, and the pre-training model of the carcinomatous tumor is trained. The input of the pre-training model of the cancer is a coding vector based on sample mutation detection information of an experimental database and the coding of a compound structure under different medication schemes, and if the medication schemes of the multi-medicine combination are used, the processing mode of characteristic crossing is referred; the output of the model is an EC50 concentration value for evaluating inhibition of tumor organoid activity by the drug regimen.
Dividing a wet experimental data set, dividing a training set and a verification set according to the proportion not lower than [0.9,0.1], and performing migration learning on the pre-training model of the cancer seeds by the training set to obtain a first target prediction model; and the verification set verifies the performance level of the first target prediction model, if the verification result reaches a preset standard, the first target prediction model is the target prediction model, otherwise, the first target prediction model is continuously trained until the target prediction model is obtained. The training process of the training set is used for setting super parameters such as learning rate, batch size, iteration number and the like by taking MSE as a Loss evaluation parameter of a model regression predicted value and a true value and Adam as a model optimizer. And evaluating the performance of the model by using the verification set, and iteratively adjusting the super parameters of the model according to the performance. Finally, the original parameters of the pre-training model of the cancer and the parameters of the model after fine adjustment are used, and the performance comparison of the model before and after fine adjustment is obtained through reasoning. And after the model training is finished, obtaining a target prediction model.
In this embodiment, non-small cell lung cancer is selected as target cancer, 47 non-small cell lung cancer cryopreservation samples containing complete mutation information are screened from an organoid tissue sample library, and a human tumor organoid culture medium is used, wherein the culture medium contains growth factors, nutrients, small molecule inhibitors, matrigel or Matrigel for organoid mass development and other components which are specially optimized for lung cancer cells; resuscitating and subculturing the frozen samples, wherein 34 samples are successfully resuscitated and enter into the organoid and organ chip data set queue through active quality control.
Further, a medication regimen is selected that is compatible with the clinic. Referring to the recommendations of the clinical oncologist expert group in this example, the following 20 single drug regimens comprising a targeted drug and a chemotherapeutic drug were ultimately confirmed:
gefitinib, dacatinib, oxcetirib, ai Leti, bragg, crizotinib, emtrictinib, laratinib, ensartinib, sivoratib, celepatinib, vandertinib, apatinib, afatinib, erlotinib, paclitaxel, gemcitabine, cisplatin, and two DMSO replicates were used as controls. Wherein the concentration gradient of the drug addition of each compound was set to 0. Mu. Mol/L, 0.016. Mu. Mol/L, 0.08. Mu. Mol/L, 0.4. Mu. Mol/L, 2.0. Mu. Mol/L, 10.0. Mu. Mol/L.
On day 7 of resuscitation of the group-in samples, organoid activity control was performed, confirming that 34 organoid samples in the queue were passed by the quality control, 20 compounds per sample, and 680 total experimental designs were performed, and wet experiments were performed using lung organoid chips. Specifically, separating organoid precipitate under the actions of centrifugal separation, digestive juice action, cell collection and the like by using a special digestive juice corresponding to Matrigel or Matrigel; after the organoid sediment is counted and quality controlled by a cell counter, matrigel or Matrigel is added again for blowing and beating uniformly, and finally the organoid sediment is plated in an organoid cavity on an organ chip, and attention is paid to ensuring that the organoid sediment is digested uniformly as much as possible but a small part of the organoid is kept in a clustered structure and uniformity during plating.
Further, on the 4 th day after plating, adding a compound with a corresponding concentration to the organoid chamber in each organ-chip system for effect by referring to the experimental design; the organoids were incubated for a period of 7 days after dosing, with a single change of fluid for 2 days, and in a sterile environment at 37℃throughout the incubation period.
The 7 th day of dosing culture is the experimental end point, the activity detection is carried out on the organoids, and the optical density value of each organoid sample is detected under a specific wavelength by an enzyme-labeled instrument by using an ATP cell activity detection scheme standard kit.
Table 2 shows comparison of ATP activity detection data of organ-a chip TP 0302 samples.
0.016μM | 0.08μM | 0.4μM | 2μM | 10μM | |
Gefitinib | 7365 | 6544 | 8868 | 7559 | 6531 |
Dacomitinib | 7165 | 5959 | 7638 | 6208 | 7264 |
Osimertinib | 6908 | 7498 | 8764 | 7678 | 7936 |
Alectinib | 8057 | 6763 | 7915 | 8485 | 7508 |
Brigatinib | 8494 | 7384 | 6774 | 7209 | 7787 |
Crizotinib | 6880 | 7821 | 9351 | 5618 | 6877 |
Entrectinib | 7838 | 9041 | 9450 | 5556 | 8104 |
Larotrectinib | 9306 | 8899 | 9988 | 8329 | 9738 |
Ensartinib | 9702 | 8913 | 9241 | 9505 | 9732 |
Savolitinib | 7658 | 6481 | 7187 | 6660 | 7731 |
Selpercatinib | 6713 | 6489 | 7509 | 6110 | 7239 |
Vandetanib | 8451 | 9075 | 8333 | 9110 | 7712 |
Apatinib | 7974 | 9720 | 7789 | 8937 | 8855 |
Afatinib | 9472 | 7482 | 8442 | 7398 | 7103 |
Erlotinib | 9256 | 7670 | 9298 | 9811 | 9395 |
Cisplatin | 6263 | 5956 | 8812 | 8659 | 8148 |
Gemcitabine | 7385 | 7636 | 8218 | 7595 | 8053 |
Paclitaxel | 6157 | 7303 | 8584 | 6246 | 6970 |
DMSO/Control | 6885 | 6211 | 5735 | 5907 | 6672 |
TABLE 2
Taking the TP 0302 sample results of table 2 as an example, the activity values of each sample under the action of different compounds are collated. And respectively using a regression model purchasing machine to obtain optical density values of each drug administration scheme under different concentrations according to a best fit curve, calculating the drug concentration with the tumor organoid activity of exactly 50% of the maximum value from the fit curve, and recording the drug concentration as the EC50 of the drug administration scheme. After all the medication schemes are processed, the effectiveness of the medication schemes on tumor organoid inhibition is ordered according to the sequence of EC50 detection results, and the medication scheme effectiveness sequence is obtained.
In the embodiment of the application, a verification set is reserved in the ratio of [0.8,0.2], and the training set is utilized to train the pre-training model of the pan cancer species. The input of the model is a coding vector based on sample mutation detection information of an experimental database and the coding of a compound structure under different medication schemes, and if the medication schemes of multiple drugs are combined, the processing mode of characteristic crossing is referred. The output target value of the model is an EC50 detection value of the drug regimen on the tumor organoid activity inhibition rate, a training process of a training set is used for taking MSE as a Loss evaluation parameter of a model regression predicted value and a true value, adam is used as a model optimizer, and super parameters such as a learning rate, a batch size, iteration times and the like are set. The model performance of the models before and after fine tuning was recorded as Loss based on MSE, as shown in table 3.
MSE | Pre-training model weights | Organ chip post-trimming weights |
Training data set | 0.5732 | 0.0718 |
Validating a data set | 0.4175 | 0.0953 |
TABLE 3 Table 3
As can be seen from table 3, the pre-training model of the cancer species before fine tuning has a certain prediction function, and the phenomenon can be reflected that the pre-training model based on the two-dimensional cell experiment system has a certain level of prediction consistency on the organoid platform result, but the error is still larger; after the organoid data is used for fine tuning, the target prediction model is obviously suitable for the field of organoid and organ chip data, and the more accurate prediction of experimental results with organ structures is realized.
Further, the validation set validates the performance level of the first target prediction model, including:
s61: predicting the effectiveness of different medication schemes of the tumor through the first target prediction model, and sequencing the different medication schemes from high to low according to the effectiveness to obtain prediction effectiveness sequencing of the different medication schemes;
s62: obtaining experimental effectiveness sequences of different medication schemes in an organ-on-a-chip system experiment;
s63: the consistency of the prediction validity ranking and the experimental validity ranking was calculated by Spearman's Rank Correlation Method (Spearman rank correlation verification method), expressed as:
Wherein ρ (rho) represents a uniformity coefficient; d, d i Representing the difference between the predicted validity rank and the ith rank in the experimental validity rank, n representing the total amount of data;
s64: and when the consistency coefficient rho (rho) is larger than 0, carrying out statistical T-test on the consistency coefficient rho (rho) to obtain a test result, judging whether the test result is smaller than a significance threshold value, and if so, enabling the sequencing of the consistency coefficient rho (rho) corresponding to the prediction effectiveness sequencing to be consistent with the sequencing in the experimental effectiveness sequencing. The significance threshold in the embodiment of the application is 0.05.
In particular, the Spearman's consistency method does not need to assume that the data are linearly related, only measures the relation of rank order among the data, is more suitable for consistency judgment between target prediction model output and experimental real detection values which possibly belong to different dimensions and distribution, and can effectively show differences of intra-group comparison among different medication schemes for the same sample.
And performing performance test on the target prediction model by using the verification set isolated in the fine tuning process, and evaluating the predicted value of the target prediction model and the effectiveness sequence of the medication scheme actually detected by the wet experiment. Table 4 shows the significance of Spearman's correlation coefficients and their statistical t-test for 7 samples in the validation set:
Sample | Spearman'srho | P.Value |
TP_Validation_01_0892 | 0.41008916 | 0.046557884 |
TP_Validation_02_0084 | 0.593043478 | 0.002256206 |
TP_Validation_03_0138 | 0.368695652 | 0.076249688 |
TP_Validation_04_0756 | 0.474016102 | 0.019281635 |
TP_Validation_05_0100 | 0.31826087 | 0.129608777 |
TP_Validation_06_0262 | 0.505217391 | 0.011795445 |
TP_Validation_07_0619 | 0.406956522 | 0.048424943 |
TABLE 4 Table 4
In Table 4, all samples in the validation set had Spearman's rho > 0, meaning that the model predictions had positive correlation with the experimentally detected true values; after the statistical T-test, 5 samples of 7 samples below the significance threshold of 0.05 have statistical significance, and the remaining two samples also have smaller P-Value. Further, in the order of predicting and detecting the validity of the two medication schemes, the first medication scheme is checked, and the rest 6 samples except TP_validation_05_0100 in the verification data set have the first name of the completely consistent medication scheme, and 4 samples have the first three names of the completely consistent medication scheme. This means that the target predictive model is of sufficient value in terms of the most efficient medication instruction.
S7: and predicting the rationality of the tumor medication scheme through the target prediction model.
The prediction reasoning stage is to sequence a frozen sample which is not subjected to mutation detection to obtain mutation data, and then predict and evaluate the actual organ-on-chip wet experimental result by using a reasoning result obtained based on the target prediction model.
Step S7 is an application phase of the target prediction model. For a sample which needs to be predicted by a tumor killing medication scheme, after the tissue is cleaned and primarily quality controlled, a necessary DNA extraction kit which meets the tissue type requirement of the sample is used for extracting DNA of the sample; performing necessary quality control on the DNA of the sample by using various methods such as a DNA concentration tester, a purity detector, agarose gel electrophoresis and the like; breaking the DNA into fragments according to 300bp, and carrying out terminal repair and street addition; performing PCR amplification and necessary purification steps, and finally completing sequencing by using an adaptive instrument of an NGS sequencing technology platform; performing quality control on the off-machine fastq data produced by sequencing by using a quality control tool such as fastp, and performing sequence comparison according to a reference genome by using tools such as bwa; performing mutation analysis on the comparison result by using a mutation detection tool; finally, annotation tools are used to annotate the detected mutation at the gene level with reference to a mutation database such as COSMIC. And (3) sorting and summarizing mutation information of the samples, carrying out necessary filtering of low-frequency mutation according to a set threshold value of mutation frequency in the samples, and finally recording a mutated gene queue.
Specifically, the medication schemes corresponding to the samples are arranged, and a queue containing a plurality of potential medication schemes is output; and sequentially selecting the medication schemes and codes of corresponding compounds from the queues, inputting mutation information codes into the target prediction model together with the mutation information queue input mutation information codes of the samples, reasoning by using the model to obtain the inhibition effect on tumors corresponding to each medication scheme, and sequencing the medication schemes according to inhibition effectiveness.
In the embodiment of the application, a wet experimental data set established based on the non-small cell lung cancer type is used, a target prediction model obtained through fine tuning training is used, mutation information of the patient is input into the model for coding, and a compound queue of a potential medication scheme is input into the model for reasoning the tumor inhibition effect of a corresponding compound; and sequencing according to the result output by the model, and obtaining recommended guidance on the medication scheme based on the sequencing.
Further, in order to verify the reliability of the predicted result of the sample, the sample is subjected to a test for consistency of the result. The lower line graph shows the drug sensitivity result of the sample. Model prediction and experiment of samples containing tp_three_01_0002 in fig. 4 the inhibition of tumors by the actual compounds, the ordinate represents the tumor inhibition, and the lower the value, the stronger the tumor inhibition; the abscissa represents different dosing regimens; in the two broken lines, the broken lines are actual detected values of experiments, the compounds on the abscissa are arranged according to descending order of the detected values of the experiments, and the solid lines represent model reasoning predicted values.
Further, three drug regimens with the strongest inhibitory effect on small cell lung cancer are checked, and the first three sequences of the experimental actual predicted value and the target predicted model reasoning predicted value are consistent, which are: (1) emtrictinib, (2) larrotib, (3) paclitaxel.
The foregoing is an exemplary embodiment of the application, the scope of which is defined by the claims and their equivalents.
Claims (8)
1. The deep learning prediction method of the tumor drug administration scheme based on gene detection is characterized by comprising the following steps of:
s1: constructing a model structure frame of a pre-training model, wherein the pre-training model comprises a first coding module, a second coding module, a genetic information decoding module, a compound structure decoding module, a full-connection layer and an output layer;
s2: constructing a pre-training mutation drug sensitivity data set based on a two-dimensional drug sensitivity database, and training a pre-training model through the pre-training mutation drug sensitivity data set to obtain a pre-training model of the cancer species;
s3: screening cancer seed samples of the target prediction model from an organoid sample library and resuscitating corresponding organoids;
s4: designing various medication schemes according to cancer samples;
s5: carrying out wet experiments on the organ-on-chip system matched with the selected organoids through different medication schemes to obtain a wet experiment data set;
S6: performing migration learning on the pre-training model of the cancer through a wet experimental data set until a target prediction model applicable to the target field is obtained;
s7: and predicting the rationality of the tumor medication scheme through the target prediction model.
2. The deep learning prediction method according to claim 1, wherein the step S2 includes:
s21: inputting the pre-training mutation drug sensitive data set into a first coding module and a second coding module to be coded respectively to obtain mutation information codes and compound structure codes;
s22: inputting the mutation information code to the genetic information decoding module for decoding and outputting, and inputting the compound structure code to the compound structure decoding module for decoding and outputting;
s23: the full-connection layer performs characteristic splicing on the outputs of the genetic information decoding module and the compound structure decoding module, and then outputs the outputs through the output layer;
s24: repeating the steps S21 to S23 until training of the pre-training model is completed, and obtaining a pre-training model of the cancer species;
wherein the first encoding module and the second encoding module each use a transfomer infrastructure based cross-attention mechanism that correlates mutation information encoding with hidden layer features of compound structural encoding, weighting the compound structural encoding by mutation information encoding while weighting mutation information encoding by compound structural encoding.
3. The deep learning prediction method according to claim 2, wherein the first coding module is an inverse KO module, and the inverse KO module performs mapping conversion on the mutation map of each sample in the pre-training mutation drug sensitive dataset in an One-Hot coding mode to obtain mutation information codes; the second coding module is a Morgan fingerprint coding module, and the Morgan fingerprint coding module codes all intervention compounds related in the pre-training mutation drug sensitive data set according to the corresponding compound structure to obtain compound structure codes.
4. The deep learning prediction method of claim 1, wherein in step S4, at least 12 of the dosage regimens are used, each of the dosage regimens producing a concentration gradient of at least 5 different concentration levels, the concentration gradient being based on [0,0.016,0.08,0.4,2,10.0] micromolar for the non-specific drug.
5. The deep learning prediction method of claim 1, wherein step S5 includes:
s51: according to the number of the medicine taking schemes, performing the plating of organoid chambers in an organ chip system with the number of the medicine taking schemes of +1 on samples which are recovered for 5-14 days and are controlled by organoid activity and counting quality;
S52: according to each medication scheme and concentration gradient in the medication scheme design, respectively carrying out medication treatment on a single organ chip, selecting an organ chip system without adding compound medicines as a control group, and culturing the organ chip system for 7 days;
s53: the genetic information of the sample is combined with the collection of the sample library data and the detection in the system culture process of the organ chip, so that the collection of the genetic information of the sample is completed;
s54: on the 7 th day after dosing, performing activity detection on organoids on the organ-chip system, and judging the effectiveness of the drug administration scheme through cell activity data to obtain the effectiveness sequence of the drug administration scheme;
s55: the sample genetic information and the order of medication effectiveness together form a wet experiment data set.
6. The deep learning prediction method of claim 1, wherein in step S2, the pre-training mutation drug sensitive data set is divided into a training set, a test set and a verification set according to the ratio of [0.7,0.2,0.1], and the training set trains the pre-training model to obtain a first pre-training model; evaluating the performance of the first pre-training model by the test set, and iteratively adjusting the super parameters of the first pre-training model according to the performance to obtain a second pre-training model; and the verification set verifies the performance level of the second pre-training model, if the verification result reaches the preset standard, the second pre-training model is the pre-training model of the cancer species, otherwise, the second pre-training model is continuously trained until the pre-training model of the cancer species is obtained.
7. The deep learning prediction method according to claim 1, wherein in step S6, the wet experimental data set is divided into a training set and a verification set according to a ratio not lower than [0.9,0.1], and the training set performs migration learning on the pre-training model of the pan-cancer species to obtain a first target prediction model; and the verification set verifies the performance level of the first target prediction model, if the verification result reaches a preset standard, the first target prediction model is the target prediction model, otherwise, the first target prediction model is continuously trained until the target prediction model is obtained.
8. The deep learning prediction method of claim 7, wherein the validation set validates a performance level of the first target prediction model, comprising:
s61: predicting the effectiveness of different medication schemes of the tumor through the first target prediction model, and sequencing the different medication schemes from high to low according to the effectiveness to obtain prediction effectiveness sequencing of the different medication schemes;
s62: obtaining experimental effectiveness sequences of different medication schemes in an organ-on-a-chip system experiment;
s63: the consistency of the predicted validity ordering and the experimental validity ordering is calculated and expressed as:
Wherein ρ (rho) represents a uniformity coefficient; d, d i Representing the difference between the predicted validity rank and the ith rank in the experimental validity rank, n representing the total amount of data;
s64: and when the consistency coefficient rho (rho) is larger than 0, carrying out statistical T-test on the consistency coefficient rho (rho) to obtain a test result, judging whether the test result is smaller than a significant threshold value, and if so, enabling the sequencing of the consistency coefficient rho (rho) corresponding to the predicted validity sequencing to be consistent with the sequencing in the experimental validity sequencing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311177095.XA CN117079716B (en) | 2023-09-13 | 2023-09-13 | Deep learning prediction method of tumor drug administration scheme based on gene detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311177095.XA CN117079716B (en) | 2023-09-13 | 2023-09-13 | Deep learning prediction method of tumor drug administration scheme based on gene detection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117079716A true CN117079716A (en) | 2023-11-17 |
CN117079716B CN117079716B (en) | 2024-04-05 |
Family
ID=88715393
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311177095.XA Active CN117079716B (en) | 2023-09-13 | 2023-09-13 | Deep learning prediction method of tumor drug administration scheme based on gene detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117079716B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118571500A (en) * | 2024-07-31 | 2024-08-30 | 南方医科大学南方医院 | Colorectal cancer chemotherapy response prediction system and storage medium |
Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190164632A1 (en) * | 2017-09-25 | 2019-05-30 | Syntekabio Co., Ltd. | Drug indication and response prediction systems and method using ai deep learning based on convergence of different category data |
CN111223577A (en) * | 2020-01-17 | 2020-06-02 | 江苏大学 | Deep learning-based synergistic anti-tumor multi-drug combination effect prediction method |
US20200365270A1 (en) * | 2019-05-15 | 2020-11-19 | International Business Machines Corporation | Drug efficacy prediction for treatment of genetic disease |
KR20210153540A (en) * | 2020-06-10 | 2021-12-17 | 주식회사 에이조스바이오 | System for phenotype-based anticancer drug screening using artificial intelligence deep learning |
CN114530196A (en) * | 2021-08-31 | 2022-05-24 | 天津工业大学 | Organ chip drug evaluation method based on deep learning prediction |
CN114582429A (en) * | 2022-03-03 | 2022-06-03 | 四川大学 | Method and device for predicting drug resistance of mycobacterium tuberculosis based on hierarchical attention neural network |
CN114882970A (en) * | 2022-06-02 | 2022-08-09 | 西安电子科技大学 | Drug interaction effect prediction method based on pre-training model and molecular graph |
WO2022170909A1 (en) * | 2021-02-09 | 2022-08-18 | 清华大学深圳国际研究生院 | Drug sensitivity prediction method, electronic device and computer-readable storage medium |
CN115116624A (en) * | 2022-06-29 | 2022-09-27 | 广西大学 | Drug sensitivity prediction method and device based on semi-supervised transfer learning |
WO2022222231A1 (en) * | 2021-04-23 | 2022-10-27 | 平安科技(深圳)有限公司 | Drug-target interaction prediction method and apparatus, device, and storage medium |
CN115376706A (en) * | 2022-10-26 | 2022-11-22 | 杭州艾名医学科技有限公司 | Prediction model-based breast cancer drug scheme prediction method and device |
CN115966316A (en) * | 2023-02-10 | 2023-04-14 | 北京大学 | Tumor drug sensitivity prediction method, system, device and storage medium |
CN116110509A (en) * | 2022-11-15 | 2023-05-12 | 浙江大学 | Method and device for predicting drug sensitivity based on histology consistency pretraining |
CN116403731A (en) * | 2023-04-11 | 2023-07-07 | 上海交通大学 | Missense mutation effect prediction method and system for clinical drug effect based on deep learning |
CN116486900A (en) * | 2023-04-25 | 2023-07-25 | 徐州医科大学 | Drug target affinity prediction method based on depth mode data fusion |
CN116597916A (en) * | 2023-06-15 | 2023-08-15 | 江苏运动健康研究院 | Prediction method of antitumor compound prognosis efficacy based on organ chip and deep learning |
US20230268026A1 (en) * | 2022-01-07 | 2023-08-24 | Absci Corporation | Designing biomolecule sequence variants with pre-specified attributes |
CN116646001A (en) * | 2023-06-05 | 2023-08-25 | 兰州大学 | Method for predicting drug target binding based on combined cross-domain attention model |
-
2023
- 2023-09-13 CN CN202311177095.XA patent/CN117079716B/en active Active
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190164632A1 (en) * | 2017-09-25 | 2019-05-30 | Syntekabio Co., Ltd. | Drug indication and response prediction systems and method using ai deep learning based on convergence of different category data |
US20200365270A1 (en) * | 2019-05-15 | 2020-11-19 | International Business Machines Corporation | Drug efficacy prediction for treatment of genetic disease |
CN111223577A (en) * | 2020-01-17 | 2020-06-02 | 江苏大学 | Deep learning-based synergistic anti-tumor multi-drug combination effect prediction method |
KR20210153540A (en) * | 2020-06-10 | 2021-12-17 | 주식회사 에이조스바이오 | System for phenotype-based anticancer drug screening using artificial intelligence deep learning |
WO2022170909A1 (en) * | 2021-02-09 | 2022-08-18 | 清华大学深圳国际研究生院 | Drug sensitivity prediction method, electronic device and computer-readable storage medium |
WO2022222231A1 (en) * | 2021-04-23 | 2022-10-27 | 平安科技(深圳)有限公司 | Drug-target interaction prediction method and apparatus, device, and storage medium |
CN114530196A (en) * | 2021-08-31 | 2022-05-24 | 天津工业大学 | Organ chip drug evaluation method based on deep learning prediction |
US20230268026A1 (en) * | 2022-01-07 | 2023-08-24 | Absci Corporation | Designing biomolecule sequence variants with pre-specified attributes |
CN114582429A (en) * | 2022-03-03 | 2022-06-03 | 四川大学 | Method and device for predicting drug resistance of mycobacterium tuberculosis based on hierarchical attention neural network |
CN114882970A (en) * | 2022-06-02 | 2022-08-09 | 西安电子科技大学 | Drug interaction effect prediction method based on pre-training model and molecular graph |
CN115116624A (en) * | 2022-06-29 | 2022-09-27 | 广西大学 | Drug sensitivity prediction method and device based on semi-supervised transfer learning |
CN115376706A (en) * | 2022-10-26 | 2022-11-22 | 杭州艾名医学科技有限公司 | Prediction model-based breast cancer drug scheme prediction method and device |
CN116110509A (en) * | 2022-11-15 | 2023-05-12 | 浙江大学 | Method and device for predicting drug sensitivity based on histology consistency pretraining |
CN115966316A (en) * | 2023-02-10 | 2023-04-14 | 北京大学 | Tumor drug sensitivity prediction method, system, device and storage medium |
CN116403731A (en) * | 2023-04-11 | 2023-07-07 | 上海交通大学 | Missense mutation effect prediction method and system for clinical drug effect based on deep learning |
CN116486900A (en) * | 2023-04-25 | 2023-07-25 | 徐州医科大学 | Drug target affinity prediction method based on depth mode data fusion |
CN116646001A (en) * | 2023-06-05 | 2023-08-25 | 兰州大学 | Method for predicting drug target binding based on combined cross-domain attention model |
CN116597916A (en) * | 2023-06-15 | 2023-08-15 | 江苏运动健康研究院 | Prediction method of antitumor compound prognosis efficacy based on organ chip and deep learning |
Non-Patent Citations (2)
Title |
---|
JIAN LI 等: "OOCDB: A Comprehensive, Systematic, and Real-time Organs-on-a-chip Database", 《GENOMICS PROTEOMICS BIOINFORMATICS》, vol. 21, no. 2, 30 April 2023 (2023-04-30), pages 243 - 258, XP087424283, DOI: 10.1016/j.gpb.2023.01.001 * |
李伟;杨金才;黄牛;: "深度学习在药物设计与发现中的应用", 药学学报, no. 05, 9 April 2019 (2019-04-09), pages 15 - 21 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118571500A (en) * | 2024-07-31 | 2024-08-30 | 南方医科大学南方医院 | Colorectal cancer chemotherapy response prediction system and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN117079716B (en) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10867706B2 (en) | Multi-scale complex systems transdisciplinary analysis of response to therapy | |
Pal | Predictive modeling of drug sensitivity | |
CN117079716B (en) | Deep learning prediction method of tumor drug administration scheme based on gene detection | |
CN109072309A (en) | Cancer evolution detection and diagnosis | |
Dlamini et al. | AI and precision oncology in clinical cancer genomics: From prevention to targeted cancer therapies-an outcomes based patient care | |
WO2021183917A1 (en) | Systems and methods for deconvolution of expression data | |
CN112470229A (en) | Computer-implemented method of analyzing genetic data about an organism | |
CN115116624A (en) | Drug sensitivity prediction method and device based on semi-supervised transfer learning | |
Zhao et al. | Object-oriented regression for building predictive models with high dimensional omics data from translational studies | |
Xiao et al. | A mapping-based universal Kriging model for order-of-addition experiments in drug combination studies | |
Ahmed et al. | Review of personalized cancer treatment with machine learning | |
US20080234946A1 (en) | Predictive radiosensitivity network model | |
Wisesty et al. | Join classifier of Type and index mutation on lung cancer DNA using sequential labeling model | |
Oustimov et al. | Artificial neural networks in the cancer genomics frontier | |
Lucas et al. | Cross-study projections of genomic biomarkers: an evaluation in cancer genomics | |
JP2024512540A (en) | Method for detecting tumor derived mutations from cell-free DNA based on artificial intelligence e and Method for early diagnosis of cancer using the same} | |
Akbari et al. | The revolutionizing impact of artificial intelligence on breast cancer management | |
KR20240065435A (en) | A predictable data analysis method for cancer recurrence and metastasis | |
Dlamini et al. | Informatics in Medicine Unlocked | |
Korayem et al. | A hybrid genetic algorithm and artificial immune system for informative gene selection | |
Kiran | Modeling of tumor growth and optimization of therapeutic protocol design | |
KR20240065434A (en) | Patient care system to predict cancer recurrence and metastasis | |
Mallavarapu | Identifying Cancer Subtypes Using Unsupervised Deep Learning | |
Ke | Sample-based Measures of Dysregulation and Heterogeneity in Cancer Molecular Profiles | |
Yang | Applying machine learning to derive actionable insights in precision oncology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |