CN110730822A - Method for identifying compounds - Google Patents
Method for identifying compounds Download PDFInfo
- Publication number
- CN110730822A CN110730822A CN201880040438.9A CN201880040438A CN110730822A CN 110730822 A CN110730822 A CN 110730822A CN 201880040438 A CN201880040438 A CN 201880040438A CN 110730822 A CN110730822 A CN 110730822A
- Authority
- CN
- China
- Prior art keywords
- compound
- binding interaction
- binding
- target protein
- findings
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 150000001875 compounds Chemical class 0.000 title claims abstract description 183
- 238000000034 method Methods 0.000 title claims abstract description 102
- 238000009739 binding Methods 0.000 claims abstract description 156
- 230000027455 binding Effects 0.000 claims abstract description 155
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 142
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 142
- 230000003993 interaction Effects 0.000 claims abstract description 109
- 239000000126 substance Substances 0.000 claims description 60
- 125000003729 nucleotide group Chemical group 0.000 claims description 53
- 239000002773 nucleotide Substances 0.000 claims description 48
- 238000002474 experimental method Methods 0.000 claims description 11
- 239000013642 negative control Substances 0.000 claims description 7
- 238000000513 principal component analysis Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 2
- 238000003041 virtual screening Methods 0.000 abstract description 7
- 239000003814 drug Substances 0.000 abstract description 6
- 229940124597 therapeutic agent Drugs 0.000 abstract description 6
- 238000011161 development Methods 0.000 abstract description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 76
- -1 small molecule compounds Chemical class 0.000 description 63
- 125000006850 spacer group Chemical group 0.000 description 44
- 238000006243 chemical reaction Methods 0.000 description 40
- 150000003384 small molecules Chemical class 0.000 description 28
- 150000005829 chemical entities Chemical class 0.000 description 26
- 125000000524 functional group Chemical group 0.000 description 18
- 238000012549 training Methods 0.000 description 18
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 17
- 230000003321 amplification Effects 0.000 description 17
- 239000011230 binding agent Substances 0.000 description 17
- 238000003199 nucleic acid amplification method Methods 0.000 description 17
- 230000015572 biosynthetic process Effects 0.000 description 16
- 239000012634 fragment Substances 0.000 description 16
- 125000004429 atom Chemical group 0.000 description 14
- ZCCUUQDIBDJBTK-UHFFFAOYSA-N psoralen Chemical compound C1=C2OC(=O)C=CC2=CC2=C1OC=C2 ZCCUUQDIBDJBTK-UHFFFAOYSA-N 0.000 description 13
- KOANYEOXZLKPMN-UHFFFAOYSA-N 3-(9h-carbazol-1-yl)prop-2-enenitrile Chemical group C12=CC=CC=C2NC2=C1C=CC=C2C=CC#N KOANYEOXZLKPMN-UHFFFAOYSA-N 0.000 description 12
- 238000004132 cross linking Methods 0.000 description 12
- 230000000670 limiting effect Effects 0.000 description 12
- 108020004414 DNA Proteins 0.000 description 11
- 238000010801 machine learning Methods 0.000 description 11
- 229910052757 nitrogen Inorganic materials 0.000 description 11
- 239000003550 marker Substances 0.000 description 10
- 239000003960 organic solvent Substances 0.000 description 10
- 238000002198 surface plasmon resonance spectroscopy Methods 0.000 description 10
- 125000001731 2-cyanoethyl group Chemical group [H]C([H])(*)C([H])([H])C#N 0.000 description 9
- YLQBMQCUIZJEEH-UHFFFAOYSA-N Furan Chemical compound C=1C=COC=1 YLQBMQCUIZJEEH-UHFFFAOYSA-N 0.000 description 9
- 238000007792 addition Methods 0.000 description 9
- 230000000295 complement effect Effects 0.000 description 9
- 238000012163 sequencing technique Methods 0.000 description 9
- 238000003786 synthesis reaction Methods 0.000 description 9
- 108010042407 Endonucleases Proteins 0.000 description 8
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 8
- 239000003795 chemical substances by application Substances 0.000 description 8
- 238000005755 formation reaction Methods 0.000 description 8
- 238000003752 polymerase chain reaction Methods 0.000 description 8
- 230000002441 reversible effect Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 7
- 238000013537 high throughput screening Methods 0.000 description 7
- 125000005647 linker group Chemical group 0.000 description 7
- 238000005457 optimization Methods 0.000 description 7
- 108090000765 processed proteins & peptides Proteins 0.000 description 7
- APQXWKHOGQFGTB-UHFFFAOYSA-N 1-ethenyl-9h-carbazole Chemical group C12=CC=CC=C2NC2=C1C=CC=C2C=C APQXWKHOGQFGTB-UHFFFAOYSA-N 0.000 description 6
- LMDZBCPBFSXMTL-UHFFFAOYSA-N 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide Chemical compound CCN=C=NCCCN(C)C LMDZBCPBFSXMTL-UHFFFAOYSA-N 0.000 description 6
- JUJWROOIHBZHMG-UHFFFAOYSA-N Pyridine Chemical compound C1=CC=NC=C1 JUJWROOIHBZHMG-UHFFFAOYSA-N 0.000 description 6
- 230000001588 bifunctional effect Effects 0.000 description 6
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 238000009830 intercalation Methods 0.000 description 6
- 238000002372 labelling Methods 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 125000003396 thiol group Chemical group [H]S* 0.000 description 6
- 150000003573 thiols Chemical class 0.000 description 6
- JYEUMXHLPRZUAT-UHFFFAOYSA-N 1,2,3-triazine Chemical compound C1=CN=NN=C1 JYEUMXHLPRZUAT-UHFFFAOYSA-N 0.000 description 5
- VXGRJERITKFWPL-UHFFFAOYSA-N 4',5'-Dihydropsoralen Natural products C1=C2OC(=O)C=CC2=CC2=C1OCC2 VXGRJERITKFWPL-UHFFFAOYSA-N 0.000 description 5
- DLGOEMSEDOSKAD-UHFFFAOYSA-N Carmustine Chemical compound ClCCNC(=O)N(N=O)CCCl DLGOEMSEDOSKAD-UHFFFAOYSA-N 0.000 description 5
- 102000004533 Endonucleases Human genes 0.000 description 5
- 101000597662 Homo sapiens Serine/threonine-protein phosphatase 2B catalytic subunit alpha isoform Proteins 0.000 description 5
- 102100035348 Serine/threonine-protein phosphatase 2B catalytic subunit alpha isoform Human genes 0.000 description 5
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical group O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 5
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 5
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 5
- 150000001335 aliphatic alkanes Chemical group 0.000 description 5
- 150000001412 amines Chemical class 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 238000013461 design Methods 0.000 description 5
- 238000010494 dissociation reaction Methods 0.000 description 5
- 230000005593 dissociations Effects 0.000 description 5
- 230000002209 hydrophobic effect Effects 0.000 description 5
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 230000000269 nucleophilic effect Effects 0.000 description 5
- 238000002360 preparation method Methods 0.000 description 5
- 238000007637 random forest analysis Methods 0.000 description 5
- 230000002829 reductive effect Effects 0.000 description 5
- 230000008439 repair process Effects 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- KBPLFHHGFOOTCA-UHFFFAOYSA-N 1-Octanol Chemical compound CCCCCCCCO KBPLFHHGFOOTCA-UHFFFAOYSA-N 0.000 description 4
- XPAWPLHTNPCLDL-KVQBGUIXSA-N 2-fluoro-9-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-3h-purin-6-one Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC(F)=NC(O)=C2N=C1 XPAWPLHTNPCLDL-KVQBGUIXSA-N 0.000 description 4
- IJGRMHOSHXDMSA-UHFFFAOYSA-N Atomic nitrogen Chemical compound N#N IJGRMHOSHXDMSA-UHFFFAOYSA-N 0.000 description 4
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 4
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 4
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 4
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 4
- AOJJSUZBOXZQNB-TZSSRYMLSA-N Doxorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-TZSSRYMLSA-N 0.000 description 4
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 4
- 102000003960 Ligases Human genes 0.000 description 4
- 108090000364 Ligases Proteins 0.000 description 4
- 239000002253 acid Substances 0.000 description 4
- 125000003277 amino group Chemical group 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 4
- 210000004027 cell Anatomy 0.000 description 4
- 239000003153 chemical reaction reagent Substances 0.000 description 4
- DQLATGHUWYMOKM-UHFFFAOYSA-L cisplatin Chemical compound N[Pt](N)(Cl)Cl DQLATGHUWYMOKM-UHFFFAOYSA-L 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 230000002538 fungal effect Effects 0.000 description 4
- 239000003446 ligand Substances 0.000 description 4
- 238000002156 mixing Methods 0.000 description 4
- 150000008300 phosphoramidites Chemical class 0.000 description 4
- 229920001223 polyethylene glycol Polymers 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 239000000376 reactant Substances 0.000 description 4
- 230000008263 repair mechanism Effects 0.000 description 4
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical class [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 4
- FRZBSUCNDQOKCK-UHFFFAOYSA-N 3-(9h-carbazol-1-yl)prop-2-enoic acid Chemical class C12=CC=CC=C2NC2=C1C=CC=C2C=CC(=O)O FRZBSUCNDQOKCK-UHFFFAOYSA-N 0.000 description 3
- LZKGFGLOQNSMBS-UHFFFAOYSA-N 4,5,6-trichlorotriazine Chemical compound ClC1=NN=NC(Cl)=C1Cl LZKGFGLOQNSMBS-UHFFFAOYSA-N 0.000 description 3
- BUNGCZLFHHXKBX-UHFFFAOYSA-N 8-methoxypsoralen Natural products C1=CC(=O)OC2=C1C=C1CCOC1=C2OC BUNGCZLFHHXKBX-UHFFFAOYSA-N 0.000 description 3
- UHOVQNZJYSORNB-UHFFFAOYSA-N Benzene Chemical group C1=CC=CC=C1 UHOVQNZJYSORNB-UHFFFAOYSA-N 0.000 description 3
- 108010082610 Deoxyribonuclease (Pyrimidine Dimer) Proteins 0.000 description 3
- 102000004099 Deoxyribonuclease (Pyrimidine Dimer) Human genes 0.000 description 3
- 108010036364 Deoxyribonuclease IV (Phage T4-Induced) Proteins 0.000 description 3
- 102100031780 Endonuclease Human genes 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- QXKHYNVANLEOEG-UHFFFAOYSA-N Methoxsalen Chemical compound C1=CC(=O)OC2=C1C=C1C=COC1=C2OC QXKHYNVANLEOEG-UHFFFAOYSA-N 0.000 description 3
- 102000001708 Protein Isoforms Human genes 0.000 description 3
- 108010029485 Protein Isoforms Proteins 0.000 description 3
- RWRDLPDLKQPQOW-UHFFFAOYSA-N Pyrrolidine Chemical compound C1CCNC1 RWRDLPDLKQPQOW-UHFFFAOYSA-N 0.000 description 3
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical compound OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 3
- 108091023040 Transcription factor Proteins 0.000 description 3
- 102000040945 Transcription factor Human genes 0.000 description 3
- 125000001931 aliphatic group Chemical group 0.000 description 3
- 229940100198 alkylating agent Drugs 0.000 description 3
- 239000002168 alkylating agent Substances 0.000 description 3
- 239000011324 bead Substances 0.000 description 3
- 230000006399 behavior Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000000205 computational method Methods 0.000 description 3
- 230000002255 enzymatic effect Effects 0.000 description 3
- 150000002148 esters Chemical group 0.000 description 3
- 238000002875 fluorescence polarization Methods 0.000 description 3
- 125000001072 heteroaryl group Chemical group 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 125000002346 iodo group Chemical group I* 0.000 description 3
- 125000000468 ketone group Chemical group 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 229910052751 metal Inorganic materials 0.000 description 3
- 239000002184 metal Substances 0.000 description 3
- 229960004469 methoxsalen Drugs 0.000 description 3
- SQBBOVROCFXYBN-UHFFFAOYSA-N methoxypsoralen Natural products C1=C2OC(=O)C(OC)=CC2=CC2=C1OC=C2 SQBBOVROCFXYBN-UHFFFAOYSA-N 0.000 description 3
- 150000004712 monophosphates Chemical class 0.000 description 3
- 150000007523 nucleic acids Chemical class 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 150000004713 phosphodiesters Chemical class 0.000 description 3
- 238000006116 polymerization reaction Methods 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- UMJSCPRVCHMLSP-UHFFFAOYSA-N pyridine Natural products COC1=CC=CN=C1 UMJSCPRVCHMLSP-UHFFFAOYSA-N 0.000 description 3
- 150000003839 salts Chemical group 0.000 description 3
- 125000001424 substituent group Chemical group 0.000 description 3
- KTNLKCRJTSDVGD-UHFFFAOYSA-N 1-sulfonyl-2-(2-sulfonylethylsulfanyl)ethane Chemical class S(=O)(=O)=CCSCC=S(=O)=O KTNLKCRJTSDVGD-UHFFFAOYSA-N 0.000 description 2
- NCJINJWAWDKOBJ-UHFFFAOYSA-N 2-(9H-carbazol-1-yl)ethenamine Chemical group NC=CC1=CC=CC=2C3=CC=CC=C3NC1=2 NCJINJWAWDKOBJ-UHFFFAOYSA-N 0.000 description 2
- WEVJJMPVVFNAHZ-UHFFFAOYSA-N 4-amino-1-[4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-5-iodopyrimidin-2-one Chemical compound C1=C(I)C(N)=NC(=O)N1C1OC(CO)C(O)C1 WEVJJMPVVFNAHZ-UHFFFAOYSA-N 0.000 description 2
- KISUPFXQEHWGAR-RRKCRQDMSA-N 4-amino-5-bromo-1-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]pyrimidin-2-one Chemical compound C1=C(Br)C(N)=NC(=O)N1[C@@H]1O[C@H](CO)[C@@H](O)C1 KISUPFXQEHWGAR-RRKCRQDMSA-N 0.000 description 2
- WOVKYSAHUYNSMH-RRKCRQDMSA-N 5-bromodeoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(Br)=C1 WOVKYSAHUYNSMH-RRKCRQDMSA-N 0.000 description 2
- STQGQHZAVUOBTE-UHFFFAOYSA-N 7-Cyan-hept-2t-en-4,6-diinsaeure Natural products C1=2C(O)=C3C(=O)C=4C(OC)=CC=CC=4C(=O)C3=C(O)C=2CC(O)(C(C)=O)CC1OC1CC(N)C(O)C(C)O1 STQGQHZAVUOBTE-UHFFFAOYSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 102100033477 ADP-ribosylation factor-like protein 13A Human genes 0.000 description 2
- 102100033793 ALK tyrosine kinase receptor Human genes 0.000 description 2
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical group NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 2
- WOVKYSAHUYNSMH-UHFFFAOYSA-N BROMODEOXYURIDINE Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C(Br)=C1 WOVKYSAHUYNSMH-UHFFFAOYSA-N 0.000 description 2
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 2
- KAKZBPTYRLMSJV-UHFFFAOYSA-N Butadiene Chemical class C=CC=C KAKZBPTYRLMSJV-UHFFFAOYSA-N 0.000 description 2
- FVLVBPDQNARYJU-XAHDHGMMSA-N C[C@H]1CCC(CC1)NC(=O)N(CCCl)N=O Chemical compound C[C@H]1CCC(CC1)NC(=O)N(CCCl)N=O FVLVBPDQNARYJU-XAHDHGMMSA-N 0.000 description 2
- 101100452003 Caenorhabditis elegans ape-1 gene Proteins 0.000 description 2
- 102100033620 Calponin-1 Human genes 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 108010077544 Chromatin Proteins 0.000 description 2
- CMSMOCZEIVJLDB-UHFFFAOYSA-N Cyclophosphamide Chemical group ClCCN(CCCl)P1(=O)NCCCO1 CMSMOCZEIVJLDB-UHFFFAOYSA-N 0.000 description 2
- 102000012410 DNA Ligases Human genes 0.000 description 2
- 108010061982 DNA Ligases Proteins 0.000 description 2
- 238000005698 Diels-Alder reaction Methods 0.000 description 2
- BWGNESOTFCXPMA-UHFFFAOYSA-N Dihydrogen disulfide Chemical compound SS BWGNESOTFCXPMA-UHFFFAOYSA-N 0.000 description 2
- 238000002965 ELISA Methods 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 101710081048 Endonuclease III Proteins 0.000 description 2
- 101000877447 Enterobacteria phage T4 Endonuclease V Proteins 0.000 description 2
- 102100025626 GTP-binding protein GEM Human genes 0.000 description 2
- 102100033962 GTP-binding protein RAD Human genes 0.000 description 2
- 108091006094 GTPase-accelerating proteins Proteins 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- 102000002812 Heat-Shock Proteins Human genes 0.000 description 2
- 108010004889 Heat-Shock Proteins Proteins 0.000 description 2
- 101000927075 Homo sapiens ADP-ribosylation factor-like protein 13A Proteins 0.000 description 2
- 101000945318 Homo sapiens Calponin-1 Proteins 0.000 description 2
- 101001132495 Homo sapiens GTP-binding protein RAD Proteins 0.000 description 2
- 101000652736 Homo sapiens Transgelin Proteins 0.000 description 2
- 101000857270 Homo sapiens Zinc finger protein GLIS1 Proteins 0.000 description 2
- XQFRJNBWHJMXHO-RRKCRQDMSA-N IDUR Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(I)=C1 XQFRJNBWHJMXHO-RRKCRQDMSA-N 0.000 description 2
- SIKJAQJRHWYJAI-UHFFFAOYSA-N Indole Chemical compound C1=CC=C2NC=CC2=C1 SIKJAQJRHWYJAI-UHFFFAOYSA-N 0.000 description 2
- 102000004310 Ion Channels Human genes 0.000 description 2
- 108090000862 Ion Channels Proteins 0.000 description 2
- GQYIWUVLTXOXAJ-UHFFFAOYSA-N Lomustine Chemical compound ClCCN(N=O)C(=O)NC1CCCCC1 GQYIWUVLTXOXAJ-UHFFFAOYSA-N 0.000 description 2
- 101000777240 Micrococcus luteus (strain ATCC 4698 / DSM 20030 / JCM 1464 / NBRC 3333 / NCIMB 9278 / NCTC 2665 / VKM Ac-2230) Ultraviolet N-glycosylase/AP lyase Proteins 0.000 description 2
- 101100087528 Mus musculus Rhoj gene Proteins 0.000 description 2
- NWIBSHFKIJFRCO-WUDYKRTCSA-N Mytomycin Chemical compound C1N2C(C(C(C)=C(N)C3=O)=O)=C3[C@@H](COC(N)=O)[C@@]2(OC)[C@@H]2[C@H]1N2 NWIBSHFKIJFRCO-WUDYKRTCSA-N 0.000 description 2
- 238000010222 PCR analysis Methods 0.000 description 2
- 108010067902 Peptide Library Proteins 0.000 description 2
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 description 2
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 description 2
- 102000001253 Protein Kinase Human genes 0.000 description 2
- WTKZEGDFNFYCGP-UHFFFAOYSA-N Pyrazole Chemical compound C=1C=NNC=1 WTKZEGDFNFYCGP-UHFFFAOYSA-N 0.000 description 2
- KAESVJOAVNADME-UHFFFAOYSA-N Pyrrole Chemical compound C=1C=CNC=1 KAESVJOAVNADME-UHFFFAOYSA-N 0.000 description 2
- INVGWHRKADIJHF-UHFFFAOYSA-N Sanguinarin Chemical compound C1=C2OCOC2=CC2=C3[N+](C)=CC4=C(OCO5)C5=CC=C4C3=CC=C21 INVGWHRKADIJHF-UHFFFAOYSA-N 0.000 description 2
- FOCVUCIESVLUNU-UHFFFAOYSA-N Thiotepa Chemical compound C1CN1P(N1CC1)(=S)N1CC1 FOCVUCIESVLUNU-UHFFFAOYSA-N 0.000 description 2
- 108010083111 Ubiquitin-Protein Ligases Proteins 0.000 description 2
- 102000006275 Ubiquitin-Protein Ligases Human genes 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 102100025883 Zinc finger protein GLIS1 Human genes 0.000 description 2
- 150000001345 alkine derivatives Chemical class 0.000 description 2
- 125000004453 alkoxycarbonyl group Chemical group 0.000 description 2
- 150000001413 amino acids Chemical class 0.000 description 2
- 239000012491 analyte Substances 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 150000001540 azides Chemical class 0.000 description 2
- 125000000852 azido group Chemical group *N=[N+]=[N-] 0.000 description 2
- YBHILYKTIRIUTE-UHFFFAOYSA-N berberine Chemical compound C1=C2CC[N+]3=CC4=C(OC)C(OC)=CC=C4C=C3C2=CC2=C1OCO2 YBHILYKTIRIUTE-UHFFFAOYSA-N 0.000 description 2
- 229940093265 berberine Drugs 0.000 description 2
- QISXPYZVZJBNDM-UHFFFAOYSA-N berberine Natural products COc1ccc2C=C3N(Cc2c1OC)C=Cc4cc5OCOc5cc34 QISXPYZVZJBNDM-UHFFFAOYSA-N 0.000 description 2
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 2
- 230000008827 biological function Effects 0.000 description 2
- 230000008236 biological pathway Effects 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 229950004398 broxuridine Drugs 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 229960005243 carmustine Drugs 0.000 description 2
- 235000017168 chlorine Nutrition 0.000 description 2
- 210000003483 chromatin Anatomy 0.000 description 2
- 229960004316 cisplatin Drugs 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- MGNZXYYWBUKAII-UHFFFAOYSA-N cyclohexediene Chemical class C1CC=CC=C1 MGNZXYYWBUKAII-UHFFFAOYSA-N 0.000 description 2
- 229960004397 cyclophosphamide Drugs 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 229960000975 daunorubicin Drugs 0.000 description 2
- STQGQHZAVUOBTE-VGBVRHCVSA-N daunorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(C)=O)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 STQGQHZAVUOBTE-VGBVRHCVSA-N 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 208000035475 disorder Diseases 0.000 description 2
- 229960004679 doxorubicin Drugs 0.000 description 2
- ZSWFCLXCOIISFI-UHFFFAOYSA-N endo-cyclopentadiene Chemical class C1C=CC=C1 ZSWFCLXCOIISFI-UHFFFAOYSA-N 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 description 2
- 108700015053 epidermal growth factor receptor activity proteins Proteins 0.000 description 2
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 2
- 238000009472 formulation Methods 0.000 description 2
- 125000002485 formyl group Chemical class [H]C(*)=O 0.000 description 2
- 150000004676 glycans Chemical class 0.000 description 2
- 125000000623 heterocyclic group Chemical group 0.000 description 2
- 238000009396 hybridization Methods 0.000 description 2
- OAKJQQAXSVQMHS-UHFFFAOYSA-N hydrazine Substances NN OAKJQQAXSVQMHS-UHFFFAOYSA-N 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000001257 hydrogen Substances 0.000 description 2
- 125000001165 hydrophobic group Chemical group 0.000 description 2
- 150000003949 imides Chemical class 0.000 description 2
- 239000003112 inhibitor Substances 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 238000010208 microarray analysis Methods 0.000 description 2
- YOHYSYJDKVYCJI-UHFFFAOYSA-N n-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide Chemical compound FC(F)(F)C1=CC=CC(NC=2N=CN=C(NC=3C=C(NC(=O)C4CC4)C=CC=3)C=2)=C1 YOHYSYJDKVYCJI-UHFFFAOYSA-N 0.000 description 2
- 230000009871 nonspecific binding Effects 0.000 description 2
- 102000039446 nucleic acids Human genes 0.000 description 2
- 108020004707 nucleic acids Proteins 0.000 description 2
- 239000012038 nucleophile Substances 0.000 description 2
- 238000002515 oligonucleotide synthesis Methods 0.000 description 2
- 230000003647 oxidation Effects 0.000 description 2
- 238000007254 oxidation reaction Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 2
- NMHMNPHRMNGLLB-UHFFFAOYSA-N phloretic acid Chemical compound OC(=O)CCC1=CC=C(O)C=C1 NMHMNPHRMNGLLB-UHFFFAOYSA-N 0.000 description 2
- IIMIOEBMYPRQGU-UHFFFAOYSA-L picoplatin Chemical compound N.[Cl-].[Cl-].[Pt+2].CC1=CC=CC=N1 IIMIOEBMYPRQGU-UHFFFAOYSA-L 0.000 description 2
- 150000003057 platinum Chemical class 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 229920001282 polysaccharide Polymers 0.000 description 2
- 239000005017 polysaccharide Substances 0.000 description 2
- 239000013641 positive control Substances 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 229940002612 prodrug Drugs 0.000 description 2
- 239000000651 prodrug Substances 0.000 description 2
- 108060006633 protein kinase Proteins 0.000 description 2
- 230000004850 protein–protein interaction Effects 0.000 description 2
- 150000003254 radicals Chemical class 0.000 description 2
- 238000007634 remodeling Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 229960003440 semustine Drugs 0.000 description 2
- 230000009870 specific binding Effects 0.000 description 2
- 125000000547 substituted alkyl group Chemical group 0.000 description 2
- 125000004426 substituted alkynyl group Chemical group 0.000 description 2
- JJAHTWIKCUJRDK-UHFFFAOYSA-N succinimidyl 4-(N-maleimidomethyl)cyclohexane-1-carboxylate Chemical compound C1CC(CN2C(C=CC2=O)=O)CCC1C(=O)ON1C(=O)CCC1=O JJAHTWIKCUJRDK-UHFFFAOYSA-N 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 229940104230 thymidine Drugs 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- IUCJMVBFZDHPDX-UHFFFAOYSA-N tretamine Chemical compound C1CN1C1=NC(N2CC2)=NC(N2CC2)=N1 IUCJMVBFZDHPDX-UHFFFAOYSA-N 0.000 description 2
- 150000003918 triazines Chemical class 0.000 description 2
- 239000001226 triphosphate Substances 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- HZSBSRAVNBUZRA-RQDPQJJXSA-J (1r,2r)-cyclohexane-1,2-diamine;tetrachloroplatinum(2+) Chemical compound Cl[Pt+2](Cl)(Cl)Cl.N[C@@H]1CCCC[C@H]1N HZSBSRAVNBUZRA-RQDPQJJXSA-J 0.000 description 1
- TYJPSIQEEXOQLC-UHFFFAOYSA-N (2,5-dioxopyrrolidin-1-yl) 6-[(2-methylpropan-2-yl)oxycarbonylamino]hexanoate Chemical compound CC(C)(C)OC(=O)NCCCCCC(=O)ON1C(=O)CCC1=O TYJPSIQEEXOQLC-UHFFFAOYSA-N 0.000 description 1
- HXNBAOLVPAWYLT-NVNXTCNLSA-N (5z)-5-[[5-bromo-2-[(2-bromophenyl)methoxy]phenyl]methylidene]-2-sulfanylidene-1,3-thiazolidin-4-one Chemical compound S\1C(=S)NC(=O)C/1=C/C1=CC(Br)=CC=C1OCC1=CC=CC=C1Br HXNBAOLVPAWYLT-NVNXTCNLSA-N 0.000 description 1
- FIDRAVVQGKNYQK-UHFFFAOYSA-N 1,2,3,4-tetrahydrotriazine Chemical compound C1NNNC=C1 FIDRAVVQGKNYQK-UHFFFAOYSA-N 0.000 description 1
- KTZQTRPPVKQPFO-UHFFFAOYSA-N 1,2-benzoxazole Chemical compound C1=CC=C2C=NOC2=C1 KTZQTRPPVKQPFO-UHFFFAOYSA-N 0.000 description 1
- 238000007106 1,2-cycloaddition reaction Methods 0.000 description 1
- CDKIEBFIMCSCBB-UHFFFAOYSA-N 1-(6,7-dimethoxy-3,4-dihydro-1h-isoquinolin-2-yl)-3-(1-methyl-2-phenylpyrrolo[2,3-b]pyridin-3-yl)prop-2-en-1-one;hydrochloride Chemical compound Cl.C1C=2C=C(OC)C(OC)=CC=2CCN1C(=O)C=CC(C1=CC=CN=C1N1C)=C1C1=CC=CC=C1 CDKIEBFIMCSCBB-UHFFFAOYSA-N 0.000 description 1
- ZHLXWDHNRVTLBX-BMYQGPEFSA-N 1-[(2R,3R,4S,5R)-2-ethenyl-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-2,4-dioxopyrimidine-5-carboxylic acid Chemical group C(=O)(O)C=1C(NC(N([C@]2([C@H](O)[C@H](O)[C@@H](CO)O2)C=C)C=1)=O)=O ZHLXWDHNRVTLBX-BMYQGPEFSA-N 0.000 description 1
- QUBHZNQRJVRJNT-JOAULVNJSA-N 1-[(2S,4S,5R)-2-ethenyl-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-2,4-dioxopyrimidine-5-carboxylic acid Chemical compound C(=O)(O)C=1C(NC(N([C@]2(C[C@H](O)[C@@H](CO)O2)C=C)C1)=O)=O QUBHZNQRJVRJNT-JOAULVNJSA-N 0.000 description 1
- QGAFNZMEIFDENH-UHFFFAOYSA-N 1-hexanoyloxy-2,5-dioxopyrrolidine-3-sulfonic acid Chemical compound CCCCCC(=O)ON1C(=O)CC(S(O)(=O)=O)C1=O QGAFNZMEIFDENH-UHFFFAOYSA-N 0.000 description 1
- UZGKAASZIMOAMU-UHFFFAOYSA-N 124177-85-1 Chemical compound NP(=O)=O UZGKAASZIMOAMU-UHFFFAOYSA-N 0.000 description 1
- HYZJCKYKOHLVJF-UHFFFAOYSA-N 1H-benzimidazole Chemical compound C1=CC=C2NC=NC2=C1 HYZJCKYKOHLVJF-UHFFFAOYSA-N 0.000 description 1
- QMQZIXCNLUPEIN-UHFFFAOYSA-N 1h-imidazole-2-carbonitrile Chemical compound N#CC1=NC=CN1 QMQZIXCNLUPEIN-UHFFFAOYSA-N 0.000 description 1
- UEJJHQNACJXSKW-UHFFFAOYSA-N 2-(2,6-dioxopiperidin-3-yl)-1H-isoindole-1,3(2H)-dione Chemical compound O=C1C2=CC=CC=C2C(=O)N1C1CCC(=O)NC1=O UEJJHQNACJXSKW-UHFFFAOYSA-N 0.000 description 1
- HZOYZGXLSVYLNF-UHFFFAOYSA-N 2-amino-3,7-dihydropurin-6-one;1h-pyrimidine-2,4-dione Chemical compound O=C1C=CNC(=O)N1.O=C1NC(N)=NC2=C1NC=N2 HZOYZGXLSVYLNF-UHFFFAOYSA-N 0.000 description 1
- NOIRDLRUNWIUMX-UHFFFAOYSA-N 2-amino-3,7-dihydropurin-6-one;6-amino-1h-pyrimidin-2-one Chemical compound NC=1C=CNC(=O)N=1.O=C1NC(N)=NC2=C1NC=N2 NOIRDLRUNWIUMX-UHFFFAOYSA-N 0.000 description 1
- 125000002941 2-furyl group Chemical group O1C([*])=C([H])C([H])=C1[H] 0.000 description 1
- RSEBUVRVKCANEP-UHFFFAOYSA-N 2-pyrroline Chemical compound C1CC=CN1 RSEBUVRVKCANEP-UHFFFAOYSA-N 0.000 description 1
- MGADZUXDNSDTHW-UHFFFAOYSA-N 2H-pyran Chemical compound C1OC=CC=C1 MGADZUXDNSDTHW-UHFFFAOYSA-N 0.000 description 1
- HDAVJPSXEPLOMF-UHFFFAOYSA-N 3-(9h-carbazol-3-yl)prop-2-enenitrile Chemical compound C1=CC=C2C3=CC(C=CC#N)=CC=C3NC2=C1 HDAVJPSXEPLOMF-UHFFFAOYSA-N 0.000 description 1
- HVCOBJNICQPDBP-UHFFFAOYSA-N 3-[3-[3,5-dihydroxy-6-methyl-4-(3,4,5-trihydroxy-6-methyloxan-2-yl)oxyoxan-2-yl]oxydecanoyloxy]decanoic acid;hydrate Chemical compound O.OC1C(OC(CC(=O)OC(CCCCCCC)CC(O)=O)CCCCCCC)OC(C)C(O)C1OC1C(O)C(O)C(O)C(C)O1 HVCOBJNICQPDBP-UHFFFAOYSA-N 0.000 description 1
- 108010034927 3-methyladenine-DNA glycosylase Proteins 0.000 description 1
- AOJJSUZBOXZQNB-VTZDEGQISA-N 4'-epidoxorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-VTZDEGQISA-N 0.000 description 1
- DWAOUXYZOSPAOH-UHFFFAOYSA-N 4-[2-(diethylamino)ethoxy]furo[3,2-g]chromen-7-one;hydrochloride Chemical compound [Cl-].O1C(=O)C=CC2=C1C=C1OC=CC1=C2OCC[NH+](CC)CC DWAOUXYZOSPAOH-UHFFFAOYSA-N 0.000 description 1
- SSMVDPYHLFEAJE-UHFFFAOYSA-N 4-azidoaniline Chemical compound NC1=CC=C(N=[N+]=[N-])C=C1 SSMVDPYHLFEAJE-UHFFFAOYSA-N 0.000 description 1
- IFQUPKAISSPFTE-UHFFFAOYSA-N 4-benzoylbenzoic acid Chemical compound C1=CC(C(=O)O)=CC=C1C(=O)C1=CC=CC=C1 IFQUPKAISSPFTE-UHFFFAOYSA-N 0.000 description 1
- WVKOPZMDOFGFAK-UHFFFAOYSA-N 4-hydroperoxycyclophosphamide Chemical compound OOC1=NP(O)(N(CCCl)CCCl)OCC1 WVKOPZMDOFGFAK-UHFFFAOYSA-N 0.000 description 1
- IWWDWOYKHMTHSL-UHFFFAOYSA-N 4-methoxybuta-1,3-dien-2-yl(trimethyl)silane Chemical class COC=CC(=C)[Si](C)(C)C IWWDWOYKHMTHSL-UHFFFAOYSA-N 0.000 description 1
- SBZDIRMBQJDCLB-UHFFFAOYSA-N 5-azidopentanoic acid Chemical compound OC(=O)CCCCN=[N+]=[N-] SBZDIRMBQJDCLB-UHFFFAOYSA-N 0.000 description 1
- FFKUHGONCHRHPE-UHFFFAOYSA-N 5-methyl-1h-pyrimidine-2,4-dione;7h-purin-6-amine Chemical compound CC1=CNC(=O)NC1=O.NC1=NC=NC2=C1NC=N2 FFKUHGONCHRHPE-UHFFFAOYSA-N 0.000 description 1
- UBKVUFQGVWHZIR-UHFFFAOYSA-N 8-oxoguanine Chemical compound O=C1NC(N)=NC2=NC(=O)N=C21 UBKVUFQGVWHZIR-UHFFFAOYSA-N 0.000 description 1
- 102100028468 ADP-ribosylation factor-like protein 10 Human genes 0.000 description 1
- 102100039645 ADP-ribosylation factor-like protein 4A Human genes 0.000 description 1
- 102100022861 ADP-ribosylation factor-like protein 5A Human genes 0.000 description 1
- 101150020330 ATRX gene Proteins 0.000 description 1
- 101001082110 Acanthamoeba polyphaga mimivirus Eukaryotic translation initiation factor 4E homolog Proteins 0.000 description 1
- 102100022089 Acyl-[acyl-carrier-protein] hydrolase Human genes 0.000 description 1
- 102100027165 Alpha-2-macroglobulin receptor-associated protein Human genes 0.000 description 1
- 101710126837 Alpha-2-macroglobulin receptor-associated protein Proteins 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 108010031677 Anaphase-Promoting Complex-Cyclosome Proteins 0.000 description 1
- 102000005446 Anaphase-Promoting Complex-Cyclosome Human genes 0.000 description 1
- 102100032187 Androgen receptor Human genes 0.000 description 1
- 102100021569 Apoptosis regulator Bcl-2 Human genes 0.000 description 1
- 101100056377 Arabidopsis thaliana ARL gene Proteins 0.000 description 1
- 101100190393 Arabidopsis thaliana CLMP1 gene Proteins 0.000 description 1
- 101100404726 Arabidopsis thaliana NHX7 gene Proteins 0.000 description 1
- 102100036781 Arf-GAP with GTPase, ANK repeat and PH domain-containing protein 2 Human genes 0.000 description 1
- 102100024003 Arf-GAP with SH3 domain, ANK repeat and PH domain-containing protein 1 Human genes 0.000 description 1
- GIDCUQKCIZAUKW-UHFFFAOYSA-N Aristolactam I N-beta-D-glucoside Natural products COC1=CC=CC(C=2C=3OCOC=3C=C(C3=2)C2=O)=C1C=C3N2C1OC(CO)C(O)C(O)C1O GIDCUQKCIZAUKW-UHFFFAOYSA-N 0.000 description 1
- 102100037211 Aryl hydrocarbon receptor nuclear translocator-like protein 1 Human genes 0.000 description 1
- 102100021631 B-cell lymphoma 6 protein Human genes 0.000 description 1
- 102100021247 BCL-6 corepressor Human genes 0.000 description 1
- 102100035080 BDNF/NT-3 growth factors receptor Human genes 0.000 description 1
- 102000008836 BTB/POZ domains Human genes 0.000 description 1
- 108050000749 BTB/POZ domains Proteins 0.000 description 1
- 108060000903 Beta-catenin Proteins 0.000 description 1
- 102000015735 Beta-catenin Human genes 0.000 description 1
- ZOXJGFHDIHLPTG-UHFFFAOYSA-N Boron Chemical compound [B] ZOXJGFHDIHLPTG-UHFFFAOYSA-N 0.000 description 1
- 102100026008 Breakpoint cluster region protein Human genes 0.000 description 1
- 102000001805 Bromodomains Human genes 0.000 description 1
- 108050009021 Bromodomains Proteins 0.000 description 1
- COVZYZSDYWQREU-UHFFFAOYSA-N Busulfan Chemical compound CS(=O)(=O)OCCCCOS(C)(=O)=O COVZYZSDYWQREU-UHFFFAOYSA-N 0.000 description 1
- 108010029697 CD40 Ligand Proteins 0.000 description 1
- 101150013553 CD40 gene Proteins 0.000 description 1
- 102100032937 CD40 ligand Human genes 0.000 description 1
- 102100021975 CREB-binding protein Human genes 0.000 description 1
- 101100065718 Caenorhabditis elegans ets-4 gene Proteins 0.000 description 1
- 101100450705 Caenorhabditis elegans hif-1 gene Proteins 0.000 description 1
- 101100507655 Canis lupus familiaris HSPA1 gene Proteins 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-M Chloride anion Chemical compound [Cl-] VEXZGXHMUGYJMC-UHFFFAOYSA-M 0.000 description 1
- ZAMOUSCENKQFHK-UHFFFAOYSA-N Chlorine atom Chemical compound [Cl] ZAMOUSCENKQFHK-UHFFFAOYSA-N 0.000 description 1
- 102100036444 Clathrin interactor 1 Human genes 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 102000010970 Connexin Human genes 0.000 description 1
- 108050001175 Connexin Proteins 0.000 description 1
- 239000004971 Cross linker Substances 0.000 description 1
- 102100023580 Cyclic AMP-dependent transcription factor ATF-4 Human genes 0.000 description 1
- 102100024458 Cyclin-dependent kinase inhibitor 2A Human genes 0.000 description 1
- 102100039498 Cytotoxic T-lymphocyte protein 4 Human genes 0.000 description 1
- 108090000133 DNA helicases Proteins 0.000 description 1
- 102000003844 DNA helicases Human genes 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 102100039128 DNA-3-methyladenine glycosylase Human genes 0.000 description 1
- 101001082109 Danio rerio Eukaryotic translation initiation factor 4E-1B Proteins 0.000 description 1
- 102100031597 Dedicator of cytokinesis protein 2 Human genes 0.000 description 1
- 108010046331 Deoxyribodipyrimidine photo-lyase Proteins 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- YXHKONLOYHBTNS-UHFFFAOYSA-N Diazomethane Chemical group C=[N+]=[N-] YXHKONLOYHBTNS-UHFFFAOYSA-N 0.000 description 1
- 101100339887 Drosophila melanogaster Hsp27 gene Proteins 0.000 description 1
- 101100295776 Drosophila melanogaster onecut gene Proteins 0.000 description 1
- 102000012199 E3 ubiquitin-protein ligase Mdm2 Human genes 0.000 description 1
- 108050002772 E3 ubiquitin-protein ligase Mdm2 Proteins 0.000 description 1
- 102100031918 E3 ubiquitin-protein ligase NEDD4 Human genes 0.000 description 1
- 102100039503 E3 ubiquitin-protein ligase RNF31 Human genes 0.000 description 1
- 102100032057 ETS domain-containing protein Elk-1 Human genes 0.000 description 1
- 102100023794 ETS domain-containing protein Elk-3 Human genes 0.000 description 1
- 102100023792 ETS domain-containing protein Elk-4 Human genes 0.000 description 1
- 102100032025 ETS homologous factor Human genes 0.000 description 1
- 102100039578 ETS translocation variant 4 Human genes 0.000 description 1
- 102100035079 ETS-related transcription factor Elf-3 Human genes 0.000 description 1
- 102100039247 ETS-related transcription factor Elf-4 Human genes 0.000 description 1
- 102100039244 ETS-related transcription factor Elf-5 Human genes 0.000 description 1
- 102100023078 Early endosome antigen 1 Human genes 0.000 description 1
- 102100023226 Early growth response protein 1 Human genes 0.000 description 1
- HTIJFSOGRVMCQR-UHFFFAOYSA-N Epirubicin Natural products COc1cccc2C(=O)c3c(O)c4CC(O)(CC(OC5CC(N)C(=O)C(C)O5)c4c(O)c3C(=O)c12)C(=O)CO HTIJFSOGRVMCQR-UHFFFAOYSA-N 0.000 description 1
- 239000004593 Epoxy Substances 0.000 description 1
- FCEXWTOTHXCQCQ-UHFFFAOYSA-N Ethoxydihydrosanguinarine Natural products C12=CC=C3OCOC3=C2C(OCC)N(C)C(C2=C3)=C1C=CC2=CC1=C3OCO1 FCEXWTOTHXCQCQ-UHFFFAOYSA-N 0.000 description 1
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical group OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 1
- 101000914063 Eucalyptus globulus Leafy/floricaula homolog FL1 Proteins 0.000 description 1
- 102100030667 Eukaryotic peptide chain release factor subunit 1 Human genes 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 108091008794 FGF receptors Proteins 0.000 description 1
- 102100035264 FYVE and coiled-coil domain-containing protein 1 Human genes 0.000 description 1
- 102100037008 Factor in the germline alpha Human genes 0.000 description 1
- 101100058548 Felis catus BMI1 gene Proteins 0.000 description 1
- 102000004150 Flap endonucleases Human genes 0.000 description 1
- 108090000652 Flap endonucleases Proteins 0.000 description 1
- 101710181403 Frizzled Proteins 0.000 description 1
- 102000005698 Frizzled receptors Human genes 0.000 description 1
- 108010045438 Frizzled receptors Proteins 0.000 description 1
- 102000008412 GATA5 Transcription Factor Human genes 0.000 description 1
- 108010021779 GATA5 Transcription Factor Proteins 0.000 description 1
- 102000001267 GSK3 Human genes 0.000 description 1
- 108060006662 GSK3 Proteins 0.000 description 1
- 102000013446 GTP Phosphohydrolases Human genes 0.000 description 1
- 102100037941 GTP-binding protein Di-Ras1 Human genes 0.000 description 1
- 102100037949 GTP-binding protein Di-Ras2 Human genes 0.000 description 1
- 102100027362 GTP-binding protein REM 2 Human genes 0.000 description 1
- 102100027541 GTP-binding protein Rheb Human genes 0.000 description 1
- 102100022871 GTPase ERas Human genes 0.000 description 1
- 102100029974 GTPase HRas Human genes 0.000 description 1
- 102100030708 GTPase KRas Human genes 0.000 description 1
- 102100039788 GTPase NRas Human genes 0.000 description 1
- 102000018898 GTPase-Activating Proteins Human genes 0.000 description 1
- 108091006109 GTPases Proteins 0.000 description 1
- 101001077417 Gallus gallus Potassium voltage-gated channel subfamily H member 6 Proteins 0.000 description 1
- 102100039956 Geminin Human genes 0.000 description 1
- 108090000577 Geminin Proteins 0.000 description 1
- 108091093094 Glycol nucleic acid Proteins 0.000 description 1
- 229930186217 Glycolipid Natural products 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- 102100021383 Guanine nucleotide exchange factor DBS Human genes 0.000 description 1
- 102100032134 Guanine nucleotide exchange factor VAV2 Human genes 0.000 description 1
- 101150019359 HSF1 gene Proteins 0.000 description 1
- 101150007616 HSP90AB1 gene Proteins 0.000 description 1
- 101150096895 HSPB1 gene Proteins 0.000 description 1
- 102100026973 Heat shock protein 75 kDa, mitochondrial Human genes 0.000 description 1
- 101710130649 Heat shock protein 75 kDa, mitochondrial Proteins 0.000 description 1
- 102100039170 Heat shock protein beta-6 Human genes 0.000 description 1
- 101710100489 Heat shock protein beta-6 Proteins 0.000 description 1
- 102100035108 High affinity nerve growth factor receptor Human genes 0.000 description 1
- 102100022901 Histone acetyltransferase KAT2A Human genes 0.000 description 1
- 102100033071 Histone acetyltransferase KAT6A Human genes 0.000 description 1
- 102100033070 Histone acetyltransferase KAT6B Human genes 0.000 description 1
- 102100038885 Histone acetyltransferase p300 Human genes 0.000 description 1
- 102100022103 Histone-lysine N-methyltransferase 2A Human genes 0.000 description 1
- 102100026265 Histone-lysine N-methyltransferase ASH1L Human genes 0.000 description 1
- 102100039121 Histone-lysine N-methyltransferase MECOM Human genes 0.000 description 1
- 102100032742 Histone-lysine N-methyltransferase SETD2 Human genes 0.000 description 1
- 102100029239 Histone-lysine N-methyltransferase, H3 lysine-36 specific Human genes 0.000 description 1
- 101000824356 Homarus americanus FMRFamide-like neuropeptide 4 Proteins 0.000 description 1
- 102100024208 Homeobox protein MIXL1 Human genes 0.000 description 1
- 102100029054 Homeobox protein notochord Human genes 0.000 description 1
- 101000769450 Homo sapiens ADP-ribosylation factor-like protein 10 Proteins 0.000 description 1
- 101000886015 Homo sapiens ADP-ribosylation factor-like protein 4A Proteins 0.000 description 1
- 101000974441 Homo sapiens ADP-ribosylation factor-like protein 5A Proteins 0.000 description 1
- 101000779641 Homo sapiens ALK tyrosine kinase receptor Proteins 0.000 description 1
- 101000824278 Homo sapiens Acyl-[acyl-carrier-protein] hydrolase Proteins 0.000 description 1
- 101000971171 Homo sapiens Apoptosis regulator Bcl-2 Proteins 0.000 description 1
- 101000928215 Homo sapiens Arf-GAP with GTPase, ANK repeat and PH domain-containing protein 2 Proteins 0.000 description 1
- 101000975752 Homo sapiens Arf-GAP with SH3 domain, ANK repeat and PH domain-containing protein 1 Proteins 0.000 description 1
- 101000740484 Homo sapiens Aryl hydrocarbon receptor nuclear translocator-like protein 1 Proteins 0.000 description 1
- 101000971234 Homo sapiens B-cell lymphoma 6 protein Proteins 0.000 description 1
- 101100165236 Homo sapiens BCOR gene Proteins 0.000 description 1
- 101000596896 Homo sapiens BDNF/NT-3 growth factors receptor Proteins 0.000 description 1
- 101000933320 Homo sapiens Breakpoint cluster region protein Proteins 0.000 description 1
- 101000896987 Homo sapiens CREB-binding protein Proteins 0.000 description 1
- 101000851951 Homo sapiens Clathrin interactor 1 Proteins 0.000 description 1
- 101000905743 Homo sapiens Cyclic AMP-dependent transcription factor ATF-4 Proteins 0.000 description 1
- 101000889276 Homo sapiens Cytotoxic T-lymphocyte protein 4 Proteins 0.000 description 1
- 101000866237 Homo sapiens Dedicator of cytokinesis protein 2 Proteins 0.000 description 1
- 101001053992 Homo sapiens Deleted in lung and esophageal cancer protein 1 Proteins 0.000 description 1
- 101000966403 Homo sapiens Dynein light chain 1, cytoplasmic Proteins 0.000 description 1
- 101000908706 Homo sapiens Dynein light chain 2, cytoplasmic Proteins 0.000 description 1
- 101000636713 Homo sapiens E3 ubiquitin-protein ligase NEDD4 Proteins 0.000 description 1
- 101001103583 Homo sapiens E3 ubiquitin-protein ligase RNF31 Proteins 0.000 description 1
- 101001048716 Homo sapiens ETS domain-containing protein Elk-4 Proteins 0.000 description 1
- 101000938776 Homo sapiens ETS domain-containing transcription factor ERF Proteins 0.000 description 1
- 101000921245 Homo sapiens ETS homologous factor Proteins 0.000 description 1
- 101000813747 Homo sapiens ETS translocation variant 4 Proteins 0.000 description 1
- 101000877395 Homo sapiens ETS-related transcription factor Elf-1 Proteins 0.000 description 1
- 101000877379 Homo sapiens ETS-related transcription factor Elf-3 Proteins 0.000 description 1
- 101000813135 Homo sapiens ETS-related transcription factor Elf-4 Proteins 0.000 description 1
- 101000813141 Homo sapiens ETS-related transcription factor Elf-5 Proteins 0.000 description 1
- 101001050162 Homo sapiens Early endosome antigen 1 Proteins 0.000 description 1
- 101001049697 Homo sapiens Early growth response protein 1 Proteins 0.000 description 1
- 101000938790 Homo sapiens Eukaryotic peptide chain release factor subunit 1 Proteins 0.000 description 1
- 101001022168 Homo sapiens FYVE and coiled-coil domain-containing protein 1 Proteins 0.000 description 1
- 101000878291 Homo sapiens Factor in the germline alpha Proteins 0.000 description 1
- 101000951240 Homo sapiens GTP-binding protein Di-Ras1 Proteins 0.000 description 1
- 101000951231 Homo sapiens GTP-binding protein Di-Ras2 Proteins 0.000 description 1
- 101000856606 Homo sapiens GTP-binding protein GEM Proteins 0.000 description 1
- 101000581787 Homo sapiens GTP-binding protein REM 2 Proteins 0.000 description 1
- 101000574654 Homo sapiens GTP-binding protein Rit1 Proteins 0.000 description 1
- 101000620820 Homo sapiens GTPase ERas Proteins 0.000 description 1
- 101000584633 Homo sapiens GTPase HRas Proteins 0.000 description 1
- 101000584612 Homo sapiens GTPase KRas Proteins 0.000 description 1
- 101000744505 Homo sapiens GTPase NRas Proteins 0.000 description 1
- 101000615232 Homo sapiens Guanine nucleotide exchange factor DBS Proteins 0.000 description 1
- 101000775776 Homo sapiens Guanine nucleotide exchange factor VAV2 Proteins 0.000 description 1
- 101000596894 Homo sapiens High affinity nerve growth factor receptor Proteins 0.000 description 1
- 101001046967 Homo sapiens Histone acetyltransferase KAT2A Proteins 0.000 description 1
- 101000944179 Homo sapiens Histone acetyltransferase KAT6A Proteins 0.000 description 1
- 101000944174 Homo sapiens Histone acetyltransferase KAT6B Proteins 0.000 description 1
- 101000882390 Homo sapiens Histone acetyltransferase p300 Proteins 0.000 description 1
- 101001045846 Homo sapiens Histone-lysine N-methyltransferase 2A Proteins 0.000 description 1
- 101000785963 Homo sapiens Histone-lysine N-methyltransferase ASH1L Proteins 0.000 description 1
- 101001033728 Homo sapiens Histone-lysine N-methyltransferase MECOM Proteins 0.000 description 1
- 101000654725 Homo sapiens Histone-lysine N-methyltransferase SETD2 Proteins 0.000 description 1
- 101000634050 Homo sapiens Histone-lysine N-methyltransferase, H3 lysine-36 specific Proteins 0.000 description 1
- 101001052462 Homo sapiens Homeobox protein MIXL1 Proteins 0.000 description 1
- 101000596925 Homo sapiens Homeobox protein TGIF1 Proteins 0.000 description 1
- 101000634521 Homo sapiens Homeobox protein notochord Proteins 0.000 description 1
- 101000852815 Homo sapiens Insulin receptor Proteins 0.000 description 1
- 101000599951 Homo sapiens Insulin-like growth factor I Proteins 0.000 description 1
- 101001033233 Homo sapiens Interleukin-10 Proteins 0.000 description 1
- 101001019598 Homo sapiens Interleukin-17 receptor A Proteins 0.000 description 1
- 101001043817 Homo sapiens Interleukin-31 receptor subunit alpha Proteins 0.000 description 1
- 101001046587 Homo sapiens Krueppel-like factor 1 Proteins 0.000 description 1
- 101001038339 Homo sapiens LIM homeobox transcription factor 1-alpha Proteins 0.000 description 1
- 101001043562 Homo sapiens Low-density lipoprotein receptor-related protein 2 Proteins 0.000 description 1
- 101000914251 Homo sapiens Major centromere autoantigen B Proteins 0.000 description 1
- 101000962483 Homo sapiens Max dimerization protein 1 Proteins 0.000 description 1
- 101000798109 Homo sapiens Melanotransferrin Proteins 0.000 description 1
- 101000967073 Homo sapiens Metal regulatory transcription factor 1 Proteins 0.000 description 1
- 101001016777 Homo sapiens Microtubule-associated protein 9 Proteins 0.000 description 1
- 101001018196 Homo sapiens Mitogen-activated protein kinase kinase kinase 5 Proteins 0.000 description 1
- 101000957106 Homo sapiens Mitotic spindle assembly checkpoint protein MAD1 Proteins 0.000 description 1
- 101001030197 Homo sapiens Myelin transcription factor 1 Proteins 0.000 description 1
- 101000997252 Homo sapiens NF-kappa-B inhibitor-interacting Ras-like protein 2 Proteins 0.000 description 1
- 101000586302 Homo sapiens Oncostatin-M-specific receptor subunit beta Proteins 0.000 description 1
- 101000613575 Homo sapiens Paired box protein Pax-1 Proteins 0.000 description 1
- 101000702559 Homo sapiens Probable global transcription activator SNF2L2 Proteins 0.000 description 1
- 101001098868 Homo sapiens Proprotein convertase subtilisin/kexin type 9 Proteins 0.000 description 1
- 101000817237 Homo sapiens Protein ECT2 Proteins 0.000 description 1
- 101000742054 Homo sapiens Protein phosphatase 1D Proteins 0.000 description 1
- 101000775749 Homo sapiens Proto-oncogene vav Proteins 0.000 description 1
- 101100038201 Homo sapiens RAP1GAP gene Proteins 0.000 description 1
- 101001069891 Homo sapiens RAS guanyl-releasing protein 1 Proteins 0.000 description 1
- 101000870953 Homo sapiens RAS guanyl-releasing protein 4 Proteins 0.000 description 1
- 101001092166 Homo sapiens RPE-retinal G protein-coupled receptor Proteins 0.000 description 1
- 101000709121 Homo sapiens Ral guanine nucleotide dissociation stimulator-like 1 Proteins 0.000 description 1
- 101000709135 Homo sapiens Ral guanine nucleotide dissociation stimulator-like 2 Proteins 0.000 description 1
- 101001023826 Homo sapiens Ras GTPase-activating protein nGAP Proteins 0.000 description 1
- 101001092172 Homo sapiens Ras-GEF domain-containing family member 1A Proteins 0.000 description 1
- 101000744515 Homo sapiens Ras-related protein M-Ras Proteins 0.000 description 1
- 101000686227 Homo sapiens Ras-related protein R-Ras2 Proteins 0.000 description 1
- 101000584785 Homo sapiens Ras-related protein Rab-7a Proteins 0.000 description 1
- 101001130465 Homo sapiens Ras-related protein Ral-A Proteins 0.000 description 1
- 101001130458 Homo sapiens Ras-related protein Ral-B Proteins 0.000 description 1
- 101001130441 Homo sapiens Ras-related protein Rap-2a Proteins 0.000 description 1
- 101000580043 Homo sapiens Ras-specific guanine nucleotide-releasing factor 2 Proteins 0.000 description 1
- 101001061898 Homo sapiens RasGAP-activating-like protein 1 Proteins 0.000 description 1
- 101000581155 Homo sapiens Rho GTPase-activating protein 12 Proteins 0.000 description 1
- 101001091996 Homo sapiens Rho GTPase-activating protein 22 Proteins 0.000 description 1
- 101001091991 Homo sapiens Rho GTPase-activating protein 25 Proteins 0.000 description 1
- 101001091984 Homo sapiens Rho GTPase-activating protein 26 Proteins 0.000 description 1
- 101001106395 Homo sapiens Rho GTPase-activating protein 5 Proteins 0.000 description 1
- 101001106322 Homo sapiens Rho GTPase-activating protein 7 Proteins 0.000 description 1
- 101001106309 Homo sapiens Rho GTPase-activating protein 8 Proteins 0.000 description 1
- 101000927776 Homo sapiens Rho guanine nucleotide exchange factor 11 Proteins 0.000 description 1
- 101000927774 Homo sapiens Rho guanine nucleotide exchange factor 12 Proteins 0.000 description 1
- 101000669917 Homo sapiens Rho-associated protein kinase 1 Proteins 0.000 description 1
- 101000711466 Homo sapiens SAM pointed domain-containing Ets transcription factor Proteins 0.000 description 1
- 101100477520 Homo sapiens SHOX gene Proteins 0.000 description 1
- 101100204204 Homo sapiens STARD8 gene Proteins 0.000 description 1
- 101000708009 Homo sapiens Sentrin-specific protease 8 Proteins 0.000 description 1
- 101000836906 Homo sapiens Signal-induced proliferation-associated protein 1 Proteins 0.000 description 1
- 101000618138 Homo sapiens Sperm-associated antigen 4 protein Proteins 0.000 description 1
- 101000647991 Homo sapiens StAR-related lipid transfer protein 13 Proteins 0.000 description 1
- 101000652747 Homo sapiens Target of rapamycin complex 2 subunit MAPKAP1 Proteins 0.000 description 1
- 101000819111 Homo sapiens Trans-acting T-cell-specific transcription factor GATA-3 Proteins 0.000 description 1
- 101000702545 Homo sapiens Transcription activator BRG1 Proteins 0.000 description 1
- 101000881764 Homo sapiens Transcription elongation factor 1 homolog Proteins 0.000 description 1
- 101000904152 Homo sapiens Transcription factor E2F1 Proteins 0.000 description 1
- 101000895882 Homo sapiens Transcription factor E2F4 Proteins 0.000 description 1
- 101000819074 Homo sapiens Transcription factor GATA-4 Proteins 0.000 description 1
- 101000825182 Homo sapiens Transcription factor Spi-B Proteins 0.000 description 1
- 101000825161 Homo sapiens Transcription factor Spi-C Proteins 0.000 description 1
- 101001010792 Homo sapiens Transcriptional regulator ERG Proteins 0.000 description 1
- 101000685104 Homo sapiens Transcriptional repressor scratch 1 Proteins 0.000 description 1
- 101000764872 Homo sapiens Transient receptor potential cation channel subfamily A member 1 Proteins 0.000 description 1
- 101000649014 Homo sapiens Triple functional domain protein Proteins 0.000 description 1
- 101000795659 Homo sapiens Tuberin Proteins 0.000 description 1
- 101001087394 Homo sapiens Tyrosine-protein phosphatase non-receptor type 1 Proteins 0.000 description 1
- 101000617285 Homo sapiens Tyrosine-protein phosphatase non-receptor type 6 Proteins 0.000 description 1
- 101100155061 Homo sapiens UBE3A gene Proteins 0.000 description 1
- 238000006736 Huisgen cycloaddition reaction Methods 0.000 description 1
- 102000004157 Hydrolases Human genes 0.000 description 1
- 108090000604 Hydrolases Proteins 0.000 description 1
- 102100031612 Hypermethylated in cancer 1 protein Human genes 0.000 description 1
- 101710133850 Hypermethylated in cancer 1 protein Proteins 0.000 description 1
- 101150012059 IKBKG gene Proteins 0.000 description 1
- 108010050332 IQ motif containing GTPase activating protein 1 Proteins 0.000 description 1
- XDXDZDZNSLXDNA-TZNDIEGXSA-N Idarubicin Chemical compound C1[C@H](N)[C@H](O)[C@H](C)O[C@H]1O[C@@H]1C2=C(O)C(C(=O)C3=CC=CC=C3C3=O)=C3C(O)=C2C[C@@](O)(C(C)=O)C1 XDXDZDZNSLXDNA-TZNDIEGXSA-N 0.000 description 1
- XDXDZDZNSLXDNA-UHFFFAOYSA-N Idarubicin Natural products C1C(N)C(O)C(C)OC1OC1C2=C(O)C(C(=O)C3=CC=CC=C3C3=O)=C3C(O)=C2CC(O)(C(C)=O)C1 XDXDZDZNSLXDNA-UHFFFAOYSA-N 0.000 description 1
- 102100036721 Insulin receptor Human genes 0.000 description 1
- 102100037852 Insulin-like growth factor I Human genes 0.000 description 1
- 102100039068 Interleukin-10 Human genes 0.000 description 1
- 102000013462 Interleukin-12 Human genes 0.000 description 1
- 108010065805 Interleukin-12 Proteins 0.000 description 1
- 102100035018 Interleukin-17 receptor A Human genes 0.000 description 1
- 102100021594 Interleukin-31 receptor subunit alpha Human genes 0.000 description 1
- 102000042838 JAK family Human genes 0.000 description 1
- 108091082332 JAK family Proteins 0.000 description 1
- 102100022248 Krueppel-like factor 1 Human genes 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- 102100040290 LIM homeobox transcription factor 1-alpha Human genes 0.000 description 1
- 102100025833 Major centromere autoantigen B Human genes 0.000 description 1
- 102100039185 Max dimerization protein 1 Human genes 0.000 description 1
- 102100040514 Metal regulatory transcription factor 1 Human genes 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 102000016397 Methyltransferase Human genes 0.000 description 1
- 108010050345 Microphthalmia-Associated Transcription Factor Proteins 0.000 description 1
- 102100030157 Microphthalmia-associated transcription factor Human genes 0.000 description 1
- 102100032485 Microtubule-associated protein 9 Human genes 0.000 description 1
- 102100033127 Mitogen-activated protein kinase kinase kinase 5 Human genes 0.000 description 1
- 102100025748 Mothers against decapentaplegic homolog 3 Human genes 0.000 description 1
- 101710143111 Mothers against decapentaplegic homolog 3 Proteins 0.000 description 1
- 101000590284 Mus musculus 26S proteasome non-ATPase regulatory subunit 14 Proteins 0.000 description 1
- 101100381525 Mus musculus Bcl6 gene Proteins 0.000 description 1
- 101100243940 Mus musculus Phox2a gene Proteins 0.000 description 1
- 101100524555 Mus musculus Rgl1 gene Proteins 0.000 description 1
- 101100087591 Mus musculus Rictor gene Proteins 0.000 description 1
- 101100155062 Mus musculus Ube3a gene Proteins 0.000 description 1
- SGSSKEDGVONRGC-UHFFFAOYSA-N N(2)-methylguanine Chemical compound O=C1NC(NC)=NC2=C1N=CN2 SGSSKEDGVONRGC-UHFFFAOYSA-N 0.000 description 1
- 108700026495 N-Myc Proto-Oncogene Proteins 0.000 description 1
- TWUKMYQCZJTUIC-UHFFFAOYSA-N N-[6-[2-cyanoethyl-[di(propan-2-yl)amino]-dihydroxy-lambda5-phosphanyl]hexyl]-2,2,2-trifluoroacetamide Chemical compound N#CCCP(O)(O)(N(C(C)C)C(C)C)CCCCCCNC(=O)C(F)(F)F TWUKMYQCZJTUIC-UHFFFAOYSA-N 0.000 description 1
- PCLIMKBDDGJMGD-UHFFFAOYSA-N N-bromosuccinimide Chemical compound BrN1C(=O)CCC1=O PCLIMKBDDGJMGD-UHFFFAOYSA-N 0.000 description 1
- 102100030124 N-myc proto-oncogene protein Human genes 0.000 description 1
- 108010057466 NF-kappa B Proteins 0.000 description 1
- 102000003945 NF-kappa B Human genes 0.000 description 1
- 102100034325 NF-kappa-B inhibitor-interacting Ras-like protein 2 Human genes 0.000 description 1
- 101150111783 NTRK1 gene Proteins 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- JJAHTWIKCUJRDK-XYPYZODXSA-N O=C([C@@H]1CC[C@@H](CN2C(C=CC2=O)=O)CC1)ON1C(=O)CCC1=O Chemical compound O=C([C@@H]1CC[C@@H](CN2C(C=CC2=O)=O)CC1)ON1C(=O)CCC1=O JJAHTWIKCUJRDK-XYPYZODXSA-N 0.000 description 1
- 102100030127 Obscurin Human genes 0.000 description 1
- 101710194880 Obscurin Proteins 0.000 description 1
- 108010038807 Oligopeptides Proteins 0.000 description 1
- 102000015636 Oligopeptides Human genes 0.000 description 1
- 102100030098 Oncostatin-M-specific receptor subunit beta Human genes 0.000 description 1
- ZCQWOFVYLHDMMC-UHFFFAOYSA-N Oxazole Chemical compound C1=COC=N1 ZCQWOFVYLHDMMC-UHFFFAOYSA-N 0.000 description 1
- 102100040460 P2X purinoceptor 3 Human genes 0.000 description 1
- 101710189970 P2X purinoceptor 3 Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 102100024894 PR domain zinc finger protein 1 Human genes 0.000 description 1
- 102100040851 Paired box protein Pax-1 Human genes 0.000 description 1
- PTPHDVKWAYIFRX-UHFFFAOYSA-N Palmatine Natural products C1C2=C(OC)C(OC)=CC=C2C=C2N1CCC1=C2C=C(OC)C(OC)=C1 PTPHDVKWAYIFRX-UHFFFAOYSA-N 0.000 description 1
- 102100038633 Phosphatidylinositol 3,4,5-trisphosphate-dependent Rac exchanger 2 protein Human genes 0.000 description 1
- 101710188183 Phosphatidylinositol 3,4,5-trisphosphate-dependent Rac exchanger 2 protein Proteins 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 108010051742 Platelet-Derived Growth Factor beta Receptor Proteins 0.000 description 1
- 102100026547 Platelet-derived growth factor receptor beta Human genes 0.000 description 1
- 102000012338 Poly(ADP-ribose) Polymerases Human genes 0.000 description 1
- 108010061844 Poly(ADP-ribose) Polymerases Proteins 0.000 description 1
- 229920000776 Poly(Adenosine diphosphate-ribose) polymerase Polymers 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 108010009975 Positive Regulatory Domain I-Binding Factor 1 Proteins 0.000 description 1
- 102100022807 Potassium voltage-gated channel subfamily H member 2 Human genes 0.000 description 1
- 101710163352 Potassium voltage-gated channel subfamily H member 4 Proteins 0.000 description 1
- 101710163348 Potassium voltage-gated channel subfamily H member 8 Proteins 0.000 description 1
- 102100031021 Probable global transcription activator SNF2L2 Human genes 0.000 description 1
- WDVSHHCDHLJJJR-UHFFFAOYSA-N Proflavine Chemical compound C1=CC(N)=CC2=NC3=CC(N)=CC=C3C=C21 WDVSHHCDHLJJJR-UHFFFAOYSA-N 0.000 description 1
- 102100038955 Proprotein convertase subtilisin/kexin type 9 Human genes 0.000 description 1
- 102100040437 Protein ECT2 Human genes 0.000 description 1
- 102100038675 Protein phosphatase 1D Human genes 0.000 description 1
- 102100024601 Protein tyrosine phosphatase type IVA 3 Human genes 0.000 description 1
- 101710138647 Protein tyrosine phosphatase type IVA 3 Proteins 0.000 description 1
- 102100032190 Proto-oncogene vav Human genes 0.000 description 1
- 238000004618 QSPR study Methods 0.000 description 1
- 108010010469 Qa-SNARE Proteins Proteins 0.000 description 1
- 102100034220 RAS guanyl-releasing protein 1 Human genes 0.000 description 1
- 102100033445 RAS guanyl-releasing protein 4 Human genes 0.000 description 1
- 101150020518 RHEB gene Proteins 0.000 description 1
- 101710086015 RNA ligase Proteins 0.000 description 1
- 238000010240 RT-PCR analysis Methods 0.000 description 1
- 102100023320 Ral guanine nucleotide dissociation stimulator Human genes 0.000 description 1
- 102100032665 Ral guanine nucleotide dissociation stimulator-like 1 Human genes 0.000 description 1
- 102100032786 Ral guanine nucleotide dissociation stimulator-like 2 Human genes 0.000 description 1
- 102100035582 Ral-GDS-related protein Human genes 0.000 description 1
- 102100038914 RalA-binding protein 1 Human genes 0.000 description 1
- 101150041852 Ralbp1 gene Proteins 0.000 description 1
- 101150015043 Ralgds gene Proteins 0.000 description 1
- 238000001069 Raman spectroscopy Methods 0.000 description 1
- 102100040088 Rap1 GTPase-activating protein 1 Human genes 0.000 description 1
- 102100031426 Ras GTPase-activating protein 1 Human genes 0.000 description 1
- 102100035410 Ras GTPase-activating protein nGAP Human genes 0.000 description 1
- 102100034419 Ras GTPase-activating-like protein IQGAP1 Human genes 0.000 description 1
- 102100035771 Ras-GEF domain-containing family member 1A Human genes 0.000 description 1
- 102100039789 Ras-related protein M-Ras Human genes 0.000 description 1
- 102100025003 Ras-related protein R-Ras2 Human genes 0.000 description 1
- 102100034485 Ras-related protein Rab-2A Human genes 0.000 description 1
- 102100030019 Ras-related protein Rab-7a Human genes 0.000 description 1
- 102100031424 Ras-related protein Ral-A Human genes 0.000 description 1
- 102100031425 Ras-related protein Ral-B Human genes 0.000 description 1
- 102100031420 Ras-related protein Rap-2a Human genes 0.000 description 1
- 102100027555 Ras-specific guanine nucleotide-releasing factor 2 Human genes 0.000 description 1
- 102100029554 RasGAP-activating-like protein 1 Human genes 0.000 description 1
- 108091030145 Retron msr RNA Proteins 0.000 description 1
- 102100027663 Rho GTPase-activating protein 12 Human genes 0.000 description 1
- 102100035757 Rho GTPase-activating protein 22 Human genes 0.000 description 1
- 102100035759 Rho GTPase-activating protein 25 Human genes 0.000 description 1
- 102100035744 Rho GTPase-activating protein 26 Human genes 0.000 description 1
- 102100021428 Rho GTPase-activating protein 5 Human genes 0.000 description 1
- 102100021446 Rho GTPase-activating protein 7 Human genes 0.000 description 1
- 102100021443 Rho GTPase-activating protein 8 Human genes 0.000 description 1
- 102100033194 Rho guanine nucleotide exchange factor 11 Human genes 0.000 description 1
- 102100033193 Rho guanine nucleotide exchange factor 12 Human genes 0.000 description 1
- 102100021709 Rho guanine nucleotide exchange factor 4 Human genes 0.000 description 1
- 101710128386 Rho guanine nucleotide exchange factor 4 Proteins 0.000 description 1
- 102100039313 Rho-associated protein kinase 1 Human genes 0.000 description 1
- 108010055623 S-Phase Kinase-Associated Proteins Proteins 0.000 description 1
- 102000000341 S-Phase Kinase-Associated Proteins Human genes 0.000 description 1
- 102100034018 SAM pointed domain-containing Ets transcription factor Human genes 0.000 description 1
- 102000000583 SNARE Proteins Human genes 0.000 description 1
- 108010041948 SNARE Proteins Proteins 0.000 description 1
- 108700022176 SOS1 Proteins 0.000 description 1
- 101150099493 STAT3 gene Proteins 0.000 description 1
- 101100197320 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPL35A gene Proteins 0.000 description 1
- 101100042881 Sambucus nigra SNA-I gene Proteins 0.000 description 1
- 102100031407 Sentrin-specific protease 8 Human genes 0.000 description 1
- 108700025071 Short Stature Homeobox Proteins 0.000 description 1
- 102100029992 Short stature homeobox protein Human genes 0.000 description 1
- 102100027163 Signal-induced proliferation-associated protein 1 Human genes 0.000 description 1
- 102100032929 Son of sevenless homolog 1 Human genes 0.000 description 1
- 101150100839 Sos1 gene Proteins 0.000 description 1
- 102100021907 Sperm-associated antigen 4 protein Human genes 0.000 description 1
- 102100037615 Spermatogenesis-associated protein 13 Human genes 0.000 description 1
- 101710202308 Spermatogenesis-associated protein 13 Proteins 0.000 description 1
- 102100025252 StAR-related lipid transfer protein 13 Human genes 0.000 description 1
- 102100026755 StAR-related lipid transfer protein 8 Human genes 0.000 description 1
- 108010057722 Synaptosomal-Associated Protein 25 Proteins 0.000 description 1
- 102000004183 Synaptosomal-Associated Protein 25 Human genes 0.000 description 1
- 102000050389 Syntaxin Human genes 0.000 description 1
- 102100030904 Target of rapamycin complex 2 subunit MAPKAP1 Human genes 0.000 description 1
- 102100033740 Tenomodulin Human genes 0.000 description 1
- 108020000411 Toll-like receptor Proteins 0.000 description 1
- 102100021386 Trans-acting T-cell-specific transcription factor GATA-3 Human genes 0.000 description 1
- 102100037116 Transcription elongation factor 1 homolog Human genes 0.000 description 1
- 102100024026 Transcription factor E2F1 Human genes 0.000 description 1
- 102100021783 Transcription factor E2F4 Human genes 0.000 description 1
- 102100021380 Transcription factor GATA-4 Human genes 0.000 description 1
- 102100022281 Transcription factor Spi-B Human genes 0.000 description 1
- 102100022285 Transcription factor Spi-C Human genes 0.000 description 1
- 102100023185 Transcriptional repressor scratch 1 Human genes 0.000 description 1
- 101710204707 Transforming growth factor-beta receptor-associated protein 1 Proteins 0.000 description 1
- 102100026186 Transient receptor potential cation channel subfamily A member 1 Human genes 0.000 description 1
- 108010023649 Tripartite Motif Proteins Proteins 0.000 description 1
- 102100028101 Triple functional domain protein Human genes 0.000 description 1
- 102100031638 Tuberin Human genes 0.000 description 1
- 102100033732 Tumor necrosis factor receptor superfamily member 1A Human genes 0.000 description 1
- 101710187743 Tumor necrosis factor receptor superfamily member 1A Proteins 0.000 description 1
- 102100033733 Tumor necrosis factor receptor superfamily member 1B Human genes 0.000 description 1
- 101710187830 Tumor necrosis factor receptor superfamily member 1B Proteins 0.000 description 1
- 102100040245 Tumor necrosis factor receptor superfamily member 5 Human genes 0.000 description 1
- 101710102803 Tumor suppressor ARF Proteins 0.000 description 1
- 102100033001 Tyrosine-protein phosphatase non-receptor type 1 Human genes 0.000 description 1
- 102100033019 Tyrosine-protein phosphatase non-receptor type 11 Human genes 0.000 description 1
- 101710116241 Tyrosine-protein phosphatase non-receptor type 11 Proteins 0.000 description 1
- 102100021657 Tyrosine-protein phosphatase non-receptor type 6 Human genes 0.000 description 1
- 102100030434 Ubiquitin-protein ligase E3A Human genes 0.000 description 1
- 102100035071 Vimentin Human genes 0.000 description 1
- 108010065472 Vimentin Proteins 0.000 description 1
- 102000056014 X-linked Nuclear Human genes 0.000 description 1
- 108700042462 X-linked Nuclear Proteins 0.000 description 1
- 108010016200 Zinc Finger Protein GLI1 Proteins 0.000 description 1
- 102100035535 Zinc finger protein GLI1 Human genes 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- BGLGAKMTYHWWKW-UHFFFAOYSA-N acridine yellow Chemical compound [H+].[Cl-].CC1=C(N)C=C2N=C(C=C(C(C)=C3)N)C3=CC2=C1 BGLGAKMTYHWWKW-UHFFFAOYSA-N 0.000 description 1
- 150000001251 acridines Chemical class 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 238000001261 affinity purification Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 125000003172 aldehyde group Chemical group 0.000 description 1
- 150000003797 alkaloid derivatives Chemical class 0.000 description 1
- 150000001336 alkenes Chemical group 0.000 description 1
- 125000003545 alkoxy group Chemical group 0.000 description 1
- 125000003282 alkyl amino group Chemical group 0.000 description 1
- 125000000217 alkyl group Chemical group 0.000 description 1
- 229940045714 alkyl sulfonate alkylating agent Drugs 0.000 description 1
- 150000008052 alkyl sulfonates Chemical class 0.000 description 1
- 125000000304 alkynyl group Chemical group 0.000 description 1
- 229960000473 altretamine Drugs 0.000 description 1
- 229940059260 amidate Drugs 0.000 description 1
- 150000001408 amides Chemical class 0.000 description 1
- XCPGHVQEEXUHNC-UHFFFAOYSA-N amsacrine Chemical compound COC1=CC(NS(C)(=O)=O)=CC=C1NC1=C(C=CC=C2)C2=NC2=CC=CC=C12 XCPGHVQEEXUHNC-UHFFFAOYSA-N 0.000 description 1
- 229960001220 amsacrine Drugs 0.000 description 1
- 108010080146 androgen receptors Proteins 0.000 description 1
- 229940045799 anthracyclines and related substance Drugs 0.000 description 1
- 229940045719 antineoplastic alkylating agent nitrosoureas Drugs 0.000 description 1
- 229940027998 antiseptic and disinfectant acridine derivative Drugs 0.000 description 1
- 125000005602 azabenzimidazolyl group Chemical group 0.000 description 1
- 125000005334 azaindolyl group Chemical group N1N=C(C2=CC=CC=C12)* 0.000 description 1
- VSRXQHXAPYXROS-UHFFFAOYSA-N azanide;cyclobutane-1,1-dicarboxylic acid;platinum(2+) Chemical compound [NH2-].[NH2-].[Pt+2].OC(=O)C1(C(O)=O)CCC1 VSRXQHXAPYXROS-UHFFFAOYSA-N 0.000 description 1
- IVRMZWNICZWHMI-UHFFFAOYSA-N azide group Chemical group [N-]=[N+]=[N-] IVRMZWNICZWHMI-UHFFFAOYSA-N 0.000 description 1
- 150000001541 aziridines Chemical class 0.000 description 1
- 230000033590 base-excision repair Effects 0.000 description 1
- 125000003785 benzimidazolyl group Chemical group N1=C(NC2=C1C=CC=C2)* 0.000 description 1
- RWCCWEUUXYIKHB-UHFFFAOYSA-N benzophenone Chemical group C=1C=CC=CC=1C(=O)C1=CC=CC=C1 RWCCWEUUXYIKHB-UHFFFAOYSA-N 0.000 description 1
- 239000012965 benzophenone Substances 0.000 description 1
- 150000008366 benzophenones Chemical class 0.000 description 1
- 125000001797 benzyl group Chemical group [H]C1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])* 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- 238000010256 biochemical assay Methods 0.000 description 1
- 238000007622 bioinformatic analysis Methods 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 229910052796 boron Inorganic materials 0.000 description 1
- 125000001246 bromo group Chemical class Br* 0.000 description 1
- 229960002092 busulfan Drugs 0.000 description 1
- 125000003917 carbamoyl group Chemical group [H]N([H])C(*)=O 0.000 description 1
- 125000002837 carbocyclic group Chemical group 0.000 description 1
- 229960004562 carboplatin Drugs 0.000 description 1
- 150000001732 carboxylic acid derivatives Chemical class 0.000 description 1
- 125000002843 carboxylic acid group Chemical group 0.000 description 1
- 239000003054 catalyst Substances 0.000 description 1
- 229960004630 chlorambucil Drugs 0.000 description 1
- JCKYGMPEJWAADB-UHFFFAOYSA-N chlorambucil Chemical group OC(=O)CCCC1=CC=C(N(CCCl)CCCl)C=C1 JCKYGMPEJWAADB-UHFFFAOYSA-N 0.000 description 1
- 239000000460 chlorine Substances 0.000 description 1
- 229910052801 chlorine Inorganic materials 0.000 description 1
- 125000001309 chloro group Chemical class Cl* 0.000 description 1
- 230000009137 competitive binding Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010205 computational analysis Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 239000003431 cross linking reagent Substances 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 125000004093 cyano group Chemical group *C#N 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000006352 cycloaddition reaction Methods 0.000 description 1
- 150000001925 cycloalkenes Chemical class 0.000 description 1
- 125000000392 cycloalkenyl group Chemical group 0.000 description 1
- 125000000753 cycloalkyl group Chemical group 0.000 description 1
- 125000000596 cyclohexenyl group Chemical group C1(=CCCCC1)* 0.000 description 1
- 125000000113 cyclohexyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])(*)C([H])([H])C1([H])[H] 0.000 description 1
- 125000001559 cyclopropyl group Chemical group [H]C1([H])C([H])([H])C1([H])* 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 150000004985 diamines Chemical class 0.000 description 1
- 150000001993 dienes Chemical class 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-K dioxido-sulfanylidene-sulfido-$l^{5}-phosphane Chemical compound [O-]P([O-])([S-])=S NAGJZTKCGNOGPW-UHFFFAOYSA-K 0.000 description 1
- HGCZFZRDBQGZEF-UHFFFAOYSA-N diphenylmethanone;1h-pyrimidine-2,4-dione Chemical group O=C1C=CNC(=O)N1.C=1C=CC=CC=1C(=O)C1=CC=CC=C1 HGCZFZRDBQGZEF-UHFFFAOYSA-N 0.000 description 1
- ZBZKGHJCOHBOCB-UHFFFAOYSA-N diphenylmethanone;isothiocyanic acid Chemical compound N=C=S.C=1C=CC=CC=1C(=O)C1=CC=CC=C1 ZBZKGHJCOHBOCB-UHFFFAOYSA-N 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-N dithiophosphoric acid Chemical group OP(O)(S)=S NAGJZTKCGNOGPW-UHFFFAOYSA-N 0.000 description 1
- AFOSIXZFDONLBT-UHFFFAOYSA-N divinyl sulfone Chemical group C=CS(=O)(=O)C=C AFOSIXZFDONLBT-UHFFFAOYSA-N 0.000 description 1
- 239000012039 electrophile Substances 0.000 description 1
- 229960001904 epirubicin Drugs 0.000 description 1
- 150000002118 epoxides Chemical class 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- NPUKDXXFDDZOKR-LLVKDONJSA-N etomidate Chemical compound CCOC(=O)C1=CN=CN1[C@H](C)C1=CC=CC=C1 NPUKDXXFDDZOKR-LLVKDONJSA-N 0.000 description 1
- 230000006846 excision repair Effects 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 102000052178 fibroblast growth factor receptor activity proteins Human genes 0.000 description 1
- NKKLCOFTJVNYAQ-UHFFFAOYSA-N formamidopyrimidine Chemical compound O=CNC1=CN=CN=C1 NKKLCOFTJVNYAQ-UHFFFAOYSA-N 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- 125000003827 glycol group Chemical group 0.000 description 1
- RQFCJASXJCIDSX-UUOKFMHZSA-N guanosine 5'-monophosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@H]1O RQFCJASXJCIDSX-UUOKFMHZSA-N 0.000 description 1
- 229910052736 halogen Inorganic materials 0.000 description 1
- 150000002367 halogens Chemical group 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 125000004404 heteroalkyl group Chemical group 0.000 description 1
- 125000004366 heterocycloalkenyl group Chemical group 0.000 description 1
- 125000000592 heterocycloalkyl group Chemical group 0.000 description 1
- UUVWYPNAQBNQJQ-UHFFFAOYSA-N hexamethylmelamine Chemical compound CN(C)C1=NC(N(C)C)=NC(N(C)C)=N1 UUVWYPNAQBNQJQ-UHFFFAOYSA-N 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 150000002429 hydrazines Chemical class 0.000 description 1
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 1
- 150000002443 hydroxylamines Chemical class 0.000 description 1
- 229960000908 idarubicin Drugs 0.000 description 1
- 229960001101 ifosfamide Drugs 0.000 description 1
- HOMGKSMUEGBAAB-UHFFFAOYSA-N ifosfamide Chemical compound ClCCNP1(=O)OCCCN1CCCl HOMGKSMUEGBAAB-UHFFFAOYSA-N 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 125000003453 indazolyl group Chemical class N1N=C(C2=C1C=CC=C2)* 0.000 description 1
- PZOUSPYUWWUPPK-UHFFFAOYSA-N indole Natural products CC1=CC=CC2=C1C=CN2 PZOUSPYUWWUPPK-UHFFFAOYSA-N 0.000 description 1
- RKJUIXBNRJVNHR-UHFFFAOYSA-N indolenine Natural products C1=CC=C2CC=NC2=C1 RKJUIXBNRJVNHR-UHFFFAOYSA-N 0.000 description 1
- 125000001041 indolyl group Chemical group 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 150000002484 inorganic compounds Chemical class 0.000 description 1
- 229910010272 inorganic material Inorganic materials 0.000 description 1
- PGLTVOMIXTUURA-UHFFFAOYSA-N iodoacetamide Chemical group NC(=O)CI PGLTVOMIXTUURA-UHFFFAOYSA-N 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 125000000904 isoindolyl group Chemical group C=1(NC=C2C=CC=CC12)* 0.000 description 1
- 125000002183 isoquinolinyl group Chemical group C1(=NC=CC2=CC=CC=C12)* 0.000 description 1
- 230000000155 isotopic effect Effects 0.000 description 1
- CTAPFRYPJLPFDF-UHFFFAOYSA-N isoxazole Chemical compound C=1C=NOC=1 CTAPFRYPJLPFDF-UHFFFAOYSA-N 0.000 description 1
- 238000002898 library design Methods 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 238000004895 liquid chromatography mass spectrometry Methods 0.000 description 1
- 229960002247 lomustine Drugs 0.000 description 1
- 238000003670 luciferase enzyme activity assay Methods 0.000 description 1
- 238000004020 luminiscence type Methods 0.000 description 1
- 125000005439 maleimidyl group Chemical group C1(C=CC(N1*)=O)=O 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 229960004961 mechlorethamine Drugs 0.000 description 1
- HAWPXGHAZFHHAD-UHFFFAOYSA-N mechlorethamine Chemical compound ClCCN(C)CCCl HAWPXGHAZFHHAD-UHFFFAOYSA-N 0.000 description 1
- 229960001924 melphalan Drugs 0.000 description 1
- SGDBTWWWUNNDEQ-LBPRGKRZSA-N melphalan Chemical group OC(=O)[C@@H](N)CC1=CC=C(N(CCCl)CCCl)C=C1 SGDBTWWWUNNDEQ-LBPRGKRZSA-N 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 125000000956 methoxy group Chemical group [H]C([H])([H])O* 0.000 description 1
- 125000001160 methoxycarbonyl group Chemical group [H]C([H])([H])OC(*)=O 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000033607 mismatch repair Effects 0.000 description 1
- 229960004857 mitomycin Drugs 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 239000002062 molecular scaffold Substances 0.000 description 1
- NFVJNJQRWPQVOA-UHFFFAOYSA-N n-[2-chloro-5-(trifluoromethyl)phenyl]-2-[3-(4-ethyl-5-ethylsulfanyl-1,2,4-triazol-3-yl)piperidin-1-yl]acetamide Chemical compound CCN1C(SCC)=NN=C1C1CN(CC(=O)NC=2C(=CC=C(C=2)C(F)(F)F)Cl)CCC1 NFVJNJQRWPQVOA-UHFFFAOYSA-N 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000036963 noncompetitive effect Effects 0.000 description 1
- 230000020520 nucleotide-excision repair Effects 0.000 description 1
- VOIFONALGHJFTH-UHFFFAOYSA-N o-(2,5-dioxopyrrolidin-1-yl) 3-oxobutanethioate Chemical compound CC(=O)CC(=S)ON1C(=O)CCC1=O VOIFONALGHJFTH-UHFFFAOYSA-N 0.000 description 1
- 229920001542 oligosaccharide Polymers 0.000 description 1
- 150000002482 oligosaccharides Chemical class 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 150000002894 organic compounds Chemical class 0.000 description 1
- 229950008017 ormaplatin Drugs 0.000 description 1
- AHLPHDHHMVZTML-BYPYZUCNSA-N ornithyl group Chemical group N[C@@H](CCCN)C(=O)O AHLPHDHHMVZTML-BYPYZUCNSA-N 0.000 description 1
- DWAFYCQODLXJNR-BNTLRKBRSA-L oxaliplatin Chemical compound O1C(=O)C(=O)O[Pt]11N[C@@H]2CCCC[C@H]2N1 DWAFYCQODLXJNR-BNTLRKBRSA-L 0.000 description 1
- 229960001756 oxaliplatin Drugs 0.000 description 1
- QUCQEUCGKKTEBI-UHFFFAOYSA-N palmatine Chemical compound COC1=CC=C2C=C(C3=C(C=C(C(=C3)OC)OC)CC3)[N+]3=CC2=C1OC QUCQEUCGKKTEBI-UHFFFAOYSA-N 0.000 description 1
- 230000000144 pharmacologic effect Effects 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- UEZVMMHDMIWARA-UHFFFAOYSA-M phosphonate Chemical compound [O-]P(=O)=O UEZVMMHDMIWARA-UHFFFAOYSA-M 0.000 description 1
- 125000004437 phosphorous atom Chemical group 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 230000010399 physical interaction Effects 0.000 description 1
- 229950005566 picoplatin Drugs 0.000 description 1
- 125000005936 piperidyl group Chemical group 0.000 description 1
- 230000010287 polarization Effects 0.000 description 1
- 229920001515 polyalkylene glycol Polymers 0.000 description 1
- 125000003367 polycyclic group Chemical group 0.000 description 1
- 229920000151 polyglycol Polymers 0.000 description 1
- 239000010695 polyglycol Substances 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 229920000098 polyolefin Chemical group 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 229960000286 proflavine Drugs 0.000 description 1
- 238000000159 protein binding assay Methods 0.000 description 1
- GNGHGFVFONSDEL-UHFFFAOYSA-N pyrazine;pyridazine Chemical compound C1=CC=NN=C1.C1=CN=CC=N1 GNGHGFVFONSDEL-UHFFFAOYSA-N 0.000 description 1
- 125000004076 pyridyl group Chemical group 0.000 description 1
- 125000000719 pyrrolidinyl group Chemical group 0.000 description 1
- ZVJHJDDKYZXRJI-UHFFFAOYSA-N pyrroline Natural products C1CC=NC1 ZVJHJDDKYZXRJI-UHFFFAOYSA-N 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- 125000002943 quinolinyl group Chemical group N1=C(C=CC2=CC=CC=C12)* 0.000 description 1
- 108010067765 rab2 GTP Binding protein Proteins 0.000 description 1
- 239000000700 radioactive tracer Substances 0.000 description 1
- 238000006268 reductive amination reaction Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 238000007142 ring opening reaction Methods 0.000 description 1
- 239000011435 rock Substances 0.000 description 1
- 229940084560 sanguinarine Drugs 0.000 description 1
- YZRQUTZNTDAYPJ-UHFFFAOYSA-N sanguinarine pseudobase Natural products C1=C2OCOC2=CC2=C3N(C)C(O)C4=C(OCO5)C5=CC=C4C3=CC=C21 YZRQUTZNTDAYPJ-UHFFFAOYSA-N 0.000 description 1
- 238000009738 saturating Methods 0.000 description 1
- 230000009834 selective interaction Effects 0.000 description 1
- 238000011451 sequencing strategy Methods 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 125000005017 substituted alkenyl group Chemical group 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 229960003433 thalidomide Drugs 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 229960001196 thiotepa Drugs 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- 108091005703 transmembrane proteins Proteins 0.000 description 1
- 102000035160 transmembrane proteins Human genes 0.000 description 1
- 229950001353 tretamine Drugs 0.000 description 1
- 150000003852 triazoles Chemical class 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 210000005048 vimentin Anatomy 0.000 description 1
- 125000000391 vinyl group Chemical group [H]C([*])=C([H])[H] 0.000 description 1
- 239000007762 w/o emulsion Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1089—Design, preparation, screening or analysis of libraries using computer algorithms
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/10—Libraries containing peptides or polypeptides, or derivatives thereof
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/30—Detection of binding sites or motifs
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/60—In silico combinatorial chemistry
- G16C20/64—Screening of libraries
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2563/00—Nucleic acid detection characterized by the use of physical, structural and functional properties
- C12Q2563/179—Nucleic acid detection characterized by the use of physical, structural and functional properties the label being a nucleic acid
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/50—Molecular design, e.g. of drugs
Landscapes
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- General Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Evolutionary Biology (AREA)
- Medicinal Chemistry (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biochemistry (AREA)
- Pharmacology & Pharmacy (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Plant Pathology (AREA)
- Bioethics (AREA)
- Microbiology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Library & Information Science (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
Abstract
The present disclosure provides virtual screening methods that utilize data sets from nucleotide-encoding libraries (e.g., DNA-encoding libraries). These methods allow for high confidence prediction of binding interactions between candidate compounds and proteins of interest for the development of therapeutic agents.
Description
Background
The virtual screening method can extend the available screening options for a given objective and can increase the likelihood of successful optimization. Virtual screening can be a fast and inexpensive method of identifying multiple scaffolds to be used as starting points for optimization. The ability to virtually screen is generally limited by the size of the experimentally determined data set used, as it relies on comparison with known experimental data to generate virtual data. Therefore, there is a need for a method that combines robust computational methods with extremely large data sets to produce sufficient confidence in the computational predictions to replace traditional high-throughput screening methods.
Summary of The Invention
The present disclosure provides methods for identifying compounds that are useful as therapeutic agents and/or that can be used as starting points for optimization in the development of therapeutic agents. These methods combine computational methods for predicting binding between a compound and a protein with large data sets of experimental data obtained using nucleotide-encoding libraries (e.g., DNA-encoding libraries). The combination of data generated with nucleotide coding libraries and computational methods allows for high confidence prediction of binding interactions between candidate compounds and proteins of interest.
Accordingly, in one aspect, the present disclosure provides a method comprising the steps of: (a) providing a plurality of binding interaction findings (e.g., at least 250,000 findings) for a target protein in a physical computing device having a representation of a set of candidate compounds (e.g., small molecule compounds), wherein at least 50% (e.g., at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%) of the plurality of binding interaction findings represent a binding interaction between the target protein and a compound (e.g., a member of a DNA-encoding library) comprising a nucleotide tag encoding the identity of the compound; (b) using the plurality of binding interactions to find an estimated binding interaction for the candidate compound using the computing device; and (c) outputting a list of candidate compounds that can be displayed and ranked by the highest estimated binding interaction.
In some embodiments, the plurality of binding interaction findings comprises at least 250,000 (e.g., at least 500,000, at least one million, at least two million, at least five million, at least ten million, at least twenty-five million) binding interaction findings.
In some embodiments, at least 50% of the plurality of binding interaction findings are determined by contacting a plurality (e.g., at least 250,000, at least 500,000, at least one million, at least two million, at least five million, at least ten million) of compounds comprising a nucleotide tag encoding the identity of the compound with the target protein simultaneously (e.g., simultaneously in the same reaction vessel). For example, in some embodiments, at least 50% of the binding interaction findings for DNA encoding library members used to generate the estimated binding interaction are determined in a single experiment in a single reaction vessel.
In some embodiments, the method further comprises providing one or more additional plurality of binding interaction findings for one or more additional target proteins, wherein at least 50% of the one or more additional plurality of binding interaction findings represent binding interactions between the additional target protein and compounds from the plurality of binding interaction findings with the target protein of step (a). In some embodiments, the method further comprises providing one or more additional plurality of binding interaction findings of one or more negative control experiments, wherein at least 50% of the plurality of binding interaction findings represent a negative control for a compound from the plurality of binding interaction findings with the target protein of step (a). In some embodiments, the method further comprises providing one or more additional plurality of binding interaction findings of one or more control experiments, wherein the plurality of binding interaction findings comprises binding interaction findings of a compound (e.g., a known inhibitor or natural ligand) having a known binding interaction with the target protein of step (a). In some embodiments, the method comprises generating a selectivity score by comparing the binding or estimated binding of the compound or candidate compound to the target protein to the binding or estimated binding of the compound or candidate compound to the one or more additional target proteins and/or a negative control. In some embodiments, the list of candidate compounds can be displayed and ranked by selective scoring. In some embodiments, the one or more additional target proteins comprise a mutant of the target protein.
In some embodiments, the estimated binding interaction is generated using chemical structure comparison, e.g., using molecular performance. Molecular representations include, but are not limited to, topological representations (e.g., fingerprints, linkage table, molecular connectivity, and/or molecular graphical representations), electrostatic representations (e.g., surface electrons), geometric representations (e.g., pharmacophores, pharmacophore fingerprints, shape-based fingerprints, and/or 3D molecular coordinates using atoms, features, or functional groups) or quantum chemical representations based on atoms, features, or functional groups and their connectivity. In some embodiments, topological representations (e.g., fingerprints, linkage tables, molecular connectivity, and/or molecular graphical representations) based on atoms, features, or functional groups and their connectivity are used to generate estimated binding interactions. In some embodiments, the estimated binding interaction is generated using electrostatic representation (e.g., surface electrons). In some embodiments, the estimated binding interaction is generated using geometric representation (e.g., pharmacophore fingerprint, shape-based fingerprint, and/or using 3D molecular coordinates of atoms, features, or functional groups). In some embodiments, the estimated binding interaction is generated using quantum chemical representation. In some embodiments, the estimated binding interaction is generated using a chemical fingerprint.
Chemical fingerprinting may be used to aggregate structural information and binding interaction data of compounds to identify structural patterns indicative of binding to a target protein. Thus, in some embodiments, the method further comprises (i) providing a plurality of chemical fingerprints (e.g., chemical fingerprints such as ECFP6, FCFP6, ECFP4, MACCS, or morgan/ring fingerprints having different numbers of bits (e.g., 166, 512, 1024)) for a plurality of compounds; and (ii) utilizing the plurality of chemical fingerprints in the generation of the estimated binding interactions. In some embodiments, such as in a training set, the plurality of chemical fingerprints includes chemical fingerprints of one or more of the compounds comprising nucleotide tags encoding the identity of the compounds, e.g., a chemical fingerprint is a representation of the structure of a compound without nucleotide tags. In some embodiments, for example in a predictive set, the plurality of chemical fingerprints includes chemical fingerprints of one or more candidate compounds. In some embodiments, the chemical fingerprint is an ECFP6 fingerprint.
In some embodiments, the method further comprises providing one or more property findings (e.g., molecular weight and/or clogP) for the set of candidate compounds. In some embodiments, the one or more property findings are used to generate an estimated binding interaction. In some embodiments, the list of candidate compounds is capable of being displayed and ranked by the one or more property findings
In some embodiments, the method further comprises sending the list of candidate compounds over the internet or to a display device. In some embodiments, the physical computing devices are accessed and operated over the internet.
In some embodiments, the method further comprises generating a confidence score for each estimated binding interaction of a candidate compound, wherein the confidence score is generated using a chemical structure comparison (e.g., a principal component analysis) between the candidate compound and one or more compounds from the plurality of binding interactions of the target protein of step (a). For example, in some embodiments, the confidence score is generated by comparing the candidate compound to a chemical space defined by the plurality of binding-interacting compounds from step (a), by determining the distance of the chemical space of the candidate compound as the euclidean distance on the dimension defined by the principal component analysis. In some embodiments, the list of candidate compounds can be displayed and ranked by the confidence score of the estimated binding interaction of the candidate compound.
In some embodiments, the method further comprises (d) synthesizing one or more of the candidate compounds from a list of candidate compounds.
In some embodiments, the method further comprises (e) contacting one or more synthetic candidate compounds with the target protein to determine one or more experimental binding interactions.
In one aspect, the present disclosure provides a computer-readable medium having stored thereon executable instructions for directing a physical computing device to implement a method comprising:
(a) providing a plurality of binding interaction findings for a target protein in a physical computing device, the physical computing device having representations of a set of candidate compounds,
wherein at least 90% of the plurality of binding interaction findings represent binding interactions between the target protein and a compound comprising a nucleotide tag encoding the identity of the compound;
(b) using the plurality of binding interactions to find an estimated binding interaction using the computing device to generate the candidate compound; and
(c) a list of candidate compounds that can be displayed and ranked by the highest estimated binding interaction is output.
In one aspect, the present disclosure provides a physical computing device having a representation of a set of candidate compounds and programmed with executable instructions to direct the device to perform a method comprising:
(a) providing a plurality of binding interaction findings for a target protein in a physical computing device, the physical computing device having representations of a set of candidate compounds,
wherein at least 90% of the plurality of binding interaction findings represent binding interactions between the target protein and a compound comprising a nucleotide tag encoding the identity of the compound;
(b) using the plurality of binding interactions to find an estimated binding interaction using the computing device to generate the candidate compound; and
(c) a list of candidate compounds that can be displayed and ranked by the highest estimated binding interaction is output.
Definition of
As used herein, a "confidence score" refers to a calculation that indicates a confidence in an estimated binding interaction of a candidate compound based on the structural similarity between the candidate compound and one or more compounds in the dataset used to prepare the estimate.
The term "binding interaction" as used herein refers to an association (e.g., non-covalent or covalent) between two or more entities. "direct" binding refers to physical contact between entities or moieties; indirect binding involves physical interaction by way of physical contact with one or more intermediate entities. Binding between two or more entities can generally be assessed in any of a variety of contexts-including where interacting entities or moieties are studied separately or in the context of more complex systems (e.g., when covalently or otherwise associated with a carrier entity and/or in a biological system or cell).
The affinity of a molecule X for its partner Y can generally be determined by the dissociation constant (K)D) And (4) showing. Affinity can be measured by conventional methods known in the art, including those described herein. The term "K" as used hereinD"means the dissociation equilibrium constant for a particular compound-protein or complex-protein interaction. Generally, the compounds of the present invention are present in amounts less than about 10-6M, e.g. less than about 10-7M、10-8M、10-9M or 10-10M or even lower dissociation equilibrium constant (K)D) Binding to the presentation protein, for example when the presentation protein is used as an analyte by Surface Plasmon Resonance (SPR) techniques and the compound is assayed as a ligand. In some embodiments, the compounds of the present invention are present in an amount less than about 10-6M, e.g. less than about 10-7M、10-8M、10-9M or 10-10M or even lower dissociation equilibrium constant (K)D) Binding to a target protein (e.g. a eukaryotic target protein such as a mammalian target protein or a fungal target protein or a prokaryotic target protein such as a bacterial target protein), for example when the target protein is used as an analyte and the compound is assayed as a ligand by Surface Plasmon Resonance (SPR) techniques.
As used herein, "binding interaction discovery" refers to the binding interaction between a compound and a protein (e.g., a target protein) or the lack thereof, as has been experimentally determined by, for example, SPR. For example, in some embodiments, a binding interaction discovery refers to a determination that a compound does not interact with a protein (e.g., a target protein).
The term "molecular manifestation" refers, for example, to a topological, electrostatic, geometric, or quantum chemical manifestation of a compound. Molecular manifestations include, for example, chemical fingerprints.
The term "electrostatic representation" refers to a type of molecular representation that includes information such as surface electrons.
As used herein, "estimated binding interaction" refers to a binding interaction that has been predicted using computational analysis. In some embodiments, the estimated binding interaction of the candidate compound with the target protein is generated by comparing the chemical structure of the candidate compound to the chemical structure of one or more compounds for which binding interaction with the target protein has been experimentally determined.
The term "chemical fingerprint" as used herein refers to a machine-readable representation of a molecule of a compound, such as a bit string, i.e., a list of binary values (0 or 1), which characterizes the two-and/or three-dimensional structure of the molecule. Exemplary methods of generating chemical fingerprints are known in the art, including, but not limited to, MACCS, Extended Connectivity Fingerprints (ECFP), Functional Class Fingerprints (FCFP), morgan/cyclic fingerprints, and chemical hash fingerprints.
The term "clogP" as used herein refers to the calculated partition coefficient of a molecule or portion of a molecule. Partition coefficient is the ratio of the concentration of a compound in a mixture of two immiscible phases (e.g., octanol and water) at equilibrium and measures the hydrophobicity or hydrophilicity of a compound. There are a variety of methods available in the art for determining clogP, for example, in some embodiments clogP can be determined using quantitative structure-property relationship algorithms known in the art (e.g., using fragment-based prediction methods that predict logP of a compound by determining the sum of non-overlapping molecular fragments of the compound). Several algorithms for calculating clogP are known in the art, including those used by molecular editing software such as CHEMDAW Pro, 12.0.2.1092 version (Cambridge Soft, Cambridge, MA) and MARVINSKETCH (Chemaxon, Budapest, Hungary).
The term "equivalent" as used herein refers to two or more compounds, entities, situations, sets of conditions, etc., which may not be identical to each other, but are sufficiently similar to allow comparisons to be made between them so that conclusions can be reasonably drawn based on the differences or similarities observed. In some embodiments, an equivalent set of conditions, environment, individual, or population is characterized by a plurality of substantially identical features and one or a small number of varying features. One of ordinary skill in the art will understand in the background what degree of identity is required in any given situation for two or more such compounds, entities, situations, sets of conditions, etc., to be considered equivalent. For example, one of ordinary skill in the art will appreciate that a collection of environments, individuals, or populations are equivalent to one another when they are characterized by a sufficient number and type of substantially identical features to warrant a reasonable conclusion (i.e., that the results or observed phenomenological differences obtained or observed under or with different collections of environments, individuals, or populations are changes in those features that are caused or indicative of changes in those features that are changed).
Many of the methods described herein include a "determining" step. One of ordinary skill in the art will understand upon reading this specification that such a "determination" can be accomplished using any of a variety of techniques available to those of skill in the art or by using any of a variety of techniques available to those of skill in the art, including, for example, the specific techniques explicitly mentioned herein. In some embodiments, the determination relates to manipulation of the physical sample. In some embodiments, considerations and/or processing relating to the data or information are determined, for example, using a computer or other processing unit adapted to perform the correlation analysis. In some embodiments, determining comprises receiving the relevant information and/or material from the source. In some embodiments, determining comprises comparing one or more characteristics of the sample or entity to a comparable reference.
The term "geometric representation" refers to one type of molecular representation. The geometric representation may include information about, for example, pharmacophores, pharmacophore fingerprints, shape-based fingerprints, and/or 3D molecular coordinates using atoms, features, or functional groups.
The term "library" as used herein refers to 2, 5, 102、103、104、105、106、107、108、109A collection of one or more different molecules. In some embodiments, at least 10% (e.g., at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99%, or 100%) of the compounds in the library are compounds that include a nucleotide tag that encodes their identity, such as DNA-encoding compounds.
The term "negative control" as used herein refers to an experiment in which no defined binding interaction of the target protein is present.
The term "polar surface area" refers to the sum of the surfaces of all polar atoms of a molecule or portion of a molecule, including the hydrogen to which they are attached. The polar surface areas were determined by computer using a program such as CHEMDAW Pro, Version 12.0.2.1092(Cambridge Soft, Cambridge, MA).
The term "positive control" as used herein refers to an experiment in which the binding interaction is determined, wherein the binding affinity of a compound in contact with a target protein is known.
As used herein, "property discovery" refers to a calculated or experimentally determined property (e.g., clogP, polar surface area, molecular weight) of a particular compound.
The term "selective", when used in reference to a compound having activity, is understood by those skilled in the art to mean that the compound distinguishes between potential target entities or states. For example, in some embodiments, a compound is said to "selectively" bind to a target if it preferentially binds to the target in the presence of one or more competing candidate targets. In many embodiments, selective interactions depend on the presence of specific structural features (e.g., epitopes, clefts, binding sites) of the target entity. It should be understood that selectivity need not be absolute. In some embodiments, selectivity can be assessed relative to the selectivity of a binding agent for one or more other potential target entities (e.g., competitors). In some embodiments, selectivity is assessed relative to selectivity for a reference selective binding agent. In some embodiments, selectivity is assessed relative to selectivity for a reference non-selective binding agent. In some embodiments, the agent or entity detectably does not bind to a competing candidate target under conditions for binding to its target entity. In some embodiments, a binding agent binds its target entity with a higher on-rate, a lower off-rate, increased affinity, decreased dissociation, and/or increased stability compared to a competing candidate target.
As used herein, "selectivity score" refers to the calculation of the specificity of a compound for a target protein. In some embodiments, the selectivity score can be calculated by comparing the binding of a compound to a target protein to the binding of a compound to another protein (e.g., a mutant or unrelated protein to the target protein). In other embodiments, the selectivity score can be calculated by comparing the binding of the compound to the target protein and a negative control.
The term "small molecule" refers to a low molecular weight organic and/or inorganic compound. Typically, a "small molecule" is a molecule that is less than about 5 kilodaltons (kD) in size. In some embodiments, the small molecule is less than about 4 kD, 3 kD, about 2 kD, or about 1 kD. In some embodiments, the small molecule is less than about 800 daltons (D), about 600D, about 500D, about 400D, about 300D, about 200D, or about 100D. In some embodiments, the small molecule is less than about 2000g/mol, less than about 1500g/mol, less than about 1000g/mol, less than about 800g/mol, or less than about 500 g/mol. In some embodiments, the small molecule is not a polymer. In some embodiments, the small molecule does not include a polymeric moiety. In some embodiments, the small molecule is not a protein or polypeptide (e.g., is not an oligopeptide or peptide). In some embodiments, the small molecule is not a polynucleotide (e.g., is not an oligonucleotide). In some embodiments, the small molecule is not a polysaccharide. In some embodiments, the small molecule does not include a polysaccharide (e.g., is not a glycoprotein, proteoglycan, glycolipid, etc.). In some embodiments, the small molecule is not a lipid. In some embodiments, the small molecule is a modulatory compound. In some embodiments, the small molecule is biologically active. In some embodiments, the small molecule is detectable (e.g., comprises at least one detectable moiety). In some embodiments, the small molecule is a therapeutic agent.
One of ordinary skill in the art, upon reading this disclosure, will appreciate that certain small molecule compounds described herein can be provided and/or utilized in any of a variety of forms, such as salt forms, protected forms, prodrug forms, ester forms, isomeric forms (e.g., optical and/or structural isomers), isotopic forms, and the like. In some embodiments, reference to a particular compound may relate to a particular form of the compound. In some embodiments, reference to a particular compound may relate to any form of that compound. In some embodiments, when a compound is one that is present or found in nature, the compound may be provided and/or utilized in accordance with the present invention in a form that is different from the form in which it is present or found in nature. One of ordinary skill in the art will appreciate that a preparation of a compound that includes one or more individual forms at different levels, amounts, or ratios from a reference preparation or source (e.g., a natural source) of the compound can be considered to be different forms of the compound described herein. Thus, in some embodiments, for example, a preparation of a single stereoisomer of a compound can be considered to be a different form of the compound than the racemic mixture of the compound; a particular salt of a compound may be considered to be a different form from another salt form of the compound; a formulation comprising one conformer of a double bond ((Z) or (E)) may be considered to be in a different form to a formulation comprising the other conformer of a double bond ((E) or (Z)); preparations in which one or more atoms is an isotope other than that present in the reference preparation can be considered to be in a different form; and so on.
The term "specific binding" or "specific for … …" or "specific for … …" as used herein refers to the interaction between a binding agent and a target entity. As one of ordinary skill will appreciate, an interaction is considered "specific" if it is favorable in the presence of an alternative interactionE.g. KDLess than 10 μ M binding (e.g., less than 5 μ M, less than 1 μ M, less than 500 nM, less than 200 nM, less than 100 nM, less than 75 nM, less than 50 nM, less than 25 nM, less than 10 nM, or 10 nM to 100 nM, 50 nM to 250 nM, 100 nM to 500 nM, 250 nM to 1 μ M, 500 nM to 2 μ M, 1 μ M to 5 μ M). In many embodiments, the specific interaction depends on the presence of a particular structural feature (e.g., epitope, cleft, binding site) of the target entity. It is to be understood that specificity need not be absolute. In some embodiments, specificity can be assessed relative to the specificity of a binding agent for one or more other potential target entities (e.g., competitors). In some embodiments, specificity is assessed relative to the specificity of a reference specific binding agent. In some embodiments, the specificity is assessed relative to the specificity of a reference non-specific binding agent.
The term "structural similarity" refers to the similarity in the two-or three-dimensional arrangement and/or orientation of atoms or moieties relative to each other (e.g., the distance and/or angle between an agent of interest and a reference agent) in one or more different compounds.
The term "substantially" refers to a qualitative condition that exhibits a complete or near complete degree or degree of a characteristic or attribute of interest. One of ordinary skill in the biological arts will appreciate that few, if any, biological and chemical phenomena have progressed to completion and/or proceed to completion or achieve or avoid some absolute result. The term "substantially" is therefore used herein to cover the complete potential absence inherent in many biological and chemical phenomena.
The term "does not substantially bind" a particular protein as used herein may, for example, be defined by having 10 to the target-4M is greater than or equal to, or 10-5M is greater than or equal to, or 10-6M is greater than or equal to, or 10-7M is greater than or equal to, or 10-8M is greater than or equal to, or 10-9M is greater than or equal to, or 10-10M is greater than or equal to, or 10-11M is greater than or equal to, or 10-12K of M or greaterDOr 10 is-4M to 10-12M, or 10-6M to 10-10M, or 10-7M to 10-9K in the range of MDOr a portion of a molecule.
The term "target protein" refers to a protein that binds to a small molecule. In some embodiments, the target protein is involved in a biological pathway associated with a disease, disorder, or condition. In some embodiments, the target protein is a naturally occurring protein; in some such embodiments, the target protein is naturally present in certain mammalian cells (e.g., mammalian target protein), fungal cells (e.g., fungal target protein), bacterial cells (e.g., bacterial target protein), or plant cells (e.g., plant target protein). In some embodiments, the target protein is characterized by a natural interaction with one or more naturally occurring presented protein/naturally occurring small molecule complexes. In some embodiments, the target protein is characterized by natural interactions with a plurality of different naturally presented protein/natural small molecule complexes; in some such embodiments, some or all of the complexes utilize the same presentation protein (and different small molecules). The target protein may be naturally occurring, e.g., wild-type. Alternatively, the target protein may be different from the wild-type protein, but still retain biological function, e.g., as an allelic variant, splice mutant, or biologically active fragment. Exemplary mammalian target proteins are gtpases, GTPase activating proteins, ornithine nucleotide exchange factors, heat shock proteins, ion channels, coiled coil proteins, kinases, phosphatases, ubiquitin ligases, transcription factors, chromatin modifying/remodeling factors, proteins with classical protein-protein interaction domains and motifs, or any other protein involved in a biological pathway associated with a disease, disorder or condition.
The term "topological representation" refers to a type of molecular representation that depends on the topology of the molecule and that indicates the position of individual atoms and the bonding connections between them. The topological representation can be based on atoms, features, or functional groups and their connectivity (e.g., fingerprints, connection tables, molecular connectivity, and/or molecular graphical representations). The topological representation may be computed based on the molecular graphical representation.
The term "quantum chemical manifestation" refers to a type of molecular manifestation. Quantum chemical manifestation may include information about, for example, the energy or electronic properties of a compound.
Brief Description of Drawings
FIG. 1 is a graph illustrating the prediction of binding interactions as the number of libraries increases.
FIG. 2 is a graph illustrating multiple prediction trials over time due to improvements in the prediction model.
Detailed Description
The present disclosure provides virtual screening methods for identifying compounds that are useful as therapeutic agents and/or that can be used as starting points for optimization in the development of therapeutic agents. These methods utilize large data sets of experimental data obtained using DNA-encoding libraries to generate high-confidence predictions of binding interactions between candidate compounds and proteins of interest.
Coding compound
The invention features methods of using coded chemical entities, including a chemical entity, one or more tags, and a headpiece operably associating a first chemical entity and one or more tags. Chemical entities, headpieces, labels, bonds, and bifunctional spacers are further described below.
Chemical entities
The coding compounds (e.g., small molecules) utilized in the methods of the invention can include one or more building blocks and optionally one or more scaffolds.
The scaffold S may be a monoatomic or molecular scaffold. Exemplary monoatomic scaffolds include carbon, boron, nitrogen, or phosphorus atoms, among others. Exemplary polyatomic scaffolds include cycloalkyls, cycloalkenyls, heterocycloalkyls, heterocycloalkenyls, aryls, or heteroaryls. Specific embodiments of heteroaryl scaffolds include triazines, such as1, 3, 5-triazine, 1,2, 3-triazine, or1, 2, 4-triazine; a pyrimidine; pyrazine; pyridazine; furan; pyrrole; pyrroline; a pyrrolidine; oxazole; pyrazole; isoxazole; a pyran; pyridine; indole; indazoles; or a purine.
The scaffold S can be operably linked to the label by any available method. In one example, S is a triazine directly attached to the headpiece. To getTo this exemplary scaffold, trichlorotriazine (i.e., a chlorinated triazine precursor having three chlorines) is reacted with a nucleophilic group of a headpiece. Using this approach, S has three sites available for substitution with a chloride, two of which are available diversity nodes and one linked to the headpiece. Next, the component A is putnDiverse nodes added to the scaffold and will be member AnCoded mark An("Mark A)n") to the header fragment, wherein the two steps can be performed in any order. Then, the member B may be putnAdded to the remaining diversity nodes and will be member BnCoded mark BnAttached to tag AnOf the end portion of (a). In another example, S is a nucleophilic group (e.g., amino group) operably linked to a labeled triazine, wherein the trichlorotriazine is reacted with PEG, a labeled aliphatic or aromatic linker. As described above, building blocks and associated tags may be added.
In another example, S is operatively connected to member AnThe triazine of (1). To obtain such scaffolds, a building block A having two diversity nodes (e.g., electrophilic and nucleophilic groups, such as Fmoc-amino acids) is usednWith a nucleophilic group of a linker (e.g., a terminal group of a PEG, aliphatic or aromatic linker attached to the headpiece). Then, trichlorotriazine is reacted with component AnIs reacted with a nucleophilic group. Using this approach, all three chlorine sites of S are used as diversity nodes for the building block. Additional members and markers may be added, and additional stents S may be added, as described hereinn。
Exemplary Member An' includes, for example, amino acids (e.g., alpha-, beta-, Y-, delta-, and epsilon-amino acids, as well as derivatives of natural and unnatural amino acids), chemically reactive reactants with amines (e.g., azide or alkyne chains), or thiol reactants, or combinations thereof. Component AnThe choice of (a) depends on, for example, the nature of the reactive group used in the linker, the nature of the scaffold moiety, and the solvent used for the chemical synthesis.
Exemplary Member Bn' and Cn' includes any useful structural unit of a chemical entity, such as an optionally substituted aromatic group (e.g., optionally substituted phenyl or benzyl), an optionally substituted heterocyclic group (e.g., optionally substituted quinolinyl, isoquinolinyl, indolyl, isoindolyl, azaindolyl, benzimidazolyl, azabenzimidazolyl, benzisoxazole, pyridyl, piperidyl, or pyrrolidinyl), an optionally substituted alkyl group (e.g., optionally substituted straight or branched C1-6Alkyl or optionally substituted C1-6Aminoalkyl), or an optionally substituted carbocyclic group (e.g., optionally substituted cyclopropyl, cyclohexyl, or cyclohexenyl). Particularly useful component Bn' and Cn' includes those having one or more reactive groups, such as optionally substituted groups (e.g., any described herein) having one or more substituents that are optionally reactive groups or that can be chemically modified to form reactive groups. Exemplary reactive groups include amines (-NR)2Wherein each R is independently H or optionally substituted C1-6Alkyl), hydroxy, alkoxy (-OR, wherein R is optionally substituted C1-6Alkyl, such as methoxy), carboxyl (-COOH), amide, or chemically reactive substituents. For example, it can be at the mark BnOr CnInto which restriction sites can be introduced, wherein the complex can be recognized by performing PCR and restriction digestion with one of the corresponding restriction enzymes.
Head segment
In one coding chemical entity, the headpiece operably links each chemical entity to its coding oligonucleotide tag. Generally, the headpiece is an initial oligonucleotide having at least two functional groups that can be further derivatized, wherein a first functional group operably links the first chemical entity (or component thereof) to the headpiece and a second functional group operably links one or more labels of the headpiece to the headpiece. A bifunctional spacer may optionally be used as the spacer moiety between the headpiece and the chemical entity.
The functional group of the headpiece can be used to form a covalent bond with a chemical entity component and another covalent bond with a label. The component may be any part of a small molecule, such as a scaffold with a multiplicity of nodes or building blocks. Alternatively, the headpiece can be derivatized to provide a spacer (e.g., a spacer moiety that separates the headpiece from the small molecule to be formed in the library) that terminates in a functional group (e.g., a hydroxyl, amine, carboxyl, thiol, alkynyl, azido, or phosphate group) that is used to form a covalent bond with a chemical entity component. The spacer may be attached to the 5 '-end, or the 3' -end of the headpiece at one of the internal sites. When a spacer is attached to one of the internal sites, the spacer can be operably linked to a derivatized base (e.g., the C5 site of uridine) or placed internally within the oligonucleotide using standard techniques known in the art. Exemplary spacers are described herein.
The headpiece can have any useful configuration. The headpiece may be, for example, 1 to 100 nucleotides in length, preferably 5 to 20 nucleotides in length, and most preferably 5 to 15 nucleotides in length. As described herein, the headpiece can be single-stranded or double-stranded, and can be composed of natural or modified nucleotides. For example, a chemical moiety is operably linked to the 3 '-terminus or the 5' -terminus of the headpiece. In particular embodiments, the headpiece includes a hairpin structure formed by complementary bases within the sequence. For example, a chemical moiety may be operably linked to an internal site, 3 '-terminus, or 5' -terminus of the headpiece.
Generally, the headpiece includes a non-self-complementary sequence on the 5 '-or 3' -end that allows for binding of the oligonucleotide tag by polymerization, enzymatic ligation, or chemical reaction. The headpiece may allow for ligation of oligonucleotide tags and optional purification and phosphorylation steps. After the addition of the last tag, additional adaptor sequences may be added to the 5' -end of the last tag. Exemplary adaptor sequences include primer binding sequences or sequences with a label (e.g., biotin). In cases where a number of building blocks and corresponding labels are used (e.g., 100), a mix-split strategy can be employed during the oligonucleotide synthesis step to form the desired number of labels. Such mix-resolution strategies for DNA synthesis are known in the art. The resulting library members may be amplified by PCR and subsequently selected for binding entities to the target of interest.
The headpiece or complex may optionally include one or more primer binding sequences. For example, the headpiece has a sequence in a hairpin loop region that serves as a primer binding region for amplification, where the primer binding region has a higher melting temperature for its complementary primer (e.g., which may include a flanking identifier region) than the sequence in the headpiece. In other embodiments, the complex comprises two primer binding sequences on both sides of one or more labels (which encode one or more building blocks) (e.g., such that a PCR reaction can occur). Alternatively, the headpiece may contain a primer binding sequence at the 5 '-or 3' -end. In other embodiments, the headpiece is a hairpin and the loop region forms a primer binding site or the primer binding site is introduced on the 3' side of the loop of the headpiece by hybridization of an oligonucleotide. A primer oligonucleotide comprising a region homologous to the 3 '-end of the headpiece and carrying a primer binding region on its 5' -end (e.g. to make a PCR reaction feasible) may be hybridised to the headpiece and may comprise a label encoding a building block or adding a building block. The primer oligonucleotide may comprise additional information, such as a random nucleotide region, e.g., 2 to 16 nucleotides in length, which is included for bioinformatic analysis.
The headpiece may optionally include a hairpin structure, where such a structure can be achieved by any useful method. For example, the headpiece can include complementary bases that form an intermolecular base-pairing partner, e.g., by Watson-Crick (Watson-Crick) base-pairing (e.g., adenine-thymine and guanine-cytosine) and/or by wobble base-pairing (e.g., guanine-uracil, inosine-adenine and inosine-cytosine). In another example, the headpiece may include modified or substituted nucleotides that can form higher affinity duplex formations than unmodified nucleotides, such modified or substituted nucleotides being known in the art. In another example, the headpiece includes one or more bases that are cross-linked to form a hairpin structure. For example, bases within a single strand or bases in different duplexes may be cross-linked, e.g., by using psoralen.
The headpiece or complex may optionally include one or more labels for detection. For example, the headpiece, one or more oligonucleotide tags, and/or one or more primer sequences can include an isotope, a radioimaging agent, a marker, a tracer, a fluorescent tag (e.g., rhodamine or fluorescein), a chemiluminescent tag, a quantum dot, or a reporter molecule (e.g., biotin or histidine tag).
In other embodiments, the head fragments or tags may be modified to support solubility under semi-aqueous, reduced aqueous or non-aqueous (e.g., organic) conditions. The C5 position of, for example, T or C bases can be modified by using an aliphatic chain to make the headpiece or labeled nucleotide bases more hydrophobic and not significantly disrupt their ability to form hydrogen bonds with their complementary bases. Exemplary modified or substituted nucleotides are 5' -dimethoxytrityl-N4-diisobutylaminomethylidene-5- (1-propynyl) -2' -deoxycytidine, 3' - [ (2-cyanoethyl) - (N, N-diisopropyl) ] -phosphoramidite; 5' -dimethoxytrityl-5- (1-propynyl) -2' -deoxyuridine, 3' - [ (2-cyanoethyl) - (N, N-diisopropyl) ] -phosphoramidite; 5' -dimethoxytrityl-5-fluoro-2 ' -deoxyuridine, 3' - [ (2-cyanoethyl) - (N, N-diisopropyl) ] -phosphoramidite; and 5' -dimethoxytrityl-5- (pyrene-1-yl-ethynyl) -2' -deoxyuridine, or 3' - [ (2-cyanoethyl) - (N, N-diisopropyl) ] -phosphoramidite.
In addition, the headpiece oligonucleotide may be interspersed with modifications that increase solubility in organic solvents. For example, azobenzene phosphoramidites can introduce hydrophobic moieties into the design of the headpiece. Such insertion of the hydrophobic amidate into the headpiece may occur anywhere in the molecule. However, if used for tag deconvolution, the insertion cannot interfere with subsequent labeling using additional DNA tags during library synthesis or subsequent PCR or microarray analysis once selection is complete. Such additions to the headpiece design described herein may render the headpiece soluble in, for example, 15%, 25%, 30%, 50%, 75%, 90%, 95%, 98%, 99%, or 100% organic solvent. Thus, the addition of hydrophobic residues to the design of the headpiece results in improved solubility under semi-aqueous or non-aqueous (e.g., organic) conditions while enabling the headpiece to be used for oligonucleotide labeling. In addition, DNA markers subsequently introduced into the library may also be modified at the C5 site of the T or C base, making them also rendering the library more hydrophobic and soluble in organic solvents for subsequent steps of library synthesis.
In particular embodiments, the headpiece and the first tag may be the same entity, i.e., multiple headpiece-tag entities may be constructed, all sharing a common portion (e.g., a primer binding region) and all differing on another portion (e.g., a coding region). They can be used in the "split" step and assembled after the events they encode have occurred.
In particular embodiments, the headpiece may encode information, for example by including a sequence encoding the first resolution step or a sequence encoding the identity of the library, such as by using a particular sequence associated with a particular library.
Oligonucleotide labeling
The oligonucleotide tags described herein (e.g., tags or partial headpieces or partial tailpieces) can be used to encode any useful information, such as a molecule, a portion of a chemical entity, addition of a component (e.g., scaffold or building block), headpieces in a library, identity of a library, use of one or more library members (e.g., use of members of an aliquot of a library), and/or source of a library member (e.g., by using a sequence of origin).
Any sequence in the oligonucleotide may be used to encode any information. Thus, one oligonucleotide sequence may be used for multiple purposes, for example to encode two or more types of information or to provide a starting oligonucleotide that also encodes one or more types of information. For example, the first marker may be the addition of a first building block and an identification code for the library. In another example, a headpiece can be used to provide an initial oligonucleotide that operably links a chemical entity to a label, wherein the headpiece additionally includes a sequence encoding an identity of the library (e.g., a library recognition sequence). Thus, any of the information described herein can be encoded in a separate oligonucleotide tag or can be combined and encoded in the same oligonucleotide sequence (e.g., an oligonucleotide tag such as a tag or headpiece).
The building block sequence encodes the identity of the building block and/or the type of binding reaction to be performed using the building block. Such building block sequences are included in a tag, wherein the tag may optionally include one or more types of sequences (e.g., library-identifying sequences, use sequences, and/or source sequences) as described below.
The library recognition sequence encodes the identity of a particular library. To allow for the mixing of two or more libraries, the library members may contain one or more library recognition sequences, such as in a library recognition tag (i.e., an oligonucleotide comprising a library recognition sequence), in a ligated tag, in a portion of the head fragment sequence, or in the tail fragment sequence. These library recognition sequences can be used to derive coding relationships in which tagged sequences are translated and correlated with chemical (synthetic) history information. Thus, these library recognition sequences allow two or more libraries to be mixed together for selection, amplification, purification, sequencing, and the like.
The sequence of use encodes the history (i.e., use) of one or more library members in an individual aliquot of the library. For example, separate aliquots can be treated with different reaction conditions, components, and/or selection steps. In particular, such sequences can be used to identify such aliquots and infer their history (use), and thus allow aliquots of the same library having different histories (uses) (e.g., different selection experiments) to be mixed together for the purpose of mixing samples together for selection, amplification, purification, sequencing, and the like. These use sequences can be included in the head fragment, tail fragment, tag, use tag (i.e., an oligonucleotide that includes the use sequence), or any other tag described herein (e.g., a library-identifying tag or source tag).
The source sequence is a degenerate (randomly generated) oligonucleotide sequence of any useful length (e.g., about six oligonucleotides) that encodes a source of the library member. Such sequences are used to randomly subdivide library members that are otherwise identical in all respects into entities that are distinguishable by sequence information, such that the observation of amplification products derived from a unique progenitor template (e.g., a selected library member) can be distinguished from the observation of multiple amplification products derived from the same progenitor template (e.g., a selected library member). For example, after library formation and prior to the selection step, each library member may include a different source sequence, for example in a source tag. After selection, selected library members can be amplified to produce amplification products, and a portion of the library members expected to include the source sequence (e.g., in the source signature) can be observed and compared to the source sequence in each of the other library members. Since the source sequence is degenerate, each amplification product of each library member should have a different source sequence. However, observation of the same source sequence in the amplification product may indicate multiple amplicons derived from the same template molecule. The source marker may be used when it is desired to determine statistics and statistics of the population encoding the marker prior to amplification rather than after amplification. These source sequences can be included in the head fragment, tail fragment, tag, source tag (i.e., an oligonucleotide that includes the source sequence), or any tag described herein (e.g., a library-identifying tag or a use tag).
Any type of sequence described herein may be included in the header fragment. For example, the headpiece can include one or more of a building block sequence, a library recognition sequence, a use sequence, or a source sequence.
Any of these sequences described herein may be included in the tail segment. For example, the tail segment can include one or more of a library recognition sequence, a use sequence, or a source sequence.
Any of the labels described herein may include a linker at or near the 5 '-or 3' -end with the fixed sequence. The linker facilitates the formation of a bond (e.g., a chemical bond) by providing a reactive group (e.g., a chemically reactive group or a photoreactive group) or by providing a site for a reagent that allows formation of a bond (e.g., a reagent that intercalates a moiety or a reversibly reactive group in the linker or cross-linking oligonucleotide). Each 5 '-linker may be the same or different, and each 3' -linker may be the same or different. In an exemplary non-limiting complex with more than one tag, each tag can include a5 '-linker and a 3' -linker, where each 5 '-linker has the same sequence and each 3' -linker has the same sequence (e.g., where the sequence of the 5 '-linker can be the same or different from the sequence of the 3' -linker). The linker provides a sequence that can be used for one or more keys. To allow binding of the transfer primer or hybridization of the cross-linking oligonucleotide, the linker may include one or more functional groups that allow bond formation (e.g., a bond, such as a chemical bond, for which the polymerase has reduced read-through or translocation capability).
These sequences may include any modification described herein for an oligonucleotide, such as one or more modifications that promote solubility in organic solvents (e.g., any described herein, such as for a headpiece), that provide a native phosphodiester bond (e.g., a phosphorothioate analog), or that provide one or more non-natural oligonucleotides (e.g., 2' -substituted nucleotides, such as 2' -O-methylated nucleotides and 2' -fluoro nucleotides, or any of the nucleotides described herein).
These sequences may include any of the features described herein for the oligonucleotides. For example, these sequences may be included in a tag of less than 20 nucleotides (e.g., a tag as described herein). In other examples, markers comprising one or more of these sequences have about the same mass (e.g., each marker has a mass that is about +/-10% different from the average mass within a particular marker set that encodes a particular variable); lack of a primer binding (e.g., constant) region; lack of a constant region; or a constant region of reduced length (e.g., less than 30 nucleotides, less than 25 nucleotides, less than 20 nucleotides, less than 19 nucleotides, less than 18 nucleotides, less than 17 nucleotides, less than 16 nucleotides, less than 15 nucleotides, less than 14 nucleotides, less than 13 nucleotides, less than 12 nucleotides, less than 11 nucleotides, less than 10 nucleotides, less than 9 nucleotides, less than 8 nucleotides, or less than 7 nucleotides in length).
Sequencing strategies for libraries and oligonucleotides of this length may optionally include concatenation or linkage strategies to increase read fidelity or sequencing depth, respectively. In particular, the selection of coding libraries lacking primer binding regions has been described in the literature for SELEX, such as Jarosch et al,Nucleic Acids Res.34 e86 (2006), which is incorporated herein by reference. For example, library members can be modified (e.g., after the selection step) to include a first adaptor sequence on the 5 '-end of the complex and a second adaptor sequence on the 3' -end of the complex, wherein the first sequence is substantially complementary to the second sequence and causes duplex formation. To further improve yield, two immobilized dangling nucleotides (e.g., CC) are added to the 5' -end.
Key with a key body
The bond of the invention is present between the information-encoding oligonucleotides (e.g., between the headpiece and the tag, between two tags, or between a tag and a tailpiece). Exemplary linkages include phosphodiester linkages, phosphonate linkages, and phosphorothioate linkages. In some embodiments, the polymerase has reduced ability to read or translocate through one or more bonds. In certain embodiments, the chemical bond includes one or more chemically reactive groups, such as a monophosphate and/or hydroxyl group, a photoreactive group, an intercalating moiety, a cross-linking oligonucleotide, or a reversible co-reactive group.
A bond can be tested to determine if the polymerase has reduced ability to read through or translocate through the bond. This ability can be tested by any useful method, such as liquid chromatography-mass spectrometry, RT-PCR analysis, sequence population statistics, and/or PCR analysis. In some embodiments, chemical linking comprises the use of one or more chemical reaction pairs to provide bonds, such as monophosphates and hydroxyls. As described herein, the readable bond can be synthesized by chemical ligation, for exampleE.g. by the presence of a cyanoimidazole and a divalent metal source (e.g. ZnCl)2) In the case of (3), the reaction of a monophosphate, monothiophosphate, or monophosphonic acid at the 5 '-or 3' -terminus with a hydroxyl group at the 5 '-or 3' -terminus.
Other exemplary chemical reaction pairs are such pairs: including an optionally substituted alkynyl group and an optionally substituted azido group, via a wheatstone (Huisgen)1, 3-dipolar cycloaddition reaction to form a triazole; optionally substituted dienes (e.g., optionally substituted 1, 3-unsaturated compounds such as optionally substituted 1, 3-butadiene, 1-methoxy-3-trimethylsilyl-1, 3-butadiene, cyclopentadiene, cyclohexadiene, or furan) with 4 pi-electron systems and optionally substituted dienophiles or optionally substituted heteroadienophiles (e.g., optionally substituted alkenyl groups or optionally substituted alkynyl groups) with 2 pi-electron systems via reaction by Diels Alder (Diels-Alder) to form cycloalkenes; nucleophiles (e.g., optionally substituted amines or optionally substituted thiols) with a strained heterocyclic electrophile (e.g., optionally substituted epoxide, aziridine ion, or episulfonium ion), via a ring-opening reaction to form a heteroalkyl group; phosphorothioate groups with an iodo group, as in splint linkages of 5 '-iododT containing oligonucleotides to 3' -phosphorothioate oligonucleotides; reaction of an optionally substituted amino group with an aldehyde group or ketone group, such as reaction of a3 '-aldehyde-modified oligonucleotide (which may optionally be obtained by oxidation of a commercially available 3' -glyceryl-modified oligonucleotide) with a5 '-amino oligonucleotide (i.e., in a reductive amination reaction) or a 5' -hydrazine oligonucleotide; optionally substituted amino groups and carboxylic acid groups or thiol groups (e.g., with or without the use of trans-4- (maleimidomethyl) cyclohexane-1-carboxylate succinimidyl ester (SMCC) or the pair of 1-ethyl-3- (3-dimethylaminopropyl) carbodiimide (EDAC); optionally substituted hydrazine and aldehyde or ketone groups; optionally substituted hydroxylamine and aldehyde or ketone groups; or a nucleophile and optionally substituted alkyl halide.
Platinum complexes, alkylating agents, or furan modified nucleotides may also be used as chemically reactive groups to form inter-or intra-chain linkages. Such a reagent may be used between two oligonucleotides, and it may optionally be present in a cross-linked oligonucleotide.
Exemplary non-limiting platinum complexes include cisplatin (cis-diamminedichloroplatinum (II), e.g., to form GG intrachain bonds), antiplatin (trans-diamminedichloroplatinum (II), e.g., to form GXG interchain bonds, where X may be any nucleotide), carboplatin, picoplatin (ZD0473), ormaplatin, or oxaliplatin to form, e.g., GC, CG, AG, or GG bonds. Any of these bonds may be interchain or intrachain bonds.
Exemplary non-limiting alkylating agents include nitrogen mustards (mechlorethamine (e.g., to form GG linkages), chlorambucil, melphalan, cyclophosphamide, prodrug forms of cyclophosphamide (e.g., 4-hydroperoxycyclophosphamide and ifosfamide)), 1, 3-bis (2-chloroethyl) -1-nitrosourea (BCNU, carmustine), aziridines (e.g., mitomycin C, triethylenemelamine, or triethylenethiophosphoramide (thiotepa) to form GG or AG linkages), hexamethylmelamine, alkylsulfonates (e.g., busulfan to form GG linkages), or nitrosoureas (e.g., 2-chlorothiourea to form GG or CG linkages, such as carmustine (BCNU), chlorourethricin, lomustine (CCNU), and semustine (methyl-CCNU)). Any of these bonds may be interchain or intrachain bonds.
Furan modified nucleotides may also be used to form the bond. Upon in situ oxidation (e.g., with N-bromosuccinimide (NBS)), the furan moiety forms a reactive oxyalkylene aldehyde derivative that reacts with the complementary base to form an interchain bond. In some embodiments, the furan modified nucleotide forms a bond with a complementary a or C nucleotide. Exemplary non-limiting furan modified nucleotides include any 2' - (furan-2-yl) propionylamino-modified nucleotide; or an acyclic modified nucleotide of a 2- (furan-2-yl) ethyl glycol nucleic acid.
Photoreactive groups may also be used as reactive groups. Exemplary non-limiting photoreactive groups include an intercalating moiety, a psoralen derivative (e.g., psoralen, HMT-psoralen, or 8-methoxypsoralen), an optionally substituted cyanovinylcarbazole group, an optionally substituted vinylcarbazole group, an optionally substituted cyanovinyl group, an optionally substituted acrylamide group, an optionally substituted diazirine group, an optionally substituted benzophenone (e.g., succinimidyl ester of 4-benzoylbenzoic acid or benzophenone isothiocyanate), an optionally substituted 5- (carboxy) vinyluridine group (e.g., 5- (carboxy) vinyl-2' -deoxyuridine), or an optionally substituted azide group (e.g., an aryl azide or haloaryl azide, such as 4-azido-2, 3,5, succinimidyl ester of 6-tetrafluorobenzoic Acid (ATFB).
The intercalating moiety may also serve as a reactive group. Exemplary non-limiting intercalating moieties include psoralen derivatives, alkaloid derivatives (e.g., berberine, palmatine, berberine, sanguinarine (e.g., an iminium or alkanolamine form thereof, or aristololactam- β -D-glucoside), ethidium cations (e.g., ethidium bromide), acridine derivatives (e.g., proflavine, acridine yellow, or amsacrine), anthracycline derivatives (e.g., doxorubicin, epirubicin, daunorubicin (daunorubicin), idarubicin, and doxorubicin), or thalidomide.
For cross-linking oligonucleotides, any available reactive group (e.g., a group described herein) can be used to form inter-or intra-chain bonds. Exemplary reactive groups include chemically reactive groups, photoreactive groups, intercalating moieties, and reversible co-reactive groups. Crosslinking reagents for use with the crosslinking oligonucleotide include, but are not limited to, alkylating agents (e.g., as described herein), cisplatin (cis-diaminedichloroplatinum (II)), trans-diaminedichloroplatinum (II), psoralen, HMT-psoralen, 8-methoxypsoralen, furan-modified nucleotides, 2-fluoro-deoxyinosine (2-F-dI), 5-bromo-deoxycytidine (5-Br-dC), 5-bromo-deoxyuridine (5-Br-dU), 5-iodo-deoxycytidine (5-I-dC), 5-iodo-deoxyuridine (5-I-dU), trans-4- (maleimidomethyl) cyclohexane-1-carboxylic acid succinimidyl ester, succinimidyl ester, SMCC, EDAC, or acetylthioacetic succinimidyl ester (SATA).
Oligonucleotides may also be modified to contain thiol moieties which can react with various thiol reactive groups such as maleimide, halogen, iodoacetamide and thus can be used to crosslink two oligonucleotides. The thiol group may be attached to the 5 '-or 3' -terminus of the oligonucleotide.
For interchain cross-linking between double-stranded oligonucleotides at pyrimidine (e.g., thymidine) positions, an embedded photoreactive moiety psoralen may be selected. Upon irradiation with ultraviolet light (about 254nm), psoralen intercalates into the duplex and forms covalent interchain crosslinks with the pyrimidine, preferably at the 5' -TpA site. The psoralen moiety may be covalently linked to the modified oligonucleotide (e.g., via an alkane chain, such as C)1-10Alkyl or polyglycol radicals, e.g. - (CH)2CH2O)nCH2CH2-, where n is an integer of 1 to 50). Exemplary psoralen derivatives may also be used, with non-limiting derivatives including 4'- (hydroxyethoxymethyl) -4, 5', 8-trimethylpsoralen (HMT-psoralen) and 8-methoxypsoralen.
The various portions of the cross-linking oligonucleotide may be modified to introduce bonds. For example, a terminal phosphorothioate in an oligonucleotide may also be used to ligate two adjacent oligonucleotides. Halogenated uracils/cytosines may also be used as cross-linker modifications in oligonucleotides. For example, a 2-fluoro-deoxyinosine (2-F-dI) modified oligonucleotide may be reacted with a disulfide containing diamine or thiopropylamine to form a disulfide bond.
As described below, reversible co-reactive groups include those selected from the group consisting of: cyanovinylcarbazole groups, cyanovinyl groups, acrylamide groups, thiol groups, or sulfonylethyl sulfides. Optionally substituted Cyanovinylcarbazole (CNV) groups may also be used in oligonucleotides to crosslink to pyrimidine bases (e.g., cytosine, thymine, and uracil, and their modified bases) in the complementary strand. Upon irradiation at 366nm, the CNV group promotes [2+2] cycloaddition to the adjacent pyrimidine base, which causes interchain crosslinking. Irradiation at 312nm reverses the crosslinking and thus provides a means for reversible crosslinking of the oligonucleotide strand. A non-limiting CNV group is 3-cyanovinylcarbazole, which may include as carboxyvinylcarbazole nucleotide (e.g., as 3-carboxyvinylcarbazole-1 '- β -deoxynucleoside-5' -triphosphate).
The CNV group can be modified to use another reactionThe reactive cyano group is substituted with a substituent group to provide an optionally substituted vinylcarbazole group. Exemplary non-limiting reactive groups for the vinylcarbazole group include-CONRN1RN2Wherein each R isN1And RN2May be the same or different and are independently H and C1-6Alkyl radicals, e.g. CONH2;-CO2A carboxyl group of H; or C2-7An alkoxycarbonyl group (e.g., methoxycarbonyl). Further, the reactive group may be located on the alpha or beta carbon of the vinyl group. Exemplary vinyl carbazole groups include cyanovinyl carbazole groups as described herein; aminovinylcarbazole groups (e.g., aminovinylcarbazole nucleotides such as 3-aminovinylcarbazole-1 '- β -deoxynucleoside-5' -triphosphates); carboxyvinylcarbazole groups (e.g., carboxyvinylcarbazole nucleotides such as 3-carboxyvinylcarbazole-1 '- β -deoxynucleoside-5' -triphosphates); and C2-7An alkoxycarbonyl vinylcarbazole group (e.g., an alkoxycarbonyl vinylcarbazole nucleotide such as 3-methoxycarbonylvinylcarbazole-1 '- β -deoxynucleoside-5' -triphosphate). Additional optionally substituted vinylcarbazole groups and nucleotides with such groups are provided in U.S. patent 7,972,792 and Yoshimura and Fujimoto,Org.Lett.10:3227-3230(2008), which are hereby incorporated by reference in their entirety.
Other reversibly reactive groups include a thiol group and another thiol group to form a disulfide, and a thiol group and a vinyl sulfone group to form a sulfonylethyl sulfide. The thiol-thiol group may optionally include a bond formed by reaction with bis- ((N-iodoacetyl) piperazinyl) sulforhodamine. Other reversibly reactive groups (e.g., such as certain photoreactive groups) include optionally substituted benzophenone groups. A non-limiting example is benzophenone uracil (BPU), which can be used for site-selective formation and sequence-selective formation of interchain crosslinks of BPU-containing oligonucleotide duplexes. This crosslinking can be reversed upon heating, providing a means for reversible crosslinking of the two oligonucleotide strands.
In other embodiments, chemical ligation includes the introduction of analogs of phosphodiester bonds, e.g., for post-selection PCR analysis and sequencing. Exemplary analogs of phosphodiesters include a phosphorothioate linkage (e.g., a linkage as introduced by use of a phosphorothioate group and a leaving group such as an iodo group), a phosphoamide linkage, or a phosphorodithioate linkage (e.g., a linkage as introduced by use of a phosphorodithioate group and a leaving group such as an iodo group).
For any group described herein (e.g., a chemically reactive group, a photoreactive group, an intercalating moiety, a cross-linked oligonucleotide, or a reversible co-reactive group), the group can be incorporated at or near the end of the oligonucleotide or between the 5 '-and 3' -ends. In addition, one or more groups may be present in each oligonucleotide. When a reactive group pair is desired, the oligonucleotide can be designed to facilitate the reaction between the group pair. In a non-limiting example of a cyanovinylcarbazole group co-reactive with a pyrimidine base, the first oligonucleotide may be designed to include a cyanovinylcarbazole group at or near the 5' -terminus. In this example, the second oligonucleotide may be designed to be complementary to the first oligonucleotide and include a co-reactive pyrimidine base at a site that aligns with the cyanovinylcarbazole group when the first and second oligonucleotides hybridize. Any of the groups herein and any oligonucleotide having one or more groups can be designed to facilitate a reaction between the groups to form one or more bonds.
Bifunctional spacer
The bifunctional spacer between the headpiece and the chemical entity may be altered to provide an appropriate spacer moiety and/or to increase the solubility of the headpiece in organic solvents. A variety of spacers are commercially available, which can bind the headpiece to a library of small molecules. The spacer is generally composed of straight or branched chains and may include C1-10Alkyl, 1 to 10-atom heteroalkyl, C2-10Alkenyl radical, C2-10Alkynyl, C5-10Aryl, 3 to 20-atom ring or polycyclic systems, phosphodiesters, peptides, oligosaccharides, oligonucleotides, oligomers, polymers or polyalkylene glycols (e.g. polyethylene glycols, such as- (CH)2CH2O)nCH2CH2-, where n is an integer of 1 to 50), or a combination thereof.
Bifunctional spacers can provide an appropriate spacer moiety between the headpiece of the library and the chemical entity. In certain embodiments, the bifunctional spacer comprises three moieties. Moiety 1 may be a reactive group that forms a covalent bond with DNA, such as a carboxylic acid, preferably activated by N-hydroxysuccinimide (NHS) ester to react with an amino group (e.g., amino-modified dT) on DNA; imides for modifying the 5 'or 3' -end of the single-stranded headpiece (by standard oligonucleotide chemistry); a chemical reaction pair (e.g., azido-alkyne cycloaddition in the presence of a cu (i) catalyst or any described herein); or a thiol-reactive group. Moiety 2 may also be a reactive group with a chemical entity, building block AnOr the scaffold forms a covalent bond. Such reactive groups may be, for example, amines, thiols, azides or alkynes. Portion 3 may be a chemically inert spacer portion of variable length, introduced between portions 1 and 2. Such spacer moieties can be chains of ethylene glycol units (e.g., PEGs of different lengths), alkanes, alkenes, polyalkene chains, or peptide chains. The spacer may comprise a branched or inserted moiety having a hydrophobic moiety (e.g., a benzene ring) to improve solubility of the headpiece in organic solvents, and a fluorescent moiety (e.g., fluorescein or Cy-3) for library detection purposes. Hydrophobic residues in the design of the headpiece may be varied with the design of the spacer to facilitate library synthesis in organic solvents. For example, the head fragment and spacer combination is designed to have the appropriate residues, where octanol: coefficient of water (P)oct) For example, 1.0 to 2.5. Spacers can be empirically selected for a given small molecule library design such that the library can be synthesized in organic solvents, e.g., 15%, 25%, 30%, 50%, 75%, 90%, 95%, 98%, 99%, or 100% organic solvents. A mimic reaction can be used prior to library synthesis to alter the spacer to select the appropriate chain length, which dissolves the headpiece in organic solvent. Exemplary spacers include those of: with increased alkyl chain length, increased polyethylene glycol units, with positive charge (to neutralize head)Negative phosphate charge on the fragment), or an increased amount of hydrophobicity (e.g., addition of a benzene ring structure).
Examples of commercially available spacers include amino-carboxylic acid spacers, such as those that are peptides (e.g., Z-Gly-Gly-Gly-Osu (N-. alpha. -benzyloxycarbonyl- (glycine)3-N-succinimidyl ester) or Z-Gly-Gly-Gly-Gly-Gly-Gly-Osu (N-alpha-benzyloxycarbonyl- (glycine)6-N-succinimidyl ester, SEQ ID N0:13)), PEG (e.g., Fmoc-amino PEG2000-NHS or amino-PEG (12-24) -NHS) or an alkane acid chain (e.g., Boc-epsilon-aminocaproic acid-Osu); chemical reaction pairs spacers, such as those described herein that bind a peptide moiety (e.g., azidohomoalanine-Gly-Gly-Gly-OSu (SEQ ID NO:2) or propargylglycine-Gly-Gly-Gly-OSu (SEQ ID NO:3)), PEG (e.g., azido-PEG-NHS), or an alkane chain moiety (e.g., 5-azidopentanoic acid, ((s))S) -2- (azidomethyl) -1-Boc-pyrrolidine, 4-azidoaniline, or 4-azido-butane-1-acid N-hydroxysuccinimide ester); thiol-reactive spacers, such as those of PEG (e.g., SM (PEG) n NHS-PEG-maleimide), alkane chains (e.g., 3- (pyridin-2-yldisulfanyl) -propionic acid-Osu or 6- (3' - [ 2-pyridyldithio-))]-propionamido) hexanoic acid sulfosuccinimidyl ester)); and imides used in oligonucleotide synthesis, such as amino-modifying agents (e.g., 6- (trifluoroacetylamino) -hexyl- (2-cyanoethyl) - (N, N-diisopropyl) -phosphoramidite), thiol-modifying agents (e.g., S-trityl-6-mercaptohexyl-1- [ (2-cyanoethyl) - (N, N-diisopropyl)]Phosphoramidites or chemical pair modifiers (e.g., 6-hexyn-1-yl- (2-cyanoethyl) - (N, N-diisopropyl) -phosphoramidite, 3-dimethoxytrityloxy-2- (3- (3-propargyloxypropionylamino) propionylamino) propyl-1-O-succinyl, long chain alkylamino CPG, or 4-azido-but-1-oic acid N-hydroxysuccinimide ester)). Additional spacers are known in the art and those that can be used during library synthesis include, but are not limited to, 5 '-0-dimethoxytrityl-1', 2 '-dideoxyribose-3' - [ (2-cyanoethyl) - (N, N-diisopropyl)]-a phosphoramidite; 9-0-Dimethoxytrityl-triethylene glycol, 1- [ (2-cyanoethyl) - (N, N-diisopropyl)]-a phosphoramidite; 3- (4,4' -Dimethoxytrityloxy) propyl-1- [ (2-cyanoethyl) - (N, N-diisopropyl)]-a phosphoramidite; and 18-O-dimethoxytrityl hexaethylene glycol, 1- [ (2-cyanoethyl) - (N, N-diisopropyl)]-phosphoramidites. Any of the spacers herein may be added in different combinations in series with one another to produce spacers of different desired lengths.
The spacers may also be branched, where branched spacers are well known in the art, and examples may consist of symmetric or asymmetric doublets or symmetric triplets. See, e.g., Newcome et al, Dendritic Molecules: Concepts, Synthesis, Perspectives, VCH Publishers (1996); the results of Boussif et al,Proc.Natl.Acad.Sci.USA92: 7297-; and the Jansen et al, who,Science266:1226(1994)。
method for determining nucleotide sequence of complex
The invention features methods that include determining the nucleotide sequence of a complex such that a coding relationship can be established between the sequence of an assembly marker sequence and a building block (or building block) of a chemical entity. In particular, the identity and/or history of the chemical entity may be inferred from the base sequence in the oligonucleotide. Using this approach, libraries comprising different chemical entities or members (e.g., small molecules or peptides) can be treated with specific marker sequences.
Any of the bonds described herein may be reversible or irreversible. Reversible bonds include photoreactive bonds (e.g., cyanovinylcarbazole groups and thymidine) and redox bonds. Additional connections are described herein.
In an alternative embodiment, the "unreadable" linkage may be enzymatically repaired to produce a readable or at least displaceable linkage. Enzyme repair processes are well known to those skilled in the art and include, but are not limited to, pyrimidine (e.g., thymidine) dimer repair mechanisms (e.g., using a photolyase or glycosylase (e.g., T4 Pyrimidine Dimer Glycosylase (PDG))), base excision repair mechanisms (e.g., using a glycosylase, an apurinic/Apyrimidinic (AP) endonuclease, a Flap endonuclease, or a poly ADP ribose polymerase (e.g., human apurinic/Apyrimidinic (AP) endonuclease, APE 1; endonuclease III (Nth) protein; endonuclease IV; endonuclease V; formamidopyrimidine [ faby ] -DNA glycosylase (Fpy); human 8-oxoguanine glycosylase 1 (. alpha.isoform) (hOGGl); human pgendonuclease VIII-like l (hILNEl)), uracil-DNA glycosylase (UDG); human single-stranded selective monofunctional uracil DNA glycosylase (SMUG 1); and human alkyl adenine DNA glycosylase) A methylase (hAAG)), which may optionally be combined with one or more endonucleases, DNA or RNA polymerases, and/or ligases for repair, a methylation repair mechanism (e.g., using methylguanine methyltransferase), an AP repair mechanism (e.g., using an apurinic/Apyrimidinic (AP) endonuclease (e.g., APE 1; endonuclease III; endonuclease IV; an endonuclease V; fpg; hOGGl; and hNEILl), which may optionally be combined with one or more endonucleases, DNA or RNA polymerases, and/or ligases for repair, nucleotide excision repair mechanisms (e.g., using an excision repair cross-complementary protein or excision nuclease, which may optionally be combined with one or more endonucleases, DNA or RNA polymerases, and/or ligases for repair), and mismatch repair mechanisms (e.g., using endonucleases (e.g., T7 endonuclease I; MutS, MutH and/or MutL) which may optionally be combined with one or more exonucleases, endonucleases, helicases, DNA or RNA polymerases, and/or ligases for repair). Commercial enzyme mixtures can be used to readily provide these types of repair mechanisms, for example, PreCR < Replacmix (New England Biolabs Inc., Ipswich MA), which includes Taq DNA ligase, endonuclease IV, Bst DNA polymerase, Fpg, uracil-DNA glycosylase (UDG), T4 PDG (T4 endonuclease V), and endonuclease VIII.
Method for coding chemical entities within a library
The methods of the invention may utilize libraries having varying numbers of chemical entities encoded by oligonucleotide tags. Examples of building blocks and encoding DNA tags can be found in U.S. patent application publication 2007/0224607, which is incorporated herein by reference.
Each chemical entity is formed by one or more building blocks and optionally a scaffold. The scaffold is used to provide one or more diverse nodes in a particular geometry (e.g., a triazine providing three nodes that are spatially disposed around a heteroaryl ring or linear geometry).
Building blocks and their encoding labels can be added to the headpiece directly or indirectly (e.g., via a spacer) to form a complex. When the head segment includes a spacer, a member or scaffold is added to the end of the spacer. When a spacer is not present, a building block may be added directly to the headpiece or the building block itself may include a spacer that reacts with the functional group of the headpiece. Exemplary spacers and head segments are described herein.
The stent may be added in any useful manner. For example, a scaffold may be added to the end of a spacer or headpiece, and a continuous member may be added to the available diversity nodes of the scaffold. In another example, component A is first placednAdded to the spacer or head segment, and then the diversity node of the stent S is connected to the member AnThe functional group in (1) is reacted. Oligonucleotide labels encoding a particular scaffold may optionally be added to the headpiece or complex. For example, mixing SnA complex added to n reaction vessels, wherein n is an integer greater than 1 and is labeled Sn(i.e., symbol S)1,S2, …,Sn-1,Sn) A functional group bound to the complex.
Building blocks may be added in multiple synthetic steps. For example, an aliquot of the headpiece, optionally with a spacer attached, is divided into n reaction vessels, where n is an integer of 2 or greater. In a first step, component A is placednAdding to each n reaction vessel (i.e., building Block A)1,A2,… An-1,AnAdded to reaction vessel 1,2, … n-1, n), where n is an integer, and each building block AnIs unique. In a second step, a scaffold S is added to each reaction vessel to form An-an S complex. Optionally, a stent S may be usednIs added to each reaction vessel to form An-SnA complex, wherein n is an integer greater than two, and eachSupport SnMay be unique. In a third step, component B is placednTo contain AnIn each n reaction vessel of the S complex (i.e. the building block B)1,B2,… Bn-1,BnTo contain A1-S,A2-S,… An-1-S,An-reaction vessels 1,2, … n-1, n for S complexes) in which each building block B is provided with a plurality of building blocks BnIs unique. In a further step, component C may be introducednTo contain Bn-AnIn each n reaction vessel of the-S complex (i.e., component C)1,C2,… Cn-1,CnTo contain B1-A1-S… Bn-An-reaction vessels 1,2, … n-1, n for S complexes) in which each member C is a membernIs unique. The resulting library will have n3Of a number of n3A labeled complex. In this way, additional synthetic steps can be used to incorporate additional building blocks to further diversify the library.
After formation of the library, the resulting complex may optionally be purified and subjected to a polymerization or ligation reaction, e.g., to a headpiece. This general strategy can be extended to include additional diversity nodes and components (e.g., D, E, F, etc.). For example, the first diversity node reacts with the building block and/or S and is encoded by the oligonucleotide tag. Additional building blocks are then reacted with the resulting complex and subsequent diversity nodes are derived from the additional building blocks, which are encoded by the primers used in the polymerization or ligation reaction.
To form the coding library, oligonucleotide tags are added to the complexes after or before each synthesis step. For example, in the component AnBefore or after addition to each reaction vessel, marker AnFunctional groups bound to headpiece (i.e., label A1,A2,…An-1,AnAdded to reaction vessel 1,2, … n-1, n) containing the headpiece. Each mark AnWith different sequences, one for each unique member AnAssociate and determine the signature AnIs provided withFor component AnThe chemical structure of (1). In this way, additional markers are used for coding as additional members or additional stents.
In addition, the last label added to the complex may also include a primer binding sequence or provide a functional group that allows binding (e.g., by ligation) of a primer binding sequence. The primer binding sequences can be used to amplify and/or sequence the oligonucleotide tags of the complexes. Exemplary methods for amplification and for sequencing include Polymerase Chain Reaction (PCR), linear amplification (LCR), Rolling Circle Amplification (RCA), or any other method known in the art for amplifying or determining nucleic acid sequences.
Using these methods, large libraries can be formed with large numbers of encoded chemical entities. For example, head segment is connected to spacer and member AnReaction, this building block comprised 1,000 different variants (i.e., n = 1,000). For each component AnLabeling the DNA with AnLigation or primer extension to the headpiece. These reactions can be performed in1,000 well plates or 10x 100 well plates. All reactants can be combined, optionally purified and resolved into a second set of plates. Next, component B may be usednThe same procedure was performed, which also included 1,000 different variants. The DNA may be labeled BnIs connected to An-headpiece complex, and all reactions can be combined. The resulting library comprises Anx Bn1,000 x1,000 combinations (i.e., 000,000 compounds) labeled with 1,000,000 different combinations of labels. The same method can be extended to add a component Cn、Dn、EnAnd the like. The resulting library can then be used to identify compounds that bind to the target. The structure of the chemical entities bound to the library can optionally be assessed by PCR and sequencing of DNA markers to identify the enriched compounds.
This method may be modified to avoid labeling after each component is added or to avoid merging (or mixing). For example, by combining the member AnAdding to n reaction vessels (where n is an integer greater than 1) and adding the same building block B1Added to each reaction well to modify the process. Here, for each chemical entityB1Are identical and, therefore, do not require oligonucleotide labeling encoding for this building block. After the building blocks are added, the composites may or may not be combined. For example, after the final step of building block addition, the library is not pooled and the pools (pool) are screened separately to identify compounds bound to the target. To avoid pooling all the reactants after synthesis, binding on the sensor surface can be monitored in a high-throughput format (e.g., 384 well plates and 1,536 well plates) using, for example, ELISA, SPR, ITC, Tm change, SEC, or similar assays. For example, A can be labeled with DNAnCoding means AnAnd member B may be encoded by its position within the well platen. A can then be performed by using a binding assay (e.g., ELISA, SPR, ITC, Tm shift, SEC, or the like), and by performing A by sequencing, microarray analysis, and/or restriction digestion analysisnMarker analysis to identify candidate compounds. This analysis allows the identification of the building block A which produces the desired moleculenAnd BnCombinations of (a) and (b).
The amplification method can optionally include forming a water-in-oil emulsion to form a plurality of aqueous microreactors. Reaction conditions (e.g., concentration of complexes and size of microreactors) can be adjusted to provide (on average) microreactors having at least one member of a library of compounds. Each microreactor may also comprise a target, a single bead capable of binding to a complex or a portion of a complex (e.g., one or more labels) and/or binding to a target, and an amplification reaction solution having one or more necessary reagents for nucleic acid amplification. After amplification of the label in the microreactor, the amplified copy of the label will bind to the bead in the microreactor and the coated bead can be identified by any available method.
Once the building blocks from the first library that bind to the target of interest are identified, a second library can be prepared in an iterative manner. For example, one or two additional diversity nodes can be added and a second library formed and sampled, as described herein. This process can be repeated as many times as necessary to form a molecule having the desired molecular and pharmaceutical properties.
Various attachment techniques may be used to add brackets, members, spacers, keys and indicia. Thus, any of the combining steps described herein may include any available connection technology or technology. Exemplary ligation techniques include enzymatic ligation, e.g., enzymatic ligation using one or more RNA ligase and/or DNA ligase, as described herein; and chemical ligation, e.g., using a chemically reactive pair, as described herein.
Screening method
There are a number of established technical methods for determining binding of a compound to a protein, e.g.by determiningKd. Methods for detecting or quantifying binding of a compound to a target protein include, for example, absorbance, fluorescence, raman scattering, phosphorescence, luminescence, luciferase assays, and radioactivity. Exemplary techniques include Surface Plasmon Resonance (SPR) and Fluorescence Polarization (FP). SPR measures the change in refractive index of a metal surface when a compound binds to a protein immobilized on the metal surface, and FP measures the change in tumbling rate (tumbling rate) caused by the binding of a compound to a protein using the loss of polarization of incident light. In some embodiments, these methods can be used to experimentally determine the binding of a candidate compound to a target protein predicted using the methods of the invention.
Alternatively, compounds that bind to the target protein can be identified using affinity-based methods. For example, a target protein with an affinity tag (e.g., a poly-His tag) can be pre-incubated with saturating concentrations of one or more candidate compounds. Subsequent affinity purification and compound identification (e.g., by using an identity tag) will allow identification of compounds that bind to the target protein.
Target protein
A target protein (e.g., a eukaryotic target protein such as a mammalian target protein or a fungal target protein or a prokaryotic target protein such as a bacterial target protein) is a protein that mediates a disease condition or a symptom of a disease condition. Thus, a desired therapeutic effect can be obtained by modulating (inhibiting or increasing) its activity.
The target protein may be naturally occurring, e.g., wild-type. Alternatively, the target protein may be different from the wild-type protein, but still retain biological function, e.g., as an allelic variant, splice mutant, or biologically active fragment.
In some embodiments, the target protein is an enzyme (e.g., a kinase). In some embodiments, the target protein is a transmembrane protein. In some embodiments, the target protein has a coiled coil structure. In certain embodiments, the target protein is a dimeric complex protein.
In some embodiments, the target protein is a GTPase, such as DIRAS1, DIRAS2, ERAS, GEM, HRAS, KRAS, MRAS, NKIRAS2, NRAS, RALA, RALB, RAP 12, RAP 22, RASD 102, RASL11 2, RASL 2, REM2, rerp, rgl, RRAD, RRAS2, RASL10 2, RASL11 RAB2, RASL 2, rasp 7, rasp 2, rasp 7 RAB 72, rasp 2, RASL 2, rasp 7 RAB7, rasp 7 RAB 72, rasp 7, rasp 7 RAB 72, rasp 7, rasp 72, rasp 7, rasp 72, rasp 7 RAB 72, rasp 7, rasp 72, rasp 7, rasp 72, rasp 7 RAB 72, rasp 7, rasp 72, rasp 7 rabb, rasp 7 RAB 72, rasp 7, rasp 72, RAP2, ARF, ARL5, ARL10, ARL13, ARL, TRIM, ARL4, ARFRP, ARL13, RAN, RHEB, RHEBL, RRAD, GEM, REM, RIT, RHOT, or RHOT. In some embodiments, the target protein is a GTPase activating protein, such as NF1, IQGAP1, PLEXIN-B1, RASAL1, RASAL2, ARHGAP5, ARHGAP8, ARHGAP12, ARHGAP22, ARHGAP25, BCR, DLC1, DLC2, DLC3, GRAF, RALBP1, RAP1GAP, SIPA1, TSC2, AGAP2, ASAP1, or ASAP 3. In some embodiments, the target protein is a guanylate exchanger, such as CNRASGEF, RASGEF1A, RASGRF2, RASGRP1, RASGRP4, SOS1, RALGDS, RGL1, RGL2, RGR, ARHGEF 2, ASEF/ARHGEF 2, ASEF2, DBS, ECT2, GEF-H2, LARG, NET 2, OBSCURIN, P-REX2, PDZ-RHOGEF, TEM 72, TIAM 2, TRIO, VAV2, DOCK2, C3 2, BIDODG 2/AREF 2, A EF3672, FG 2, or FBP 100. In certain embodiments, the target protein is a protein having a protein-protein interaction domain, such as ARM; a BAR; BEACH; BH; BIR; BRCT; BROMO; BTB; c1; c2; a CARD; CC; CALM; CH (CH); CHROMO; CUE; DEATH; DED; DEP; DH; EF-hand; an EH; ENTH; EVH 1; f-box; FERM; FF; FH 2; FHA; FYVE; GAT; GEL; GLUE; GRAM; GRIP; GYF, respectively; HEAT; HECT; IQ; an LRR; MBT; MH 1; MH 2; MIU; NZF; PAS; PB 1; PDZ; PH value; POLO-Box; PTB; a PUF; PWWP; PX; RGS; RING; SAM; SC; SH 2; SH 3; SOCS; SPRY; START; SWIRM; TIR; TPR; TRAF; SNARE; TUBBY; TUDOR; UBA; UEV; UIM; VHL; VHS; WD 40; WW; SH 2; SH 3; TRAF; a bromodomain; or TPR. In some embodiments, the target protein is a heat shock protein, such as Hsp20, Hsp27, Hsp70, Hsp84, α B crystals, TRAP-1, hsf1, or Hsp 90. In certain embodiments, the target protein is an ion channel, such as cav2.2, cav3.2, IKACh, kv1.5, TRPA1, nav1.7, nav1.8, nav1.9, P2X3, or P2X 4. In some embodiments, the target protein is a helical frizzled protein such as geminin, SPAG4, VAV1, MAD1, ROCK1, RNF31, NEDP1, HCCM, EEA1, Vimentin, ATF4, Nemo, SNAP25, Syntaxin 1a, FYCO1, or CEP 250. In certain embodiments, the target protein is a kinase, such as ABL, ALK, AXL, BTK, EGFR, FMS, FAK, FGFR, 2,3, 4, FLT, HER/ErbB, IGF1, INSR, JAK, KIT, MET, pdgf, PDGFRB, RET RON, ROR, ROS, SRC, SYK, TIE, TRKA, TRKB, KDR, AKT, PDK, PKC, RHO, ROCK, RSK, RKS, ATM, ATR, CDK, ik, rkk, GSK3, JNK, ARuB, PLK, pkk, raf, PKN, fack, etc. In some embodiments, the target protein is a phosphatase, such as WIP1, SHP2, SHP1, PRL-3, PTP1B, or STEP. In certain embodiments, the target protein is a ubiquitin ligase, such as BMI-1, MDM2, NEDD4-1, β -TRCP, SKP2, E6AP, or APC/C. In some embodiments, the target protein is a chromatin modifying/remodeling factor, such as that encoded by genes BRG1, BRM, ATRX, PRDM3, ASH1L, CBP, KAT6A, KAT6B, MLL, NSD1, SETD2, EP300, KAT2A, or CREBBP. In some embodiments, the target protein is a transcription factor, such as a transcription factor encoded by: EHF, ELF1, ELF3, ELF4, ELF5, ELK1, ELK3, ELK4, ERF, ERG, ETS 4, ETV4, FEV, FLI 4, GAVPA, SPDEF, SPI 4, SPIC, SPIB, E2F4, ARNTL, BHLHA 4, BHLHB 4, BHLBHB 4, BHE 4, BHLHE4, CLOCK, FIGLA 4, HES 4, HEY 4, HEHEHEY 4, HESABL 4, CALN 4, CALNF 4, CALN 4, CALNF 4, CALN 4, CALNF 4, CALN 4, CALNF 4, CALN, HOXA, HOXAB, HOXB, HOXC, HOXD, IRX, ISL, ISX, LBX, LHX, LMX1, MEIS, MEOX, MIXL, MNX, MSX, NKX-3, NKX-8, NKX-1, NKX-2, NOTO, ONECUT, ONECOX, OTX, PDX, PHOX2, PITX, PINOX, PROP, PRRX, RAX, RAXL, RHOXF, SHIF, SHOX, TGIF, POTGIF 2, VACX, HOXB, HOXCX, HOXB, LBX, MSX, LBX, MSX, NKX-2, NKX-2, NKX, POXU, PFX, SMAD3, CENPB, PAX1, BCL6 1, EGR1, GLIS1, GLI 1, GLIS1, HIC 1, HINFP1, KLF1, MTF1, PRDM1, SCRT1, SNAI 1, SP1, YY1, ZBZBZBN 1, ZBTB 71, FONXNFX 1, FONZNFX 1, FONXNFX 1, FONXN 1, FONX 1, FONXN 1, FONX 1, GATA3, GATA4, or GATA 5; or C-Myc, Max, Stat3, androgen receptor, C-Jun, C-Fox, N-Myc, L-Myc, MITF, Hif-1 alpha, Hif-2 alpha, Bcl6, E2F1, NF-kappa B, Stat5, or ER (contact). In certain embodiments, the target protein is TrkA, P2Y14, mPEGS, ASK1, ALK, Bcl-2, BCL-XL, mSIN1, ROR γ t, IL17RA, eIF4E, TLR 7R, PCSK9, IgE R, CD40, CD40L, Shn-3, TNFR1, TNFR2, IL31RA, OSMR, IL12 β 1,2, Tau, FASN, KCTD 6, KCTD 9, Raptor, Rictor, RALGPA, Membrane connexin family members, BCOR, NCOR, β catenin, AAC 11, PLD 12, Frizzled 12, RaplP, MLL-1, Myb, Ezh 12, RhoGD12, EGFR, CTLA4 (12), GCGC coact), Adiconin R72, GPRp 12, GPR-12, or Nrl 12-12, or NGPR 12-12.
Virtual screening method
Data collection and statistical result generation
In some embodiments, the steps in the virtual screening methods of the invention include obtaining data derived from a DNA-encoding library selection experiment (e.g., an affinity-based experiment) directed to a target protein. Data is selected for reading as DNA sequences, which are then aggregated into statistical reads, such as sequence counts. Aggregation into statistical results is based on grouping common coding compounds, e.g., putative chemical structures encoded by DNA (example level) or partial substructures encoding chemical structures (single, double or triple synthon level). The cut-off value of the statistical results obtained from sequencing of one or more selection conditions is used to determine whether a compound or moiety of the compound binds to the target (binder). Millions to millions (or even billions) of sequences are used per selection condition in order to collect significant statistics reflecting true potential small molecule/protein binding.
Machine learning
Machine Learning methods are known in the art, for example, non-limiting Machine Learning methods include naive Bayes (Na meive Bayes), Random Forest (Random Forest), Decision trees (Decision Tree), support vector machines (support vector Machine), Neural networks (Neural Net), and Deep Learning (Deep Learning).
In some embodiments, each data point from the data collection step is used to train a machine learning algorithm. Each data point includes information derived from the molecular structure (in whole or in part) of the compound from the DNA-encoding library and associated statistics from one or more selection experiments. The structure is used to generate digital inputs (calculated chemical properties such as molecular weight, cLogP) and binary strings (e.g., chemical fingerprints that reflect atoms, groups of atoms, and connectivity within the structure). The reads of these molecular calculations are used as input columns for training and prediction by machine learning algorithms. In some embodiments, the model is constructed such that the only inputs required are those directly derived from the molecular structure. In some embodiments, any structure from which these fingerprints and properties can be calculated can produce a prediction.
In some embodiments, further structural derivatives of the compounds (e.g., core analysis with side chains removed) may be used to generate further fingerprints and property calculations, or alternative structural fingerprints for training and prediction.
In some embodiments, data from one or more DNA-encoding library selections is used to assess whether a molecule is considered to represent an instance of a binding agent (positive), a non-binding agent (negative), or a non-specific binding agent (negative). Although the assessment (positive or negative) is based on the behavior of the encoding molecule in at least one DNA encoding library selection, additional information from other sources can be used to assess the positive and negative classifications used for training. It is also noteworthy that structures known to be synthesized in the library but not showing any counts from sequencing are considered negative examples in training. In some embodiments, a positive control is included within the dataset. For example, binding interaction data from compounds with known binding affinity for the target protein (e.g., known inhibitors or natural ligands) can be included.
In one embodiment, the assessment of binding of the input molecule is determined by detecting a statistically significant enrichment (elevated sequence count) in the selection comprising the target protein. Enrichment under control conditions that did not include the target protein was also used to assess the specificity of binding. Such conditions typically include a resin for capturing the protein during selection, but no addition of the protein. Additional information can be used to determine whether a particular molecule or portion of a molecule is labeled positive, e.g., enriched or not enriched under additional conditions or when selecting for a protein of interest. Information derived from selection for a number of non-target proteins may also be used, for example, a count of the total number of proteins for which a given molecule or portion of molecules has been shown to be enriched in selection. For example, detecting enrichment for a given molecule for several additional targets in a database may result in a negative indication due to lack of specificity.
Molecular representation
In some embodiments of the invention, molecular behavior is used to generate estimated binding calculations. Molecular manifestations include, for example, topological manifestations, electrostatic manifestations, geometric manifestations, or quantum chemical manifestations. The topological representation can be based on atoms, features, or functional groups and their connectivity (e.g., fingerprints, connection tables, molecular connectivity, and/or molecular graphical representations). Electrostatic manifestations include, for example, surface electrons. Geometric representations are, for example, pharmacophores, pharmacophore fingerprints, shape-based fingerprints, and/or 3D molecular coordinates using atoms, features, or functional groups. In some embodiments, quantum chemical representation is used. In some embodiments, the electronic molecular representation is a chemical fingerprint.
In some embodiments, the step in the virtual screening method of the invention comprises generating a chemical fingerprint of both the compound and the candidate compound for which binding interaction data has been generated. Chemical fingerprints may be generated using any method known in the art, such as ECFP6, FCFP6, ECFP4, MACCS, or morgan/ring fingerprints. The chemical fingerprint is then analyzed to identify patterns, e.g., to identify structural features that increase or decrease binding to the target protein. Information generated from chemical fingerprint comparisons of a large number of compounds (e.g., at least 250,000 molecules) may be used to increase the accuracy of the estimated binding interactions generated, as compared to chemical fingerprint comparisons of a smaller number of compounds, e.g., less than 100,000 compounds. In some embodiments, chemical fingerprints are used as the primary information for machine learning in the method.
For example, an exemplary training set input for an 8-bit fingerprint may include:
fingerprints are representations of chemical entities. Machine learning is performed by inputting the training rows (i.e., columns of each compound (i.e., fingerprint bits) plus a training column indicating whether it is a positive or negative embodiment).
Various algorithms (random forest (RF), naive bayes, deep learning, neural networks, etc.) operate by finding patterns that are related to true or false indications. These patterns may involve one or more bits. They can be found by explicitly analyzing statistical results (e.g., naive bayes, random forests) or by empirical feedback from varying model parameters (e.g., neural networks).
Another method that may be used is to add a column of computational properties (e.g., MW, cLogP, tPSA) in addition to the fingerprint. In this case, the machine learning algorithm may utilize these other columns in its statistical analysis or its model parameter search. Using the property in the analysis may improve the accuracy of the prediction compared to a prediction performed without using the property.
The molecules subsequently predicted in this approach are represented in exactly the same way as those represented in the training set, the key difference being that the training columns seen above are now unknown. The model generates prediction values to be filled into a combined feature column (e.g., a combined prediction column). In some embodiments, the column is Boolean (T/F), classified (e.g., non-binding agent, competitive binding agent, non-competitive binding agent), or numeric (e.g., reflecting a probability score for a binding agent).
Only the molecules to be predicted comprising the fingerprint columns may be used with the model generated by the first embodiment described above.
The following is an exemplary prediction with input information extended to include properties that can be used with the model created by the second embodiment above.
Output of
In some embodiments, the generated model will produce a binary score that indicates whether the candidate compound is positive or negative, or a probability score (e.g., from 0 to 1) that indicates the likelihood of assignment of the model to activity/binding of the candidate compound is positive or negative. This value can then be used to make a go/no-go decision (binary case) for a given molecule or to inform the candidate compound of a priority (probability score).
Examples
Example 1
Selection data of soluble epoxy hydrolases (sEH) from a set of libraries are used to train one of several machine learning models (random forest, naive bayes or neural networks) and then used to predict the selection behavior of molecules from libraries not included in the training set against the same target. The libraries used in the training set included a linear peptide library with 25,844,065 compounds, a 3-cyclic pyrazole library with 3,976,320 compounds, a 2-cyclic pyridine library with 5,079,459 compounds, and a 4-cyclic macrocycle library with 1,511,399,304 compounds. Libraries for the prediction set included a 3-ring linear peptide library of 221,580,000 compounds, a 3-ring pyridine library of 285,917,292 compounds, and a 2-ring benzimidazole library of 1,622,820 compounds.
As shown in fig. 1, an enrichment of binders was seen in the prediction set. The 4 quadrants in the figure represent the prediction of positive bisplexan using increasing library numbers (left to right, top to bottom). The Y-axis represents the enrichment of positives in the prediction set compared to random selection from the original population. The Y-axis shows the percentage of positives found in the prediction set in the original set. The results show that for the training and test sets (keeping the bissynthons out of the training set, but from the same library), the enrichment of the prediction set was always 2-2.5 times that of the original population. The prediction set is a double synthon from a library not used for training. In this case, increasing the number of libraries used for training compared to the original population shows an increased positive rate in the predicted population.
Example 2
Selection data for sEH from the same library as example 1 was used with machine learning algorithms (RF, MLP, deep learning) to train and generate models for predicting the activity of molecules not found in DNA-encoding libraries. For example, data is input and a model is generated that predicts the activity of the tested molecule in a conventional High Throughput Screening (HTS) assay (i.e., automated testing of 10K to 1Ms molecules). Predictions by the model are used as a filter to generate a list (e.g., 100 compounds) from an initial list of 10,000 to 100,000 or more molecules. The goal is to identify the molecules in this short list such that the final list is greatly enriched (10X to 100X) in potential rates of active molecules found in the initial set.
As shown in fig. 2, enrichment of predicted molecules greater than 40X has been observed compared to random selection. Fig. 2 shows a number of trials over time due to the improvement of the predictive model. This trend shows an increase in enrichment of major HTS hits and more rigorously confirmed actives in the prediction set compared to random selection. The confirmed actives were subjected to a second, confirmatory biochemical assay and activity was demonstrated. The best results show >40 fold improvement in the resulting prediction set compared to randomly selected molecules from the original population.
Example 3 optimization of prediction
For a given target or targets, there is a known set of HTS data. Multiple parameter settings are tested in order to achieve a high prediction rate. In fact, a high prediction rate is the result of fine tuning the prediction based on HTS results. HTS is used to demonstrate suitability, and the model can then be used to predict new or existing compounds (e.g., commercially available or from a pre-existing proprietary compound library). These molecules can then be tested with higher active rate expectations (e.g., greater than 1% or 10% active molecules) within the prediction set, regardless of the potential active rate of the random sample.
Example 4 optimization of prediction
Data from selections directed against a given target but under different conditions (e.g., using different protein fragments, mutants, isoforms, using closely related targets, using known small molecule competitors, etc.) are used to further refine the definition of positive data in the training set used to train the model.
Example 5 optimization of prediction
Data from selections for 10 to 100 protein targets, mutants, isoforms, etc. were used as a series of additional data columns to define positive or negative examples for training machine learning models.
Other embodiments
Various modifications and variations of the described methods and systems of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. While the invention has been described in connection with specific desired embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in the medical, pharmacological or related fields are intended to be within the scope of the present invention.
The following is claimed in the present application.
Claims (26)
1. A method, comprising the steps of:
(a) providing a plurality of binding interaction findings for a target protein in a physical computing device, the physical computing device having representations of a set of candidate compounds,
wherein at least 90% of the plurality of binding interaction findings represent binding interactions between the target protein and a compound comprising a nucleotide tag encoding the identity of the compound;
(b) using the plurality of binding interactions to find an estimated binding interaction using the computing device to generate the candidate compound; and
(c) a list of candidate compounds that can be displayed and ranked by the highest estimated binding interaction is output.
2. The method of claim 1, wherein the plurality of binding interaction findings comprises at least one million binding interaction findings.
3. The method of claim 1 or2, wherein at least 95% of the plurality of binding interaction findings represent binding interactions between the target protein and a compound comprising a nucleotide tag encoding the identity of the compound.
4. The method of any one of claims 1 to 3, wherein at least 99% of the plurality of binding interaction findings represent a binding interaction between the target protein and a compound comprising a nucleotide tag encoding the identity of the compound.
5. The method of any one of claims 1 to 4, wherein at least 50% of the plurality of binding interaction findings are determined by simultaneously contacting a plurality of compounds comprising a nucleotide tag encoding the identity of the compound with the target protein.
6. The method according to any one of claims 1 to 5, wherein the method further comprises providing one or more additional plurality of binding interaction findings for one or more additional target proteins, wherein at least 50% of the plurality of binding interaction findings represent binding interactions between the additional target protein and compounds from the plurality of binding interactions with the target protein.
7. The method of claim 6, wherein the list of candidate compounds can be displayed and ranked by the selectivity of a candidate compound for the target protein relative to the one or more additional target proteins.
8. The method of claim 6 or 7, wherein the one or more additional target proteins comprise a mutant of the target protein.
9. The method of any one of claims 1 to 8, wherein the method further comprises providing one or more additional plurality of binding interaction findings of one or more negative control experiments, wherein at least 50% of the plurality of binding interaction findings represent negative control experiments from compounds that bind to the plurality of binding interactions of the target protein.
10. The method of any one of claims 1 to 9, wherein the method further comprises transmitting the list of candidate compounds over the internet or to a display device.
11. The method of any of claims 1-10, wherein the physical computing device is accessed and operated over the internet.
12. The method of any one of claims 1 to 11, wherein the estimated binding interaction is generated using chemical structure comparison.
13. The method of claim 12, wherein the chemical structure comparison utilizes molecular representation.
14. The method of claim 13, wherein the molecular representation comprises a chemical fingerprint.
15. The method of claim 14, wherein the chemical fingerprinting is ECFP6, FCFP6, ECFP4, MACCS or morgan/ring fingerprinting.
16. The method of any one of claims 1-15, wherein the method further comprises generating a confidence score for each estimated binding interaction of a candidate compound, wherein the confidence score is generated using a chemical structure comparison of the candidate compound to one or more compounds from the plurality of binding interactions with the target protein.
17. The method of claim 16, wherein the chemical structure comparison is a principal component analysis.
18. The method of claim 16 or 17, wherein the list of candidate compounds is capable of being displayed and ranked by a confidence score of the estimated binding interaction of the candidate compounds.
19. The method of any one of claims 1 to 18, wherein the method further comprises providing one or more property findings for the set of candidate compounds.
20. The method of claim 19, wherein the one or more property findings comprise molecular weight and/or clogP.
21. The method of claim 19 or 20, wherein the one or more property findings are utilized to generate the estimated binding interaction.
22. The method of any one of claims 19 to 21, wherein the list of candidate compounds is displayable and gradeable by the one or more property findings.
23. The method of any one of claims 1 to 22, wherein the method further comprises (d) synthesizing one or more of the candidate compounds from the list of candidate compounds.
24. The method of claim 23, wherein the method further comprises contacting the one or more synthetic candidate compounds with the target protein to determine one or more experimental binding interactions.
25. A computer readable medium having stored thereon executable instructions for directing a physical computing device to perform a method comprising:
(a) providing a plurality of binding interaction findings for a target protein in a physical computing device, the physical computing device having representations of a set of candidate compounds,
wherein at least 90% of the plurality of binding interaction findings represent binding interactions between the target protein and a compound comprising a nucleotide tag encoding the identity of the compound;
(b) using the plurality of binding interactions to find an estimated binding interaction using the computing device to generate the candidate compound; and
(c) a list of candidate compounds that can be displayed and ranked by the highest estimated binding interaction is output.
26. A physical computing device having a representation of a set of candidate compounds and programmed with executable instructions to direct the device to perform a method comprising:
(a) providing a plurality of binding interaction findings for a target protein in a physical computing device, the physical computing device having representations of a set of candidate compounds,
wherein at least 90% of the plurality of binding interaction findings represent binding interactions between the target protein and a compound comprising a nucleotide tag encoding the identity of the compound;
(b) using the plurality of binding interactions to find an estimated binding interaction using the computing device to generate the candidate compound; and
(c) a list of candidate compounds that can be displayed and ranked by the highest estimated binding interaction is output.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201762486692P | 2017-04-18 | 2017-04-18 | |
US62/486692 | 2017-04-18 | ||
PCT/US2018/028050 WO2018195134A1 (en) | 2017-04-18 | 2018-04-18 | Methods for identifying compounds |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110730822A true CN110730822A (en) | 2020-01-24 |
CN110730822B CN110730822B (en) | 2024-03-08 |
Family
ID=63856100
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201880040438.9A Active CN110730822B (en) | 2017-04-18 | 2018-04-18 | Methods for identifying compounds |
Country Status (9)
Country | Link |
---|---|
US (1) | US20200143903A1 (en) |
EP (1) | EP3612545A4 (en) |
JP (2) | JP7277378B2 (en) |
CN (1) | CN110730822B (en) |
AU (2) | AU2018256367A1 (en) |
BR (1) | BR112019021786A2 (en) |
EA (1) | EA201992476A1 (en) |
MA (1) | MA51864A (en) |
WO (1) | WO2018195134A1 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
BR112021024915A2 (en) * | 2019-06-12 | 2022-01-18 | Quantum Si Inc | Techniques for protein identification using machine learning and related systems and methods |
US20210303762A1 (en) * | 2020-03-31 | 2021-09-30 | International Business Machines Corporation | Expert-in-the-loop ai for materials discovery |
CN111863120B (en) * | 2020-06-28 | 2022-05-13 | 深圳晶泰科技有限公司 | Medicine virtual screening system and method for crystal compound |
CN112086145B (en) * | 2020-09-02 | 2024-04-16 | 腾讯科技(深圳)有限公司 | Compound activity prediction method and device, electronic equipment and storage medium |
WO2023069592A1 (en) * | 2021-10-21 | 2023-04-27 | Google Llc | Multi-label neural architecture for modeling dna-encoded libraries data |
WO2023239720A1 (en) * | 2022-06-06 | 2023-12-14 | The Trustees Of Indiana University | Method of predicting ms/ms spectra and properties of chemical compounds |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105659087A (en) * | 2013-06-13 | 2016-06-08 | 比奥德赛公司 | Method of screening candidate biochemical entities targeting a target biochemical entity |
TW201629069A (en) * | 2015-01-09 | 2016-08-16 | 霍普驅動生物科技股份有限公司 | Compounds that participate in cooperative binding and uses thereof |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
IL129498A0 (en) | 1996-11-04 | 2000-02-29 | Dimensional Pharm Inc | System method and computer program product for identifying chemical compounds having desired properties |
US20040010376A1 (en) * | 2001-04-17 | 2004-01-15 | Peizhi Luo | Generation and selection of protein library in silico |
WO2003099999A2 (en) | 2002-05-20 | 2003-12-04 | Abmaxis, Inc. | Generation and selection of protein library in silico |
WO2006078228A1 (en) * | 2002-09-16 | 2006-07-27 | Plexxikon, Inc. | Methods for the design of molecular scaffolds and ligands |
EA201992285A1 (en) | 2012-07-13 | 2020-05-31 | Икс-Чем, Инк. | DNA-CODED LIBRARIES CONTAINING LINKS OF CODING OLIGONUCLEOTIDES NOT AVAILABLE FOR READING BY POLYMERASES |
MA41298A (en) | 2014-12-30 | 2017-11-07 | X Chem Inc | DNA-CODED BANK MARKING PROCESSES |
-
2018
- 2018-04-18 JP JP2019556665A patent/JP7277378B2/en active Active
- 2018-04-18 MA MA051864A patent/MA51864A/en unknown
- 2018-04-18 EA EA201992476A patent/EA201992476A1/en unknown
- 2018-04-18 BR BR112019021786A patent/BR112019021786A2/en unknown
- 2018-04-18 CN CN201880040438.9A patent/CN110730822B/en active Active
- 2018-04-18 AU AU2018256367A patent/AU2018256367A1/en not_active Abandoned
- 2018-04-18 US US16/606,325 patent/US20200143903A1/en active Pending
- 2018-04-18 EP EP18788378.0A patent/EP3612545A4/en active Pending
- 2018-04-18 WO PCT/US2018/028050 patent/WO2018195134A1/en unknown
-
2023
- 2023-05-08 JP JP2023076466A patent/JP2023113620A/en active Pending
- 2023-07-18 AU AU2023206117A patent/AU2023206117A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105659087A (en) * | 2013-06-13 | 2016-06-08 | 比奥德赛公司 | Method of screening candidate biochemical entities targeting a target biochemical entity |
TW201629069A (en) * | 2015-01-09 | 2016-08-16 | 霍普驅動生物科技股份有限公司 | Compounds that participate in cooperative binding and uses thereof |
Non-Patent Citations (2)
Title |
---|
SANTIAGO VILAR等: "Computational Drug Target Screening through Protein Interaction Profiles" * |
WILLY DECURTINS等: "Automated screening for small organic ligands using DNAencoded chemical libraries" * |
Also Published As
Publication number | Publication date |
---|---|
WO2018195134A1 (en) | 2018-10-25 |
BR112019021786A2 (en) | 2020-05-05 |
CN110730822B (en) | 2024-03-08 |
JP2020518898A (en) | 2020-06-25 |
JP2023113620A (en) | 2023-08-16 |
JP7277378B2 (en) | 2023-05-18 |
EP3612545A1 (en) | 2020-02-26 |
AU2023206117A1 (en) | 2023-08-10 |
AU2018256367A1 (en) | 2019-11-28 |
EP3612545A4 (en) | 2021-01-13 |
US20200143903A1 (en) | 2020-05-07 |
MA51864A (en) | 2020-02-26 |
EA201992476A1 (en) | 2020-02-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110730822A (en) | Method for identifying compounds | |
AU2018202665B2 (en) | DNA-encoded libraries having encoding oligonucleotide linkages not readable by polymerases | |
KR102175718B1 (en) | Synthetic nucleic acid spike-in | |
CN107428795B (en) | Method for tagging DNA encoding libraries | |
Akoopie et al. | A GTP-synthesizing ribozyme selected by metabolic coupling to an RNA polymerase ribozyme | |
US20200109446A1 (en) | Chip hybridized association-mapping platform and methods of use | |
US11365441B2 (en) | Method and apparatus for simultaneous targeted sequencing of DNA, RNA and protein | |
US10655162B1 (en) | Identification of biomolecular interactions | |
WO2023091683A1 (en) | Nucleic acid storage for blockchain and non-fungible tokens | |
DK2771485T3 (en) | PROCEDURE FOR IDENTIFICATION OF APTAMER | |
EA042768B1 (en) | METHODS FOR IDENTIFYING COMPOUNDS | |
Rath et al. | Programmable design of functional ribonucleoprotein complexes | |
WO2022162211A1 (en) | Rna aptamers and their use | |
US20230016731A1 (en) | Affinity purification sequencing | |
EP4314339A1 (en) | Chimeric artefact detection method | |
Meek | Pushing the Boundaries of Selex for the Generation of Aptamers with Unique Functionality | |
Klaesson | Development of DNA-based methods for analysis of protein interactions | |
Slaughter | Article Watch: July 2019 | |
CA3214604A1 (en) | Fixed point number representation and computation circuits | |
RU2021108530A (en) | HIGH-PERFORMANCE SINGLE NUCLEUS AND SINGLE CELL LIBRARIES AND METHODS FOR THEIR PRODUCTION AND USE | |
WO2023168085A1 (en) | Dna microarrays and component level sequencing for nucleic acid-based data storage and processing | |
JP2022542756A (en) | Methods for tagging and coding existing compound libraries | |
Chircus | High Throughput Technologies for Studying Nucleic Acid-Protein Interactions | |
CN112105748A (en) | Methods for sequencing and producing nucleic acid sequences | |
WO1995000668A1 (en) | Method for selection of targets and nucleic acids which modulate target activity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40017902 Country of ref document: HK |
|
GR01 | Patent grant | ||
GR01 | Patent grant |