WO2023196220A2 - Method for genome-wide functional perturbation of human microsatellites using engineered zinc fingers - Google Patents
Method for genome-wide functional perturbation of human microsatellites using engineered zinc fingers Download PDFInfo
- Publication number
- WO2023196220A2 WO2023196220A2 PCT/US2023/017257 US2023017257W WO2023196220A2 WO 2023196220 A2 WO2023196220 A2 WO 2023196220A2 US 2023017257 W US2023017257 W US 2023017257W WO 2023196220 A2 WO2023196220 A2 WO 2023196220A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- ews
- seq
- zinc finger
- ggaa
- bold
- Prior art date
Links
- 239000011701 zinc Substances 0.000 title claims abstract description 85
- 229910052725 zinc Inorganic materials 0.000 title claims abstract description 85
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 title claims abstract description 83
- 238000000034 method Methods 0.000 title claims abstract description 56
- 108091092878 Microsatellite Proteins 0.000 title description 43
- 241000282414 Homo sapiens Species 0.000 title description 15
- 208000006168 Ewing Sarcoma Diseases 0.000 claims abstract description 54
- 239000000203 mixture Substances 0.000 claims abstract description 30
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 20
- 210000004027 cell Anatomy 0.000 claims description 143
- 230000014509 gene expression Effects 0.000 claims description 60
- 108020001507 fusion proteins Proteins 0.000 claims description 56
- 102000037865 fusion proteins Human genes 0.000 claims description 56
- 150000007523 nucleic acids Chemical class 0.000 claims description 48
- 102000011252 Krueppel-associated box Human genes 0.000 claims description 43
- 108050001491 Krueppel-associated box Proteins 0.000 claims description 43
- 239000013598 vector Substances 0.000 claims description 40
- 102000039446 nucleic acids Human genes 0.000 claims description 35
- 108020004707 nucleic acids Proteins 0.000 claims description 35
- 230000004927 fusion Effects 0.000 claims description 30
- 230000004913 activation Effects 0.000 claims description 23
- 206010028980 Neoplasm Diseases 0.000 claims description 21
- 230000000694 effects Effects 0.000 claims description 12
- 230000002103 transcriptional effect Effects 0.000 claims description 12
- 230000001594 aberrant effect Effects 0.000 claims description 9
- 201000010099 disease Diseases 0.000 claims description 7
- AOJJSUZBOXZQNB-TZSSRYMLSA-N Doxorubicin Chemical compound O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 AOJJSUZBOXZQNB-TZSSRYMLSA-N 0.000 claims description 6
- 101710160287 Heterochromatin protein 1 Proteins 0.000 claims description 6
- 238000002347 injection Methods 0.000 claims description 6
- 239000007924 injection Substances 0.000 claims description 6
- 238000002512 chemotherapy Methods 0.000 claims description 5
- 230000003993 interaction Effects 0.000 claims description 5
- 238000002271 resection Methods 0.000 claims description 4
- 230000003584 silencer Effects 0.000 claims description 4
- 230000037426 transcriptional repression Effects 0.000 claims description 4
- CMSMOCZEIVJLDB-UHFFFAOYSA-N Cyclophosphamide Chemical compound ClCCN(CCCl)P1(=O)NCCCO1 CMSMOCZEIVJLDB-UHFFFAOYSA-N 0.000 claims description 3
- 239000012829 chemotherapy agent Substances 0.000 claims description 3
- 229960004397 cyclophosphamide Drugs 0.000 claims description 3
- 229960004679 doxorubicin Drugs 0.000 claims description 3
- VJJPUSNTGOMMGY-MRVIYFEKSA-N etoposide Chemical compound COC1=C(O)C(OC)=CC([C@@H]2C3=CC=4OCOC=4C=C3[C@@H](O[C@H]3[C@@H]([C@@H](O)[C@@H]4O[C@H](C)OC[C@H]4O3)O)[C@@H]3[C@@H]2C(OC3)=O)=C1 VJJPUSNTGOMMGY-MRVIYFEKSA-N 0.000 claims description 3
- 229960005420 etoposide Drugs 0.000 claims description 3
- HOMGKSMUEGBAAB-UHFFFAOYSA-N ifosfamide Chemical compound ClCCNP1(=O)OCCCN1CCCl HOMGKSMUEGBAAB-UHFFFAOYSA-N 0.000 claims description 3
- 229960001101 ifosfamide Drugs 0.000 claims description 3
- 230000005855 radiation Effects 0.000 claims description 3
- OGWKCGZFUXNPDA-XQKSVPLYSA-N vincristine Chemical compound C([N@]1C[C@@H](C[C@]2(C(=O)OC)C=3C(=CC4=C([C@]56[C@H]([C@@]([C@H](OC(C)=O)[C@]7(CC)C=CCN([C@H]67)CC5)(O)C(=O)OC)N4C=O)C=3)OC)C[C@@](C1)(O)CC)CC1=C2NC2=CC=CC=C12 OGWKCGZFUXNPDA-XQKSVPLYSA-N 0.000 claims description 3
- 229960004528 vincristine Drugs 0.000 claims description 3
- OGWKCGZFUXNPDA-UHFFFAOYSA-N vincristine Natural products C1C(CC)(O)CC(CC2(C(=O)OC)C=3C(=CC4=C(C56C(C(C(OC(C)=O)C7(CC)C=CCN(C67)CC5)(O)C(=O)OC)N4C=O)C=3)OC)CN1CCC1=C2NC2=CC=CC=C12 OGWKCGZFUXNPDA-UHFFFAOYSA-N 0.000 claims description 3
- 125000003275 alpha amino acid group Chemical group 0.000 claims 2
- 238000011282 treatment Methods 0.000 abstract description 20
- 239000002773 nucleotide Substances 0.000 abstract description 19
- 125000003729 nucleotide group Chemical group 0.000 abstract description 16
- 210000003811 finger Anatomy 0.000 description 87
- 108090000623 proteins and genes Proteins 0.000 description 82
- 108700037122 EWS-FLI fusion Proteins 0.000 description 75
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 68
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 68
- 108090000740 RNA-binding protein EWS Proteins 0.000 description 64
- 102000004229 RNA-binding protein EWS Human genes 0.000 description 63
- 230000027455 binding Effects 0.000 description 43
- 150000001413 amino acids Chemical group 0.000 description 29
- 210000002901 mesenchymal stem cell Anatomy 0.000 description 27
- 235000001014 amino acid Nutrition 0.000 description 25
- 102000004169 proteins and genes Human genes 0.000 description 23
- 108010077544 Chromatin Proteins 0.000 description 20
- 210000003483 chromatin Anatomy 0.000 description 20
- 108020004414 DNA Proteins 0.000 description 19
- 235000018102 proteins Nutrition 0.000 description 18
- 108020005004 Guide RNA Proteins 0.000 description 17
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 16
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 16
- 230000006870 function Effects 0.000 description 16
- 239000005090 green fluorescent protein Substances 0.000 description 16
- 108010051109 Cell-Penetrating Peptides Proteins 0.000 description 14
- 102000020313 Cell-Penetrating Peptides Human genes 0.000 description 14
- 230000004568 DNA-binding Effects 0.000 description 14
- 108091028043 Nucleic acid sequence Proteins 0.000 description 13
- 238000003491 array Methods 0.000 description 13
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 12
- 208000035475 disorder Diseases 0.000 description 12
- 239000013612 plasmid Substances 0.000 description 12
- 238000010361 transduction Methods 0.000 description 12
- 230000026683 transduction Effects 0.000 description 12
- 239000003623 enhancer Substances 0.000 description 11
- 239000013603 viral vector Substances 0.000 description 11
- 238000002474 experimental method Methods 0.000 description 10
- 230000008685 targeting Effects 0.000 description 10
- 108090000765 processed proteins & peptides Proteins 0.000 description 9
- 201000011510 cancer Diseases 0.000 description 8
- 239000013604 expression vector Substances 0.000 description 8
- 239000008194 pharmaceutical composition Substances 0.000 description 8
- 230000001105 regulatory effect Effects 0.000 description 8
- 241000700605 Viruses Species 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 230000008901 benefit Effects 0.000 description 7
- 150000001875 compounds Chemical class 0.000 description 7
- 102000004196 processed proteins & peptides Human genes 0.000 description 7
- -1 y-carboxyglutamate Chemical compound 0.000 description 7
- DNIAPMSPPWPWGF-UHFFFAOYSA-N Propylene glycol Chemical compound CC(O)CO DNIAPMSPPWPWGF-UHFFFAOYSA-N 0.000 description 6
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 6
- 101710185494 Zinc finger protein Proteins 0.000 description 6
- 102100023597 Zinc finger protein 816 Human genes 0.000 description 6
- 210000000349 chromosome Anatomy 0.000 description 6
- 230000006698 induction Effects 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 238000001890 transfection Methods 0.000 description 6
- 230000003612 virological effect Effects 0.000 description 6
- 241000702421 Dependoparvovirus Species 0.000 description 5
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 5
- 241000713666 Lentivirus Species 0.000 description 5
- 230000001580 bacterial effect Effects 0.000 description 5
- 239000002299 complementary DNA Substances 0.000 description 5
- 239000003814 drug Substances 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 210000005260 human cell Anatomy 0.000 description 5
- 238000003119 immunoblot Methods 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 238000003753 real-time PCR Methods 0.000 description 5
- 230000001177 retroviral effect Effects 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- CIWBSHSKHKDKBQ-JLAZNSOCSA-N Ascorbic acid Chemical compound OC[C@H](O)[C@H]1OC(=O)C(O)=C1O CIWBSHSKHKDKBQ-JLAZNSOCSA-N 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 4
- CURLTUGMZLYLDI-UHFFFAOYSA-N Carbon dioxide Chemical compound O=C=O CURLTUGMZLYLDI-UHFFFAOYSA-N 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- 241000238631 Hexapoda Species 0.000 description 4
- 101000939426 Homo sapiens UDP-glucuronosyltransferase 3A2 Proteins 0.000 description 4
- 101100539384 Homo sapiens UGT3A2 gene Proteins 0.000 description 4
- 241000283973 Oryctolagus cuniculus Species 0.000 description 4
- 108091093037 Peptide nucleic acid Proteins 0.000 description 4
- 238000003559 RNA-seq method Methods 0.000 description 4
- 102100029786 UDP-glucuronosyltransferase 3A2 Human genes 0.000 description 4
- 101150062351 UGT3A2 gene Proteins 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 239000002131 composite material Substances 0.000 description 4
- 230000008021 deposition Effects 0.000 description 4
- 239000006185 dispersion Substances 0.000 description 4
- 229940079593 drug Drugs 0.000 description 4
- 239000004615 ingredient Substances 0.000 description 4
- 239000002502 liposome Substances 0.000 description 4
- 231100000590 oncogenic Toxicity 0.000 description 4
- 230000002246 oncogenic effect Effects 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 4
- 230000003252 repetitive effect Effects 0.000 description 4
- 230000010076 replication Effects 0.000 description 4
- 239000002904 solvent Substances 0.000 description 4
- 230000032258 transport Effects 0.000 description 4
- 241001430294 unidentified retrovirus Species 0.000 description 4
- WVDDGKGOMKODPV-UHFFFAOYSA-N Benzyl alcohol Chemical compound OCC1=CC=CC=C1 WVDDGKGOMKODPV-UHFFFAOYSA-N 0.000 description 3
- 241000283707 Capra Species 0.000 description 3
- 102100030768 ETS domain-containing transcription factor ERF Human genes 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- 102100031181 Glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 3
- 108010093488 His-His-His-His-His-His Proteins 0.000 description 3
- 101000938776 Homo sapiens ETS domain-containing transcription factor ERF Proteins 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- 239000004472 Lysine Substances 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 206010060862 Prostate cancer Diseases 0.000 description 3
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 3
- 238000011529 RT qPCR Methods 0.000 description 3
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 3
- 108091027981 Response element Proteins 0.000 description 3
- HEMHJVSKTPXQMS-UHFFFAOYSA-M Sodium hydroxide Chemical compound [OH-].[Na+] HEMHJVSKTPXQMS-UHFFFAOYSA-M 0.000 description 3
- 230000003213 activating effect Effects 0.000 description 3
- 239000013543 active substance Substances 0.000 description 3
- 239000003242 anti bacterial agent Substances 0.000 description 3
- 238000002487 chromatin immunoprecipitation Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 3
- 239000002612 dispersion medium Substances 0.000 description 3
- 238000009510 drug design Methods 0.000 description 3
- 230000008482 dysregulation Effects 0.000 description 3
- 210000003527 eukaryotic cell Anatomy 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 238000001476 gene delivery Methods 0.000 description 3
- 230000002068 genetic effect Effects 0.000 description 3
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 3
- 238000001990 intravenous administration Methods 0.000 description 3
- 210000004962 mammalian cell Anatomy 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000002105 nanoparticle Substances 0.000 description 3
- 210000004940 nucleus Anatomy 0.000 description 3
- 239000000546 pharmaceutical excipient Substances 0.000 description 3
- 239000002953 phosphate buffered saline Substances 0.000 description 3
- 229920001223 polyethylene glycol Polymers 0.000 description 3
- 229920001184 polypeptide Polymers 0.000 description 3
- 239000000843 powder Substances 0.000 description 3
- 238000003762 quantitative reverse transcription PCR Methods 0.000 description 3
- 230000001718 repressive effect Effects 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 230000006641 stabilisation Effects 0.000 description 3
- 238000011105 stabilization Methods 0.000 description 3
- 238000001356 surgical procedure Methods 0.000 description 3
- 208000024891 symptom Diseases 0.000 description 3
- 230000001225 therapeutic effect Effects 0.000 description 3
- 210000001519 tissue Anatomy 0.000 description 3
- 241000701447 unidentified baculovirus Species 0.000 description 3
- 238000011144 upstream manufacturing Methods 0.000 description 3
- 230000035899 viability Effects 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- 102000006311 Cyclin D1 Human genes 0.000 description 2
- 108010058546 Cyclin D1 Proteins 0.000 description 2
- 102000053602 DNA Human genes 0.000 description 2
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 2
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 2
- 102000040848 ETS family Human genes 0.000 description 2
- 108091071901 ETS family Proteins 0.000 description 2
- 102100025137 Early activation antigen CD69 Human genes 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 241000588724 Escherichia coli Species 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- 108010010803 Gelatin Proteins 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 101150086355 HBG gene Proteins 0.000 description 2
- 102100021519 Hemoglobin subunit beta Human genes 0.000 description 2
- 229920000209 Hexadimethrine bromide Polymers 0.000 description 2
- 102000003893 Histone acetyltransferases Human genes 0.000 description 2
- 108090000246 Histone acetyltransferases Proteins 0.000 description 2
- 101000934374 Homo sapiens Early activation antigen CD69 Proteins 0.000 description 2
- 101000899111 Homo sapiens Hemoglobin subunit beta Proteins 0.000 description 2
- 101001055144 Homo sapiens Interleukin-2 receptor subunit alpha Proteins 0.000 description 2
- VEXZGXHMUGYJMC-UHFFFAOYSA-N Hydrochloric acid Chemical compound Cl VEXZGXHMUGYJMC-UHFFFAOYSA-N 0.000 description 2
- 108010091358 Hypoxanthine Phosphoribosyltransferase Proteins 0.000 description 2
- 102100029098 Hypoxanthine-guanine phosphoribosyltransferase Human genes 0.000 description 2
- 102100026878 Interleukin-2 receptor subunit alpha Human genes 0.000 description 2
- LRQKBLKVPFOOQJ-YFKPBYRVSA-N L-norleucine Chemical group CCCC[C@H]([NH3+])C([O-])=O LRQKBLKVPFOOQJ-YFKPBYRVSA-N 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 241000714177 Murine leukemia virus Species 0.000 description 2
- 101710141454 Nucleoprotein Proteins 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- 101710182846 Polyhedrin Proteins 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 241000700584 Simplexvirus Species 0.000 description 2
- 108020004459 Small interfering RNA Proteins 0.000 description 2
- 108091023040 Transcription factor Proteins 0.000 description 2
- 102000040945 Transcription factor Human genes 0.000 description 2
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 2
- 238000010521 absorption reaction Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 239000012190 activator Substances 0.000 description 2
- 239000004480 active ingredient Substances 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 230000000844 anti-bacterial effect Effects 0.000 description 2
- 230000002155 anti-virotic effect Effects 0.000 description 2
- 229940121375 antifungal agent Drugs 0.000 description 2
- 239000003429 antifungal agent Substances 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 125000000637 arginyl group Chemical group N[C@@H](CCCNC(N)=N)C(=O)* 0.000 description 2
- 235000010323 ascorbic acid Nutrition 0.000 description 2
- 229960005070 ascorbic acid Drugs 0.000 description 2
- 239000011668 ascorbic acid Substances 0.000 description 2
- 239000011230 binding agent Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 210000000988 bone and bone Anatomy 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 239000002775 capsule Substances 0.000 description 2
- 229910002092 carbon dioxide Inorganic materials 0.000 description 2
- 230000003833 cell viability Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- OSASVXMJTNOKOY-UHFFFAOYSA-N chlorobutanol Chemical compound CC(C)(O)C(Cl)(Cl)Cl OSASVXMJTNOKOY-UHFFFAOYSA-N 0.000 description 2
- 238000000576 coating method Methods 0.000 description 2
- 239000013068 control sample Substances 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000006471 dimerization reaction Methods 0.000 description 2
- 239000003937 drug carrier Substances 0.000 description 2
- 241001493065 dsRNA viruses Species 0.000 description 2
- 238000001378 electrochemiluminescence detection Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 239000012894 fetal calf serum Substances 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 239000008273 gelatin Substances 0.000 description 2
- 229920000159 gelatin Polymers 0.000 description 2
- 235000019322 gelatine Nutrition 0.000 description 2
- 235000011852 gelatine desserts Nutrition 0.000 description 2
- 235000011187 glycerol Nutrition 0.000 description 2
- 239000001963 growth medium Substances 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 238000001802 infusion Methods 0.000 description 2
- 230000002601 intratumoral effect Effects 0.000 description 2
- 239000007951 isotonicity adjuster Substances 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 238000011068 loading method Methods 0.000 description 2
- HQKMJHAJHXVSDF-UHFFFAOYSA-L magnesium stearate Chemical compound [Mg+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O HQKMJHAJHXVSDF-UHFFFAOYSA-L 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000002609 medium Substances 0.000 description 2
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 description 2
- 230000011987 methylation Effects 0.000 description 2
- 238000007069 methylation reaction Methods 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 210000003470 mitochondria Anatomy 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- 210000003463 organelle Anatomy 0.000 description 2
- 108010043655 penetratin Proteins 0.000 description 2
- MCYTYTUNNNZWOK-LCLOTLQISA-N penetratin Chemical compound C([C@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H](N)CCCNC(N)=N)[C@@H](C)CC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(N)=O)C1=CC=CC=C1 MCYTYTUNNNZWOK-LCLOTLQISA-N 0.000 description 2
- 108010011110 polyarginine Proteins 0.000 description 2
- 230000023603 positive regulation of transcription initiation, DNA-dependent Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 210000001236 prokaryotic cell Anatomy 0.000 description 2
- 230000002062 proliferating effect Effects 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 229950010131 puromycin Drugs 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 239000002096 quantum dot Substances 0.000 description 2
- 238000001959 radiotherapy Methods 0.000 description 2
- 230000007115 recruitment Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 108020004418 ribosomal RNA Proteins 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 2
- 238000007920 subcutaneous administration Methods 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 108091006106 transcriptional activators Proteins 0.000 description 2
- 108091006107 transcriptional repressors Proteins 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000005945 translocation Effects 0.000 description 2
- 108010062760 transportan Proteins 0.000 description 2
- PBKWZFANFUTEPS-CWUSWOHSSA-N transportan Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(N)=O)[C@@H](C)CC)NC(=O)CNC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@H](CC=1C2=CC=CC=C2NC=1)NC(=O)CN)[C@@H](C)O)C1=CC=C(O)C=C1 PBKWZFANFUTEPS-CWUSWOHSSA-N 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- 241000701161 unidentified adenovirus Species 0.000 description 2
- 239000003981 vehicle Substances 0.000 description 2
- DIGQNXIGRZPYDK-WKSCXVIASA-N (2R)-6-amino-2-[[2-[[(2S)-2-[[2-[[(2R)-2-[[(2S)-2-[[(2R,3S)-2-[[2-[[(2S)-2-[[2-[[(2S)-2-[[(2S)-2-[[(2R)-2-[[(2S,3S)-2-[[(2R)-2-[[(2S)-2-[[(2S)-2-[[(2S)-2-[[2-[[(2S)-2-[[(2R)-2-[[2-[[2-[[2-[(2-amino-1-hydroxyethylidene)amino]-3-carboxy-1-hydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxypropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1,5-dihydroxy-5-iminopentylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxybutylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1,3-dihydroxypropylidene]amino]-1-hydroxyethylidene]amino]-1-hydroxy-3-sulfanylpropylidene]amino]-1-hydroxyethylidene]amino]hexanoic acid Chemical compound C[C@@H]([C@@H](C(=N[C@@H](CS)C(=N[C@@H](C)C(=N[C@@H](CO)C(=NCC(=N[C@@H](CCC(=N)O)C(=NC(CS)C(=N[C@H]([C@H](C)O)C(=N[C@H](CS)C(=N[C@H](CO)C(=NCC(=N[C@H](CS)C(=NCC(=N[C@H](CCCCN)C(=O)O)O)O)O)O)O)O)O)O)O)O)O)O)O)N=C([C@H](CS)N=C([C@H](CO)N=C([C@H](CO)N=C([C@H](C)N=C(CN=C([C@H](CO)N=C([C@H](CS)N=C(CN=C(C(CS)N=C(C(CC(=O)O)N=C(CN)O)O)O)O)O)O)O)O)O)O)O)O DIGQNXIGRZPYDK-WKSCXVIASA-N 0.000 description 1
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- UKAUYVFTDYCKQA-UHFFFAOYSA-N -2-Amino-4-hydroxybutanoic acid Natural products OC(=O)C(N)CCO UKAUYVFTDYCKQA-UHFFFAOYSA-N 0.000 description 1
- IIZPXYDJLKNOIY-JXPKJXOSSA-N 1-palmitoyl-2-arachidonoyl-sn-glycero-3-phosphocholine Chemical compound CCCCCCCCCCCCCCCC(=O)OC[C@H](COP([O-])(=O)OCC[N+](C)(C)C)OC(=O)CCC\C=C/C\C=C/C\C=C/C\C=C/CCCCC IIZPXYDJLKNOIY-JXPKJXOSSA-N 0.000 description 1
- CNJLMVZFWLNOEP-UHFFFAOYSA-N 4,7,7-trimethylbicyclo[4.1.0]heptan-5-one Chemical compound O=C1C(C)CCC2C(C)(C)C12 CNJLMVZFWLNOEP-UHFFFAOYSA-N 0.000 description 1
- UZOVYGYOLBIAJR-UHFFFAOYSA-N 4-isocyanato-4'-methyldiphenylmethane Chemical compound C1=CC(C)=CC=C1CC1=CC=C(N=C=O)C=C1 UZOVYGYOLBIAJR-UHFFFAOYSA-N 0.000 description 1
- 102100033647 Activity-regulated cytoskeleton-associated protein Human genes 0.000 description 1
- 102100030379 Acyl-coenzyme A synthetase ACSM2A, mitochondrial Human genes 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 101001030716 Arabidopsis thaliana Histone deacetylase HDT1 Proteins 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- 241000416162 Astragalus gummifer Species 0.000 description 1
- 241000194110 Bacillus sp. (in: Bacteria) Species 0.000 description 1
- 102100023932 Bcl-2-like protein 2 Human genes 0.000 description 1
- 108010081589 Becaplermin Proteins 0.000 description 1
- 208000018084 Bone neoplasm Diseases 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- QCMYYKRYFNMIEC-UHFFFAOYSA-N COP(O)=O Chemical class COP(O)=O QCMYYKRYFNMIEC-UHFFFAOYSA-N 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- 208000005623 Carcinogenesis Diseases 0.000 description 1
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 description 1
- 101710150820 Cellular tumor antigen p53 Proteins 0.000 description 1
- 101710094648 Coat protein Proteins 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 229920002261 Corn starch Polymers 0.000 description 1
- 102000016736 Cyclin Human genes 0.000 description 1
- 108050006400 Cyclin Proteins 0.000 description 1
- PMATZTZNYRCHOR-CGLBZJNRSA-N Cyclosporin A Chemical compound CC[C@@H]1NC(=O)[C@H]([C@H](O)[C@H](C)C\C=C\C)N(C)C(=O)[C@H](C(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](CC(C)C)N(C)C(=O)[C@@H](C)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)N(C)C(=O)[C@H](C(C)C)NC(=O)[C@H](CC(C)C)N(C)C(=O)CN(C)C1=O PMATZTZNYRCHOR-CGLBZJNRSA-N 0.000 description 1
- 108010036949 Cyclosporine Proteins 0.000 description 1
- FBPFZTCFMRRESA-FSIIMWSLSA-N D-Glucitol Natural products OC[C@H](O)[C@H](O)[C@@H](O)[C@H](O)CO FBPFZTCFMRRESA-FSIIMWSLSA-N 0.000 description 1
- FBPFZTCFMRRESA-KVTDHHQDSA-N D-Mannitol Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-KVTDHHQDSA-N 0.000 description 1
- FBPFZTCFMRRESA-JGWLITMVSA-N D-glucitol Chemical compound OC[C@H](O)[C@@H](O)[C@H](O)[C@H](O)CO FBPFZTCFMRRESA-JGWLITMVSA-N 0.000 description 1
- 230000033616 DNA repair Effects 0.000 description 1
- 241000450599 DNA viruses Species 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 239000006145 Eagle's minimal essential medium Substances 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- 102100030013 Endoribonuclease Human genes 0.000 description 1
- 108010093099 Endoribonucleases Proteins 0.000 description 1
- 101001091269 Escherichia coli Hygromycin-B 4-O-kinase Proteins 0.000 description 1
- 108091092566 Extrachromosomal DNA Proteins 0.000 description 1
- 108010001515 Galectin 4 Proteins 0.000 description 1
- 102100039556 Galectin-4 Human genes 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 1
- 208000002250 Hematologic Neoplasms Diseases 0.000 description 1
- 102000008157 Histone Demethylases Human genes 0.000 description 1
- 108010074870 Histone Demethylases Proteins 0.000 description 1
- 102000011787 Histone Methyltransferases Human genes 0.000 description 1
- 108010036115 Histone Methyltransferases Proteins 0.000 description 1
- 102000003964 Histone deacetylase Human genes 0.000 description 1
- 108090000353 Histone deacetylase Proteins 0.000 description 1
- 108010033040 Histones Proteins 0.000 description 1
- 102000010029 Homer Scaffolding Proteins Human genes 0.000 description 1
- 108010077223 Homer Scaffolding Proteins Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101100054737 Homo sapiens ACSM2A gene Proteins 0.000 description 1
- 101000904691 Homo sapiens Bcl-2-like protein 2 Proteins 0.000 description 1
- 101000599778 Homo sapiens Insulin-like growth factor 2 mRNA-binding protein 1 Proteins 0.000 description 1
- 101000615488 Homo sapiens Methyl-CpG-binding domain protein 2 Proteins 0.000 description 1
- 101000615495 Homo sapiens Methyl-CpG-binding domain protein 3 Proteins 0.000 description 1
- 101000903686 Homo sapiens Procollagen galactosyltransferase 1 Proteins 0.000 description 1
- 101000979455 Homo sapiens Protein Niban 3 Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 241000701044 Human gammaherpesvirus 4 Species 0.000 description 1
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 1
- 206010021143 Hypoxia Diseases 0.000 description 1
- 206010062016 Immunosuppression Diseases 0.000 description 1
- 102100037924 Insulin-like growth factor 2 mRNA-binding protein 1 Human genes 0.000 description 1
- 108020004684 Internal Ribosome Entry Sites Proteins 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- UKAUYVFTDYCKQA-VKHMYHEASA-N L-homoserine Chemical group OC(=O)[C@@H](N)CCO UKAUYVFTDYCKQA-VKHMYHEASA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical group CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- QEFRNWWLZKMPFJ-ZXPFJRLXSA-N L-methionine (R)-S-oxide Chemical group C[S@@](=O)CC[C@H]([NH3+])C([O-])=O QEFRNWWLZKMPFJ-ZXPFJRLXSA-N 0.000 description 1
- QEFRNWWLZKMPFJ-UHFFFAOYSA-N L-methionine sulphoxide Chemical group CS(=O)CCC(N)C(O)=O QEFRNWWLZKMPFJ-UHFFFAOYSA-N 0.000 description 1
- 108010054278 Lac Repressors Proteins 0.000 description 1
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 description 1
- 238000012307 MRI technique Methods 0.000 description 1
- 101710125418 Major capsid protein Proteins 0.000 description 1
- 229930195725 Mannitol Natural products 0.000 description 1
- 244000246386 Mentha pulegium Species 0.000 description 1
- 235000016257 Mentha pulegium Nutrition 0.000 description 1
- 235000004357 Mentha x piperita Nutrition 0.000 description 1
- 102000003792 Metallothionein Human genes 0.000 description 1
- 108090000157 Metallothionein Proteins 0.000 description 1
- 102000006890 Methyl-CpG-Binding Protein 2 Human genes 0.000 description 1
- 108010072388 Methyl-CpG-Binding Protein 2 Proteins 0.000 description 1
- 102100021299 Methyl-CpG-binding domain protein 2 Human genes 0.000 description 1
- 102100021291 Methyl-CpG-binding domain protein 3 Human genes 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 102000016397 Methyltransferase Human genes 0.000 description 1
- 229920000168 Microcrystalline cellulose Polymers 0.000 description 1
- 241000713869 Moloney murine leukemia virus Species 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 241000204031 Mycoplasma Species 0.000 description 1
- 108091061960 Naked DNA Proteins 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 1
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 1
- 241001631646 Papillomaviridae Species 0.000 description 1
- 229930182555 Penicillin Natural products 0.000 description 1
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 101710083689 Probable capsid protein Proteins 0.000 description 1
- 102100022982 Procollagen galactosyltransferase 1 Human genes 0.000 description 1
- 102100023095 Protein Niban 3 Human genes 0.000 description 1
- 239000013614 RNA sample Substances 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 239000012980 RPMI-1640 medium Substances 0.000 description 1
- 241000714474 Rous sarcoma virus Species 0.000 description 1
- 241000607142 Salmonella Species 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- DWAQJAXMDSEUJJ-UHFFFAOYSA-M Sodium bisulfite Chemical compound [Na+].OS([O-])=O DWAQJAXMDSEUJJ-UHFFFAOYSA-M 0.000 description 1
- 229920002472 Starch Polymers 0.000 description 1
- 241000193996 Streptococcus pyogenes Species 0.000 description 1
- 101000910035 Streptococcus pyogenes serotype M1 CRISPR-associated endonuclease Cas9/Csn1 Proteins 0.000 description 1
- 101001091268 Streptomyces hygroscopicus Hygromycin-B 7''-O-kinase Proteins 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 108091046869 Telomeric non-coding RNA Proteins 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- 102000006601 Thymidine Kinase Human genes 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- 229920001615 Tragacanth Polymers 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- 241000700618 Vaccinia virus Species 0.000 description 1
- 108700005077 Viral Genes Proteins 0.000 description 1
- 108010067390 Viral Proteins Proteins 0.000 description 1
- 101100082060 Xenopus laevis pou5f1.1 gene Proteins 0.000 description 1
- 241000209149 Zea Species 0.000 description 1
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 1
- 230000007488 abnormal function Effects 0.000 description 1
- 239000003070 absorption delaying agent Substances 0.000 description 1
- 150000001242 acetic acid derivatives Chemical class 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 239000000443 aerosol Substances 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 239000000783 alginic acid Substances 0.000 description 1
- 235000010443 alginic acid Nutrition 0.000 description 1
- 229920000615 alginic acid Polymers 0.000 description 1
- 229960001126 alginic acid Drugs 0.000 description 1
- 150000004781 alginic acids Chemical class 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 230000003042 antagnostic effect Effects 0.000 description 1
- 239000003963 antioxidant agent Substances 0.000 description 1
- 235000006708 antioxidants Nutrition 0.000 description 1
- 239000007864 aqueous solution Substances 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 208000006673 asthma Diseases 0.000 description 1
- 230000003385 bacteriostatic effect Effects 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 235000019445 benzyl alcohol Nutrition 0.000 description 1
- WQZGKKKJIJFFOK-VFUOTHLCSA-N beta-D-glucose Chemical compound OC[C@H]1O[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-VFUOTHLCSA-N 0.000 description 1
- 230000003115 biocidal effect Effects 0.000 description 1
- 238000007622 bioinformatic analysis Methods 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000008499 blood brain barrier function Effects 0.000 description 1
- 210000001218 blood-brain barrier Anatomy 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 230000036952 cancer formation Effects 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 239000001569 carbon dioxide Substances 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 231100000504 carcinogenesis Toxicity 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 210000000845 cartilage Anatomy 0.000 description 1
- 125000002091 cationic group Chemical group 0.000 description 1
- 230000010261 cell growth Effects 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 238000003570 cell viability assay Methods 0.000 description 1
- 238000012054 celltiter-glo Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 238000010382 chemical cross-linking Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 229960004926 chlorobutanol Drugs 0.000 description 1
- 210000003763 chloroplast Anatomy 0.000 description 1
- 239000013611 chromosomal DNA Substances 0.000 description 1
- 229960001265 ciclosporin Drugs 0.000 description 1
- 150000001860 citric acid derivatives Chemical class 0.000 description 1
- 239000011248 coating agent Substances 0.000 description 1
- 229940075614 colloidal silicon dioxide Drugs 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002872 contrast media Substances 0.000 description 1
- 238000013270 controlled release Methods 0.000 description 1
- 239000008120 corn starch Substances 0.000 description 1
- 230000001351 cycling effect Effects 0.000 description 1
- 229930182912 cyclosporin Natural products 0.000 description 1
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 231100000135 cytotoxicity Toxicity 0.000 description 1
- 230000003013 cytotoxicity Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000017858 demethylation Effects 0.000 description 1
- 238000010520 demethylation reaction Methods 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000008121 dextrose Substances 0.000 description 1
- 102000004419 dihydrofolate reductase Human genes 0.000 description 1
- UGMCXQCYOVCMTB-UHFFFAOYSA-K dihydroxy(stearato)aluminium Chemical compound CCCCCCCCCCCCCCCCCC(=O)O[Al](O)O UGMCXQCYOVCMTB-UHFFFAOYSA-K 0.000 description 1
- 239000003085 diluting agent Substances 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 1
- 230000003828 downregulation Effects 0.000 description 1
- 241001492478 dsDNA viruses, no RNA stage Species 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 230000013020 embryo development Effects 0.000 description 1
- 238000012407 engineering method Methods 0.000 description 1
- 230000001973 epigenetic effect Effects 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- 101150023663 flu gene Proteins 0.000 description 1
- 235000013355 food flavoring agent Nutrition 0.000 description 1
- 235000003599 food sweetener Nutrition 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 230000005021 gait Effects 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 239000000499 gel Substances 0.000 description 1
- 239000007903 gelatin capsule Substances 0.000 description 1
- 238000010199 gene set enrichment analysis Methods 0.000 description 1
- 230000030279 gene silencing Effects 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 102000034356 gene-regulatory proteins Human genes 0.000 description 1
- 108091006104 gene-regulatory proteins Proteins 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000009033 hematopoietic malignancy Effects 0.000 description 1
- 208000006454 hepatitis Diseases 0.000 description 1
- 231100000283 hepatitis Toxicity 0.000 description 1
- 239000000833 heterodimer Substances 0.000 description 1
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 235000001050 hortel pimenta Nutrition 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 229910052739 hydrogen Inorganic materials 0.000 description 1
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 229960002591 hydroxyproline Drugs 0.000 description 1
- 239000001866 hydroxypropyl methyl cellulose Substances 0.000 description 1
- 235000010979 hydroxypropyl methyl cellulose Nutrition 0.000 description 1
- 229920003088 hydroxypropyl methyl cellulose Polymers 0.000 description 1
- UFVKGYZPFZQRLF-UHFFFAOYSA-N hydroxypropyl methyl cellulose Chemical compound OC1C(O)C(OC)OC(CO)C1OC1C(O)C(O)C(OC2C(C(O)C(OC3C(C(O)C(O)C(CO)O3)O)C(CO)O2)O)C(CO)O1 UFVKGYZPFZQRLF-UHFFFAOYSA-N 0.000 description 1
- 230000007954 hypoxia Effects 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 230000001506 immunosuppresive effect Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 239000003701 inert diluent Substances 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 239000007972 injectable composition Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 239000012212 insulator Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 210000003093 intracellular space Anatomy 0.000 description 1
- 238000007912 intraperitoneal administration Methods 0.000 description 1
- 238000007913 intrathecal administration Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 239000008101 lactose Substances 0.000 description 1
- 235000010445 lecithin Nutrition 0.000 description 1
- 239000000787 lecithin Substances 0.000 description 1
- 229940067606 lecithin Drugs 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 230000033001 locomotion Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000314 lubricant Substances 0.000 description 1
- 238000004020 luminiscence type Methods 0.000 description 1
- 235000019359 magnesium stearate Nutrition 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 239000000594 mannitol Substances 0.000 description 1
- 235000010355 mannitol Nutrition 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 210000003716 mesoderm Anatomy 0.000 description 1
- 229930182817 methionine Chemical group 0.000 description 1
- STZCRXQWRGQSJD-GEEYTBSJSA-M methyl orange Chemical compound [Na+].C1=CC(N(C)C)=CC=C1\N=N\C1=CC=C(S([O-])(=O)=O)C=C1 STZCRXQWRGQSJD-GEEYTBSJSA-M 0.000 description 1
- 229940012189 methyl orange Drugs 0.000 description 1
- 235000010270 methyl p-hydroxybenzoate Nutrition 0.000 description 1
- 229960001047 methyl salicylate Drugs 0.000 description 1
- LSDPWZHWYPCBBB-UHFFFAOYSA-O methylsulfide anion Chemical group [SH2+]C LSDPWZHWYPCBBB-UHFFFAOYSA-O 0.000 description 1
- 235000019813 microcrystalline cellulose Nutrition 0.000 description 1
- 239000008108 microcrystalline cellulose Substances 0.000 description 1
- 229940016286 microcrystalline cellulose Drugs 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- 231100000324 minimal toxicity Toxicity 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 108091005601 modified peptides Proteins 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 239000002324 mouth wash Substances 0.000 description 1
- 229940051866 mouthwash Drugs 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 210000000663 muscle cell Anatomy 0.000 description 1
- 239000006199 nebulizer Substances 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000004766 neurogenesis Effects 0.000 description 1
- 239000000346 nonvolatile oil Substances 0.000 description 1
- 230000008266 oncogenic mechanism Effects 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000007310 pathophysiology Effects 0.000 description 1
- 239000008188 pellet Substances 0.000 description 1
- 229940049954 penicillin Drugs 0.000 description 1
- 229960003742 phenol Drugs 0.000 description 1
- 235000021317 phosphate Nutrition 0.000 description 1
- 150000008298 phosphoramidates Chemical class 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 239000002504 physiological saline solution Substances 0.000 description 1
- 239000006187 pill Substances 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 229920003023 plastic Polymers 0.000 description 1
- 229920000724 poly(L-arginine) polymer Polymers 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 239000008389 polyethoxylated castor oil Substances 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 229920005862 polyol Polymers 0.000 description 1
- 150000003077 polyols Chemical class 0.000 description 1
- 230000029279 positive regulation of transcription, DNA-dependent Effects 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 239000003380 propellant Substances 0.000 description 1
- 230000004845 protein aggregation Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 210000001938 protoplast Anatomy 0.000 description 1
- 230000022983 regulation of cell cycle Effects 0.000 description 1
- 230000000754 repressing effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- CVHZOJJKTDOEJC-UHFFFAOYSA-N saccharin Chemical compound C1=CC=C2C(=O)NS(=O)(=O)C2=C1 CVHZOJJKTDOEJC-UHFFFAOYSA-N 0.000 description 1
- 229940081974 saccharin Drugs 0.000 description 1
- 235000019204 saccharin Nutrition 0.000 description 1
- 239000000901 saccharin and its Na,K and Ca salt Substances 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 230000035939 shock Effects 0.000 description 1
- 239000013605 shuttle vector Substances 0.000 description 1
- 235000010267 sodium hydrogen sulphite Nutrition 0.000 description 1
- 210000004872 soft tissue Anatomy 0.000 description 1
- 239000000600 sorbitol Substances 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 239000007921 spray Substances 0.000 description 1
- 241001147420 ssDNA viruses Species 0.000 description 1
- 239000008107 starch Substances 0.000 description 1
- 235000019698 starch Nutrition 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
- 230000001954 sterilising effect Effects 0.000 description 1
- 238000004659 sterilization and disinfection Methods 0.000 description 1
- 229960005322 streptomycin Drugs 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 235000000346 sugar Nutrition 0.000 description 1
- 150000005846 sugar alcohols Polymers 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000004094 surface-active agent Substances 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 238000013268 sustained release Methods 0.000 description 1
- 239000012730 sustained-release form Substances 0.000 description 1
- 239000003765 sweetening agent Substances 0.000 description 1
- 239000003826 tablet Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 229940126585 therapeutic drug Drugs 0.000 description 1
- RTKIYNMVFMVABJ-UHFFFAOYSA-L thimerosal Chemical compound [Na+].CC[Hg]SC1=CC=CC=C1C([O-])=O RTKIYNMVFMVABJ-UHFFFAOYSA-L 0.000 description 1
- 229940033663 thimerosal Drugs 0.000 description 1
- 231100000331 toxic Toxicity 0.000 description 1
- 230000002588 toxic effect Effects 0.000 description 1
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 1
- 238000003151 transfection method Methods 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 241001529453 unidentified herpesvirus Species 0.000 description 1
- 230000003827 upregulation Effects 0.000 description 1
- 238000001291 vacuum drying Methods 0.000 description 1
- 238000009777 vacuum freeze-drying Methods 0.000 description 1
- 210000000605 viral structure Anatomy 0.000 description 1
- 239000000277 virosome Substances 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 239000008215 water for injection Substances 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
- C07K14/4701—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
- C07K14/4702—Regulators; Modulating activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K2319/00—Fusion polypeptide
- C07K2319/80—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
- C07K2319/81—Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor containing a Zn-finger domain for DNA binding
Definitions
- the present invention relates to compositions comprising zinc fingers and methods of use thereof for the treatment of nucleotide repeat expansion disorders such as Ewing Sarcoma.
- Nucleotide repeat expansion disorders involve the localized expansion of unstable repeats of sets of three, four, five, or more nucleotides and can result in loss of function of the gene in which the repeat resides, a gain of toxic function, or both. Expanded repeat regions within non-coding sequences can lead to aberrant expression of the gene while expanded repeats within coding regions (also known as codon reiteration disorders) may cause mis-folding and protein aggregation. The exact cause of the pathophysiology associated with the aberrant proteins is often not known.
- Ewing sarcoma is an aggressive pediatric malignancy that likely arises from neural crest- or mesoderm-derived mesenchymal stem cells (MSCs). It is driven by oncogenic fusions between EWS and genes in the ETS family (mostly FLU). EWS-FLI1 binds DNA either at ETS-like consensus sites containing a GGAA core motif or, more specifically with respect to other ETS family members, at GGAA microsatellites, where the enhancer activity increases with the number of consecutive GGAA motifs.
- the human genome contains thousands of GGAA-microsatellites. As such, in Ewing Sarcoma, the disease is caused by the widespread activation of GGAA, and illustrates the need for therapeutic agents that are able to perturb these elements.
- Repeat elements can be dysregulated at genome-wide scale in human diseases.
- Ewing sarcoma hundreds of normally inert GGAA tandem repeats can be converted into de novo transcriptional enhancers when bound by the EWS-FLI1 oncogenic fusion protein.
- ZFAs zinc finger arrays
- a fusion of a KRAB repression domain to a GGAA repeat-targeted ZFA could silence GGAA microsatellite enhancers genome-wide in Ewing sarcoma cells, thereby reducing expression of EWS-FLI1- activated genes.
- this KRAB-ZFA fusion showed selective toxicity against Ewing sarcoma cell lines compared with other non-Ewing cancer cell lines, consistent with its Ewing sarcoma-specific impact on the transcriptome.
- engineered zinc finger arrays comprising 6 zinc finger recognition regions, wherein the zinc finger array binds a target sequence of GGAAGGAAGGAAGGAAGG (SEQ ID NO:4) or AAGGAAGGAAGGAAGGAA (SEQ ID NO: 5).
- the engineered zinc finger array comprises at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to an amino acid sequence set forth in any one of SEQ ID NOs:24-39.
- the engineered zinc finger array comprises the amino acid sequence set forth in SEQ ID NO: 30.
- isolated cells comprising the zinc finger array according to any one of the aforementioned embodiments.
- provided herein are isolated nucleic acid encoding the zinc finger array according to any one of the aforementioned embodiments.
- provided herein is a vector comprising the isolated nucleic acid described above.
- fusion protein comprising the zinc finger arrays according to any one of the aforementioned embodiments fused to a heterologous functional domain, with an optional intervening linker, wherein the linker does not interfere with activity of the fusion protein.
- the heterologous functional domain is a transcriptional silencer or transcriptional repression domain.
- the transcriptional repression domain is a Krueppel- associated box (KRAB) domain, ERF repressor domain (ERD), or mSin3 A interaction domain (SID).
- the transcriptional silencer is Heterochromatin Protein 1 (HP1).
- isolated cells comprising the fusion protein according to any one of the aforementioned embodiments.
- provided herein are isolated nucleic acids encoding the fusion according to any one of the aforementioned embodiments.
- provided herein is a vector comprising the isolated nucleic acid described above.
- Also provided herein are methods of reducing aberrant gene expression driven by activation of GGAA-microsatellites in a cell comprising contacting the cell with an effective amount of the fusion proteins as described above, or the isolated nucleic acid as described above. Also provided herein are methods of treating a subject who has a disease associated with aberrant gene expression driven by activation of GGAA- microsatellites in a cell, the method comprising administering to the subject an effective amount of a composition comprising the fusion proteins as described above, or the isolated nucleic acid as described above. In some embodiments of the methods described above, the subject has Ewing sarcoma.
- the composition is administered by injection into or near a tumor, or by application after surgical resection. In some embodiments of the methods described above, the composition is administered by injection into or near a tumor, or by application before surgical resection. In some embodiments, the method of treating a subject further comprises treating a subject with one or more chemotherapy agents. In some embodiments, the chemotherapy is one of vincristine, doxorubicin, cyclophosphamide, ifosfamide, etoposide, or a combination thereof. In some embodiments, the composition is administered before radiation.
- FIGs. 1A-1D Engineering ZFAs to bind GGAA microsatellites in the human genome and efficient activation of a target gene by engineered ZFAs fused to EWS.
- FIG. 1A Schematic of 16 ZFAs, each engineered to bind ⁇ 4.5 GGAA microsatellites.
- the ZFAs have six zinc fingers, and each finger recognizes three nucleotides.
- the target sequences of ZFA 1 through 8 start with GGA, and ZFA 9 through 16 with AAG.
- the amino acid compositions of recognition helices for each zinc finger are shown on the right. Multiple zinc fingers with different recognition helices can recognize the same nucleotides.
- FIG. IB The amino acid compositions of recognition helices for each zinc finger are shown on the right. Multiple zinc fingers with different recognition helices can recognize the same nucleotides.
- FIG. 1C 32 fusions of EWS and ZFAs that target GGAA repeats were tested for UGT3A2 gene activation in U2OS cells by nucleofection.
- EWS-ZFA7 closely mimicked the activation level of EWS- FLI1 and therefore was selected for further experiments.
- FIG. ID 32 fusions of EWS and ZFAs that target GGAA repeats were tested for UGT3A2 gene activation in U2OS cells by nucleofection.
- EWS-ZFA7 closely mimicked the activation level of EWS- FLI1 and therefore was selected for further experiments.
- FIG. ID 32 fusions of EWS and ZFAs that target GGAA repeats were tested for UGT3A2 gene activation in U2OS cells by nucleofection.
- EWS-ZFA7 closely mimicked the activation level of EWS- FLI1 and therefore was selected for further experiments.
- UGT3A2 mRNA expression of UGT3A2 in U2OS cells nucleofected with EWS-ZFA7, EWS-dCas9, dCas9-EWS or dCas9-based bi-partite EWS activators (dCas9-DmrA and DmrC-EWS).
- the bi-partite system increases the density of EWS molecules recruited to a target site.
- FIGs. 2A-2B Gene activation by dCas9-based EWS activators targeting specific promoters in the human genome.
- FIG. 2A mRNA expression levels of the endogenous IL2RA, CD69, HBB, and HBG promoters in the presence of EWS-dCas9, dCas9-EWS or dCas9-based bi-partite EWS with single gRNAs (1, 2, and 3) or pooled gRNAs (all) targeting promoter sequences in U2OS cells. Relative expression of each gene was measured by RT-qPCR, normalized to HPRT levels and calculated relative to that of a control sample expressing a non-targeting gRNA.
- FIGs. 3A-3H Efficient and specific binding of EWS-ZFA at GGAA repeats in MSCs induces active chromatin and activation of GGAA repeat associated genes.
- FIG. 3A GGAA repeat motifs identified at sites bound by EWS-ZFA in MSCs (AGGAAGGAAGGAAGGAAGGAAGGA, SEQ ID NO: 134).
- FIG. 3C Bar plots showing the fraction of GGAA repeats in the genome bound by EWS-ZFA (left) and EWS-FLI1 (right) upon lentiviral transduction in MSCs. Data from one of two biological replicate experiments is shown. The number of consecutive GGAA repeats in each category is shown on the x-axis.
- FIG. 3E Example showing the binding of 3xHA-tagged EWS-FLI1 or EWS-ZFA and accompanying H3K27ac levels in MSC at the IGF2BP1 locus containing a GGAA repeats element and a canonical ETS binding site.
- FIG. 3F Example showing the binding of 3xHA-tagged EWS-FLI1 or EWS-ZFA and accompanying H3K27ac levels in MSC at the IGF2BP1 locus containing a GGAA repeats element and a canonical ETS binding site.
- FIG. 3G Example showing the binding of 3xHA-tagged EWS-FLI1 or EWS-ZFA and accompanying H3K27ac levels in MSC at the NIBAN3 and COLGALT1 loci containing a canonical ETS binding site.
- FIG. 3H Example showing the binding of 3xHA-tagged EWS-FLI1 or EWS-ZFA and accompanying H3K27ac levels in MSC at the NIBAN3 and COLGALT1 loci containing a canonical ETS binding site.
- FIGs. 4A-4G Efficient binding of EWS-ZFA at GGAA repeats in MSCs and comparison of changes in H3K27ac ChlP-seq signals in MSCs after treatment with EWS- FLI1 and EWS-ZFA.
- FIG. 4A GGAA repeat motifs identified at sites bound by EWS- ZFA from a second biological replicate experiment in MSCs (GAAGGAAGGAAGGAAGGAAGGAAG, SEQ ID NO: 135).
- FIG. 4C Bar plot showing the number of GGAA repeat microsatellites genome-wide based on the number of consecutive GGAA repeats.
- FIG. 4D Bar plots showing the fraction of GGAA repeats in the genome bound by EWS-ZFA (left) and EWS-FLI1 (right) upon lentiviral transduction in MSCs. The number of consecutive GGAA repeats in each category is shown on the x-axis. The data shown corresponds to the second of two biological replicate experiments.
- FIG. 4E Bar plot showing the number of GGAA repeat microsatellites genome-wide based on the number of consecutive GGAA repeats.
- FIG. 4D Bar plots showing the fraction of GGAA repeats in the genome bound by EWS-ZFA (left) and EWS-FLI1 (right) upon lentiviral transduction in MSCs. The number of consecutive GGAA repeats in each category is shown on the x-axis. The data shown corresponds to the second of two biological replicate experiments.
- FIG. 4F Protein expression levels of Control (Empty Vector), EWS-ZFA and EWS-FLI1 in MSCs. Protein levels were determined by immunoblotting using specific antibodies directed against HA. GAPDH was used as loading control.
- FIG. 4G Protein expression levels of Control (Empty Vector), EWS-ZFA and EWS-FLI1 in MSCs. Protein levels were determined by immunoblotting using specific antibodies directed against HA. GAPDH was used as loading control.
- FIG. 4G Protein expression levels of Control (Empty Vector), EWS-ZFA and EWS-FLI1 in MSCs. Protein levels were determined by immunoblotting using specific antibodies directed against HA. GAPDH was used as loading control.
- FIGs. 5A-5H Binding of KRAB-ZFAto GGAA repeats induces selective toxicity in Ewing sarcoma cell lines by repressing target gene expression.
- FIG. 5B Composite plot showing EWS-FLI1 occupancy of GGAA repeats after introduction of KRAB-ZFA or GFP (control) in SKNMC. The x axis represents a 10-Kb window centered on 812 GGAA repeats.
- FIG. 5C The x axis represents a 10-Kb window centered on 812 GGAA repeats.
- FIG. 5D Example showing the binding of KRAB-ZFA (3xHA-tagged), endogenous EWS-FLI, H3K9me3 and H3K27ac a GGAA repeat element associated with the CCND1 locus, after treatment of SKNMC cells with KRAB-ZFA constructs. GFP was used as control.
- FIG. 5E Example showing the binding of KRAB-ZFA (3xHA-tagged), endogenous EWS-FLI, H3K9me3 and H3K27ac a GGAA repeat element associated with the CCND1 locus, after treatment of SKNMC cells with KRAB-ZFA constructs. GFP was used as control.
- FIG. 5E Example showing the binding of KRAB-ZFA (3xHA-tagged), endogenous EWS-FLI, H3K9me3 and H3K27ac a GGAA repeat element associated with the CCND1 locus, after treatment of SKNMC cells with KRAB-ZFA constructs. GFP
- FIG. 5F Example showing the binding of KRAB-ZFA (3xHA-tagged), H3K9me3 and H3K27ac at a GGAA repeat element associated with the CCND1 locus, after the treatment of HEK293T cells with KRAB-ZFA construct. GFP was used as control.
- FIG. 5G Example showing the binding of KRAB-ZFA (3xHA-tagged), H3K9me3 and H3K27ac at a GGAA repeat element associated with the CCND1 locus, after the treatment of HEK293T cells with KRAB-ZFA construct. GFP was used as control.
- FIG. 5G Example showing the binding of KRAB-ZFA (3xHA-tagged), H3K9me3 and H3K27ac at a GGAA repeat element associated with the CCND1 locus
- FIG. 5H Viability of Ewing sarcoma and non-Ewing cell lines 8 days post lentiviral transduction of KRAB-ZFA and GFP (control). Open circles indicate two biological replicates with three technical replicates, error bars show the s.e.m.
- FIGs. 6A-6F Changes in EWS-FLI1 occupancy and chromatin states upon binding of KRAB-ZFA at GGAA repeats and ETS canonical binding sites in Ewing Sarcoma cell lines.
- FIG. 6B Composite plot showing decreased EWS-FLI1 occupancy at GGAA repeat enhancers after introduction of KRAB-ZFA in A673. GFP was used as control.
- the x-axis represents a 10-Kb window centered on 812 GGAA repeats.
- FIG. 6C The x-axis represents a 10-Kb window centered on 812 GGAA repeats.
- FIG. 6D Composite plots showing maintained EWS-FLI1 occupancy at canonical ETS binding sites after introduction of KRAB-ZFA in SKNMC and A673 cells. GFP was used as control.
- FIG. 6F Boxplots showing changes in FLU (EWS-FLI1) ChlP-seq signals upon lentiviral induction of KRAB-ZFA in SKNMC and A673 cells at GGAA repeat microsatellites (blue, n
- FIGs. 7A-7E KRAB-ZFAs can silence GGAA repeat-associated genes in Ewing Sarcoma cells but not in HEK293T.
- FIG. 7B The results from two biological replicates.
- FIG. 7D
- FIG. 7E Protein levels of KRAB-ZFA and EWS-FLI1 across all cell lines tested ( Figure 3a) were determined by immunoblotting using specific antibodies directed against HA (KRAB- ZFA) and FLU (EWS-FLI1). GAPDH was used as loading control.
- Microsatellite repeats are a class of simple tandem repeats that previous studies have shown can be dysregulated in multiple disease states (Subramanian, Mishra, and Singh 2003; Malik et al. 2021; Trost et al. 2020; Usdin 2008).
- large scale epigenetic dysregulation of microsatellite repeats has been observed in Ewing sarcoma, a pediatric bone tumor where the EWS-FLI1 translocation fusion protein operates as a transcriptional pioneer factor (Delattre et al. 1992; Riggi et al. 2014). This fusion includes both the N-terminal transactivation domain of EWS and the C-terminal DNA binding domain of FLU.
- EWS-FLI1 can bind to both non-repeat GGAA motifs and GGAA microsatellite repeats.
- binding of EWS-FLI1 to the hundreds of GGAA microsatellites present throughout the human genome converts them into transcriptional enhancers, thereby inducing a tumor-specific gene regulatory program (Gangwal et al. 2008; Guillon et al. 2009; Riggi et al. 2014; Boulay et al. 2017).
- This example together with the dysregulated expression of other repeat classes in other tumor types (Ting et al. 2011; Burns 2017), illustrates how aberrant transcriptional programs in cancer and other diseases can be caused by the widespread activation of specific repeat categories and highlights the need for robust tools to conduct genome-wide studies and perturbation of these elements.
- Described herein are engineered ZFAs that can be used to efficiently target and alter the chromatin state of a class of microsatellite repeats in human cells.
- EWS-FLI1 GGAA microsatellite repeats bound by EWS-FLI1
- engineered EWS-ZFA fusion proteins targeted to these repeats can be over an order of magnitude more efficient than an EWS-dCas9-targeted fusion for activating a GGAA repeat previously shown to be converted into a de novo enhancer by EWS-FLI1.
- EWS-ZFA fusions can effectively phenocopy the pioneer function of EWS-FLI1 at GGAA microsatellites and recapitulate the GGAA repeat-dependent chromatin landscape and gene expression profiles of Ewing sarcoma.
- coupling of a GGAA repeat- targeted ZFAto a transcriptional repressor KRAB domain resulted in genome-wide silencing of GGAA microsatellites and cytotoxicity that was selective for Ewing sarcoma cells through the targeted inactivation of oncogenic gene expression programs.
- Our results validate the power and efficacy of engineered ZF technology for targeting and altering the functional state of microsatellite repeats and illustrate how this platform can be deployed to interrogate the function of microsatellite repetitive elements at genome-scale.
- exogenous nucleic acid sequence is a nucleic acid sequence that is not normally present in a cell, but can be introduced into a cell by one or more genetic, biochemical or other methods. Normal presence in the cell is determined with respect to the particular developmental stage and environmental conditions of the cell. Thus, for example, as used herein, an extrachromosomal DNA sequence that is introduced into the cell is an exogenous nucleic acid (even if part or all of that sequence is also present in the genome of the cell). Similarly, a nucleic acid sequence that is present only during embryonic development of muscle is an exogenous nucleic acid sequence with respect to an adult muscle cell.
- a nucleic acid sequence induced by heat shock is an exogenous molecule with respect to a non-heat-shocked cell.
- An exogenous nucleic acid sequence can comprise, for example, a functioning version of a malfunctioning endogenous gene.
- an “endogenous” nucleic acid sequence is one that is normally present in a particular cell at a particular developmental stage under particular environmental conditions.
- an endogenous nucleic acid can comprise a chromosome, the genome of a mitochondrion, chloroplast or other organelle, or a naturally- occurring episomal nucleic acid.
- Nucleic acid refers to deoxyribonucleotides or ribonucleotides in either single- or double-stranded form.
- the term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which can be synthetic, naturally occurring, or non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs).
- PNAs peptide-nucleic acids
- nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and polynucleotide.
- a “gene,” for the purposes of the present disclosure, includes a DNA region encoding a gene product (see infra), as well as all DNA regions which regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding and/or transcribed sequences.
- a gene includes, but is not necessarily limited to, promoter sequences, terminators, translational regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control regions.
- polypeptide “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues.
- the terms apply to amino acid polymers in which one or more amino acid residue is an analog or mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers.
- amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
- Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, y-carboxyglutamate, and O-phosphoserine.
- Amino acid analog refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an a-carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine, and methyl sulfonium.
- Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
- Ranges provided herein are understood to be shorthand for all of the values within the range.
- a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 (as well as fractions thereof unless the context clearly dictates otherwise).
- compositions comprising a zinc finger DNA-binding domain that specifically binds to a target site in any gene comprising a tetra-nucleotide repeat, e g., GGAA.
- zinc finger refers to a polypeptide comprising a DNA binding domain that is stabilized by zinc.
- the individual DNA binding domains are typically referred to as “fingers.”
- a zinc finger protein has at least one finger, preferably two fingers, three fingers, four fingers, five fingers, or six fingers.
- a zinc finger protein having two or more zinc fingers is referred to as a “multi-finger” or “multi- zinc finger” protein or “multi-finger array” or “zinc finger array.”
- Each finger typically comprises an approximately 30 amino acid, zinc- chelating, DNA-binding domain.
- An exemplary motif characterizing one class of these proteins is X(2)-Cys-X(2,4)-Cys-X(12)-His-X(3- 5)-His (SEQ ID NO: 1), where X is any amino acid, which is known as the “C(2)H(2)” class.
- Zinc finger units are joined together by non-canonical (non-TGEKP linkers) such as TGSQKP (SEQ ID NO:2) or CGSQKP (SEQ ID NO:3).
- a single zinc finger of this C(2)H(2) class consists of an alpha helix containing the two invariant histidine residues coordinated with zinc along with the two cysteine residues of a single beta turn (Berg and Shi, Science 271:1081-1085 (1996)).
- Each finger within a zinc finger array binds to about two to about five nucleotides within a DNA sequence.
- a zinc finger array that include three fingers typically recognize a target site that includes 9 or 10 nucleotides; a zinc finger arrays that include four fingers typically recognize a target site that includes 12 to 14 nucleotides; while a zinc finger arrays having six fingers can recognize target sites that include 18 to 21 nucleotides.
- the zinc finger protein/array is a non-naturally occurring protein, in that it is engineered to bind to a target site of choice.
- An engineered zinc finger binding domain can have a novel binding specificity, compared to a naturally- occurring zinc finger.
- Engineering methods include, but are not limited to, rational design and various types of selection. Rational design includes, for example, using databases comprising tri-nucleotide sequences and individual zinc finger amino acid sequences, in which each tri-nucleotide sequence is associated with one or more amino acid sequences of zinc fingers which bind the particular tri-nucleotide sequence.
- Engineered zinc finger proteins are non-naturally occurring zinc finger proteins whose recognition helices have been altered (e.g., by selection and/or rational design) to bind to a pre-selected target site.
- Any of the zinc finger arrays described herein may include 1, 2, 3, 4, 5, 6 or more zinc fingers, each zinc finger having a recognition helix that binds to a target subsite in the selected sequence(s) (e.g., gene(s)).
- the recognition helix is non-naturally occurring.
- the zinc finger proteins have the recognition helices shown in FIG. 1 A.
- the DNA binding domain is an engineered zinc finger array including four to six fingers that is capable of recognizing target sites of 12 to 18 nucleotides (e.g., a zinc finger array having 6 fingers that recognizes target sites of 18 nucleotides).
- Each zinc finger within the array is designed to target a trinucleotide sequence.
- each zinc finger is designed to recognize GGA, AGG, AAG, or GAA. Therefore, when the zinc finger array is appropriately assembled, the zinc finger array can recognize sequences such as GGAAGGAAGGAAGGAAGG (SEQ ID NO:4) or AAGGAAGGAAGGAAGGAA (SEQ ID NO: 5).
- FIG. 1 A is a schematic of 16 different ZFAs, each engineered to bind ⁇ 4.5 GGAA microsatellites.
- the ZFAs each have six zinc fingers, and each finger recognizes three nucleotides.
- the target sequences of ZFAs 1 through 8 start with GGA, and ZFAs 9 through 16 with AAG.
- the amino acid compositions of recognition helices for each zinc finger are shown on the right side of FIG. 1A. Multiple zinc fingers with different recognition helices can in certain instances recognize the same nucleotides.
- Fusion proteins comprising DNA-binding proteins as described herein and a heterologous regulatory (functional) domain (or functional fragment thereof) are also provided.
- Common domains include, e.g., transcriptional repressors (e.g., KRAB, ERD, SID, TGF-P-inducible early gene (TTEG), v-erbA, MBD2, MBD3, Rb, MeCP2, R0M2, AtHD2A, and others, e.g., amino acids 473-530 of the ets2 repressor factor (ERF) repressor domain (ERD), amino acids 1-97 of the KRAB domain of K0X1, or amino acids 1-36 of the Mad mSIN3 interaction domain (SID); see Beerli et al., PNAS USA 95: 14628-14633 (1998)) or silencers such as Heterochromatin Protein 1 (HP1, also known as swi6), e.g., HPla or HP10; proteins or peptides that could
- the fusion proteins include a linker between the zinc finger array and the heterologous functional domains. Domains could also be proteins that recruit (either directly or indirectly) other proteins in the cell that in turn can modulate gene expression.
- linkers that can be used in these fusion proteins (or between fusion proteins in a concatenated structure) can include any sequence that does not interfere with the function of the fusion proteins.
- the linkers are short, e.g., 2-20 amino acids, and are typically flexible (i.e., comprising amino acids with a high degree of freedom such as glycine, alanine, and serine).
- the linker comprises one or more units consisting of GGGS (SEQ ID NO:6) or GGGGS (SEQ ID NO:7), e.g., two, three, four, or more repeats of the GGGS (SEQ ID NO:6) or GGGGS (SEQ ID NO:7) unit.
- Other linker sequences can also be used.
- Indirect fusions include one or more dimerization systems (e.g., heterodimer systems containing DmrAand DmrC) that mediate coupling of different domains (e.g., DNA-binding domains and gene expression modulating domains), for example, by addition of a drug that induces activation of the dimerization systems.
- the zinc finger fusion protein e.g., a zinc finger that targets GGAA repeats and a repressor domain
- the nucleic acid encoding the zinc finger fusion protein can be cloned into an intermediate vector for transformation into prokaryotic or eukaryotic cells for replication and/or expression.
- Intermediate vectors are typically prokaryote vectors, e.g., plasmids, or shuttle vectors, or insect vectors, for storage or manipulation of the nucleic acid encoding the zinc finger fusion protein for production of the zinc finger fusion protein.
- the nucleic acid encoding the zinc finger fusion protein can also be cloned into an expression vector, for administration to a plant cell, animal cell, preferably a mammalian cell or a human cell, fungal cell, bacterial cell, or protozoan cell.
- a sequence encoding a zinc finger fusion protein is typically subcloned into an expression vector that contains a promoter to direct transcription.
- Suitable bacterial and eukaryotic promoters are well known in the art and described, e.g., in Sambrook et al., Molecular Cloning, A Laboratory Manual (3d ed. 2001); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al., eds., 2010).
- Bacterial expression systems for expressing the engineered protein are available in, e.g., E. coli, Bacillus sp., and Salmonella (Palva etal., 1983, Gene 22:229-235). Kits for such expression systems are commercially available.
- Eukaryotic expression systems for mammalian cells, yeast, and insect cells are well known in the art and are also commercially available.
- the promoter used to direct expression of a nucleic acid depends on the particular application. For example, a strong constitutive promoter is typically used for expression and purification of fusion proteins. In contrast, when the zinc finger fusion protein is to be administered in vivo for gene regulation, either a constitutive or an inducible promoter can be used, depending on the particular use of the zinc finger fusion protein. In addition, a preferred promoter for administration of the zinc finger fusion protein can be a weak promoter, such as HSV TK or a promoter having similar activity.
- the promoter can also include elements that are responsive to transactivation, e.g., hypoxia response elements, Gal4 response elements, lac repressor response element, and small molecule control systems such as tetracycline-regulated systems and the RU-486 system (see, e.g., Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89:5547; Oligino etal., 1998, Gene Then, 5:491-496; Wang et al., 1997, Gene Then, 4:432-441; Neering etal., 1996, Blood, 88: 1147-55; and Rendahl etal., 1998, Nat. Biotechnol., 16:757-761).
- elements that are responsive to transactivation e.g., hypoxia response elements, Gal4 response elements, lac repressor response element, and small molecule control systems such as tetracycline-regulated systems and the RU-486 system (see, e
- the expression vector typically contains a transcription unit or expression cassette that contains all the additional elements required for the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic.
- a typical expression cassette thus contains a promoter operably linked, e.g., to the nucleic acid sequence encoding the zinc finger fusion protein, and any signals required, e.g., for efficient polyadenylation of the transcript, transcriptional termination, ribosome binding sites, or translation termination.
- Additional elements of the cassette may include, e.g., enhancers, and heterologous spliced intronic signals.
- the particular expression vector used to transport the genetic information into the cell is selected with regard to the intended use of the zinc finger fusion protein, e.g., expression in plants, animals, bacteria, fungus, protozoa, etc.
- Standard bacterial expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and commercially available tag-fusion expression systems such as GST and LacZ.
- Expression vectors containing regulatory elements from eukaryotic viruses are often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, and vectors derived from Epstein-Barr virus.
- eukaryotic vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV40 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.
- the vectors for expressing the zinc finger fusion protein can include RNA Pol III promoters to drive expression of the guide RNAs, e.g., the Hl, U6 or 7SK promoters. These human promoters allow for expression of zinc finger fusion proteins in mammalian cells following plasmid transfection.
- Some expression systems have markers for selection of stably transfected cell lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase.
- High yield expression systems are also suitable, such as using a baculovirus vector in insect cells, with the gRNA encoding sequence under the direction of the polyhedrin promoter or other strong baculovirus promoters.
- the elements that are typically included in expression vectors also include a replicon that functions in E. coli, a gene encoding antibiotic resistance to permit selection of bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential regions of the plasmid to allow insertion of recombinant sequences.
- Standard transfection methods are used to produce bacterial, mammalian, yeast or insect cell lines that express large quantities of protein, which are then purified using standard techniques (see, e.g., Colley et al., 1989, J. Biol. Chem., 264: 17619-22; Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according to standard techniques (see, e.g., Morrison, 1977, J. Bacteriol.
- nucleic acids encoding the fusion proteins, as well as cells, tissues, and transgenic animals comprising the nucleic acids and optionally expressing the fusion proteins.
- Any nucleic acid construct capable of directing expression and/or which can transfer sequences to target cells can be used to administer the nucleic acid sequences described herein encoding either the exogenous nucleic acid sequence to be inserted within the target site or the zinc finger nuclease fusion proteins.
- Nucleic acid sequences described herein can be delivered to cells with vector delivery systems, including viral vector delivery systems comprising DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell.
- vector refers to nucleic acid molecules, usually doublestranded DNA, which may have inserted into it another nucleic acid molecule, such as a sequence encoding a nuclease fusion protein.
- the vector is used to transport the inserted nucleic acid molecule into a suitable host cell.
- a vector may contain the necessary elements that permit transcribing the inserted nucleic acid molecule, and translating the transcript into a polypeptide.
- the vector Once in the host cell, the vector may for instance replicate independently of, or coincidental with, the host chromosomal DNA, and several copies of the vector and its inserted nucleic acid molecule may be generated.
- vector may thus also be defined as a gene delivery vehicle that facilitates gene transfer into a target cell.
- This definition includes both non-viral and viral vectors.
- gene delivery systems can be used to combine viral and non-viral components, such as nanoparticles or virosomes (Yamada et al. (2003) Nat Biotechnol . 21, 885-890).
- Non- viral vectors include but are not limited to cationic lipids, liposomes, nanoparticles, PEG, PEI, etc.
- Viral vectors are derived from viruses including but not limited to: retrovirus, lentivirus, adeno-associated virus, adenovirus, herpesvirus, hepatitis virus or the like.
- viral vectors are replication-deficient as they have lost the ability to propagate in a given cell since viral genes essential for replication have been eliminated from the viral vector.
- RNA or DNA viral based systems for the delivery of nucleic acids takes advantage of highly evolved processes for targeting a virus to specific cells in the body and trafficking the viral payload to the nucleus.
- Viral vectors can be derived from lentivirus, adeno-associated virus, adenovirus, retroviruses and antiviruses.
- Conventional viral based systems for the delivery of nucleic acid sequences could include retroviral, lentiviral, adenoviral, adeno- associated, herpes simplex virus, and TMV-like viral vectors for gene transfer. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types and target tissues.
- Retroviruses and antiviruses are RNA viruses that have the ability to insert their genes into host cell chromosomes after infection. Retroviral and lentiviral vectors have been developed that lack the genes encoding viral proteins, but retain the ability to infect cells and insert their genes into the chromosomes of the target cell (Miller (1990) Mol Cell Biol. 10, 4239-4242; Naldini et al. (1996) Science 272, 263-267; VandenDriessche et al., (1999) Proc Natl Acad Sci USA. 96, 10379-10384.
- lentiviral vectors can transduce both dividing and non-dividing cells whereas MLV-based retroviral vectors can only transduce dividing cells.
- Adenoviral vectors are designed to be administered directly to a living subject. Unlike retroviral vectors, most of the adenoviral vector genomes do not integrate into the chromosome of the host cell. Instead, genes introduced into cells using adenoviral vectors are maintained in the nucleus as an extrachromosomal element (episome) that persists for an extended period of time. Adenoviral vectors will transduce dividing and nondividing cells in many different tissues (Chuah et al. (2003) Blood. 101, 1734-1743). Another viral vector is derived from the herpes simplex virus, a large, double-stranded DNA virus. Recombinant forms of the vaccinia virus, another dsDNA virus, can accommodate large inserts and are generated by homologous recombination.
- Adeno-associated virus is a small ssDNA virus which infects humans and some other primate species, not known to cause disease and consequently causing only a very mild immune response. AAV can infect both dividing and non-dividing cells and may incorporate its genome into that of the host cell. These features make AAV a very attractive candidate for creating viral vectors for gene therapy, although the cloning capacity of the vector is relatively limited. In a specific embodiment described herein, the vector used is therefore derived from adeno associated virus.
- the zinc finger fusion proteins described herein can be delivered to cells by conventional protein transduction methods known in the art.
- one or more Nuclear Localization Signals (NLS) or protein transduction domains e.g., penetratin or transportan
- NLS Nuclear Localization Signals
- protein transduction domains e.g., penetratin or transportan
- Such methods are described, for example by Liu, J. et al, Molecular Therapy-Nucleic Acids (2015) 4, e232 and Gaj, T. et al, ACS Chem. Biol. 2014, 9, 1662-1667.
- Cys2His2 zinc fingerss themselves harbor intrinsic cell transduction properties. See, e.g., Gaj T, Guo J, Kato Y, Sirk SJ, Barbas CF 3rd. Nat Methods.
- the zinc finger fusion proteins include a cell-penetrating peptide sequence that facilitates delivery to the intracellular space, e.g., HIV-derived TAT peptide or hCT derived cell-penetrating peptides, see, e.g., Caron etal., (2001) Afo/ Ther. 3(3):310-8; Langel, Cell-Penetrating Peptides: Processes and Applications (CRC Press, Boca Raton FL 2002); El-Andaloussi etal., (2005) Curr Pharm Des. 11 (28):3597- 611; and Deshayes et aL, (2005) Cell Mol Life Sci. 62(16): 1839-49.
- a cell-penetrating peptide sequence that facilitates delivery to the intracellular space
- HIV-derived TAT peptide or hCT derived cell-penetrating peptides see, e.g., Caron etal., (2001) Afo/ Ther
- CPPs Cell penetrating peptides
- cytoplasm or other organelles e.g. the mitochondria and the nucleus.
- molecules that can be delivered by CPPs include therapeutic drugs, plasmid DNA, oligonucleotides, siRNA, peptide-nucleic acid (PNA), proteins, peptides, nanoparticles, and liposomes.
- CPPs are generally 30 amino acids or less, are derived from naturally or non-naturally occurring protein or chimeric sequences, and contain either a high relative abundance of positively charged amino acids, e.g.
- CPPs that are commonly used in the art include Tat (Frankel et al., (1988) Cell. 55: 1189-1193, Vives etal., (1997) J. Biol. Chem. 272:16010-16017), penetratin (Derossi etal., (1994) J. Biol. Chem. 269: 10444-10450), polyarginine peptide sequences (Wender etal., (2000) Proc. Natl. Acad. Sci. USA 97:13003-13008, Futaki et al., (2001) J. Biol. Chem. 276:5836-5840), and transportan (Pooga et al., (1998) Nat. Biotechnol. 16:857-861).
- CPPs can be linked with their cargo through covalent or non-covalent strategies.
- Methods for covalently joining a CPP and its cargo are known in the art, e.g. chemical cross-linking (Stetsenko etal., (2000) J. Org. Chem. 65:4900-4909, Gait et al. (2003) Cell. Mol. Life. Sci. 60:844-853) or cloning a fusion protein (Nagahara etal., (1998) Nat. Med. 4:1449-1453).
- Non-covalent coupling between the cargo and short amphipathic CPPs comprising polar and non-polar domains is established through electrostatic and hydrophobic interactions.
- CPPs have been utilized in the art to deliver potentially therapeutic biomolecules into cells. Examples include cyclosporine linked to polyarginine for immunosuppression (Rothbard etal., (2000) Nature Medicine 6(11): 1253-1257), siRNA against cyclin Bl linked to a CPP called MPG for inhibiting tumorigenesis (Crombez et al. , (2007) Biochem Soc. Trans. 35:44-46), tumor suppressor p53 peptides linked to CPPs to reduce cancer cell growth (Takenobu et al., (2002) Afo/. Cancer Ther. 1(12): 1043-1049, Snyder et al., (2004) PLoS Biol.
- CPPs have been utilized in the art to transport contrast agents into cells for imaging and biosensing applications.
- green fluorescent protein (GFP) attached to Tat has been used to label cancer cells (Shokolenko et al., (2005) DNA Repair 4(4): 511-518).
- Tat conjugated to quantum dots have been used to successfully cross the blood- brain barrier for visualization of the rat brain (Santra et al., (2005) Chem. Commun. 3144-3146).
- CPPs have also been combined with magnetic resonance imaging techniques for cell imaging (Liu et al., (2006) Biochem. and Biophys. Res. Comm.
- zinc finger fusion proteins include a moiety that has a high affinity for a ligand, for example GST, FLAG or hexahistidine sequences (one or more hexahistidine sequences).
- affinity tags can facilitate the purification of recombinant zinc finger fusion proteins.
- the zinc finger fusion proteins do not include a NLS or hexahistidine sequence.
- compositions and kits comprising the zinc finger fusion protein described herein.
- the kits can also include one or more additional reagents, e.g., additional enzymes (such as RNA polymerases) and buffers, e.g., for use in a method described herein.
- additional reagents e.g., additional enzymes (such as RNA polymerases) and buffers, e.g., for use in a method described herein.
- compositions comprising the zinc finger fusion proteins described herein as an active ingredient.
- compositions typically include a pharmaceutically acceptable carrier.
- pharmaceutically acceptable carrier includes saline, solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.
- compositions are typically formulated to be compatible with its intended route of administration.
- routes of administration include intrathecal, intraperitoneal, intraocular, oral, intravenous, intradermal, subcutaneous, oral, intratumoral injection, administration by a gel for slow release, or an infusion pump.
- solutions or suspensions used for administration to the eye, parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide.
- the parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.
- compositions suitable for injectable use can include sterile aqueous solutions (where water-soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion.
- suitable carriers include physiological saline, bacteriostatic water, Cremophor ELTM (BASF, Parsippany, NJ) or phosphate buffered saline (PBS).
- the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi.
- the carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof.
- the proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants.
- Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like.
- isotonic agents for example, sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in the composition.
- Prolonged absorption of the injectable compositions can be brought about by including in the composition an agent that delays absorption, for example, aluminum monostearate and gelatin.
- Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization.
- dispersions are prepared by incorporating the active compound into a sterile vehicle, which contains a basic dispersion medium and the required other ingredients from those enumerated above.
- the preferred methods of preparation are vacuum drying and freeze-drying, which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile- filtered solution thereof.
- Oral compositions generally include an inert diluent or an edible carrier.
- the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules.
- Oral compositions can also be prepared using a fluid carrier for use as a mouthwash.
- Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition.
- the tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.
- a binder such as microcrystalline cellulose, gum tragacanth or gelatin
- an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch
- a lubricant such as magnesium stearate or Sterotes
- a glidant such as colloidal silicon dioxide
- the compounds can be delivered in the form of an aerosol spray from a pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer.
- a suitable propellant e.g., a gas such as carbon dioxide, or a nebulizer.
- compositions may also be formulated to provide slow, controlled or sustained release of the active agent using, by way of example, hydroxypropyl methyl cellulose in varying proportions or other polymer matrices, liposomes and/or microspheres.
- the pharmaceutical compositions described herein may contain opacifying agents and may be formulated so that they release the active agent only, or preferentially, in a certain portion of the gastrointestinal tract, optionally, in a delayed manner.
- the active agent can also be in micro-encapsulated form, if appropriate, with one or more of the above-described excipients.
- the methods described herein include methods for the treatment of disorders associated with GGAA tandem repeats.
- the disorder is Ewing Sarcoma.
- the disorder is prostate cancer (see, e.g., Kedage et al An Interaction with Ewing's Sarcoma Breakpoint Protein EWS Defines a Specific Oncogenic Mechanism of ETS Factors Rearranged in Prostate Cancer, Cell Reports 2016 Oct 25;17(5): 1289-1301, where dysregulation of GGAA repeats in prostate cancer due to TMPRSS2-ERG fusions is described).
- the disorder is a tumor where ETS factors have abnormal functions may involve dysregulation of GGAA repeats (including hematopoietic malignancies with high levels of FLU).
- the methods include administering a therapeutically effective amount of the compositions comprising a zinc finger fusion protein as described herein, to a subject who is in need of, or who has been determined to be in need of, such treatment.
- patient or “subject” refers to members of the animal kingdom including but not limited to human beings and “mammal” refers to all mammals, including, but not limited to human beings.
- the terms “treat,” “treating,” “treatment,” and the like refer to reducing or ameliorating a disorder and/or symptoms associated therewith by any suitable dosage regimen, procedure and/or administration route of a composition, device or structure with the object of achieving a desirable clinical/medical end-point. It will be appreciated that, although not precluded, treating a disorder or condition does not require that the disorder, condition or symptoms associated therewith be completely eliminated. In specific embodiments, the terms “treat,” “treatment,” and “treating” refer to the amelioration of at least one measurable physical parameter of a proliferative disorder, such as growth of a tumor, not necessarily discernible by the patient.
- the terms “treat,” “treatment,” and “treating” refer to the inhibition of the progression of a proliferative disorder, either physically by, e.g., stabilization of a discernible symptom, physiologically by, e.g., stabilization of a physical parameter, or both. In other embodiments the terms “treat,” “treatment,” and “treating” refer to the reduction or stabilization of tumor size or cancerous cell count.
- Ewing s Sarcoma is a type of cancerous tumor that grows in the bones or the soft tissue around bones, such as cartilage or the nerves. It results from a translocation which fuses the EWS gene on chromosome 22 with the FLU gene on chromosome 11. The resultant fusion, EWS-FLI1, functions as a transcriptional activator.
- Treatment for Ewing sarcoma usually begins with chemotherapy. The drugs may shrink the tumor and make it easier to remove the cancer with surgery or target with radiation therapy. After surgery or radiation therapy, chemotherapy treatments might continue in order to kill any cancer cells that might remain.
- compositions described herein are administered to a subject in need thereof (e.g., intravenous (similar to other chemotherapy treatments currently used for Ewing’s Sarcoma), through infusion pump, or intratumoral injection) in a therapeutically sufficient amount to reduce tumor size or to kill tumor cells.
- a subject in need thereof e.g., intravenous (similar to other chemotherapy treatments currently used for Ewing’s Sarcoma), through infusion pump, or intratumoral injection
- the compositions described herein are administered in a therapeutically sufficient amount to reduce the aberrant gene expression driven by activation of GGAA-microsatellites in a cell, which results because of the activity of EWS-FLI1.
- compositions described herein can be used in combination with one or more other treatments that are typically used to treat Ewing’s Sarcoma.
- chemotherapy agents e.g., vincristine, doxorubicin, cyclophosphamide, ifosfamide, and etoposide
- radiation surgery, or any combination thereof.
- engineered ZFAs can be used to efficiently target and alter the chromatin state of a class of microsatellite repeats in human cells.
- MSCs Primary bone marrow derived-MSCs were collected with approval from the Institutional Review Board of the Centre Hospitalier Universitaire Vaudois. Samples were de-identified prior to our analysis. MSCs were cultured in IMDM (Life Technologies) containing 10% fetal calf serum (FCS) and 10 ng/ml platelet-derived growth factor BB (PeproTech). U2OS were obtained from Toni Cathomen (Freiburg). All other cell lines were obtained from ATCC and media from Life Technologies. Ewing sarcoma cell lines SKNMC, A673, EW7 were grown in RPMI 1640 and CHP100 in McCoy’s 5a Medium.
- FCS fetal calf serum
- BB platelet-derived growth factor BB
- HEK293T, Hela and U2OS were grown in DMEM and MRC5 in EMEM. All media were supplemented with 1% penicillin and streptomycin (Life Technologies). McCoy’s 5a medium was supplemented with 15% FBS and all other media were supplemented with 10% FBS. Cells were cultured at 37° C with 5% CO2. Media supernatant was analyzed biweekly for the presence of Mycoplasma using MycoAlertTM PLUS (Lonza). Cell lines were authenticated by ATCC STR profiling.
- Each of the 16 different ZFAs that recognize ⁇ 4.5 GGAA tandem repeats was generated by assembling pre-selected 2-ZF units from an unpublished Joung lab archive. Although we used an unpublished archive of engineered zinc finger modules to provide the various 2-ZF units for constructing our ZFAs, there are other published public sources of zinc finger units as well as protocols that can be used to create customized zinc finger arrays (Sander et al. 2010; Fu et al. 2009; Wright et al. 2006; Sander et al. 2011; Maeder et al. 2008, 2009). The assembled ZFAs were inserted into the pENTR3C vector and EWS N-terminus (Riggi et al.
- dCas9-EWS (NP173) was constructed by cloning EWS into BPK1179 digested with Xhol and Notl by Gibson assembly, and EWS-dCas9 (YET3486) was constructed by cloning EWS into pSQT digested with Agel and BstZ17i by Gibson assembly.
- DmrC-EWS was generated by inserting EWS into DmrC entry vector digested with Nrul, using Gibson assembly. Sequences of gRNAs used in this study are provided in Table 2A.
- Lentivirus was produced in HEK293T LentiX cells (Clontech) by LT1 (Mirus Bio) transfection with gene delivery vector and packaging vectors GAG/POL and VSV plasmids(Boulay et al. 2017). Viral supernatants were collected 72 h after transfection and concentrated using the LentiX concentrator (Clontech). Virus containing pellets were resuspended in PBS and added dropwise on cells in presence of growth media supplemented with 6 ug/ml polybrene.
- Cells infected with lentivirus were selected using puromycin (Invivogen) at a concentration of 1 ug/ml for SKNMC, EW7, CHP100, HEK293T, HeLa and U2OS or 2 ug/ml for A673 and MRC5 in the growth medium. MSCs were selected with 0.75 ug/ml puromycin. Overexpression efficiency was determined by immunoblot analysis.
- Immunoblot analyses were performed using standard protocols (Boulay et al. 2017). Primary antibodies were used at the following concentrations: rat anti-HA (Roche, lug/ml), rabbit anti -FLU (abeam, lug/ml), and mouse anti-GAPDH (Millipore, 0.1 ug/ml). Secondary antibodies were goat anti-rabbit, goat anti-rat, and goat anti-mouse IgG respectively conjugated with horseradish peroxidase (Bio-Rad, 1: 10,000 dilution). Membranes were developed using Western Lightning Plus-ECL enhanced chemiluminescence substrate (PerkinElmer) and visualized using photographic film.
- qPCR was performed using Roche LightCycler480 with the following cycling protocols: initial denaturation at 95 °C for 20 seconds (s) followed by 45 cycles of 95 °C for 3 s and 60 °C for 30 s. Ct values over 35, were considered as 35. Relative quantification of each target, normalized to an endogenous control (GAPDH or HPRTPy was performed using the comparative Ct method (Applied Biosystems).
- Cells that were transduced with lentiviral KRAB-ZFA plasmid or GFP control plasmid were grown for 8 days and cell viability was measured using the CellTiter-Glo luminescent assay (Promega) as described by the manufacturer. Endpoint luminescence was measured on a SpectraMax M5 plate reader (Molecular Devices).
- ChIP assays of MSCs, SKNMC, A673 and HEK293T cells were carried out using 2-5 x 10 6 cells per sample and per epitope, following the procedures described previously (Mikkelsen et al. 2007).
- chromatin from formaldehyde-fixed cells were fragmented to 200-700 bp with a Branson 250 sonifier. Solubilized chromatin was immunoprecipitated overnight at 4C with 3 pg of target specific antibodies (rat anti-HA (Roche), rabbit anti -FLU (Abeam), rabbit anti-H3K27ac (Active Motif), and rabbit anti- H3K9me3 (Abeam)).
- Antibody-chromatin complexes were pulled down with protein G- Dynabeads (Life Technologies), washed, and then eluted. After crosslink reversal, RNase A, and proteinase K treatment, immunoprecipitated DNA was extracted with AMP Pure beads (Beckman Coulter). ChIP DNA was quantified with Qubit. Sequencing libraries were prepared with 1-5 ng of ChIP DNA samples and input samples using the Ovation Ultralow System V2 kit (Nugen). Libraries were sequenced with single-end (SE) 50-75 cycles on an Illumina Nextseq 500 Illumina genome analyzer.
- Reads were aligned to human reference genome hgl9 using bwa (Li and Durbin 2009). Aligned reads were then filtered to exclude PCR duplicates and were extended to 200 bp to approximate fragment sizes. Density maps were generated by counting the number of fragments overlapping each position using igvtools, and normalized to 10 million reads. We used MACS2 (Zhang et al. 2008) to call peaks using matching input controls with a q- value threshold of 0.01. Peaks were filtered to exclude blacklisted regions as defined by the ENCODE consortium (ENCODE Project Consortium 2012). Peaks within 200 bp of each other were merged. Genome- wide GGAA microsatellite repeats were previously annotated (Boulay et al.
- RNA libraries were prepared from 500 ng of total RNA treated with Ribogold zero to remove ribosomal RNA, using TruSeq Stranded Total RNA Library Prep Gold kit (Illumina, 20020599) and TruSeq RNA Single Indexes.
- the RNA libraries were sequenced with PE 32 cycles on an Illumina Nextseq500 system.
- RNA samples were sent to Novogene Corporation for mRNA sequencing.
- RNA libraries were sequenced with PEI 50 cycles on an Illumina NovaSeq 6000 system. Reads were aligned to hgl9 using STAR (Dobin et al. 2013).
- Mapped reads were filtered to exclude PCR duplicates and reads mapping to known ribosomal RNA coordinates, obtained from the rmsk table in the UCSC database (genome .
- Gene expression was calculated using featureCounts (Liao, Smyth, and Shi 2014). Only primary alignments with mapping quality of 10 or more were counted. Counts were then normalized to 1 million reads. Signal tracks were generated using bedtools (Quinlan et al. 2010). Differential expression was calculated using DESeq2 (Love, Huber, and Anders 2014).
- Gene set overlaps were computed using Gene Set Enrichment Analysis (GSEA, gsea-msigdb.org/gsea/msigdb/annotate.jsp). Genes lists for GSEA analysis were selected using a log2 fold change of 0.6 for upregulated genes and -0.6 for downregulated genes. An adjusted p-value threshold of ⁇ 0.1 was also applied. Gene lists were then analyzed for overlaps with C2 (curated gene sets) and BP (GO biological process), with a FDR q-value ⁇ 0.05.
- GSEA Gene Set Enrichment Analysis
- Example 1.1 Engineering sequence-specific DNA-binding domains to target GGAA microsatellite repeats
- dCas9 programmed by guide RNAs have similarly been used to create gene regulatory proteins that function efficiently in human cells (Qi et al. 2013; Maeder et al. 2013; Perez-Pinera et al. 2013) and offer the substantial additional advantage of simple targetability by altering the gRNA sequence.
- TALE repeats have also been used to build customized DNA-binding domains, utilizing assembled arrays of four TALE repeat domains as “building blocks”, with each recognizing one of the four different DNA bases (Scholze and Boch 2011; Boch et al. 2009; Moscou and Bogdanove 2009).
- EWS-ZFA7 EWS-ZFA7 fusion
- dCas9-DmrA fusions harboring two, three or four DmrA domains could mediate modest activation of the UGT3A2 gene (mean fold-activation of 3.1, 3, or 1.1, respectively), levels much lower compared to the activation observed using the EWS- ZFA fusion (FIG. ID).
- these same dCas9-based EWS constructs were effective at activating various other genes when using different gRNAs directed to non- repetitive target sites within the promoters of those genes in U2OS cells and HEK293 cells (FIGs. 2A-2B).
- Example 1.2 An EWS-ZFA fusion recapitulates genome-wide activation of microsatellite repeats observed in Ewing sarcoma
- EWS-ZFA could target and activate GGAA microsatellites genome-wide by comparing its activity to EWS-FLI1 in mesenchymal stem cells (MSCs).
- MSCs are a model for the cell of origin of Ewing sarcoma and EWS- FLI1 has previously been shown to operate as a pioneer factor at GGAA repeats in these cells to induce a chromatin landscape and gene expression pattern similar to that of tumor cells (Riggi et al. 2014).
- GGAA repeats as the dominant motif found in EWS-ZFA peaks (more than 80% of EWS-ZFA binding sites contained more than four consecutive GGAA units, FIG. 3A, FIG. 4A, note alternate motifs include: GGAAGGAAGGAAGGAAGGAAGGAA (SEQ ID NO: 136) and AAGGAAGGAAGGAAGGAAGGAAGG (SEQ ID NO: 137)).
- EWS-ZFA also bound nearly all of the GGAA repeats in the genome bound by EWS-FLI1 (FIG. 3B, FIG.
- Example 1.3 KRAB-ZFAs can selectively silence the microsatellite-driven Ewing Sarcoma gene expression program
- EWS-ZFA can efficiently target and activate GGAA microsatellites in MSCs
- a fusion of our engineered ZFA to a repressive KRAB domain might conversely silence active GGAA microsatellites bound by endogenous EWS-FLI1 in Ewing sarcoma cells, thereby inactivating its downstream oncogenic gene expression program.
- This approach offers the possibility to delineate the precise functional role of GGAA repeats in Ewing Sarcoma cells, in isolation from the non-repeat GGAA target sites of EWS-FLI1.
- KRAB-ZFA binding was also associated with striking changes in chromatin states and the induction of repressive marks with increased H3K9me3 and decreased H3K27Ac signals at GGAA microsatellites (FIGs. 5A, 5C - 5D, FIGs. 6A, 6C). As expected, these changes were observed uniquely at GGAA repeats and not at non-repeat GGAA EWS-FLI1 binding sites, confirming the specificity of KRAB-ZFA (FIGs. 6D-6F).
- HEK293T cells were largely devoid of active chromatin marks at GGAA repeats and that there were no major changes in H3K27Ac signals induced with KRAB-ZFA expression (FIG. 7C).
- GGAA repeats in HEK293T cells accumulated strong repressive H3K9me3 signals after expression of KRAB-ZFA in the same manner as Ewing sarcoma cells (FIGs. 5E - 5F).
- Ewing sarcoma cells HEK293T transduced with KRAB-ZFA displayed minimal transcriptional changes, which only included a handful of genes with GGAA repeats located within their promoters (FIG. 7D, Table 6).
- engineered ZFAs are highly effective and specific tools for targeting widely distributed repetitive elements and altering their chromatin states.
- Engineered ZFAs have distinct advantages for this purpose given their high DNA binding affinities, small size, and similarities to endogenous transcription factors.
- Our findings further demonstrate that engineered ZFAs can greatly facilitate the functional assessment of the important but challenging-to-study repetitive elements of the human genome and may provide a strategy for therapeutically modifying the non-coding function of these repeats.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
Abstract
Described herein are compositions comprising zinc fingers and methods of use thereof for the treatment of nucleotide repeat expansion disorders such as Ewing Sarcoma.
Description
Method for genome-wide functional perturbation of human microsatellites using engineered zinc fingers
CLAIM OF PRIORITY
This application claims priority under 35 USC §119(e) to U.S. Patent Application Serial No. 63/327,175, filed on April 4, 2022, the entire contents of which are hereby incorporated by reference.
FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with Government support under Grant Nos. GM118158, CA211707, CA204954, OD006862, GM105378, and CA231637 awarded by the National Institutes of Health. The Government has certain rights in the invention.
TECHNICAL FIELD
The present invention relates to compositions comprising zinc fingers and methods of use thereof for the treatment of nucleotide repeat expansion disorders such as Ewing Sarcoma.
BACKGROUND
Nucleotide repeat expansion disorders involve the localized expansion of unstable repeats of sets of three, four, five, or more nucleotides and can result in loss of function of the gene in which the repeat resides, a gain of toxic function, or both. Expanded repeat regions within non-coding sequences can lead to aberrant expression of the gene while expanded repeats within coding regions (also known as codon reiteration disorders) may cause mis-folding and protein aggregation. The exact cause of the pathophysiology associated with the aberrant proteins is often not known.
Ewing sarcoma is an aggressive pediatric malignancy that likely arises from neural crest- or mesoderm-derived mesenchymal stem cells (MSCs). It is driven by oncogenic fusions between EWS and genes in the ETS family (mostly FLU). EWS-FLI1 binds DNA either at ETS-like consensus sites containing a GGAA core motif or, more
specifically with respect to other ETS family members, at GGAA microsatellites, where the enhancer activity increases with the number of consecutive GGAA motifs.
The human genome contains thousands of GGAA-microsatellites. As such, in Ewing Sarcoma, the disease is caused by the widespread activation of GGAA, and illustrates the need for therapeutic agents that are able to perturb these elements.
SUMMARY
Repeat elements can be dysregulated at genome-wide scale in human diseases. For example, in Ewing sarcoma, hundreds of normally inert GGAA tandem repeats can be converted into de novo transcriptional enhancers when bound by the EWS-FLI1 oncogenic fusion protein. Here we show that fusions of GGAA repeat-targeted engineered zinc finger arrays (ZFAs) to the EWS domain can function at least as efficiently as EWS-FLI1 for converting hundreds of GGAA repeats into active enhancers in an Ewing sarcoma precursor cell model. Furthermore, a fusion of a KRAB repression domain to a GGAA repeat-targeted ZFA could silence GGAA microsatellite enhancers genome-wide in Ewing sarcoma cells, thereby reducing expression of EWS-FLI1- activated genes. Remarkably, this KRAB-ZFA fusion showed selective toxicity against Ewing sarcoma cell lines compared with other non-Ewing cancer cell lines, consistent with its Ewing sarcoma-specific impact on the transcriptome. These findings demonstrate the value of ZFAs for functional annotation of repeats and illustrate how aberrant microsatellite activities might be regulated for potential therapeutic applications.
In some embodiments, provided herein are engineered zinc finger arrays comprising 6 zinc finger recognition regions, wherein the zinc finger array binds a target sequence of GGAAGGAAGGAAGGAAGG (SEQ ID NO:4) or AAGGAAGGAAGGAAGGAA (SEQ ID NO: 5). In some embodiments, the engineered zinc finger array comprises at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to an amino acid sequence set forth in any one of SEQ ID NOs:24-39. In some embodiments, the engineered zinc finger array comprises the amino acid sequence set forth in SEQ ID NO: 30.
In some embodiments, provided herein are isolated cells comprising the zinc finger array according to any one of the aforementioned embodiments.
In some embodiments, provided herein are isolated nucleic acid encoding the zinc finger array according to any one of the aforementioned embodiments. In some embodiments, provided herein is a vector comprising the isolated nucleic acid described above.
In some embodiments, provided herein are fusion protein comprising the zinc finger arrays according to any one of the aforementioned embodiments fused to a heterologous functional domain, with an optional intervening linker, wherein the linker does not interfere with activity of the fusion protein. In some embodiments, the heterologous functional domain is a transcriptional silencer or transcriptional repression domain. In some embodiments, the transcriptional repression domain is a Krueppel- associated box (KRAB) domain, ERF repressor domain (ERD), or mSin3 A interaction domain (SID). In some embodiments, the transcriptional silencer is Heterochromatin Protein 1 (HP1).
In some embodiments, provided herein are isolated cells comprising the fusion protein according to any one of the aforementioned embodiments.
In some embodiments, provided herein are isolated nucleic acids encoding the fusion according to any one of the aforementioned embodiments. In some embodiments, provided herein is a vector comprising the isolated nucleic acid described above.
Also provided herein are methods of reducing aberrant gene expression driven by activation of GGAA-microsatellites in a cell, the method comprising contacting the cell with an effective amount of the fusion proteins as described above, or the isolated nucleic acid as described above. Also provided herein are methods of treating a subject who has a disease associated with aberrant gene expression driven by activation of GGAA- microsatellites in a cell, the method comprising administering to the subject an effective amount of a composition comprising the fusion proteins as described above, or the isolated nucleic acid as described above. In some embodiments of the methods described above, the subject has Ewing sarcoma. In some embodiments of the methods described above, the composition is administered by injection into or near a tumor, or by application after surgical resection. In some embodiments of the methods described
above, the composition is administered by injection into or near a tumor, or by application before surgical resection. In some embodiments, the method of treating a subject further comprises treating a subject with one or more chemotherapy agents. In some embodiments, the chemotherapy is one of vincristine, doxorubicin, cyclophosphamide, ifosfamide, etoposide, or a combination thereof. In some embodiments, the composition is administered before radiation.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.
Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.
DESCRIPTION OF DRAWINGS
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
FIGs. 1A-1D. Engineering ZFAs to bind GGAA microsatellites in the human genome and efficient activation of a target gene by engineered ZFAs fused to EWS. FIG. 1A. Schematic of 16 ZFAs, each engineered to bind ~4.5 GGAA microsatellites. The ZFAs have six zinc fingers, and each finger recognizes three nucleotides. The target sequences of ZFA 1 through 8 start with GGA, and ZFA 9 through 16 with AAG. The amino acid compositions of recognition helices for each zinc finger are shown on the right. Multiple zinc fingers with different recognition helices can recognize the same nucleotides. FIG. IB. Schematic of ZFAs fused to EWS activating UGT3A2 by binding to an 11 -unit GGAA microsatellite located ~2Kb upstream of the TSS. EWS is fused to the N-terminus (left panel) or C-terminus (right panel) of ZFAs. FIG. 1C. 32 fusions of
EWS and ZFAs that target GGAA repeats were tested for UGT3A2 gene activation in U2OS cells by nucleofection. EWS-ZFA7 closely mimicked the activation level of EWS- FLI1 and therefore was selected for further experiments. FIG. ID. mRNA expression of UGT3A2 in U2OS cells nucleofected with EWS-ZFA7, EWS-dCas9, dCas9-EWS or dCas9-based bi-partite EWS activators (dCas9-DmrA and DmrC-EWS). The bi-partite system increases the density of EWS molecules recruited to a target site.
FIGs. 2A-2B. Gene activation by dCas9-based EWS activators targeting specific promoters in the human genome. FIG. 2A. mRNA expression levels of the endogenous IL2RA, CD69, HBB, and HBG promoters in the presence of EWS-dCas9, dCas9-EWS or dCas9-based bi-partite EWS with single gRNAs (1, 2, and 3) or pooled gRNAs (all) targeting promoter sequences in U2OS cells. Relative expression of each gene was measured by RT-qPCR, normalized to HPRT levels and calculated relative to that of a control sample expressing a non-targeting gRNA. Open circles indicate biological replicates (n=3), bars the mean of replicates, and error bars the s.e.m. FIG. 2B. mRNA expression levels of the endogenous IL2RA, CD69, HBB, and HBG promoters in the presence of dCas9-EWS or dCas9-based bi-partite EWS with single gRNAs (1, 2, and 3) or pooled gRNAs (all) targeting promoter sequences in HEK293 cells. Relative expression of each gene was measured by RT-qPCR, normalized to HPRT levels and calculated relative to that of a control sample expressing a non-targeting gRNA. Open circles indicate biological replicates (n=3), bars the mean of replicates, and error bars the s.e.m.
FIGs. 3A-3H. Efficient and specific binding of EWS-ZFA at GGAA repeats in MSCs induces active chromatin and activation of GGAA repeat associated genes. FIG. 3A. GGAA repeat motifs identified at sites bound by EWS-ZFA in MSCs (AGGAAGGAAGGAAGGAAGGAAGGA, SEQ ID NO: 134). FIG. 3B. Scatterplot showing binding of 3xHA-tagged EWS-FLI1 and EWS-ZFA to GGAA repeats genomewide (n=13029) in MSCs determined using HA ChlP-seq. ChlP-seq signals are in a log2 scale. The Spearman correlation coefficient is 0.68 with p-value < 2.2e-16. Data from one of two biological replicate experiments is shown. FIG. 3C. Bar plots showing the fraction of GGAA repeats in the genome bound by EWS-ZFA (left) and EWS-FLI1 (right) upon lentiviral transduction in MSCs. Data from one of two biological replicate
experiments is shown. The number of consecutive GGAA repeats in each category is shown on the x-axis. FIG. 3D. Heatmaps showing HA and H3K27ac ChlP-seq signals in MSCs at EWS-FLI1 bound GGAA repeats identified in Ewing sarcoma (n=812) upon lentiviral transduction of either 3xHA-tagged EWS-FLI1 or EWS-ZFA. 3x HA-tagged GFP was used as control. 10-kb windows in each panel are centered on EWS-FLI1 binding sites in Ewing sarcoma. FIG. 3E. Example showing the binding of 3xHA-tagged EWS-FLI1 or EWS-ZFA and accompanying H3K27ac levels in MSC at the IGF2BP1 locus containing a GGAA repeats element and a canonical ETS binding site. FIG. 3F. Heatmaps showing HA and H3K27ac ChlP-seq signals in MSCs at EWS-FLI1 bound canonical ETS binding sites identified in Ewing sarcoma (n=973) upon lentiviral transduction of either 3xHA-tagged EWS-FLI1 or EWS-ZFA. GFP was used as control. 10-kb windows in each panel are centered on EWS-FLI1 binding sites in Ewing sarcoma. FIG. 3G. Example showing the binding of 3xHA-tagged EWS-FLI1 or EWS-ZFA and accompanying H3K27ac levels in MSC at the NIBAN3 and COLGALT1 loci containing a canonical ETS binding site. FIG. 3H. Heatmaps of log2 fold changes in expression of GGAA-repeat-associated genes (n=126) in MSCs treated with EWS-FLI1 or EWS-ZFA constructs compared to a GFP control, determined by RNA-seq. Two biological replicates are shown. Spearman correlation of log2 fold changes in EWS-FLI1 and EWS-ZNF is 0.58 (p- value < 2.22e-16).
FIGs. 4A-4G. Efficient binding of EWS-ZFA at GGAA repeats in MSCs and comparison of changes in H3K27ac ChlP-seq signals in MSCs after treatment with EWS- FLI1 and EWS-ZFA. FIG. 4A. GGAA repeat motifs identified at sites bound by EWS- ZFA from a second biological replicate experiment in MSCs (GAAGGAAGGAAGGAAGGAAGGAAG, SEQ ID NO: 135). FIG. 4B. Scatterplot showing binding of 3xHA-tagged EWS-FLI1 and EWS-ZFA to GGAA repeats genomewide (n=13092) in MSCs determined using HA ChlP-seq. The ChlP-seq signals are in a log2 scale. The Spearman’s correlation coefficient is 0.68 with p-value < 2.2e-16. The data shown corresponds to the second of two biological replicate experiments. FIG. 4C. Bar plot showing the number of GGAA repeat microsatellites genome-wide based on the number of consecutive GGAA repeats. FIG. 4D. Bar plots showing the fraction of GGAA repeats in the genome bound by EWS-ZFA (left) and EWS-FLI1 (right) upon
lentiviral transduction in MSCs. The number of consecutive GGAA repeats in each category is shown on the x-axis. The data shown corresponds to the second of two biological replicate experiments. FIG. 4E. Boxplots show changes in H3K27ac ChlP-seq signals in MSCs expressing either EWS-FLI1 or EWS-ZFA at GGAA repeat microsatellites (a, n=812). The results of two biological replicate experiments are shown. FIG. 4F. Protein expression levels of Control (Empty Vector), EWS-ZFA and EWS-FLI1 in MSCs. Protein levels were determined by immunoblotting using specific antibodies directed against HA. GAPDH was used as loading control. FIG. 4G. Boxplots show changes in H3K27ac ChlP-seq signals in MSCs expressing either EWS-FLI1 or EWS- ZFA at canonical ETS-binding sites (b, n=973) bound by EWS-FLI1 in Ewing sarcoma. The results of two biological replicate experiments are shown.
FIGs. 5A-5H. Binding of KRAB-ZFAto GGAA repeats induces selective toxicity in Ewing sarcoma cell lines by repressing target gene expression. FIG. 5A. Heatmaps showing binding of 3xHA tagged KRAB-ZFA and H3K9me3 deposition at EWS-FLI1 bound GGAA repeats (n=812) in SKNMC cells as determined using ChlP-seq. FIG. 5B. Composite plot showing EWS-FLI1 occupancy of GGAA repeats after introduction of KRAB-ZFA or GFP (control) in SKNMC. The x axis represents a 10-Kb window centered on 812 GGAA repeats. FIG. 5C. Histograms showing changes in H3K27ac at 812 EWS-FLI1 bound GGAA repeats upon treatment of SKNMC cells with KRAB-ZFA. FIG. 5D. Example showing the binding of KRAB-ZFA (3xHA-tagged), endogenous EWS-FLI, H3K9me3 and H3K27ac a GGAA repeat element associated with the CCND1 locus, after treatment of SKNMC cells with KRAB-ZFA constructs. GFP was used as control. FIG. 5E. Heatmaps showing binding of KRAB-ZFA (3xHA tagged) and H3K9me3 deposition in HEK293T cells at GGAA repeats bound by EWS-FLI1 in Ewing sarcoma (n=812) as determined using ChlP-seq. FIG. 5F. Example showing the binding of KRAB-ZFA (3xHA-tagged), H3K9me3 and H3K27ac at a GGAA repeat element associated with the CCND1 locus, after the treatment of HEK293T cells with KRAB-ZFA construct. GFP was used as control. FIG. 5G. Heatmaps showing expression (row- normalized counts) of GGAA repeat-associated genes (n=235), in SKNMC and HEK293T cells treated with KRAB-ZFA or GFP (control) determined by RNA-seq. Data are from two biological replicates. FIG. 5H. Viability of Ewing sarcoma and non-Ewing
cell lines 8 days post lentiviral transduction of KRAB-ZFA and GFP (control). Open circles indicate two biological replicates with three technical replicates, error bars show the s.e.m.
FIGs. 6A-6F. Changes in EWS-FLI1 occupancy and chromatin states upon binding of KRAB-ZFA at GGAA repeats and ETS canonical binding sites in Ewing Sarcoma cell lines. FIG. 6A. Heatmaps showing HA (KRAB-ZFA) and H3K9me3 ChlP- seq signals at EWS-FLI1 bound GGAA repeats (n=812) in A673 cells upon lentiviral transduction. FIG. 6B. Composite plot showing decreased EWS-FLI1 occupancy at GGAA repeat enhancers after introduction of KRAB-ZFA in A673. GFP was used as control. The x-axis represents a 10-Kb window centered on 812 GGAA repeats. FIG. 6C. Histogram showing changes in H3K27ac ChlP-seq signals at 812 EWS-FLI1 bound GGAA repeats upon treatment of A673 cells with KRAB-ZFA. FIG. 6D. Composite plots showing maintained EWS-FLI1 occupancy at canonical ETS binding sites after introduction of KRAB-ZFA in SKNMC and A673 cells. GFP was used as control. The x- axis represents a 10-kb window centered on EWS-FLI1 -bound canonical ETS binding sites (n=973). FIG. 6E. Boxplots showing changes in FLU (EWS-FLI1) ChlP-seq signals upon lentiviral induction of KRAB-ZFA in SKNMC and A673 cells at GGAA repeat microsatellites (blue, n=812) and canonical ETS-binding sites (gray, n=973). FIG. 6F. Scatterplots showing changes in H3K9me3 and H3K27ac ChlP-seq signals at EWS- FLI1 binding sites (GGAA repeats (top, n=812) and canonical ETS binding sites (bottom, n=973)) upon lentiviral induction of KRAB-ZFA in Ewing sarcoma cell lines SKNMC and A673 as well as the control cell line HEK293T. Iog2 fold changes are shown.
FIGs. 7A-7E. KRAB-ZFAs can silence GGAA repeat-associated genes in Ewing Sarcoma cells but not in HEK293T. FIG. 7A. Heatmap showing row- normalized expression levels of GGAA repeat-associated genes (n=235) in A673 and HEK293T cells treated with KRAB-ZFA or GFP (control) determined by RNA-seq. Data are from two biological replicates. FIG. 7B. Bar plot showing the number of genes up or downregulated by 1.5-fold (p- value < 0.1) upon treatment with KRAB-ZFA construct targeting GGAA repeat microsatellites, in Ewing sarcoma cell lines SKNMC and A673 as well as the control cell line HEK293T. FIG. 7C. Heatmaps showing the absence of activity and the lack of changes in H3K27ac ChlP-seq signals in HEK293T cells at EWS-
FLU bound GGAA repeats in Ewing sarcoma (n=812) upon KRAB-ZFA lentiviral transduction. GFP was used as control. 10-kb windows in each panel are centered on EWS-FLI1 binding sites. FIG. 7D. Binding of KRAB-ZFA accompanied by chromatin changes (H3K9me3 and H3K27ac) at a GGAA repeat located within the promoter of BCL2L2, a gene downregulated by KRAB-ZFA in HEK293Tcells. FIG. 7E. Protein levels of KRAB-ZFA and EWS-FLI1 across all cell lines tested (Figure 3a) were determined by immunoblotting using specific antibodies directed against HA (KRAB- ZFA) and FLU (EWS-FLI1). GAPDH was used as loading control.
DETAILED DESCRIPTION
Microsatellite repeats are a class of simple tandem repeats that previous studies have shown can be dysregulated in multiple disease states (Subramanian, Mishra, and Singh 2003; Malik et al. 2021; Trost et al. 2020; Usdin 2008). For example, large scale epigenetic dysregulation of microsatellite repeats has been observed in Ewing sarcoma, a pediatric bone tumor where the EWS-FLI1 translocation fusion protein operates as a transcriptional pioneer factor (Delattre et al. 1992; Riggi et al. 2014). This fusion includes both the N-terminal transactivation domain of EWS and the C-terminal DNA binding domain of FLU. In contrast to FLU, which stably binds to only non-repeat GGAA sites, EWS-FLI1 can bind to both non-repeat GGAA motifs and GGAA microsatellite repeats. Notably, binding of EWS-FLI1 to the hundreds of GGAA microsatellites present throughout the human genome converts them into transcriptional enhancers, thereby inducing a tumor-specific gene regulatory program (Gangwal et al. 2008; Guillon et al. 2009; Riggi et al. 2014; Boulay et al. 2017). This example, together with the dysregulated expression of other repeat classes in other tumor types (Ting et al. 2011; Burns 2017), illustrates how aberrant transcriptional programs in cancer and other diseases can be caused by the widespread activation of specific repeat categories and highlights the need for robust tools to conduct genome-wide studies and perturbation of these elements.
Described herein are engineered ZFAs that can be used to efficiently target and alter the chromatin state of a class of microsatellite repeats in human cells. Using GGAA microsatellite repeats bound by EWS-FLI1, we showed that engineered EWS-ZFA fusion proteins targeted to these repeats can be over an order of magnitude more efficient than
an EWS-dCas9-targeted fusion for activating a GGAA repeat previously shown to be converted into a de novo enhancer by EWS-FLI1. In addition, EWS-ZFA fusions can effectively phenocopy the pioneer function of EWS-FLI1 at GGAA microsatellites and recapitulate the GGAA repeat-dependent chromatin landscape and gene expression profiles of Ewing sarcoma. Remarkably, coupling of a GGAA repeat- targeted ZFAto a transcriptional repressor KRAB domain resulted in genome-wide silencing of GGAA microsatellites and cytotoxicity that was selective for Ewing sarcoma cells through the targeted inactivation of oncogenic gene expression programs. Our results validate the power and efficacy of engineered ZF technology for targeting and altering the functional state of microsatellite repeats and illustrate how this platform can be deployed to interrogate the function of microsatellite repetitive elements at genome-scale.
Definitions
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In case of conflict, the present application, including definitions will control.
An “exogenous” nucleic acid sequence is a nucleic acid sequence that is not normally present in a cell, but can be introduced into a cell by one or more genetic, biochemical or other methods. Normal presence in the cell is determined with respect to the particular developmental stage and environmental conditions of the cell. Thus, for example, as used herein, an extrachromosomal DNA sequence that is introduced into the cell is an exogenous nucleic acid (even if part or all of that sequence is also present in the genome of the cell). Similarly, a nucleic acid sequence that is present only during embryonic development of muscle is an exogenous nucleic acid sequence with respect to an adult muscle cell. Alternatively, a nucleic acid sequence induced by heat shock is an exogenous molecule with respect to a non-heat-shocked cell. An exogenous nucleic acid sequence can comprise, for example, a functioning version of a malfunctioning endogenous gene. By contrast, an “endogenous” nucleic acid sequence is one that is normally present in a particular cell at a particular developmental stage under particular environmental conditions. For example, an endogenous nucleic acid can comprise a
chromosome, the genome of a mitochondrion, chloroplast or other organelle, or a naturally- occurring episomal nucleic acid.
“Nucleic acid” refers to deoxyribonucleotides or ribonucleotides in either single- or double-stranded form. The term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which can be synthetic, naturally occurring, or non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated. The term nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and polynucleotide. A “gene,” for the purposes of the present disclosure, includes a DNA region encoding a gene product (see infra), as well as all DNA regions which regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding and/or transcribed sequences. Accordingly, a gene includes, but is not necessarily limited to, promoter sequences, terminators, translational regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control regions.
The terms “polypeptide,” “peptide” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an analog or mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers.
The term “amino acid” refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, y-carboxyglutamate, and O-phosphoserine. Amino acid analog refers to compounds that have the same basic chemical structure as a naturally occurring amino
acid, i.e., an a-carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine, and methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
Ranges provided herein are understood to be shorthand for all of the values within the range. For example, a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 (as well as fractions thereof unless the context clearly dictates otherwise).
In this disclosure, “comprises,” “comprising,” “containing” and “having” and the like can have the meaning ascribed to them in U.S. Patent law and can mean “includes,” “including,” and the like; “consisting essentially of’ or “consists essentially” likewise has the meaning ascribed in U.S. Patent law and the term is open-ended, allowing for the presence of more than that which is recited so long as basic or novel characteristics of that which is recited is not changed by the presence of more than that which is recited, but excludes prior art embodiments.
Other definitions appear in context throughout this disclosure.
Compositions
Described herein are compositions comprising a zinc finger DNA-binding domain that specifically binds to a target site in any gene comprising a tetra-nucleotide repeat, e g., GGAA.
As used herein, the term zinc finger refers to a polypeptide comprising a DNA binding domain that is stabilized by zinc. The individual DNA binding domains are typically referred to as “fingers.” A zinc finger protein has at least one finger, preferably two fingers, three fingers, four fingers, five fingers, or six fingers. A zinc finger protein having two or more zinc fingers is referred to as a “multi-finger” or “multi- zinc finger” protein or “multi-finger array” or “zinc finger array.” Each finger typically comprises an approximately 30 amino acid, zinc- chelating, DNA-binding domain. An exemplary
motif characterizing one class of these proteins is X(2)-Cys-X(2,4)-Cys-X(12)-His-X(3- 5)-His (SEQ ID NO: 1), where X is any amino acid, which is known as the “C(2)H(2)” class. Zinc finger units are joined together by non-canonical (non-TGEKP linkers) such as TGSQKP (SEQ ID NO:2) or CGSQKP (SEQ ID NO:3). Studies have demonstrated that a single zinc finger of this C(2)H(2) class consists of an alpha helix containing the two invariant histidine residues coordinated with zinc along with the two cysteine residues of a single beta turn (Berg and Shi, Science 271:1081-1085 (1996)). Each finger within a zinc finger array binds to about two to about five nucleotides within a DNA sequence. A zinc finger array that include three fingers typically recognize a target site that includes 9 or 10 nucleotides; a zinc finger arrays that include four fingers typically recognize a target site that includes 12 to 14 nucleotides; while a zinc finger arrays having six fingers can recognize target sites that include 18 to 21 nucleotides.
In some embodiments the zinc finger protein/array is a non-naturally occurring protein, in that it is engineered to bind to a target site of choice. An engineered zinc finger binding domain can have a novel binding specificity, compared to a naturally- occurring zinc finger. Engineering methods include, but are not limited to, rational design and various types of selection. Rational design includes, for example, using databases comprising tri-nucleotide sequences and individual zinc finger amino acid sequences, in which each tri-nucleotide sequence is associated with one or more amino acid sequences of zinc fingers which bind the particular tri-nucleotide sequence.
Engineered zinc finger proteins are non-naturally occurring zinc finger proteins whose recognition helices have been altered (e.g., by selection and/or rational design) to bind to a pre-selected target site. Any of the zinc finger arrays described herein may include 1, 2, 3, 4, 5, 6 or more zinc fingers, each zinc finger having a recognition helix that binds to a target subsite in the selected sequence(s) (e.g., gene(s)). In some embodiments, the recognition helix is non-naturally occurring. In certain embodiments, the zinc finger proteins have the recognition helices shown in FIG. 1 A.
In certain embodiments, the DNA binding domain is an engineered zinc finger array including four to six fingers that is capable of recognizing target sites of 12 to 18 nucleotides (e.g., a zinc finger array having 6 fingers that recognizes target sites of 18 nucleotides). Each zinc finger within the array is designed to target a trinucleotide
sequence. For example, each zinc finger is designed to recognize GGA, AGG, AAG, or GAA. Therefore, when the zinc finger array is appropriately assembled, the zinc finger array can recognize sequences such as GGAAGGAAGGAAGGAAGG (SEQ ID NO:4) or AAGGAAGGAAGGAAGGAA (SEQ ID NO: 5).
See also, FIG. 1 A, which is a schematic of 16 different ZFAs, each engineered to bind ~4.5 GGAA microsatellites. The ZFAs each have six zinc fingers, and each finger recognizes three nucleotides. The target sequences of ZFAs 1 through 8 start with GGA, and ZFAs 9 through 16 with AAG. The amino acid compositions of recognition helices for each zinc finger are shown on the right side of FIG. 1A. Multiple zinc fingers with different recognition helices can in certain instances recognize the same nucleotides.
Fusion Proteins
Fusion proteins comprising DNA-binding proteins as described herein and a heterologous regulatory (functional) domain (or functional fragment thereof) are also provided. Common domains include, e.g., transcriptional repressors (e.g., KRAB, ERD, SID, TGF-P-inducible early gene (TTEG), v-erbA, MBD2, MBD3, Rb, MeCP2, R0M2, AtHD2A, and others, e.g., amino acids 473-530 of the ets2 repressor factor (ERF) repressor domain (ERD), amino acids 1-97 of the KRAB domain of K0X1, or amino acids 1-36 of the Mad mSIN3 interaction domain (SID); see Beerli et al., PNAS USA 95: 14628-14633 (1998)) or silencers such as Heterochromatin Protein 1 (HP1, also known as swi6), e.g., HPla or HP10; proteins or peptides that could recruit long noncoding RNAs (IncRNAs) fused to a fixed RNA binding sequence such as those bound by the MS2 coat protein, endoribonuclease Csy4, or the lambda N protein; enzymes that modify the methylation state of DNA (e.g., DNA methyltransferase (DNMT) or TET proteins); enzymes that modify histone subunits (e.g., histone acetyltransferases (HAT), histone deacetylases (HD AC), histone methyltransferases (e.g., for methylation of lysine or arginine residues) or histone demethylases (e.g., for demethylation of lysine or arginine residues); transcriptional activators (e.g., activation domains of NF-KB (e.g., p65), VP64, VPR, or p300).
In some embodiments, the fusion proteins include a linker between the zinc finger array and the heterologous functional domains. Domains could also be proteins that
recruit (either directly or indirectly) other proteins in the cell that in turn can modulate gene expression. For direct fusions, linkers that can be used in these fusion proteins (or between fusion proteins in a concatenated structure) can include any sequence that does not interfere with the function of the fusion proteins. In preferred embodiments, the linkers are short, e.g., 2-20 amino acids, and are typically flexible (i.e., comprising amino acids with a high degree of freedom such as glycine, alanine, and serine). In some embodiments, the linker comprises one or more units consisting of GGGS (SEQ ID NO:6) or GGGGS (SEQ ID NO:7), e.g., two, three, four, or more repeats of the GGGS (SEQ ID NO:6) or GGGGS (SEQ ID NO:7) unit. Other linker sequences can also be used. Indirect fusions include one or more dimerization systems (e.g., heterodimer systems containing DmrAand DmrC) that mediate coupling of different domains (e.g., DNA-binding domains and gene expression modulating domains), for example, by addition of a drug that induces activation of the dimerization systems.
Delivery and Expression Systems
To use the zinc finger fusion protein (e.g., a zinc finger that targets GGAA repeats and a repressor domain) described herein, it may be desirable to express them from a nucleic acid that encodes them. This can be performed in a variety of ways. For example, the nucleic acid encoding the zinc finger fusion protein can be cloned into an intermediate vector for transformation into prokaryotic or eukaryotic cells for replication and/or expression. Intermediate vectors are typically prokaryote vectors, e.g., plasmids, or shuttle vectors, or insect vectors, for storage or manipulation of the nucleic acid encoding the zinc finger fusion protein for production of the zinc finger fusion protein. The nucleic acid encoding the zinc finger fusion protein can also be cloned into an expression vector, for administration to a plant cell, animal cell, preferably a mammalian cell or a human cell, fungal cell, bacterial cell, or protozoan cell.
To obtain expression, a sequence encoding a zinc finger fusion protein is typically subcloned into an expression vector that contains a promoter to direct transcription. Suitable bacterial and eukaryotic promoters are well known in the art and described, e.g., in Sambrook et al., Molecular Cloning, A Laboratory Manual (3d ed. 2001); Kriegler,
Gene Transfer and Expression: A Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al., eds., 2010). Bacterial expression systems for expressing the engineered protein are available in, e.g., E. coli, Bacillus sp., and Salmonella (Palva etal., 1983, Gene 22:229-235). Kits for such expression systems are commercially available. Eukaryotic expression systems for mammalian cells, yeast, and insect cells are well known in the art and are also commercially available.
The promoter used to direct expression of a nucleic acid depends on the particular application. For example, a strong constitutive promoter is typically used for expression and purification of fusion proteins. In contrast, when the zinc finger fusion protein is to be administered in vivo for gene regulation, either a constitutive or an inducible promoter can be used, depending on the particular use of the zinc finger fusion protein. In addition, a preferred promoter for administration of the zinc finger fusion protein can be a weak promoter, such as HSV TK or a promoter having similar activity. The promoter can also include elements that are responsive to transactivation, e.g., hypoxia response elements, Gal4 response elements, lac repressor response element, and small molecule control systems such as tetracycline-regulated systems and the RU-486 system (see, e.g., Gossen & Bujard, 1992, Proc. Natl. Acad. Sci. USA, 89:5547; Oligino etal., 1998, Gene Then, 5:491-496; Wang et al., 1997, Gene Then, 4:432-441; Neering etal., 1996, Blood, 88: 1147-55; and Rendahl etal., 1998, Nat. Biotechnol., 16:757-761).
In addition to the promoter, the expression vector typically contains a transcription unit or expression cassette that contains all the additional elements required for the expression of the nucleic acid in host cells, either prokaryotic or eukaryotic. A typical expression cassette thus contains a promoter operably linked, e.g., to the nucleic acid sequence encoding the zinc finger fusion protein, and any signals required, e.g., for efficient polyadenylation of the transcript, transcriptional termination, ribosome binding sites, or translation termination. Additional elements of the cassette may include, e.g., enhancers, and heterologous spliced intronic signals.
The particular expression vector used to transport the genetic information into the cell is selected with regard to the intended use of the zinc finger fusion protein, e.g., expression in plants, animals, bacteria, fungus, protozoa, etc. Standard bacterial
expression vectors include plasmids such as pBR322 based plasmids, pSKF, pET23D, and commercially available tag-fusion expression systems such as GST and LacZ.
Expression vectors containing regulatory elements from eukaryotic viruses are often used in eukaryotic expression vectors, e.g., SV40 vectors, papilloma virus vectors, and vectors derived from Epstein-Barr virus. Other exemplary eukaryotic vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV40 early promoter, SV40 late promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.
The vectors for expressing the zinc finger fusion protein can include RNA Pol III promoters to drive expression of the guide RNAs, e.g., the Hl, U6 or 7SK promoters. These human promoters allow for expression of zinc finger fusion proteins in mammalian cells following plasmid transfection.
Some expression systems have markers for selection of stably transfected cell lines such as thymidine kinase, hygromycin B phosphotransferase, and dihydrofolate reductase. High yield expression systems are also suitable, such as using a baculovirus vector in insect cells, with the gRNA encoding sequence under the direction of the polyhedrin promoter or other strong baculovirus promoters.
The elements that are typically included in expression vectors also include a replicon that functions in E. coli, a gene encoding antibiotic resistance to permit selection of bacteria that harbor recombinant plasmids, and unique restriction sites in nonessential regions of the plasmid to allow insertion of recombinant sequences.
Standard transfection methods are used to produce bacterial, mammalian, yeast or insect cell lines that express large quantities of protein, which are then purified using standard techniques (see, e.g., Colley et al., 1989, J. Biol. Chem., 264: 17619-22; Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990)). Transformation of eukaryotic and prokaryotic cells are performed according to standard techniques (see, e.g., Morrison, 1977, J. Bacteriol. 132:349-351; Clark-Curtiss & Curtiss, Methods in Enzymology 101:347-362 (Wu et al., eds, 1983)).
Any of the known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, polybrene, protoplast fusion, electroporation, nucleofection, liposomes, microinjection, naked DNA, plasmid vectors, viral vectors, both episomal and integrative, and any of the other well-known methods for introducing cloned genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, e.g., Sambrook et al., supra). It is only necessary that the particular genetic engineering procedure used be capable of successfully introducing at least one gene into the host cell capable of expressing the zinc finger fusion protein.
Also provided herein are nucleic acids encoding the fusion proteins, as well as cells, tissues, and transgenic animals comprising the nucleic acids and optionally expressing the fusion proteins. Any nucleic acid construct capable of directing expression and/or which can transfer sequences to target cells can be used to administer the nucleic acid sequences described herein encoding either the exogenous nucleic acid sequence to be inserted within the target site or the zinc finger nuclease fusion proteins. Nucleic acid sequences described herein can be delivered to cells with vector delivery systems, including viral vector delivery systems comprising DNA and RNA viruses, which have either episomal or integrated genomes after delivery to the cell.
The term “vector” as used herein refers to nucleic acid molecules, usually doublestranded DNA, which may have inserted into it another nucleic acid molecule, such as a sequence encoding a nuclease fusion protein. The vector is used to transport the inserted nucleic acid molecule into a suitable host cell. A vector may contain the necessary elements that permit transcribing the inserted nucleic acid molecule, and translating the transcript into a polypeptide. Once in the host cell, the vector may for instance replicate independently of, or coincidental with, the host chromosomal DNA, and several copies of the vector and its inserted nucleic acid molecule may be generated. The term “vector” may thus also be defined as a gene delivery vehicle that facilitates gene transfer into a target cell. This definition includes both non-viral and viral vectors. Alternatively, gene delivery systems can be used to combine viral and non-viral components, such as nanoparticles or virosomes (Yamada et al. (2003) Nat Biotechnol . 21, 885-890). Non- viral vectors include but are not limited to cationic lipids, liposomes, nanoparticles, PEG,
PEI, etc. Viral vectors are derived from viruses including but not limited to: retrovirus, lentivirus, adeno-associated virus, adenovirus, herpesvirus, hepatitis virus or the like. Typically, but not necessarily, viral vectors are replication-deficient as they have lost the ability to propagate in a given cell since viral genes essential for replication have been eliminated from the viral vector.
The use of RNA or DNA viral based systems for the delivery of nucleic acids takes advantage of highly evolved processes for targeting a virus to specific cells in the body and trafficking the viral payload to the nucleus. Viral vectors can be derived from lentivirus, adeno-associated virus, adenovirus, retroviruses and antiviruses. Conventional viral based systems for the delivery of nucleic acid sequences could include retroviral, lentiviral, adenoviral, adeno- associated, herpes simplex virus, and TMV-like viral vectors for gene transfer. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types and target tissues.
Retroviruses and antiviruses are RNA viruses that have the ability to insert their genes into host cell chromosomes after infection. Retroviral and lentiviral vectors have been developed that lack the genes encoding viral proteins, but retain the ability to infect cells and insert their genes into the chromosomes of the target cell (Miller (1990) Mol Cell Biol. 10, 4239-4242; Naldini et al. (1996) Science 272, 263-267; VandenDriessche et al., (1999) Proc Natl Acad Sci USA. 96, 10379-10384. The difference between a lentiviral and a classical Moloney-murine leukemia-virus (MLV) based retroviral vector is that lentiviral vectors can transduce both dividing and non-dividing cells whereas MLV-based retroviral vectors can only transduce dividing cells.
Adenoviral vectors are designed to be administered directly to a living subject. Unlike retroviral vectors, most of the adenoviral vector genomes do not integrate into the chromosome of the host cell. Instead, genes introduced into cells using adenoviral vectors are maintained in the nucleus as an extrachromosomal element (episome) that persists for an extended period of time. Adenoviral vectors will transduce dividing and nondividing cells in many different tissues (Chuah et al. (2003) Blood. 101, 1734-1743). Another viral vector is derived from the herpes simplex virus, a large, double-stranded DNA virus.
Recombinant forms of the vaccinia virus, another dsDNA virus, can accommodate large inserts and are generated by homologous recombination.
Adeno-associated virus (AAV) is a small ssDNA virus which infects humans and some other primate species, not known to cause disease and consequently causing only a very mild immune response. AAV can infect both dividing and non-dividing cells and may incorporate its genome into that of the host cell. These features make AAV a very attractive candidate for creating viral vectors for gene therapy, although the cloning capacity of the vector is relatively limited. In a specific embodiment described herein, the vector used is therefore derived from adeno associated virus.
The zinc finger fusion proteins described herein can be delivered to cells by conventional protein transduction methods known in the art. In specific embodiments, one or more Nuclear Localization Signals (NLS) or protein transduction domains (e.g., penetratin or transportan) can be optionally added to the fusion protein. Such methods are described, for example by Liu, J. et al, Molecular Therapy-Nucleic Acids (2015) 4, e232 and Gaj, T. et al, ACS Chem. Biol. 2014, 9, 1662-1667. In some instances, Cys2His2 zinc fingerss themselves harbor intrinsic cell transduction properties. See, e.g., Gaj T, Guo J, Kato Y, Sirk SJ, Barbas CF 3rd. Nat Methods. 2012 Jul 1 ;9(8): 805-7. ; Gaj T, Liu J, Anderson KE, Sirk SJ, Barbas CF 3rd. ACS Chem Biol. 2014 Aug 15 ;9(8): 1662-7; Liu J, Gaj T, Wallen MC, Barbas CF 3rd. Mol Ther Nucleic Acids. 2015 Mar 10;4(3):e232; Liu J, et al. Nat Protoc. 2015 Nov; 10(11): 1842-59; Perdigao PRL, Cunha-Santos C, Barbas CF 3rd, Santa-Marta M, Goncalves J. Mol Ther Methods Clin Dev. 2020 May 22; 18: 145- 158.
In other embodiments, the zinc finger fusion proteins include a cell-penetrating peptide sequence that facilitates delivery to the intracellular space, e.g., HIV-derived TAT peptide or hCT derived cell-penetrating peptides, see, e.g., Caron etal., (2001) Afo/ Ther. 3(3):310-8; Langel, Cell-Penetrating Peptides: Processes and Applications (CRC Press, Boca Raton FL 2002); El-Andaloussi etal., (2005) Curr Pharm Des. 11 (28):3597- 611; and Deshayes et aL, (2005) Cell Mol Life Sci. 62(16): 1839-49.
Cell penetrating peptides (CPPs) are short peptides that facilitate the movement of a wide range of biomolecules across the cell membrane into the cytoplasm or other organelles, e.g. the mitochondria and the nucleus. Examples of molecules that can be
delivered by CPPs include therapeutic drugs, plasmid DNA, oligonucleotides, siRNA, peptide-nucleic acid (PNA), proteins, peptides, nanoparticles, and liposomes. CPPs are generally 30 amino acids or less, are derived from naturally or non-naturally occurring protein or chimeric sequences, and contain either a high relative abundance of positively charged amino acids, e.g. lysine or arginine, or an alternating pattern of polar and nonpolar amino acids. CPPs that are commonly used in the art include Tat (Frankel et al., (1988) Cell. 55: 1189-1193, Vives etal., (1997) J. Biol. Chem. 272:16010-16017), penetratin (Derossi etal., (1994) J. Biol. Chem. 269: 10444-10450), polyarginine peptide sequences (Wender etal., (2000) Proc. Natl. Acad. Sci. USA 97:13003-13008, Futaki et al., (2001) J. Biol. Chem. 276:5836-5840), and transportan (Pooga et al., (1998) Nat. Biotechnol. 16:857-861).
CPPs can be linked with their cargo through covalent or non-covalent strategies. Methods for covalently joining a CPP and its cargo are known in the art, e.g. chemical cross-linking (Stetsenko etal., (2000) J. Org. Chem. 65:4900-4909, Gait et al. (2003) Cell. Mol. Life. Sci. 60:844-853) or cloning a fusion protein (Nagahara etal., (1998) Nat. Med. 4:1449-1453). Non-covalent coupling between the cargo and short amphipathic CPPs comprising polar and non-polar domains is established through electrostatic and hydrophobic interactions.
CPPs have been utilized in the art to deliver potentially therapeutic biomolecules into cells. Examples include cyclosporine linked to polyarginine for immunosuppression (Rothbard etal., (2000) Nature Medicine 6(11): 1253-1257), siRNA against cyclin Bl linked to a CPP called MPG for inhibiting tumorigenesis (Crombez et al. , (2007) Biochem Soc. Trans. 35:44-46), tumor suppressor p53 peptides linked to CPPs to reduce cancer cell growth (Takenobu et al., (2002) Afo/. Cancer Ther. 1(12): 1043-1049, Snyder et al., (2004) PLoS Biol. 2:E36), and dominant negative forms of Ras or phosphoinositol 3 kinase (PI3K) fused to Tat to treat asthma (Myou et al., (2003) J. Immunol. 171 :4399- 4405).
CPPs have been utilized in the art to transport contrast agents into cells for imaging and biosensing applications. For example, green fluorescent protein (GFP) attached to Tat has been used to label cancer cells (Shokolenko et al., (2005) DNA Repair 4(4): 511-518). Tat conjugated to quantum dots have been used to successfully cross the
blood- brain barrier for visualization of the rat brain (Santra et al., (2005) Chem. Commun. 3144-3146). CPPs have also been combined with magnetic resonance imaging techniques for cell imaging (Liu et al., (2006) Biochem. and Biophys. Res. Comm.
347(1): 133-140). See also Ramsey and Flynn, Pharmacol Ther. 2015 Jul 22. pii: S0163- 7258(15)00141-2.
In some embodiments, zinc finger fusion proteins include a moiety that has a high affinity for a ligand, for example GST, FLAG or hexahistidine sequences (one or more hexahistidine sequences). Such affinity tags can facilitate the purification of recombinant zinc finger fusion proteins.
In some embodiments, the zinc finger fusion proteins do not include a NLS or hexahistidine sequence.
Also provided herein are compositions and kits comprising the zinc finger fusion protein described herein. The kits can also include one or more additional reagents, e.g., additional enzymes (such as RNA polymerases) and buffers, e.g., for use in a method described herein.
Pharmaceutical Compositions and Methods of Administration
The methods described herein include the use of pharmaceutical compositions comprising the zinc finger fusion proteins described herein as an active ingredient.
Pharmaceutical compositions typically include a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” includes saline, solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. Supplementary active compounds can also be incorporated into the compositions.
Pharmaceutical compositions are typically formulated to be compatible with its intended route of administration. Examples of routes of administration include intrathecal, intraperitoneal, intraocular, oral, intravenous, intradermal, subcutaneous, oral, intratumoral injection, administration by a gel for slow release, or an infusion pump.
Methods of formulating suitable pharmaceutical compositions are known in the art, see, e.g., Remington: The Science and Practice of Pharmacy, 21st ed., 2005; and the
books in the series Drugs and the Pharmaceutical Sciences: a Series of Textbooks and Monographs (Dekker, NY). For example, solutions or suspensions used for administration to the eye, parenteral, intradermal, or subcutaneous application can include the following components: a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as ethylenediaminetetraacetic acid; buffers such as acetates, citrates or phosphates and agents for the adjustment of tonicity such as sodium chloride or dextrose. pH can be adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic.
Pharmaceutical compositions suitable for injectable use can include sterile aqueous solutions (where water-soluble) or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, NJ) or phosphate buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the extent that easy syringability exists. It should be stable under the conditions of manufacture and storage and must be preserved against the contaminating action of microorganisms such as bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyetheylene glycol, and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, for example, sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable compositions can be brought about
by including in the composition an agent that delays absorption, for example, aluminum monostearate and gelatin.
Sterile injectable solutions can be prepared by incorporating the active compound in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle, which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation are vacuum drying and freeze-drying, which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile- filtered solution thereof.
Oral compositions generally include an inert diluent or an edible carrier. For the purpose of oral therapeutic administration, the active compound can be incorporated with excipients and used in the form of tablets, troches, or capsules, e.g., gelatin capsules. Oral compositions can also be prepared using a fluid carrier for use as a mouthwash. Pharmaceutically compatible binding agents, and/or adjuvant materials can be included as part of the composition. The tablets, pills, capsules, troches and the like can contain any of the following ingredients, or compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring.
For administration by inhalation, the compounds can be delivered in the form of an aerosol spray from a pressured container or dispenser that contains a suitable propellant, e.g., a gas such as carbon dioxide, or a nebulizer. Such methods include those described in U.S. Patent No. 6,468,798.
Pharmaceutical compositions may also be formulated to provide slow, controlled or sustained release of the active agent using, by way of example, hydroxypropyl methyl cellulose in varying proportions or other polymer matrices, liposomes and/or microspheres. In addition, the pharmaceutical compositions described herein may contain
opacifying agents and may be formulated so that they release the active agent only, or preferentially, in a certain portion of the gastrointestinal tract, optionally, in a delayed manner. The active agent can also be in micro-encapsulated form, if appropriate, with one or more of the above-described excipients.
Methods of Treatment
The methods described herein include methods for the treatment of disorders associated with GGAA tandem repeats. In some embodiments, the disorder is Ewing Sarcoma. In other embodiments, the disorder is prostate cancer (see, e.g., Kedage et al An Interaction with Ewing's Sarcoma Breakpoint Protein EWS Defines a Specific Oncogenic Mechanism of ETS Factors Rearranged in Prostate Cancer, Cell Reports 2016 Oct 25;17(5): 1289-1301, where dysregulation of GGAA repeats in prostate cancer due to TMPRSS2-ERG fusions is described). In other embodiments, the disorder is a tumor where ETS factors have abnormal functions may involve dysregulation of GGAA repeats (including hematopoietic malignancies with high levels of FLU). Generally, the methods include administering a therapeutically effective amount of the compositions comprising a zinc finger fusion protein as described herein, to a subject who is in need of, or who has been determined to be in need of, such treatment.
As used herein, the term “patient” or “subject” refers to members of the animal kingdom including but not limited to human beings and “mammal” refers to all mammals, including, but not limited to human beings.
As used herein, the terms “treat,” “treating,” “treatment,” and the like refer to reducing or ameliorating a disorder and/or symptoms associated therewith by any suitable dosage regimen, procedure and/or administration route of a composition, device or structure with the object of achieving a desirable clinical/medical end-point. It will be appreciated that, although not precluded, treating a disorder or condition does not require that the disorder, condition or symptoms associated therewith be completely eliminated. In specific embodiments, the terms “treat,” “treatment,” and “treating” refer to the amelioration of at least one measurable physical parameter of a proliferative disorder, such as growth of a tumor, not necessarily discernible by the patient. In other embodiments the terms “treat,” “treatment,” and “treating” refer to the inhibition of the
progression of a proliferative disorder, either physically by, e.g., stabilization of a discernible symptom, physiologically by, e.g., stabilization of a physical parameter, or both. In other embodiments the terms “treat,” “treatment,” and “treating” refer to the reduction or stabilization of tumor size or cancerous cell count.
Ewing’s Sarcoma is a type of cancerous tumor that grows in the bones or the soft tissue around bones, such as cartilage or the nerves. It results from a translocation which fuses the EWS gene on chromosome 22 with the FLU gene on chromosome 11. The resultant fusion, EWS-FLI1, functions as a transcriptional activator. Treatment for Ewing sarcoma usually begins with chemotherapy. The drugs may shrink the tumor and make it easier to remove the cancer with surgery or target with radiation therapy. After surgery or radiation therapy, chemotherapy treatments might continue in order to kill any cancer cells that might remain. Accordingly, in some examples, the compositions described herein are administered to a subject in need thereof (e.g., intravenous (similar to other chemotherapy treatments currently used for Ewing’s Sarcoma), through infusion pump, or intratumoral injection) in a therapeutically sufficient amount to reduce tumor size or to kill tumor cells. In some instances, the compositions described herein are administered in a therapeutically sufficient amount to reduce the aberrant gene expression driven by activation of GGAA-microsatellites in a cell, which results because of the activity of EWS-FLI1.
In some instances, the compositions described herein can be used in combination with one or more other treatments that are typically used to treat Ewing’s Sarcoma. For example, in combination with chemotherapy agents (e.g., vincristine, doxorubicin, cyclophosphamide, ifosfamide, and etoposide), radiation, surgery, or any combination thereof.
EXAMPLES
The present invention is additionally described by way of the following illustrative, non-limiting Examples that provide a better understanding of the present invention and of its many advantages.
As shown herein, engineered ZFAs can be used to efficiently target and alter the chromatin state of a class of microsatellite repeats in human cells.
Materials and Methods
The following materials and methods were used in the examples below.
EXPERIMENTAL MODEL AND SUBJECT DETAILS
Cell lines
Primary bone marrow derived-MSCs were collected with approval from the Institutional Review Board of the Centre Hospitalier Universitaire Vaudois. Samples were de-identified prior to our analysis. MSCs were cultured in IMDM (Life Technologies) containing 10% fetal calf serum (FCS) and 10 ng/ml platelet-derived growth factor BB (PeproTech). U2OS were obtained from Toni Cathomen (Freiburg). All other cell lines were obtained from ATCC and media from Life Technologies. Ewing sarcoma cell lines SKNMC, A673, EW7 were grown in RPMI 1640 and CHP100 in McCoy’s 5a Medium. HEK293T, Hela and U2OS were grown in DMEM and MRC5 in EMEM. All media were supplemented with 1% penicillin and streptomycin (Life Technologies). McCoy’s 5a medium was supplemented with 15% FBS and all other media were supplemented with 10% FBS. Cells were cultured at 37° C with 5% CO2. Media supernatant was analyzed biweekly for the presence of Mycoplasma using MycoAlertTM PLUS (Lonza). Cell lines were authenticated by ATCC STR profiling.
METHODS DETAILS
Plasmids and oligonucleotides
Each of the 16 different ZFAs that recognize ~4.5 GGAA tandem repeats was generated by assembling pre-selected 2-ZF units from an unpublished Joung lab archive. Although we used an unpublished archive of engineered zinc finger modules to provide the various 2-ZF units for constructing our ZFAs, there are other published public sources of zinc finger units as well as protocols that can be used to create customized zinc finger arrays (Sander et al. 2010; Fu et al. 2009; Wright et al. 2006; Sander et al. 2011; Maeder et al. 2008, 2009). The assembled ZFAs were inserted into the pENTR3C vector and EWS N-terminus (Riggi et al. 2014) or KRAB (from BPK1407) was cloned into pENTR3C- ZFAs by Gibson assembly. The EWS-ZFA or KRAB-ZFA fusions thus generated were transferred to lentiviral pLIV vector containing EFl -alpha promoter via LR reactions using Gateway LR clonase II Enzyme Mix (Invitrogen). dCas9-EWS (NP173) was constructed
by cloning EWS into BPK1179 digested with Xhol and Notl by Gibson assembly, and EWS-dCas9 (YET3486) was constructed by cloning EWS into pSQT digested with Agel and BstZ17i by Gibson assembly. DmrC-EWS was generated by inserting EWS into DmrC entry vector digested with Nrul, using Gibson assembly. Sequences of gRNAs used in this study are provided in Table 2A.
Transfection
For EWS-ZFA experiments in U2OS cells, 2 x 105 cells were transfected with lug of plasmids by nucleofection using the DN-100 program on a Lonza 4-D Nucleofector with the SE Cell Line Kit (Lonza) and transfected cells were plated in 24-well plates. For dCas9- based EWS constructs, we used the nucleofection method described in detail previously (Tak et ai. 2017).
Lentiviral Generation
Lentivirus was produced in HEK293T LentiX cells (Clontech) by LT1 (Mirus Bio) transfection with gene delivery vector and packaging vectors GAG/POL and VSV plasmids(Boulay et al. 2017). Viral supernatants were collected 72 h after transfection and concentrated using the LentiX concentrator (Clontech). Virus containing pellets were resuspended in PBS and added dropwise on cells in presence of growth media supplemented with 6 ug/ml polybrene. Cells infected with lentivirus were selected using puromycin (Invivogen) at a concentration of 1 ug/ml for SKNMC, EW7, CHP100, HEK293T, HeLa and U2OS or 2 ug/ml for A673 and MRC5 in the growth medium. MSCs were selected with 0.75 ug/ml puromycin. Overexpression efficiency was determined by immunoblot analysis.
Immunoblot Analysis
Immunoblot analyses were performed using standard protocols (Boulay et al. 2017). Primary antibodies were used at the following concentrations: rat anti-HA (Roche, lug/ml), rabbit anti -FLU (abeam, lug/ml), and mouse anti-GAPDH (Millipore, 0.1 ug/ml). Secondary antibodies were goat anti-rabbit, goat anti-rat, and goat anti-mouse IgG respectively conjugated with horseradish peroxidase (Bio-Rad, 1: 10,000 dilution).
Membranes were developed using Western Lightning Plus-ECL enhanced chemiluminescence substrate (PerkinElmer) and visualized using photographic film.
Real-Time Quantitative reverse transcription PCR
Total RNA was extracted from the transfected cells 72 hours post-transfection using the NucleoSpin® RNA Plus (Clontech), and 250 ng of purified RNA was used for cDNA synthesis in 20ul of total reaction using High-Capacity RNA-cDNA kit (ThermoFisher). cDNA was diluted 1:20 and 3 pl of cDNA was used for quantitative PCR (qPCR) using SYBR Green Real-Time PCR Master Mix (ThermoFisher), and primers specific for the target transcript (Table 2B). qPCR was performed using Roche LightCycler480 with the following cycling protocols: initial denaturation at 95 °C for 20 seconds (s) followed by 45 cycles of 95 °C for 3 s and 60 °C for 30 s. Ct values over 35, were considered as 35. Relative quantification of each target, normalized to an endogenous control (GAPDH or HPRTPy was performed using the comparative Ct method (Applied Biosystems).
Cell Viability Assays
Cells that were transduced with lentiviral KRAB-ZFA plasmid or GFP control plasmid were grown for 8 days and cell viability was measured using the CellTiter-Glo luminescent assay (Promega) as described by the manufacturer. Endpoint luminescence was measured on a SpectraMax M5 plate reader (Molecular Devices).
Definition of target genes associated with EWS-FLI1 bound GGAA-repeats
In FIG. 3H, 126 GGAA repeat-associated genes were selected based on a maximum distance of 100 kb from EWS-FLI1 bound GGAA repeats (n=812) and upregulation upon EWS-FLI1 induction in MSCs (greater than 2-fold). In FIG. 5H, 235 GGAA repeat-associated genes were selected based on a maximum distance of 100 kb from EWS-FLI1 bound GGAA repeats (n=812) and downregulation upon EWS-FLI1 knockdown in both SKNMC and A673 Ewing sarcoma cell lines (greater than 2-fold) (Riggi et al. 2014).
ChlP-seq
ChIP assays of MSCs, SKNMC, A673 and HEK293T cells were carried out using 2-5 x 106 cells per sample and per epitope, following the procedures described previously (Mikkelsen et al. 2007). In brief, chromatin from formaldehyde-fixed cells were fragmented to 200-700 bp with a Branson 250 sonifier. Solubilized chromatin was immunoprecipitated overnight at 4C with 3 pg of target specific antibodies (rat anti-HA (Roche), rabbit anti -FLU (Abeam), rabbit anti-H3K27ac (Active Motif), and rabbit anti- H3K9me3 (Abeam)). Antibody-chromatin complexes were pulled down with protein G- Dynabeads (Life Technologies), washed, and then eluted. After crosslink reversal, RNase A, and proteinase K treatment, immunoprecipitated DNA was extracted with AMP Pure beads (Beckman Coulter). ChIP DNA was quantified with Qubit. Sequencing libraries were prepared with 1-5 ng of ChIP DNA samples and input samples using the Ovation Ultralow System V2 kit (Nugen). Libraries were sequenced with single-end (SE) 50-75 cycles on an Illumina Nextseq 500 Illumina genome analyzer.
ChlP-seq Bioinformatic Analysis
Reads were aligned to human reference genome hgl9 using bwa (Li and Durbin 2009). Aligned reads were then filtered to exclude PCR duplicates and were extended to 200 bp to approximate fragment sizes. Density maps were generated by counting the number of fragments overlapping each position using igvtools, and normalized to 10 million reads. We used MACS2 (Zhang et al. 2008) to call peaks using matching input controls with a q- value threshold of 0.01. Peaks were filtered to exclude blacklisted regions as defined by the ENCODE consortium (ENCODE Project Consortium 2012). Peaks within 200 bp of each other were merged. Genome- wide GGAA microsatellite repeats were previously annotated (Boulay et al. 2017, 2018). Peak intersections were identified using bedtools (Quinlan et al. 2010). Average ChlP-seq signals across intervals were calculated using bwtool (Pohl and Beato 2014). findMotifsGenome.pl was used to identify de novo DNA motifs between 8 and 20 bp from all sites bound by EWS-ZFA with the Homer suite of tools (Heinz et al. 2010). Signals shown in heatmaps (100 bp windows) and composite plots (10 bp window) were calculated using bwtool (Pohl and Beato 2014). Heatmap signals are in log2 scale, centered around EWS-FLI1 binding sites (Riggi et al. 2014) and
are capped at the 99th percentile.
RNA-Seq
Total RNA was isolated from cells using NucleoSpin RNA Plus (Clontech). For Fig. 2h, RNA libraries were prepared from 500 ng of total RNA treated with Ribogold zero to remove ribosomal RNA, using TruSeq Stranded Total RNA Library Prep Gold kit (Illumina, 20020599) and TruSeq RNA Single Indexes. The RNA libraries were sequenced with PE 32 cycles on an Illumina Nextseq500 system. For Fig. 3g, RNA samples were sent to Novogene Corporation for mRNA sequencing. RNA libraries were sequenced with PEI 50 cycles on an Illumina NovaSeq 6000 system. Reads were aligned to hgl9 using STAR (Dobin et al. 2013). Mapped reads were filtered to exclude PCR duplicates and reads mapping to known ribosomal RNA coordinates, obtained from the rmsk table in the UCSC database (genome . Gene expression was calculated using featureCounts (Liao,
Smyth, and Shi 2014). Only primary alignments with mapping quality of 10 or more were counted. Counts were then normalized to 1 million reads. Signal tracks were generated using bedtools (Quinlan et al. 2010). Differential expression was calculated using DESeq2 (Love, Huber, and Anders 2014).
GSEA analysis
Gene set overlaps were computed using Gene Set Enrichment Analysis (GSEA, gsea-msigdb.org/gsea/msigdb/annotate.jsp). Genes lists for GSEA analysis were selected using a log2 fold change of 0.6 for upregulated genes and -0.6 for downregulated genes. An adjusted p-value threshold of < 0.1 was also applied. Gene lists were then analyzed for overlaps with C2 (curated gene sets) and BP (GO biological process), with a FDR q-value < 0.05.
QUANTIFICATION AND STATISTICAL ANALYSIS
Information on the number of biological replicates, statistical tests and p-values is provided in the figure legends.
Example 1.1: Engineering sequence-specific DNA-binding domains to target GGAA microsatellite repeats
Although multiple platforms are available to create DNA-binding modules that might be capable of recognizing GGAA microsatellite repeats bound by the EWS-FLI1 fusion protein, we chose to focus on using engineered Zinc Finger Arrays (ZFAs) and RNA-targeted dCas9 from Streptococcus pyogenes. Cys2His2 ZFAs can be engineered to recognize novel DNA sequences of interest and have been used successfully to build artificial transcription factors capable of influencing gene expression in human cells (Graslund et al. 2005; Beerli et al. 1998). Alternatively, dCas9 programmed by guide RNAs (gRNAs) have similarly been used to create gene regulatory proteins that function efficiently in human cells (Qi et al. 2013; Maeder et al. 2013; Perez-Pinera et al. 2013) and offer the substantial additional advantage of simple targetability by altering the gRNA sequence. TALE repeats have also been used to build customized DNA-binding domains, utilizing assembled arrays of four TALE repeat domains as “building blocks”, with each recognizing one of the four different DNA bases (Scholze and Boch 2011; Boch et al. 2009; Moscou and Bogdanove 2009). However, because the NN TALE repeat typically used to recognize guanine (G) has also been reported to recognize adenine (A) (albeit with less efficiency) (Streubel et al. 2012; Deng et al. 2012; Christian et al. 2012), we elected not to engineer TALE repeat arrays designed to recognize GGAA microsatellite repeat sequences.
We engineered ZFAs to recognize two 18 bp sequences that align within different registers of 4.5 GGAA repeats: 5’-GGAAGGAAGGAAGGAAGG and 5’- AAGGAAGGAAGGAAGGAA. A single ZF recognizes ~3 bp of DNA and previous work has shown that highly active arrays of six ZFs that recognize 18 bp target sites can be assembled by using pre-selected 2-ZF units joined together by non-canonical (non- TGEKP linkers) such as TGSQKP or CGSQKP (Sander et al. 2010; Fu et al. 2009; Wright et al. 2006; Sander et al. 2011; Maeder et al. 2009, 2008; Joung, Voytas, and Kamens 2015; Pearson 2008; Moore, Klug, and Choo 2001, Joung lab (unpublished data)). Using this strategy and an archive of pre-selected 2-ZF units engineered to bind to various specific target sequences, we assembled eight different 6-ZF arrays for each of the two 18 bp target sites (FIG. 1A) (Methods, Table 1). To test the abilities of these 16
ZFAs to bind to GGAA microsatellite repeats, we fused the disordered prion-like N- terminal domain of EWSR1 (Chong et al. 2018; Boulay et al. 2017) (hereafter referred to as the EWS domain) to the N-terminus or C-terminus of each of the ZFAs (FIG. IB). We then assessed the abilities of each of these 32 fusions to activate the UGT3A2 gene (an EWS-FLI1 target gene that has 11 GGAA repeats positioned ~2 kb upstream of its promoter) in human U2OS cells. We found that all of these ZF-based fusions activated UGT3A2 with varying levels of efficiency (mean fold-activation ranging from 14- to 190- fold) (FIG. 1C). Because the ZF array ZFA7 exhibited approximately equivalent activity regardless of the position of the EWS domain (mean fold activation of 83- and 70-fold) and this level of activation was similar to that observed with EWS-FLI1 (mean fold activation of 121 -fold) (FIG. 1C), we selected the EWS-ZFA7 fusion (hereafter referred to as EWS-ZFA) for use in further experiments.
#, SEQ ID NO:
#, SEQ ID NO:
#, SEQ ID NO: To enable binding of GGAA repeats by dCas9, we also designed a gRNA that would target a 23 bp sequence composed of ~5.5 GGAA repeats: 5’- AGGAAGGAAGGAAGGAAGGAagg, consisting of a 20 nt spacer (bold) and an NGG PAM (lower case). To our surprise, expression of this gRNA together with a fusion protein in which the EWS domain was fused to the N-terminal or C-terminal end of dCas9 (hereafter referred to as EWS-dCas9 or dCas9-EWS, respectively) failed to activate the UGT3A2 gene in U2OS cells (FIG. ID). To increase the number of EWS domains recruited by our dCas9-gRNA complex, we used single and multimerized configurations of two domains called DmrA and DmrC that only interact in the presence of a small molecule A/C heterodimerizer. We co-expressed our gRNA and a DmrC-EWS
domain fusion together with a dCas9-DmrA fusion protein harboring one, two, three, or four DmrA domains in U2OS cells (FIG. ID). In the presence of heterodimerizer, we found that dCas9-DmrA fusions harboring two, three or four DmrA domains could mediate modest activation of the UGT3A2 gene (mean fold-activation of 3.1, 3, or 1.1, respectively), levels much lower compared to the activation observed using the EWS- ZFA fusion (FIG. ID). In contrast, these same dCas9-based EWS constructs were effective at activating various other genes when using different gRNAs directed to non- repetitive target sites within the promoters of those genes in U2OS cells and HEK293 cells (FIGs. 2A-2B). The ability of our dCas9-based EWS constructs to mediate activation from unique sites in the genome but not from GGAA repeats suggests that it may be challenging for these fusions to recognize and/or bind to these repeats that are present at over 13,000 loci in the human genome including the UGT3A2 promoter. Taken together, our results show that an engineered EWS-ZFA fusion could more effectively activate an EWS-FLI1 target gene with upstream GGAA repeats than analogous dCas9- based fusions to the EWS domain.
Example 1.2: An EWS-ZFA fusion recapitulates genome-wide activation of microsatellite repeats observed in Ewing sarcoma
We next tested whether EWS-ZFA could target and activate GGAA microsatellites genome-wide by comparing its activity to EWS-FLI1 in mesenchymal stem cells (MSCs). MSCs are a model for the cell of origin of Ewing sarcoma and EWS- FLI1 has previously been shown to operate as a pioneer factor at GGAA repeats in these cells to induce a chromatin landscape and gene expression pattern similar to that of tumor cells (Riggi et al. 2014). Transduction of MSCs with lentiviral vectors expressing EWS- ZFA followed by ChlP-seq and unbiased sequence analysis identified GGAA repeats as the dominant motif found in EWS-ZFA peaks (more than 80% of EWS-ZFA binding sites contained more than four consecutive GGAA units, FIG. 3A, FIG. 4A, note alternate motifs include: GGAAGGAAGGAAGGAAGGAAGGAA (SEQ ID NO: 136) and AAGGAAGGAAGGAAGGAAGGAAGG (SEQ ID NO: 137)). EWS-ZFA also bound nearly all of the GGAA repeats in the genome bound by EWS-FLI1 (FIG. 3B,
FIG. 4B) We categorized 13,029 GGAA microsatellites with more than 4 consecutive GGAA units based on their length (FIG. 4C) and found that the EWS-ZFA binds a higher fraction of GGAA microsatellites (10-20%) than EWS-FLI1 at each length interval (FIG. 3C, FIG. 4D)
We further tested whether EWS-ZFA binding would lead to the induction of active chromatin states at GGAA repeats in MSCs. To do this, we used ChlP-seq to measure the active chromatin mark H3K27ac at 812 GGAA microsatellites that are consistently bound by endogenous EWS-FLI1 in Ewing sarcoma cell lines (Riggi et al. 2014). We observed strong binding of EWS-ZFA at these same sites and de novo deposition of H3K27ac, often at higher levels than that induced by EWS-FLI1 (FIGS. 3D - 3E; FIG. 4E). This higher activity and the higher binding of GGAA repeats by EWS- ZFA compared to EWS-FLI1 (FIG. 3C, FIG. 4D) may be due to higher protein expression levels observed upon lentiviral induction (FIG. 4F). However, we cannot rule out other possible explanations such as structural and functional differences between the ZFA and FLU DNA-binding domains (which may provide distinct DNA stability profiles to each fusion protein) or differing numbers of fusion proteins recruited to a given GGAA repeat (which may result in variable recruitment of chromatin co-factors involved in H3K27ac deposition). By contrast, canonical non-repeat GGAA sites bound by EWS- FLI1 showed no evidence of EWS-ZFA binding or chromatin state changes, thereby demonstrating the specificity of the engineered EWS-ZFA fusion for GGAA repeats relative to non-repeat GGAA sites as expected (FIGs. 3F - 3G, FIG. 4G). In addition to changes in chromatin activity, we also measured transcriptional changes for genes in the vicinity of EWS -FLU -bound GGAA repeats. Transcript analysis showed that 72% of the genes that are within 100Kb of EWS -FLU -bound GGAA repeats (FIG. 3H, Table 3, shown below) and that are induced > 2-fold by EWS-FLI1 were also upregulated to a similar degree by EWS-ZFA. Taken together, these data show that EWS-ZFA was able to phenocopy the chromatin and transcriptional activation observed in Ewing sarcoma, suggesting that localizing the N-terminal EWS domain to GGAA repeats via an engineered ZFA instead of the FLU DNA-binding domain was sufficient to initiate the recruitment of chromatin regulators required for pioneer function, enhancer activation and target gene expression. These results provide an important proof-of-concept for how
engineered ZFAs can be an effective tool to target and alter the functional state of GGAA microsatellites genome-wide.
Example 1.3: KRAB-ZFAs can selectively silence the microsatellite-driven Ewing Sarcoma gene expression program
Given that EWS-ZFA can efficiently target and activate GGAA microsatellites in MSCs, we hypothesized that a fusion of our engineered ZFA to a repressive KRAB domain (Margolin et al. 1994; Groner et al. 2010) might conversely silence active GGAA microsatellites bound by endogenous EWS-FLI1 in Ewing sarcoma cells, thereby inactivating its downstream oncogenic gene expression program. This approach offers the possibility to delineate the precise functional role of GGAA repeats in Ewing Sarcoma cells, in isolation from the non-repeat GGAA target sites of EWS-FLI1. We expressed a KRAB-ZFA fusion protein and found that it bound efficiently to GGAA microsatellites
in two Ewing sarcoma cell lines (SKNMC and A673). Interestingly, KRAB-ZFA binding was followed by EWS-FLI1 eviction from the same genomic sites, as assessed by FLU ChlP-seq performed in SKNMC and A673 cells (FLU ChlP-seq can be used to detect the binding of EWS-FLI1 because these two cell lines do not express endogenous wild type FLU) (FIGs. 5A - 5B, FIGs. 6A-6B). KRAB-ZFA binding was also associated with striking changes in chromatin states and the induction of repressive marks with increased H3K9me3 and decreased H3K27Ac signals at GGAA microsatellites (FIGs. 5A, 5C - 5D, FIGs. 6A, 6C). As expected, these changes were observed uniquely at GGAA repeats and not at non-repeat GGAA EWS-FLI1 binding sites, confirming the specificity of KRAB-ZFA (FIGs. 6D-6F). Among the genes located within 100 kb of EWS-FLI1- bound GGAA repeats that showed > 2-fold decreases in EWS-FLI1 -depleted cell lines, 49% and 47% showed a similar decrease due to KRAB-ZFA expression in SKNMC and A673 cells, respectively (FIG. 5G, FIG. 7A, Table 4, shown below). Genes involved in specific functional categories (e.g., cell cycle regulation and neurogenesis) that have previously been identified after EWS-FLI1 knockdown (Riggi et al. 2014) and are linked to Ewing sarcoma cell survival were enriched among the genes downregulated by KRAB-ZFA in SKNMC and A673 cells (FIG. 7B).
Because the KRAB-ZFA fusion would only be expected to alter the function of GGAA repeats in Ewing sarcoma cells in which the EWS-FLI1 is expressed (and in which these repeats function as enhancers), we were interested in evaluating the effects of KRAB-ZFA expression in non-Ewing sarcoma cells. To do this, we analyzed genomewide chromatin state changes in HEK293T cells upon expression of KRAB-ZFA. Similar to what has been observed in most non-Ewing cell types previously examined (Riggi et al. 2014), we found that HEK293T cells were largely devoid of active chromatin marks at GGAA repeats and that there were no major changes in H3K27Ac signals induced with KRAB-ZFA expression (FIG. 7C). However, GGAA repeats in HEK293T cells accumulated strong repressive H3K9me3 signals after expression of KRAB-ZFA in the same manner as Ewing sarcoma cells (FIGs. 5E - 5F). In contrast to Ewing sarcoma cells, HEK293T transduced with KRAB-ZFA displayed minimal transcriptional changes, which only included a handful of genes with GGAA repeats located within their promoters (FIG. 7D, Table 6).
Finally, we tested whether the selective antagonistic effect exerted by the KRAB- ZFA fusion on the EWS-FLI1 -induced transcriptional program in Ewing sarcoma cells might also translate into a cell-type-specific impact on cell viability. To this end, we quantitatively compared the viability of four different Ewing sarcoma cell lines to four non-Ewing sarcoma control lines upon the expression of KRAB-ZFA or GFP (as a negative control) (FIG. 5H). Strikingly, despite similar KRAB-ZFA protein expression levels (FIG. 7E), only the viability of Ewing sarcoma cells was affected by KRAB-ZFA, with a reduction exceeding 80%, whereas minimal toxicity was observed in all negative control non-Ewing sarcoma cell lines (FIG. 5H).
Discussion
Our results show that engineered ZFAs are highly effective and specific tools for targeting widely distributed repetitive elements and altering their chromatin states. Engineered ZFAs have distinct advantages for this purpose given their high DNA binding affinities, small size, and similarities to endogenous transcription factors. Our findings further demonstrate that engineered ZFAs can greatly facilitate the functional assessment of the important but challenging-to-study repetitive elements of the human genome and may provide a strategy for therapeutically modifying the non-coding function of these repeats.
In the case of GGAA microsatellites that are activated genome- wide in Ewing sarcoma, the high degree of specificity conferred by ZFAs allowed us to isolate their function and determine that these elements are in fact responsible for large-scale gene activation in this tumor type. By recruiting specific regulatory domains without the involvement of endogenous DNA binding proteins, engineered ZFAs also make it possible to study the contribution of poorly understood proteins as shown by our finding that the N-terminus of EWSR1 is sufficient to activate GGAA microsatellites in the absence of the ETS DNA binding domain contained in EWS-FLI1.
Intriguingly, we observed large differences between ZFA and RNA-guided dCas9 approaches for targeting GGAA repeats. These observations suggest that ZFAs may have advantages over dCas9 for studying the function of tandem repeats that occur at a high number of different locations in the human genome.
Sequences for Zinc-finger array fusion proteins
KRAB-ZFA1 (SEQ ID NO:66)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA1 ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCAT
TTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGA
TCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACG
CACCTGAGGGGATCCTAA
KRAB-ZFA2 (SEQ ID NO:67)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA2
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACAT
TTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGC
ACCTGAGGGGATCCTAA
KRAB-ZFA3 (SEQ ID NO:68)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA3
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCAT
TTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGA
TCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACG
CACCTGAGGGGATCCTAA
KRAB-ZFA4 (SEQ ID NO:69)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA4
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACAT
TTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGC
ACCTGAGGGGATCCTAA
KRAB-ZFA5 (SEQ ID NO: 70)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA5
AAGGACCCCAAGAAGAAGAGGAAAGTCYCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCAT
TTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGA
TCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACG
CACCTGAGGGGATCCTAA
KRAB-ZFA6 (SEQ ID NO:71)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA6
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACAT
TTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGC
ACCTGAGGGGATCCTAA
KRAB-ZFA7 (SEQ ID NO: 72)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA7
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCAT
TTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGA
TCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACG
CACCTGAGGGGATCCTAA
KRAB-ZFA8 (SEQ ID NO: 73)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA8
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGC
ATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATAC
AGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTG
CACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCA
GTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATC
TACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAA
CTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCC
AGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACAT
TTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGC
ACCTGAGGGGATCCTAA
KRAB-ZFA9 (SEQ ID NO: 74)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA9
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACAC
CGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAG
GACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCC
AGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACAT
ACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCC
CAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTA
ACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCG
AATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTA
CGCACCTGAGGGGATCCTAA
KRAB-ZFA10 (SEQ ID NO: 75)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA10
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACAC
CGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAG
GACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCC
AGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACAT
ACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCC
CAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAA
CTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGA
ATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTAC
GCACCTGAGGGGATCCTAA
KRAB-ZFA11 (SEQ ID NO: 76)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA11
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACAC
CGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAG
GACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCC
AGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCAT
ACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCC
CAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTA
ACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCG
AATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTA
CGCACCTGAGGGGATCCTAA
KRAB-ZFA12 (SEQ ID NO:77)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA12
AAGGACCCCAAGAAGAAGAGGAAAGTCYCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACAC
CGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAG
GACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCC
AGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCAT
ACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCC
CAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAA
CTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGA
ATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTAC
GCACCTGAGGGGATCCTAA
KRAB-ZFA13 (SEQ ID NO:78)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA13
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACC
GGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACT
CTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCA
GTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATA
CCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAAT
TTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCA
GAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACT
TGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAAT
ATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGC
ACCTGAGGGGATCCTAA
KRAB-ZFA14 (SEQ ID NO:79)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA14
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACC
GGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACT
CTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCA
GTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATA
CCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAAT
TTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCA
GAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACT
TGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAAT
ATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGC
ACCTGAGGGGATCCTAA
KRAB-ZFA15 (SEQ ID NO: 80)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA15
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACC
GGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACT
CTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCA
GTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATA
CCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAAT
TTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCA
GAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACT
TGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAAT
ATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGC
ACCTGAGGGGATCCTAA
KRAB-ZFA16 (SEQ ID NO:81)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA16
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCCGATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTT
CAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGAC
ACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGA
ACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGG
TTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAA
GAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTGG
AGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTA
TGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACC
GGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACT
CTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCA
GTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATA
CCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAAT
TTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCA
GAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACT
TGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAAT
ATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGC
ACCTGAGGGGATCCTAA
ZFA1-KRAB (SEQ ID NO: 82)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA1
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAA
CCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAA
ACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATC
TGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
CAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCCAGAAGCCAT
TCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCA
CATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCG
AAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA2-KRAB (SEQ ID NO: 83)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA2
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAA
CCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAA
ACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATC
TGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
CAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCCAGAAGCCAT
TCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAAC
CATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCG
AAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA3-KRAB (SEQ ID NO: 84)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA3 AUGGACCCCAAGAAGAAGAGGAAAGTCICGAGCTACCCATACGATGTTCCAGAT TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GG/TCTAGACC CGGAGAGCGC CC ATTCC AGTGTCGGATTTGCATGCGGAAC TT
TTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAA
CCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAA
ACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATC
TGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
AACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATT
CCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCAC
ATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGA
AATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGA
CACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTG
GAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTG
GAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGA
TGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAG
AGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCA
AATCATCAGTTTAA
ZFA4-KRAB (SEQ ID NO: 85)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA4
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAA
CCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAA
ACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATC
TGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
AACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATT
CCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACC
ATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGA
AATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGA
CACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTG
GAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTG
GAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGA
TGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAG
AGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCA
AATCATCAGTTTAA
ZFA5-KRAB (SEQ ID NO: 86)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA5
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
CAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCCAGAAGCCAT
TCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCA
CATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCG
AAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA6-KRAB (SEQ ID NO: 87)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA6
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
CAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCCAGAAGCCAT
TCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAAC
CATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCG
AAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA7-KRAB (SEQ ID NO: 88)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA7 ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCAC
ACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTA
ACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATTC
CAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACA
TACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGGA
TCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGAC
ACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGG
AAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGG
AGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGAT
GTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGA
GAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAA ATCATCAGTTTAA
ZFA8-KRAB (SEQ ID NO: 89)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA8
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCAC
ACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTA
ACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATTC
CAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCA
TACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGGA
TCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGAC
ACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTGG
AAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTGG
AGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGAT
GTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGA
GAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCAA ATCATCAGTTTAA
ZFA9-KRAB (SEQ ID NO:90)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA9
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA10-KRAB (SEQ ID NO:91)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA10
AUGGACCCCAAGAAGAAGAGGAAAGTCICGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GG/TCTAGACC CGGAGAGCGC CC ATTTC AGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGA
CACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTG
GAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTG
GAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGA
TGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAG
AGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCA
AATCATCAGTTTAA
ZFA11-KRAB (SEQ ID NO:92)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA11
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA12-KRAB (SEQ ID NO:93)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA12
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGA
CACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTG
GAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTG
GAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGA
TGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAG
AGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCA
AATCATCAGTTTAA
ZFA13-KRAB (SEQ ID NO:94)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA13
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA14-KRAB (SEQ ID NO:95)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA14 ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGA
CACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTG
GAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTG
GAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGA
TGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAG
AGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCA AATCATCAGTTTAA
ZFA15-KRAB (SEQ ID NO:96)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA15
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGG
ACACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGT
GGAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCT
GGAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCA
GATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAG
AGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAA
TCAAATCATCAGTTTAA
ZFA16-KRAB (SEQ ID NO:97)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: KRAB, Underlined: ZFA16
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGATGCTAAGTCACTAACTGCCTGGTCCCGGA
CACTGGTGACCTTCAAGGATGTATTTGTGGACTTCACCAGGGAGGAGTG
GAAGCTGCTGGACACTGCTCAGCAGATCGTGTACAGAAATGTGATGCTG
GAGAACTATAAGAACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGA
TGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAG
AGAAATTCACCAAGAGACCCATCCTGATTCAGAGACTGCATTTGAAATCA
AATCATCAGTTTAA
EWS-ZFA1 (SEQ ID NO:98)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA1
AUGGACCCCAAGAAGAAGAGGAAAGTCICGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTT
GCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCC
ACACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCG
CGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCT
TTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGT
CACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA2 (SEQ ID NO:99)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA2
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTT
GCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCC
ACACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCG
CGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTT
TCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTC
ACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA3 (SEQ ID NO: 100)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA3
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTT
GTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCA
CACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGC
GTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTT
CAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCA
CCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA4 (SEQ ID NO: 101)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA4
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTT
GTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCA
CACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGC
GTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTT
CAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCA
CCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA5 (SEQ ID NO: 102)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA5
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTT
GCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCC
ACACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCG
CGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCT
TTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGT
CACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA6 (SEQ ID NO: 103)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA6
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTT
GCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCC
ACACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCG
CGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTT
TCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTC
ACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA7 (SEQ ID NO: 104)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA7
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTT
GTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCA
CACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGC
GTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTT
CAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCA
CCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA8 (SEQ ID NO: 105)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA8
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTCCAGT
GTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACC
CGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTT
CTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCACTGCGGCAGCCAG
AAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTT
GTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATA
TGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCA
CACCGGTTCCCAGAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGC
GTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTT
CAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCA
CCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA9 (SEQ ID NO: 106)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA9
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATT
TGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCC
ACACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCC
CAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCAT
TCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGG
CACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA10 (SEQ ID NO: 107)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA10
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATT
TGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCC
ACACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCC
CAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCAT
TCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTAT
CACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA11 (SEQ ID NO: 108)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA11
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATT
TGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATC
TGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCA
CACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCC
AGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATT
CCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGC
ACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA12 (SEQ ID NO: 109)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA12
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTCAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATT
TGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATC
TGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCA
CACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCC
AGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATT
CCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATC
ACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA13 (SEQ ID NO: 110)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA13
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATT
TGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCC
ACACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCC
CAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCAT
TCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGG
CACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA14 (SEQ ID NO: 111)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA14
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATT
TGGACGCACATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGAT
CTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCC
ACACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCC
CAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCAT
TCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTAT
CACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA15 (SEQ ID NO: 112)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA15
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATT
TGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATC
TGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCA
CACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCC
AGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAGCCATT
CCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTCTTGGC
ACCTGCGTACGCACCTGAGGGGATCCTAA
EWS-ZFA16 (SEQ ID NO: 113)
Bold and Italic. NLS, Italic and underlined. 3X HA, Bold: EWS, Underlined: ZFA16
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GCTTCAGCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCA
GGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACC
ACCCAGGCATATGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTG
ATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTATGGGCAGACCGC
CTATGCAACTTCTTATGGACAGCCTCCCACTGGTTATACTACTCCAACTG
CCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATGGCACTGGTGCTTA
TGATACCACCACTGCTACAGTCACCACCACCCAGGCCTCCTATGCAGCT
CAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCTATGGGCAGCAGC
CAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAACAAGCCCACTGA
GACTAGTCAACCTCAATCTAGCACAGGGGGTTACAACCAGCCCAGCCTA
GGATATGGACAGAGTAACTACAGTTATCCCCAGGTACCTGGGAGCTACC
CCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTACCAGCTATTCC
TCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGCAGAACA
CCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCAACA
AAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACG
GGCAGCAGGGAGGCGGTGGA4GCTCTAGACCCGGAGAGCGCCCATTTCAGT
GTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTA
CGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACT
TCAGTCGTAACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGCAGCCA
GAAGCCATTCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATT
TGTTGAACCATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATC
TGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAACGTCATCTACGTACCCA
CACCGGTTCCCAGAAGCCATTTCAGTGTCGGATCTGTATGCGGAACTTCTCCC
AGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAGCCATT
CCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCAGTATC
ACCTGCGTACGCACCTGAGGGGATCCTAA
ZFA1-EWS (SEQ ID NO: 114)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA1
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAA
CCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAA
ACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATC
TGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
CAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCCAGAAGCCAT
TCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCA
CATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCG
AAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAA
GCTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAG
GATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTA
TGGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACC
TATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTT
ATACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTA
TGGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAG
GCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGC
CTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGA
AACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACA
ACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGT
ACCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCT
CCTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTA
CTCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGT
AGCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACC
CACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACA
GAGCAGCAGCTACGGGCAGCAGTAA
ZFA2-EWS (SEQ ID NO: 115)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA2
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
CAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCCAGAAGCCAT
TCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCA
CATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCG
AAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAA
GCTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAG
GATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTA
TGGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACC
TATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTT
ATACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTA
TGGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAG
GCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGC
CTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGA
AACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACA
ACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGT
ACCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCT
CCTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTA
CTCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGT
AGCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACC
CACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACA
GAGCAGCAGCTACGGGCAGCAGTAA
ZFA3-EWS (SEQ ID NO: 116)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA3
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAA
CCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAA
ACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATC
TGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
AACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATT
CCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCAC
ATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGA
AATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAG
CTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGG
ATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTAT
GGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCT
ATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTA
TACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTAT
GGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGG
CCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCC
TATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAA
ACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAA
CCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTA
CCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTC
CTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTAC
TCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTA
GCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCC
ACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAG
AGCAGCAGCTACGGGCAGCAGTAA
ZFA4-EWS (SEQ ID NO: 117)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA4
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCCACATCATTTGGACGCACATACCCGTACTCATACAGGTGAAAAA
CCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAA
ACGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATC
TGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
AACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATT
CCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACC
ATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGA
AATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAG
CTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGG
ATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTAT
GGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCT
ATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTA
TACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTAT
GGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGG
CCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCC
TATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAA
ACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAA
CCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTA
CCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTC
CTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTAC
TCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTA
GCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCC
ACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAG
AGCAGCAGCTACGGGCAGCAGTAA
ZFA5-EWS (SEQ ID NO: 118)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA5
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCAC
ACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTA
ACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATTC
CAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACA
TACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGGA
TCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAGC
TGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGA
TATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTATG
GACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTA
TGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTAT
ACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATG
GCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGGC
CTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCT
ATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAA
CAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAAC
CAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTAC
CTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCC
TACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTACT
CTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAG
CTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCA
CCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGA
GCAGCAGCTACGGGCAGCAGTAA
ZFA6-EWS (SEQ ID NO: 119)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA6
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCA
CACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGT
CAGGACAACTTGTCTTGGCACCTAAAAACCCACACCGGTTCCCAGAAGCCAT
TCCAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAAC
CATACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCG
AAATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAA
GCTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAG
GATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTA
TGGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACC
TATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTT
ATACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTA
TGGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAG
GCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGC
CTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGA
AACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACA
ACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGT
ACCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCT
CCTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTA
CTCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGT
AGCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACC
CACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACA
GAGCAGCAGCTACGGGCAGCAGTAA
ZFA7-EWS (SEQ ID NO: 120)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA7
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCAC
ACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTA
ACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATTC
CAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACA
TACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGGA
TCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAGC
TGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGA
TATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTATG
GACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTA
TGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTAT
ACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATG
GCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGGC
CTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCT
ATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAA
CAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAAC
CAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTAC
CTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCC
TACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTACT
CTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAG
CTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCA
CCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGA
GCAGCAGCTACGGGCAGCAGTAA
ZFA8-EWS (SEQ ID NO: 121)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA8
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTCCAGTGTCGGATTTGCATGCGGAACTT
TTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCATACAGGTGAAAAAC
CCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGTCTGCACATTTGAAA
CGTCATCTACGTACCCACTGCGGCAGCCAGAAGCCATTTCAGTGTCGGATCT
GTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCAC
ACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTA
ACTCTTATTTGCAGTATCACCTAAAAACCCACACCGGTTCCCAGAAGCCATTC
CAGTGTCGGATTTGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCA
TACCCGTACTCATACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAA
ATTTCTCCCAGTCTGCACATTTGAAACGTCACCTGCGTACGCACCTGAGGGGA
TCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAGC
TGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGGA
TATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTATG
GACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCTA
TGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTAT
ACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTATG
GCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGGC
CTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCCT
ATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAAA
CAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAAC
CAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTAC
CTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCC
TACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTACT
CTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAG
CTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCA
CCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGA
GCAGCAGCTACGGGCAGCAGTAA
ZFA9-EWS (SEQ ID NO: 122)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA9
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAA
GCTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAG
GATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTA
TGGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACC
TATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTT
ATACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTA
TGGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAG
GCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGC
CTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGA
AACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACA
ACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGT
ACCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCT
CCTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTA
CTCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGT
AGCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACC
CACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACA
GAGCAGCAGCTACGGGCAGCAGTAA
ZFA10-EWS (SEQ ID NO: 123)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA10
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAG
CTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGG
ATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTAT
GGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCT
ATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTA
TACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTAT
GGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGG
CCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCC
TATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAA
ACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAA
CCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTA
CCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTC
CTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTAC
TCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTA
GCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCC
ACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAG
AGCAGCAGCTACGGGCAGCAGTAA
ZFA11-EWS (SEQ ID NO: 124)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA11
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAA
GCTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAG
GATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTA
TGGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACC
TATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTT
ATACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTA
TGGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAG
GCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGC
CTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGA
AACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACA
ACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGT
ACCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCT
CCTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTA
CTCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGT
AGCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACC
CACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACA
GAGCAGCAGCTACGGGCAGCAGTAA
ZFA12-EWS (SEQ ID NO: 125)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA12
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCCAGGTAACTTGCAGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTCAGGACAACTTGTC
TTGGCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAG
CTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGG
ATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTAT
GGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCT
ATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTA
TACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTAT
GGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGG
CCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCC
TATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAA
ACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAA
CCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTA
CCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTC
CTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTAC
TCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTA
GCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCC
ACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAG
AGCAGCAGCTACGGGCAGCAGTAA
ZFA13-EWS (SEQ ID NO: 126)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA13
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAA
GCTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAG
GATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTA
TGGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACC
TATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTT
ATACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTA
TGGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAG
GCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGC
CTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGA
AACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACA
ACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGT
ACCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCT
CCTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTA
CTCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGT
AGCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACC
CACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACA
GAGCAGCAGCTACGGGCAGCAGTAA
ZFA14-EWS (SEQ ID NO: 127)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA14
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCCACATCATTTGGACGCACATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAG
CTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGG
ATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTAT
GGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCT
ATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTA
TACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTAT
GGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGG
CCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCC
TATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAA
ACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAA
CCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTA
CCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTC
CTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTAC
TCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTA
GCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCC
ACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAG
AGCAGCAGCTACGGGCAGCAGTAA
ZFA15-EWS (SEQ ID NO: 128)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA15
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCCAGGTAACTTGCAGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTCAGGACAACTTGTCTTGGCACCTGCGTACGCACCTGAGGG
GATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAA
GCTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAG
GATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTA
TGGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACC
TATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTT
ATACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTA
TGGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAG
GCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGC
CTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGA
AACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACA
ACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGT
ACCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCT
CCTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTA
CTCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGT
AGCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACC
CACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACA
GAGCAGCAGCTACGGGCAGCAGTAA
ZFA16-EWS (SEQ ID NO: 129)
Bold and Italic. NLS, Italic and underlined'. 3X HA, Bold: EWS, Underlined: ZFA16
ATGGACCCCAAGAAGAAGAGGAAAGTCTCGAGCTACCCATACGATGTTCCAGAT
TACGCTTATCCTTATGACGTACCTGACTATGCATACCCTTATGATGTACCAGACTAC
GC7TCTAGACCCGGAGAGCGCCCATTTCAGTGTCGGATCTGTATGCGGAACTT
CTCCCAGCGTGGTAACTTGTTGCGTCATCTACGTACGCACACCGGAGAGAAG
CCATTCCAATGCCGAATATGCATGCGCAACTTCAGTCGTAACTCTTATTTGCA
GTATCACCTAAAAACCCACACCGGCAGCCAGAAGCCATTCCAGTGTCGGATT
TGCATGCGGAACTTTTCGCGTCGTGCACATTTGTTGAACCATACCCGTACTCA
TACAGGTGAAAAACCCTTTCAGTGTCGGATCTGTATGCGAAATTTCTCCCAGT
CTGCACATTTGAAACGTCATCTACGTACCCACACCGGTTCCCAGAAGCCATTT
CAGTGTCGGATCTGTATGCGGAACTTCTCCCAGCGTGGTAACTTGTTGCGTCA
TCTACGTACGCACACCGGAGAGAAGCCATTCCAATGCCGAATATGCATGCGC
AACTTCAGTCGTAACTCTTATTTGCAGTATCACCTGCGTACGCACCTGAGGGG
ATCCGGAGGCGGTGGAAGCGCGTCCACGGATTACAGTACCTATAGCCAAG
CTGCAGCGCAGCAGGGCTACAGTGCTTACACCGCCCAGCCCACTCAAGG
ATATGCACAGACCACCCAGGCATATGGGCAACAAAGCTATGGAACCTAT
GGACAGCCCACTGATGTCAGCTATACCCAGGCTCAGACCACTGCAACCT
ATGGGCAGACCGCCTATGCAACTTCTTATGGACAGCCTCCCACTGGTTA
TACTACTCCAACTGCCCCCCAGGCATACAGCCAGCCTGTCCAGGGGTAT
GGCACTGGTGCTTATGATACCACCACTGCTACAGTCACCACCACCCAGG
CCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCTTATCCAGCC
TATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGGATGGAA
ACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTACAA
CCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTA
CCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTC
CTACCAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTAC
TCTCAGCAGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTA
GCTATGGTCAACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCC
ACCCCAAACTGGTTCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAG
AGCAGCAGCTACGGGCAGCAGTAA
KRAB (SEQ ID NO: 130)
GATGCTAAGTCACTAACTGCCTGGTCCCGGACACTGGTGACCTTCAAGGATG
TATTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGGACACTGCTCAGCA
GATCGTGTACAGAAATGTGATGCTGGAGAACTATAAGAACCTGGTTTCCTTG
GGTTATCAGCTTACTAAGCCAGATGTGATCCTCCGGTTGGAGAAGGGAGAAG
AGCCCTGGCTGGTGGAGAGAGAAATTCACCAAGAGACCCATCCTGATTCAGA GACTGCATTTGAAATCAAATCATCAGTT
3X HA (SEQ ID NO: 131)
TACCCATACGATGTTCCAGATTACGCTTATCCTTATGACGTACCTGACTATGC
ATACCCTTATGATGTACCAGACTACGCT
NLS (SEQ ID NO: 132)
CCCAAGAAGAAGAGGAAAGTC
EWS (SEQ ID NO: 133)
GCGTCCACGGATTACAGTACCTATAGCCAAGCTGCAGCGCAGCAGGGCTACA
GTGCTTACACCGCCCAGCCCACTCAAGGATATGCACAGACCACCCAGGCATA
TGGGCAACAAAGCTATGGAACCTATGGACAGCCCACTGATGTCAGCTATACC
CAGGCTCAGACCACTGCAACCTATGGGCAGACCGCCTATGCAACTTCTTATG
GACAGCCTCCCACTGGTTATACTACTCCAACTGCCCCCCAGGCATACAGCCA
GCCTGTCCAGGGGTATGGCACTGGTGCTTATGATACCACCACTGCTACAGTCA
CCACCACCCAGGCCTCCTATGCAGCTCAGTCTGCATATGGCACTCAGCCTGCT
TATCCAGCCTATGGGCAGCAGCCAGCAGCCACTGCACCTACAAGACCGCAGG
ATGGAAACAAGCCCACTGAGACTAGTCAACCTCAATCTAGCACAGGGGGTTA
CAACCAGCCCAGCCTAGGATATGGACAGAGTAACTACAGTTATCCCCAGGTA
CCTGGGAGCTACCCCATGCAGCCAGTCACTGCACCTCCATCCTACCCTCCTAC
CAGCTATTCCTCTACACAGCCGACTAGTTATGATCAGAGCAGTTACTCTCAGC
AGAACACCTATGGGCAACCGAGCAGCTATGGACAGCAGAGTAGCTATGGTCA
ACAAAGCAGCTATGGGCAGCAGCCTCCCACTAGTTACCCACCCCAAACTGGT
TCCTACAGCCAAGCTCCAAGTCAATATAGCCAACAGAGCAGCAGCTACGGGC AGCAG
ZEA recognition helices:
QSAHLKR (SEQ ID NO: 138)
RPHHLDA (SEQ ID NO: 139)
RQDNLSW (SEQ ID NO: 140)
QPGNLQR (SEQ ID NO: 141)
RRAHLLN (SEQ ID NO: 142)
RNSYLQY (SEQ ID NO: 143)
QRGNLLR (SEQ ID NO: 144)
References
Aksenova, Anna Y., and Sergei M. Mirkin. 2019. “At the Beginning of the End and in the Middle of the Beginning: Structure and Maintenance of Telomeric DNA Repeats and Interstitial Telomeric Sequences.” Genes
Beerli, R. R., D. J. Segal, B. Dreier, and C. F. Barbas. 1998. “Toward Controlling Gene Expression at Will: Specific Regulation of the erbB-2/HER-2 Promoter by Using Polydactyl Zinc Finger Proteins Constructed from Modular Building Blocks.” Proceedings of the National Academy of Sciences
Boch, Jens, Heidi Scholze, Sebastian Schornack, Angelika Landgraf, Simone Hahn, Sabine Kay, Thomas Lahaye, Anja Nickstadt, and Ulla Bonas. 2009. “Breaking the Code of DNA Binding Specificity of TAL-Type III Effectors.” Science 326 (5959): 1509-12.
Boulay, Gaylor, Gabriel J. Sandoval, Nicolo Riggi, Sowmya Iyer, Remi Buisson, Beverly Naigles, Mary E. Awad, et al. 2017. “Cancer-Specific Retargeting of BAF Complexes by a Prion-like Domain.” Cell 171 (1): 163-78. el9.
Boulay, Gaylor, Angela Volorio, Sowmya Iyer, Liliane C. Broye, Ivan Stamenkovic, Nicolo Riggi, and Miguel N. Rivera. 2018. “Epigenome Editing of Microsatellite Repeats Defines Tumor- Specific Enhancer Functions and Dependencies.” Genes & Development 32 (15-16): 1008-19.
’’Mechanisms of action of key genetic abnormality in Ewing sarcoma.” August 20, 2018. sciencedaily, com/releases/2018/08/180802115649. htm
Burns, Kathleen H. 2017. “Transposable Elements in Cancer.” Nature Reviews. Cancer 17 (7): 415-24.
Chong, Shasha, Claire Dugast-Darzacq, Zhe Liu, Peng Dong, Gina M. Dailey, Claudia Cattoglio, Alec Heckert, et al. 2018. “Imaging Dynamic and Selective Low- Complexity Domain Interactions That Control Gene Transcription.” Science 2018 Jul 27;361(6400):eaar2555
Christian, Michelle L., Zachary L. Demorest, Colby G. Starker, Mark J. Osborn, Michael D. Nyquist, Yong Zhang, Daniel F. Carlson, Philip Bradley, Adam J. Bogdanove, and Daniel F. Voytas. 2012. “Targeting G with TAL Effectors: A Comparison of Activities of TALENs Constructed with NN and NK Repeat Variable Di-Residues.” PloS One 7 (9): e45383.
Delattre, O., J. Zucman, B. Plougastel, C. Desmaze, T. Melot, M. Peter, H. Kovar, I. Joubert, P. de Jong, and G. Rouleau. 1992. “Gene Fusion with an ETS DNA- Binding Domain Caused by Chromosome Translocation in Human Tumours.” Nature 359 (6391): 162-65.
Deng, D., C. Yan, X. Pan, M. Mahfouz, and J. Wang. 2012. “Structural Basis for Sequence-Specific Recognition of DNA by TAL Effectors.” science.sciencemag.org/content/335/6069/720.abstract.
Dobin, Alexander, Carrie A. Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha, Philippe Batut, Mark Chaisson, and Thomas R. Gingeras. 2013. “STAR: Ultrafast Universal RNA-Seq Aligner.” Bioinformatics 29 (1): 15-21.
ENCODE Project Consortium. 2012. “An Integrated Encyclopedia of DNA Elements in the Human Genome.” Nature 489 (7414): 57-74.
Fuentes, Daniel R., Tomek Swigut, and Joanna Wysocka. 2018. “Systematic Perturbation of Retroviral LTRs Reveals Widespread Long-Range Effects on Human Gene Regulation.” eLife 7 (August). 2018 Aug 2;7:e35989.
Fu, Fengli, Jeffry D. Sander, Morgan Maeder, Stacey Thibodeau-Beganny, J. Keith Joung, Drena Dobbs, Leslie Miller, and Daniel F. Voytas. 2009. “Zinc Finger Database (ZiFDB): A Repository for Information on C2H2 Zinc Fingers and Engineered Zinc-Finger Arrays.” Nucleic Acids Research 31 (Database issue): D279-83.
Gangwal, Kunal, Savita Sankar, Peter C. Hollenhorst, Michelle Kinsey, Stephen C. Haroldsen, Atul A. Shah, Kenneth M. Boucher, et al. 2008. “Microsatellites as
EWS/FLI Response Elements in Ewing’s Sarcoma.” Proceedings of the National Academy of Sciences of the United States of America 105 (29): 10149-54.
Gao, Xuefei, Jason C. H. Tsang, Fortis Gaba, Donghai Wu, Liming Lu, and Pentao Liu. 2014. “Comparison of TALE Designer Transcription Factors and the CRISPR/dCas9 in Regulation of Gene Expression by Targeting Enhancers.” Nucleic Acids Research 42 (20): el 55.
Gersbach, Charles A., Thomas Gaj, and Carlos F. Barbas 3rd. 2014. “Synthetic Zinc Finger Proteins: The Advent of Targeted Gene Regulation and Genome Modification Technologies.” Accounts of Chemical Research 47 (8): 2309-18.
Graslund, Torbjbrn, Xuelin Li, Laurent Magnenat, Mikhail Popkov, and Carlos F. Barbas 3rd. 2005. “Exploring Strategies for the Design of Artificial Transcription Factors: Targeting Sites Proximal to Known Regulatory Regions for the Induction of Gamma-Globin Expression and the Treatment of Sickle Cell Disease.” The Journal of Biological Chemistry 280 (5): 3707-14.
Groner, Anna C., Sylvain Meylan, Angela Ciuffi, Nadine Zangger, Giovanna Ambrosini, Nicolas Denervaud, Philipp Bucher, and Didier Trono. 2010. “KRAB-Zinc Finger Proteins and KAP1 Can Mediate Long-Range Transcriptional Repression through Heterochromatin Spreading.” PLoS Genetics 6 (3): el 000869.
Guillon, Noelle, Franck Tirode, Valentina Boeva, Andrei Zynovyev, Emmanuel Barillot, and Olivier Delattre. 2009. “The Oncogenic EWS-FLI1 Protein Binds in Vivo GGAA Microsatellite Sequences with Potential Transcriptional Activation Function.” PloS One 4 (3): e4932.
Heinz, Sven, Christopher Benner, Nathanael Spann, Eric Bertolino, Yin C. Lin, Peter Laslo, Jason X. Cheng, Cornells Murre, Harinder Singh, and Christopher K. Glass. 2010. “Simple Combinations of Lineage-Determining Transcription Factors Prime Cis-Regulatory Elements Required for Macrophage and B Cell Identities.” Molecular Cell 38 (4): 576-89.
Holtzman, Liad, and Charles A. Gersbach. 2018. “Editing the Epigenome: Reshaping the Genomic Landscape.” Annual Review of Genomics and Human Genetics 19 (August): 43-71.
Jachowicz, Joanna W., Xinyang Bing, Julien Pontabry, Ana Boskovic, Oliver J. Rando,
and Maria-Elena Torres-Padilla. 2017. “LINE-1 Activation after Fertilization Regulates Global Chromatin Accessibility in the Early Mouse Embryo.” Nature Genetics. 2017 Oct;49(10): 1502-1510.
Joung, J. Keith, Daniel F. Voytas, and Joanne Kamens. 2015. “Accelerating Research through Reagent Repositories: The Genome Editing Example.” Genome Biology 16 (November): 255.
Lander, E. S., L. M. Linton, B. Birren, C. Nusbaum, M. C. Zody, J. Baldwin, K. Devon, et al. 2001. “Initial Sequencing and Analysis of the Human Genome.” Nature 409 (6822): 860-921.
Liao, Yang, Gordon K. Smyth, and Wei Shi. 2014. “featureCounts: An Efficient General Purpose Program for Assigning Sequence Reads to Genomic Features.” Bioinformatics. 30 (7): 923-30.
Li, Heng, and Richard Durbin. 2009. “Fast and Accurate Short Read Alignment with Burrows-Wheeler Transform.” Bioinformatics 25 (14): 1754-60.
Love, Michael I., Wolfgang Huber, and Simon Anders. 2014. “Moderated Estimation of Fold Change and Dispersion for RNA-Seq Data with DESeq2.” Genome Biology 15 (12): 550.
Maeder, Morgan L., Samantha J. Linder, Vincent M. Cascio, Yanfang Fu, Quan H. Ho, and J. Keith Joung. 2013. “CRISPR RNA-Guided Activation of Endogenous Human Genes ” Nature Methods 10 (10): 977-79.
Maeder, Morgan L., Stacey Thibodeau-Beganny, Anna Osiak, David A. Wright, Reshma M. Anthony, Magdalena Eichtinger, Tao Jiang, et al. 2008. “Rapid ‘Open-Source’ Engineering of Customized Zinc-Finger Nucleases for Highly Efficient Gene Modification.” Molecular Cell 31 (2): 294-301.
Maeder, Morgan L., Stacey Thibodeau-Beganny, Jeffry D. Sander, Daniel F. Voytas, and J. Keith Joung. 2009. “Oligomerized Pool Engineering (OPEN): An ‘Open-Source’ Protocol for Making Customized Zinc-Finger Arrays.” Nature Protocols 4 (10): 1471-1501.
Malik, Indranil, Chase P. Kelley, Eric T. Wang, and Peter K. Todd. 2021. “Molecular Mechanisms Underlying Nucleotide Repeat Expansion Disorders.” Nature Reviews. Molecular Cell Biology 22 (9): 589-607.
Margolin, J. F., J. R. Friedman, W. K. Meyer, H. Vissing, H. J. Thiesen, and F. J. Rauscher 3rd. 1994. “Kriippel-Associated Boxes Are Potent Transcriptional Repression Domains.” Proceedings of the National Academy of Sciences of the United States of America 91 (10): 4509-13.
Mikkelsen, Tarjei S., Manching Ku, David B. Jaffe, Biju Issac, Erez Lieberman, Georgia Giannoukos, Pablo Alvarez, et al. 2007. “Genome-Wide Maps of Chromatin State in Pluripotent and Lineage-Committed Cells.” Nature 448 (7153): 553-60.
Moore, M., A. Klug, and Y. Choo. 2001. “Improved DNA Binding Specificity from Polyzinc Finger Peptides by Using Strings of Two-Finger Units.” Proceedings of the National Academy of Sciences of the United States of America 98 (4): 1437-41.
Moscou, Matthew J., and Adam J. Bogdanove. 2009. “A Simple Cipher Governs DNA Recognition by TAL Effectors.” Science 326 (5959): 1501.
Payer, Lindsay M., and Kathleen H. Burns. 2019. “Transposable Elements in Human Genetic Disease.” Nature Reviews. Genetics 20 (12): 760-72.
Pearson, Helen. 2008. “Protein Engineering: The Fate of Fingers.” Nature 455 (7210): 160-64.
Pehrsson, Erica C., Mayank N. K. Choudhary, Vasavi Sundaram, and Ting Wang. 2019. “The Epigenomic Landscape of Transposable Elements across Normal Human Development and Anatomy.” Nature Communications 10 (1): 5640.
Perez-Pinera, Pablo, D. Dewran Kocak, Christopher M. Vockley, Andrew F. Adler, Ami M. Kabadi, Lauren R. Polstein, Pratiksha I. Thakore, et al. 2013. “RNA-Guided Gene Activation by CRISPR-Cas9-Based Transcription Factors.” Nature Methods 10 (10): 973-76.
Pohl, Andy, and Miguel Beato. 2014. “Bwtool: A Tool for bigWig Files.” Bioinformatics 30 (11): 1618-19.
Qi, Lei S., Matthew H. Larson, Luke A. Gilbert, Jennifer A. Doudna, Jonathan S. Weissman, Adam P. Arkin, and Wendell A. Lim. 2013. “Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression.” Cell 2013 Feb 28;152(5): 1173-83
Quinlan, Aaron R., Royden A. Clark, Svetlana Sokolova, Mitchell L. Leibowitz, Yujun Zhang, Matthew E. Hurles, Joshua C. Mell, and Ira M. Hall. 2010. “Genome-Wide
Mapping and Assembly of Structural Variant Breakpoints in the Mouse Genome.” Genome Research 20 (5): 623-35.
Riggi, Nicolo, Birgit Knoechel, Shawn M. Gillespie, Esther Rheinbay, Gaylor Boulay, Mario L. Suva, Nikki E. Rossetti, et al. 2014. “EWS-FLI1 Utilizes Divergent Chromatin Remodeling Mechanisms to Directly Activate or Repress Enhancer Elements in Ewing Sarcoma.” Cancer Cell 26 (5): 668-81.
Sander, Jeffry D., Elizabeth J. Dahlborg, Mathew J. Goodwin, Lindsay Cade, Feng Zhang, Daniel Cifuentes, Shaun J. Curtin, et al. 2011. “Selection-Free Zinc-Finger- Nuclease Engineering by Context-Dependent Assembly (CoDA).” Nature Methods 8 (1): 67-69.
Sander, Jeffry D., and J. Keith Joung. 2014. “CRISPR-Cas Systems for Editing, Regulating and Targeting Genomes.” Nature Biotechnology 32 (4): 347-55.
Sander, Jeffry D., Morgan L. Maeder, Deepak Reyon, Daniel F. Voytas, J. Keith Joung, and Drena Dobbs. 2010. “ZiFiT (Zinc Finger Targeter): An Updated Zinc Finger Engineering Tool.” Nucleic Acids Research 38 (Web Server issue): W462-68.
Sawaya, Sterling, Andrew Bagshaw, Emmanuel Buschiazzo, Pankaj Kumar, Shantanu Chowdhury, Michael A. Black, and Neil Gemmell. 2013. “Microsatellite Tandem Repeats Are Abundant in Human Promoters and Are Associated with Regulatory Elements.” PloS One 8 (2): e54710.
Scholze, Heidi, and Jens Boch. 2011. “TAL Effectors Are Remote Controls for Gene Activation.” Current Opinion in Microbiology 14 (1): 47-53.
Streubel, Jana, Christina Blucher, Angelika Landgraf, and Jens Boch. 2012. “TAL Effector RVD Specificities and Efficiencies.” Nature Biotechnology 30 (7): 593-95.
Subramanian, Subbaya, Rakesh K. Mishra, and Lalji Singh. 2003. “Genome-Wide Analysis of Microsatellite Repeats in Humans: Their Abundance and Density in Specific Genomic Regions.” Genome Biology 4 (2): R13.
Tak, Y. Esther, Benjamin P. Kleinstiver, James K. Nunez, Jonathan Y. Hsu, Joy E.
Horng, Jingyi Gong, Jonathan S. Weissman, and J. Keith Joung. 2017. “Inducible and Multiplex Gene Regulation Using CRISPR-Cpfl -Based Transcription Factors.” Nature Methods 14 (12): 1163-66.
Ting, David T., Doron Lipson, Suchismita Paul, Brian W. Brannigan, Sara Akhavanfard,
Erik J. Coffman, Gianmarco Contino, et al. 2011. “Aberrant Overexpression of Satellite Repeats in Pancreatic and Other Epithelial Cancers.” Science 331 (6017): 593-96.
Trost, Brett, Worrawat Engchuan, Charlotte M. Nguyen, Bhooma Thiruvahindrapuram, Egor Dolzhenko, Ian Backstrom, Mila Mirceta, et al. 2020. “Genome-Wide Detection of Tandem DNA Repeats That Are Expanded in Autism.” Nature 2020 Oct;586(7827): 80-86.
Usdin, Karen. 2008. “The Biological Effects of Simple Tandem Repeats: Lessons from the Repeat Expansion Diseases.” Genome Research 18 (7): 1011-19.
Wright, David A., Stacey Thibodeau-Beganny, Jeffry D. Sander, Ronnie J. Winfrey, Andrew S. Hirsh, Magdalena Eichtinger, Fengli Fu, et al. 2006. “Standardized Reagents and Protocols for Engineering Zinc Finger Nucleases by Modular Assembly.” Nature Protocols 1 (3): 1637-52.
Yarrington, Robert M., Surbhi Verma, Shaina Schwartz, Jonathan K. Trautman, and Dana Carroll. 2018. “Nucleosomes Inhibit Target Cleavage by CRISPR-Cas9 in Vivo.” Proceedings of the National Academy of Sciences of the United States of America 115 (38): 9351-58.
Zeitler, Bryan, Steven Froelich, Kimberly Marlen, David A. Shivak, Qi Yu, Davis Li, Jocelynn R. Pearl, et al. 2019. “Allele-Selective Transcriptional Repression of Mutant HTT for the Treatment of Huntington’s Disease.” Nature Medicine 25 (7): 1131-42.
Zhang, Yong, Tao Liu, Clifford A. Meyer, Jerome Eeckhoute, David S. Johnson, Bradley E. Bernstein, Chad Nusbaum, et al. 2008. “Model-Based Analysis of ChlP-Seq (MACS).” Genome Biology 9 (9): R137.
OTHER EMBODIMENTS
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended
claims. Other aspects, advantages, and modifications are within the scope of the following claims.
Claims
1. An engineered zinc finger array comprising 6 zinc finger recognition regions, wherein the zinc finger array binds a target sequence of GGAAGGAAGGAAGGAAGG (SEQ ID NO:4) or AAGGAAGGAAGGAAGGAA (SEQ ID NO: 5).
2. The engineered zinc finger array of claim 1 , wherein the engineered zinc finger array comprises at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to an amino acid sequence set forth in any one of SEQ ID NOs:24-39.
3. The engineered zinc finger array of claim 1, wherein the engineered zinc finger array comprises the amino acid sequence set forth in SEQ ID NO:30.
4. An isolated cell comprising the zinc finger array according to any one of claims 1 to 3.
5. An isolated nucleic acid encoding the zinc finger array according to any one of claims 1 to 3.
6. A vector comprising the isolated nucleic acid of claim 5.
7. A fusion protein comprising the zinc finger array according to any one of claims 1 to 3 fused to a heterologous functional domain, with an optional intervening linker, wherein the linker does not interfere with activity of the fusion protein.
8. The fusion protein of claim 7, wherein the heterologous functional domain is a transcriptional silencer or transcriptional repression domain.
9. The fusion protein of claim 8, wherein the transcriptional repression domain is a Krueppel-associated box (KRAB) domain, ERF repressor domain (ERD), or mSin3 A interaction domain (SID).
10. The fusion protein of claim 8, wherein the transcriptional silencer is Heterochromatin Protein 1 (HP1).
11. An isolated cell comprising the fusion protein according to any one of claims 7 to 10.
12. An isolated nucleic acid encoding the fusion according to any one of claims 7 to 10.
13. A vector comprising the isolated nucleic acid of claim 12.
14. A method of reducing aberrant gene expression driven by activation of GGAA-microsatellites in a cell, the method comprising contacting the cell with an effective amount of the fusion protein of claims 7-10, or the isolated nucleic acid of claim 12.
15. A method of treating a subject who has a disease associated with aberrant gene expression driven by activation of GGAA-microsatellites in a cell, the method comprising administering to the subject an effective amount of a composition comprising the fusion protein of claims 7-10, or the isolated nucleic acid of claim 12.
16. The method of claims 14 or 15, wherein the subject has Ewing sarcoma.
17. The method of claim 15, wherein the composition is administered by injection into or near a tumor, or by application after surgical resection.
18. The method of claim 15, wherein the composition is administered by injection into or near a tumor, or by application before surgical resection.
19. The method of any one of claims 15-18, further comprising treating a subject with one or more chemotherapy agents.
20. The method of claim 19, wherein the chemotherapy is one of vincristine, doxorubicin, cyclophosphamide, ifosfamide, etoposide, or a combination thereof.
21. The method of any one of claims 15-20, wherein the composition is administered before radiation.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263327175P | 2022-04-04 | 2022-04-04 | |
US63/327,175 | 2022-04-04 |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2023196220A2 true WO2023196220A2 (en) | 2023-10-12 |
WO2023196220A3 WO2023196220A3 (en) | 2023-11-23 |
WO2023196220A9 WO2023196220A9 (en) | 2024-02-01 |
Family
ID=88243374
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/017257 WO2023196220A2 (en) | 2022-04-04 | 2023-04-03 | Method for genome-wide functional perturbation of human microsatellites using engineered zinc fingers |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2023196220A2 (en) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10465187B2 (en) * | 2017-02-06 | 2019-11-05 | Trustees Of Boston University | Integrated system for programmable DNA methylation |
US20220193210A1 (en) * | 2018-02-02 | 2022-06-23 | Danmarks Tekniske Universitet | Therapeutics for autoimmune kidney disease: synthetic antigens |
US11041155B2 (en) * | 2018-05-17 | 2021-06-22 | The General Hospital Corporation | CCCTC-binding factor variants |
-
2023
- 2023-04-03 WO PCT/US2023/017257 patent/WO2023196220A2/en unknown
Also Published As
Publication number | Publication date |
---|---|
WO2023196220A9 (en) | 2024-02-01 |
WO2023196220A3 (en) | 2023-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11891631B2 (en) | Transcription activator-like effector (tale) - lysine-specific demethylase 1 (LSD1) fusion proteins | |
US20200291424A1 (en) | Targeted deletion of cellular dna sequences | |
US10093910B2 (en) | Engineered CRISPR-Cas9 nucleases | |
EP3504327B1 (en) | Engineered target specific nucleases | |
US11110154B2 (en) | Methods and compositions for treating Huntington's Disease | |
JP6793547B2 (en) | Optimization Function Systems, methods and compositions for sequence manipulation with the CRISPR-Cas system | |
KR102455623B1 (en) | An engineered guide RNA for the optimized CRISPR/Cas12f1 system and use thereof | |
EP2879693B1 (en) | Dna modifying fusion proteins and methods of use thereof | |
EP3676287A1 (en) | Fusion proteins for improved precision in base editing | |
JP2022526695A (en) | Inhibition of unintentional mutations in gene editing | |
US11618780B2 (en) | Composition and method for activating latent human immunodeficiency virus (HIV) | |
CN111954540A (en) | Engineered target-specific nucleases | |
JP2024073630A (en) | Novel transcription activators | |
KR20200135225A (en) | Single base editing proteins and composition comprising the same | |
US20240209396A1 (en) | Small cas proteins and uses thereof | |
CN109207518B (en) | Drug-inducible CRISPR/Cas9 system for gene transcription activation | |
US20220064237A1 (en) | Htt repressors and uses thereof | |
WO2023196220A2 (en) | Method for genome-wide functional perturbation of human microsatellites using engineered zinc fingers | |
US20240254464A1 (en) | Cleavage-inactive cas12f1, cleavage-inactive cas12f1-based fusion protein, crispr gene-editing system comprising same, and preparation method and use thereof | |
EP4368713A1 (en) | Cleavage-inactive cas12f1, cleavage-inactive cas12f1-based fusion protein, crispr gene-editing system comprising same, and preparation method and use thereof | |
CN112654711B (en) | Composition of Cas protein inhibitor and application | |
US20230045095A1 (en) | Compositions, Methods and Systems for the Delivery of Gene Editing Material to Cells | |
RU2800921C2 (en) | New transcription activator | |
KR102691097B1 (en) | Target Specific CRISPR variants | |
JP2024521368A (en) | Polypeptides Translated by Circular RNA Circ-ACE2 and Uses Thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23785217 Country of ref document: EP Kind code of ref document: A2 |