WO2014172470A2 - Methods of mutating, modifying or modulating nucleic acid in a cell or nonhuman mammal - Google Patents
Methods of mutating, modifying or modulating nucleic acid in a cell or nonhuman mammal Download PDFInfo
- Publication number
- WO2014172470A2 WO2014172470A2 PCT/US2014/034387 US2014034387W WO2014172470A2 WO 2014172470 A2 WO2014172470 A2 WO 2014172470A2 US 2014034387 W US2014034387 W US 2014034387W WO 2014172470 A2 WO2014172470 A2 WO 2014172470A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleic acid
- target nucleic
- acid sequences
- protein
- cell
- Prior art date
Links
- 150000007523 nucleic acids Chemical class 0.000 title claims abstract description 346
- 238000000034 method Methods 0.000 title claims abstract description 283
- 241000124008 Mammalia Species 0.000 title claims description 56
- 102000039446 nucleic acids Human genes 0.000 title claims description 55
- 108020004707 nucleic acids Proteins 0.000 title claims description 55
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 438
- 210000004027 cell Anatomy 0.000 claims abstract description 364
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 179
- 230000035772 mutation Effects 0.000 claims abstract description 145
- 230000014509 gene expression Effects 0.000 claims abstract description 142
- 108091033409 CRISPR Proteins 0.000 claims abstract description 121
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 115
- 210000000130 stem cell Anatomy 0.000 claims abstract description 78
- 230000000694 effects Effects 0.000 claims abstract description 74
- 241000282414 Homo sapiens Species 0.000 claims abstract description 63
- 230000000295 complement effect Effects 0.000 claims abstract description 39
- 230000027455 binding Effects 0.000 claims abstract description 35
- 229920002477 rna polymer Polymers 0.000 claims abstract description 31
- 101710163270 Nuclease Proteins 0.000 claims abstract description 14
- 238000010354 CRISPR gene editing Methods 0.000 claims abstract 6
- 235000018102 proteins Nutrition 0.000 claims description 176
- 241000699666 Mus <mouse, genus> Species 0.000 claims description 131
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 99
- 239000012636 effector Substances 0.000 claims description 96
- 201000010099 disease Diseases 0.000 claims description 94
- 108020004414 DNA Proteins 0.000 claims description 81
- 210000001161 mammalian embryo Anatomy 0.000 claims description 73
- 239000002773 nucleotide Substances 0.000 claims description 53
- 125000003729 nucleotide group Chemical group 0.000 claims description 53
- 230000008672 reprogramming Effects 0.000 claims description 51
- 108091034117 Oligonucleotide Proteins 0.000 claims description 42
- 241000894007 species Species 0.000 claims description 38
- 238000003780 insertion Methods 0.000 claims description 33
- 230000037431 insertion Effects 0.000 claims description 33
- 108020001507 fusion proteins Proteins 0.000 claims description 32
- 102000037865 fusion proteins Human genes 0.000 claims description 32
- 235000001014 amino acid Nutrition 0.000 claims description 30
- 150000001413 amino acids Chemical group 0.000 claims description 30
- 239000013612 plasmid Substances 0.000 claims description 28
- 108700019146 Transgenes Proteins 0.000 claims description 27
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 27
- 108091026890 Coding region Proteins 0.000 claims description 25
- 102000040945 Transcription factor Human genes 0.000 claims description 25
- 108091023040 Transcription factor Proteins 0.000 claims description 24
- 230000004913 activation Effects 0.000 claims description 23
- 230000001105 regulatory effect Effects 0.000 claims description 23
- 108010077544 Chromatin Proteins 0.000 claims description 22
- 229940024606 amino acid Drugs 0.000 claims description 22
- 210000003483 chromatin Anatomy 0.000 claims description 22
- 108010033040 Histones Proteins 0.000 claims description 21
- 101100247004 Rattus norvegicus Qsox1 gene Proteins 0.000 claims description 20
- 102000004190 Enzymes Human genes 0.000 claims description 19
- 108090000790 Enzymes Proteins 0.000 claims description 19
- 238000013518 transcription Methods 0.000 claims description 19
- 230000035897 transcription Effects 0.000 claims description 19
- 238000005215 recombination Methods 0.000 claims description 17
- 230000001939 inductive effect Effects 0.000 claims description 16
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 16
- 230000002103 transcriptional effect Effects 0.000 claims description 16
- 108010091086 Recombinases Proteins 0.000 claims description 14
- 102000018120 Recombinases Human genes 0.000 claims description 14
- 108700009124 Transcription Initiation Site Proteins 0.000 claims description 13
- 230000003213 activating effect Effects 0.000 claims description 13
- 229920001184 polypeptide Polymers 0.000 claims description 13
- 230000023603 positive regulation of transcription initiation, DNA-dependent Effects 0.000 claims description 13
- 230000006916 protein interaction Effects 0.000 claims description 13
- 230000006798 recombination Effects 0.000 claims description 13
- 241000283984 Rodentia Species 0.000 claims description 12
- 210000001671 embryonic stem cell Anatomy 0.000 claims description 12
- 102000053602 DNA Human genes 0.000 claims description 11
- 210000004263 induced pluripotent stem cell Anatomy 0.000 claims description 11
- 102000005962 receptors Human genes 0.000 claims description 11
- 108020003175 receptors Proteins 0.000 claims description 11
- 239000003102 growth factor Substances 0.000 claims description 10
- 238000009396 hybridization Methods 0.000 claims description 10
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims description 9
- 108700020796 Oncogene Proteins 0.000 claims description 9
- 235000004279 alanine Nutrition 0.000 claims description 9
- 238000001514 detection method Methods 0.000 claims description 9
- 241000283690 Bos taurus Species 0.000 claims description 8
- 108010068250 Herpes Simplex Virus Protein Vmw65 Proteins 0.000 claims description 8
- 102000013566 Plasminogen Human genes 0.000 claims description 8
- 108010051456 Plasminogen Proteins 0.000 claims description 8
- 241000288906 Primates Species 0.000 claims description 8
- 108700020978 Proto-Oncogene Proteins 0.000 claims description 8
- 102000052575 Proto-Oncogene Human genes 0.000 claims description 8
- 210000005260 human cell Anatomy 0.000 claims description 8
- 230000008836 DNA modification Effects 0.000 claims description 7
- 102100038895 Myc proto-oncogene protein Human genes 0.000 claims description 7
- 210000000349 chromosome Anatomy 0.000 claims description 7
- 108700026220 vif Genes Proteins 0.000 claims description 7
- 241000282465 Canis Species 0.000 claims description 6
- 108010012236 Chemokines Proteins 0.000 claims description 6
- 102000019034 Chemokines Human genes 0.000 claims description 6
- 241000282324 Felis Species 0.000 claims description 6
- 108020004682 Single-Stranded DNA Proteins 0.000 claims description 6
- 239000012190 activator Substances 0.000 claims description 6
- 239000002299 complementary DNA Substances 0.000 claims description 6
- 239000003623 enhancer Substances 0.000 claims description 6
- 239000003607 modifier Substances 0.000 claims description 6
- 238000011144 upstream manufacturing Methods 0.000 claims description 6
- 102000000905 Cadherin Human genes 0.000 claims description 5
- 108050007957 Cadherin Proteins 0.000 claims description 5
- 108010078791 Carrier Proteins Proteins 0.000 claims description 5
- 108010076667 Caspases Proteins 0.000 claims description 5
- 102000011727 Caspases Human genes 0.000 claims description 5
- 102000004127 Cytokines Human genes 0.000 claims description 5
- 108090000695 Cytokines Proteins 0.000 claims description 5
- 241000283073 Equus caballus Species 0.000 claims description 5
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 claims description 5
- 101710135898 Myc proto-oncogene protein Proteins 0.000 claims description 5
- 230000004570 RNA-binding Effects 0.000 claims description 5
- 101710150448 Transcriptional regulator Myc Proteins 0.000 claims description 5
- 229940009098 aspartate Drugs 0.000 claims description 5
- 229940088597 hormone Drugs 0.000 claims description 5
- 239000005556 hormone Substances 0.000 claims description 5
- 230000000754 repressing effect Effects 0.000 claims description 5
- 108091006107 transcriptional repressors Proteins 0.000 claims description 5
- 241000271566 Aves Species 0.000 claims description 4
- 108010039209 Blood Coagulation Factors Proteins 0.000 claims description 4
- 102000015081 Blood Coagulation Factors Human genes 0.000 claims description 4
- 102000016289 Cell Adhesion Molecules Human genes 0.000 claims description 4
- 108010067225 Cell Adhesion Molecules Proteins 0.000 claims description 4
- 108010060434 Co-Repressor Proteins Proteins 0.000 claims description 4
- 102000008169 Co-Repressor Proteins Human genes 0.000 claims description 4
- 108010037362 Extracellular Matrix Proteins Proteins 0.000 claims description 4
- 102000010834 Extracellular Matrix Proteins Human genes 0.000 claims description 4
- 108010004889 Heat-Shock Proteins Proteins 0.000 claims description 4
- 102000002812 Heat-Shock Proteins Human genes 0.000 claims description 4
- 108060003951 Immunoglobulin Proteins 0.000 claims description 4
- 102000014150 Interferons Human genes 0.000 claims description 4
- 108010050904 Interferons Proteins 0.000 claims description 4
- 102100025169 Max-binding protein MNT Human genes 0.000 claims description 4
- 102000006404 Mitochondrial Proteins Human genes 0.000 claims description 4
- 108010058682 Mitochondrial Proteins Proteins 0.000 claims description 4
- 108010006519 Molecular Chaperones Proteins 0.000 claims description 4
- 108010057466 NF-kappa B Proteins 0.000 claims description 4
- 102000007999 Nuclear Proteins Human genes 0.000 claims description 4
- 108010089610 Nuclear Proteins Proteins 0.000 claims description 4
- 102000003800 Selectins Human genes 0.000 claims description 4
- 108090000184 Selectins Proteins 0.000 claims description 4
- 108060008682 Tumor Necrosis Factor Proteins 0.000 claims description 4
- 108700025716 Tumor Suppressor Genes Proteins 0.000 claims description 4
- 102000044209 Tumor Suppressor Genes Human genes 0.000 claims description 4
- 230000003115 biocidal effect Effects 0.000 claims description 4
- 239000003114 blood coagulation factor Substances 0.000 claims description 4
- 230000003081 coactivator Effects 0.000 claims description 4
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 claims description 4
- 102000018358 immunoglobulin Human genes 0.000 claims description 4
- 102000006495 integrins Human genes 0.000 claims description 4
- 108010044426 integrins Proteins 0.000 claims description 4
- 229940079322 interferon Drugs 0.000 claims description 4
- 108091005706 peripheral membrane proteins Proteins 0.000 claims description 4
- AAEVYOVXGOFMJO-UHFFFAOYSA-N prometryn Chemical compound CSC1=NC(NC(C)C)=NC(NC(C)C)=N1 AAEVYOVXGOFMJO-UHFFFAOYSA-N 0.000 claims description 4
- 238000007634 remodeling Methods 0.000 claims description 4
- 230000035939 shock Effects 0.000 claims description 4
- 102000035160 transmembrane proteins Human genes 0.000 claims description 4
- 108091005703 transmembrane proteins Proteins 0.000 claims description 4
- 102000003390 tumor necrosis factor Human genes 0.000 claims description 4
- 208000034951 Genetic Translocation Diseases 0.000 claims description 3
- 239000003242 anti bacterial agent Substances 0.000 claims description 3
- 230000011987 methylation Effects 0.000 claims description 2
- 238000007069 methylation reaction Methods 0.000 claims description 2
- 102000003945 NF-kappa B Human genes 0.000 claims 3
- 241000699670 Mus sp. Species 0.000 description 141
- 108700028369 Alleles Proteins 0.000 description 130
- 230000008685 targeting Effects 0.000 description 95
- 239000013598 vector Substances 0.000 description 74
- 108020004999 messenger RNA Proteins 0.000 description 69
- 241001465754 Metazoa Species 0.000 description 62
- 210000002257 embryonic structure Anatomy 0.000 description 57
- 210000002459 blastocyst Anatomy 0.000 description 55
- 230000001404 mediated effect Effects 0.000 description 51
- 238000002347 injection Methods 0.000 description 44
- 239000007924 injection Substances 0.000 description 44
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 description 43
- 101150059736 SRY gene Proteins 0.000 description 35
- 206010028980 Neoplasm Diseases 0.000 description 32
- 210000001082 somatic cell Anatomy 0.000 description 32
- 101150083522 MECP2 gene Proteins 0.000 description 31
- 238000012217 deletion Methods 0.000 description 31
- 230000037430 deletion Effects 0.000 description 31
- 239000012634 fragment Substances 0.000 description 31
- 238000004458 analytical method Methods 0.000 description 29
- 239000000523 sample Substances 0.000 description 29
- 108010054624 red fluorescent protein Proteins 0.000 description 27
- 238000010453 CRISPR/Cas method Methods 0.000 description 26
- 238000010362 genome editing Methods 0.000 description 26
- -1 Klf Proteins 0.000 description 25
- 238000002105 Southern blotting Methods 0.000 description 24
- 238000003556 assay Methods 0.000 description 24
- 230000004927 fusion Effects 0.000 description 24
- 238000012239 gene modification Methods 0.000 description 24
- 238000000338 in vitro Methods 0.000 description 24
- 201000011510 cancer Diseases 0.000 description 22
- 230000006870 function Effects 0.000 description 21
- 238000010459 TALEN Methods 0.000 description 20
- 230000005017 genetic modification Effects 0.000 description 20
- 235000013617 genetically modified food Nutrition 0.000 description 20
- 238000012360 testing method Methods 0.000 description 20
- 239000003795 chemical substances by application Substances 0.000 description 19
- 230000034431 double-strand break repair via homologous recombination Effects 0.000 description 19
- 239000005090 green fluorescent protein Substances 0.000 description 19
- 230000004048 modification Effects 0.000 description 19
- 238000012986 modification Methods 0.000 description 19
- 230000009977 dual effect Effects 0.000 description 18
- 230000010354 integration Effects 0.000 description 18
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 17
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 17
- 101710126211 POU domain, class 5, transcription factor 1 Proteins 0.000 description 17
- 238000004519 manufacturing process Methods 0.000 description 17
- 108091000080 Phosphotransferase Proteins 0.000 description 16
- 238000003776 cleavage reaction Methods 0.000 description 16
- 229940088598 enzyme Drugs 0.000 description 16
- 238000002474 experimental method Methods 0.000 description 16
- 102000020233 phosphotransferase Human genes 0.000 description 16
- 230000008569 process Effects 0.000 description 16
- 230000007017 scission Effects 0.000 description 16
- 238000010363 gene targeting Methods 0.000 description 15
- 238000003205 genotyping method Methods 0.000 description 15
- 238000001890 transfection Methods 0.000 description 15
- 238000011282 treatment Methods 0.000 description 15
- 208000035199 Tetraploidy Diseases 0.000 description 14
- 230000004075 alteration Effects 0.000 description 14
- 238000011161 development Methods 0.000 description 14
- 230000018109 developmental process Effects 0.000 description 14
- 230000006780 non-homologous end joining Effects 0.000 description 14
- 108020004705 Codon Proteins 0.000 description 13
- 102100035423 POU domain, class 5, transcription factor 1 Human genes 0.000 description 13
- 230000002068 genetic effect Effects 0.000 description 13
- 239000000203 mixture Substances 0.000 description 13
- 239000000126 substance Substances 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 12
- 238000012163 sequencing technique Methods 0.000 description 12
- 235000013601 eggs Nutrition 0.000 description 11
- 238000001727 in vivo Methods 0.000 description 11
- 230000001965 increasing effect Effects 0.000 description 11
- 210000001519 tissue Anatomy 0.000 description 11
- 210000002593 Y chromosome Anatomy 0.000 description 10
- 239000003814 drug Substances 0.000 description 10
- 239000013604 expression vector Substances 0.000 description 10
- 210000004602 germ cell Anatomy 0.000 description 10
- 230000003993 interaction Effects 0.000 description 10
- 102100034170 Interferon-induced, double-stranded RNA-activated protein kinase Human genes 0.000 description 9
- 102000001253 Protein Kinase Human genes 0.000 description 9
- 241000700159 Rattus Species 0.000 description 9
- 230000008859 change Effects 0.000 description 9
- 230000004069 differentiation Effects 0.000 description 9
- 210000000056 organ Anatomy 0.000 description 9
- 108060006633 protein kinase Proteins 0.000 description 9
- 238000011160 research Methods 0.000 description 9
- 210000000805 cytoplasm Anatomy 0.000 description 8
- 229940079593 drug Drugs 0.000 description 8
- 235000013336 milk Nutrition 0.000 description 8
- 210000004080 milk Anatomy 0.000 description 8
- 239000008267 milk Substances 0.000 description 8
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 7
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 7
- 108020005004 Guide RNA Proteins 0.000 description 7
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 7
- 238000010367 cloning Methods 0.000 description 7
- 230000003247 decreasing effect Effects 0.000 description 7
- 230000005782 double-strand break Effects 0.000 description 7
- 210000002950 fibroblast Anatomy 0.000 description 7
- 239000013642 negative control Substances 0.000 description 7
- 230000004850 protein–protein interaction Effects 0.000 description 7
- 108091008146 restriction endonucleases Proteins 0.000 description 7
- 241000894006 Bacteria Species 0.000 description 6
- 230000004568 DNA-binding Effects 0.000 description 6
- 241000282412 Homo Species 0.000 description 6
- 101001076407 Homo sapiens Interleukin-1 receptor antagonist protein Proteins 0.000 description 6
- 206010068052 Mosaicism Diseases 0.000 description 6
- 102000004022 Protein-Tyrosine Kinases Human genes 0.000 description 6
- 108090000412 Protein-Tyrosine Kinases Proteins 0.000 description 6
- 238000011529 RT qPCR Methods 0.000 description 6
- 108091081024 Start codon Proteins 0.000 description 6
- 230000002009 allergenic effect Effects 0.000 description 6
- 125000003275 alpha amino acid group Chemical group 0.000 description 6
- 230000006907 apoptotic process Effects 0.000 description 6
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 6
- 102000034287 fluorescent proteins Human genes 0.000 description 6
- 108091006047 fluorescent proteins Proteins 0.000 description 6
- 238000002744 homologous recombination Methods 0.000 description 6
- 230000006801 homologous recombination Effects 0.000 description 6
- 239000003550 marker Substances 0.000 description 6
- 210000004291 uterus Anatomy 0.000 description 6
- 238000001262 western blot Methods 0.000 description 6
- 238000011740 C57BL/6 mouse Methods 0.000 description 5
- 208000024172 Cardiovascular disease Diseases 0.000 description 5
- 102000003688 G-Protein-Coupled Receptors Human genes 0.000 description 5
- 108090000045 G-Protein-Coupled Receptors Proteins 0.000 description 5
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 5
- 101000620814 Homo sapiens Ras and EF-hand domain-containing protein Proteins 0.000 description 5
- 101000772122 Homo sapiens Twisted gastrulation protein homolog 1 Proteins 0.000 description 5
- 108091005461 Nucleic proteins Proteins 0.000 description 5
- 108700026244 Open Reading Frames Proteins 0.000 description 5
- 238000012408 PCR amplification Methods 0.000 description 5
- 102100022869 Ras and EF-hand domain-containing protein Human genes 0.000 description 5
- 210000001766 X chromosome Anatomy 0.000 description 5
- 101150044453 Y gene Proteins 0.000 description 5
- 239000000427 antigen Substances 0.000 description 5
- 108091007433 antigens Proteins 0.000 description 5
- 102000036639 antigens Human genes 0.000 description 5
- 238000013459 approach Methods 0.000 description 5
- 230000008033 biological extinction Effects 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 5
- 230000001419 dependent effect Effects 0.000 description 5
- 208000035475 disorder Diseases 0.000 description 5
- 101150111214 lin-28 gene Proteins 0.000 description 5
- 150000002632 lipids Chemical class 0.000 description 5
- 210000004962 mammalian cell Anatomy 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 125000004573 morpholin-4-yl group Chemical group N1(CCOCC1)* 0.000 description 5
- 210000004940 nucleus Anatomy 0.000 description 5
- 210000000287 oocyte Anatomy 0.000 description 5
- 230000037361 pathway Effects 0.000 description 5
- 230000008439 repair process Effects 0.000 description 5
- 238000012546 transfer Methods 0.000 description 5
- SGKRLCUYIXIAHR-AKNGSSGZSA-N (4s,4ar,5s,5ar,6r,12ar)-4-(dimethylamino)-1,5,10,11,12a-pentahydroxy-6-methyl-3,12-dioxo-4a,5,5a,6-tetrahydro-4h-tetracene-2-carboxamide Chemical compound C1=CC=C2[C@H](C)[C@@H]([C@H](O)[C@@H]3[C@](C(O)=C(C(N)=O)C(=O)[C@H]3N(C)C)(O)C3=O)C3=C(O)C2=C1O SGKRLCUYIXIAHR-AKNGSSGZSA-N 0.000 description 4
- 208000023275 Autoimmune disease Diseases 0.000 description 4
- 238000011765 DBA/2 mouse Methods 0.000 description 4
- 241000275449 Diplectrum formosum Species 0.000 description 4
- 102100039620 Granulocyte-macrophage colony-stimulating factor Human genes 0.000 description 4
- 108700021430 Kruppel-Like Factor 4 Proteins 0.000 description 4
- 208000035752 Live birth Diseases 0.000 description 4
- 108060001084 Luciferase Proteins 0.000 description 4
- 101150012532 NANOG gene Proteins 0.000 description 4
- 108010076039 Polyproteins Proteins 0.000 description 4
- 102100028904 Serine/threonine-protein kinase MARK2 Human genes 0.000 description 4
- 102100022978 Sex-determining region Y protein Human genes 0.000 description 4
- NKANXQFJJICGDU-QPLCGJKRSA-N Tamoxifen Chemical compound C=1C=CC=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=CC=C1 NKANXQFJJICGDU-QPLCGJKRSA-N 0.000 description 4
- 108091036066 Three prime untranslated region Proteins 0.000 description 4
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 4
- 241000700605 Viruses Species 0.000 description 4
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 4
- 230000001594 aberrant effect Effects 0.000 description 4
- 239000011543 agarose gel Substances 0.000 description 4
- 230000031018 biological processes and functions Effects 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000004113 cell culture Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000001086 cytosolic effect Effects 0.000 description 4
- 230000029087 digestion Effects 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 229960003722 doxycycline Drugs 0.000 description 4
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 4
- 229960005542 ethidium bromide Drugs 0.000 description 4
- 230000012010 growth Effects 0.000 description 4
- 102000046824 human IL1RN Human genes 0.000 description 4
- 238000012744 immunostaining Methods 0.000 description 4
- 230000004777 loss-of-function mutation Effects 0.000 description 4
- 235000013372 meat Nutrition 0.000 description 4
- 238000000386 microscopy Methods 0.000 description 4
- 238000010172 mouse model Methods 0.000 description 4
- 102000054765 polymorphisms of proteins Human genes 0.000 description 4
- 230000029279 positive regulation of transcription, DNA-dependent Effects 0.000 description 4
- 239000002243 precursor Substances 0.000 description 4
- 230000035755 proliferation Effects 0.000 description 4
- RXWNCPJZOCPEPQ-NVWDDTSBSA-N puromycin Chemical compound C1=CC(OC)=CC=C1C[C@H](N)C(=O)N[C@H]1[C@@H](O)[C@H](N2C3=NC=NC(=C3N=C2)N(C)C)O[C@@H]1CO RXWNCPJZOCPEPQ-NVWDDTSBSA-N 0.000 description 4
- 230000009467 reduction Effects 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 150000003384 small molecules Chemical class 0.000 description 4
- 239000000758 substrate Substances 0.000 description 4
- 230000003612 virological effect Effects 0.000 description 4
- 208000024827 Alzheimer disease Diseases 0.000 description 3
- 108091079001 CRISPR RNA Proteins 0.000 description 3
- 201000009030 Carcinoma Diseases 0.000 description 3
- 102100025064 Cellular tumor antigen p53 Human genes 0.000 description 3
- 108010051219 Cre recombinase Proteins 0.000 description 3
- 102000004533 Endonucleases Human genes 0.000 description 3
- 108010042407 Endonucleases Proteins 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 108010051975 Glycogen Synthase Kinase 3 beta Proteins 0.000 description 3
- 102100038104 Glycogen synthase kinase-3 beta Human genes 0.000 description 3
- 101001059454 Homo sapiens Serine/threonine-protein kinase MARK2 Proteins 0.000 description 3
- 101000826130 Homo sapiens Sex-determining region Y protein Proteins 0.000 description 3
- 101000687905 Homo sapiens Transcription factor SOX-2 Proteins 0.000 description 3
- 102000004058 Leukemia inhibitory factor Human genes 0.000 description 3
- 108090000581 Leukemia inhibitory factor Proteins 0.000 description 3
- 239000005089 Luciferase Substances 0.000 description 3
- 108010025020 Nerve Growth Factor Proteins 0.000 description 3
- 239000004677 Nylon Substances 0.000 description 3
- 102000043276 Oncogene Human genes 0.000 description 3
- 208000018737 Parkinson disease Diseases 0.000 description 3
- 108091005804 Peptidases Proteins 0.000 description 3
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 description 3
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 description 3
- 239000004365 Protease Substances 0.000 description 3
- 108010029485 Protein Isoforms Proteins 0.000 description 3
- 102000001708 Protein Isoforms Human genes 0.000 description 3
- 108091046869 Telomeric non-coding RNA Proteins 0.000 description 3
- 102100027671 Transcriptional repressor CTCF Human genes 0.000 description 3
- 108700029634 Y-Linked Genes Proteins 0.000 description 3
- 230000003321 amplification Effects 0.000 description 3
- 230000008236 biological pathway Effects 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 239000000872 buffer Substances 0.000 description 3
- 230000004663 cell proliferation Effects 0.000 description 3
- 238000007796 conventional method Methods 0.000 description 3
- 206010012601 diabetes mellitus Diseases 0.000 description 3
- 210000002919 epithelial cell Anatomy 0.000 description 3
- 210000002304 esc Anatomy 0.000 description 3
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 3
- 235000013305 food Nutrition 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- 231100000118 genetic alteration Toxicity 0.000 description 3
- 230000004077 genetic alteration Effects 0.000 description 3
- 239000001963 growth medium Substances 0.000 description 3
- 210000003958 hematopoietic stem cell Anatomy 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 238000002955 isolation Methods 0.000 description 3
- 231100000053 low toxicity Toxicity 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 238000002493 microarray Methods 0.000 description 3
- 238000000520 microinjection Methods 0.000 description 3
- 210000000472 morula Anatomy 0.000 description 3
- 238000002703 mutagenesis Methods 0.000 description 3
- 231100000350 mutagenesis Toxicity 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 229920001778 nylon Polymers 0.000 description 3
- 239000008188 pellet Substances 0.000 description 3
- 230000019491 signal transduction Effects 0.000 description 3
- 210000003491 skin Anatomy 0.000 description 3
- 238000013519 translation Methods 0.000 description 3
- 239000013603 viral vector Substances 0.000 description 3
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 2
- NHBKXEKEPDILRR-UHFFFAOYSA-N 2,3-bis(butanoylsulfanyl)propyl butanoate Chemical compound CCCC(=O)OCC(SC(=O)CCC)CSC(=O)CCC NHBKXEKEPDILRR-UHFFFAOYSA-N 0.000 description 2
- RYVNIFSIEDRLSJ-UHFFFAOYSA-N 5-(hydroxymethyl)cytosine Chemical compound NC=1NC(=O)N=CC=1CO RYVNIFSIEDRLSJ-UHFFFAOYSA-N 0.000 description 2
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 2
- 102100022900 Actin, cytoplasmic 1 Human genes 0.000 description 2
- 108010085238 Actins Proteins 0.000 description 2
- 241000272517 Anseriformes Species 0.000 description 2
- 241000203069 Archaea Species 0.000 description 2
- 238000011718 B6 albino mouse Methods 0.000 description 2
- 238000011719 B6C3F1 mouse Methods 0.000 description 2
- 238000011723 B6D2F1 (BDF1) mouse Methods 0.000 description 2
- 238000011725 BALB/c mouse Methods 0.000 description 2
- 238000011729 BALB/c nude mouse Methods 0.000 description 2
- 108091032955 Bacterial small RNA Proteins 0.000 description 2
- 102100023995 Beta-nerve growth factor Human genes 0.000 description 2
- 102000004219 Brain-derived neurotrophic factor Human genes 0.000 description 2
- 108090000715 Brain-derived neurotrophic factor Proteins 0.000 description 2
- 238000011735 C3H mouse Methods 0.000 description 2
- 238000011814 C57BL/6N mouse Methods 0.000 description 2
- 238000011748 CB6F1 mouse Methods 0.000 description 2
- 108010014064 CCCTC-Binding Factor Proteins 0.000 description 2
- 238000011761 CD2F1 (CDF1) mouse Methods 0.000 description 2
- 101150035324 CDK9 gene Proteins 0.000 description 2
- 108091007914 CDKs Proteins 0.000 description 2
- 208000006332 Choriocarcinoma Diseases 0.000 description 2
- 102100037146 Chromatin complexes subunit BAP18 Human genes 0.000 description 2
- 102100031668 Chromodomain Y-like protein Human genes 0.000 description 2
- 108010005939 Ciliary Neurotrophic Factor Proteins 0.000 description 2
- 102100031614 Ciliary neurotrophic factor Human genes 0.000 description 2
- 108010071942 Colony-Stimulating Factors Proteins 0.000 description 2
- 108010068106 Cyclin T Proteins 0.000 description 2
- 230000008301 DNA looping mechanism Effects 0.000 description 2
- 230000007067 DNA methylation Effects 0.000 description 2
- 230000007018 DNA scission Effects 0.000 description 2
- 102100024364 Disintegrin and metalloproteinase domain-containing protein 8 Human genes 0.000 description 2
- 102100024739 E3 ubiquitin-protein ligase UHRF1 Human genes 0.000 description 2
- 102000003951 Erythropoietin Human genes 0.000 description 2
- 108090000394 Erythropoietin Proteins 0.000 description 2
- 101150099612 Esrrb gene Proteins 0.000 description 2
- 238000011771 FVB mouse Methods 0.000 description 2
- 238000012413 Fluorescence activated cell sorting analysis Methods 0.000 description 2
- 201000011240 Frontotemporal dementia Diseases 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- 241000963438 Gaussia <copepod> Species 0.000 description 2
- 108010010803 Gelatin Proteins 0.000 description 2
- 206010064571 Gene mutation Diseases 0.000 description 2
- 102000058061 Glucose Transporter Type 4 Human genes 0.000 description 2
- BCCRXDTUTZHDEU-VKHMYHEASA-N Gly-Ser Chemical compound NCC(=O)N[C@@H](CO)C(O)=O BCCRXDTUTZHDEU-VKHMYHEASA-N 0.000 description 2
- 208000024869 Goodpasture syndrome Diseases 0.000 description 2
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 description 2
- 206010072579 Granulomatosis with polyangiitis Diseases 0.000 description 2
- 101001092912 Haloferax volcanii (strain ATCC 29605 / DSM 3757 / JCM 8879 / NBRC 14742 / NCIMB 2012 / VKM B-1768 / DS2) Small archaeal modifier protein 1 Proteins 0.000 description 2
- 208000002250 Hematologic Neoplasms Diseases 0.000 description 2
- 102100038885 Histone acetyltransferase p300 Human genes 0.000 description 2
- 108090000246 Histone acetyltransferases Proteins 0.000 description 2
- 102000003893 Histone acetyltransferases Human genes 0.000 description 2
- 108090000353 Histone deacetylase Proteins 0.000 description 2
- 102100038720 Histone deacetylase 9 Human genes 0.000 description 2
- 102100035864 Histone lysine demethylase PHF8 Human genes 0.000 description 2
- 102100022102 Histone-lysine N-methyltransferase 2B Human genes 0.000 description 2
- 102100038970 Histone-lysine N-methyltransferase EZH2 Human genes 0.000 description 2
- 102100029234 Histone-lysine N-methyltransferase NSD2 Human genes 0.000 description 2
- 102100029768 Histone-lysine N-methyltransferase SETD1A Human genes 0.000 description 2
- 102100027704 Histone-lysine N-methyltransferase SETD7 Human genes 0.000 description 2
- 101000740094 Homo sapiens Chromatin complexes subunit BAP18 Proteins 0.000 description 2
- 101000777795 Homo sapiens Chromodomain Y-like protein Proteins 0.000 description 2
- 101000760417 Homo sapiens E3 ubiquitin-protein ligase UHRF1 Proteins 0.000 description 2
- 101001000378 Homo sapiens Histone lysine demethylase PHF8 Proteins 0.000 description 2
- 101001045848 Homo sapiens Histone-lysine N-methyltransferase 2B Proteins 0.000 description 2
- 101001008894 Homo sapiens Histone-lysine N-methyltransferase 2D Proteins 0.000 description 2
- 101000882127 Homo sapiens Histone-lysine N-methyltransferase EZH2 Proteins 0.000 description 2
- 101000634048 Homo sapiens Histone-lysine N-methyltransferase NSD2 Proteins 0.000 description 2
- 101000865038 Homo sapiens Histone-lysine N-methyltransferase SETD1A Proteins 0.000 description 2
- 101000650682 Homo sapiens Histone-lysine N-methyltransferase SETD7 Proteins 0.000 description 2
- 101000990902 Homo sapiens Matrix metalloproteinase-9 Proteins 0.000 description 2
- 101000615488 Homo sapiens Methyl-CpG-binding domain protein 2 Proteins 0.000 description 2
- 101000615492 Homo sapiens Methyl-CpG-binding domain protein 4 Proteins 0.000 description 2
- 101001130226 Homo sapiens Phosphatidylcholine-sterol acyltransferase Proteins 0.000 description 2
- 101001098868 Homo sapiens Proprotein convertase subtilisin/kexin type 9 Proteins 0.000 description 2
- 101000777789 Homo sapiens Testis-specific chromodomain protein Y 1 Proteins 0.000 description 2
- 101000801088 Homo sapiens Transmembrane protein 201 Proteins 0.000 description 2
- 108090000723 Insulin-Like Growth Factor I Proteins 0.000 description 2
- 102220470475 L-seryl-tRNA(Sec) kinase_C57L_mutation Human genes 0.000 description 2
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 2
- 102000008192 Lactoglobulins Human genes 0.000 description 2
- 108010060630 Lactoglobulins Proteins 0.000 description 2
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- 208000031422 Lymphocytic Chronic B-Cell Leukemia Diseases 0.000 description 2
- 108091054455 MAP kinase family Proteins 0.000 description 2
- 102000043136 MAP kinase family Human genes 0.000 description 2
- 102000007651 Macrophage Colony-Stimulating Factor Human genes 0.000 description 2
- 108010046938 Macrophage Colony-Stimulating Factor Proteins 0.000 description 2
- 102100030412 Matrix metalloproteinase-9 Human genes 0.000 description 2
- 102100021299 Methyl-CpG-binding domain protein 2 Human genes 0.000 description 2
- 102100021290 Methyl-CpG-binding domain protein 4 Human genes 0.000 description 2
- 108091030146 MiRBase Proteins 0.000 description 2
- 238000011784 NIH-III nude mouse Methods 0.000 description 2
- 238000011789 NOD SCID mouse Methods 0.000 description 2
- 238000011794 NU/NU nude mouse Methods 0.000 description 2
- 208000012902 Nervous system disease Diseases 0.000 description 2
- 206010029260 Neuroblastoma Diseases 0.000 description 2
- 208000029726 Neurodevelopmental disease Diseases 0.000 description 2
- 208000025966 Neurological disease Diseases 0.000 description 2
- 102220580210 Non-receptor tyrosine-protein kinase TYK2_D10A_mutation Human genes 0.000 description 2
- 206010033128 Ovarian cancer Diseases 0.000 description 2
- 206010061535 Ovarian neoplasm Diseases 0.000 description 2
- 238000011647 PGP mouse Methods 0.000 description 2
- 229910019142 PO4 Inorganic materials 0.000 description 2
- 241001494479 Pecora Species 0.000 description 2
- 102000010292 Peptide Elongation Factor 1 Human genes 0.000 description 2
- 108010077524 Peptide Elongation Factor 1 Proteins 0.000 description 2
- 102100031538 Phosphatidylcholine-sterol acyltransferase Human genes 0.000 description 2
- 108091007412 Piwi-interacting RNA Proteins 0.000 description 2
- 108010038512 Platelet-Derived Growth Factor Proteins 0.000 description 2
- 102000010780 Platelet-Derived Growth Factor Human genes 0.000 description 2
- 108010012271 Positive Transcriptional Elongation Factor B Proteins 0.000 description 2
- 102000019014 Positive Transcriptional Elongation Factor B Human genes 0.000 description 2
- 208000006664 Precursor Cell Lymphoblastic Leukemia-Lymphoma Diseases 0.000 description 2
- RJKFOVLPORLFTN-LEKSSAKUSA-N Progesterone Chemical compound C1CC2=CC(=O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H](C(=O)C)[C@@]1(C)CC2 RJKFOVLPORLFTN-LEKSSAKUSA-N 0.000 description 2
- 102100038955 Proprotein convertase subtilisin/kexin type 9 Human genes 0.000 description 2
- 102100034026 RNA-binding protein Musashi homolog 1 Human genes 0.000 description 2
- 101710129077 RNA-binding protein Musashi homolog 1 Proteins 0.000 description 2
- 102000004278 Receptor Protein-Tyrosine Kinases Human genes 0.000 description 2
- 108090000873 Receptor Protein-Tyrosine Kinases Proteins 0.000 description 2
- 241000242739 Renilla Species 0.000 description 2
- 108700008625 Reporter Genes Proteins 0.000 description 2
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 2
- 108091006300 SLC2A4 Proteins 0.000 description 2
- 101100218590 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) BDF2 gene Proteins 0.000 description 2
- 206010039491 Sarcoma Diseases 0.000 description 2
- 206010039710 Scleroderma Diseases 0.000 description 2
- 201000010208 Seminoma Diseases 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- 108020003224 Small Nucleolar RNA Proteins 0.000 description 2
- 102000042773 Small Nucleolar RNA Human genes 0.000 description 2
- 101100289792 Squirrel monkey polyomavirus large T gene Proteins 0.000 description 2
- 241000193996 Streptococcus pyogenes Species 0.000 description 2
- 108700026226 TATA Box Proteins 0.000 description 2
- 206010043276 Teratoma Diseases 0.000 description 2
- 102100031664 Testis-specific chromodomain protein Y 1 Human genes 0.000 description 2
- MUMGGOZAMZWBJJ-DYKIIFRCSA-N Testostosterone Chemical compound O=C1CC[C@]2(C)[C@H]3CC[C@](C)([C@H](CC4)O)[C@@H]4[C@@H]3CCC2=C1 MUMGGOZAMZWBJJ-DYKIIFRCSA-N 0.000 description 2
- 108091028113 Trans-activating crRNA Proteins 0.000 description 2
- 102100033708 Transmembrane protein 201 Human genes 0.000 description 2
- 108010020764 Transposases Proteins 0.000 description 2
- 102000008579 Transposases Human genes 0.000 description 2
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 description 2
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 2
- 102100021575 Tyrosine-protein kinase BAZ1B Human genes 0.000 description 2
- 108010073929 Vascular Endothelial Growth Factor A Proteins 0.000 description 2
- 102000005789 Vascular Endothelial Growth Factors Human genes 0.000 description 2
- 108010019530 Vascular Endothelial Growth Factors Proteins 0.000 description 2
- 108700029631 X-Linked Genes Proteins 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 101150063416 add gene Proteins 0.000 description 2
- 210000001789 adipocyte Anatomy 0.000 description 2
- 210000004504 adult stem cell Anatomy 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 2
- 238000004220 aggregation Methods 0.000 description 2
- 239000013566 allergen Substances 0.000 description 2
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 description 2
- 230000001174 ascending effect Effects 0.000 description 2
- 238000011717 athymic nude mouse Methods 0.000 description 2
- 230000001363 autoimmune Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 238000011734 black swiss mouse Methods 0.000 description 2
- 230000000903 blocking effect Effects 0.000 description 2
- 229940077737 brain-derived neurotrophic factor Drugs 0.000 description 2
- 210000004899 c-terminal region Anatomy 0.000 description 2
- 231100000504 carcinogenesis Toxicity 0.000 description 2
- 230000024245 cell differentiation Effects 0.000 description 2
- 230000007910 cell fusion Effects 0.000 description 2
- 210000002230 centromere Anatomy 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 239000012707 chemical precursor Substances 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 238000012761 co-transfection Methods 0.000 description 2
- 108010045512 cohesins Proteins 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 238000003977 dairy farming Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000032459 dedifferentiation Effects 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 208000022602 disease susceptibility Diseases 0.000 description 2
- 229940069417 doxy Drugs 0.000 description 2
- HALQELOKLVRWRI-VDBOFHIQSA-N doxycycline hyclate Chemical group O.[Cl-].[Cl-].CCO.O=C1C2=C(O)C=CC=C2[C@H](C)[C@@H]2C1=C(O)[C@]1(O)C(=O)C(C(N)=O)=C(O)[C@@H]([NH+](C)C)[C@@H]1[C@H]2O.O=C1C2=C(O)C=CC=C2[C@H](C)[C@@H]2C1=C(O)[C@]1(O)C(=O)C(C(N)=O)=C(O)[C@@H]([NH+](C)C)[C@@H]1[C@H]2O HALQELOKLVRWRI-VDBOFHIQSA-N 0.000 description 2
- 210000003981 ectoderm Anatomy 0.000 description 2
- 230000012202 endocytosis Effects 0.000 description 2
- 210000001900 endoderm Anatomy 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 230000007608 epigenetic mechanism Effects 0.000 description 2
- 229940105423 erythropoietin Drugs 0.000 description 2
- 238000010195 expression analysis Methods 0.000 description 2
- 210000003754 fetus Anatomy 0.000 description 2
- 238000000684 flow cytometry Methods 0.000 description 2
- 230000037433 frameshift Effects 0.000 description 2
- 239000008273 gelatin Substances 0.000 description 2
- 229920000159 gelatin Polymers 0.000 description 2
- 235000019322 gelatine Nutrition 0.000 description 2
- 235000011852 gelatine desserts Nutrition 0.000 description 2
- 238000003209 gene knockout Methods 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- 210000001654 germ layer Anatomy 0.000 description 2
- 210000004209 hair Anatomy 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 2
- 238000002952 image-based readout Methods 0.000 description 2
- 210000000987 immune system Anatomy 0.000 description 2
- 230000001976 improved effect Effects 0.000 description 2
- 239000012212 insulator Substances 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 210000002510 keratinocyte Anatomy 0.000 description 2
- 208000017169 kidney disease Diseases 0.000 description 2
- 238000011813 knockout mouse model Methods 0.000 description 2
- 239000010985 leather Substances 0.000 description 2
- 231100000518 lethal Toxicity 0.000 description 2
- 230000001665 lethal effect Effects 0.000 description 2
- 230000000670 limiting effect Effects 0.000 description 2
- 206010025135 lupus erythematosus Diseases 0.000 description 2
- 210000003716 mesoderm Anatomy 0.000 description 2
- 239000002207 metabolite Substances 0.000 description 2
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 description 2
- 208000010125 myocardial infarction Diseases 0.000 description 2
- 229940053128 nerve growth factor Drugs 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 230000004770 neurodegeneration Effects 0.000 description 2
- 208000015122 neurodegenerative disease Diseases 0.000 description 2
- 230000000926 neurological effect Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000011580 nude mouse model Methods 0.000 description 2
- 230000009438 off-target cleavage Effects 0.000 description 2
- 230000009437 off-target effect Effects 0.000 description 2
- 231100000590 oncogenic Toxicity 0.000 description 2
- 230000002246 oncogenic effect Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000002018 overexpression Effects 0.000 description 2
- 210000003101 oviduct Anatomy 0.000 description 2
- 244000045947 parasite Species 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 235000021317 phosphate Nutrition 0.000 description 2
- 210000001778 pluripotent stem cell Anatomy 0.000 description 2
- OXCMYAYHXIHQOA-UHFFFAOYSA-N potassium;[2-butyl-5-chloro-3-[[4-[2-(1,2,4-triaza-3-azanidacyclopenta-1,4-dien-5-yl)phenyl]phenyl]methyl]imidazol-4-yl]methanol Chemical compound [K+].CCCCC1=NC(Cl)=C(CO)N1CC1=CC=C(C=2C(=CC=CC=2)C2=N[N-]N=N2)C=C1 OXCMYAYHXIHQOA-UHFFFAOYSA-N 0.000 description 2
- 230000001566 pro-viral effect Effects 0.000 description 2
- 108020001580 protein domains Proteins 0.000 description 2
- 229950010131 puromycin Drugs 0.000 description 2
- ZAHRKKWIAAJSAO-UHFFFAOYSA-N rapamycin Natural products COCC(O)C(=C/C(C)C(=O)CC(OC(=O)C1CCCCN1C(=O)C(=O)C2(O)OC(CC(OC)C(=CC=CC=CC(C)CC(C)C(=O)C)C)CCC2C)C(C)CC3CCC(O)C(C3)OC)C ZAHRKKWIAAJSAO-UHFFFAOYSA-N 0.000 description 2
- 238000010188 recombinant method Methods 0.000 description 2
- 208000023504 respiratory system disease Diseases 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000001177 retroviral effect Effects 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 238000007480 sanger sequencing Methods 0.000 description 2
- 230000008771 sex reversal Effects 0.000 description 2
- 229960002930 sirolimus Drugs 0.000 description 2
- QFJCIRLUMZQUOT-HPLJOQBZSA-N sirolimus Chemical compound C1C[C@@H](O)[C@H](OC)C[C@@H]1C[C@@H](C)[C@H]1OC(=O)[C@@H]2CCCCN2C(=O)C(=O)[C@](O)(O2)[C@H](C)CC[C@H]2C[C@H](OC)/C(C)=C/C=C/C=C/[C@@H](C)C[C@@H](C)C(=O)[C@H](OC)[C@H](O)/C(C)=C/[C@@H](C)C(=O)C1 QFJCIRLUMZQUOT-HPLJOQBZSA-N 0.000 description 2
- 206010041823 squamous cell carcinoma Diseases 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 230000009885 systemic effect Effects 0.000 description 2
- 229960001603 tamoxifen Drugs 0.000 description 2
- 108091035539 telomere Proteins 0.000 description 2
- 102000055501 telomere Human genes 0.000 description 2
- 210000003411 telomere Anatomy 0.000 description 2
- 229940124597 therapeutic agent Drugs 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 201000002510 thyroid cancer Diseases 0.000 description 2
- 238000004448 titration Methods 0.000 description 2
- 108091006106 transcriptional activators Proteins 0.000 description 2
- 230000010474 transient expression Effects 0.000 description 2
- 230000014621 translational initiation Effects 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 125000001493 tyrosinyl group Chemical group [H]OC1=C([H])C([H])=C(C([H])=C1[H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 210000002268 wool Anatomy 0.000 description 2
- 239000002676 xenobiotic agent Substances 0.000 description 2
- WKBPZYKAUNRMKP-UHFFFAOYSA-N 1-[2-(2,4-dichlorophenyl)pentyl]1,2,4-triazole Chemical compound C=1C=C(Cl)C=C(Cl)C=1C(CCC)CN1C=NC=N1 WKBPZYKAUNRMKP-UHFFFAOYSA-N 0.000 description 1
- PRDFBSVERLRRMY-UHFFFAOYSA-N 2'-(4-ethoxyphenyl)-5-(4-methylpiperazin-1-yl)-2,5'-bibenzimidazole Chemical compound C1=CC(OCC)=CC=C1C1=NC2=CC=C(C=3NC4=CC(=CC=C4N=3)N3CCN(C)CC3)C=C2N1 PRDFBSVERLRRMY-UHFFFAOYSA-N 0.000 description 1
- YMHOBZXQZVXHBM-UHFFFAOYSA-N 2,5-dimethoxy-4-bromophenethylamine Chemical compound COC1=CC(CCN)=C(OC)C=C1Br YMHOBZXQZVXHBM-UHFFFAOYSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 1
- SYXACIGWSSQBAJ-UHFFFAOYSA-N 2-amino-6-ethyl-5-pyridin-4-ylpyridine-3-carbonitrile Chemical compound CCC1=NC(N)=C(C#N)C=C1C1=CC=NC=C1 SYXACIGWSSQBAJ-UHFFFAOYSA-N 0.000 description 1
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical group OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 1
- GJTBSTBJLVYKAU-XVFCMESISA-N 2-thiouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=S)NC(=O)C=C1 GJTBSTBJLVYKAU-XVFCMESISA-N 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- ZAYHVCMSTBRABG-UHFFFAOYSA-N 5-Methylcytidine Natural products O=C1N=C(N)C(C)=CN1C1C(O)C(O)C(CO)O1 ZAYHVCMSTBRABG-UHFFFAOYSA-N 0.000 description 1
- ZAYHVCMSTBRABG-JXOAFFINSA-N 5-methylcytidine Chemical compound O=C1N=C(N)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 ZAYHVCMSTBRABG-JXOAFFINSA-N 0.000 description 1
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 1
- FVFVNNKYKYZTJU-UHFFFAOYSA-N 6-chloro-1,3,5-triazine-2,4-diamine Chemical compound NC1=NC(N)=NC(Cl)=N1 FVFVNNKYKYZTJU-UHFFFAOYSA-N 0.000 description 1
- OGHAROSJZRTIOK-KQYNXXCUSA-O 7-methylguanosine Chemical compound C1=2N=C(N)NC(=O)C=2[N+](C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OGHAROSJZRTIOK-KQYNXXCUSA-O 0.000 description 1
- 102100033793 ALK tyrosine kinase receptor Human genes 0.000 description 1
- 102000000872 ATM Human genes 0.000 description 1
- 102400000069 Activation peptide Human genes 0.000 description 1
- 101800001401 Activation peptide Proteins 0.000 description 1
- 208000024893 Acute lymphoblastic leukemia Diseases 0.000 description 1
- 208000014697 Acute lymphocytic leukaemia Diseases 0.000 description 1
- 208000031261 Acute myeloid leukaemia Diseases 0.000 description 1
- 102100034540 Adenomatous polyposis coli protein Human genes 0.000 description 1
- 208000016683 Adult T-cell leukemia/lymphoma Diseases 0.000 description 1
- 108010011170 Ala-Trp-Arg-His-Pro-Gln-Phe-Gly-Gly Proteins 0.000 description 1
- 208000032671 Allergic granulomatous angiitis Diseases 0.000 description 1
- 102100026882 Alpha-synuclein Human genes 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 102000013455 Amyloid beta-Peptides Human genes 0.000 description 1
- 108010090849 Amyloid beta-Peptides Proteins 0.000 description 1
- 208000000058 Anaplasia Diseases 0.000 description 1
- 201000003076 Angiosarcoma Diseases 0.000 description 1
- 206010002556 Ankylosing Spondylitis Diseases 0.000 description 1
- 208000003343 Antiphospholipid Syndrome Diseases 0.000 description 1
- 108020005544 Antisense RNA Proteins 0.000 description 1
- 101150009927 Apobec3 gene Proteins 0.000 description 1
- 102100021569 Apoptosis regulator Bcl-2 Human genes 0.000 description 1
- 108090000448 Aryl Hydrocarbon Receptors Proteins 0.000 description 1
- 102100026792 Aryl hydrocarbon receptor Human genes 0.000 description 1
- 108020005224 Arylamine N-acetyltransferase Proteins 0.000 description 1
- 108010004586 Ataxia Telangiectasia Mutated Proteins Proteins 0.000 description 1
- 102100032306 Aurora kinase B Human genes 0.000 description 1
- 108090000749 Aurora kinase B Proteins 0.000 description 1
- 206010003805 Autism Diseases 0.000 description 1
- 208000020706 Autistic disease Diseases 0.000 description 1
- 101001125874 Autographa californica nuclear polyhedrosis virus Per os infectivity factor 3 Proteins 0.000 description 1
- 206010003827 Autoimmune hepatitis Diseases 0.000 description 1
- 206010064539 Autoimmune myocarditis Diseases 0.000 description 1
- 206010069002 Autoimmune pancreatitis Diseases 0.000 description 1
- 108091005950 Azurite Proteins 0.000 description 1
- 108091008875 B cell receptors Proteins 0.000 description 1
- 208000010839 B-cell chronic lymphocytic leukemia Diseases 0.000 description 1
- 108091012583 BCL2 Proteins 0.000 description 1
- 208000032791 BCR-ABL1 positive chronic myelogenous leukemia Diseases 0.000 description 1
- 102000036365 BRCA1 Human genes 0.000 description 1
- 108700020463 BRCA1 Proteins 0.000 description 1
- 101150072950 BRCA1 gene Proteins 0.000 description 1
- 108700020462 BRCA2 Proteins 0.000 description 1
- 102000052609 BRCA2 Human genes 0.000 description 1
- 102100027161 BRCA2-interacting transcriptional repressor EMSY Human genes 0.000 description 1
- 206010004146 Basal cell carcinoma Diseases 0.000 description 1
- 208000023328 Basedow disease Diseases 0.000 description 1
- 108010001572 Basic-Leucine Zipper Transcription Factors Proteins 0.000 description 1
- 102000000806 Basic-Leucine Zipper Transcription Factors Human genes 0.000 description 1
- 208000027496 Behcet disease Diseases 0.000 description 1
- 208000009137 Behcet syndrome Diseases 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 102100023962 Bifunctional arginine demethylase and lysyl-hydroxylase JMJD6 Human genes 0.000 description 1
- 102100037468 Bifunctional peptidase and arginyl-hydroxylase JMJD5 Human genes 0.000 description 1
- 208000008439 Biliary Liver Cirrhosis Diseases 0.000 description 1
- 208000033222 Biliary cirrhosis primary Diseases 0.000 description 1
- 102100033743 Biotin-[acetyl-CoA-carboxylase] ligase Human genes 0.000 description 1
- 108010039206 Biotinidase Proteins 0.000 description 1
- 102100026044 Biotinidase Human genes 0.000 description 1
- 208000020925 Bipolar disease Diseases 0.000 description 1
- 206010005003 Bladder cancer Diseases 0.000 description 1
- 201000004569 Blindness Diseases 0.000 description 1
- 208000013165 Bowen disease Diseases 0.000 description 1
- 208000019337 Bowen disease of the skin Diseases 0.000 description 1
- 208000003174 Brain Neoplasms Diseases 0.000 description 1
- 101150008921 Brca2 gene Proteins 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 208000026310 Breast neoplasm Diseases 0.000 description 1
- 102100033641 Bromodomain-containing protein 2 Human genes 0.000 description 1
- 102100021942 C-C motif chemokine 28 Human genes 0.000 description 1
- 102100039435 C-X-C motif chemokine 17 Human genes 0.000 description 1
- 102000038625 CMGCs Human genes 0.000 description 1
- 108091007913 CMGCs Proteins 0.000 description 1
- 102100021975 CREB-binding protein Human genes 0.000 description 1
- AQGNHMOJWBZFQQ-UHFFFAOYSA-N CT 99021 Chemical compound CC1=CNC(C=2C(=NC(NCCNC=3N=CC(=CC=3)C#N)=NC=2)C=2C(=CC(Cl)=CC=2)Cl)=N1 AQGNHMOJWBZFQQ-UHFFFAOYSA-N 0.000 description 1
- 102100033676 CUGBP Elav-like family member 1 Human genes 0.000 description 1
- 101100451537 Caenorhabditis elegans hsd-1 gene Proteins 0.000 description 1
- 101100042630 Caenorhabditis elegans sin-3 gene Proteins 0.000 description 1
- 101100257359 Caenorhabditis elegans sox-2 gene Proteins 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- 108030005456 Calcium/calmodulin-dependent protein kinases Proteins 0.000 description 1
- 241000282836 Camelus dromedarius Species 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- 101710132601 Capsid protein Proteins 0.000 description 1
- 208000009458 Carcinoma in Situ Diseases 0.000 description 1
- 206010007559 Cardiac failure congestive Diseases 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 102000008122 Casein Kinase I Human genes 0.000 description 1
- 108010049812 Casein Kinase I Proteins 0.000 description 1
- 102100034356 Casein kinase I isoform alpha-like Human genes 0.000 description 1
- 241000700199 Cavia porcellus Species 0.000 description 1
- 102000000844 Cell Surface Receptors Human genes 0.000 description 1
- 108010001857 Cell Surface Receptors Proteins 0.000 description 1
- 108091005944 Cerulean Proteins 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- VEXZGXHMUGYJMC-UHFFFAOYSA-M Chloride anion Chemical compound [Cl-] VEXZGXHMUGYJMC-UHFFFAOYSA-M 0.000 description 1
- 101100041257 Chlorobaculum tepidum (strain ATCC 49652 / DSM 12025 / NBRC 103806 / TLS) rub2 gene Proteins 0.000 description 1
- 102100032919 Chromobox protein homolog 1 Human genes 0.000 description 1
- 102100032902 Chromobox protein homolog 3 Human genes 0.000 description 1
- 102100032918 Chromobox protein homolog 5 Human genes 0.000 description 1
- 102100031665 Chromodomain Y-like protein 2 Human genes 0.000 description 1
- 208000017667 Chronic Disease Diseases 0.000 description 1
- 208000006545 Chronic Obstructive Pulmonary Disease Diseases 0.000 description 1
- 208000010833 Chronic myeloid leukaemia Diseases 0.000 description 1
- 208000006344 Churg-Strauss Syndrome Diseases 0.000 description 1
- 108091005960 Citrine Proteins 0.000 description 1
- 108010067499 Clk dual-specificity kinases Proteins 0.000 description 1
- 101710094648 Coat protein Proteins 0.000 description 1
- 206010009900 Colitis ulcerative Diseases 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 102000007644 Colony-Stimulating Factors Human genes 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- 206010056370 Congestive cardiomyopathy Diseases 0.000 description 1
- 241000186216 Corynebacterium Species 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 208000011231 Crohn disease Diseases 0.000 description 1
- 101150059484 CycT gene Proteins 0.000 description 1
- 102000016736 Cyclin Human genes 0.000 description 1
- 108050006400 Cyclin Proteins 0.000 description 1
- 102000002435 Cyclin T Human genes 0.000 description 1
- 108010025464 Cyclin-Dependent Kinase 4 Proteins 0.000 description 1
- 108010025454 Cyclin-Dependent Kinase 5 Proteins 0.000 description 1
- 102100024112 Cyclin-T2 Human genes 0.000 description 1
- 102100033234 Cyclin-dependent kinase 17 Human genes 0.000 description 1
- 102100036329 Cyclin-dependent kinase 3 Human genes 0.000 description 1
- 102100036252 Cyclin-dependent kinase 4 Human genes 0.000 description 1
- 102100026805 Cyclin-dependent-like kinase 5 Human genes 0.000 description 1
- 102000018832 Cytochromes Human genes 0.000 description 1
- 108010052832 Cytochromes Proteins 0.000 description 1
- 102100035298 Cytokine SCM-1 beta Human genes 0.000 description 1
- 102000010831 Cytoskeletal Proteins Human genes 0.000 description 1
- 108010037414 Cytoskeletal Proteins Proteins 0.000 description 1
- 102100024812 DNA (cytosine-5)-methyltransferase 3A Human genes 0.000 description 1
- 108050002829 DNA (cytosine-5)-methyltransferase 3A Proteins 0.000 description 1
- 102100024810 DNA (cytosine-5)-methyltransferase 3B Human genes 0.000 description 1
- 101710123222 DNA (cytosine-5)-methyltransferase 3B Proteins 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 108010061982 DNA Ligases Proteins 0.000 description 1
- 230000035131 DNA demethylation Effects 0.000 description 1
- 102100021147 DNA mismatch repair protein Msh6 Human genes 0.000 description 1
- 108010008783 DNA modification methylase EcoHK31I Proteins 0.000 description 1
- 108010063593 DNA modification methylase SssI Proteins 0.000 description 1
- 101710150423 DNA nickase Proteins 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 1
- 102100022204 DNA-dependent protein kinase catalytic subunit Human genes 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 101100447432 Danio rerio gapdh-2 gene Proteins 0.000 description 1
- 101100193633 Danio rerio rag2 gene Proteins 0.000 description 1
- 206010011878 Deafness Diseases 0.000 description 1
- 102100038606 Death-associated protein kinase 3 Human genes 0.000 description 1
- 206010012335 Dependence Diseases 0.000 description 1
- 108010093668 Deubiquitinating Enzymes Proteins 0.000 description 1
- 102000001477 Deubiquitinating Enzymes Human genes 0.000 description 1
- 201000010046 Dilated cardiomyopathy Diseases 0.000 description 1
- 102100040862 Dual specificity protein kinase CLK1 Human genes 0.000 description 1
- 102100037573 Dual specificity protein phosphatase 12 Human genes 0.000 description 1
- 206010058314 Dysplasia Diseases 0.000 description 1
- 108091035710 E-box Proteins 0.000 description 1
- 102100021740 E3 ubiquitin-protein ligase BRE1A Human genes 0.000 description 1
- 102100021739 E3 ubiquitin-protein ligase BRE1B Human genes 0.000 description 1
- 102100023991 E3 ubiquitin-protein ligase DTX3L Human genes 0.000 description 1
- 108050002772 E3 ubiquitin-protein ligase Mdm2 Proteins 0.000 description 1
- 102000012199 E3 ubiquitin-protein ligase Mdm2 Human genes 0.000 description 1
- 102100037964 E3 ubiquitin-protein ligase RING2 Human genes 0.000 description 1
- 102100037240 E3 ubiquitin-protein ligase UBR2 Human genes 0.000 description 1
- 102100024748 E3 ubiquitin-protein ligase UHRF2 Human genes 0.000 description 1
- 108091005947 EBFP2 Proteins 0.000 description 1
- 108091005942 ECFP Proteins 0.000 description 1
- 102100035074 Elongator complex protein 3 Human genes 0.000 description 1
- 206010014733 Endometrial cancer Diseases 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 102100036448 Endothelial PAS domain-containing protein 1 Human genes 0.000 description 1
- 241000701832 Enterobacteria phage T3 Species 0.000 description 1
- 208000018428 Eosinophilic granulomatosis with polyangiitis Diseases 0.000 description 1
- 102100031968 Ephrin type-B receptor 2 Human genes 0.000 description 1
- 101000982540 Escherichia phage T7 Protein Ocr Proteins 0.000 description 1
- 208000000461 Esophageal Neoplasms Diseases 0.000 description 1
- 241000186394 Eubacterium Species 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- 108010007457 Extracellular Signal-Regulated MAP Kinases Proteins 0.000 description 1
- 102100030863 Eyes absent homolog 1 Human genes 0.000 description 1
- 102100030862 Eyes absent homolog 2 Human genes 0.000 description 1
- 102100030861 Eyes absent homolog 3 Human genes 0.000 description 1
- 108091008794 FGF receptors Proteins 0.000 description 1
- 108010074860 Factor Xa Proteins 0.000 description 1
- 102000018233 Fibroblast Growth Factor Human genes 0.000 description 1
- 108050007372 Fibroblast Growth Factor Proteins 0.000 description 1
- 102000003971 Fibroblast Growth Factor 1 Human genes 0.000 description 1
- 108090000386 Fibroblast Growth Factor 1 Proteins 0.000 description 1
- 102100024804 Fibroblast growth factor 22 Human genes 0.000 description 1
- 102100037362 Fibronectin Human genes 0.000 description 1
- 108010067306 Fibronectins Proteins 0.000 description 1
- 201000008808 Fibrosarcoma Diseases 0.000 description 1
- 108090000331 Firefly luciferases Proteins 0.000 description 1
- 241000192125 Firmicutes Species 0.000 description 1
- 108010091824 Focal Adhesion Kinase 1 Proteins 0.000 description 1
- 102100033324 GATA zinc finger domain-containing protein 1 Human genes 0.000 description 1
- 102000001267 GSK3 Human genes 0.000 description 1
- 108060006662 GSK3 Proteins 0.000 description 1
- 101150033270 Gadd45a gene Proteins 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- 101150112014 Gapdh gene Proteins 0.000 description 1
- 206010051066 Gastrointestinal stromal tumour Diseases 0.000 description 1
- 102100036530 General transcription factor 3C polypeptide 4 Human genes 0.000 description 1
- 206010018364 Glomerulonephritis Diseases 0.000 description 1
- 102000053187 Glucuronidase Human genes 0.000 description 1
- 108010060309 Glucuronidase Proteins 0.000 description 1
- 102100021181 Golgi phosphoprotein 3 Human genes 0.000 description 1
- 241000282575 Gorilla Species 0.000 description 1
- 108010017080 Granulocyte Colony-Stimulating Factor Proteins 0.000 description 1
- 102100039619 Granulocyte colony-stimulating factor Human genes 0.000 description 1
- 208000015023 Graves' disease Diseases 0.000 description 1
- 108010051696 Growth Hormone Proteins 0.000 description 1
- 102100034221 Growth-regulated alpha protein Human genes 0.000 description 1
- 241000606790 Haemophilus Species 0.000 description 1
- 208000030836 Hashimoto thyroiditis Diseases 0.000 description 1
- 206010019280 Heart failures Diseases 0.000 description 1
- 208000001258 Hemangiosarcoma Diseases 0.000 description 1
- 102100027685 Hemoglobin subunit alpha Human genes 0.000 description 1
- 108091005902 Hemoglobin subunit alpha Proteins 0.000 description 1
- 102100035108 High affinity nerve growth factor receptor Human genes 0.000 description 1
- 108010074870 Histone Demethylases Proteins 0.000 description 1
- 102000008157 Histone Demethylases Human genes 0.000 description 1
- 108010036115 Histone Methyltransferases Proteins 0.000 description 1
- 102100022901 Histone acetyltransferase KAT2A Human genes 0.000 description 1
- 102100022846 Histone acetyltransferase KAT2B Human genes 0.000 description 1
- 102100022893 Histone acetyltransferase KAT5 Human genes 0.000 description 1
- 102100033071 Histone acetyltransferase KAT6A Human genes 0.000 description 1
- 102100033070 Histone acetyltransferase KAT6B Human genes 0.000 description 1
- 102100033068 Histone acetyltransferase KAT7 Human genes 0.000 description 1
- 102100033069 Histone acetyltransferase KAT8 Human genes 0.000 description 1
- 102100021467 Histone acetyltransferase type B catalytic subunit Human genes 0.000 description 1
- 102000003964 Histone deacetylase Human genes 0.000 description 1
- 102100039999 Histone deacetylase 2 Human genes 0.000 description 1
- 102100021455 Histone deacetylase 3 Human genes 0.000 description 1
- 102100021454 Histone deacetylase 4 Human genes 0.000 description 1
- 102100021453 Histone deacetylase 5 Human genes 0.000 description 1
- 102100022537 Histone deacetylase 6 Human genes 0.000 description 1
- 102100038715 Histone deacetylase 8 Human genes 0.000 description 1
- 102100025210 Histone-arginine methyltransferase CARM1 Human genes 0.000 description 1
- 102100027755 Histone-lysine N-methyltransferase 2C Human genes 0.000 description 1
- 102100026265 Histone-lysine N-methyltransferase ASH1L Human genes 0.000 description 1
- 102100035043 Histone-lysine N-methyltransferase EHMT1 Human genes 0.000 description 1
- 102100035042 Histone-lysine N-methyltransferase EHMT2 Human genes 0.000 description 1
- 102100027770 Histone-lysine N-methyltransferase KMT5B Human genes 0.000 description 1
- 102100027788 Histone-lysine N-methyltransferase KMT5C Human genes 0.000 description 1
- 102100029235 Histone-lysine N-methyltransferase NSD3 Human genes 0.000 description 1
- 102100029144 Histone-lysine N-methyltransferase PRDM9 Human genes 0.000 description 1
- 102100030095 Histone-lysine N-methyltransferase SETD1B Human genes 0.000 description 1
- 102100032742 Histone-lysine N-methyltransferase SETD2 Human genes 0.000 description 1
- 102100023696 Histone-lysine N-methyltransferase SETDB1 Human genes 0.000 description 1
- 102100023676 Histone-lysine N-methyltransferase SETDB2 Human genes 0.000 description 1
- 102100032804 Histone-lysine N-methyltransferase SMYD3 Human genes 0.000 description 1
- 102100028998 Histone-lysine N-methyltransferase SUV39H1 Human genes 0.000 description 1
- 102100028988 Histone-lysine N-methyltransferase SUV39H2 Human genes 0.000 description 1
- 102100029239 Histone-lysine N-methyltransferase, H3 lysine-36 specific Human genes 0.000 description 1
- 102100039489 Histone-lysine N-methyltransferase, H3 lysine-79 specific Human genes 0.000 description 1
- 208000017604 Hodgkin disease Diseases 0.000 description 1
- 208000010747 Hodgkins lymphoma Diseases 0.000 description 1
- 102000009331 Homeodomain Proteins Human genes 0.000 description 1
- 108010048671 Homeodomain Proteins Proteins 0.000 description 1
- 101000924577 Homo sapiens Adenomatous polyposis coli protein Proteins 0.000 description 1
- 101000798306 Homo sapiens Aurora kinase B Proteins 0.000 description 1
- 101001057996 Homo sapiens BRCA2-interacting transcriptional repressor EMSY Proteins 0.000 description 1
- 101000975541 Homo sapiens Bifunctional arginine demethylase and lysyl-hydroxylase JMJD6 Proteins 0.000 description 1
- 101001025948 Homo sapiens Bifunctional peptidase and arginyl-hydroxylase JMJD5 Proteins 0.000 description 1
- 101000871771 Homo sapiens Biotin-[acetyl-CoA-carboxylase] ligase Proteins 0.000 description 1
- 101000897477 Homo sapiens C-C motif chemokine 28 Proteins 0.000 description 1
- 101000889048 Homo sapiens C-X-C motif chemokine 17 Proteins 0.000 description 1
- 101000896987 Homo sapiens CREB-binding protein Proteins 0.000 description 1
- 101000944448 Homo sapiens CUGBP Elav-like family member 1 Proteins 0.000 description 1
- 101000797584 Homo sapiens Chromobox protein homolog 1 Proteins 0.000 description 1
- 101000797578 Homo sapiens Chromobox protein homolog 3 Proteins 0.000 description 1
- 101000797581 Homo sapiens Chromobox protein homolog 5 Proteins 0.000 description 1
- 101000777787 Homo sapiens Chromodomain Y-like protein 2 Proteins 0.000 description 1
- 101000944358 Homo sapiens Cyclin-dependent kinase 17 Proteins 0.000 description 1
- 101000715946 Homo sapiens Cyclin-dependent kinase 3 Proteins 0.000 description 1
- 101000804771 Homo sapiens Cytokine SCM-1 beta Proteins 0.000 description 1
- 101000950656 Homo sapiens DEP domain-containing protein 1B Proteins 0.000 description 1
- 101000968658 Homo sapiens DNA mismatch repair protein Msh6 Proteins 0.000 description 1
- 101000619536 Homo sapiens DNA-dependent protein kinase catalytic subunit Proteins 0.000 description 1
- 101000956149 Homo sapiens Death-associated protein kinase 3 Proteins 0.000 description 1
- 101000832767 Homo sapiens Disintegrin and metalloproteinase domain-containing protein 8 Proteins 0.000 description 1
- 101000924017 Homo sapiens Dual specificity protein phosphatase 1 Proteins 0.000 description 1
- 101000881110 Homo sapiens Dual specificity protein phosphatase 12 Proteins 0.000 description 1
- 101000896083 Homo sapiens E3 ubiquitin-protein ligase BRE1A Proteins 0.000 description 1
- 101000896080 Homo sapiens E3 ubiquitin-protein ligase BRE1B Proteins 0.000 description 1
- 101000904542 Homo sapiens E3 ubiquitin-protein ligase DTX3L Proteins 0.000 description 1
- 101001095815 Homo sapiens E3 ubiquitin-protein ligase RING2 Proteins 0.000 description 1
- 101000670537 Homo sapiens E3 ubiquitin-protein ligase RNF168 Proteins 0.000 description 1
- 101001107071 Homo sapiens E3 ubiquitin-protein ligase RNF8 Proteins 0.000 description 1
- 101000760434 Homo sapiens E3 ubiquitin-protein ligase UHRF2 Proteins 0.000 description 1
- 101000877382 Homo sapiens Elongator complex protein 3 Proteins 0.000 description 1
- 101000851937 Homo sapiens Endothelial PAS domain-containing protein 1 Proteins 0.000 description 1
- 101000938435 Homo sapiens Eyes absent homolog 1 Proteins 0.000 description 1
- 101000938438 Homo sapiens Eyes absent homolog 2 Proteins 0.000 description 1
- 101000938441 Homo sapiens Eyes absent homolog 3 Proteins 0.000 description 1
- 101001051971 Homo sapiens Fibroblast growth factor 22 Proteins 0.000 description 1
- 101000926786 Homo sapiens GATA zinc finger domain-containing protein 1 Proteins 0.000 description 1
- 101000714252 Homo sapiens General transcription factor 3C polypeptide 4 Proteins 0.000 description 1
- 101000746373 Homo sapiens Granulocyte-macrophage colony-stimulating factor Proteins 0.000 description 1
- 101001069921 Homo sapiens Growth-regulated alpha protein Proteins 0.000 description 1
- 101001046967 Homo sapiens Histone acetyltransferase KAT2A Proteins 0.000 description 1
- 101001047006 Homo sapiens Histone acetyltransferase KAT2B Proteins 0.000 description 1
- 101001046996 Homo sapiens Histone acetyltransferase KAT5 Proteins 0.000 description 1
- 101000944179 Homo sapiens Histone acetyltransferase KAT6A Proteins 0.000 description 1
- 101000944174 Homo sapiens Histone acetyltransferase KAT6B Proteins 0.000 description 1
- 101000944166 Homo sapiens Histone acetyltransferase KAT7 Proteins 0.000 description 1
- 101000944170 Homo sapiens Histone acetyltransferase KAT8 Proteins 0.000 description 1
- 101000898976 Homo sapiens Histone acetyltransferase type B catalytic subunit Proteins 0.000 description 1
- 101001035011 Homo sapiens Histone deacetylase 2 Proteins 0.000 description 1
- 101000899282 Homo sapiens Histone deacetylase 3 Proteins 0.000 description 1
- 101000899259 Homo sapiens Histone deacetylase 4 Proteins 0.000 description 1
- 101000899255 Homo sapiens Histone deacetylase 5 Proteins 0.000 description 1
- 101000899330 Homo sapiens Histone deacetylase 6 Proteins 0.000 description 1
- 101001032113 Homo sapiens Histone deacetylase 7 Proteins 0.000 description 1
- 101001032118 Homo sapiens Histone deacetylase 8 Proteins 0.000 description 1
- 101001032092 Homo sapiens Histone deacetylase 9 Proteins 0.000 description 1
- 101001045846 Homo sapiens Histone-lysine N-methyltransferase 2A Proteins 0.000 description 1
- 101001008892 Homo sapiens Histone-lysine N-methyltransferase 2C Proteins 0.000 description 1
- 101000785963 Homo sapiens Histone-lysine N-methyltransferase ASH1L Proteins 0.000 description 1
- 101000877314 Homo sapiens Histone-lysine N-methyltransferase EHMT1 Proteins 0.000 description 1
- 101000877312 Homo sapiens Histone-lysine N-methyltransferase EHMT2 Proteins 0.000 description 1
- 101001028782 Homo sapiens Histone-lysine N-methyltransferase EZH1 Proteins 0.000 description 1
- 101001008821 Homo sapiens Histone-lysine N-methyltransferase KMT5B Proteins 0.000 description 1
- 101001008824 Homo sapiens Histone-lysine N-methyltransferase KMT5C Proteins 0.000 description 1
- 101000634046 Homo sapiens Histone-lysine N-methyltransferase NSD3 Proteins 0.000 description 1
- 101001124887 Homo sapiens Histone-lysine N-methyltransferase PRDM9 Proteins 0.000 description 1
- 101000864672 Homo sapiens Histone-lysine N-methyltransferase SETD1B Proteins 0.000 description 1
- 101000654725 Homo sapiens Histone-lysine N-methyltransferase SETD2 Proteins 0.000 description 1
- 101000684609 Homo sapiens Histone-lysine N-methyltransferase SETDB1 Proteins 0.000 description 1
- 101000684615 Homo sapiens Histone-lysine N-methyltransferase SETDB2 Proteins 0.000 description 1
- 101000708574 Homo sapiens Histone-lysine N-methyltransferase SMYD3 Proteins 0.000 description 1
- 101000696705 Homo sapiens Histone-lysine N-methyltransferase SUV39H1 Proteins 0.000 description 1
- 101000696699 Homo sapiens Histone-lysine N-methyltransferase SUV39H2 Proteins 0.000 description 1
- 101000634050 Homo sapiens Histone-lysine N-methyltransferase, H3 lysine-36 specific Proteins 0.000 description 1
- 101000963360 Homo sapiens Histone-lysine N-methyltransferase, H3 lysine-79 specific Proteins 0.000 description 1
- 101001046870 Homo sapiens Hypoxia-inducible factor 1-alpha Proteins 0.000 description 1
- 101001082570 Homo sapiens Hypoxia-inducible factor 3-alpha Proteins 0.000 description 1
- 101001008896 Homo sapiens Inactive histone-lysine N-methyltransferase 2E Proteins 0.000 description 1
- 101001043764 Homo sapiens Inhibitor of nuclear factor kappa-B kinase subunit alpha Proteins 0.000 description 1
- 101001076292 Homo sapiens Insulin-like growth factor II Proteins 0.000 description 1
- 101001042360 Homo sapiens LIM domain kinase 2 Proteins 0.000 description 1
- 101000804764 Homo sapiens Lymphotactin Proteins 0.000 description 1
- 101000871869 Homo sapiens Lys-63-specific deubiquitinase BRCC36 Proteins 0.000 description 1
- 101000613958 Homo sapiens Lysine-specific demethylase 2A Proteins 0.000 description 1
- 101000614013 Homo sapiens Lysine-specific demethylase 2B Proteins 0.000 description 1
- 101000614017 Homo sapiens Lysine-specific demethylase 3A Proteins 0.000 description 1
- 101000614020 Homo sapiens Lysine-specific demethylase 3B Proteins 0.000 description 1
- 101000613625 Homo sapiens Lysine-specific demethylase 4A Proteins 0.000 description 1
- 101000613629 Homo sapiens Lysine-specific demethylase 4B Proteins 0.000 description 1
- 101001088893 Homo sapiens Lysine-specific demethylase 4C Proteins 0.000 description 1
- 101001088895 Homo sapiens Lysine-specific demethylase 4D Proteins 0.000 description 1
- 101001088892 Homo sapiens Lysine-specific demethylase 5A Proteins 0.000 description 1
- 101001088883 Homo sapiens Lysine-specific demethylase 5B Proteins 0.000 description 1
- 101001088887 Homo sapiens Lysine-specific demethylase 5C Proteins 0.000 description 1
- 101001088879 Homo sapiens Lysine-specific demethylase 5D Proteins 0.000 description 1
- 101001025967 Homo sapiens Lysine-specific demethylase 6A Proteins 0.000 description 1
- 101001025971 Homo sapiens Lysine-specific demethylase 6B Proteins 0.000 description 1
- 101001025945 Homo sapiens Lysine-specific demethylase 7A Proteins 0.000 description 1
- 101001050886 Homo sapiens Lysine-specific histone demethylase 1A Proteins 0.000 description 1
- 101000613960 Homo sapiens Lysine-specific histone demethylase 1B Proteins 0.000 description 1
- 101000581507 Homo sapiens Methyl-CpG-binding domain protein 1 Proteins 0.000 description 1
- 101000615495 Homo sapiens Methyl-CpG-binding domain protein 3 Proteins 0.000 description 1
- 101001055091 Homo sapiens Mitogen-activated protein kinase kinase kinase 8 Proteins 0.000 description 1
- 101000896657 Homo sapiens Mitotic checkpoint serine/threonine-protein kinase BUB1 Proteins 0.000 description 1
- 101000583839 Homo sapiens Muscleblind-like protein 1 Proteins 0.000 description 1
- 101000583841 Homo sapiens Muscleblind-like protein 2 Proteins 0.000 description 1
- 101000957333 Homo sapiens Muscleblind-like protein 3 Proteins 0.000 description 1
- 101001008816 Homo sapiens N-lysine methyltransferase KMT5A Proteins 0.000 description 1
- 101000708645 Homo sapiens N-lysine methyltransferase SMYD2 Proteins 0.000 description 1
- 101000616738 Homo sapiens NAD-dependent protein deacetylase sirtuin-6 Proteins 0.000 description 1
- 101000602926 Homo sapiens Nuclear receptor coactivator 1 Proteins 0.000 description 1
- 101000974356 Homo sapiens Nuclear receptor coactivator 3 Proteins 0.000 description 1
- 101000934489 Homo sapiens Nucleosome-remodeling factor subunit BPTF Proteins 0.000 description 1
- 101100137155 Homo sapiens POU5F1 gene Proteins 0.000 description 1
- 101000687346 Homo sapiens PR domain zinc finger protein 2 Proteins 0.000 description 1
- 101000589450 Homo sapiens Poly(ADP-ribose) glycohydrolase Proteins 0.000 description 1
- 101000702559 Homo sapiens Probable global transcription activator SNF2L2 Proteins 0.000 description 1
- 101000585728 Homo sapiens Protein O-GlcNAcase Proteins 0.000 description 1
- 101000757216 Homo sapiens Protein arginine N-methyltransferase 1 Proteins 0.000 description 1
- 101000757232 Homo sapiens Protein arginine N-methyltransferase 2 Proteins 0.000 description 1
- 101000775582 Homo sapiens Protein arginine N-methyltransferase 6 Proteins 0.000 description 1
- 101000693024 Homo sapiens Protein arginine N-methyltransferase 7 Proteins 0.000 description 1
- 101000796142 Homo sapiens Protein arginine N-methyltransferase 8 Proteins 0.000 description 1
- 101001051767 Homo sapiens Protein kinase C beta type Proteins 0.000 description 1
- 101001026854 Homo sapiens Protein kinase C delta type Proteins 0.000 description 1
- 101000695187 Homo sapiens Protein patched homolog 1 Proteins 0.000 description 1
- 101000742054 Homo sapiens Protein phosphatase 1D Proteins 0.000 description 1
- 101000686031 Homo sapiens Proto-oncogene tyrosine-protein kinase ROS Proteins 0.000 description 1
- 101000996935 Homo sapiens Putative oxidoreductase GLYR1 Proteins 0.000 description 1
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 1
- 101000945093 Homo sapiens Ribosomal protein S6 kinase alpha-4 Proteins 0.000 description 1
- 101000945096 Homo sapiens Ribosomal protein S6 kinase alpha-5 Proteins 0.000 description 1
- 101000835860 Homo sapiens SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 Proteins 0.000 description 1
- 101000648174 Homo sapiens Serine/threonine-protein kinase 10 Proteins 0.000 description 1
- 101000880431 Homo sapiens Serine/threonine-protein kinase 4 Proteins 0.000 description 1
- 101000588540 Homo sapiens Serine/threonine-protein kinase Nek6 Proteins 0.000 description 1
- 101000588553 Homo sapiens Serine/threonine-protein kinase Nek9 Proteins 0.000 description 1
- 101000987310 Homo sapiens Serine/threonine-protein kinase PAK 2 Proteins 0.000 description 1
- 101001036145 Homo sapiens Serine/threonine-protein kinase greatwall Proteins 0.000 description 1
- 101000989953 Homo sapiens Serine/threonine-protein kinase haspin Proteins 0.000 description 1
- 101000637839 Homo sapiens Serine/threonine-protein kinase tousled-like 1 Proteins 0.000 description 1
- 101001068027 Homo sapiens Serine/threonine-protein phosphatase 2A catalytic subunit alpha isoform Proteins 0.000 description 1
- 101001068019 Homo sapiens Serine/threonine-protein phosphatase 2A catalytic subunit beta isoform Proteins 0.000 description 1
- 101001068219 Homo sapiens Serine/threonine-protein phosphatase 4 catalytic subunit Proteins 0.000 description 1
- 101000620653 Homo sapiens Serine/threonine-protein phosphatase 5 Proteins 0.000 description 1
- 101000702545 Homo sapiens Transcription activator BRG1 Proteins 0.000 description 1
- 101000975007 Homo sapiens Transcriptional regulator Kaiso Proteins 0.000 description 1
- 101000796673 Homo sapiens Transformation/transcription domain-associated protein Proteins 0.000 description 1
- 101000836339 Homo sapiens Transposon Hsmar1 transposase Proteins 0.000 description 1
- 101000971144 Homo sapiens Tyrosine-protein kinase BAZ1B Proteins 0.000 description 1
- 101000997832 Homo sapiens Tyrosine-protein kinase JAK2 Proteins 0.000 description 1
- 101000644815 Homo sapiens Ubiquitin carboxyl-terminal hydrolase 16 Proteins 0.000 description 1
- 101000607872 Homo sapiens Ubiquitin carboxyl-terminal hydrolase 21 Proteins 0.000 description 1
- 101000807524 Homo sapiens Ubiquitin carboxyl-terminal hydrolase 22 Proteins 0.000 description 1
- 101000777220 Homo sapiens Ubiquitin carboxyl-terminal hydrolase 3 Proteins 0.000 description 1
- 101000916547 Homo sapiens Zinc finger and BTB domain-containing protein 38 Proteins 0.000 description 1
- 101000788776 Homo sapiens Zinc finger and BTB domain-containing protein 4 Proteins 0.000 description 1
- 108090000144 Human Proteins Proteins 0.000 description 1
- 102000003839 Human Proteins Human genes 0.000 description 1
- 102000004157 Hydrolases Human genes 0.000 description 1
- 108090000604 Hydrolases Proteins 0.000 description 1
- 241000282597 Hylobates Species 0.000 description 1
- 206010020751 Hypersensitivity Diseases 0.000 description 1
- 206010020772 Hypertension Diseases 0.000 description 1
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 1
- 102100022875 Hypoxia-inducible factor 1-alpha Human genes 0.000 description 1
- 102100030482 Hypoxia-inducible factor 3-alpha Human genes 0.000 description 1
- 208000010159 IgA glomerulonephritis Diseases 0.000 description 1
- 206010021263 IgA nephropathy Diseases 0.000 description 1
- 101150074243 Il1rn gene Proteins 0.000 description 1
- 206010061598 Immunodeficiency Diseases 0.000 description 1
- 102100027767 Inactive histone-lysine N-methyltransferase 2E Human genes 0.000 description 1
- 208000022559 Inflammatory bowel disease Diseases 0.000 description 1
- 102100021892 Inhibitor of nuclear factor kappa-B kinase subunit alpha Human genes 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 102100023915 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 102000004218 Insulin-Like Growth Factor I Human genes 0.000 description 1
- 108090001117 Insulin-Like Growth Factor II Proteins 0.000 description 1
- 102000048143 Insulin-Like Growth Factor II Human genes 0.000 description 1
- 102100025947 Insulin-like growth factor II Human genes 0.000 description 1
- 102000005755 Intercellular Signaling Peptides and Proteins Human genes 0.000 description 1
- 108010070716 Intercellular Signaling Peptides and Proteins Proteins 0.000 description 1
- 102000000589 Interleukin-1 Human genes 0.000 description 1
- 108010002352 Interleukin-1 Proteins 0.000 description 1
- 102100026018 Interleukin-1 receptor antagonist protein Human genes 0.000 description 1
- 102000015696 Interleukins Human genes 0.000 description 1
- 108010063738 Interleukins Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 102000004310 Ion Channels Human genes 0.000 description 1
- 108090000769 Isomerases Proteins 0.000 description 1
- 102000004195 Isomerases Human genes 0.000 description 1
- 208000003456 Juvenile Arthritis Diseases 0.000 description 1
- 206010059176 Juvenile idiopathic arthritis Diseases 0.000 description 1
- 208000008839 Kidney Neoplasms Diseases 0.000 description 1
- 101150072501 Klf2 gene Proteins 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- 125000002842 L-seryl group Chemical group O=C([*])[C@](N([H])[H])([H])C([H])([H])O[H] 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- 102100021756 LIM domain kinase 2 Human genes 0.000 description 1
- 241000254158 Lampyridae Species 0.000 description 1
- 208000018142 Leiomyosarcoma Diseases 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 102000004882 Lipase Human genes 0.000 description 1
- 108090001060 Lipase Proteins 0.000 description 1
- 239000004367 Lipase Substances 0.000 description 1
- 241000406668 Loxodonta cyclotis Species 0.000 description 1
- 102000006830 Luminescent Proteins Human genes 0.000 description 1
- 108010047357 Luminescent Proteins Proteins 0.000 description 1
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 1
- 108090000856 Lyases Proteins 0.000 description 1
- 102000004317 Lyases Human genes 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 102100035304 Lymphotactin Human genes 0.000 description 1
- 102100033638 Lys-63-specific deubiquitinase BRCC36 Human genes 0.000 description 1
- 102100040598 Lysine-specific demethylase 2A Human genes 0.000 description 1
- 102100040584 Lysine-specific demethylase 2B Human genes 0.000 description 1
- 102100040581 Lysine-specific demethylase 3A Human genes 0.000 description 1
- 102100040582 Lysine-specific demethylase 3B Human genes 0.000 description 1
- 102100040863 Lysine-specific demethylase 4A Human genes 0.000 description 1
- 102100040860 Lysine-specific demethylase 4B Human genes 0.000 description 1
- 102100033230 Lysine-specific demethylase 4C Human genes 0.000 description 1
- 102100033231 Lysine-specific demethylase 4D Human genes 0.000 description 1
- 102100033246 Lysine-specific demethylase 5A Human genes 0.000 description 1
- 102100033247 Lysine-specific demethylase 5B Human genes 0.000 description 1
- 102100033249 Lysine-specific demethylase 5C Human genes 0.000 description 1
- 102100033143 Lysine-specific demethylase 5D Human genes 0.000 description 1
- 102100037462 Lysine-specific demethylase 6A Human genes 0.000 description 1
- 102100037461 Lysine-specific demethylase 6B Human genes 0.000 description 1
- 102100037465 Lysine-specific demethylase 7A Human genes 0.000 description 1
- 102100024985 Lysine-specific histone demethylase 1A Human genes 0.000 description 1
- 102100040596 Lysine-specific histone demethylase 1B Human genes 0.000 description 1
- 101710125418 Major capsid protein Proteins 0.000 description 1
- 241000206589 Marinobacter Species 0.000 description 1
- 108010080991 Mediator Complex Proteins 0.000 description 1
- 102000000490 Mediator Complex Human genes 0.000 description 1
- 208000000172 Medulloblastoma Diseases 0.000 description 1
- 101710087103 Melittin Proteins 0.000 description 1
- 208000024556 Mendelian disease Diseases 0.000 description 1
- 206010027476 Metastases Diseases 0.000 description 1
- 102000006890 Methyl-CpG-Binding Protein 2 Human genes 0.000 description 1
- 108010072388 Methyl-CpG-Binding Protein 2 Proteins 0.000 description 1
- 102100027383 Methyl-CpG-binding domain protein 1 Human genes 0.000 description 1
- 102100021291 Methyl-CpG-binding domain protein 3 Human genes 0.000 description 1
- 102000016397 Methyltransferase Human genes 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 108700011259 MicroRNAs Proteins 0.000 description 1
- 102100026907 Mitogen-activated protein kinase kinase kinase 8 Human genes 0.000 description 1
- 102100021691 Mitotic checkpoint serine/threonine-protein kinase BUB1 Human genes 0.000 description 1
- YNAVUWVOSKDBBP-UHFFFAOYSA-N Morpholine Chemical group C1COCCN1 YNAVUWVOSKDBBP-UHFFFAOYSA-N 0.000 description 1
- 208000003445 Mouth Neoplasms Diseases 0.000 description 1
- 108010008699 Mucin-4 Proteins 0.000 description 1
- 102100022693 Mucin-4 Human genes 0.000 description 1
- 102100038732 Mucosa-associated lymphoid tissue lymphoma translocation protein 1 Human genes 0.000 description 1
- 208000034578 Multiple myelomas Diseases 0.000 description 1
- 241000699660 Mus musculus Species 0.000 description 1
- 101100225547 Mus musculus Ehmt2 gene Proteins 0.000 description 1
- 101100499219 Mus musculus Hsd11b1 gene Proteins 0.000 description 1
- 101100494762 Mus musculus Nedd9 gene Proteins 0.000 description 1
- 101100193635 Mus musculus Rag2 gene Proteins 0.000 description 1
- 101100257363 Mus musculus Sox2 gene Proteins 0.000 description 1
- 101100369079 Mus musculus Tdg gene Proteins 0.000 description 1
- 102100030965 Muscleblind-like protein 1 Human genes 0.000 description 1
- 102100030964 Muscleblind-like protein 2 Human genes 0.000 description 1
- 102100038751 Muscleblind-like protein 3 Human genes 0.000 description 1
- 208000033761 Myelogenous Chronic BCR-ABL Positive Leukemia Diseases 0.000 description 1
- 208000033776 Myeloid Acute Leukemia Diseases 0.000 description 1
- 102000003505 Myosin Human genes 0.000 description 1
- 108060008487 Myosin Proteins 0.000 description 1
- VQAYFKKCNSOZKM-IOSLPCCCSA-N N(6)-methyladenosine Chemical compound C1=NC=2C(NC)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O VQAYFKKCNSOZKM-IOSLPCCCSA-N 0.000 description 1
- 108700026495 N-Myc Proto-Oncogene Proteins 0.000 description 1
- WWGBHDIHIVGYLZ-UHFFFAOYSA-N N-[4-[3-[[[7-(hydroxyamino)-7-oxoheptyl]amino]-oxomethyl]-5-isoxazolyl]phenyl]carbamic acid tert-butyl ester Chemical compound C1=CC(NC(=O)OC(C)(C)C)=CC=C1C1=CC(C(=O)NCCCCCCC(=O)NO)=NO1 WWGBHDIHIVGYLZ-UHFFFAOYSA-N 0.000 description 1
- 102100027771 N-lysine methyltransferase KMT5A Human genes 0.000 description 1
- 102100032806 N-lysine methyltransferase SMYD2 Human genes 0.000 description 1
- 102100030124 N-myc proto-oncogene protein Human genes 0.000 description 1
- 102100031455 NAD-dependent protein deacetylase sirtuin-1 Human genes 0.000 description 1
- 102100022913 NAD-dependent protein deacetylase sirtuin-2 Human genes 0.000 description 1
- 102100030710 NAD-dependent protein deacetylase sirtuin-3, mitochondrial Human genes 0.000 description 1
- 102100021840 NAD-dependent protein deacetylase sirtuin-6 Human genes 0.000 description 1
- VQAYFKKCNSOZKM-UHFFFAOYSA-N NSC 29409 Natural products C1=NC=2C(NC)=NC=NC=2N1C1OC(CO)C(O)C1O VQAYFKKCNSOZKM-UHFFFAOYSA-N 0.000 description 1
- BDFNAGOUUFOPSP-UHFFFAOYSA-N Nasvin Natural products O1C2=C(Cl)C(O)=C(Cl)C(C)=C2C(=O)OC2=C1C(C(C)=CC)=C(Cl)C(O)=C2CCCC BDFNAGOUUFOPSP-UHFFFAOYSA-N 0.000 description 1
- 229930193140 Neomycin Natural products 0.000 description 1
- 208000034176 Neoplasms, Germ Cell and Embryonal Diseases 0.000 description 1
- 206010029164 Nephrotic syndrome Diseases 0.000 description 1
- 102000007072 Nerve Growth Factors Human genes 0.000 description 1
- 102000007530 Neurofibromin 1 Human genes 0.000 description 1
- 108010085793 Neurofibromin 1 Proteins 0.000 description 1
- 102000028517 Neuropeptide receptor Human genes 0.000 description 1
- 108070000018 Neuropeptide receptor Proteins 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 102100023050 Nuclear factor NF-kappa-B p105 subunit Human genes 0.000 description 1
- 102100037223 Nuclear receptor coactivator 1 Human genes 0.000 description 1
- 102100022883 Nuclear receptor coactivator 3 Human genes 0.000 description 1
- 101710141454 Nucleoprotein Proteins 0.000 description 1
- 102100025062 Nucleosome-remodeling factor subunit BPTF Human genes 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 102000012547 Olfactory receptors Human genes 0.000 description 1
- 108050002069 Olfactory receptors Proteins 0.000 description 1
- 208000010191 Osteitis Deformans Diseases 0.000 description 1
- 108090000854 Oxidoreductases Proteins 0.000 description 1
- 102000004316 Oxidoreductases Human genes 0.000 description 1
- 238000012879 PET imaging Methods 0.000 description 1
- 102000038030 PI3Ks Human genes 0.000 description 1
- 108091007960 PI3Ks Proteins 0.000 description 1
- 102100024885 PR domain zinc finger protein 2 Human genes 0.000 description 1
- 108010011536 PTEN Phosphohydrolase Proteins 0.000 description 1
- 208000027868 Paget disease Diseases 0.000 description 1
- 241000282576 Pan paniscus Species 0.000 description 1
- 241000282577 Pan troglodytes Species 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 241001504519 Papio ursinus Species 0.000 description 1
- 229930040373 Paraformaldehyde Natural products 0.000 description 1
- 241000606860 Pasteurella Species 0.000 description 1
- 206010034277 Pemphigoid Diseases 0.000 description 1
- 201000011152 Pemphigus Diseases 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 241000286209 Phasianidae Species 0.000 description 1
- 102100032543 Phosphatidylinositol 3,4,5-trisphosphate 3-phosphatase and dual-specificity protein phosphatase PTEN Human genes 0.000 description 1
- 108090000430 Phosphatidylinositol 3-kinases Proteins 0.000 description 1
- 102000003993 Phosphatidylinositol 3-kinases Human genes 0.000 description 1
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 1
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 102100032347 Poly(ADP-ribose) glycohydrolase Human genes 0.000 description 1
- 108010022429 Polycomb-Group Proteins Proteins 0.000 description 1
- 102000012425 Polycomb-Group Proteins Human genes 0.000 description 1
- 241000282569 Pongo Species 0.000 description 1
- 108700011066 PreScission Protease Proteins 0.000 description 1
- 241000605861 Prevotella Species 0.000 description 1
- 208000012654 Primary biliary cholangitis Diseases 0.000 description 1
- 101710083689 Probable capsid protein Proteins 0.000 description 1
- 102100031021 Probable global transcription activator SNF2L2 Human genes 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- 108010043400 Protamine Kinase Proteins 0.000 description 1
- 102000004245 Proteasome Endopeptidase Complex Human genes 0.000 description 1
- 108090000708 Proteasome Endopeptidase Complex Proteins 0.000 description 1
- 108090000315 Protein Kinase C Proteins 0.000 description 1
- 108700040121 Protein Methyltransferases Proteins 0.000 description 1
- 102000055027 Protein Methyltransferases Human genes 0.000 description 1
- 102100030122 Protein O-GlcNAcase Human genes 0.000 description 1
- 102100022985 Protein arginine N-methyltransferase 1 Human genes 0.000 description 1
- 102100022988 Protein arginine N-methyltransferase 2 Human genes 0.000 description 1
- 102100034607 Protein arginine N-methyltransferase 5 Human genes 0.000 description 1
- 101710084427 Protein arginine N-methyltransferase 5 Proteins 0.000 description 1
- 102100032140 Protein arginine N-methyltransferase 6 Human genes 0.000 description 1
- 102100026297 Protein arginine N-methyltransferase 7 Human genes 0.000 description 1
- 102100031365 Protein arginine N-methyltransferase 8 Human genes 0.000 description 1
- 102100027584 Protein c-Fos Human genes 0.000 description 1
- 102100024923 Protein kinase C beta type Human genes 0.000 description 1
- 102100037340 Protein kinase C delta type Human genes 0.000 description 1
- 102100028680 Protein patched homolog 1 Human genes 0.000 description 1
- 102100038675 Protein phosphatase 1D Human genes 0.000 description 1
- 108091000520 Protein-Arginine Deiminase Type 4 Proteins 0.000 description 1
- 102100035731 Protein-arginine deiminase type-4 Human genes 0.000 description 1
- 102100023347 Proto-oncogene tyrosine-protein kinase ROS Human genes 0.000 description 1
- 229930185560 Pseudouridine Natural products 0.000 description 1
- PTJWIQPHWPFNBW-UHFFFAOYSA-N Pseudouridine C Natural products OC1C(O)C(CO)OC1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-UHFFFAOYSA-N 0.000 description 1
- 201000004681 Psoriasis Diseases 0.000 description 1
- 201000001263 Psoriatic Arthritis Diseases 0.000 description 1
- 208000036824 Psoriatic arthropathy Diseases 0.000 description 1
- 102000000033 Purinergic Receptors Human genes 0.000 description 1
- 108010080192 Purinergic Receptors Proteins 0.000 description 1
- 102100034301 Putative oxidoreductase GLYR1 Human genes 0.000 description 1
- 102100033810 RAC-alpha serine/threonine-protein kinase Human genes 0.000 description 1
- 102100033479 RAF proto-oncogene serine/threonine-protein kinase Human genes 0.000 description 1
- 102000001183 RAG-1 Human genes 0.000 description 1
- 108060006897 RAG1 Proteins 0.000 description 1
- 102000015097 RNA Splicing Factors Human genes 0.000 description 1
- 108010039259 RNA Splicing Factors Proteins 0.000 description 1
- 102000004909 RNF168 Human genes 0.000 description 1
- 102000004910 RNF8 Human genes 0.000 description 1
- 101100163890 Rattus norvegicus Ascl2 gene Proteins 0.000 description 1
- 101150045121 Rbbp5 gene Proteins 0.000 description 1
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 1
- 208000015634 Rectal Neoplasms Diseases 0.000 description 1
- 208000033464 Reiter syndrome Diseases 0.000 description 1
- 206010038389 Renal cancer Diseases 0.000 description 1
- 208000006265 Renal cell carcinoma Diseases 0.000 description 1
- 108010052090 Renilla Luciferases Proteins 0.000 description 1
- 102000004389 Ribonucleoproteins Human genes 0.000 description 1
- 108010081734 Ribonucleoproteins Proteins 0.000 description 1
- 102100033644 Ribosomal protein S6 kinase alpha-4 Human genes 0.000 description 1
- 102100033645 Ribosomal protein S6 kinase alpha-5 Human genes 0.000 description 1
- 238000011579 SCID mouse model Methods 0.000 description 1
- 108091005770 SIRT3 Proteins 0.000 description 1
- 102000001332 SRC Human genes 0.000 description 1
- 108010017324 STAT3 Transcription Factor Proteins 0.000 description 1
- 102100024790 SWI/SNF complex subunit SMARCC2 Human genes 0.000 description 1
- 101710169052 SWI/SNF complex subunit SMARCC2 Proteins 0.000 description 1
- 102100025746 SWI/SNF-related matrix-associated actin-dependent regulator of chromatin subfamily B member 1 Human genes 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 241000289605 Sarcophilus Species 0.000 description 1
- 102100028900 Serine/threonine-protein kinase 10 Human genes 0.000 description 1
- 102100037629 Serine/threonine-protein kinase 4 Human genes 0.000 description 1
- 102100031401 Serine/threonine-protein kinase Nek6 Human genes 0.000 description 1
- 102100031398 Serine/threonine-protein kinase Nek9 Human genes 0.000 description 1
- 102100027939 Serine/threonine-protein kinase PAK 2 Human genes 0.000 description 1
- 102100029064 Serine/threonine-protein kinase WNK1 Human genes 0.000 description 1
- 102100039278 Serine/threonine-protein kinase greatwall Human genes 0.000 description 1
- 102100029332 Serine/threonine-protein kinase haspin Human genes 0.000 description 1
- 102100032015 Serine/threonine-protein kinase tousled-like 1 Human genes 0.000 description 1
- 102100034464 Serine/threonine-protein phosphatase 2A catalytic subunit alpha isoform Human genes 0.000 description 1
- 102100034470 Serine/threonine-protein phosphatase 2A catalytic subunit beta isoform Human genes 0.000 description 1
- 102100034492 Serine/threonine-protein phosphatase 4 catalytic subunit Human genes 0.000 description 1
- 102100022346 Serine/threonine-protein phosphatase 5 Human genes 0.000 description 1
- 101150117538 Set2 gene Proteins 0.000 description 1
- 108700032475 Sex-Determining Region Y Proteins 0.000 description 1
- 102100024040 Signal transducer and activator of transcription 3 Human genes 0.000 description 1
- 108010041191 Sirtuin 1 Proteins 0.000 description 1
- 108010041216 Sirtuin 2 Proteins 0.000 description 1
- 208000021386 Sjogren Syndrome Diseases 0.000 description 1
- 208000000453 Skin Neoplasms Diseases 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 108020004459 Small interfering RNA Proteins 0.000 description 1
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 1
- 102000013275 Somatomedins Human genes 0.000 description 1
- 102100038803 Somatotropin Human genes 0.000 description 1
- 101150037203 Sox2 gene Proteins 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 241000194020 Streptococcus thermophilus Species 0.000 description 1
- 229930006000 Sucrose Natural products 0.000 description 1
- CZMRCDWAGMRECN-UGDNZRGBSA-N Sucrose Chemical compound O[C@H]1[C@H](O)[C@@H](CO)O[C@@]1(CO)O[C@@H]1[C@H](O)[C@@H](O)[C@H](O)[C@@H](CO)O1 CZMRCDWAGMRECN-UGDNZRGBSA-N 0.000 description 1
- 108091008874 T cell receptors Proteins 0.000 description 1
- 102000016266 T-Cell Antigen Receptors Human genes 0.000 description 1
- 208000029052 T-cell acute lymphoblastic leukemia Diseases 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 102100040347 TAR DNA-binding protein 43 Human genes 0.000 description 1
- 101710150875 TAR DNA-binding protein 43 Proteins 0.000 description 1
- 108010076818 TEV protease Proteins 0.000 description 1
- 102000018679 Tacrolimus Binding Proteins Human genes 0.000 description 1
- 108010027179 Tacrolimus Binding Proteins Proteins 0.000 description 1
- 208000024313 Testicular Neoplasms Diseases 0.000 description 1
- 206010057644 Testis cancer Diseases 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 108090000190 Thrombin Proteins 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- AUYYCJSJGJYCDS-LBPRGKRZSA-N Thyrolar Chemical class IC1=CC(C[C@H](N)C(O)=O)=CC(I)=C1OC1=CC=C(O)C(I)=C1 AUYYCJSJGJYCDS-LBPRGKRZSA-N 0.000 description 1
- 102000002689 Toll-like receptor Human genes 0.000 description 1
- 108020000411 Toll-like receptor Proteins 0.000 description 1
- 102100031027 Transcription activator BRG1 Human genes 0.000 description 1
- 102100024270 Transcription factor SOX-2 Human genes 0.000 description 1
- 102100030780 Transcriptional activator Myb Human genes 0.000 description 1
- 102100023011 Transcriptional regulator Kaiso Human genes 0.000 description 1
- 101710121478 Transcriptional repressor CTCF Proteins 0.000 description 1
- 108020004566 Transfer RNA Proteins 0.000 description 1
- 102000004357 Transferases Human genes 0.000 description 1
- 108090000992 Transferases Proteins 0.000 description 1
- 102100032762 Transformation/transcription domain-associated protein Human genes 0.000 description 1
- 102100027172 Transposon Hsmar1 transposase Human genes 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- 102100022596 Tyrosine-protein kinase ABL1 Human genes 0.000 description 1
- 102100033444 Tyrosine-protein kinase JAK2 Human genes 0.000 description 1
- 102100039079 Tyrosine-protein kinase TXK Human genes 0.000 description 1
- 101150056689 UBR2 gene Proteins 0.000 description 1
- 102100020730 Ubiquitin carboxyl-terminal hydrolase 16 Human genes 0.000 description 1
- 102100037184 Ubiquitin carboxyl-terminal hydrolase 22 Human genes 0.000 description 1
- 102100031287 Ubiquitin carboxyl-terminal hydrolase 3 Human genes 0.000 description 1
- 201000006704 Ulcerative Colitis Diseases 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 206010046851 Uveitis Diseases 0.000 description 1
- 102000009524 Vascular Endothelial Growth Factor A Human genes 0.000 description 1
- 108010073925 Vascular Endothelial Growth Factor B Proteins 0.000 description 1
- 108010073923 Vascular Endothelial Growth Factor C Proteins 0.000 description 1
- 108010073919 Vascular Endothelial Growth Factor D Proteins 0.000 description 1
- 102100038217 Vascular endothelial growth factor B Human genes 0.000 description 1
- 102100038232 Vascular endothelial growth factor C Human genes 0.000 description 1
- 102100038234 Vascular endothelial growth factor D Human genes 0.000 description 1
- 102100033178 Vascular endothelial growth factor receptor 1 Human genes 0.000 description 1
- 206010047112 Vasculitides Diseases 0.000 description 1
- 206010047115 Vasculitis Diseases 0.000 description 1
- 241000545067 Venus Species 0.000 description 1
- 240000004922 Vigna radiata Species 0.000 description 1
- 101150060771 WDR5 gene Proteins 0.000 description 1
- 102000007544 Whey Proteins Human genes 0.000 description 1
- 108010046377 Whey Proteins Proteins 0.000 description 1
- 102000013814 Wnt Human genes 0.000 description 1
- 208000028247 X-linked inheritance Diseases 0.000 description 1
- 208000028258 Y-linked inheritance Diseases 0.000 description 1
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 1
- 108010016200 Zinc Finger Protein GLI1 Proteins 0.000 description 1
- 108010088665 Zinc Finger Protein Gli2 Proteins 0.000 description 1
- 102100028125 Zinc finger and BTB domain-containing protein 38 Human genes 0.000 description 1
- 102100025349 Zinc finger and BTB domain-containing protein 4 Human genes 0.000 description 1
- 102100035535 Zinc finger protein GLI1 Human genes 0.000 description 1
- 102100035558 Zinc finger protein GLI2 Human genes 0.000 description 1
- 238000000862 absorption spectrum Methods 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 102000005421 acetyltransferase Human genes 0.000 description 1
- 108020002494 acetyltransferase Proteins 0.000 description 1
- 208000002552 acute disseminated encephalomyelitis Diseases 0.000 description 1
- 230000010933 acylation Effects 0.000 description 1
- 238000005917 acylation reaction Methods 0.000 description 1
- 210000005006 adaptive immune system Anatomy 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000001800 adrenalinergic effect Effects 0.000 description 1
- 201000006966 adult T-cell leukemia Diseases 0.000 description 1
- TXUZVZSFRXZGTL-QPLCGJKRSA-N afimoxifene Chemical compound C=1C=CC=CC=1C(/CC)=C(C=1C=CC(OCCN(C)C)=CC=1)/C1=CC=C(O)C=C1 TXUZVZSFRXZGTL-QPLCGJKRSA-N 0.000 description 1
- 238000000246 agarose gel electrophoresis Methods 0.000 description 1
- 208000026935 allergic disease Diseases 0.000 description 1
- 230000000172 allergic effect Effects 0.000 description 1
- 230000007815 allergy Effects 0.000 description 1
- 208000004631 alopecia areata Diseases 0.000 description 1
- 108090000185 alpha-Synuclein Proteins 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 229960000723 ampicillin Drugs 0.000 description 1
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 1
- 208000036878 aneuploidy Diseases 0.000 description 1
- 231100001075 aneuploidy Toxicity 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 208000006673 asthma Diseases 0.000 description 1
- 210000001130 astrocyte Anatomy 0.000 description 1
- 230000003143 atherosclerotic effect Effects 0.000 description 1
- 208000010668 atopic eczema Diseases 0.000 description 1
- 230000004900 autophagic degradation Effects 0.000 description 1
- 210000003719 b-lymphocyte Anatomy 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 210000002469 basement membrane Anatomy 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- WGDUUQDYDIIBKT-UHFFFAOYSA-N beta-Pseudouridine Natural products OC1OC(CN2C=CC(=O)NC2=O)C(O)C1O WGDUUQDYDIIBKT-UHFFFAOYSA-N 0.000 description 1
- 201000009036 biliary tract cancer Diseases 0.000 description 1
- 208000020790 biliary tract neoplasm Diseases 0.000 description 1
- 238000012742 biochemical analysis Methods 0.000 description 1
- 239000003150 biochemical marker Substances 0.000 description 1
- 230000008238 biochemical pathway Effects 0.000 description 1
- 239000003124 biologic agent Substances 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000006696 biosynthetic metabolic pathway Effects 0.000 description 1
- 230000029803 blastocyst development Effects 0.000 description 1
- 210000001109 blastomere Anatomy 0.000 description 1
- 230000036772 blood pressure Effects 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 210000000481 breast Anatomy 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- UDSAIICHUKSCKT-UHFFFAOYSA-N bromophenol blue Chemical compound C1=C(Br)C(O)=C(Br)C=C1C1(C=2C=C(Br)C(O)=C(Br)C=2)C2=CC=CC=C2S(=O)(=O)O1 UDSAIICHUKSCKT-UHFFFAOYSA-N 0.000 description 1
- 239000007853 buffer solution Substances 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 230000005880 cancer cell killing Effects 0.000 description 1
- 229930003827 cannabinoid Natural products 0.000 description 1
- 239000003557 cannabinoid Substances 0.000 description 1
- 210000004413 cardiac myocyte Anatomy 0.000 description 1
- 201000011529 cardiovascular cancer Diseases 0.000 description 1
- 101150038500 cas9 gene Proteins 0.000 description 1
- 230000006652 catabolic pathway Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 125000002091 cationic group Chemical group 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 239000002771 cell marker Substances 0.000 description 1
- 230000033077 cellular process Effects 0.000 description 1
- 230000036755 cellular response Effects 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 208000026106 cerebrovascular disease Diseases 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 239000013043 chemical agent Substances 0.000 description 1
- 235000013330 chicken meat Nutrition 0.000 description 1
- 210000001612 chondrocyte Anatomy 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 230000008711 chromosomal rearrangement Effects 0.000 description 1
- 208000032852 chronic lymphocytic leukemia Diseases 0.000 description 1
- 239000011035 citrine Substances 0.000 description 1
- 108010030886 coactivator-associated arginine methyltransferase 1 Proteins 0.000 description 1
- ZPUCINDJVBIVPJ-LJISPDSOSA-N cocaine Chemical compound O([C@H]1C[C@@H]2CC[C@@H](N2C)[C@H]1C(=O)OC)C(=O)C1=CC=CC=C1 ZPUCINDJVBIVPJ-LJISPDSOSA-N 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 208000029742 colonic neoplasm Diseases 0.000 description 1
- 229940047120 colony stimulating factors Drugs 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000002301 combined effect Effects 0.000 description 1
- 239000000287 crude extract Substances 0.000 description 1
- 108010082025 cyan fluorescent protein Proteins 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 238000012350 deep sequencing Methods 0.000 description 1
- 230000007850 degeneration Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000017858 demethylation Effects 0.000 description 1
- 238000010520 demethylation reaction Methods 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 201000001981 dermatomyositis Diseases 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- ANCLJVISBRWUTR-UHFFFAOYSA-N diaminophosphinic acid Chemical group NP(N)(O)=O ANCLJVISBRWUTR-UHFFFAOYSA-N 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- FOCAHLGSDWHSAH-UHFFFAOYSA-N difluoromethanethione Chemical compound FC(F)=S FOCAHLGSDWHSAH-UHFFFAOYSA-N 0.000 description 1
- ZPTBLXKRQACLCR-XVFCMESISA-N dihydrouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)CC1 ZPTBLXKRQACLCR-XVFCMESISA-N 0.000 description 1
- 229940042399 direct acting antivirals protease inhibitors Drugs 0.000 description 1
- 208000016097 disease of metabolism Diseases 0.000 description 1
- 238000002224 dissection Methods 0.000 description 1
- 230000009429 distress Effects 0.000 description 1
- 230000036267 drug metabolism Effects 0.000 description 1
- 108060002430 dynein heavy chain Proteins 0.000 description 1
- 102000013035 dynein heavy chain Human genes 0.000 description 1
- 230000004064 dysfunction Effects 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000000295 emission spectrum Methods 0.000 description 1
- 210000002889 endothelial cell Anatomy 0.000 description 1
- 210000001339 epidermal cell Anatomy 0.000 description 1
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 description 1
- 108700015053 epidermal growth factor receptor activity proteins Proteins 0.000 description 1
- 230000008995 epigenetic change Effects 0.000 description 1
- 230000001973 epigenetic effect Effects 0.000 description 1
- 230000004049 epigenetic modification Effects 0.000 description 1
- 206010015037 epilepsy Diseases 0.000 description 1
- 230000002922 epistatic effect Effects 0.000 description 1
- 201000004101 esophageal cancer Diseases 0.000 description 1
- 229940011871 estrogen Drugs 0.000 description 1
- 239000000262 estrogen Substances 0.000 description 1
- 230000028023 exocytosis Effects 0.000 description 1
- 230000023428 female meiosis Effects 0.000 description 1
- 231100000502 fertility decrease Toxicity 0.000 description 1
- 230000004720 fertilization Effects 0.000 description 1
- 210000000604 fetal stem cell Anatomy 0.000 description 1
- 231100001048 fetal toxicity Toxicity 0.000 description 1
- 229940126864 fibroblast growth factor Drugs 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- 235000019634 flavors Nutrition 0.000 description 1
- 238000001917 fluorescence detection Methods 0.000 description 1
- 238000000799 fluorescence microscopy Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 108010021843 fluorescent protein 583 Proteins 0.000 description 1
- 239000012737 fresh medium Substances 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 1
- 238000003304 gavage Methods 0.000 description 1
- 208000005017 glioblastoma Diseases 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 210000003714 granulocyte Anatomy 0.000 description 1
- 210000002503 granulosa cell Anatomy 0.000 description 1
- 239000000122 growth hormone Substances 0.000 description 1
- 210000000442 hair follicle cell Anatomy 0.000 description 1
- 201000009277 hairy cell leukemia Diseases 0.000 description 1
- 230000010370 hearing loss Effects 0.000 description 1
- 231100000888 hearing loss Toxicity 0.000 description 1
- 208000016354 hearing loss disease Diseases 0.000 description 1
- 210000002216 heart Anatomy 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 230000002008 hemorrhagic effect Effects 0.000 description 1
- 210000003494 hepatocyte Anatomy 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 238000013537 high throughput screening Methods 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 102000047444 human SOX2 Human genes 0.000 description 1
- 230000002895 hyperchromatic effect Effects 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 210000002865 immune cell Anatomy 0.000 description 1
- 239000007943 implant Substances 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 239000012678 infectious agent Substances 0.000 description 1
- 208000027866 inflammatory disease Diseases 0.000 description 1
- 230000002757 inflammatory effect Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 238000002743 insertional mutagenesis Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000007912 intraperitoneal administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 230000000302 ischemic effect Effects 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 208000022013 kidney Wilms tumor Diseases 0.000 description 1
- 201000010982 kidney cancer Diseases 0.000 description 1
- 230000002147 killing effect Effects 0.000 description 1
- 210000002415 kinetochore Anatomy 0.000 description 1
- 231100000225 lethality Toxicity 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 208000012987 lip and oral cavity carcinoma Diseases 0.000 description 1
- 235000019421 lipase Nutrition 0.000 description 1
- 230000029226 lipidation Effects 0.000 description 1
- 206010024627 liposarcoma Diseases 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 201000005202 lung cancer Diseases 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 201000011649 lymphoblastic lymphoma Diseases 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 230000005389 magnetism Effects 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 1
- 208000027202 mammary Paget disease Diseases 0.000 description 1
- 230000008774 maternal effect Effects 0.000 description 1
- 230000013011 mating Effects 0.000 description 1
- 208000023356 medullary thyroid gland carcinoma Diseases 0.000 description 1
- 210000002752 melanocyte Anatomy 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 208000030159 metabolic disease Diseases 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 230000009401 metastasis Effects 0.000 description 1
- MYWUZJCMWCOHBA-VIFPVBQESA-N methamphetamine Chemical compound CN[C@@H](C)CC1=CC=CC=C1 MYWUZJCMWCOHBA-VIFPVBQESA-N 0.000 description 1
- 239000002679 microRNA Substances 0.000 description 1
- 108091007426 microRNA precursor Proteins 0.000 description 1
- 210000000274 microglia Anatomy 0.000 description 1
- 206010063344 microscopic polyangiitis Diseases 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000001616 monocyte Anatomy 0.000 description 1
- 210000005087 mononuclear cell Anatomy 0.000 description 1
- 201000006417 multiple sclerosis Diseases 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 230000000869 mutational effect Effects 0.000 description 1
- 206010028417 myasthenia gravis Diseases 0.000 description 1
- NFVJNJQRWPQVOA-UHFFFAOYSA-N n-[2-chloro-5-(trifluoromethyl)phenyl]-2-[3-(4-ethyl-5-ethylsulfanyl-1,2,4-triazol-3-yl)piperidin-1-yl]acetamide Chemical compound CCN1C(SCC)=NN=C1C1CN(CC(=O)NC=2C(=CC=C(C=2)C(F)(F)F)Cl)CCC1 NFVJNJQRWPQVOA-UHFFFAOYSA-N 0.000 description 1
- YOHYSYJDKVYCJI-UHFFFAOYSA-N n-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide Chemical compound FC(F)(F)C1=CC=CC(NC=2N=CN=C(NC=3C=C(NC(=O)C4CC4)C=CC=3)C=2)=C1 YOHYSYJDKVYCJI-UHFFFAOYSA-N 0.000 description 1
- 108010069768 negative elongation factor Proteins 0.000 description 1
- 229960004927 neomycin Drugs 0.000 description 1
- 201000008383 nephritis Diseases 0.000 description 1
- 208000009928 nephrosis Diseases 0.000 description 1
- 231100001027 nephrosis Toxicity 0.000 description 1
- 210000005155 neural progenitor cell Anatomy 0.000 description 1
- 210000004498 neuroglial cell Anatomy 0.000 description 1
- 230000014511 neuron projection development Effects 0.000 description 1
- 239000002858 neurotransmitter agent Substances 0.000 description 1
- 239000003900 neurotrophic factor Substances 0.000 description 1
- 231100000956 nontoxicity Toxicity 0.000 description 1
- 238000011330 nucleic acid test Methods 0.000 description 1
- 239000002777 nucleoside Substances 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- 210000004248 oligodendroglia Anatomy 0.000 description 1
- 238000010915 one-step procedure Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000004789 organ system Anatomy 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 210000000496 pancreas Anatomy 0.000 description 1
- 201000002528 pancreatic cancer Diseases 0.000 description 1
- 208000008443 pancreatic carcinoma Diseases 0.000 description 1
- 229920002866 paraformaldehyde Polymers 0.000 description 1
- 230000008186 parthenogenesis Effects 0.000 description 1
- 230000008506 pathogenesis Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 201000001976 pemphigus vulgaris Diseases 0.000 description 1
- 239000000137 peptide hydrolase inhibitor Substances 0.000 description 1
- 208000030613 peripheral artery disease Diseases 0.000 description 1
- 230000002688 persistence Effects 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 230000000865 phosphorylative effect Effects 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 208000030761 polycystic kidney disease Diseases 0.000 description 1
- 230000003234 polygenic effect Effects 0.000 description 1
- 208000005987 polymyositis Diseases 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 231100000683 possible toxicity Toxicity 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 229960003387 progesterone Drugs 0.000 description 1
- 239000000186 progesterone Substances 0.000 description 1
- 230000002062 proliferating effect Effects 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 208000007153 proteostasis deficiencies Diseases 0.000 description 1
- PTJWIQPHWPFNBW-GBNDHIKLSA-N pseudouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-GBNDHIKLSA-N 0.000 description 1
- 208000020016 psychiatric disease Diseases 0.000 description 1
- 208000005069 pulmonary fibrosis Diseases 0.000 description 1
- 208000002815 pulmonary hypertension Diseases 0.000 description 1
- 201000003651 pulmonary sarcoidosis Diseases 0.000 description 1
- 102000016914 ras Proteins Human genes 0.000 description 1
- 101150007867 rbfox2 gene Proteins 0.000 description 1
- 101150069431 rbr-2 gene Proteins 0.000 description 1
- 238000011536 re-plating Methods 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 208000002574 reactive arthritis Diseases 0.000 description 1
- 206010038038 rectal cancer Diseases 0.000 description 1
- 201000001275 rectum cancer Diseases 0.000 description 1
- 230000008263 repair mechanism Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000001718 repressive effect Effects 0.000 description 1
- 230000001850 reproductive effect Effects 0.000 description 1
- 210000002345 respiratory system Anatomy 0.000 description 1
- 230000028617 response to DNA damage stimulus Effects 0.000 description 1
- 201000009410 rhabdomyosarcoma Diseases 0.000 description 1
- 206010039073 rheumatoid arthritis Diseases 0.000 description 1
- 108020004418 ribosomal RNA Proteins 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- DWRXFEITVBNRMK-JXOAFFINSA-N ribothymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 DWRXFEITVBNRMK-JXOAFFINSA-N 0.000 description 1
- 229940101201 ringl Drugs 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 229910052594 sapphire Inorganic materials 0.000 description 1
- 239000010980 sapphire Substances 0.000 description 1
- 201000000306 sarcoidosis Diseases 0.000 description 1
- 201000000980 schizophrenia Diseases 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 210000000717 sertoli cell Anatomy 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 210000003765 sex chromosome Anatomy 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 101150022243 sgf29 gene Proteins 0.000 description 1
- 230000005783 single-strand break Effects 0.000 description 1
- 210000002363 skeletal muscle cell Anatomy 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 108091029842 small nuclear ribonucleic acid Proteins 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
- 230000003381 solubilizing effect Effects 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000010374 somatic cell nuclear transfer Methods 0.000 description 1
- 125000006850 spacer group Chemical group 0.000 description 1
- 210000001324 spliceosome Anatomy 0.000 description 1
- 208000017572 squamous cell neoplasm Diseases 0.000 description 1
- 108700010045 sry Genes Proteins 0.000 description 1
- 230000023895 stem cell maintenance Effects 0.000 description 1
- 238000005309 stochastic process Methods 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 210000002536 stromal cell Anatomy 0.000 description 1
- 238000007920 subcutaneous administration Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000005720 sucrose Substances 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 101150003509 tag gene Proteins 0.000 description 1
- 210000001138 tear Anatomy 0.000 description 1
- 201000003120 testicular cancer Diseases 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- 229960003604 testosterone Drugs 0.000 description 1
- 101150024821 tetO gene Proteins 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
- 125000000341 threoninyl group Chemical group [H]OC([H])(C([H])([H])[H])C([H])(N([H])[H])C(*)=O 0.000 description 1
- 229960004072 thrombin Drugs 0.000 description 1
- 208000030829 thyroid gland adenocarcinoma Diseases 0.000 description 1
- 208000030901 thyroid gland follicular carcinoma Diseases 0.000 description 1
- 239000005495 thyroid hormone Substances 0.000 description 1
- 229940036555 thyroid hormone Drugs 0.000 description 1
- 206010043778 thyroiditis Diseases 0.000 description 1
- 239000011031 topaz Substances 0.000 description 1
- 229910052853 topaz Inorganic materials 0.000 description 1
- 230000000699 topical effect Effects 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 108700012359 toxins Proteins 0.000 description 1
- 230000002463 transducing effect Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 238000011830 transgenic mouse model Methods 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- 208000009174 transverse myelitis Diseases 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- GWBUNZLLLLDXMD-UHFFFAOYSA-H tricopper;dicarbonate;dihydroxide Chemical compound [OH-].[OH-].[Cu+2].[Cu+2].[Cu+2].[O-]C([O-])=O.[O-]C([O-])=O GWBUNZLLLLDXMD-UHFFFAOYSA-H 0.000 description 1
- 208000001072 type 2 diabetes mellitus Diseases 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 210000002444 unipotent stem cell Anatomy 0.000 description 1
- 210000003708 urethra Anatomy 0.000 description 1
- 210000003932 urinary bladder Anatomy 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 230000002485 urinary effect Effects 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- VBEQCZHXXJYVRD-GACYYNSASA-N uroanthelone Chemical compound C([C@@H](C(=O)N[C@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CS)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O)C(C)C)[C@@H](C)O)NC(=O)[C@H](CO)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H](CCSC)NC(=O)[C@H](CS)NC(=O)[C@@H](NC(=O)CNC(=O)CNC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CS)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)CNC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@H](CO)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CS)NC(=O)CNC(=O)[C@H]1N(CCC1)C(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CC(N)=O)C(C)C)[C@@H](C)CC)C1=CC=C(O)C=C1 VBEQCZHXXJYVRD-GACYYNSASA-N 0.000 description 1
- 230000035899 viability Effects 0.000 description 1
- 108010071260 virus protein 2A Proteins 0.000 description 1
- 230000004393 visual impairment Effects 0.000 description 1
- 235000021119 whey protein Nutrition 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
- 230000002034 xenobiotic effect Effects 0.000 description 1
- 230000022814 xenobiotic metabolic process Effects 0.000 description 1
- 108091005957 yellow fluorescent proteins Proteins 0.000 description 1
- 239000011701 zinc Substances 0.000 description 1
- 229910052725 zinc Inorganic materials 0.000 description 1
- 210000004340 zona pellucida Anatomy 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/8509—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K67/00—Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
- A01K67/027—New or modified breeds of vertebrates
- A01K67/0275—Genetically modified vertebrates, e.g. transgenic
- A01K67/0276—Knock-out vertebrates
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/46—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
- C07K14/47—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
- C07K14/4701—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
- C07K14/4702—Regulators; Modulating activity
- C07K14/4705—Regulators; Modulating activity stimulating, promoting or activating activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2207/00—Modified animals
- A01K2207/05—Animals modified by non-integrating nucleic acids, e.g. antisense, RNAi, morpholino, episomal vector, for non-therapeutic purpose
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2217/00—Genetically modified animals
- A01K2217/15—Animals comprising multiple alterations of the genome, by transgenesis or homologous recombination, e.g. obtained by cross-breeding
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2217/00—Genetically modified animals
- A01K2217/20—Animal model comprising regulated expression system
- A01K2217/206—Animal model comprising tissue-specific expression system, e.g. tissue specific expression of transgene, of Cre recombinase
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2227/00—Animals characterised by species
- A01K2227/10—Mammal
- A01K2227/105—Murine
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2267/00—Animals characterised by purpose
- A01K2267/03—Animal model, e.g. for test or diseases
- A01K2267/0393—Animal model comprising a reporter system for screening tests
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
Definitions
- mice Genetically modified mice represent a crucial tool for understanding gene function in development and disease. Mutant mice are conventionally generated by insertional mutagenesis (Copeland and Jenkins, 2010; Kool and Berns, 2009) or by gene targeting methods (Capecchi, 2005). In conventional gene targeting methods, mutations are introduced through homologous recombination in mouse embryonic stem (ES) cells. Targeted ES cells injected into wild-type blastocysts can contribute to the germline of chimeric animals, generating mice containing the targeted gene modification (Capecchi, 2005). It is costly and time- consuming to produce single gene knockout mice, and even more so to make double mutant mice.
- ES mouse embryonic stem
- DSBs induced by these site-specific nucleases can then be repaired by either error-prone non-homologous end joining (NHEJ) resulting in mutant mice and rats carrying deletions or insertions at the cut site (Carbery et al, 2010; Geurts et al, 2009; Sung et al, 2013; Tesson et al, 2011). If a donor plasmid with homology to the ends flanking the DSB is co-injected, high-fidelity homologous recombination can produce animals with targeted integrations (Cui et al., 2011; Meyer et al., 2010).
- NHEJ error-prone non-homologous end joining
- ZNFs zinc finger nucleases
- TALENs Transcription activator-like effector nucleases
- CRISPR Clustered Regularly Interspaced Short Palindromic Repeats
- Cas CRISPR associated proteins
- CRISPR/Cas CRISPR/Cas system to drive both non-homologous end joining (NHEJ) based gene disruption and homology directed repair (HDR) based precise gene editing to achieve highly efficient and simultaneous targeting of multiple nucleic acid sequences in cells and nonhuman mammals.
- the invention is directed to a method of mutating one or more target nucleic acid sequences in a (one or more) stem cell or a zygote comprising introducing into the stem cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity.
- RNA ribonucleic acid
- the stem cell or zygote is maintained under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the one or more RNA sequences to the portion of the target nucleic acid sequence, thereby mutating one or more target nucleic acid sequences in the stem cell or zygote.
- the invention is directed to a method of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences comprising introducing into a zygote or an embryo (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and ii) a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity.
- RNA ribonucleic acid
- the zygote or the embryo is maintained under conditions in which RNA hybridizes to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the RNA to the portion of the target nucleic acid sequence, thereby producing an embryo having one or more mutated nucleic acid sequences.
- the embryo having one or more mutated nucleic acid sequences may be transferred into a foster nonhuman mammalian mother.
- the foster nonhuman mammalian mother is maintained under conditions in which one or more offspring carrying the one or more mutated nucleic acid sequences are produced, thereby producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences.
- the invention is directed to a method of modulating the expression and/or activity of one or more target nucleic acid sequences in one or more cells or zygotes comprising introducing into the cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associate (Cas) protein; (ii) a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence; and (iii) an effector domain.
- RNA ribonucleic acid
- Cas CRISPR associate
- the method further comprises maintaining the cell under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, the Cas protein binds to each of the one or more RNA sequences and the effector domain modulates the expression and/or activity of the target nucleic acid, thereby modulating the expression and/or activity of the one or more target nucleic acid sequences in the cell or zygote.
- Figures 1 A-1E show multiplexed gene targeting in mESCs.
- Figure 1 A shows a schematic of the Cas9/sgRNA targeting sites in Tetl, 2, and 3.
- the sgRNA targeting sequence is underlined, and the protospacer-adjacent motif (PAM) sequence is labeled in green.
- the restriction sites at the target regions are bold and capitalized.
- Restriction enzymes used for restriction fragment length polymorphism (RFLP) and Southern blot analysis are shown, and the Southern blot probes are shown as orange boxes (SEQ ID NOs: 27- 29).
- Figure IB shows a surveyor assay for Cas9-mediated cleavage at Tetl, 2, 3 loci in mESCs.
- Figure 1C shows the genotyping of triple targeted mESCs, clone #51, #52, and #53 are shown.
- the upper panels in Figure 1C is the RFLP analysis. Tetl PCR products were digested with Sacl, Tet2 PCR products were digested with EcoRV, and Tet3 PCR products were digested with Xhol. Lower panel, Southern blot analysis.
- Sacl digested genomic DNA was hybridized with a 5' probe.
- Expected fragment size: WT (wild type) 5.8 kb
- TM Tueted mutation
- 6.4 kb For the Tet2 locus, Sacl and EcoRV double digested genomic DNA was hybridized with a 3' probe.
- Figures 2A-2F show single and double gene targeting in vivo by injection into fertilized eggs.
- Figure 2B shows genotyping of Tet2 single targeted mice.
- RFLP analysis is shown in the upper panel, and Southern blot analysis is shown in the lower panel of Figure 2B.
- Figure 2C is the sequence of both alleles of targeted gene in Tetl biallelic mutant mouse #2 and Tet2 biallelic mutant mouse #4 (SEQ ID NOs: 30, 32-34, 46, 47).
- Figure 2D is the genotyping of Tetl/Tet2 double mutant mice. Analysis of mice #1 to #12 is shown.
- RFLP analysis is shown in the upper panel
- Southern blot analysis is shown in the lower panel of Figure 2D.
- FIG. 2E is the sequence of four mutant alleles from double mutant mouse #9 and #10 (SEQ ID NOs: 30, 33, 42, 46, 48-53). PAM sequences are labeled in red.
- Figure 2F is a picture of three-week-old double mutant mice. All RFLP and Southern digestions and probes are the same as those used in Figures 1A-E. See also Figures 6A-6F, Tables 2 and 3.
- Figures 3A-3C show multiplexed HR-mediated genome editing in vivo.
- Figure 3 A shows a schematic of the oligo targeting sites at Tetl and Tet2 loci (SEQ ID NOs: 54-57). The sgRNA targeting sequence is underlined, and the PAM sequence is labeled in green. Oligo targeting each gene is shown under the target site, with 2bp changes labeled in red. Restriction enzyme sites used for RFLP analysis are bold and capitalized.
- Figure 3B is RFLP analysis of double oligo injection mice with HDR-mediated targeting at the Tetl and Tet2 loci.
- Figure 3C shows the sequences of both alleles of Tetl and Tet2 in mouse #5 and #7 show simultaneously HDR-mediated targeting at one allele or two alleles of each gene, and NHEJ-mediated disruption at the other alleles (SEQ ID NOs: 30, 33, 40, 58-61). See also Figures 8A-8C.
- Figures 4A-4B show multiplexed genome editing in mES cells and mouse.
- Figure 4A is a diagram representing multiple gene targeting in mES cells.
- Figure 4B shows one step generation of mice with multiple mutations.
- Upper panel multiple targeted mutations with random indels introduced through NHEJ.
- Lower panel multiple predefined mutations introduced through HDR-mediated repair.
- Figures 5A-5D show single, triple, and quintuple gene targeting in mES cells.
- Figure 5 A is RFLP analysis of clones from each single targeting experiment (#1 to #17 are shown).
- Figure 5B is RFLP analysis of triple gene targeted clones (#37 to #53 are shown). Tetl PCR products were digested with Sacl, Tet2 PCR products were digested with EcoRV, and Tet3 PCR products were digested with Xhol. WT control is shown in the last lane. Genotyping of clone #51, #52, and #53 are also shown in Figure 1C.
- Figure 5C is a schematic of the Cas9/sgRNA targeting sites in Sry and Uty (SEQ ID NOs: 62-64).
- the sgR A targeting sequence is underlined, and the protospacer-adjacent motif (PAM) sequence is labeled in green.
- the restriction sites at the target regions are bold and capitalized. Restriction enzymes used for RFLP analysis are shown.
- Figure 5D is RFLP analysis of quintuple gene targeted clones (#1 to #10 are shown). Sry PCR products were digested with BsaJI, Uty PCR products were digested with Avrll. WT control is shown in the last lane. RFLP analysis of Tetl, 2, 3 loci are not shown.
- Figures 5A- 5D are related to Figure 1A-1E, Tables 1 and 4.
- Figures 6A-6F show one step generation of single gene mutant mice by zygote injection
- Figure 6 A is RFLP analysis of blastocysts injected with different concentration of Cas9 mRNA and Tetl sgRNA at 203 ⁇ 4/ ⁇ 1. Tetl PCR products were digested with Sacl.
- Figure 6B shows commonly recovered Tetl and Tet2 alleles resulted from MMEJ (SEQ ID NOs: 30, 33, 34, 40, 46, 52). PAM sequence of each targeting sequence is labeled in green. Microhomology flanking the DSB is bold and underlined in WT sequence.
- Figure 6C is RFLP analysis of eight Tet3 targeted blastocysts demonstrated high targeting efficiency (embryo #3 and #5 failed to amplify). Tet3 PCR products were digested with Xhol.
- Figure 6D is a picture of how some Tet3 targeted mice show smaller size and all homozygous mutants died within one day after birth.
- Figure 6E is RFLP analysis of Tet3 single targeted new born mice. Mouse #8 and #14 survived after birth. Sample #2 and #6 failed to amplify.
- Figure 6F are sequences of both Tet3 alleles of surviving Tet3 targeted mouse #14. PAM sequences are labeled in red.
- Figures 6A-6F are related to Figures 2A-2F and Table 2.
- Figures 7A-7B show off target analysis of double mutant mice.
- Figure 7A shows three potential off targets of Tetl sgRNA and four potential off targets of Tet2 sgRNA are shown (SEQ ID NOs: 66-74). The 12bp perfect matching seed sequence is labeled in blue, and NGG PAM sequence is labeled in red.
- Figure 7B shows a surveyor assay of all seven potential off target loci in seven double mutant mice derived with high concentration of Cas9 mRNA (lOOng/ ⁇ ) injection. WT control is included as the eighth sample. The weak cleavage activity at Ubrl locus is not due to off target effect, since sequences of these PCR products show no mutations.
- Figures 7A-7B are related to Figures 2A-2F and Table 5.
- Figures 8A-8C show multiplexed precise HDR-mediated genome editing in vivo.
- Figure 8 A is RFLP analysis of single oligo injection embryos with HDR- mediated targeting at Tetl and Tet2 locus.
- Figure 8B is RFLP analysis of double oligo injection embryos with multiplexed HDR-mediated targeting at both Tetl and Tet2 loci.
- Figure 8C shows the sequences of both alleles of Tetl and Tet2 in embryo #2 and show simultaneously HDR-mediated targeting at one allele of both genes, and NHEJ-mediated gene disruption at the other allele of each gene (SEQ ID NOs: 30, 33, 53, 58, 59, 75).
- Figures 8A-8C are related to Figures 3A-3C.
- Figure 9 shows 20 bp sequences of Tetl, Tet2, Tet3, Sry and Uty and the full length sequences of the RNA sequences (SEQ ID NOs: 76-85).
- FIG. 1 OA- IOC: dCas9ta guided by sgRNA targeting tet binding site activates TetO promoter in HeLa cell.
- 10A Schematic of dCas9ta fusion protein generated by mutation of two amino acids of Cas9 protein and fusion to 3x VP 16 minimal transactivation domain
- 10B Schematic of a TetO::tdTomato reporter system to test dCas9ta fusion protein.
- IOC Phase contrast, fluorescent microscopy and fluorescent activated cell sorting (FACS) profile of
- 1 1 A Schematic of the experiment showing the target sites of the sgRNAs relative to Nanog locus and the NanogGFP reporter.
- Figures 12A-12D dCas9ta guided by sgRNA targeting tet binding site activates TetO promoter in NIH3T3 cell. Phase contrast, fluorescent microscopy and fluorescent activated cell sorting (FACS) profile of
- NIH3T3/TetO :tdTomato
- EF 1 a :NLSM2rtTA cells transfected by pmaxGFP (12A), under dox exposure for 2 days (12B), transfected with dCas9ta without sgRNA (12C), and transfected with dCas9ta with sgRNA complementary to tet binding sites (12D).
- Figures 13A-13D Microscope pictures and FACS analysis of
- HeLa/TetO::tdTomato cells 13A
- No transfection 13B
- 13C Transfected with dCas9Cdk9 + dCas9CycT + sgTetO.
- 13D Transfected with dCas9ta + dCas9Cdk9 + dCas9CycT + sgTetO.
- Figure 14 the wild type (Wt) Cas9 (S. pyogenes) nucleotide sequence (SEQ ID NO: 485).
- Figure 15 A Alignment of HMG box sequences of Sry proteins from different mammalian species (SEQ ID NOs: 86-91). Position 94 (shown in red) is highly conserved in different species (h: human; m: mouse; c: Chimpanzee; pc: Pygmy Chimpanzee; g: gorilla; py: Pongo; hi: Hylobates; b: Baboon and cj: Calitrix jaccus). (From Shahid et al. BMC Medical Genetics 2010 11 : 131 doi: 10.1186/1471- 2350-11-131).
- HMG high mobility group
- FIGS 16A-16F One step generation of the Sox2-V5 allele.
- PAM protospacer-adjacent motif
- PCR primers (SF, V5F, and SR) used for PCR genotyping are shown as red arrowheads.
- FIGS. 17A-17F One step generation of an endogenous reporter allele (SEQ ID NOs: 99 and 100).
- (17A Schematic overview of strategy to generate a Nanog-mCherry knock-in allele.
- the sgRNA coding sequence is underlined, capitalized, and labeled in red.
- the protospacer-adjacent motif (PAM) sequence is labeled in green.
- the stop codon of Nanog is labeled in orange.
- the homologous arms of the donor vector are indicated as HA-L (2kb) and HA-R (3kb).
- the restriction enzyme used for Southern blot analysis is shown, and the Southern blot probes are shown as red boxes.
- (17B Southern analysis of Nanog-mCherry targeted allele.
- the sgRNA coding sequence is underlined, capitalized, and labeled in red.
- the protospacer-adjacent motif (PAM) sequence is labeled in green.
- the homologous arms of the donor vector are indicated as HA-L (4.5kb) and HA-R (2kb).
- the IRES-eGFP transgene is indicated as a green box, and the PGK- Neo cassette is indicated as a grey box.
- the restriction enzyme used for Southern blot analysis is shown, and the Southern blot probes are shown as red boxes.
- Oct4-eGFP targeted blastocysts showed expression in ICM. Scale bar, 50 ⁇ . Mouse ES cell lines derived from targeted blastocysts remain GFP positive.
- FIGS 18A-18E One step generation of a Mecp2 floxed allele.
- (18A) Schematic of the Cas9/sgRNA/oligo targeting sites in Mecp2 intron 2 and intron 3 (SEQ ID NOs: 102-106). The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green. In the oligo donor sequence, the loxP site is indicated as an orange box, and the restriction site sequences are in bold and capitalized. Restriction enzymes used for RFLP and Southern blot analysis are shown, and the Southern blot probes are shown as red boxes.
- PAM protospacer-adjacent motif
- FIGS 19A-19C Integration of loxP sites at Tetl and Tet2 loci.
- FIGS 20A-20C Characterization of Nanog-mCherry alleles.
- (20A) ES clone with mosaic expression of mCherry. The mCherry negative colony is indicated by the arrow.
- (20C) The blot was then stripped and hybridized with mCherry internal probe.
- FIGS 21A-21B Integration of loxP sites at Mecp2 intron 2 and 3.
- 21A Schematic of the Cas9/sgRNA/oligo targeting sites (SEQ ID NOs: 116-124). The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green. In oligo donor sequence, the loxP site is labeled as an orange box, and the restriction site sequence is in bold and capitalized. PCR primers used for RFLP analysis are shown as red arrows. For intron 2, two sgRNA coding sequences LI and L2 are shown, and their corresponding oligos are named accordingly.
- PAM protospacer-adjacent motif
- PCR primers LF and LR are used to amplify the intron 2 region, while RF and RR are used to amplify the intron 3 region.
- Figures 22A-22C Analysis of Mecp2 floxed allele.
- 22 A RFLP analysis detected loxP integration at intron 2 (Mecp2-L2) and intron 3 (Mecp2-Rl) in mice derived from L2 and Rl double sgRNA/oligo injections. Primers LF and LR were used to amplify intron 2 region, and RF and RR were used to amplify intron 3 region. Mice containing loxP sites in both introns are marked by stars.
- 22B Partial chromatograph from one single sequencing file crossing both loxP sites, exon 3, and flanking intron sequences.
- 22C Partial chromatograph from sequences of Cre- mediated recombination PCR products (deletion and circular products from Fig. 18C).
- Figures 23A-23D CRISPR-on activates exogenous transgenes.
- 23 A Schematic of the dCas9VP48 mediated transgene activation in HeLa cells.
- dCas9VP48 was generated by fusing dCas9 (indicated by black circle) to VP48 domain (indicated by green diamond). sgRNA complementary to rtTA binding site is indicated by small hairpin labeled sgTetO. (23B) dCas9VP48 activates
- TetO::tdTomato transgene in HeLa cells Upper (top) panel, phase contrast picture of transfected cells; middle panel, tdTomato signal using fluorescent microscopy; bottom panel, FACS analysis of transfected cells.
- Column i cells transfected with GFP plasmid; Column ii, cells treated with doxycycline; Column iii, cells transfected with dCas9VP48 only; Column iv, cells transfected with dCas9VP48 and sgTetO. Cells were transfected with the indicated plasmids and 48 hr later were analyzed by flow cytometry for tdTomato expression.
- dCas9VP48 Schematic of the dCas9VP48 mediated reporter activation in early mouse embryos.
- dCas9VP48, Nanog::EGFP vector, and 7 sgRNAs targeted on Nanog promoter were co-injected into mouse zygotes and cultured into blastocyst stage.
- dCas9VP48/sgRNA can activate gene in vivo. Left panel, embryos injected with dCas9VP48 and Nanog: :EGFP vector; right panel, embryos injected with dCas9VP48, Nanog: :EGFP vector and sgRNAs targeting Nanog promoter. Embryos two, three, four days post- injection were shown.
- FIGS 24A-24G dCas9VP160 activated multiple endogenous genes simultaneously.
- 24A Protein architecture of dCas9VP160 compared to VP48.
- 24B Schematic of the human IL1RN promoter region. Locations of transcription start site (TSS) and start codon (ATG) are indicated. Short lines with number indicate targeting sites of the sgRNAs.
- 24C Activation of human IL1RN expression in HEK293T cells. Cells transfected with dCas9VP160 and indicated sgRNAs were analyzed by qRT-PCR 2 days later. sgTetO-mut, negative control sgRNA. Error bars show standard deviation (SD) among triplicates.
- SD standard deviation
- Figures 25A-25B Multiple exogenous and endogenous genes were simultaneously activated by CRISPR-on.
- 25 A One exogenous and two endogenous genes were simultaneously activated by CRISPR-on.
- Cells transfected with dCas9VP160 and indicated sgRNAs were analyzed by qRT-PCR 2 days later.
- sgTetO-mut negative control sgRNA.
- Error bars show SD among triplicates.
- Three endogenous genes SOX2, IL1RN, and OCT4 can be simultaneously activated by dCas9VP160/sgRNAs. Cells were transfected with dCas9VP160 and indicated sgRNAs and were analyzed by qRT-PCR 2 days later. sgTetO-mut, negative control sgRNA.
- the last three sets of bars represent triple activation experiments using sgSOX2, sgOCT4 and sglLIRN with three different ratios of sgSOX2:sgILlRN, keeping the amount of sgOCT4 constant, as indicated by numbers above line. Error bars show SD among triplicates.
- FIGS 26A-26D CRISPR-on is specific.
- 26A The histogram showing distribution of Log 2 fold changes of gene expression in sample transfected with dCas9VP160/sgTetO over dCas9VP160/sgTetO-mut control.
- 26B A histogram showing distribution of Log 2 fold changes of gene expression in sample transfected with dCas9VP 160/sgIL 1 RN 1 ⁇ 3 over dCas9VP 160/sgTetO-mut control.
- the vertical line marks the fold change of the target gene IL1RN.
- Figure 27 The persistence of CRISPR-on mediated transgene
- Figures 28A-28B CRISPR-on activates transgene in mouse cells.
- dCas9VP48 guided by sgRNA targeting rtTA binding site activates TetO promoter in NIH3T3/TetO::tdTomato;EFla::rtTA-M2 cells.
- 28A Schematic of the dCas9VP48 mediated transgene activation in NIH3T3 cells.
- dCas9VP48 was generated by fusion of dCas9 to VP48 and then co-transfected with sgRNA complementary to tet binding site in NIH3T3/TetO::tdTomato;EFla::rtTA-M2 cells.
- dCas9VP48 depends on sgRNA to bind to the target tetO promoter to activate TetO::tdTomato transgene in NIH3T3 cells.
- Cells were transfected with the indicated plasmids or sgRNAs and were analyzed by flow cytometry for tdTomato expression 48 hours later.
- Figure 29 CRISPR-on activated a single-copy transgene in ESCs.
- Cells were transfected with the indicated plasmids into a Tet-inducible MSI1 over- expression mouse embryonic stem cell (mESC) line and were analyzed by western blot for MSI1 expression 48 hours later.
- mESC mouse embryonic stem cell
- FIGS 30A-30B Tunable gene activation can be achieved by titration of sgRNA.
- FIGS 31 A-3 IB dCas9VP48 with 6 sgRNAs failed to activate the IL1RN gene.
- 31 A Schematic of the human IL1RN promoter region. Locations of transcription start site (TSS) and start codon (ATG) are indicated. Short lines with number indicate locations of sgRNAs.
- TSS transcription start site
- ATG start codon
- dCas9VP48/sgRNAs were transfected with dCas9VP48 and six sgRNAs and 2 days later were analyzed by qRT-PCR. sgTetO- mut, negative control sgRNA. Error bars show SD among triplicates.
- Figures 32A-32C Nucleotide sequences of dCas9VP64 on pmax expression vector (SEQ ID NO: 486), dCas9Vp96 on pmax expression vector (SEQ ID NO: 487), and dCas9Vpl60 on pmax expression vector (SEQ ID NO: 488).
- CRISPR clustered regularly interspaced short palindromic repeats
- Cas genes CRISPR associated genes
- CRISPR/Cas mediated gene editing allows the simultaneous disruption of five genes (Tetl, Tet2, Tet3, Sry, Uty - 8 alleles) in mouse embryonic stem cells (mESCs) with high efficiency.
- mESCs mouse embryonic stem cells
- Co- injection of Cas9 mRNA and single guide RNA (sgRNA) targeting Tetl and Tet2 into zygotes generated mice with biallelic mutations in both genes with an efficiency of 80%.
- sgRNA single guide RNA
- co-injection of Cas9 mRNA/sgRNAs with mutant oligos generated precise point mutations in target genes.
- a method described herein generates non-human mammals, e.g., mice, with biallelic mutations in 1, 2, 3, 4, 5, or more genes with an efficiency of between 20% and 95%, or even more, e.g., at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 85%, 90%, 95%, or more, e.g., up to 96%, 97%, 98%, 99%, or more.
- a method described herein generates non-human mammals, e.g., mice, with biallelic mutations in 2, 3, 4, 5, or more genes with an efficiency of at least 70%, 80%, 85%, 90%, 95%, or more, e.g., between 70% and 85%, 90%, 95%, 96%, 97%, 98%, 99%, or more.
- the invention is directed to a method of mutating or modulating one or more target nucleic acid sequences in a (one or more) stem cell or a zygote comprising introducing into the stem cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is
- the stem cell or zygote is maintained under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the one or more RNA sequences to the portion of the target nucleic acid sequence, thereby mutating one or more target nucleic acid sequences in the stem cell or zygote.
- the stem cell or zygote into which the one or more RNA sequences and Cas nucleic acid sequence are introduced is an isolated stem cell or isolated zygote.
- the method can also further comprise introducing the stem cell or zygote into a nonhuman mammal.
- a stem cell is a pluripotent cell.
- a "pluripotent" cell has the ability to self-renew and to differentiate into cells of all three embryonic germ layers (endoderm, mesoderm and ectoderm) and, typically, has the potential to divide in vitro for a long period of time, e.g., at least 20, at least 25, or at least 30 passages, or more (e.g., up to 80 passages, or up to 1 year, or more), without losing its self-renewal and differentiation properties.
- a pluripotent cell is said to exhibit or be in a "pluripotent state”.
- a pluripotent cell line or cell culture is often characterized in that the cells can differentiate into a wide variety of cell types in vitro and in vivo.
- Cells that are able to form teratomas containing cells having characteristics of endoderm, mesoderm, and ectoderm when injected into SCID mice are considered pluripotent.
- Cells that possess ability to participate in formation of chimeras (upon injection into a blastocyst of the same species that is transferred to a suitable foster mother of the same species) that survive to term are pluripotent. If the germ line of the chimeric animal contains cells derived from the introduced cell, the cell is considered germline-competent in addition to being pluripotent.
- ES cells are examples of pluripotent cells.
- ES cells have been derived from mice, primates (including humans), and some other species.
- ES cells are often derived from cells obtained from the inner cell mass (ICM) of a vertebrate blastocyst but can also be derived from single blastomeres (e.g., removed from a morula).
- Pluripotent cells can also be obtained using somatic cell nuclear transfer in at least some species, e.g., mice and various non-human primates.
- Pluripotent cells can also be obtained using parthenogenesis, e.g., from germ cells, e.g., oocytes.
- Other pluripotent cells include embryonic carcinoma (EC) and embryonic germ (EG) cells. See, e.g., Yu J, Thomson J A, Pluripotent stem cell lines. 22(15): 1987-97, 2008.
- Reprogramming refers to a process that alters the differentiation state or identity of a cell.
- Induced pluripotent stem (iPS) cells are pluripotent, ES-like cells derived from somatic cells (e.g., fibroblasts, keratinocytes, hematopoietic cells, neural precursor cells) by reprogramming. Reprogramming can be performed using a variety of different methods. As used herein, iPS) cells are pluripotent, ES-like cells derived from somatic cells (e.g., fibroblasts, keratinocytes, hematopoietic cells, neural precursor cells) by reprogramming. Reprogramming can be performed using a variety of different methods. As used herein,
- reprogramming protocol refers to any treatment or combination of treatments that causes at least some cells to become reprogrammed. In some embodiments
- reprogramming protocol refers to a set of manipulations (e.g., introduction of nucleic acid(s), e.g., vector(s), carrying particular genes) and/or culture conditions (e.g., culture in medium containing particular compounds) that generates pluripotent cells from somatic cells, e.g., in vitro.
- reprogramming factor encompasses genes, RNAs, or proteins that promote or contribute to cell reprogramming, e.g., in vitro. Many useful reprogramming factors are transcription factors.
- reprogramming “reprogramming to a pluripotent state", “reprogramming to pluripotency”, refer to in vitro
- reprogramming methods that do not require and typically do not include nuclear or cytoplasmic transfer or cell fusion, e.g., with oocytes, embryos, germ cells, or pluripotent cells.
- Any embodiment or claim may specifically exclude compositions or methods relating to or involving nuclear or cytoplasmic transfer or cell fusion, e.g., fusion of a somatic cell with oocytes, embryos, germ cells, or pluripotent cells or transfer of a somatic cell nucleus to oocytes, embryos, germ cells, or pluripotent cells.
- Differentiated cells can be reprogrammed to a pluripotent state by overexpress of the four transcription factors Oct4, Sox2, Klf , and c-Myc
- iPSCs Fully reprogrammed induced pluripotent stem cells
- iPSCs Fully reprogrammed induced pluripotent stem cells
- iPSCs can contribute to the three germ layers and give rise to fertile mice by tetraploid complementation ( Wernig, M., et al. (2007).
- Direct cell reprogramming is a stochastic process amenable to acceleration. Nature 462, 595-601).
- the reprogramming process is characterized by widespread epigenetic changes that generate iPSCs that are functionally and molecularly similar to embryonic stem (ES) cells (Carey, B. W. et al. Reprogramming factor stoichiometry influences the epigenetic state and biological properties of induced pluripotent stem cells. Cell Stem Cell 9, 588-598, (2011)).
- ES embryonic stem
- Reprogramming somatic cells to a pluripotent state can be achieved by infecting cells with retroviruses that encode the transcription factors Oct4, Sox2, Klf4, and c-Myc (termed "OSKM factors") under control of a viral LTR.
- Oct4, Sox2 and Klf4 (“OSK factors") are also sufficient to reprogram mammalian, e.g., rodent or human, somatic cells to pluripotency.
- Other sets of reprogramming factors e.g., Oct4, Sox2, Nanog, and Lin28 (OSNL factors) can be used to reprogram
- mammalian cells e.g., rodent or human cells, with Lin28 being dispensable.
- the ectopically expressed factors induce expression of endogenous pluripotency genes such as Oct4 and Nanog. Since the retroviral vectors in iPS cells derived by this approach are silenced, maintenance of pluripotency relies on expression of such endogenous genes and establishment of an appropriate transcriptional network in the reprogrammed cells.
- reprogramming factors that are members of the same gene family may be used in place of one another in certain embodiments. For example, Klf2 and Klf5 can substitute for Klf4, Soxl for Sox2 and N-Myc for c- Myc.
- reprogramming can be achieved using Sall4, Nanog, Esrrb, and Lin28 as reprogramming factors (SNEL factors) or using Sal4, Lin28, Essrb, and Dppa2 (SLED factors) (Buganim Y, et al, Cell. 2012 Sep 14;150(6): 1209-22).
- examples of reprogramming factors of interest for reprogramming somatic cells to pluripotency in vitro include Oct4, Sall4, Nanog, Esrrb, Lin28, Klf , c-Myc, Dppa2, and any gene/RNA/protein that can substitute for one or more of these in a method of reprogramming somatic cells in vitro.
- Exogenous reprogramming factors may be introduced into somatic cells in any form that is capable of maintaining exogenous reprogramming factors for a period of time and at levels sufficient to activate endogenous pluripotency genes and for reprogramming of at least some of the somatic cells into which the exogenous reprogramming factors are introduced to occur.
- exogenous refers to a substance present in a cell or organism other than its native source.
- exogenous nucleic acid or “exogenous protein” refer to a nucleic acid or protein that has been introduced by a process involving the hand of man into a biological system such as a cell or organism in which it is not normally found or in which it is found in lower amounts.
- a substance will be considered exogenous if it is introduced into a cell or an ancestor of the cell that inherits the substance.
- endogenous refers to a substance that is native to the biological system.
- Somatic cells of use in aspects of the invention may be primary cells (non-immortalized cells), such as those freshly isolated from an animal, or may be derived from a cell line capable of prolonged proliferation in culture (e.g., for longer than 3 months) or indefinite proliferation (immortalized cells).
- Adult somatic cells may be obtained from individuals, e.g., human subjects, and cultured according to standard cell culture protocols available to those of ordinary skill in the art. Cells may be maintained in cell culture following their isolation from a subject. In certain embodiments, the cells are passaged once or more following their isolation from the individual (e.g., between 2-5, 5-10, 10-20, 20-50, 50-100 times, or more) prior to their use in a method of the invention.
- cells may be frozen and subsequently thawed prior to use. In some embodiments, cells will have been passaged no more than 1, 2, 5, 10, 20, or 50 times following their isolation from an individual prior to their use in a method of the invention.
- Somatic cells of use in aspects of the invention include mammalian cells, such as, for example, human cells, non-human primate cells, or rodent (e.g., mouse, rat) cells. They may be obtained by well-known methods from various organs, e.g., skin, lung, pancreas, liver, stomach, intestine, heart, breast, reproductive organs, muscle, blood, bladder, kidney, urethra and other urinary organs, etc., generally from any organ or tissue containing live somatic cells.
- Mammalian somatic cells useful in various embodiments include, for example, fibroblasts, Sertoli cells, granulosa cells, neurons, pancreatic cells, epidermal cells, epithelial cells, endothelial cells, hepatocytes, hair follicle cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T lymphocytes), macrophages, monocytes, mononuclear cells, cardiac muscle cells, skeletal muscle cells, etc.
- a somatic cell is a terminally differentiated somatic cell.
- a somatic cell is a progenitor (precursor) cell, which has not terminally differentiated.
- reprogramming factors are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors. In some embodiments, reprogramming factors are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors. In some embodiments, the one or more nucleic acid sequences comprise DNA. In some embodiments, the one or more nucleic acid sequences comprise R A. In some embodiments, the one or more nucleic acid sequences comprise a nucleic acid construct.
- the one or more nucleic acid sequences comprise a vector for delivery of the reprogramming factors into a target cell (e.g., a mammalian somatic cell, e.g., a human or mouse fibroblast cell).
- a target cell e.g., a mammalian somatic cell, e.g., a human or mouse fibroblast cell.
- Any suitable vector may be used. Examples of suitable vectors are described by Stadtfeld and Hochedlinger (Genes Dev. 24:2239- 2263, 2010, incorporated herein by reference in its entirety). Other suitable vectors are apparent to those skilled in the art.
- a vector comprises an inducible vector.
- the inducible vector is a doxycycline inducible vector (i.e., a vector activates expression of said reprogramming factors in the presence of doxycycline in a culture medium).
- “Expression” refers to the cellular processes involved in producing RNA and proteins as applicable, for example, transcription, translation, folding, modification and processing.
- “Expression products” include RNA transcribed from a gene and polypeptides obtained by translation of mRNA transcribed from a gene.
- the inducible vector is a tamoxifen inducible vector or encodes a tamoxifen-inducible protein.
- a vector is an integrating vector that integrates into a genome of a host cell (e.g., a mammalian somatic cell).
- a vector comprises a viral vector, e.g., a retroviral vector, e.g., a lentiviral vector.
- a vector comprises an excisable vector.
- the excisable vector comprises a transposon, wherein said excisable vector is excisable from said genome by transient expression of a transposase.
- the transposon comprises a piggyback transposon (See, e.g., Woltjen et al.
- the excisable vector comprises one or more loxP site incorporated into said vector, wherein said vector can be excised from said genome by transient expression of a Cre recombinase (See, e.g., Kaji et al. Nature 458:771-775, 2009; Soldner et al. Cell 136:964-977, 2009, eachof which is incorporated herein by reference in its entirety).
- the excisable vector comprises a floxed lentiviral vector.
- the vector does not integrate into the genome of said somatic cell.
- the vector comprises an adenoviral vector (See, e.g., Zhou and Freed. Stem Cells 27:2667-2674, 2009, the teachings of which are incorporated herein by reference).
- the vector comprises a Sendai viral vector (See, e.g., Fusaki et al. Proc Jpn Acad 85:348-362, 2009, the teachings of which are incorporated herein by reference).
- the vector comprises a plasmid.
- the vector comprises an episome (Yu et al. Science 324(5928):797-801, 2009, the teachings of which are incorporated herein by reference).
- a nucleic acid construct comprises a polycistronic vector that can transduce any combination of reprogramming factors with a goal of reducing the number of proviral integrations.
- polycistronic nucleic acid constructs, expression cassettes, and vectors that employ internal ribosomal entry sites and self-cleaving peptides and are capable of transducing any combination of reprogramming factors are described in PCT Application Publication No. WO 2009/152529, incorporated herein by reference in its entirety.
- reprogramming factors are provided by polycistronic nucleic acid constructs (e.g., expression cassettes, and vectors comprising such constructs).
- polycistronic nucleic acid constructs comprise a portion that encodes a self-cleaving peptide.
- a polycistronic nucleic acid construct comprises at least two, three, or four, coding regions, wherein the coding regions are linked to each by a nucleic acid that encodes a self-cleaving peptide so as to form a single open reading frame, and wherein the coding regions encode at least first and second reprogramming factors capable, either alone or in combination with one or more additional reprogramming factors, of reprogramming a mammalian somatic cell to pluripotency.
- the construct comprises two coding regions separated by a self-cleaving peptide.
- constructs encode a polyprotein that comprises 2, 3, or 4 reprogramming factors, separated by self-cleaving peptides.
- the construct comprises expression control element(s), e.g., a promoter, suitable to direct expression in mammalian cells, wherein the portion of the construct that encodes the polyprotein is operably linked to the expression control element(s).
- the promoter drives transcription of a polycistronic message that encodes the reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide.
- the promoter can be a viral promoter (e.g., a CMV promoter) or a mammalian promoter (e.g., a PGK promoter).
- the expression cassette or construct can comprise other genetic elements, e.g., to enhance expression or stability of a transcript.
- any of the foregoing constructs or expression cassettes may further include a coding region that does not encode a reprogramming factor, wherein the coding region is separated from adjacent coding region(s) by a self-cleaving peptide.
- the additional coding region encodes a selectable marker.
- the self-cleaving peptide is a viral 2A peptide. In some embodiments, the self-cleaving peptide is an aphthovirus 2A peptide.
- a construct comprises sites for a recombinase that is functional in mammalian cells, wherein the sites flank at least the portion of the construct that comprises the coding regions for the factors (i.e., one site is positioned 5 ' and a second site is positioned 3 ' to the portion of the construct that encodes the polyprotein), so that the sequence encoding the factors can be excised from the genome after reprogramming.
- the recombinase can be, e.g., Cre or Flp, where the corresponding recombinase sites are LoxP sites and Frt sites.
- the recombinase is a transposase.
- the recombinase sites need not be directly adjacent to the region encoding the polyprotein but will be positioned such that a region whose eventual removal from the genome is desired is located between the sites.
- the recombinase sites are on the 5' and 3' ends of an expression cassette. Excision may result in a residual copy of the recombinase site remaining in the genome, which in some embodiments is the only genetic change resulting from the reprogramming process.
- one or more nucleic acids for introducing reprogramming factors comprise mRNA that is translatable in a mammalian somatic cell.
- the mRNA can be introduced in vitro into somatic cells to be reprogrammed and translated by endogenous enzymes into proteins that can activate one or more endogenous pluripotency genes in the cell.
- pluripotency gene refers to a gene whose expression under normal conditions (e.g., in the absence of genetic engineering or other manipulation designed to alter gene expression) occurs in and is typically restricted to pluripotent stem cells, and is crucial for their functional identity as such.
- the polypeptide encoded by a pluripotency gene may be present as a maternal factor in the oocyte.
- the gene may be expressed by at least some cells of the embryo, e.g., throughout at least a portion of the preimplantation period and/or in germ cell precursors of the adult.
- the gene may be expressed in ES cells and/or in embryonic carcinoma cells.
- the pluripotency gene is typically substantially not expressed in somatic cell types that constitute the body of an adult animal under normal conditions (with the exception of germ cells or precursors thereof, or possibly in certain disease states such as cancer).
- the pluripotency gene may be one whose average expression level (based on RNA or protein) in ES cells is at least 50-fold or 100-fold greater than its average level in those terminally differentiated cell types present in the body of an adult mammal.
- the pluripotency gene is one that encodes multiple splice variants or isoforms of a protein, wherein one or more such variants or isoforms is expressed in at least some adult somatic cell types, while one or more other variants or isoforms is not substantially expressed in adult somatic cells under normal conditions.
- expression of the pluripotency gene is essential to maintain the viability or pluripotent state of iPSCs.
- the iPSCs are not formed, die or, in some embodiments, differentiate or cease to be pluripotent.
- the pluripotency gene is characterized in that its expression in an ES cell or iPS cell decreases (resulting in, e.g., a reduction in the average steady state level of RNA transcript and/or protein encoded by the gene by at least 50%, 60%, 70%, 80%, 90%, 95%, or more) when the cell differentiates into a terminally differentiated cell.
- Oct4 and Nanog are exemplary pluripotency genes.
- the mRNA is in vitro transcribed mRNA. Non-limiting examples of producing in vitro transcribed mRNA are described by Warren et al. (Cell Stem Cell 7(5):618-30, 2010, Mandal PK, Rossi DJ. Nat Protoc.
- mRNA e.g., in vitro transcribed mRNA
- mRNA comprises a sequence encoding SV40 large T (LT).
- mRNA e.g., in vitro transcribed mRNA
- mRNA, e.g., in vitro transcribed mRNA comprises a 5' cap. The cap may be wild-type or modified. Examples of suitable caps and methods of synthesizing mRNA containing such caps are apparent to those skilled in the art.
- mRNA e.g., in vitro transcribed mRNA
- mRNA e.g., in vitro transcribed mRNA comprises a polyA tail.
- Methods of adding a polyA tail to mRNA are known in the art, e.g., enzymatic addition via polyA polymerase or ligation with a suitable ligase.
- the methods provided herein can also be used to mutate or modulate one or more nucleic acids in stem cells that are present in cell compositions such as embryos, zygotes, fetuses, and post-natal mammals.
- a stem cell e.g., an ES or iPS cell
- zygote, embryo, or post-natal mammal is already genetically modified (already harbors one or more genetic modifications) prior to being subjected to the methods described herein.
- the stem cell e.g., an ES or iPS cell
- zygote, embryo, or post-natal mammal may be one into which an exogenous nucleic acid has been introduced by a process involving the hand of man (or may be descended at least in part from a cell or organism into which an exogenous nucleic acid has been introduced by a process involving the hand of man).
- the nucleic acid may for example contain a sequence that is exogenous to the cell, it may contain native sequences (i.e., sequences naturally found in the cells) but in a non-naturally occurring arrangement (e.g., a coding region linked to a promoter from a different gene), or altered versions of native sequences, etc.
- a stem cell e.g., an ES or iPS cell
- zygote, embryo, or post-natal mammal is not already genetically modified (does not already harbor one or more genetic modifications) prior to being subjected to the methods described herein.
- the stem cell, zygote, embryo, or post-natal mammal can be of vertebrate (e.g., mammalian) origin.
- the vertebrates are mammals or avians.
- primate e.g., human
- rodent e.g., mouse, rat
- canine feline, bovine, equine, caprine, porcine, or avian (e.g., chickens, ducks, geese, turkeys) stem cells, zygotes, embryos, or post-natal mammals.
- the stem cell, zygote, embryo, or post-natal mammal is isolated (e.g., an isolated stem cell; an isolated zygote; an isolated embryo).
- a mouse stem cell, mouse zygote, mouse embryo, or mouse post-natal mammal is used.
- a rat stem cell, rat zygote, rat embryo, or rat post-natal mammal is used.
- a human stem cell, human zygote or human embryo is used.
- the invention is directed to a method of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences comprising introducing into a zygote or an embryo (i) one or more ribonucleic acid (R A) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and ii) a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity.
- R A ribonucleic acid
- Cas CRISPR associated
- the zygote or the embryo is maintained under conditions in which RNA hybridizes to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the RNA to the portion of the target nucleic acid sequence, thereby producing an embryo having one or more mutated nucleic acid sequences.
- the embryo having one or more mutated nucleic acid sequences may be transferred into a foster nonhuman mammalian mother.
- the foster nonhuman mammalian mother is maintained under conditions in which one or more offspring carrying the one or more mutated nucleic acid sequences are produced, thereby producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences.
- nonhuman mammals can also be produced using methods described herein and/or with conventional methods, see for example, U.S. Published Application No. 201 10302665.
- a method of producing a non-human mammalian embryo can comprise injecting non-human mammalian ES cells (e.g., iPSCs) genetically modified according to an inventive method of the present invention into non-human tetraploid blastocysts and maintaining said resulting tetraploid blastocysts under conditions that result in formation of embryos, thereby producing a non-human mammalian embryo.
- non-human mammalian ES cells e.g., iPSCs
- said non-human mammalian cells are mouse cells and said non- human mammalian embryo is a mouse.
- said mouse cells are mutant mouse cells and are injected into said non-human tetraploid blastocysts by microinjection.
- laser-assisted micromanipulation or piezo injection is used.
- a non-human mammalian embryo comprises a mouse embryo.
- iPS induced pluripotent stem
- the embryo is then transferred (impregnated) into an appropriate foster mother, such as a pseudopregnant female (e.g., of the same species as the embryo).
- the foster mother is then maintained under conditions that result in development of live offspring that harbor the one or more mutations.
- Another example is the use of the tetraploid complementation assay in which cells of two mammalian embryos are combined to form a new embryo (Tarn and Rossant, Develop, 750:6156-6163 (2003)).
- the assay involves producing a tetraploid cell in which every chromosome exists fourfold. This is done by taking an embryo at the two-cell stage and fusing the two cells by applying an electrical current. The resulting tetraploid cell continues to divide, and all daughter cells will also be tetraploid. Such a tetraploid embryo develops normally to the blastocyst stage and will implant in the wall of the uterus.
- tetraploid complementation assay In the tetraploid complementation assay, a tetraploid embryo (either at the morula or blastocyst stage) is combined with normal diploid embryonic stem cells (ES) from a different organism. The embryo develops normally; the fetus is exclusively derived from the ES cell, while the extraembryonic tissues are exclusively derived from the tetraploid cells.
- ES diploid embryonic stem cells
- Another conventional method used to produce nonhuman mammals includes pronuclear microinjection. DNA is introduced directly into the male pronucleus of a nonhuman mammal egg just after fertilization. Similar to the two- step cloning described above, the egg is implanted into a pseudopregnant female. Offspring are screened for the integrated transgene. Heterozygous offspring can be subsequently mated to generate homozygous animals.
- nonhuman mammals can be used in the methods described herein.
- the nonhuman mammal can be a rodent ⁇ e.g., mouse, rat, guinea pig, hamster), a nonhuman primate, a canine, a feline, a bovine, an equine, a porcine or a caprine.
- mice strains and mouse models of human disease are used in conjunction with the methods of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences described herein.
- One of ordinary skill in the art appreciates the thousands of commercially and non- commercially available strains of laboratory mice for modeling human disease. Mice models exist for diseases such as cancer, cardiovascular disease, autoimmune diseases and disorders, inflammatory diseases, diabetes (type 1 and 2), neurological diseases, and other diseases.
- Examples of commercially available research strains include, and is not limited to, 11BHSD2 Mouse, GSK3B Mouse, 129-E Mouse HSD1 1B1 Mouse, AK Mouse Immortomouse®, Athymic Nude Mouse, LCAT Mouse, B6 Albino Mouse, Lox-1 Mouse, B6C3F1 Mouse, Ly5 Mouse, B6D2F1 (BDF1) Mouse, MMP9 Mouse, BALB/c Mouse, NIH-III Nude Mouse, BALB/c Nude Mouse, NOD Mouse, NOD SCID Mouse, Black Swiss Mouse, NSE-p25 Mouse, C3H Mouse, NU/NU Nude Mouse, C57BL/6-E Mouse, PCSK9 Mouse, C57BL/6N Mouse, PGP Mouse (P-glycoprotein Deficient), CB6F1 Mouse, repTOPTM ERE-Luc Mouse, CD-I® Mouse, repTOPTM mitoIRE Mouse, CD-I® Nude Mouse, repTOPTM PPRE-Luc Mouse, CD1-E Mouse, Rip-
- mouse strains include BALB/c, C57BL/6, C57BL/10, C3H, ICR, CBA, A/J, NOD, DBA/1, DBA/2, MOLD, 129, HRS, MRL, NZB, NIH, AKR, SJL, NZW, CAST, KK, SENCAR, C57L, SAMR1 , SAMP1 , C57BR, and NZO.
- the method of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences further comprises mating one or more commercially and/or non-commercially available nonhuman mammal with the nonhuman mammal carrying mutations in one or more target nucleic acid sequences produced by the methods described herein.
- the invention is also directed to nonhuman mammals produced by the methods described herein.
- RNA sequences comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein is introduced into the stem cell, zygote and/or embryo, etc.
- the RNA sequence is referred to as guide RNA (gRNA) or single guide RNA (sgRNA).
- a single RNA sequence can be complementary to one or more (e.g., all) of the target nucleic acid sequences that are being modulated or mutated.
- a single RNA is complementary to a single target nucleic acid sequence.
- multiple (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) RNA sequences are introduced wherein each RNA sequence is
- RNA sequences complementary to (specific for) one target nucleic acid sequence.
- two or more, three or more, four or more, five or more, or six or more RNA sequences are complementary to (specific for) different parts of the same target sequence.
- two or more RNA sequences bind to different sequences of the same region (e.g. promoter) of DNA (see e.g., Figure 30A).
- a single RNA sequence is complementary to at least two target or more (all) of the target nucleic acid sequences.
- the portion of the RNA sequence that is complementary to one or more of the target nucleic acid sequences and the portion of the RNA sequence that binds to Cas protein can be introduced as a single sequence or as 2 (or more) separate sequences into a cell, zygote, embryo or nonhuman animal.
- the sequence that binds to Cas protein comprises a stem-loop.
- the RNA sequence used to modify gene expression in a nonhuman mammal is a naturally occurring RNA sequence, a modified RNA sequence (e.g., a RNA sequence comprising one or more modified bases), a synthetic RNA sequence, or a combination thereof.
- a modified RNA is an RNA comprising one or more modifications (e.g., RNA comprising one or more non-standard and/or non-naturally occurring bases) to the RNA sequence (e.g., modifications to the backbone and or sugar). Methods of modifying bases of RNA are well known in the art.
- modified bases include those contained in the nucleosides 5-methylcytidine (5mC), pseudouridine ( ⁇ ), 5-methyluridine, 2'0-methyluridine, 2-thiouridine, N-6 methyladenosine, hypoxanthine, dihydrouridine (D), inosine (I), and 7- methylguanosine (m7G).
- 5mC 5-methylcytidine
- ⁇ pseudouridine
- ⁇ 5-methyluridine
- 2'0-methyluridine 2-thiouridine
- N-6 methyladenosine hypoxanthine
- dihydrouridine D
- inosine I
- 7- methylguanosine m7G
- the RNA sequence is a morpholino. Morpho linos are typically synthetic molecules, of about 25 bases in length and bind to
- Morpholinos have standard nucleic acid bases, but those bases are bound to morpholine rings instead of deoxyribose rings and are linked through
- phosphorodiamidate groups instead of phosphates. Morpholinos do not degrade their target RNA molecules, unlike many antisense structural types (e.g., phosphorothioates, siRNA). Instead, morpholinos act by steric blocking and bind to a target sequence within a RNA and block molecules that might otherwise interact with the RNA.
- Each RNA sequence can vary in length from about 8 base pairs (bp) to about 200 bp. In some embodiments, the RNA sequence can be about 9 to about 190 bp; about 10 to about 150 bp; about 15 to about 120 bp; about 20 to about 100 bp; about 30 to about 90 bp; about 40 to about 80 bp; about 50 to about 70 bp in length.
- each target nucleic acid sequence to which each RNA sequence is complementary can also vary in size.
- the portion of each target nucleic acid sequence to which the RNA is complementary can be about 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,35, 36, 37, 38 39, 40, 41, 42, 43, 44, 45, 46 47, 48, 49, 50, 51, 52, 53,54, 55, 56,57, 58, 59 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80 81, 82, 83, 84, 85, 86, 87 88, 89, 90, 81, 92, 93, 94, 95, 96, 97, 98, or 100 nucleotides (contiguous nucleotides) in length
- each RNA sequence can be at least about 70%, 75%, 80%, 85%, 90%, 95%, 100%, etc. identical or similar to the portion of each target nucleic acid sequence.
- each RNA sequence is completely or partially identical or similar to each target nucleic acid sequence.
- each RNA sequence can differ from perfect complementarity to the portion of the target sequence by about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, etc. nucleotides.
- one or more RNA sequences are perfectly complementary (100%) across at least about 10 to about 25 (e.g., about 20) nucleotides of the target nucleic acid.
- the one or more RNA sequences can further comprise one or more expression control elements.
- the RNA sequences comprises a promoter, suitable to direct expression in cells, wherein the portion of the RNA sequence is operably linked to the expression control element(s).
- the promoter can be a viral promoter (e.g., a CMV promoter) or a mammalian promoter (e.g., a PGK promoter).
- the RNA sequence can comprise other genetic elements, e.g., to enhance expression or stability of a transcript.
- the additional coding region encodes a selectable marker (e.g., a reporter gene such as green fluorescent protein (GFP)).
- GFP green fluorescent protein
- the one or more RNA sequences also comprise a (one or more) binding site for a (one or more) CRISPR associated (Cas) protein, and, upon hybridization of the one or more RNA sequences to the one or more target sequences, a (one or more) Cas protein or variant thereof cleaves or nicks each of the target nucleic acid sequences.
- a (one or more) Cas protein or variant thereof upon hybridization of the one or more RNA sequences to the one or more target nucleic acid sequences, the Cas protein or variants thereof binds to the one or more RNA sequences and cleaves the one or more target nucleic acids sequences.
- RNA-based adaptive immune system that uses CRISPR (clustered regularly interspaced short palindromic repeat) and Cas (CRISPR-associated) proteins to detect and destroy invading viruses and plasmids (Horvath and Barrangou, Science, 327(5962): 167-170 (2010); Wiedenheft et al, Nature, 482(7385):331-338 (2012)).
- Cas proteins, CRISPR RNAs (crRNAs) and trans-activating crRNA (tracrRNA) form ribonucleoprotein complexes, which target and degrade foreign nucleic acids, guided by crRNAs (Gasiunas et al, Proc. Natl. Acad. Sci, 109(39):E2579-86 (2012); Jinek et al, Science, 337:816-821 (2012)).
- the method further comprises introducing one or more Cas nucleic acid or variant thereof into the cell, embryo, zygote, or non-human mammal.
- a Cas protein or variant thereof is introduced into the cell, embryo, zygote, or non-human mammal.
- a cell e.g., stem cell (ES or iPS cell), zygote, embryo, or animal may already harbor a nucleic acid that encodes Cas (may be constitutive or inducible) and/or may already contain Cas protein.
- a cell e.g., stem cell (ES or iPS cell), zygote, embryo, or animal
- ES or iPS cell stem cell
- zygote embryo, or animal
- a cell or organism into which a nucleic acid encoding a Cas protein has been introduced by a process involving the hand of man.
- CRISPR associated (Cas) genes or proteins which are known in the art can be used in the methods of the invention and the choice of Cas protein will depend upon the particular conditions of the method (e.g.,
- Cas proteins include Casl , Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 and CaslO.
- the Cas nucleic acid or protein used in the methods is Cas9.
- a Cas protein e.g., a Cas9 protein
- a particular Cas protein e.g., a particular Cas9 protein, may be selected to recognize a particular protospacer-adjacent motif (PAM) sequence.
- PAM protospacer-adjacent motif
- a Cas protein e.g., a Cas9 protein
- a Cas protein may be obtained from a bacteria or archaea or synthesized using known methods.
- a Cas protein may be from a gram positive bacteria or a gram negative bacteria.
- a Cas protein may be from a
- Streptococcus (e.g., a S. pyogenes, a S. thermophilus) a Crptococcus, a
- nucleic acids encoding two or more different Cas proteins, or two or more Cas proteins may be introduced into a cell, zygote, embryo, or animal, e.g., to allow for recognition and modification of sites comprising the same, similar or different PAM motifs.
- the Cas protein can cleave one strand or both strands (e.g., of a double stranded target nucleic acid), or alternatively, nick one strand or both strands (e.g., of a double stranded target nucleic acid).
- a Cas9 nickase may be generated by inactivating one or more of the Cas9 nuclease domains.
- an amino acid substitution at residue 10 in the RuvC I domain of Cas9 converts the nuclease into a DNA nickase.
- the aspartate at amino acid residue 10 can be substituted for alanine (Cong et al, Science, 339:819-823).
- a catalytically inactive Cas9 protein includes mutating at residue 10 and/or residue 840. Mutations at both residue 10 and residue 840 can create a catalytically inactive Cas9 protein, sometimes referred herein as dCas9. For example, a D10A and a H840A Cas9 mutant is catalytically inactive.
- fusions of a catalytically inactive (D10A; H840A) Cas9 protein (dCas9) tethered with all or a portion of (e.g., biologically active portion of) an (one or more) effector domain create chimeric proteins that can be guided to specific DNA sites by one or more RNA sequences (sgRNA) to modulate activity and/or expression of one or more target nucleic acids sequences (e.g., exert certain effects on transcription or chromatin organization, or bring specific kind of molecules into specific DNA loci, or act as sensor of local histone or DNA state).
- a "biologically active portion of an effector domain” is a portion that maintains the function (e.g.
- an effector domain e.g., a "minimal” or “core” domain.
- an effector domain e.g., a "minimal” or “core” domain.
- fusion of the Cas9 (e.g., dCas9) with all or a portion of one or more effector domains (e.g., transcriptional activation domains) created a chimeric protein.
- effector domains e.g., transcriptional activation domains
- fusion of a dCas9 with one or more effector domains created a chimeric protein dCas9TA.
- the one or more effector domains are the same (e.g., VP 16 transcriptional activation domains).
- the one or more effector (e.g., transcriptional activation) domains are different.
- dCas9TA is guided to specific nucleic acid sites by one or more R A (e.g. sgR A). In some aspects, dCas9TA is guided to specific nucleic acid sites by RNA (e.g. sgRNA) to modulate gene expression. In some aspects, all or a portion of one or more VP 16 effector domains are fused with Cas9 (e.g., dCas9). In other aspects, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more VP16 effector domains (all or a biologically active portion) are fused with dCas9. In some aspects, a chimeric protein comprising a fusion of a catalytically inactive Cas to all or a portion of one or more effector domains is referred to herein as "CRISPRzyme” or "CRISPR-on”.
- fusion of Cas9 with all or a portion of one or more effector domains comprise one or more linkers.
- a "linker” is something that connects or fuses two or more effector domains (e.g see Hermanson, Bioconjugate Techniques, 2 nd Edition, which is hereby incorporated by reference in its entirety).
- a variety of linkers can be used.
- a linker comprises one or more amino acids.
- a linker comprises 2 or more amino acids.
- a linker comprises the amino acid sequence GS.
- fusion of Cas9 with two or more effector domains (e.g., VP16 core domain such as DALDDFDLDML) comprises one or more interspersed linkers (e.g., GS linkers) between the domains.
- dCas9 is fused with 3 VP 16 core domains with interspersed linkers, referred to herein as dCas9VP48.
- dCas9 is fused with 4 VP 16 core domains with interspersed GS linkers between the core domains, referred herein as dCas9VP48 (SEQ ID NO: 14).
- dCas9 is fused with 6 VP16 core domains with interspersed GS linkers between the core domains, referred herein as dCas9VP96 (SEQ ID NO: 15).
- dCas9VP160 fusion of dCas9 with 10 VP 16 core domains with interspersed GS linkers between the core domains
- the invention is directed to a method of modulating the expression and/or activity of one or more target nucleic acid sequences in a cell or zygote comprising introducing into the cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associate (Cas) protein; (ii) a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence; and (iii) an (one or more) effector domain.
- RNA ribonucleic acid
- Cas CRISPR associate
- the method further comprises maintaining the cell or zygote under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences.
- the Cas protein binds to each of the one or more RNA sequences and the effector domain modulates the expression and/or activity of the target nucleic acid, thereby modulating the expression and/or activity of a target nucleic acid sequence.
- one or more RNA sequences, Cas nucleic acid sequences and effector domains can be introduced into a cell, zygote, embryo or non-human mammal.
- the method of modulating the expression and/or activation of one or more target nucleic acids in a cell is used to reprogram a cell's potency.
- Cells can be reprogrammed, e.g., by the methods described herein.
- the invention is directed to a method of modulating the expression and/or activity of one or more target nucleic acid sequences in a cell wherein the cell or cell's potency (e.g., totipotency, pluripotency, multipotency, oligopotency and unipotency) is reprogrammed (e.g., a differentiated cell; a non-differentiated cell).
- the cell or cell's potency e.g., totipotency, pluripotency, multipotency, oligopotency and unipotency
- the method results in differentiation of a cell (e.g., a totipotent or pluripotent cell differentiates into a unipotent cell or differentiated cell).
- the methods results in dedifferentiation of a cell (e.g. a differentiated cell reverts to an earlier developmental stage).
- the invention is directed to reprogramming a differentiated cell to a totipotent, pluripotent, or multipotent state.
- the method results in transdifferentiation of the cell (e.g. a fibroblast is reprogrammed to a fat cell or a fat cell is reprogrammed to a fibroblast).
- the one or more target nucleic acid sequences in a cell are overexpressed causing the cell to be reprogrammed.
- one or more transcription factors are modulated altering cell potency or dedifferentiation.
- one or transcription factors such as Oct4, Sox2, Klf4, and c-Myc are modulated (e.g. overexpressed) in a cell. (Takahashi, K. & Yamanaka, S. Cell 126, 663-676, 2006).
- the invention is directed to a method of modulating one or more target nucleic acid sequences comprising simultaneous activation of the one or more target nucleic acid sequences.
- the method of modulating one or more target nucleic acid sequences comprises adjusting the level of modulation of one or more target nucleic acid sequences by adjusting the amount (e.g. grams, milligrams, micrograms, nanograms, moles, millimoles, micromoles, nanomoles, stoichiometric amount, molar ratio) of the one or more ribonucleic acid sequences introduced into the cell or zygote ( Figure 30B).
- the level of modulation of one target nucleic acid sequence is the same or different compared to the level of modulation of another target nucleic acid sequence in the same cell or zygote ( Figure 25B).
- multiple target nucleic acid sequences are modulated (e.g. multiplexed activation).
- the invention is directed to (e.g., a composition comprising, consisting essentially of, consisting of) a nucleic acid sequence that encodes a fusion protein (chimeric protein) comprising all or a portion of a Cas protein fused to all or a portion of an effector domain.
- the invention is directed to (e.g., a composition comprising, consisting essentially of, consisting of) a fusion protein comprising all or a portion of a cas protein fused to all or a portion of an effector domain.
- all or a portion of the cas protein has endonuclease activity (e.g., can cleave and/or nick a target nucleic acid sequence) and/or targeting activity.
- all or a portion of the Cas protein targets but does not cleave a nucleic acid sequence.
- the Cas protein can be fused to the N-terminus or C-terminus of the effector domain.
- the portion of the effector domain modulates the expression and/or activation of a target nucleic acid sequence (e.g., gene).
- nucleic acid sequence encoding the fusion protein and/or the fusion protein are isolated.
- substantially pure and isolated nucleic acid sequence is one that is separated from nucleic acids that normally flank the gene or nucleotide sequence (as in genomic sequences) and/or has been completely or partially purified from other transcribed sequences (e.g., as in an RNA or cDNA library).
- an isolated nucleic acid of the invention may be substantially isolated with respect to the complex cellular milieu in which it naturally occurs, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized.
- an “isolated,” “substantially pure,” or “substantially pure and isolated” protein is one that is separated from or substantially isolated with respect to the complex cellular milieu in which it naturally occurs, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized.
- the isolated material will form part of a composition (for example, a crude extract containing other substances), buffer system, or reagent mix.
- the material may be purified to essential homogeneity, for example, as determined by agarose gel electrophoresis or column
- an isolated nucleic acid molecule comprises at least about 50%, 80%, 90%, 95%, 98% or 99% (on a molar basis) of all macromolecular species present.
- Module is used consistently with its use in the art, i.e., meaning to cause or facilitate a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest. Without limitation, such change may be an increase, decrease, or change in relative strength or activity of different components or branches of the process, pathway, or phenomenon.
- a “modulator” is an agent that causes or facilitates a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest.
- modulating (“modulates”; “modulation”) the expression and/or activity of a target nucleic acid sequence refers to any of a variety of alterations to the expression and/or activation of the one or more target nucleic acid sequences.
- the method of modulating the expression and/or activity of the one or more target nucleic acid sequences includes activating, increasing, decreasing, coactivating, regulating, repressing, organizing, remodeling, modifying, and/or fusing the expression and/or activity of one or more target nucleic acid sequences.
- the one or more RNA sequences can be complementary to any of a variety of all or a portion of a target nucleic acid sequence that is to be modulated.
- the method of modulating one or more target nucleic acid sequences comprises introducing one or more RNA sequences that are complementary to all or a portion of a (one or more) regulatory region, an open reading frame (ORF; a splicing factor), an intronic sequence, a chromosomal region (e.g., telomere, centromere) of the one or more target nucleic acid sequences into a cell.
- the target nucleic acid sequence is all or a portion of a plasmid or linear double stranded DNA (dsDNA).
- the regulatory region targeted by the one or more target nucleic acid sequences is a promoter, enhancer, and/or operator region.
- all or a portion of the regulatory region is targeted by the one or more target nucleic acid sequences.
- the regulatory region targeted by the one or more target nucleic acid sequences is exactly or within about 25 bases, 50 bases, 100 bases, 200 bases, 300 bases, 400 bases, 500 bases, 600 bases, 700 bases, 800 bases, 900 bases, 1000 bases, 1500 bases, 2000 bases, or more upstream to the one or more genes (e.g., endogenous genes; exogenous genes) or a (one or more) transcription start site (TSS).
- TSS transcription start site
- the one or more target nucleic acid sequences is exactly or within about 25 bases, 50 bases, 100 bases, 200 bases, 300 bases, 400 bases, 500 bases, 600 bases, 700 bases, 800 bases, 900 bases, 1000 bases, 1500 bases, 2000 bases, or more downstream to the one or more genes (e.g., endogenous genes; exogenous genes) or a TSS.
- the regulatory region targeted by one or more target nucleic acid sequences can be entirely or partially found at or about the 5 ' end of the gene (e.g., endogenous or exogenous) or a TSS.
- the 5 ' end of a gene can include untranscribed (flanking) regions (e.g., all or a portion of a promoter) and a portion of the transcribed region.
- a "regulatory region” is any segment of a nucleic acid sequence capable of modulating (e.g. increasing, decreasing) expression and/or activity of one or more target nucleic acid sequences (e.g. genes).
- regulatory regions include a promoter, enhancer, telomere, locus control region, insulator, centromere, repeat sequence, transposable element, synthetic sequence, and operator.
- Specific examples of regulatory regions include CAAT box, CCAAT box, Pribnow box, TATA box, SECIS element, Polyadenylation signals, A-box, Z- box, C-box, E-box, and/or G-box.
- the method of modulating one or more target nucleic acid sequences comprises introducing a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence into the cell.
- a Cas protein or variant thereof is introduced into the cell.
- the Cas nucleic acid sequence encodes a Cas protein that does not have endonuclease activity.
- the Cas nucleic acid sequence encodes a Cas protein that does not have nickase activity.
- the Cas nucleic acid sequence encodes a Cas protein that does not have endonuclease and nickase activity.
- the Cas nucleic acid sequence encodes a Cas protein that does not have enzymatic activity or is catalytically inactive.
- the method of modulating one or more target nucleic acid sequences comprises introducing a Cas nucleic acid sequence or a variant thereof that encodes a Cas9 protein.
- the Cas nucleic acid sequence encodes a Cas9 protein that comprises one or more mutations.
- the Cas nucleic acid sequence encodes a Cas9 protein that comprises a mutation at amino acid position 10, 840, or a combination thereof.
- the Cas nucleic acid sequence encodes a Cas9 protein wherein the amino acid at position 10 is mutated from aspartate (D) to alanine (A) and the amino acid at position 840 is mutated from histidine (H) to alanine (A).
- the method of modulating one or more target nucleic acid sequences also comprises introducing one or more effector domains.
- an "effector domain” is a molecule (e.g., protein) that modulates the expression and/or activation of a target nucleic acid sequence (e.g., gene).
- the effector domain targets one or both alleles of a gene.
- the effector domain can be introduced as a nucleic acid sequence and/or as a protein.
- the effector domain can be a constitutive or an inducible effector domain.
- a Cas nucleic acid sequence or variant thereof and an effector domain nucleic acid sequence are introduced into the cell as a chimeric sequence.
- the effector domain is fused to a molecule that associates with (e.g., binds to) Cas protein (e.g., the effector molecule is fused to an antibody or antigen binding fragment thereof that binds to Cas protein).
- a Cas protein or variant thereof and an effector domain are fused or tethered creating a chimeric protein and are introduced into the cell as the chimeric protein.
- the Cas protein and effector domain bind as a protein-protein interaction.
- the Cas protein and effector domain are covalently linked.
- the effector domain associates non-covelently with the Cas protein.
- a Cas nucleic acid sequence and an effector domain nucleic acid sequence are introduced as separate sequences and/or proteins. In some aspects, the Cas protein and effector domain are not fused or tethered.
- effector domains include a transcription(al) activating domain (e.g., VP16, VP48, VP64, VP96 and VP160), a coactivator domain, a transcription factor, a transcriptional pause release factor domain, a negative regulator of transcriptional elongation domain, a transcriptional repressor domain, a chromatin organizer domain, a remodeler domain, a histone modifier domain, a DNA modification domain, a RNA binding domain, a protein interaction input devices domain (Grunberg and Serrano, Nucleic Acids Research, 3 '8 (8): '2663 -267 ' 5 (2010)), and a protein interaction output device domain (Grunberg and Serrano, Nucleic Acids Research, 3 '8 (8): '2663 -267 ' 5 (2010)).
- a transcription(al) activating domain e.g., VP16, VP48, VP64, VP96 and VP160
- coactivator domain e.g., VP
- a “protein interaction input device” and a “protein interaction output device” refers to a protein-protein interaction (PPI).
- the PPI is regulatable, e.g., by a small molecule or by light.
- binding partners are targeted to different sites in the genome using the inactive Cas protein. The binding partners interact, thereby bringing the targeted loci into proximity.
- a protein interaction output device is a system for detecting/monitoring occurrence of a PPI, generally by producing a detectable signal when the PPI occurs (e.g., by reconstituting a fluorescent protein) or to trigger specific cellular responses ⁇ e.g., by reconstituting a caspase protein to induce apoptosis).
- the idea in this context is to target different sites in the genome with the components of the "output device". If the interaction occurs, the "output device” generates a signal. This can be used to determine or monitor the proximity of the targeted loci.
- cells are treated with an agent and the effect of the agent on the cell is determined.
- effector domains include histone marks readers/interactors
- the effector domain is a VP 16 effector domain. In some aspects, the effector domain is a VP48 effector domain. In some aspects, the effector domain is a VP64 effector domain. In some aspects, the effector domain is a VP96 effector domain. In some aspects, the effector domain is a VP 160 effector domain.
- fusion of the Cas9 to an effector domain can be to that of a single copy or multiple/tandem copies of full-length or partial- length effectors.
- Other fusions can be with split (functionally complementary) versions of the effector domains.
- Effector domains for use in the methods include any one of the following classes of proteins: proteins that mediate drug inducible looping of DNA and/or contacts of genomic loci, proteins that aid in the three- dimensional proximity of genomic loci bound by dCas9 with different sgR A.
- transcription activators or coactivators include VP 16, tandem copies comprising all or a biologically active portion of the activation peptide from VP 16 (e.g. minimal transactivation domain), such as
- ADALDDFDLDMLP SEQ ID NO: 125
- DALDDFDLDML SEQ ID NO: 126
- VP48 e.g, 3 copies of VP16 minimal transactivation domain
- VP64 e.g., 4 copies of VP16 minimal TA
- VP96 e.g., 6 copies of VP16 minimal TA
- VP160 e.g, 10 copies of VP 16 minimal TA
- Brd4 p65.
- a specific example of a transcription factor is MYC.
- transcriptional pause release factors include proteins in the PTEFb complex, such as Cyclin Tl, Cyclin T2, Cyclin T3, Cdk9.
- negative regulators of transcriptional elongation include negative elongation factor (NELF) components.
- NELF negative elongation factor
- transcriptional repressors include engrailed (EnR), KRAB, Sin3 -interaction domain (SID) and EMSY.
- chromatin organizers and remodelers include insulator proteins, such as CTCF (transcriptional repressor CTCF or CCCTC- binding factor) to disrupt interactions between enhancers and promoters, cohesin complex and mediator complex Medl to activate gene expression, switch/sucrose nonfermentable (SWI/SNF) complex - INI1, BAF155b, BAF170, BRG1, hBRM to open up chromatin, and polycomb repressive complex to induce repressive domains on chromatin.
- CTCF transcriptional repressor CTCF or CCCTC- binding factor
- SI/SNF switch/sucrose nonfermentable
- histone modifiers include histone acetyltransferases such as p300/EP300 (p300HAT), CBP/CREBBP (CBPHAT) , MGEA5, CDYL, CLOCK, ELP3, GTF3C4, KAT2A, KAT2B, KAT5, MYST2, MYST3, MYST4, HAT1, NAT 10, NCOA1, NCOA3, MYST1, CDY1B, CDY1; histone
- histone acetyltransferases such as p300/EP300 (p300HAT), CBP/CREBBP (CBPHAT) , MGEA5, CDYL, CLOCK, ELP3, GTF3C4, KAT2A, KAT2B, KAT5, MYST2, MYST3, MYST4, HAT1, NAT 10, NCOA1, NCOA3, MYST1, CDY1B, CDY1; histone
- methyltransferases such as SET7, PRMT1, PRMT2, PRMT5, PRMT6, PRMT7, PRMT8, G9a, CARM1, MLL, Set2/SET1A, Ash2, Wdr5, Rbbp5, EZH1, EZH2, MLL2, MLL3, MLL4, MLL5, WHSC1L1, PRDM9, SETD1A, SETD1B, SETD2, SETD7, SETD8, SETDB1, SETDB2, SETMAR, SUV39H1, SUV39H2,
- DNA modifiers include 5hmc conversion from 5mC such as Tetl (TetlCD); DNA demethylation by Tetl, ACID A, MBD4, Apobecl, Apobec2, Apobec3, Tdg, Gadd45a, Gadd45b, ROS1; DNA methylation by Dnmtl, Dnmt3a, Dnmt3b, CpG Methyltransferase M.SssI, and/or M.EcoHK31I.
- 5hmc conversion from 5mC such as Tetl (TetlCD); DNA demethylation by Tetl, ACID A, MBD4, Apobecl, Apobec2, Apobec3, Tdg, Gadd45a, Gadd45b, ROS1; DNA methylation by Dnmtl, Dnmt3a, Dnmt3b, CpG Methyltransferase M.SssI, and/or M.Eco
- RNA binding domains to bring RNA molecules to specific genomic loci include Rbfox2, CUG-BP, MBNL1, MBNL2, MBNL3, MS2 coat protein ( MS2 hairpin), and engineered Pumilio.
- histone marks readers/interactors include Sgf29, BPTF, C17orf49/BAP18, GATAD1, TRRAP, PHF8, N-PAC, MSH-6, and NSDl, NSD2, CBX1, CBX3, CBX5, CDYL, and CDYL2.
- DNA modification readers/interactors include MeCP2, MBD1, MBD2, MBD3 MBD4, ZBTB4, ZBTB33, ZBTB38, UHRF1, and UHRF2.
- the method of modulating one or more target nucleic acid sequences in a cell can further comprise introducing an effector molecule.
- an "effector molecule” is a molecule (e.g., nucleic acid sequence; protein; organic molecule; inorganic molecule, small molecule) or physical trigger that associates with ⁇ e.g., binds to; specifically binds to) the effector domain to modulate the expression and/or activity of a target nucleic acid sequence ⁇ e.g., an inducer molecule; a trigger molecule).
- the effector molecule is an antibiotic or derivatives/variants thereof.
- the antibiotic is doxy eye line.
- One of ordinary skill in the art can appreciate other types of antibiotics used, including but not limited to, tetracycline, ampicillin, puromycin, and neomycin.
- the effector molecule is rapamycin, tamoxifen and/or derivative/variants thereof (e.g., (Z)-4-hydroxytamoxifen).
- the effector molecule can also associate with one or more domains (e.g., binding domains) that are fused to or associated with the effector domain.
- the effector domain can be fused to or associated with a receptor domain and/or an antigen binding domain, and the effector molecule (e.g., a ligand specific to the receptor domain; an antibody specific to the antigen binding domain) can bind to the receptor domain and/or antigen binding domain which activates the effector domain, thereby modulating the expression and/or activity of the one or more target nucleic acid sequences.
- the method can further comprise introducing other molecules or factors into the cell to facilitate modulation of the activation and/or expression of the target nucleic acid sequence.
- molecules include coactivators, chromatin remodelers, histone acetylases, deacetylases, kinases, and methylases.
- the methods described herein can also be used to silence expression of a nucleic acid sequence (e.g., a gene) by guiding a repressor to a target nucleic acid sequence.
- target nucleic acid sequences can be mutated or modulated using the methods described herein and will depend upon the desired results.
- the target nucleic acid sequence is a gene sequence.
- the methods described herein can be used to genetically modify two or more different genes in the same gene family, two or more genes that have a redundant function (e.g., redundant may mean that one needs to inactivate at least two of the genes to produce a particular phenotype, e.g., a detectable phenotype), two or more genes of which at least one gene does not or is believed not to produce detectable phenotype when inactivated (e.g., in the strain background used), two or more genes at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or more identical, two or more copies of the same gene, two or more genes in same biological pathway (e.g., signaling pathway, metabolic pathway), two or more genes that share at least one biological activity and/or act
- redundant may mean that one needs to in
- the target nucleic acid sequence is associated with a disease or condition (e.g., see van der Weyden et al, Genome Biol, 12:224 (2011)).
- a disease or condition e.g., see van der Weyden et al, Genome Biol, 12:224 (2011).
- genetic modifications of interest include modifying
- sequence(s), e.g., gene(s) to match sequence in different species (e.g., change mouse sequence to human sequence for any gene(s) of interest), alter sites of potential or known post-translational modification of proteins (e.g.,
- mutating a cell or nonhuman mammal to insert an epitope tag or transgene at an endogenous locus make a reporter mouse, introduce loxP sites or FlpRT sites flanking certain genomic regions, and/or insert a cassette (e.g., a loxP-stop-loxP or FRT-stop-FRT cassette) in front of a gene to produce conditional alleles (e.g., see Frese and Tuveson, Nature Rev, 7:645-658 (2007); Nern et al., PNAS, 108(34): 14198-14203 (2011); Freidal et al., Meth Molec Biol, 6PJ/205-231 (2011)).
- a cassette e.g., a loxP-stop-loxP or FRT-stop-FRT cassette
- one copy of the one or more target nucleic acid sequences is mutated. In some aspects, both copies of one or more of the target nucleic acid sequences in the stem cell or zygote are mutated. In some aspects, the one or more target nucleic acid sequences that are mutated are endogenous to the stem cell or zygote.
- At least two of the target nucleic acid sequences are endogenous nucleic acid sequences. In some aspects, at least two of the target nucleic acid sequences are exogenous nucleic acid sequences. In some aspects where there are at least two target nucleic acid sequences, at least one of the target nucleic acid sequences is an endogenous nucleic acid sequence and at least one of the target nucleic acid sequences is an exogenous nucleic acid sequence. In some aspects, at least two of the target nucleic acid sequences are endogenous genes. In some aspects, at least two of the target nucleic acid sequences are exogenous genes.
- At least one of the target nucleic acid sequences is an endogenous gene and at least one of the target nucleic acid sequences is an exogenous gene. In some aspects, at least two of the target nucleic acid sequences are at least 1 kB apart. In some aspects, at least two of the target nucleic acid sequences are on different chromosomes.
- mutate refers to alteration of a sequence (a target sequence).
- a target sequence that has been mutated refers to the replacement, introduction, and/or deletion of one or more nucleotides in the target sequence.
- a target sequence has been mutated to replace one or more nucleotides in the sequence with one or more nucleotides that occur in one or more natural states of the sequence (e.g., target sequence that is mutated with respect to a wild type sequence has been mutated to replace one or more nucleotides in the sequence with one or more nucleotides that occur in a wild type sequence).
- a target sequence has been mutated to replace one or more nucleotides that occurs in one or more natural states of the sequence (wild type) with one or more other nucleotides.
- At least one mutation comprises an insertion of a tag (e.g., an epitope tag such as a V5 tag; a fluorescent tag), a transgene (e.g, a reporter gene such as p2A-mCherry, GFP), a translation initiation site (e.g., IRES sequence), a transcription initiation site (e.g., TATA box) and/or an insertion of a site recognized by a recombinase (e.g., Cre).
- a tag e.g., an epitope tag such as a V5 tag; a fluorescent tag
- a transgene e.g, a reporter gene such as p2A-mCherry, GFP
- a translation initiation site e.g., IRES sequence
- a transcription initiation site e.g., TATA box
- Cre e.g., Cre
- at least one mutation renders expression of an endogenous gene conditional.
- at least one mutation renders
- the mutations comprise inserting recombination sites (e.g., loxP sites or FRT sites) flanking a selected genomic region, wherein the selected genomic region is optionally within a gene.
- the mutations can also comprise inserting a recombination-site-STOP-recombination site cassette (e.g., a loxP-STOP-loxP or FRT-STOP-FRT cassette) in a gene, between a promoter and a coding region of a gene, or in a regulatory region of a gene.
- the recombination-site-STOP-recombination site cassette is positioned so as to disrupt expression of the gene and wherein excision of the cassette by a recombinase renders the gene expressible.
- the methods provided herein provide for multiplexed genome editing in cells, embryos, zygotes and nonhuman mammals. As shown herein, cells, embryos, zygotes and non-human mammals carrying mutations in multiple genes can be generated in a single step. In some aspects, the methods described herein allow for the mutation of 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, etc. nucleic acid sequences (e.g., genes) in a (single) cell, zygote, embryo or nonhuman mammal using the methods described herein.
- nucleic acid sequences e.g., genes
- 1 nucleic acid sequence is mutated in a (single) cell, zygote, embryo or nonhuman mammal.
- 2 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal.
- 3 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal.
- 4 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal.
- 5 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal, etc.
- the methods described herein can further comprising introducing one or more additional nucleic acid sequences that are complementary to a portion of the one or more target nucleic acid sequences cleaved by the Cas protein.
- a variety of nucleic acid sequences can be introduced, and include a single stranded
- the size of the nucleic acid sequences can vary and will depend upon the reason for introducing the nucleic acid sequence.
- the one or more nucleic acid sequences can be used to replace one or more nucleotides, introduce one or more additional nucleotides, delete one or more nucleotides or a combination thereof in the one or more target nucleic acid sequences.
- the one or more nucleic acid sequences introduce a point mutation in one or more of the target sequences.
- the one or more nucleic acid sequences replace one or more mutant nucleotides with one or more wild type nucleotides in one or more of the target sequences.
- the one or more nucleic acid sequences replace one or more wild type nucleotides with one or more (mutant) nucleotides in one or more of the target sequences.
- the one or more nucleic acids introduce a tag (e.g., a fluorescent protein such as green fluorescent protein), label and/or cleavage site.
- the nucleic acid sequence can be from about 10 nucleotides to about 5000 nucleotides, about 20 to 4500 nucleotides, about 30 to 4000 nucleotides, about 50 to 3500 nucleotides, about 60 to about 3000 nucleotides, about 70 to about 2500 nucleotides, about 80 to about 2000 nucleotides, about 90 to about 1500 nucleotides, about 100 to about 1000 nucleotides, etc.
- the nucleic acid sequence is about 10 to about 500 nucleotides.
- the nucleic acid sequence ⁇ e.g., oligonucleotide is used to further modify (alter, edit, mutate) the cleaved target nucleic acid sequence ⁇ e.g., such oligo-mediated repair allows for precise genome editing,).
- this aspect allows for genome editing, however as shown herein the other allele is often mutated through nonhomologous end joining (NHEJ, see Fig 3B, 3C, and 8C.
- nucleic acid and/or protein can be used to introduce nucleic acid and/or protein into a stem cell, zygote, embryo, and or mammal. Suitable methods include calcium phosphate or lipid-mediated transfection, electroporation, injection, and transduction or infection using a vector ⁇ e.g., a viral vector such as an adenoviral vector).
- a vector e.g., a viral vector such as an adenoviral vector.
- the nucleic acid and/or protein is complexed with a vehicle, e.g., a cationic vehicle, that facilitates uptake of the nucleic acid and/or protein, e.g., via endocytosis.
- the method described herein can further comprise isolating the stem cell or zygote produced by the methods.
- the invention is directed to a stem cell or zygote (an isolated stem cell or zygote) produced by the methods described herein.
- the disclosure provides a clonal population of cells harboring the mutation(s), replicating cultures comprising cells harboring the mutation(s) and cells isolated from the generated animals.
- the methods described herein can further comprise crossing the generated animals with other animals harboring genetic modifications (optionally in same strain background) and/or having one or more phenotypes of interest (e.g., disease susceptibility - such as NOD mice).
- the methods may comprise modifying a stem cell, zygote, and/or animal from a strain that harbors one or more genetic modifications and/or has one or more phenotypes of interest (e.g., disease susceptibility).
- mice strains and mouse models of human disease are used.
- One of ordinary skill in the art appreciates the thousands of commercially and non-commercially available strains of laboratory mice for modeling human disease.
- Mice models exist for diseases such as cancer, cardiovascular disease, autoimmune, inflammatory, diabetes (type 1 and 2), neurobiology, and other diseases.
- Examples of commercially available research strains include, and is not limited to, 11BHSD2 Mouse, GSK3B Mouse, 129-E Mouse HSD11B1 Mouse, AK Mouse Immortomouse®, Athymic Nude Mouse, LCAT Mouse, B6 Albino Mouse, Lox-1 Mouse, B6C3F1 Mouse, Ly5 Mouse, B6D2F1 (BDF1) Mouse, MMP9 Mouse, BALB/c Mouse, NIH-III Nude Mouse, BALB/c Nude Mouse, NOD Mouse, NOD SCID Mouse, Black Swiss Mouse, NSE- p25 Mouse, C3H Mouse, NU/NU Nude Mouse, C57BL/6-E Mouse, PCSK9 Mouse, C57BL/6N Mouse, PGP Mouse (P-glycoprotein Deficient), CB6F1 Mouse, repTOPTM ERE-Luc Mouse, CD-I® Mouse, repTOPTM mitoIRE Mouse, CD-I® Nude Mouse, repTOPTM PPRE-Luc Mouse, CD1-E Mouse, Rip-
- mouse strains include BALB/c, C57BL/6, C57BL/10, C3H, ICR, CBA, A/J, NOD, DBA/1, DBA/2, MOLD, 129, HRS, MRL, NZB, NIH, AKR, SJL, NZW, CAST, KK, SENCAR, C57L, SAMR1, SAMP1, C57BR, and NZO.
- the methods described herein can further comprise assessing whether the one or more target nucleic acids have been mutated and/or modulated using a variety of known methods.
- methods described herein are used to produce multiple genetic modifications in a stem cell, zygote, embryo, or animal, wherein at least one of the genetic modifications knocks out (functionally inactivates completely or partially) a gene whose knockout does not produce a detectable phenotype, and at least one of the genetic modifications is in a different gene or genomic location.
- the resulting stem cell, zygote, embryo, or animal, or a cell, zygote, embryo, or animal generated therefrom, is analyzed for the presence of one or more detectable phenotypes.
- Such methods may be used to identify genes or genomic locations that have synthetic effects (e.g., effects that are greater in degree or different in kind from the sum of the effects caused by either mutation alone).
- an effect is synthetic lethality.
- at least one of the genetic modifications may be conditional (e.g., the effect of the modification, such as gene knockout, only becomes manifest under certain conditions, which are typically under control of the artisan).
- animals are permitted to develop at least to post-natal stage, e.g., to adult stage.
- the appropriate conditions for the modification to produce an effect (sometimes termed "inducing conditions") are imposed, and the phenotype of the animal is subsequently analyzed. A phenotype may be compared to that of an unmodified animal or to the phenotype prior to the imposition of the inducing conditions.
- analysis may comprise any type of phenotypic analysis known in the art, e.g., examination of the structure, size, development, weight, or function, of any tissue, organ, or organ system (or the entire organism), analysis of behavior, activity of any biological pathway or process, level of any particular substance or gene product, etc.
- analysis comprises gene expression analysis, e.g., at the level of mR A or protein.
- such analysis may comprise, e.g., use of microarrays (e.g., oligonucleotide microarrays, sometimes termed "chips"), high throughput sequencing (e.g., RNASeq), ChIP on Chip analysis, ChlPSeq analysis, etc.
- high content screening may be used, in which elements of high throughput screening may applied to the analysis of individual cells through the use of automated microscopy and image analysis (see, e.g., Zanella et al, (2010). High content screening: seeing is believing. Trends Biotechnol. 28:237-245).
- analysis comprises quantitative analyses of components of cells such as spatio-temporal distributions of individual proteins, cytoskeletal structures, vesicles, and organelles, e.g., when contacted with test agents, e.g., chemical compounds.
- test agents e.g., chemical compounds.
- activation or inhibition of individual proteins and protein-protein interactions and/or changes in biological processes and cell functions may be assessed.
- a range of fluorescent probes for biological processes, functions, and cell components are available and may be used, e.g., with
- cells or animals generated according to methods herein may comprise a reporter, e.g., a fluorescent reporter or enzyme (e.g., a luciferase such as Gaussia, Renilla, or firefly luciferase) that, for example, reports on the expression or activity of particular genes.
- a reporter e.g., a fluorescent reporter or enzyme (e.g., a luciferase such as Gaussia, Renilla, or firefly luciferase) that, for example, reports on the expression or activity of particular genes.
- a reporter e.g., a fluorescent reporter or enzyme (e.g., a luciferase such as Gaussia, Renilla, or firefly luciferase) that, for example, reports on the expression or activity of particular genes.
- a non-invasive detection means e.g., an imaging or detection means such as PET imaging, MRI, fluorescence detection.
- Multiplexed genome editing may allow installation of reporters for detection of multiple proteins, e.g., 2 - 20 different proteins, e.g., in a cell, tissue, organ, or animal, e.g., in a living animal.
- Multiplexed genome editing according to the present invention may be useful to determine or examine the biological role(s) and/or roles in disease of genes of unknown function (e.g., genes whose complete knockout does not produce a detectable phenotype). For example, discovery of synthetic effects caused by mutations in first and second genes may pinpoint a genetic or biochemical pathway in which such gene(s) or encoded gene product(s) is involved. In some
- mutations may be generated in stem cells or zygotes from any existing knockout or deletion strain or animals produced according to methods described herein may be crossed with animals from such strain. In some embodiments one or more gain-of-function and/or loss-of-function alleles are generated.
- cells or zygotes generated in or derived from animals produced in projects such as the International Knockout Mouse Consortium (IKMC), the website of which is http://www.knockoutmouse.org). In some embodiments it is contemplated to use, in methods described herein, cells or zygotes generated in or derived from animals produced in projects such as the International Knockout Mouse Consortium (IKMC), the website of which is http://www.knockoutmouse.org). In some embodiments it is contemplated to use, in methods described herein, cells or zygotes generated in or derived from animals produced in projects such as the International Knockout Mouse Consortium (IKMC), the website of which is http://www.knockoutmouse.org). In some embodiments it is
- a mouse gene to be modified according to methods described herein is any gene from the Mouse Genome Informatics (MGI) database for which sequences and genome coordinates are available, e.g., any gene predicted by the NCBI, Ensembl, and Vega (Vertebrate Genome Annotation) pipelines for mouse Genome Build 37 (NCBI) or Genome Reference Consortium GRCm38.
- MMI Mouse Genome Informatics
- a gene or genomic location to be modified is included in genome of a species for which a fully sequenced genome exists. Genome sequences may be obtained, e.g., from the UCSC Genome Browser
- a human gene or sequence to be modified according to methods described herein may be found in Human Genome Build hgl9 (Genome Reference Consortium).
- a gene is any gene for which a Gene ID has been assigned in the Gene Database of the NCBI (http://www.ncbi.nlm.nih.gov/gene).
- a gene is any gene for which a genomic, cDNA, mRNA, or encoded gene product (e.g., protein) sequence is available in a database such as any of those available at the National Center for Biotechnology Information (www.ncbi.nih.gov) or Universal Protein Resource (www.uniprot.org).
- Databases include, e.g., GenBank, RefSeq, Gene, UniProtKB/SwissProt, UniProtKB/Trembl, and the like.
- a gene encodes a polypeptide.
- a gene may not encode a polypeptide.
- a gene may, for example, comprise a template for transcription of a functional RNA, i.e., an RNA that has at least one function other than providing a messenger RNA (mRNA) to be translated into protein.
- mRNA messenger RNA
- Examples include, e.g., long non-coding RNA (e.g., greater than 200 bases in length, e.g., 200 - 5,000 bases), small RNA (e.g., small nuclear RNA), transfer RNA, ribosomal RNA, microRNA precursor, Piwi-interacting RNAs (piRNAs), small nucleolar RNAs (snoRNAs).
- RNA is 25 bases or less, 50 bases or less, 100 bases or less, 200 bases or less in length.
- Sequences of functional RNAs are available, e.g., from databases such as miRBase (website is http://www.mirbase.org ) (Kozomara A , et al., miRBase: integrating microRNA annotation and deep-sequencing data. NAR 2011 39(Database
- RNAdb Long Non-Coding RNA Database
- website is http://www.lncrnadb.org/
- Amaral PP et al. (2011) IncRNAdb: a reference database for long noncoding RNAs. Nucleic Acids Res 39: D146-151).
- a genomic sequence may be suspected of potentially comprising a template for transcription of a functional RNA.
- a genetic modification may be made in the sequence to determine whether such genetic modification alters the phenotype of a cell or animal or affects production of an RNA or protein or alters susceptibility to a disease.
- a known or suspected regulatory region e.g., a known or suspected enhancer region or a known or suspected promoter region.
- the effect on expression of one or more genes in may be assessed.
- a genetic modification may be made in the sequence to determine whether such genetic modification alters the phenotype of a cell or animal or affects production of an R A or protein or alters susceptibility to a disease.
- any method described herein may comprise isolating one or more cells, samples, or substances from an animal generated according to methods described herein, e.g., any genetically modified animal generated as described herein.
- a method may further comprise analyzing the one or more cells, samples, or substances. Such analysis may, for example assess the effect of a genetic medication(s) introduced according to the methods.
- animals generated according to methods described herein may be useful in the identification of candidate agents for treatment of disease and/or for testing agents for potential toxicity or side effects.
- any method described herein may comprise contacting an animal generated according to methods described herein, e.g., any genetically modified animal generated as described herein, with a test agent (e.g., a small molecule, nucleic acid, polypeptide, lipid, etc.).
- contacting comprises administering the test agent.
- Administration may be by any route (e.g., oral, intravenous, intraperitoneal, gavage, topical, transdermal, intramuscular, enteral, subcutaneous), may be systemic or local, may include any dose (e.g., from about 0.01 mg/kg to about 500 mg/kg), may involve a single dose or multiple doses.
- a method may further comprise analyzing the animal. Such analysis may, for example assess the effect of the test agent in an animal having a genetic medication(s) introduced according to the methods.
- a test agent that reduces or enhances an effect of one or more genetic modification(s) may be identified.
- test agent may be identified as a candidate agent for treatment of a disease associated with or produced by the genetic modification(s) or associated with or produced by naturally occurring mutations in a gene or genomic location harboring the genetic modification.
- a cell may be a diseased cell or may originate from a subject suffering from a disease, e.g., a disease affecting the cell or organ from which the cell was obtained.
- a mutation is introduced into a genomic region of the iPS cell that is associated with a disease (e.g., any disease of interest, such as diseases mentioned herein).
- a disease e.g., any disease of interest, such as diseases mentioned herein.
- it is of interest to knock out or otherwise modify a gene or genomic location that is known or suspected to be involved in disease pathogenesis and/or known or suspected to be associated with increased or decreased risk of developing a disease or particular manifestation(s) of a disease.
- Multiplexed genome editing as described herein may allow for production of cells or cell lines that are isogenic except with regard to, e.g., between 2 and 20 selected sites or genetic alterations. This may allow for the study of the combined effect of multiple mutations that are suspected of or known to play a role in disease risk, development or progression.
- CRISPRzymes can be designed to target specific chromatin loci to exert modification (e.g., methylation or demethylation) on causative genes of diseases due to aberrant chromatin state to correct the chromatin states.
- CRISPRzymes can be used to detect/sense certain sequence variation or chromatin states at defined loci guided by sgRNA, or interactions between genomic loci guided by pairs or set of sgRNAs and to exert specific therapeutic outcomes dependent on chromatin state or the interaction of genomic loci.
- split fragments of Caspase can be fused to dCas9 and only reconstitute apoptosis-inducing activity when two genomic loci targeted by specific sgRNAs are proximal due to looping under certain disease conditions or cell types, e.g., cancer stem cells, [http://www.ncbi.nlm.nih.gov/pubmed/22070901].
- CRISPRzymes can be coupled with biosensors to kill cells on detecting specific histone or DNA modifications at specific loci, e.g., DNA methylation
- CRISPR- CaspaseA CRISPR- CaspaseA
- MBDl-CaspaseB binds to mCpG
- CRISPR- CaspaseA binds to a genomic loci (e.g., hypermethylated genes in cancer) guided by an sgRNA. Only at that defined loci and when the loci is methylated is the Caspase reconstituted, and triggering the killing of cancer cells but not in normal cells.
- CRISPRzymes can be used to detect chromosomal translocation events resulting in fusion of DNA fragments.
- dCas9 can be fused to split fragments of fluorescent marker, or luciferase gene and sgRNA targeting the fused genes are used and only when the two specific gene fragments are fused is the reporter reconstituted.
- This strategy can be used to screen for/detection of subtypes of cancer cells in patient samples/biopsy, at single cell resolution. Similarly fusion with split caspase will allow specific killing/depletion of aberrant cells characteristic of specific chromosomal translocation events.
- CRISPRzymes can be used to restore DNA looping in patients with deficient DNA looping, e.g., Cornelia de Lange patients (defeats in cohesin complex.)
- CRISPRzymes can also be used in pharmaceutical and/or academic research.
- a screen can be used by a library of sgRNA sequences in combination with a CRISPRzyme or a set of CRISPRzymes.
- the screen can be in the format of library, where each samples (cells, embryos, or tissues) are treated with known and predefined sgRNA or a set of sgRNA.
- the screen can be pooled whereby vectors expressing different sgRNAs are mixed and introduced to the target (cells, embryos, tissues, etc.) and cells with appropriate phenotype are selected or enriched and the sgRNA harboring the specific phenotype identified by sequencing.
- CRISPRzymes can be used to elicit chromatin state changes, or transcription activation of specific gene or specific sets of genes in somatic cell, adult stem cells or embryonic stem cells to induce them to reprogram into pluripotent states, to differentiate or transdifferentiate.
- methods described herein may be used to produce non- human mammals that have a mutation in the SR Y (sex determining region Y) gene.
- the SR Y gene is an intronless gene located on the Y chromosome in therian mammals that encodes a transcription factor that is a member of the SOX (SRY-like box) gene family of DNA-binding proteins. Since a functional Sry protein is required for male development, a mammal that has an X and Y chromosome, wherein the Y chromosome harbors a loss-of-function mutation in SRY, is an anatomic female. An anatomic female may be recognized, e.g., by the presence of a uterus and ovaries and the absence of testes.
- the CRISPR/Cas system may be used to generate mutations in SRY, e.g., in a stem cell, zygote, or embryo.
- SRY e.g., in a stem cell, zygote, or embryo.
- a target nucleic acid sequence mutated according to methods described herein is the SR Y gene or a portion thereof.
- the mutation is a loss-of-function mutation.
- the loss-of-function mutation is a deletion of part or all of the SRY gene.
- the mutation, e.g., deletion is in a portion of the gene that is essential for its function.
- a mutation is in the portion of the SR Y gene that encodes the high mobility group (HMG) DNA binding domain of Sry, termed the HMG box.
- HMG box Nasrin, Nature, 354, 317-320 (1991)
- the HMG box extends from amino acid 58 to amino acid 137 of Sry.
- the corresponding sequences in other species are immediately evident upon aligning the Sry protein sequences with the human sequence (see, e.g., Fig. 15A).
- the Sry HMG box extends from amino acid 3 to amino acid 82).
- the HMG domain is essential for the function of SR Y proteins.
- mice were generated using transcription activator-like effector nuclease (TALEN) technology to mutate the Sry gene in mouse ES cells.
- TALEN transcription activator-like effector nuclease
- Two pairs of TALENs were generated to target the high mobility group (HMG) DNA binding domain of Sry and were transfected into mouse
- TALEN pairs 1 and 2 showed gene modification efficiencies of 15% and 20%>, respectively, based on a Surveyor assay.
- the deletions ranged in size from 1 1 to 540 bp (Wang, H., supra).
- Three of the generated deletions are depicted schematically in Figure 15B.
- the TALEN cleavage site is in the middle between the binding of the TALEN2 pair as depicted in Figure 15B.
- the mutated ES cells were used to produce living mice by tetraploid
- mice were found to be anatomic females.
- insertion of a sequence encoding GFP at the same site lead to sex reversal.
- Adult Sry-targeted mice (anatomic females) showed reduced fertility, but they were fertile and transmitted the Sry-mutated Y
- mice From the age of ⁇ 2 months, each of seven XYSry(tml) females was housed with a single XYSry(dllRlb );Tg(Sry)2Ei male for 5-7 months. The result was that three XYSry(tml) females gave birth to a total of eight litters (two eaten at birth). It has been reported that, in XY female meiosis, the X and Y chromosomes do not pair efficiently and segregate randomly, leading to sex chromosome aneuploidy in the offspring of XY females 1, 2. aThese mice may carry either one or two X
- mice may also carry YSry(dllRlb).
- the portion of the SRY gene that is targeted is within or overlaps with the portion of the gene that encodes the HMG box.
- the mutation removes at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 40, 50, 100, or more nucleotides from the gene, e.g., at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 40, 50, 100, or more nucleotides from the portion of the gene that encodes the HMG box.
- the mutation is in a portion of the gene upstream (5') of the region that encodes the HMG box, e.g.,k encoding a portion of the Sry protein that lies N-terminal to the HMG box.
- a mutation is an insertion upstream of or within the sequence that encodes the HMG box, wherein the insertion results in a frameshift or stop codon. For example, insertion of 1 or 2 amino acids or a longer sequence not divisible by 3 would result in a frameshift. Insertion of a stop codon in the region located 5 ' of the sequence encoding the HMG box would result in a truncated and nonfunctional Sry protein.
- a mutation may be located in a portion of the SRY gene that encodes a portion of Sry that is C-terminal to the HMG box.
- a mutation may be in a regulatory region, e.g., a promoter.
- a mutation may be upstream of the start codon, e.g., in a promoter.
- the SRY gene is mutated in a zygote, and the zygote is transferred to the uterus of a foster mother (e.g., a pseudopregnant female) to develop to birth. It will be understood that the zygote may be maintained in culture after mutation of the S ⁇ F gene, e.g., to an early embryonic stage (e.g., a blastocyst) and then transferred to the uterus of a foster mother. In some
- the invention provides a zygote having an X and Y chromosome, wherein the Y chromosome has an engineered mutation in the SR Y gene, wherein the zygote is capable of developing to an anatomic female.
- the mammal may be any non-human mammal.
- a method comprises generating a non-human mammal that has an X chromosome and a Y chromosome (i.e., somatic cells of which contain an X and a Y chromosome).
- Methods of creating anatomic females may be useful in any context in which it is desired to reduce the number or proportion of male offspring and/or increase the number of proportion of anatomically female offspring.
- methods of generating anatomic females are useful in animal husbandry, which generally refers to the breeding and raising of non-human animals for any of a variety of purposes, e.g., for meat, as sources of animal products (e.g., milk, wool, hair, leather, skin, horn, eggs, or meat), for performing work, or providing companionship, e.g., as pets.
- animal husbandry generally refers to the breeding and raising of non-human animals for any of a variety of purposes, e.g., for meat, as sources of animal products (e.g., milk, wool, hair, leather, skin, horn, eggs, or meat), for performing work, or providing companionship, e.g., as pets.
- animal products e.g., milk, wool, hair, leather, skin, horn, eggs, or meat
- the non-human mammal is allowed to develop at least until adulthood. In some embodiments the adult non-human mammal gives rise to offspring, which inherit the mutation.
- a useful product e.g., milk, wool, hair, leather, skin, horn, or meat, is obtained from the anatomically female non-human mammal.
- a non-human mammal useful in dairy farming is a cow, goat, sheep, or camel, or other non-human animal useful for the production of milk.
- a cow is of any of the following breeds: a Holstein (also referred to as Holstein-Friesian), Brown Swiss, Canadienne, Dutch Belted, Guernsey, Ayrshire, Jersey, Kerry, Milking Shorthorn, Milking Devon, or Norwegian Red.
- a Holstein also referred to as Holstein-Friesian
- Brown Swiss Brown Swiss
- Canadienne Dutch Belted
- Guernsey Ayrshire
- Jersey Kerry
- Milking Shorthorn Milking Devon, or Norwegian Red.
- methods of creating anatomic females may be useful in the context of managing species at risk of extinction, e.g., in programs that attempt to maintain or increase the number of individuals of a particular species.
- a species at risk of extinction may be any species recognized as near threatened, threatened (vulnerable, endangered, or critically endangered), or extinct in the wild by the International Union of Conservation (IUCN).
- IUCN International Union of Conservation
- Such species are listed, e.g., on the IUCN Red List of Threatened Species (also known as the IUCN Red List or Red Data List), e.g., the 2012 version (available at the IUCN website at http://www.iucnredlist.org/).
- the population of a species at risk of extinction may be declining.
- a species, e.g., a species at risk of extinction may be, e.g., a bear, canine, caprine, elephant, feline, non-human primate, ovine, rodent, or ungulate species.
- a species, e.g., a species at risk of extinction may be a marsupial, e.g., a Kenyan Devil.
- methods of generating non-human mammals may comprise mutating one or more genes whose mutation results in a phenotype of interest. In some embodiments both copies of the gene are mutated.
- a phenotype of interest may be any phenotype, e.g., any property of interest.
- the non-human mammal is a source of food (e.g., milk or meat) or other products useful for humans.
- at least some humans may be allergic to a component, e.g., a protein, found in the food.
- a phenotype of interest may comprise reduced or absent production of an allergenic component, or alteration in an allergenic component so as to reduce its allergenicity.
- the gene encoding a whey protein e.g., the whey protein beta- lactoglobulin (BLG), a component found in the milk of cows, sheep, and a variety of other species (but not humans) that constitutes a major milk allergen
- a gene is mutated so as to remove an allergenic epitope or alter it to a non-allergenic form, e.g., by changing or deleting one or more amino acids.
- the protein may still be produced and able to fulfill its normal function but is no longer allergenic or has reduced allergenicity to humans.
- a gene is mutated so as to reduce or eliminate production of the protein.
- a mutation is insertion of a stop codon or deletion or alteration of a start codon or at least a portion of a promoter.
- a phenotype of interest may comprise any alteration that qualitatively or quantitatively alters one or more characteristics of a product that is obtained from the non-human mammal, e.g., in a way that makes the product more useful, easier to manipulate, less allergenic, or improved in any way.
- a characteristic may be color, texture, flavor, consistency, viscosity, thickness, roughness, toughness, tenderness, stringiness, fat content, protein content, sugar content, etc.
- a phenotype of interest may comprise any alteration that increases the yield of a product (e.g., on a per animal basis, per month or year basis); increases the growth rate; reduces the amount of food, resources, or care consumed or required by the animal; renders the animal more resistant to disease; renders the animal more tolerant of high or low
- a phenotype may comprise increased milk production.
- a polymorphism e.g., a single nucleotide polymorphism
- a polymorphism may be identified as being associated with a phenotype of interest using methods known in the art (e.g., genetic association studies). Methods described herein may be used to generate non-human mammals having a
- methods of generating anatomically female non- human mammals may comprise mutating one or more additional nucleic acids in addition to the SR Y gene. For example, any gene the mutation of which results in a phenotype of interest (e.g., reduced allergen content), may be mutated.
- disease disorders
- condition may refer to any alteration from a state of health and/or normal functioning of an organism, e.g., an abnormality of the body or mind that causes pain, discomfort, dysfunction, distress, degeneration, or death to the individual afflicted.
- Diseases include any disease known to those of ordinary skill in the art. In some
- a disease is a chronic disease, e.g., it typically lasts or has lasted for at least 3-6 months, or more, e.g., 1, 2, 3, 5, 10 or more years, or indefinitely.
- Disease may have a characteristic set of symptoms and/or signs that occur commonly in individuals suffering from the disease.
- Diseases and methods of diagnosis and treatment thereof are described in standard medical textbooks such as Longo, D., et al. (eds.), Harrison's Principles of Internal Medicine, 18th Edition; McGraw-Hill Professional, 2011 and/or Goldman's Cecil Medicine, Saunders; 24 edition (August 5, 2011).
- a disease is a multigenic disorder (also referred to as complex, multifactorial, or polygenic disorder).
- a multigenic disorder may be any disease for which it is known or suspected that multiple genes (e.g., particular alleles of such genes, particular polymorphisms in such genes) may contribute to risk of developing the disease and/or may contribute to the way the disease manifests (e.g., its severity, age of onset, rate of progression, etc.)
- a multigenic disease is a disease that has a genetic component as shown by familial aggregation (occurs more commonly in certain families than in the general population) but does not follow Mendelian laws of inheritance, e.g., the disease does not clearly follow a dominant, recessive, X-linked, or Y-linked inheritance pattern.
- a multigenic disease is one that is not typically controlled by variants of large effect in a single gene (as is the case with Mendelian disorders).
- a multigenic disease may occur in familial form and sporadically. Examples include, e.g., Parkinson's disease, Alzheimer's disease, and various types of cancer. Examples of multigenic diseases include many common diseases such as hypertension, diabetes mellitus (e.g., type II diabetes mellitus), cardiovascular disease, cancer, and stroke (ischemic,
- a disease e.g., a multigenic disease is a psychiatric, neurological, neurodevelopmental disease, neurodegenerative disease, cardiovascular disease, autoimmune disease, cancer, metabolic disease, or respiratory disease.
- at least one gene is implicated in a familial form of a multigenic disease.
- a disease is cancer, which term is generally used interchangeably to refer to a disease characterized by one or more tumors, e.g., one or more malignant or potentially malignant tumors.
- tumor as used herein encompasses abnormal growths comprising aberrantly proliferating cells.
- tumors are typically characterized by excessive cell proliferation that is not appropriately regulated (e.g., that does not respond normally to physiological influences and signals that would ordinarily constrain proliferation) and may exhibit one or more of the following properties: dysplasia (e.g., lack of normal cell differentiation, resulting in an increased number or proportion of immature cells); anaplasia (e.g., greater loss of differentiation, more loss of structural organization, cellular pleomorphism, abnormalities such as large, hyperchromatic nuclei, high nuclearxytoplasmic ratio, atypical mitoses, etc.);
- dysplasia e.g., lack of normal cell differentiation, resulting in an increased number or proportion of immature cells
- anaplasia e.g., greater loss of differentiation, more loss of structural organization, cellular pleomorphism, abnormalities such as large, hyperchromatic nuclei, high nuclearxytoplasmic ratio, atypical mitoses, etc.
- tumor includes malignant solid tumors, e.g., carcinomas (cancers arising from epithelial cells), sarcomas (cancers arising from cells of mesenchymal origin), and malignant growths in which there may be no detectable solid tumor mass (e.g., certain hematologic malignancies).
- Cancer includes, but is not limited to: breast cancer; biliary tract cancer; bladder cancer; brain cancer (e.g., glioblastomas, medulloblastomas); cervical cancer;
- melanoma oral cancer including squamous cell carcinoma; ovarian cancer including ovarian cancer arising from epithelial cells, stromal cells, germ cells and
- mesenchymal cells mesenchymal cells; neuroblastoma, pancreatic cancer; prostate cancer; rectal cancer; sarcomas including angiosarcoma, gastrointestinal stromal tumors,
- osteosarcoma renal cancer including renal cell carcinoma and Wilms tumor; skin cancer including basal cell carcinoma and squamous cell cancer; testicular cancer including germinal tumors such as seminoma, non-seminoma (teratomas, choriocarcinomas), stromal tumors, and germ cell tumors; thyroid cancer including thyroid adenocarcinoma and medullary carcinoma.
- testicular cancer including germinal tumors such as seminoma, non-seminoma (teratomas, choriocarcinomas), stromal tumors, and germ cell tumors
- thyroid cancer including thyroid adenocarcinoma and medullary carcinoma.
- a cancer is one for which mutation or
- a gene is an oncogene, proto-oncogene, or tumor suppressor gene.
- oncogene encompasses nucleic acids that, when expressed, can increase the likelihood of or contribute to cancer initiation or progression. Normal cellular sequences (“proto-oncogenes”) can be activated to become oncogenes (sometimes termed "activated oncogenes”) by mutation and/or aberrant expression.
- an oncogene can comprise a complete coding sequence for a gene product or a portion that maintains at least in part the oncogenic potential of the complete sequence or a sequence that encodes a fusion protein.
- Oncogenic mutations can result, e.g., in altered (e.g., increased) protein activity, loss of proper regulation, or an alteration (e.g., an increase) in R A or protein level.
- Aberrant expression may occur, e.g., due to chromosomal rearrangement resulting in juxtaposition to regulatory elements such as enhancers, epigenetic mechanisms, or due to amplification, and may result in an increased amount of proto-oncogene product or production in an inappropriate cell type.
- Proto-oncogenes often encode proteins that control or participate in cell proliferation, differentiation, and/or apoptosis. These proteins include, e.g., various transcription factors, chromatin remodelers, growth factors, growth factor receptors, signal transducers, and apoptosis regulators.
- a TSG may be any gene wherein a loss or reduction in function of an expression product of the gene can increase the likelihood of or contribute to cancer initiation or progression. Loss or reduction in function can occur, e.g., due to mutation or epigenetic mechanisms.
- Many TSGs encode proteins that normally function to restrain or negatively regulate cell proliferation and/or to promote apoptosis.
- Exemplary oncogenes include, e.g., MYC, SRC, FOS, JUN, MYB, RAS, RAF, ABL, ALK, AKT, TRK, BCL2, WNT, HER2/NEU, EGFR, MAPK, ERK, MDM2, CDK4, GLI1, GLI2, IGF2, TP53, etc.
- Exemplary TSGs include, e.g., RB, TP53, APC, NF1, BRCA1, BRCA2, PTEN, CDK inhibitory proteins (e.g., pl6, p21), PTCH, WT1, etc. It will be understood that a number of these oncogene and TSG names encompass multiple family members and that many other TSGs are known.
- any such gene may be genetically modified, e.g., to generate a cancer model, which may be used, e.g., to determine effect of particular alterations on development of cancer, to determine effect of particular alterations on efficacy of or resistance to treatment, to identify or characterize existing or potential candidate therapeutic agents, etc. Similar methods are envisioned for genes associated with other diseases.
- a disease is a cardiovascular disease, e.g., atherosclerotic heart disease or vessel disease, congestive heart failure, myocardial infarction, cerebrovascular disease, peripheral artery disease, cardiomyopathy.
- a cardiovascular disease e.g., atherosclerotic heart disease or vessel disease, congestive heart failure, myocardial infarction, cerebrovascular disease, peripheral artery disease, cardiomyopathy.
- a disease is a psychiatric, neurological, or neurodevelopmental disease, e.g., schizophrenia, depression, bipolar disorder, epilepsy, autism, addiction.
- Neurodegenerative diseases include, e.g., Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, frontotemporal dementia.
- a disease is an autoimmune diseases e.g., acute disseminated encephalomyelitis, alopecia areata, antiphospholipid syndrome, autoimmune hepatitis, autoimmune myocarditis, autoimmune pancreatitis, autoimmune polyendocrine syndromesautoimmune uveitis, inflammatory bowel disease (Crohn's disease, ulcerative colitis), type I diabetes mellitus (e.g.
- scleroderma ankylosing spondylitis, sarcoid, pemphigus vulgaris, pemphigoid, psoriasis, myasthenia gravis, systemic lupus erythemotasus, rheumatoid arthritis, juvenile arthritis, psoriatic arthritis, Behcet's syndrome, Reiter's disease, Berger's disease, dermatomyositis, polymyositis, antineutrophil cytoplasmic antibody-associated vasculitides (e.g., granulomatosis with polyangiitis (also known as Wegener's granulomatosis), microscopic polyangiitis, and Churg-Strauss syndrome), scleroderma, Sjogren's syndrome, anti- glomerular basement membrane disease (including Goodpasture's syndrome), dilated cardiomyopathy, primary biliary cirrhosis,
- a disease is a respiratory disease, e.g., allergy affecting the respiratory system, asthma, chronic obstructive pulmonary disease, pulmonary hypertension, pulmonary fibrosis, and sarcoidosis.
- a respiratory disease e.g., allergy affecting the respiratory system, asthma, chronic obstructive pulmonary disease, pulmonary hypertension, pulmonary fibrosis, and sarcoidosis.
- a disease is a renal disease, e.g., polycystic kidney disease, lupus, nephropathy (nephrosis or nephritis) or glomerulonephritis (of any kind).
- a renal disease e.g., polycystic kidney disease, lupus, nephropathy (nephrosis or nephritis) or glomerulonephritis (of any kind).
- a disease is vision loss or hearing loss, e.g., associated with advanced age.
- a disease is an infectious disease, e.g., any disease caused by a virus, bacteria, fungus, or parasite. In some embodiments it is of interest to modify genes that may be involved in susceptibility to the disease.
- a disease is one for which at least one genome- wide association (GWA) study (GWAS) has been performed.
- GWAS genome- wide association
- a GWAS types multiple "cases" (subjects having a disease of interest or particular manifestations thereof) and "controls" (subjects not having the disease or manifestations) for several thousand to millions, e.g., 1 million or more, e.g., 1-5 million or more, alleles (e.g., single nucleotide polymorphisms) positioned throughout the genome or a substantial portion thereof (e.g., at least 80%, 90%, 95%, or more of the genome).
- control data may be obtained from historical data.
- Genotyping may be performed using microarrays or other methods. Alleles associated (e.g., in a statistically significant manner) with increased (or decreased) risk of a disease (or particular manifestations) may thereby be identified. It will be appreciated that statistical results may be corrected for multiple hypothesis testing, e.g., using methods known in the art. In some embodiments a p value of less than about 10 "7 , 10 "8 , or 10 "9 is considered evidence of association. In some embodiments a gene or allele or polymorphism has been identified as contributing to disease risk or severity in at least one GWAS.
- a gene is one for which an allele or polymorphism is associated with an increased or decreased risk of developing a disease of at least 1.1, 1.2, 1.5, 2, 3, 4, 5, 7.5, 10, or more, relative to individuals not having the allele or polymorphism.
- an allele or polymorphism is associated with an increased or decreased risk of developing a disease of at least 1.1, 1.2, 1.5, 2, 3, 4, 5, 7.5, 10, or more, relative to individuals not having the allele or polymorphism.
- a phenotypic trait may be a physical sign (such as blood pressure), a biochemical marker, which in some embodiments may be detectable in a body fluid such as blood, saliva, urine, tears, etc., such as level of a metabolite, LDL, etc., wherein an abnormally low or high level of the marker may correlate with having or not having the disease or with susceptibility to or protection from a disease.
- a sequence to be inserted into a genome encodes a tag.
- the sequence may be inserted into a gene in an appropriate position such that a fusion protein comprising the tag is produced.
- the term "tag" is used in a broad sense to encompass any of a wide variety of polypeptides.
- a tag comprises a sequence useful for purifying, expressing, solubilizing, and/or detecting a polypeptide.
- a tag may serve multiple functions.
- a tag is a relatively small polypeptide, e.g., ranging from a few amino acids up to about 100 amino acids long.
- a tag is more than 100 amino acids long, e.g., up to about 500 amino acids long, or more.
- a tag comprises an HA, TAP, Myc, 6XHis, Flag, V5, or GST tag, to name few examples.
- a tag e.g., any of the afore-mentioned tags
- that comprises an epitope against which an antibody, e.g., a monoclonal antibody, is available (e.g., commercially available) or known in the art may be referred to as an "epitope tag".
- a tag comprises a solubility-enhancing tag (e.g., a SUMO tag, NUS A tag, SNUT tag, a Strep tag, or a monomeric mutant of the Ocr protein of bacteriophage T7). See, e.g., Esposito D and Chatterjee DK. Curr Opin BiotechnoL; 17(4):353-8 (2006).
- a tag is cleavable, so that at least a portion of it can be removed, e.g., by a protease.
- a protease cleavage site in the tag, e.g., adjacent or linked to a functional portion of the tag.
- exemplary proteases include, e.g., thrombin, TEV protease, Factor Xa, PreScission protease, etc.
- a "self-cleaving" tag is used. See, e.g., PCT/US05/05763.
- a tag comprises a fluorescent polypeptide (e.g., GFP or a derivative thereof such as enhanced GFP (EGFP)) or an enzyme that can act on a substrate to produce a detectable signal, e.g., a fluorescence or colorimetric signal.
- Luciferase e.g., a firefly, Renilla, or Gaussia luciferase
- fluorescent proteins include GFP and derivatives thereof, proteins comprising chromophores that emit light of different colors such as red, yellow, and cyan fluorescent proteins, etc.
- a tag e.g., a fluorescent protein, may be monomeric.
- a fluorescent protein is e.g., Sirius, Azurite, EBFP2, TagBFP, mTurquoise, ECFP, Cerulean, TagCFP, mTFPl, mUkGl, mAGl,
- a tag may comprise a domain that binds to and/or acts a sensor of a small molecule (e.g., a metabolite) or ion, e.g., calcium, chloride, or of intracellular voltage, pH, or other conditions. Any genetically encodable sensor may be used; a number of such sensors are known in the art. In some embodiments a FRET -based sensor may be used.
- a small molecule e.g., a metabolite
- ion e.g., calcium, chloride, or of intracellular voltage, pH, or other conditions.
- Any genetically encodable sensor may be used; a number of such sensors are known in the art.
- a FRET -based sensor may be used.
- different genes are modified to incorporate different tags, so that proteins encoded by the genes are distinguishably labeled. For example, between 2 and 20 distinct tags may be introduced. In some embodiments the tags have distinct emission and/or absorption spectra. In some embodiments a tag may absorb and/or emit light in the infrared or near-infrared region. It will be understood that any nucleic acid sequence encoding a tag may be codon-optimized for expression in a cell, zygote, embryo, or animal into which it is to be introduced.
- fragments or domains of a protein may act in a dominant negative manner and may, for example, disrupt normal function or interaction of the protein.
- a gene of interest encodes a protein
- aggregation of which is associated with one or more diseases, which may be referred to as protein misfolding diseases.
- diseases which may be referred to as protein misfolding diseases.
- proteins misfolding diseases include, e.g., alpha-synuclein
- a gene of interest encodes a transcription factor, a transcriptional co-activator or co-repressor, an enzyme, a chaperone, a heat shock factor, a heat shock protein, a receptor, a secreted protein, a transmembrane protein, a histone (e.g., HI, H2A, H2B, H3, H4), a peripheral membrane protein, a soluble protein, a nuclear protein, a mitochondrial protein, a growth factor, a cytokine (e.g., an interleukin, e.g., any of IL-1 - IL-33), an interferon (e.g., alpha, beta, or gamma), a chemokine (e.g., a CXC, CX3C, C (or XC), or CX3C chemokine).
- a histone e.g., HI, H2A, H2B, H3, H4
- a peripheral membrane protein
- a chemokine may be CCL1 - CCL28, CXCL1 - CXCL17, XCL1 or XCL2, or CXC3L1).
- a gene encodes a colony-stimulating factor, a hormone (e.g., insulin, thyroid hormone, growth hormone, estrogen, progesterone, testosterone), an extracellular matrix protein (e.g., collagen, fibronectin), a motor protein (e.g., dynein, myosin), cell adhesion molecule, a major or minor histocompatibility (MHC) gene, a transporter, a channel (e.g., an ion channel), an immunoglobulin (Ig) superfamily (IgSF) gene (e.g., a gene encoding an antibody, T cell receptor, B cell receptor), tumor necrosis factor, an NF-kappaB protein, an integrin, a cadherin superfamily member (e.g., a cadherin), a
- Growth factors include, e.g., members of the vascular endothelial growth factor (VEGF, e.g., VEGF-A, VEGF-B, VEGF-C, VEGF-D), epidermal growth factor (EGF), insulin-like growth factor (IGF; IGF-1, IGF-2), fibroblast growth factor (FGF, e.g., FGF1 - FGF22), platelet derived growth factor (PDGF), or nerve growth factor (NGF) families.
- VEGF vascular endothelial growth factor
- a growth factor may be CSF1 (macrophage colony- stimulating factor), CSF2 (granulocyte macrophage colony- stimulating factor, GM-CSF), or CSF3 (granulocyte colony-stimulating factors, G-CSF).
- a gene encodes erythropoietin (EPO).
- a gene encodes a neurotrophic factor, i.e., a factor that promotes survival,
- neural lineage cells which term as used herein includes neural progenitor cells, neurons, and glial cells, e.g., astrocytes,
- the protein is a factor that promotes neurite outgrowth.
- the protein is ciliary neurotrophic factor (CNTF) or brain-derived neurotrophic factor (BDNF).
- a gene of interest encodes a polypeptide that is a subunit of any protein that is comprised of multiple subunits.
- An enzyme may be any protein that catalyzes a reaction of a type that has been assigned an Enzyme Commission number (EC number) by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC- IUBMB). Enzymes include, e.g., oxidoreductases, transferases, hydrolases, lyases, isomerases, ligases.
- EC number Enzyme Commission number
- NC- IUBMB Nomenclature Committee of the International Union of Biochemistry and Molecular Biology
- Examples include, e.g., kinases (protein kinases, e.g., Ser/Thr kinase, Tyr kinase), lipid kinases (e.g., phosphatidylmositide 3-kinases (PI 3-kinases or PI3Ks)), phosphatases, acetyltransferases, methyltransferases, deacetylases, demethylases, lipases, cytochrome P450s, glucuronidases, recombinases (e.g., Rag- 1, Rag-2).
- An enzyme may participate in the biosynthesis, modification, or degradation of nucleotides, nucleic acids, amino acids, proteins, neurotransmitters, xenobiotics (e.g., drugs) or other macromolecules.
- the mammalian genome encodes at least about 500 different kinases.
- Kinases can be classified based on the nature of their typical substrates and include protein kinases (i.e., kinases that transfer phosphate to one or more protein(s)), lipid kinases (i.e., kinases that transfer a phosphate group to one or more lipid(s)), nucleotide kinases, etc.
- Protein kinases are of particular interest in certain aspects of the invention. PKs are often referred to as serine/threonine kinases (S/TKs) or tyrosine kinases (TKs) based on their substrate preference.
- Serine/threonine kinases (EC 2.7.11.1) phosphorylate serine and/or threonine residues while TKs (EC 2.7.10.1 and EC 2.7.10.2) phosphorylate tyrosine residues.
- TKs EC 2.7.10.1 and EC 2.7.10.2
- tyrosine residues EC 2.7.12.1
- the human protein kinase family can be further divided based on sequence/structural similarity into the following groups: (1) AGC kinases - containing PKA, PKC and PKG; (2) CaM kinases - containing the calcium/calmodulin-dependent protein kinases; (3) CK1 - containing the casein kinase 1 group; (4) CMGC - containing CDK, MAPK, GSK3 and CLK kinases; (5) STE - containing the homologs of yeast Sterile 7, Sterile 11 , and Sterile 20 kinases; (6) TK - containing the tyrosine kinases; (7) TKL - containing the tyrosine-kinase like group of kinases.
- a further group referred to as "atypical protein kinases" contains proteins that lack sequence homology to the other groups but are known or predicted to have kina
- Receptors include, e.g., G protein coupled receptors, tyrosine kinase receptors, serine/threonine kinase receptors, Toll-like receptors, nuclear receptor, immune cell surface receptor.
- a receptor is a receptor for any of the hormones, cytokines, growth factors, or secreted proteins mentioned herein.
- GPCRs G protein coupled receptors
- Numerous G protein coupled receptors (GPCRs) are known in the art. See, e.g., Vroling B, GPCRDB: information system for G protein-coupled receptors. Nucleic Acids Res. 2011 Jan;39(Database issue):D309-19. Epub 2010 Nov 2.
- G protein coupled receptors include, e.g., adrenergic, cannabinoid, purinergic receptors, neuropeptide receptors, olfactory receptors.
- Transcription factors (TFs) (sometimes called sequence-specific DNA-binding factors) bind to specific DNA sequences and (alone or in a complex with other proteins), regulate transcription, e.g., activating or repressing
- TFs are listed, for example, in the TRANSFAC® database, Gene Ontology (http://www.geneonlology.org/) or DBD
- TFs can be classified based on the structure of their DNA binding domains (DBD). For example in certain
- a TF is a helix-loop-helix, helix-turn-helix, winged helix, leucine zipper, bZIP, zinc finger, homeodomain, or beta-scaffold factor with minor groove contacts protein.
- Transcription factors include, e.g., p53, STAT3, PAS family transcription factors (e.g., HIF family: HIF1A, HIF2A, HIF3A), aryl hydrocarbon receptor.
- an animal generated according to inventive methods is useful for studying drug metabolism.
- it may be of interest to genetically modify multiple enzymes involved in xenobiotic metabolism e.g., multiple P450s.
- an animal generated according to inventive methods is useful for studying the immune system and/or for generating animals that have a humanized immune system or that are immunocompromised and may serve as hosts for cells or tissues from other organisms of the same species or different species.
- Section headings used herein are not to be construed as limiting in any way. It is expressly contemplated that subject matter presented under any section heading may be applicable to any aspect or embodiment described herein.
- Embodiments or aspects herein may be directed to any agent, composition, article, kit, and/or method described herein. It is contemplated that any one or more embodiments or aspects can be freely combined with any one or more other embodiments or aspects whenever appropriate. For example, any combination of two or more agents, compositions, articles, kits, and/or methods that are not mutually inconsistent, is provided.
- compositions methods of using the composition as disclosed herein are provided, and methods of making the composition according to any of the methods of making disclosed herein are provided.
- a claim recites a method a composition for performing the method is provided.
- elements are presented as lists or groups, each subgroup is also disclosed. It should also be understood that, in general, where embodiments or aspects is/are referred to herein as comprising particular element(s), feature(s), agent(s), substance(s), step(s), etc., (or combinations thereof), certain embodiments or aspects may consist of, or consist essentially of, such element(s), feature(s), agent(s), substance(s), step(s), etc. (or combinations thereof).
- Any method of treatment may comprise a step of providing a subject in need of such treatment, e.g., a subject having a disease for which such treatment is warranted.
- Any method of treatment may comprise a step of diagnosing a subject as being in need of such treatment, e.g., diagnosing a subject as having a disease for which such treatment is warranted.
- V6.5 mESCs (on a 129/Sv x C57BL/6 Fl hybrid background) were cultured on gelatin-coated plates with standard mESC culture conditions.
- Cells were transfected with a plasmid expressing mammalian codon optimized Cas9 and sgRNA (single targeting), or three plasmids expressing Cas9 and sgRNAs targeting Tetl, Tet2, and Tet3 (triple targeting), or five PCR products each coding for sgRNA targeting Tetl, Tet2, Tet3, Sry, and Uty, along with a plasmid expressing PGK- puroR using FuGENE HD reagent (Promega), following manufacturer's instructions.
- mESC 12 hours after transfection, mESC were re-plated at a low density on DR4 MEF feeder layers. Puromycin (2 ⁇ g/ml) was added one day after replating and taken off after 48 hours. After recovering for 4 to 6 days, individual colonies were picked and genotyped by RFLP and Southern blot analysis, and the leftover mES cells on plate were collected for Suveryor assay.
- DNA was extracted from pre-plated mESCs following standard procedures. DNA was transferred to nylon membrane using BioRad slot blot vacuum manifold apparatus. Anti-5hmC (Active Motif 1 : 10000) was used to detect 5hmC following manufacturer's protocol. [00197] Production of Cas9 mRNA and sgRNA
- T7 promoter was added to Cas9 coding region by PCR amplification using primer Cas9 F and R (Table 6).
- T7-Cas9 PCR product was gel-purified and used as the template for in vitro transcription (IVT) using mMESSAGE
- T7 promoter was added to sgRNAs template by PCR amplification using primer Tetl F and R, Tet2 F and R, Tet3 F and R (Table 6).
- the T7-sgRNA PCR product was gel-purified and used as the template for IVT using MEGAshortscript T7 kit (Life Technologies). Both the Cas9 mRNA and the sgRNAs were purified using MEGAclear kit (Life Technologies).
- B6D2F1 C57BL/6 X DBA2
- ICR mouse strains were used as embryo donors and foster mothers, respectively.
- Super-ovulated female B6D2F1 mice (7-8 weeks old) were mated to B6D2F1 stud males, and fertilized embryos were collected from oviducts.
- Cas9 mRNAs from 20 ng/ ⁇ to 200 ng/ ⁇
- sgRNA from 20 ng/ ⁇ to 50 ng/ ⁇
- Genomic DNA was separated on a 0.8% agarose gel after restriction digests with the appropriate enzymes, transferred to a nylon membrane (Amersham) and hybridized with 32P random primer (Stratagene)-labeled probes.
- MMMMM MMMMM TGG (SEQ ID NO: 139) were generated. Exact matches to these search sequences in the mouse genome (mm9) were found using bowtie and reported as potential targets of the CRISPR sgRNA.
- sgRNAs targeting the Ten-eleven translocation (Tet) family members, Tetl, Tet2 and Tet3 were digested (Fig 1A).
- Tet proteins (Tetl/2/3) convert 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) in various embryonic and adult tissues and mutant mice for each of these three genes have been produced by homologous recombination in ES cells (Dawlaty et al.
- plasmids expressing both the mammalian codon optimized Cas9 and a sgRNA targeting each gene were transfected into mouse ES cells and determined the targeted cleavage efficiency by the Surveyor assay (Guschin et al, Methods Mol Biol, 649:247-256 (2010)). All three Cas9-sgRNA transfections produced cleavage at target loci with high efficiency of 36% at Tetl, 48% at Tet2, and 36% at Tet3 (Fig IB).
- each target locus contains a restriction enzyme recognition site (Fig 1 A)
- a ⁇ 500bp fragment around each target site was PCR amplified, and the PCR products were digested with the respective enzyme.
- a correctly targeted allele will lose the restriction site, which can be detected by failure to cleave upon enzyme treatment.
- RFLP restriction fragment length polymorphism
- mutant mice could be generated in vivo by direct embryo manipulation.
- Capped polyadenylated Cas9 mRNA was produced by in vitro transcription and co-injected with sgRNAs. Initially, to determine the optimal concentration of Cas9 mRNA for targeting in vivo, varying amounts of Cas9- encoding mRNA were injected with Tetl targeting sgRNA at constant concentration (20 ng/ ⁇ ) into pronuclear (PN) stage one-cell mouse embryos and the frequency of altered alleles at the blastocyst stage was assessed using the RFLP assay. As expected, higher concentration of Cas9 mRNA led to more efficient gene disruption (Fig. 6A). Nevertheless, even embryos injected with the highest amount of Cas9 mRNA (2003 ⁇ 4/ ⁇ 1) showed normal blastocyst development, indicating low toxicity.
- sgRNAs targeting Tetl or Tet2 were co-injected with different concentrations of Cas9 mRNA. Blastocysts derived from the injected embryos were transplanted into foster mothers and newborn pups were obtained. As summarized in Table 2, about 10% of the transferred blastocysts developed to birth independent of the RNA concentrations used for injection indicating low fetal toxicity of the Cas9 mRNA and sgRNA. RFLP, Southern blot, and sequencing analysis demonstrated that between 50 and 90% of the postnatal mice carried biallelic mutations in either target gene (Figs. 2A, 2B, 2C, Table 2).
- Blastocysts were also derived from zygotes injected with Cas9 mRNA and Tet3 sgRNA. Genotyping of the blastocysts demonstrated that of eight embryos three were homozygous and three were heterozygous Tet3 mutants (two failed to amplify) (Fig 6C). Some blastocysts were implanted into foster mothers and, upon C-section, multiple mice of smaller size (Fig 6D), many of which died soon after delivery, were readily identified. Genotyping shown in Fig 6E indicated that all pups with mutations in both Tet3 alleles died neonatally. Only two out of 15 mice survived that were either Tet3 heterozygous mutants or wt (Fig 6F).
- Tetl and Tet2 sgRNAs were co-injected with 20 or 100 ng/ ⁇ Cas9 mRNA into zygotes. A total of 28 pups were born from 144 embryos transferred into foster mothers (21% live birth rate) that had been injected at the zygote stage with high concentrations of RNA (Cas9 mRNA at 100 ng/ ⁇ , sgRNAs at 50 ng/ ⁇ ), consistent with low or no toxicity of the Cas9 mRNA and sgRNAs (Table 3).
- Blastocysts were derived from zygotes injected with Cas9 mRNA and sgRNAs and oligos targeting Tetl or Tet2, respectively. DNA was isolated, amplified and digested with EcoRI to detect oligo mediated HDR events. Six out of nine Tetl targeted embryos and nine out of 15 Tet2 targeted embryos incorporated an EcoRI site at the respective target locus, with several embryos having both alleles modified (Fig 8A).
- mice with HR-mediated precise mutations in multiple genes can be generated in one step by CRISPR/Cas mediated genome editing.
- Plasmids encoding Cas9 and sgRNAs targeting Tetl, Tet2, and Tet3 were transfected separately (single targeting) or in a pool (triple targeting) into mES cells.
- the number of total alleles mutated in each mES cell clone is listed from 0 to 2 for single targeting experiment, and 0 to 6 for triple targeting experiment.
- the number of clones containing each specific number of mutated alleles is shown in relation to the total number of clones screened in each experiment.
- mice containing each specific number of mutated alleles is shown in relation to the total number of mice screened in each experiment.
- Cas9 mRNA and sgRNAs targeting Tetl and Tet2 were co-injected into fertilized eggs.
- the blastocysts derived from the injected embryos were transplanted into foster mothers and newborn pups were obtained and genotyped.
- the number of total alleles mutated in each mouse is listed from 0 to 4 for Tetl and Tet2.
- the number of mice containing each specific number of mutated alleles is shown in relation to the number of total mice screened in each experiment.
- Table 4 Plasmids encoding Cas9 and five PCR products expressing sgRNAs targeting Tetl, Tet2, Tet3, Sry, and Uty were co-transfected into mES cells. The number of clones containing mutations in all six Tet alleles is listed in the Tetl, 2, 3 column; the number of clones containing mutations in all six Tet alleles and Sry allele is listed in the Tetl, 2, 3 + Sry column; the number of clones containing mutations in all six Tet alleles and both Sry and Uty allele is listed in the Tetl, 2, 3 + Sry +Uty column.
- Cas9 R GCGAGCTCTAGGAATTCTTAC (SEQ ID NO: 170)
- Tetl F TC (SEQ ID NO: 171)
- sgRNA R AAAAGCACCGACTCGGTGCC (SEQ ID NO: 172)
- sgRNA R AAAAGCACCGACTCGGTGCC (SEQ ID NO: 174)
- Tet3 F TTAATACGACTCACTATAGGAAGGAGGGGAAGAG sgRNA TTCTCG (SEQ ID NO: 175)
- mice The genetic manipulation of mice is a crucial approach for the study of development and disease.
- generation of mice with specific mutations is labor intensive and involves gene targeting by homologous recombination in ES cells, the production of chimeric mice and, after germ line transmission of the targeted ES cells, the interbreeding of heterozygous mice to produce the
- mice carrying mutations in several genes require time-consuming intercrossing of single mutant mice.
- generation of ES cells carrying homozygous mutations in several genes is usually achieved by sequential targeting, a process that is labor-intensive necessitating multiple consecutive cloning steps to target the genes and to delete the selectable markers.
- mice embryos can be directly modified by injection of Cas9 mRNA and sgRNA into the fertilized egg resulting in the efficient production of mice carrying biallelic mutations in a given gene. More significantly, co-injection of Cas9 with Tetl and Tet2 sgRNAs into zygotes produced mice that carried mutations in both genes ( Figure 4B, upper panel).
- mice carrying multiple mutations can be generated within 4 weeks, which is a much shorter time frame than can be achieved by conventional consecutive targeting of genes in ES cells and avoids time-consuming intercrossing of single mutant mice.
- CRISPR/Cas mediated targeting is useful to generate mutant alleles with predetermined alterations, and co-injection of single stranded oligos can introduce designed point mutations into two target genes in one step, allowing for multiplexed gene editing in a strictly controlled manner (Figure 4B, lower panel).
- This targeting system allows for the production of conditional alleles, or precise insertion of larger DNA fragments such as GFP markers so as to generate conditional knockout and reporter mice for specific genes.
- H840A of the human codon-optimized Cas9 nickase was mutated to generate nuclease-deficient dCas9 [PMID: 23452860] and a 3x minimal VP 16 transcriptional activation domain (TAD) was fused to the C-terminal of the dCas9 protein ( Figure 10A) to generate dCas9ta.
- TAD 3x minimal VP 16 transcriptional activation domain
- dCas9ta can activate endogenous gene expression
- dCas9ta chimeric expression construct was designed and cloned with 8 different sgRNAs targeting Nanog promoter (sgmNanog) and transfected in NIH3T3 cells.
- sgmNanog Nanog promoter
- a NanogGFP plasmid [PMID: 18594521] containing 1.2kb promoter of Nanog was co-transfected.
- dCas9 was fused to Cdk9 and CycT, two components of the P-TEFb complex involved in the transcriptional pause release [PMID: 22986266] and their transactivation was tested activity on the
- TetO::tdTomato with or without dCas9ta ( Figures 13A-13D). Transfection of dCas9ta resulted in 10% of tdTomato positive cells. Transfection of both dCas9Cdk (pAC72) and dCas9CycT (pAC73) also activated tdTomato expression, though to a lesser extent (2%). Co-transfection of three plasmids, pAC5 (dCas9ta), pAC72 (dCas9Cdk9), pAC73 (dCas9CycT), with sgTetO resulted in 13% tdTomato positive cells. This additive effect indicates that co-transfection with or fusion of additional transcriptional activators or transactivation domains to dCas9ta likely further augment dCas9ta transactivation activity.
- a two-step fusion PCR was used to amplify Cas9 Nickase ORF without stop codon from the pX335 vector, incorporate H840A mutation, EcoRI -Agel restriction site on the 5' end as well as an Fsel site on the 3 'end (EcoRI -Agel- dCas9-FseI fragment).
- the 3x minimal VP 16 activation domain coding fragment (TAD) was excised from a vector (Addgene: 20342) containing NLSM2rtTA coding sequence by Fsel and EcoRI digestion (Fsel-TA-EcoRI fragment).
- pCR8/GW/TOPO Invitrogen
- pACl which contains the dCas9ta gene.
- the dCas9ta coding sequence was subsequently excised from pACl and cloned into pX355 vector (Addgene: 42335) by Agel-EcoRI digestion to replace dCas9 Nickase to create a chimeric vector pAC2 that expresses both the dCas9ta and the sgRNA.
- sgRNA spacers were cloned into the Bbsl-digested pAC2 vector.
- sgRNA targeting TetO sgTet
- aaacTATCAGTGATAGAGAAAAGC (SEQ ID NO: 180) onto Bbsl-digested pAC2 vector to generate pAC5.
- Fsel-EcoRI fragment from pAC5 or pACl was replaced by PCR amplicons of different domains or genes with Fsel and EcoRI added on the primer sequences.
- dCas9 was cloned by PCR amplification of dCas9ta with reverse primer before the 3xTA domains and cloned into pCR8GWTOPO to create pAC84 and pAC5 to create pAC89.
- Non-chimeric versions of dCas9 fusions were generated by LR Clonase-medicated recombination to a pmax-DEST vector (pAC90).
- a TetO::tdTomato (plasmid pAC3) transgene and a EF 1 a: :NLSM2rtTA (plasmid pAC4) transgene were delivered into NIH3T3 (mouse) and HeLa (human) cells by PiggyBac transposition.
- sgRNAs were designed to target TetO binding site (sgTetO).
- pmaxGFP (Clontech) was used as a transfection control. Transfection was done using FuGene HD following manufacturer's instructions.
- sgRNA designs DNA targets, oligos, and plasmids used to target different DNA.
- Last three bases are PAM (5'-NGG-3') motif.
- Lowercase letters in the target sequences indicate changes made (first g) to allow efficient U6
- Dual expression construct expressing both dCas9ta and sgRNA from pAC2 U6 promoter
- dCas9TA peptide sequence (Underlined sequence indicate the 3x VP 16 minimal Transactivation domains)
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Organic Chemistry (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Environmental Sciences (AREA)
- Medicinal Chemistry (AREA)
- Veterinary Medicine (AREA)
- Mycology (AREA)
- Animal Husbandry (AREA)
- Biodiversity & Conservation Biology (AREA)
- Animal Behavior & Ethology (AREA)
- Cell Biology (AREA)
- Toxicology (AREA)
- Gastroenterology & Hepatology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The invention is directed to a method of mutating one or more target nucleic acid sequences in a stem cell or a zygote comprising introducing into the stem cell or zygote (i) ribonucleic acid (RNA) sequences that comprise a portion that is complementary to a portion of each of the target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity. The stem cell or zygote is maintained under conditions in which the target nucleic acid sequences are mutated in the stem cell or zygote. The invention is also directed to methods of producing a non human mammal carrying mutations and methods of modulating the expression and/or activity target nucleic acid sequences and cells or zygotes.
Description
METHODS OF MUTATING, MODIFYING OR MODULATING NUCLEIC ACID IN A CELL OR NONHUMAN MAMMAL
RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Provisional Application No. 61/812,720, filed on April 16, 2013 ; U.S . Provisional Application No. 61/824,920, filed on May 17, 2013; U.S. Provisional Application No. 61/858,437, filed on July 25, 2013; and U.S. Provisional Application No. 61/865,888, filed on August 14, 2013.
[0002] The entire teachings of the above applications are incorporated herein by reference.
GOVERNMENT SUPPORT
[0003] This invention was made with government support under HD 045022 and R37CA084198 from the National Institutes of Health. The government has certain rights in the invention.
BACKGROUND OF THE INVENTION
[0004] Genetically modified mice represent a crucial tool for understanding gene function in development and disease. Mutant mice are conventionally generated by insertional mutagenesis (Copeland and Jenkins, 2010; Kool and Berns, 2009) or by gene targeting methods (Capecchi, 2005). In conventional gene targeting methods, mutations are introduced through homologous recombination in mouse embryonic stem (ES) cells. Targeted ES cells injected into wild-type blastocysts can contribute to the germline of chimeric animals, generating mice containing the targeted gene modification (Capecchi, 2005). It is costly and time- consuming to produce single gene knockout mice, and even more so to make double mutant mice. Moreover, in most other mammalian species no established ES cell lines are available that contribute efficiently to chimeric animals, which greatly limits the genetic studies in many species.
[0005] Alternative methods have been developed to accelerate the process of genome modification by directly injecting DNA or mRNA of site-specific nucleases into the one cell embryo to generate DNA double strand break (DSB) at a specified locus in various species (Bogdanove and Voytas, 2011; Carroll et al, 2008; Urnov et al, 2010). DSBs induced by these site-specific nucleases can then be repaired by either error-prone non-homologous end joining (NHEJ) resulting in mutant mice and rats carrying deletions or insertions at the cut site (Carbery et al, 2010; Geurts et al, 2009; Sung et al, 2013; Tesson et al, 2011). If a donor plasmid with homology to the ends flanking the DSB is co-injected, high-fidelity homologous recombination can produce animals with targeted integrations (Cui et al., 2011; Meyer et al., 2010). Because these methods require the complex designs of zinc finger nucleases (ZNFs) or Transcription activator-like effector nucleases (TALENs) for each target gene and because the efficiency of targeting may vary substantially, no multiplexed gene targeting has been reported to date.
[0006] Thus, improved methods for producing genetically modified non-human mammals, such as mice, are needed.
SUMMARY OF THE INVENTION
[0007] Described herein is the use of the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and CRISPR associated (Cas) proteins
(CRISPR/Cas) system to drive both non-homologous end joining (NHEJ) based gene disruption and homology directed repair (HDR) based precise gene editing to achieve highly efficient and simultaneous targeting of multiple nucleic acid sequences in cells and nonhuman mammals.
[0008] Accordingly, in one aspect, the invention is directed to a method of mutating one or more target nucleic acid sequences in a (one or more) stem cell or a zygote comprising introducing into the stem cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity. The stem cell or zygote is maintained under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid
sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the one or more RNA sequences to the portion of the target nucleic acid sequence, thereby mutating one or more target nucleic acid sequences in the stem cell or zygote.
[0009] In some aspects, the invention is directed to a method of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences comprising introducing into a zygote or an embryo (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and ii) a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity. The zygote or the embryo is maintained under conditions in which RNA hybridizes to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the RNA to the portion of the target nucleic acid sequence, thereby producing an embryo having one or more mutated nucleic acid sequences. The embryo having one or more mutated nucleic acid sequences may be transferred into a foster nonhuman mammalian mother. The foster nonhuman mammalian mother is maintained under conditions in which one or more offspring carrying the one or more mutated nucleic acid sequences are produced, thereby producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences.
[0010] In some aspects, the invention is directed to a method of modulating the expression and/or activity of one or more target nucleic acid sequences in one or more cells or zygotes comprising introducing into the cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associate (Cas) protein; (ii) a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence; and (iii) an effector domain. The method further comprises maintaining the cell under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, the Cas protein binds to each of the one or more RNA sequences and the effector domain modulates the expression and/or activity of the target nucleic acid, thereby
modulating the expression and/or activity of the one or more target nucleic acid sequences in the cell or zygote.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
[0012] Figures 1 A-1E show multiplexed gene targeting in mESCs. Figure 1 A shows a schematic of the Cas9/sgRNA targeting sites in Tetl, 2, and 3. The sgRNA targeting sequence is underlined, and the protospacer-adjacent motif (PAM) sequence is labeled in green. The restriction sites at the target regions are bold and capitalized. Restriction enzymes used for restriction fragment length polymorphism (RFLP) and Southern blot analysis are shown, and the Southern blot probes are shown as orange boxes (SEQ ID NOs: 27- 29). Figure IB shows a surveyor assay for Cas9-mediated cleavage at Tetl, 2, 3 loci in mESCs. Figure 1C shows the genotyping of triple targeted mESCs, clone #51, #52, and #53 are shown. The upper panels in Figure 1C is the RFLP analysis. Tetl PCR products were digested with Sacl, Tet2 PCR products were digested with EcoRV, and Tet3 PCR products were digested with Xhol. Lower panel, Southern blot analysis. For the Tetl locus, Sacl digested genomic DNA was hybridized with a 5' probe. Expected fragment size: WT (wild type) = 5.8 kb, TM (Targeted mutation) = 6.4 kb. For the Tet2 locus, Sacl and EcoRV double digested genomic DNA was hybridized with a 3' probe.
Expected fragment size: WT = 4.3kb, TM = 5.6kb. For the Tet3 locus, BamHI and Xhol double digested genomic DNA was hybridized with a 5' probe. Expected fragment size: WT = 3.2kb, TM = 8.1kb. Figure ID shows the sequence of six mutant alleles in triple targeted mESC clone #14 and #41 (SEQ ID NOs: 30-45). Pro tospacer- Adjacent Motif (PAM) sequence is labeled in red. Figure IE shows the analysis of 5hmC levels in DNA isolated from triple targeted mES clones by dot blot assay using anti-5hmC antibody. A previously characterized DKO clone derived using traditional method is used as a control. See also Figures 5A-D, Tables 1 and 4.
[0013] Figures 2A-2F show single and double gene targeting in vivo by injection into fertilized eggs. (2A) Genotyping of Tetl single targeted mice. Figure 2B shows genotyping of Tet2 single targeted mice. RFLP analysis is shown in the
upper panel, and Southern blot analysis is shown in the lower panel of Figure 2B. Figure 2C is the sequence of both alleles of targeted gene in Tetl biallelic mutant mouse #2 and Tet2 biallelic mutant mouse #4 (SEQ ID NOs: 30, 32-34, 46, 47). Figure 2D is the genotyping of Tetl/Tet2 double mutant mice. Analysis of mice #1 to #12 is shown. RFLP analysis is shown in the upper panel, and Southern blot analysis is shown in the lower panel of Figure 2D. The Tetl locus is displayed in the left panel and the Tet2 locus in the right panel. Figure 2E is the sequence of four mutant alleles from double mutant mouse #9 and #10 (SEQ ID NOs: 30, 33, 42, 46, 48-53). PAM sequences are labeled in red. Figure 2F is a picture of three-week-old double mutant mice. All RFLP and Southern digestions and probes are the same as those used in Figures 1A-E. See also Figures 6A-6F, Tables 2 and 3.
[0014] Figures 3A-3C show multiplexed HR-mediated genome editing in vivo. Figure 3 A shows a schematic of the oligo targeting sites at Tetl and Tet2 loci (SEQ ID NOs: 54-57). The sgRNA targeting sequence is underlined, and the PAM sequence is labeled in green. Oligo targeting each gene is shown under the target site, with 2bp changes labeled in red. Restriction enzyme sites used for RFLP analysis are bold and capitalized. Figure 3B is RFLP analysis of double oligo injection mice with HDR-mediated targeting at the Tetl and Tet2 loci. Figure 3C shows the sequences of both alleles of Tetl and Tet2 in mouse #5 and #7 show simultaneously HDR-mediated targeting at one allele or two alleles of each gene, and NHEJ-mediated disruption at the other alleles (SEQ ID NOs: 30, 33, 40, 58-61). See also Figures 8A-8C.
[0015] Figures 4A-4B show multiplexed genome editing in mES cells and mouse. Figure 4A is a diagram representing multiple gene targeting in mES cells. Figure 4B shows one step generation of mice with multiple mutations. Upper panel, multiple targeted mutations with random indels introduced through NHEJ. Lower panel, multiple predefined mutations introduced through HDR-mediated repair.
[0016] Figures 5A-5D show single, triple, and quintuple gene targeting in mES cells. Figure 5 A is RFLP analysis of clones from each single targeting experiment (#1 to #17 are shown). Figure 5B is RFLP analysis of triple gene targeted clones (#37 to #53 are shown). Tetl PCR products were digested with Sacl, Tet2 PCR products were digested with EcoRV, and Tet3 PCR products were digested with Xhol. WT control is shown in the last lane. Genotyping of clone #51, #52, and #53
are also shown in Figure 1C. Figure 5C is a schematic of the Cas9/sgRNA targeting sites in Sry and Uty (SEQ ID NOs: 62-64). The sgR A targeting sequence is underlined, and the protospacer-adjacent motif (PAM) sequence is labeled in green. The restriction sites at the target regions are bold and capitalized. Restriction enzymes used for RFLP analysis are shown. Figure 5D is RFLP analysis of quintuple gene targeted clones (#1 to #10 are shown). Sry PCR products were digested with BsaJI, Uty PCR products were digested with Avrll. WT control is shown in the last lane. RFLP analysis of Tetl, 2, 3 loci are not shown. Figures 5A- 5D are related to Figure 1A-1E, Tables 1 and 4.
[0017] Figures 6A-6F show one step generation of single gene mutant mice by zygote injection Figure 6 A is RFLP analysis of blastocysts injected with different concentration of Cas9 mRNA and Tetl sgRNA at 20¾/μ1. Tetl PCR products were digested with Sacl. Figure 6B shows commonly recovered Tetl and Tet2 alleles resulted from MMEJ (SEQ ID NOs: 30, 33, 34, 40, 46, 52). PAM sequence of each targeting sequence is labeled in green. Microhomology flanking the DSB is bold and underlined in WT sequence. Figure 6C is RFLP analysis of eight Tet3 targeted blastocysts demonstrated high targeting efficiency (embryo #3 and #5 failed to amplify). Tet3 PCR products were digested with Xhol. Figure 6D is a picture of how some Tet3 targeted mice show smaller size and all homozygous mutants died within one day after birth. Figure 6E is RFLP analysis of Tet3 single targeted new born mice. Mouse #8 and #14 survived after birth. Sample #2 and #6 failed to amplify. Figure 6F are sequences of both Tet3 alleles of surviving Tet3 targeted mouse #14. PAM sequences are labeled in red. Figures 6A-6F are related to Figures 2A-2F and Table 2.
[0018] Figures 7A-7B show off target analysis of double mutant mice. Figure 7A shows three potential off targets of Tetl sgRNA and four potential off targets of Tet2 sgRNA are shown (SEQ ID NOs: 66-74). The 12bp perfect matching seed sequence is labeled in blue, and NGG PAM sequence is labeled in red. Figure 7B shows a surveyor assay of all seven potential off target loci in seven double mutant mice derived with high concentration of Cas9 mRNA (lOOng/μΙ) injection. WT control is included as the eighth sample. The weak cleavage activity at Ubrl locus is not due to off target effect, since sequences of these PCR products show no mutations. Figures 7A-7B are related to Figures 2A-2F and Table 5.
[0019] Figures 8A-8C show multiplexed precise HDR-mediated genome editing in vivo. Figure 8 A is RFLP analysis of single oligo injection embryos with HDR- mediated targeting at Tetl and Tet2 locus. Figure 8B is RFLP analysis of double oligo injection embryos with multiplexed HDR-mediated targeting at both Tetl and Tet2 loci. Figure 8C shows the sequences of both alleles of Tetl and Tet2 in embryo #2 and show simultaneously HDR-mediated targeting at one allele of both genes, and NHEJ-mediated gene disruption at the other allele of each gene (SEQ ID NOs: 30, 33, 53, 58, 59, 75). Figures 8A-8C are related to Figures 3A-3C.
[0020] Figure 9 shows 20 bp sequences of Tetl, Tet2, Tet3, Sry and Uty and the full length sequences of the RNA sequences (SEQ ID NOs: 76-85).
[0021] Figures 1 OA- IOC: dCas9ta guided by sgRNA targeting tet binding site activates TetO promoter in HeLa cell. (10A) Schematic of dCas9ta fusion protein generated by mutation of two amino acids of Cas9 protein and fusion to 3x VP 16 minimal transactivation domain (10B) Schematic of a TetO::tdTomato reporter system to test dCas9ta fusion protein. (IOC) Phase contrast, fluorescent microscopy and fluorescent activated cell sorting (FACS) profile of
HeLa/TetO::tdTomato;EFla::NLSM2rtTA cells transfected by pmaxGFP (i), under dox exposure for 2 days (ii), transfected with dCas9ta without sgRNA (iii), and transfected with dCas9ta with sgRNA (sgTetO) complementary to tet binding sites (iv).
[0022] Figures 1 lA-11C dCas9ta guided by sgRNAs targeting Nanog promoter activates a NanogGFP reporter and endogenous Nanog gene in NIH3T3 cells. (1 1 A) Schematic of the experiment showing the target sites of the sgRNAs relative to Nanog locus and the NanogGFP reporter. (1 IB) qRT-PCR analysis of endogenous Nanog. The fold change of (i) NanogGFP-only, (ii) NanogGFP + dCas9ta, (iii) NanogGFP + dCas9ta + sgmNanog were expressed relative to NanogGFP-only control. (11C) Microscope pictures of cells transfected with (i) NanogGFP-only, (ii) NanogGFP + dCas9ta, (iii) NanogGFP + dCas9ta + sgmNanog.
[0023] Figures 12A-12D: dCas9ta guided by sgRNA targeting tet binding site activates TetO promoter in NIH3T3 cell. Phase contrast, fluorescent microscopy and fluorescent activated cell sorting (FACS) profile of
NIH3T3/TetO : :tdTomato;EF 1 a: :NLSM2rtTA cells transfected by pmaxGFP (12A), under dox exposure for 2 days (12B), transfected with dCas9ta without sgRNA
(12C), and transfected with dCas9ta with sgRNA complementary to tet binding sites (12D).
[0024] Figures 13A-13D: Microscope pictures and FACS analysis of
HeLa/TetO::tdTomato cells (13A) No transfection. (13B) Transfected with dCas9ta + sgTetO. (13C) Transfected with dCas9Cdk9 + dCas9CycT + sgTetO. (13D) Transfected with dCas9ta + dCas9Cdk9 + dCas9CycT + sgTetO.
[0025] Figure 14: the wild type (Wt) Cas9 (S. pyogenes) nucleotide sequence (SEQ ID NO: 485).
[0026] Figure 15 A: Alignment of HMG box sequences of Sry proteins from different mammalian species (SEQ ID NOs: 86-91). Position 94 (shown in red) is highly conserved in different species (h: human; m: mouse; c: Chimpanzee; pc: Pygmy Chimpanzee; g: gorilla; py: Pongo; hi: Hylobates; b: Baboon and cj: Calitrix jaccus). (From Shahid et al. BMC Medical Genetics 2010 11 : 131 doi: 10.1186/1471- 2350-11-131).
[0027] Figure 15B: Genetic modification of Sry using TALENs. Schematic of Sry TALEN pair 2 and its recognition sequence within high mobility group (HMG) domain of Sry (SEQ ID NOs: 86-96). TAL repeats are color-coded to represent each of four repeat variable di-residues (RVDs); each RVD recognizes one corresponding DNA base (NI = A, NG = T, HD = C, NN = G). Nucleotides bound by TALENs are capitalized. Shown below are clones (targeted mutation [TM] alleles 1-3) where Sry deletions induced by TALENs were indicated in dashed lines. Srytm4 (540-bp deletion) and Srytm5 (440-bp deletion) clones are not shown.
[0028] Figures 16A-16F: One step generation of the Sox2-V5 allele. (16A) Schematic of the Cas9/sgRNA/oligo targeting site at the Sox2 stop codon (SEQ ID NOs: 97 and 98). The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green. The stop codon of Sox2 is labeled in orange. The oligo contained 60bp homologies flanking the DSB. In the oligo donor sequence, the V5 tag sequence is labeled as a green box. PCR primers (SF, V5F, and SR) used for PCR genotyping are shown as red arrowheads. (16B) Upper panel, PCR genotyping using primers V5F and SR produced bands with correct size in targeted ES samples Tl to T5, but not in WT sample. Lower panel, PCR genotyping using primers SF and SR produced slightly larger products, indicating the 42bp V5 tag sequence was integrated. Tl only contain
larger product, suggesting either both alleles were targeted, or one allele failed to amplify. (16C) PCR products using primers SF and SR were cloned into plasmid and sequenced. Sequence across the targeting region confirmed correct fusion of V5 tag to the last codon of Sox2. (16D) Western blot analysis identified Sox2-V5 protein using V5 antibody in ES cells containing Sox2-V5 allele. Beta-actin was shown as the loading control. (16E) Immunostaining of targeted blastocyst using V5 antibody showed signal in ICM. Scale bar, 50um. (16F) Immunostaining of targeted ES cells using V5 antibody showed uniform Sox2 expression. Scale bar, ΙΟΟμιη.
[0029] Figures 17A-17F: One step generation of an endogenous reporter allele (SEQ ID NOs: 99 and 100). (17A) Schematic overview of strategy to generate a Nanog-mCherry knock-in allele. The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green. The stop codon of Nanog is labeled in orange. The homologous arms of the donor vector are indicated as HA-L (2kb) and HA-R (3kb). The restriction enzyme used for Southern blot analysis is shown, and the Southern blot probes are shown as red boxes. (17B) Southern analysis of Nanog-mCherry targeted allele. Ncol-digested genomic DNA was hybridized with 3 'external probe. Expected fragment size: WT (wild type) = 11.5 kb, T (Targeted) = 5.6 kb. The blot was then stripped and hybridized with mCherry internal probe. Expected fragment size: WT = N/A, T = 6.6 kb. (17C) Nanog-mCherry targeted blastocysts showed expression in ICM. Mouse ES cell lines derived from targeted blastocysts remain mCherry positive, and the mCherry expression disappear upon differentiation. Scale bar, ΙΟΟμιη. (17D) Schematic overview of strategy to generate an Oct4-eGFP knock-in allele (SEQ ID NO: 101). The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green. The homologous arms of the donor vector are indicated as HA-L (4.5kb) and HA-R (2kb). The IRES-eGFP transgene is indicated as a green box, and the PGK- Neo cassette is indicated as a grey box. The restriction enzyme used for Southern blot analysis is shown, and the Southern blot probes are shown as red boxes. (17E) Oct4-eGFP targeted blastocysts showed expression in ICM. Scale bar, 50μιη. Mouse ES cell lines derived from targeted blastocysts remain GFP positive. Scale bar, ΙΟΟμιη. (17F) Southern analysis of Oct4-eGFP targeted allele. Southern analysis of Oct4-eGFP targeted allele. Hindlll-digested genomic DNA was hybridized with
3 'external probe. Expected fragment size: WT = 9 kb, Targeted = 7.2 kb. The blot was then stripped and hybridized with eGFP internal probe. Expected fragment size: WT = N/A, Targeted = 7.2 kb.
[0030] Figures 18A-18E: One step generation of a Mecp2 floxed allele. (18A) Schematic of the Cas9/sgRNA/oligo targeting sites in Mecp2 intron 2 and intron 3 (SEQ ID NOs: 102-106). The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green. In the oligo donor sequence, the loxP site is indicated as an orange box, and the restriction site sequences are in bold and capitalized. Restriction enzymes used for RFLP and Southern blot analysis are shown, and the Southern blot probes are shown as red boxes. (18B) Southern analysis of targeted alleles. Data of five mice are shown. EcoRI/Nhel-digested genomic DNA was hybridized with the exon3 probe. Expected fragment size: WT = 5.2 kb, 21oxP = 0.7 kb, L2-loxP = 3.9kb, Rl- loxP = 2kb. The blot was then stripped and hybridized with the exon4 probe.
Expected fragment size: WT = 5.2 kb, 21oxP = 3.2 kb. L2-loxP = 3.9kb, Rl-loxP = 3.2kb. The sequence of the floxed allele is shown in Fig. 26B. (18C) In vitro Cre- mediated recombination of the floxed Mecp2 allele. The genomic DNA of targeted mice #1 and #3 was incubated with Cre recombinase, and used as PCR template. Primers DF and DR flanking the floxed allele produce shorter products upon Cre- dependent excision. Primers CF and CR detect the circular molecule, which only form upon Cre-loxP recombination. The position of each primer is shown at the bottom cartoon. The deletion and circular PCR products were sequenced and the sequences are shown in Fig. 26C. (18D) Injection of Cas9 mRNA and both L2 and Rl sgRNA generated Mecp2 mutant allele with deletion of exon 3. PCR genotyping using primers DF and DR identified defined deletion events in mice #1, #6, and #8 (indicated by stars). (18E) Sequences of three mutant alleles with exon 3 deletions in three mice (SEQ ID NOs: 107-111). R2 and LI sgRNA coding sequences were underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green.
[0031] Figures 19A-19C: Integration of loxP sites at Tetl and Tet2 loci. (19A) Schematic of the Cas9/sgRNA/oligo targeting sites in Tetl exon 4 and Tet2 exon 3 (SEQ ID NOs: 112-115). The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in
green. In oligo donor sequence, the loxP site is indicated by an orange box, and the restriction site sequence is in bold and capitalized. (19B) RFLP analysis of double sgRNA/oligo injection mice with HDR-mediated targeting at the Tetl and Tet2 loci. About 500bp regions around the targeting sites at Tetl and Tet2 were amplified from 16 embryos and digested with EcoRI. A corrected targeted allele is identified as a cleaved fragment. Samples containing targeted alleles are indicated by stars. (19C) The sequences of targeted alleles of Tetl and Tet2 in sample #2 and #9 confirmed precise integration of loxP sites at both loci.
[0032] Figures 20A-20C: Characterization of Nanog-mCherry alleles. (20A) ES clone with mosaic expression of mCherry. The mCherry negative colony is indicated by the arrow. (20B) Southern analysis of Nanog-mCherry targeted allele identified mosaic animal. Ncol-digested genomic DNA was hybridized with 3 'external probe. Expected fragment size: WT = 11.5 kb, T = 5.6 kb. Mouse #6 is identified as mosaic, because the targeted band (indicated by arrow) is weaker than WT band. (20C) The blot was then stripped and hybridized with mCherry internal probe.
Expected fragment size: WT = NA, Targeted = 6.6 kb. In addition to the targeted allele, one extra band (indicated by arrow) is present in mouse #3, indicating a ramdon insertion of the donor vector.
[0033] Figures 21A-21B: Integration of loxP sites at Mecp2 intron 2 and 3. (21A) Schematic of the Cas9/sgRNA/oligo targeting sites (SEQ ID NOs: 116-124). The sgRNA coding sequence is underlined, capitalized, and labeled in red. The protospacer-adjacent motif (PAM) sequence is labeled in green. In oligo donor sequence, the loxP site is labeled as an orange box, and the restriction site sequence is in bold and capitalized. PCR primers used for RFLP analysis are shown as red arrows. For intron 2, two sgRNA coding sequences LI and L2 are shown, and their corresponding oligos are named accordingly. For intron 3, Rl, R2 and their targeting oligos are shown. PCR primers LF and LR are used to amplify the intron 2 region, while RF and RR are used to amplify the intron 3 region. (2 IB) RFLP analysis of single sgRNA/oligo injection mice with HDR-mediated targeting at Mecp2 intron 2 or intron 3. Cleavage of PCR product upon Nhel or EcoRI digestion indicates loxP integration at intron 2 or intron 3 respectively. LoxP integration efficiency at LI, L2, Rl, and R2 sites are compared. Samples containing loxP site are labeled by stars. Three out of eight samples contained a loxP site at the LI site, and four out of eight
contained a loxP site at the L2 site. Two out of six samples contained loxP site at Rl site, while none was detected at the R2 site. Primers used for each PCR are labeled.
[0034] Figures 22A-22C: Analysis of Mecp2 floxed allele. (22 A) RFLP analysis detected loxP integration at intron 2 (Mecp2-L2) and intron 3 (Mecp2-Rl) in mice derived from L2 and Rl double sgRNA/oligo injections. Primers LF and LR were used to amplify intron 2 region, and RF and RR were used to amplify intron 3 region. Mice containing loxP sites in both introns are marked by stars. (22B) Partial chromatograph from one single sequencing file crossing both loxP sites, exon 3, and flanking intron sequences. (22C) Partial chromatograph from sequences of Cre- mediated recombination PCR products (deletion and circular products from Fig. 18C).
[0035] Figures 23A-23D: CRISPR-on activates exogenous transgenes. (23 A) Schematic of the dCas9VP48 mediated transgene activation in HeLa cells.
dCas9VP48 was generated by fusing dCas9 (indicated by black circle) to VP48 domain (indicated by green diamond). sgRNA complementary to rtTA binding site is indicated by small hairpin labeled sgTetO. (23B) dCas9VP48 activates
TetO::tdTomato transgene in HeLa cells. Upper (top) panel, phase contrast picture of transfected cells; middle panel, tdTomato signal using fluorescent microscopy; bottom panel, FACS analysis of transfected cells. Column i, cells transfected with GFP plasmid; Column ii, cells treated with doxycycline; Column iii, cells transfected with dCas9VP48 only; Column iv, cells transfected with dCas9VP48 and sgTetO. Cells were transfected with the indicated plasmids and 48 hr later were analyzed by flow cytometry for tdTomato expression. (23C) Schematic of the dCas9VP48 mediated reporter activation in early mouse embryos. dCas9VP48, Nanog::EGFP vector, and 7 sgRNAs targeted on Nanog promoter were co-injected into mouse zygotes and cultured into blastocyst stage. (23D) dCas9VP48/sgRNA can activate gene in vivo. Left panel, embryos injected with dCas9VP48 and Nanog: :EGFP vector; right panel, embryos injected with dCas9VP48, Nanog: :EGFP vector and sgRNAs targeting Nanog promoter. Embryos two, three, four days post- injection were shown.
[0036] Figures 24A-24G: dCas9VP160 activated multiple endogenous genes simultaneously. (24A) Protein architecture of dCas9VP160 compared to VP48. (24B) Schematic of the human IL1RN promoter region. Locations of transcription
start site (TSS) and start codon (ATG) are indicated. Short lines with number indicate targeting sites of the sgRNAs. (24C) Activation of human IL1RN expression in HEK293T cells. Cells transfected with dCas9VP160 and indicated sgRNAs were analyzed by qRT-PCR 2 days later. sgTetO-mut, negative control sgRNA. Error bars show standard deviation (SD) among triplicates. (24D) Schematic of the human SOX2 promoter region. Locations of TSS and start codon (ATG) are indicated. Short lines with number indicate locations of sgRNAs. (24E) Activation of SOX2. Cells transfected with dCas9VP160 and indicated sgRNAs were analyzed by qRT- PCR 2 days later. sgTetO-mut, negative control sgRNA. Error bars show SD among triplicates. (24F) Schematic of the human OCT4 promoter region. Locations of transcription start site (TSS) and start codon (ATG) are indicated. Short lines with number indicate locations of sgRNAs. (24G) Activation of OCT4. Cells transfected with dCas9VP160 and indicated sgRNAs were analyzed by qRT-PCR 2 days later. sgTetO-mut, negative control sgRNA. Error bars show SD among triplicates.
[0037] Figures 25A-25B: Multiple exogenous and endogenous genes were simultaneously activated by CRISPR-on. (25 A) One exogenous and two endogenous genes were simultaneously activated by CRISPR-on. Cells transfected with dCas9VP160 and indicated sgRNAs were analyzed by qRT-PCR 2 days later.
sgTetO-mut, negative control sgRNA. ,Error bars show SD among triplicates. (25B) Three endogenous genes SOX2, IL1RN, and OCT4, can be simultaneously activated by dCas9VP160/sgRNAs. Cells were transfected with dCas9VP160 and indicated sgRNAs and were analyzed by qRT-PCR 2 days later. sgTetO-mut, negative control sgRNA. The last three sets of bars represent triple activation experiments using sgSOX2, sgOCT4 and sglLIRN with three different ratios of sgSOX2:sgILlRN, keeping the amount of sgOCT4 constant, as indicated by numbers above line. Error bars show SD among triplicates.
[0038] Figures 26A-26D: CRISPR-on is specific. (26A) The histogram showing distribution of Log2 fold changes of gene expression in sample transfected with dCas9VP160/sgTetO over dCas9VP160/sgTetO-mut control. (26B) A histogram showing distribution of Log2 fold changes of gene expression in sample transfected with dCas9VP 160/sgIL 1 RN 1 ~3 over dCas9VP 160/sgTetO-mut control. The vertical line marks the fold change of the target gene IL1RN. (26C) Column graph showing the log2 fold changes of genes up-regulated by at least two fold in cells
transfected with dCas9VP160/sgTetO over dCas9VP160/sgTetO-mut. The dotted line indicates the 2 fold cut-off. (26D) Column graph showing the log2 fold changes of genes up-regulated by more than two fold in cells transfected with
dCas9VP160/sgILlRNl~3 over dCas9VP160/sgTetO-mut.
[0039] Figure 27: The persistence of CRISPR-on mediated transgene
expression. Cells were transfected with dCas9VP48/sgTetO in
HeLa/TetO::tdTomato;EFla::rtTA-M2 cells and mean fluorescence measured by FACS was shown as fold relative to non-transfected control for the indicated samples 2 days, 12 days and 18 days after trans fection.
[0040] Figures 28A-28B: CRISPR-on activates transgene in mouse cells.
dCas9VP48 guided by sgRNA targeting rtTA binding site activates TetO promoter in NIH3T3/TetO::tdTomato;EFla::rtTA-M2 cells. (28A) Schematic of the dCas9VP48 mediated transgene activation in NIH3T3 cells. dCas9VP48 was generated by fusion of dCas9 to VP48 and then co-transfected with sgRNA complementary to tet binding site in NIH3T3/TetO::tdTomato;EFla::rtTA-M2 cells. (28B) dCas9VP48 depends on sgRNA to bind to the target tetO promoter to activate TetO::tdTomato transgene in NIH3T3 cells. Cells were transfected with the indicated plasmids or sgRNAs and were analyzed by flow cytometry for tdTomato expression 48 hours later.
[0041] Figure 29: CRISPR-on activated a single-copy transgene in ESCs. Cells were transfected with the indicated plasmids into a Tet-inducible MSI1 over- expression mouse embryonic stem cell (mESC) line and were analyzed by western blot for MSI1 expression 48 hours later.
[0042] Figures 30A-30B: Tunable gene activation can be achieved by titration of sgRNA. (30A) Schematic of CRISPR-on-mediated transgene activation with titration of sgRNA in Hela cells. (30B) Fold changes of mean fluorescence of tdTomato under the control of TetO promoter. Cells were transfected with dCas9 activator and indicated amount of sgTetO and mean fluorescence measured by FACS were analyzed 2 days later. NT = not transfected; C = negative control sgRNA.
[0043] Figures 31 A-3 IB: dCas9VP48 with 6 sgRNAs failed to activate the IL1RN gene. (31 A) Schematic of the human IL1RN promoter region. Locations of transcription start site (TSS) and start codon (ATG) are indicated. Short lines with
number indicate locations of sgRNAs. (3 IB) Activation of human IL1RN
expression in HEK293T cells by dCas9VP48/sgRNAs. Cells were transfected with dCas9VP48 and six sgRNAs and 2 days later were analyzed by qRT-PCR. sgTetO- mut, negative control sgRNA. Error bars show SD among triplicates.
[0044] Figures 32A-32C: Nucleotide sequences of dCas9VP64 on pmax expression vector (SEQ ID NO: 486), dCas9Vp96 on pmax expression vector (SEQ ID NO: 487), and dCas9Vpl60 on pmax expression vector (SEQ ID NO: 488).
DETAILED DESCRIPTION OF THE INVENTION
[0045] A description of example embodiments of the invention follows.
[0046] Mice carrying mutations in multiple genes are traditionally generated by sequential recombination in embryonic stem cells and/or time-consuming intercrossing of mice with single mutants. Described herein is the development of an efficient technology for the generation of animals carrying multiple mutated genes. Specifically, the clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR associated genes (Cas genes), referred to herein as the CRISPR/Cas system, has been adapted as an efficient gene targeting technology e.g., for multiplexed genome editing. Demonstrated herein is that CRISPR/Cas mediated gene editing allows the simultaneous disruption of five genes (Tetl, Tet2, Tet3, Sry, Uty - 8 alleles) in mouse embryonic stem cells (mESCs) with high efficiency. Co- injection of Cas9 mRNA and single guide RNA (sgRNA) targeting Tetl and Tet2 into zygotes generated mice with biallelic mutations in both genes with an efficiency of 80%. In addition, co-injection of Cas9 mRNA/sgRNAs with mutant oligos generated precise point mutations in target genes. Thus, shown herein is that the CRISPR/Cas system allows the one step generation of animals carrying mutations in multiple genes, an approach that will greatly accelerate the in vivo study of, for example, functionally redundant genes and of epistatic gene interactions. In certain embodiments a method described herein generates non-human mammals, e.g., mice, with biallelic mutations in 1, 2, 3, 4, 5, or more genes with an efficiency of between 20% and 95%, or even more, e.g., at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 85%, 90%, 95%, or more, e.g., up to 96%, 97%, 98%, 99%, or more. For example, in certain embodiments a method described herein generates non-human mammals, e.g., mice, with biallelic mutations in 2, 3, 4, 5, or more genes with an efficiency of
at least 70%, 80%, 85%, 90%, 95%, or more, e.g., between 70% and 85%, 90%, 95%, 96%, 97%, 98%, 99%, or more.
[0047] Accordingly, in one aspect, the invention is directed to a method of mutating or modulating one or more target nucleic acid sequences in a (one or more) stem cell or a zygote comprising introducing into the stem cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is
complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity. The stem cell or zygote is maintained under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the one or more RNA sequences to the portion of the target nucleic acid sequence, thereby mutating one or more target nucleic acid sequences in the stem cell or zygote. In a particular aspect, the stem cell or zygote into which the one or more RNA sequences and Cas nucleic acid sequence are introduced is an isolated stem cell or isolated zygote. The method can also further comprise introducing the stem cell or zygote into a nonhuman mammal.
[0048] The methods described herein can be used to mutate or modulate one or more nucleic acid sequences in a variety of stem cells which include totipotent, pluripotent, multipotent, oligipotent and unipotent stem cells. Specific examples of stem cells include embryonic stem cells, fetal stem cells, adult stem cells, and induced pluripotent stem cells (iPSCs) (e.g., see U.S. Published Application Nos. 20100144031 , 201 10076678, 201 10088107, 20120028821 all of which are incorporated herein by reference).
[0049] In some embodiments a stem cell is a pluripotent cell. A "pluripotent" cell has the ability to self-renew and to differentiate into cells of all three embryonic germ layers (endoderm, mesoderm and ectoderm) and, typically, has the potential to divide in vitro for a long period of time, e.g., at least 20, at least 25, or at least 30 passages, or more (e.g., up to 80 passages, or up to 1 year, or more), without losing its self-renewal and differentiation properties. A pluripotent cell is said to exhibit or be in a "pluripotent state". A pluripotent cell line or cell culture is often characterized in that the cells can differentiate into a wide variety of cell types in
vitro and in vivo. Cells that are able to form teratomas containing cells having characteristics of endoderm, mesoderm, and ectoderm when injected into SCID mice are considered pluripotent. Cells that possess ability to participate in formation of chimeras (upon injection into a blastocyst of the same species that is transferred to a suitable foster mother of the same species) that survive to term are pluripotent. If the germ line of the chimeric animal contains cells derived from the introduced cell, the cell is considered germline-competent in addition to being pluripotent.
[0050] ES cells are examples of pluripotent cells. ES cells have been derived from mice, primates (including humans), and some other species. ES cells are often derived from cells obtained from the inner cell mass (ICM) of a vertebrate blastocyst but can also be derived from single blastomeres (e.g., removed from a morula). Pluripotent cells can also be obtained using somatic cell nuclear transfer in at least some species, e.g., mice and various non-human primates. Pluripotent cells can also be obtained using parthenogenesis, e.g., from germ cells, e.g., oocytes. Other pluripotent cells include embryonic carcinoma (EC) and embryonic germ (EG) cells. See, e.g., Yu J, Thomson J A, Pluripotent stem cell lines. 22(15): 1987-97, 2008.
[0051] "Reprogramming", as used herein, refers to a process that alters the differentiation state or identity of a cell. Induced pluripotent stem (iPS) cells are pluripotent, ES-like cells derived from somatic cells (e.g., fibroblasts, keratinocytes, hematopoietic cells, neural precursor cells) by reprogramming. Reprogramming can be performed using a variety of different methods. As used herein,
"reprogramming protocol" refers to any treatment or combination of treatments that causes at least some cells to become reprogrammed. In some embodiments
"reprogramming protocol" refers to a set of manipulations (e.g., introduction of nucleic acid(s), e.g., vector(s), carrying particular genes) and/or culture conditions (e.g., culture in medium containing particular compounds) that generates pluripotent cells from somatic cells, e.g., in vitro. As used herein, the term "reprogramming factor" encompasses genes, RNAs, or proteins that promote or contribute to cell reprogramming, e.g., in vitro. Many useful reprogramming factors are transcription factors. In some aspects the terms "reprogramming", "reprogramming to a pluripotent state", "reprogramming to pluripotency", refer to in vitro
reprogramming methods that do not require and typically do not include nuclear or cytoplasmic transfer or cell fusion, e.g., with oocytes, embryos, germ cells, or
pluripotent cells. Any embodiment or claim may specifically exclude compositions or methods relating to or involving nuclear or cytoplasmic transfer or cell fusion, e.g., fusion of a somatic cell with oocytes, embryos, germ cells, or pluripotent cells or transfer of a somatic cell nucleus to oocytes, embryos, germ cells, or pluripotent cells.
[0052] Differentiated cells can be reprogrammed to a pluripotent state by overexpress of the four transcription factors Oct4, Sox2, Klf , and c-Myc
(Takahashi, K. & Yamanaka, S. Cell 126, 663-676, 2006). Fully reprogrammed induced pluripotent stem cells (iPSCs) can contribute to the three germ layers and give rise to fertile mice by tetraploid complementation ( Wernig, M., et al. (2007). In vitro reprogramming of fibroblasts into a pluripotent ES-cell-like state. Nature 448, 318-324); Hanna, J., et al. (2009). Direct cell reprogramming is a stochastic process amenable to acceleration. Nature 462, 595-601). The reprogramming process is characterized by widespread epigenetic changes that generate iPSCs that are functionally and molecularly similar to embryonic stem (ES) cells (Carey, B. W. et al. Reprogramming factor stoichiometry influences the epigenetic state and biological properties of induced pluripotent stem cells. Cell Stem Cell 9, 588-598, (2011)).
[0053] Reprogramming somatic cells to a pluripotent state can be achieved by infecting cells with retroviruses that encode the transcription factors Oct4, Sox2, Klf4, and c-Myc (termed "OSKM factors") under control of a viral LTR. Oct4, Sox2 and Klf4 ("OSK factors") are also sufficient to reprogram mammalian, e.g., rodent or human, somatic cells to pluripotency. Other sets of reprogramming factors, e.g., Oct4, Sox2, Nanog, and Lin28 (OSNL factors) can be used to reprogram
mammalian cells, e.g., rodent or human cells, with Lin28 being dispensable. The ectopically expressed factors induce expression of endogenous pluripotency genes such as Oct4 and Nanog. Since the retroviral vectors in iPS cells derived by this approach are silenced, maintenance of pluripotency relies on expression of such endogenous genes and establishment of an appropriate transcriptional network in the reprogrammed cells. Furthermore, reprogramming factors that are members of the same gene family may be used in place of one another in certain embodiments. For example, Klf2 and Klf5 can substitute for Klf4, Soxl for Sox2 and N-Myc for c- Myc. It has recently been discovered that reprogramming can be achieved using
Sall4, Nanog, Esrrb, and Lin28 as reprogramming factors (SNEL factors) or using Sal4, Lin28, Essrb, and Dppa2 (SLED factors) (Buganim Y, et al, Cell. 2012 Sep 14;150(6): 1209-22). Thus, examples of reprogramming factors of interest for reprogramming somatic cells to pluripotency in vitro include Oct4, Sall4, Nanog, Esrrb, Lin28, Klf , c-Myc, Dppa2, and any gene/RNA/protein that can substitute for one or more of these in a method of reprogramming somatic cells in vitro.
[0054] Exogenous reprogramming factors may be introduced into somatic cells in any form that is capable of maintaining exogenous reprogramming factors for a period of time and at levels sufficient to activate endogenous pluripotency genes and for reprogramming of at least some of the somatic cells into which the exogenous reprogramming factors are introduced to occur. As used herein, "exogenous" refers to a substance present in a cell or organism other than its native source. For example, the terms "exogenous nucleic acid" or "exogenous protein" refer to a nucleic acid or protein that has been introduced by a process involving the hand of man into a biological system such as a cell or organism in which it is not normally found or in which it is found in lower amounts. A substance will be considered exogenous if it is introduced into a cell or an ancestor of the cell that inherits the substance. In contrast, the term "endogenous" refers to a substance that is native to the biological system.
[0055] Somatic cells of use in aspects of the invention may be primary cells (non-immortalized cells), such as those freshly isolated from an animal, or may be derived from a cell line capable of prolonged proliferation in culture (e.g., for longer than 3 months) or indefinite proliferation (immortalized cells). Adult somatic cells may be obtained from individuals, e.g., human subjects, and cultured according to standard cell culture protocols available to those of ordinary skill in the art. Cells may be maintained in cell culture following their isolation from a subject. In certain embodiments, the cells are passaged once or more following their isolation from the individual (e.g., between 2-5, 5-10, 10-20, 20-50, 50-100 times, or more) prior to their use in a method of the invention. In some embodiments, cells may be frozen and subsequently thawed prior to use. In some embodiments, cells will have been passaged no more than 1, 2, 5, 10, 20, or 50 times following their isolation from an individual prior to their use in a method of the invention. Somatic cells of use in aspects of the invention include mammalian cells, such as, for example, human cells,
non-human primate cells, or rodent (e.g., mouse, rat) cells. They may be obtained by well-known methods from various organs, e.g., skin, lung, pancreas, liver, stomach, intestine, heart, breast, reproductive organs, muscle, blood, bladder, kidney, urethra and other urinary organs, etc., generally from any organ or tissue containing live somatic cells. Mammalian somatic cells useful in various embodiments include, for example, fibroblasts, Sertoli cells, granulosa cells, neurons, pancreatic cells, epidermal cells, epithelial cells, endothelial cells, hepatocytes, hair follicle cells, keratinocytes, hematopoietic cells, melanocytes, chondrocytes, lymphocytes (B and T lymphocytes), macrophages, monocytes, mononuclear cells, cardiac muscle cells, skeletal muscle cells, etc. In some embodiments a somatic cell is a terminally differentiated somatic cell. In some embodiments a somatic cell is a progenitor (precursor) cell, which has not terminally differentiated.
[0056] In some embodiments, reprogramming factors are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors. In some embodiments, reprogramming factors are introduced into somatic cells in the form of one or more nucleic acid sequences encoding the reprogramming factors. In some embodiments, the one or more nucleic acid sequences comprise DNA. In some embodiments, the one or more nucleic acid sequences comprise R A. In some embodiments, the one or more nucleic acid sequences comprise a nucleic acid construct. In some embodiments, the one or more nucleic acid sequences comprise a vector for delivery of the reprogramming factors into a target cell (e.g., a mammalian somatic cell, e.g., a human or mouse fibroblast cell). Any suitable vector may be used. Examples of suitable vectors are described by Stadtfeld and Hochedlinger (Genes Dev. 24:2239- 2263, 2010, incorporated herein by reference in its entirety). Other suitable vectors are apparent to those skilled in the art.
[0057] In some embodiments, a vector comprises an inducible vector. In some embodiments, the inducible vector is a doxycycline inducible vector (i.e., a vector activates expression of said reprogramming factors in the presence of doxycycline in a culture medium). "Expression" refers to the cellular processes involved in producing RNA and proteins as applicable, for example, transcription, translation, folding, modification and processing. "Expression products" include RNA transcribed from a gene and polypeptides obtained by translation of mRNA
transcribed from a gene. In some embodiments, the inducible vector is a tamoxifen inducible vector or encodes a tamoxifen-inducible protein. In some embodiments, a vector is an integrating vector that integrates into a genome of a host cell (e.g., a mammalian somatic cell). In some embodiments, a vector comprises a viral vector, e.g., a retroviral vector, e.g., a lentiviral vector. In some embodiments, a vector comprises an excisable vector. In some embodiments, the excisable vector comprises a transposon, wherein said excisable vector is excisable from said genome by transient expression of a transposase. In certain embodiments, the transposon comprises a piggyback transposon (See, e.g., Woltjen et al. Nature 458:766-770, 2009; Yusa et al. Nat Methods 6:363-369, 2009, incorporated herein by reference in its entirety). In some embodiments, the excisable vector comprises one or more loxP site incorporated into said vector, wherein said vector can be excised from said genome by transient expression of a Cre recombinase (See, e.g., Kaji et al. Nature 458:771-775, 2009; Soldner et al. Cell 136:964-977, 2009, eachof which is incorporated herein by reference in its entirety). In some embodiments, the excisable vector comprises a floxed lentiviral vector.
[0058] In some embodiments, the vector does not integrate into the genome of said somatic cell. In some embodiments, the vector comprises an adenoviral vector (See, e.g., Zhou and Freed. Stem Cells 27:2667-2674, 2009, the teachings of which are incorporated herein by reference). In some embodiments, the vector comprises a Sendai viral vector (See, e.g., Fusaki et al. Proc Jpn Acad 85:348-362, 2009, the teachings of which are incorporated herein by reference). In some embodiments, the vector comprises a plasmid. In some embodiments, the vector comprises an episome (Yu et al. Science 324(5928):797-801, 2009, the teachings of which are incorporated herein by reference).
[0059] In some embodiments, to minimize the number of independent proviral integrations required for reprogramming, a nucleic acid construct comprises a polycistronic vector that can transduce any combination of reprogramming factors with a goal of reducing the number of proviral integrations. Such polycistronic nucleic acid constructs, expression cassettes, and vectors that employ internal ribosomal entry sites and self-cleaving peptides and are capable of transducing any combination of reprogramming factors are described in PCT Application Publication No. WO 2009/152529, incorporated herein by reference in its entirety.
[0060] In certain embodiments reprogramming factors are provided by polycistronic nucleic acid constructs (e.g., expression cassettes, and vectors comprising such constructs). In certain embodiments the polycistronic nucleic acid constructs comprise a portion that encodes a self-cleaving peptide. In certain embodiments a polycistronic nucleic acid construct comprises at least two, three, or four, coding regions, wherein the coding regions are linked to each by a nucleic acid that encodes a self-cleaving peptide so as to form a single open reading frame, and wherein the coding regions encode at least first and second reprogramming factors capable, either alone or in combination with one or more additional reprogramming factors, of reprogramming a mammalian somatic cell to pluripotency. In some embodiments of the invention the construct comprises two coding regions separated by a self-cleaving peptide. In some embodiments constructs encode a polyprotein that comprises 2, 3, or 4 reprogramming factors, separated by self-cleaving peptides. In some embodiments the construct comprises expression control element(s), e.g., a promoter, suitable to direct expression in mammalian cells, wherein the portion of the construct that encodes the polyprotein is operably linked to the expression control element(s). The promoter drives transcription of a polycistronic message that encodes the reprogramming factors, each reprogramming factor being linked to at least one other reprogramming factor by a self-cleaving peptide. The promoter can be a viral promoter (e.g., a CMV promoter) or a mammalian promoter (e.g., a PGK promoter). The expression cassette or construct can comprise other genetic elements, e.g., to enhance expression or stability of a transcript. In some
embodiments of the invention any of the foregoing constructs or expression cassettes may further include a coding region that does not encode a reprogramming factor, wherein the coding region is separated from adjacent coding region(s) by a self-cleaving peptide. In some embodiments the additional coding region encodes a selectable marker. In some embodiments, the self-cleaving peptide is a viral 2A peptide. In some embodiments, the self-cleaving peptide is an aphthovirus 2A peptide.
[0061] In some embodiments a construct comprises sites for a recombinase that is functional in mammalian cells, wherein the sites flank at least the portion of the construct that comprises the coding regions for the factors (i.e., one site is positioned 5 ' and a second site is positioned 3 ' to the portion of the construct that encodes the
polyprotein), so that the sequence encoding the factors can be excised from the genome after reprogramming. The recombinase can be, e.g., Cre or Flp, where the corresponding recombinase sites are LoxP sites and Frt sites. In some embodiments the recombinase is a transposase. It will be understood that the recombinase sites need not be directly adjacent to the region encoding the polyprotein but will be positioned such that a region whose eventual removal from the genome is desired is located between the sites. In some embodiments the recombinase sites are on the 5' and 3' ends of an expression cassette. Excision may result in a residual copy of the recombinase site remaining in the genome, which in some embodiments is the only genetic change resulting from the reprogramming process.
[0062] In some embodiments, one or more nucleic acids for introducing reprogramming factors comprise mRNA that is translatable in a mammalian somatic cell. In some embodiments, the mRNA can be introduced in vitro into somatic cells to be reprogrammed and translated by endogenous enzymes into proteins that can activate one or more endogenous pluripotency genes in the cell. As used herein, "pluripotency gene", refers to a gene whose expression under normal conditions (e.g., in the absence of genetic engineering or other manipulation designed to alter gene expression) occurs in and is typically restricted to pluripotent stem cells, and is crucial for their functional identity as such. It will be appreciated that the polypeptide encoded by a pluripotency gene may be present as a maternal factor in the oocyte. The gene may be expressed by at least some cells of the embryo, e.g., throughout at least a portion of the preimplantation period and/or in germ cell precursors of the adult. The gene may be expressed in ES cells and/or in embryonic carcinoma cells. The pluripotency gene is typically substantially not expressed in somatic cell types that constitute the body of an adult animal under normal conditions (with the exception of germ cells or precursors thereof, or possibly in certain disease states such as cancer). For example, the pluripotency gene may be one whose average expression level (based on RNA or protein) in ES cells is at least 50-fold or 100-fold greater than its average level in those terminally differentiated cell types present in the body of an adult mammal. In some embodiments, the pluripotency gene is one that encodes multiple splice variants or isoforms of a protein, wherein one or more such variants or isoforms is expressed in at least some adult somatic cell types, while one or more other variants or isoforms is not
substantially expressed in adult somatic cells under normal conditions. In some embodiments, expression of the pluripotency gene is essential to maintain the viability or pluripotent state of iPSCs. Thus if the gene is knocked out or its expression is inhibited (i.e., its expression is eliminated or substantially reduced, e.g., such that the average steady state level of RNA transcript and/or protein encoded by the gene is decreased by at least 50%, 60%, 70%, 80%, 90%, 95%, or more), the iPSCs are not formed, die or, in some embodiments, differentiate or cease to be pluripotent. In some embodiments the pluripotency gene is characterized in that its expression in an ES cell or iPS cell decreases (resulting in, e.g., a reduction in the average steady state level of RNA transcript and/or protein encoded by the gene by at least 50%, 60%, 70%, 80%, 90%, 95%, or more) when the cell differentiates into a terminally differentiated cell. Oct4 and Nanog are exemplary pluripotency genes. In some embodiments, the mRNA is in vitro transcribed mRNA. Non-limiting examples of producing in vitro transcribed mRNA are described by Warren et al. (Cell Stem Cell 7(5):618-30, 2010, Mandal PK, Rossi DJ. Nat Protoc. 2013 Mar;8(3):568-82, and/or PCT/US2011/032679 (WO/2011/130624) the teachings of each of which are incorporated herein by reference). The protocols described may be adapted to produce one or more mRNAs of interest in the present invention. In some embodiments, mRNA, e.g., in vitro transcribed mRNA, comprises a sequence encoding SV40 large T (LT). In some embodiments, mRNA, e.g., in vitro transcribed mRNA, comprises one or more modifications that increase stability or translatability of said mRNA. In some embodiments, mRNA, e.g., in vitro transcribed mRNA comprises a 5' cap. The cap may be wild-type or modified. Examples of suitable caps and methods of synthesizing mRNA containing such caps are apparent to those skilled in the art.
[0063] In some embodiments, mRNA, e.g., in vitro transcribed mRNA, comprises an open reading frame flanked by a 5 ' untranslated region and a 3 ' untranslated region that enhance translation of said open reading frame, e.g., a 5' untranslated region that comprises a strong Kozak translation initiation signal, and/or a 3' untranslated region comprises an alpha-globin 3' untranslated region.
[0064] In some embodiments, mRNA, e.g., in vitro transcribed mRNA comprises a polyA tail. Methods of adding a polyA tail to mRNA are known in the art, e.g., enzymatic addition via polyA polymerase or ligation with a suitable ligase.
[0065] The methods provided herein can also be used to mutate or modulate one or more nucleic acids in stem cells that are present in cell compositions such as embryos, zygotes, fetuses, and post-natal mammals. In some embodiments, a stem cell (e.g., an ES or iPS cell), zygote, embryo, or post-natal mammal is already genetically modified (already harbors one or more genetic modifications) prior to being subjected to the methods described herein. For example, the stem cell (e.g., an ES or iPS cell), zygote, embryo, or post-natal mammal may be one into which an exogenous nucleic acid has been introduced by a process involving the hand of man (or may be descended at least in part from a cell or organism into which an exogenous nucleic acid has been introduced by a process involving the hand of man). The nucleic acid may for example contain a sequence that is exogenous to the cell, it may contain native sequences (i.e., sequences naturally found in the cells) but in a non-naturally occurring arrangement (e.g., a coding region linked to a promoter from a different gene), or altered versions of native sequences, etc. In some embodiments, a stem cell (e.g., an ES or iPS cell), zygote, embryo, or post-natal mammal is not already genetically modified (does not already harbor one or more genetic modifications) prior to being subjected to the methods described herein.
[0066] The stem cell, zygote, embryo, or post-natal mammal can be of vertebrate (e.g., mammalian) origin. In some aspects, the vertebrates are mammals or avians. Particular examples include primate (e.g., human), rodent (e.g., mouse, rat), canine, feline, bovine, equine, caprine, porcine, or avian (e.g., chickens, ducks, geese, turkeys) stem cells, zygotes, embryos, or post-natal mammals. In some embodiments, the stem cell, zygote, embryo, or post-natal mammal is isolated (e.g., an isolated stem cell; an isolated zygote; an isolated embryo). In some embodiments, a mouse stem cell, mouse zygote, mouse embryo, or mouse post-natal mammal is used. In some embodiments, a rat stem cell, rat zygote, rat embryo, or rat post-natal mammal is used. In some embodiments, a human stem cell, human zygote or human embryo is used.
[0067] In some aspects, the invention is directed to a method of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences comprising introducing into a zygote or an embryo (i) one or more ribonucleic acid (R A) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a
CRISPR associated (Cas) protein; and ii) a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity. The zygote or the embryo is maintained under conditions in which RNA hybridizes to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the RNA to the portion of the target nucleic acid sequence, thereby producing an embryo having one or more mutated nucleic acid sequences. The embryo having one or more mutated nucleic acid sequences may be transferred into a foster nonhuman mammalian mother. The foster nonhuman mammalian mother is maintained under conditions in which one or more offspring carrying the one or more mutated nucleic acid sequences are produced, thereby producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences.
[0068] As will be apparent to those of skill in the art, the nonhuman mammals can also be produced using methods described herein and/or with conventional methods, see for example, U.S. Published Application No. 201 10302665. A method of producing a non-human mammalian embryo can comprise injecting non-human mammalian ES cells (e.g., iPSCs) genetically modified according to an inventive method of the present invention into non-human tetraploid blastocysts and maintaining said resulting tetraploid blastocysts under conditions that result in formation of embryos, thereby producing a non-human mammalian embryo. In some embodiments, said non-human mammalian cells are mouse cells and said non- human mammalian embryo is a mouse. In some embodiments, said mouse cells are mutant mouse cells and are injected into said non-human tetraploid blastocysts by microinjection. In some embodiments laser-assisted micromanipulation or piezo injection is used. In some embodiments, a non-human mammalian embryo comprises a mouse embryo.
[0069] Another example of such conventional techniques is two step cloning which involves introducing embryonic stem (ES) and/or induced pluripotent stem (iPS) cells comprising the one or more mutations into a blastocyst (e.g., a tetraploid blastocyst) and maintaining the blastocyst under conditions that result in
development of an embryo. The embryo is then transferred (impregnated) into an appropriate foster mother, such as a pseudopregnant female (e.g., of the same
species as the embryo). The foster mother is then maintained under conditions that result in development of live offspring that harbor the one or more mutations.
[0070] Another example is the use of the tetraploid complementation assay in which cells of two mammalian embryos are combined to form a new embryo (Tarn and Rossant, Develop, 750:6156-6163 (2003)). The assay involves producing a tetraploid cell in which every chromosome exists fourfold. This is done by taking an embryo at the two-cell stage and fusing the two cells by applying an electrical current. The resulting tetraploid cell continues to divide, and all daughter cells will also be tetraploid. Such a tetraploid embryo develops normally to the blastocyst stage and will implant in the wall of the uterus. In the tetraploid complementation assay, a tetraploid embryo (either at the morula or blastocyst stage) is combined with normal diploid embryonic stem cells (ES) from a different organism. The embryo develops normally; the fetus is exclusively derived from the ES cell, while the extraembryonic tissues are exclusively derived from the tetraploid cells.
[0071] Another conventional method used to produce nonhuman mammals includes pronuclear microinjection. DNA is introduced directly into the male pronucleus of a nonhuman mammal egg just after fertilization. Similar to the two- step cloning described above, the egg is implanted into a pseudopregnant female. Offspring are screened for the integrated transgene. Heterozygous offspring can be subsequently mated to generate homozygous animals.
[0072] A variety of nonhuman mammals can be used in the methods described herein. For example, the nonhuman mammal can be a rodent {e.g., mouse, rat, guinea pig, hamster), a nonhuman primate, a canine, a feline, a bovine, an equine, a porcine or a caprine.
[0073] In some aspects, various mouse strains and mouse models of human disease are used in conjunction with the methods of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences described herein. One of ordinary skill in the art appreciates the thousands of commercially and non- commercially available strains of laboratory mice for modeling human disease. Mice models exist for diseases such as cancer, cardiovascular disease, autoimmune diseases and disorders, inflammatory diseases, diabetes (type 1 and 2), neurological diseases, and other diseases. Examples of commercially available research strains include, and is not limited to, 11BHSD2 Mouse, GSK3B Mouse, 129-E Mouse
HSD1 1B1 Mouse, AK Mouse Immortomouse®, Athymic Nude Mouse, LCAT Mouse, B6 Albino Mouse, Lox-1 Mouse, B6C3F1 Mouse, Ly5 Mouse, B6D2F1 (BDF1) Mouse, MMP9 Mouse, BALB/c Mouse, NIH-III Nude Mouse, BALB/c Nude Mouse, NOD Mouse, NOD SCID Mouse, Black Swiss Mouse, NSE-p25 Mouse, C3H Mouse, NU/NU Nude Mouse, C57BL/6-E Mouse, PCSK9 Mouse, C57BL/6N Mouse, PGP Mouse (P-glycoprotein Deficient), CB6F1 Mouse, repTOP™ ERE-Luc Mouse, CD-I® Mouse, repTOP™ mitoIRE Mouse, CD-I® Nude Mouse, repTOP™ PPRE-Luc Mouse, CD1-E Mouse, Rip-HAT Mouse, CD2F1 (CDF1) Mouse, SCID Hairless Congenic (SHC™) Mouse, CF-1™ Mouse, SCID Hairless Outbred (SHO™) Mouse, DBA/2 Mouse, SJL-E Mouse, Fox Chase CB 17™ Mouse, SKH1-E Mouse, Fox Chase SCID® Beige Mouse, Swiss Webster (CFW®) Mouse, Fox Chase SCID® Mouse, TARGATT™ Mouse, FVB Mouse, THE POUND MOUSE™, and GLUT 4 Mouse. Other mouse strains include BALB/c, C57BL/6, C57BL/10, C3H, ICR, CBA, A/J, NOD, DBA/1, DBA/2, MOLD, 129, HRS, MRL, NZB, NIH, AKR, SJL, NZW, CAST, KK, SENCAR, C57L, SAMR1 , SAMP1 , C57BR, and NZO.
[0074] In some aspects, the method of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences further comprises mating one or more commercially and/or non-commercially available nonhuman mammal with the nonhuman mammal carrying mutations in one or more target nucleic acid sequences produced by the methods described herein. The invention is also directed to nonhuman mammals produced by the methods described herein.
[0075] In the methods provided herein, one or more ribonucleic acid (RNA) sequences comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein is introduced into the stem cell, zygote and/or embryo, etc. In some embodiments, the RNA sequence is referred to as guide RNA (gRNA) or single guide RNA (sgRNA).
[0076] In some aspects, a single RNA sequence can be complementary to one or more (e.g., all) of the target nucleic acid sequences that are being modulated or mutated. In one aspect, a single RNA is complementary to a single target nucleic acid sequence. In a particular aspect in which two or more target nucleic acid sequences are to be modulated or mutated, multiple (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or
more) RNA sequences are introduced wherein each RNA sequence is
complementary to (specific for) one target nucleic acid sequence. In some aspects, two or more, three or more, four or more, five or more, or six or more RNA sequences are complementary to (specific for) different parts of the same target sequence. In one aspect, two or more RNA sequences bind to different sequences of the same region (e.g. promoter) of DNA (see e.g., Figure 30A). In some aspects, a single RNA sequence is complementary to at least two target or more (all) of the target nucleic acid sequences. It will also be apparent to those of skill in the art that the portion of the RNA sequence that is complementary to one or more of the target nucleic acid sequences and the portion of the RNA sequence that binds to Cas protein can be introduced as a single sequence or as 2 (or more) separate sequences into a cell, zygote, embryo or nonhuman animal. In some embodiments the sequence that binds to Cas protein comprises a stem-loop.
[0077] In some embodiments, the RNA sequence used to modify gene expression in a nonhuman mammal is a naturally occurring RNA sequence, a modified RNA sequence (e.g., a RNA sequence comprising one or more modified bases), a synthetic RNA sequence, or a combination thereof. As used herein a "modified RNA" is an RNA comprising one or more modifications (e.g., RNA comprising one or more non-standard and/or non-naturally occurring bases) to the RNA sequence (e.g., modifications to the backbone and or sugar). Methods of modifying bases of RNA are well known in the art. Examples of such modified bases include those contained in the nucleosides 5-methylcytidine (5mC), pseudouridine (Ψ), 5-methyluridine, 2'0-methyluridine, 2-thiouridine, N-6 methyladenosine, hypoxanthine, dihydrouridine (D), inosine (I), and 7- methylguanosine (m7G). It should be noted that any number of bases in a RNA sequence can be substituted in various embodiments. It should further be understood that combinations of different modifications may be used.
[0078] In some aspects, the RNA sequence is a morpholino. Morpho linos are typically synthetic molecules, of about 25 bases in length and bind to
complementary sequences of RNA by standard nucleic acid base-pairing.
Morpholinos have standard nucleic acid bases, but those bases are bound to morpholine rings instead of deoxyribose rings and are linked through
phosphorodiamidate groups instead of phosphates. Morpholinos do not degrade
their target RNA molecules, unlike many antisense structural types (e.g., phosphorothioates, siRNA). Instead, morpholinos act by steric blocking and bind to a target sequence within a RNA and block molecules that might otherwise interact with the RNA.
[0079] Each RNA sequence can vary in length from about 8 base pairs (bp) to about 200 bp. In some embodiments, the RNA sequence can be about 9 to about 190 bp; about 10 to about 150 bp; about 15 to about 120 bp; about 20 to about 100 bp; about 30 to about 90 bp; about 40 to about 80 bp; about 50 to about 70 bp in length.
[0080] The portion of each target nucleic acid sequence to which each RNA sequence is complementary can also vary in size. In particular aspects, the portion of each target nucleic acid sequence to which the RNA is complementary can be about 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,35, 36, 37, 38 39, 40, 41, 42, 43, 44, 45, 46 47, 48, 49, 50, 51, 52, 53,54, 55, 56,57, 58, 59 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80 81, 82, 83, 84, 85, 86, 87 88, 89, 90, 81, 92, 93, 94, 95, 96, 97, 98, or 100 nucleotides (contiguous nucleotides) in length. In some embodiments, each RNA sequence can be at least about 70%, 75%, 80%, 85%, 90%, 95%, 100%, etc. identical or similar to the portion of each target nucleic acid sequence. In some embodiments, each RNA sequence is completely or partially identical or similar to each target nucleic acid sequence. For example, each RNA sequence can differ from perfect complementarity to the portion of the target sequence by about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, etc. nucleotides. In some embodiments, one or more RNA sequences are perfectly complementary (100%) across at least about 10 to about 25 (e.g., about 20) nucleotides of the target nucleic acid.
[0081] As will be apparent to those of ordinary skill in the art, the one or more RNA sequences can further comprise one or more expression control elements. For example, in some embodiments the RNA sequences comprises a promoter, suitable to direct expression in cells, wherein the portion of the RNA sequence is operably linked to the expression control element(s). The promoter can be a viral promoter (e.g., a CMV promoter) or a mammalian promoter (e.g., a PGK promoter). The RNA sequence can comprise other genetic elements, e.g., to enhance expression or
stability of a transcript. In some embodiments the additional coding region encodes a selectable marker (e.g., a reporter gene such as green fluorescent protein (GFP)).
[0082] As described herein, the one or more RNA sequences also comprise a (one or more) binding site for a (one or more) CRISPR associated (Cas) protein, and, upon hybridization of the one or more RNA sequences to the one or more target sequences, a (one or more) Cas protein or variant thereof cleaves or nicks each of the target nucleic acid sequences. In a particular aspect, upon hybridization of the one or more RNA sequences to the one or more target nucleic acid sequences, the Cas protein or variants thereof binds to the one or more RNA sequences and cleaves the one or more target nucleic acids sequences. Bacteria and Archaea have evolved an RNA-based adaptive immune system that uses CRISPR (clustered regularly interspaced short palindromic repeat) and Cas (CRISPR-associated) proteins to detect and destroy invading viruses and plasmids (Horvath and Barrangou, Science, 327(5962): 167-170 (2010); Wiedenheft et al, Nature, 482(7385):331-338 (2012)). Cas proteins, CRISPR RNAs (crRNAs) and trans-activating crRNA (tracrRNA) form ribonucleoprotein complexes, which target and degrade foreign nucleic acids, guided by crRNAs (Gasiunas et al, Proc. Natl. Acad. Sci, 109(39):E2579-86 (2012); Jinek et al, Science, 337:816-821 (2012)).
[0083] In one aspect, the method further comprises introducing one or more Cas nucleic acid or variant thereof into the cell, embryo, zygote, or non-human mammal. In some aspects, a Cas protein or variant thereof is introduced into the cell, embryo, zygote, or non-human mammal. In some aspects, a cell, e.g., stem cell (ES or iPS cell), zygote, embryo, or animal may already harbor a nucleic acid that encodes Cas (may be constitutive or inducible) and/or may already contain Cas protein. For example, in some embodiments a cell, e.g., stem cell (ES or iPS cell), zygote, embryo, or animal, may be descended from a cell or organism into which a nucleic acid encoding a Cas protein has been introduced by a process involving the hand of man.
[0084] A variety of CRISPR associated (Cas) genes or proteins which are known in the art can be used in the methods of the invention and the choice of Cas protein will depend upon the particular conditions of the method (e.g.,
www.ncbi.nlm.nih.gov/gene/?term=cas9). Specific examples of Cas proteins include Casl , Cas2, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9 and CaslO. In a particular
aspect, the Cas nucleic acid or protein used in the methods is Cas9. In some embodiments a Cas protein, e.g., a Cas9 protein, may be from any of a variety of prokaryotic species. In some embodiments a particular Cas protein, e.g., a particular Cas9 protein, may be selected to recognize a particular protospacer-adjacent motif (PAM) sequence. In certain embodiments a Cas protein, e.g., a Cas9 protein, may be obtained from a bacteria or archaea or synthesized using known methods. In certain embodiments, a Cas protein may be from a gram positive bacteria or a gram negative bacteria. In certain embodiments, a Cas protein may be from a
Streptococcus, (e.g., a S. pyogenes, a S. thermophilus) a Crptococcus, a
Corynebacterium, a Haemophilus, a Eubacterium, a Pasteurella, a Prevotella, a VeiUonella, or a Marinobacter. In some embodiments nucleic acids encoding two or more different Cas proteins, or two or more Cas proteins, may be introduced into a cell, zygote, embryo, or animal, e.g., to allow for recognition and modification of sites comprising the same, similar or different PAM motifs.
[0085] The Cas protein can cleave one strand or both strands (e.g., of a double stranded target nucleic acid), or alternatively, nick one strand or both strands (e.g., of a double stranded target nucleic acid). In some embodiments a Cas9 nickase may be generated by inactivating one or more of the Cas9 nuclease domains. In some embodiments, an amino acid substitution at residue 10 in the RuvC I domain of Cas9 converts the nuclease into a DNA nickase. For example, the aspartate at amino acid residue 10 can be substituted for alanine (Cong et al, Science, 339:819-823). Other amino acids mutations that create a catalytically inactive Cas9 protein includes mutating at residue 10 and/or residue 840. Mutations at both residue 10 and residue 840 can create a catalytically inactive Cas9 protein, sometimes referred herein as dCas9. For example, a D10A and a H840A Cas9 mutant is catalytically inactive.
[0086] As shown herein, fusions of a catalytically inactive (D10A; H840A) Cas9 protein (dCas9) tethered with all or a portion of (e.g., biologically active portion of) an (one or more) effector domain create chimeric proteins that can be guided to specific DNA sites by one or more RNA sequences (sgRNA) to modulate activity and/or expression of one or more target nucleic acids sequences (e.g., exert certain effects on transcription or chromatin organization, or bring specific kind of molecules into specific DNA loci, or act as sensor of local histone or DNA state).
As used herein, a "biologically active portion of an effector domain" is a portion that maintains the function (e.g. completely, partially, minimally) of an effector domain (e.g., a "minimal" or "core" domain). Specifically, shown herein is that fusion of the Cas9 (e.g., dCas9) with all or a portion of one or more effector domains (e.g., transcriptional activation domains) created a chimeric protein. In one aspect, fusion of a dCas9 with one or more effector domains created a chimeric protein dCas9TA. In some aspects, the one or more effector domains are the same (e.g., VP 16 transcriptional activation domains). In other aspects, the one or more effector (e.g., transcriptional activation) domains are different. In some aspects, dCas9TA is guided to specific nucleic acid sites by one or more R A (e.g. sgR A). In some aspects, dCas9TA is guided to specific nucleic acid sites by RNA (e.g. sgRNA) to modulate gene expression. In some aspects, all or a portion of one or more VP 16 effector domains are fused with Cas9 (e.g., dCas9). In other aspects, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more VP16 effector domains (all or a biologically active portion) are fused with dCas9. In some aspects, a chimeric protein comprising a fusion of a catalytically inactive Cas to all or a portion of one or more effector domains is referred to herein as "CRISPRzyme" or "CRISPR-on".
[0087] In one aspect, fusion of Cas9 with all or a portion of one or more effector domains comprise one or more linkers. As used herein, a "linker" is something that connects or fuses two or more effector domains (e.g see Hermanson, Bioconjugate Techniques, 2nd Edition, which is hereby incorporated by reference in its entirety). As will be appreciated by one of ordinary skill in the art, a variety of linkers can be used. In one aspect, a linker comprises one or more amino acids. In some aspects, a linker comprises 2 or more amino acids. In one aspect, a linker comprises the amino acid sequence GS. In some aspects, fusion of Cas9 (e.g., dCas9) with two or more effector domains (e.g., VP16 core domain such as DALDDFDLDML) comprises one or more interspersed linkers (e.g., GS linkers) between the domains. In some aspects, dCas9 is fused with 3 VP 16 core domains with interspersed linkers, referred to herein as dCas9VP48. In other aspects, dCas9 is fused with 4 VP 16 core domains with interspersed GS linkers between the core domains, referred herein as dCas9VP48 (SEQ ID NO: 14). In other aspects, dCas9 is fused with 6 VP16 core domains with interspersed GS linkers between the core domains, referred herein as dCas9VP96 (SEQ ID NO: 15). In other aspects, fusion of dCas9 with 10 VP 16 core
domains with interspersed GS linkers between the core domains, referred herein as dCas9VP160 (SEQ ID NO: 16).
[0088] Accordingly, in some aspects, the invention is directed to a method of modulating the expression and/or activity of one or more target nucleic acid sequences in a cell or zygote comprising introducing into the cell or zygote (i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associate (Cas) protein; (ii) a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence; and (iii) an (one or more) effector domain. The method further comprises maintaining the cell or zygote under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences. The Cas protein binds to each of the one or more RNA sequences and the effector domain modulates the expression and/or activity of the target nucleic acid, thereby modulating the expression and/or activity of a target nucleic acid sequence. As with some aspects of the invention, one or more RNA sequences, Cas nucleic acid sequences and effector domains can be introduced into a cell, zygote, embryo or non-human mammal.
[0089] In some aspects, the method of modulating the expression and/or activation of one or more target nucleic acids in a cell is used to reprogram a cell's potency. Cells can be reprogrammed, e.g., by the methods described herein. In one aspect, the invention is directed to a method of modulating the expression and/or activity of one or more target nucleic acid sequences in a cell wherein the cell or cell's potency (e.g., totipotency, pluripotency, multipotency, oligopotency and unipotency) is reprogrammed (e.g., a differentiated cell; a non-differentiated cell). In one aspect, the method results in differentiation of a cell (e.g., a totipotent or pluripotent cell differentiates into a unipotent cell or differentiated cell). In another aspect, the methods results in dedifferentiation of a cell (e.g. a differentiated cell reverts to an earlier developmental stage). For example, the invention is directed to reprogramming a differentiated cell to a totipotent, pluripotent, or multipotent state. In other aspects the method results in transdifferentiation of the cell (e.g. a fibroblast is reprogrammed to a fat cell or a fat cell is reprogrammed to a fibroblast). In one aspect, the one or more target nucleic acid sequences in a cell are overexpressed
causing the cell to be reprogrammed. In another aspect, one or more transcription factors are modulated altering cell potency or dedifferentiation. In another aspect, one or transcription factors such as Oct4, Sox2, Klf4, and c-Myc are modulated (e.g. overexpressed) in a cell. (Takahashi, K. & Yamanaka, S. Cell 126, 663-676, 2006).
[0090] In some aspects, the invention is directed to a method of modulating one or more target nucleic acid sequences comprising simultaneous activation of the one or more target nucleic acid sequences. In another aspect, the method of modulating one or more target nucleic acid sequences comprises adjusting the level of modulation of one or more target nucleic acid sequences by adjusting the amount (e.g. grams, milligrams, micrograms, nanograms, moles, millimoles, micromoles, nanomoles, stoichiometric amount, molar ratio) of the one or more ribonucleic acid sequences introduced into the cell or zygote (Figure 30B). In some aspects, the level of modulation of one target nucleic acid sequence is the same or different compared to the level of modulation of another target nucleic acid sequence in the same cell or zygote (Figure 25B). In one aspect, multiple target nucleic acid sequences are modulated (e.g. multiplexed activation).
[0091] In some aspects the invention is directed to (e.g., a composition comprising, consisting essentially of, consisting of) a nucleic acid sequence that encodes a fusion protein (chimeric protein) comprising all or a portion of a Cas protein fused to all or a portion of an effector domain. In some aspects, the invention is directed to (e.g., a composition comprising, consisting essentially of, consisting of) a fusion protein comprising all or a portion of a cas protein fused to all or a portion of an effector domain. In some aspects, all or a portion of the cas protein has endonuclease activity (e.g., can cleave and/or nick a target nucleic acid sequence) and/or targeting activity. In some aspects all or a portion of the Cas protein targets but does not cleave a nucleic acid sequence. In some aspects, the Cas protein can be fused to the N-terminus or C-terminus of the effector domain. In some aspects, the portion of the effector domain modulates the expression and/or activation of a target nucleic acid sequence (e.g., gene).
[0092] In some aspects, the nucleic acid sequence encoding the fusion protein and/or the fusion protein are isolated. An "isolated," "substantially pure," or
"substantially pure and isolated" nucleic acid sequence, as used herein, is one that is separated from nucleic acids that normally flank the gene or nucleotide sequence (as
in genomic sequences) and/or has been completely or partially purified from other transcribed sequences (e.g., as in an RNA or cDNA library). For example, an isolated nucleic acid of the invention may be substantially isolated with respect to the complex cellular milieu in which it naturally occurs, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized. An "isolated," "substantially pure," or "substantially pure and isolated" protein (e.g., chimeric protein; fusion protein), as used herein, is one that is separated from or substantially isolated with respect to the complex cellular milieu in which it naturally occurs, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized. In some instances, the isolated material will form part of a composition (for example, a crude extract containing other substances), buffer system, or reagent mix. In other circumstances, the material may be purified to essential homogeneity, for example, as determined by agarose gel electrophoresis or column
chromatography such as HPLC. Preferably, an isolated nucleic acid molecule comprises at least about 50%, 80%, 90%, 95%, 98% or 99% (on a molar basis) of all macromolecular species present.
[0093] "Modulate" is used consistently with its use in the art, i.e., meaning to cause or facilitate a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest. Without limitation, such change may be an increase, decrease, or change in relative strength or activity of different components or branches of the process, pathway, or phenomenon. A "modulator" is an agent that causes or facilitates a qualitative or quantitative change, alteration, or modification in a process, pathway, or phenomenon of interest.
[0094] In some aspects, "modulating" ("modulates"; "modulation") the expression and/or activity of a target nucleic acid sequence refers to any of a variety of alterations to the expression and/or activation of the one or more target nucleic acid sequences. For example, the method of modulating the expression and/or activity of the one or more target nucleic acid sequences includes activating, increasing, decreasing, coactivating, regulating, repressing, organizing, remodeling, modifying, and/or fusing the expression and/or activity of one or more target nucleic acid sequences.
[0095] Thus, the one or more RNA sequences can be complementary to any of a variety of all or a portion of a target nucleic acid sequence that is to be modulated. In some aspects of the invention, the method of modulating one or more target nucleic acid sequences comprises introducing one or more RNA sequences that are complementary to all or a portion of a (one or more) regulatory region, an open reading frame (ORF; a splicing factor), an intronic sequence, a chromosomal region (e.g., telomere, centromere) of the one or more target nucleic acid sequences into a cell. In some aspects, the target nucleic acid sequence is all or a portion of a plasmid or linear double stranded DNA (dsDNA). In some aspects, the regulatory region targeted by the one or more target nucleic acid sequences is a promoter, enhancer, and/or operator region. In some aspects, all or a portion of the regulatory region is targeted by the one or more target nucleic acid sequences. In some aspects, the regulatory region targeted by the one or more target nucleic acid sequences is exactly or within about 25 bases, 50 bases, 100 bases, 200 bases, 300 bases, 400 bases, 500 bases, 600 bases, 700 bases, 800 bases, 900 bases, 1000 bases, 1500 bases, 2000 bases, or more upstream to the one or more genes (e.g., endogenous genes; exogenous genes) or a (one or more) transcription start site (TSS). In some aspects, the one or more target nucleic acid sequences is exactly or within about 25 bases, 50 bases, 100 bases, 200 bases, 300 bases, 400 bases, 500 bases, 600 bases, 700 bases, 800 bases, 900 bases, 1000 bases, 1500 bases, 2000 bases, or more downstream to the one or more genes (e.g., endogenous genes; exogenous genes) or a TSS. As will be appreciated by one of ordinary skill in the art, the regulatory region targeted by one or more target nucleic acid sequences can be entirely or partially found at or about the 5 ' end of the gene (e.g., endogenous or exogenous) or a TSS. The 5 ' end of a gene can include untranscribed (flanking) regions (e.g., all or a portion of a promoter) and a portion of the transcribed region.
[0096] As used herein, a "regulatory region" is any segment of a nucleic acid sequence capable of modulating (e.g. increasing, decreasing) expression and/or activity of one or more target nucleic acid sequences (e.g. genes). Examples of regulatory regions include a promoter, enhancer, telomere, locus control region, insulator, centromere, repeat sequence, transposable element, synthetic sequence, and operator. Specific examples of regulatory regions include CAAT box, CCAAT
box, Pribnow box, TATA box, SECIS element, Polyadenylation signals, A-box, Z- box, C-box, E-box, and/or G-box.
[0097] The method of modulating one or more target nucleic acid sequences comprises introducing a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence into the cell. In some aspects, a Cas protein or variant thereof is introduced into the cell. In some aspects, the Cas nucleic acid sequence encodes a Cas protein that does not have endonuclease activity. In some aspects, the Cas nucleic acid sequence encodes a Cas protein that does not have nickase activity. In some aspects, the Cas nucleic acid sequence encodes a Cas protein that does not have endonuclease and nickase activity. In some aspects, the Cas nucleic acid sequence encodes a Cas protein that does not have enzymatic activity or is catalytically inactive.
[0098] In some aspects of the invention, the method of modulating one or more target nucleic acid sequences comprises introducing a Cas nucleic acid sequence or a variant thereof that encodes a Cas9 protein. In some aspects, the Cas nucleic acid sequence encodes a Cas9 protein that comprises one or more mutations. In some aspects, the Cas nucleic acid sequence encodes a Cas9 protein that comprises a mutation at amino acid position 10, 840, or a combination thereof. In some aspects, the Cas nucleic acid sequence encodes a Cas9 protein wherein the amino acid at position 10 is mutated from aspartate (D) to alanine (A) and the amino acid at position 840 is mutated from histidine (H) to alanine (A).
[0099] The method of modulating one or more target nucleic acid sequences also comprises introducing one or more effector domains. As used herein an "effector domain" is a molecule (e.g., protein) that modulates the expression and/or activation of a target nucleic acid sequence (e.g., gene). In some aspects, the effector domain targets one or both alleles of a gene. The effector domain can be introduced as a nucleic acid sequence and/or as a protein. In some aspects, the effector domain can be a constitutive or an inducible effector domain. In some aspects, a Cas nucleic acid sequence or variant thereof and an effector domain nucleic acid sequence are introduced into the cell as a chimeric sequence. In some aspects, the effector domain is fused to a molecule that associates with (e.g., binds to) Cas protein (e.g., the effector molecule is fused to an antibody or antigen binding fragment thereof that binds to Cas protein). In some aspects, a Cas protein or variant
thereof and an effector domain are fused or tethered creating a chimeric protein and are introduced into the cell as the chimeric protein. In some aspects, the Cas protein and effector domain bind as a protein-protein interaction. In some aspects, the Cas protein and effector domain are covalently linked. In some aspects, the effector domain associates non-covelently with the Cas protein. In some aspects, a Cas nucleic acid sequence and an effector domain nucleic acid sequence are introduced as separate sequences and/or proteins. In some aspects, the Cas protein and effector domain are not fused or tethered.
[00100] Examples of effector domains include a transcription(al) activating domain (e.g., VP16, VP48, VP64, VP96 and VP160), a coactivator domain, a transcription factor, a transcriptional pause release factor domain, a negative regulator of transcriptional elongation domain, a transcriptional repressor domain, a chromatin organizer domain, a remodeler domain, a histone modifier domain, a DNA modification domain, a RNA binding domain, a protein interaction input devices domain (Grunberg and Serrano, Nucleic Acids Research, 3 '8 (8): '2663 -267 '5 (2010)), and a protein interaction output device domain (Grunberg and Serrano, Nucleic Acids Research, 3 '8 (8): '2663 -267 '5 (2010)). As used herein a "protein interaction input device" and a "protein interaction output device" refers to a protein-protein interaction (PPI). In some embodiments the PPI is regulatable, e.g., by a small molecule or by light. In some aspect, binding partners are targeted to different sites in the genome using the inactive Cas protein. The binding partners interact, thereby bringing the targeted loci into proximity. A protein interaction output device is a system for detecting/monitoring occurrence of a PPI, generally by producing a detectable signal when the PPI occurs (e.g., by reconstituting a fluorescent protein) or to trigger specific cellular responses {e.g., by reconstituting a caspase protein to induce apoptosis). The idea in this context is to target different sites in the genome with the components of the "output device". If the interaction occurs, the "output device" generates a signal. This can be used to determine or monitor the proximity of the targeted loci. In some aspects, cells are treated with an agent and the effect of the agent on the cell is determined. Other examples of effector domains include histone marks readers/interactors
(http://www.cell.com/abstract/S0092-8674(10)00951-7) and DNA modification readers/ interactors .
[00101] In some aspects, the effector domain is a VP 16 effector domain. In some aspects, the effector domain is a VP48 effector domain. In some aspects, the effector domain is a VP64 effector domain. In some aspects, the effector domain is a VP96 effector domain. In some aspects, the effector domain is a VP 160 effector domain.
[00102] In one aspect of the invention, fusion of the Cas9 to an effector domain can be to that of a single copy or multiple/tandem copies of full-length or partial- length effectors. Other fusions can be with split (functionally complementary) versions of the effector domains. Effector domains for use in the methods include any one of the following classes of proteins: proteins that mediate drug inducible looping of DNA and/or contacts of genomic loci, proteins that aid in the three- dimensional proximity of genomic loci bound by dCas9 with different sgR A.
[00103] Specific examples of transcription activators or coactivators include VP 16, tandem copies comprising all or a biologically active portion of the activation peptide from VP 16 (e.g. minimal transactivation domain), such as
ADALDDFDLDMLP (SEQ ID NO: 125) and DALDDFDLDML (SEQ ID NO: 126), VP48 (e.g, 3 copies of VP16 minimal transactivation domain), VP64 (e.g., 4 copies of VP16 minimal TA), VP96 (e.g., 6 copies of VP16 minimal TA), VP160 (e.g, 10 copies of VP 16 minimal TA), Brd4, and p65.
[00104] A specific example of a transcription factor is MYC.
[00105] Specific examples of transcriptional pause release factors include proteins in the PTEFb complex, such as Cyclin Tl, Cyclin T2, Cyclin T3, Cdk9.
[00106] Specific examples of negative regulators of transcriptional elongation include negative elongation factor (NELF) components.
[00107] Specific examples of transcriptional repressors include engrailed (EnR), KRAB, Sin3 -interaction domain (SID) and EMSY.
[00108] Specific examples of chromatin organizers and remodelers include insulator proteins, such as CTCF (transcriptional repressor CTCF or CCCTC- binding factor) to disrupt interactions between enhancers and promoters, cohesin complex and mediator complex Medl to activate gene expression, switch/sucrose nonfermentable (SWI/SNF) complex - INI1, BAF155b, BAF170, BRG1, hBRM to open up chromatin, and polycomb repressive complex to induce repressive domains on chromatin.
[00109] Specific examples of histone modifiers include histone acetyltransferases such as p300/EP300 (p300HAT), CBP/CREBBP (CBPHAT) , MGEA5, CDYL, CLOCK, ELP3, GTF3C4, KAT2A, KAT2B, KAT5, MYST2, MYST3, MYST4, HAT1, NAT 10, NCOA1, NCOA3, MYST1, CDY1B, CDY1; histone
methyltransferases such as SET7, PRMT1, PRMT2, PRMT5, PRMT6, PRMT7, PRMT8, G9a, CARM1, MLL, Set2/SET1A, Ash2, Wdr5, Rbbp5, EZH1, EZH2, MLL2, MLL3, MLL4, MLL5, WHSC1L1, PRDM9, SETD1A, SETD1B, SETD2, SETD7, SETD8, SETDB1, SETDB2, SETMAR, SUV39H1, SUV39H2,
SUV420H1, SUV420H2, NSD1, DOT1L, EHMT2, EHMT1, SMYD2, PRDM2, ASH1L, WHSC1, SMYD3; histone Deiminases such as PADI4; histone biotinases such as HLCS; histone ribosylases such as PARPl; histone ubiquitinases such as RNF20, RNF40, DTX3L, HUWEl, RBXl, RINGl, RNF2, RNF168, RNF8, UBR2, UHRF1, RAG1; histone kinases such as CDK17, CDK3, CDK5, DAPK3, PRKDC, GSK3B, CHUK, LIMK2, MASTL, MAP3K8, MLT, BUB1, PRKCB, PRKCD, RPS6KA4, RPS6KA5, ATM, STK10, AURKB, STK4, ATR, GSG2, PKNl, NEK6, NEK9, PAK2, TLK1, BAZ1B, JAK2; histone demethylases such as Jaridl, Rbr-2, JMJD6, PHF8, KDM2A, KDM2B, KDM3A, KDM3B, KDM4A, KDM4B,
KDM4C, KDM4D, KDM5A, KDM5B, KDM5C, KDM5D, KDM6A, KDM6B, JHDM1D, JMJD5, C14orfl89, KDM1A, KDM1B; histone deribosylases such as PARG; histone deubiquitinases such as MSYM1, BRCC3, USP16, USP22, USP3; histone phosphatases such as DUSP1, EYA1, EYA2, EYA3, PPM1D, PPP2CA, PPP2CB,PPP4C, PPP5C, PPPICC; histone deacetylases such as HDACl, HDACIO, HDACl 1, HDAC2, HDAC3, HDAC4, HDAC5, HDAC6, HDAC7, HDAC8, HDAC9, SIRT1, SIRT2, SIRT3, SIRT6.
[00110] Specific examples of DNA modifiers include 5hmc conversion from 5mC such as Tetl (TetlCD); DNA demethylation by Tetl, ACID A, MBD4, Apobecl, Apobec2, Apobec3, Tdg, Gadd45a, Gadd45b, ROS1; DNA methylation by Dnmtl, Dnmt3a, Dnmt3b, CpG Methyltransferase M.SssI, and/or M.EcoHK31I.
[00111] Specific examples of RNA binding domains to bring RNA molecules to specific genomic loci include Rbfox2, CUG-BP, MBNL1, MBNL2, MBNL3, MS2 coat protein ( MS2 hairpin), and engineered Pumilio.
[00112] Specific examples of protein interaction input devices (Grunberg and Serrano, Nucleic Acids Research, 38(8): '2663 -267 '5 (2010)) to mediate drug
inducible looping of DNA and/or contact of genomic loci include rapamycin induced FKBP:FRB interaction, Jun:Fos, engineered variants of constitutive leuzine zipper interaction, and light-inducible PIF3:PhyB interaction.
[00113] Specific examples of protein interaction output devices (Grunberg and Serrano, Nucleic Acids Research, 38(8): '2663 -267 '5 (2010)) to report three- dimensional proximity of genomic loci bound by dCas9 with different sgR A targeting different genomic loci includes split green fluorescent protein (GFP), Fluorescent Resonance Energy Transfer (FRET), split lactamase (antibiotic resistance-based selection) and split capase. These proteins can also be extended to a screening platform for proximal domains in chromatin with a library of sgRNA expression constructs.
[00114] Specific examples of histone marks readers/interactors include Sgf29, BPTF, C17orf49/BAP18, GATAD1, TRRAP, PHF8, N-PAC, MSH-6, and NSDl, NSD2, CBX1, CBX3, CBX5, CDYL, and CDYL2.
[00115] Specific examples of DNA modification readers/interactors include MeCP2, MBD1, MBD2, MBD3 MBD4, ZBTB4, ZBTB33, ZBTB38, UHRF1, and UHRF2.
[00116] In some aspects of the invention, the method of modulating one or more target nucleic acid sequences in a cell can further comprise introducing an effector molecule. As used herein, an "effector molecule" is a molecule (e.g., nucleic acid sequence; protein; organic molecule; inorganic molecule, small molecule) or physical trigger that associates with {e.g., binds to; specifically binds to) the effector domain to modulate the expression and/or activity of a target nucleic acid sequence {e.g., an inducer molecule; a trigger molecule). In some aspects, the effector molecule is a physical signal such as light {e.g., at one or more specific wavelengths; temperature {e.g., temperature-sensitivity); magnetism; stressor and the like. The effector molecule can be contacted with the cell and/or introduced into the cell {e.g., as a nucleic acid sequence or as protein sequence). In some embodiments, the effector molecule is endogenous. In other embodiments, the effector molecule is exogenous. For example, an exogenous effector molecule can be introduced to the cell. In some aspects, the effector molecule binds to the effector domain. In some aspects, the effector molecule is a nucleic acid, protein, drug, small organic molecule and derivatives/variants thereof. In some aspects of the invention, the
effector molecule is an antibiotic or derivatives/variants thereof. For example, the antibiotic is doxy eye line. One of ordinary skill in the art can appreciate other types of antibiotics used, including but not limited to, tetracycline, ampicillin, puromycin, and neomycin. In some aspects, the effector molecule is rapamycin, tamoxifen and/or derivative/variants thereof (e.g., (Z)-4-hydroxytamoxifen).
[00117] As will be appreciated by those of skill in the art, the effector molecule can also associate with one or more domains (e.g., binding domains) that are fused to or associated with the effector domain. For example, the effector domain can be fused to or associated with a receptor domain and/or an antigen binding domain, and the effector molecule (e.g., a ligand specific to the receptor domain; an antibody specific to the antigen binding domain) can bind to the receptor domain and/or antigen binding domain which activates the effector domain, thereby modulating the expression and/or activity of the one or more target nucleic acid sequences.
[00118] As will be apparent to those of skill in the art, the method can further comprise introducing other molecules or factors into the cell to facilitate modulation of the activation and/or expression of the target nucleic acid sequence. Examples of such molecules include coactivators, chromatin remodelers, histone acetylases, deacetylases, kinases, and methylases. The methods described herein can also be used to silence expression of a nucleic acid sequence (e.g., a gene) by guiding a repressor to a target nucleic acid sequence.
[00119] A variety of target nucleic acid sequences can be mutated or modulated using the methods described herein and will depend upon the desired results. In one aspect, the target nucleic acid sequence is a gene sequence. In particular aspects, the methods described herein can be used to genetically modify two or more different genes in the same gene family, two or more genes that have a redundant function (e.g., redundant may mean that one needs to inactivate at least two of the genes to produce a particular phenotype, e.g., a detectable phenotype), two or more genes of which at least one gene does not or is believed not to produce detectable phenotype when inactivated (e.g., in the strain background used), two or more genes at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or more identical, two or more copies of the same gene, two or more genes in same biological pathway (e.g., signaling pathway, metabolic pathway), two or more genes that share at least one biological activity and/or act on at least one common substrate and/or are part of the same
protein or protein-nucleic acid complex (e.g., a heteroligomeric protein,
spliceosome, proteasome, RISC, transcription complex, replication complex, kinetochore, channel, transporter).
[00120] In some aspects, the target nucleic acid sequence is associated with a disease or condition (e.g., see van der Weyden et al, Genome Biol, 12:224 (2011)). Specific examples of genetic modifications of interest include modifying
sequence(s), (e.g., gene(s)) to match sequence in different species (e.g., change mouse sequence to human sequence for any gene(s) of interest), alter sites of potential or known post-translational modification of proteins (e.g.,
phosphorylation, glycosylation, lipidation, acylation, acetylation), alter sites of potential or known epigenetic modification, alter sites of potential or known protein- protein or protein-nucleic acid interaction, inserting tag, e.g., epitope tag, and/or inserting or deleting splice sites. Other examples, include mutating a cell or nonhuman mammal to insert an epitope tag or transgene at an endogenous locus, make a reporter mouse, introduce loxP sites or FlpRT sites flanking certain genomic regions, and/or insert a cassette (e.g., a loxP-stop-loxP or FRT-stop-FRT cassette) in front of a gene to produce conditional alleles (e.g., see Frese and Tuveson, Nature Rev, 7:645-658 (2007); Nern et al., PNAS, 108(34): 14198-14203 (2011); Freidal et al., Meth Molec Biol, 6PJ/205-231 (2011)).
[00121] In some aspects, one copy of the one or more target nucleic acid sequences is mutated. In some aspects, both copies of one or more of the target nucleic acid sequences in the stem cell or zygote are mutated. In some aspects, the one or more target nucleic acid sequences that are mutated are endogenous to the stem cell or zygote.
[00122] In particular aspects, at least two of the target nucleic acid sequences are endogenous nucleic acid sequences. In some aspects, at least two of the target nucleic acid sequences are exogenous nucleic acid sequences. In some aspects where there are at least two target nucleic acid sequences, at least one of the target nucleic acid sequences is an endogenous nucleic acid sequence and at least one of the target nucleic acid sequences is an exogenous nucleic acid sequence. In some aspects, at least two of the target nucleic acid sequences are endogenous genes. In some aspects, at least two of the target nucleic acid sequences are exogenous genes. In some aspects where there are at least two target nucleic acid sequences, at least
one of the target nucleic acid sequences is an endogenous gene and at least one of the target nucleic acid sequences is an exogenous gene. In some aspects, at least two of the target nucleic acid sequences are at least 1 kB apart. In some aspects, at least two of the target nucleic acid sequences are on different chromosomes.
[00123] As used herein "mutate", "mutated" or "mutation" and the like refers to alteration of a sequence (a target sequence). For example, in some aspects, a target sequence that has been mutated refers to the replacement, introduction, and/or deletion of one or more nucleotides in the target sequence. In some aspects, a target sequence has been mutated to replace one or more nucleotides in the sequence with one or more nucleotides that occur in one or more natural states of the sequence (e.g., target sequence that is mutated with respect to a wild type sequence has been mutated to replace one or more nucleotides in the sequence with one or more nucleotides that occur in a wild type sequence). In some aspects, a target sequence has been mutated to replace one or more nucleotides that occurs in one or more natural states of the sequence (wild type) with one or more other nucleotides.
[00124] In particular aspects, at least one mutation comprises an insertion of a tag (e.g., an epitope tag such as a V5 tag; a fluorescent tag), a transgene (e.g, a reporter gene such as p2A-mCherry, GFP), a translation initiation site (e.g., IRES sequence), a transcription initiation site (e.g., TATA box) and/or an insertion of a site recognized by a recombinase (e.g., Cre). In some aspects, at least one mutation renders expression of an endogenous gene conditional. In yet some aspects, at least one mutation renders expression of an endogenous gene inducible, repressible, or tissue-specific. In still some aspects, the mutations comprise inserting recombination sites (e.g., loxP sites or FRT sites) flanking a selected genomic region, wherein the selected genomic region is optionally within a gene. The mutations can also comprise inserting a recombination-site-STOP-recombination site cassette (e.g., a loxP-STOP-loxP or FRT-STOP-FRT cassette) in a gene, between a promoter and a coding region of a gene, or in a regulatory region of a gene. In this aspect, the recombination-site-STOP-recombination site cassette is positioned so as to disrupt expression of the gene and wherein excision of the cassette by a recombinase renders the gene expressible.
[00125] The methods provided herein provide for multiplexed genome editing in cells, embryos, zygotes and nonhuman mammals. As shown herein, cells, embryos,
zygotes and non-human mammals carrying mutations in multiple genes can be generated in a single step. In some aspects, the methods described herein allow for the mutation of 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, etc. nucleic acid sequences (e.g., genes) in a (single) cell, zygote, embryo or nonhuman mammal using the methods described herein. In a particular aspect, 1 nucleic acid sequence is mutated in a (single) cell, zygote, embryo or nonhuman mammal. In some aspects, 2 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal. In some aspects, 3 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal. In some aspects, 4 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal. In some aspects, 5 nucleic acid sequences are mutated in a (single) cell, zygote, embryo or nonhuman mammal, etc.
[00126] The methods described herein can further comprising introducing one or more additional nucleic acid sequences that are complementary to a portion of the one or more target nucleic acid sequences cleaved by the Cas protein. A variety of nucleic acid sequences can be introduced, and include a single stranded
oligonucleotide, a double stranded oligonucleotide, a plasmid, a cDNA, a gene block (e.g., gBlocks™ Gene Fragments (IDT)), a PCR product and the like. Thus, the size of the nucleic acid sequences can vary and will depend upon the reason for introducing the nucleic acid sequence. For example, the one or more nucleic acid sequences can be used to replace one or more nucleotides, introduce one or more additional nucleotides, delete one or more nucleotides or a combination thereof in the one or more target nucleic acid sequences. In a particular aspect, the one or more nucleic acid sequences introduce a point mutation in one or more of the target sequences. In some aspects, the one or more nucleic acid sequences replace one or more mutant nucleotides with one or more wild type nucleotides in one or more of the target sequences. In some aspects, the one or more nucleic acid sequences replace one or more wild type nucleotides with one or more (mutant) nucleotides in one or more of the target sequences. In some aspects, the one or more nucleic acids introduce a tag (e.g., a fluorescent protein such as green fluorescent protein), label and/or cleavage site. Thus, the nucleic acid sequence can be from about 10 nucleotides to about 5000 nucleotides, about 20 to 4500 nucleotides, about 30 to 4000 nucleotides, about 50 to 3500 nucleotides, about 60 to about 3000 nucleotides,
about 70 to about 2500 nucleotides, about 80 to about 2000 nucleotides, about 90 to about 1500 nucleotides, about 100 to about 1000 nucleotides, etc. In a particular aspect, the nucleic acid sequence is about 10 to about 500 nucleotides.
[00127] In a particular aspect, the nucleic acid sequence {e.g., oligonucleotide) is used to further modify (alter, edit, mutate) the cleaved target nucleic acid sequence {e.g., such oligo-mediated repair allows for precise genome editing,). Thus, this aspect allows for genome editing, however as shown herein the other allele is often mutated through nonhomologous end joining (NHEJ, see Fig 3B, 3C, and 8C.
Shown herein is that using lower Cas mR A concentration generates more mice with heterozygous mutations. Therefore, the methods provided herein allow for optimization of the system for more efficient generation of nonhuman mammals with only one oligo -modified allele. In some embodiments, employment of Cas nickase is desirable, since it mainly induces DNA single strand breaks, which is typically repaired through HDR (Cong et al, Science 339:819-823 (2013); Mali et al, Science, 339:823-826 (2013)).
[00128] As will be apparent to those of skill in the art, a variety of methods can be used to introduce nucleic acid and/or protein into a stem cell, zygote, embryo, and or mammal. Suitable methods include calcium phosphate or lipid-mediated transfection, electroporation, injection, and transduction or infection using a vector {e.g., a viral vector such as an adenoviral vector). In some aspects, the nucleic acid and/or protein is complexed with a vehicle, e.g., a cationic vehicle, that facilitates uptake of the nucleic acid and/or protein, e.g., via endocytosis.
[00129] The method described herein can further comprise isolating the stem cell or zygote produced by the methods. Thus, in some aspects, the invention is directed to a stem cell or zygote (an isolated stem cell or zygote) produced by the methods described herein. In some aspects, the disclosure provides a clonal population of cells harboring the mutation(s), replicating cultures comprising cells harboring the mutation(s) and cells isolated from the generated animals.
[00130] The methods described herein can further comprise crossing the generated animals with other animals harboring genetic modifications (optionally in same strain background) and/or having one or more phenotypes of interest (e.g., disease susceptibility - such as NOD mice). In addition, the methods may comprise modifying a stem cell, zygote, and/or animal from a strain that harbors one or more
genetic modifications and/or has one or more phenotypes of interest (e.g., disease susceptibility).
[00131] In some aspects, various mouse strains and mouse models of human disease are used. One of ordinary skill in the art appreciates the thousands of commercially and non-commercially available strains of laboratory mice for modeling human disease. Mice models exist for diseases such as cancer, cardiovascular disease, autoimmune, inflammatory, diabetes (type 1 and 2), neurobiology, and other diseases. Examples of commercially available research strains include, and is not limited to, 11BHSD2 Mouse, GSK3B Mouse, 129-E Mouse HSD11B1 Mouse, AK Mouse Immortomouse®, Athymic Nude Mouse, LCAT Mouse, B6 Albino Mouse, Lox-1 Mouse, B6C3F1 Mouse, Ly5 Mouse, B6D2F1 (BDF1) Mouse, MMP9 Mouse, BALB/c Mouse, NIH-III Nude Mouse, BALB/c Nude Mouse, NOD Mouse, NOD SCID Mouse, Black Swiss Mouse, NSE- p25 Mouse, C3H Mouse, NU/NU Nude Mouse, C57BL/6-E Mouse, PCSK9 Mouse, C57BL/6N Mouse, PGP Mouse (P-glycoprotein Deficient), CB6F1 Mouse, repTOP™ ERE-Luc Mouse, CD-I® Mouse, repTOP™ mitoIRE Mouse, CD-I® Nude Mouse, repTOP™ PPRE-Luc Mouse, CD1-E Mouse, Rip-HAT Mouse, CD2F1 (CDF1) Mouse, SCID Hairless Congenic (SHC™) Mouse, CF-1™ Mouse, SCID Hairless Outbred (SHO™) Mouse, DBA/2 Mouse, SJL-E Mouse, Fox Chase CB17™ Mouse, SKH1-E Mouse, Fox Chase SCID® Beige Mouse, Swiss Webster (CFW®) Mouse, Fox Chase SCID® Mouse, TARGATT™ Mouse, FVB Mouse, THE POUND MOUSE™, and GLUT 4 Mouse. Other mouse strains include BALB/c, C57BL/6, C57BL/10, C3H, ICR, CBA, A/J, NOD, DBA/1, DBA/2, MOLD, 129, HRS, MRL, NZB, NIH, AKR, SJL, NZW, CAST, KK, SENCAR, C57L, SAMR1, SAMP1, C57BR, and NZO.
[00132] The methods described herein can further comprise assessing whether the one or more target nucleic acids have been mutated and/or modulated using a variety of known methods.
[00133] In some embodiments methods described herein are used to produce multiple genetic modifications in a stem cell, zygote, embryo, or animal, wherein at least one of the genetic modifications knocks out (functionally inactivates completely or partially) a gene whose knockout does not produce a detectable phenotype, and at least one of the genetic modifications is in a different gene or
genomic location. The resulting stem cell, zygote, embryo, or animal, or a cell, zygote, embryo, or animal generated therefrom, is analyzed for the presence of one or more detectable phenotypes. Such methods may be used to identify genes or genomic locations that have synthetic effects (e.g., effects that are greater in degree or different in kind from the sum of the effects caused by either mutation alone). In some embodiments an effect is synthetic lethality. In some embodiments at least one of the genetic modifications may be conditional (e.g., the effect of the modification, such as gene knockout, only becomes manifest under certain conditions, which are typically under control of the artisan). In some embodiments animals are permitted to develop at least to post-natal stage, e.g., to adult stage. The appropriate conditions for the modification to produce an effect (sometimes termed "inducing conditions") are imposed, and the phenotype of the animal is subsequently analyzed. A phenotype may be compared to that of an unmodified animal or to the phenotype prior to the imposition of the inducing conditions.
[00134] In any aspect or embodiment herein, analysis may comprise any type of phenotypic analysis known in the art, e.g., examination of the structure, size, development, weight, or function, of any tissue, organ, or organ system (or the entire organism), analysis of behavior, activity of any biological pathway or process, level of any particular substance or gene product, etc. In some embodiments analysis comprises gene expression analysis, e.g., at the level of mR A or protein. In some embodiments such analysis may comprise, e.g., use of microarrays (e.g., oligonucleotide microarrays, sometimes termed "chips"), high throughput sequencing (e.g., RNASeq), ChIP on Chip analysis, ChlPSeq analysis, etc. In some embodiments high content screening may be used, in which elements of high throughput screening may applied to the analysis of individual cells through the use of automated microscopy and image analysis (see, e.g., Zanella et al, (2010). High content screening: seeing is believing. Trends Biotechnol. 28:237-245). In some embodiments analysis comprises quantitative analyses of components of cells such as spatio-temporal distributions of individual proteins, cytoskeletal structures, vesicles, and organelles, e.g., when contacted with test agents, e.g., chemical compounds. In some embodiments activation or inhibition of individual proteins and protein-protein interactions and/or changes in biological processes and cell functions may be assessed. A range of fluorescent probes for biological processes,
functions, and cell components are available and may be used, e.g., with
fluorescence microscopy. In some embodiments cells or animals generated according to methods herein may comprise a reporter, e.g., a fluorescent reporter or enzyme (e.g., a luciferase such as Gaussia, Renilla, or firefly luciferase) that, for example, reports on the expression or activity of particular genes. Such reporter may be fused to a protein, so that the protein or its activity is rendered detectable, optionally using a non-invasive detection means, e.g., an imaging or detection means such as PET imaging, MRI, fluorescence detection. Multiplexed genome editing according to the invention may allow installation of reporters for detection of multiple proteins, e.g., 2 - 20 different proteins, e.g., in a cell, tissue, organ, or animal, e.g., in a living animal.
[00135] Multiplexed genome editing according to the present invention may be useful to determine or examine the biological role(s) and/or roles in disease of genes of unknown function (e.g., genes whose complete knockout does not produce a detectable phenotype). For example, discovery of synthetic effects caused by mutations in first and second genes may pinpoint a genetic or biochemical pathway in which such gene(s) or encoded gene product(s) is involved. In some
embodiments mutations may be generated in stem cells or zygotes from any existing knockout or deletion strain or animals produced according to methods described herein may be crossed with animals from such strain. In some embodiments one or more gain-of-function and/or loss-of-function alleles are generated.
[00136] In some embodiments it is contemplated to use, in methods described herein, cells or zygotes generated in or derived from animals produced in projects such as the International Knockout Mouse Consortium (IKMC), the website of which is http://www.knockoutmouse.org). In some embodiments it is
contemplated to cross animals generated as described herein with animals generated by or available through the IKMC. For example, in some embodiments a mouse gene to be modified according to methods described herein is any gene from the Mouse Genome Informatics (MGI) database for which sequences and genome coordinates are available, e.g., any gene predicted by the NCBI, Ensembl, and Vega (Vertebrate Genome Annotation) pipelines for mouse Genome Build 37 (NCBI) or Genome Reference Consortium GRCm38.
[00137] In some embodiments a gene or genomic location to be modified is included in genome of a species for which a fully sequenced genome exists. Genome sequences may be obtained, e.g., from the UCSC Genome Browser
(http://genome.ucsc.edu/index.html). For example, in some embodiments a human gene or sequence to be modified according to methods described herein may be found in Human Genome Build hgl9 (Genome Reference Consortium). In some embodiments a gene is any gene for which a Gene ID has been assigned in the Gene Database of the NCBI (http://www.ncbi.nlm.nih.gov/gene). In some embodiments a gene is any gene for which a genomic, cDNA, mRNA, or encoded gene product (e.g., protein) sequence is available in a database such as any of those available at the National Center for Biotechnology Information (www.ncbi.nih.gov) or Universal Protein Resource (www.uniprot.org). Databases include, e.g., GenBank, RefSeq, Gene, UniProtKB/SwissProt, UniProtKB/Trembl, and the like.
[00138] In some embodiments a gene encodes a polypeptide. In some
embodiments a gene may not encode a polypeptide. A gene may, for example, comprise a template for transcription of a functional RNA, i.e., an RNA that has at least one function other than providing a messenger RNA (mRNA) to be translated into protein. Examples, include, e.g., long non-coding RNA (e.g., greater than 200 bases in length, e.g., 200 - 5,000 bases), small RNA (e.g., small nuclear RNA), transfer RNA, ribosomal RNA, microRNA precursor, Piwi-interacting RNAs (piRNAs), small nucleolar RNAs (snoRNAs). In some embodiments a small RNA is 25 bases or less, 50 bases or less, 100 bases or less, 200 bases or less in length. Sequences of functional RNAs are available, e.g., from databases such as miRBase (website is http://www.mirbase.org ) (Kozomara A , et al., miRBase: integrating microRNA annotation and deep-sequencing data. NAR 2011 39(Database
Issue):D 152-D 157), or the Long Non-Coding RNA Database, also called IncRNAdb (website is http://www.lncrnadb.org/), (Amaral PP, et al. (2011) IncRNAdb: a reference database for long noncoding RNAs. Nucleic Acids Res 39: D146-151). In some embodiments a genomic sequence may be suspected of potentially comprising a template for transcription of a functional RNA. A genetic modification may be made in the sequence to determine whether such genetic modification alters the phenotype of a cell or animal or affects production of an RNA or protein or alters susceptibility to a disease.
[00139] In some embodiments it is of interest to genetically modify a known or suspected regulatory region, e.g., a known or suspected enhancer region or a known or suspected promoter region. The effect on expression of one or more genes in (e.g., within up to about 1, 2, 5, 10, 20, 50, 100, 500 kB or within about 1, 2, 5, or 10 MB from the modification) may be assessed. A genetic modification may be made in the sequence to determine whether such genetic modification alters the phenotype of a cell or animal or affects production of an R A or protein or alters susceptibility to a disease.
[00140] In some embodiments any method described herein may comprise isolating one or more cells, samples, or substances from an animal generated according to methods described herein, e.g., any genetically modified animal generated as described herein. In some embodiments a method may further comprise analyzing the one or more cells, samples, or substances. Such analysis may, for example assess the effect of a genetic medication(s) introduced according to the methods.
[00141] In some embodiments animals generated according to methods described herein may be useful in the identification of candidate agents for treatment of disease and/or for testing agents for potential toxicity or side effects. In some embodiments any method described herein may comprise contacting an animal generated according to methods described herein, e.g., any genetically modified animal generated as described herein, with a test agent (e.g., a small molecule, nucleic acid, polypeptide, lipid, etc.). In some embodiments contacting comprises administering the test agent. Administration may be by any route (e.g., oral, intravenous, intraperitoneal, gavage, topical, transdermal, intramuscular, enteral, subcutaneous), may be systemic or local, may include any dose (e.g., from about 0.01 mg/kg to about 500 mg/kg), may involve a single dose or multiple doses. In some embodiments a method may further comprise analyzing the animal. Such analysis may, for example assess the effect of the test agent in an animal having a genetic medication(s) introduced according to the methods. In some embodiments a test agent that reduces or enhances an effect of one or more genetic modification(s) may be identified. In some embodiments if a test agent reduces or inhibits development of a disease associated with or produced by the genetic
modification(s), (or reduces or inhibits one or more symptoms or signs of such a
disease) the test agent may be identified as a candidate agent for treatment of a disease associated with or produced by the genetic modification(s) or associated with or produced by naturally occurring mutations in a gene or genomic location harboring the genetic modification.
[00142] In some embodiments a cell (e.g., a somatic cell to be used to generate an iPS cell) may be a diseased cell or may originate from a subject suffering from a disease, e.g., a disease affecting the cell or organ from which the cell was obtained. In some embodiments a mutation is introduced into a genomic region of the iPS cell that is associated with a disease (e.g., any disease of interest, such as diseases mentioned herein). For example, in some embodiments it is of interest to knock out or otherwise modify a gene or genomic location that is known or suspected to be involved in disease pathogenesis and/or known or suspected to be associated with increased or decreased risk of developing a disease or particular manifestation(s) of a disease. In some embodiments it is of interest to knock out or otherwise modify a gene or genomic location and determine whether such knockout or modification alters the risk of developing a disease or one or more manifestations of a disease, alters progression of the disease, or alters the response of a subject to therapy or candidate therapy for a disease. In some embodiments it is of interest to modify an abnormal or disease-associated nucleotide or sequence to one that is normal or not associated with disease. In some embodiments this may allow production of genetically matched cells or cell lines (e.g., iPS cells or cell lines) that differ only at one or more selected sites of genetic modification. Multiplexed genome editing as described herein may allow for production of cells or cell lines that are isogenic except with regard to, e.g., between 2 and 20 selected sites or genetic alterations. This may allow for the study of the combined effect of multiple mutations that are suspected of or known to play a role in disease risk, development or progression.
[00143] The methods of modulating the expression and/or activity of one or more target nucleic acid sequences in a cell have a variety of uses (e.g., therapeutic, pharmaceutical and/or academic uses). For example, CRISPRzymes can be designed to target specific chromatin loci to exert modification (e.g., methylation or demethylation) on causative genes of diseases due to aberrant chromatin state to correct the chromatin states. In addition, CRISPRzymes can be used to detect/sense certain sequence variation or chromatin states at defined loci guided by sgRNA, or
interactions between genomic loci guided by pairs or set of sgRNAs and to exert specific therapeutic outcomes dependent on chromatin state or the interaction of genomic loci.
[00144] For example, split fragments of Caspase can be fused to dCas9 and only reconstitute apoptosis-inducing activity when two genomic loci targeted by specific sgRNAs are proximal due to looping under certain disease conditions or cell types, e.g., cancer stem cells, [http://www.ncbi.nlm.nih.gov/pubmed/22070901].
CRISPRzymes can be coupled with biosensors to kill cells on detecting specific histone or DNA modifications at specific loci, e.g., DNA methylation
(http://www.ncbi.nlm.nih.gov/pubmed/21797230). A pair of fusions: CRISPR- CaspaseA, MBDl-CaspaseB fusion. MBDl-CaspaseB binds to mCpG, CRISPR- CaspaseA binds to a genomic loci (e.g., hypermethylated genes in cancer) guided by an sgRNA. Only at that defined loci and when the loci is methylated is the Caspase reconstituted, and triggering the killing of cancer cells but not in normal cells. CRISPRzymes can be used to detect chromosomal translocation events resulting in fusion of DNA fragments. dCas9 can be fused to split fragments of fluorescent marker, or luciferase gene and sgRNA targeting the fused genes are used and only when the two specific gene fragments are fused is the reporter reconstituted. This strategy can be used to screen for/detection of subtypes of cancer cells in patient samples/biopsy, at single cell resolution. Similarly fusion with split caspase will allow specific killing/depletion of aberrant cells characteristic of specific chromosomal translocation events. Conversely, CRISPRzymes can be used to restore DNA looping in patients with deficient DNA looping, e.g., Cornelia de Lange patients (defeats in cohesin complex.)
[00145] CRISPRzymes can also be used in pharmaceutical and/or academic research. For example, a screen can be used by a library of sgRNA sequences in combination with a CRISPRzyme or a set of CRISPRzymes. The screen can be in the format of library, where each samples (cells, embryos, or tissues) are treated with known and predefined sgRNA or a set of sgRNA. Alternatively, the screen can be pooled whereby vectors expressing different sgRNAs are mixed and introduced to the target (cells, embryos, tissues, etc.) and cells with appropriate phenotype are selected or enriched and the sgRNA harboring the specific phenotype identified by sequencing. CRISPRzymes can be used to elicit chromatin state changes, or
transcription activation of specific gene or specific sets of genes in somatic cell, adult stem cells or embryonic stem cells to induce them to reprogram into pluripotent states, to differentiate or transdifferentiate.
[00146] In some aspects, methods described herein may be used to produce non- human mammals that have a mutation in the SR Y (sex determining region Y) gene. The SR Y gene is an intronless gene located on the Y chromosome in therian mammals that encodes a transcription factor that is a member of the SOX (SRY-like box) gene family of DNA-binding proteins. Since a functional Sry protein is required for male development, a mammal that has an X and Y chromosome, wherein the Y chromosome harbors a loss-of-function mutation in SRY, is an anatomic female. An anatomic female may be recognized, e.g., by the presence of a uterus and ovaries and the absence of testes.
[00147] As described herein, the CRISPR/Cas system may be used to generate mutations in SRY, e.g., in a stem cell, zygote, or embryo. Thus in some
embodiments, a target nucleic acid sequence mutated according to methods described herein is the SR Y gene or a portion thereof. In some embodiments the mutation is a loss-of-function mutation. In some embodiments the loss-of-function mutation is a deletion of part or all of the SRY gene. In some embodiments the mutation, e.g., deletion, is in a portion of the gene that is essential for its function. In some embodiments a mutation is in the portion of the SR Y gene that encodes the high mobility group (HMG) DNA binding domain of Sry, termed the HMG box. The HMG box (Nasrin, Nature, 354, 317-320 (1991)). is the characteristic domain of the SOX (SRY-type HMG box) family of transcription factors. It is a 79 amino acid domain that is highly conserved among SRY proteins (at least 50% identical to the human Sry HMG box). In humans, the HMG box extends from amino acid 58 to amino acid 137 of Sry. The corresponding sequences in other species are immediately evident upon aligning the Sry protein sequences with the human sequence (see, e.g., Fig. 15A). For example, in mouse the Sry HMG box extends from amino acid 3 to amino acid 82). The HMG domain is essential for the function of SR Y proteins.
[00148] In some aspects, the present disclosure relates to the recognition that targeted mutations in SR Y cause anatomic sex reversal, resulting in non-human mammals that have X and Y chromosomes but are anatomic females. For example,
Applicants have generated XY mice having a variety of deletions or insertion in the gene (Wang H, et al., TALEN-mediated editing of the mouse Y chromosome. Nat Biotechnol. 2013; May 12, doi: 10.1038/nbt.2595. ePub ahead of print,
incorporated herein by reference). The mice were generated using transcription activator-like effector nuclease (TALEN) technology to mutate the Sry gene in mouse ES cells. Two pairs of TALENs were generated to target the high mobility group (HMG) DNA binding domain of Sry and were transfected into mouse
embryonic stem (ES) cells to generate deletions. TALEN pairs 1 and 2 showed gene modification efficiencies of 15% and 20%>, respectively, based on a Surveyor assay. The deletions ranged in size from 1 1 to 540 bp (Wang, H., supra). Three of the generated deletions are depicted schematically in Figure 15B. The TALEN cleavage site is in the middle between the binding of the TALEN2 pair as depicted in Figure 15B. The mutated ES cells were used to produce living mice by tetraploid
complementation using standard methods. The resulting mice were found to be anatomic females. In addition, the insertion of a sequence encoding GFP at the same site lead to sex reversal. Adult Sry-targeted mice (anatomic females) showed reduced fertility, but they were fertile and transmitted the Sry-mutated Y
chromosome to offspring.
TALEN TALEN recognition sequences (bold and underlined) & Amino acid sequences of the RVDs
Sry pair 1 5' TGGCCCAGCAGAATCCCAGCATGCAAAATACAGAGATCAGCAAGC (SEO ID NO: 127)
3' ACCGGGTCGTCTTAGGGTCGTACGTTTTATGTCTCTAGTCGTTCG (SEO ID NO: 128) NG NN N HD HD HD NI NN HD NI NN NI NI NG (Left TALEN) (SEQ ID NO: 129) NN HD NG NGNN HD NG NN NI NG HD NG HD NG NN NG (Right TALEN) (SEQ ID NO: 130)
Sry pair 2 5' GAATGCATTTATGGTGTGGTCCCGTGGTGAGAGGCACAAGTTGGCCCAGC (SEO ID NO: 131)
3' CTTACGTAAATACCACACCAGGGCACCACTCTCCGTGTTCAACCGGGTCG (SEO ID NO: 132) NN NI NI NG NN HD NI NG NG NG NI NG NN NN NG NN NG (SEQ ID NO: 133)
NN HD NG NN NN NN HD HD NI NI HD NG NG NN NG NN HD HD NG (SEQ ID NO: 134)
The distributions of genotypes and anatomic sexual phenotypes in progeny from six litters.
From the age of ~2 months, each of seven XYSry(tml) females was housed with a single XYSry(dllRlb );Tg(Sry)2Ei male for 5-7 months. The result was that three XYSry(tml) females gave birth to a total of eight litters (two eaten at birth). It has been reported that, in XY female meiosis, the X and Y chromosomes do not pair efficiently and segregate randomly, leading to sex chromosome aneuploidy in the offspring of XY females 1, 2. aThese mice may carry either one or two X
chromosomes. bThese mice may also carry YSry(dllRlb).
[00149] In some embodiments the portion of the SRY gene that is targeted is within or overlaps with the portion of the gene that encodes the HMG box. In some embodiments the mutation removes at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 40, 50, 100, or more nucleotides from the gene, e.g., at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 40, 50, 100, or more nucleotides from the portion of the gene that encodes the HMG box. In some embodiments the mutation is in a portion of the gene upstream (5') of the region that encodes the HMG box, e.g.,k encoding a portion of the Sry protein that lies N-terminal to the HMG box. In some
embodiments a mutation is an insertion upstream of or within the sequence that encodes the HMG box, wherein the insertion results in a frameshift or stop codon.
For example, insertion of 1 or 2 amino acids or a longer sequence not divisible by 3 would result in a frameshift. Insertion of a stop codon in the region located 5 ' of the sequence encoding the HMG box would result in a truncated and nonfunctional Sry protein. In some embodiments a mutation may be located in a portion of the SRY gene that encodes a portion of Sry that is C-terminal to the HMG box. In some embodiments a mutation may be in a regulatory region, e.g., a promoter. In some embodiments a mutation may be upstream of the start codon, e.g., in a promoter.
[00150] In some embodiments the SRY gene is mutated in a zygote, and the zygote is transferred to the uterus of a foster mother (e.g., a pseudopregnant female) to develop to birth. It will be understood that the zygote may be maintained in culture after mutation of the S^F gene, e.g., to an early embryonic stage (e.g., a blastocyst) and then transferred to the uterus of a foster mother. In some
embodiments, the invention provides a zygote having an X and Y chromosome, wherein the Y chromosome has an engineered mutation in the SR Y gene, wherein the zygote is capable of developing to an anatomic female. The mammal may be any non-human mammal. In some embodiments a method comprises generating a non-human mammal that has an X chromosome and a Y chromosome (i.e., somatic cells of which contain an X and a Y chromosome).
[00151] Methods of creating anatomic females may be useful in any context in which it is desired to reduce the number or proportion of male offspring and/or increase the number of proportion of anatomically female offspring. In some embodiments, methods of generating anatomic females are useful in animal husbandry, which generally refers to the breeding and raising of non-human animals for any of a variety of purposes, e.g., for meat, as sources of animal products (e.g., milk, wool, hair, leather, skin, horn, eggs, or meat), for performing work, or providing companionship, e.g., as pets. In some embodiments it may be of interest to generate anatomic females which may be capable of producing offspring or serving as foster mothers for offspring of that species or producing a product of interest. In some embodiments the non-human mammal is allowed to develop at least until adulthood. In some embodiments the adult non-human mammal gives rise to offspring, which inherit the mutation. In some embodiments a useful product, e.g., milk, wool, hair, leather, skin, horn, or meat, is obtained from the anatomically female non-human mammal.
[00152] In the context of dairy farming there is considerable interest in reducing the number of male offspring, as they are not useful for producing milk. In some embodiments, a non-human mammal useful in dairy farming is a cow, goat, sheep, or camel, or other non-human animal useful for the production of milk. In some embodiments a cow is of any of the following breeds: a Holstein (also referred to as Holstein-Friesian), Brown Swiss, Canadienne, Dutch Belted, Guernsey, Ayrshire, Jersey, Kerry, Milking Shorthorn, Milking Devon, or Norwegian Red.
[00153] In some embodiments methods of creating anatomic females may be useful in the context of managing species at risk of extinction, e.g., in programs that attempt to maintain or increase the number of individuals of a particular species. In some embodiments a species at risk of extinction may be any species recognized as near threatened, threatened (vulnerable, endangered, or critically endangered), or extinct in the wild by the International Union of Conservation (IUCN). Such species are listed, e.g., on the IUCN Red List of Threatened Species (also known as the IUCN Red List or Red Data List), e.g., the 2012 version (available at the IUCN website at http://www.iucnredlist.org/). In some embodiments the population of a species at risk of extinction may be declining. In some embodiments a species, e.g., a species at risk of extinction, may be, e.g., a bear, canine, caprine, elephant, feline, non-human primate, ovine, rodent, or ungulate species. In some embodiments a species, e.g., a species at risk of extinction, may be a marsupial, e.g., a Tasmanian Devil.
[00154] In some embodiments, methods of generating non-human mammals may comprise mutating one or more genes whose mutation results in a phenotype of interest. In some embodiments both copies of the gene are mutated. A phenotype of interest may be any phenotype, e.g., any property of interest. In some embodiments the non-human mammal is a source of food (e.g., milk or meat) or other products useful for humans. In some embodiments at least some humans may be allergic to a component, e.g., a protein, found in the food. A phenotype of interest may comprise reduced or absent production of an allergenic component, or alteration in an allergenic component so as to reduce its allergenicity. For example, in some embodiments the gene encoding a whey protein, e.g., the whey protein beta- lactoglobulin (BLG), a component found in the milk of cows, sheep, and a variety of other species (but not humans) that constitutes a major milk allergen, is mutated. In
some embodiments a gene is mutated so as to remove an allergenic epitope or alter it to a non-allergenic form, e.g., by changing or deleting one or more amino acids. The protein may still be produced and able to fulfill its normal function but is no longer allergenic or has reduced allergenicity to humans. In some embodiments a gene is mutated so as to reduce or eliminate production of the protein. In some embodiments a mutation is insertion of a stop codon or deletion or alteration of a start codon or at least a portion of a promoter.
[00155] In some embodiments a phenotype of interest may comprise any alteration that qualitatively or quantitatively alters one or more characteristics of a product that is obtained from the non-human mammal, e.g., in a way that makes the product more useful, easier to manipulate, less allergenic, or improved in any way. In some embodiments a characteristic may be color, texture, flavor, consistency, viscosity, thickness, roughness, toughness, tenderness, stringiness, fat content, protein content, sugar content, etc. In some embodiments a phenotype of interest may comprise any alteration that increases the yield of a product (e.g., on a per animal basis, per month or year basis); increases the growth rate; reduces the amount of food, resources, or care consumed or required by the animal; renders the animal more resistant to disease; renders the animal more tolerant of high or low
temperature, or reduces the environmental impact of the animal (e.g., reduces methane production). In some embodiments, a phenotype may comprise increased milk production.
[00156] In some embodiments a polymorphism, e.g., a single nucleotide polymorphism, may be identified as being associated with a phenotype of interest using methods known in the art (e.g., genetic association studies). Methods described herein may be used to generate non-human mammals having a
polymorphism that is associated with the phenotype. The animal may be compared with an otherwise isogenic animal that has not been genetically modified. The effect specifically due to variation at the polymorphic position may be determined. If a mutation or polymorphism confers a phenotype of interest, the non-human mammal may be used as a source of additional animals having the mutation or polymorphism and/or additional mammals having the mutation or polymorphism may be produced using methods described herein.
[00157] In some embodiments, methods of generating anatomically female non- human mammals may comprise mutating one or more additional nucleic acids in addition to the SR Y gene. For example, any gene the mutation of which results in a phenotype of interest (e.g., reduced allergen content), may be mutated.
[00158] The terms "disease", "disorder" or "condition" are used interchangeably and may refer to any alteration from a state of health and/or normal functioning of an organism, e.g., an abnormality of the body or mind that causes pain, discomfort, dysfunction, distress, degeneration, or death to the individual afflicted. Diseases include any disease known to those of ordinary skill in the art. In some
embodiments a disease is a chronic disease, e.g., it typically lasts or has lasted for at least 3-6 months, or more, e.g., 1, 2, 3, 5, 10 or more years, or indefinitely. Disease may have a characteristic set of symptoms and/or signs that occur commonly in individuals suffering from the disease. Diseases and methods of diagnosis and treatment thereof are described in standard medical textbooks such as Longo, D., et al. (eds.), Harrison's Principles of Internal Medicine, 18th Edition; McGraw-Hill Professional, 2011 and/or Goldman's Cecil Medicine, Saunders; 24 edition (August 5, 2011). In certain embodiments a disease is a multigenic disorder (also referred to as complex, multifactorial, or polygenic disorder). Such diseases may be associated with the effects of multiple genes, sometimes in combination with environmental factors (e.g., exposure to particular physical or chemical agents or biological agents such as viruses, lifestyle factors such as diet, smoking, etc.). A multigenic disorder may be any disease for which it is known or suspected that multiple genes (e.g., particular alleles of such genes, particular polymorphisms in such genes) may contribute to risk of developing the disease and/or may contribute to the way the disease manifests (e.g., its severity, age of onset, rate of progression, etc.) In some embodiments a multigenic disease is a disease that has a genetic component as shown by familial aggregation (occurs more commonly in certain families than in the general population) but does not follow Mendelian laws of inheritance, e.g., the disease does not clearly follow a dominant, recessive, X-linked, or Y-linked inheritance pattern. In some embodiments a multigenic disease is one that is not typically controlled by variants of large effect in a single gene (as is the case with Mendelian disorders). In some embodiments a multigenic disease may occur in familial form and sporadically. Examples include, e.g., Parkinson's disease,
Alzheimer's disease, and various types of cancer. Examples of multigenic diseases include many common diseases such as hypertension, diabetes mellitus (e.g., type II diabetes mellitus), cardiovascular disease, cancer, and stroke (ischemic,
hemorrhagic). In some embodiments a disease, e.g., a multigenic disease is a psychiatric, neurological, neurodevelopmental disease, neurodegenerative disease, cardiovascular disease, autoimmune disease, cancer, metabolic disease, or respiratory disease. In some embodiments at least one gene is implicated in a familial form of a multigenic disease.
[00159] In some embodiments a disease is cancer, which term is generally used interchangeably to refer to a disease characterized by one or more tumors, e.g., one or more malignant or potentially malignant tumors. The term "tumor" as used herein encompasses abnormal growths comprising aberrantly proliferating cells. As known in the art, tumors are typically characterized by excessive cell proliferation that is not appropriately regulated (e.g., that does not respond normally to physiological influences and signals that would ordinarily constrain proliferation) and may exhibit one or more of the following properties: dysplasia (e.g., lack of normal cell differentiation, resulting in an increased number or proportion of immature cells); anaplasia (e.g., greater loss of differentiation, more loss of structural organization, cellular pleomorphism, abnormalities such as large, hyperchromatic nuclei, high nuclearxytoplasmic ratio, atypical mitoses, etc.);
invasion of adjacent tissues (e.g., breaching a basement membrane); and/or metastasis. Malignant tumors have a tendency for sustained growth and an ability to spread, e.g., to invade locally and/or metastasize regionally and/or to distant locations, whereas benign tumors often remain localized at the site of origin and are often self-limiting in terms of growth. The term "tumor" includes malignant solid tumors, e.g., carcinomas (cancers arising from epithelial cells), sarcomas (cancers arising from cells of mesenchymal origin), and malignant growths in which there may be no detectable solid tumor mass (e.g., certain hematologic malignancies). Cancer includes, but is not limited to: breast cancer; biliary tract cancer; bladder cancer; brain cancer (e.g., glioblastomas, medulloblastomas); cervical cancer;
choriocarcinoma; colon cancer; endometrial cancer; esophageal cancer; gastric cancer; hematological neoplasms including acute lymphocytic leukemia and acute myelogenous leukemia; T-cell acute lymphoblastic leukemia/lymphoma; hairy cell
leukemia; chronic lymphocytic leukemia, chronic myelogenous leukemia, multiple myeloma; adult T-cell leukemia/lymphoma; intraepithelial neoplasms including Bowen's disease and Paget's disease; liver cancer; lung cancer; lymphomas including Hodgkin's disease and lymphocytic lymphomas; neuroblastoma;
melanoma, oral cancer including squamous cell carcinoma; ovarian cancer including ovarian cancer arising from epithelial cells, stromal cells, germ cells and
mesenchymal cells; neuroblastoma, pancreatic cancer; prostate cancer; rectal cancer; sarcomas including angiosarcoma, gastrointestinal stromal tumors,
leiomyosarcoma, rhabdomyosarcoma, liposarcoma, fibrosarcoma, and
osteosarcoma; renal cancer including renal cell carcinoma and Wilms tumor; skin cancer including basal cell carcinoma and squamous cell cancer; testicular cancer including germinal tumors such as seminoma, non-seminoma (teratomas, choriocarcinomas), stromal tumors, and germ cell tumors; thyroid cancer including thyroid adenocarcinoma and medullary carcinoma. It will be appreciated that a variety of different tumor types can arise in certain organs, which may differ with regard to, e.g., clinical and/or pathological features and/or molecular markers.
Tumors arising in a variety of different organs are discussed, e.g., the WHO
Classification of Tumours series, 4th ed, or 3rd ed (Pathology and Genetics of Tumours series), by the International Agency for Research on Cancer (IARC), WHO Press, Geneva, Switzerland, all volumes of which are incorporated herein by reference. In some embodiments a cancer is one for which mutation or
overexpression of particular genes is known or suspected to play a role in
development, progression, recurrence, etc., of a cancer. In some embodiments such genes are targets for genetic modification according to methods described herein. In some embodiments a gene is an oncogene, proto-oncogene, or tumor suppressor gene. The term "oncogene" encompasses nucleic acids that, when expressed, can increase the likelihood of or contribute to cancer initiation or progression. Normal cellular sequences ("proto-oncogenes") can be activated to become oncogenes (sometimes termed "activated oncogenes") by mutation and/or aberrant expression. In various embodiments an oncogene can comprise a complete coding sequence for a gene product or a portion that maintains at least in part the oncogenic potential of the complete sequence or a sequence that encodes a fusion protein. Oncogenic mutations can result, e.g., in altered (e.g., increased) protein activity, loss of proper
regulation, or an alteration (e.g., an increase) in R A or protein level. Aberrant expression may occur, e.g., due to chromosomal rearrangement resulting in juxtaposition to regulatory elements such as enhancers, epigenetic mechanisms, or due to amplification, and may result in an increased amount of proto-oncogene product or production in an inappropriate cell type. Proto-oncogenes often encode proteins that control or participate in cell proliferation, differentiation, and/or apoptosis. These proteins include, e.g., various transcription factors, chromatin remodelers, growth factors, growth factor receptors, signal transducers, and apoptosis regulators. A TSG may be any gene wherein a loss or reduction in function of an expression product of the gene can increase the likelihood of or contribute to cancer initiation or progression. Loss or reduction in function can occur, e.g., due to mutation or epigenetic mechanisms. Many TSGs encode proteins that normally function to restrain or negatively regulate cell proliferation and/or to promote apoptosis. Exemplary oncogenes include, e.g., MYC, SRC, FOS, JUN, MYB, RAS, RAF, ABL, ALK, AKT, TRK, BCL2, WNT, HER2/NEU, EGFR, MAPK, ERK, MDM2, CDK4, GLI1, GLI2, IGF2, TP53, etc. Exemplary TSGs include, e.g., RB, TP53, APC, NF1, BRCA1, BRCA2, PTEN, CDK inhibitory proteins (e.g., pl6, p21), PTCH, WT1, etc. It will be understood that a number of these oncogene and TSG names encompass multiple family members and that many other TSGs are known. In some embodiments any such gene may be genetically modified, e.g., to generate a cancer model, which may be used, e.g., to determine effect of particular alterations on development of cancer, to determine effect of particular alterations on efficacy of or resistance to treatment, to identify or characterize existing or potential candidate therapeutic agents, etc. Similar methods are envisioned for genes associated with other diseases.
[00160] In some embodiments a disease is a cardiovascular disease, e.g., atherosclerotic heart disease or vessel disease, congestive heart failure, myocardial infarction, cerebrovascular disease, peripheral artery disease, cardiomyopathy.
[00161] In some embodiments a disease is a psychiatric, neurological, or neurodevelopmental disease, e.g., schizophrenia, depression, bipolar disorder, epilepsy, autism, addiction. Neurodegenerative diseases include, e.g., Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, frontotemporal dementia.
[00162] In some embodiments a disease is an autoimmune diseases e.g., acute disseminated encephalomyelitis, alopecia areata, antiphospholipid syndrome, autoimmune hepatitis, autoimmune myocarditis, autoimmune pancreatitis, autoimmune polyendocrine syndromesautoimmune uveitis, inflammatory bowel disease (Crohn's disease, ulcerative colitis), type I diabetes mellitus (e.g. , juvenile onset diabetes), multiple sclerosis, scleroderma, ankylosing spondylitis, sarcoid, pemphigus vulgaris, pemphigoid, psoriasis, myasthenia gravis, systemic lupus erythemotasus, rheumatoid arthritis, juvenile arthritis, psoriatic arthritis, Behcet's syndrome, Reiter's disease, Berger's disease, dermatomyositis, polymyositis, antineutrophil cytoplasmic antibody-associated vasculitides (e.g., granulomatosis with polyangiitis (also known as Wegener's granulomatosis), microscopic polyangiitis, and Churg-Strauss syndrome), scleroderma, Sjogren's syndrome, anti- glomerular basement membrane disease (including Goodpasture's syndrome), dilated cardiomyopathy, primary biliary cirrhosis, thyroiditis (e.g., Hashimoto's thyroiditis, Graves' disease), transverse myelitis, and Guillane-Barre syndrome.
[00163] In some embodiments a disease is a respiratory disease, e.g., allergy affecting the respiratory system, asthma, chronic obstructive pulmonary disease, pulmonary hypertension, pulmonary fibrosis, and sarcoidosis.
[00164] In some embodiments a disease is a renal disease, e.g., polycystic kidney disease, lupus, nephropathy (nephrosis or nephritis) or glomerulonephritis (of any kind).
[00165] In some embodiments a disease is vision loss or hearing loss, e.g., associated with advanced age.
[00166] In some embodiments a disease is an infectious disease, e.g., any disease caused by a virus, bacteria, fungus, or parasite. In some embodiments it is of interest to modify genes that may be involved in susceptibility to the disease.
[00167] It will be understood that classification of diseases herein is not intended to be limiting. One of ordinary skill in the art will appreciate that various diseases may be appropriately classified in multiple different groups.
[00168] In some embodiments a disease is one for which at least one genome- wide association (GWA) study (GWAS) has been performed. In some embodiments a GWAS types multiple "cases" (subjects having a disease of interest or particular manifestations thereof) and "controls" (subjects not having the disease or
manifestations) for several thousand to millions, e.g., 1 million or more, e.g., 1-5 million or more, alleles (e.g., single nucleotide polymorphisms) positioned throughout the genome or a substantial portion thereof (e.g., at least 80%, 90%, 95%, or more of the genome). It will be understood that control data may be obtained from historical data. Genotyping may be performed using microarrays or other methods. Alleles associated (e.g., in a statistically significant manner) with increased (or decreased) risk of a disease (or particular manifestations) may thereby be identified. It will be appreciated that statistical results may be corrected for multiple hypothesis testing, e.g., using methods known in the art. In some embodiments a p value of less than about 10"7, 10"8, or 10"9 is considered evidence of association. In some embodiments a gene or allele or polymorphism has been identified as contributing to disease risk or severity in at least one GWAS. See, e.g., http://www.genome.gov/gwastudies for examples of GWAS studies and genetic variants (alleles, polymorphisms) associated with various diseases. In some embodiments a gene (or any sequence) is one for which an allele or polymorphism is associated with an increased or decreased risk of developing a disease of at least 1.1, 1.2, 1.5, 2, 3, 4, 5, 7.5, 10, or more, relative to individuals not having the allele or polymorphism. In some embodiments an allele or polymorphism is associated with an increased or decreased risk of developing a disease of at least 1.1, 1.2, 1.5, 2, 3, 4, 5, 7.5, 10, or more, relative to individuals not having the allele or polymorphism. Genes, alleles, polymorphisms, or genetic loci that may contribute to any phenotypic trait of interest such as longevity, weight, resistance to infection, response or lack thereof to various therapeutic agents, resistance or susceptibility to potentially harmful substances such as toxins or infectious agents (e.g., viruses, bacteria, fungi, parasites), are of interest. A phenotypic trait may be a physical sign (such as blood pressure), a biochemical marker, which in some embodiments may be detectable in a body fluid such as blood, saliva, urine, tears, etc., such as level of a metabolite, LDL, etc., wherein an abnormally low or high level of the marker may correlate with having or not having the disease or with susceptibility to or protection from a disease.
[00169] In some embodiments a sequence to be inserted into a genome encodes a tag. The sequence may be inserted into a gene in an appropriate position such that a fusion protein comprising the tag is produced. The term "tag" is used in a broad
sense to encompass any of a wide variety of polypeptides. In some embodiments, a tag comprises a sequence useful for purifying, expressing, solubilizing, and/or detecting a polypeptide. In some embodiments a tag may serve multiple functions. In some embodiments a tag is a relatively small polypeptide, e.g., ranging from a few amino acids up to about 100 amino acids long. In some embodiments a tag is more than 100 amino acids long, e.g., up to about 500 amino acids long, or more. In some embodiments, a tag comprises an HA, TAP, Myc, 6XHis, Flag, V5, or GST tag, to name few examples. A tag (e.g., any of the afore-mentioned tags) that comprises an epitope against which an antibody, e.g., a monoclonal antibody, is available (e.g., commercially available) or known in the art may be referred to as an "epitope tag". In some embodiments a tag comprises a solubility-enhancing tag (e.g., a SUMO tag, NUS A tag, SNUT tag, a Strep tag, or a monomeric mutant of the Ocr protein of bacteriophage T7). See, e.g., Esposito D and Chatterjee DK. Curr Opin BiotechnoL; 17(4):353-8 (2006). In some embodiments, a tag is cleavable, so that at least a portion of it can be removed, e.g., by a protease. In some
embodiments, this is achieved by including a protease cleavage site in the tag, e.g., adjacent or linked to a functional portion of the tag. Exemplary proteases include, e.g., thrombin, TEV protease, Factor Xa, PreScission protease, etc. In some embodiments, a "self-cleaving" tag is used. See, e.g., PCT/US05/05763. In some embodiments, a tag comprises a fluorescent polypeptide (e.g., GFP or a derivative thereof such as enhanced GFP (EGFP)) or an enzyme that can act on a substrate to produce a detectable signal, e.g., a fluorescence or colorimetric signal. Luciferase (e.g., a firefly, Renilla, or Gaussia luciferase) is an example of such an enzyme. Examples of fluorescent proteins include GFP and derivatives thereof, proteins comprising chromophores that emit light of different colors such as red, yellow, and cyan fluorescent proteins, etc. A tag, e.g., a fluorescent protein, may be monomeric. In certain embodiments a fluorescent protein is e.g., Sirius, Azurite, EBFP2, TagBFP, mTurquoise, ECFP, Cerulean, TagCFP, mTFPl, mUkGl, mAGl,
AcGFPl, TagGFP2, EGFP, mWasabi, EmGFP, TagYPF, EYFP, Topaz, SYFP2, Venus, Citrine, mKO, mK02, mOrange, mOrange2, TagRFP, TagRFP-T, mStrawberry, mRuby, mCherry, mRaspberry, mKate2, mPlum, niNeptune, mTomato, T- Sapphire, mAmetrine, mKeima. See, e.g., Chalfie, M. and Kain, SR (eds.) Green fluorescent protein: properties, applications, and protocols (Methods of
biochemical analysis, v. 47). Wiley-Interscience, Hoboken, N.J., 2006, and/or Chudakov, DM, et al, Physiol Rev. 90(3): 1103-63, 2010 for discussion of GFP and numerous other fluorescent or luminescent proteins. In some embodiments a tag may comprise a domain that binds to and/or acts a sensor of a small molecule (e.g., a metabolite) or ion, e.g., calcium, chloride, or of intracellular voltage, pH, or other conditions. Any genetically encodable sensor may be used; a number of such sensors are known in the art. In some embodiments a FRET -based sensor may be used. In some embodiments different genes are modified to incorporate different tags, so that proteins encoded by the genes are distinguishably labeled. For example, between 2 and 20 distinct tags may be introduced. In some embodiments the tags have distinct emission and/or absorption spectra. In some embodiments a tag may absorb and/or emit light in the infrared or near-infrared region. It will be understood that any nucleic acid sequence encoding a tag may be codon-optimized for expression in a cell, zygote, embryo, or animal into which it is to be introduced.
[00170] In some embodiments it may be of interest to express fragments or domains of a protein, which may act in a dominant negative manner and may, for example, disrupt normal function or interaction of the protein.
[00171] In some embodiments a gene of interest encodes a protein the
aggregation of which is associated with one or more diseases, which may be referred to as protein misfolding diseases. Examples include, e.g., alpha-synuclein
(Parkinson's disease and related disorders), amyloid beta or tau (Alzheimer's disease), TDP-43 (frontotemporal dementia, ALS).
[00172] In some embodiments a gene of interest encodes a transcription factor, a transcriptional co-activator or co-repressor, an enzyme, a chaperone, a heat shock factor, a heat shock protein, a receptor, a secreted protein, a transmembrane protein, a histone (e.g., HI, H2A, H2B, H3, H4), a peripheral membrane protein, a soluble protein, a nuclear protein, a mitochondrial protein, a growth factor, a cytokine (e.g., an interleukin, e.g., any of IL-1 - IL-33), an interferon (e.g., alpha, beta, or gamma), a chemokine (e.g., a CXC, CX3C, C (or XC), or CX3C chemokine). A chemokine may be CCL1 - CCL28, CXCL1 - CXCL17, XCL1 or XCL2, or CXC3L1). In some embodiments a gene encodes a colony-stimulating factor, a hormone (e.g., insulin, thyroid hormone, growth hormone, estrogen, progesterone, testosterone), an extracellular matrix protein (e.g., collagen, fibronectin), a motor protein (e.g.,
dynein, myosin), cell adhesion molecule, a major or minor histocompatibility (MHC) gene, a transporter, a channel (e.g., an ion channel), an immunoglobulin (Ig) superfamily (IgSF) gene (e.g., a gene encoding an antibody, T cell receptor, B cell receptor), tumor necrosis factor, an NF-kappaB protein, an integrin, a cadherin superfamily member (e.g., a cadherin), a selectin, a clotting factor, a complement factor, a plasminogen, plasminogen activating factor. Growth factors include, e.g., members of the vascular endothelial growth factor (VEGF, e.g., VEGF-A, VEGF-B, VEGF-C, VEGF-D), epidermal growth factor (EGF), insulin-like growth factor (IGF; IGF-1, IGF-2), fibroblast growth factor (FGF, e.g., FGF1 - FGF22), platelet derived growth factor (PDGF), or nerve growth factor (NGF) families. It will be understood that the afore-mentioned protein families comprise multiple members. Any such member may be used in various embodiments. In some embodiments a growth factor promotes proliferation and/or differentiation of one or more hematopoietic cell types. For example, a growth factor may be CSF1 (macrophage colony- stimulating factor), CSF2 (granulocyte macrophage colony- stimulating factor, GM-CSF), or CSF3 (granulocyte colony-stimulating factors, G-CSF). In some embodiments a gene encodes erythropoietin (EPO). In some embodiments, a gene encodes a neurotrophic factor, i.e., a factor that promotes survival,
development and/or function of neural lineage cells (which term as used herein includes neural progenitor cells, neurons, and glial cells, e.g., astrocytes,
oligodendrocytes, microglia). For example, in some embodiments, the protein is a factor that promotes neurite outgrowth. In some embodiments, the protein is ciliary neurotrophic factor (CNTF) or brain-derived neurotrophic factor (BDNF).
[00173] In some embodiments a gene of interest encodes a polypeptide that is a subunit of any protein that is comprised of multiple subunits.
[00174] An enzyme may be any protein that catalyzes a reaction of a type that has been assigned an Enzyme Commission number (EC number) by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC- IUBMB). Enzymes include, e.g., oxidoreductases, transferases, hydrolases, lyases, isomerases, ligases. Examples include, e.g., kinases (protein kinases, e.g., Ser/Thr kinase, Tyr kinase), lipid kinases (e.g., phosphatidylmositide 3-kinases (PI 3-kinases or PI3Ks)), phosphatases, acetyltransferases, methyltransferases, deacetylases, demethylases, lipases, cytochrome P450s, glucuronidases, recombinases (e.g., Rag-
1, Rag-2). An enzyme may participate in the biosynthesis, modification, or degradation of nucleotides, nucleic acids, amino acids, proteins, neurotransmitters, xenobiotics (e.g., drugs) or other macromolecules.
[00175] The mammalian genome encodes at least about 500 different kinases. Kinases can be classified based on the nature of their typical substrates and include protein kinases (i.e., kinases that transfer phosphate to one or more protein(s)), lipid kinases (i.e., kinases that transfer a phosphate group to one or more lipid(s)), nucleotide kinases, etc. Protein kinases (PKs) are of particular interest in certain aspects of the invention. PKs are often referred to as serine/threonine kinases (S/TKs) or tyrosine kinases (TKs) based on their substrate preference.
Serine/threonine kinases (EC 2.7.11.1) phosphorylate serine and/or threonine residues while TKs (EC 2.7.10.1 and EC 2.7.10.2) phosphorylate tyrosine residues. A number of "dual specificity" kinases (EC 2.7.12.1) that are capable of
phosphorylating both serine/threonine and tyrosine residues are known. The human protein kinase family can be further divided based on sequence/structural similarity into the following groups: (1) AGC kinases - containing PKA, PKC and PKG; (2) CaM kinases - containing the calcium/calmodulin-dependent protein kinases; (3) CK1 - containing the casein kinase 1 group; (4) CMGC - containing CDK, MAPK, GSK3 and CLK kinases; (5) STE - containing the homologs of yeast Sterile 7, Sterile 11 , and Sterile 20 kinases; (6) TK - containing the tyrosine kinases; (7) TKL - containing the tyrosine-kinase like group of kinases. A further group referred to as "atypical protein kinases" contains proteins that lack sequence homology to the other groups but are known or predicted to have kinase activity, and in some instances are predicted to have a similar structural fold to typical kinases.
[00176] Receptors include, e.g., G protein coupled receptors, tyrosine kinase receptors, serine/threonine kinase receptors, Toll-like receptors, nuclear receptor, immune cell surface receptor. In some embodiments a receptor is a receptor for any of the hormones, cytokines, growth factors, or secreted proteins mentioned herein. Numerous G protein coupled receptors (GPCRs) are known in the art. See, e.g., Vroling B, GPCRDB: information system for G protein-coupled receptors. Nucleic Acids Res. 2011 Jan;39(Database issue):D309-19. Epub 2010 Nov 2. The GPCRDB can be found online at http://www.gpcr.org/7tm/. G protein coupled receptors include, e.g., adrenergic, cannabinoid, purinergic receptors, neuropeptide receptors,
olfactory receptors. Transcription factors (TFs) (sometimes called sequence-specific DNA-binding factors) bind to specific DNA sequences and (alone or in a complex with other proteins), regulate transcription, e.g., activating or repressing
transcription. Exemplary TFs are listed, for example, in the TRANSFAC® database, Gene Ontology (http://www.geneonlology.org/) or DBD
(www.transcriptionfactor.org) (Wilson, et al, DBD - taxonomically broad transcription factor predictions: new content and functionality Nucleic Acids Research 2008 doi: 10.1093/nar/gkm964). TFs can be classified based on the structure of their DNA binding domains (DBD). For example in certain
embodiments a TF is a helix-loop-helix, helix-turn-helix, winged helix, leucine zipper, bZIP, zinc finger, homeodomain, or beta-scaffold factor with minor groove contacts protein. Transcription factors include, e.g., p53, STAT3, PAS family transcription factors (e.g., HIF family: HIF1A, HIF2A, HIF3A), aryl hydrocarbon receptor.
[00177] In some embodiments it may be of interest to genetically modify multiple genes that function in the same biological pathway or process, e.g., signal transduction pathway, biosynthetic pathway, xenobiotic metabolizing pathway, anabolic or catabolic pathway, apoptosis, autophagy, endocytosis, exocytosis. In some embodiments an animal generated according to inventive methods is useful for studying drug metabolism. For example, it may be of interest to genetically modify multiple enzymes involved in xenobiotic metabolism (e.g., multiple P450s). In some embodiments an animal generated according to inventive methods is useful for studying the immune system and/or for generating animals that have a humanized immune system or that are immunocompromised and may serve as hosts for cells or tissues from other organisms of the same species or different species.
[00178] The foregoing written specification is considered to be sufficient to enable one skilled in the art to practice the invention. Various modifications of the invention in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description and fall within the scope of the appended claims. The advantages and objects of the invention are not necessarily encompassed by each embodiment of the invention. Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments described herein, which fall within the
scope of the claims. The scope of the present invention is not to be limited by or to embodiments or examples described above.
[00179] Section headings used herein are not to be construed as limiting in any way. It is expressly contemplated that subject matter presented under any section heading may be applicable to any aspect or embodiment described herein.
[00180] Embodiments or aspects herein may be directed to any agent, composition, article, kit, and/or method described herein. It is contemplated that any one or more embodiments or aspects can be freely combined with any one or more other embodiments or aspects whenever appropriate. For example, any combination of two or more agents, compositions, articles, kits, and/or methods that are not mutually inconsistent, is provided.
[00181] Articles such as "a", "an", "the" and the like, may mean one or more than one unless indicated to the contrary or otherwise evident from the context.
[00182] The phrase "and/or" as used herein in the specification and in the claims, should be understood to mean "either or both" of the elements so conjoined.
Multiple elements listed with "and/or" should be construed in the same fashion, i.e., "one or more" of the elements so conjoined. Other elements may optionally be present other than the elements specifically identified by the "and/or" clause. As used herein in the specification and in the claims, "or" should be understood to have the same meaning as "and/or" as defined above. For example, when used in a list of elements, "or" or "and/or" shall be interpreted as being inclusive, i.e., the inclusion of at least one, but optionally more than one, of list of elements, and, optionally, additional unlisted elements. Only terms clearly indicative to the contrary, such as "only one of or "exactly one of will refer to the inclusion of exactly one element of a number or list of elements. Thus claims that include "or" between one or more members of a group are considered satisfied if one, more than one, or all of the group members are present, employed in, or otherwise relevant to a given product or process unless indicated to the contrary. Embodiments are provided in which exactly one member of the group is present, employed in, or otherwise relevant to a given product or process. Embodiments are provided in which more than one, or all of the group members are present, employed in, or otherwise relevant to a given product or process. Any one or more claims may be amended to explicitly exclude any embodiment, aspect, feature, element, or characteristic, or any combination
thereof. Any one or more claims may be amended to exclude any agent, composition, amount, dose, administration route, cell type, target, cellular marker, antigen, targeting moiety, or combination thereof.
[00183] Embodiments in which any one or more limitations, elements, clauses, descriptive terms, etc., of any claim (or relevant description from elsewhere in the specification) is introduced into another claim are provided. For example, a claim that is dependent on another claim may be modified to include one or more elements or limitations found in any other claim that is dependent on the same base claim. It is expressly contemplated that any amendment to a genus or generic claim may be applied to any species of the genus or any species claim that incorporates or depends on the generic claim.
[00184] Where a claim recites a composition, methods of using the composition as disclosed herein are provided, and methods of making the composition according to any of the methods of making disclosed herein are provided. Where a claim recites a method, a composition for performing the method is provided. Where elements are presented as lists or groups, each subgroup is also disclosed. It should also be understood that, in general, where embodiments or aspects is/are referred to herein as comprising particular element(s), feature(s), agent(s), substance(s), step(s), etc., (or combinations thereof), certain embodiments or aspects may consist of, or consist essentially of, such element(s), feature(s), agent(s), substance(s), step(s), etc. (or combinations thereof). It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited. Any method of treatment may comprise a step of providing a subject in need of such treatment, e.g., a subject having a disease for which such treatment is warranted. Any method of treatment may comprise a step of diagnosing a subject as being in need of such treatment, e.g., diagnosing a subject as having a disease for which such treatment is warranted.
[00185] Where ranges are given herein, embodiments in which the endpoints are included, embodiments in which both endpoints are excluded, and embodiments in which one endpoint is included and the other is excluded, are provided. It should be assumed that both endpoints are included unless indicated otherwise. Unless otherwise indicated or otherwise evident from the context and understanding of one
of ordinary skill in the art, values that are expressed as ranges can assume any specific value or subrange within the stated ranges in various embodiments, to the tenth of the unit of the lower limit of the range, unless the context clearly dictates otherwise. "About" in reference to a numerical value generally refers to a range of values that fall within ±10%, in some embodiments ±5%, in some embodiments ±1%), in some embodiments ±0.5%> of the value unless otherwise stated or otherwise evident from the context. In any embodiment in which a numerical value is prefaced by "about", an embodiment in which the exact value is recited is provided. Where an embodiment in which a numerical value is not prefaced by "about" is provided, an embodiment in which the value is prefaced by "about" is also provided. Where a range is preceded by "about", embodiments are provided in which "about" applies to the lower limit and to the upper limit of the range or to either the lower or the upper limit, unless the context clearly dictates otherwise. Where a phrase such as "at least", "up to", "no more than", or similar phrases, precedes a series of numbers, it is to be understood that the phrase applies to each number in the list in various embodiments (it being understood that, depending on the context, 100% of a value, e.g., a value expressed as a percentage, may be an upper limit), unless the context clearly dictates otherwise. For example, "at least 1, 2, or 3" should be understood to mean "at least 1, at least 2, or at least 3" in various embodiments. It will also be understood that any and all reasonable lower limits and upper limits are expressly contemplated.
[00186] Exemplification
[00187] Example 1
[00188] EXPERIMENTAL PROCEDURES
[00189] Procedures for generating sgRNAs expressing vector
[00190] Bicistronic expression vector expressing Cas9 and sgRNA (Cong et ah,
Science 339: 819-823 (2013)) were digested with Bbsl and treated with Antarctic
Phosphatase, and the linearized vector was gel-purified. A pair of oligos (Table 6) for each targeting site was annealed, phosphorylated, and ligated to linearized vector.
[00191] Cell culture and Transfection
[00192] V6.5 mESCs (on a 129/Sv x C57BL/6 Fl hybrid background) were cultured on gelatin-coated plates with standard mESC culture conditions. Cells were
transfected with a plasmid expressing mammalian codon optimized Cas9 and sgRNA (single targeting), or three plasmids expressing Cas9 and sgRNAs targeting Tetl, Tet2, and Tet3 (triple targeting), or five PCR products each coding for sgRNA targeting Tetl, Tet2, Tet3, Sry, and Uty, along with a plasmid expressing PGK- puroR using FuGENE HD reagent (Promega), following manufacturer's instructions. 12 hours after transfection, mESC were re-plated at a low density on DR4 MEF feeder layers. Puromycin (2μg/ml) was added one day after replating and taken off after 48 hours. After recovering for 4 to 6 days, individual colonies were picked and genotyped by RFLP and Southern blot analysis, and the leftover mES cells on plate were collected for Suveryor assay.
[00193] Suveryor assay and RFLP analysis for genome modification
[00194] Suveryor assay was performed as described by (Guschin et ah, Methods Molec Biol, 649:247-256 (2010)). Genomic DNA from treated and control ES cells or targeted and control mice was extracted. Mouse genomic DNA samples were prepared from tail biopsies. PCR was performed using Tetl, 2, 3 specific primers (Table S3) under the following conditions: 95°C for 5 min; 35x (95°C for 30 s, 60°C for 30 s, 68°C for 40 s); 68°C for 2 min; hold at 4°C. PCR products were then denatured, annealed, and treated with Suveryor nuclease (Transgenomic). DNA concentration of each band was measured on an ethidium bromide-stained 10% acrylamide Criterion TBE gel (BioRad) and quantified using Image J software. The same PCR products for Suveryor assay were used for RFLP analysis. lOul of Tetl, Tet2, or Tet3 PCR product was digested with Sacl, EcoRV, or Xhol respectively. Digested DNA was separated on an ethidium bromide-stained agarose gel (2%). For sequencing, PCR products were cloned using the Original TA Cloning Kit
(Invitrogen), and mutations were identified by Sanger sequencing.
[00195] Dot blot
[00196] DNA was extracted from pre-plated mESCs following standard procedures. DNA was transferred to nylon membrane using BioRad slot blot vacuum manifold apparatus. Anti-5hmC (Active Motif 1 : 10000) was used to detect 5hmC following manufacturer's protocol.
[00197] Production of Cas9 mRNA and sgRNA
[00198] T7 promoter was added to Cas9 coding region by PCR amplification using primer Cas9 F and R (Table 6). T7-Cas9 PCR product was gel-purified and used as the template for in vitro transcription (IVT) using mMESSAGE
mMACHINE T7 ULTRA kit (Life Technologies). T7 promoter was added to sgRNAs template by PCR amplification using primer Tetl F and R, Tet2 F and R, Tet3 F and R (Table 6). The T7-sgRNA PCR product was gel-purified and used as the template for IVT using MEGAshortscript T7 kit (Life Technologies). Both the Cas9 mRNA and the sgRNAs were purified using MEGAclear kit (Life
Technologies) and eluted in RNase-free water.
[00199] One cell embryo injection
[00200] All animal procedures were performed according to NIH guidelines and approved by the Committee on Animal Care at MIT. B6D2F1 (C57BL/6 X DBA2) female mice and ICR mouse strains were used as embryo donors and foster mothers, respectively. Super-ovulated female B6D2F1 mice (7-8 weeks old) were mated to B6D2F1 stud males, and fertilized embryos were collected from oviducts. Cas9 mRNAs (from 20 ng/μΐ to 200 ng/μΐ) and sgRNA (from 20 ng/μΐ to 50 ng/μΐ) was injected into the cytoplasm of fertilized eggs with well recognized pronuclei in M2 medium (Sigma). For oligos injection, Cas mRNA (100 ng/μΐ), sgRNA (50¾/μ1) and donor oligos (100 ng/μΐ) were mixed and injected into zygotes at the pronuclei stage. The injected zygotes were cultured in KSOM with amino acids at 37°C under 5% C02 in air until blastocyst stage by 3.5 days. Thereafter, 15-25 blastocysts were transferred into uterus of pseudopregnant ICR females at 2.5 dpc.
[00201] Southern blotting
[00202] Genomic DNA was separated on a 0.8% agarose gel after restriction digests with the appropriate enzymes, transferred to a nylon membrane (Amersham) and hybridized with 32P random primer (Stratagene)-labeled probes.
[00203] Prediction of potential off-targets
[00204] Potential targets of CRISPR sgRNAs were found using the rules outline in Mali et al., Science, 339:823-826 (2013). For a 20nt sgRNA sequence of nnnnn nnMMM MMMMM MMMMM (SEQ ID NO: 135), where M are the seed bases
preceding the PAM sequence NGG, four search sequences (MMM MMMMM MMMMM AGG (SEQ ID NO: 136); MMM MMMMM MMMMM CGG (SEQ ID NO: 137); MMM MMMMM MMMMM GGG (SEQ ID NO: 138); MMM
MMMMM MMMMM TGG (SEQ ID NO: 139)) were generated. Exact matches to these search sequences in the mouse genome (mm9) were found using bowtie and reported as potential targets of the CRISPR sgRNA.
[00205] RESULTS
[00206] Simultaneous targeting up to five genes in ES cells
[00207] To test the possibility of targeting functionally redundant genes from the same gene family, sgRNAs targeting the Ten-eleven translocation (Tet) family members, Tetl, Tet2 and Tet3 were digested (Fig 1A). Tet proteins (Tetl/2/3) convert 5-methylcytosine (5mC) to 5-hydroxymethylcytosine (5hmC) in various embryonic and adult tissues and mutant mice for each of these three genes have been produced by homologous recombination in ES cells (Dawlaty et al. Cell Stem Cell, 9: 166-175 (2011); Gu et al, Nature, 477:606-610 (2011); Li et al, Blood, 1 18:4509- 4518 (2011); Moran-Crusio et al, Cancer Cell, 20:11-24 (2011)). To test whether the CRISPR/Cas system could produce targeted cleavage in the mouse genome, plasmids expressing both the mammalian codon optimized Cas9 and a sgRNA targeting each gene (Cong et al, Science, 339:819-823(2013); Mali et al, Science, 339:823-826 (2013)) were transfected into mouse ES cells and determined the targeted cleavage efficiency by the Surveyor assay (Guschin et al, Methods Mol Biol, 649:247-256 (2010)). All three Cas9-sgRNA transfections produced cleavage at target loci with high efficiency of 36% at Tetl, 48% at Tet2, and 36% at Tet3 (Fig IB). Because each target locus contains a restriction enzyme recognition site (Fig 1 A), a ~500bp fragment around each target site was PCR amplified, and the PCR products were digested with the respective enzyme. A correctly targeted allele will lose the restriction site, which can be detected by failure to cleave upon enzyme treatment. Using this restriction fragment length polymorphism (RFLP) assay, 48 ES cell clones from each single targeting experiment were screened. Consistent with the Surveyor analysis, a high percentage of mESC clones were targeted, with a high probability of having both alleles mutated (Fig 5 A). The results summarized in Table 1 demonstrate that between about 65% and about 81% of the tested ES cell
clones carried mutations in the Tet genes with up to abut 77% having mutations in both alleles.
[00208] The high efficiency of single gene modification prompted testing targeting of all three genes simultaneously. For this ES cells were co-transfected with the constructs expressing Cas9 and three sgR As targeting Tetl, 2 and 3. Of 96 clones screened using the RFLP assay, 20 clones were identified as having mutations in all six alleles of the three genes (Fig 1C, 5B, Table 1). To exclude that a PCR bias could give false positive results, Southern blot analysis was performed and complete agreement with the RFLP results was confirmed (Fig 1C). The PCR products of Tetl, Tet2, and Tet3 targeted regions were sub-cloned and sequenced to verify that all of eight tested clones carried biallelic mutations in all three genes with most clones displaying two mutant alleles for each gene with small insertions or deletions (indels) at the target site (Fig ID). To test whether these mutant alleles would abolish the function of Tet proteins, the 5hmC level of targeted clones were compared to wt mES cells. Previously, a depletion of 5hmC in Tetl/Tet2 double knockout mES cells derived using traditional gene targeting methods was reported (Dawlaty et al, Cell Stem Cell, 9: 166-175 (2011)). As expected from loss of function alleles, a significant reduction of 5hmC levels in all clones carrying biallelic mutations in the three genes was found (Fig IE).
[00209] Recently efficient targeting of two Y-linked genes, Sry and Uty, using TALENs was demonstrated (Wang et al., in press). To further test the potential of multiplexed gene targeting by CRISPR/Cas system, sgRNAs targeting these two Y- linked genes were designed (Figure 5C). Short PCR products encoding sgRNAs targeting all five genes (Tetl, Tet2, Tet3, Sry, and Uty) were pooled and co- transfected with a Cas9 expressing plasmid and the PGK puroR cassette into ES cells. Of 96 clones that were screened using the RFLP assay, 10%> carried mutations in all eight alleles of the five genes (Figure 5D, Table 4), demonstrating the capacity of the CRISP/Cas9 system for highly efficient multiplexed gene targeting.
[00210] One step generation of single gene mutant mice by zygote injection
[00211] Whether mutant mice could be generated in vivo by direct embryo manipulation was tested. Capped polyadenylated Cas9 mRNA was produced by in vitro transcription and co-injected with sgRNAs. Initially, to determine the optimal concentration of Cas9 mRNA for targeting in vivo, varying amounts of Cas9-
encoding mRNA were injected with Tetl targeting sgRNA at constant concentration (20 ng/μΐ) into pronuclear (PN) stage one-cell mouse embryos and the frequency of altered alleles at the blastocyst stage was assessed using the RFLP assay. As expected, higher concentration of Cas9 mRNA led to more efficient gene disruption (Fig. 6A). Nevertheless, even embryos injected with the highest amount of Cas9 mRNA (200¾/μ1) showed normal blastocyst development, indicating low toxicity.
[00212] To investigate whether postnatal mice carrying targeted mutations could be generated, sgRNAs targeting Tetl or Tet2 were co-injected with different concentrations of Cas9 mRNA. Blastocysts derived from the injected embryos were transplanted into foster mothers and newborn pups were obtained. As summarized in Table 2, about 10% of the transferred blastocysts developed to birth independent of the RNA concentrations used for injection indicating low fetal toxicity of the Cas9 mRNA and sgRNA. RFLP, Southern blot, and sequencing analysis demonstrated that between 50 and 90% of the postnatal mice carried biallelic mutations in either target gene (Figs. 2A, 2B, 2C, Table 2).
[00213] Surprisingly, specific Δ9 Tetl and specific Δ8 and Δ15 Tet2 mutant alleles were repeatedly recovered in independently derived mice. Preferential generation of these alleles is likely caused by a short sequence repeat flanking the DSB (see Figure 6B) consistent with a previous report demonstrating that perfect microhomology sequences flanking the cleavage sites can generate microhomology- mediated precise deletions by end repair mechanism (MMEJ) (McVey and Lee, Trends Genet, 24:529-538(2008); Symington and Gautier, Annu Rev Genet, 11 :636- 646 (2011)) (Fig 6B). A similar observation was also made when TALEN mRNA was injected into one cell rat embryos (Tesson et al, Nat Biotechnol, 29:695-696 (2011)).
[00214] Blastocysts were also derived from zygotes injected with Cas9 mRNA and Tet3 sgRNA. Genotyping of the blastocysts demonstrated that of eight embryos three were homozygous and three were heterozygous Tet3 mutants (two failed to amplify) (Fig 6C). Some blastocysts were implanted into foster mothers and, upon C-section, multiple mice of smaller size (Fig 6D), many of which died soon after delivery, were readily identified. Genotyping shown in Fig 6E indicated that all pups with mutations in both Tet3 alleles died neonatally. Only two out of 15 mice survived that were either Tet3 heterozygous mutants or wt (Fig 6F). These results
are consistent with the lethal neonatal phenotype of Tet3 knockout mice generated using traditional methods (Gu et al, Nature, 477:606-610 (2011)), although which of the Tet3 mutations produced loss of function rather than hypomorphic alleles has not been established.
[00215] One step generation of double gene mutant mice by zygote injection
[00216] To test whether Tetl/Tet2 double mutant mice could be produced from single embryos, Tetl and Tet2 sgRNAs were co-injected with 20 or 100 ng/μΐ Cas9 mRNA into zygotes. A total of 28 pups were born from 144 embryos transferred into foster mothers (21% live birth rate) that had been injected at the zygote stage with high concentrations of RNA (Cas9 mRNA at 100 ng/μΐ, sgRNAs at 50 ng/μΐ), consistent with low or no toxicity of the Cas9 mRNA and sgRNAs (Table 3). RFLP, Southern blot analysis and sequencing identified 22 mice carrying targeted mutations at all four alleles of the Tetl and Tet2 genes (Fig 2D, 2E) with the remaining mice carrying mutations in a subset of alleles (Table 3). Injection of zygotes with low concentration of RNA (Cas9 mRNA at 20 ng/μΐ, sgRNAs at 20 ng/μΐ) yielded 19 pups from 75 transferred embryos (about 25% live birth rate), which is a higher survival rate than from embryos injected with lOOng/μΙ of Cas9 RNA. Nevertheless, more than about 50% of the pups were biallelic Tetl/Tet2 double mutants (Table 3). These results demonstrate that postnatal mice carrying biallelic mutations in two different genes can be generated within one month with high efficiency (Fig 2F).
[00217] Although the high live birth rate and normal development of mutant mice indicate low toxicity of CRISPR/Cas9 system, the off-target effects in vivo were determined. Previous work in vitro, in bacteria, and in cultured human cells suggested that the protospacer-adjacent motif (PAM) sequence NGG and the 8-12 base "seed sequence" at the 3' end of the sgRNA are most important for determining the DNA cleavage specificity (Cong et al, Science, 339:819-823(2013); Jiang et al, Nat Biotechnol, 31 :233-239 (2013); Jinek et al, Science, 337:816-821 (2012)). Based on this rule, only three and four potential off targets exist in mouse genome for Tetl and Tet2 sgRNA respectively (Table 5, Experimental procedures), with each of them perfectly matching the 12 bp seed sequence at the 3' end and the NGG PAM sequence of the sgRNA (there is no potential off target site for Tet3 sgRNA
using this prediction rule). From seven double mutant mice produced from injection with high R A concentration ~400bp fragments from all seven potential off-target loci were PCR amplified and no cleavage was found in the Surveyor assay (Figs. 7A-7B), indicating a high specificity of CRISPR/Cas system.
[00218] Multiplexed precise HDR-mediated genome editing in vivo
[00219] The NHEJ-mediated gene mutations described above produced mutant alleles with different and unpredictable insertions and deletions of variable size. The possibility of precise homology directed repair (HDR)-mediated genome editing by co-injecting Cas9 mRNA, sgRNAs and single stranded DNA oligos into one-cell embryos was explored. For this an oligo targeting Tetl so as to change two base pairs of a Sacl restriction site and create instead an EcoRI site and a second oligo targeting Tet2 with two base pair changes that would convert an EcoRV site into an EcoRI site were designed (Fig 3 A). Blastocysts were derived from zygotes injected with Cas9 mRNA and sgRNAs and oligos targeting Tetl or Tet2, respectively. DNA was isolated, amplified and digested with EcoRI to detect oligo mediated HDR events. Six out of nine Tetl targeted embryos and nine out of 15 Tet2 targeted embryos incorporated an EcoRI site at the respective target locus, with several embryos having both alleles modified (Fig 8A). When Cas9 mRNA, sgRNAs, and single stranded DNA oligos targeting both Tetl and Tet2 were co-injected into zygotes, out of 14 embryos, four were identified that were targeted with the oligo at the Tetl locus, seven that were targeted with the oligo at the Tet2 locus and one embryo (#2) that had one allele of each gene correctly modified (Fig 8B). All four alleles of embryo #2 were sequenced, confirming that one allele of each gene contained the 2bp changes directed by the oligo, while the other alleles were disrupted by NHEJ-mediated deletion and insertion (Fig 8C).
[00220] Blastocysts with double oligo injections were implanted into foster mothers and a total of 10 pups were born from 48 embryos transferred (21% live birth rate). Upon RFLP analysis using EcoRI, seven mice containing EcoRI sites at the Tetl locus and eight mice containing EcoRI sites at the Tet2 locus, with six mice containing EcoRI sites at both Tetl and Tet2 loci were identified (Fig 3B). RFLP analysis using Sacl and EcoRV to Tetl and Tet2 loci respectively was also applied showing that all alleles not targeted by oligos contained disruptions, which is in consistent with the high biallelic mutation rate by Cas9 mRNA and sgRNAs
injection. These results were confirmed by sequencing demonstrating mutations in all four alleles of mouse #5 and #7 (Fig 3C). The results herein demonstrate that mice with HR-mediated precise mutations in multiple genes can be generated in one step by CRISPR/Cas mediated genome editing.
[00221] Table 1. CRISPR/Cas mediated gene targeting in V6.5 mES cells
[00222] Plasmids encoding Cas9 and sgRNAs targeting Tetl, Tet2, and Tet3 were transfected separately (single targeting) or in a pool (triple targeting) into mES cells. The number of total alleles mutated in each mES cell clone is listed from 0 to 2 for single targeting experiment, and 0 to 6 for triple targeting experiment. The number of clones containing each specific number of mutated alleles is shown in relation to the total number of clones screened in each experiment.
[00223] Table 2. CRISPR/Cas mediated single gene targeting in BDF2 mice.
[00224] Cas9 mRNA and sgRNAs targeting Tetl , Tet2, or Tet3 were injected into fertilized eggs. The blastocysts derived from injected embryos were
transplanted into foster mothers and newborn pups were obtained and genotyped. The number of total alleles mutated in each mouse is listed from 0 to 2. The number of mice containing each specific number of mutated alleles is shown in relation to the total number of mice screened in each experiment.
Sg Dose of Blastocyst/Injected Transferred Newborns Mutant alleles per
RNA Cas9/gRNA zygotes embryos (dead) mouse/Total mice mRNA (recipients) tested*
(ng/μΐ) 2 1 0
Tetl 200/20 38/50 19(1) 2(0) 2/2 0/2 0/2
Tetl 100/20 50/60 25(1) 3(0) 2/3 0/3 1/3
Tetl 50/20 40/50 40(2) 8(3) 4/7 2/7 1/7
Tetl 100/50 167/198 60(3) 12(2) 9/11 1/11 1/11
Tet2 100/50 176/203 108(5) 22(3) 19/20 0/20 1/20
Tet3 100/50 85/112 64(4) 15(13) 9/13 2/13 2/13
*Some of the pups were cannibalized
[00225] Table 3. CRISPR/Cas mediated double gene targeting in BDF2 mice.
[00226] Cas9 mRNA and sgRNAs targeting Tetl and Tet2 were co-injected into fertilized eggs. The blastocysts derived from the injected embryos were transplanted into foster mothers and newborn pups were obtained and genotyped. The number of total alleles mutated in each mouse is listed from 0 to 4 for Tetl and Tet2. The number of mice containing each specific number of mutated alleles is shown in relation to the number of total mice screened in each experiment.
*Some of the pups were cannibal ized
[00227] Table 4 Plasmids encoding Cas9 and five PCR products expressing sgRNAs targeting Tetl, Tet2, Tet3, Sry, and Uty were co-transfected into mES cells. The number of clones containing mutations in all six Tet alleles is listed in the Tetl, 2, 3 column; the number of clones containing mutations in all six Tet alleles and Sry allele is listed in the Tetl, 2, 3 + Sry column; the number of clones containing mutations in all six Tet alleles and both Sry and Uty allele is listed in the Tetl, 2, 3 + Sry +Uty column.
[00228] The increased efficiency of generating Tetl, 2, 3 triple targeted mES clones in this quintuple targeting experiment, compared to the triple targeting experiment (Table 1), is likely due to the use of short PCR products instead of
plasmids that express sgRNAs. The much smaller size of pooled PCR products may ensure more efficient delivery into transfected cells. Table 4 is related to Table 1.
[00229] Table 5 Potential off targets of Tetl and Tete2 sgRNAs
[00230] Table 6 Oligonucleotides used in this study, oligonucleotides used for cloning sgRNA expression vector
[00232] Oligonucleotides used for making template for in vitro transcription
Template Direction Sequence (5' to 3')
TAATACGACTCACTATAGGGAGAATGGACTATAAG
F GACCACGAC (SEQ ID NO: 169)
Cas9 R GCGAGCTCTAGGAATTCTTAC (SEQ ID NO: 170)
TTAATACGACTCACTATAGGCTGCTGTCAGGGAGC
Tetl F TC (SEQ ID NO: 171)
sgRNA R AAAAGCACCGACTCGGTGCC (SEQ ID NO: 172)
TTAATACGACTCACTATAGGAAAGTGCCAACAGAT
Tet2 F ATCC (SEQ ID NO: 173)
sgRNA R AAAAGCACCGACTCGGTGCC (SEQ ID NO: 174)
Tet3 F TTAATACGACTCACTATAGGAAGGAGGGGAAGAG sgRNA TTCTCG (SEQ ID NO: 175)
R AAAAGCACCGACTCGGTGCC (SEQ ID NO: 176)
[00233] Oligonucleotides used for HDR-mediated repair through embryo injection
DISCUSSION
[00234] The genetic manipulation of mice is a crucial approach for the study of development and disease. However, the generation of mice with specific mutations is labor intensive and involves gene targeting by homologous recombination in ES cells, the production of chimeric mice and, after germ line transmission of the targeted ES cells, the interbreeding of heterozygous mice to produce the
homozygous experimental animals, a process that may take 6 to 12 months or longer (Capecchi, 2005). To produce mice carrying mutations in several genes requires time-consuming intercrossing of single mutant mice. Similarly, the generation of ES cells carrying homozygous mutations in several genes is usually achieved by sequential targeting, a process that is labor-intensive necessitating multiple consecutive cloning steps to target the genes and to delete the selectable markers.
[00235] As summarized in Figures 4A-4B and described herein, three different approaches for the generation of mice carrying multiple genetic alterations have been established. Demonstrate d herein is that CRISPR/Cas -mediated genome editing in ES cells can generate the simultaneous mutations of several genes with high efficiency, a single-step approach allowing the production of cells with mutations in five different genes (Figure 4A). Three Tet genes were chosen as targets because the respective mutant phenotypes have been well defined previously (Dawlaty et al, Cell Stem Cell, 9: 166-175 (2011); Gu et al, Nature, 477:606-610 (2011)). Cells mutant for Tetl, 2 and 3 were depleted of 5hmC as would be expected for loss of function mutations of the genes (Dawlaty et al, Dev Cell, 24:310-323 (2013)). However, which of the Cas9-mediated gene mutations produced loss of function rather than hypomorphic alleles has not been established.
[00236] Also shown herein is that mouse embryos can be directly modified by injection of Cas9 mRNA and sgRNA into the fertilized egg resulting in the efficient production of mice carrying biallelic mutations in a given gene. More significantly, co-injection of Cas9 with Tetl and Tet2 sgRNAs into zygotes produced mice that carried mutations in both genes (Figure 4B, upper panel). It was found that up to about 95% of new-born mice were biallelic mutant in the targeted gene when single sgRNA was injected, and when co-injected with two different sgRNAs, up to about 80% carried biallelic mutations in both targeted genes. Thus, mice carrying multiple mutations can be generated within 4 weeks, which is a much shorter time frame than can be achieved by conventional consecutive targeting of genes in ES cells and avoids time-consuming intercrossing of single mutant mice.
[00237] The introduction of DSBs by CRISPR/Cas generates mutant alleles with varying deletions or insertions in contrast to designed precise mutations created by homologous recombination. The introduction of point mutations into human ES cells, cancer cell lines, and mouse by ZNF or TALEN along with DNA oligo has been demonstrated previously (Chen et al, Nat Methods, 8:753-755 (2011); Soldner et al, Cell, 146:318-331 (2011); Wefers et al, PNAS, USA, 110:3782-3787 (2013)). Demonstrated herein is that CRISPR/Cas mediated targeting is useful to generate mutant alleles with predetermined alterations, and co-injection of single stranded oligos can introduce designed point mutations into two target genes in one step, allowing for multiplexed gene editing in a strictly controlled manner (Figure 4B, lower panel). This targeting system allows for the production of conditional alleles, or precise insertion of larger DNA fragments such as GFP markers so as to generate conditional knockout and reporter mice for specific genes.
[00238] It is likely that a much larger number of genomic loci than targeted in the present work can be modified simultaneously when pooled sgRNAs are introduced. The methods presented here provide for systematic genome engineering in mice, facilitating the investigation of entire signaling pathways, of synthetic lethal phenotypes or of genes that have redundant functions. A particularly interesting application is the possibility to produce mice carrying multiple alterations in candidate loci that have been identified in GWAS studies to play a role in the genesis of multigenic diseases. In summary, CRISPR/Cas mediated genome editing
allows for the generation of ES cells and mice carrying multiple genetic alterations and facilitates the genetic dissection of development and complex diseases.
[00239] Example 2
[00240] R A-programmable DNA binding enzymes (CRISPRzymes)
[00241] Reported herein is the generation of an RNA-guided, programmable transactivator based on CRISPR/Cas system, CRISPRa, which provides a tool for modulation of a (one or more) nucleic acid sequence, e.g., gene activation, and serves as a proof of principle for CRISPR-based RNA-guided DNA binding enzymes (CRISPRzymes).
[00242] Results
[00243] dCas9ta guided by sgRNA targeting tet binding site activates TetO promoter
[00244] To build a CRISPR/Cas-based transcriptional activator, H840A of the human codon-optimized Cas9 nickase was mutated to generate nuclease-deficient dCas9 [PMID: 23452860] and a 3x minimal VP 16 transcriptional activation domain (TAD) was fused to the C-terminal of the dCas9 protein (Figure 10A) to generate dCas9ta. dCas9ta was first tested on a tdTomato reporter under the control of Tet- inducible promoter with seven copies of tet binding site upstream of a CMV minimal promoter (TetO::tdTomato) (Figure 10B). HeLa and NIH3T3 cells with TetO::tdTomato transgene were generated by PiggyBac transposition [PMID: 17576687]. As a positive control for the reporter activity, these cells also constitutively expressed the rtTA-M2 transactivator that can induce tdTomato expression upon doxy eye line treatment (Figure 11C panel ii; Figure 12B).
Transfection of dCas9ta with sgRNA (sgTetO) complementary to tet binding site activated TetO::tdTomato reporter in the absence of doxycycline (Figure 11C panel iv; Figure 12D). Transfection of dCas9ta without sgRNA did not activate tdTomato expression, indicating the dCas9ta depends on sgRNA to bind to the target tet binding sites to activate tdTomato expression (Figure 11C panel iii; Figure 12C).
[00245] dCas9ta with sgRNA targeting Nanog promoter can activate both the NanogGFP reporter and the endogenous Nanog expression in NIH3T3 cells
[00246] To test whether dCas9ta can activate endogenous gene expression, dCas9ta chimeric expression construct was designed and cloned with 8 different
sgRNAs targeting Nanog promoter (sgmNanog) and transfected in NIH3T3 cells. As a comparison, a NanogGFP plasmid [PMID: 18594521] containing 1.2kb promoter of Nanog was co-transfected. Transfection of dCas9ta without sgRNA did not activate the exogenous NanogGFP reporter (Figure 11C; panel ii) or the endogenous Nanog gene (Figure 1 IB, column ii) while transfection of 8 constructs expressing dCas9ta and sgmNanog activated the NanogGFP reporter (Figure 11C, panel iii) and endogenous Nanog expression (Figure 1 IB, column iii).
[00247] dCas9 fusion with P-TEFb components also activate gene expression
[00248] To test whether dCas9 can be used to bring other protein domains to DNA to regulate gene expression, dCas9 was fused to Cdk9 and CycT, two components of the P-TEFb complex involved in the transcriptional pause release [PMID: 22986266] and their transactivation was tested activity on the
TetO::tdTomato with or without dCas9ta (Figures 13A-13D). Transfection of dCas9ta resulted in 10% of tdTomato positive cells. Transfection of both dCas9Cdk (pAC72) and dCas9CycT (pAC73) also activated tdTomato expression, though to a lesser extent (2%). Co-transfection of three plasmids, pAC5 (dCas9ta), pAC72 (dCas9Cdk9), pAC73 (dCas9CycT), with sgTetO resulted in 13% tdTomato positive cells. This additive effect indicates that co-transfection with or fusion of additional transcriptional activators or transactivation domains to dCas9ta likely further augment dCas9ta transactivation activity.
[00249] Materials and Methods
[00250] Cloning
[00251] A two-step fusion PCR was used to amplify Cas9 Nickase ORF without stop codon from the pX335 vector, incorporate H840A mutation, EcoRI -Agel restriction site on the 5' end as well as an Fsel site on the 3 'end (EcoRI -Agel- dCas9-FseI fragment). The 3x minimal VP 16 activation domain coding fragment (TAD) was excised from a vector (Addgene: 20342) containing NLSM2rtTA coding sequence by Fsel and EcoRI digestion (Fsel-TA-EcoRI fragment). The two fragments were ligated into pCR8/GW/TOPO (Invitrogen) vector digested by EcoRI to generate pACl which contains the dCas9ta gene. The dCas9ta coding sequence was subsequently excised from pACl and cloned into pX355 vector (Addgene: 42335) by Agel-EcoRI digestion to replace dCas9 Nickase to create a chimeric vector pAC2 that expresses both the dCas9ta and the sgRNA. sgRNA spacers were
cloned into the Bbsl-digested pAC2 vector. For example, sgRNA targeting TetO (sgTet) was cloned by ligating phosphorylated and anneled oligos sgTet-F:
caccGCTTTTCTCTATCACTGATA (SEQ ID NO: 179) and sgTet-R:
aaacTATCAGTGATAGAGAAAAGC (SEQ ID NO: 180) onto Bbsl-digested pAC2 vector to generate pAC5. To replace the 3x minimal activation domain (3xmTAD) in dCas9ta protein for other protein domains, Fsel-EcoRI fragment from pAC5 or pACl was replaced by PCR amplicons of different domains or genes with Fsel and EcoRI added on the primer sequences. dCas9 was cloned by PCR amplification of dCas9ta with reverse primer before the 3xTA domains and cloned into pCR8GWTOPO to create pAC84 and pAC5 to create pAC89. Non-chimeric versions of dCas9 fusions were generated by LR Clonase-medicated recombination to a pmax-DEST vector (pAC90).
[00252] A reporter assay for dCas9ta activity
[00253] A TetO::tdTomato (plasmid pAC3) transgene and a EF 1 a: :NLSM2rtTA (plasmid pAC4) transgene were delivered into NIH3T3 (mouse) and HeLa (human) cells by PiggyBac transposition. sgRNAs were designed to target TetO binding site (sgTetO). pmaxGFP (Clontech) was used as a transfection control. Transfection was done using FuGene HD following manufacturer's instructions.
[00254] qRT expression analysis
[00255] Pellets were snap-frozen and stored at -80C. RNA were prepared from the pellets by RNeasy kit (QIAGEN). cDNA were produced by Superscript III RT (Life Technology). qRT were done in triplicates using Gapdh as a control.
[00256] qRT primers:
[00257] sgRNA designs, DNA targets, oligos, and plasmids used to target different DNA. Last three bases are PAM (5'-NGG-3') motif. Lowercase letters in the target sequences indicate changes made (first g) to allow efficient U6
transcription or for mutational analysis (other changes). Lowercase letters in the oligo sequences indicate overhang compatible to the Bbsl-digested vectors. Target gene names with m prefix indicate mouse gene while those with h prefix indicates human genes.
Name Target Target sequences Forward oligo Reverse oligo Plasmid
sgTet Tet Gcttttctctatcactgata CaccGCTTTTCTCTAT AaacTATCAGTG pAC5 binding ggg CACTGATA ATAGAGAAAAG site C
(SEQ ID NO: 189) (SEQ ID NO: 190) (SEQ ID NO: 191) sgmNanog-1 mNanog GTAATGCAAAA CaccGTAATGCAAAA AaacTACAGCTT pAC34 promoter GAAGCTGTAAG GAAGCTGTA CTTTTGCATTAC
G
(SEQ ID NO: 192) SEQ ID NO: (193) (SEQ ID NO: 194) sgmNanog-2 mNanog GATCTCTAGTG CaccGATCTCTAGTG AaacGAAACTTC pAC35 promoter GGAAGTTTCAG GGAAGTTTC CCACTAGAGAT
G C
(SEQ ID NO: 195) (SEQ ID NO: 196) (SEQ ID NO: 197) sgmNanog-3 mNanog GCTCTTCACAT CaccGCTCTTCACAT AaacGGTTTCCC pAC36 promoter TGGGAAACCTG TGGGAAACC AATGTGAAGAG
G C
(SEQ ID NO: 198) (SEQ ID NO: 199) (SEQ ID NO: 200) sgmNanog-4 mNanog GAGTGTTTAAA CaccGAGTGTTTAAA AaacCTACATTA pAC37 promoter TTAATGTAGAG TTAATGTAG ATTTAAACACT
G C
(SEQ ID NO: 201) (SEQ ID NO: 202) (SEQ ID NO: 203) sgmNanog-5 mNanog GAGTTTCACGT CaccGAGTTTCACGT AaacGTCTCGGG pAC38 promoter ACCCGAGACTG ACCCGAGAC TACGTGAAACT
G C
(SEQ ID NO: 204) (SEQ ID NO: 205) (SEQ ID NO: 206) sgmNanog-6 mNanog GCTTCTGTGTA CaccGCTTCTGTGTA AaacCTCTGCTT pAC39 promoter TAAGCAGAGAG TAAGCAGAG ATACACAGAAG
G C
(SEQ ID NO: 207) (SEQ ID NO: 208) (SEQ ID NO: 209) sgmNanog-7 mNanog GCGTTAAAAAG CaccGCGTTAAAAAG AaacAAAGTGCG pAC47 promoter CCGCACTTTTG CCGCACTTT GCTTTTTAACG
G C
(SEQ ID NO: 210) (SEQ ID NO: 211) (SEQ ID NO: 212) sgmNanog-8 mNanog GTCTGTAGAAA CaccGTCTGTAGAAA AaacCTTCCATT pAC48 promoter GAA GAATGGAAG CTTTCTACAGA
C
(SEQ ID NO: 213) (SEQ ID NO: 214) (SEQ ID NO: 215)
[00258] Genbank files
Plasmid name description
pACl dCas9ta on pCR8GWTOPO
Dual expression construct expressing both dCas9ta and sgRNA from pAC2 U6 promoter
pAC3 TetO::tdTomato PiggyBac
pAC4 EFla::NLSM2rtTA PiggyBac
pAC5 Dual expression construct expression both dCas9ta and sgTetO pAC34 Dual expression construct expression both dCas9ta and sgmNanogl pAC35 Dual expression construct expression both dCas9ta and sgmNanog2 pAC36 Dual expression construct expression both dCas9ta and sgmNanog3 pAC37 Dual expression construct expression both dCas9ta and sgmNanog4 pAC38 Dual expression construct expression both dCas9ta and sgmNanog5 pAC39 Dual expression construct expression both dCas9ta and sgmNanog6 pAC47 Dual expression construct expression both dCas9ta and sgmNanog7 pAC48 Dual expression construct expression both dCas9ta and sgmNanog8
Dual expression construct expression both dCas9Apobecl and pAC71 sgTetO
pAC72 Dual expression construct expression both dCas9Cdkl and sgTetO pAC73 Dual expression construct expression both dCas9CycT and sgTetO
Dual expression construct expression both dCas9Gadd45a and pAC76 sgTetO
Dual expression construct expression both dCas9mCBPHAT and pAC78 sgTetO
Dual expression construct expression both puroR and sgTetO pAC80 (control)
pAC81 dCas9ER on pCR8GWTOPO
pAC82 dCas9p300HAT on pCR8GWTOPO
pAC83 dCas9TetlCD on pCR8GWTOPO
pAC84 dCas9 on pCR8GWTOPO
pAC85 dCas9EnR on pmax expression vector
pAC86 dCas9p300HAT on pmax expression vector
pAC87 dCas9TetlCD on pmax expression vector
pAC88 dCas9ta on pmax expression vector
pAC89 Dual expression construct expression both dCas9 and sgTetO
[00259] CRISPRzyme dCas9 fusion peptides
[00260] Sequences: dCas9TA peptide sequence is shown below. The underlined sequence indicates the 3x VP 16 minimal transactivation domains.
[00261] dCas9TA peptide sequence (Underlined sequence indicate the 3x VP 16 minimal Transactivation domains)
[00262] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKK LIGALLFDSGETAEATRLKRTARRRYTR
RK RICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KK GLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAK LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSK GYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREI NYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKR VEASGPADALDDFDLDMLPADALDDFDLDMLPADALDDFDLDMLP
G- (SEQ ID NO: 1)
[00263] dCas9Apobecl
[00264] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGR RML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAMSSETGPVAVDPTLRRRIEPHEFEVFFDPRELRKETCLLYEI
NWGGRHSVWRHTSQNTSNHVEVNFLEKFTTERYFRPNTRCSITWFLSWSPC
GECSRAITEFLSRHPYVTLFIYIARLYHHTDQRNRQGLRDLISSGVTIQIMTEQ
EYCYCWRNFVNYPPSNEAYWPRYPHLWVKLYVLELYCIILGLPPCLKILRR
KQPQLTFFTITLQTCHYQRI PPHLLWATGLK (SEQ ID NO: 2)
[00265] dCas9Cdk9
[00266] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGR RML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAMAKQYDSVECPFCDEVTKYEKLAKIGQGTFGEVFKAKHR
QTGQKVALK VLMENEKEGFPITALREIKILQLLKHENVVNLIEICRTKASPY
NRCKGSIYLVFDFCEHDLAGLLSNVLVKFTLSEIKRVMQMLLNGLYYIHRN
KILHRDMKAANVLITRDGVLKLADFGLARAFSLAK SQPNRYTNRVVTLW
YRPPELLLGERDYGPPIDLWGAGCIMAEMWTRSPIMQGNTEQHQLALISQL
CGSITPEVWPNVDKYELFEKLELVKGQKRKVKDRLKAYVRDPYALDLIDKL
LVLDPAQRIDSDDALNHDFFWSDPMPSDLKGMLSTHLTSMFEYLAPPRRKG
SQITQQSTNQSRNPATTNQTEFERVF(SEQ ID NO: 3)
[00267] dCas9CycT
[00268] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAMEGERKNNNKRWYFTREQLENSPSRRFGVDSDKELSYRQ
QAANLLQDMGQRLNVSQLTINTAIVYMHRFYMIQSFTQFHRYSMAPAALFL
AAKVEEQPKKLEHVIKVAHTCLHPQESLPDTRSEAYLQQVQDLVILESIILQT
LGFELTIDHPHTHVVKCTQLVRASKDLAQTSYFMATNSLHLTTFSLQYTPPV
VACVCIHLACKWSNWEIPVSTDGKHWWEYVDATVTLELLDELTHEFLQILE
KTPSRLKRIRNWRAYQAAMKTKPDDRGADENTSEQTILNMISQTSSDTTIAG
LMSMSTASTSAVPSLPSSEESSSSLTSVDMLQGERWLSSQPPFKLEAAQGHR
TSESLALIGVDHSLQQDGSSAFGSQKQASKSVPSAKVSLKEYRAKHAEELAA
QKRQLENME AN VKS Q Y AY AAQNLL SHD SHS S VILKMPIE S SENPERPFLDK
ADKSALKMRLPVASGDKAVSSKPEEIKMRIKVHSAGDKHNSIEDSVTKSRE
HKEKQRTHPSNHHHHHNHHSHRHSHLQLPAGPVSKRPSDPKHSSQTSTLAH
KT YSLS STLS S SS STRKRGPPEETGAAVFDHP AKIAKSTKS SLNFPFPPLPTMT
QLPGHSSDTSGLPFSQPSCKTRVPHMKLDKGPPGANGHNATQSIDYQDTVN
MLHSLLSAQGVQPTQAPAFEFVHSYGEYMNPRAGAISS
RSGTTDKPRPPPLPSEPPPPLPPLPK (SEQ ID NO: 4)
[00269] dCas9EnR
[00270] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNR VTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR FMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREI NYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPALEDRCSPQSAPSPITLQMQHLHHQQQQQQQQQQQMQHLH
QLQQLQQLHQQQLAAGVFHHPAMAFDAAAAAAAAAAAAAAHAHAAALQ
QRLSGSGSPASCSTPASSTPLTIKEEESDSVIGDMSFHNQTHTTNEEEEAEED
DDIDVDVDDTSAGGRLPPPAHQQQSTAKPSLAFSISNILSDRFGDVQKPGKSI
ENQASIFRPFEANRSQTATPSAFTRVDLLEFSRQQQAAAAAATAAMMLERA
NFLNCFNPAAYPRIHEEIVQSRLRRSAANAVIPPPMSSKMSDANPEKSALGS
MQPKLEQKLISEEDLN (SEQ ID NO: 5)
[00271] dCas9Gadd45a
[00272] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVE ASGP AMTLEEF S AAEQKTERMDT VGD ALEE VL SKARS QRTIT VG V
YEAAKLLNVDPDNVVLCLLAADEDDDRDVALQIHFTLIRAFCCENDINILRV
SNPGRL AELLLLEND AGP AE S GG AAQTPDLHC VL VTNPHS S Q WKDP ALS QL
ICFCRESRYMDQWVP VINLPER (SEQ ID NO: 6)
[00273] dCas9p300HAT
[00274] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTR SEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPMDLSTIKRKLD
TGQYQEPWQYIDDIWLMF NAWLYNRKTSRVYKYCSKLSEVFEQEIDPVM
QSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNRYHFCEKCFNEIQ
GESVSLGDDPSQPQTTINKEQFSKRK DTLDPELFVECTECGRKMHQICVLH
HEIIWPSGFVCDGCLK TARTRKENKLSAKRLPSTRLGTFLENRVNDFLRRQ
NHPESGEVTVRVVHASDKTVEVKPGMKARFVDSGEMAESFPYRTKALFAF
EEIDGVDLCFFGMHVQEYGSDCPPPNQRRVYISYLDSVHFFRPKCLRTAVYH
EILIGYLEYVKKLGYTTGHIWACPPSEGDDYIFHCHPPDQKIPKPKRLQEWY
KKMLDKAV SERI VHD YKDILKQ ATEDRLT S AKELP YFEGDF WPN VLEE SIKE
LEQEEEERKREENTSNESTDVTKGDSKNAKKKNNK TSK KSSLSRGNKK
PGMPNVSNDLSQKLYATMEKHKEVFFVIRLIACPAPNSLPPIVDPDPLIPCDL
MD GRD AFLTL ARDKHLEF S SLRRAQ WSTMCML VELHTQ S QDRF V YTCNEC
KHHVETRWHCTVCEDYDLCITCYNTK HDH KMEK (SEQ ID NO: 7)
[00275] dCas9mTetlCD
[00276] MYPYDVPDYASPKK R VEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKK LIGALLFDSGETAEATRLKRTARRRYTR
RK RICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KK GLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAK LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSK GYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREI NYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAEAAPCDCDGGTQKEKGPYYTHLGAGPSVAAVRELMETRF
GQKGKAIRIEKIVFTGKEGKSSQGCPVAKWVIRRSGPEEKLICLVRERVDHH
CSTAVIVVLILLWEGIPRLMADRLYKELTENLRSYSGHPTDRRCTLNKKRTC
TCQGIDPKTCGASFSFGCSWSMYFNGCKFGRSENPRKFRLAPNYPLHNYYK
RITGMSSEGSDVKTGWIIPDRKTLISREEKQLEK LQELATVLAPLYKQMAP
VAYQNQVEYEEVAGDCRLGNEEGRPFSGVTCCMDFCAHSHKDIHNMHNGS
TVVCTLIRADGRDTNCPEDEQLHVLPLYRLADTDEFGSVEGMKAKIKSGAIQ
VNGPTRKRRLRFTEPVPRCGKRAKMKQNHNKSGSHNTKSFSSASSTSHLVK
DESTDFCPLQASSAETSTCTYSKTASGGFAETSSILHCTMPSGAHSGANAAA
GECTGTVQPAEVAAHPHQSLPTADSPVHAEPLTSPSEQLTSNQSNQQLPLLS
NSQKLASCQVEDERHPEADEPQHPEDDNLPQLDEFWSDSEEIYADPSFGGV
AIAPIHGSVLIECARKELHATTSLRSPKRGVPFRVSLVFYQHKSLNKPNHGFD
INKIKCKCK VTKKKPADRECPDVSPEANLSHQIPSRVASTLTRDNVVTVSP
YSLTHVAGPYNRWV (SEQ ID NO: 8)
[00277] dCas9hACIDA
[00278] MYPYDVPDYASPKKKRKVEASDK YSIGLAIGTNSVGWAVITDE
YKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRK
NRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH
EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDV
DKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK
NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQ
YADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL
VRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL
VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKI
LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERM
TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQK
AIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLL
KIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQL
KRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT
FKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMGRH
KPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQL
QNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVL
TRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG
LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLK
SKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFVY
GDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI
ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPKR
NSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL
GITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA
GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE
IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA
AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRK
VEASGPAMDSLLMNRRKFLYQFK VRWAKGRRETYLCYVVKRRDSATSFS
LDFGYLRNK GCHVELLFLRYISDWDLDPGRCYRVTWFTSWSPCYDCARH
VADFLRGNPNLSLRIFTARLYFCEDRKAEPEGLRRLHRAGVQIAIMTFKDYF
YCWNTFVENHERTFKAWEGLHENSVRLSRQLRRILLPLYEVDDLRDAFRTL
GL (SEQ ID NO: 9)
[00279] dCas9hDMNTl
[00280] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGR RML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVE AS GP AMP ART AP ARVPTL AVP AI SLPDD VRRRLKDLERD SLTEKE
CVKEKLNLLHEFLQTEIK QLCDLETKLRKEELSEEGYLAKVKSLLNKDLSL
ENGAHAYNREVNGRLENGNQARSEARRVGMADANSPPKPLSKPRTPRRSK
SDGEAKRSRDPPASASQVTGIRAEPSPSPRITRKSTRQTTITSHFAKGPAKRKP
QEESERAKSDESIKEEDKDQDEKRRRVTSRERVARPLPAEEPERAKSGTRTE
KEEERDEKEEKRLRS QTKEPTPKQKLKEEPDRE ARAG VQ ADEDED GDEKDE
KKHRSQPKDLAAKRRPEEKEPEKVNPQISDEKDEDEKEEKRRKTTPKEPTEK
KMARAKTVMNSKTHPPKCIQCGQYLDDPDLKYGQHPPDAVDEPQMLTNEK
LSIFDANESGFESYEALPQHKLTCFSVYCKHGHLCPIDTGLIEK IELFFSGSA
KPIYDDDPSLEGGVNGK LGPINEWWITGFDGGEKALIGFSTSFAEYILMDPS
PEYAPIFGLMQEKIYISKIVVEFLQSNSDSTYEDLINKIETTVPPSGLNLNRFTE
DSLLRHAQFVVEQVESYDEAGDSDEQPIFLTPCMRDLIKLAGVTLGQRRAQ
ARRQTIRHSTREKDRGPTKATTTKLVYQIFDTFFAEQIEKDDREDKENAFKR
RRCGVCEVCQQPECGKCKACKDMVKFGGSGRSKQACQERRCPNMAMKEA
DDDEEVDDNIPEMPSPK MHQGKK KQNK RISWVGEAVKTDGK SYYK
KVCIDAETLEVGDCVSVIPDDSSKPLYLARVTALWEDSSNGQMFHAHWFCA
GTDTVLGATSDPLELFLVDECEDMQLSYIHSKVKVIYKAPSENWAMEGGM
DPESLLEGDDGKTYFYQLWYDQDYARFESPPKTQPTEDNKFKFCVSCARLA
EMRQKEIPRVLEQLEDLDSRVLYYSATK GILYRVGDGVYLPPEAFTFNIKL
SSPVKRPRKEPVDEDLYPEHYRKYSDYIKGSNLDAPEPYRIGRIKEIFCPK S
NGRPNETDIKIRVNKFYRPENTHKSTPASYHADINLLYWSDEEAVVDFKAV
QGRCTVEYGEDLPECVQVYSMGGPNRFYFLEAYNAKSKSFEDPPNHARSPG
NKGKGKGKGKGKPKS Q ACEP SEPEIEIKLPKLRTLD VF S GCGGL SEGFHQ AG
ISDTLWAIEMWDPAAQAFRL NPGSTVFTEDCNILLKLVMAGETTNSRGQR
LPQKGDVEMLCGGPPCQGFSGMNRFNSRTYSKFK SLVVSFLSYCDYYRPR
FFLLENVRNFVSFKRSMVLKLTLRCLVRMGYQCTFGVLQAGQYGVAQTRR
RAIILAAAPGEKLPLFPEPLHVFAPRACQLSVVVDDKKFVSNITRLSSGPFRTI
TVRDTMSDLPEVRNGASALEISYNGEPQSWFQRQLRGAQYQPILRDHICKD
MSALVAARMRHIPLAPGSDWRDLPNIEVRLSDGTMARKLRYTHHDRK GR
SSSGALRGVCSCVEAGKACDPAARQFNTLIPWCLPHTGNRHNHWAGLYGR LEWDGFFSTTVTNPEPMGKQGRVLHPEQHRVVSVRECARSQGFPDTYRLFG NILDKHRQVGNAVPP PLAKAIGLEIKLCMLAKARESASAKIKEEEAAKDID
(SEQ ID NO: 10)
[00281] dCas9hDNMT3a
[00282] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKR VE AS GP AMP AMP S S GPGDT S S S AAEREEDRKDGEEQEEPRGKEERQE
PSTTAR VGRPGRKRKHPPVESGDTPKDPAVISKSPSMAQDSGASELLPNGD
LEKRSEPQPEEGSPAGGQKGGAPAEGEGAAETLPEASRAVENGCCTPKEGR
GAPAEAGKEQKETNIESMKMEGSRGRLRGGLGWESSLRQRPMPRLTFQAG
DPYYISKRKRDEWLARWKREAEKKAKVIAGMNAVEENQGPGESQKVEEAS
PPAVQQPTDPASPTVATTPEPVGSDAGDK ATKAGDDEPEYEDGRGFGIGE
LVWGKLRGFSWWPGRIVSWWMTGRSRAAEGTRWVMWFGDGKFSVVCVE
KLMPLSSFCSAFHQATYNKQPMYRKAIYEVLQVASSRAGKLFPVCHDSDES
DTAKAVEVQNKPMIEWALGGFQPSGPKGLEPPEEEK PYKEVYTDMWVEP
EAAAYAPPPPAKKPRKSTAEKPKVKEIIDERTRERLVYEVRQKCRNIEDICIS
CGSLNVTLEHPLFVGGMCQNCK CFLECAYQYDDDGYQSYCTICCGGREV
LMCGNNNCCRCFCVECVDLLVGPGAAQAAIKEDPWNCYMCGHKGTYGLL
RRREDWPSRLQMFFANNHDQEFDPPKVYPPVPAEKRKPIRVLSLFDGIATGL
LVLKDLGIQVDRYIASEVCEDSITVGMVRHQGKIMYVGDVRSVTQKHIQEW
GPFDLVIGGSPCNDLSIVNPARKGLYEGTGRLFFEFYRLLHDARPKEGDDRP
FFWLFENVVAMGVSDKRDISRFLESNPVMIDAKEVSAAHRARYFWGNLPG
MNRPLASTVNDKLELQECLEHGRIAKFSKVRTITTRSNSIKQGKDQHFPVFM
NEKEDILWCTEMERVFGFPVHYTDVSNMSRLARQRLLGRSWSVPVIRHLFA
PLKEYFACVID (SEQ ID NO: 11)
[00283] dCas9hDNMT3b
[00284] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKR VE AS GP AMKGDTRHLNGEED AGGRED SIL VNG AC SD Q S SD SPPILE AI
RTPEIRGRRSSSRLSKREVSSLLSYTQDLTGDGDGEDGDGSDTPVMPKLFRE
TRTRSESPAVRTR NNSVSSRERHRPSPRSTRGRQGRNHVDESPVEFPATRSL
RRRATASAGTPWPSPPSSYLTIDLTDDTEDTHGTPQSSSTPYARLAQDSQQG
GMESPQVEADSGDGDSSEYQDGKEFGIGDLVWGKIKGFSWWPAMVVSWK
ATSKRQAMSGMRWVQWFGDGKFSEVSADKLVALGLFSQHFNLATFNKLV
SYRKAMYHALEKARVRAGKTFPSSPGDSLEDQLKPMLEWAHGGFKPTGIE
GLKPNNTQPENKTRRRTADDSATSDYCPAPKRLKTNCYNNGKDRGDEDQS
REQMASDVANNKSSLEDGCLSCGRKNPVSFHPLFEGGLCQTCRDRFLELFY
MYDDDGYQSYCTVCCEGRELLLCSNTSCCRCFCVECLEVLVGTGTAAEAK
LQEPWSCYMCLPQRCHGVLRRRKDWNVRLQAFFTSDTGLEYEAPKLYPAIP
AARRRPIRVLSLFDGIATGYLVLKELGIVGKYVASEVCEESIAVGTVKHEGNI
KYVNDVRNITKK IEEWGPFDLVIGGSPCDLSNVNPARKGLYEGTGRLFFEF
YHLLNYSRPKEGDDRPFFWMFENVVAMKVGDKRDISRFLECNPVMIDAIKV
SAAHRARYFWGNLPGMNRIFGFPVHYTDVSNMGRGARQKLLGRSWSVPVI
RH LFAPLKDYFACEID (SEQ ID NO: 12)
[00285] dCas9hG9a
[00286] MYPYDVPDYASPKK R VEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKK LIGALLFDSGETAEATRLKRTARRRYTR
RK RICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KK GLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAK LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSK GYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLEYFTVYNELTKVKYVTEGMRKPAFLSGEQ
KKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHD
LLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMK
QLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDS
LTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMG
RHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVENT
QLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNK
VLTRSDKNRGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKAER
GGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVIT
LKSKLVSDFRKDFQFYKVREI NYHHAHDAYLNAVVGTALK YPKLESEFV
YGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRP
LIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPK
R SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKEL
LGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS
AGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYL
DEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGA
PAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKK
RKVEASGPAMAAAAGAAAAAAAEGEAPAEMGALLLEKETRGATERVHGS
LGDTPRSEETLPKATPDSLEPAGPSSPASVTVTVGDEGADTPVGATPLIGDES
ENLEGDGDLRGGRILLGHATKSFPSSPSKGGSCPSRAKMSMTGAGKSPPSVQ
SLAMRLLSMPGAQGAAAAGSEPPPATTSPEGQPKVHRARKTMSKPGNGQPP
VPEKRPPEIQHFRMSDDVHSLGKVTSDLAKRRKLNSGGGLSEELGSARRSGE
VTLTKGDPGSLEEWETVVGDDFSLYYDSYSVDERVDSDSKSEVEALTEQLS
EEEEEEEEEEEEEEEEEEEEEEEEDEESGNQSDRSGSSGRRKAKK WRKDSP
WVKPSRKRRKREPPRAKEPRGVSNDTSSLETERGFEELPLCSCRMEAPKIDRI
SERAGHKCM ATE S VDGEL S GCN AAILKRETMRP S SRV ALM VLCETHRARM
VKHHCCPGCGYFCTAGTFLECHPDFRVAHRFHKACVSQLNGMVFCPHCGE
DASEAQEVTIPRGDGVTPPAGTAAPAPPPLSQDVPGRADTSQPSARMRGHG
EPRRPPCDPLADTIDSSGPSLTLPNGGCLSAVGLPLGPGREALEKALVIQESE
RRKKLRFHPRQLYLSVKQGELQKVILMLLDNLDPNFQSDQQSKRTPLHAAA
QKGSVEICHVLLQAGANINAVDKQQRTPLMEAVNNHLEVARYMVQRGGC
VYSKEEDGSTCLHHAAKIGNLEMVSLLLSTGQVDVNAQDSGGWTPIIWAAE
HKHIEVIRMLLTRGADVTLTDNEENICLHWASFTGSAAIAEVLLNARCDLHA
VNYHGDTPLHIAARESYHDCVLLFLSRGANPELRNKEGDTAWDLTPERSDV
WFALQLNRKLRLGVGNRAIRTEKIICRDVARGYENVPIPCVNGVDGEPCPED
YKYISENCETSTMNIDRNITHLQHCTCVDDCSSSNCLCGQLSIRCWYDKDGR
LLQEFNKIEPPLIFECNQ AC S C WRNCK RV VQ S GIKVRLQL YRT AKMG WG V
RALQTIPQGTFICEYVGELISDAEADVREDDSYLFDLDNKDGEVYCIDARYY
GNISRFINHLCDPNIIPVRVFMLHQDLRFPRIAFFSSRDIRTGEELGFDYGDRF
WDIKSKYFTCQCGSEKCKHSAEAIALEQSRLARLDPHPELLPELGSLPPVNT
D (SEQ ID NO: 13)
[00287] dCas9VP64
[00288] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLEYFTVYNELTKVKYVTEGMRKPAFLSGEQ
KKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHD
LLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMK
QLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANR FMQLIHDDS
LTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMG
RHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVENT
QLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVAIVPQSFLKDDSIDNKV
LTRSDK RGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERG
GLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITL
KSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFV
YGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRP
LIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPK
R SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKEL
LGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS
AGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYL
DEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGA
PAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKK
RKVEASGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLD
MLGSDALDDFDLDMLYID (SEQ ID NO: 14)
[00289] dCas9VP96
[00290] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFDL
DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYID (SEQ
ID NO: 15)
[00291] dCas9VP160
[00292] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFDL
DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDD
FDLDMLGDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYID
(SEQ ID NO: 16)
[00293] dCas9hINIl
[00294] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAMMMMALSKTFGQKPVKFQLEDDGEFYMIGSEVGNYLRM
FRGSLYKRYPSLWRRLATVEERKKIVASSHGK TKPNTKDHGYTTLATSVT
LLKASEVEEILDGNDEKYKAVSISTEPPTYLREQKAKRNSQWVPTLPNSSHH
LDAVPCSTTINRNRMGRDKKRTFPLCFDDHDPAVIHENASQPEVLVPIRLDM
EIDGQKLRDAFTWNMNEKLMTPEMFSEILCDDLDLNPLTFVPAIASAIRQQIE
SYPTDSILEDQSDQRVIIKLNIHVGNISLVDQFEWDMSEKENSPEKFALKLCS
ELGLGGEFVTTIAYSIRGQLSWHQKTYAFSENPLPTVEIAIRNTGDADQWCP
LLETLTDAEMEKKIRDQDRNTRRMRRLANTAPAW (SEQ ID NO: 17)
[00295] dCas9hMBD4
[00296] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKRNSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAMGTTGLESLSLGDRGAAPTVTSSERLVPDPPNDLRKEDVA
MELERVGEDEEQMMIKRSSECNPLLQEPIASAQFGATAGTECRKSVPCGWE
RVVKQRLFGKTAGRFDVYFISPQGLKFRSKSSLANYLHK GETSLKPEDFDF
TVLSKRGIKSRYKDCSMAALTSHLQNQS NSNWNLRTRSKCKKDVFMPPSS
SSELQESRGLSNFTSTHLLLKEDEGVDDVNFRKVRKPKGKVTILKGIPIKKTK
KGCRKSCSGFVQSDSKRESVCNKADAESEPVAQKSQLDRTVCISDAGACGE
TLSVTSEENSLVKKKERSLSSGSNFCSEQKTSGIINKFCSAKDSEHNEKYEDT
FLESEEIGTKVEVVERKEHLHTDILKRGSEMDNNCSPTRKDFTGEKIFQEDTI
PRTQIERRKTSLYFSSKYNKEALSPPRRKAFK WTPPRSPFNLVQETLFHDP
WKLLIATIFLNRTSGKMAIPVLWKFLEKYPSAEVARTADWRDVSELLKPLG
LYDLRAKTIVKFSDEYLTKQWKYPIELHGIGKYGNDSYRIFCVNEWKQVHP
EDHKLNKYHDWLWENHEKLS LS (SEQ ID NO: 18)
[00297] dCas9hTDG
[00298] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTR SEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAMEAENAGSYSLQQAQAFYTFPFQQLMAEAPNMAVVNEQ
QMPEEVPAPAPAQEPVQEAPKGRKRKPRTTEPKQPVEPKKPVESK SGKSA
KSKEKQEKITDTFKVKRKVDRFNGVSEAELLTKTLPDILTFNLDIVIIGINPGL
MAAYKGHHYPGPGNHFWKCLFMSGLSEVQLNHMDDHTLPGKYGIGFTNM
VERTTPGSKDLSSKEFREGGRILVQKLQKYQPRIAVFNGKCIYEIFSKEVFGV
KVK LEFGLQPHKIPDTETLCYVMPSSSARCAQFPRAQDKVHYYIKLKDLR
DQLKGIERNMDVQEVQYTFDLQLAQEDAKKMAVKEEKYDPGYEAAYGGA
YGENPCSSEPCGFSSNGLIESVELRGESAFSGIPNGQWMTQSFTDQIPSFSNH
CGTQEQEEESHA (SEQ ID NO: 19)
[00299] dCas9hTETlCD
[00300] MYPYDVPDYASPKK RKVEASDK YSIGLAIGTNSVGWAVIT
DEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTR
RKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVA
YHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNS
DVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
KK GLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIG
DQYADLFLAAK LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL
KALVRQQLPEKYKEIFFDQSK GYAGYIDGGASQEEFYKFIKPILEKMDGTE
ELLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKI
EKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIE
RMTNFDK LPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGE
QKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYH
DLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM
KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHD
DSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKV
MGRHKPENIVIEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVE
NTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSID
NKVLTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKA
ERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKV
ITLKSKLVSDFRKDFQFYKVREI NYHHAHDAYLNAVVGTALIK YPKLESE
FVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRK
RPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESIL
PKR SDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVK
ELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRML
ASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKH
YLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNL
G AP AAFKYFDTTIDRKRYT STKE VLD ATLIHQ SITGL YETRIDL S QLGGD SPK
KKRKVEASGPAELPTCSCLDRVIQKDKGPYYTHLGAGPSVAAVREIMENRY
GQKGNAIRIEIVVYTGKEGKSSHGCPIAKWVLRRSSDEEKVLCLVRQRTGHH
CPTAVMVVLIMVWDGIPLPMADRLYTELTENLKSYNGHPTDRRCTLNENRT
CTCQGIDPETCGASFSFGCSWSMYFNGCKFGRSPSPRRFRIDPSSPLHEK LE
DNLQSLATRLAPIYKQYAPVAYQNQVEYENVARECRLGSKEGRPFSGVTAC
LDFCAHPHRDIHNM NGSTVVCTLTREDNRSLGVIPQDEQLHVLPLYKLSD
TDEFGSKEGMEAKIKSGAIEVLAPRRKKRTCFTQPVPRSGKKRAAMMTEVL
AHKIRAVEKKPIPRIKRKNNSTTT NSKPSSLPTLGSNTETVQPEVKSETEPHF
ILKSSDNTKTYSLMPSAPHPVKEASPGFSWSPKTASATPAPLK DATASCGF
SERSSTPHCTMPSGRLSGANAAAADGPGISQLGEVAPLPTLSAPVMEPLINSE
PSTGVTEPLTPHQPNHQPSFLTSPQDLASSPMEEDEQHSEADEPPSDEPLSDD
PLSPAEEKLPHIDEYWSDSEHIFLDANIGGVAIAPAHGSVLIECARRELHATTP VEHPNRNHPTRLSLVFYQHK LNKPQHGFELNKIKFEAKEAK KKMKASE QKDQAANEGPEQSSEVNELNQIPSHKALTLTHDNVVTVSPYALTHVAGPYN HWVID (SEQ ID NO: 20)
[00301] dCas9hPRMTl
[00302] MYPYDVPDYASPKKKRKVEASDK YSIGLAIGTNSVGWAVITDE
YKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRK
NRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH
EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDV
DKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK
NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQ
YADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL
VRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL
VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKI
LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERM
TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQK
AIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLL
KIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQL
KRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT
FKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMGRH
KPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQL
QNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVL
TRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG
LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLK
SKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFVY
GDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI
ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPKR
NSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL
GITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA
GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE
IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA
AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRK
VEASGPAMAAAEAANCIMENFVATLANGMSLQPPLEEVSCGQAESSEKPNA
EDMTSKDYYFDSYAHFGIHEEMLKDEVRTLTYR SMFHNRHLFKDKVVLD
VGSGTGILCMFAAKAGARKVIGIECSSISDYAVKIVKANKLDHVVTIIKGKV
EEVELPVEKVDIIISEWMGYCLFYESMLNTVLYARDKWLAPDGLIFPDRATL
YVTAIEDRQYKDYKIHWWENVYGFDMSCIKDVAIKEPLVDVVDPKQLVTN
ACLIKEVDIYTVKVEDLTFTSPFCLQVKRNDYVHALVAYFNIEFTRCHKRTG
FSTSPESPYTHWKQTVFYMEDYLTVKTGEEIFGTIGMRPNAKNNRDLDFTID
LDFKGQLCELSCSTDYRMRID (SEQ ID NO: 21)
[00303] dCas9hSET7
[00304] MYPYDVPDYASPKKKRKVEASDK YSIGLAIGTNSVGWAVITDE
YKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRK
NRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH
EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDV
DKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK
NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQ
YADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL
VRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL
VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKI
LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERM
TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQK
AIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLL
KIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQL
KRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT
FKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMGRH
KPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQL
QNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVL
TRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG
LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLK
SKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFVY
GDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI
ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPKR
NSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL
GITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA
GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE
IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA
AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKR
VEASGPAMDSDDEMVEEAVEGHLDDDGLPHGFCTVTYSSTDRFEGNFVHG
EK GRGKFFFFDGSTLEGYYVDDALQGQGVYTYEDGGVLQGTYVDGELNG
PAQEYDTDGRLIFKGQYKDNIRHGVCWIYYPDGGSLVGEVNEDGEMTGEKI
AYVYPDERTALYGKFIDGEMIEGKLATLMSTEEGRPHFELMPGNSVYHFDK
STSSCISTNALLPDPYESERVYVAESLISSAGEGLFSKVAVGPNTVMSFYNGV
RITHQEVDSRDWALNGNTLSLDEETVIDVPEPYNHVSKYCASLGHKANHSF
TPNCIYDMFVHPRFGPIKCIRTLRAVEADEELTVAYGYDHSPPGKSGPEAPE
WYQVELKAFQATQQKID (SEQ ID NO: 22)
[00305] dCas9SID4x
[00306] MYPYDVPDYASPKKKRKVEASDK YSIGLAIGTNSVGWAVITDE
YKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRK
NRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH
EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDV
DKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK
NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQ
YADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL
VRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL
VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKI
LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERM
TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQK
AIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLL
KIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQL
KRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT
FKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMGRH
KPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQL
QNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVL
TRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG
LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLK
SKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFVY
GDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI
ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPKR
NSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL GITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRK VE AS GPAASPKKKRKVE AS GS GMNI QMLLE AAD YLERRERE AEHG Y ASML PGSGMNIQMLLEAADYLERREREAEHGYASMLPGSGMNIQMLLEAADYLE RREREAEHGYASMLPGSGMNIQMLLEAAD YLERRERE AEHGYASMLPSRS RIDID (SEQ ID NO: 23)
[00307] dCas9hsSssIM
[00308] MYPYDVPDYASPKKKRKVEASDK YSIGLAIGTNSVGWAVITDE
YKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRK
NRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH
EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDV
DKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK
NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQ
YADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL
VRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL
VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKI
LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERM
TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQK
AIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLL
KIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQL
KRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT
FKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMGRH
KPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQL
QNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVL
TRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG
LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLK
SKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFVY
GDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI
ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPKR
NSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL
GITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA
GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE
IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA
AFKYFDTTIDPvKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRK
VEASGPAMSKVENKTKKLRVFEAFAGIGAQRKALEKVRKDEYEIVGLAEW
YVPAIVMYQAIHNNFHTKLEYKSVSREEMIDYLENKTLSWNSK PVSNGY
WKRKKDDELKIIYNAIKLSEKEGNIFDIRDLYKRTLK IDLLTYSFPCQDLSQ
QGIQKGMKRGSGTRSGLLWEIERALDSTEK DLPKYLLMENVGALLHKK
EEELNQWKQKLESLGYQNSIEVLNAADFGSSQARRRVFMISTLNEFVELPKG
DKKPKSIK VLNKIVSEKDIL NLLKYNLTEFK TKSNINKASLIGYSKFNSE
GYVYDPEFTGPTLTASGANSRIKIKDGSNIRKMNSDETFLYMGFDSQDGKRV
NEIEFLTENQKIFVAGNSISVEVLEAIIDKIGG (SEQ ID NO: 24)
[00309] dCas9hsSssIMA (split SssIM fragment A fusion)
[00310] MYPYDVPDYASPKKKRKVEASDK YSIGLAIGTNSVGWAVITDE
YKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRK
NRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH
EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDV
DKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK
NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQ
YADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL
VRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL
VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKI
LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERM
TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQK
AIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLL
KIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQL
KRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT
FKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMGRH
KPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQL
QNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVL
TRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG
LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLK
SKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFVY
GDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI
ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPKR
NSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL
GITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA
GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE
IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA
AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRK
VEASGPAMSKVENKTKKLRVFEAFAGIGAQRKALEKVRKDEYEIVGLAEW
YVPAIVMYQAIHNNFHTKLEYKSVSREEMIDYLENKTLSWNSK PVSNGY
WKRKKDDELKIIYNAIKLSEKEGNIFDIRDLYKRTLK IDLLTYSFPCQDLSQ
QGIQKGMKRGSGTRSGLLWEIERALDSTEK DLPKYLLMENVGALLHKK
EEELNQWKQKLESLGYQNSIEVLNAADFGSSQARRRVFMISTLNEFVELPKG
DKKPKSIKKVLNKIVSEKDILNNLLKYNLTEFKKTKSNINKASLIGYSKFNSE
GYV (SEQ ID NO: 25)
[00311] dCas9hsSssIMB (split SssIM fragment B fusion)
[00312] MYPYDVPDYASPKKKRKVEASDK YSIGLAIGTNSVGWAVITDE
YKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRK
NRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH
EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDV
DKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK
NGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQ
YADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL
VRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELL
VKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKI
LTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERM
TNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQK
AIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLL
KIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQL
KRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLT
FKEDIQKAQVSGQGDSLHEHIANLAGSPAIK GILQTVKVVDELVKVMGRH
KPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQL
QNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVL
TRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGG
LSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLK
SKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIK YPKLESEFVY
GDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI
ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVK TEVQTGGFSKESILPKR
NSDKLIARKKDWDPK YGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL
GITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASA
GELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDE
IIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPA
AFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSPKKKRK
VEASGPAEFVELPKGDKKPKSIKKVLNKIVSEKDILNNLLKYNLTEFKKTKS
NINKASLIGYSKFNSEGYVYDPEFTGPTLTASGANSRIKIKDGSNIRKMNSDE
TFLYMGFDSQDGKRVNEIEFLTENQKIFVAGNSISVEVLEAIIDKIGG (SEQ ID
NO: 26)
[00313] Example 3
[00314] One step generation of mice carrying reporter and conditional allele by CRISPR/Cas mediated genome editing
[00315] Here, reporter and conditional mutant mice were created by co-injection of zygotes with Cas9 mRNA, different guide RNAs (sgRNAs) as well as DNA vectors of different sizes. Using this one step procedure, mice carrying a tag or a fluorescent reporter construct in the Nanog, the Sox2 and the Oct4 gene as well as Mecp2 conditional mutant mice were generated. In addition, using sgRNAs targeting two separate sites in the Mecp2 gene, mice harboring the predicted deletions of about 700 bps were produced. Finally, potential off-targets of four sgRNAs in gene- modified mice and ESC lines were analyzed and off-target mutations were identified in only rare instances indicating high specificity of genome editing by the
CRISPR/Cas system.
[00316] EXPERIMENTAL PROCEDURES
[00317] Production of Cas9 mRNA and sgRNA
[00318] Bicistronic expression vector px330 expressing Cas9 and sgRNA (Cong, L., et ah. Science 339, 819-823 (2013)) was digested with Bbsl and treated with Antarctic Phosphatase, and the linearized vector was gel-purified. A pair of oligos
(Table 11) for each targeting site was annealed, phosphorylated, and ligated to the linearized vector.
[00319] T7 promoter was added to Cas9 coding region by PCR amplification using primer Cas9 F and R (Table 11). T7-Cas9 PCR product was gel-purified and used as the template for in vitro transcription (IVT) using mMESSAGE
mMACHINE T7 ULTRA kit (Life Technologies). T7 promoter was added to sgRNAs template by PCR amplification using primer listed in Table 11. The T7- sgRNA PCR product was gel-purified and used as the template for IVT using MEGAshortscript T7 kit (Life Technologies). Both the Cas9 mRNA and the sgRNAs were purified using MEGAclear kit (Life Technologies) and eluted in RNase-free water.
[00320] Single stranded and double stranded DNA donors
[00321] All single stranded oligos were ordered as Ultramer DNA oligos from Integrated DNA Technologies. Nanog-2A-mCherry vector was modified from previously published targeting vector Nanog-2A-mCherry-PGK-Neo (Faddah et al, 2013). Nanog-2A-mCherry-PGK-Neo was digested with Pad and Ascl to drop out the PGK-Neo cassette, the 9.7kb fragment was gel purified and blunt-ended using T4 DNA polymerase (New England Biolabs), then self-ligated using T4 DNA ligase (New England Biolabs). Oct4-IRES-eGFP-PGK-Neo vector is previously published (Lengner et al, 2007).
[00322] Suveryor assay and RFLP analysis for genome modification
[00323] Suveryor assay was performed as described (Guschin, D.Y., et al., Methods Mol Biol 649, 247-256 (2010)). Genomic DNA from targeted and control mice or embryos was extracted and PCR was performed using gene specific primers (Table 11) under the following conditions: 95°C for 5 min; 35x (95°C for 30 s, 60°C for 30 s, 68°C for 40 s); 68°C for 2 min; hold at 4°C. PCR products were then denatured, annealed, and treated with Suveryor nuclease (Transgenomic). DNA concentration of each band was measured on an ethidium bromide-stained 10% acrylamide Criterion TBE gel (BioRad) and quantified using Image I software. For RFLP analysis, ΙΟμΙ of Tetl, Tet2, Mecp2-Rl, R2 PCR product was digested with EcoRI, ΙΟμΙ of Mecp2-Ll, L2 PCR product was digested with Nhel. Digested DNA was separated on an ethidium bromide-stained agarose gel (2%). For sequencing,
PCR products were cloned using the Original TA Cloning Kit (Invitrogen), and mutations were identified by Sanger sequencing.
[00324] One cell embryo injection
[00325] All animal procedures were performed according to NIH guidelines and approved by the Committee on Animal Care at MIT. B6D2F1 (C57BL/6 X DBA2) female mice and ICR mouse strains were used as embryo donors and foster mothers, respectively. Super-ovulated female B6D2F1 mice (7-8 weeks old) were mated to B6D2F1 stud males, and fertilized embryos were collected from oviducts. Cas mRNA (100 ng/μΐ), sgRNA (50¾/μ1) and donor oligos (100 ng/μΐ) were mixed and injected into the cytoplasm of fertilized eggs with well-recognized pronuclei in M2 medium (Sigma). The injected zygotes were cultured in KSOM with amino acids at 37°C under 5% C02 in air until blastocyst stage by 3.5 days. Thereafter, 15-25 blastocysts were transferred into uterus of pseudopregnant ICR females at 2.5 dpc.
[00326] Southern blotting
[00327] Genomic DNA was separated on a 0.8% agarose gel after restriction digests with the appropriate enzymes, transferred to a nylon membrane (Amersham) and hybridized with 32P random primer (Stratagene)-labeled probes. Between hybridizations, blots were stripped and checked for complete removal of
radioactivity before rehybridization with a different probe.
[00328] In vivo Cre recombination
[00329] A 20-μ1 reaction containing 1 μg of genomic DNA and 10 units of recombinant Cre recombinase (New England Biolabs) in lx buffer was incubated at 37 °C for one hour. For all targets, 1 μΐ of the Cre reaction mix was used as template for PCR reactions with gene-specific primers. For each target, primers DF and DR were used for detecting the deletion products, and primers CF and CR were used to detect the circle product. All products were sequenced.
[00330] Immunostaining and Western blot analysis
[00331] For immunostaining, cells in 24-well were fixed in PBS supplemented with 4% paraformaldehyde for 15 min at room temperature (RT). The cells were then permeabilized using 0.2% Triton X-100 in PBS for 15 min at RT. The cells were blocked for 30 min in 1% BSA in PBS. Primary antibody against V5 (ab9137, abeam) was diluted in the same blocking buffer and incubated with the samples overnight at 4 °C. The cells were treated with a fluorescently coupled secondary
antibody and then incubated for 1 hr at RT. The nuclei were stained with Hoechst 33342 (Sigma) for 5 min at RT.
[00332] For western blot, Cell pellets were lysed on ice in Laemmli buffer (62.5 mM Tris-HCl, pH 6.8, 2% sodium dodecyl sulfate, 5%b-mercaptoethanol, 10% glycerol and 0.01% bromophenol blue) for 30 min in presence of protease inhibitors (Roche Diagnostics), boiled for 5-7 min at 100 °C, and subjected to western blot analysis. Primary antibodies: V5 (1 : 1,000, ab9137, abeam), beta-actin (1 :2,000). Blots were probed with anti-goat, or anti-rabbit IgG-HRP secondary antibody (1 : 10,000) and visualized using ECL detection kit (GE Healthcare).
[00333] ESC derivation and differentiation
[00334] Morulas or blastocysts were selected to generate ES cell lines. The zona pellucida was removed using acid Tyrode solution. Each embryo was transferred into one well of a 96-well plate seeded with ICR embryonic fibroblast feeders in ESC medium supplemented with 20%> knockout serum replacement, 1,500 U/ml leukemia inhibitory factor (LIF), 3 M CHIR99021, and 1 M PD0325901. After 4-5 days in culture, the colonies were trypsinized and transferred to a 96-well plate with a fresh feeder layer in fresh medium. Clonal expansion of the ESCs proceeded from 48-well plates to 6-well plates with feeder cells and then to 6-well plates for routine culture.
[00335] For ESC differentiation, cells were harvested by trypsinization and transferred to bacterial culture dishes in the ES medium without or LIF. After 3 days, aggregated cells were plated onto gelatin-coated tissue culture dishes and incubated for another 3 days.
[00336] Prediction of potential off-targets
[00337] Potential off-targets were predicted by searching the mouse genome (mm9) for matches to the 20-nt sgRNA sequence allowing for up to 4 mismatches (Nanog) or 3 mismatches (Sox2, Mecp2-L2 and Mecp2-Rl) followed by NGG PAM sequence. Matches were ranked first by ascending number of mismatches, then by ascending distance from the PAM sequence.
[00338] Results
[00339] Targeted insertion of short DNA fragments
[00340] As described herein, precise introduction of base pair mutations into the Tetl and Tet2 genes was done through homology directed repair (HDR)-mediated
genome editing following co-injection of single stranded mutant DNA oligos, sgRNAs and Cas9 mRNA (Wang, H., et al., Cell 153, 910-918 (2013)). To test whether a larger DNA construct could be inserted at the same DSBs at Tetl exon 4 and Tet2 exon 3, oligos were designed containing the 34bp loxP site and a 6bp EcoRI site flanked by 60bps sequences adjoining the DSBs (Fig. 19A). Cas9 mRNA was co-injected with sgRNAs and single stranded DNA oligos targeting both Tetl and Tet2 into zygotes. The restriction fragment length polymorphism (RFLP) assay shown in Fig. 19B identified six out of 15 tested embryos carrying the loxP site at the Tetl locus, eight carrying the loxP site at the Tet2 locus and three had at least one allele of each gene correctly modified. The correct integration of loxP sites was confirmed by sequencing PCR products of sample 2, 9 and 14 (Fig. 19C) These results demonstrate that HDR-mediated repair can introduce targeted integration of 40bp DNA elements efficiently through CRISPR/Cas mediated genome editing (summarized in Table 7).
[00341] Mice with reporters in the endogenous Nanog, Sox2 and Oct4 genes
[00342] Since the study of many genes and their protein products are limited by the availability of high quality antibodies, the potential of fusing a short epitope tag to an endogenous gene was explored. sgRNA was designed to target the stop codon of Sox2 and a corresponding oligo to fuse the 42bp V5 tag into the last codon (Fig. 16A). After injection of the sgRNA, Cas9 mRNA and the oligo into zygotes, in vitro differentiated blastocysts were explanted into culture to derive ES cells. PCR genotyping and sequencing identified seven out of 16 ES cell lines carrying a correctly targeted insert (Fig. 16B and 16C). Western blot analysis revealed a protein band at the predicted size using V5 antibody in targeted ES cells but not in the control cells (Fig. 16D). As expected from a correctly targeted and functional allele, Sox2 expression was seen in targeted blastocysts and ES cells using V5 antibody (Fig. 16E and 16F). 12 of 35 E 13.5 (Embryonic Day 13) embryos and live bom mice derived from injected zygotes carried the V5 tag correctly targeted into the Sox2 gene as indicated by PCR genotyping and sequencing (Table 7).
[00343] To assess whether a marker transgene could be inserted into an endogenous locus, Cas9 mRNA, sgRNA and a double stranded donor vector which was designed to fuse a p2A-mCherry reporter with the last codon of the Nanog gene were co-injected (Fig. 17A). A circular donor vector was used to minimize random
integrations. To assess toxicity and to optimize the concentration of donor DNA, different amounts of Nanog-2A-mCherry vector were microinjected. Injection with high a concentration of donor DNA (500 ng/μΐ) yielded mCherry-positive embryos with high efficiency with most blastocysts being retarded, whereas injection with a lower donor DNA concentration (10 ng/μΐ) yielded mostly healthy blastocysts most of which were mCherry-negative. When 200 ng/μΐ donor DNA was used, 75% of the injected zygotes developed to blastocysts, 9% of which were mCherry-positive (Figure 17C, Table 9). mCherry was mainly expressed in the inner cell mass (ICM) consistent with targeted integration of the mCherry transgene into the Nanog gene. Six ES cell lines were derived from mCherry positive blastocysts, four of which uniformly expressed mCherry with the signal disappearing upon cellular
differentiation (Fig. 17C). The other two lines showed variegated mCherry expression with some colonies being mCherry positive and others negative (Fig 20A) consistent with mosaic donor embryos, which would be expected if transgene insertion occurred at a later than the zygote stage as has been previously observed with ZNF and TALEN - mediated targeting (Cui, X., et al., Nat Biotechnol 29, 64- 67 (2011); Wefers, B., et al., Proc Natl Acad Sci U S A 110, 3782-3787 (2013); Brown, A.J., et al. Nat Methods (2013)). Correct transgene integration in ES cell lines was confirmed by Southern blot analysis (Figure 17B). Postnatal mice from injected zygotes were also generated. Southern blot analysis (Fig. 20B and 20C) revealed that eight out of 86 El 3.5 embryos and live born mice carried the mCherry transgene in the Nanog locus. One targeted mouse was mosaic, since the intensity of targeted allele was lower than the wild type allele (Fig 20B, #6). Two of the mice carried an additional randomly integrated transgene (Fig. 20C, #3). As summarized in Tables 7 and 9, the efficiency of targeted insertion of the transgene was about 10% in blastocysts and postnatal mice derived from injected zygotes.
[00344] Finally, sgRNA targeting the Oct4 3 ' UTR was designed, which was co- injected with a published donor vector designed to integrate the 3kb transgene cassette (IRES-eGFP-loxP-Neo-loxP; Figure 17D) at the 3' end of the Oct4 gene (Lengner, C.J., et al. Cell Stem Cell 1, 403-415 (2007)). Blastocysts were derived from injected zygotes, inspected for GFP expression and explanted to derive ES cells. About 20% of the blastocysts displayed uniform GFP expression in the inner cell mass (ICM) region. Three of nice derived ES cell lines expressed GFP (Figure
17E), including one showed mosaic expression (Table 10). Three out of ten live born mice contained the targeted allele (Table 7). Correct targeting in mice and ES cell lines was confirmed by Southern blot analysis (Figure 17F). Conventionally, transgenic mice are generated by pronuclear instead of cytoplasmic injection of DNA. To optimize the generation of CRISPR/Cas9 targeted embryos, different concentrations of RNA and of the Nanog-mCherry and the Oct4-GFP DNA vectors were compared as well as three different delivery modes: (i) simultaneous injection of all constructs into the cytoplasm, (ii) simultaneous injection of the RNA and the DNA into the pronucleus and (iii) injection of Cas9/sgRNA into the cytoplasm followed 2 hours later by pronuclear injection of the DNA vector. Table 9 shows that simultaneous injection of all constructs into the cytoplasm at a concentration of 100 ng/μΐ Cas9 RNA, 50 ng/μΐ of sgRNA and 200 ng/μΐ of vector DNA was optimal, resulting in 8% to 18% of targeted blastocysts. Similarly, the simultaneous injection of 5 ng/μΐ Cas9 RNA, 2.5 ng/μΐ of sgRNA and 10 ng/μΐ of DNA vector yielded between 10% and 18% targeted blastocysts. In contrast, the two step procedure with Cas9 and sgRNA simultaneous injected into the cytoplasm followed 2 hours later by pronuclear injection of different concentrations of DNA vector yielded no or at most 3% positive blastocysts. Simultaneous injection of RNA and DNA into the cytoplasm or nucleus is an efficient procedure to achieve targeted insertion.
[00345] Conditional Mecp2 mutant mice
[00346] Whether conditional mutant mice can be generated in one step by insertion of two loxP sites into the same allele of the Mecp2 gene was also investigated herein. To derive conditional mutant mice similar to those previously described using traditional homologous recombination methods in ES cells (Chen, R.Z., et al., Nat Genet 27, 327-331 2001)), two sgRNAs targeting Mecp2 intron 2 (LI, L2), and two sgRNAs targeting intron 3 (Rl, R2) as well as the corresponding loxP site oligos with 60 bp homology to sequences surrounding each sgRNA mediated DSB were designed (Fig. 21 A) To facilitate detection of correct insertions, the oligos targeting intron 2 were engineered to contain an Nhel restriction site and the oligos targeting intron 3 to contain an EcoRI site in addition to the LoxP sequences (Fig. 18A, 21 A). To determine the efficiency of single loxP site integration at the Mecp2 locus, Cas9 mRNA was injected and each single sgRNA
and corresponding oligo into zygotes, which were cultured to the blastocyst stage and genotyped by the RFLP assay. As shown in Figure 2 IB, the L2 and Rl sgRNAs were more efficient in integrating the oligos with 4 out of 8 embryos carrying the L2 oligo and 2 out of 6 embryos the carrying Rl oligo (there was no amplification in DNA from 2 embryos). Therefore, L2 and Rl sgRNAs and the corresponding oligos were chosen for the generation of a floxed allele (Fig. 21 A).
[00347] A total of 98 E 13.5 (Embryonic Day) embryos or mice were generated from zygotes injected with Cas9 mRNA, sgRNAs, and DNA oligos targeting the L2 and Rl sites. Genomic DNA was digested with both Nhel and EcoRI, and analyzed by Southern blot using exon 3 and 4 probes (Fig. 18A and 18B). The L2 and Rl oligos contained, in addition to the loxP site, different restriction sites (Nhel or EcoRI). Thus, single loxP site integration at L2 or Rl will produce either a 3.9 kb or a 2 kb band, respectively, when hybridized with the exon3 probe (Fig. 18A and 18B). It was found herein that about 50% of the embryos or mice carried a loxP site at the L2 site and about 25% at the Rl site. Importantly, integration of both loxP sites on the same DNA molecule, generating a floxed allele, produces a 700 bp band as detected by exon 3 probe hybridization (Fig. 18A and 18B). RFLP analysis, sequencing (Fig. 22A and 22B) and Southern blot analysis (Fig. 18B) showed that 16 out of the 98 mice tested contained two loxP sites flanking exon 3 on the same allele. Table 8 summarizes the frequency of all alleles and shows that the overall insertion frequency of an L2 or Rl insertion was slightly higher in females (32/38) than in males (38/60) consistent with the higher copy number of the X-linked Mecp2 gene in females. To confirm that the floxed allele was functional, genomic DNA was used for in vitro Cre-mediated recombination. Upon Cre treatment, both the deletion and circular products were detected by PCR in targeted mice, but not in DNA from wild type mice (Fig. 18C). The PCR products were sequenced and confirmed the precise Cre-loxP mediated recombination (Fig. 22C).
[00348] ome pups, herein, carried large deletions but no LoxP insertions, raising the possibility that two cleavage events may generate defined deletions. To confirm this, Cas9 mRNA, Mecp2-L2 and R sgRNAs were coinjected but without oligos. PCR genotyping and sequencing (Fig 18D and 18E) revealed that eight out of 23 mice carried deletions of about 700bp spanning the L2 and Rl sites removing exon 3. This was confirmed by Southern analysis (not shown). Because DNA breaks are
repaired through the non-homo iogous end joining (NHEJ) pathway, the ends of the breaks are different in different deletion alleles (Fig 18E).
[00349] Mosaicism
[00350] As mentioned above, some animals were mosaic for the targeted insertion. The frequency of mosaicism in Mecp2 targeted mice by Southern blot analysis was characterized. Since Mecp2 is an X-linked gene, in males more than one allele and in females more than two different alleles suggest mosaicism, which would be expected if integration occurred later than the zygote stage. For example, as shown in Fig. 18B, female mouse #2 contained three different alleles (one WT allele, one floxed allele, and one L2-loxP allele), and female mouse #4 contained four different alleles (one WT allele, one floxed allele, one L2-loxP allele, and one Rl-loxP allele). Male mouse #5 contained two different alleles, with each allele carrying a single loxP site (Fig. 18B). Eight mosaics out of 16 mice were identified containing a Mecp2 floxed allele. The frequency of mosaicism among 49 embryos and mice containing loxP site was about 40% (Table 10). Since Southern blot analysis cannot detect small in-del mutations caused by NHEJ repair, it is possible that this underestimates the overall mosaicism frequency.
[00351] Off-target analysis
[00352] Recent studies identified a high level of off-target cleavage in human cell lines using the CRISPR/Cas system, with Cas9 targeting specificity being shown to tolerate small numbers of mismatches between sgRNA and target DNA in a sequence and position dependent manner (Fu, Y., et al., Nat Biotechnol. (2013); Hsu, P.D., et al., Nat Biotechnol (2013)). Potential off-target (OT) mutations in mice and ES cell lines were characterized derived from zygotes injected with Cas9 and sgRNAs targeting the Sox2, the Nanog and the Mecp2 gene. All genomic loci containing up to three base pair mismatches were identified compared to the 20 bp sgRNA coding sequence (Table 11). All 13 potential OT sites of Sox 2 sgRNA was amplified in six mice and four ES cell lines carrying the Sox2-V5 allele and was tested for potential off target mutations using the Surveyor assay. No mutation was detected in any locus. When nine Nanog sgRNA potential OT sites were tested in five correctly targeted mice and four targeted ES cell lines, mutations were found in seven samples at OT1 (Table 11). Since Nanog OT1 has only one base pair difference at the very 5 ' end of the sgRNA, it may be not surprising to find such a
high frequency of mutations at this locus. In contrast, no off-target mutation was seen in any other Nanog OTs, which contain three or four base pair difference.
Finally, four potential off-targets sites for Mecp2 L2, and ten sites for Mecp2 Rl were analyzed in ten mice carrying a Mecp2 floxed allele. Only one off-target mutation was identified in one mouse at the Mecp2 Rl OT2 (Table 11). In summary, all potential off-target sites differing up to three or four base pairs in 29 mice or ES lines were tested and identified mutations in only one off-target site for Nanog (7/9 samples) and Mecp2 (1/10 samples). Thus, the off-target mutation rate is
substantially lower than was observed in previous studies using cultured human cancer cell lines (Fu, Y., et al., Nat Biotechnol. (2013); Hsu, P.D., et al., Nat Biotechnol (2013)).
[00353] DISCUSSION
[00354] In this study, CRISPR/Cas technology can be used for efficient one-step insertions of a short epitope or longer fluorescent tags into precise genomic locations, which will facilitate the generation of mice carrying reporters in endogenous genes. Mice and/or embryos carrying reporter constructs in the Sox2, the Nanog and the Oct4 gene were derived from zygotes injected with Cas9 mRNA, sgRNAs and DNA oligo or vectors encoding a tag or a fluorescent marker. Also, microinjection of two Mecp2 specific sgRNAs, Cas9 mRNA and two different oligos encoding LoxP sites into fertilized eggs allowed the one-step generation of conditional mutant mice. In addition the introduction of two spaced sgRNAs targeting the Mecp2 gene produced mice carrying defined deletions of about 700 bp. Though all RNA and DNA constructs were injected into the cytoplasm or nucleus of zygotes, the gene modification events could happen at the one cell stage or later. Indeed, Southern analyses revealed mosaicism in 13% to 40% of the targeted mice and ES cell lines indicating that the insertion of the transgenes had occurred after the zygote stage (Table 10).
[00355] Previous experiments (Wang, H., et al., Cell 153, 910-918 (2013)) demonstrated herein is an efficiency of CRISPR/Cas sgRNA mediated cleavage that was high enough to allow for the one-step production of engineered mice up to 90% of which carried homozygous mutations in two genes (4 mutant alleles). The results reported here show that the sgRNA mediated DSBs occur at a significantly higher frequency than insertion of exogenous DNA sequences. Therefore, the allele not
carrying the insert will likely be mutated as a consequence of NHEJ-based gene disruption. Thus, the reporter allele would need to be segregated away from the mutant allele in order to produce mice carrying a reporter as well as a wt allele.
[00356] Two recent studies reported a high off-target mutation rate in
CRISPR/Cas9 transfected human cell lines (Fu, Y., et al., Nat Biotechnol. (2013); Hsu, P.D., et al., Nat Biotechnol (2013)). The off-target rate for four different sgRNAs was analyzed and identified the cleavage of Nanog OT1 in 7 out of 9 samples and of Mecp2 Rl OT2 in 1/10 mice tested. Nanog OT1 has only one base pair difference from the targeting sequence at the extreme 5' end (position 20, numbered 1-20 in the 3' to 5' direction of gRNA target site), while Mecp2 Rl OT2 has one base pair mismatch at position 20, and one mismatch at position 7. No mutations were detected in 34 potential OTs of Sox2, Nanog, Oct4 or Mecp2 containing 2, 3, or 4bp mismatches in a total of 29 mice and ES cell lines tested. This result is consistent with the previous findings that Cas9 can catalyze DNA cleavage in the presence of single-base mismatches in. the PAM-distal region (Cong et al, 2013; Hsu et al, 2013; Jiang et al, 2013; Jinek et al, 2012). Consistent with the observation that three or more interspaced mismatches dramatically reduce Cas9 cleavage (Hsu et al, 2013), there were no observed off-target mutations at loci containing 3bp mismatches.
[00357] There are several possibilities to explain the significant difference in off- target cleavage rate seen in animals derived from manipulated zygotes and the results reported for CRISPR Cas treated human cell lines (Fu, Y., et al., Nat Biotechnol. (2013); Hsu, P.D., et al., Nat Biotechnol (2013)) . The off-target mutagenesis was analyzed based on the analysis of a "clonal genome" in animals derived from a single manipulated zygote, in contrast to the two previous reports that analyzed heterogeneous cell populations. The surveyor assay, based upon extensive PGR amplification, may identify any mutation, even very rare alleles that may be present in the heterogeneous population. The transformed human cell lines may have different DNA damage responses resulting in a different mutagenesis rate than the normal one ceil embryo. In the experiments described herein, CRISPR/Cas was injecting as short-lived RNA in contrast to Fu et al. and Hsu et al. who used DNA plasmid transfection, which may express the Cas9/sgRNA for longer time periods leading to more extensive cleavage. Thus, this data suggests high specificity
of the CRISPR Cas9 system for gene editing in early embryos aimed at generating gene -modified mice. Nevertheless, characterization of off-target mutagenesis of CRISPR/Cas system using whole genome sequencing would be highly informative and may allow designing sgRNAs with higher specificity.
[00358] In summary, CRISPR Cas mediated genome editing represents an efficient and simple method of generating sophisticated genetic modifications in mice such as conditional alleles and endogenous reporters in one step. The principles described in this study could be directly adapted to other mammalian species, which provides sophisticated genome engineering in many species where ES cells are not available.
[00359] Table 7: Mice with reporters in the endogenous genes
Blastocyst
Targeted Targeted Transferred Knock- in pre- s /
Donor blastocysts / ESCs / embryos and postnatal
Injected
Total Total (recipients) mice / Total zygotes
Texl- 6/1
loxP 5
Tetl-loxP + Tex2- 6/1
65/89 N/A N/A N/A Tet2-loxP loxP 5
Both
Sox2-V5 414/498 N/A 7/16 200(10) 12/35
Nanog-mCherry 936/1262 86/936 NDa 415 (21) 8/86
Oct4-GFP 254/345 47/254 3/9 100(4) 3/10
Cas9 mRNA, sgRNAs targeting Tetl , Tet2, Sox2, Nanog, or Oct4, and single stranded DNA oligos or double stranded donor vectors were injected into fertilized eggs. Targeted blastocysts were identified by RFLP or fluorescence of reporters. The blastocysts derived from injected embryos were derived ES cell lines or transplanted into foster mothers and El 3.5 embryos and postnatal mice were obtained and genotyped.
aOnly mCherry positive blastocysts were selected to generate ES cell lines.
ND, not determined; N/A, not applicable.
Table 8: Conditional Mecp2 mutant mice
Blastocyst Transferred (Pre and post) Mice with loxP / Total
Donor / Injected embryos Sex Two loxP in Two loxP in zygotes (recipients) Total3 L2b Rlc
two alleles one alleles
Mecp2-L2 Male 28/60 26/60 12/60 2d /60 8/60
+ Mecpe- 367/451 360(18) Female 21/38 19/38 13/38 3/38 8/38 Rl
Total 49/98 45/98 25/98 5/98 16/98
Cas9 mRNA, sgRNAs targeting Mecp2 -L2 and Mecp2 -Rl, and single stranded
DNA oligos were injected into fertilized eggs. The blastocysts derived from the injected embryos were transplanted into foster mothers and pre- and postnatal mice were genotyped.
aTotal mice containing loxP site integration in the genome.
bMice containing loxP site integrated at L2 site.
cMice containing loxP site integrated at Rl site.
dThese male mice were mosaic.
[00361] Table 9: Efficiency of generation of reporter embryos by cytoplasm and pronuclear injection
Dose of Cas9/sgRNA Dose of Donor Injected Targeted Blastocysts
Donor
(ng/μΐ) vector (ng/μΐ) zygotes / Total
One-step injection
Nanog-mCherry 100/50 (Cyto) 500(Cyto) 186 1/81
Nanog-mCherry 100/50 (Cyto) 200(Cyto) 1262 86/936
Nanog-mCherry 100/50 (Cyto) 50(Cyto) 402 7/308
Nanog-mCherry 100/50 (Cyto) 10(Cyto) 333 1/278
Oct4-GFP 100/50 (Cyto) 200(Cyto) 345 47/254
Nanog-mCherry 5/2.5 (Nuc) 10 (Nuc) 98 7/75
Oct4-GFP 5/2.5 (Nuc) 10 (Nuc) 105 13/72
Two-step injection
Nanog-mCherry 100/50 (Cyto) 50 (Nuc) 45 0/0
Nanog-mCherry 100/50 (Cyto) 10 (Nuc) 91 1/34
Nanog-mCherry 100/50 (Cyto) 2 (Nuc) 85 1/68
Cas9 mRNA, sgRNAs targeting Nanog, or Oct4, and double stranded donor vectors were injected into cytoplasm or pronuclei of zygotes. In one-step injection, the RNA and the DNA were simultaneously injected into the cytoplasm or pronucleus. In two-step injection, Cas9/sgRNA were injected into the cytoplasm followed 2 hours
later by pronuclear injection of the DNA vector. Targeted blastocysts were identified by fluorescence of reporters. Cyto, cytoplasm; Nuc, nucleus.
[00362] Table 10: Mosaicism in targeted mice
Donor Mosaic / Total targeted
Mice 1/8
Nanog-cherry
ESCs 2/6
Mice 0/3
Oct4-EGFP
ESCs 1/3
Male 11/28
Mecp2-L2 +
Female 9/21
Mecpe-Rl
Total 20/49*
Targeted mice or ESCs were identified by RFLP, Southern bolt or Sequencing. The frequency of mosaicism in targeted mice was determined by fluorescent reporter or Southern blot analysis.
* These 49 mice contain at least one loxP integration.
[00363] Table 11 : Off-target analysis
Indel mutation
Site name Sequence frequency Coordinate
(Mutant/Total)
Target Sox2 Stop TGCCCCTGTCGCACATGTGAGGG / chr3: 34550278-34550300
(SEQ ID NO: 216)
OTl_Sox2 Stop TaCCCtTGTtGCACATGTGAAGG 0/10 chr4: 126636377-126636399
(SEQ ID NO: 217)
OT2_Sox2 Stop TtCCCaTGTaGCACATGTGAGGG 0/10 chrl4: 58830941-58830963
(SEQ ID NO: 218)
OT3_Sox2 Stop TcCCCCTGTCaCACATGTGgTGG 0/10 chrl : 136641174-136641196
(SEQ ID NO: 219)
OT4_Sox2_Stop TGCaCCTGTgGCACATGTGgGGG 0/10 chr9: 69081892-69081914
(SEQ ID NO: 220)
OT5_Sox2 Stop TGCCaCaGTtGCACATGTGAGGG 0/10 chrl : 130633965-130633987
(SEQ ID NO: 221)
OT6_Sox2 Stop TGCCaCTGTtGCAaATGTGAGGG 0/10 chrl8: 61611640-61611662
(SEQ ID NO: 222)
OT7_Sox2 Stop TGCCtCTGTCaCAgATGTGACGG 0/10 chr5: 136841014-136841036
(SEQ ID NO: 223)
OT8_Sox2_Stop TGCCtCTGTCtCACATGTGcTGG 0/10 chr4: 141162434-141162456
(SEQ ID NO: 224)
OT9_Sox2 Stop TGCCtCTGTCGCtCATGgGATGG 0/10 chr9: 48224429-48224451
(SEQ ID NO: 225)
OT10_Sox2_Stop TGCCCaTGTCcCACATGgGATGG 0/10 chr7: 72596616-72596638
(SEQ ID NO: 226)
OTl l_Sox2_Stop TGCCCCTcTgtCACATGTGAAGG 0/10 chrl8: 56473819-56473841
(SEQ ID NO: 227)
OT12_Sox2_Stop TGCCCCTGTCatACATGTGgAGG 0/10 chr6: 98776389-98776411
(SEQ ID NO: 228)
OT13_Sox2_Stop TGCCCCTGTCtCcCATGTGcTGG 0/10 chr4: 148868089-148868111
(SEQ ID NO: 229)
Target Nanog Stop CGTAAGTCTCATATTTCACCTGG / chr6: 122663559-122663581
(SEQ ID NO: 230)
OTI Nanog Stop tGTAAGTCTCATATTTCACCTGG 7/9" chrX: 87128718-87128740
(SEQ ID NO: 231)
OT2_Nanog_Stop CccAAGTCTCATtTTTCACCAGG 0/9 chrl4: 21598653-21598675
(SEQ ID NO: 232)
OT3_Nanog_Stop GaTAAGgaTCATATTTCACCCGG 0/9 chrX:88301349-88301371
(SEQ ID NO: 233)
OT4_Nanog_Stop TGcAAtTtTCATATTTCACCTGG 0/9 chrl2:71841888-71841910
(SEQ ID NO: 234)
OT5_Nanog_Stop GGTcAteCTCATATTTCACCAGG 0/9 chrl l:13346951-13346973
(SEQ ID NO: 235)
OT6_Nanog_Stop TGTtAtTCaCATATTTCACCTGG 0/9 chr6:13888078-13888100
(SEQ ID NO: 236)
OT7_Nanog_Stop TGTgAGTagCATATTTCACCTGG 0/9 chrl8:41037112-41037134
(SEQ ID NO: 237)
OT8_Nanog_Stop TGTAAaTaaCATATTTCACCCGG 0/9 chrX:70168441-70168463
(SEQ ID NO: 238)
OT9_Nanog_Stop CaTAgagCTCATATTTCACCAGG 0/9 chr4:80388067-80388089
(SEQ ID NO: 239)
Target Mecp2 L2 CCCAAGGATACAGTATCCTAGGG / chrX: 71282802-71282824
(SEQ ID NO: 240)
0Tl_Mecp2_L2 CCCAAGGATgCttTATCCTAAGG 0/10 chr8: 121441299-121441321
(SEQ ID NO: 241)
OT2_Mecp2_L2 CCCAtGGATAgAGTAgCCTAAGG 0/10 chrl5: 55526927-55526949
(SEQ ID NO: 242)
OT3_Mecp2_L2 CCCAAGaATACAGTgTgCTAAGG 0/10 chr4: 83371755-83371777
(SEQ ID NO: 243)
OT4_Mecp2_L2 CCCAAGGAcACAGgATCCcAAGG 0/10 chrl7: 27887352-27887374
(SEQ ID NO: 244)
Target_Mecp2_Rl AGGAGTGAGGTCTAGTACTTGGG / chrX: 71282103-71282125
(SEQ ID NO: 245)
OTl_Mecp2_Rl AGGgGTGAGtTCTAGTACTTCGG 0/10 chrl3: 48912459-48912481
(SEQ ID NO: 246)
OT2_Mecp2_Rl TGGAGTGAGGTCTtGTACTTGGG 1/10" chrl2: 15404584-15404606
(SEQ ID NO: 247)
OT3_Mecp2_Rl AGGAGTctGGgCTAGTACTTGGG 0/10 chr6: 115474148-115474170
(SEQ ID NO: 248)
Mismatches from the on-target sequence are lower-case, boldface and underlined.
Indel mutation frequencies in targeted mice or ESCs were calculated by Suveryor assay. Coordinate in which sites were located are shown. OT, off-target; /, not tested.aNanog OT1 and 2 contain 3bp mismatches; OT3 to 9 contain 4bp
mismatches lying in PAM distal region.
bPCR products were cloned and sequenced to confirm off-target mutations.
[00364] Table 12. Oligonucleotides used in this study.
[00365] Oligonucleotides used for cloning sgR A expression vector
Gene target Direction Sequence (5' to 3')
F Caccggctgctgtcagggagctca (SEQ ID NO: 249)
Tetl
R Aaactgagctccctgacagcagcc (SEQ ID NO: 250)
F caccgaaagtgccaacagatatcc (SEQ ID NO: 251)
Tet2
R aaacggatatctgttggcactttc (SEQ ID NO: 252)
Sox2 F caccgtgcccctgtcgcacatgtga (SEQ ID NO: 253)
R aaactcacatgtgcgacaggggcac (SEQ ID NO 254)
F caccgcgtaagtctcatatttcacc (SEQ ID NO 255)
Nanog
R aaacggtgaaatatgagacttacgc (SEQ ID NO 256)
F caccgctcagtgatgctgttgatc (SEQ ID NO: 257)
Oct4
R aaacgatcaacagcatcactgagc (SEQ ID NO: 258)
F caccgttgggccccagcttgaccca (SEQ ID NO 259)
Mecp2 LI
R aaactgggtcaagctggggcccaac (SEQ ID NO 260)
F caccgcccaaggatacagtatccta (SEQ ID NO 261)
Mecp2 L2
R aaactaggatactgtatccttgggc (SEQ ID NO 262)
F caccgaggagtgaggtctagtactt (SEQ ID NO 263)
Mecp2 Rl
R aaacaagtactagacctcactcctc (SEQ ID NO 264)
F caccgtttggtggtggattaggtct (SEQ ID NO 265)
Mecp2 R2
R aaacagacctaatccaccaccaaac (SEQ ID NO 266)
[00366] Oligonucleotides used for RFLP analysis and PCR genotyping.
Gene target Direction Sequence (5' to 3')
F ttgttctctcctctgactgc (SEQ ID NO: 267)
Tetl
R tgattgatcaaataggcctgc (SEQ ID NO: 268)
F cagatgcttaggccaatcaag (SEQ ID NO: 269)
Tet2
R agaagcaacacacatgaagatg (SEQ ID NO: 270)
SF acatgatcagcatgtacctcc (SEQ ID NO: 271)
Sox2
SR taatttggatggga tggtgg (SEQ ID NO: 272)
V5 V5F acatgggcaagcccatcc (SEQ ID NO: 273)
Mecp2 LF aatgtgccactttaacagcac (SEQ ID NO: 274) L1.L2 LR ttctgatgtttctgctttgcc (SEQ ID NO: 275)
Mecp2 RF aagcatgagccactacaacc (SEQ ID NO: 276) R1,R2 RR cttgctcagaagccaaaacag (SEQ ID NO: 277)
[00367] Oligonucleotides used for making template for in vitro transcription
Directio
Template Sequence (5' to 3')
n
Taatacgactcactatagggagaatggactataaggacca
F
Cas9 Cgac (SEQ ID NO: 278)
R gcgagctctaggaattcttac (SEQ ID NO: 279)
Ttaatacgactcactataggctgctgtcagggagctc
Tetl F
(SEQ ID NO: 280)
sgRNA
R aaaagcaccgactcggtgcc (SEQ ID NO: 281)
Ttaatacgactcactataggaaagtgccaacagatatcc
Tet2 F
(SEQ ID NO: 282)
sgRNA
R aaaagcaccgactcggtgcc (SEQ ID NO: 283)
Ttaatacgactcactatagtgcccctgtcgcacatgtga
Sox2 F
(SEQ ID NO: 284)
sgRNA
R aaaagcaccgactcggtgcc (SEQ ID NO: 285)
Ttaatacgactcactatagcgtaagtctcatatttcacc
Nanog F
(SEQ ID NO: 286)
sgRNA
R aaaagcaccgactcggtgcc (SEQ ID NO: 287)
Ttaatacgactcactataggctcagtgatgctgttgatc
Oct4 F
(SEQ ID NO: 288)
sgRNA aaaagcaccgactcggtgcc (SEQ ID NO: 289)
R
Mecp2- Ttaatacgactcactatagttgggccccagcttgaccca
F
Ll (SEQ ID NO: 290)
sgRNA R aaaagcaccgactcggtgcc (SEQ ID NO: 291)
Mecp2- Ttaatacgactcactatagcccaaggatacagtatccta
F
L2 (SEQ ID NO: 292)
sgRNA R aaaagcaccgactcggtgcc (SEQ ID NO: 293)
Mecp2- Ttaatacgactcactatagaggagtgaggtctagtactt
F
Rl (SEQ ID NO: 294)
sgRNA R aaaagcaccgactcggtgcc (SEQ ID NO: 295)
Mecp2- Ttaatacgactcactatagtttggtggtggattaggtct
F
R2 (SEQ ID NO: 296)
sgRNA R aaaagcaccgactcggtgcc (SEQ ID NO: 297)
Oligonucleotides used for HDR-mediated repair through embryo
Gene
Sequence (5' to 3')
target
Gaaaaaggcccatattatacacaccttggggcaggaccaagtgtggctgctgtcaggga
Tetl- gGAATTCataacttcgtataatgtatgctatacgaagttatctcatggagactaggtga loxP ggaactctgcttcccgctaacccattcttcccggtgacctgg (SEQ ID NO: 298)
Ctctgtgactataaggctctgactctcaagtcacagaaacacgtgaaagtgccaacaga
Tet2- tGAATTCataacttcgtataatgtatgctatacgaagttatatccaggctgcagaatcg loxP gagaaccacgcccgagctgcagagcctcaagcaaccaaaagc (SEQ ID NO: 299)
Taccagagcggcccggtgcccggcacggccattaacggcacactgcccctgtcgcaca t; gGGCAAGCCCATCCCCAACCCCCTGCTGGGCCTGGACAGCACGtgagggctggactgcg
Sox2-v5 aactggagaaggggagagattt caaagagatacaagggaattg ίSEQ ID NO : 00 )
Tgtttgaccaatatcaccagcaacctaaagctgttaagaaatctttgggccccagcttg
Mecp2- aGCTAGCataacttcgtataatgtatgctatacgaagttatcccaaggatacagtatcc Ll-loxP tagggaagttaccaaaatcagagatagtatgcagcagccagg (SEQ ID NO: 301)
Ccagcaacctaaagctgttaagaaatctttgggccccagcttgacccaaggatacagta
Mecp2- tGCTAGCataacttcgtataatgtatgctatacgaagttatcctagggaagttaccaaa L2-loxP atcagagatagtatgcagcagccaggggtctcatgtgtggca (SEQ ID NO: 302)
Ccactcctctgtactccctggcttttccacaatccttaaactgaaggagtgaggtctag
Mecp2- tataacttcgtatagcatacattatacgaagttatGAATTCacttgggggtcattgggc Rl-loxP tagactgaatatctttggttggtacccagacctaatccacca (SEQ ID NO: 303)
Ccaaaaaggctggacaccatgccttggttaaaatggaggaatgttttggtggtggatta
Mecp2- gGAATTCataacttcgtataatgtatgctatacgaagttatgtctgggtaccaaccaaa R2-loxP gatattcagtctagcccaatgacccccaagtactagacctca (SEQ ID NO: 304)
[00369] Oligonucleotides used for off-targeted analysis
Directio
Gene target n Sequence (5' to 3')
F Atgacatgacctaagtaaaccc (SEQ ID NO: 305)
OTl_Sox2_Stop
R Ctccactctgtactaggcac (SEQ ID NO: 306)
F tgatggtttttggtgactgcc (SEQ ID NO: 307)
OT2_Sox2_Stop
R gacagatcatagatagaaaattg (SEQ ID NO: 308)
F aaactgaggcacagagtctg (SEQ ID NO: 309)
OT3_Sox2_Stop
R gtgacgaagccactttgacc (SEQ ID NO: 310)
F caccttaggttcatggcattc (SEQ ID NO: 311)
OT4_Sox2_Stop
R gatggatcagtgattaagagc (SEQ ID NO: 312)
F accatgatggactgtaccatc (SEQ ID NO: 313)
OT5_Sox2_Stop
R catggacgtcattactagatg (SEQ ID NO: 314)
F ttcctcgaagatgaaatgatt (SEQ ID NO: 315)
OT6_Sox2_Stop
R cagtgtgcagactctgagag (SEQ ID NO: 316)
F atgtgccacacaaggcaggc (SEQ ID NO: 317)
OT7_Sox2_Stop
R gcaaaacctctgaaagttgac (SEQ ID NO: 318)
F ttcctgtcctggcttccttc (SEQ ID NO: 319)
OT8_Sox2_Stop
R gcactagttgtcacgtgatg (SEQ ID NO: 320)
F gactcagatttccaagccatg (SEQ ID NO: 321)
OT9_Sox2_Stop
R acatctctgagctctaagcc (SEQ ID NO: 322)
F tgccatgtgctgtgttcacc (SEQ ID NO: 323)
OT10_Sox2_Stop
R ttgatatttaagacagggtctc (SEQ ID NO: 324)
F gtaaggaatgtaagaactcttg (SEQ ID NO: 325)
OTl l_Sox2_Stop
R aattctcaactgaggaatactg (SEQ ID NO: 326)
F tctcagacagaaacgctgtg (SEQ ID NO: 327)
OT12_Sox2_Stop
R gacttgatatgccaggatgag (SEQ ID NO: 328)
F agctgacagaagacgatgag (SEQ ID NO: 329)
OT13_Sox2_Stop
R taaacccaagcaaaggtcatg (SEQ ID NO: 330)
F gctggtgagatggctcagtg (SEQ ID NO: 331)
OTI Nanog Stop
R gtcttaacctgcttatagcaac (SEQ ID NO: 332)
F agatcccattacggatggttg (SEQ ID NO: 333)
OT2_Nanog_Stop
R ggacactcaccaatgtttgg (SEQ ID NO: 334)
F tagattatctagtgtgttccac (SEQ ID NO: 335)
OT3_Nanog_Stop R agtttcagtgctcagagcac (SEQ ID NO: 336)
F gacactttctaagtgggcttg (SEQ ID NO: 337)
OT4_Nanog_Stop
R gttaagggacagtgaatatcc (SEQ ID NO: 338)
F tcccatctaccctctgactg (SEQ ID NO: 339)
OT5_Nanog_Stop
R gcctgaagaaaagaaggtcc (SEQ ID NO: 340)
F tctgaggtgagcaaagcatg (SEQ ID NO: 341)
OT6_Nanog_Stop
R aatccaccatgtcttccgtg (SEQ ID NO: 342)
F caattttctcagtgaggtagg (SEQ ID NO: 343)
OT7_Nanog_Stop
R cttgttcagtgcattgctgc (SEQ ID NO: 344)
F tctcttcagaaaagagtaggc (SEQ ID NO: 345)
OT8_Nanog_Stop
R gttggcaactgcactctgtg (SEQ ID NO: 346)
F agctcatgcatgctgagctg (SEQ ID NO: 347)
OT9_Nanog_Stop
R aacttcaagtggaactgcttg (SEQ ID NO: 348)
F cacacacacactgaataaaatg (SEQ ID NO: 349)
OTl_Mecp2_Left2
R aagctggctttgagcaggac (SEQ ID NO: 350)
F tagtcacttatgtttactcctc (SEQ ID NO: 351)
OT2_Mecp2_Left2
R gtgatgccagcagttggcag (SEQ ID NO: 352)
F tcactttccctcagtactcc (SEQ ID NO: 353)
OT3_Mecp2_Left2
R caagtatcattctctgaacaag (SEQ ID NO: 354)
F gaactttgagacagggtctc (SEQ ID NO: 355)
OT4_Mecp2_Left2
R gacagagcagcttggccttc (SEQ ID NO: 356)
OT1 Mecp2 Right F gcagcaccagtggaatattac (SEQ ID NO: 357)
1 R gcctattgatgaatctgccc (SEQ ID NO: 358)
OT2 Mecp2 Right F acagatgcagccactcacag (SEQ ID NO: 359)
1 R gtccaagtcacttctcccac (SEQ ID NO: 360)
OT3 Mecp2 Right F tccgacaatggtttatgtctg (SEQ ID NO: 361)
1 R agatactagcagtgcagctg (SEQ ID NO: 362)
OT4 Mecp2 Right F gttcctgctggttttgtttcg (SEQ ID NO: 363)
1 R tagaccaatctacaaccacag (SEQ ID NO: 364)
OT5 Mecp2 Right F tgctgtgaaactcaggcatg (SEQ ID NO: 365)
1 R cttctaagacaagccagaaag (SEQ ID NO: 366)
OT6 Mecp2 Right F cggcataaacctcccattag (SEQ ID NO: 367)
1 R ctctgtgcttgtaaggcaaac (SEQ ID NO: 368)
OT7 Mecp2 Right F gccagacaataattcccaag (SEQ ID NO: 369)
1 R ctgatattgctactgctaacc (SEQ ID NO: 370)
OT8 Mecp2 Right F ccattgtgaaagtgggatgc (SEQ ID NO: 371)
1 R ggctgctctcgtaaacaaaac (SEQ ID NO: 372)
0T9 Mecp2 Right F gtcactctcatgtgcaggtg (SEQ ID NO: 373)
1 R ctagcacttgggaagcaaatg (SEQ ID NO: 374)
OT10 Mecp2 Righ F ctaatcacacttctacaagctg (SEQ ID NO: 375) tl R agagaggctccaattgttag (SEQ ID NO: 376)
[00370] Example 4
[00371] Multiplexed activation of endogenous genes by CRISPR-on, a RNA- guided transcriptional activator system
[00372] As described in Example 2, a two-component transcriptional activator consisting of a nuclease-dead Cas9 (dCas9) protein fused with a transcriptional activation domain and single guide RNAs (sgRNAs) with complementary sequence to gene promoters. It is demonstrated that CRISPR-on can efficiently activate exogenous reporter genes in both human and mouse cells in a tunable manner. In addition, robust reporter gene activation in vivo can be achieved by injecting the system components into mouse zygotes. Furthermore, CRISPR-on can activate the endogenous ILIRN, SOX2, and OCT4 genes. The most efficient gene activation was
achieved by clusters of 3 to 4 sgRNAs binding to the proximal promoters suggesting their synergistic action in gene induction. Significantly, when sgRNAs targeting multiple genes were simultaneously introduced into cells, robust multiplexed endogenous gene activation was achieved. Genome-wide expression profiling demonstrated high specificity of the system.
[00373] Materials and Methods
[00374] Cloning
[00375] A two-step fusion PCR was performed to amplify Cas9 Nickase ORF without stop codon from the pX335 vector (Addgene: 42335), incorporate H840A mutation, EcoRI-Agel restriction site on the 5 ' end as well as an Fsel site on the 3 'end (EcoRI-AgeI-dCas9-FseI fragment). The 3x minimal VP 16 activation domain coding fragment (VP48) was excised from a vector (Addgene: 20342) containing NLSM2rtTA coding sequence by Fsel and EcoRI digestion (Fsel-TA-EcoRI fragment). The two fragments were ligated into pCR8/GW/TOPO (Invitrogen) vector digested by EcoRI to generate a gateway compatible dCas9VP48 coding plasmid. The dCas9VP48 coding sequence was subsequently excised and cloned into pX355 vector (Addgene: 42335) by Agel-EcoRI digestion to replace dCas9 Nickase to create a chimeric vector that expresses both the dCas9VP48 and the sgRNA (dCas9VP48-U6-sgRNA-chimeric). sgRNA spacers were cloned into the Bbsl-digested vector by annealing oligos as previously described (Cong et al, Science; 339 (6121):819-823(2013)). For construction of dCas9VP160 (SEQ ID NO: 16), a gB locks gene fragment containing coding sequence for 10 tandem repeats of VP 16 domains separated by Glycine-Serine (GS) linker was ordered from
Integrated DNA Technology (IDT) and amplified by PCR primers containing Fsel and EcoRI sites to replace VP48 fragment in pCR8-dCas9VP48 to generate pCR8- dCas9VP160. A pmax-DEST gateway destination vector was constructed by replacing GFP coding sequence in pmaxGFP (Clontech) by a gateway destination cassette (Invitrogen). The pCR8-dCas9VP160 vector was then recombined with pmax-DEST via LR Clonase-medicated to create pmax-dCas9VP160 expression plasmid. For the endogenous gene experiments, sgRNAs were cloned by oligo clonding method mentioned above to a PBneo-sgRNA expression vector.
[00376] Culturing and transfection of HeLa, HEK293T and NIH3T3
[00377] HeLa, HEK293T and NIH3T3 cells were cultured in DMEM with 10% inactivated FBS, 1% Penn/Strep, 1% Glutamine, 1% non-essential amino acids. Trans fection was done using Fugene HD (Promega) using a 2:6 ratio (A total DNA amount of 2μg and 6μ1 of Fugene HD reagent) in 6-well plates. For TetO::tdTomato experiments, 2μg of the chimeric vector was used. For endogenous gene activation experiments, the U6 promoter - sgRNA - terminator sequence was amplified from the PBneo-sgRNA plasmids, purified by PCR purification kit (QIAGEN), and transfected as linear DNA (^g Total sgRNA expressing DNA) with ^g of pmax- dCas9VP160 plasmid. Where there are multiple sgRNAs for multiple genes, the amount per sgRNA was evenly divided among genes first, then among the sgRNAs targeting each gene.
[00378] Transgene activation experiment in mouse embryonic stem cells (mESC)
[00379] Mouse ESCs from mice carrying a Dox-inducible Musashi-1 (MSI1) allele in the CollAl locus (Kharas et al, Nature medicine; 16 (8):903-908 (2010)) were transfected with dCas9VP48 using Xfect mESC transfection reagent
(Clontech) or were cultured in mouse ES medium with 2 μg/ml Doxycycline. 48 hours later, Protein lysates were prepared on ice from cell pellets in SDS-Tris lysis buffer (10% SDS, 10% Glycerol, 0.1M DTT, 0.12g/ml Urea) supplemented with protease and phosphatase inhibitor tables (1 tablet/ 10ml, Roche) and analyzed by western blot. Blots were probed with primary rabbit anti-MSIl (Cell Signaling Technologies, #2154), mouse anti-Alpha-Tubulin (SIGMA) antibodies. Secondary HRP-conjugated anti-rabbit/anti-mouse IgG were used and visualized with ECL (GE Healthcare).
[00380] One cell embryo injection
[00381] All animal procedures were performed according to NIH guidelines and approved by the Committee on Animal Care at MIT. B6D2F1 (C57BL/6 X DBA2) female mice and ICR mouse strains were used as embryo donors and foster mothers, respectively. Super-ovulated female B6D2F1 mice (7-8 weeks old) were mated to B6D2F1 stud males, and fertilized embryos were collected from oviducts.
Cas9VP48 plasmid (200 ng/μΐ), Nanog::EGFP construct (200 ng/μΐ), and sgRNAs (50 ng/μΐ for each) were mixed and injected into the cytoplasm of fertilized eggs with well-recognized pronuclei in M2 medium (Sigma). Injected oocytes were cultured in KSOM medium for 96 h to examine their development in vitro. Images
of resulting embryos were acquired with an inverted microscope under the same exposure parameters.
[00382] Bioinformatics analysis of gene expression and CRISPR off-target analysis
[00383] Affymetrix U133A 2.0 array was used for microarray gene expression analysis. Gene expression values were processed and normalized using affy package for R {Gautier, 2004 #27} .
[00384] qRT-PCR expression analysis
[00385] Total RNA was isolated using the Rneasy Kit (QIAGEN) and reversed transcribed using the Superscript III First Strand Synthesis kit (Invitrogen).
Quantitative RT-PCR analysis was performed in triplicate using the ABI 7900 HT system with FAST SYBR Green Master Mix (Applied Biosystems). Gene expression was normalized to GAPDH. Error bars represent the standard deviation (SD) of the mean of triplicate reactions. Primer sequences are included in Table 13.
[00386] Table 13 qRT-PCR primers.
Gene Forward primer Reverse primer
SOX2 CACCTACAGCATGTCCTACTC GGTTTTCTCCATGCTGTTTCT
G T
(SEQ ID NO: 377) (SEQ ID NO: 378)
IL1RN GACCCTCTGGGAGAAAATCC GTCCTTGCAAGTATCCAGCA
(SEQ ID NO: 379) (SEQ ID NO: 380)
OCT4 GCTCGAGAAGGATGTGGTCC CGTTGTGCATAGTCGCTGCT
(SEQ ID NO: 381) (SEQ ID NO: 382)
GAPDH CGAGATCCCTCCAAAATCAA ATCCACAGTCTTCTGGGTGG
(SEQ ID NO: 383) (SEQ ID NO: 384)
[00452] Table 14 sgRNA designs, DNA targets, oligos for cloning.
Name Target Target sequences Forward oligo Reverse oligo sgTetO Tet binding gCTTTTCTCTATC caccGCTTTTCTCTAT aaacTATCAGTGAT site ACTGATAGGG CACTGATA AGAGAAAAGC
(SEQ ID NO: 385) (SEQ ID NO: 386) (SEQ ID NO: 387) sgTetOMut Mutant gCTTTTCTtTAtCAt caccGCTTTTCTTTAT aaacTACCAATGAT version of TGgTAGGG CATTGGTA AAAGAAAAGC sgTetO (SEQ ID NO: 388) (SEQ ID NO: 389) (SEQ ID NO: 390) sgNanog-1 mouse Nanog GTAATGCAAAAG caccGTAATGCAAAA aaacTACAGCTTCT
AAGCTGTAAGG GAAGCTGTA TTTGCATTAC
(SEQ ID NO: 391) (SEQ ID NO: 392) (SEQ ID NO: 393) sgNanog-2 mouse Nanog GATCTCTAGTGG caccGATCTCTAGTG aaacGAAACTTCCC
GAAGTTTCAGG GGAAGTTTC ACTAGAGATC
(SEQ ID NO: 394) (SEQ ID NO: 395) (SEQ ID NO: 396) sgNanog-3 mouse Nanog GTCTGTAGAAAG caccGTCTGTAGAAA aaacCTTCCATTCT
AATGGAAGAGG GAATGGAAG TTCTACAGAC
(SEQ ID NO: 397) (SEQ ID NO: 398) (SEQ ID NO: 399) sgNanog-4 mouse Nanog GCTCTTCACATTG caccGCTCTTCACATT aaacGGTTTCCCAA
GGAAACCTGG GGGAAACC TGTGAAGAGC
(SEQ ID NO: 400) (SEQ ID NO: 401) (SEQ ID NO: 402) sgNanog-5 mouse Nanog GCGTTAAAAAGC caccGCGTTAAAAAG aaacAAAGTGCGGC
CGCACTTTTGG CCGCACTTT TTTTTAACGC
(SEQ ID NO: 403) (SEQ ID NO: 404) (SEQ ID NO: 405) sgNanog-6 mouse Nanog GAGTGTTTAAATT caccGAGTGTTTAAA aaacCTACATTAAT
AATGTAGAGG TTAATGTAG TTAAACACTC
(SEQ ID NO: 406) (SEQ ID NO: 407) (SEQ ID NO: 408) sgNanog-7 mouse Nanog GAGTTTCACGTAC caccGAGTTTCACGT aaacGTCTCGGGTA
CCGAGACTGG ACCCGAGAC CGTGAAACTC
(SEQ ID NO: 409) (SEQ ID NO: 410) (SEQ ID NO: 411) sgILlRN-1 IL1RN GCACCTCAGAGA caccGCACCTCAGAG aaacCCTTGTACTC
GTACAAGGAGG AGTACAAGG TCTGAGGTGC
(SEQ ID NO: 412) (SEQ ID NO: 413) (SEQ ID NO: 414) sgILlRN-2 IL1RN gGGCTGACTTGAT caccGGGCTGACTTG aaacGCTTGGCATC
GCCAAGCAGG ATGCCAAGC AAGTCAGCCC
( SEQ ID NO: 415) (SEQ ID NO: 416) (SEQ ID NO: 417) sgILlRN-3 IL1RN gGTTTCCAGGAGG caccGGTTTCCAGGA aaacGAGTCACCCT
GTGACTCAGG GGGTGACTC CCTGGAAACC
(SEQ ID NO: 418) (SEQ ID NO: 419) (SEQ ID NO: 420) sgILlRN-4 IL1RN gGGTTCTTATCTG caccGGGTTCTTATCT aaacTCTTACGCAG
CGTAAGATGG GCGTAAGA ATAAGAACCC (SEQ ID NO: 421) (SEQ ID NO: 422) (SEQ ID NO: 423) sgILlRN-5 IL1RN gATTGGGAACAA caccGATTGGGAACA aaacTGTCTGGCTT
GCCAGACAAGG AGCCAGACA GTTCCCAATC
(SEQ ID NO: 424) (SEQ ID NO: 425) (SEQ ID NO: 426) sgILlRN-6 IL1RN GATATGCTTTTGA caccGATATGCTTTTG aaacAGGTCCCTCA
GGGACCTAGG AGGGACCT AAAGCATATC
(SEQ ID NO: 427) (SEQ ID NO: 428) (SEQ ID NO: 429) sgSOX2-l SOX2 GGGGAGAGGAGG caccGGGGAGAGGAG aaacCCTCCCCTCC
AGGGGAGGCGG GAGGGGAGG TCCTCTCCCC
(SEQ ID NO: 430) (SEQ ID NO: 431) (SEQ ID NO: 432) sgSOX2-2 SOX2 GAGAGAGGCAAA caccGAGAGAGGCAA aaacGATTCCAGTT
CTGGAATCAGG ACTGGAATC TGCCTCTCTC
(SEQ ID NO: 433) (SEQ ID NO: 434) (SEQ ID NO: 435) sgSOX2-3 SOX2 gCATGTGACGGG caccGCATGTGACGG aaacTGACAGCCCC
GGCTGTCAGGG GGGCTGTCA CGTCACATGC
(SEQ ID NO: 436) (SEQ ID NO: 437) (SEQ ID NO: 438) sgSOX2-4 SOX2 GCTGCCGGGTTTT caccGCTGCCGGGTT aaacTTCATGCAAA
GCATGAAAGG TTGCATGAA ACCCGGCAGC
(SEQ ID NO: 439) (SEQ ID NO: 440) (SEQ ID NO: 441) sgSOX2-5 SOX2 GCCGGCCGCGCG caccGCCGGCCGCGC aaacGCCTCCCCCG
GGGGAGGCCGG GGGGGAGGC CGCGGCCGGC
(SEQ ID NO: 442) (SEQ ID NO: 443) (SEQ ID NO: 444) sgSOX2-6 SOX2 GGCAGGCGAGGA caccGGCAGGCGAGG aaacCCTCCCCCTC
GGGGGAGGAGG AGGGGGAGG CTCGCCTGCC
(SEQ ID NO: 445) (SEQ ID NO: 446) (SEQ ID NO: 447) sgSOX2-7 SOX2 GTATCCCCTCTCG caccGTATCCCCTCTC aaacGTTGCTGCGA
CAGCAACAGG GCAGCAAC GAGGGGATAC
(SEQ ID NO: 448) (SEQ ID NO: 449) (SEQ ID NO: 450) sgSOX2-8 SOX2 GCAGGGTACTTA caccGCAGGGTACTT aaacTCCTCATTTA
AATGAGGATGG AAATGAGGA AGTACCCTGC
(SEQ ID NO: 451) (SEQ ID NO: 452) (SEQ ID NO: 453) sgSOX2-9 SOX2 GCAGCTAAGGTG caccGCAGCTAAGGT aaacCACCCCCGCA
CGGGGGTGGGG GCGGGGGTG CCTTAGCTGC
(SEQ ID NO: 454) (SEQ ID NO: 455) (SEQ ID NO: 456) sgSOX2-10 SOX2 GGCTGTCCAACTC caccGGCTGTCCAAC aaacGAAATACGA
GTATTTCTGG TCGTATTTC GTTGGACAGCC
(SEQ ID NO: 457) (SEQ ID NO: 458) (SEQ ID NO: 459) sgOCT4-l OCT4 GAAGGAAGGCGC caccGAAGGAAGGCG aaacGGCTTGGGGC
CCCAAGCCGGG CCCCAAGCC GCCTTCCTTC (SEQ ID NO: 460) (SEQ ID NO: 461) (SEQ ID NO: 462) sgOCT4-2 OCT4 GGTGAAATGAGG caccGGTGAAATGAG AaacTCGCAAGCC
GCTTGCGAAGG GGCTTGCGA CTCATTTCACC
(SEQ ID NO: 463) (SEQ ID NO: 464) (SEQ ID NO: 465) sgOCT4-3 OCT4 GGCCCCGCCCCCT caccGGCCCCGCCCC aaacCCCATCCAGG
GGATGGGTGG CTGGATGGG GGGCGGGGCC
(SEQ ID NO: 466) (SEQ ID NO: 467) (SEQ ID NO: 468) sgOCT4-4 OCT4 GGGGGGAGAAAC caccGGGGGGAGAAA aaacTCGCCTCAGT
TGAGGCGAAGG CTGAGGCGA TTCTCCCCCC
(SEQ ID NO: 469) (SEQ ID NO: 470) (SEQ ID NO: 471) sgOCT4-5 OCT4 GGTGGTGGCAAT caccGGTGGTGGCAA aaacCAGACACCAT
GGTGTCTGTGG TGGTGTCTG TGCCACCACC
(SEQ ID NO: 472) (SEQ ID NO: 473) (SEQ ID NO: 474) sgOCT4-6 OCT4 GACACAACTGGC caccGACACAACTGG aaacGGAGGGGCG
GCCCCTCCAGG CGCCCCTCC CCAGTTGTGTC
(SEQ ID NO: 475) (SEQ IDN O: 476) (SEQ ID NO: 477)
Last three bases are PAM (5'-NGG-3') motif. Lowercase italic letters in the target sequences indicate mismatch (first g to allow efficient U6 transcription) or as mutant control (other changes; as in sgTetO-mut). Lowercase letters in the oligo sequences indicate overhang compatible to the Bbsl-digested vectors.
[00387] Results
[00388] Fusion of nuclease-deficient Cas9 to transactivation domain
generated an RNA-programmable transcription factor
[00389] To generate a CPJSPR/Cas-based transcription activator, the H840A mutation was introduced in the human codon-optimized Cas9(D10A) nickase (Cong et al, Science; 339 (6121):819-823(2013)) to create a nuclease-deficient dCas9
(H840A; D10A) and fused a 3x minimal VP 16 transcriptional activation domain (VP48) to its C-terminus (Figure 23 A). We first tested dCas9VP48 in human HeLa cells carrying integrated tdTomato reporter transgene under the control of a
Tetracycline-inducible promoter composed of seven copies of rtTA binding sites and a CMV minimal promoter (TetO::tdTomato). As a positive control, these cells constitutively express the rtTA transactivator that induced tdTomato expression upon doxycycline treatment (Figure 23B panel ii). Transient transfection of
dCas9VP48 with sgRNA complementary to rtTA binding site (sgTetO) activated the
TetO::tdTomato reporter in the absence of doxycycline at almost the same efficiency as the positive control (Figure 23B panel iv). Transfection of dCas9VP48 without sgRNA did not activate tdTomato expression. (Figure 23B panel iii). Activation of a TetO::tdTomato reporter lasted for one week but became weak afterwards (Figure 27). Similarly, co-expression of dCas9VP48 with sgTetO activated tdTomato transgene in mouse NIH3T3 cells carrying an integrated TetO::tdTomato reporter (Figure 28B, panel iv), while expression of dCas9VP48 alone did not activate tdTomato expression (Figure 28B, panel iii). These results indicate that CRISPR-on activates a transgene reporter robustly in human and mouse cells to a similar level as rtTA in the presence of doxycycline and that the binding of dCas9VP48 to the tetO promoter is strictly dependent on sgTetO. The higher fraction of fluorescent HeLa cells as compared to that in NIH3T3 cells is likely due to higher transfection efficiency.
[00390] CRISPR-on was tested whether it could activate a single-copy transgene in embryonic stem cells (ESC). For this dCas9VP48 was co-transfected with sgTetO into ESC cells carrying a Tet-inducible Musashil (MSI1) transgene at the Col la locus and the rtTA-M2 in the Rosa26 locus (Kharas et al., Nature medicine; 16 (8):903-908 (2010)) (Figure 29). Transient transfection of dCas9VP48 alone did not activate MSI1 expression (Figure 29, Lane 1) while co-transfection of dCas9VP48 with sgTetO or addition of doxycycline (positive control) activated MSI1 expression (Figure 29, Lanes 2 and 7). Neither expression of dCas9VP48 with a mutant TetO sgRNA (sgTetO-mut) carrying mismatches to the TetO binding sites (Figure 29, Lane 3) nor expression of sgTetO with dCas9 lacking an activation domain activated MSI1 expression (Figure 29, Lane 4).
[00391] To further characterize the system, HEK293T/TetO::tdTomato cells were transfected with dCas9 activator and a serial titration of sgRNAs (Figure 30). A near-linear relationship was observed between the amount of sgTetO transfected with the mean fluorescence by FACS (Figure 30B), indicating that the level of gene activation could be controlled precisely using CRISPR-on.
[00392] To test whether CRISPR-on can activate genes in vivo, dCas9VP48 plasmid, seven different sgRNAs (sgNanog-l~7) targeting the mouse Nanog promoter and a Nanog: :EGFP construct containing lkb promoter and 5' UTR of Nanog were co-injected into mouse zygotes (Figures 23C and 23D). Two days after
injection, a GFP signal was detected in 4-cell embryos by fluorescence microscopy and higher GFP expression was observed in morula and blastocyst on day 3 and day 4, whereas no GFP signal was observed in control embryos without the sgRNAs. This indicates that dCas9VP48/sgNanogs can specifically activate the GFP transgene in mouse embryos.
[00393] Activation of endogenous genes
[00394] Having established that the CRSIPR-on system can activate reporter transgenes, sgRNAs targeting the endogenous human IL1RN gene were designed and tested their transactivation activity in HEK293T cells. To identify the binding sites most efficient for gene induction, six sgRNAs were designed to span the lkb IL1RN promoter (Figures 31A-31B). Initially, dCas9VP48 was transfected with all 6 sgRNAs, but failed to induce IL1RN gene expression (Figures 31A-31B). To test whether a stronger activation domain can activate IL1RN, a VP 160 domain containing 10 tandem copies of VP 16 motifs was fused with dCas9 to generate dCas9VP160 (Figure 24A). When co-transfected with multiple but not single sgRNAs, dCas9VP160 readily activated ILIRN (Figures 24B and 24C).
Transduction of three proximal sgRNAs (sgILlRNl~3) activated ILIRN
approximately 6 fold, whereas the three distal sgRNAs (sgILlRN4~6) did not induce robust induction. Addition of sgRNA4~6 to the proximal sgRNAs
(sgILlRNl~3) did not significantly augment the expression (Figure 24C). These data suggest that gene activation is synergistically promoted by multiple
dCas9VP160/sgRNA binding events at the proximal region of the ILIRN promoter.
[00395] A similar result was obtained with 10 sgRNAs spanning the SOX2 promoter (Figures 24D and 24E). As for ILIRN, expression of single sgRNAs did not yield strong activation of SOX2, while the triple sgRNAs (3-5, 4-6, 5-7, 8-10) activated SOX2 more than 4 fold. Seven fold activation was achieved with sgSOX2- 4-6 and sgSOX2-5~7, while further distal sgRNAs (sgSOX2-8~10) or those downstream of TSS (sgSOX2-l~2) were less potent. Quintuple sgSOX2-l~5 had a lower activity than triple sgSOX2 3-5 suggesting that sgRNAs downstream of TSS (sgSOX2-l~2) may be detrimental to activation. Binding of dCas9VP160 downstream of transcriptional start sites may sterically hinder transcription by blocking polymerase, consistent with a previously report on CRISPRi (Qi et al, Cell; 152 (5):1173-1183 (2013)). To further confirm this observation, six sgRNAs
were designed spanning Oct4 promoter, including two targeting downstream of TSS (sgOCT4-l~2) (Figure 24F). An eight fold activation was achieved with sgOCT4- 3~6, albeit all six sgOCT4-l~6 had a much lower activity than sgOCT4-3~6 confirming that sgRNAs downstream of TSS (sgSOX2-l~2) have a negative effect on gene activation (Figure 24G). Therefore, in IL1RN, SOX2, and OCT4 promoters, three to five dCas9VP160/sgRNAs binding within 300bp upstream TSS induced the most efficient gene activation.
[00396] Multiple exogenous and endogenous genes can be simultaneously activated by CRISPR-on
[00397] Single, double and triple activation of a TetO::tdTomato transgene and the endogenous SOX2 and IL1RN genes (Figure 3 A) were tested in HEK293T cells carrying the stably integrated TetO::tdTomato transgene
(HEK293T/TetO::tdTomato). Transfection of sgRNAs targeting the individual promoters (sgTetO for TetO::tdTomato, sgSOX2-l~10 for SOX2 or sgILlRNl~6 for IL1RN) activated the respective genes (TetO: 6.6x; SOX2: 3.5x; IL1RN: 10.7x) while not affecting expression of the other two genes (Figure 25 A). Simultaneous transfection of sgRNAs targeting two or three promoters activated the corresponding sets of genes (Figure 25A).
[00398] To test whether the system allows the activation of three different endogenous genes in a dose dependent manner, HEK293T cells were co-transfected with dCas9VP160 and the most efficient sgRNAs targeting all three genes
(sgILlRNl~3 for IL1RN, sgSOX2-5~7 for targeting SOX2, and sgOCT4-l~3 for OCT4) in different ratios (Figure 25B). When sgRNAs targeting one gene, or two genes were used, only the respective genes were activated. When all sgRNAs targeting three genes were transfected, albeit in different ratios, robust activation of all three genes was observed (Figure 25B). More significantly, when different ratios of sgRNAs were used targeting SOX2 and IL1RN while maintaining the OCT4 sgRNAs constant, the predicted change of the ratio of SOX2 and IL1RN expression levels, and the OCT4 expression remained stable (Figure 25B). These results demonstrate that the CRISPR-on system can be robustly used for multiplexed activation of endogenous genes.
[00399] CRISPR-on is highly specific
[00400] To test the specificity of CRISPR-on-mediated gene activation, microarray experiments were conducted to compare genome-wide gene expression profiles of cells transfected with dCas9VP160 and specific sgRNAs to cells transfected with dCas9VP160 and sgTetO-mut control sgRNA (Figures 26A-26D). While efficiently activating target genes, CRISPR-on did not cause major perturbations in the transcriptome (Figures 26A and 26B) as only three genes showed an over two fold up regulation upon transduction of dCas9VP160/sgTetO (Figure 26C). While CRISPR-on mediated activation of IL1RN induced the IL1RN target gene 13 fold, only 16 other genes showed an about twofold increase in expression (Figure 26D). The minor upregulation of these genes may be secondary due to the over-expression of tdTomato or IL1RN.
[00401] Discussion
[00402] Artificial transcription factors (ATFs) are valuable tools for studying gene functions and transcriptional networks. Zinc-fingers and TALE transcription factors have been developed over the recent decades and show promises in both bioengineering and therapeutic applications (Sera T., Adv Drug Deliv Rev; 61 (7- 8):513-526 (2009); Perez-Pinera et al, Nat Methods; 10 (3):239-242 (2013); Maeder et al, Nature methods 2013; 10 (3):243-245 (2013)). Here, CRISPR-on was established as a novel class of artificial transcription factors based on the
CRISPR/Cas system. A major advantage of this system is that only one Cas9 protein is required to activate multiple genes individually or simultaneously and that its DNA binding specificity is determined by sgRNAs, which are designed based on simple RNA/DNA complementarity.
[00403] Using CRISPR-on, robust activation was demonstrated of exogenous reporter genes in both human and mouse transformed cells as well as in ES cells. When the system was introduced into one-cell mouse embryos, efficient reporter gene activation occurred . This system can be used to manipulate transcriptional networks in early embryos.
[00404] Robust endogenous gene activation was achieved using the stronger activation domain VP 160. Further optimization of activation domains, such as using different linker sequences, may improve the CRISPR-on activation efficiency even further. The promoter scanning experiments demonstrated that efficient activation of endogenous genes could be achieved by three to five sgRNAs binding within 300bp
region upstream of transcription start sites. Using additional sgRNAs targeting further upstream or downstream regions did not significantly improve the level of induction. This data suggest that only a small number of sgRNAs targeting the proximal promoter are sufficient to activate endogenous genes.
[00405] It is shown here that the CRISPR-on system can be used for the simultaneous induction of at least three different endogenous genes. More significantly, the stoichiometry of gene induction of multiple genes can be tuned by adjusting the relative amount of their cognate sgRNAs. Simultaneous activation of multiple endogenous genes with defined stoichiometry opens up novel opportunities for systems biology as it allows for the predictable manipulation of transcriptional networks.
[00406] Finally, with the ease of design and synthesis, a library of sgRNAs could be generated. When introduced into a cell line constitutively expressing dCas9 protein, gene activation screens mediated by RNA (RNAa) could be achieved. Since the specificity components (sgRNA) can be separately designed and constructed from the effector component (Cas protein), the same library of sgRNAs could be used with different dCas9 fusions (e.g., VP160 domain for transactivation, KRAB domain for transcriptional repression, chromatin modifier domains for specific histone modification) to exert different functions at particular genomic loci.
[00407] CRISPR dCas9 fusion peptides sequences
[00408] dCas9VP160-2A-puro
[00409] MYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTD
R HSIKK LIGALLFDSGETAEATRLKRTARRRYTRRK RICYLQEIFSNEMAKVDDSFFHR
LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH
MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSR
RLENLIAQLPGEKK GLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA
QIGDQYADLFLAAK LSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVR
QQLPEKYKEIFFDQSK GYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR
KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS
RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDK LPNEKVLPKHSLLYEYFTV
YNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEI
SGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA
HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD
SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIV
IEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR
DMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDK RGKSDNVPSEEVVKKMK
NYWRQLLNAKLI QRKFDNLTKAERGGLSELDKAGFIKRQLVETRQI KHVAQILDSRMN TKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR PLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLI ARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEK PID FLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLAS HYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNKHRDK PIREQAENI IHLFTLTNLGAPAAFKYFDT IDRKRYTSTKEVLDATLIHQSI GLYETRI DLSQLGGDSPKKKRKVEASGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFD LDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDA LDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYIDRSGSGEGRGSLLTCGDVEENPG PRLEMTEYKPTVRLATRDDVPRAVRTLAAAFADYPATRHTVDPDRHIERVTELQELFLTR VGLDIGKVWVADDGAAVAVWTTPESVEAGAVFAEIGPRMAELSGSRLAAQQQMEGLLAPH RPKEPAWFLATVGVSPDHQGKGLGSAVVLPGVEAAERAGVPAFLETSAPRNLPFYERLGF TVTADVEVPEGPRTWCMTRKPGA (SEQ ID NO : 478)
[00410] dCas9VP160-2A-neo
[00411] MYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTD
R HSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR
LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH
MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSR
RLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA
QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVR
QQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR
KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS
RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTV
YNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEI
SGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA
HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD
SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIV
IEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR
DMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMK
NYWRQLLNAKLI QRKFDNLTKAERGGLSELDKAGFIKRQLVETRQI KHVAQILDSRMN
TKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK
YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR
PLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLI
ARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPID
FLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLAS
HYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNKHRDK
PIREQAENI IHLFTLTNLGAPAAFKYFDT IDRKRYTSTKEVLDATLIHQSI GLYETRI
DLSQLGGDSPKKKRKVEASGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFD LDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDA LDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYIDRSGSGEGRGSLLTCGDVEENPG PRLETRMGSAIEQDGLHAGSPAAWVERLFGYDWAQQTIGCSDAAVFRLSAQGRPVLFVKT DLSGALNELQDEAARLSWLATTGVPCAAVLDVVTEAGRDWLLLGEVPGQDLLSSHLAPAE KVSIMADAMRRLHTLDPATCPFDHQAKHRIERARTRMEAGLVDQDDLDEEHQGLAPAELF ARLKARMPDGEDLVVTHGDACLPNIMVENGRFSGFIDCGRLGVADRYQDIALATRDIAEE LGGEWADRFLVLYGIAAPDSQRIAFYRLLDEFF (SEQ ID NO : 479)
[00412] dCas9p65
[00413] MYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTD
R HSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR
LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH
MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSR
RLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA
QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVR
QQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR
KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS
RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTV
YNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEI
SGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA
HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD
SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIV
IEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR
DMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMK
NYWRQLLNAKLI QRKFDNLTKAERGGLSELDKAGFIKRQLVETRQI KHVAQILDSRMN
TKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK
YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR
PLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLI
ARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPID
FLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLAS
HYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNKHRDK
PIREQAENI IHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRI
DLSQLGGDSPKKKRKVEASGPASPMEFQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFS
GPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASAL
APAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEA
LLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEA
ITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLSQISSID
(SEQ ID NO: 480)
dCas9KRAB
[00414] MYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTD R HSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSR RLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVR QQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTV YNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEI SGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIV IEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR DMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMK NYWRQLLNAKLI QRKFDNLTKAERGGLSELDKAGFIKRQLVETRQI KHVAQILDSRMN TKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR PLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLI ARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPID FLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLAS HYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNKHRDK PIREQAENI IHLFTLTNLGAPAAFKYFDT IDRKRYTSTKEVLDATLIHQSI GLYETRI DLSQLGGDSPKKKRKVEASGPAASPKKKRKVEASMDAKSLTAWSRTLVTFKDVFVDFTRE EWKLLDTAQQIVYRNVMLENYKNLVSLGYQLTKPDVILRLEKGEEPWLVSRGSID
(SEQ ID NO: 481)
[00415] dCas9PCP
[00416] MYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTD
R HSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR
LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH
MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSR
RLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA
QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVR
QQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR
KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS
RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTV
YNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEI
SGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA
HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD
SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIV
IEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR
DMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMK
NYWRQLLNAKLI QRKFDNLTKAERGGLSELDKAGFIKRQLVETRQI KHVAQILDSRMN
TKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK
YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR
PLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLI
ARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPID
FLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLAS
HYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNKHRDK
PIREQAENI IHLFTLTNLGAPAAFKYFDT IDRKRYTSTKEVLDATLIHQSI GLYETRI
DLSQLGGDSPKKKRKVEASGPAIDMSKTIVLSVGEATRTLTEIQSTADRQIFEEKVGPLV
GRLRLTASLRQNGAKTAYRVNLKLDQADVVDCSTSVCGELPKVRYTQVWSHDVTIVANST
EASRKSLYDLTKSLVATSQVEDLVVNLVPLGR (SEQ ID NO : 482)
[00417] dCas9MS2
[00418] MYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTD
R HSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR
LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH
MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSR
RLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA
QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVR
QQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR
KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS
RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTV
YNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEI
SGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA
HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD
SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIV IEMARENQTTQKGQK SRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR DMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDK RGKSDNVPSEEVVKKMK NYWRQLLNAKLI QRKFDNLTKAERGGLSELDKAGFIKRQLVETRQI KHVAQILDSRMN TKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR PLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLI ARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEK PID FLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLAS HYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNKHRDK PIREQAENI IHLFTLTNLGAPAAFKYFDT IDRKRYTSTKEVLDATLIHQSI GLYETRI DLSQLGGDSPKKKRKVEASGPAMASNFTQFVLVDNGGTGDVTVAPSNFANGIAEWISSNS RSQAYKVTCSVRQSSAQNRKYTIKVEVPKGAWRSYLNMELTIPIFATNSDCELIVKAMQG LLKDGNPIPSAIAANSGIYA (SEQ ID NO: 483)
[00419] dCas9VP160ER
[00420] MYPYDVPDYASPKKKRKVEASDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTD
R HSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR
LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAH
MIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSR
RLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA
QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALVR
QQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR
KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNS
RFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTV
YNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEI
SGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYA
HLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDD
SLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGRHKPENIV
IEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGR
DMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMK
NYWRQLLNAKLI QRKFDNLTKAERGGLSELDKAGFIKRQLVETRQI KHVAQILDSRMN
TKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKK
YPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKR
PLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLI
ARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPID
FLEAKGYKEVKKDLI IKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLAS
HYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQISEFSKRVILADANLDKVLSAYNKHRDK
PIREQAENI IHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRI
DLSQLGGDSPKKKRKVEASGPAGSGRADALDDFDLDMLGSDALDDFDLDMLGSDALDDFD
LDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDA LDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLYIDSAGDMRAANLWPSPLMIKRSKK NSLALSLTADQMVSALLDAEPPILYSEYDPTRPFSEASMMGLLTNLADRELVHMINWAKR VPGFVDLTLHDQVHLLECAWLEILMIGLVWRSMEHPVKLLFAPNLLLDRNQGKCVEGMVE IFDMLLATSSRFRMMNLQGEEFVCLKSI ILLNSGVYTFLSSTLKSLEEKDHIHRVLDKI DTLIHLMAKAGLTLQQQHQRLAQLLLILSHIRHMSNKGMEHLYSMKCK VVPLYDLLLEA ADAHRLHAPTSRGGASVEETDQSHLATAGSTSSHSLQKYYITGEAEGFPATV
(SEQ ID NO: 484)
[00421] The teachings of all patents, published applications and references cited herein are incorporated by reference in their entirety.
[00422] While this invention has been particularly shown and described with references to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
Claims
1. A method of mutating one or more target nucleic acid sequences in a stem cell or zygote comprising:
(a) introducing into the stem cell or zygote
(i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and
ii) a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity; and
(b) maintaining the cell or zygote under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the one or more RNA sequences to the portion of the target nucleic acid sequence;
thereby mutating one or more target nucleic acid sequences in the stem cell or zygote.
2. The method of claim 1 wherein the Cas protein is Cas9.
3. The method of any one of the preceding claims wherein the Cas protein cleaves one strand or both strands of one or more of the target nucleic acid sequences.
4. The method of any one of the preceding claims wherein the Cas protein nicks one strand or both strands of one or more of the target nucleic acid sequences.
5. The method of any one of the preceding claims wherein the RNA sequence is from about 10 base pairs to about 150 base pairs in length.
6. The method of any one of the preceding claims wherein the stem cell or zygote is a rodent stem cell or zygote, a primate stem cell or zygote, a canine stem cell or zygote, a feline stem cell or zygote, a bovine stem cell or zygote, an equine stem cell or zygote, a caprine stem cell or zygote, a porcine stem cell or zygote, or an avian stem cell or zygote.
7. The method of claim 6 wherein the rodent stem cell or zygote is a mouse stem cell or zygote.
8. The method of claim 6 wherein the primate stem cell or zygote is a human stem cell or zygote.
9. The method of any one of the preceding claims wherein the stem cell is an embryonic stem cell or an induced pluripotent stem cell.
10. The method of any one of the preceding claims wherein one or more of the target nucleic acid sequences are associated with a disease or condition.
11. The method of any one of the preceding claims wherein one or more of the target nucleic sequences are a gene.
12. The method of any one of the preceding claims wherein both copies of one or more of the target nucleic acid sequences in the stem cell or zygote are mutated.
13. The method of any one of the preceding claims wherein the one or more target nucleic acid sequences are endogenous to the stem cell or zygote.
14. The method of any one of the preceding claims wherein one or more of the target nucleic acid sequences encode a transcription factor, a transcriptional co-activator or co-repressor, an enzyme, a chaperone, a heat shock factor, a heat shock protein, a receptor, a secreted protein, a transmembrane protein, a histone, a peripheral membrane protein, a soluble protein, a nuclear protein,
a mitochondrial protein, a growth factor, a cytokine, an interferon, a chemokine, a hormone, an extracellular matrix protein, a motor protein, a cell adhesion molecule, a major or minor histocompatibility (MHC) gene, a transporter, a channel an immunoglobulin (Ig) superfamily (IgSF) gene, a tumor necrosis factor, an NF-kappaB protein, an integrin, a cadherin superfamily member, a selectin, a clotting factor, a complement factor, a plasminogen, plasminogen activating factor, a proto-oncogene, oncogene, or tumor suppressor gene.
15. The method of any one of the preceding claims further comprising
introducing one or more nucleic acid sequences that are complementary to a portion of the one or more target nucleic acid sequences cleaved by the Cas protein.
16. The method of claim 15 wherein the one or more nucleic acid sequences are a single stranded oligonucleotide, a double stranded oligonucleotide, a plasmid, a cDNA, a gene block or a PCR product.
17. The method of claim 15 or 16 wherein the one or more nucleic acid
sequences replace one or more nucleotides, introduce one or more additional nucleotides, delete one or more nucleotides or a combination thereof in the one or more target nucleic acid sequences.
18. The method of claim 15, 16, or 17 wherein the one or more nucleic acid sequences introduce a point mutation in one or more of the target sequences.
19. The method of claim 15, 16, or 17 wherein the one or more nucleic acid sequences replace one or more mutant nucleotides with one or more wild type nucleotides in one or more of the target sequences.
20. The method of claim 15, 16, 17, 18 or 19 wherein the nucleic acid sequences is from about 10 nucleotides to about 1000 nucleotides.
21. The method of any one of the preceding claims further comprising introducing the stem cell or zygote into a nonhuman mammal.
22. The method of claim 21 wherein the nonhuman mammal is of the same species as the stem cell or zygote.
23. The method of claim 21 wherein the nonhuman mammal is a mouse.
24. The method of any one of the preceding claims further comprising assessing whether the one or more of the target nucleic acid sequences have been mutated.
25. The method of any one of the preceding claims further comprising isolating the stem cell or zygote.
26. The method of any one of the preceding claims wherein 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 target nucleic acids sequences are mutated in the stem cell or zygote.
27. The method of any one of the preceding claims wherein at least two of the target nucleic acid sequences are endogenous nucleic acid sequences.
28. The method of any one of the preceding claims wherein at least two of the target nucleic acid sequences are endogenous genes.
29. The method of any one of the preceding claims wherein at least two of the target nucleic acid sequences are at least lkB apart.
30. The method of any one of the preceding claims wherein at least two of the target nucleic acid sequences are on different chromosomes.
31. The method of any one of the preceding claims wherein at least one mutation comprises an insertion of a tag, a transgene, or an insertion of a site recognized by a recombinase.
32. The method of any one of the preceding claims wherein at least one mutation renders expression of an endogenous gene conditional.
33. The method of any one of the preceding claims wherein at least one mutation renders expression of an endogenous gene inducible, repressible, or tissue- specific.
34. The method of any one of the preceding claims wherein the mutations
comprise inserting recombination sites (e.g., loxP sites or FRT sites) flanking a selected genomic region, wherein the selected genomic region is optionally within a gene.
35. The method of any one of the preceding claims wherein the mutations
comprise inserting a recombination-site-STOP-recombination site cassette (e.g., a loxP-STOP-loxP or FRT-STOP-FRT cassette) in a gene, between a promoter and a coding region of a gene, or in a regulatory region of a gene.
36. The method of claim 35, wherein the recombination-site-STOP- recombination site cassette is positioned so as to disrupt expression of the gene and wherein excision of the cassette by a recombinase renders the gene expressible.
37. An isolated mutated stem cell or zygote produced by the method of any one of the preceding claims.
38. A method of producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences comprising:
(a) introducing into a zygote or an embryo
(i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to a portion of each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associated (Cas) protein; and
ii) a Cas nucleic acid sequence or a variant thereof that encodes a Cas protein having nuclease activity; and
(b) maintaining the zygote or the embryo under conditions in which RNA hybridizes to the portion of each of the one or more target nucleic acid sequences, and the Cas protein cleaves each of the one or more target nucleic acid sequences upon hybridization of the RNA to the portion of the target nucleic acid sequence, thereby producing an embryo having one or more mutated nucleic acid sequences;
(c) introducing the embryo having one or more mutated nucleic acid sequences into a foster nonhuman mammalian mother; and
(d) maintaining the foster nonhuman mammalian mother under conditions in which one or more offspring carrying the one or more mutated nucleic acid sequences are produced,
thereby producing a nonhuman mammal carrying mutations in one or more target nucleic acid sequences.
39. The method of claim 38 wherein the Cas protein is Cas9.
40. The method of claim 38 or 39 wherein the Cas protein cleaves one strand or both strands of one or more of the target nucleic acid sequences.
41. The method of any one of claims 38-40 wherein the Cas protein nicks one strand or both strands of one or more of the target nucleic acid sequences.
42. The method of any one of claims 38-41 wherein the RNA sequence is from about 10 base pairs to about 150 base pairs in length.
43. The method of any one of claims 38-42 wherein nonhuman mammal is a rodent, a nonhuman primate, a canine, a feline, a bovine, a porcine, an equine, or a caprine.
44. The method of claim 43 wherein the rodent is a mouse.
45. The method of any one of claims 38-44 wherein one or more of the target nucleic acid sequences are associated with a disease or condition.
46. The method of any one of claims 38-45 wherein one or more of the target nucleic sequences are a gene.
47. The method of any one of claims 38-46 wherein both copies of one or more of the target nucleic acid sequences in the zygote or the embryo are mutated.
48. The method of any one of claims 38-47 wherein the one or more target nucleic acid sequences are endogenous to the zygote or the embryo.
49. The method of any one of claims 38-48 wherein one or more of the target nucleic acid sequences encode, a transcription factor, a transcriptional co- activator or co-repressor, an enzyme, a chaperone, a heat shock factor, a heat shock protein, a receptor, a secreted protein, a transmembrane protein, a histone, a peripheral membrane protein, a soluble protein, a nuclear protein, a mitochondrial protein, a growth factor, a cytokine, an interferon, a chemokine, a hormone, an extracellular matrix protein, a motor protein, a cell adhesion molecule, a major or minor histocompatibility (MHC) gene, a transporter, a channel an immunoglobulin (Ig) superfamily (IgSF) gene, a tumor necrosis factor, an NF-kappaB protein, an integrin, a cadherin superfamily member, a selectin, a clotting factor, a complement factor, a plasminogen, plasminogen activating factor, a proto-oncogene, oncogene, or tumor suppressor gene.
50. The method of any one of claims 38-49 further comprising introducing into the zygote or the embryo one or more nucleic acid sequences that are complementary to a portion of the one or more target nucleic acid sequences cleaved by the Cas protein.
51. The method of claim 50 wherein the one or more nucleic acid sequences are a single stranded DNA oligonucleotide, a double stranded DNA
oligonucleotide, a plasmid, a cDNA, a gene block or a PCR product.
52. The method of claim 50 or 51 wherein the one or more nucleic acid
sequences replace one or more nucleotides, introduce one or more additional nucleotides, delete one or more nucleotides or a combination thereof in the one or more target nucleic acid sequences.
53. The method of claim 50, 51 , or 52 wherein the one or more nucleic acid sequences introduce a point mutation in the target nucleic acid sequence.
54. The method of claim 50, 51 , or 52 wherein the one or more nucleic acid sequences replace one or more mutant nucleotides with one or more wild type nucleotides in the target nucleic acid sequence.
55. The method of claim 50, 51, 52, 53, or 54 wherein the one or more nucleic acid sequences is from about 10 nucleotides to about 1000 nucleotides.
56. The method of any one of claims 38-55 wherein the nonhuman mammal is of the same species as the embryo or zygote.
57. The method of any one of claims 38-56 wherein the nonhuman mammal is a mouse.
58. The method of any one of claims 38-57 further comprising assessing whether the one or more target nucleic acid sequences have been mutated.
59. The method of any one of claims 31-50 wherein 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 target nucleic acids sequences are mutated in the nonhuman mammal.
60. The method of any one of the claims 38-59 wherein at least two of the target nucleic acid sequences are endogenous nucleic acid sequences.
61. The method of any one of claims 38-60 wherein at least two of the target nucleic acid sequences are endogenous genes.
62. The method of any one of claims 38-61 wherein at least two of the target nucleic acid sequences are at least lkB apart.
63. The method of any one of claims 38-62 wherein at least two of the target nucleic acid sequences are on different chromosomes.
64. The method of any one of the preceding claims wherein at least one mutation comprises an insertion of a tag, a transgene, or an insertion of a site recognized by a recombinase.
65. The method of any one of the preceding claims wherein at least one mutation renders expression of an endogenous gene conditional.
66. The method of any one of the preceding claims wherein at least one mutation renders expression of an endogenous gene inducible, repressible, or tissue- specific.
The method of any one of the preceding claims wherein the mutations comprise inserting recombination sites (e.g., loxP sites or FRT sites) flanking a selected genomic region, wherein the selected genomic region is optionally within a gene.
68. The method of any one of the preceding claims wherein the mutations comprise inserting a recombination-site-STOP-recombination site cassette (e.g., a loxP-STOP-loxP or FRT-STOP-FRT cassette) in a gene, between a promoter and a coding region of a gene, or in a regulatory region of a gene.
69. The method of claim 35, wherein the recombination-site-STOP- recombination site cassette is positioned so as to disrupt expression of the gene and wherein excision of the cassette by a recombinase renders the gene expressible.
70. A non-human mammal produced by the method of any one of claims 31-55.
71. A method of modulating the expression and/or activity of one or more target nucleic acid sequences in a cell comprising:
(a) introducing into the cell
(i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associate (Cas) protein;
(ii) a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence; and
(iii) an effector domain;
(b) maintaining the cell under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, the Cas protein binds to each of the one or more RNA sequences and the effector domain modulates the expression and/or activity of the target nucleic acid,
thereby modulating the expression and/or activity of one or more target nucleic acid sequences in the cell.
72. The method of claim 71, wherein the Cas protein and effector domain are fused creating a chimeric protein.
73. The method of any one of claims 71-72, wherein the one or more R A sequences are complementary to all or a portion of a regulatory region of the one or more target nucleic acid sequences.
74. The method of claim 73, wherein the regulatory region of the one or more target nucleic acid sequences is a promoter, enhancer, and/or operator.
75. The method of any one of claims 71-74, wherein the Cas protein is Cas9.
76. The method of claim 75, wherein the Cas9 protein comprises one or more mutations.
77. The method of claim 76, wherein the Cas9 protein comprises a mutation at amino acid position 10, 840, or a combination thereof.
78. The method of claim 77, wherein the amino acid at position 10 is mutated from aspartate (D) to alanine (A) and the mutation at amino acid position 840 is mutated from histidine (H) to alanine (A).
79. The method of any one of claims 71-78, wherein the effector domain
modulates the expression and/or activity of the one or more target nucleic acid sequences by activating, coactivating, regulating, repressing, organizing, remodeling, modifying, and/or fusing the expression and/or activity of one or more target nucleic acid sequences.
80. The method of any one of claims 71-79, wherein the effector domain is a transcription activating domain, a coactivator domain, a transcriptional pause release factor domain, a negative regulator of transcriptional elongation domain, a transcriptional repressor domain, a chromatin organizer domain, a remodeler domain, a histone modifier domain, a DNA modification domain, a RNA binding domain, a protein interaction input devices domain, or a protein interaction output device domain.
81. The method of any one of claims 71-80, wherein the effector domain is VP 16, VP48, VP64, VP96, or VP 160.
82. The method of any one of claims 71-81, further comprising introducing to the cell an effector molecule that activates the effector domain.
83. The method of claim 82, wherein the effector molecule binds to the effector domain.
84. The method of any one of claims 82-83, wherein the effector molecule is an antibiotic.
85. The method of any one of claims 71-84, wherein the RNA sequence is from about 10 base pairs to about 150 base pairs in length.
86. The method of any one of claims 71-85 wherein the cell is a rodent cell, a primate cell, a canine cell, a feline cell, a bovine cell, an equine cell, a caprine cell, a porcine cell, or an avian cell.
87. The method of claim 86 wherein the rodent cell is a mouse cell.
88. The method of claim 86 wherein the primate cell is a human cell.
89. The method of any one of claims 71-88 wherein the cell is a stem cell or zygote.
90. The method of any one of claims 71-89 wherein the stem cell is an
embryonic stem cell or an induced pluripotent stem cell.
91. The method of any one of claims 71-90 wherein one or more of the target nucleic acid sequences are associated with a disease or condition.
92. The method of any one of claims 71-91 wherein one or more of the target nucleic acid sequences are a gene.
93. The method of any one of claims 71-92 wherein both copies of one or more of the target nucleic acid sequences in the cell are mutated.
94. The method of any one of claims 71-93 wherein the one or more target
nucleic acid sequences are endogenous or exogenous to the cell.
95. The method of any one of claims 71-94 wherein one or more of the target nucleic acid sequences encode a mutated protein, a reprogramming factor, a transcription factor, a transcriptional co-activator or co-repressor, an enzyme, a chaperone, a heat shock factor, a heat shock protein, a receptor, a secreted protein, a transmembrane protein, a histone, a peripheral membrane protein, a soluble protein, a nuclear protein, a mitochondrial protein, a growth factor, a cytokine, an interferon, a chemokine, a hormone, an extracellular matrix protein, a motor protein, a cell adhesion molecule, a major or minor histocompatibility (MHC) gene, a transporter, a channel an immunoglobulin (Ig) superfamily (IgSF) gene, a tumor necrosis factor, an NF-kappaB protein, an integrin, a cadherin superfamily member, a selectin, a clotting factor, a complement factor, a plasminogen, plasminogen activating factor, a proto- oncogene, oncogene, or tumor suppressor gene.
The method of claim 95, wherein the transcription factor is Oct4, Sox2, Klf , or c-Myc.
The method of any one of claims 71-96 further comprising introducing cell into a nonhuman mammal.
98. The method of claim 97 wherein the nonhuman mammal is of the same
species as the cell.
99. The method of any one of claims 97 or 98 wherein the nonhuman mammal is a mouse.
100. The method of any one of claims 71-99 further comprising isolating the cell.
101. The method of any one of claims 71-100 wherein at least two of the target nucleic acid sequences are endogenous or exogenous nucleic acid sequences.
102. The method of any one of claims 71-101 wherein at least two of the target nucleic acid sequences are endogenous or exogenous genes.
103. The method of any one of claims 71-102 wherein at least two of the target nucleic acid sequences are at least lkB apart.
104. The method of any one of claims 71-103 wherein at least two of the target nucleic acid sequences are on different chromosomes.
The method of any one of claims 71-104, wherein the modulation of one or more target nucleic acid sequences comprises simultaneous activation of the one or more target nucleic acid sequences.
The method of any one of claims 71-105, further comprising adjusting the level of modulation of one or more target nucleic acid sequences by adjusting the amount of the one or more ribonucleic acid sequences introduced into the cell or zygote.
The method of any one of claims 71-106, wherein 2, 3, 4, or 5 R A sequences which bind within 300 bases upstream to a transcription start site to each of the one or more target nucleic acid sequences are introduced into the cell or zygote.
108. The method of any one of claims 71-107, wherein modulation of the expression and/or activity of the one or more target nucleic acid sequence reprograms the cell's potency.
109. The method of claim 108, wherein a differentiated cell is reprogrammed to a pluripotent cell.
110. An isolated nucleic acid sequence that encodes a fusion protein comprising all or a portion of a Cas protein fused to all or a portion of an effector domain.
111. The isolated nucleic acid sequence of claim 110, wherein the Cas protein is Cas9.
112. The isolated nucleic acid sequence of claim 111, wherein the Cas9 protein comprises one or more mutations.
113. The isolated nucleic acid sequence of claim 112, wherein the Cas9 protein comprises a mutation at amino acid position 10, 840, or a combination thereof.
114. The isolated nucleic acid sequence of claim 113, wherein the amino acid at position 10 is mutated from aspartate (D) to alanine (A) and the mutation at amino acid position 840 is mutated from histidine (H) to alanine (A).
115. The isolated nucleic acid sequence of any one of claims 110-114, wherein the effector domain modulates the expression and/or activity of the one or more target nucleic acid sequences by activating, coactivating, regulating, repressing, organizing, remodeling, modifying, and/or fusing the expression and/or activity of one or more target nucleic acid sequences.
116. The isolated nucleic acid sequence of any one of claims 108-114, wherein the effector domain is a transcription activating domain, a coactivator
domain, a transcriptional pause release factor domain, a negative regulator of transcriptional elongation domain, a transcriptional repressor domain, a chromatin organizer domain, a remodeler domain, a histone modifier domain, a DNA modification domain, a RNA binding domain, a protein interaction input devices domain, and a protein interaction output device domain.
117. The isolated nucleic acid sequence of any one of claims 108-115, wherein the effector domain is VP 16, VP48, VP64, VP96, or VP 160.
118. An isolated fusion protein comprising all or a portion of a Cas protein fused to all or a portion of an effector domain.
119. The isolated fusion protein of claim 118, wherein the Cas protein is Cas9.
120. The isolated fusion protein of claim 119, wherein the Cas9 protein comprises one or more mutations.
121. The isolated fusion protein of claim 120, wherein the Cas9 protein comprises a mutation at amino acid position 10, 840, or a combination thereof.
122. The isolated fusion protein of claim 121 , wherein the amino acid at position 10 is mutated from aspartate (D) to alanine (A) and the mutation at amino acid position 840 is mutated from histidine (H) to alanine (A).
123. The isolated fusion protein of any one of claims 118-122, wherein the
effector domain modulates the expression and/or activity of the one or more target nucleic acid sequences by activating, coactivating, regulating, repressing, organizing, remodeling, modifying, and/or fusing the expression and/or activity of one or more target nucleic acid sequences.
124. The isolated fusion protein of any one of claims 118-123, wherein the
effector domain is a transcription activating domain, a coactivator domain, a
transcriptional pause release factor domain, a negative regulator of transcriptional elongation domain, a transcriptional repressor domain, a chromatin organizer domain, a remodeler domain, a histone modifier domain, a DNA modification domain, a RNA binding domain, a protein interaction input devices domain, and a protein interaction output device domain.
125. The isolated fusion protein of any one of claims 118-124, wherein the
effector domain is VP 16, VP48, VP64, VP96, or VP 160.
126. An isolated polypeptide comprising the amino acid sequence set forth in SEQ ID NO: 14 (dCas9VP64).
127. An isolated polypeptide comprising the amino acid sequence set forth in SEQ ID NO: 15 (dCas9VP96).
128. An isolated polypeptide comprising the amino acid sequence set forth in SEQ ID NO: 16 (dCas9VP160).
129. An isolated fusion protein comprising at least six VP 16 transactivation
domains fused to each other.
130. The fusion protein of claim 129, wherein at least one VP16 transactivation domain is a minimal transactivation domain.
131. The fusion protein of claim 130, wherein the VP 16 minimal transactivation domain comprises the amino acid sequence DALDDFDLDML
132. The fusion protein of any one of claims 129-131, comprising 6, 7, 8, 9, or 10 tandem copies of the VP 16 transactivation domains.
133. The fusion protein of any one of claims 129-132, further comprising one or more linkers.
A method of modulating the expression and/or activity of one or more target nucleic acid sequences that cause a disease in an individual in need thereof comprising:
(a) introducing into the individual
(i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associate (Cas) protein;
(ii) a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence; and
(iii) an effector domain;
(b) maintaining the cell under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, the Cas protein binds to each of the one or more RNA sequences and the effector domain modulates the expression and/or activity of the target nucleic acid,
thereby modulating the expression and/or activity of one or more target nucleic acid sequences that cause a disease in the individual.
135. The method of Claim 134 wherein the one or more target nucleic acid
sequences are one or more genes.
136. The method of Claims 134 or 135 wherein the effector domain represses the activity and/or expression of the one or more target nucleic acid sequences.
A method of detecting a sequence variation, chromatin state or a
combination thereof at a defined loci of one or more target sequences of cell and exerting an effect on the cell upon detection of the sequence variation, chromatin state or a combination thereof at the defined loci comprising:
(a) introducing into the cell
(i) one or more ribonucleic acid (RNA) sequences that comprise a portion that is complementary to each of the one or more target nucleic acid sequences and comprise a binding site for a CRISPR associate (Cas) protein;
(ii) a Cas nucleic acid sequence or a variant thereof that encodes the Cas protein that targets but does not cleave the target nucleic acid sequence; and
(iii) an effector domain;
(b) maintaining the cell under conditions in which the one or more RNA sequences hybridize to the portion of each of the one or more target nucleic acid sequences, the Cas protein binds to each of the one or more RNA sequences and the effector domain exerts an effect on the cell upon detection of the sequence variation, chromatin state or a combination thereof at the defined loci,
thereby detecting a sequence variation, chromatin state or a combination thereof at a defined loci of one or more target sequences of the cell and exerting an effect on the cell upon detection of the sequence variation, chromatin state or a combination thereof at the defined loci in the cell.
138. The method of Claim 137 wherein the cell is in an individual in need thereof.
139. The method of Claim 137 or 138 wherein the effector domain is one or more caspases.
140. The method of any one of Claims 137-139 wherein the methylation,
chromosomal translocation, or a combination thereof is detected at the defined loci.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/785,031 US20160186208A1 (en) | 2013-04-16 | 2014-04-16 | Methods of Mutating, Modifying or Modulating Nucleic Acid in a Cell or Nonhuman Mammal |
Applications Claiming Priority (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361812720P | 2013-04-16 | 2013-04-16 | |
US61/812,720 | 2013-04-16 | ||
US201361824920P | 2013-05-17 | 2013-05-17 | |
US61/824,920 | 2013-05-17 | ||
US201361858437P | 2013-07-25 | 2013-07-25 | |
US61/858,437 | 2013-07-25 | ||
US201361865888P | 2013-08-14 | 2013-08-14 | |
US61/865,888 | 2013-08-14 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2014172470A2 true WO2014172470A2 (en) | 2014-10-23 |
WO2014172470A3 WO2014172470A3 (en) | 2015-10-29 |
Family
ID=51731977
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/034387 WO2014172470A2 (en) | 2013-04-16 | 2014-04-16 | Methods of mutating, modifying or modulating nucleic acid in a cell or nonhuman mammal |
Country Status (2)
Country | Link |
---|---|
US (1) | US20160186208A1 (en) |
WO (1) | WO2014172470A2 (en) |
Cited By (62)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104388560A (en) * | 2014-11-14 | 2015-03-04 | 中国农业大学 | Method for marking Y chromosome and application thereof |
US9068179B1 (en) | 2013-12-12 | 2015-06-30 | President And Fellows Of Harvard College | Methods for correcting presenilin point mutations |
US9163284B2 (en) | 2013-08-09 | 2015-10-20 | President And Fellows Of Harvard College | Methods for identifying a target site of a Cas9 nuclease |
US9228207B2 (en) | 2013-09-06 | 2016-01-05 | President And Fellows Of Harvard College | Switchable gRNAs comprising aptamers |
US9228208B2 (en) | 2013-12-11 | 2016-01-05 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a genome |
US9322037B2 (en) | 2013-09-06 | 2016-04-26 | President And Fellows Of Harvard College | Cas9-FokI fusion proteins and uses thereof |
US9322006B2 (en) | 2011-07-22 | 2016-04-26 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
WO2016080399A1 (en) * | 2014-11-20 | 2016-05-26 | 国立大学法人京都大学 | Method for knock-in of dna into target region of mammalian genome, and cell |
US9359599B2 (en) | 2013-08-22 | 2016-06-07 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US9487802B2 (en) | 2014-05-30 | 2016-11-08 | The Board Of Trustees Of The Leland Stanford Junior University | Compositions and methods to treat latent viral infections |
WO2016201399A1 (en) * | 2015-06-12 | 2016-12-15 | Lonza Walkersville, Inc. | Methods for nuclear reprogramming using synthetic transcription factors |
US9526784B2 (en) | 2013-09-06 | 2016-12-27 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
CN106282118A (en) * | 2015-06-24 | 2017-01-04 | 武汉荣实医药科技有限公司 | The genetically engineered cell strain of alzheimer's disease key pathogenetic factor app and medicaments sifting model |
WO2017015637A1 (en) * | 2015-07-22 | 2017-01-26 | Duke University | High-throughput screening of regulatory element function with epigenome editing technologies |
WO2017031370A1 (en) * | 2015-08-18 | 2017-02-23 | The Broad Institute, Inc. | Methods and compositions for altering function and structure of chromatin loops and/or domains |
US9663782B2 (en) | 2013-07-19 | 2017-05-30 | Larix Bioscience Llc | Methods and compositions for producing double allele knock outs |
EP3219799A1 (en) | 2016-03-17 | 2017-09-20 | IMBA-Institut für Molekulare Biotechnologie GmbH | Conditional crispr sgrna expression |
JPWO2016104716A1 (en) * | 2014-12-26 | 2017-10-05 | 国立研究開発法人理化学研究所 | Gene knockout method |
US9834791B2 (en) | 2013-11-07 | 2017-12-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US9834786B2 (en) | 2012-04-25 | 2017-12-05 | Regeneron Pharmaceuticals, Inc. | Nuclease-mediated targeting with large targeting vectors |
US20180023064A1 (en) * | 2015-02-09 | 2018-01-25 | Duke University | Compositions and methods for epigenome editing |
WO2018035495A1 (en) | 2016-08-19 | 2018-02-22 | Whitehead Institute For Biomedical Research | Methods of editing dna methylation |
US9902971B2 (en) | 2014-06-26 | 2018-02-27 | Regeneron Pharmaceuticals, Inc. | Methods for producing a mouse XY embryonic (ES) cell line capable of producing a fertile XY female mouse in an F0 generation |
WO2018129544A1 (en) | 2017-01-09 | 2018-07-12 | Whitehead Institute For Biomedical Research | Methods of altering gene expression by perturbing transcription factor multimers that structure regulatory loops |
US10077453B2 (en) | 2014-07-30 | 2018-09-18 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US10113163B2 (en) | 2016-08-03 | 2018-10-30 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US10117911B2 (en) | 2015-05-29 | 2018-11-06 | Agenovir Corporation | Compositions and methods to treat herpes simplex virus infections |
EP3284749A4 (en) * | 2015-04-13 | 2018-11-14 | The University of Tokyo | Set of polypeptides exhibiting nuclease activity or nickase activity with dependence on light or in presence of drug or suppressing or activating expression of target gene |
CN109022435A (en) * | 2018-07-19 | 2018-12-18 | 佛山科学技术学院 | A kind of conditionity inducing mouse spermatogonium Tet3 Knockout cells system and its construction method |
US10167457B2 (en) | 2015-10-23 | 2019-01-01 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
WO2019126799A1 (en) * | 2017-12-22 | 2019-06-27 | Distributed Bio, Inc. | Major histocompatibility complex (mhc) compositions and methods of use thereof |
US10337001B2 (en) | 2014-12-03 | 2019-07-02 | Agilent Technologies, Inc. | Guide RNA with chemical modifications |
US10385359B2 (en) | 2013-04-16 | 2019-08-20 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US10457960B2 (en) | 2014-11-21 | 2019-10-29 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for targeted genetic modification using paired guide RNAs |
US10544405B2 (en) | 2013-01-16 | 2020-01-28 | Emory University | Cas9-nucleic acid complexes and uses related thereto |
WO2020061591A1 (en) | 2018-09-21 | 2020-03-26 | President And Fellows Of Harvard College | Methods and compositions for treating diabetes, and methods for enriching mrna coding for secreted proteins |
US20200231975A1 (en) * | 2017-07-17 | 2020-07-23 | The Broad Institute, Inc. | Novel type vi crispr orthologs and systems |
US10745677B2 (en) | 2016-12-23 | 2020-08-18 | President And Fellows Of Harvard College | Editing of CCR5 receptor gene to protect against HIV infection |
US10767175B2 (en) | 2016-06-08 | 2020-09-08 | Agilent Technologies, Inc. | High specificity genome editing using chemically modified guide RNAs |
EP3230460B1 (en) * | 2014-12-12 | 2020-10-07 | James Zhu | Methods and compositions for selectively eliminating cells of interest |
CN112522312A (en) * | 2020-11-23 | 2021-03-19 | 福建省立医院 | WKH rat model construction method |
JP2021509577A (en) * | 2017-12-28 | 2021-04-01 | ザ ジェイ. デビッド グラッドストーン インスティテューツ、 ア テスタメンタリー トラスト エスタブリッシュド アンダー ザ ウィル オブ ジェイ. デビッド グラッドストーン | Generation of induced pluripotent cells by CRISPR activation |
US11268082B2 (en) | 2017-03-23 | 2022-03-08 | President And Fellows Of Harvard College | Nucleobase editors comprising nucleic acid programmable DNA binding proteins |
US11306309B2 (en) | 2015-04-06 | 2022-04-19 | The Board Of Trustees Of The Leland Stanford Junior University | Chemically modified guide RNAs for CRISPR/CAS-mediated gene regulation |
US11306324B2 (en) | 2016-10-14 | 2022-04-19 | President And Fellows Of Harvard College | AAV delivery of nucleobase editors |
US11319532B2 (en) | 2017-08-30 | 2022-05-03 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11421251B2 (en) | 2015-10-13 | 2022-08-23 | Duke University | Genome engineering with type I CRISPR systems in eukaryotic cells |
US11427817B2 (en) | 2015-08-25 | 2022-08-30 | Duke University | Compositions and methods of improving specificity in genomic engineering using RNA-guided endonucleases |
US11447770B1 (en) | 2019-03-19 | 2022-09-20 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11466271B2 (en) | 2017-02-06 | 2022-10-11 | Novartis Ag | Compositions and methods for the treatment of hemoglobinopathies |
US11492670B2 (en) | 2015-10-27 | 2022-11-08 | The Broad Institute Inc. | Compositions and methods for targeting cancer-specific sequence variations |
US11542496B2 (en) | 2017-03-10 | 2023-01-03 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
US11661590B2 (en) | 2016-08-09 | 2023-05-30 | President And Fellows Of Harvard College | Programmable CAS9-recombinase fusion proteins and uses thereof |
US11732274B2 (en) | 2017-07-28 | 2023-08-22 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
US11795443B2 (en) | 2017-10-16 | 2023-10-24 | The Broad Institute, Inc. | Uses of adenosine base editors |
US11884915B2 (en) | 2021-09-10 | 2024-01-30 | Agilent Technologies, Inc. | Guide RNAs with chemical modification for prime editing |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
US11912985B2 (en) | 2020-05-08 | 2024-02-27 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
US11920128B2 (en) | 2013-09-18 | 2024-03-05 | Kymab Limited | Methods, cells and organisms |
US12098399B2 (en) | 2022-06-24 | 2024-09-24 | Tune Therapeutics, Inc. | Compositions, systems, and methods for epigenetic regulation of proprotein convertase subtilisin/kexin type 9 (PCSK9) gene expression |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10702581B2 (en) | 2015-05-04 | 2020-07-07 | Ilias Biologics Inc. | Compositions containing protein loaded exosome and methods for preparing and delivering the same |
SG10202109655VA (en) | 2015-12-04 | 2021-10-28 | Novartis Ag | Compositions and methods for immunooncology |
EP3510152A4 (en) | 2016-09-07 | 2020-04-29 | Flagship Pioneering, Inc. | Methods and compositions for modulating gene expression |
EP3356522A4 (en) * | 2016-09-30 | 2019-03-27 | Cellex Life Sciences, Incorporated | Compositions containing protein loaded exosome and methods for preparing and delivering the same |
US20200149039A1 (en) * | 2016-12-12 | 2020-05-14 | Whitehead Institute For Biomedical Research | Regulation of transcription through ctcf loop anchors |
WO2018148603A1 (en) * | 2017-02-09 | 2018-08-16 | Allen Institute | Genetically-tagged stem cell lines and methods of use |
MX2019011272A (en) * | 2017-03-22 | 2019-10-24 | Novartis Ag | Compositions and methods for immunooncology. |
US20180305719A1 (en) * | 2017-04-19 | 2018-10-25 | The Board Of Trustees Of The University Of Illinois | Vectors For Integration Of DNA Into Genomes And Methods For Altering Gene Expression And Interrogating Gene Function |
US10927168B2 (en) | 2017-10-02 | 2021-02-23 | Humanicen, Inc. | Method of reducing tumor relapse rate in immunotherapy by administration of lenzilumab |
SG11202003096UA (en) | 2017-10-02 | 2020-05-28 | Humanigen Inc | Methods of treating immunotherapy-related toxicity using a gm-csf antagonist |
US11130805B2 (en) | 2017-10-02 | 2021-09-28 | Humanigen, Inc. | Methods of treating CART-T cell therapy-induced neuroinflammation using a GM-CSF antagonist |
US10899831B2 (en) | 2017-10-02 | 2021-01-26 | Humanigen, Inc. | Method of reducing the level of non-GM-CSF cytokines/chemokines in immunotherapy-related toxicity |
CN111601817A (en) * | 2017-11-14 | 2020-08-28 | 纪念斯隆-凯特琳癌症中心 | IL-33 secreting immunoresponsive cells and uses thereof |
EP3873205A4 (en) * | 2018-10-31 | 2022-05-11 | Humanigen, Inc. | Materials and methods for treating cancer |
WO2020227255A1 (en) * | 2019-05-06 | 2020-11-12 | The Regents Of The University Of Michigan | Targeted therapy |
US11987791B2 (en) | 2019-09-23 | 2024-05-21 | Omega Therapeutics, Inc. | Compositions and methods for modulating hepatocyte nuclear factor 4-alpha (HNF4α) gene expression |
CN111700034B (en) * | 2020-05-22 | 2021-12-14 | 中国人民解放军空军军医大学 | Construction method and application of schizophrenia animal model based on central nervous system myelin sheath function change |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102625655B (en) * | 2008-12-04 | 2016-07-06 | 桑格摩生物科学股份有限公司 | Zinc finger nuclease is used to carry out genome editor in rats |
DK2800811T3 (en) * | 2012-05-25 | 2017-07-17 | Univ Vienna | METHODS AND COMPOSITIONS FOR RNA DIRECTIVE TARGET DNA MODIFICATION AND FOR RNA DIRECTIVE MODULATION OF TRANSCRIPTION |
SG11201503059XA (en) * | 2012-10-23 | 2015-06-29 | Toolgen Inc | Composition for cleaving a target dna comprising a guide rna specific for the target dna and cas protein-encoding nucleic acid or cas protein, and use thereof |
KR102243092B1 (en) * | 2012-12-06 | 2021-04-22 | 시그마-알드리치 컴퍼니., 엘엘씨 | Crispr-based genome modification and regulation |
DK3064585T3 (en) * | 2012-12-12 | 2020-04-27 | Broad Inst Inc | DESIGN AND OPTIMIZATION OF IMPROVED SYSTEMS, PROCEDURES AND ENZYME COMPOSITIONS FOR SEQUENCE MANIPULATION |
US8697359B1 (en) * | 2012-12-12 | 2014-04-15 | The Broad Institute, Inc. | CRISPR-Cas systems and methods for altering expression of gene products |
-
2014
- 2014-04-16 US US14/785,031 patent/US20160186208A1/en not_active Abandoned
- 2014-04-16 WO PCT/US2014/034387 patent/WO2014172470A2/en active Application Filing
Cited By (137)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10323236B2 (en) | 2011-07-22 | 2019-06-18 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US9322006B2 (en) | 2011-07-22 | 2016-04-26 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US12006520B2 (en) | 2011-07-22 | 2024-06-11 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US10301646B2 (en) | 2012-04-25 | 2019-05-28 | Regeneron Pharmaceuticals, Inc. | Nuclease-mediated targeting with large targeting vectors |
US9834786B2 (en) | 2012-04-25 | 2017-12-05 | Regeneron Pharmaceuticals, Inc. | Nuclease-mediated targeting with large targeting vectors |
US10544405B2 (en) | 2013-01-16 | 2020-01-28 | Emory University | Cas9-nucleic acid complexes and uses related thereto |
US11312945B2 (en) | 2013-01-16 | 2022-04-26 | Emory University | CAS9-nucleic acid complexes and uses related thereto |
US12037596B2 (en) | 2013-04-16 | 2024-07-16 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US10975390B2 (en) | 2013-04-16 | 2021-04-13 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US10385359B2 (en) | 2013-04-16 | 2019-08-20 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US9663782B2 (en) | 2013-07-19 | 2017-05-30 | Larix Bioscience Llc | Methods and compositions for producing double allele knock outs |
US10508298B2 (en) | 2013-08-09 | 2019-12-17 | President And Fellows Of Harvard College | Methods for identifying a target site of a CAS9 nuclease |
US11920181B2 (en) | 2013-08-09 | 2024-03-05 | President And Fellows Of Harvard College | Nuclease profiling system |
US10954548B2 (en) | 2013-08-09 | 2021-03-23 | President And Fellows Of Harvard College | Nuclease profiling system |
US9163284B2 (en) | 2013-08-09 | 2015-10-20 | President And Fellows Of Harvard College | Methods for identifying a target site of a Cas9 nuclease |
US11046948B2 (en) | 2013-08-22 | 2021-06-29 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US10227581B2 (en) | 2013-08-22 | 2019-03-12 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US9359599B2 (en) | 2013-08-22 | 2016-06-07 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US9228207B2 (en) | 2013-09-06 | 2016-01-05 | President And Fellows Of Harvard College | Switchable gRNAs comprising aptamers |
US9322037B2 (en) | 2013-09-06 | 2016-04-26 | President And Fellows Of Harvard College | Cas9-FokI fusion proteins and uses thereof |
US9526784B2 (en) | 2013-09-06 | 2016-12-27 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
US10912833B2 (en) | 2013-09-06 | 2021-02-09 | President And Fellows Of Harvard College | Delivery of negatively charged proteins using cationic lipids |
US10858639B2 (en) | 2013-09-06 | 2020-12-08 | President And Fellows Of Harvard College | CAS9 variants and uses thereof |
US9340799B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | MRNA-sensing switchable gRNAs |
US10682410B2 (en) | 2013-09-06 | 2020-06-16 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
US9999671B2 (en) | 2013-09-06 | 2018-06-19 | President And Fellows Of Harvard College | Delivery of negatively charged proteins using cationic lipids |
US9340800B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | Extended DNA-sensing GRNAS |
US10597679B2 (en) | 2013-09-06 | 2020-03-24 | President And Fellows Of Harvard College | Switchable Cas9 nucleases and uses thereof |
US9737604B2 (en) | 2013-09-06 | 2017-08-22 | President And Fellows Of Harvard College | Use of cationic lipids to deliver CAS9 |
US9388430B2 (en) | 2013-09-06 | 2016-07-12 | President And Fellows Of Harvard College | Cas9-recombinase fusion proteins and uses thereof |
US11299755B2 (en) | 2013-09-06 | 2022-04-12 | President And Fellows Of Harvard College | Switchable CAS9 nucleases and uses thereof |
US11920128B2 (en) | 2013-09-18 | 2024-03-05 | Kymab Limited | Methods, cells and organisms |
US11390887B2 (en) | 2013-11-07 | 2022-07-19 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US10190137B2 (en) | 2013-11-07 | 2019-01-29 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US9834791B2 (en) | 2013-11-07 | 2017-12-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US10640788B2 (en) | 2013-11-07 | 2020-05-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAs |
US9546384B2 (en) | 2013-12-11 | 2017-01-17 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a mouse genome |
US9228208B2 (en) | 2013-12-11 | 2016-01-05 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a genome |
US10711280B2 (en) | 2013-12-11 | 2020-07-14 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a mouse ES cell genome |
US10208317B2 (en) | 2013-12-11 | 2019-02-19 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a mouse embryonic stem cell genome |
US11820997B2 (en) | 2013-12-11 | 2023-11-21 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a genome |
US9840699B2 (en) | 2013-12-12 | 2017-12-12 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
US9068179B1 (en) | 2013-12-12 | 2015-06-30 | President And Fellows Of Harvard College | Methods for correcting presenilin point mutations |
US11053481B2 (en) | 2013-12-12 | 2021-07-06 | President And Fellows Of Harvard College | Fusions of Cas9 domains and nucleic acid-editing domains |
US11124782B2 (en) | 2013-12-12 | 2021-09-21 | President And Fellows Of Harvard College | Cas variants for gene editing |
US10465176B2 (en) | 2013-12-12 | 2019-11-05 | President And Fellows Of Harvard College | Cas variants for gene editing |
US10066241B2 (en) | 2014-05-30 | 2018-09-04 | The Board Of Trustees Of The Leland Stanford Junior University | Compositions and methods of delivering treatments for latent viral infections |
US9487802B2 (en) | 2014-05-30 | 2016-11-08 | The Board Of Trustees Of The Leland Stanford Junior University | Compositions and methods to treat latent viral infections |
US10793874B2 (en) | 2014-06-26 | 2020-10-06 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for targeted genetic modifications and methods of use |
US9902971B2 (en) | 2014-06-26 | 2018-02-27 | Regeneron Pharmaceuticals, Inc. | Methods for producing a mouse XY embryonic (ES) cell line capable of producing a fertile XY female mouse in an F0 generation |
US11578343B2 (en) | 2014-07-30 | 2023-02-14 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US10077453B2 (en) | 2014-07-30 | 2018-09-18 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US10704062B2 (en) | 2014-07-30 | 2020-07-07 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
CN104388560A (en) * | 2014-11-14 | 2015-03-04 | 中国农业大学 | Method for marking Y chromosome and application thereof |
WO2016080399A1 (en) * | 2014-11-20 | 2016-05-26 | 国立大学法人京都大学 | Method for knock-in of dna into target region of mammalian genome, and cell |
JPWO2016080399A1 (en) * | 2014-11-20 | 2017-08-31 | 国立大学法人京都大学 | Method and cell for knocking in DNA into a mammalian target genomic region |
US10362771B2 (en) | 2014-11-20 | 2019-07-30 | Kyoto University | Method for knock-in of DNA into target region of mammalian genome, and cell |
US11697828B2 (en) | 2014-11-21 | 2023-07-11 | Regeneran Pharmaceuticals, Inc. | Methods and compositions for targeted genetic modification using paired guide RNAs |
US10457960B2 (en) | 2014-11-21 | 2019-10-29 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for targeted genetic modification using paired guide RNAs |
US10337001B2 (en) | 2014-12-03 | 2019-07-02 | Agilent Technologies, Inc. | Guide RNA with chemical modifications |
US10900034B2 (en) | 2014-12-03 | 2021-01-26 | Agilent Technologies, Inc. | Guide RNA with chemical modifications |
EP3230460B1 (en) * | 2014-12-12 | 2020-10-07 | James Zhu | Methods and compositions for selectively eliminating cells of interest |
EP3230460B2 (en) † | 2014-12-12 | 2023-11-29 | James Zhu | Methods and compositions for selectively eliminating cells of interest |
US10863730B2 (en) | 2014-12-26 | 2020-12-15 | Riken | Gene knockout method |
JPWO2016104716A1 (en) * | 2014-12-26 | 2017-10-05 | 国立研究開発法人理化学研究所 | Gene knockout method |
US11155796B2 (en) | 2015-02-09 | 2021-10-26 | Duke University | Compositions and methods for epigenome editing |
US10676726B2 (en) * | 2015-02-09 | 2020-06-09 | Duke University | Compositions and methods for epigenome editing |
US20180023064A1 (en) * | 2015-02-09 | 2018-01-25 | Duke University | Compositions and methods for epigenome editing |
US11535846B2 (en) | 2015-04-06 | 2022-12-27 | The Board Of Trustees Of The Leland Stanford Junior University | Chemically modified guide RNAS for CRISPR/Cas-mediated gene regulation |
US11851652B2 (en) | 2015-04-06 | 2023-12-26 | The Board Of Trustees Of The Leland Stanford Junior | Compositions comprising chemically modified guide RNAs for CRISPR/Cas-mediated editing of HBB |
US11306309B2 (en) | 2015-04-06 | 2022-04-19 | The Board Of Trustees Of The Leland Stanford Junior University | Chemically modified guide RNAs for CRISPR/CAS-mediated gene regulation |
US11390860B2 (en) | 2015-04-13 | 2022-07-19 | The University Of Tokyo | Set of polypeptides exhibiting nuclease activity or nickase activity with dependence on light or in presence of drug or suppressing or activating expression of target gene |
EP3284749A4 (en) * | 2015-04-13 | 2018-11-14 | The University of Tokyo | Set of polypeptides exhibiting nuclease activity or nickase activity with dependence on light or in presence of drug or suppressing or activating expression of target gene |
US10117911B2 (en) | 2015-05-29 | 2018-11-06 | Agenovir Corporation | Compositions and methods to treat herpes simplex virus infections |
JP2018521642A (en) * | 2015-06-12 | 2018-08-09 | ロンザ ウォカーズビル インコーポレーティッド | A method for nuclear reprogramming using synthetic transcription factors |
WO2016201399A1 (en) * | 2015-06-12 | 2016-12-15 | Lonza Walkersville, Inc. | Methods for nuclear reprogramming using synthetic transcription factors |
US11655481B2 (en) | 2015-06-12 | 2023-05-23 | Lonza Walkersville, Inc. | Methods for nuclear reprogramming using synthetic transcription factors |
IL255536B2 (en) * | 2015-06-12 | 2023-05-01 | Lonza Walkersville Inc | Methods for nuclear reprogramming using synthetic transcription factors |
IL298524B1 (en) * | 2015-06-12 | 2023-11-01 | Lonza Walkersville Inc | Methods for nuclear reprogramming using synthetic transcription factors |
IL255536A (en) * | 2015-06-12 | 2018-01-31 | Lonza Walkersville Inc | Methods for nuclear reprogramming using synthetic transcription factors |
IL298524B2 (en) * | 2015-06-12 | 2024-03-01 | Lonza Walkersville Inc | Methods for nuclear reprogramming using synthetic transcription factors |
JP2021166550A (en) * | 2015-06-12 | 2021-10-21 | ロンザ ウォカーズビル インコーポレーティッド | Method for nuclear reprogramming using synthetic transcription factor |
CN106282118A (en) * | 2015-06-24 | 2017-01-04 | 武汉荣实医药科技有限公司 | The genetically engineered cell strain of alzheimer's disease key pathogenetic factor app and medicaments sifting model |
WO2017015637A1 (en) * | 2015-07-22 | 2017-01-26 | Duke University | High-throughput screening of regulatory element function with epigenome editing technologies |
US10676735B2 (en) | 2015-07-22 | 2020-06-09 | Duke University | High-throughput screening of regulatory element function with epigenome editing technologies |
WO2017031370A1 (en) * | 2015-08-18 | 2017-02-23 | The Broad Institute, Inc. | Methods and compositions for altering function and structure of chromatin loops and/or domains |
US11214800B2 (en) | 2015-08-18 | 2022-01-04 | The Broad Institute, Inc. | Methods and compositions for altering function and structure of chromatin loops and/or domains |
US11427817B2 (en) | 2015-08-25 | 2022-08-30 | Duke University | Compositions and methods of improving specificity in genomic engineering using RNA-guided endonucleases |
US11421251B2 (en) | 2015-10-13 | 2022-08-23 | Duke University | Genome engineering with type I CRISPR systems in eukaryotic cells |
US10167457B2 (en) | 2015-10-23 | 2019-01-01 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US12043852B2 (en) | 2015-10-23 | 2024-07-23 | President And Fellows Of Harvard College | Evolved Cas9 proteins for gene editing |
US11214780B2 (en) | 2015-10-23 | 2022-01-04 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US11492670B2 (en) | 2015-10-27 | 2022-11-08 | The Broad Institute Inc. | Compositions and methods for targeting cancer-specific sequence variations |
EP3219799A1 (en) | 2016-03-17 | 2017-09-20 | IMBA-Institut für Molekulare Biotechnologie GmbH | Conditional crispr sgrna expression |
WO2017158153A1 (en) | 2016-03-17 | 2017-09-21 | Imba - Institut Für Molekulare Biotechnologie Gmbh | Conditional crispr sgrna expression |
US10767175B2 (en) | 2016-06-08 | 2020-09-08 | Agilent Technologies, Inc. | High specificity genome editing using chemically modified guide RNAs |
US11702651B2 (en) | 2016-08-03 | 2023-07-18 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US10947530B2 (en) | 2016-08-03 | 2021-03-16 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US11999947B2 (en) | 2016-08-03 | 2024-06-04 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US10113163B2 (en) | 2016-08-03 | 2018-10-30 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US11661590B2 (en) | 2016-08-09 | 2023-05-30 | President And Fellows Of Harvard College | Programmable CAS9-recombinase fusion proteins and uses thereof |
US11434476B2 (en) | 2016-08-19 | 2022-09-06 | Whitehead Institute For Biomedical Research | Methods of editing DNA methylation |
US12060588B2 (en) | 2016-08-19 | 2024-08-13 | Whitehead Institute For Biomedical Research | Methods of editing DNA methylation |
WO2018035495A1 (en) | 2016-08-19 | 2018-02-22 | Whitehead Institute For Biomedical Research | Methods of editing dna methylation |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US12084663B2 (en) | 2016-08-24 | 2024-09-10 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11306324B2 (en) | 2016-10-14 | 2022-04-19 | President And Fellows Of Harvard College | AAV delivery of nucleobase editors |
US11820969B2 (en) | 2016-12-23 | 2023-11-21 | President And Fellows Of Harvard College | Editing of CCR2 receptor gene to protect against HIV infection |
US10745677B2 (en) | 2016-12-23 | 2020-08-18 | President And Fellows Of Harvard College | Editing of CCR5 receptor gene to protect against HIV infection |
WO2018129544A1 (en) | 2017-01-09 | 2018-07-12 | Whitehead Institute For Biomedical Research | Methods of altering gene expression by perturbing transcription factor multimers that structure regulatory loops |
EP4249501A2 (en) | 2017-01-09 | 2023-09-27 | Whitehead Institute for Biomedical Research | Methods of altering gene expression by perturbing transcription factor multimers that structure regulatory loops |
US11873496B2 (en) | 2017-01-09 | 2024-01-16 | Whitehead Institute For Biomedical Research | Methods of altering gene expression by perturbing transcription factor multimers that structure regulatory loops |
US11466271B2 (en) | 2017-02-06 | 2022-10-11 | Novartis Ag | Compositions and methods for the treatment of hemoglobinopathies |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
US11542496B2 (en) | 2017-03-10 | 2023-01-03 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US11268082B2 (en) | 2017-03-23 | 2022-03-08 | President And Fellows Of Harvard College | Nucleobase editors comprising nucleic acid programmable DNA binding proteins |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
EP3655530A4 (en) * | 2017-07-17 | 2021-07-28 | The Broad Institute, Inc. | Novel type vi crispr orthologs and systems |
US20200231975A1 (en) * | 2017-07-17 | 2020-07-23 | The Broad Institute, Inc. | Novel type vi crispr orthologs and systems |
US11732274B2 (en) | 2017-07-28 | 2023-08-22 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
US11319532B2 (en) | 2017-08-30 | 2022-05-03 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11932884B2 (en) | 2017-08-30 | 2024-03-19 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11795443B2 (en) | 2017-10-16 | 2023-10-24 | The Broad Institute, Inc. | Uses of adenosine base editors |
WO2019126799A1 (en) * | 2017-12-22 | 2019-06-27 | Distributed Bio, Inc. | Major histocompatibility complex (mhc) compositions and methods of use thereof |
JP2021509577A (en) * | 2017-12-28 | 2021-04-01 | ザ ジェイ. デビッド グラッドストーン インスティテューツ、 ア テスタメンタリー トラスト エスタブリッシュド アンダー ザ ウィル オブ ジェイ. デビッド グラッドストーン | Generation of induced pluripotent cells by CRISPR activation |
JP7344877B2 (en) | 2017-12-28 | 2023-09-14 | ザ ジェイ. デビッド グラッドストーン インスティテューツ、 ア テスタメンタリー トラスト エスタブリッシュド アンダー ザ ウィル オブ ジェイ. デビッド グラッドストーン | Generation of induced pluripotent cells by CRISPR activation |
CN109022435A (en) * | 2018-07-19 | 2018-12-18 | 佛山科学技术学院 | A kind of conditionity inducing mouse spermatogonium Tet3 Knockout cells system and its construction method |
CN109022435B (en) * | 2018-07-19 | 2022-05-06 | 佛山科学技术学院 | Conditional induction mouse spermatogonium Tet3 gene knockout cell line and construction method thereof |
WO2020061591A1 (en) | 2018-09-21 | 2020-03-26 | President And Fellows Of Harvard College | Methods and compositions for treating diabetes, and methods for enriching mrna coding for secreted proteins |
US11447770B1 (en) | 2019-03-19 | 2022-09-20 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11643652B2 (en) | 2019-03-19 | 2023-05-09 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11795452B2 (en) | 2019-03-19 | 2023-10-24 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11912985B2 (en) | 2020-05-08 | 2024-02-27 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
US12031126B2 (en) | 2020-05-08 | 2024-07-09 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
CN112522312A (en) * | 2020-11-23 | 2021-03-19 | 福建省立医院 | WKH rat model construction method |
US11884915B2 (en) | 2021-09-10 | 2024-01-30 | Agilent Technologies, Inc. | Guide RNAs with chemical modification for prime editing |
US12098399B2 (en) | 2022-06-24 | 2024-09-24 | Tune Therapeutics, Inc. | Compositions, systems, and methods for epigenetic regulation of proprotein convertase subtilisin/kexin type 9 (PCSK9) gene expression |
Also Published As
Publication number | Publication date |
---|---|
US20160186208A1 (en) | 2016-06-30 |
WO2014172470A3 (en) | 2015-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20160186208A1 (en) | Methods of Mutating, Modifying or Modulating Nucleic Acid in a Cell or Nonhuman Mammal | |
US12060588B2 (en) | Methods of editing DNA methylation | |
JP7083364B2 (en) | Optimized CRISPR-Cas dual nickase system, method and composition for sequence manipulation | |
US10023922B2 (en) | Reporter of genomic methylation and uses thereof | |
JP6700306B2 (en) | Pre-fertilization egg cell, fertilized egg, and method for modifying target gene | |
US20190134227A1 (en) | Generation of genetically engineered animals by crispr/cas9 genome editing in spermatogonial stem cells | |
KR101773782B1 (en) | Methods and compositions for the targeted modification of a genome | |
Ghanta et al. | 5′-Modifications improve potency and efficacy of DNA donors for precision genome editing | |
Zhang et al. | A high-throughput small molecule screen identifies farrerol as a potentiator of CRISPR/Cas9-mediated genome editing | |
JP2015500637A (en) | Haploid cells | |
JP2022549120A (en) | Highly Efficient RNA-Aptamer Recruitment-Mediated DNA Base Editors and Their Use for Targeted Genome Modification | |
US20210185990A1 (en) | Non-meiotic allele introgression | |
CA3012042A1 (en) | Systems and methods for in vivo dual recombinase-mediated cassette exchange (drmce) and disease models thereof | |
US11913015B2 (en) | Embryonic cell cultures and methods of using the same | |
JP2019537445A (en) | DNA plasmids for rapid generation of homologous recombination vectors for cell line development | |
Bona et al. | Fanconi anemia DNA crosslink repair factors protect against LINE-1 retrotransposition during mouse development | |
Tennant et al. | Fluorescent in vivo editing reporter (FIVER): a novel multispectral reporter of in vivo genome editing | |
Jones et al. | A tamoxifen inducible knock-in allele for investigation of E2A function | |
JP2024150691A (en) | Methods for editing DNA methylation | |
TW202332770A (en) | Methods for large-size chromosomal transfer and modified chromosomes and organisims using same | |
Anuar | Using TALENs to Knockout H2A. Lap1 Function in Mice | |
WO2023150503A2 (en) | Gene-editing methods for embryonic stem cells | |
KR20240011831A (en) | Methods for preventing rapid silencing of genes in pluripotent stem cells | |
Geula | Deciphering the molecular role of N6-Methyladenosine mRNA modification in development of mammalian stem cells | |
Leuchs | Conditional gene targeting of TP53 in pig-a model for Li-Fraumeni disease and gastric cancer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14785488 Country of ref document: EP Kind code of ref document: A2 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14785031 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14785488 Country of ref document: EP Kind code of ref document: A2 |