WO2015048690A1 - Optimized small guide rnas and methods of use - Google Patents
Optimized small guide rnas and methods of use Download PDFInfo
- Publication number
- WO2015048690A1 WO2015048690A1 PCT/US2014/058133 US2014058133W WO2015048690A1 WO 2015048690 A1 WO2015048690 A1 WO 2015048690A1 US 2014058133 W US2014058133 W US 2014058133W WO 2015048690 A1 WO2015048690 A1 WO 2015048690A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- small guide
- nuclease
- nucleic acid
- guide rna
- cell
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 71
- 108091032973 (ribonucleotides)n+m Proteins 0.000 title description 2
- 102000040650 (ribonucleotides)n+m Human genes 0.000 title description 2
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 153
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 111
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 111
- 230000001404 mediated effect Effects 0.000 claims abstract description 64
- 239000000203 mixture Substances 0.000 claims abstract description 34
- 238000001514 detection method Methods 0.000 claims abstract description 25
- 230000004048 modification Effects 0.000 claims abstract description 12
- 238000012986 modification Methods 0.000 claims abstract description 12
- 108020005004 Guide RNA Proteins 0.000 claims description 263
- 210000004027 cell Anatomy 0.000 claims description 199
- 101710163270 Nuclease Proteins 0.000 claims description 158
- 108091033409 CRISPR Proteins 0.000 claims description 105
- 230000014509 gene expression Effects 0.000 claims description 70
- 125000003729 nucleotide group Chemical group 0.000 claims description 63
- 239000002773 nucleotide Substances 0.000 claims description 60
- 108091027544 Subgenomic mRNA Proteins 0.000 claims description 54
- 108090000623 proteins and genes Proteins 0.000 claims description 49
- 230000027455 binding Effects 0.000 claims description 45
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 41
- 230000000694 effects Effects 0.000 claims description 36
- 230000002950 deficient Effects 0.000 claims description 34
- 102000004169 proteins and genes Human genes 0.000 claims description 33
- 230000000295 complement effect Effects 0.000 claims description 28
- 230000035772 mutation Effects 0.000 claims description 28
- 210000000349 chromosome Anatomy 0.000 claims description 22
- 230000001965 increasing effect Effects 0.000 claims description 16
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical class O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 claims description 12
- 108091006047 fluorescent proteins Proteins 0.000 claims description 10
- 102000034287 fluorescent proteins Human genes 0.000 claims description 10
- 210000004940 nucleus Anatomy 0.000 claims description 9
- 230000030648 nucleus localization Effects 0.000 claims description 8
- 230000006780 non-homologous end joining Effects 0.000 claims description 6
- 150000003384 small molecules Chemical class 0.000 claims description 6
- 102000014450 RNA Polymerase III Human genes 0.000 claims description 5
- 108010078067 RNA Polymerase III Proteins 0.000 claims description 5
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 claims description 4
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 claims description 4
- 230000033616 DNA repair Effects 0.000 claims description 3
- 102100028251 Phosphoglycerate kinase 1 Human genes 0.000 claims description 3
- 101710139464 Phosphoglycerate kinase 1 Proteins 0.000 claims description 3
- 102100037935 Polyubiquitin-C Human genes 0.000 claims description 3
- 108010056354 Ubiquitin C Proteins 0.000 claims description 3
- 230000005030 transcription termination Effects 0.000 claims description 3
- 102220605874 Cytosolic arginine sensor for mTORC1 subunit 2_D10A_mutation Human genes 0.000 claims description 2
- 102000010292 Peptide Elongation Factor 1 Human genes 0.000 claims description 2
- 108010077524 Peptide Elongation Factor 1 Proteins 0.000 claims description 2
- 241000713880 Spleen focus-forming virus Species 0.000 claims description 2
- 239000000411 inducer Substances 0.000 claims description 2
- 239000012096 transfection reagent Substances 0.000 claims description 2
- 102000055501 telomere Human genes 0.000 description 56
- 108091035539 telomere Proteins 0.000 description 56
- 210000003411 telomere Anatomy 0.000 description 53
- 101000972286 Homo sapiens Mucin-4 Proteins 0.000 description 44
- 102100022693 Mucin-4 Human genes 0.000 description 43
- 238000003384 imaging method Methods 0.000 description 40
- 238000002372 labelling Methods 0.000 description 39
- 235000001014 amino acid Nutrition 0.000 description 31
- 229940024606 amino acid Drugs 0.000 description 31
- 150000001413 amino acids Chemical class 0.000 description 31
- 235000018102 proteins Nutrition 0.000 description 31
- 108090000765 processed proteins & peptides Proteins 0.000 description 28
- 238000013461 design Methods 0.000 description 22
- 102000004196 processed proteins & peptides Human genes 0.000 description 22
- 210000003583 retinal pigment epithelium Anatomy 0.000 description 20
- 230000003252 repetitive effect Effects 0.000 description 17
- 229920002477 rna polymer Polymers 0.000 description 17
- 238000006467 substitution reaction Methods 0.000 description 17
- 229920001184 polypeptide Polymers 0.000 description 16
- 230000008685 targeting Effects 0.000 description 16
- 230000002068 genetic effect Effects 0.000 description 15
- 239000013603 viral vector Substances 0.000 description 15
- 230000037430 deletion Effects 0.000 description 13
- 238000012217 deletion Methods 0.000 description 13
- 230000033001 locomotion Effects 0.000 description 13
- 108020004705 Codon Proteins 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 12
- 125000003275 alpha amino acid group Chemical group 0.000 description 11
- 238000009792 diffusion process Methods 0.000 description 11
- 239000013598 vector Substances 0.000 description 11
- 102000053602 DNA Human genes 0.000 description 10
- 108020004414 DNA Proteins 0.000 description 10
- 241000713666 Lentivirus Species 0.000 description 10
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 10
- 230000008045 co-localization Effects 0.000 description 10
- 238000010453 CRISPR/Cas method Methods 0.000 description 9
- 101001133056 Homo sapiens Mucin-1 Proteins 0.000 description 9
- 102100034256 Mucin-1 Human genes 0.000 description 9
- 239000000523 sample Substances 0.000 description 9
- 238000013518 transcription Methods 0.000 description 9
- 230000035897 transcription Effects 0.000 description 9
- 238000003776 cleavage reaction Methods 0.000 description 8
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 8
- 238000010362 genome editing Methods 0.000 description 8
- 230000007017 scission Effects 0.000 description 8
- 238000001890 transfection Methods 0.000 description 8
- 101150059949 MUC4 gene Proteins 0.000 description 7
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 7
- 238000007792 addition Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000012360 testing method Methods 0.000 description 7
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 6
- 102000004190 Enzymes Human genes 0.000 description 6
- 108090000790 Enzymes Proteins 0.000 description 6
- 108091092195 Intron Proteins 0.000 description 6
- 108091093037 Peptide nucleic acid Proteins 0.000 description 6
- 239000003153 chemical reaction reagent Substances 0.000 description 6
- 230000002759 chromosomal effect Effects 0.000 description 6
- 229940088598 enzyme Drugs 0.000 description 6
- 239000005090 green fluorescent protein Substances 0.000 description 6
- 230000003993 interaction Effects 0.000 description 6
- 239000013612 plasmid Substances 0.000 description 6
- 230000005945 translocation Effects 0.000 description 6
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 5
- 108700008625 Reporter Genes Proteins 0.000 description 5
- 108091027967 Small hairpin RNA Proteins 0.000 description 5
- 241000193996 Streptococcus pyogenes Species 0.000 description 5
- 235000004279 alanine Nutrition 0.000 description 5
- 125000000539 amino acid group Chemical group 0.000 description 5
- 210000000170 cell membrane Anatomy 0.000 description 5
- 230000001186 cumulative effect Effects 0.000 description 5
- 210000005260 human cell Anatomy 0.000 description 5
- 208000015181 infectious disease Diseases 0.000 description 5
- 230000035515 penetration Effects 0.000 description 5
- 229920002401 polyacrylamide Polymers 0.000 description 5
- 229920000642 polymer Polymers 0.000 description 5
- 102000040430 polynucleotide Human genes 0.000 description 5
- 108091033319 polynucleotide Proteins 0.000 description 5
- 239000002157 polynucleotide Substances 0.000 description 5
- 239000004055 small Interfering RNA Substances 0.000 description 5
- 239000000126 substance Substances 0.000 description 5
- 108010077544 Chromatin Proteins 0.000 description 4
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 4
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical group CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 4
- 108091023040 Transcription factor Proteins 0.000 description 4
- 102000040945 Transcription factor Human genes 0.000 description 4
- 241000700605 Viruses Species 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 4
- 210000003483 chromatin Anatomy 0.000 description 4
- 150000001875 compounds Chemical class 0.000 description 4
- 238000006731 degradation reaction Methods 0.000 description 4
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 4
- 239000013604 expression vector Substances 0.000 description 4
- 108020001507 fusion proteins Proteins 0.000 description 4
- 102000037865 fusion proteins Human genes 0.000 description 4
- 238000001727 in vivo Methods 0.000 description 4
- 238000004519 manufacturing process Methods 0.000 description 4
- 229930182817 methionine Chemical group 0.000 description 4
- 239000002953 phosphate buffered saline Substances 0.000 description 4
- FWBHETKCLVMNFS-UHFFFAOYSA-N 4',6-Diamino-2-phenylindol Chemical compound C1=CC(C(=N)N)=CC=C1C1=CC2=CC=C(C(N)=N)C=C2N1 FWBHETKCLVMNFS-UHFFFAOYSA-N 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 3
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 3
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 3
- 101000800312 Homo sapiens TERF1-interacting nuclear factor 2 Proteins 0.000 description 3
- 101100346932 Mus musculus Muc1 gene Proteins 0.000 description 3
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 3
- 206010039491 Sarcoma Diseases 0.000 description 3
- 102100033085 TERF1-interacting nuclear factor 2 Human genes 0.000 description 3
- 102000007316 Telomeric Repeat Binding Protein 2 Human genes 0.000 description 3
- 108010033710 Telomeric Repeat Binding Protein 2 Proteins 0.000 description 3
- 230000002776 aggregation Effects 0.000 description 3
- 238000004220 aggregation Methods 0.000 description 3
- 230000033228 biological regulation Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002073 fluorescence micrograph Methods 0.000 description 3
- 238000010166 immunofluorescence Methods 0.000 description 3
- 238000007901 in situ hybridization Methods 0.000 description 3
- 238000000338 in vitro Methods 0.000 description 3
- 238000010348 incorporation Methods 0.000 description 3
- 230000037431 insertion Effects 0.000 description 3
- 238000003780 insertion Methods 0.000 description 3
- 230000004807 localization Effects 0.000 description 3
- 239000002853 nucleic acid probe Substances 0.000 description 3
- 230000005257 nucleotidylation Effects 0.000 description 3
- 230000008488 polyadenylation Effects 0.000 description 3
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 3
- 125000006850 spacer group Chemical group 0.000 description 3
- 238000010186 staining Methods 0.000 description 3
- 238000010361 transduction Methods 0.000 description 3
- 230000026683 transduction Effects 0.000 description 3
- 229940035893 uracil Drugs 0.000 description 3
- 230000003612 virological effect Effects 0.000 description 3
- 238000011179 visual inspection Methods 0.000 description 3
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 108091003079 Bovine Serum Albumin Proteins 0.000 description 2
- 238000010446 CRISPR interference Methods 0.000 description 2
- 102000014669 Chromo shadow domains Human genes 0.000 description 2
- 108050005011 Chromo shadow domains Proteins 0.000 description 2
- 241001112695 Clostridiales Species 0.000 description 2
- 101710096438 DNA-binding protein Proteins 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 208000026350 Inborn Genetic disease Diseases 0.000 description 2
- LRQKBLKVPFOOQJ-YFKPBYRVSA-N L-norleucine Chemical group CCCC[C@H]([NH3+])C([O-])=O LRQKBLKVPFOOQJ-YFKPBYRVSA-N 0.000 description 2
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 2
- 102100025169 Max-binding protein MNT Human genes 0.000 description 2
- 108060004795 Methyltransferase Proteins 0.000 description 2
- 241000713869 Moloney murine leukemia virus Species 0.000 description 2
- 108010063954 Mucins Proteins 0.000 description 2
- 241001529936 Murinae Species 0.000 description 2
- 229930040373 Paraformaldehyde Natural products 0.000 description 2
- 108010076504 Protein Sorting Signals Proteins 0.000 description 2
- 208000037280 Trisomy Diseases 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- 108010017070 Zinc Finger Nucleases Proteins 0.000 description 2
- 230000009056 active transport Effects 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 229940098773 bovine serum albumin Drugs 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- -1 but not limited to Proteins 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 239000013592 cell lysate Substances 0.000 description 2
- 230000011088 chloroplast localization Effects 0.000 description 2
- 239000006059 cover glass Substances 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000006073 displacement reaction Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 210000001723 extracellular space Anatomy 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 208000016361 genetic disease Diseases 0.000 description 2
- 239000005556 hormone Substances 0.000 description 2
- 229940088597 hormone Drugs 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000001257 hydrogen Substances 0.000 description 2
- 230000036039 immunity Effects 0.000 description 2
- 238000011065 in-situ storage Methods 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 230000001939 inductive effect Effects 0.000 description 2
- 238000010859 live-cell imaging Methods 0.000 description 2
- 210000004962 mammalian cell Anatomy 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000025608 mitochondrion localization Effects 0.000 description 2
- 230000009149 molecular binding Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 229920002866 paraformaldehyde Polymers 0.000 description 2
- 239000000047 product Substances 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 230000008707 rearrangement Effects 0.000 description 2
- 230000001718 repressive effect Effects 0.000 description 2
- 230000028617 response to DNA damage stimulus Effects 0.000 description 2
- 230000009870 specific binding Effects 0.000 description 2
- 230000006641 stabilisation Effects 0.000 description 2
- 238000011105 stabilization Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 108010057210 telomerase RNA Proteins 0.000 description 2
- 108091006107 transcriptional repressors Proteins 0.000 description 2
- UKAUYVFTDYCKQA-UHFFFAOYSA-N -2-Amino-4-hydroxybutanoic acid Natural products OC(=O)C(N)CCO UKAUYVFTDYCKQA-UHFFFAOYSA-N 0.000 description 1
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical group C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- BFSVOASYOCHEOV-UHFFFAOYSA-N 2-diethylaminoethanol Chemical compound CCN(CC)CCO BFSVOASYOCHEOV-UHFFFAOYSA-N 0.000 description 1
- 241001156739 Actinobacteria <phylum> Species 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 241001142141 Aquificae <phylum> Species 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 241000972773 Aulopiformes Species 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 102100026189 Beta-galactosidase Human genes 0.000 description 1
- 241000167854 Bourreria succulenta Species 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 102000014914 Carrier Proteins Human genes 0.000 description 1
- 108010035563 Chloramphenicol O-acetyltransferase Proteins 0.000 description 1
- 241000191368 Chlorobi Species 0.000 description 1
- 208000003322 Coinfection Diseases 0.000 description 1
- 241000192700 Cyanobacteria Species 0.000 description 1
- 230000005778 DNA damage Effects 0.000 description 1
- 231100000277 DNA damage Toxicity 0.000 description 1
- 230000004568 DNA-binding Effects 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 241000282324 Felis Species 0.000 description 1
- 241000192125 Firmicutes Species 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 1
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- HVLSXIKZNLPZJJ-TXZCQADKSA-N HA peptide Chemical compound C([C@@H](C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HVLSXIKZNLPZJJ-TXZCQADKSA-N 0.000 description 1
- 208000009889 Herpes Simplex Diseases 0.000 description 1
- 108010068250 Herpes Simplex Virus Protein Vmw65 Proteins 0.000 description 1
- 229920000209 Hexadimethrine bromide Polymers 0.000 description 1
- 101000670189 Homo sapiens Ribulose-phosphate 3-epimerase Proteins 0.000 description 1
- 241000700588 Human alphaherpesvirus 1 Species 0.000 description 1
- 102000004157 Hydrolases Human genes 0.000 description 1
- 108090000604 Hydrolases Proteins 0.000 description 1
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 1
- 206010061598 Immunodeficiency Diseases 0.000 description 1
- 208000029462 Immunodeficiency disease Diseases 0.000 description 1
- 241000726306 Irus Species 0.000 description 1
- 208000008839 Kidney Neoplasms Diseases 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- UKAUYVFTDYCKQA-VKHMYHEASA-N L-homoserine Chemical group OC(=O)[C@@H](N)CCO UKAUYVFTDYCKQA-VKHMYHEASA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- QEFRNWWLZKMPFJ-ZXPFJRLXSA-N L-methionine (R)-S-oxide Chemical group C[S@@](=O)CC[C@H]([NH3+])C([O-])=O QEFRNWWLZKMPFJ-ZXPFJRLXSA-N 0.000 description 1
- QEFRNWWLZKMPFJ-UHFFFAOYSA-N L-methionine sulphoxide Chemical group CS(=O)CCC(N)C(O)=O QEFRNWWLZKMPFJ-UHFFFAOYSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 102000004882 Lipase Human genes 0.000 description 1
- 108090001060 Lipase Proteins 0.000 description 1
- 239000004367 Lipase Substances 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 101150114927 MUC1 gene Proteins 0.000 description 1
- 208000035032 Multiple sulfatase deficiency Diseases 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 102000007999 Nuclear Proteins Human genes 0.000 description 1
- 108010089610 Nuclear Proteins Proteins 0.000 description 1
- 108091005461 Nucleic proteins Proteins 0.000 description 1
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 1
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 1
- 102000057297 Pepsin A Human genes 0.000 description 1
- 108090000284 Pepsin A Proteins 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 102000002508 Peptide Elongation Factors Human genes 0.000 description 1
- 108010068204 Peptide Elongation Factors Proteins 0.000 description 1
- BELBBZDIHDAJOR-UHFFFAOYSA-N Phenolsulfonephthalein Chemical compound C1=CC(O)=CC=C1C1(C=2C=CC(O)=CC=2)C2=CC=CC=C2S(=O)(=O)O1 BELBBZDIHDAJOR-UHFFFAOYSA-N 0.000 description 1
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 1
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 241000192142 Proteobacteria Species 0.000 description 1
- 206010038389 Renal cancer Diseases 0.000 description 1
- 108020004422 Riboswitch Proteins 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 241001180364 Spirochaetes Species 0.000 description 1
- 241000194020 Streptococcus thermophilus Species 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 238000010459 TALEN Methods 0.000 description 1
- 101710192266 Tegument protein VP22 Proteins 0.000 description 1
- 102000010823 Telomere-Binding Proteins Human genes 0.000 description 1
- 108010038599 Telomere-Binding Proteins Proteins 0.000 description 1
- 102000007315 Telomeric Repeat Binding Protein 1 Human genes 0.000 description 1
- 108010033711 Telomeric Repeat Binding Protein 1 Proteins 0.000 description 1
- 241001143310 Thermotogae <phylum> Species 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 102100040396 Transcobalamin-1 Human genes 0.000 description 1
- 101710124861 Transcobalamin-1 Proteins 0.000 description 1
- 108010043645 Transcription Activator-Like Effector Nucleases Proteins 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- 206010046865 Vaccinia virus infection Diseases 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 210000001766 X chromosome Anatomy 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000035508 accumulation Effects 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 238000007605 air drying Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 150000003862 amino acid derivatives Chemical class 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 108091008324 binding proteins Proteins 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000008499 blood brain barrier function Effects 0.000 description 1
- 210000001218 blood-brain barrier Anatomy 0.000 description 1
- 108091005948 blue fluorescent proteins Proteins 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 229910000389 calcium phosphate Inorganic materials 0.000 description 1
- 239000001506 calcium phosphate Substances 0.000 description 1
- 235000011010 calcium phosphates Nutrition 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- UHBYWPGGCSDKFX-UHFFFAOYSA-N carboxyglutamic acid Chemical compound OC(=O)C(N)CC(C(O)=O)C(O)=O UHBYWPGGCSDKFX-UHFFFAOYSA-N 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 210000003855 cell nucleus Anatomy 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 235000019693 cherries Nutrition 0.000 description 1
- 239000013611 chromosomal DNA Substances 0.000 description 1
- 230000008711 chromosomal rearrangement Effects 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 230000002559 cytogenic effect Effects 0.000 description 1
- 230000003013 cytotoxicity Effects 0.000 description 1
- 231100000135 cytotoxicity Toxicity 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000018044 dehydration Effects 0.000 description 1
- 238000006297 dehydration reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000368 destabilizing effect Effects 0.000 description 1
- 229960000633 dextran sulfate Drugs 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 1
- 229960003722 doxycycline Drugs 0.000 description 1
- XQTWDDCIUJNLTR-CVHRZJFOSA-N doxycycline monohydrate Chemical compound O.O=C1C2=C(O)C=CC=C2[C@H](C)[C@@H]2C1=C(O)[C@]1(O)C(=O)C(C(N)=O)=C(O)[C@@H](N(C)C)[C@@H]1[C@H]2O XQTWDDCIUJNLTR-CVHRZJFOSA-N 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 230000006718 epigenetic regulation Effects 0.000 description 1
- 210000000981 epithelium Anatomy 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 235000019688 fish Nutrition 0.000 description 1
- 238000000684 flow cytometry Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 229920000159 gelatin Polymers 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 235000019322 gelatine Nutrition 0.000 description 1
- 235000011852 gelatine desserts Nutrition 0.000 description 1
- 230000004545 gene duplication Effects 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 102000056048 human MUC4 Human genes 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 150000002431 hydrogen Chemical group 0.000 description 1
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 1
- 229960002591 hydroxyproline Drugs 0.000 description 1
- 210000000987 immune system Anatomy 0.000 description 1
- 230000007813 immunodeficiency Effects 0.000 description 1
- 238000012744 immunostaining Methods 0.000 description 1
- 230000001976 improved effect Effects 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000016507 interphase Effects 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 201000010982 kidney cancer Diseases 0.000 description 1
- 208000021601 lentivirus infection Diseases 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 235000019421 lipase Nutrition 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 239000002207 metabolite Substances 0.000 description 1
- 230000031864 metaphase Effects 0.000 description 1
- LSDPWZHWYPCBBB-UHFFFAOYSA-O methylsulfide anion Chemical compound [SH2+]C LSDPWZHWYPCBBB-UHFFFAOYSA-O 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 238000002715 modification method Methods 0.000 description 1
- 108091005601 modified peptides Proteins 0.000 description 1
- 201000006033 mucosulfatidosis Diseases 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- 230000002071 myeloproliferative effect Effects 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 230000017074 necrotic cell death Effects 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 210000000633 nuclear envelope Anatomy 0.000 description 1
- 230000009437 off-target effect Effects 0.000 description 1
- 238000012235 off-target genome editing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 108010043655 penetratin Proteins 0.000 description 1
- 229940111202 pepsin Drugs 0.000 description 1
- 230000008823 permeabilization Effects 0.000 description 1
- 229960003531 phenolsulfonphthalein Drugs 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 230000026447 protein localization Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 239000001397 quillaja saponaria molina bark Substances 0.000 description 1
- 230000037425 regulation of transcription Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 235000019515 salmon Nutrition 0.000 description 1
- 229930182490 saponin Natural products 0.000 description 1
- 150000007949 saponins Chemical class 0.000 description 1
- 238000005204 segregation Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 210000000952 spleen Anatomy 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 230000010473 stable expression Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 230000004960 subcellular localization Effects 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 238000002287 time-lapse microscopy Methods 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 1
- 108091006106 transcriptional activators Proteins 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000003151 transfection method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 1
- 230000005740 tumor formation Effects 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 208000007089 vaccinia Diseases 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 108091005957 yellow fluorescent proteins Proteins 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/111—General methods applicable to biologically active non-coding nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/50—Physical structure
- C12N2310/53—Physical structure partially self-complementary or closed
- C12N2310/531—Stem-loop; Hairpin
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/80—Vectors containing sites for inducing double-stranded breaks, e.g. meganuclease restriction sites
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Y—ENZYMES
- C12Y301/00—Hydrolases acting on ester bonds (3.1)
Definitions
- CRISPR Clustered, regularly interspaced short palindromic repeat
- Cas CRISPR-associated genes
- the mechanism underlying this immunity is based on sequence specific cleavage of foreign nucleic acids by a CRISPR:Cas complex that contains the Cas nuclease and a guide RNA derived from the CRISPR sequences that provides target sequence specificity through a single stranded binding region. Binding of the CRISPR:Cas complex to the target sequence results in double stranded cleavage of the target sequence.
- the CRISPR/Cas system has been modified for use in prokaryotic and eukaryotic systems for genome editing and transcriptional regulation.
- this invention provides a small guide RNA molecule comprising from 5 ' to 3 ' : a binding region, comprising between about 5 and about 50 nucleotides; a 5 ' hairpin region, comprising: fewer than four consecutive uracil nucleotides; or a length of at least 31 nucleotides; and a 3 ' hairpin region; and a transcription termination sequence, wherein the small guide RNA is configured to form a complex with a small guide RNA-mediated nuclease, the complex having increased stability or activity relative to a complex containing a small guide RNA-mediated nuclease and a small guide RNA comprising at least 95% identity to SEQ ID NO: l or a complement thereof.
- the 5' hairpin region of the small guide RNA molecule comprises fewer than four consecutive uracil nucleotides and a length of at least 31 nucleotides, a length of at least 35 nucleotides, or a length of at least 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides.
- the small guide RNA molecule further comprises an additional hairpin region designed to interact with a protein or small-molecule to
- the small guide RNA molecule comprises at least 95% identity to SEQ ID NOs:2, 3, or 4, or a complement thereof.
- the invention provides a composition for nucleic acid modification or detection comprising any of the foregoing small guide RNA molecules.
- the composition further comprises a small guide RNA-mediated nuclease, wherein the small guide RNA and the small guide RNA-mediated nuclease form a complex having increased stability or activity relative to a complex containing a small guide RNA comprising at least 95% identity to SEQ ID NO: l or a complement thereof.
- the composition is nuclease defective, thereby forming a complex configured to bind to, but not cleave or nick, a target nucleic acid substantially complementary to the binding region of the small guide RNA.
- the nuclease defective composition comprises a Cas9 protein containing a mutation at one or more of the following residues: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
- the nuclease defective composition comprises a Cas9 protein containing a mutation at two or more of the following residues: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and A987.
- the nuclease defective composition comprises a Cas9 protein containing a D 1 OA and a H840A mutation.
- the nuclease defective composition comprises a labeled Cas9 protein.
- the labeled Cas9 protein comprises a fluorophore.
- the fluorophore is a fluorescent protein.
- the nuclease defective composition comprises a Cas9 protein that comprises a polypeptide that modulates (e.g., activates or represses) transcription.
- the Cas9 protein can be a Cas9 fused to a transcriptional repressor, including but not limited to, a chromoshadow domain (CSD).
- the Cas9 protein can be a Cas9 fused to a transcriptional repressor , including but not limited to, a Kriippel associated box (KRAB) domain.
- the Cas9 protein can be a Cas9 fused to a transcriptional activator, including but not limited to, VP8, VP16, or VP64.
- the composition has nuclease activity, thereby forming a complex configured to bind and cleave a target nucleic acid sequence substantially complementary to the binding region of the small guide RNA.
- the small guide RNA-mediated nuclease has nicking activity, but is substantially defective at catalyzing double stranded breaks in the target sequence.
- the small guide RNA-mediated nuclease comprises a Cas9 protein containing a mutation at one or more of the following residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
- the small guide RNA-mediated nuclease comprises a Cas9 protein with nicking activity containing a mutation at one or more of the following residues D10, G 12, G 17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
- the invention provides an expression cassette comprising a promoter operably linked to a nucleic acid encoding any of the small guide RNAs of claims 1-4.
- the promoter of the expression cassette is an RNA polymerase III promoter.
- the RNA polymerase promoter is a U6 or HI promoter, preferably a U6 promoter.
- the promoter of the expression cassette is an SFFV promoter.
- the invention provides an expression cassette comprising a promoter operably linked to a nucleic acid encoding a small guide NA-mediated nuclease.
- the promoter is a weak mammalian promoter as compared to the human elongation factor 1 promoter (EF1A).
- the weak mammalian promoter is a ubiquitin C promoter or a phosphoglycerate kinase 1 promoter (PGK).
- the weak mammalian promoter is a TetOn promoter in the absence of an inducer.
- the nucleic acid encoding a small guide RNA-mediated nuclease of the expression cassette further further encodes a one or two nuclear localization sequences.
- the present invention provides a host cell comprising any one of the foregoing the small guide RNAs.
- the cell further comprises a small guide RNA-mediated nuclease.
- the small guide RNA-mediated nuclease is labeled, such as with a fluorophore, such as a fluorescent protein.
- the present invention provides a method of detecting a target nucleic acid sequence in a cell, the method comprising: (i) introducing into the cell: (a) one or more small guide RNAs, each small guide RNA specific for the target nucleic acid sequence; and (b) a labeled nuclease-deficient small guide RNA-mediated nuclease, thereby forming a labeled RNA:nuclease complex; and (ii) incubating the cell to allow the labeled
- RNA:nuclease complex to localize to the target nucleic acid sequence; and (iii) detecting the presence, absence, or quantity of labeled complex in the nucleus of the cell, thereby detecting the target nucleic acid.
- the method further comprises introducing at least 2, 3, 4, 5, 6, 7, 8, 10, 15, 20 (e.g., 3-20, 4-10, etc.) or more different small guide RNAs, each specific for a different portion of the target nucleic acid sequence.
- the one or more small guide RNAs are specific for a repeated target sequence.
- the repeated target sequence comprises at least 5, 10, 15, 20 or more contiguous repeats, each repeat of at least 5, 10, 15, 20, or more nucleotides in length.
- the small guide RNA-mediated nuclease contains a mutation at one or more of the following residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987. In some cases, the small guide RNA-mediated nuclease contains a mutation at two or more of the following residues D 10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
- the introducing further comprises: ⁇ forming a complex between: - a first labeled nuclease-deficient small guide RNA-mediated nuclease and one or more small guide RNAs, to form a first labeled complex; and - a second labeled nuclease deficient small guide RNA-mediated nuclease and one or more small guide RNAs, to form a second labeled complex; and ⁇ contacting the cell with the first and second complexes.
- the method further comprises forming a third and fourth labeled complex and contacting the cell with the four labeled complexes.
- each labeled complex is specific for a different target nucleic acid sequence or specific for a different region of a chromosome. In some cases, each labeled complex is labeled with a different label, and the method comprises detecting the presence, absence, or quantity of each labeled complex in the nucleus of the cell.
- the present invention provides a method of modifying a target nucleic acid sequence in a cell, the method comprising: (i) introducing into the cell: (a) any of the foregoing small guide RNAs, the small guide RNA specific for the target nucleic acid sequence; and (b) a small guide RNA-mediated nuclease, thereby forming a small guide RNA:nuclease complex; and (ii) incubating the cell to allow the small guide RNA:nuclease complex to bind to and cleave or nick the target nucleic acid, thereby modifying the target nucleic acid sequence in the cell.
- the small guide RNA-mediated nuclease is capable of catalyzing double stranded breaks in the target nucleic acid, and the method further comprises cleaving the target nucleic acid. In some cases, the small guide RNA-mediated nuclease is capable of nicking the target nucleic acid but not catalyzing double stranded breaks in the target nucleic acid, and the method further comprises nicking the target nucleic acid.
- the method further comprises (i) introducing into the cell: (a) a pair of any of the foregoing small guide RNAs, each small guide RNA specific for a target nucleic acid, wherein the pair of small guide RNAs bind to target nucleic acids on a chromosome and flank a nucleic acid region of interest; and (b) a small guide RNA-mediated nuclease, thereby forming a pair of small guide RNA:nuclease complexes; and (ii) incubating the cell to allow the small guide RNA:nuclease complexes to localize to the target nucleic acid sequence, thereby creating nicks or double stranded breaks that flank the nucleic acid region of interest; and (iii) incubating the cell to allow non-homologous end joining (NHEJ) or homologous DNA repair (HDR) to occur, thereby reducing heterozygosity in the cell or deleting at least a portion of the nucle
- the method further comprises introducing into the cell a heterologous nucleic acid that contains regions of substantial homology to the nucleic acid region of interest, thereby incorporating at least a portion of the heterologous nucleic acid into the nucleic acid region of interest.
- the present invention provides a kit comprising an sgRNA and a labeled nuclease defective sgR A-mediated nuclease.
- the kit further comprises a second sgRNA or a second labeled nuclease defective sgRNA-mediated nuclease.
- the kit further comprises a cell transfection reagent.
- Fig. 1 depicts an optimized CRISPR system for visualizing sequence-specific genomic elements in living human cells.
- a dCas9-EGFP fusion protein and sgRNAs allow local enrichment of fluorescence signals at specific genomic sites in living cells.
- B The CRISPR imaging system consists of a doxycycline-inducible expression system with low dCas9-EGFP expression level, a Tet-on 3G trans-activator, and custom designed sgRNAs expressed from a murine U6 promoter.
- C Original and optimized sgRNA designs.
- the optimized sgRNA contains an A-U pair flip (underlined) and 5 -bp extension (underlined) at the top of the hairpin (boxed).
- D CRISPR imaging of the human telomeric DNA sequence in RPE cells. The sgRNA target site is indicated by a gray line and the adjacent PAM is shown.
- the optimized sgRNA (F+E) shows a much higher labeling efficiency with lower background fluorescence signals. SgGAL4 is the negative control without cognate binding sites in the genome. Scale bar, 5 ⁇ .
- E Histograms of telomere counts and distribution of telomere fluorescence intensity.
- telomeres show greatly enhanced labeling efficiency of telomeres.
- Fig. 2 different dCas9 and nuclear localization signal (NLS) fusion proteins show varied efficiencies for nucleus localization. Two copies of NLSs were fused to dCas9-EGFP at different positions. Version 1 containing two tandem NLSs between dCas9 and EGFP shows only 20% nuclear localization, which can be enhanced by insertion of a HA tag as shown in Version 2. Version 3 and 4, with an N-terminal NLS, shows almost 100 % nuclear localization of dCas9-EGFP. The nucleus is labeled with DAPI. Scale bar, 5 ⁇
- Fig. 3 depicts a comparison of different sgRNA designs for CRISPR-mediated gene repression.
- Different sgRNAs were transiently transfected into a GFP+ HEK293 reporter cell line harboring a genomic integrated dCas9-KRAB gene. The fluorescence was assayed using flow cytometry. Modifications (underlined) of sgRNA sequence with a polymerase III SINE element polyadenylation signal sequence (# 2), A-U flip (# 3, 4, 5 and 6), hairpin extension (# 7 and 8) and combined changes were hypothesized to increase sgRNA expression level, stability or its association with dCas9 protein.
- Repression fold is calculated by dividing the fluorescence of +sgGAL4 and fluorescence + different designs.
- the fold increase in repression is calculated by dividing the fluorescence of Design #1 and fluorescence of other designs.
- the unmodified hairpin nucleotides are boxed.
- Fig. 4 Optimized sgRNA designs enhance the labeling efficiency of telomeres.
- A Both designs of A-U flip (# 6) and hairpin extension (# 8) increased the telomere labeling efficiency compared to the original sgRNA design (# 1), as more dCas9-EGFP puncta were observed. Combining both modifications of A-U flip and hairpin extension (# 10) further enhanced the labeling efficiency and reduced background fluorescence signal.
- This design was used for later imaging and annotated as (F+E).
- Each conventional fluorescence image shows one projected three dimensional (3-D) stack of 3 ⁇ deep (an entire RPE nucleus) with 0.4 ⁇ focal spacing. Scale bar, 5 ⁇ .
- FIG. 5 Labeling efficiency of telomeres is dosage-dependent of the sgRNA lentivirus. With the best sgRNA design (F+E), telomeres were labeled when infected with 50 ⁇ lentivirus, but there were some nucleolus-like structures highlighted in the background.
- Fig. 6 depicts the co-localization of CRISPR labeling with telomere markers.
- CRISPR can effectively detect telomere length changes in living cells.
- A Co-localization of dCas9-EGFP and telomeres as labeled by Oligo FISH (top) or antibody to TRF2 (bottom).
- B-C Telomere elongation was induced in UMUC3 cancer cells (starting telomere length 2- 5kb) by overexpression of human telomerase RNA (hTR).
- B Visualization of telomeres in UMUC3 cells by CRISPR labeling.
- C Telomere fluorescence intensity in CRISPR-labeled cells increases after elongation of telomeres by hTR. Scale bar, 5 ⁇ .
- Fig. 7 illustrates that no DNA damage response was detected at the telomeres labeled by CRISPR.
- A Co-localization of PNA FISH and 53BP1 staining in RPE cells, dCas9-EGFP-labeled RPE cells, and TIN2 shRNA-treated RPE cells. ⁇ 2 shRNA infecting the RPE cells was used to induce telomeric DNA damage response revealed by 53BP1 antibody staining. 53BP1 signal was enriched at the telomeres indicated by PNA probe in most cells when infected with TIN2 shRNA. There was no obvious enrichment of 53BP1 at the telomeres labeled by PNA probes or dCas9-EGFP.
- B Histogram showing the quantification of 53BP1 signal enriched at the telomeres.
- Fig. 8 depicts the results of imaging of endogenous MUC4 gene by targeting the repetitive or non-repetitive regions via dCas9-EGFP.
- A Schematic of the human MUC4 locus containing two repeated regions in exon 2 and intron 3 as indicated. T he sgRNA target sites are indicated by gray lines and the adjacent PAMs are shown.
- B Conventional fluorescence images of MUC4 loci (arrows) in RPE cells by targeting the exon 2 repeats with three different sgRNAs.
- C Two protospacers with different lengths, 13 bp and 23 bp, were chosen as targets in the intron repeats .
- MUC4 intron can only be labeled by using optimized sgRNA design when targeting the 23 bp protospacer.
- D Histograms of MUC4 loci counts by labeling exon 2 via CRISPR.
- E Histograms of MUC4 loci counts by labeling intron 3 via CRISPR.
- F Co-localization of dCas9-EGFP and Oligo FISH probes for both exon and intron labeling.
- G 73 protospacers in the first exon were selected for labeling the non-repetitive region of MUC4 gene. 1-3 spots (arrows) can be detected with 36 and 73 sgRNAs. Co-labeling of MUC4 using 73 sgRNAs and sgMUC4-E3 shows two proximal spots (arrowhead). The inset shows the magnification of the white box region.
- H H
- Fig. 9 depicts the results of a karyotype analysis of the RPE cell line. Cytogenetic analysis was performed on ten G-banded metaphase cells of human RPE cell line. The chromosomes were stained with dyes that show a pattern of light and dark bands (called the banding pattern). The banding pattern for each chromosome was specific and consistent allowing identification of each of the 24 chromosomes. 10 cells were analyzed. This cell line demonstrated a hypertriploid karyotype (73 chromosomes in total) with female origin. There were extra copies of chromosome 5, 7, 1 1 , 12, 16, 19, and 20 that were present in eight or nine cells except for chromosome 16 (six cells) and 19 (four cells). Chromosome 10 and 22 were also lost in nine and eight cells respectively. All ten cells had two copies of an abnormal X chromosome with addition of unidentifiable genetic material translocated to the long-arm at Xq28.
- Fig. 10 depicts CRISPR imaging of MUC4 and telomeres in the monoclones of HeLa cell line.
- A In addition to RPE and UMUC3 cell lines, CRISPR can also detect MUC4 loci and telomeres in living HeLa cells. Three MUC4 loci (arrows) were detected by CRISPR via targeting the repetitive exon region.
- B Histograms of MUC4 loci counts by CRISPR labeling.
- C MUC4 locus is localized at 3q29 of chromosome 3. Two hundred interphase nuclei of HeLa cells were examined by FISH using two probes hybridized to 3q26.1 and 3q28-29, respectively. All two hundred cells demonstrated a FISH signal pattern of three copies of 3q28-29, suggesting trisomy chromosome 3 in HeLa cells.
- Fig. 11 depicts CRISPR imaging of MUC1 loci in living RPE cells.
- A Schematic depicting the structure of MUC1 gene, which contains a 60 bp unit repeated 20 to 140 copies in both exon 3 and intron 3. Four sgRNAs were designed to target the repeat region, and their target sites are indicated with gray lines and the adjacent PAM are also indicated.
- B MUC1 loci (arrows) visualized in RPE cells by targeting four different protospacers.
- C The labeling specificity of CRISPR was confirmed by oligo FISH labeling. The co-localization of dCas9-EGFP and oligo FISH is indicated by an arrow.
- Fig. 12 depicts depicts a comparison of labeling efficiency between CRISPR and Oligo FISH.
- A Oligo FISH labeling of MUC4 exon, MUC4 intron, and MUC 1 exon as indicated by arrows.
- B Histograms showing the statistics of observed spots for labeling MUC4 exon, MUC4 intron, and MUC1 exon by Oligo FISH or CRISPR.
- CRISPR shows a higher labeling efficiency in all cases.
- Fig. 13 depicts imaging of multiple elements via CRISPR.
- A The loci of MUC1 and MUC4 were co-labeled by infecting RPE cells with both sgMUC 1 -El and sgMUC4-E3. More spots (arrows) were observed compared to individual sgMUCl-El or sgMUC4-E3 labeling, while similar number of spots was detected if infecting the cells with both sgMUC4- E3 and sgMUC4-I2(F+E) that target the same MUC4
- B Histograms showing the counts of Mucin genes loci. 45% of cells targeting both MUC 1 and MUC4 contain more than 6 spots, while only 10% of cells have 6 spots with co-infection of two sgMUC4s.
- Fig. 14 depicts single particle tracking and diffusion dynamics of MUC4 labeled by C ISPR.
- A Conventional fluorescence image of MUC4 (left) and three example traces of marked foci (right). All traces show confined diffusion, but focus 2 is actively transported in addition.
- B Averaged Mean Square Displacement (MSD) of 119 and 50 traces from 26 cells, categorized as confined diffusive (darker line) and actively transported (lighter line). The shaded areas represent the standard error of the mean.
- C Histogram of confinement characteristic length and
- D histogram of microscopic diffusion coefficient derived from MSDs.
- E Data illustrating individual MUC4 loci movement.
- nucleic acid refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated.
- DNA deoxyribonucleic acids
- RNA ribonucleic acids
- degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed- base and/or deoxyinosine residues (Batzer et al, Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al, J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al, Mol. Cell. Probes 8:91-98 (1994)).
- nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene.
- the term "gene” means the segment of DNA involved in producing a polypeptide chain. It may include regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons).
- a “promoter” is defined as an array of nucleic acid control sequences that direct transcription of a nucleic acid.
- a promoter includes necessary nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element.
- a promoter also optionally includes distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription.
- An "expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular polynucleotide sequence in a host cell.
- An expression cassette may be part of a plasmid, viral genome, or nucleic acid fragment.
- an expression cassette includes a polynucleotide to be transcribed, operably linked to a promoter.
- a "reporter gene” encodes proteins that are readily detectable due to their biochemical characteristics, such as enzymatic activity or chemifluorescent features.
- One specific example of such a reporter is green fluorescent protein. Fluorescence generated from this protein can be detected with various commercially-available fluorescent detection systems. Other reporters can be detected by staining.
- the reporter can also be an enzyme that generates a detectable signal when contacted with an appropriate substrate.
- the reporter can be an enzyme that catalyzes the formation of a detectable product. Suitable enzymes include, but are not limited to, proteases, nucleases, lipases, phosphatases and hydrolases.
- the reporter can encode an enzyme whose substrates are substantially impermeable to eukaryotic plasma membranes, thus making it possible to tightly control signal formation.
- suitable reporter genes that encode enzymes include, but are not limited to, CAT (chloramphenicol acetyl transferase; Alton and Vapnek (1979) Nature 282: 864-869);
- luciferase lux
- ⁇ -galactosidase LacZ
- ⁇ . -glucuronidase alkaline phosphatase
- alkaline phosphatase Toh, et al. (1980) Eur. J. Biochem. 182: 231-238; and Hall et al. (1983) J. Mol. Appl. Gen. 2: 101), each of which is incorporated by reference herein in the entirety.
- Other suitable reporters include those that encode for a particular epitope that can be detected with a labeled antibody that specifically recognizes the epitope.
- amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
- Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ - carboxyglutamate, and O-phosphoserine.
- Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i. e.
- R groups e.g., norleucine
- modified peptide backbones but retain the same basic chemical structure as a naturally occurring amino acid.
- Amino acid mimetics refers to chemical compounds having a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
- Amino acids may be referred to herein by either the commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical
- Polypeptide “peptide,” and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. All three terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non- naturally occurring amino acid polymers. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds.
- Constantly modified variants applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, “conservatively modified variants” refers to those nucleic acids that encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide.
- nucleic acid variations are "silent variations," which are one species of conservatively modified variations. Every nucleic acid sequence herein that encodes a polypeptide also describes every possible silent variation of the nucleic acid.
- each codon in a nucleic acid except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule.
- each silent variation of a nucleic acid that encodes a polypeptide is implicit in each described sequence.
- amino acid sequences one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention. In some cases, conservatively modified variants of Cas9 or sgRNA can have an increased stability, assembly, or activity as described herein.
- Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
- amino acid residues are numbered according to their relative positions from the left most residue, which is numbered 1, in an unmodified wild- type polypeptide sequence.
- the terms “identical” or percent “identity,” in the context of describing two or more polynucleotide or amino acid sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same.
- a core small guide RNA (sgRNA) sequence responsible for assembly and activity of a sgRNA:nuclease complex has at least 80% identity, preferably 85%, 90%, 91%, 92%, 93, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity, to a reference sequence, e.g., one of SEQ ID NOs:l-4), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection.
- a reference sequence e.g., one of SEQ ID NOs:l-4
- a Cas9 sequence responsible for assembly and activity of a sgRNA:nuclease complex has at least 80% identity, preferably 85%, 90%, 91%, 92%, 93, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity, to a reference sequence, e.g., one of SEQ ID NOs:5-6), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Such sequences are then said to be "substantially identical.” With regard to polynucleotide sequences, this definition also refers to the complement of a test sequence. With regard to amino acid sequences, preferably, the identity exists over a region that is at least about 50 amino acids or nucleotides in length, or more preferably over a region that is 75-100 amino acids or nucleotides in length.
- sequence comparison typically one sequence acts as a reference sequence, to which test sequences are compared.
- test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated.
- sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
- sequence comparison of nucleic acids and proteins the BLAST and BLAST 2.0 algorithms and the default parameters discussed below are used.
- a “comparison window”, as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
- Methods of alignment of sequences for comparison are well-known in the art.
- Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol.
- BLAST and BLAST 2.0 algorithms are described in Altschul et al. , (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively.
- Software for performing BLAST analyses is publicly available at the National Center for Biotechnology Information website, ncbi.nlm.nih.gov.
- the algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive- valued threshold score T when aligned with a word of the same length in a database sequence.
- HSPs high scoring sequence pairs
- T is referred to as the neighborhood word score threshold (Altschul et al, supra).
- These initial neighborhood word hits acts as seeds for initiating searches to find longer HSPs containing them.
- the word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased.
- Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always ⁇ 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score.
- Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached.
- the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
- the BLASTP program uses as defaults a word size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
- the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., arlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)).
- One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance.
- P(N) the smallest sum probability
- a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.
- nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below.
- a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions.
- Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below.
- Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequence.
- Yet another indication that two polypeptides are substantially identical is that the two polypeptides retain identical or substantially similar activity.
- a "translocation sequence” or “transduction sequence” refers to a peptide or protein (or active fragment or domain thereof) sequence that directs the movement of a protein from one cellular compartment to another, or from the extracellular space through the cell or plasma membrane into the cell.
- Translocation sequences that direct the movement of a protein from the extracellular space through the cell or plasma membrane into the cell are "cell penetration peptides.”
- Translocation sequences that localize to the nucleus of a cell are termed “nuclear localization" sequences, signals, domains, peptides, or the like. Examples of translocation sequences include, without limitation, the TAT transduction domain (see, e.g., S. Schwarze et al, Science 285 (Sep.
- penetratins or penetratin peptides D. Derossi et al., Trends in Cell Biol. 8, 84-87
- Herpes simplex virus type 1 VP22 A. Phelan et al., Nature Biotech. 16, 440-443 (1998), and polycationic peptides (Cell Mol. Life Sci. 62 (2005) 1839-1849). Further translocation sequences are known in the art.
- Translocation peptides can be fused (e.g. at the amino or carboxy terminus), conjugated, or coupled to a compound of the present invention, to, among other things, produce a conjugate compound that may easily pass into target cells, or through the blood brain barrier and into target cells.
- CRISPR/Cas refers to a widespread class of bacterial systems for defense against foreign nucleic acid. CRISPR/Cas systems are found in a wide range of eubacterial and archaeal organisms. CRISPR/Cas systems include type I, II, and III subtypes.
- Wild-type type II CRISPR/Cas systems utilize the RNA-mediated nuclease,Cas9 in complex with guide and activating RNA to recognize and cleave foreign nucleic acid.
- Cas9 homo logs are found in a wide variety of eubacteria, including, but not limited to bacteria of the following taxonomic groups: Actinobacteria, Aquificae, Bacteroidetes- Chlorobi, Chlamydiae-Verrucomicrobia, Chlroflexi, Cyanobacteria, Firmicutes,
- An exemplary Cas9 protein is the
- Streptococcus pyogenes Cas9 protein Additional Cas9 proteins and homologs thereof are described in, e.g., Chylinksi, et al, RNA Biol. 2013 May 1 ; 10(5): 726-737 ; Nat. Rev. Microbiol. 201 1 June; 9(6): 467-477; Hou, et al, Proc Natl Acad Sci U S A. 2013 Sep 24;110(39): 15644-9; Sampson et al, Nature. 2013 May 9;497(7448):254-7; and Jinek, et al, Science. 2012 Aug 17;337(6096):816-21.
- activity in the context of CRISPR/Cas activity, Cas9 activity, sgRNA activity, sgRNA:nuclease activity and the like refers to the ability to bind to a target sequence and/or label or cleave the target sequence.
- activity can be measured in a variety of ways as known in the art. For example, expression, activity, or level of a reporter gene can be measured, and sgRNA:nucleases targeting the reporter gene sequence can be assayed for their ability to reduce the expression, activity, or level of the reporter gene.
- a cell can be transfected with an expression cassette encoding a green fluorescent protein under the control of a constituitive promoter. The fluorescence intensity can be measured and compared to the intensity of the cell after transfection with Cas9 and candidate sg NAs to identify optimized sgR As
- the methods and compositions are based on an optimized CRISPR/Cas system that employs a small guide RNA (sgRNA) and an sgRNA-mediated nuclease.
- the sgRNA contains a binding region that provides specific binding to a nucleic acid target sequence that is substantially complementary to the sequence of the binding region.
- the sgRNA and the nuclease can form a complex that specifically binds to the nucleic acid target sequence.
- an small guide RNA (sgRNA) molecule is provided.
- sgRNAs contain a binding region that determines the sequence specificity of the sgRNA and the sgRNA:nuclease complex, a 5 ' stem-loop region that, at least in part, participates in assembly and interaction with a sgRNA-mediated nuclease; an intervening sequence, a 3 ' stem-loop region, and a termination sequence.
- the binding region can be between about 5 and 100 nucleotides long, or longer (e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59 60, 61, 62, 63, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 92, 93, 94, 95, 96, 97, 98, 99, or 100 nucleotides in length, or longer).
- the binding region is between about 15 and about 30 nucleotides in length (e.g., about 15-29, 15-26, 15-25; 16-30, 16-29, 16-26, 16-25; or about 18-30, 18-29, 18-26, or 18-25 nucleotides in length).
- the binding region is designed to complement or substantially complement the target nucleic acid sequence or sequences.
- the binding region can incorporate wobble or degenerate bases to bind multiple sequences.
- the binding region can be altered to increase stability. For example, non-natural nucleotides, can be incorporated to increase RNA resistance to degradation.
- the binding region can be altered or designed to avoid or reduce secondary structure formation in the binding region.
- the binding region can be designed to optimize G-C content.
- G-C content is preferably between about 40% and about 60% (e.g., 40%, 45%, 50%, 55%, 60%).
- the binding region can contain modified nucleotides such as, without limitation, methylated or phosphorylated nucleotides.
- the 5 ' stem-loop region can be between about 15 and about 50 nucleotides in length (e.g., about 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or about 50 nucleotides in length). In some cases, the 5' stem-loop region is between about 30-45 nucleotides in length (e.g., about 31 ,
- the 5 ' stem- loop region is at least about 31 nucleotides in length (e.g., at least about 31, 32,
- the 5' stem-loop structure contains one or more loops or bulges, each loop or bulge of about 1 , 2,
- the 5' stem-loop structure contains a stem of between about 10 and 30 complementary base pairs (e.g., 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 complementary base pairs).
- the 5 ' stem-loop structure can contain protein-binding, or small molecule-binding structures.
- the 5 ' stem- loop function e.g., interacting or assembling with a sgRNA-mediated nuclease
- the 5 ' stem-loop structure can contain non- natural nucleotides.
- non-natural nucleotides can be incorporated to enhance protein-RNA interaction, or to increase the thermal stability or resistance to degradation of the sgR A.
- the intervening sequence between the 5' and 3' stem- loop structures can be between about 10 and about 50 nucleotides in length (e.g., about 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or about 50 nucleotides in length).
- the intervening sequence is designed to be linear, unstructured, substantially linear, or substantially unstructured.
- the intervening sequence can contain non- natural nucleotides.
- non-natural nucleotides can be incorporated to enhance protein-RNA interaction or to increase the activity of the sgRNA:nuclease complex.
- natural nucleotides can be incorporated to enhance the thermal stability or resistance to degradation of the sgRNA.
- the 3 ' stem-loop structure can contain an about 3, 4, 5, 6, 7, or 8 nucleotide loop and an about 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 nucleotide or longer stem.
- the 3 ' stem-loop can contain a protein-binding, small molecule-binding, hormone-binding, or metabolite-binding structure that can conditionally stabilize the secondary and/or tertiary structure of the sgRNA.
- the 3 ' stem- loop can contain non-natural nucleotides.
- non- natural nucleotides can be incorporated to enhance protein-R A interaction or to increase the activity of the sgRNA:nuclease complex.
- natural nucleotides can be incorporated to enhance the thermal stability or resistance to degradation of the sgRNA.
- the sgRNA includes a termination structure at its 3 ' end.
- the sgRNA includes an additional 3 ' hairpin structure, e.g. , before the termination structure, that can interact with proteins, small-molecules, hormones, etc., for stabilization or additional functionality, such as conditional stabilization or conditional regulation of sgRNA :nuclease assembly or activity.
- the sgRNA is optimized to enhance stability, assembly, and/or expression. In some case, the sgRNA is optimized to enhance the activity of a
- sgRNA:nuclease complex as compared to previously known sgRNAs, such as an sgRNA encoded by:
- the optimized sgRNA provides enhanced activity as compared to a previously known sgRNA or an sgRNA substantially identical to a previously known sgRNA.
- identity of an sgRNA to another sgRNA is determined with reference to the identity to the nucleotide sequences outside of the binding region. For example, two sgRNAs with 0% identity inside the binding region and 100% identity outside the binding region are 100% identical to each other.
- number of substitutions, additions, or deletions of an sgRNA as compared to another is determined with reference to the nucleotide sequences outside of the binding region. For example, two sgRNAs with multiple additions, substitutions, and/or deletions inside the binding region and 100% identity outside the binding region are considered to contain 0 nucleotide substitutions, additions, or deletions.
- the optimized sgRNAs described herein form an sgRNA:nuclease complex with enhanced activity as compared to SEQ ID NO:l, or an sgRNA 90, 95, 96, 97, 98, or 99% or more identical to SEQ ID NO:l.
- the optimized sgRNAs described herein form an sgRNA:nuclease complex with enhanced activity as compared to SEQ ID NO:l, or an sgRNA with fewer than 5, 4, 3, or 2 nucleotide substitutions, additions, or deletions of SEQ ID NO:l.
- the sgRNA can be optimized for expression by substituting, deleting, or adding one or more nucleotides.
- a nucleotide sequence that provides inefficient transcription from an encoding template nucleic acid can be deleted or substituted.
- the sgRNA is transcribed from a nucleic acid operably linked to an RNA polymerase III promoter.
- sgRNA sequences that result in inefficient transcription by RNA polymerase III such as those described in Nielsen et al., Science. 2013 Jun 28;340(6140):1577-80, can be deleted or substituted.
- one or more consecutive uracils can be deleted or substituted from the sgRNA sequence.
- the consecutive uracils are present in the stem portion of a stem-loop structure.
- one or more of the consecutive uracils can be substituted by exchanging the uracil and its complementary base.
- the sgRNA sequence can be altered to exchange the adenine and uracil.
- This "A-U flip" can retain the overall structure and function of the sgRNA molecule while improving expression by reducing the number of consecutive uracil nucleotides.
- the sgRNA containing an A-U flip is encoded by: SEQ ID NO:2 [N] 5 _
- the optimized sgRNA is at least 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99% identical or more to SEQ ID NO:2, or contains fewer than 10, 9, 8, 7, 6, 5, 4, 3, or 2 nucleotide additions, deletions, or substitutions compared to SEQ ID NO:2.
- the A-U pair can be replaced by a G-C, C-G, A-C, G-U pair.
- the sgRNA can be optimized for stability.
- Stability can be enhanced by optimizing the stability of the sgRNA:nuclease interaction, optimizing assembly of the sgRNA:nuclease complex, removing or altering RNA destabilizing sequence elements, or adding RNA stabilizing sequence elements.
- the sgRNA contains a 5' stem-loop structure proximal to, or adjacent to, the binding region that interacts with the sgRNA-mediated nuclease. Optimization of the 5 ' stem-loop structure can provide enhanced stability or assembly of the sgRNA:nuclease complex. In some cases, the 5 ' stem-loop structure is optimized by increasing the length of the stem portion of the stem-loop structure.
- An exemplary sgRNA containing an optimized 5 ' stem-loop structure is encoded by: SEQ ID NO:3 [N] 5 -
- the optimized sgRNA is at least 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99% identical or more to SEQ ID NO:3, or contains fewer than 10, 9, 8, 7, 6, 5, 4, 3, or 2 nucleotide additions, deletions, or substitutions compared to SEQ ID NO:3.
- the 5 ' stem-loop optimization is combined with mutations for increased transcription to provide an optimized sgRNA.
- an A-U flip and an elongated stem loop can be combined to provide an optimized sgRNA.
- An exemplary sgRNA containing an A-U flip and an elongated 5' stem-loop is encoded by:
- the optimized sgRNA is at least 90, 91 , 92, 93, 94, 95, 96, 97, 98, or 99% identical or more to SEQ ID NO:4, or contains fewer than 10, 9, 8, 7, 6, 5, 4, 3, or 2 nucleotide additions, deletions, or substitutions compared to SEQ ID NO:4.
- sgRNAs can be modified by methods known in the art.
- the modifications can include, but are not limited to, the addition of one or more of the following sequence elements: a 5 ' cap (e.g., a 7-methylguanylate cap); a 3 ' polyadenylated tail; a riboswitch sequence; a stability control sequence; a hairpin; a subcellular localization sequence; a detection sequence or label; or a binding site for one or more proteins.
- Modifications can also include the introduction of non-natural nucleotides including, but not limited to, one or more of the following: fluorescent nucleotides and methylated nucleotides.
- the sgRNA can contain from 5' to 3': (i) a binding region of between about 10 and about 50 nucleotides; (ii) a 5' hairpin region containing fewer than four consecutive uracil nucleotides, or a length of at least 31 nucleotides (e.g., from about 31 to about 41 nucleotides); (iii) a 3' hairpin region; and (iv) a transcription termination sequence, wherein the small guide RNA is configured to form a complex with a small guide RNA- mediated nuclease, the complex having increased stability or activity relative to a complex containing a small guide RNA-mediated nuclease and a small guide RNA comprising at least 95% identity to SEQ ID NO:l or a complement thereof.
- an sgRNA-mediated nuclease is provided.
- the sgRNA-mediated nuclease is a Cas9 protein.
- the sgRNA-mediated nuclease can be a type I, II, or III Cas9 protein.
- the sgRNA-mediated nuclease can be a modified Cas9 protein.
- Cas9 proteins can be modified by any method known in the art.
- the Cas9 protein can be codon optimized for expression in host cell or an in vitro expression system. Additionally, or alternatively, the Cas9 protein can be engineered for stability, enhanced target binding, or reduced aggregation.
- the Cas9 protein can be engineered to be nuclease deficient.
- the Cas9 protein generally catalyzes double-stranded cleavage of the target nucleic acid; however, certain Cas9 mutations can provide a nuclease that is able to nick the target nucleic acid but unable to catalyze double-stranded cleavage, or unable to substantially catalyze double -stranded cleavage.
- Exemplary mutations that reduce or eliminate double stranded cleavage but provide nicking activity include one or more mutations in the following locations: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987, or a mutation in a corresponding location in a Cas9 homologue or ortholog.
- the mutation(s) can include substitution with any natural (e.g. , alanine) or non-natural amino acid, or deletion.
- Cas9 proteins that cleave or nick the target sequence can be utilized in combination with an sgRNA, such as one or more of the sgRNAs described herein, to form a complex that is useful for various nucleic acid modification methods, such as genome editing as further explained below.
- An exemplary nicking Cas9 protein is Cas9D10A or Cas9H840A (Jinek, et al, Science. 2012 Aug 17;337(6096):816-21).
- the Cas9D 1 OA amino acid sequence is (D 1 OA mutation underlined) : SEQ ID NO:5
- certain Cas9 mutations can provide a nuclease that is nuclease defective.
- certain Cas9 mutations can provide a nuclease that does not cleave or nick, or does not substantially cleave or nick the target sequence.
- Exemplary mutations that reduce or eliminate nuclease activity include one or more mutations in the following locations: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987, or a mutation in a corresponding location in a Cas9 homologue or ortholog.
- the mutation(s) can include substitution with any natural (e.g.
- Cas9 proteins that do not cleave or nick the target sequence can be utilized in combination with an sgRNA, such as one or more of the sgRNAs described herein, to form a complex that is useful for detection of target nucleic acids as further explained below.
- An exemplary nuclease defective Cas9 protein is Cas9D10A&H840A (Jinek, et ciL, Science. 2012 Aug 17;337(6096):816-21 ; Qi, et al , Cell. 2013 Feb 28; 152(5): 1173-83).
- the Cas9 protein can be conjugated or fused to a detectable label.
- Cas9 is fused to a fluorescent protein.
- a fluorescent protein can be fused at the N and/or C-terminus of the Cas9 protein.
- the Cas9 protein is nuclease deficient and labeled.
- the labeled Cas9 protein can be combined with an sgR A, such as an sgRNA provided herein, to form a complex useful as a detection reagent in a cell.
- the Cas9 protein can be conjugated or fused to a protein localization signal or a cell penetration peptide.
- the Cas9 protein can be fused to one or more nuclear localization signals, one or more mitochondrial localization signals, or one or more chloroplast localization signals.
- expression cassettes are provided for expression of sgRNAs and/or Cas9.
- the expression cassettes are configured to express sgRNAs and/or Cas9 in an in vivo system.
- the expression cassettes can be configured to express sgRNAs and/or Cas9 in an in vitro system.
- the expression cassette is configured to express one or more sgRNAs and/or Cas9 in the same host cell in which nucleic acid modification or detection is to be performed.
- Expression cassettes as described herein can include a promoter operably linked to a nucleotide encoding sgRNA or Cas9.
- the expression cassettes also include one or more nuclear localization signals, one or more mitochondrial localization signals, or one or more chloroplast localization signals.
- the nuclear localization signal can be fused to the N and/or C-terminus of a Cas9 coding sequence.
- additional sequences are included in the expression cassette, such as cell penetration peptides, purification tags, detection labels, or termination sequences. Termination sequences include stop codons, transcriptional termination signals, or polyadenylation signals.
- Detection labels include fluorescent proteins, such as green fluorescent protein, yellow fluorescent protein, blue fluorescent protein, mCherry, and the like.
- sgRNA expression cassettes can utilize a wide range of suitable promoters as known in the art.
- sgRNA expression cassettes operably link a strong promoter to the sgRNA encoding nucleic acid to provide a high level of expression in a specified expression system or host cell.
- an RNA polymerase III promoter such as HI or U6
- HI or U6 can be operably linked to an sgRNA encoding nucleic acid for in vivo expression in a mammalian cell.
- Other suitable mammalian promoters can also be utilized.
- Exemplary suitable promoters for sg NA expression are provided in e.g., www.addgene.org/CRISPR/.
- Cas9 expression cassettes can utilize a wide range of suitable promoters as known in the art.
- the Cas9, or modified Cas9, utilized in the host cell is prone to aggregation or mislocalization.
- Cas9 expression cassettes operably link a weak promoter to the Cas9 encoding nucleic acid to provide a low level of expression in the host cell.
- the weak expression level mitigates, ameliorates, or eliminates the aggregation and/or mislocalization of Cas9 in the host cell.
- the Cas9 is operably linked to a strong promoter to provide reagent quantities of Cas9 from an in vitro or in vivo expression system.
- Suitable strong promoters include promoters described in e.g. , www.addgene.org/CRISPR/.
- Suitable weak mammalian promoters include the ubiquitin C promoter, the phosphoglycerate kinase 1 promoter, and the un-induced TetOn promoter.
- the weak or strong promoters are respectively weaker or stronger than the elongation factor la (EFla) promoter, which is known to provide a moderate level of expression in a broad range of mammalian cell types.
- EFla elongation factor la
- the expression systems include an expression cassette containing a nucleic acid encoding an sgRNA.
- the expression systems include an expression cassette containing a nucleic acid encoding a Cas9 protein.
- the expression systems contain one or more of the foregoing expression cassettes and a host cell or cell-lysate for generating reagent quantities of sgRNA and/or Cas9.
- the expression cassettes can be introduced into the host cell or cell-lysate and incubated for production of sgRNA or Cas9.
- the sgRNA and/or Cas9 product(s) can be purified and used in the methods of the present invention.
- host cells containing nucleic acid detection or modification reagents.
- the host cells contain one or more sgRNAs, one or more sgRNA- mediated nucleases, or one or more sgRNA:nuclease complexes.
- the host cells are transfected with a suitable vector containing one or more expression cassettes encoding an sgRNA and/or Cas9 protein.
- suitable vectors include any vectors that are known in the art and capable of transferring nucleic acid to a host cell.
- Suitable vectors include, but are not limited to one or more of the following viral vectors: an onco viral vector, a foamy viral vector, lentiviral vector, a feline immunodeficiency viral vector, a Moloney murine leukemia virus (MoMLV) vector, a vaccinia viral vector, a polioviral vector, an adenoviral vector, an adeno-associated viral vector, an SV40 viral vector, a herpes simplex viral vector, an HIV viral vector, a spleen necrosis viral vector, a Rous Sarcoma viral vector, a Harvey Sarcoma viral vector, an avian leucosis viral vector, a myeloproliferative sarcoma viral vector, or a mammary tumor viral vector.
- viral vectors include, but are not limited to one or more of the following viral vectors: an onco viral vector, a foamy viral vector, lentiviral vector, a feline immunodeficiency viral vector, a Mol
- the sgRNA, Cas9, or sgR A:nuclease complexes are introduced into a host cell. Suitable methods for introduction of the RNA, protein, or complex are known in the art and include, for example, electroporation; calcium phosphate precipitation; or PEI, PEG, DEAE, nanoparticle, or liposome mediated transformation. Other suitable transfection methods include direct micro-injection.
- the sgRNA and Cas9 are introduced separately and the sgRNA:nuclease complexes are formed in the cell.
- the sgR A:nuclease complexes are formed and then introduced into the cell.
- multiple, differentially labeled, sgRNA:nuclease complexes, each directed to a different nucleic acid target or nucleic acid target region are formed and then introduced into the cell.
- the methods utilize a nuclease defective Cas9 protein.
- a complex between an sgRNA and a Cas9 protein e.g., a nuclease defective Cas9
- a nucleic acid containing the target sequence e.g., a nucleic acid containing the target sequence.
- the nucleic acid resides in a host cell, e.g., a living host cell.
- sgRNA and Cas9 can be expressed in, or otherwise introduced into, a host cell, and the sgRNA and Cas9 can form an sgRNA:nuclease complex and localize to the target nucleic acid sequence.
- sgRNA can be expressed in the host cell and Cas9 introduced using a method known in the art for protein transfection.
- Cas9 fused to a cell penetration peptide can be contacted with the sgRNA containing cell.
- Cas9 can be expressed in the cell and sgRNA introduced into the cell using any method known in the art for RNA transfection.
- sgRNA:nuclease complexes can be formed and introduced into the cell. [0089] sgRNA:nuclease complexes can localize to a target nucleic acid sequence that is complementary, or substantially complementary to the binding region of the sgRNA.
- Such localized sgRNA:nuclease complexes can be detected to detect the target nucleic acid sequence, or a region containing the target nucleic acid.
- the nuclease and/or the sg NA can be labeled and detection of the localized label thereby detects the target nucleic acid sequence.
- multiple different sgR As each corresponding to a different target sequence in a nucleic acid region can be used to amplify the signal.
- 5, 10, 15, 20, 25, 30, or more sgRNAs can be designed that each bind to a different sequence corresponding to a specified gene or chromosomal region.
- sgRNAs and a labeled Cas9 can be introduced into a host cell and their localization detected. The presence of a strong and localized nuclear signal indicates the presence of the target gene or chromosomal region.
- a small number of sgRNAs e.g., 5, 4, 3, 2, or 1 can be designed to bind to a repetitive sequence located in a target gene or chromosomal region and, e.g., introduced into a cell to form an sgRNA:nuclease complex.
- the sgRNA:nuclease complex can localize to the repeat region and provide a strong signal.
- an sgRNA:nuclease complex can be used to enhance other chromosomal detection methods.
- an sgRNA targeted to a nucleic acid sequence or region of interest and an Cas9 protein can be introduced into a cell.
- the cell can be incubated to allow formation and localization of the sgRNA:nuclease complex.
- Helicase activity of the sgRNA:nuclease complex can then relax the target nucleic acid and surrounding regions.
- the cells can then be fixed and stained using standard fluorescence in situ hybridization (FISH) protocols. The relaxed regions can then be more readily and routinely detected by FISH.
- FISH fluorescence in situ hybridization
- the sgRNA :nuclease complexes used to enhance other detection methods such as FISH are labeled. In other cases, the sgRNA:nuclease complexes used to enhance other detection methods are not labeled. Labeled
- sgRNA:nuclease complexes can provide for co-localization detection or detection of multiple sequences or regions in addition to, or in combination with the other detection method (e.g., FISH).
- FISH fluorescence in situ hybridization
- an sgRNA:nuclease provided herein can be used to enhance any methods known in the art that rely on access to chromosomal DNA.
- genome editing with TALENS or Zinc Finger nucleases can be enhanced by targeting sgRNA:nucleases (e.g. , nuclease active, nuclease deficient, or nuclease defective) to a region at or near the TALEN or Zinc Finger nuclease target site.
- activators or repressors can targeted to regions at or near sgRNA :nuclease target sites.
- sgRNA:nuclease unwinds, or relaxes the target nucleic acid region, or creates a more open chromosomal structure providing access to other tools known in the art for genome editing, labeling, or transcriptional regulation.
- the methods described herein provide diagnostics for genetic diseases.
- the methods can be used to detect chromosomal number,
- chromosomal rearrangements mutations, insertions, deletions, and the presence or absence of a gene or region of a gene.
- the methods utilize a nuclease deficient Cas9 protein.
- the methods utilize a Cas9 protein that catalyzes double stranded cleavage of the target nucleic acid.
- a complex between an sgRNA and a Cas9 protein e.g. , a nicking Cas9 protein or a Cas9 protein capable of double stranded cleavage
- a nucleic acid resides in a host cell, e.g., a living host cell.
- sgRNA and Cas9 can be expressed in a host cell, the sgRNA and Cas9 can form an sgRNA:nuclease complex and localize to the target nucleic acid sequence.
- sgRNA can be expressed in the host cell and Cas9 introduced using a method known in the art for protein transfection.
- Cas9 fused to a cell penetration peptide can be contacted with the sgRNA containing cell.
- Cas9 can be expressed in the cell and sgRNA introduced into the cell using any method known in the art for RNA transfection.
- sgRNA:nuclease complexes can localize to a target nucleic acid sequence that is complementary, or substantially complementary to the binding region of the sgRNA.
- the localized sgRNA:nuclease complexes can then nick or cleave the target sequence.
- a pair of sgRNA:nuclease complexes are utilized that flank a nucleic acid region of interest.
- the nicking or cleaving of target nucleic acid sequences that flank the region of interest can then activate repair processes in the host cell.
- NHEJ non homologous end joining
- HDR homologous DNA repair
- a heterologous nucleic acid can be introduced into the cell with regions of homology corresponding to at least a portion of the nucleic acid region of interest and/or one or more target nucleic acid sequences.
- HDR can incorporate at least a portion of the heterologous nucleic acid into the flanked region of interest.
- the heterologous nucleic acid can contain one or more selectable markers or detectable labels to assay for successful incorporation or select for cells that have incorporated the nucleic acid.
- the heterologous nucleic acid can encode a detectable or selectable protein, e.g. an enzyme, transcription factor, or binding protein.
- kits are described herein for modifying or detecting nucleic acids.
- the kits can be used for detecting nucleic acids in a living cell.
- the kits contain a labeled nuclease defective sgR A and an sgRNA-mediated nuclease.
- the kits contain a nuclease deficient, or a nuclease competent sgRNA-mediated nuclease and an sgRNA.
- the kits contain multiple sgRNAs directed to different target nucleic acid sequences.
- the kits contain multiple labeled sgRNA-mediated nucleases.
- kits contain one or more expression cassettes containing nucleic acid encoding one or more of the sgRNAs and/or sgRNA- mediated nucleases described herein.
- Further uses of the methods, compositions, or kits described herein include one or more of the following: genome editing, transcriptional or epigenetic regulation, genome imaging, copy number analysis, analysis of living cells, detection of highly repetitive genome sequence or structure, detection of complex genome sequences or structures, detection of gene duplication or rearrangement, enhanced FISH labeling, unwinding of target nucleic acid, large scale diagnostics of diseases and genetic disorders related to genome deletion, duplication, and rearrangement, use of an RNA oligo chip with multiple unique sgRNAs for high-throughput imaging and/or diagnostics, multicolor differential detection of target sequences, identification or diagnosis of diseases of unknown cause or origin, and 4- dimensional (e.g., time-lapse) or 5-dimensional (e.g., multicolor time-lapse) imaging of cells (e.g., live cells
- Example 1 Dynamic Imaging of the Genome in Living Human Cells Via CRISPR
- FISH requires chemical fixation of cells, precluding its use for live cell imaging.
- DNA- binding proteins fused with fluorescent proteins have enabled robust dynamic imaging of chromosomes in living cells (Robinett, et al., 1996; Wang, et al., 2008). Due to either fixed DNA sequence binding requirements or limited native DNA-binding proteins, however, it remains challenging to use DNA-binding protein for imaging arbitrary genes and genetic elements. We argue that more powerful dynamic genome imaging approaches require methods that could leverage the ease of Watson-Crick base-pairing as seen in nucleic acid probes and the robustness of DNA-binding proteins.
- the bacterial genetic immune system CRISPR (clustered regularly interspaced short palindromic repeats) provides a natural example for DNA targeting using both a DNA- binding protein and a small base-pairing RNA (Barrangou, et al., 2007; Wiedenheft, et al.,
- the minimal type II CRISPR system derived from Streptococcus pyogenes recognizes specific DNA sequences via a small guide (sg) RNA-mediated nuclease, Cas9 (Deltcheva, et al, 2011; Jinek, et al., 2012). Upon DNA binding, the Cas9-sgRNA complex causes DNA double-stranded breaks (Jinek, et al., 2012). Harnessing this unique RNA- guided nuclease activity, recent work has demonstrated the use of CRISPR for genome editing in a broad range of organisms (Cong, et al., 2013; Mali, et al., 2013; Wang, et al.,
- dCas9 nuclease-deactivated Cas9
- dCas9 protein to Enhanced Green Fluorescent Protein (EGFP).
- EGFP Enhanced Green Fluorescent Protein
- complementary sgRNAs will direct dCas9-EGFP to the targeted genomic loci (Fig. 1A).
- Fig. 1A To enrich the signal over the background of unbound dCas9-EGFP, we targeted multiple dCas9- EGFP proteins to a given locus.
- dCas9-EGFP To reduce the levels of free dCas9-EGFP that contribute to background, we expressed dCas9-EGFP from the Tet-On 3G promoter without doxycycline induction (Fig. IB).
- telomeres specialized chromatin structures composed of TTAGGG repeats 5 to 15 kilobase pairs (kb) in length (Moyzis, et al., 1988).
- sgRNA sgTelomere
- the sgRNA also contained a redesigned stem-loop hairpin structure, and a S.
- telomeres terminator sequence following previous studies (Jinek, et al., 2012; Qi, et al., 2013) (Fig. 1C).
- Fig. 1C a clonal RPE cell line for stable dCas9-EGFP expression using lentiviruses containing the dCas9-EGFP expression cassette.
- This cell line was subsequently infected by sgTelomere-containing lentivirus, and imaged for telomere detection 48 h post-infection. Most cells (80%) showed 10 to 40 puncta with bright nuclear areas resembling nucleoli (Fig. ID). The observed number of puncta was substantially lower than numbers expected for telomeres in human cells, suggesting that the system was sub-optimal.
- telomeres To further check the integrity of telomeres with dCas9 binding, we used an antibody to image the localization of 53BP1, a protein recruited to damaged DNA sites(d'Adda di Fagagna, et al., 2003). We saw a very mild increase in DNA damage at telomeres, which was orders of magnitude less than major telomere disruptions such as disassembly of the shelterin complex by shRNA-mediated depletion of TIN2 (Fig. 7) (Kim, et al, 1999).
- telomere length can be conditionally elongated by transfection with a human telomerase RNA (hTR) gene (Vaziri, et al, 1998).
- hTR human telomerase RNA
- MUC4gene which encodes a glycoprotein important for protecting mucus in diverse epithelial tissues and tumor formation.
- the MUC4 gene contains a region in the coding sequence with a variable number (> 100) of 48-bp tandem repeats in the second exon (Fig. 8A) (Nollet, et al., 1998).
- sgMUC4-El sgMUC4-El
- E2 E3 sgRNAs
- sgMUC4-E3 The best one, sgMUC4-E3, showed labeling of 2 or more puncta in 100 % of cells (Fig. 8B). The labeling was highly efficient, and the original sgRNA design without modifications was sufficient for imaging.
- the MUC4 gene also contains an array of 15 -bp tandem repeat region in the third intron.
- sgRNA binding requires a minimal ⁇ 12-bp DNA complementary region
- two sgRNAs with different lengths of base pairing 23 bp and 12 bp
- the F+E modifications of the sgRNA designs greatly enhanced the imaging efficiency (Fig. 8C).
- CRISPR imaging offers a unique non-invasive platform for tracking the dynamics of genetic elements in living cells.
- CRISPR-imaging analysis revealed similar movement speed and two dimensional displacement of telomeres(Fig. 14B-D).
- long telomeres, as defined by higher fluorescent surface area, showed slower movement speeds.
- We also tracked individual MUC4 loci movement Fig. 14E).
- CRISPR sgRNAs for sequence-specifically visualizing genetic elements in living cells defines a new class of genome imaging tools.
- a nuclease-deactivated S. pyogenes Cas9 fused with fluorescent proteins can be used to flexibly image both repetitive and non-repetitive genetic elements, which can be potentially applied to image any genomic sequence of interest.
- Our technology relies on RNA-directed local enrichment of fluorescence signals, which robustly filters the off-target effects of the CRISPR system (Hsu, et al., 2013).
- orthogonal Cas9 proteins should further allow multiplexed detection of multiple genetic events.
- the method is capable of tracking the dynamic movement of an arbitrary genomic region in living cells, opening doors to a fuller understanding of how genomes are organized in vivo and how they are dynamically regulated in the cell nucleus.
- the CRISPR imaging technology offers a universal platform for using RNAs to modify, modulate and label human genomes.
- the sgR A (old design) expression plasmids were cloned by inserting annealed primers into the vector digested by BstXI and Xhol.
- the optimized sgRNAs expression constructs were directly ordered as gBlocks (IDT) and clone into the lentiviral U6-based expression vector digested by Xbal/BamHI.
- the 73 sgRNAs targeting non-repetitive sequence of MUC4 were cloned into the same sgRNA expression vector by amplifying the insertions using a same reverse primer but different forward primers containing the unique spacer sequence right after the BstXI site and the optimized sgRNA as the template.
- the PCR fragments were cloned into the vector by BstXI and Xhol.
- HE Human embryonic kidney
- DMEM Dulbecco's modified Eagle medium
- RPE retinal pigment epithelium
- 293T cells were seeded into T75 flask one day prior to transfection.
- 1 ⁇ g of pMD2.G plasmid irus envelope plasmid
- 8 ⁇ g of pCMV-dR8.91 virus packaging plasmid
- 9 ⁇ g of the lentivector Tet-on 3G, dCas9-EGFP, GFP-TRF1, sgRNA or TIN-2 shRNA
- Virus was harvested 48 hours post- transfection.
- culture cells were incubated with culture medium-diluted virus supernatant supplemented with 5 ⁇ g ml polybrene for 12 hours.
- RPE, UMUC3 and HeLa cell lines stably expressing dCas9-EGFP were generated by infecting the cells with lentivirus that co-packaged the two expression vectors of dCas9- EGFP and transactivator protein.
- Clonal cells expressing dCas9-EGFP in the inducible system for each cell line were generated by picking a single cell colony.
- the clones with low basal level expression of dCas9-EGFP were selected for CRISP imaging.
- telomeres and the repetitive regions of MUC1 or MUC4 the selected clonal cell lines were infected with ⁇ lentivirus with individual sgRNAs in each 8-well of chambered cover glass.
- sgRNA plasmids were co- packaged in the same lentivirus.
- the dCas9 expression cells were infected with a mixture of lentivirus including 16, 26, 36 or 73 sgRNAs.
- FISH Fluorescence in situ hybridization
- Oligo FISH was performed according to standard protocols. Briefly, cells were fixed with 4% paraformaldehyde, incubated with 0.7% Triton X-100, 0.1% Saponin in 2xSSCfor 30 min at RT for permeabilization of the nuclear membrane, washed with 2> ⁇ SSC twice, treated with RNase A at 37 °C for lh, washed again with 2*SSC and equilibrated with PBS for 5 min before dehydration by consecutive 5 min incubations in 70%, 85% and 100% ethanol. After air drying, cells were heated at 80 °C for 5 min in 70% formamide/2> ⁇ SSC, washed using an ethanol series (ice cold; 70, 80, 95%).
- Cy3- or Cy5-labeled Oligo FISH probe (around 20bp, each oligo has one molecule of Cy3 or Cy5 conjugated to its 5 prime end) in hybridizing solution (10% dextran sulfate, 50% formomide, 500 ng/ml
- Salmon sperm DNA in 2xSSC buffer at a final concentration of 2 ng/ ⁇ , was added to the sample and incubated overnight at 37 °C. After hybridization, cells were washed three times with 2xSSC for three times and finally stained with DAPI.
- telomere PNA-FISH was performed as described in Diolaiti et al (beginning after the pepsin treatment).
- telomeres and Mucin genes were seeded onto 8-Well Lab-Tek II chambered cover glass 24 hours before lentivirus infection.
- the imaging of telomeres and Mucin genes was performed 48 hours post-infection of sgR As.
- the samples with fixation were imaged at a frame rate of 5Hz with 50 frames.
- the final image was generated by averaging the 50 frames.
- 3 ⁇ Z-stack at 0.4 ⁇ steps were acquired.
- the culture medium was replaced with medium without phenol red. Images were recorded at a frame rate of 5Hz. 600 frames were acquired for the dynamic analysis.
- the temperature was maintained at 37 °C with an enviromental chamber. Imasins analysis
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Biomedical Technology (AREA)
- Chemical & Material Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Biophysics (AREA)
- Plant Pathology (AREA)
- Medicinal Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Methods, compositions, and kits are provided herein for CRISPER/Cas-mediated nucleic acid detection or modification.
Description
Optimized Small Guide RNAs and Methods of Use STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER
FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT
[0001] This invention was made with government support under Grant No. GM081879 awarded by the National Institutes of Health. The government has certain rights in the invention.
CROSS-REFERENCE TO RELATED APPLICATION
[0002] This application claims priority to U.S. Provisional Application No. 61/883,929, filed on September 27, 2013, the contents of which are hereby incorporated in the entirety for all purposes.
BACKGROUND OF THE INVENTION
[0003] Clustered, regularly interspaced short palindromic repeat (CRISPR) sequences are present in approximately 40% of eubacterial genomes and nearly all archaeal genomes sequenced to date, and consist of short (~24-48 nucleotide) direct repeats separated by similarly sized, unique spacers. They are generally flanked by a set of CRISPR-associated (Cas) genes that encode a nuclease this is important for CRISPR maintenance and function. In Streptococcus thermophilus and Escherichia coli, CRISPR/Cas loci have been
demonstrated to confer immunity against bacteriophage infection by an interference mechanism that relies on the strict identity between CRISPR spacers and phage target sequences. The mechanism underlying this immunity is based on sequence specific cleavage of foreign nucleic acids by a CRISPR:Cas complex that contains the Cas nuclease and a guide RNA derived from the CRISPR sequences that provides target sequence specificity through a single stranded binding region. Binding of the CRISPR:Cas complex to the target sequence results in double stranded cleavage of the target sequence. [0004] The CRISPR/Cas system has been modified for use in prokaryotic and eukaryotic systems for genome editing and transcriptional regulation. However, methods and compositions known the in art often fail to provide the activity and specificity necessary for
routine use. For example, Cradick, et al, Nucleic Acids Res. Aug. 1 1 , 2013; Pattanayak, et al, Nat Biotechnol. 2013 Sep;31(9):839-43; Mali, et al, Nat Biotechnol. 2013
Sep;31(9):833-8; and Hsu, et al, Nat Biotechnol. 2013 Sep;31(9):827-32, all report significant off-target genome editing and varied editing efficiency across different gene targets. Similar issues also exist when using known CRISPR/Cas systems for regulation of transcription.
BRIEF SUMMARY OF THE INVENTION
[0005] In one embodiment, this invention provides a small guide RNA molecule comprising from 5 ' to 3 ' : a binding region, comprising between about 5 and about 50 nucleotides; a 5 ' hairpin region, comprising: fewer than four consecutive uracil nucleotides; or a length of at least 31 nucleotides; and a 3 ' hairpin region; and a transcription termination sequence, wherein the small guide RNA is configured to form a complex with a small guide RNA-mediated nuclease, the complex having increased stability or activity relative to a complex containing a small guide RNA-mediated nuclease and a small guide RNA comprising at least 95% identity to SEQ ID NO: l or a complement thereof.
[0006] In some cases, the 5' hairpin region of the small guide RNA molecule comprises fewer than four consecutive uracil nucleotides and a length of at least 31 nucleotides, a length of at least 35 nucleotides, or a length of at least 36, 37, 38, 39, 40, 41 , 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides. In some cases, the small guide RNA molecule further comprises an additional hairpin region designed to interact with a protein or small-molecule to
conditionally stabilize the secondary and/or tertiary structure of the small guide RNA molecule. In some cases, the small guide RNA molecule comprises at least 95% identity to SEQ ID NOs:2, 3, or 4, or a complement thereof. [0007] In some embodiments, the invention provides a composition for nucleic acid modification or detection comprising any of the foregoing small guide RNA molecules. In some cases, the composition further comprises a small guide RNA-mediated nuclease, wherein the small guide RNA and the small guide RNA-mediated nuclease form a complex having increased stability or activity relative to a complex containing a small guide RNA comprising at least 95% identity to SEQ ID NO: l or a complement thereof. In some cases, the composition is nuclease defective, thereby forming a complex configured to bind to, but not cleave or nick, a target nucleic acid substantially complementary to the binding region of
the small guide RNA. In some cases, the nuclease defective composition comprises a Cas9 protein containing a mutation at one or more of the following residues: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987. In some cases, the nuclease defective composition comprises a Cas9 protein containing a mutation at two or more of the following residues: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and A987. In some cases, the nuclease defective composition comprises a Cas9 protein containing a D 1 OA and a H840A mutation.
[0008] In some cases, the nuclease defective composition comprises a labeled Cas9 protein. For example, in some cases, the labeled Cas9 protein comprises a fluorophore. In some cases, the fluorophore is a fluorescent protein.
[0009] In some cases, the nuclease defective composition comprises a Cas9 protein that comprises a polypeptide that modulates (e.g., activates or represses) transcription. For example, the Cas9 protein can be a Cas9 fused to a transcriptional repressor, including but not limited to, a chromoshadow domain (CSD). As another example, the Cas9 protein can be a Cas9 fused to a transcriptional repressor , including but not limited to, a Kriippel associated box (KRAB) domain. As yet another example, the Cas9 protein can be a Cas9 fused to a transcriptional activator, including but not limited to, VP8, VP16, or VP64.
[0010] In some embodiments, the composition has nuclease activity, thereby forming a complex configured to bind and cleave a target nucleic acid sequence substantially complementary to the binding region of the small guide RNA. In some cases, the small guide RNA-mediated nuclease has nicking activity, but is substantially defective at catalyzing double stranded breaks in the target sequence. In some cases, the small guide RNA-mediated nuclease comprises a Cas9 protein containing a mutation at one or more of the following residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987. In some cases, the small guide RNA-mediated nuclease comprises a Cas9 protein with nicking activity containing a mutation at one or more of the following residues D10, G 12, G 17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
[0011] In some embodiments, the invention provides an expression cassette comprising a promoter operably linked to a nucleic acid encoding any of the small guide RNAs of claims 1-4. In some cases, the promoter of the expression cassette is an RNA polymerase III promoter. In some cases, the RNA polymerase promoter is a U6 or HI promoter, preferably a U6 promoter. In some cases, the promoter of the expression cassette is an SFFV promoter.
[0012] In some embodiments, the invention provides an expression cassette comprising a promoter operably linked to a nucleic acid encoding a small guide NA-mediated nuclease. In some cases, the promoter is a weak mammalian promoter as compared to the human elongation factor 1 promoter (EF1A). In some cases, the weak mammalian promoter is a ubiquitin C promoter or a phosphoglycerate kinase 1 promoter (PGK). In some cases, the weak mammalian promoter is a TetOn promoter in the absence of an inducer. In some cases, the nucleic acid encoding a small guide RNA-mediated nuclease of the expression cassette further further encodes a one or two nuclear localization sequences.
[0013] In some embodiments, the present invention provides a host cell comprising any one of the foregoing the small guide RNAs. In some cases, the cell further comprises a small guide RNA-mediated nuclease. In some cases, the small guide RNA-mediated nuclease is labeled, such as with a fluorophore, such as a fluorescent protein.
[0014] In some embodiments, the present invention provides a method of detecting a target nucleic acid sequence in a cell, the method comprising: (i) introducing into the cell: (a) one or more small guide RNAs, each small guide RNA specific for the target nucleic acid sequence; and (b) a labeled nuclease-deficient small guide RNA-mediated nuclease, thereby forming a labeled RNA:nuclease complex; and (ii) incubating the cell to allow the labeled
RNA:nuclease complex to localize to the target nucleic acid sequence; and (iii) detecting the presence, absence, or quantity of labeled complex in the nucleus of the cell, thereby detecting the target nucleic acid.
[0015] In some cases, the method further comprises introducing at least 2, 3, 4, 5, 6, 7, 8, 10, 15, 20 (e.g., 3-20, 4-10, etc.) or more different small guide RNAs, each specific for a different portion of the target nucleic acid sequence. In some cases, the one or more small guide RNAs are specific for a repeated target sequence. In some cases, the repeated target sequence comprises at least 5, 10, 15, 20 or more contiguous repeats, each repeat of at least 5, 10, 15, 20, or more nucleotides in length. In some cases, the small guide RNA-mediated nuclease contains a mutation at one or more of the following residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987. In some cases, the small guide RNA-mediated nuclease contains a mutation at two or more of the following residues D 10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
[0016] In some embodiments, the introducing further comprises: · forming a complex between: - a first labeled nuclease-deficient small guide RNA-mediated nuclease and one or
more small guide RNAs, to form a first labeled complex; and - a second labeled nuclease deficient small guide RNA-mediated nuclease and one or more small guide RNAs, to form a second labeled complex; and · contacting the cell with the first and second complexes. In some cases, the method further comprises forming a third and fourth labeled complex and contacting the cell with the four labeled complexes. In some cases, each labeled complex is specific for a different target nucleic acid sequence or specific for a different region of a chromosome. In some cases, each labeled complex is labeled with a different label, and the method comprises detecting the presence, absence, or quantity of each labeled complex in the nucleus of the cell. [0017] In some embodiments, the present invention provides a method of modifying a target nucleic acid sequence in a cell, the method comprising: (i) introducing into the cell: (a) any of the foregoing small guide RNAs, the small guide RNA specific for the target nucleic acid sequence; and (b) a small guide RNA-mediated nuclease, thereby forming a small guide RNA:nuclease complex; and (ii) incubating the cell to allow the small guide RNA:nuclease complex to bind to and cleave or nick the target nucleic acid, thereby modifying the target nucleic acid sequence in the cell.
[0018] In some cases, the small guide RNA-mediated nuclease is capable of catalyzing double stranded breaks in the target nucleic acid, and the method further comprises cleaving the target nucleic acid. In some cases, the small guide RNA-mediated nuclease is capable of nicking the target nucleic acid but not catalyzing double stranded breaks in the target nucleic acid, and the method further comprises nicking the target nucleic acid. In some cases, the method further comprises (i) introducing into the cell: (a) a pair of any of the foregoing small guide RNAs, each small guide RNA specific for a target nucleic acid, wherein the pair of small guide RNAs bind to target nucleic acids on a chromosome and flank a nucleic acid region of interest; and (b) a small guide RNA-mediated nuclease, thereby forming a pair of small guide RNA:nuclease complexes; and (ii) incubating the cell to allow the small guide RNA:nuclease complexes to localize to the target nucleic acid sequence, thereby creating nicks or double stranded breaks that flank the nucleic acid region of interest; and (iii) incubating the cell to allow non-homologous end joining (NHEJ) or homologous DNA repair (HDR) to occur, thereby reducing heterozygosity in the cell or deleting at least a portion of the nucleic acid region of interest. In some cases, the method the method further comprises introducing into the cell a heterologous nucleic acid that contains regions of substantial
homology to the nucleic acid region of interest, thereby incorporating at least a portion of the heterologous nucleic acid into the nucleic acid region of interest.
[0019] In some embodiments, the present invention provides a kit comprising an sgRNA and a labeled nuclease defective sgR A-mediated nuclease. In some cases, the kit further comprises a second sgRNA or a second labeled nuclease defective sgRNA-mediated nuclease. In some cases, the kit further comprises a cell transfection reagent.
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] Fig. 1 depicts an optimized CRISPR system for visualizing sequence-specific genomic elements in living human cells. (A) Overview of using CRISPR for genome imaging. A dCas9-EGFP fusion protein and sgRNAs allow local enrichment of fluorescence signals at specific genomic sites in living cells. (B) The CRISPR imaging system consists of a doxycycline-inducible expression system with low dCas9-EGFP expression level, a Tet-on 3G trans-activator, and custom designed sgRNAs expressed from a murine U6 promoter. We also optimized the nuclear localization of the dCas9-EGFP fusion protein by modifying the fusion strategy with nuclear localization signals (NLS). (C) Original and optimized sgRNA designs. The optimized sgRNA contains an A-U pair flip (underlined) and 5 -bp extension (underlined) at the top of the hairpin (boxed). (D) CRISPR imaging of the human telomeric DNA sequence in RPE cells. The sgRNA target site is indicated by a gray line and the adjacent PAM is shown. The optimized sgRNA (F+E) shows a much higher labeling efficiency with lower background fluorescence signals. SgGAL4 is the negative control without cognate binding sites in the genome. Scale bar, 5 μηι. (E) Histograms of telomere counts and distribution of telomere fluorescence intensity. The optimized sgRNA shows greatly enhanced labeling efficiency of telomeres. [0021] Fig. 2 different dCas9 and nuclear localization signal (NLS) fusion proteins show varied efficiencies for nucleus localization. Two copies of NLSs were fused to dCas9-EGFP at different positions. Version 1 containing two tandem NLSs between dCas9 and EGFP shows only 20% nuclear localization, which can be enhanced by insertion of a HA tag as shown in Version 2. Version 3 and 4, with an N-terminal NLS, shows almost 100 % nuclear localization of dCas9-EGFP. The nucleus is labeled with DAPI. Scale bar, 5μηι
[0022] Fig. 3 depicts a comparison of different sgRNA designs for CRISPR-mediated gene repression. Different sgRNAs were transiently transfected into a GFP+ HEK293 reporter cell
line harboring a genomic integrated dCas9-KRAB gene. The fluorescence was assayed using flow cytometry. Modifications (underlined) of sgRNA sequence with a polymerase III SINE element polyadenylation signal sequence (# 2), A-U flip (# 3, 4, 5 and 6), hairpin extension (# 7 and 8) and combined changes were hypothesized to increase sgRNA expression level, stability or its association with dCas9 protein. Repression fold is calculated by dividing the fluorescence of +sgGAL4 and fluorescence + different designs. The fold increase in repression is calculated by dividing the fluorescence of Design #1 and fluorescence of other designs. The unmodified hairpin nucleotides are boxed.
[0023] Fig. 4 Optimized sgRNA designs enhance the labeling efficiency of telomeres. (A) Both designs of A-U flip (# 6) and hairpin extension (# 8) increased the telomere labeling efficiency compared to the original sgRNA design (# 1), as more dCas9-EGFP puncta were observed. Combining both modifications of A-U flip and hairpin extension (# 10) further enhanced the labeling efficiency and reduced background fluorescence signal. This design was used for later imaging and annotated as (F+E). Each conventional fluorescence image shows one projected three dimensional (3-D) stack of 3 μιη deep (an entire RPE nucleus) with 0.4 μηι focal spacing. Scale bar, 5 μηι. (B) A plot of overall telomere intensity distribution comparing the telomere labeling efficiencies of different sgRNA designs. Each spot stands for one cell. The percentage of whole-cell GFP is calculated as the ratio of overall telomere intensity and overall GFP signal in the whole cell. [0024] Fig. 5 Labeling efficiency of telomeres is dosage-dependent of the sgRNA lentivirus. With the best sgRNA design (F+E), telomeres were labeled when infected with 50 μΕ lentivirus, but there were some nucleolus-like structures highlighted in the background. Infection with 200 μΐ^ lentivirus increased telomere labeling efficiency without nucleolus-like structures, possibly due to enhanced assembly between sgRNAs and the dCas9-EGFP protein. A good balance between labeling efficiency and cytotoxicity was achieved by infecting the cells with 100 μΐ, sgRNA lentivirus.
[0025] Fig. 6 depicts the co-localization of CRISPR labeling with telomere markers.
CRISPR can effectively detect telomere length changes in living cells. (A) Co-localization of dCas9-EGFP and telomeres as labeled by Oligo FISH (top) or antibody to TRF2 (bottom). (B-C)Telomere elongation was induced in UMUC3 cancer cells (starting telomere length 2- 5kb) by overexpression of human telomerase RNA (hTR). (B) Visualization of telomeres in
UMUC3 cells by CRISPR labeling. (C) Telomere fluorescence intensity in CRISPR-labeled cells increases after elongation of telomeres by hTR. Scale bar, 5 μηι.
[0026] Fig. 7 illustrates that no DNA damage response was detected at the telomeres labeled by CRISPR. (A) Co-localization of PNA FISH and 53BP1 staining in RPE cells, dCas9-EGFP-labeled RPE cells, and TIN2 shRNA-treated RPE cells. ΤΓΝ2 shRNA infecting the RPE cells was used to induce telomeric DNA damage response revealed by 53BP1 antibody staining. 53BP1 signal was enriched at the telomeres indicated by PNA probe in most cells when infected with TIN2 shRNA. There was no obvious enrichment of 53BP1 at the telomeres labeled by PNA probes or dCas9-EGFP. (B) Histogram showing the quantification of 53BP1 signal enriched at the telomeres.
[0027] Fig. 8 depicts the results of imaging of endogenous MUC4 gene by targeting the repetitive or non-repetitive regions via dCas9-EGFP. (A) Schematic of the human MUC4 locus containing two repeated regions in exon 2 and intron 3 as indicated. T he sgRNA target sites are indicated by gray lines and the adjacent PAMs are shown. (B) Conventional fluorescence images of MUC4 loci (arrows) in RPE cells by targeting the exon 2 repeats with three different sgRNAs. (C) Two protospacers with different lengths, 13 bp and 23 bp, were chosen as targets in the intron repeats . MUC4 intron (arrows) can only be labeled by using optimized sgRNA design when targeting the 23 bp protospacer. (D) Histograms of MUC4 loci counts by labeling exon 2 via CRISPR. (E) Histograms of MUC4 loci counts by labeling intron 3 via CRISPR. (F) Co-localization of dCas9-EGFP and Oligo FISH probes for both exon and intron labeling. (G) 73 protospacers in the first exon were selected for labeling the non-repetitive region of MUC4 gene. 1-3 spots (arrows) can be detected with 36 and 73 sgRNAs. Co-labeling of MUC4 using 73 sgRNAs and sgMUC4-E3 shows two proximal spots (arrowhead). The inset shows the magnification of the white box region. (H)
Histograms of MUC4 loci counts. 26 sgRNAs or more are required to detect the non- repetitive genome loci. Scale bar, 5 μιη.
[0028] Fig. 9 depicts the results of a karyotype analysis of the RPE cell line. Cytogenetic analysis was performed on ten G-banded metaphase cells of human RPE cell line. The chromosomes were stained with dyes that show a pattern of light and dark bands (called the banding pattern). The banding pattern for each chromosome was specific and consistent allowing identification of each of the 24 chromosomes. 10 cells were analyzed. This cell line demonstrated a hypertriploid karyotype (73 chromosomes in total) with female origin.
There were extra copies of chromosome 5, 7, 1 1 , 12, 16, 19, and 20 that were present in eight or nine cells except for chromosome 16 (six cells) and 19 (four cells). Chromosome 10 and 22 were also lost in nine and eight cells respectively. All ten cells had two copies of an abnormal X chromosome with addition of unidentifiable genetic material translocated to the long-arm at Xq28.
[0029] Fig. 10: depicts CRISPR imaging of MUC4 and telomeres in the monoclones of HeLa cell line. (A) In addition to RPE and UMUC3 cell lines, CRISPR can also detect MUC4 loci and telomeres in living HeLa cells. Three MUC4 loci (arrows) were detected by CRISPR via targeting the repetitive exon region. (B) Histograms of MUC4 loci counts by CRISPR labeling. (C) MUC4 locus is localized at 3q29 of chromosome 3. Two hundred interphase nuclei of HeLa cells were examined by FISH using two probes hybridized to 3q26.1 and 3q28-29, respectively. All two hundred cells demonstrated a FISH signal pattern of three copies of 3q28-29, suggesting trisomy chromosome 3 in HeLa cells.
[0030] Fig. 11 depicts CRISPR imaging of MUC1 loci in living RPE cells. (A) Schematic depicting the structure of MUC1 gene, which contains a 60 bp unit repeated 20 to 140 copies in both exon 3 and intron 3. Four sgRNAs were designed to target the repeat region, and their target sites are indicated with gray lines and the adjacent PAM are also indicated. (B) MUC1 loci (arrows) visualized in RPE cells by targeting four different protospacers. (C) The labeling specificity of CRISPR was confirmed by oligo FISH labeling. The co-localization of dCas9-EGFP and oligo FISH is indicated by an arrow. (D) Histograms of the labeling efficiency of MUC1 loci by targeting different protospacers. 1 - 5 spots can be observed in 90 % of the cells with sgMUCl-El and 95% of cells with sgMUCl-E3, but only in 45 % and 20 % of cells with sgMUC 1-E2 and sgMUC 1-E4, respectively. The sgRNA design
(sgMUC 1 -E4(F+E)) increased the labeling efficiency compared to sgMUC 1 -E4 from 20 % to 55%.
[0031] Fig. 12 depicts depicts a comparison of labeling efficiency between CRISPR and Oligo FISH. (A) Oligo FISH labeling of MUC4 exon, MUC4 intron, and MUC 1 exon as indicated by arrows. (B) Histograms showing the statistics of observed spots for labeling MUC4 exon, MUC4 intron, and MUC1 exon by Oligo FISH or CRISPR. CRISPR shows a higher labeling efficiency in all cases.
[0032] Fig. 13 depicts imaging of multiple elements via CRISPR. (A) The loci of MUC1 and MUC4 were co-labeled by infecting RPE cells with both sgMUC 1 -El and sgMUC4-E3.
More spots (arrows) were observed compared to individual sgMUCl-El or sgMUC4-E3 labeling, while similar number of spots was detected if infecting the cells with both sgMUC4- E3 and sgMUC4-I2(F+E) that target the same MUC4 (B) Histograms showing the counts of Mucin genes loci. 45% of cells targeting both MUC 1 and MUC4 contain more than 6 spots, while only 10% of cells have 6 spots with co-infection of two sgMUC4s.
[0033] Fig. 14 depicts single particle tracking and diffusion dynamics of MUC4 labeled by C ISPR. (A) Conventional fluorescence image of MUC4 (left) and three example traces of marked foci (right). All traces show confined diffusion, but focus 2 is actively transported in addition. (B) Averaged Mean Square Displacement (MSD) of 119 and 50 traces from 26 cells, categorized as confined diffusive (darker line) and actively transported (lighter line). The shaded areas represent the standard error of the mean. (C) Histogram of confinement characteristic length and (D) histogram of microscopic diffusion coefficient derived from MSDs. (E) Data illustrating individual MUC4 loci movement. (F) Analysis of MUC4 loci movement depicts two distinct movement modes: confined diffusion and active transport of confined diffusion. (G) 80%> of detected MUC4 loci followed confined diffusion, suggesting the movement of local chromatin is modulated by nuclear factors. (H) The median speed wasO.Ol lum2/s as defined by diffusion coefficient.
DETAILED DESCRIPTION OF THE INVENTION I. Definitions
[0034] As used in this specification and the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise.
[0035] The term "nucleic acid" or "polynucleotide" refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogues of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, SNPs, and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-
base and/or deoxyinosine residues (Batzer et al, Nucleic Acid Res. 19:5081 (1991); Ohtsuka et al, J. Biol. Chem. 260:2605-2608 (1985); and Rossolini et al, Mol. Cell. Probes 8:91-98 (1994)). The term nucleic acid is used interchangeably with gene, cDNA, and mRNA encoded by a gene. [0036] The term "gene" means the segment of DNA involved in producing a polypeptide chain. It may include regions preceding and following the coding region (leader and trailer) as well as intervening sequences (introns) between individual coding segments (exons).
[0037] A "promoter" is defined as an array of nucleic acid control sequences that direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor elements, which can be located as much as several thousand base pairs from the start site of transcription.
[0038] An "expression cassette" is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular polynucleotide sequence in a host cell. An expression cassette may be part of a plasmid, viral genome, or nucleic acid fragment. Typically, an expression cassette includes a polynucleotide to be transcribed, operably linked to a promoter.
[0039] A "reporter gene" encodes proteins that are readily detectable due to their biochemical characteristics, such as enzymatic activity or chemifluorescent features. One specific example of such a reporter is green fluorescent protein. Fluorescence generated from this protein can be detected with various commercially-available fluorescent detection systems. Other reporters can be detected by staining. The reporter can also be an enzyme that generates a detectable signal when contacted with an appropriate substrate. The reporter can be an enzyme that catalyzes the formation of a detectable product. Suitable enzymes include, but are not limited to, proteases, nucleases, lipases, phosphatases and hydrolases. The reporter can encode an enzyme whose substrates are substantially impermeable to eukaryotic plasma membranes, thus making it possible to tightly control signal formation. Specific examples of suitable reporter genes that encode enzymes include, but are not limited to, CAT (chloramphenicol acetyl transferase; Alton and Vapnek (1979) Nature 282: 864-869);
luciferase (lux); β-galactosidase; LacZ; β. -glucuronidase; and alkaline phosphatase (Toh, et al. (1980) Eur. J. Biochem. 182: 231-238; and Hall et al. (1983) J. Mol. Appl. Gen. 2: 101),
each of which is incorporated by reference herein in the entirety. Other suitable reporters include those that encode for a particular epitope that can be detected with a labeled antibody that specifically recognizes the epitope.
[0040] The term "amino acid" refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ- carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i. e. , an a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. "Amino acid mimetics" refers to chemical compounds having a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
[0041] There are various known methods in the art that permit the incorporation of an unnatural amino acid derivative or analog into a polypeptide chain in a site-specific manner, see, e.g., WO 02/086075.
[0042] Amino acids may be referred to herein by either the commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
[0043] "Polypeptide," "peptide," and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. All three terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non- naturally occurring amino acid polymers. As used herein, the terms encompass amino acid chains of any length, including full-length proteins, wherein the amino acid residues are linked by covalent peptide bonds.
[0044] "Conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, "conservatively modified variants" refers to those nucleic acids that encode identical or essentially identical amino acid
sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations," which are one species of conservatively modified variations. Every nucleic acid sequence herein that encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule.
Accordingly, each silent variation of a nucleic acid that encodes a polypeptide is implicit in each described sequence.
[0045] As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention. In some cases, conservatively modified variants of Cas9 or sgRNA can have an increased stability, assembly, or activity as described herein.
[0046] The following eight groups each contain amino acids that are conservative substitutions for one another:
1) Alanine (A), Glycine (G);
2) Aspartic acid (D), Glutamic acid (E);
3) Asparagine (N), Glutamine (Q);
4) Arginine (R), Lysine (K);
5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V);
6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W);
7) Serine (S), Threonine (T); and
8) Cysteine (C), Methionine (M)
(see, e.g. , Creighton, Proteins, W. H. Freeman and Co., N. Y. (1984)).
[0047] Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
[0048] In the present application, amino acid residues are numbered according to their relative positions from the left most residue, which is numbered 1, in an unmodified wild- type polypeptide sequence.
[0049] As used in herein, the terms "identical" or percent "identity," in the context of describing two or more polynucleotide or amino acid sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same. For example, a core small guide RNA (sgRNA) sequence responsible for assembly and activity of a sgRNA:nuclease complex has at least 80% identity, preferably 85%, 90%, 91%, 92%, 93, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity, to a reference sequence, e.g., one of SEQ ID NOs:l-4), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. As another example, a Cas9 sequence responsible for assembly and activity of a sgRNA:nuclease complex has at least 80% identity, preferably 85%, 90%, 91%, 92%, 93, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity, to a reference sequence, e.g., one of SEQ ID NOs:5-6), when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Such sequences are then said to be "substantially identical." With regard to polynucleotide sequences, this definition also refers to the complement of a test sequence. With regard to amino acid sequences, preferably, the identity exists over a region that is at least about 50 amino acids or nucleotides in length, or more preferably over a region that is 75-100 amino acids or nucleotides in length.
[0050] For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences
relative to the reference sequence, based on the program parameters. For sequence comparison of nucleic acids and proteins, the BLAST and BLAST 2.0 algorithms and the default parameters discussed below are used.
[0051] A "comparison window", as used herein, includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat 7. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 1995 supplement)).
[0052] Examples of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al. , (1990) J. Mol. Biol. 215: 403-410 and Altschul et al. (1977) Nucleic Acids Res. 25: 3389-3402, respectively. Software for performing BLAST analyses is publicly available at the National Center for Biotechnology Information website, ncbi.nlm.nih.gov. The algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive- valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits acts as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below,
due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a word size (W) of 28, an expectation (E) of 10, M=l, N=-2, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a word size (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
[0053] The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., arlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.
[0054] An indication that two nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative substitutions. Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below. Yet another indication that two nucleic acid sequences are substantially identical is that the same primers can be used to amplify the sequence. Yet another indication that two polypeptides are substantially identical is that the two polypeptides retain identical or substantially similar activity.
[0055] A "translocation sequence" or "transduction sequence" refers to a peptide or protein (or active fragment or domain thereof) sequence that directs the movement of a protein from one cellular compartment to another, or from the extracellular space through the cell or plasma membrane into the cell. Translocation sequences that direct the movement of a protein from the extracellular space through the cell or plasma membrane into the cell are "cell penetration peptides." Translocation sequences that localize to the nucleus of a cell are termed "nuclear localization" sequences, signals, domains, peptides, or the like.
Examples of translocation sequences include, without limitation, the TAT transduction domain (see, e.g., S. Schwarze et al, Science 285 (Sep. 3, 1999); penetratins or penetratin peptides (D. Derossi et al., Trends in Cell Biol. 8, 84-87); Herpes simplex virus type 1 VP22 (A. Phelan et al., Nature Biotech. 16, 440-443 (1998), and polycationic peptides (Cell Mol. Life Sci. 62 (2005) 1839-1849). Further translocation sequences are known in the art.
Translocation peptides can be fused (e.g. at the amino or carboxy terminus), conjugated, or coupled to a compound of the present invention, to, among other things, produce a conjugate compound that may easily pass into target cells, or through the blood brain barrier and into target cells. [0056] The "CRISPR/Cas" system refers to a widespread class of bacterial systems for defense against foreign nucleic acid. CRISPR/Cas systems are found in a wide range of eubacterial and archaeal organisms. CRISPR/Cas systems include type I, II, and III subtypes. Wild-type type II CRISPR/Cas systems utilize the RNA-mediated nuclease,Cas9 in complex with guide and activating RNA to recognize and cleave foreign nucleic acid. [0057] Cas9 homo logs are found in a wide variety of eubacteria, including, but not limited to bacteria of the following taxonomic groups: Actinobacteria, Aquificae, Bacteroidetes- Chlorobi, Chlamydiae-Verrucomicrobia, Chlroflexi, Cyanobacteria, Firmicutes,
Proteobacteria, Spirochaetes, and Thermotogae. An exemplary Cas9 protein is the
Streptococcus pyogenes Cas9 protein. Additional Cas9 proteins and homologs thereof are described in, e.g., Chylinksi, et al, RNA Biol. 2013 May 1 ; 10(5): 726-737 ; Nat. Rev. Microbiol. 201 1 June; 9(6): 467-477; Hou, et al, Proc Natl Acad Sci U S A. 2013 Sep 24;110(39): 15644-9; Sampson et al, Nature. 2013 May 9;497(7448):254-7; and Jinek, et al, Science. 2012 Aug 17;337(6096):816-21.
[0058] As used herein, "activity" in the context of CRISPR/Cas activity, Cas9 activity, sgRNA activity, sgRNA:nuclease activity and the like refers to the ability to bind to a target sequence and/or label or cleave the target sequence. Such activity can be measured in a variety of ways as known in the art. For example, expression, activity, or level of a reporter gene can be measured, and sgRNA:nucleases targeting the reporter gene sequence can be assayed for their ability to reduce the expression, activity, or level of the reporter gene. For example, a cell can be transfected with an expression cassette encoding a green fluorescent protein under the control of a constituitive promoter. The fluorescence intensity can be
measured and compared to the intensity of the cell after transfection with Cas9 and candidate sg NAs to identify optimized sgR As
II. Introduction
[0059] Described herein are methods and compositions for analyzing or modifying target nucleic acids. The methods and compositions are based on an optimized CRISPR/Cas system that employs a small guide RNA (sgRNA) and an sgRNA-mediated nuclease. The sgRNA contains a binding region that provides specific binding to a nucleic acid target sequence that is substantially complementary to the sequence of the binding region. The sgRNA and the nuclease can form a complex that specifically binds to the nucleic acid target sequence. III. Compositions
[0060] In some embodiments, an small guide RNA (sgRNA) molecule is provided.
sgRNAs contain a binding region that determines the sequence specificity of the sgRNA and the sgRNA:nuclease complex, a 5 ' stem-loop region that, at least in part, participates in assembly and interaction with a sgRNA-mediated nuclease; an intervening sequence, a 3 ' stem-loop region, and a termination sequence.
[0061] The binding region can be between about 5 and 100 nucleotides long, or longer (e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 , 52, 53, 54, 55, 56, 57, 58, 59 60, 61, 62, 63, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 92, 93, 94, 95, 96, 97, 98, 99, or 100 nucleotides in length, or longer). In some cases, the binding region is between about 15 and about 30 nucleotides in length (e.g., about 15-29, 15-26, 15-25; 16-30, 16-29, 16-26, 16-25; or about 18-30, 18-29, 18-26, or 18-25 nucleotides in length). Generally, the binding region is designed to complement or substantially complement the target nucleic acid sequence or sequences. In some cases, the binding region can incorporate wobble or degenerate bases to bind multiple sequences. In some cases, the binding region can be altered to increase stability. For example, non-natural nucleotides, can be incorporated to increase RNA resistance to degradation. In some cases, the binding region can be altered or designed to avoid or reduce secondary structure formation in the binding region. In some cases, the binding region can be designed to optimize G-C content. In some cases, G-C content is preferably between about 40% and about 60% (e.g., 40%, 45%, 50%, 55%, 60%). In some
cases, the binding region can contain modified nucleotides such as, without limitation, methylated or phosphorylated nucleotides.
[0062] The 5 ' stem-loop region can be between about 15 and about 50 nucleotides in length (e.g., about 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or about 50 nucleotides in length). In some cases, the 5' stem-loop region is between about 30-45 nucleotides in length (e.g., about 31 ,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41 , 42, 43, 44, or 45 nucleotides in length). In some cases, the 5 ' stem- loop region is at least about 31 nucleotides in length (e.g., at least about 31, 32,
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45 nucleotides in length). In some cases, the 5' stem-loop structure contains one or more loops or bulges, each loop or bulge of about 1 , 2,
3, 4, 5, 6, 7, 8, 9, or 10 nucleotides. In some cases, the 5' stem-loop structure contains a stem of between about 10 and 30 complementary base pairs (e.g., 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, 25, 26, 27, 28, 29, or 30 complementary base pairs).
[0063] In some embodiments, the 5 ' stem-loop structure can contain protein-binding, or small molecule-binding structures. In some cases, the 5 ' stem- loop function (e.g., interacting or assembling with a sgRNA-mediated nuclease) can be conditionally activated by drugs, growth factors, small molecule ligands, or a protein that binds to the protein-binding structure of the 5' stem- loop. In some embodiments, the 5 ' stem-loop structure can contain non- natural nucleotides. For example, non-natural nucleotides can be incorporated to enhance protein-RNA interaction, or to increase the thermal stability or resistance to degradation of the sgR A.
[0064] The intervening sequence between the 5' and 3' stem- loop structures can be between about 10 and about 50 nucleotides in length (e.g., about 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 , 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or about 50 nucleotides in length). In some cases, the intervening sequence is designed to be linear, unstructured, substantially linear, or substantially unstructured. In some embodiments, the intervening sequence can contain non- natural nucleotides. For example, non-natural nucleotides can be incorporated to enhance protein-RNA interaction or to increase the activity of the sgRNA:nuclease complex. As another example, natural nucleotides can be incorporated to enhance the thermal stability or resistance to degradation of the sgRNA.
[0065] The 3 ' stem-loop structure can contain an about 3, 4, 5, 6, 7, or 8 nucleotide loop and an about 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 , 22, 23, 24, or 25 nucleotide or longer stem. In some cases, the 3 ' stem-loop can contain a protein-binding, small molecule-binding, hormone-binding, or metabolite-binding structure that can conditionally stabilize the secondary and/or tertiary structure of the sgRNA. In some embodiments, the 3 ' stem- loop can contain non-natural nucleotides. For example, non- natural nucleotides can be incorporated to enhance protein-R A interaction or to increase the activity of the sgRNA:nuclease complex. As another example, natural nucleotides can be incorporated to enhance the thermal stability or resistance to degradation of the sgRNA. [0066] In some embodiments, the sgRNA includes a termination structure at its 3 ' end. In some cases, the sgRNA includes an additional 3 ' hairpin structure, e.g. , before the termination structure, that can interact with proteins, small-molecules, hormones, etc., for stabilization or additional functionality, such as conditional stabilization or conditional regulation of sgRNA :nuclease assembly or activity. [0067] In some cases, the sgRNA is optimized to enhance stability, assembly, and/or expression. In some case, the sgRNA is optimized to enhance the activity of a
sgRNA:nuclease complex as compared to previously known sgRNAs, such as an sgRNA encoded by:
SEQ ID NO:l [N]5_
looGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCC
GUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU, where [N] represents a target specific binding region of between about 5-100 nucleotides (e.g. , about 5, 10, 15, 20, 15, 30, 35, 40, 45, 50, 55, 60, 70, 80, or 90 nucleotides) that is complementary or substantially complementary to the target nucleic acid. In some cases, the optimized sgRNA provides enhanced activity as compared to a previously known sgRNA or an sgRNA substantially identical to a previously known sgRNA. As used herein, identity of an sgRNA to another sgRNA, such as an sgRNA to SEQ ID NO:l is determined with reference to the identity to the nucleotide sequences outside of the binding region. For example, two sgRNAs with 0% identity inside the binding region and 100% identity outside the binding region are 100% identical to each other. Similarly, as used herein, number of substitutions, additions, or deletions of an sgRNA as compared to another, such as an sgRNA compared to SEQ ID NO:l is determined with reference to the nucleotide sequences outside of the binding region.
For example, two sgRNAs with multiple additions, substitutions, and/or deletions inside the binding region and 100% identity outside the binding region are considered to contain 0 nucleotide substitutions, additions, or deletions.
[0068] In some cases, the optimized sgRNAs described herein form an sgRNA:nuclease complex with enhanced activity as compared to SEQ ID NO:l, or an sgRNA 90, 95, 96, 97, 98, or 99% or more identical to SEQ ID NO:l. In some cases, the optimized sgRNAs described herein form an sgRNA:nuclease complex with enhanced activity as compared to SEQ ID NO:l, or an sgRNA with fewer than 5, 4, 3, or 2 nucleotide substitutions, additions, or deletions of SEQ ID NO:l. [0069] In some embodiments, the sgRNA can be optimized for expression by substituting, deleting, or adding one or more nucleotides. In some cases, a nucleotide sequence that provides inefficient transcription from an encoding template nucleic acid can be deleted or substituted. For example, in some cases, the sgRNA is transcribed from a nucleic acid operably linked to an RNA polymerase III promoter. In such cases, sgRNA sequences that result in inefficient transcription by RNA polymerase III, such as those described in Nielsen et al., Science. 2013 Jun 28;340(6140):1577-80, can be deleted or substituted. For example, one or more consecutive uracils can be deleted or substituted from the sgRNA sequence. In some cases, the consecutive uracils are present in the stem portion of a stem-loop structure. In such cases, one or more of the consecutive uracils can be substituted by exchanging the uracil and its complementary base. For example, if the uracil is hydrogen bonded to a corresponding adenine, the sgRNA sequence can be altered to exchange the adenine and uracil. This "A-U flip" can retain the overall structure and function of the sgRNA molecule while improving expression by reducing the number of consecutive uracil nucleotides. In some cases, the sgRNA containing an A-U flip is encoded by: SEQ ID NO:2 [N]5_
looGUUUAAGAGCUAGAAAUAGCAAGUUUAAAUAAGGCUAGUCC
GUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU, where the A-U flipped nucleotides are underlined. In some cases, the optimized sgRNA is at least 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99% identical or more to SEQ ID NO:2, or contains fewer than 10, 9, 8, 7, 6, 5, 4, 3, or 2 nucleotide additions, deletions, or substitutions compared to SEQ ID NO:2. Alternatively, the A-U pair can be replaced by a G-C, C-G, A-C, G-U pair.
[0070] In some embodiments, the sgRNA can be optimized for stability. Stability can be enhanced by optimizing the stability of the sgRNA:nuclease interaction, optimizing assembly of the sgRNA:nuclease complex, removing or altering RNA destabilizing sequence elements, or adding RNA stabilizing sequence elements. In some embodiments, the sgRNA contains a 5' stem-loop structure proximal to, or adjacent to, the binding region that interacts with the sgRNA-mediated nuclease. Optimization of the 5 ' stem-loop structure can provide enhanced stability or assembly of the sgRNA:nuclease complex. In some cases, the 5 ' stem-loop structure is optimized by increasing the length of the stem portion of the stem-loop structure. An exemplary sgRNA containing an optimized 5 ' stem-loop structure is encoded by: SEQ ID NO:3 [N]5-
IOOGUUUUAGAGCUAUGCUGGAAACAGCAUAGCAAGUUAAAAU AAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUU
U, where the nucleotides contributing to the elongated stem portion of the 5 ' stem- loop structure are underlined. In some cases, the optimized sgRNA is at least 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99% identical or more to SEQ ID NO:3, or contains fewer than 10, 9, 8, 7, 6, 5, 4, 3, or 2 nucleotide additions, deletions, or substitutions compared to SEQ ID NO:3.
[0071] In some embodiments, the 5 ' stem-loop optimization is combined with mutations for increased transcription to provide an optimized sgRNA. For example, an A-U flip and an elongated stem loop can be combined to provide an optimized sgRNA. An exemplary sgRNA containing an A-U flip and an elongated 5' stem-loop is encoded by:
SEQ ID NO: 4 [N]5
innGUUUAAGAGCUAUGCUGGAAACAGCAUAGCAAGUUUAAAU
AAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUU
U, where the A-U flipped nucleotides and the nucleotides contributing to the elongated stem portion of the 5' stem-loop structure are underlined. In some cases, the optimized sgRNA is at least 90, 91 , 92, 93, 94, 95, 96, 97, 98, or 99% identical or more to SEQ ID NO:4, or contains fewer than 10, 9, 8, 7, 6, 5, 4, 3, or 2 nucleotide additions, deletions, or substitutions compared to SEQ ID NO:4..
[0072] sgRNAs can be modified by methods known in the art. In some cases, the modifications can include, but are not limited to, the addition of one or more of the following sequence elements: a 5 ' cap (e.g., a 7-methylguanylate cap); a 3 ' polyadenylated tail; a riboswitch sequence; a stability control sequence; a hairpin; a subcellular localization
sequence; a detection sequence or label; or a binding site for one or more proteins.
Modifications can also include the introduction of non-natural nucleotides including, but not limited to, one or more of the following: fluorescent nucleotides and methylated nucleotides.
[0073] In some embodiments, the sgRNA can contain from 5' to 3': (i) a binding region of between about 10 and about 50 nucleotides; (ii) a 5' hairpin region containing fewer than four consecutive uracil nucleotides, or a length of at least 31 nucleotides (e.g., from about 31 to about 41 nucleotides); (iii) a 3' hairpin region; and (iv) a transcription termination sequence, wherein the small guide RNA is configured to form a complex with a small guide RNA- mediated nuclease, the complex having increased stability or activity relative to a complex containing a small guide RNA-mediated nuclease and a small guide RNA comprising at least 95% identity to SEQ ID NO:l or a complement thereof.
[0074] In some embodiments, an sgRNA-mediated nuclease is provided. In some cases, the sgRNA-mediated nuclease is a Cas9 protein. For example, the sgRNA-mediated nuclease can be a type I, II, or III Cas9 protein. In some cases, the sgRNA-mediated nuclease can be a modified Cas9 protein. Cas9 proteins can be modified by any method known in the art. For example, the Cas9 protein can be codon optimized for expression in host cell or an in vitro expression system. Additionally, or alternatively, the Cas9 protein can be engineered for stability, enhanced target binding, or reduced aggregation.
[0075] In some cases, the Cas9 protein can be engineered to be nuclease deficient. For example, the Cas9 protein generally catalyzes double-stranded cleavage of the target nucleic acid; however, certain Cas9 mutations can provide a nuclease that is able to nick the target nucleic acid but unable to catalyze double-stranded cleavage, or unable to substantially catalyze double -stranded cleavage. Exemplary mutations that reduce or eliminate double stranded cleavage but provide nicking activity include one or more mutations in the following locations: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987, or a mutation in a corresponding location in a Cas9 homologue or ortholog. The mutation(s) can include substitution with any natural (e.g. , alanine) or non-natural amino acid, or deletion. Cas9 proteins that cleave or nick the target sequence can be utilized in combination with an sgRNA, such as one or more of the sgRNAs described herein, to form a complex that is useful for various nucleic acid modification methods, such as genome editing as further explained below. An exemplary nicking Cas9 protein is Cas9D10A or Cas9H840A (Jinek, et al, Science. 2012 Aug 17;337(6096):816-21).
[0076] The Cas9D 1 OA amino acid sequence is (D 1 OA mutation underlined) : SEQ ID NO:5
MDKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSII KNLIGA LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVE EDKXHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDi ADLRLIYLALAHMIKFR GHFLIEGDLNPDNSDVDKLFIQLVQTY QLFEENPINASGVDAI AILSARLSKSRRLE NLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL VRQQLPEKYKEIFFDQSK GYAGYIDGGASQEEFYKFII PILEI MDGTEELLVKXNRE DLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLi DNREKIEKILTFRIPYYVGPLA RGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSL LYEYFTVY ELTKVKYVTEGMRKP AFL S GEQKKAI VDLLFKTNRKVT VKQLKED YF KKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDR EMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSG TILDFLKSD GFANRNFMQLIHDDSLTFi EDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMi RIEEGIi ELGSQILKEHP VENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDHIVPQSFLi DDSIDNKV LTRSDK RGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKAERGGLSELD KAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKXVSDFRKDF QFYKVREIN YHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIA SE QEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVR KVLSMPQWIVK TEVQTGGFSKJiSILPKRNSD LIARKKDWDPKKYGGFDSPTVAY SVLVVAKVE GKSKKLKSVKELLGITIMERSSFEK PIDFLEAKGYKEVKKDLIIKLP KYSLFELENGRKRMLASAGELQKGNELALPS YV FLYLASHYEKL GSPEDNEQK QLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAY KHRDKPIREQAENIIHLFT LTNLGAPAAF YFDTTIDRKRYTSTi EVLDATLIHQSITGLYETRIDLSQLGGD.
[0077] The Cas9H840A amino acid sequence is (H840A mutation underlined):
SEQ ID NO:6
MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIi NLIGA
LLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMA VDDSFFHRLEESFLVE EDKXHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDi ADLRLIYLALAHMIKFR GHFLIEGDLNPDNSDVDKLFIQLVQTY QLFEENPINASGVDAi AILSARLSKSRRLE
NLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLA QIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKAL VRQQLPEKYKEIFFDQSK GYAGYIDGGASQEEFYKFIB PILEB MDGTEELLVKXNRE DLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLi DNREKIEKILTFRIPYYVGPLA RGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSL LYEYFTVYNELTKVKYVTEGMRKP AFL S GEQKKAI VDLLFKTNRKVT VKQLKED YF KKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDR EMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSG TILDFLKSD GFANRNFMQLIHDDSLTFB EDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKV VDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMi RIEEGIi ELGSQILKEHP VENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLi DDSIDNKV LTRSDKNRGKSDNVPSEEVVKKMK YWRQLLNAKLITQRKFDNLTKAERGGLSELD KAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKXVSDFRKDF QFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIA SE QEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVR KVLSMPQV IVK TEVQTGGFSi ESILPKRNSD LIARKKDWDPKKYGGFDSPTVAY SVLVVAKVE GKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLP KYSLFELENGRKRMLASAGELQKGNELALPS YV FLYLASHYEKL GSPEDNEQK QLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAY KHRDKPIREQAENIIHLFT LTNLGAPAAF YFDTTIDRKRYTSTi EVLDATLIHQSITGLYETRIDLSQLGGD.
[0078] As another example, certain Cas9 mutations can provide a nuclease that is nuclease defective. For example, certain Cas9 mutations can provide a nuclease that does not cleave or nick, or does not substantially cleave or nick the target sequence. Exemplary mutations that reduce or eliminate nuclease activity include one or more mutations in the following locations: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987, or a mutation in a corresponding location in a Cas9 homologue or ortholog. The mutation(s) can include substitution with any natural (e.g. , alanine) or non-natural amino acid, or deletion. Cas9 proteins that do not cleave or nick the target sequence can be utilized in combination with an sgRNA, such as one or more of the sgRNAs described herein, to form a complex that is useful for detection of target nucleic acids as further explained below. An exemplary nuclease defective Cas9 protein is Cas9D10A&H840A (Jinek, et ciL, Science. 2012 Aug 17;337(6096):816-21 ; Qi, et al , Cell. 2013 Feb 28; 152(5): 1173-83).
[0079] In some embodiments, the Cas9 protein can be conjugated or fused to a detectable label. In some cases Cas9 is fused to a fluorescent protein. For example, a fluorescent protein can be fused at the N and/or C-terminus of the Cas9 protein. In some cases, the Cas9 protein is nuclease deficient and labeled. In such cases, the labeled Cas9 protein can be combined with an sgR A, such as an sgRNA provided herein, to form a complex useful as a detection reagent in a cell.
[0080] In some embodiments, the Cas9 protein can be conjugated or fused to a protein localization signal or a cell penetration peptide. For example, the Cas9 protein can be fused to one or more nuclear localization signals, one or more mitochondrial localization signals, or one or more chloroplast localization signals.
[0081] In some embodiments, expression cassettes are provided for expression of sgRNAs and/or Cas9. In some cases, the expression cassettes are configured to express sgRNAs and/or Cas9 in an in vivo system. Alternatively, the expression cassettes can be configured to express sgRNAs and/or Cas9 in an in vitro system. In some cases, the expression cassette is configured to express one or more sgRNAs and/or Cas9 in the same host cell in which nucleic acid modification or detection is to be performed.
[0082] Expression cassettes as described herein can include a promoter operably linked to a nucleotide encoding sgRNA or Cas9. In some embodiments the expression cassettes also include one or more nuclear localization signals, one or more mitochondrial localization signals, or one or more chloroplast localization signals. The nuclear localization signal can be fused to the N and/or C-terminus of a Cas9 coding sequence. In some cases, additional sequences are included in the expression cassette, such as cell penetration peptides, purification tags, detection labels, or termination sequences. Termination sequences include stop codons, transcriptional termination signals, or polyadenylation signals. Detection labels include fluorescent proteins, such as green fluorescent protein, yellow fluorescent protein, blue fluorescent protein, mCherry, and the like.
[0083] sgRNA expression cassettes can utilize a wide range of suitable promoters as known in the art. In some cases, sgRNA expression cassettes operably link a strong promoter to the sgRNA encoding nucleic acid to provide a high level of expression in a specified expression system or host cell. For example, an RNA polymerase III promoter, such as HI or U6, can be operably linked to an sgRNA encoding nucleic acid for in vivo expression in a mammalian
cell. Other suitable mammalian promoters can also be utilized. Exemplary suitable promoters for sg NA expression are provided in e.g., www.addgene.org/CRISPR/.
[0084] Cas9 expression cassettes can utilize a wide range of suitable promoters as known in the art. In some cases, the Cas9, or modified Cas9, utilized in the host cell is prone to aggregation or mislocalization. In some cases Cas9 expression cassettes operably link a weak promoter to the Cas9 encoding nucleic acid to provide a low level of expression in the host cell. In some cases, the weak expression level mitigates, ameliorates, or eliminates the aggregation and/or mislocalization of Cas9 in the host cell. Alternatively, the Cas9 is operably linked to a strong promoter to provide reagent quantities of Cas9 from an in vitro or in vivo expression system. Reagent quantities of Cas9 can then be purified from the expression system for subsequent use in a cell as described below. Suitable strong promoters include promoters described in e.g. , www.addgene.org/CRISPR/. Suitable weak mammalian promoters include the ubiquitin C promoter, the phosphoglycerate kinase 1 promoter, and the un-induced TetOn promoter. In some cases, the weak or strong promoters are respectively weaker or stronger than the elongation factor la (EFla) promoter, which is known to provide a moderate level of expression in a broad range of mammalian cell types.
[0085] Also provided are expression systems for expression of reagent quantities of sgRNA and/or Cas9. In some cases, the expression systems include an expression cassette containing a nucleic acid encoding an sgRNA. In some cases, the expression systems include an expression cassette containing a nucleic acid encoding a Cas9 protein. In some cases, the expression systems contain one or more of the foregoing expression cassettes and a host cell or cell-lysate for generating reagent quantities of sgRNA and/or Cas9. The expression cassettes can be introduced into the host cell or cell-lysate and incubated for production of sgRNA or Cas9. The sgRNA and/or Cas9 product(s) can be purified and used in the methods of the present invention.
[0086] Also provided are host cells containing nucleic acid detection or modification reagents. In some cases, the host cells contain one or more sgRNAs, one or more sgRNA- mediated nucleases, or one or more sgRNA:nuclease complexes. In some cases, the host cells are transfected with a suitable vector containing one or more expression cassettes encoding an sgRNA and/or Cas9 protein. Suitable vectors include any vectors that are known in the art and capable of transferring nucleic acid to a host cell. Suitable vectors include, but are not limited to one or more of the following viral vectors: an onco viral vector,
a foamy viral vector, lentiviral vector, a feline immunodeficiency viral vector, a Moloney murine leukemia virus (MoMLV) vector, a vaccinia viral vector, a polioviral vector, an adenoviral vector, an adeno-associated viral vector, an SV40 viral vector, a herpes simplex viral vector, an HIV viral vector, a spleen necrosis viral vector, a Rous Sarcoma viral vector, a Harvey Sarcoma viral vector, an avian leucosis viral vector, a myeloproliferative sarcoma viral vector, or a mammary tumor viral vector.
[0087] In some cases, the sgRNA, Cas9, or sgR A:nuclease complexes are introduced into a host cell. Suitable methods for introduction of the RNA, protein, or complex are known in the art and include, for example, electroporation; calcium phosphate precipitation; or PEI, PEG, DEAE, nanoparticle, or liposome mediated transformation. Other suitable transfection methods include direct micro-injection. In some cases, the sgRNA and Cas9 are introduced separately and the sgRNA:nuclease complexes are formed in the cell. In other cases, the sgR A:nuclease complexes are formed and then introduced into the cell. In some cases, multiple, differentially labeled, sgRNA:nuclease complexes, each directed to a different nucleic acid target or nucleic acid target region are formed and then introduced into the cell.
IV. Methods
[0088] Methods for detecting a target nucleic acid sequence are described herein. In some embodiments, the methods utilize a nuclease defective Cas9 protein. For example, a complex between an sgRNA and a Cas9 protein (e.g., a nuclease defective Cas9) can be formed and contacted with a nucleic acid containing the target sequence. In some cases, the nucleic acid resides in a host cell, e.g., a living host cell. For example, sgRNA and Cas9 can be expressed in, or otherwise introduced into, a host cell, and the sgRNA and Cas9 can form an sgRNA:nuclease complex and localize to the target nucleic acid sequence. Alternatively, sgRNA can be expressed in the host cell and Cas9 introduced using a method known in the art for protein transfection. For example, Cas9 fused to a cell penetration peptide can be contacted with the sgRNA containing cell. As yet another alternative, Cas9 can be expressed in the cell and sgRNA introduced into the cell using any method known in the art for RNA transfection. As yet another alternative sgRNA:nuclease complexes can be formed and introduced into the cell. [0089] sgRNA:nuclease complexes can localize to a target nucleic acid sequence that is complementary, or substantially complementary to the binding region of the sgRNA. Such localized sgRNA:nuclease complexes can be detected to detect the target nucleic acid
sequence, or a region containing the target nucleic acid. For example, the nuclease and/or the sg NA can be labeled and detection of the localized label thereby detects the target nucleic acid sequence. In some cases, multiple different sgR As, each corresponding to a different target sequence in a nucleic acid region can be used to amplify the signal. For example, 5, 10, 15, 20, 25, 30, or more sgRNAs can be designed that each bind to a different sequence corresponding to a specified gene or chromosomal region. These sgRNAs and a labeled Cas9 can be introduced into a host cell and their localization detected. The presence of a strong and localized nuclear signal indicates the presence of the target gene or chromosomal region. Alternatively, a small number of sgRNAs (e.g., 5, 4, 3, 2, or 1) can be designed to bind to a repetitive sequence located in a target gene or chromosomal region and, e.g., introduced into a cell to form an sgRNA:nuclease complex. The sgRNA:nuclease complex can localize to the repeat region and provide a strong signal.
[0090] In some embodiments, an sgRNA:nuclease complex can be used to enhance other chromosomal detection methods. For example an sgRNA targeted to a nucleic acid sequence or region of interest and an Cas9 protein can be introduced into a cell. The cell can be incubated to allow formation and localization of the sgRNA:nuclease complex. Helicase activity of the sgRNA:nuclease complex can then relax the target nucleic acid and surrounding regions. The cells can then be fixed and stained using standard fluorescence in situ hybridization (FISH) protocols. The relaxed regions can then be more readily and routinely detected by FISH. In some cases, the sgRNA :nuclease complexes used to enhance other detection methods such as FISH are labeled. In other cases, the sgRNA:nuclease complexes used to enhance other detection methods are not labeled. Labeled
sgRNA:nuclease complexes can provide for co-localization detection or detection of multiple sequences or regions in addition to, or in combination with the other detection method (e.g., FISH).
[0091] Similarly, an sgRNA:nuclease provided herein can be used to enhance any methods known in the art that rely on access to chromosomal DNA. For example, genome editing with TALENS or Zinc Finger nucleases can be enhanced by targeting sgRNA:nucleases (e.g. , nuclease active, nuclease deficient, or nuclease defective) to a region at or near the TALEN or Zinc Finger nuclease target site. Similarly, activators or repressors can targeted to regions at or near sgRNA :nuclease target sites. In some cases, the helicase activity of the
sgRNA:nuclease unwinds, or relaxes the target nucleic acid region, or creates a more open
chromosomal structure providing access to other tools known in the art for genome editing, labeling, or transcriptional regulation.
[0092] In some embodiments, the methods described herein provide diagnostics for genetic diseases. For example, the methods can be used to detect chromosomal number,
chromosomal rearrangements, mutations, insertions, deletions, and the presence or absence of a gene or region of a gene.
[0093] Methods for modifying a target nucleic acid sequence are described herein. In some embodiments, the methods utilize a nuclease deficient Cas9 protein. In other embodiments, the methods utilize a Cas9 protein that catalyzes double stranded cleavage of the target nucleic acid. For example, a complex between an sgRNA and a Cas9 protein (e.g. , a nicking Cas9 protein or a Cas9 protein capable of double stranded cleavage) can be formed and contacted with a nucleic acid containing the target sequence. In some cases, the nucleic acid resides in a host cell, e.g., a living host cell. For example, sgRNA and Cas9 can be expressed in a host cell, the sgRNA and Cas9 can form an sgRNA:nuclease complex and localize to the target nucleic acid sequence. Alternatively, sgRNA can be expressed in the host cell and Cas9 introduced using a method known in the art for protein transfection. For example, Cas9 fused to a cell penetration peptide can be contacted with the sgRNA containing cell. As yet another alternative, Cas9 can be expressed in the cell and sgRNA introduced into the cell using any method known in the art for RNA transfection. [0094] sgRNA:nuclease complexes can localize to a target nucleic acid sequence that is complementary, or substantially complementary to the binding region of the sgRNA. The localized sgRNA:nuclease complexes can then nick or cleave the target sequence. In some cases, a pair of sgRNA:nuclease complexes are utilized that flank a nucleic acid region of interest. The nicking or cleaving of target nucleic acid sequences that flank the region of interest can then activate repair processes in the host cell. For example, non homologous end joining (NHEJ) of flanking double stranded breaks can cause deletion of at least a portion of the flanked region of interest. Alternatively, homologous DNA repair (HDR) can cause incorporation of a homologous region and lead to loss of heterozygosity in the host cell.
[0095] As yet another alternative, a heterologous nucleic acid can be introduced into the cell with regions of homology corresponding to at least a portion of the nucleic acid region of interest and/or one or more target nucleic acid sequences. Thus, HDR can incorporate at least a portion of the heterologous nucleic acid into the flanked region of interest. In some cases,
the heterologous nucleic acid can contain one or more selectable markers or detectable labels to assay for successful incorporation or select for cells that have incorporated the nucleic acid. For example, the heterologous nucleic acid can encode a detectable or selectable protein, e.g. an enzyme, transcription factor, or binding protein. V. Kits
[0096] Kits are described herein for modifying or detecting nucleic acids. For example, the kits can be used for detecting nucleic acids in a living cell. In some embodiments, the kits contain a labeled nuclease defective sgR A and an sgRNA-mediated nuclease. In other embodiments, the kits contain a nuclease deficient, or a nuclease competent sgRNA-mediated nuclease and an sgRNA. In some cases, the kits contain multiple sgRNAs directed to different target nucleic acid sequences. In some cases, the kits contain multiple labeled sgRNA-mediated nucleases. In some embodiments, the kits contain one or more expression cassettes containing nucleic acid encoding one or more of the sgRNAs and/or sgRNA- mediated nucleases described herein. [0097] Further uses of the methods, compositions, or kits described herein include one or more of the following: genome editing, transcriptional or epigenetic regulation, genome imaging, copy number analysis, analysis of living cells, detection of highly repetitive genome sequence or structure, detection of complex genome sequences or structures, detection of gene duplication or rearrangement, enhanced FISH labeling, unwinding of target nucleic acid, large scale diagnostics of diseases and genetic disorders related to genome deletion, duplication, and rearrangement, use of an RNA oligo chip with multiple unique sgRNAs for high-throughput imaging and/or diagnostics, multicolor differential detection of target sequences, identification or diagnosis of diseases of unknown cause or origin, and 4- dimensional (e.g., time-lapse) or 5-dimensional (e.g., multicolor time-lapse) imaging of cells (e.g., live cells), tissues, or organisms.
EXAMPLES
[0098] The following examples are provided by way of illustration only and not by way of limitation. Those of skill in the art will readily recognize a variety of non-critical parameters that could be changed or modified to yield essentially the same or similar results.
Example 1: Dynamic Imaging of the Genome in Living Human Cells Via CRISPR
[0099] Introduction
[0100] Understanding the dynamic organization and function of genomes requires methods for sequence-specific imaging of genetic elements in living cells. In human cells, the functional output of a genome is strongly determined by its dynamic spatial conformation and interaction with other RNA and protein factors (Misteli, 2007; Misteli, 2013), yet methods for robustly detecting such dynamic information are missing. Artificially synthesized nucleic acid probes and engineered DNA-binding proteins have allowed in situ detection of genetic elements such as telomeres. While fluorescence in situ hybridization (FISH) using fluorescent probes is a rapid method for detecting specific DNA sequences on chromosomes based on Watson-Crick complementarity (Pardue, et al., 1969). FISH requires chemical fixation of cells, precluding its use for live cell imaging. Unlike nucleic acid probes, DNA- binding proteins fused with fluorescent proteins have enabled robust dynamic imaging of chromosomes in living cells (Robinett, et al., 1996; Wang, et al., 2008). Due to either fixed DNA sequence binding requirements or limited native DNA-binding proteins, however, it remains challenging to use DNA-binding protein for imaging arbitrary genes and genetic elements. We argue that more powerful dynamic genome imaging approaches require methods that could leverage the ease of Watson-Crick base-pairing as seen in nucleic acid probes and the robustness of DNA-binding proteins. [0101] The bacterial genetic immune system CRISPR (clustered regularly interspaced short palindromic repeats) provides a natural example for DNA targeting using both a DNA- binding protein and a small base-pairing RNA (Barrangou, et al., 2007; Wiedenheft, et al.,
2012) . The minimal type II CRISPR system derived from Streptococcus pyogenes recognizes specific DNA sequences via a small guide (sg) RNA-mediated nuclease, Cas9 (Deltcheva, et al, 2011; Jinek, et al., 2012). Upon DNA binding, the Cas9-sgRNA complex causes DNA double-stranded breaks (Jinek, et al., 2012). Harnessing this unique RNA- guided nuclease activity, recent work has demonstrated the use of CRISPR for genome editing in a broad range of organisms (Cong, et al., 2013; Mali, et al., 2013; Wang, et al.,
2013) . Furthermore, a repurposed nuclease-deactivated Cas9 (dCas9) protein has been used to regulate endogenous gene expression by controlling the RNA polymerase activity or by modulating promoter accessibility when fused with transcription factors (Qi, et al., 2013; Gilbert, et al., 2013). Beyond using CRISPR for gene editing or regulation, we hypothesized
that the system could offer a promising platform for in situ dynamic imaging of genetic elements in living cells.
[0102] Results and Discussion
[0103] To engineer the CRISPR system for imaging endogenous genetic elements, we fused the dCas9 protein to Enhanced Green Fluorescent Protein (EGFP). In this way, complementary sgRNAs will direct dCas9-EGFP to the targeted genomic loci (Fig. 1A). To enrich the signal over the background of unbound dCas9-EGFP, we targeted multiple dCas9- EGFP proteins to a given locus. To reduce the levels of free dCas9-EGFP that contribute to background, we expressed dCas9-EGFP from the Tet-On 3G promoter without doxycycline induction (Fig. IB). To better localize the dCas9-EGFP protein to the nucleus for genome targeting, we tested different fusions carrying two copies of a nuclear localization signal (NLS) sequence (Fig. 2), and used a resulting fully nuclear localized version (# 4) for subsequent imaging experiments.
[0104] To test if the CRISPR system can detect non-coding genetic elements, we imaged human telomeres, specialized chromatin structures composed of TTAGGG repeats 5 to 15 kilobase pairs (kb) in length (Moyzis, et al., 1988). We designed an sgRNA (sgTelomere) that contains a 22-nt region complementary to the telomere sequence, expressed from a murine polymerase III U6 promoter (Fig. IB & Table 1). The sgRNA also contained a redesigned stem-loop hairpin structure, and a S. pyogenes terminator sequence following previous studies (Jinek, et al., 2012; Qi, et al., 2013) (Fig. 1C). We created a clonal RPE cell line for stable dCas9-EGFP expression using lentiviruses containing the dCas9-EGFP expression cassette. This cell line was subsequently infected by sgTelomere-containing lentivirus, and imaged for telomere detection 48 h post-infection. Most cells (80%) showed 10 to 40 puncta with bright nuclear areas resembling nucleoli (Fig. ID). The observed number of puncta was substantially lower than numbers expected for telomeres in human cells, suggesting that the system was sub-optimal.
Table 1: Protopacer sequences of telomeres, MUC1 and MUC4, shaded, PAM:
MUC4 exon cctcagcatccacaggtcacgccac
MUC4 intron gaaggtatgggtgtggaaggtatlii
MUC4 intron gtgtggaaggtatggg
[0105] Previous work has suggested expression levels of the chimeric guide RNA limit CRISPR/Cas9 function in human cells. To improve the system for more effective genome imaging, we modified the sgRNA design to increase its expression level and assembly with the dCas9 protein. It has been shown that 4 or more consecutive uridine (U) residues could pause or terminate Pol-III transcription (Nielsen, et al., 2013). To improve the sgRNA expression levels, we removed the consecutive 4 U's by A-U base pair flipping or added a polymerase III SINE element polyadenylation signal sequence at the 3' end (Fig. 3). We first tested these new sgRNAs using the CRISPR interference (CRISPRi) method (Qi, et al, 2013). We transfected each modified sgRNA targeted to an EGFP reporter into cells expressing stably EGFP and dCas9-KRAB as previously described. One A-U flip (# 6) enhanced repressive activity. To improve the assembly of sgRNA and dCas9, we extended the stem-loop hairpin structure that binds to dCas9. A 5-bp extension at the top of the hairpin (# 8) increased repressive activity. Interestingly, a design combining both modifications (# 10) further increased the repression (Fig. 3).
[0106] We tested three improved sgRNA designs for telomere imaging. Consistently, these sgRNA designs increased puncta numbers and decreased background and nucleolar signals (Fig. 4). The best design (F+E) combined both A-U flip and hairpin extension (Fig. 1C) and doubled the observable telomere numbers and increased the signal-to-background ratio by 5- fold (Fig ID & IE). This sgRNA design also enabled robust imaging at lower lentiviral titers (Fig. 5).
[0107] To verify that the observed puncta were indeed telomeres, we performed FISH using telomere-specific Cys5-labeled telomeric repeat oligo probes in CRISPR-labeled cells and observed spatial co-localization (Fig. 6A). We also used an antibody to stain endogenous TRF2 in CRISPR-labeled cells, which is a protein in the shelterin complex that binds to the telomeric DNA repeats (Griffith, et al, 1999). In all cells, TRF2punctaco-localized with CRISPR labeling. The co-localization of TRF2 and CRISPR puncta also suggests that dCas9 binding does not disrupt the telomere structure. To further check the integrity of telomeres with dCas9 binding, we used an antibody to image the localization of 53BP1, a protein recruited to damaged DNA sites(d'Adda di Fagagna, et al., 2003). We saw a very mild
increase in DNA damage at telomeres, which was orders of magnitude less than major telomere disruptions such as disassembly of the shelterin complex by shRNA-mediated depletion of TIN2 (Fig. 7) (Kim, et al, 1999).
[0108] To examine whether the puncta intensity in CRISPR imaging correlates with telomere length, we imaged telomeres in theUMUC3human bladder caner cell line, wherein telomere length can be conditionally elongated by transfection with a human telomerase RNA (hTR) gene (Vaziri, et al, 1998). Six days after hTR lentiviral infection, we compared UMUC3 cells with and without transduction. CRISPR imaging showed brighter signals with hTR, consistent with the expected increase of the telomere length (Fig. 6B & 6C). Our results suggest that CRISPR imaging can detect telomere length changes by measuring fluorescence intensity.
[0109] Next, we used CRISPR to image endogenous protein-coding genes. Specifically, we chose theMUC4genewhich encodes a glycoprotein important for protecting mucus in diverse epithelial tissues and tumor formation (Hollingsworth, et al., 2004). The MUC4 gene contains a region in the coding sequence with a variable number (> 100) of 48-bp tandem repeats in the second exon (Fig. 8A) (Nollet, et al., 1998). To image the MUC4 gene, we designed three sgRNAs (Table 1) targeting this repeat sequence (sgMUC4-El, E2, E3). We observed that the labeling efficiency depended on the target site. The best one, sgMUC4-E3, showed labeling of 2 or more puncta in 100 % of cells (Fig. 8B). The labeling was highly efficient, and the original sgRNA design without modifications was sufficient for imaging. The MUC4 gene also contains an array of 15 -bp tandem repeat region in the third intron. As previous studies have shown that sgRNA binding requires a minimal ~ 12-bp DNA complementary region, we designed two sgRNAs with different lengths of base pairing (23 bp and 12 bp) to the repeat sequence (Table 1). In this case, the F+E modifications of the sgRNA designs greatly enhanced the imaging efficiency (Fig. 8C). Surprisingly, although the binding affinity of the shorter 12-bp sgMUC4-I2 is predicted to be lower than that of the longer 23-bp sgRNA (Qi, et al., 2013), sgMUC4-I2 showed a higher imaging efficiency. This implies that the copy number of bound sgRNA-dCas9-EGFP is more important than the binding affinity of individual sgRNAs. [0110] Interestingly, we saw 3 labeled MUC4 loci in the majority of RPE cells using both methods. As the MUC4 gene is located on chromosome 3, we measured the ploidy of our cells using FISH targeting two different regions on chromosome 3 as well as whole-cell
karyotype analysis (Fig. 9). Our results confirmed that the RPE cell line that we used is trisomic for chromosome 3, suggesting that using the CRISPR for imaging is capable of detecting gene copy numbers in living cells. While we saw 1 to 2 puncta using oligo FISH, CRISPR imaging using sgRNAs targeting the MUC4 exon showed statistically more puncta. The best one, sgMUC4-E3, showed 3 or 6 labeled spots in 90%of the cells (Fig. 8D), implying all possible sites were labeled. The observation of 6 spots is likely due to chromosome replication. CRISPR labeling with two sgRNAs targeting the MUC4 intron gave similar results (Fig. 8E). To confirm the observed puncta labeled with sgMUC4-E3 were indeed MUC4 loci on different chromosomes, we performed Cys5-labeled Oligo FISH labeling and observed all three spots were co-labeled (Fig. 8F & Table 2).
Table 2: Oligo FISH probes
[0111] To verify the generality of CRISPR imaging, we used sgTelomere and sgMUC4-E3 to image MUC4 loci and telomeres in HeLa cells. In both cases, we observed strong labeling of the target genomic loci (Fig. 10). We similarly observed three copies of MUC4 in our HeLa cells, and FISH experiments further confirmed the cells contain a trisomy of chromosome 3. We also designed sgRNAs to target the repetitive elements in the MUC 1 gene (Fig. 11 & Table 1) (Gendler, et al, 1990). We observed distinct multiple MUC1 loci in the RPE cells, which was further confirmed by co-labeling with FISH. It is worth noting that CRISPR generally exhibits a higher labeling efficiency than using Oligo FISH (Fig. 12).
[0112] Most genes and genetic elements in the human genome contain non-repetitive sequences. To test if CRISPR labeling could detect non-repetitive genetic elements, we designed 73 sgRNAs that target both DNA strands spanning a 5-kb region in the first intron of MUC4 gene (Table 3). We produced lentiviral cocktails, each containing 5 to 6 sgRNAs, and infected different numbers of sgRNAs (16, 26, 36 or 73) into RPE cells while
maintaining the same total virus dosage. We observed 1 to 3 MUC4 loci using 36 or 73 sgRNAs, but no detectable puncta when using 16 sgRNAs (Fig. 8G & 8H). This suggests that to detect a non-repetitive genome sequence locus using CRISPR, currently 30 or more sgRNAs are required.
Table 3: Protospacer sequences in the MUC4 non-repetitive region for CRISPR imaging, shaded background, PAM
[0113] We also co-labeled the MUC4 locus using 73 intron-targeting sgRNAs and the exon-targeting sgMUC4-E3. The distance between the two labeling sites is 15-kb. In 20% of cells, we observed two proximal spots co-localized together (Fig. 8G), whose distance likely reflects the local organization of the chromatin structure. Furthermore, we co-labeled both MUC1 and MUC4 using two sgRNAs, and observed 45 % of cells showing more than 6puncta compared to 20 % cells showing 6 puncta using only sgMUC4 (Fig. 13). These results suggest that it should be possible to perform multicolor imaging of multiple genomic loci using orthogonal Cas9 proteins fused to distinct fluorescent proteins.
[0114] CRISPR imaging offers a unique non-invasive platform for tracking the dynamics of genetic elements in living cells. We performed a high-frequency (0.2 s per frame) time- lapse microscopy to track telomere dynamics in living cells over 40s (Fig. 14A). Consistent with previous results using GFP-fused TRF1 (Wang, et at., 2008), a native telomere binding protein, CRISPR-imaging analysis revealed similar movement speed and two dimensional displacement of telomeres(Fig. 14B-D). Furthermore, long telomeres, as defined by higher fluorescent surface area, showed slower movement speeds. We also tracked individual MUC4 loci movement (Fig. 14E). Our analysis discerned two distinct movement modes: confined diffusion and active transport of confined diffusion (Fig. 14F). 80% of detected MUC4 loci followed confined diffusion, suggesting the movement of local chromatin is modulated by nuclear factors (Fig. 14G). The median speed wasO.Ol lum2/s as defined by diffusion coefficient (Fig. 14H). This demonstrates the power of using CRISPR to directly visualize the local movement and the segregation process of chromosomes during cell division.
[0115] The ability to use CRISPR sgRNAs for sequence-specifically visualizing genetic elements in living cells defines a new class of genome imaging tools. In this study, we have shown that a nuclease-deactivated S. pyogenes Cas9 fused with fluorescent proteins can be used to flexibly image both repetitive and non-repetitive genetic elements, which can be potentially applied to image any genomic sequence of interest. Our technology relies on RNA-directed local enrichment of fluorescence signals, which robustly filters the off-target effects of the CRISPR system (Hsu, et al., 2013). The use of orthogonal Cas9 proteins should further allow multiplexed detection of multiple genetic events. The method is capable of tracking the dynamic movement of an arbitrary genomic region in living cells, opening doors to a fuller understanding of how genomes are organized in vivo and how they are dynamically regulated in the cell nucleus. Combined with technologies using CRISPR for
gene editing and regulation, the CRISPR imaging technology here offers a universal platform for using RNAs to modify, modulate and label human genomes.
[0116] Supplemental Methods
Plasmid construction
[0117] The DNA sequence encoding the Cas9 nuclease harboring inactivating D10A and H840A substitutions (dCas9) , derived from Streptococcus pyogenes, was fused with EGFP and two SV40 nuclear localization sequences ( LS) at different positions. Using standard ligation-independent cloning, we cloned these three fusion proteins into the TRE response vector with a PTRE3G promoter (Tet-on 3G inducible expression system, Clontech). sgRNAs were cloned into a lentiviral U6-based expression vector derived from pSico, which coexpressm Cherry from a CMV promoter. The sgR A (old design) expression plasmids were cloned by inserting annealed primers into the vector digested by BstXI and Xhol. The optimized sgRNAs expression constructs were directly ordered as gBlocks (IDT) and clone into the lentiviral U6-based expression vector digested by Xbal/BamHI. [0118] The 73 sgRNAs targeting non-repetitive sequence of MUC4 were cloned into the same sgRNA expression vector by amplifying the insertions using a same reverse primer but different forward primers containing the unique spacer sequence right after the BstXI site and the optimized sgRNA as the template. The PCR fragments were cloned into the vector by BstXI and Xhol. Cell culture
[0119] Human embryonic kidney (HE ) cell line 293T, human renal cancer cell lineUMUC3 and Hela cells were maintained in Dulbecco's modified Eagle medium (DMEM) with high glucose in 10% FBS. Human retinal pigment epithelium (RPE) cells were maintained in DMEM with GlutaMAXl in 10% FBS. All cells were maintained at 37°C and 5% C02 in a humidified incubator.
Viral production and stable expression of dCas9 and sgRNA
[0120] For viral production, 293T cells were seeded into T75 flask one day prior to transfection. 1 μg of pMD2.G plasmid ( irus envelope plasmid), 8μg of pCMV-dR8.91 (virus packaging plasmid) and 9 μg of the lentivector (Tet-on 3G, dCas9-EGFP, GFP-TRF1, sgRNA or TIN-2 shRNA) were cotransfected into 293 T cells using FuGENE (Promega) following the manufacture's recommended protocol. Virus was harvested 48 hours post-
transfection. For virus infection, culture cells were incubated with culture medium-diluted virus supernatant supplemented with 5μg ml polybrene for 12 hours.
[0121] RPE, UMUC3 and HeLa cell lines stably expressing dCas9-EGFP were generated by infecting the cells with lentivirus that co-packaged the two expression vectors of dCas9- EGFP and transactivator protein. Clonal cells expressing dCas9-EGFP in the inducible system for each cell line were generated by picking a single cell colony. The clones with low basal level expression of dCas9-EGFP were selected for CRISP imaging. To label telomeres and the repetitive regions of MUC1 or MUC4, the selected clonal cell lines were infected with ΙΟΟμΙ lentivirus with individual sgRNAs in each 8-well of chambered cover glass. To label the non-repetitive region of MUC4, five or six of sgRNA plasmids were co- packaged in the same lentivirus. The dCas9 expression cells were infected with a mixture of lentivirus including 16, 26, 36 or 73 sgRNAs.
Immunostaining
[0122] Cells were fixed in 4% paraformaldehyde, permeabilized with 0.5% NP-40 in phosphate buffered saline (PBS) for 10 minutes, washed with PBS for 5 minutes, blocked inO.2% cold water fish gelatin and 0.5% bovine serum albumin (BSA) for 20 minutes, incubated with the primary antibody in blocking buffer at 4°Covernight, washed three times and then incubated with Alexa647-conjugatedsecondary antibody at room temperature for 1 hour, washed again and stained with DAPI. Primary and secondary antibodies used in this study were anti-TRFP2 (E-20, sc-32106, Santa Cruz Biotechnology) and anti-53BPl .
Fluorescence in situ hybridization (FISH) and IF with PNA FISH
[0123] Oligo FISH was performed according to standard protocols. Briefly, cells were fixed with 4% paraformaldehyde, incubated with 0.7% Triton X-100, 0.1% Saponin in 2xSSCfor 30 min at RT for permeabilization of the nuclear membrane, washed with 2><SSC twice, treated with RNase A at 37 °C for lh, washed again with 2*SSC and equilibrated with PBS for 5 min before dehydration by consecutive 5 min incubations in 70%, 85% and 100% ethanol. After air drying, cells were heated at 80 °C for 5 min in 70% formamide/2><SSC, washed using an ethanol series (ice cold; 70, 80, 95%). After aird rying, Cy3- or Cy5-labeled Oligo FISH probe (around 20bp, each oligo has one molecule of Cy3 or Cy5 conjugated to its 5 prime end) in hybridizing solution (10% dextran sulfate, 50% formomide, 500 ng/ml
Salmon sperm DNA in 2xSSC buffer), at a final concentration of 2 ng/μΐ, was added to the
sample and incubated overnight at 37 °C. After hybridization, cells were washed three times with 2xSSC for three times and finally stained with DAPI.
[0124] For immunofluorescence (IF) with peptide nucleic acid (PNA) FISH, after incubation with primary and secondary antibodies, telomere PNA-FISH was performed as described in Diolaiti et al (beginning after the pepsin treatment).
Optical setup and image acquisition
Cells were seeded onto 8-Well Lab-Tek II chambered cover glass 24 hours before lentivirus infection. The imaging of telomeres and Mucin genes was performed 48 hours post-infection of sgR As. The samples with fixation were imaged at a frame rate of 5Hz with 50 frames. The final image was generated by averaging the 50 frames. To make a projection image, 3 μιη Z-stack at 0.4μιη steps were acquired. Right before live cell imaging, the culture medium was replaced with medium without phenol red. Images were recorded at a frame rate of 5Hz. 600 frames were acquired for the dynamic analysis. During the imaging session, the temperature was maintained at 37 °C with an enviromental chamber. Imasins analysis
To quantify the colocalization of 53BP1 and PNA-FISH at the telomeres, images were taken on a Delta Vision deconvolution microscope (Applied Precision/GE) with a 100x 1.40 NA Plan Apo objective (Olympus). Identification, measurement of areas and intensity, and colocalization analysis of telomeres and 53BP1 foci were performed in CellProfiler
(Carpenter 2006, specific pipelines available upon request); quantification and statistical comparison were carried out using custom Python software.
[0125] References
1. T. Misteli, Cell 128, 787 (2007).
2. T. Misteli, Cell 152, 1209 (2013).
3. M. L. Pardue, J. G. Gall, Proceedings of the National Academy of Sciences of the
United States of America 64, 600 (1969).
4. C. C. Robinett et al, The Journal of cell biology 135, 1685 (1996).
5. X. Wang et al., Epigenetics & chromatin 1, 4 (2008).
6. R. Barrangou et al, Science 315, 1709 (2007).
7. B. Wiedenheft, S. H. Sternberg, J. A. Doudna, Nature 482, 331 (2012).
8. E. Deltcheva et al , Nature All, 602 (2011).
9. M. Jinek et al , Science 337, 816 (2012).
10. L. Cong et al , Science 339, 819 (2013). 11. P. Mali et al, Science 339, 823 (2013).
12. H. Wang et al , Cell 153, 910 (2013).
13. L. S. Qi et al. , Cell 152, 1173 (2013).
14. L. A. Gilbert et al, Cell 154, 442 (2013).
15. R. . Moyzis et al, Proceedings of the National Academy of Sciences of the United States of America 85, 6622 (1988).
16. S. Nielsen, Y. Yuzenkova, N. Zenkin, Science 340, 1577 (2013).
17. J. D. Griffith et al , Cell 97, 503 (1999).
18. F. dAdda di Fagagna et al, Nature 426, 194 (2003).
19. S. H. Kim, P. Kaminker, J. Campisi, Nature genetics 23, 405 (1999). 20. H. Vaziri, S. Benchimol, Current biology : CB 8, 279 (1998).
21. M. A. Hollingsworth, B. J. Swanson, Nature reviews. Cancer 4, 45 (2004).
22. S. Nollet et al, The Biochemical journal 332 ( Pt 3), 739 (1998).
23. S. J. Gendler et al, The Journal of biological chemistry 265, 15286 (1990).
24. P. D. Hsu et al , Nature biotechnology 31, 827 (2013). [0126] All patents, patent applications, and other publications, including GenBank
Accession Numbers, cited in this application are incorporated by reference in the entirety for all purposes.
Claims
WHAT IS CLAIMED IS: 1. A small guide RNA molecule comprising from 5 ' to 3 ' :
· a binding region, comprising between about 5 and about 50
nucleotides;
· a 5' hairpin region, comprising:
o fewer than four consecutive uracil nucleotides; or
o a length of at least 31 nucleotides; and
· a 3 ' hairpin region; and
· a transcription termination sequence,
wherein the small guide RNA is configured to form a complex with a small guide RNA-mediated nuclease, the complex having increased stability or activity relative to a complex containing a small guide RNA-mediated nuclease and a small guide RNA comprising at least 95% identity to SEQ ID NO: 1 or a complement thereof.
2. The small guide RNA molecule of claim 1 , wherein the 5 ' hairpin region comprises fewer than four consecutive uracil nucleotides and a length of at least 31 nucleotides.
3. The small guide RNA molecule of claim 1 , wherein the small guide RNA molecule further comprises an additional hairpin region designed to interact with a protein or small-molecule to conditionally stabilize the secondary and/or tertiary structure of the small guide RNA molecule.
4. The small guide RNA molecule of claim 1, comprising at least 95% identity to SEQ ID NOs:2, 3, or 4, or a complement thereof.
5. A composition for nucleic acid modification or detection comprising any of the small guide RNA molecules of claims 1-4, the composition further comprising a small guide RNA-mediated nuclease, wherein the small guide RNA and the small guide RNA-mediated nuclease form a complex having increased stability or activity relative to a complex containing a small guide RNA comprising at least 95% identity to SEQ ID NO: 1 or a complement thereof.
6. The composition of claim 5, wherein the composition is nuclease defective, thereby forming a complex configured to bind to, but not cleave or nick, a target nucleic acid substantially complementary to the binding region of the small guide R A.
7. The composition of claim 6, wherein the nuclease defective composition comprises a Cas9 protein containing a mutation at one or more of the following residues: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
8 . The composition of claim 7, wherein the nuclease defective composition comprises a Cas9 protein containing a mutation at two or more of the following residues: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and A987.
9. The composition of claim 8, wherein the nuclease defective composition comprises a Cas9 protein containing a D10A and a H840A mutation.
10. The composition of any one of claims 6-9 wherein the nuclease defective composition comprises a labeled Cas9 protein.
11. The composition of claim 10, wherein the labeled Cas9 protein comprises a fluorophore.
12. The composition of claim 11, wherein the fluorophore is a fluorescent protein.
13. The composition of claim 5, wherein the composition has nuclease activity, thereby forming a complex configured to bind and cleave a target nucleic acid sequence substantially complementary to the binding region of the small guide RNA.
14. The composition of claim 13, wherein the small guide RNA-mediated nuclease has nicking activity, but is substantially defective at catalyzing double stranded breaks in the target sequence.
15. The composition of claim 13, wherein the small guide RNA-mediated nuclease comprises a Cas9 protein containing a mutation at one or more of the following residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
16. An expression cassette comprising a promoter operably linked to a nucleic acid encoding any of the small guide RNAs of claims 1-4.
17. The expression cassette of claim 16, wherein the promoter is an RNA polymerase III.
18. The expression cassette of claim 17, wherein the RNA polymerase promoter is a U6, CMV, SFFV, or HI promoter.
19. The expression cassette of claim 17, wherein the RNA polymerase promoter is a U6 promoter.
20. An expression cassette comprising a promoter operably linked to a nucleic acid encoding a small guide RNA-mediated nuclease.
21. The expression cassette of claim 20, wherein the promoter is a weak mammalian promoter as compared to the human elongation factor 1 promoter (EF1A).
22. The expression cassette of claim 21 , wherein the weak mammalian promoter is a ubiquitin C promoter or a phosphoglycerate kinase 1 promoter (PGK).
23. The expression cassette of claim 21 , wherein the weak mammalian promoter is a TetOn promoter in the absence of an inducer.
24. The expression cassette of any one of claims 20-23, wherein the nucleic acid further encodes a one or two nuclear localization sequences.
25. A host cell comprising any one of the small guide RNAs of claims 1-4.
26. The host cell of claim 25, wherein the cell further comprises a small guide RNA-mediated nuclease.
27. The host cell of claim 26, wherein the small guide RNA-mediated nuclease is labeled.
28. The host cell of claim 27, wherein the small guide RNA-mediated nuclease is labeled with a fluorophore.
29. The host cell of claim 28, wherein the fluorophore is a fluorescent protein
30. A method of detecting a target nucleic acid sequence in a cell, the method comprising:
(i) introducing into the cell:
(a) one or more small guide R As, each small guide R A specific for the target nucleic acid sequence; and
(b) a labeled nuclease-deficient small guide RNA-mediated nuclease, thereby forming a labeled R A:nuclease complex; and
(ii) incubating the cell to allow the labeled R A:nuclease complex to localize to the target nucleic acid sequence; and
(iii) detecting the presence, absence, or quantity of labeled complex in the nucleus of the cell, thereby detecting the target nucleic acid.
31. The method of claim 30, the method further comprising introducing at least 5, 10, 15, 20 or more different small guide RNAs, each specific for a different portion of the target nucleic acid sequence.
32. The method of claim 30, wherein the one or more small guide RNAs are specific for a repeated target sequence.
33. The method of claim 32, wherein the repeated target sequence comprises at least 5, 10, 15, 20 or more contiguous repeats, each repeat of at least 4, 5, 6, 7, 8, 9, 10, 15, 20, or more nucleotides in length.
34. The method of any one of claims 30-33, wherein the small guide RNA- mediated nuclease contains a mutation at one or more of the following residues D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
35. The method of claim 34, wherein the small guide RNA-mediated nuclease contains a mutation at two or more of the following residues D10, G 12, G 17, E762, H840, N854, N863, H982, H983, A984, D986, or A987.
36. The method of claim 30, wherein the introducing comprises:
forming a complex between:
- a first labeled nuclease-deficient small guide RNA-mediated nuclease and one or more small guide R As, to form a first labeled complex; and
- a second labeled nuclease deficient small guide RNA-mediated nuclease and one or more small guide RNAs, to form a second labeled complex; and · contacting the cell with the first and second complexes.
37. The method of claim 36, wherein the method further comprises forming a third and fourth labeled complex and contacting the cell with the four labeled complexes.
38. The method of claim 36 or 37, wherein each labeled complex is specific for a different target nucleic acid sequence or specific for a different region of a chromosome.
39. The method of claim 36 or 37, wherein each labeled complex is labeled with a different label, and the method comprises detecting the presence, absence, or quantity of each labeled complex in the nucleus of the cell..
40. A method of modifying a target nucleic acid sequence in a cell, the method comprising:
(i) introducing into the cell:
(a) a small guide RNA of claims 1-4, the small guide RNA specific for the target nucleic acid sequence; and
(b) a small guide RNA-mediated nuclease, thereby forming a small guide RNA:nuclease complex; and
(ii) incubating the cell to allow the small guide RNA:nuclease complex to bind to and cleave or nick the target nucleic acid, thereby modifying the target nucleic acid sequence in the cell.
41. The method of claim 40, wherein the small guide RNA-mediated nuclease is capable of catalyzing double stranded breaks in the target nucleic acid, and the method further comprises cleaving the target nucleic acid.
42. The method of claim 40, wherein the small guide RNA-mediated nuclease is capable of nicking the target nucleic acid but not catalyzing double stranded
breaks in the target nucleic acid, and the method further comprises nicking the target nucleic acid.
43. The method of claim 41 or 42, wherein the method further comprises (i) introducing into the cell: (a) a pair of small guide RNAs of claims 1-4, each small guide RNA specific for a target nucleic acid, wherein the pair of small guide RNAs bind to target nucleic acids on a chromosome and flank a nucleic acid region of interest; and
(b) a small guide RNA-mediated nuclease, thereby forming a pair of small guide RNA:nuclease complexes; and
(ii) incubating the cell to allow the small guide RNA:nuclease complexes to localize to the target nucleic acid sequence, thereby creating nicks or double stranded breaks that flank the nucleic acid region of interest; and
(iii) incubating the cell to allow non homologous end joining (NHEJ) or homologous DNA repair (HDR) to occur, thereby reducing heterozygosity in the cell or deleting at least a portion of the nucleic acid region of interest.
44. The method of claim 43, wherein the method further comprises introducing into the cell a heterologous nucleic acid that contains regions of substantial homology to the nucleic acid region of interest, thereby incorporating at least a portion of the heterologous nucleic acid into the nucleic acid region of interest.
45. A kit comprising an sgRNA and a labeled nuclease defective sgRNA- mediated nuclease.
46. The kit of claim 45, wherein the kit further comprises a second sgRNA or a second labeled nuclease defective sgRNA-mediated nuclease.
47. The kit of claim 45, wherein the kit further comprises a cell transfection reagent.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/025,217 US10822606B2 (en) | 2013-09-27 | 2014-09-29 | Optimized small guide RNAs and methods of use |
US17/033,255 US12049624B2 (en) | 2013-09-27 | 2020-09-25 | Optimized small guide RNAs and methods of use |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361883929P | 2013-09-27 | 2013-09-27 | |
US61/883,929 | 2013-09-27 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/025,217 A-371-Of-International US10822606B2 (en) | 2013-09-27 | 2014-09-29 | Optimized small guide RNAs and methods of use |
US17/033,255 Division US12049624B2 (en) | 2013-09-27 | 2020-09-25 | Optimized small guide RNAs and methods of use |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015048690A1 true WO2015048690A1 (en) | 2015-04-02 |
Family
ID=52744581
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2014/058133 WO2015048690A1 (en) | 2013-09-27 | 2014-09-29 | Optimized small guide rnas and methods of use |
Country Status (2)
Country | Link |
---|---|
US (2) | US10822606B2 (en) |
WO (1) | WO2015048690A1 (en) |
Cited By (74)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9228207B2 (en) | 2013-09-06 | 2016-01-05 | President And Fellows Of Harvard College | Switchable gRNAs comprising aptamers |
US9322006B2 (en) | 2011-07-22 | 2016-04-26 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US9359599B2 (en) | 2013-08-22 | 2016-06-07 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
CN105647885A (en) * | 2016-01-20 | 2016-06-08 | 广州元曦生物科技有限公司 | Cas9 fusion protein and coding sequence thereof |
US9388430B2 (en) | 2013-09-06 | 2016-07-12 | President And Fellows Of Harvard College | Cas9-recombinase fusion proteins and uses thereof |
WO2016028843A3 (en) * | 2014-08-19 | 2016-07-14 | President And Fellows Of Harvard College | Rna-guided systems for probing and mapping of nucleic acids |
WO2016172727A1 (en) * | 2015-04-24 | 2016-10-27 | Editas Medicine, Inc. | Evaluation of cas9 molecule/guide rna molecule complexes |
US9526784B2 (en) | 2013-09-06 | 2016-12-27 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
US9546384B2 (en) | 2013-12-11 | 2017-01-17 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a mouse genome |
WO2017035416A2 (en) | 2015-08-25 | 2017-03-02 | Duke University | Compositions and methods of improving specificity in genomic engineering using rna-guided endonucleases |
WO2017053762A1 (en) * | 2015-09-24 | 2017-03-30 | Sigma-Aldrich Co. Llc | Methods and reagents for molecular proximity detection using rna-guided nucleic acid binding proteins |
WO2017040813A3 (en) * | 2015-09-02 | 2017-05-04 | University Of Massachusetts | Detection of gene loci with crispr arrayed repeats and/or polychromatic single guide ribonucleic acids |
WO2017091630A1 (en) * | 2015-11-23 | 2017-06-01 | The Regents Of The University Of California | Tracking and manipulating cellular rna via nuclear delivery of crispr/cas9 |
WO2017123556A1 (en) * | 2016-01-11 | 2017-07-20 | The Board Of Trustees Of The Leland Stanford Junior University | Chimeric proteins and methods of immunotherapy |
US9834791B2 (en) | 2013-11-07 | 2017-12-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US9840699B2 (en) | 2013-12-12 | 2017-12-12 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
US9856497B2 (en) | 2016-01-11 | 2018-01-02 | The Board Of Trustee Of The Leland Stanford Junior University | Chimeric proteins and methods of regulating gene expression |
US9982278B2 (en) | 2014-02-11 | 2018-05-29 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US9982279B1 (en) | 2017-06-23 | 2018-05-29 | Inscripta, Inc. | Nucleic acid-guided nucleases |
CN108103090A (en) * | 2017-12-12 | 2018-06-01 | 中山大学附属第医院 | RNA Cas9-m6A modified vector system for targeting RNA methylation, and construction method and application thereof |
US10011849B1 (en) | 2017-06-23 | 2018-07-03 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US10017760B2 (en) | 2016-06-24 | 2018-07-10 | Inscripta, Inc. | Methods for generating barcoded combinatorial libraries |
US10077453B2 (en) | 2014-07-30 | 2018-09-18 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US10113163B2 (en) | 2016-08-03 | 2018-10-30 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US10166255B2 (en) | 2015-07-31 | 2019-01-01 | Regents Of The University Of Minnesota | Intracellular genomic transplant and methods of therapy |
US10167457B2 (en) | 2015-10-23 | 2019-01-01 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US20190085325A1 (en) * | 2016-03-17 | 2019-03-21 | Imba - Institut Für Molekulare Biotechnologie Gmbh | Conditional crispr sgrna expression |
WO2019147743A1 (en) | 2018-01-26 | 2019-08-01 | Massachusetts Institute Of Technology | Structure-guided chemical modification of guide rna and its applications |
US10385359B2 (en) | 2013-04-16 | 2019-08-20 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US10457960B2 (en) | 2014-11-21 | 2019-10-29 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for targeted genetic modification using paired guide RNAs |
US10508298B2 (en) | 2013-08-09 | 2019-12-17 | President And Fellows Of Harvard College | Methods for identifying a target site of a CAS9 nuclease |
US10745677B2 (en) | 2016-12-23 | 2020-08-18 | President And Fellows Of Harvard College | Editing of CCR5 receptor gene to protect against HIV infection |
US10912797B2 (en) | 2016-10-18 | 2021-02-09 | Intima Bioscience, Inc. | Tumor infiltrating lymphocytes and methods of therapy |
US20210040460A1 (en) | 2012-04-27 | 2021-02-11 | Duke University | Genetic correction of mutated genes |
US11033584B2 (en) | 2017-10-27 | 2021-06-15 | The Regents Of The University Of California | Targeted replacement of endogenous T cell receptors |
US11098325B2 (en) | 2017-06-30 | 2021-08-24 | Intima Bioscience, Inc. | Adeno-associated viral vectors for gene therapy |
EP3871695A1 (en) * | 2015-12-07 | 2021-09-01 | Arc Bio, LLC | Methods and compositions for the making and using of guide nucleic acids |
US11141493B2 (en) | 2014-03-10 | 2021-10-12 | Editas Medicine, Inc. | Compositions and methods for treating CEP290-associated disease |
WO2021248102A1 (en) * | 2020-06-05 | 2021-12-09 | Flagship Pioneering Innovations Vi, Llc | Template guide rna molecules |
US11268086B2 (en) | 2014-03-10 | 2022-03-08 | Editas Medicine, Inc. | CRISPR/CAS-related methods and compositions for treating Leber's Congenital Amaurosis 10 (LCA10) |
US11268082B2 (en) | 2017-03-23 | 2022-03-08 | President And Fellows Of Harvard College | Nucleobase editors comprising nucleic acid programmable DNA binding proteins |
EP3929287A3 (en) * | 2015-06-18 | 2022-04-13 | The Broad Institute, Inc. | Crispr enzyme mutations reducing off-target effects |
US11306324B2 (en) | 2016-10-14 | 2022-04-19 | President And Fellows Of Harvard College | AAV delivery of nucleobase editors |
US11319532B2 (en) | 2017-08-30 | 2022-05-03 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11407985B2 (en) | 2013-12-12 | 2022-08-09 | The Broad Institute, Inc. | Delivery, use and therapeutic applications of the CRISPR-Cas systems and compositions for genome editing |
US11447770B1 (en) | 2019-03-19 | 2022-09-20 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11453891B2 (en) | 2017-05-10 | 2022-09-27 | The Regents Of The University Of California | Directed editing of cellular RNA via nuclear delivery of CRISPR/CAS9 |
US11466271B2 (en) | 2017-02-06 | 2022-10-11 | Novartis Ag | Compositions and methods for the treatment of hemoglobinopathies |
WO2022243540A1 (en) * | 2021-05-21 | 2022-11-24 | Aarhus Universitet | Lentivirus-derived nanoparticles comprising crispr/cas9 ribonucleoprotein complexes |
US11542496B2 (en) | 2017-03-10 | 2023-01-03 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
US11566263B2 (en) | 2016-08-02 | 2023-01-31 | Editas Medicine, Inc. | Compositions and methods for treating CEP290 associated disease |
US11578312B2 (en) | 2015-06-18 | 2023-02-14 | The Broad Institute Inc. | Engineering and optimization of systems, methods, enzymes and guide scaffolds of CAS9 orthologs and variants for sequence manipulation |
US11591581B2 (en) | 2013-12-12 | 2023-02-28 | The Broad Institute, Inc. | Compositions and methods of use of CRISPR-Cas systems in nucleotide repeat disorders |
US11597919B2 (en) | 2013-12-12 | 2023-03-07 | The Broad Institute Inc. | Systems, methods and compositions for sequence manipulation with optimized functional CRISPR-Cas systems |
US11597949B2 (en) | 2013-06-17 | 2023-03-07 | The Broad Institute, Inc. | Optimized CRISPR-Cas double nickase systems, methods and compositions for sequence manipulation |
US11624078B2 (en) | 2014-12-12 | 2023-04-11 | The Broad Institute, Inc. | Protected guide RNAS (pgRNAS) |
US11661590B2 (en) | 2016-08-09 | 2023-05-30 | President And Fellows Of Harvard College | Programmable CAS9-recombinase fusion proteins and uses thereof |
US11680268B2 (en) | 2014-11-07 | 2023-06-20 | Editas Medicine, Inc. | Methods for improving CRISPR/Cas-mediated genome-editing |
US11732274B2 (en) | 2017-07-28 | 2023-08-22 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
US11795443B2 (en) | 2017-10-16 | 2023-10-24 | The Broad Institute, Inc. | Uses of adenosine base editors |
US11814624B2 (en) | 2017-06-15 | 2023-11-14 | The Regents Of The University Of California | Targeted non-viral DNA insertions |
US11851690B2 (en) | 2017-03-14 | 2023-12-26 | Editas Medicine, Inc. | Systems and methods for the treatment of hemoglobinopathies |
US11866726B2 (en) | 2017-07-14 | 2024-01-09 | Editas Medicine, Inc. | Systems and methods for targeted integration and genome editing and detection thereof using integrated priming sites |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
US11912985B2 (en) | 2020-05-08 | 2024-02-27 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
US11963982B2 (en) | 2017-05-10 | 2024-04-23 | Editas Medicine, Inc. | CRISPR/RNA-guided nuclease systems and methods |
US11970710B2 (en) | 2015-10-13 | 2024-04-30 | Duke University | Genome engineering with Type I CRISPR systems in eukaryotic cells |
US12018275B2 (en) | 2013-06-17 | 2024-06-25 | The Broad Institute, Inc. | Delivery and use of the CRISPR-CAS systems, vectors and compositions for hepatic targeting and therapy |
US12031132B2 (en) | 2018-03-14 | 2024-07-09 | Editas Medicine, Inc. | Systems and methods for the treatment of hemoglobinopathies |
US12037407B2 (en) | 2021-10-14 | 2024-07-16 | Arsenal Biosciences, Inc. | Immune cells having co-expressed shRNAS and logic gate systems |
US12098399B2 (en) | 2022-06-24 | 2024-09-24 | Tune Therapeutics, Inc. | Compositions, systems, and methods for epigenetic regulation of proprotein convertase subtilisin/kexin type 9 (PCSK9) gene expression |
US12123032B2 (en) | 2019-11-26 | 2024-10-22 | The Broad Institute, Inc. | CRISPR enzyme mutations reducing off-target effects |
Families Citing this family (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150191744A1 (en) * | 2013-12-17 | 2015-07-09 | University Of Massachusetts | Cas9 effector-mediated regulation of transcription, differentiation and gene editing/labeling |
EP3169776A4 (en) | 2014-07-14 | 2018-07-04 | The Regents of The University of California | Crispr/cas transcriptional modulation |
US11293021B1 (en) | 2016-06-23 | 2022-04-05 | Inscripta, Inc. | Automated cell processing methods, modules, instruments, and systems |
US10253316B2 (en) | 2017-06-30 | 2019-04-09 | Inscripta, Inc. | Automated cell processing methods, modules, instruments, and systems |
US10738327B2 (en) | 2017-08-28 | 2020-08-11 | Inscripta, Inc. | Electroporation cuvettes for automation |
US10435713B2 (en) | 2017-09-30 | 2019-10-08 | Inscripta, Inc. | Flow through electroporation instrumentation |
AU2019241967A1 (en) | 2018-03-29 | 2020-11-19 | Inscripta, Inc. | Automated control of cell growth rates for induction and transformation |
WO2019200004A1 (en) | 2018-04-13 | 2019-10-17 | Inscripta, Inc. | Automated cell processing instruments comprising reagent cartridges |
WO2019204304A2 (en) * | 2018-04-16 | 2019-10-24 | The Children's Hospital Of Philadelphia | Mitochondrial rna import for treating mitochondrial disease |
US10858761B2 (en) | 2018-04-24 | 2020-12-08 | Inscripta, Inc. | Nucleic acid-guided editing of exogenous polynucleotides in heterologous cells |
US10508273B2 (en) | 2018-04-24 | 2019-12-17 | Inscripta, Inc. | Methods for identifying selective binding pairs |
US10557216B2 (en) | 2018-04-24 | 2020-02-11 | Inscripta, Inc. | Automated instrumentation for production of T-cell receptor peptide libraries |
CA3108767A1 (en) | 2018-06-30 | 2020-01-02 | Inscripta, Inc. | Instruments, modules, and methods for improved detection of edited sequences in live cells |
US11142740B2 (en) | 2018-08-14 | 2021-10-12 | Inscripta, Inc. | Detection of nuclease edited sequences in automated modules and instruments |
US10532324B1 (en) | 2018-08-14 | 2020-01-14 | Inscripta, Inc. | Instruments, modules, and methods for improved detection of edited sequences in live cells |
US10752874B2 (en) | 2018-08-14 | 2020-08-25 | Inscripta, Inc. | Instruments, modules, and methods for improved detection of edited sequences in live cells |
WO2020081149A2 (en) | 2018-08-30 | 2020-04-23 | Inscripta, Inc. | Improved detection of nuclease edited sequences in automated modules and instruments |
EP3870697A4 (en) | 2018-10-22 | 2022-11-09 | Inscripta, Inc. | Engineered enzymes |
US11214781B2 (en) | 2018-10-22 | 2022-01-04 | Inscripta, Inc. | Engineered enzyme |
AU2020247900A1 (en) | 2019-03-25 | 2021-11-04 | Inscripta, Inc. | Simultaneous multiplex genome editing in yeast |
US11001831B2 (en) | 2019-03-25 | 2021-05-11 | Inscripta, Inc. | Simultaneous multiplex genome editing in yeast |
AU2020288623A1 (en) | 2019-06-06 | 2022-01-06 | Inscripta, Inc. | Curing for recursive nucleic acid-guided cell editing |
US10907125B2 (en) | 2019-06-20 | 2021-02-02 | Inscripta, Inc. | Flow through electroporation modules and instrumentation |
EP3986909A4 (en) | 2019-06-21 | 2023-08-02 | Inscripta, Inc. | Genome-wide rationally-designed mutations leading to enhanced lysine production in e. coli |
US10927385B2 (en) | 2019-06-25 | 2021-02-23 | Inscripta, Inc. | Increased nucleic-acid guided cell editing in yeast |
WO2021102059A1 (en) | 2019-11-19 | 2021-05-27 | Inscripta, Inc. | Methods for increasing observed editing in bacteria |
EP4069837A4 (en) | 2019-12-10 | 2024-03-13 | Inscripta, Inc. | Novel mad nucleases |
US10704033B1 (en) | 2019-12-13 | 2020-07-07 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US11008557B1 (en) | 2019-12-18 | 2021-05-18 | Inscripta, Inc. | Cascade/dCas3 complementation assays for in vivo detection of nucleic acid-guided nuclease edited cells |
US10689669B1 (en) | 2020-01-11 | 2020-06-23 | Inscripta, Inc. | Automated multi-module cell processing methods, instruments, and systems |
CA3157061A1 (en) | 2020-01-27 | 2021-08-05 | Christian SILTANEN | Electroporation modules and instrumentation |
US20210332388A1 (en) | 2020-04-24 | 2021-10-28 | Inscripta, Inc. | Compositions, methods, modules and instruments for automated nucleic acid-guided nuclease editing in mammalian cells |
US11787841B2 (en) | 2020-05-19 | 2023-10-17 | Inscripta, Inc. | Rationally-designed mutations to the thrA gene for enhanced lysine production in E. coli |
EP4214314A4 (en) | 2020-09-15 | 2024-10-16 | Inscripta Inc | Crispr editing to embed nucleic acid landing pads into genomes of live cells |
US11512297B2 (en) | 2020-11-09 | 2022-11-29 | Inscripta, Inc. | Affinity tag for recombination protein recruitment |
EP4271802A1 (en) | 2021-01-04 | 2023-11-08 | Inscripta, Inc. | Mad nucleases |
US11332742B1 (en) | 2021-01-07 | 2022-05-17 | Inscripta, Inc. | Mad nucleases |
US11884924B2 (en) | 2021-02-16 | 2024-01-30 | Inscripta, Inc. | Dual strand nucleic acid-guided nickase editing |
WO2022251644A1 (en) | 2021-05-28 | 2022-12-01 | Lyell Immunopharma, Inc. | Nr4a3-deficient immune cells and uses thereof |
EP4347826A1 (en) | 2021-06-02 | 2024-04-10 | Lyell Immunopharma, Inc. | Nr4a3-deficient immune cells and uses thereof |
WO2023225665A1 (en) | 2022-05-19 | 2023-11-23 | Lyell Immunopharma, Inc. | Polynucleotides targeting nr4a3 and uses thereof |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070025970A1 (en) * | 2000-10-06 | 2007-02-01 | Oxford Biomedica (Uk) Limited | Vector system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4804467B2 (en) * | 2004-08-23 | 2011-11-02 | アルナイラム ファーマシューティカルズ, インコーポレイテッド | Multiple RNA polymerase III promoter expression construct |
US8697359B1 (en) * | 2012-12-12 | 2014-04-15 | The Broad Institute, Inc. | CRISPR-Cas systems and methods for altering expression of gene products |
US20140364333A1 (en) * | 2013-03-15 | 2014-12-11 | President And Fellows Of Harvard College | Methods for Live Imaging of Cells |
-
2014
- 2014-09-29 WO PCT/US2014/058133 patent/WO2015048690A1/en active Application Filing
- 2014-09-29 US US15/025,217 patent/US10822606B2/en active Active
-
2020
- 2020-09-25 US US17/033,255 patent/US12049624B2/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070025970A1 (en) * | 2000-10-06 | 2007-02-01 | Oxford Biomedica (Uk) Limited | Vector system |
Non-Patent Citations (3)
Title |
---|
CHEN ET AL.: "Dynamic Imaging of Genomic Loci in Living Human Cells by an Optimized CRISPR/Cas System", CELL, vol. 155, 19 December 2013 (2013-12-19), pages 1479 - 1491, XP055181416, DOI: doi:10.1016/j.cell.2013.12.001 * |
MALI ET AL.: "CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering", NATURE BIOTECHNOLOGY, vol. 31, no. 9, 1 August 2013 (2013-08-01), pages 833 - 40 * |
QI ET AL.: "Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression.", CELL, vol. 152, 28 February 2013 (2013-02-28), pages 1173 - 1183, XP055299671, DOI: doi:10.1016/j.cell.2013.02.022 * |
Cited By (171)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9322006B2 (en) | 2011-07-22 | 2016-04-26 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US12006520B2 (en) | 2011-07-22 | 2024-06-11 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US10323236B2 (en) | 2011-07-22 | 2019-06-18 | President And Fellows Of Harvard College | Evaluation and improvement of nuclease cleavage specificity |
US20210040460A1 (en) | 2012-04-27 | 2021-02-11 | Duke University | Genetic correction of mutated genes |
US11976307B2 (en) | 2012-04-27 | 2024-05-07 | Duke University | Genetic correction of mutated genes |
US10385359B2 (en) | 2013-04-16 | 2019-08-20 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US12037596B2 (en) | 2013-04-16 | 2024-07-16 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US10975390B2 (en) | 2013-04-16 | 2021-04-13 | Regeneron Pharmaceuticals, Inc. | Targeted modification of rat genome |
US12018275B2 (en) | 2013-06-17 | 2024-06-25 | The Broad Institute, Inc. | Delivery and use of the CRISPR-CAS systems, vectors and compositions for hepatic targeting and therapy |
US11597949B2 (en) | 2013-06-17 | 2023-03-07 | The Broad Institute, Inc. | Optimized CRISPR-Cas double nickase systems, methods and compositions for sequence manipulation |
US10508298B2 (en) | 2013-08-09 | 2019-12-17 | President And Fellows Of Harvard College | Methods for identifying a target site of a CAS9 nuclease |
US10954548B2 (en) | 2013-08-09 | 2021-03-23 | President And Fellows Of Harvard College | Nuclease profiling system |
US11920181B2 (en) | 2013-08-09 | 2024-03-05 | President And Fellows Of Harvard College | Nuclease profiling system |
US11046948B2 (en) | 2013-08-22 | 2021-06-29 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US9359599B2 (en) | 2013-08-22 | 2016-06-07 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US10227581B2 (en) | 2013-08-22 | 2019-03-12 | President And Fellows Of Harvard College | Engineered transcription activator-like effector (TALE) domains and uses thereof |
US11299755B2 (en) | 2013-09-06 | 2022-04-12 | President And Fellows Of Harvard College | Switchable CAS9 nucleases and uses thereof |
US9526784B2 (en) | 2013-09-06 | 2016-12-27 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
US9340800B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | Extended DNA-sensing GRNAS |
US10912833B2 (en) | 2013-09-06 | 2021-02-09 | President And Fellows Of Harvard College | Delivery of negatively charged proteins using cationic lipids |
US10858639B2 (en) | 2013-09-06 | 2020-12-08 | President And Fellows Of Harvard College | CAS9 variants and uses thereof |
US9340799B2 (en) | 2013-09-06 | 2016-05-17 | President And Fellows Of Harvard College | MRNA-sensing switchable gRNAs |
US10682410B2 (en) | 2013-09-06 | 2020-06-16 | President And Fellows Of Harvard College | Delivery system for functional nucleases |
US9388430B2 (en) | 2013-09-06 | 2016-07-12 | President And Fellows Of Harvard College | Cas9-recombinase fusion proteins and uses thereof |
US10597679B2 (en) | 2013-09-06 | 2020-03-24 | President And Fellows Of Harvard College | Switchable Cas9 nucleases and uses thereof |
US9999671B2 (en) | 2013-09-06 | 2018-06-19 | President And Fellows Of Harvard College | Delivery of negatively charged proteins using cationic lipids |
US9737604B2 (en) | 2013-09-06 | 2017-08-22 | President And Fellows Of Harvard College | Use of cationic lipids to deliver CAS9 |
US9228207B2 (en) | 2013-09-06 | 2016-01-05 | President And Fellows Of Harvard College | Switchable gRNAs comprising aptamers |
US10190137B2 (en) | 2013-11-07 | 2019-01-29 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US10640788B2 (en) | 2013-11-07 | 2020-05-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAs |
US11390887B2 (en) | 2013-11-07 | 2022-07-19 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US9834791B2 (en) | 2013-11-07 | 2017-12-05 | Editas Medicine, Inc. | CRISPR-related methods and compositions with governing gRNAS |
US10711280B2 (en) | 2013-12-11 | 2020-07-14 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a mouse ES cell genome |
US10208317B2 (en) | 2013-12-11 | 2019-02-19 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a mouse embryonic stem cell genome |
US9546384B2 (en) | 2013-12-11 | 2017-01-17 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a mouse genome |
US11820997B2 (en) | 2013-12-11 | 2023-11-21 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for the targeted modification of a genome |
US11597919B2 (en) | 2013-12-12 | 2023-03-07 | The Broad Institute Inc. | Systems, methods and compositions for sequence manipulation with optimized functional CRISPR-Cas systems |
US11053481B2 (en) | 2013-12-12 | 2021-07-06 | President And Fellows Of Harvard College | Fusions of Cas9 domains and nucleic acid-editing domains |
US9840699B2 (en) | 2013-12-12 | 2017-12-12 | President And Fellows Of Harvard College | Methods for nucleic acid editing |
US11124782B2 (en) | 2013-12-12 | 2021-09-21 | President And Fellows Of Harvard College | Cas variants for gene editing |
US11407985B2 (en) | 2013-12-12 | 2022-08-09 | The Broad Institute, Inc. | Delivery, use and therapeutic applications of the CRISPR-Cas systems and compositions for genome editing |
US10465176B2 (en) | 2013-12-12 | 2019-11-05 | President And Fellows Of Harvard College | Cas variants for gene editing |
US11591581B2 (en) | 2013-12-12 | 2023-02-28 | The Broad Institute, Inc. | Compositions and methods of use of CRISPR-Cas systems in nucleotide repeat disorders |
US11702677B2 (en) | 2014-02-11 | 2023-07-18 | The Regents Of The University Of Colorado | CRISPR enabled multiplexed genome engineering |
US11639511B2 (en) | 2014-02-11 | 2023-05-02 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US10711284B2 (en) | 2014-02-11 | 2020-07-14 | The Regents Of The University Of Colorado | CRISPR enabled multiplexed genome engineering |
US10351877B2 (en) | 2014-02-11 | 2019-07-16 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US10364442B2 (en) | 2014-02-11 | 2019-07-30 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US10240167B2 (en) | 2014-02-11 | 2019-03-26 | Inscripta, Inc. | CRISPR enabled multiplexed genome engineering |
US11078498B2 (en) | 2014-02-11 | 2021-08-03 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US9982278B2 (en) | 2014-02-11 | 2018-05-29 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US10435715B2 (en) | 2014-02-11 | 2019-10-08 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US10669559B2 (en) | 2014-02-11 | 2020-06-02 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US10266849B2 (en) | 2014-02-11 | 2019-04-23 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US10731180B2 (en) | 2014-02-11 | 2020-08-04 | The Regents Of The University Of Colorado | CRISPR enabled multiplexed genome engineering |
US10465207B2 (en) | 2014-02-11 | 2019-11-05 | The Regents Of The University Of Colorado, A Body Corporate | CRISPR enabled multiplexed genome engineering |
US11795479B2 (en) | 2014-02-11 | 2023-10-24 | The Regents Of The University Of Colorado | CRISPR enabled multiplexed genome engineering |
US11345933B2 (en) | 2014-02-11 | 2022-05-31 | The Regents Of The University Of Colorado | CRISPR enabled multiplexed genome engineering |
US11268086B2 (en) | 2014-03-10 | 2022-03-08 | Editas Medicine, Inc. | CRISPR/CAS-related methods and compositions for treating Leber's Congenital Amaurosis 10 (LCA10) |
US11141493B2 (en) | 2014-03-10 | 2021-10-12 | Editas Medicine, Inc. | Compositions and methods for treating CEP290-associated disease |
US10077453B2 (en) | 2014-07-30 | 2018-09-18 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US11578343B2 (en) | 2014-07-30 | 2023-02-14 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US10704062B2 (en) | 2014-07-30 | 2020-07-07 | President And Fellows Of Harvard College | CAS9 proteins including ligand-dependent inteins |
US12018321B2 (en) | 2014-08-19 | 2024-06-25 | President And Fellows Of Harvard College | RNA-guided systems for probing and mapping of nucleic acids |
WO2016028843A3 (en) * | 2014-08-19 | 2016-07-14 | President And Fellows Of Harvard College | Rna-guided systems for probing and mapping of nucleic acids |
US11680268B2 (en) | 2014-11-07 | 2023-06-20 | Editas Medicine, Inc. | Methods for improving CRISPR/Cas-mediated genome-editing |
US10457960B2 (en) | 2014-11-21 | 2019-10-29 | Regeneron Pharmaceuticals, Inc. | Methods and compositions for targeted genetic modification using paired guide RNAs |
US11697828B2 (en) | 2014-11-21 | 2023-07-11 | Regeneran Pharmaceuticals, Inc. | Methods and compositions for targeted genetic modification using paired guide RNAs |
US11624078B2 (en) | 2014-12-12 | 2023-04-11 | The Broad Institute, Inc. | Protected guide RNAS (pgRNAS) |
KR20180037139A (en) * | 2015-04-24 | 2018-04-11 | 에디타스 메디신, 인코포레이티드 | Evaluation of CAS9 molecule / guide RNA molecule complex |
KR102535217B1 (en) | 2015-04-24 | 2023-05-19 | 에디타스 메디신, 인코포레이티드 | Assessment of CAS9 Molecule/Guide RNA Molecule Complexes |
EP4019975A1 (en) * | 2015-04-24 | 2022-06-29 | Editas Medicine, Inc. | Evaluation of cas9 molecule/guide rna molecule complexes |
WO2016172727A1 (en) * | 2015-04-24 | 2016-10-27 | Editas Medicine, Inc. | Evaluation of cas9 molecule/guide rna molecule complexes |
EP3929287A3 (en) * | 2015-06-18 | 2022-04-13 | The Broad Institute, Inc. | Crispr enzyme mutations reducing off-target effects |
US11578312B2 (en) | 2015-06-18 | 2023-02-14 | The Broad Institute Inc. | Engineering and optimization of systems, methods, enzymes and guide scaffolds of CAS9 orthologs and variants for sequence manipulation |
US11642374B2 (en) | 2015-07-31 | 2023-05-09 | Intima Bioscience, Inc. | Intracellular genomic transplant and methods of therapy |
US11642375B2 (en) | 2015-07-31 | 2023-05-09 | Intima Bioscience, Inc. | Intracellular genomic transplant and methods of therapy |
US10406177B2 (en) | 2015-07-31 | 2019-09-10 | Regents Of The University Of Minnesota | Modified cells and methods of therapy |
US10166255B2 (en) | 2015-07-31 | 2019-01-01 | Regents Of The University Of Minnesota | Intracellular genomic transplant and methods of therapy |
US11925664B2 (en) | 2015-07-31 | 2024-03-12 | Intima Bioscience, Inc. | Intracellular genomic transplant and methods of therapy |
US11583556B2 (en) | 2015-07-31 | 2023-02-21 | Regents Of The University Of Minnesota | Modified cells and methods of therapy |
US11266692B2 (en) | 2015-07-31 | 2022-03-08 | Regents Of The University Of Minnesota | Intracellular genomic transplant and methods of therapy |
US11147837B2 (en) | 2015-07-31 | 2021-10-19 | Regents Of The University Of Minnesota | Modified cells and methods of therapy |
US11903966B2 (en) | 2015-07-31 | 2024-02-20 | Regents Of The University Of Minnesota | Intracellular genomic transplant and methods of therapy |
EP3341727A4 (en) * | 2015-08-25 | 2019-06-26 | Duke University | Compositions and methods of improving specificity in genomic engineering using rna-guided endonucleases |
US11427817B2 (en) | 2015-08-25 | 2022-08-30 | Duke University | Compositions and methods of improving specificity in genomic engineering using RNA-guided endonucleases |
JP2018525016A (en) * | 2015-08-25 | 2018-09-06 | デューク ユニバーシティ | Compositions and methods for improving specificity in genomic engineering using RNA-guided endonucleases |
CN108351350A (en) * | 2015-08-25 | 2018-07-31 | 杜克大学 | The composition and method of type endonuclease improvement genome project specificity are instructed using RNA |
JP2021166523A (en) * | 2015-08-25 | 2021-10-21 | デューク ユニバーシティ | Compositions and methods of improving specificity in genomic engineering using rna-guided endonucleases |
EP4177346A3 (en) * | 2015-08-25 | 2023-07-26 | Duke University | Compositions and methods of improving specificity in genomic engineering using rna-guided endonucleases |
CN114634930A (en) * | 2015-08-25 | 2022-06-17 | 杜克大学 | Compositions and methods for improving genome engineering specificity using RNA-guided endonucleases |
WO2017035416A2 (en) | 2015-08-25 | 2017-03-02 | Duke University | Compositions and methods of improving specificity in genomic engineering using rna-guided endonucleases |
EP4345454A3 (en) * | 2015-08-25 | 2024-07-17 | Duke University | Compositions and methods of improving specificity in genomic engineering using rna-guided endonucleases |
US11390908B2 (en) | 2015-09-02 | 2022-07-19 | University Of Massachusetts | Detection of gene loci with CRISPR arrayed repeats and/or polychromatic single guide ribonucleic acids |
WO2017040813A3 (en) * | 2015-09-02 | 2017-05-04 | University Of Massachusetts | Detection of gene loci with crispr arrayed repeats and/or polychromatic single guide ribonucleic acids |
US11268144B2 (en) | 2015-09-24 | 2022-03-08 | Sigma-Aldrich Co. Llc | Methods and reagents for molecular proximity detection using RNA-guided nucleic acid binding proteins |
WO2017053762A1 (en) * | 2015-09-24 | 2017-03-30 | Sigma-Aldrich Co. Llc | Methods and reagents for molecular proximity detection using rna-guided nucleic acid binding proteins |
US11970710B2 (en) | 2015-10-13 | 2024-04-30 | Duke University | Genome engineering with Type I CRISPR systems in eukaryotic cells |
US12043852B2 (en) | 2015-10-23 | 2024-07-23 | President And Fellows Of Harvard College | Evolved Cas9 proteins for gene editing |
US11214780B2 (en) | 2015-10-23 | 2022-01-04 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US10167457B2 (en) | 2015-10-23 | 2019-01-01 | President And Fellows Of Harvard College | Nucleobase editors and uses thereof |
US11667903B2 (en) | 2015-11-23 | 2023-06-06 | The Regents Of The University Of California | Tracking and manipulating cellular RNA via nuclear delivery of CRISPR/CAS9 |
WO2017091630A1 (en) * | 2015-11-23 | 2017-06-01 | The Regents Of The University Of California | Tracking and manipulating cellular rna via nuclear delivery of crispr/cas9 |
EP3871695A1 (en) * | 2015-12-07 | 2021-09-01 | Arc Bio, LLC | Methods and compositions for the making and using of guide nucleic acids |
JP2021141904A (en) * | 2015-12-07 | 2021-09-24 | アーク バイオ, エルエルシー | Methods and Compositions for Making and Using Guide Nucleic Acids |
US10457961B2 (en) | 2016-01-11 | 2019-10-29 | The Board Of Trustees Of The Leland Stanford Junior University | Chimeric proteins and methods of regulating gene expression |
US10336807B2 (en) | 2016-01-11 | 2019-07-02 | The Board Of Trustees Of The Leland Stanford Junior University | Chimeric proteins and methods of immunotherapy |
US11111287B2 (en) | 2016-01-11 | 2021-09-07 | The Board Of Trustees Of The Leland Stanford Junior University | Chimeric proteins and methods of immunotherapy |
WO2017123556A1 (en) * | 2016-01-11 | 2017-07-20 | The Board Of Trustees Of The Leland Stanford Junior University | Chimeric proteins and methods of immunotherapy |
US9856497B2 (en) | 2016-01-11 | 2018-01-02 | The Board Of Trustee Of The Leland Stanford Junior University | Chimeric proteins and methods of regulating gene expression |
US11773411B2 (en) | 2016-01-11 | 2023-10-03 | The Board Of Trustees Of The Leland Stanford Junior University | Chimeric proteins and methods of regulating gene expression |
CN105647885B (en) * | 2016-01-20 | 2017-08-18 | 广州元曦生物科技有限公司 | Cas9 fusion protein and coding sequence thereof |
CN105647885A (en) * | 2016-01-20 | 2016-06-08 | 广州元曦生物科技有限公司 | Cas9 fusion protein and coding sequence thereof |
US20190085325A1 (en) * | 2016-03-17 | 2019-03-21 | Imba - Institut Für Molekulare Biotechnologie Gmbh | Conditional crispr sgrna expression |
US11884917B2 (en) * | 2016-03-17 | 2024-01-30 | Imba—Institut Für Molekulare Biotechnologie Gmbh | Conditional CRISPR sgRNA expression |
US10017760B2 (en) | 2016-06-24 | 2018-07-10 | Inscripta, Inc. | Methods for generating barcoded combinatorial libraries |
US11584928B2 (en) | 2016-06-24 | 2023-02-21 | The Regents Of The University Of Colorado, A Body Corporate | Methods for generating barcoded combinatorial libraries |
US10294473B2 (en) | 2016-06-24 | 2019-05-21 | The Regents Of The University Of Colorado, A Body Corporate | Methods for generating barcoded combinatorial libraries |
US10287575B2 (en) | 2016-06-24 | 2019-05-14 | The Regents Of The University Of Colorado, A Body Corporate | Methods for generating barcoded combinatorial libraries |
US11566263B2 (en) | 2016-08-02 | 2023-01-31 | Editas Medicine, Inc. | Compositions and methods for treating CEP290 associated disease |
US10113163B2 (en) | 2016-08-03 | 2018-10-30 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US11702651B2 (en) | 2016-08-03 | 2023-07-18 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US11999947B2 (en) | 2016-08-03 | 2024-06-04 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US10947530B2 (en) | 2016-08-03 | 2021-03-16 | President And Fellows Of Harvard College | Adenosine nucleobase editors and uses thereof |
US11661590B2 (en) | 2016-08-09 | 2023-05-30 | President And Fellows Of Harvard College | Programmable CAS9-recombinase fusion proteins and uses thereof |
US12084663B2 (en) | 2016-08-24 | 2024-09-10 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11542509B2 (en) | 2016-08-24 | 2023-01-03 | President And Fellows Of Harvard College | Incorporation of unnatural amino acids into proteins using base editing |
US11306324B2 (en) | 2016-10-14 | 2022-04-19 | President And Fellows Of Harvard College | AAV delivery of nucleobase editors |
US10912797B2 (en) | 2016-10-18 | 2021-02-09 | Intima Bioscience, Inc. | Tumor infiltrating lymphocytes and methods of therapy |
US11154574B2 (en) | 2016-10-18 | 2021-10-26 | Regents Of The University Of Minnesota | Tumor infiltrating lymphocytes and methods of therapy |
US10745677B2 (en) | 2016-12-23 | 2020-08-18 | President And Fellows Of Harvard College | Editing of CCR5 receptor gene to protect against HIV infection |
US11820969B2 (en) | 2016-12-23 | 2023-11-21 | President And Fellows Of Harvard College | Editing of CCR2 receptor gene to protect against HIV infection |
US11466271B2 (en) | 2017-02-06 | 2022-10-11 | Novartis Ag | Compositions and methods for the treatment of hemoglobinopathies |
US11898179B2 (en) | 2017-03-09 | 2024-02-13 | President And Fellows Of Harvard College | Suppression of pain by gene editing |
US11542496B2 (en) | 2017-03-10 | 2023-01-03 | President And Fellows Of Harvard College | Cytosine to guanine base editor |
US11851690B2 (en) | 2017-03-14 | 2023-12-26 | Editas Medicine, Inc. | Systems and methods for the treatment of hemoglobinopathies |
US11268082B2 (en) | 2017-03-23 | 2022-03-08 | President And Fellows Of Harvard College | Nucleobase editors comprising nucleic acid programmable DNA binding proteins |
US11963982B2 (en) | 2017-05-10 | 2024-04-23 | Editas Medicine, Inc. | CRISPR/RNA-guided nuclease systems and methods |
US11453891B2 (en) | 2017-05-10 | 2022-09-27 | The Regents Of The University Of California | Directed editing of cellular RNA via nuclear delivery of CRISPR/CAS9 |
US11560566B2 (en) | 2017-05-12 | 2023-01-24 | President And Fellows Of Harvard College | Aptazyme-embedded guide RNAs for use with CRISPR-Cas9 in genome editing and transcriptional activation |
US11814624B2 (en) | 2017-06-15 | 2023-11-14 | The Regents Of The University Of California | Targeted non-viral DNA insertions |
US10011849B1 (en) | 2017-06-23 | 2018-07-03 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US10337028B2 (en) | 2017-06-23 | 2019-07-02 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US10626416B2 (en) | 2017-06-23 | 2020-04-21 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US9982279B1 (en) | 2017-06-23 | 2018-05-29 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US11697826B2 (en) | 2017-06-23 | 2023-07-11 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US10435714B2 (en) | 2017-06-23 | 2019-10-08 | Inscripta, Inc. | Nucleic acid-guided nucleases |
US11098325B2 (en) | 2017-06-30 | 2021-08-24 | Intima Bioscience, Inc. | Adeno-associated viral vectors for gene therapy |
US11866726B2 (en) | 2017-07-14 | 2024-01-09 | Editas Medicine, Inc. | Systems and methods for targeted integration and genome editing and detection thereof using integrated priming sites |
US11732274B2 (en) | 2017-07-28 | 2023-08-22 | President And Fellows Of Harvard College | Methods and compositions for evolving base editors using phage-assisted continuous evolution (PACE) |
US11319532B2 (en) | 2017-08-30 | 2022-05-03 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11932884B2 (en) | 2017-08-30 | 2024-03-19 | President And Fellows Of Harvard College | High efficiency base editors comprising Gam |
US11795443B2 (en) | 2017-10-16 | 2023-10-24 | The Broad Institute, Inc. | Uses of adenosine base editors |
US11083753B1 (en) | 2017-10-27 | 2021-08-10 | The Regents Of The University Of California | Targeted replacement of endogenous T cell receptors |
US11033584B2 (en) | 2017-10-27 | 2021-06-15 | The Regents Of The University Of California | Targeted replacement of endogenous T cell receptors |
US11590171B2 (en) | 2017-10-27 | 2023-02-28 | The Regents Of The University Of California | Targeted replacement of endogenous T cell receptors |
US11331346B2 (en) | 2017-10-27 | 2022-05-17 | The Regents Of The University Of California | Targeted replacement of endogenous T cell receptors |
CN108103090A (en) * | 2017-12-12 | 2018-06-01 | 中山大学附属第医院 | RNA Cas9-m6A modified vector system for targeting RNA methylation, and construction method and application thereof |
CN108103090B (en) * | 2017-12-12 | 2021-06-15 | 中山大学附属第一医院 | RNA Cas9-m6A modified vector system for targeting RNA methylation, and construction method and application thereof |
WO2019147743A1 (en) | 2018-01-26 | 2019-08-01 | Massachusetts Institute Of Technology | Structure-guided chemical modification of guide rna and its applications |
US12031132B2 (en) | 2018-03-14 | 2024-07-09 | Editas Medicine, Inc. | Systems and methods for the treatment of hemoglobinopathies |
US11447770B1 (en) | 2019-03-19 | 2022-09-20 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11795452B2 (en) | 2019-03-19 | 2023-10-24 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US11643652B2 (en) | 2019-03-19 | 2023-05-09 | The Broad Institute, Inc. | Methods and compositions for prime editing nucleotide sequences |
US12123032B2 (en) | 2019-11-26 | 2024-10-22 | The Broad Institute, Inc. | CRISPR enzyme mutations reducing off-target effects |
US11912985B2 (en) | 2020-05-08 | 2024-02-27 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
US12031126B2 (en) | 2020-05-08 | 2024-07-09 | The Broad Institute, Inc. | Methods and compositions for simultaneous editing of both strands of a target double-stranded nucleotide sequence |
WO2021248102A1 (en) * | 2020-06-05 | 2021-12-09 | Flagship Pioneering Innovations Vi, Llc | Template guide rna molecules |
WO2022243540A1 (en) * | 2021-05-21 | 2022-11-24 | Aarhus Universitet | Lentivirus-derived nanoparticles comprising crispr/cas9 ribonucleoprotein complexes |
US12037407B2 (en) | 2021-10-14 | 2024-07-16 | Arsenal Biosciences, Inc. | Immune cells having co-expressed shRNAS and logic gate systems |
US12098399B2 (en) | 2022-06-24 | 2024-09-24 | Tune Therapeutics, Inc. | Compositions, systems, and methods for epigenetic regulation of proprotein convertase subtilisin/kexin type 9 (PCSK9) gene expression |
Also Published As
Publication number | Publication date |
---|---|
US10822606B2 (en) | 2020-11-03 |
US20210123046A1 (en) | 2021-04-29 |
US20160289673A1 (en) | 2016-10-06 |
US12049624B2 (en) | 2024-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12049624B2 (en) | Optimized small guide RNAs and methods of use | |
US10781432B1 (en) | Engineered cascade components and cascade complexes | |
US20210403861A1 (en) | Nucleotide-specific recognition sequences for designer tal effectors | |
US20220220508A1 (en) | Engineered casx systems | |
US20170219596A1 (en) | A protein tagging system for in vivo single molecule imaging and control of gene transcription | |
US20240247286A1 (en) | Methods for improved homologous recombination and compositions thereof | |
AU2016316845B2 (en) | Engineered CRISPR-Cas9 nucleases | |
US11530421B2 (en) | Self-inactivating endonuclease-encoding nucleic acids and methods of using the same | |
EP2539445B1 (en) | Use of endonucleases for inserting transgenes into safe harbor loci | |
WO2018208755A1 (en) | Compositions and methods for tagging target proteins in proximity to a nucleotide sequence of interest | |
EP3390624A1 (en) | Modified site-directed modifying polypeptides and methods of use thereof | |
CN113373130A (en) | Cas12 protein, gene editing system containing Cas12 protein and application | |
WO2017107898A2 (en) | Compositions and methods for gene editing | |
WO2019041344A1 (en) | Methods and compositions for single-stranded dna transfection | |
WO2024198961A1 (en) | Cas protein and mutant thereof, and corresponding gene editing system and use thereof | |
US20230045187A1 (en) | Compositions comprising a nuclease and uses thereof | |
CN110499335B (en) | CRISPR/SauriCas9 gene editing system and application thereof | |
CN113039276A (en) | Nuclease-mediated modification of nucleic acids | |
WO2023086938A2 (en) | Type v nucleases | |
JP6956995B2 (en) | Genome editing method | |
WO2022187278A1 (en) | Nucleic acid detection and analysis systems | |
CN114380918B (en) | System and method for single base editing of target RNA | |
CN116622678A (en) | Gene editing protein, corresponding gene editing system and application | |
Chapter et al. | CHAPTER II A-to-I RNA editing by using ADAR1 artificial deaminase system for restoration of genetic code in Ochre (UAA) stop codon | |
CA3189662A1 (en) | Compositions comprising a nuclease and uses thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14848510 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15025217 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 14848510 Country of ref document: EP Kind code of ref document: A1 |