WO2023004409A1 - Guide rnas for crispr/cas editing systems - Google Patents

Guide rnas for crispr/cas editing systems Download PDF

Info

Publication number
WO2023004409A1
WO2023004409A1 PCT/US2022/074041 US2022074041W WO2023004409A1 WO 2023004409 A1 WO2023004409 A1 WO 2023004409A1 US 2022074041 W US2022074041 W US 2022074041W WO 2023004409 A1 WO2023004409 A1 WO 2023004409A1
Authority
WO
WIPO (PCT)
Prior art keywords
grna
adenosine deaminase
nls
composition
amino acid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2022/074041
Other languages
English (en)
French (fr)
Inventor
Brian CAFFERTY
Michael Packer
Yvonne ARATYN-SCHAUS
Lo-I CHENG
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beam Therapeutics Inc
Original Assignee
Beam Therapeutics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beam Therapeutics Inc filed Critical Beam Therapeutics Inc
Priority to CN202280060813.2A priority Critical patent/CN117916373A/zh
Priority to KR1020247005580A priority patent/KR20240037299A/ko
Priority to AU2022313315A priority patent/AU2022313315A1/en
Priority to JP2024504461A priority patent/JP2024529425A/ja
Priority to EP22761858.4A priority patent/EP4373931A1/en
Priority to CA3226664A priority patent/CA3226664A1/en
Publication of WO2023004409A1 publication Critical patent/WO2023004409A1/en
Priority to US18/418,751 priority patent/US20240301405A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/111General methods applicable to biologically active non-coding nucleic acids
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/0008Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P3/00Drugs for disorders of the metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/31Chemical structure of the backbone
    • C12N2310/315Phosphorothioates
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/34Spatial arrangement of the modifications
    • C12N2310/344Position-specific modifications, e.g. on every purine, at the 3'-end
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/35Nature of the modification
    • C12N2310/351Conjugate
    • C12N2310/3513Protein; Peptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2320/00Applications; Uses
    • C12N2320/30Special therapeutic applications
    • C12N2320/34Allele or polymorphism specific uses

Definitions

  • CRISPR/Cas editing systems include the use of guide RNA molecules
  • gRNA in association with Cas endonucleases, and related enzymes, for applications in gene editing as well as related systems, including base editing.
  • one or more gRNA molecules assembles with a Cas protein in a complex and guides the ribonucleic acid complex (RNP) to specific DNA (for example, in Cas9 and Cas 12 systems) and/or RNA (for example, in Cas 13 systems) sequences.
  • RNP ribonucleic acid complex
  • gRNA ribonucleic acid
  • RNP ribonucleic acid
  • the invention provides, in some aspects, methods to produce gRNA conjugated to an NLS sequence (NLS-gRNA) that has increased potency for use in CRISPR-Cas system, for example, increased frequency of successful editing events.
  • NLS-gRNA of the present invention can provide better trafficking of the gRNA to the nucleus to protect from cytosolic RNases and increase higher local concentration of gRNA for formation of RNP.
  • NLS-gRNA of the present invention has significantly higher potency as compared to a counterpart gRNA without the NLS sequence and also shows a higher potency as compared to highly modified gRNAs.
  • the present invention provides, among other things, a guide
  • RNA comprising a nuclear localization signal (NLS) linked to the gRNA through a linker, wherein the linker comprises a cysteine residue conjugated to the 3' end of the gRNA.
  • NLS nuclear localization signal
  • the linker comprises a cysteine residue at the N- terminus. In some embodiments, the linker comprises a cysteine residue at the C-terminus.
  • the linker comprises a cysteine residue at an internal site in the linker.
  • the linker is conjugated to the 3' end of the gRNA. In some embodiments, the linker is conjugated to the 5' end of the gRNA. In some embodiments, the linker is conjugated to an internal region in the gRNA. In some embodiments, the linker is conjugated to a first hairpin region in the gRNA. In some embodiments, the linker is conjugated to a second hairpin region in the gRNA. In some embodiments, the linker is conjugated to a bulge region in the gRNA. In some embodiments, the gRNA comprises one or more modifications. In some embodiments, one or more modifications are 2OMe modification. In some embodiments, one or more modifications comprise 2'-Fluoro modifications. In some embodiments, one or more modifications comprise phosphorothioate linkages.
  • gRNA does not comprise a backbone modification.
  • one or more modifications occur at 1, 2, 3, 4, 5, 6, 7, 8, and 9 nucleotides from the 3' end of the gRNA.
  • one or more modifications occur at 1, 2, 3, 4, 5, 6, 7, and 8 nucleotides from the 3' end of the gRNA.
  • one or more modifications occur at 1, 2, 3, 4, 5, 6, and 7 nucleotides from the 3' end of the gRNA.
  • one or more modifications occur at 1, 2, 3, 4, 5, and 6 nucleotides from the 3' end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, 3, 4, and 5 nucleotides from the 3' end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, 3, and 4 nucleotides from the 3' end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, and 3 nucleotides from the 3' end of the gRNA. In some embodiments, one or more modifications occur at 1 and 2 nucleotides from the 3' end of the gRNA. In some embodiments, one or more modifications occur at 1 nucleotide from the 3' end of the gRNA.
  • one or more modifications occur at 1, 2, 3, 4, 5, 6, 7, 8, and 9 nucleotides from the 5' end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, 3, 4, 5, 6, 7, and 8 nucleotides from the 5' end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, 3, 4, 5, 6, and 7 nucleotides from the 5' end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, 3, 4, 5, and 6 nucleotides from the 5' end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, 3, 4, and 5 nucleotides from the 5' end of the gRNA.
  • one or more modifications occur at 1, 2, 3, and 4 nucleotides from the 5 'end of the gRNA. In some embodiments, one or more modifications occur at 1, 2, and 3 nucleotides from the 5' end of the gRNA. In some embodiments, one or more modifications occur at 1, and 2 nucleotides from the 5' end of the gRNA. In some embodiments, one or more modifications occur at 1 nucleotide from the 5' end of the gRNA
  • more than 10% of the gRNA is modified. In some embodiments, more than 20% of the gRNA is modified. In some embodiments, more than 30% of the gRNA is modified. In some embodiments, more than 35% of the gRNA is modified. In some embodiments, more than 40% of the gRNA is modified. In some embodiments, more than 45% of the gRNA is modified. In some embodiments, more than 50% of the gRNA is modified. In some embodiments, more than 55% of the gRNA is modified. In some embodiments, more than 60% of the gRNA is modified. In some embodiments, more than 65% of the gRNA is modified. In some embodiments, more than 70% of the gRNA is modified.
  • more than 75% of the gRNA is modified. In some embodiments, more than 80% of the gRNA is modified. In some embodiments, more than 85% of the gRNA is modified. In some embodiments, more than 88% of the gRNA is modified. In some embodiments, more than 90% of the gRNA is modified. In some embodiments, more than 95% of the gRNA is modified.
  • less than 10% of the gRNA is modified. In some embodiments, less than 20% of the gRNA is modified. In some embodiments, less than 30% of the gRNA is modified. In some embodiments, less than 35% of the gRNA is modified. In some embodiments, less than 40% of the gRNA is modified. In some embodiments, less than 45% of the gRNA is modified. In some embodiments, less than 50% of the gRNA is modified. In some embodiments, less than 55% of the gRNA is modified. In some embodiments, less than 60% of the gRNA is modified. In some embodiments, less than 65% of the gRNA is modified. In some embodiments, less than 70% of the gRNA is modified.
  • less than 75% of the gRNA is modified. In some embodiments, less than 80% of the gRNA is modified. In some embodiments, less than 85% of the gRNA is modified. In some embodiments, less than 88% of the gRNA is modified. In some embodiments, less than 90% of the gRNA is modified. In some embodiments, less than 95% of the gRNA is modified.
  • the gRNA is conjugated to one or more NLS sequences.
  • the gRNA may comprise about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the 3' end, about or more than about 1, 2, 3, 4, 5,
  • NLSs at or near the 5' end or a combination of these (e.g. one or more NLS at the 3' end and one or more NLS at the 5' end).
  • NLSs at the 3' end may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies.
  • Non-limiting examples of NLSs include an NLS sequence derived from: the
  • NLS of the SV40 virus large T-antigen having the amino acid sequence PKKKRKV (SEQ ID NO: 41); the NLS from nucleoplasmin (e.g. the nucleoplasmin bipartite NLS with the sequence KRPAATKKAGQAKKKK (SEQ ID NO: 42)); the c-myc NLS having the amino acid sequence PAAKRVKLD (SEQ ID NO: 43) or RQRRNELKRSP (SEQ ID NO: 44); the hRNPAl M9 NLS having the sequence
  • the NLS is derived from simian vims 40 (SV40).
  • the NLS comprises an amino acid sequence of KKKRKV (SEQ ID NO: 57).
  • the NLS comprises a bipartite NLS.
  • the NLS comprises a bipartite NLS with SV40 NLS.
  • the linker further comprises a peptide spacer.
  • the peptide spacer comprises more than 2 amino acids. In some embodiments, the peptide spacer comprises more than 3 amino acids. In some embodiments, the peptide spacer comprises more than 4 amino acids. In some embodiments, the peptide spacer comprises more than 5 amino acids. In some embodiments, the peptide spacer comprises more than 6 amino acids. In some embodiments, the peptide spacer comprises more than 7 amino acids. In some embodiments, the peptide spacer comprises more than 8 amino acids. In some embodiments, the peptide spacer comprises more than 9 amino acids. In some embodiments, the peptide spacer comprises more than 10 amino acids.
  • the peptide spacer comprises more than 12 amino acids. In some embodiments, the peptide spacer comprises more than 15 amino acids. In some embodiments, the peptide spacer comprises more than 18 amino acids. In some embodiments, the peptide spacer comprises more than 20 amino acids. In some embodiments, the peptide spacer comprises more than 25 amino acids. In some embodiments, the peptide spacer comprises more than 30 amino acids.
  • the peptide spacer comprises 2-30 amino acids. In some embodiments, the peptide spacer comprises 5-25 amino acids. In some embodiments, the peptide spacer comprises 7-20 amino acids. In some embodiments, the peptide spacer comprises 7-15 amino acids. In some embodiments, the peptide spacer comprises 7-12 amino acids.
  • the peptide spacer comprises about 5 amino acids. In some embodiments, the peptide spacer comprises about 7 amino acids. In some embodiments, the peptide spacer comprises about 8 amino acids. In some embodiments, the peptide spacer comprises about 9 amino acids. In some embodiments, the peptide spacer comprises about 10 amino acids. In some embodiments, the peptide spacer comprises about 11 amino acids. In some embodiments, the peptide spacer comprises about 12 amino acids.
  • the peptide spacer comprises about 13 amino acids. In some embodiments, the peptide spacer comprises about 14 amino acids. In some embodiments, the peptide spacer comprises about 15 amino acids. [0019] In some embodiments, the peptide spacer comprises an amino acid sequence of KRTADGSEFESP (SEQ ID NO: 58). In some embodiments, the peptide spacer is 70% identical to amino acid sequence of KRTADGSEFESP. In some embodiments, the peptide spacer is 75% identical to amino acid sequence of KRTADGSEFESP. In some embodiments, the peptide spacer is 80% identical to amino acid sequence of KRTADGSEFESP.
  • the peptide spacer is 85% identical to amino acid sequence of KRTADGSEFESP. In some embodiments, the peptide spacer is 90% identical to amino acid sequence of KRTADGSEFESP. In some embodiments, the peptide spacer is 92% identical to amino acid sequence of KRTADGSEFESP. In some embodiments, the peptide spacer is 95% identical to amino acid sequence of KRTADGSEFESP. In some embodiments, the peptide spacer is 97% identical to amino acid sequence of KRTADGSEFESP. In some embodiments, the peptide spacer is 99% identical to amino acid sequence of KRTADGSEFESP.
  • the linker further comprises a chemical moiety that conjugates gRNA to the peptide spacer or to the NLS.
  • gRNA is conjugated to NLS via a linker.
  • said linker comprises a chemical moiety (e.g., L) and/or a peptidic moiety (e.g., a peptide spacer).
  • gRNA is conjugated to NLS directly via a chemical moiety
  • a chemical moiety (e.g., L).
  • a chemical moiety is non-peptidic.
  • a chemical moiety e.g., L
  • gRNA is conjugated to NLS via a peptidic moiety (e.g., a peptide spacer).
  • a peptidic moiety e.g., a peptide spacer
  • NLS NLS
  • gRNA is conjugated to NLS via a linker comprising both a chemical moiety (e.g., L) and a peptidic moiety (e.g., a peptide spacer).
  • a linker comprising both a chemical moiety (e.g., L) and a peptidic moiety (e.g., a peptide spacer).
  • such conjugates can have a structure according to Formula (I), where a chemical moiety L (e.g., a non-peptidic chemical moiety) is covalently attached to gRNA and a peptide spacer, and wherein the peptide spacer is covalently attached to NLS.
  • gRNA is conjugated to NLS via a chemical moiety (e.g., L) covalently attached to the C-terminus of the peptide spacer or the NLS amino acid sequence.
  • a chemical moiety e.g., L
  • gRNA is conjugated to NLS via a chemical moiety (e.g., L) covalently attached to the N-terminus of the peptide spacer or the NLS amino acid sequence.
  • a chemical moiety e.g., L
  • gRNA is conjugated to the peptide spacer or the NLS via a chemical moiety (e.g., L) covalently attached to the 3' end of the gRNA.
  • a chemical moiety e.g., L
  • gRNA is conjugated to the peptide spacer or the NLS via a chemical moiety (e.g., L) covalently attached to the 5' end of the gRNA.
  • a chemical moiety e.g., L
  • a chemical moiety e.g., L
  • a thiol- containing residue e.g., a cysteine residue
  • a chemical moiety e.g., L
  • a selenium-containing residue e.g., a selenocysteine residue
  • a chemical moiety e.g., L
  • an amino-containing residue e.g., a lysine residue
  • a chemical moiety e.g., L
  • a phenol-containing residue e.g., a tyrosine residue
  • amino acid residues used for formation of a linker comprise chemical modifications.
  • the guide RNA further comprises a nucleic acid linker sequence.
  • the nucleic acid linker sequence is an RNA sequence.
  • the nucleic acid linker sequence is positioned at the 5' end and/or 3' end of the guide RNA sequence.
  • the nucleic acid linker comprises about 1-50 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-45 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-40 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-35 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-30 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-25 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-20 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-15 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-10 nucleotides. In some embodiments, the nucleic acid linker comprises about 1-5 nucleotides.
  • the nucleic acid linker comprises about 5 nucleotides, about 10 nucleotides, about 15 nucleotides, about 20 nucleotides, about 25 nucleotides, about 30 nucleotides, about 35 nucleotides, about 40 nucleotides, about 45 nucleotides, or about 50 nucleotides.
  • the guide RNA does not comprise a nucleic acid linker.
  • the nucleic acid linker comprises about one nucleotide. In some embodiments, the nucleic acid linker comprises about 2 nucleotides. In some embodiments, the nucleic acid linker comprises about 3 nucleotides. In some embodiments, the nucleic acid linker comprises about 4 nucleotides. In some embodiments, the nucleic acid linker comprises about 5 nucleotides. In some embodiments, the nucleic acid linker comprises about 6 nucleotides. In some embodiments, the nucleic acid linker comprises about 7 nucleotides.
  • the nucleic acid linker comprises about 8 nucleotides. In some embodiments, the nucleic acid linker comprises about 9 nucleotides. In some embodiments, the nucleic acid linker comprises about 10 nucleotides. In some embodiments, the nucleic acid linker comprises about 11 nucleotides. In some embodiments, the nucleic acid linker comprises about 12 nucleotides. In some embodiments, the nucleic acid linker comprises about 13 nucleotides. In some embodiments, the nucleic acid linker comprises about 14 nucleotides. In some embodiments, the nucleic acid linker comprises about 15 nucleotides. In some embodiments, the nucleic acid linker comprises about 16 nucleotides.
  • the nucleic acid linker comprises about 17 nucleotides. In some embodiments, the nucleic acid linker comprises about 18 nucleotides. In some embodiments, the nucleic acid linker comprises about 19 nucleotides. In some embodiments, the nucleic acid linker comprises about 20 nucleotides. In some embodiments, the nucleic acid linker comprises about 21 nucleotides. In some embodiments, the nucleic acid linker comprises about 22 nucleotides. In some embodiments, the nucleic acid linker comprises about 23 nucleotides. In some embodiments, the nucleic acid linker comprises about 24 nucleotides. In some embodiments, the nucleic acid linker comprises about 25 nucleotides.
  • the nucleic acid linker comprises between about 50-
  • the nucleic acid linker comprises between about 100 150 nucleotides. In some embodiments, the nucleic acid linker comprises between about 150 200 nucleotides. In some embodiments, the nucleic acid linker comprises between about 200- 500 nucleotides.
  • the nucleic acid linker sequence is a linear linker sequence. In some embodiments, the linker sequence is anon-linear sequence. In some embodiments, the linker sequence comprises RNA secondary structures.
  • the nucleic acid linker sequence is placed at the 3' end and/or the 5' end of the guide RNA sequence.
  • the gRNA comprising the NLS improves base editing efficiency as compared to a gRNA without the NLS. In some embodiments, the gRNA comprising the NLS improves base editing efficiency by at least 1.5-fold as compared to a gRNA without the NLS. In some embodiments, the gRNA comprising the NLS improves base editing efficiency by at least 2-fold as compared to a gRNA without the NLS. In some embodiments, the gRNA comprising the NLS improves base editing efficiency by at least 2.5-fold as compared to a gRNA without the NLS.
  • the gRNA comprising the NLS improves base editing efficiency by at least 3 -fold as compared to a gRNA without the NLS. In some embodiments, the gRNA comprising the NLS improves base editing efficiency by at least 4-fold as compared to a gRNA without the NLS. In some embodiments, the gRNA comprising the NLS improves base editing efficiency by at least 5- fold as compared to a gRNA without the NLS.
  • the guide RNA further comprises a direct repeat sequence found in natural CRISPR systems.
  • the gRNA is a single guide RNA (sgRNA). In some embodiments, the gRNA is a tracrRNA. In some embodiments, the gRNA is a crRNA.
  • the guide RNA comprises a clustered regularly interspersed short palindromic repeats (CRISPR) RNA (crRNA). In some embodiments, the guide RNA further comprises a trans-activating RNA (tracrRNA).
  • CRISPR clustered regularly interspersed short palindromic repeats
  • tracrRNA trans-activating RNA
  • the crRNA is modified. In some embodiments, the tracrRNA is modified. In some embodiments, the crRNA and/or comprise chemically modified nucleotides. In some embodiments, the tracrRNA comprises additional sequences that maintain folding. In some embodiments, the linker comprises chemically modified nucleotides. [0047] In some embodiments, the modifications to the crRNA, tracrRNA, and/or linker comprises one or more of 1) chemical modifications; 2) any nucleotide substitutions that preserve secondary structure; 3) alterations of the GC content; 4) addition of sequence to maintain predicted folding of tracrRNA.
  • the NLS-gRNA is an extended guide RNA, or a Cas9 guide RNA, or a Casl3 guide RNA, or a Casl2 guide RNA such as Cas 12a guide RNA, Casl2b guide RNA, Casl2c guide RNA, Casl2d guide RNA, Casl2e guide RNA, Casl2f guide RNA, Casl2g guide RNA, Casl2h guide RNA, Casl2i guide RNA, Casl2j guide RNA, Cas 12k guide RNA.
  • the NLS-gRNA is an extended guide RNA.
  • the NLS- gRNA is a Cas9 guide RNA. In some embodiments, the NLS-gRNA is a Cas 13 guide RNA. In some embodiments, the NLS-gRNA is a Cas 12 guide RNA. In some embodiments, the NLS-gRNA is a Cas 12a guide RNA. In some embodiments, the NLS-gRNA is a Cas 12b guide RNA. In some embodiments, the NLS-gRNA is a Cas 12c guide RNA. In some embodiments, the NLS-gRNA is a Casl2d guide RNA. In some embodiments, the NLS- gRNA is a Casl2e guide RNA.
  • the NLS-gRNA is a Casl2f guide RNA. In some embodiments, the NLS-gRNA is a Cas 12g guide RNA. In some embodiments, the NLS-gRNA is a Casl2h guide RNA. In some embodiments, the NLS- gRNA is a Casl2i guide RNA. In some embodiments, the NLS-gRNA is a Casl2j guide RNA. In some embodiments, the NLS-gRNA is a Cas 12k guide RNA.
  • the NLS-gRNA comprises one or more of the following: a spacer, a lower stem, a bulge, an upper stem, a nexus and a hairpin.
  • the stem loop comprises GC base pairs.
  • the NLS-gRNA is produced at a yield of about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or more. Accordingly, in some embodiments, the NLS-gRNA is produced at a yield of about 50%. In some embodiments, the NLS-gRNA is produced at a yield of about 55%. In some embodiments, the NLS-gRNA is produced at a yield of about 60%. In some embodiments, the NLS-gRNA is produced at a yield of about 65%. In some embodiments, the NLS-gRNA is produced at a yield of about 70%.
  • the NLS-gRNA is produced at a yield of about 75%. In some embodiments, the NLS-gRNA is produced at a yield of about 80%. In some embodiments, the NLS-gRNA is produced at a yield of about 85%. In some embodiments, the NLS-gRNA is produced at a yield of about 90%. In some embodiments, the NLS-gRNA is produced at a yield of about 95%. In some embodiments, the NLS-gRNA is produced at a yield of more than 99%.
  • the NLS-gRNA is produced at 50%, 55%, 60%, 65%,
  • the NLS-gRNA is produced at 50% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 55% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 60% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 65% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 70% improvement in yield as compared to conventional synthetic methods.
  • the NLS-gRNA is produced at 75% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 80% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 85% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 90% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 95% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS-gRNA is produced at 99% improvement in yield as compared to conventional synthetic methods. In some embodiments, the NLS- gRNA is produced at more than 99% improvement in yield as compared to conventional synthetic methods.
  • the NLS-gRNA has a length of about 40 nucleotides, about 100 nucleotides, about 125 nucleotides, about 150 nucleotides, about 175 nucleotides, about 200 nucleotides, or greater than about 200 nucleotides. Accordingly, in some embodiments, the NLS-gRNA has a length of about 40 nucleotides. In some embodiments, the NLS-gRNA has a length of about 100 nucleotides. In some embodiments, the NLS-gRNA has a length of about 125 nucleotides. In some embodiments, the NLS-gRNA has a length of about 150 nucleotides.
  • the NLS-gRNA has a length of about 175 nucleotides. In some embodiments, the NLS-gRNA has a length of about 200 nucleotides. In some embodiments, the NLS-gRNA has a length of greater than about 200 nucleotides. [0054] In some embodiments, the NLS-gRNA length is Cas dependent. For example, in some embodiments, the NLS-gRNA length for Cas 12a is greater than 40 nucleotides. In some embodiments, the NLS-gRNA length for Cas9 is greater than 123 nucleotides. In some embodiments, the NLS-gRNA length for Cas9 is between 125-200 nucleotides.
  • the NLS-gRNA length for Cas9 is between 125-250 nucleotides. In some embodiments, the NLS-gRNA length for Cas9 is between 125-300 nucleotides. In some embodiments, the NLS-gRNA length for Cas9 is between 125-350 nucleotides. In some embodiments, the NLS-gRNA length for Cas9 is between 125-400 nucleotides. In some embodiments, the NLS-gRNA length for Cas9 is between 125-450 nucleotides. In some embodiments, the NLS-gRNA length for Cas9 is between 125-500 nucleotides.
  • the NLS-gRNA comprises one or more backbone modifications.
  • the one or more backbone modifications comprises a 2'
  • the one or more backbone modifications comprises a 2' O-methyl modification.
  • the one or more backbone modifications comprises a phosphorothioate modification.
  • the one or more backbone modifications is selected from 2'-0-methyl 3 '-phosphorothioate, 2'-0-methyl, 2'-ribo 3 '-phosphorothioate, 2'-fluro, 2'- O-methoxyethyl morpholino (PMO), locked nucleic acid (LNA), deoxy, or 5' phosphate modification.
  • the one or more backbone modifications comprises a 2'-0-methyl 3 '-phosphorothioate modification.
  • the one or more backbone modifications comprises a 2'-0-methyl modification.
  • the one or more modifications comprises a 2'-ribo 3 '-phosphorothioate modification.
  • the one or more modifications comprises a 2'-fluro modification. In some embodiments, the one or more modifications comprises a 2'-0-methoxyethyl morpholino (PMO). In some embodiments, the one or more modifications comprises a locked nucleic acid (LNA). In some embodiments, the one or more modifications comprises a deoxy modification. In some embodiments, the one or more modifications comprises a 5' phosphate modification.
  • RNA bases include for example, 2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'-(2-aminoethyl)-2'
  • O-methoxy-ethyl bases such as 2-MethoxyEthoxy A, 2-MethoxyEthoxy MeC, 2- MethoxyEthoxy G, 2-MethoxyEthoxy T.
  • modified bases include for example, 2'-0- Methyl RNA bases, and fluoro bases.
  • fluoro bases are known, and include for example, fluoro C, fluoro U, fluoro A, fluoro G bases.
  • Various 2'-OMethyl modifications can also be used with the methods described herein.
  • RNA comprising one or more of the following 2'-OMethyl modifications
  • the RNA comprises one or more of the following modifications: phosphorothioates, 2'0-methyls, 2' fluoro (2'F), deoxy.
  • the RNA comprises 2'OMe modifications at the 3' end.
  • the RNA comprises 2'OMe modifications at the 5' end.
  • the RNA comprises 2'OMe modifications at the 3' end and 5' end.
  • the RNA comprises one or more of the following modifications: 2' -O-2-Methoxyethyl (MOE), locked nucleic acids, bridged nucleic acids, unlocked nucleic acids, peptide nucleic acids, morpholino nucleic acids.
  • MOE 2' -O-2-Methoxyethyl
  • the RNA comprises one or more of the following base modifications: 2,6-diaminopurine, 2-aminopurine, pseudouracil, N1 -methyl -psuedouracil, 5' methyl cytosine, 2' pyrimidinone (zebularine), thymine.
  • modified bases include for example, 2-Aminopurine, 5-Bromo dU, deoxyUridine, 2,6-Diaminopurine (2-Amino-dA), Dideoxy-C, deoxylnosine, Hydroxymethyl dC, Inverted dT, Iso-dG, Iso-dC, Inverted Dideoxy-T, 5-Methyl dC, 5-Methyl dC, 5-Nitroindole, Super T®, 2'-F-r(C,U), 2'-NH2- r(C,U), 2,2'-Anhydro-U, 3'-Deoxy-r(A,C,G,U), 3'-0-Methyl-r(A,C,G,U), rT, rl, 5-Methyl -rC, 2-Amino-rA, rSpacer (Abasic), 7-Deaza-rG, 7-Deaza-rA, 8-Oxo-rG, 5-H
  • RNA can comprise a modified base such as, for example, 5' Int, 3' Azide (NHS Ester); 5' Hexynyl; 5' Int, 3' 5-Octadiynyl dU; 5', Int Biotin (Azide); 5', Int 6-FAM (Azide); and 5', Int 5- TAMRA (Azide).
  • modified base such as, for example, 5' Int, 3' Azide (NHS Ester); 5' Hexynyl; 5' Int, 3' 5-Octadiynyl dU; 5', Int Biotin (Azide); 5', Int 6-FAM (Azide); and 5', Int 5- TAMRA (Azide).
  • Other examples of RNA nucleotide modifications that can be used with the methods described herein include for example phosphorylation modifications, such as 5'- phosphorylation and 3 '-phosphorylation.
  • the RNA can also have one or more of the following modifications: an amino modification, biotinylation, thiol modification, alkyne modifier, adenylation, Azide (NHS Ester), Cholesterol-TEG, and Digoxigenin (NHS Ester).
  • the method produces NLS-gRNA at a purity of about
  • the method produces NLS-gRNA at a purity of about 50%. In some embodiments, the method produces NLS-gRNA at a purity of about 60%. In some embodiments, the method produces NLS-gRNA at a purity of about 70%. In some embodiments, the method produces NLS- gRNA at a purity of about 80%. In some embodiments, the method produces NLS-gRNA at a purity of about 90%. In some embodiments, wherein the method produces NLS-gRNA at a purity of about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than 99%.
  • the method produces NLS-gRNA at a purity of about 91%. In some embodiments, the method produces NLS- gRNA at a purity of about 92%. In some embodiments, the method produces NLS-gRNA at a purity of about 93%. In some embodiments, the method produces NLS-gRNA at a purity of about 94%. In some embodiments, the method produces NLS-gRNA at a purity of about 95%. In some embodiments, the method produces NLS-gRNA at a purity of about 96%. In some embodiments, the method produces NLS-gRNA at a purity of about 97%. In some embodiments, the method produces NLS-gRNA at a purity of about 98%. In some embodiments, the method produces NLS-gRNA at a purity of about 99%. In some embodiments, the method produces NLS-gRNA at a purity of greater than about 99%.
  • the present invention provides, among other things, a composition comprising a guide RNA (gRNA) comprising a nuclear localization signal (NLS) linked to the gRNA through a linker, wherein the linker comprises a cysteine residue conjugated to the 3' end of the gRNA, wherein the NLS-guide RNA is encapsulated in a lipid nanoparticle (LNP).
  • gRNA guide RNA
  • NLS nuclear localization signal
  • the present invention provides, among other things, a composition comprising a guide RNA (gRNA) comprising a nuclear localization signal (NLS) linked to the gRNA through a linker, wherein the linker comprises a cysteine residue conjugated to the 3' end of the gRNA, wherein the NLS-guide RNA is associated with lipid nanoparticle (LNP).
  • gRNA guide RNA
  • NLS nuclear localization signal
  • the composition comprises a nuclease. In some embodiments, the composition comprises a nucleic acid encoding a nuclease. In some embodiments, the composition comprises an mRNA encoding a nuclease.
  • the nuclease is conjugated to a NLS.
  • the Cas protein is conjugated to a NLS.
  • the Cas protein does not comprise a NLS.
  • the Cas protein is not conjugated to a NLS.
  • the Cas9 protein does not comprise a NLS.
  • the Cas9 protein is not conjugated to a NLS.
  • the composition comprises a NLS-gRNA and an mRNA encoding a nuclease.
  • the composition comprises a NLS- gRNA and an mRNA encoding a nuclease at 1 : 1 weight ratio.
  • the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 2: 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 3: 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 4: 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 5 : 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 6: 1 weight ratio.
  • the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 7: 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 8: 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 9: 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 10: 1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 12:1 weight ratio. In some embodiments, the composition comprises a NLS-gRNA and an mRNA encoding a nuclease at 15: 1 weight ratio.
  • the nuclease is a CRISPR class 2 type II enzyme. In some embodiments, the nuclease is a CRISPR class 2 type V enzyme. In some embodiments, the nuclease CRISPR class 2 type VI enzyme. In some embodiments, wherein the nuclease is a Cas9, Cpfl, SaCas9, Casl2, Casl3, or modified versions thereof. Accordingly, in some embodiments, the nuclease is a Cas9, or modified versions thereof. In some embodiments, the nuclease is a Cpfl, or modified versions thereof.
  • nuclease is a Staphylococcus aureus Cas9 (SaCas9), or modified versions thereof. In some embodiments, nuclease is a. Streptococcus thermophilus 1 Cas9 (StlCas9) or modified versions thereof. In some embodiments, nuclease is a Streptococcus pyogenes Cas9 (SpCas9), or modified versions thereof. In some embodiments, nuclease is a Casl2, or modified versions thereof.
  • the nuclease is a Casl3, or modified versions thereof.
  • the Cas9 comprises a nuclease dead Cas9 (dCas9). In some embodiments, the Cas9 comprises a Cas9 nickase (nCas9). In some embodiments, the Cas9 comprises a nuclease active Cas9. [0068] In some embodiments, the nuclease domain is fused to a heterologous polypeptide. In some embodiments the heterologous polypeptide includes an effector domain that is capable of making a modification to a nucleic acid (e.g., DNA).
  • a nucleic acid e.g., DNA
  • the DNA effector domain may be a deaminase domain, such as a cytidine deaminase domain, cytosine domain or an adenosine deaminase domain.
  • the deaminase domain is a cytidine deaminase domain, such as an APOBEC or AID cytidine deaminase.
  • the cytidine deaminase can be a deaminase from the apolipoprotein B mRNA-editing complex (APOBEC) family deaminase.
  • APOBEC apolipoprotein B mRNA-editing complex
  • the heterologous polypeptide is a cytidine or cytosine deaminase domain.
  • the heterologous polypeptide is a cytosine deaminase domain.
  • the heterologous polypeptide is a cytidine deaminase domain.
  • the heterologous polypeptide is an adenosine or adenine deaminase domain. In some embodiments, the heterologous polypeptide is an adenosine domain. In some embodiments, the heterologous polypeptide is an adenine domain.
  • a heterologous polypeptide is an adenosine deaminase variant domain.
  • the adenosine deaminase variant domain comprises one or more mutations with reference to SEQ ID NO: 3.
  • the adenosine deaminase variant domain comprises V82G.
  • the adenosine deaminase variant domain comprises Y147T/D.
  • the adenosine deaminase variant domain comprises Q154S.
  • the adenosine deaminase variant domain comprises L36H.
  • the adenosine deaminase variant domain comprises I76Y. In some embodiments, the adenosine deaminase variant domain comprises F149Y. In some embodiments, the adenosine deaminase variant domain comprises N157K. In some embodiments, the adenosine deaminase variant domain comprises V82G, Y147T/D and Q154S. In some embodiments, the adenosine deaminase variant domain comprises V82G, Y147T/D, Q154S, and L36H.
  • the adenosine deaminase variant domain comprises V82G, Y147T/D, Q154S, and I76Y. In some embodiments, the adenosine deaminase variant domain comprises V82G, Y147T/D, Q154S, and F149Y. In some embodiments, the adenosine deaminase variant domain comprises V82G, Y147T/D, Q154S, and N157K. In some embodiments, the adenosine deaminase variant domain comprises V82G, Y147T/D, Q154S, and D167N.
  • the adenosine deaminase variant domain comprises V82G, Y147T/D, Q154S, and one or more of L36H, I76Y, F149Y, N157K, and D167N.
  • the adenosine deaminase domain comprises mutations I76Y, V82G, Y147T, and Q154S.
  • the adenosine deaminase domain comprises mutations L36H, V82G, Y147T, Q154S, and N157K.
  • the adenosine deaminase domain comprises mutations V82G, Y147D, F149Y, Q154S, and D167N.
  • the adenosine deaminase domain comprises mutations L36H, V82G, Y147D, F149Y, Q154S, N157K, and D167N. In some embodiments, the adenosine deaminase domain comprises mutations L36H, I76Y, V82G, Y147T, Q154S, and N157K. In some embodiments, the adenosine deaminase domain comprises mutations I76Y, V82G, Y147D, F149Y, Q154S, and D167N. In some embodiments, the adenosine deaminase domain comprises mutations Y147D, F149Y, and D167N.
  • the adenosine deaminase domain comprises mutations L36H, I76Y, V82G, Q154S, and N157K. In some embodiments, the adenosine deaminase domain comprises mutations I76Y, V82G, and Q154S. In some embodiments, the adenosine deaminase domain comprises mutations L36H, I76Y, V82G, Y147D, F149Y, Q154S,
  • a heterologous polypeptide is fused to the N-terminus of a nuclease domain. In some embodiments, a heterologous polypeptide is fused to the C- terminus of a nuclease domain. In some embodiments, a heterologous polypeptide is internal to a nuclease domain. In some embodiments, a heterologous polypeptide is fused to the N- terminus of Cas9. In some embodiments, a heterologous polypeptide is fused to the C- terminus of Cas9. In some embodiments, a heterologous polypeptide is internal to Cas9.
  • an adenosine deaminase variant is fused to the N-terminus of Cas9. In some embodiments, an adenosine deaminase variant is fused to the C-terminus of Cas9. In some embodiments, an adenosine deaminase variant is internal to Cas9.
  • the NLS-gRNA is suitable for use with CRISPR/Cas systems. In some embodiments, the NLS-gRNA is suitable for use with CRISPR class 2 type II enzymes. In some embodiments, the NLS-gRNA is suitable for use with CRISPR class 2 type V enzymes. In some embodiments, the NLS-gRNA is suitable for use with CRISPR class 2 type VI enzymes. In some embodiments, wherein the NLS-gRNA is suitable for use with Cas9, Cpfl, SaCas9, Casl2, Casl3, or modified versions thereof. Accordingly, in some embodiments, the NLS-gRNA is suitable for use with Cas9, or modified versions thereof.
  • the NLS-gRNA is suitable for use with Cpfl, or modified versions thereof. In some embodiments, the NLS-gRNA is suitable for use with SaCas9, or modified versions thereof. In some embodiments, the NLS-gRNA is suitable for use with Casl2, or modified versions thereof. In some embodiments, the NLS-gRNA is suitable for use with Casl3, or modified versions thereof. In some embodiments, the NLS-gRNA is in complex with the Cas enzyme.
  • RNA sequences are included that will be cleaved by the endonuclease activity of some Cas e.g. Casl2a and Casl3 to linearize gRNA prior to or during assembly with Cas protein.
  • the NLS-gRNA provides increased stability and resistance to cellular exonucleases in comparison to gRNA without the NLS sequence. In some embodiments, the NLS-gRNA provides increased editing events in target cells using a CRISPR/Cas editing system.
  • the NLS-gRNA is in a complex with a CRISPR class 2 type II enzyme. In some embodiments, the NLS-gRNA is in a complex with a CRISPR class 2 type V enzyme. In some embodiments, the NLS-gRNA is in a complex with a CRISPR class 2 type VI enzyme. In some embodiments, the NLS-gRNA is in a complex with Cas9, Cpfl, SaCas9, Cas 12, Cas 13, or modified versions thereof.
  • a Cas protein complex comprising a
  • Cas nuclease and a NLS-gRNA Cas nuclease and a NLS-gRNA.
  • the Cas nuclease is a CRISPR class 2 type II enzyme.
  • the Cas nuclease is a CRISPR class 2 type V enzyme. In some embodiments, the Cas nuclease is a CRISPR class 2 type VI enzyme. In some embodiments, the Cas nuclease is selected from Cas9, Cpfl, SaCas9, Cas 12, Cas 13, or modified versions thereof.
  • a method for targeted transcription activation, targeted transcription repression, targeted epigenome modification, or targeted genome modification comprising introducing into a eukaryotic cell: (a) aNLS- conjugated guide RNA (NLS-gRNA), (b) at least one CRISPR/Cas protein or a nucleic acid encoding at least one CRISPR/Cas protein, wherein interactions between (a) and (b) and a target sequence in chromosomal DNA leads to targeted transcription activation, targeted transcription repression, targeted epigenome modification, or targeted genome modification.
  • NLS-gRNA NLS- conjugated guide RNA
  • RNA modification comprising introducing into a eukaryotic cell: (a) a NLS-conjugated guide RNA (NLS-gRNA) and (b) at least one CRISPR/Cas protein or a nucleic acid encoding the at least one CRISPR/Cas protein, wherein interactions between (a) and (b) and an RNA expressed by chromosomal DNA leads to a modification of the RNA expressed by the chromosomal DNA.
  • NLS-gRNA NLS-conjugated guide RNA
  • CRISPR/Cas protein or a nucleic acid encoding the at least one CRISPR/Cas protein
  • the RNA expressed by the chromosomal DNA is a messenger RNA (mRNA).
  • mRNA messenger RNA
  • the present invention provides a pharmaceutical composition comprising the NLS-gRNA of the present invention and a pharmaceutically acceptable carrier.
  • the present invention provides, among other things, a composition comprising an engineered or non-naturally occurring CRISPR associated Cas (CRISPR-Cas) system comprising: a Cas protein, a gRNA comprising a nuclear localization signal (NLS) linked to the gRNA through a linker, wherein the linker comprises a cysteine residue conjugated to the 3' end of the gRNA; and wherein the gRNA is capable of forming a complex with a Cas protein and targeting the Cas9 protein to a target DNA.
  • CRISPR-Cas CRISPR-Cas
  • the gRNA comprises a nucleic acid sequence: 5'-
  • CAGUAUGGACACU GU CC AAA-3 ' (SEQ ID NO: 2).
  • the present invention provides, among other things, a composition comprising an engineered or non-naturally occurring CRISPR associated Cas (CRISPR-Cas) system comprising: (a) a saCas9 protein; (b) an adenosine deaminase variant fused to the Cas9 protein; and (c) a gRNA comprising a nuclear localization signal (NLS) linked to the gRNA through a linker; wherein the linker comprises a cysteine residue conjugated to the 3' end of the gRNA; and wherein the gRNA is capable of forming a complex with a saCas9 protein and targeting the saCas9 protein to a target DNA; wherein the adenosine deaminase variant comprises V82G, Y147T/D, Q154S, and one or more of L36H, I76Y, F149Y, N157K, and D167N with reference to SEQ ID NO
  • the present invention provides, among other things, a method of treating a genetic disease in a subject in need thereof by administering to the subject the composition of the present invention (e.g., NLS-gRNA).
  • the present invention provides, among other things, a method of treating Glycogen Storage Disease Type la (GSDla), the method comprising administering to the subject the composition of the present invention (e.g., NLS-gRNA).
  • GSDla Glycogen Storage Disease Type la
  • a composition comprising gRNA conjugated to NLS wherein the nuclear delivery of the composition is increased by about 2 to 5 fold relative to a composition comprising gRNA without NLS. In some embodiments, the nuclear delivery of the composition is increased by about 2 fold relative to a composition comprising gRNA without NLS. In some embodiments, the nuclear delivery of the composition is increased by about 3 fold relative to a composition comprising gRNA without NLS. In some embodiments, the nuclear delivery in increased by about 4 fold relative to a composition comprising gRNA without NLS. In some embodiments, the nuclear delivery in increased by about 5 fold relative to a composition comprising gRNA without NLS.
  • the nuclear delivery in increased by greater than about 2 fold relative to a composition comprising gRNA without NLS. In some embodiments, the nuclear delivery in increased by 1.5 to 10 fold relative to a composition comprising gRNA without NLS. In some embodiments, the nuclear delivery in increased by greater than about 10 fold relative to a composition comprising gRNA without NLS.
  • the gRNA comprises a sequence with 70%, 80%, 90%,
  • the gRNA comprises a sequence with 70% identity to any one of sequences in Table 8. In some embodiments, the gRNA comprises a sequence with 75% identity to any one of sequences in Table 8. In some embodiments, the gRNA comprises a sequence with 80% identity to any one of sequences in Table 8. In some embodiments, the gRNA comprises a sequence with 85% identity to any one of sequences in Table 8. In some embodiments, the gRNA comprises a sequence with 90% identity to any one of sequences in Table 8. In some embodiments, the gRNA comprises a sequence with 95% identity to any one of sequences in Table 8. In some embodiments, the gRNA comprises a sequence with 99% identity to any one of sequences in Table 8. In some embodiments, the gRNA comprises a sequence with 100% identity to any one of sequences in Table 8.
  • a composition comprising gRNA conjugated to NLS, wherein gene editing efficiency is increased by about 2 to 5 fold relative to gRNA without NLS. In some embodiments, the gene editing efficiency is increased by about 2 fold relative to gRNA without NLS. In some embodiments, the gene editing efficiency is increased by about 3 fold relative to gRNA without NLS. In some embodiments, the gene editing efficiency is increased by about 4 fold relative to gRNA without NLS. In some embodiments, the gene editing efficiency is increased by about 5 fold relative to gRNA without NLS. In some embodiments, the gene editing efficiency is increased by about 1.5 to 10 fold relative to gRNA without NLS.
  • the gRNA target sequence has 70%, 80%, 90%, 95%,
  • the gRNA target sequence has 70% identity to SEQ ID NO: 17. In some embodiments, the gRNA target sequence has 75% identity to SEQ ID NO: 17. In some embodiments, the gRNA target sequence has 70%, 80% identity to SEQ ID NO: 17. In some embodiments, the gRNA target sequence has 85% identity to SEQ ID NO: 17. In some embodiments, the gRNA target sequence has 90% identity to SEQ ID NO: 17. In some embodiments, the gRNA target sequence has 95% identity to SEQ ID NO: 17. In some embodiments, the gRNA target sequence has 100% identity to SEQ ID NO: 17.
  • the gRNA targets one or more of organs selected from liver, kidney, brain and heart. In some embodiments, the gRNA targets liver.
  • a or An The articles “a” and “an” are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article.
  • an element means one element or more than one element.
  • Two events or entities are “associated” with one another, as that term is used herein, if the presence, level and/or form of one is correlated with that of the other.
  • a particular entity e.g., polypeptide
  • two or more entities are physically “associated” with one another if they interact, directly or indirectly, so that they are and remain in physical proximity with one another.
  • two or more entities that are physically associated with one another are covalently linked to one another; in some embodiments, two or more entities that are physically associated with one another are not covalently linked to one another but are non-covalently associated, for example by means of hydrogen bonds, van der Waals interaction, hydrophobic interactions, magnetism, and combinations thereof.
  • Adenosine deaminase or “adenine deaminase” is meant a polypeptide or fragment thereof capable of catalyzing the hydrolytic deamination of adenine or adenosine.
  • the deaminase or deaminase domain is an adenosine deaminase catalyzing the hydrolytic deamination of adenosine to inosine or deoxy adenosine to deoxy inosine.
  • the adenosine deaminase catalyzes the hydrolytic deamination of adenine or adenosine in deoxyribonucleic acid (DNA).
  • the adenosine deaminases e.g.
  • engineered adenosine deaminases, evolved adenosine deaminases may be from any organism (e.g., eukaryotic, prokaryotic), including but not limited to algae, bacteria, fungi, plants, invertebrates (e.g., insects), and vertebrates (e.g., amphibians, mammals).
  • the adenosine deaminase is an adenosine deaminase variant with one or more alterations and is capable of deaminating both adenine and cytosine in a target polynucleotide (e.g., DNA, RNA).
  • the target polynucleotide is single- or double -stranded.
  • the adenosine deaminase variant is capable of deaminating both adenine and cytosine in DNA.
  • the adenosine deaminase variant is capable of deaminating both adenine and cytosine in single-stranded DNA.
  • the adenosine deaminase variant is capable of deaminating both adenine and cytosine in RNA.
  • Adenosine deaminase activity is meant catalyzing the deamination of adenine or adenosine to guanine in a polynucleotide.
  • an adenosine deaminase variant as provided herein maintains adenosine deaminase activity (e.g., at least about 30%, 40%, 50%, 60%, 70%, 80%, 90% or more of the activity of a reference adenosine deaminase (e.g., TadA*8.20 or TadA*8.19)).
  • ABE Adenosine Base Editor
  • Adenosine Base Editor 8 polypeptide or “ABE8” is meant a base editor as defined herein comprising an adenosine deaminase variant comprising an alteration at amino acid position 82 and/or 166 of the following reference sequence:
  • ABE8 comprises further alterations, as described herein, relative to the reference sequence.
  • ABE8 polynucleotide is meant a polynucleotide encoding an ABE8 polypeptide.
  • Adenosine Deaminase polynucleotide is meant a polynucleotide encoding an adenosine deaminase polypeptide.
  • the adenosine deaminase polynucleotide encodes an adenosine deaminase variant comprising V82G, Y147T/D,
  • the adenosine deaminase polynucleotide encodes an adenosine deaminase variant comprising one of the following combinations of alterations: V82G + Y147T + Q154S; I76Y + V82G + Y147T + Q154S; L36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; L36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; L36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; or L36H + I76Y + V82G + Y147D + F149Y + V82G + Y147D + F149Y + V166N; or L36H + I76Y + V82G + Y147D
  • the deaminase or deaminase domain is a variant of a naturally occurring deaminase from an organism, such as a human, chimpanzee, gorilla, monkey, cow, dog, rat, or mouse. In some embodiments, the deaminase or deaminase domain does not occur in nature.
  • the deaminase or deaminase domain is at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.1%, at least 99.2%, at least 99.3%, at least 99.4%, at least 99.5%, at least 99.6%, at least 99.7%, at least 99.8%, or at least 99.9% identical to a naturally occurring deaminase.
  • the adenosine deaminase is from a bacterium, such as, E. coli, S. aureus, B. subtilis, S. typhi, S. putrefaciens, H. influenzae, C. crescentus, or G. sulfurreducens .
  • a bacterium such as, E. coli, S. aureus, B. subtilis, S. typhi, S. putrefaciens, H. influenzae, C. crescentus, or G. sulfurreducens .
  • the adenosine deaminase is a TadA deaminase.
  • the TadA deaminase is an E. coli TadA (ecTadA) deaminase or a fragment thereof.
  • the ecTadA deaminase is truncated ecTadA.
  • the truncated ecTadA may be missing one or more N-terminal amino acids relative to a full-length ecTadA.
  • the truncated ecTadA may be missing 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 N-terminal amino acid residues relative to the full length ecTadA. In some embodiments, the truncated ecTadA may be missing 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 C-terminal amino acid residues relative to the full length ecTadA. In some embodiments, the ecTadA deaminase does not comprise an N-terminal methionine. In some embodiments, the TadA deaminase is an N-terminal truncated TadA. In particular embodiments, the TadA is any one of the TadAs described in PCT/US2017/045381, which is incorporated herein by reference in its entirety.
  • the TadA deaminase is TadA variant.
  • the TadA variant is TadA*7.10 comprising V82G, Y147T/D, Q154S, and one or more of L36H, I76Y, F149Y, N157K, and D167N.
  • the TadA variant is TadA* 7.10 comprising a combination of alterations selected from among the following: V82G + Y147T + Q154S; I76Y + V82G + Y147T + Q154S; L36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; L36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; L36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; or L36H + I76Y + V82G + Y147D + F149Y + Q154S + N157K + D167N.
  • the TadA variant is MSP605, MSP680, MSP823, MSP824, MSP825, MSP827, MSP
  • base editor By “base editor (BE),” or “nucleobase editor (NBE)” is meant an agent that binds a polynucleotide and has nucleobase modifying activity.
  • the base editor comprises a nucleobase modifying polypeptide (e.g., a deaminase) and a polynucleotide programmable nucleotide binding domain in conjunction with a guide polynucleotide (e.g., guide RNA).
  • a nucleobase modifying polypeptide e.g., a deaminase
  • a guide polynucleotide e.g., guide RNA
  • the agent is a biomolecular complex comprising a protein domain having base editing activity, i.e., a domain capable of modifying a base (e.g., A, T, C, G, or U) within a nucleic acid molecule (e.g., DNA).
  • a protein domain having base editing activity i.e., a domain capable of modifying a base (e.g., A, T, C, G, or U) within a nucleic acid molecule (e.g., DNA).
  • the polynucleotide programmable DNA binding domain is fused or linked to a deaminase domain.
  • the agent is a fusion protein comprising one or more domains having base editing activity.
  • the protein domains having base editing activity are linked to the guide RNA (e.g., via an RNA binding motif on the guide RNA and an RNA binding domain fused to the deaminase).
  • the domains having base editing activity are capable of deaminating a base within a nucleic acid molecule.
  • the base editor is capable of deaminating one or more bases within a DNA molecule.
  • the base editor is capable of deaminating a nitrogenous base within DNA.
  • the base editor is capable of deaminating a nitrogenous base within RNA.
  • the base editor is capable of deaminating a ribonucleoside.
  • the base editor is capable of deaminating a deoxyribonucleoside.
  • the base editor is capable of deaminating a cytosine.
  • the base editor is capable of deaminating a cytidine. In some embodiments, the base editor is capable of deaminating an adenosine. In some embodiments, the base editor is capable of deaminating a cytosine (C) or an adenosine (A) within DNA. In some embodiments, the base editor is capable of deaminating a cytosine (C) and an adenosine (A) within DNA. In some embodiments, the base editor is a cytidine base editor (CBE). In some embodiments, the base editor is an adenosine base editor (ABE).
  • the base editor is an adenosine base editor (ABE) and a cytidine base editor (CBE).
  • the base editor is a nuclease-inactive Cas9 (dCas9) fused to an adenosine deaminase.
  • the base editor is fused to an inhibitor of base excision repair, for example, a UGI domain, or a dISN domain.
  • the fusion protein comprises a Cas9 nickase fused to a deaminase and an inhibitor of base excision repair, such as a UGI or dISN domain.
  • the base editor is an abasic base editor.
  • Base editing activity is meant acting to chemically alter a base within a polynucleotide (e.g., by deaminating the base).
  • a first base is converted to a second base.
  • the base editing activity is cytidine deaminase activity, e.g., converting target OG to T ⁇ A.
  • the base editing activity is adenosine or adenine deaminase activity, e.g., converting A ⁇ T to G * C.
  • the base editing activity is cytidine deaminase activity, e.g., converting target C * G to T ⁇ A and adenosine or adenine deaminase activity, e.g., converting A ⁇ T to G * C.
  • base editor system refers to a system for editing a nucleobase of a target nucleotide sequence.
  • the base editor (BE) system comprises (1) a polynucleotide programmable nucleotide binding domain (e.g., Cas9), a deaminase domain and a cytidine deaminase domain for deaminating nucleobases in the target nucleotide sequence; and (2) one or more guide polynucleotides (e.g., guide RNA) in conjunction with the polynucleotide programmable nucleotide binding domain.
  • a polynucleotide programmable nucleotide binding domain e.g., Cas9
  • guide polynucleotides e.g., guide RNA
  • the base editor (BE) system comprises a nucleobase editor domain selected from an adenosine deaminase or a cytidine deaminase, and a domain having nucleic acid sequence specific binding activity.
  • the base editor system comprises (1) a base editor (BE) comprising a polynucleotide programmable DNA binding domain and a deaminase domain for deaminating one or more nucleobases in a target nucleotide sequence; and (2) one or more guide RNAs in conjunction with the polynucleotide programmable DNA binding domain.
  • the polynucleotide programmable nucleotide binding domain is a polynucleotide programmable DNA binding domain.
  • the base editor is a cytidine base editor (CBE). In some embodiments, the base editor is an adenine or adenosine base editor (ABE). In some embodiments, the base editor is an adenine or adenosine base editor (ABE) or a cytidine base editor (CBE).
  • biologically active refers to a characteristic of any agent that has activity in a biological system, and particularly in an organism. For instance, an agent that, when administered to an organism, has a biological effect on that organism, is considered to be biologically active.
  • an agent that, when administered to an organism, has a biological effect on that organism is considered to be biologically active.
  • a portion of that peptide that shares at least one biological activity of the peptide is typically referred to as a “biologically active” portion.
  • cleavage refers to a break in a target nucleic acid created by a nuclease of a CRISPR system described herein.
  • the cleavage event is a double-stranded DNA break.
  • the cleavage event is a single -stranded DNA break.
  • the cleavage event is a single -stranded RNA break.
  • the cleavage event is a double-stranded RNA break.
  • Complementary By “complementary” or “complementarity” is meant that a nucleic acid can form hydrogen bond(s) with another nucleic acid sequence by either traditional Watson-Crick or Hoogsteen base pairing.
  • Complementary base pairing includes not only G-C and A-T base pairing, but also includes base pairing involving universal bases, such as inosine.
  • a percent complementarity indicates the percentage of contiguous residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, or 10 nucleotides out of a total of 10 nucleotides in the first oligonucleotide being base paired to a second nucleic acid sequence having 10 nucleotides represents 50%, 60%, 70%, 80%, 90%, and 100% complementarity respectively).
  • the percentage of contiguous residues in a nucleic acid molecule that can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence is calculated and rounded to the nearest whole number (e.g., 12, 13, 14, 15, 16, or 17 nucleotides out of a total of 23 nucleotides in the first oligonucleotide being base paired to a second nucleic acid sequence having 23 nucleotides represents 52%, 57%, 61%, 65%, 70%, and 74%, respectively; and has at least 50%, 50%, 60%, 60%, 70%, and 70% complementarity, respectively).
  • substantially complementary refers to complementarity between the strands such that they are capable of hybridizing under biological conditions. Substantially complementary sequences have 60%, 70%, 80%, 90%, 95%, or even 100% complementarity. Additionally, techniques to determine if two strands are capable of hybridizing under biological conditions by examining their nucleotide sequences are well known in the art.
  • CRISPR-Cas9 system refers to nucleic acids and/or proteins involved in the expression of, or directing the activity of, CRISPR-effectors, including sequences encoding CRISPR effectors, RNA guides, and other sequences and transcripts from a CRISPR locus.
  • the CRISPR system is an engineered, non-naturally occurring CRISPR system.
  • the components of a CRISPR system may include a nucleic acid(s) (e.g., a vector) encoding one or more components of the system, a component(s) in protein form, or a combination thereof.
  • CRISPR array refers to the nucleic acid (e.g., DNA) segment that includes CRISPR repeats and spacers.
  • the CRISPR array includes CRISPR repeats and spacers, starting with the first nucleotide of the first CRISPR repeat and ending with the last nucleotide of the last (terminal) CRISPR repeat.
  • each spacer in a CRISPR array is located between two repeats.
  • CRISPR repeat or “CRISPR direct repeat,” or “direct repeat,” as used herein, refer to multiple short direct repeating sequences, which show very little or no sequence variation within a CRISPR array.
  • CRISPR-associated protein (Cas): The term "CRISPR-associated protein,”
  • CRISPR effector refers to a protein that carries out an enzymatic activity and/or that binds to a target site on a nucleic acid specified by a RNA guide.
  • a CRISPR effector has endonuclease activity, nickase activity, exonuclease activity, transposase activity, and/or excision activity.
  • the CRISPR effector is nuclease inactive.
  • crRNA The term "CRISPR RNA” or “crRNA,” as used herein, refers to a
  • RNA molecule including a guide sequence used by a CRISPR effector to target a specific nucleic acid sequence.
  • crRNAs typically contain a sequence that mediates target recognition and a sequence that forms a duplex with a tracrRNA.
  • the crRNA: tracrRNA duplex binds to a CRISPR effector.
  • duplex refers to a double helical structure formed by the interaction of two single stranded nucleic acids.
  • a duplex is typically formed by the pairwise hydrogen bonding of bases, i.e., "base pairing", between two single stranded nucleic acids which are oriented antiparallel with respect to each other.
  • Base pairing in duplexes generally occurs by Watson-Crick base pairing, e.g., guanine (G) forms a base pair with cytosine (C) in DNA and RNA, adenine (A) forms a base pair with thymine (T) in DNA, and adenine (A) forms a base pair with uracil (U) in RNA.
  • duplexes are stabilized by stacking interactions between adjacent nucleotides.
  • a duplex may be established or maintained by base pairing or by stacking interactions.
  • a duplex is formed by two complementary nucleic acid strands, which may be substantially complementary or fully complementary. Single-stranded nucleic acids that base pair over a number of bases are said to "hybridize.”
  • ex vivo refers to events that occur in cells or tissues, grown outside rather than within a multi-cellular organism.
  • Functional equivalent or analog denotes, in the context of a functional derivative of an amino acid sequence, a molecule that retains a biological activity (either function or structural) that is substantially similar to that of the original sequence.
  • a functional derivative or equivalent may be a natural derivative or is prepared synthetically.
  • Exemplary functional derivatives include amino acid sequences having substitutions, deletions, or additions of one or more amino acids, provided that the biological activity of the protein is conserved.
  • the substituting amino acid desirably has chemico-physical properties which are similar to that of the substituted amino acid. Desirable similar chemico-physical properties include, similarities in charge, bulkiness, hydrophobicity, hydrophilicity, and the like.
  • Half-Life As used herein, the term “half-life” is the time required for a quantity such as protein concentration or activity to fall to half of its value as measured at the beginning of a time period.
  • Hybridize is meant to form a double-stranded molecule between complementary polynucleotide sequences (e.g., a gene described herein), or portions thereof, under various conditions of stringency.
  • complementary polynucleotide sequences e.g., a gene described herein
  • Hybridization occurs by hydrogen bonding, which may be Watson-Crick, Hoogsteen or reversed Hoogsteen hydrogen bonding, between complementary nucleobases.
  • adenine and thymine are complementary nucleobases that pair through the formation of hydrogen bonds.
  • improve As used herein, the terms “improve,” “increase” or “reduce,” or grammatical equivalents, indicate values that are relative to a baseline measurement, such as a measurement in the same individual prior to initiation of the treatment described herein, or a measurement in a control subject (or multiple control subject) in the absence of the treatment described herein.
  • a “control subject” is a subject afflicted with the same form of disease as the subject being treated, who is about the same age as the subject being treated.
  • Indel refers to insertion or deletion of bases in a nucleic acid sequence. It commonly results in mutations and is a common form of genetic variation.
  • inhibiting a protein or a gene refers to processes or methods of decreasing or reducing activity and/or expression of a protein or a gene of interest.
  • inhibiting a protein or a gene refers to reducing expression or a relevant activity of the protein or gene by at least 10% or more, for example, 20%, 30%, 40%, or 50%, 60%, 70%, 80%, 90% or more, or a decrease in expression or the relevant activity of greater than 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, 50-fold, 100-fold or more as measured by one or more methods described herein or recognized in the art.
  • in vitro refers to events that occur in an artificial environment, e.g., in a test tube or reaction vessel, in cell culture, etc., rather than within a multi-cellular organism.
  • in vivo refers to events that occur within a multi-cellular organism, such as a human and a non-human animal. In the context of cell- based systems, the term may be used to refer to events that occur within a living cell (as opposed to, for example, in vitro systems).
  • the linker or spacer is a nucleotide or amino acid sequence that physically separates the terminal positions of the gRNA sequence from the NSL sequence to enable Cas binding and function of the gRNA.
  • the linker is RNA.
  • the linker is a chemical moiety.
  • the linker is a peptide.
  • the linker is DNA.
  • the linker is a chemical linker, for example, PEG9/18.
  • the linker is a DNA linker.
  • Oligonucleotide As used herein, the term “oligonucleotide” generally refers to polynucleotides of between about 5 and about 100 nucleotides of single- or double-stranded DNA. Oligonucleotides are also known as “oligomers” or “oligos” and may be isolated from genes, or chemically synthesized. [0126] PAM: The term “PAM” or “Protospacer Adjacent Motif’ refers to a short nucleic acid sequence (usually 2-6 base pairs in length) that follows the nucleic acid region targeted for cleavage by the CRISPR system, such as CRISPR-Cas9. A PAM may be required for a Cas nuclease to cut and is generally found 3-4 nucleotides downstream from the cut site.
  • Polypeptide refers to a sequential chain of amino acids linked together via peptide bonds. The term is used to refer to an amino acid chain of any length, but one of ordinary skill in the art will understand that the term is not limited to lengthy chains and can refer to a minimal chain comprising two amino acids linked together via a peptide bond. As is known to those skilled in the art, polypeptides may be processed and/or modified. As used herein, the terms “polypeptide” and “peptide” are used interchangeably.
  • Prevent As used herein, the term “prevent” or “prevention”, when used in connection with the occurrence of a disease, disorder, and/or condition, refers to reducing the risk of developing the disease, disorder and/or condition.
  • Protein refers to one or more polypeptides that function as a discrete unit. If a single polypeptide is the discrete functioning unit and does not require permanent or temporary physical association with other polypeptides in order to form the discrete functioning unit, the terms “polypeptide” and “protein” may be used interchangeably. If the discrete functional unit is comprised of more than one polypeptide that physically associate with one another, the term “protein” refers to the multiple polypeptides that are physically coupled and function together as the discrete unit.
  • references: A “reference” entity, system, amount, set of conditions, etc., is one against which a test entity, system, amount, set of conditions, etc. is compared as described herein.
  • a “reference” antibody is a control antibody that is not engineered as described herein.
  • RNA guide refers to an RNA molecule that facilitates the targeting of a protein described herein to a target nucleic acid.
  • exemplary "RNA guides” or “guide RNAs” include, but are not limited to, crRNAs or crR As in combination with cognate tracrRNAs. The latter may be independent RNAs or fused as a single RNA using a linker (sgRNAs).
  • the RNA guide is engineered to include a chemical or biochemical modification, in some embodiments, an RNA guide may include one or more nucleotides.
  • RNA guide or “guide RNA” also refers to NLS-gRNA.
  • Single Strand Ligase means a ligase that does not require an oligonucleotide splint or a template for its ligating activity.
  • Splint or Oligonucleotide Splint refers to a single stranded RNA or DNA or other polymer that is capable of hybridizing with at least two, three or more single stranded RNA nucleotides.
  • the splint can refer to an oligonucleotide splint.
  • Subject means any subject for whom diagnosis, prognosis, or therapy is desired.
  • a subject can be a mammal, e.g., a human or non-human primate (such as an ape, monkey, orangutan, or chimpanzee), a dog, cat, guinea pig, rabbit, rat, mouse, horse, cattle, or cow.
  • a human or non-human primate such as an ape, monkey, orangutan, or chimpanzee
  • a dog cat, guinea pig, rabbit, rat, mouse, horse, cattle, or cow.
  • sgRNA The term “sgRNA,” “single guide RNA,” or “guide RNA” refers to a single guide RNA containing (i) a guide sequence (crRNA sequence) and (ii) a Cas9 nuclease-recruiting sequence (tracrRNA).
  • Substantial identity is used herein to refer to a comparison between amino acid or nucleic acid sequences. As will be appreciated by those of ordinary skill in the art, two sequences are generally considered to be “substantially identical” if they contain identical residues in corresponding positions. As is well known in this art, amino acid or nucleic acid sequences may be compared using any of a variety of algorithms, including those available in commercial computer programs such as BLASTN for nucleotide sequences and BLASTP, gapped BLAST, and PSI-BLAST for amino acid sequences. Exemplary such programs are described in Altschul, et ah, Basic local alignment search tool, J. Mol.
  • two sequences are considered to be substantially identical if at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more of their corresponding residues are identical over a relevant stretch of residues.
  • the relevant stretch is a complete sequence.
  • the relevant stretch is at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500 or more residues.
  • Target nucleic acid refers to nucleotides of any length (oligonucleotides or polynucleotides) to which the CRISPR-Cas9 system binds, either deoxyribonucleotides, ribonucleotides, or analogs thereof.
  • Target nucleic acids may have three-dimensional structure, may include coding or non-coding regions, may include exons, introns, mRNA, tRNA, rRNA, siRNA, shRNA, miRNA, ribozymes, cDNA, plasmids, vectors, exogenous sequences, endogenous sequences.
  • a target nucleic acid can comprise modified nucleotides, include methylated nucleotides, or nucleotide analogs.
  • a target nucleic acid may be interspersed with non-nucleic acid components.
  • a target nucleic acid is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
  • therapeutically effective amount refers to an amount of a therapeutic molecule (e.g., an engineered antibody described herein) which confers a therapeutic effect on a treated subject, at a reasonable benefit/risk ratio applicable to any medical treatment.
  • the therapeutic effect may be objective (i.e., measurable by some test or marker) or subjective (i.e., subject gives an indication of or feels an effect).
  • the “therapeutically effective amount” refers to an amount of a therapeutic molecule or composition effective to treat, ameliorate, or prevent a particular disease or condition, or to exhibit a detectable therapeutic or preventative effect, such as by ameliorating symptoms associated with the disease, preventing or delaying the onset of the disease, and/or also lessening the severity or frequency of symptoms of the disease.
  • a therapeutically effective amount can be administered in a dosing regimen that may comprise multiple unit doses.
  • a therapeutically effective amount and/or an appropriate unit dose within an effective dosing regimen) may vary, for example, depending on route of administration, or combination with other pharmaceutical agents.
  • the specific therapeutically effective amount (and/or unit dose) for any particular subject may depend upon a variety of factors including the disorder being treated and the severity of the disorder; the activity of the specific pharmaceutical agent employed; the specific composition employed; the age, body weight, general health, sex and diet of the subject; the time of administration, route of administration, and/or rate of excretion or metabolism of the specific therapeutic molecule employed; the duration of the treatment; and like factors as is well known in the medical arts.
  • tracrRNA refers to an RNA including a sequence that forms a structure required for a CR1SPR- associated protein to bind to a specified target nucleic acid.
  • treatment refers to any administration of a therapeutic molecule (e.g., a CRISPR-Cas therapeutic protein or system described herein) that partially or completely alleviates, ameliorates, relieves, inhibits, delays onset of, reduces severity of and/or reduces incidence of one or more symptoms or features of a particular disease, disorder, and/or condition.
  • a therapeutic molecule e.g., a CRISPR-Cas therapeutic protein or system described herein
  • Such treatment may be of a subject who does not exhibit signs of the relevant disease, disorder and/or condition and/or of a subject who exhibits only early signs of the disease, disorder, and/or condition.
  • such treatment may be of a subject who exhibits one or more established signs of the relevant disease, disorder and/or condition.
  • FIG. 1 is an exemplary schematic of gRNA conjugated to an NLS sequence.
  • the 3' end of the gRNA is conjugated to the N-terminus of a peptide spacer followed by an NLS sequence derived from SV40.
  • FIG. 2 is an exemplary graph that shows results of adenine to guanine base
  • A-to-G conversion percentage achieved with a base editor comprising an adenine deaminase fused to the N-terminus of a spCas9.
  • A-to-G conversion percentage (y-axis) is plotted for various guide RNAs with or without NLS at various ratios of mRNA encoding a base editor (1:1, 1:3, and 1:9).
  • “Lipo Control” comprises an mRNA encoding a base editor gRNA (without NLS) in lipofectamine.
  • “Lipo Control” was formulated to serve as a transfection control against the LNP group.
  • FIG. 3A is an exemplary schematic of gRNA with different modifications.
  • EM end-modified gRNAs have 3 nucleotides at both 3' and 5' ends with 2'OMe modifications.
  • HMl (heavy modified 1) has 47% of gRNA modified with 2'OMe modification.
  • HM2 (heavy modified 2) has 60% of gRNA modified with 2'OMe modification.
  • HM3 (heavy modified 3) has 88% of gRNA modified with 2 ⁇ ME and 2'F modifications.
  • the NLS-gRNA used in Example 2 comprises end-modifications. FIG.
  • 3B is an exemplary graph that shows results of adenine to guanine base (A-to-G) conversion percentage achieved in mice with a base editor comprising an adenine deaminase fused to the N-terminus of a spCas9.
  • A-to-G conversion percentage (y-axis) plotted for various guide RNAs with or without NLS, and with or without various modifications in gRNA.
  • FIG. 4A is an exemplary graph that shows results of base editing efficiency achieved in non-human primates (NHPs) with a base editor comprising an adenine deaminase fused to the N-terminus of a spCas9.
  • Base editing efficiency in liver (y-axis) is plotted for various guide RNAs with or without NLS, and with or without various modifications in gRNA.
  • FIG. 4B is a series of exemplary graphs that shows toxicology results. AST and ALT levels were measured 24 hour-post administration and fold change as compared to AST/ALT levels prior to administration with formulations comprising different gRNAs is shown.
  • FIG. 5 is an exemplary graph that shows results of adenine to guanine base
  • A-to-G conversion percentage achieved in mice with a base editor comprising an adenine deaminase fused to the N-terminus of a saCas9.
  • A-to-G conversion percentage (y-axis) for both on-target and bystander editing was plotted for various guide RNAs with various purity and modifications.
  • FIGs. 6A and 6B depict in vivo correction of GSDla mutations in liver extracts of transgenic mouse models heterozygous for huG6PC-R83C.
  • FIG. 6A is a schematic depicting in vivo workflow. Lipid nanoparticles (LNP) carrying base editor mRNA and gRNA were dosed via IV injection in transgenic mice heterozygous for huG6PC (huR83C HET), harboring the R83C mutation.
  • FIG. 6B is a bar graph depicting A-to-G base editing efficiency of the GSDla R83C mutation using MSP828 comparing on-target to bystander editing.
  • FIG. 6C is a bar graph depicting correction of the GSD la R83C mutation in a transgenic mouse model heterozygous for huG6PC, harboring the R83C mutation, using TadA adenosine deaminase variants MSP605, MSP824, MSP825, MSP680, MSP828, and MSP820. In vitro screens were run to select desirable base-editors for R83C correction.
  • LNP co-formulations of gRNA and representative base-editors were dosed (at a sub- saturating dose of 1 mpk), in vivo, in transgenic mice heterozygous for huG6PC-R83C.
  • the base-editing potency of the variants for the R83C correction in livers of the LNP -treated, huG6PC-R83C heterozygote, transgenic animals are shown in FIG. 6C.
  • Variant MSP828 yielded a high level of on-target activity under these conditions.
  • A-to-G base editing efficiency is shown for on-target and bystander editing.
  • FIG. 7 shows schematics depicting normal and loss-of-function g6pc function and related outcomes.
  • GSD-Ia (or GSDla herein) is an autosomal recessive disorder caused by mutations in the g6pc gene.
  • R83C located in the active site of the enzyme, is the most prevalent pathogenic mutation identified in Caucasian GSD-Ia patients and is associated with inactivation of G6Pase.
  • a loss of G6Pase function can result in life-threatening hypoglycemia, seizures and even death.
  • patients must maintain strict and frequent adherence to glucose supplementation through day and night, by way of a slow glucose release formula.
  • One missed or delayed dose can result in emergency hypoglycemia.
  • enlarged liver, accumulation of uric acid, lactate, and lipids are common in GSD-Ia patients.
  • FIG. 8 shows a schematic illustrating that base editors as described herein generate permanent, predicted nucleotide substitutions in an editing window.
  • the R83C mutation introduces a single G>A conversion in the g6pc gene.
  • Adenine base editors (ABEs) enable the programmable conversion of A to G in genomic DNA and thus may be used to correct this mutation.
  • FIG. 8 depicts the utility of ABEs and base editing as described herein.
  • ABE binds to target DNA that is complementary to the guide-RNA and exposes a stretch of single-stranded DNA.
  • the deaminase converts the target adenine into inosine, and the Cas enzyme nicks the opposite strand, which is then repaired, completing the base pair conversion.
  • the direct repair of a point mutation has the potential for restoration of gene function.
  • FIGs. 9A and 9B provide a depiction of the target nucleotide site, and bystander and PAM nucleotides and a bar graph showing that ABEs used in immortalized HEK293 cells yield a significant rate of precise correction of R83C.
  • Base-editors for A to G conversion in the g6pc gene were optimized for correction of R83C.
  • Shown in FIG. 9A is the target DNA sequence (c C AC C AGT AT GG AC AC T G T C C AAAG AG AAT (SEQ ID NO: 17)) and underlying amino acid translation (WWYPCQGFLI; SEQ ID NO: 18) for the GSD-Ia R83C mutation.
  • the target edit is shown by double-underlining, at position 12.
  • the editing window also includes a possible bystander, shown by single-underlining at position 6, and an edit that may result in a synonymous conversion is shown at position 10.
  • a HEK293 cell line was generated to express the g6pc transgene harboring the R83C mutation and was transfected with base-editor mRNA and gRNA. Allele frequencies were assessed by high-throughput targeted amplicon Next-Generation Sequencing (NGS).
  • GNS Next-Generation Sequencing
  • Variants 1-5 represent a combination of gRNA and base-editor RNA, engineered for optimized target correction.
  • Variant 5 yielded approximately 60% targeted base-editing efficiency for R83C correction with limited bystander editing (FIG. 9B).
  • FIG. 10 presents a photographic image and bar graphs demonstrating that 3- week-old homozygous huR83C (Horn huR83C) mice exhibited expected growth impairment and metabolic defects characteristic of GSD-la.
  • huR83C homozygous huR83C mice exhibited expected growth impairment and metabolic defects characteristic of GSD-la.
  • a GSD-Ia mouse that expresses the human G6PC-R83C transgene in place of mouse G6PC was generated to validate base-editing in vivo.
  • the results shown confirmed that mice homozygous for huR83C exhibited postnatal lethality — they were either stillborn or died within 24 hours.
  • FIGs. 11A and 11B show dot plots of in vivo correction achieved by the base editors (ABEs) described herein.
  • FIG. 11A illustrates efficient lipid nanoparticle (LNP)- mediated base editing (huG6PC-R83C correction) in livers of adult and newborn heterozygous huR83C mice.
  • LNP-mediated delivery was first optimized in less fragile transgenic mice heterozygous for huR83C.
  • the schematic in FIG. 6A depicts in vivo workflow for these experiments, with lipid nanoparticle (LNP), or LNP co -formulations of base-editor mRNA and gRNA dosed via IV injection.
  • LNP -dosing was employed via the temporal vein of heterozygous huR83C mice shortly post birth, and activity was compared to that seen in adult heterozygous huR83C mice that had received LNP administered via the tail vein.
  • NGS analysis of whole liver extracts revealed approximately 40% base-editing efficiency in adults and up to -60% efficiency in newborns, with a broader range in efficiencies. Bystander editing remained low in adults and newborns.
  • FIG. 11B shows that LNP-mediated R83C correction in livers is associated with survival of newborn homozygous huR83C mice and littermate heterozygous huR83C mice.
  • mice homozygous for huR83C were treated with LNP containing guide RNA and mRNA encoding ABE.
  • the treated mice grew normally to 3 weeks of age, without hypoglycemia- induced seizures, in the absence of glucose therapy.
  • the treated homozygous huR83C mice displayed editing efficiencies up to -60% in total liver extracts (i.e., -60% R83C correction), consistent with littermate controls that were heterozygous for huR83C.
  • FIGs. 12A and 12B show bar graphs and immunohistochemical staining images demonstrating the base editing as described herein in mice homozygous for huG6PC- R83C restores near-normal metabolic function to reverse GSD-Ia pathology.
  • the treated homozygous huR83C mice displayed proper metabolic function, with restoration of near-normal serum metabolite markers, including glucose, triglycerides, cholesterol, lactate, and uric acid, as shown by the darkest bars in the graph in FIG. 12A.
  • FIG. 13 shows a bar graph demonstrating that a single LNP dose administration in homozygous huG6PC-R83C mice maintained euglycemia during a 24-hour fasting challenge via base-editing as described herein.
  • FIG. 14 shows a Kaplan-Meier survival curves were generated to estimate survival of newborn transgenic mice homozygous for huG6PC-R83C either post base-editing via ABE mRNA or untreated.
  • Newborn mice were genotyped via PCR analysis on genomic tail DNA using the following primers, a universal forward primer (5'- ACCTACTGATGATGCACCTTTGATCAATAGAT-3'(SEQ ID NO: 61)), a mouse specific reverse primer (5 '-CATCACCCCTCGGGATGGTTCTT-3 ' (SEQ ID NO: 62)), a human specific reverse primer 1 (5'-CAGCCCAGAATCCCAACCACAAAAT-3' (SEQ ID NO: 63), and human specific reverse primer 2 (5'-AGACCAGCTCGACTTGGGATGG-3'(SEQ ID NO: 64)).
  • a universal forward primer (5'- ACCTACTGATGATGCACCTTTGATCAATAGAT-3'(SEQ ID NO: 61)
  • FIG. 15A is a schematic of gRNA fluorescently tagged with Cy5 dye.
  • FIG. 15B is a schematic of gRNA conjugated to NLS fluorescently tagged with Cy5 dye.
  • FIG. 15C shows nuclear staining with Nuc Blue.
  • FIG. 15D shows nuclear staining and ALASl/sg23 gRNA localization with Cy5.
  • FIG. 15E shows enhanced nuclear localization of NLS-gRNA.
  • FIG. 16 is a model of NLS conjugates bound to saCas9 effectors at the 3' end.
  • FIG. 17A provides sequences of exemplary 5% end modified gRNA and exemplary 25% heavy modified saHM03 gRNA.
  • FIG. 17B is a graph that shows results of A- to-G base editing efficiency of exemplary NLS conjugated gRNA relative to end modified gRNA and heavy modified saHM03 gRNA.
  • the invention provides, in some aspects, methods to produce gRNA conjugated to an NLS sequence (NLS-gRNA) that has increased potency for use in CRISPR-Cas system, increasing frequency of successful editing events.
  • NLS-gRNA of the present invention can provide better trafficking of the gRNA to the nucleus to protect from cytosolic RNases and increase higher local concentration of gRNA for formation of RNP.
  • NLS-gRNA of the present invention has significantly higher potency as compared to a counterpart gRNA without the NLS sequence and also shows a higher potency as compared to highly modified gRNAs.
  • gRNAs conjugated to a NLS sequence have potential numerous advantages that include, for example increased potency.
  • the NLS-gRNA of the present invention provides a significantly higher base editing efficiency relative to its counterpart gRNA without a NLS sequence.
  • the NLS-gRNA with end modifications e.g., comprising 2'OMe modifications at the 3' end and/or at 5' end
  • provides a higher potency as compared to a gRNA that is highly modified e.g., greater than 40%, greater than 60%, or greater than 88% modified).
  • gRNA Guide RNA
  • guide RNA also refers to guide RNA conjugated to a
  • a gRNA comprises a polynucleotide sequence complementary to a target sequence.
  • the gRNA hybridizes with the target nucleic acid sequence and directs sequence-specific binding of a CRISPR complex to the target nucleic acid.
  • an RNA guide has 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% or 100% complementarity to a target nucleic acid sequence.
  • the gRNA is between about 50 nucleotides and 250 nucleotides. In some embodiments, the gRNA is between about 50 nucleotides and 500 nucleotides. In some embodiments, the gRNA is between about 50 nucleotides and 1,000 nucleotides. In some embodiments, the gRNA is about 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180, 185,
  • the gRNA of is between about 50 and 75 nucleotides long. In some embodiments, the gRNA is between about 75 and 100 nucleotides long. In some embodiments, the gRNA is between about 100 and 125 nucleotides long. In some embodiments, the gRNA is between about 125 and 150 nucleotides long. In some embodiments, the gRNA is between about 150 and 175 nucleotides long. In some embodiments, the gRNA is between about 175 and 200 nucleotides long. In some embodiments, the gRNA is between about 200 and 225 nucleotides long. In some embodiments, the gRNA is between about 225 and 250 nucleotides long.
  • the gRNA comprises a ligated crRNA and a tracrRNA.
  • crRNA and tracrRNA sequences are known in the art, for example those associated with several type II CRISPR-Cas9 systems (e.g., WO2013/176772), Cpfl, SaCas9, Casl2, among others.
  • a gRNA can be designed to target any target sequence.
  • Optimal alignment is determined using any algorithm for aligning sequences, including the Needleman-Wunsch algorithm, Smith-Waterman algorithm, Burrows-Wheeler algorithm, ClustlW, ClustlX, BLAST, Novoalign, SOAP, Maq, and ELAND.
  • a gRNA is designed to target to a unique target sequence within the genome of a cell.
  • a gRNA is designed to lack a PAM sequence.
  • a gRNA sequence is designed to have optimal secondary structure using a folding algorithm including mFold or Geneious.
  • expression of gRNAs may be under an inducible promoter, e.g. hormone inducible, tetracycline or doxycycline inducible, arabinose inducible, or light inducible.
  • the gRNA sequence is a "dead crRNAs," “dead guides,” or “dead guide sequences” that can form a complex with a CRISPR-associated protein and bind specific targets without any substantial nuclease activity.
  • the gRNA is chemically modified in the sugar phosphate backbone or base.
  • the gRNA has one or more of the following modifications 2'0-methyl, 2'-F or locked nucleic acids to improve nuclease resistance or base pairing.
  • the gRNA may contain modified bases such as 2-thiouridine or N6-methyladenosine.
  • the gRNA is conjugated with other oligonucleotides, peptides, proteins, tags, dyes, or polyethylene glycol.
  • the gRNA includes an aptamer or riboswitch sequence that binds specific target molecules due to their three-dimensional structure.
  • gRNA has two, three, four or five hairpins.
  • gRNA includes a transcription termination sequence, which includes a polyT sequences comprising six nucleotides.
  • the present invention provides a gRNA conjugated to a NLS sequence through 3' end of gRNA. In one aspect, the present invention provides a gRNA conjugated to a NLS sequence through 5' end of gRNA. In one aspect, the present invention provides a gRNA conjugated to a NLS sequence through an internal site of gRNA.
  • gRNA is conjugated to NLS via a linker.
  • said linker comprises a chemical moiety (e.g., L) and/or a peptidic moiety (e.g., a peptide spacer).
  • gRNA is conjugated to NLS directly via a chemical moiety
  • a chemical moiety (e.g., L).
  • a chemical moiety is non-peptidic.
  • a chemical moiety e.g., L
  • gRNA is conjugated to NLS via a peptidic moiety (e.g., a peptide spacer).
  • a peptidic moiety e.g., a peptide spacer
  • NLS NLS
  • gRNA is conjugated to NLS via a linker comprising both a chemical moiety (e.g., L) and a peptidic moiety (e.g., a peptide spacer).
  • a linker comprising both a chemical moiety (e.g., L) and a peptidic moiety (e.g., a peptide spacer).
  • such conjugates can have a structure according to Formula (I), where a chemical moiety L (e.g., a non-peptidic chemical moiety) is covalently attached to gRNA and a peptide spacer, and wherein the peptide spacer is covalently attached to NLS.
  • the N-terminus of NLS sequence is conjugated to the 3' end of the gRNA via a linker comprising both a chemical moiety (e.g., L) and a peptide moiety (e.g., a peptide spacer).
  • the C-terminus of NLS sequence is conjugated to the 5' end of the gRNA via a linker comprising both a chemical moiety (e.g., L) and a peptide moiety (e.g., a peptide spacer).
  • an internal amino acid in the NLS sequence is conjugated to the 3' end of the gRNA via a linker comprising both a chemical moiety (e.g., L) and a peptide moiety (e.g., a peptide spacer).
  • an internal amino acid in the NLS sequence is conjugated to the 5' end of the gRNA via a linker comprising both a chemical moiety (e.g., L) and a peptide moiety (e.g., a peptide spacer).
  • an internal amino acid in the NLS sequence is conjugated to an internal nucleotide of the gRNA via a linker comprising both a chemical moiety (e.g., L) and a peptide moiety (e.g., a peptide spacer).
  • a linker comprising both a chemical moiety (e.g., L) and a peptide moiety (e.g., a peptide spacer).
  • gRNA is conjugated to NLS via a chemical moiety (e.g., L) covalently attached to the C-terminus of the peptide spacer or the NLS amino acid sequence.
  • a chemical moiety e.g., L
  • gRNA is conjugated to NLS via a chemical moiety (e.g., L) covalently attached to the N-terminus of the peptide spacer or the NLS amino acid sequence.
  • a chemical moiety e.g., L
  • gRNA is conjugated to the peptide spacer or the NLS via a chemical moiety (e.g., L) covalently attached to the 3' end of the gRNA.
  • a chemical moiety e.g., L
  • gRNA is conjugated to the peptide spacer or the NLS via a chemical moiety (e.g., L) covalently attached to the 5' end of the gRNA.
  • a chemical moiety e.g., L
  • a chemical moiety e.g., L
  • a thiol- containing residue e.g., a cysteine residue
  • a chemical moiety e.g., L
  • a selenium-containing residue e.g., a selenocysteine residue
  • a chemical moiety e.g., L
  • an amino-containing residue e.g., a lysine residue
  • a chemical moiety e.g., L
  • a phenol-containing residue e.g., a tyrosine residue
  • amino acid residues used for formation of a linker e.g., a thiol-, selenium-, amino-, or phenol-containing residue as described herein
  • linker e.g., a thiol-, selenium-, amino-, or phenol-containing residue as described herein
  • a gRNA is conjugated to a NLS via reductive amination. In some embodiments, a gRNA is conjugated to a NLS native chemical ligation a gRNA is conjugated to a NLS viathiolene click.
  • Chemical moieties described herein may further including substructures L 1 and/or L 2 , where L 1 and L 2 are each independently an optionally substituted group that is Ci-12 alkylene or C2- 12 heteroalkylene.
  • a chemical moiety (e.g., L) comprises a maleimide -thiol adduct.
  • gRNA is conjugated to NLS using an addition reaction between a maleimide group and a thiol group or a thiol-ene click reaction.
  • a maleimide-thiol adduct containing moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising a maleimide group.
  • a maleimide-thiol adduct containing moiety is formed from a gRNA comprising a maleimide group, and a NLS (or a peptide spacer) comprising a thiol group.
  • a chemical moiety (e.g., L) comprises a maleimide -selenol adduct.
  • gRNA is conjugated to NLS using an addition reaction between a maleimide group and a selenol group.
  • a maleimide-selenol adduct containing moiety is formed from a gRNA comprising a selenol group, and a NLS (or a peptide spacer) comprising a maleimide group.
  • a maleimide-selenol adduct containing moiety is formed from a gRNA comprising a maleimide group, and a NLS (or a peptide spacer) comprising a selenol group.
  • a chemical moiety (e.g., L) comprises O wherein Y is S or Se.
  • Y is S.
  • a chemical moiety (e.g., L) comprises O .
  • the O moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising a maleimide group.
  • the maleimide-thiol adduct containing moiety is formed from a gRNA comprising a maleimide group, and a NLS (or a peptide spacer) comprising a thiol group.
  • Y is Se.
  • a chemical moiety (e.g., L) comprises O .
  • the O moiety is formed from a gRNA comprising a selenol group, and a NLS (or a peptide spacer) comprising a maleimide group.
  • the maleimide-selenol adduct containing moiety is formed from a gRNA comprising a maleimide group, and a NLS (or a peptide spacer) comprising a selenol group.
  • a chemical moiety L has the following structure (A), where
  • Y is S or Se
  • Y is S.
  • * represents covalent attachment to gRNA.
  • ** represents covalent attachment to a peptide spacer or NLS.
  • * * represents covalent attachment to a peptide spacer.
  • a chemical moiety (e.g., L) comprises a thioether group.
  • gRNA is conjugated to NLS using a conjugation reaction between an iodoacetamide group and a thiol group.
  • a thioether-containing moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising an iodoacetamide group.
  • a thioether-containing moiety is formed from a gRNA comprising an iodoacetamide group, and a NLS (or a peptide spacer) comprising a thiol group.
  • a chemical moiety (e.g., L) comprises a selenoether moiety.
  • gRNA is conjugated to NLS using a conjugation reaction between an iodoacetamide group and a selenol group.
  • a selenoether-containing moiety is formed from a gRNA comprising a selenol group, and a NLS (or a peptide spacer) comprising an iodoacetamide group.
  • a selenoether-containing moiety is formed from a gRNA comprising an iodoacetamide group, and a NLS (or a peptide spacer) comprising a selenol group.
  • a chemical moiety (e.g., L) comprises , wherein Y is S or Se.
  • Y is S.
  • a chemical moiety e.g., L
  • the s ⁇ V H y moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising
  • the moiety is formed from a gRNA comprising an iodoacetamide group, and a NLS (or a peptide spacer) comprising a thiol group.
  • Y is Se.
  • a chemical moiety e.g., L
  • the moiety is formed from a gRNA comprising a selenol group, and a NLS (or a peptide spacer) O comprising an iodoacetamide group.
  • the H moiety is formed from a gRNA comprising an iodoacetamide group, and a NLS (or a peptide spacer) comprising a selenol group.
  • a chemical moiety (e.g., L) comprises a disulfide group.
  • gRNA is conjugated to NLS using a thiol-disulfide exchange reaction between a disulfide-containing group and a thiol group.
  • the disulfide-containing moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising a disulfide group.
  • the disulfide-containing moiety is formed from a gRNA comprising a disulfide group, and a NLS (or a peptide spacer) comprising a thiol group.
  • a chemical moiety (e.g., L) comprises .
  • the x, L moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising a disulfide group.
  • the moiety is formed from a gRNA comprising a disulfide group, and a NLS (or a peptide spacer) comprising a thiol group.
  • a chemical moiety (e.g., L) comprises an oxadiazole thioether group.
  • gRNA is conjugated to NLS using a reaction between a thiol group and a sulfonyloxadiazole group.
  • an oxadiazole thioether-containing moiety is formed from a gRNA comprising a sulfonyloxadiazole group, and a NLS (or a peptide spacer) comprising a thiol group.
  • an oxadiazole thioether-containing moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising a sulfonyloxadiazole group.
  • a chemical moiety (e.g., L) comprises , moiety is formed from a gRNA comprising a sulfonyloxadiazole group, and a NLS (or a peptide spacer) comprising a thiol group.
  • the moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising a sulfonyloxadiazole group.
  • a chemical moiety (e.g., L) comprises a urea group.
  • gRNA is conjugated to NLS using a reaction between an amino (e.g., primary amine) group and an isocyanate group.
  • a urea-containing moiety is formed from a gRNA comprising an amino (e.g., primary amine) group, and a NLS (or a peptide spacer) comprising an isocyanate group.
  • a urea-containing moiety is formed from a gRNA comprising an isocyanate group, and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • a chemical moiety (e.g., L) comprises a thiourea group.
  • gRNA is conjugated to NLS using a reaction between an amino (e.g., primary amine) group and an isothiocyanate group.
  • a thiourea-containing moiety is formed from a gRNA comprising an amino (e.g., primary amine) group, and a NLS (or a peptide spacer) comprising an isothiocyanate group.
  • a thiourea-containing moiety is formed from a gRNA comprising an isothiocyanate group, and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • a chemical moiety (e.g., L) comprises wherein X is S or O. [0219] In embodiments, X is O. In embodiments, a chemical moiety (e.g., L) comprises In embodiments, the moiety is formed from a gRNA comprising an amino (e.g., primary amine) group, and a NLS (or a peptide
  • the H H moiety is formed from a gRNA comprising an isocyanate group, and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • X is S.
  • a chemical moiety e.g., L
  • the moiety is formed from a gRNA comprising an amino (e.g., primary amine) group, and a NLS (or a peptide
  • the moiety is formed from a gRNA comprising an isothiocyanate group, and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • a chemical moiety (e.g., L) comprises a dithiocarbamate group.
  • gRNA is conjugated to NLS using a reaction between a thiol group and an isothiocyanate group.
  • a dithiocarbamate -containing moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising an isothiocyanate group.
  • a dithiocarbamate -containing moiety is formed from a gRNA comprising an isothiocyanate group, and a NLS (or a peptide spacer) comprising a thiol group.
  • a chemical moiety (e.g., L) comprises
  • the H moiety is formed from a gRNA comprising a thiol group, and a NLS (or a peptide spacer) comprising an isothiocyanate group. In embodiments, the H moiety is formed from a gRNA comprising an isothiocyanate group, and a NLS (or a peptide spacer) comprising a thiol group.
  • a chemical moiety (e.g., L) comprises a diazenylphenol group.
  • gRNA is conjugated to NLS using a reaction between a phenol group and a diazonium group.
  • a diazenylphenol-containing moiety is formed from a gRNA comprising a phenol group, and a NLS (or a peptide spacer) comprising a diazonium group.
  • a diazenylphenol-containing moiety is formed from a gRNA comprising a diazonium group, and a NLS (or a peptide spacer) comprising a phenol group.
  • a chemical moiety comprises moiety is formed from a gRNA comprising a phenol group, and a NLS (or a peptide spacer) comprising a diazonium group.
  • the moiety is formed from a gRNA comprising a diazonium group, and a NLS (or a peptide spacer) comprising a phenol group.
  • a chemical moiety (e.g., L) comprises a triazolidinedionylphenol group.
  • gRNA is conjugated to NLS using a reaction between a phenol group and a cyclic diazodicarboxamide group.
  • a triazolidinedionylphenol-containing moiety is formed from a gRNA comprising a phenol group, and a NLS (or a peptide spacer) comprising a cyclic diazodicarboxamide group.
  • atriazolidinedionylphenol-containing moiety is formed from a gRNA comprising a cyclic diazodicarboxamide group, and a NLS (or a peptide spacer) comprising a phenol group.
  • a chemical moiety (e.g., L) comprises
  • the ⁇ OH H moiety is formed from a gRNA comprising a phenol group, and a NLS (or a peptide spacer) comprising a cyclic diazodicarboxamide group.
  • the moiety is formed from a gRNA comprising a cyclic diazodicarboxamide group, and a NLS (or a peptide spacer) comprising a phenol group.
  • a chemical moiety (e.g., L) comprises a triazole group.
  • gRNA is conjugated to NLS using a 1,3 -dipolar cycloaddition between an alkyne group and an azide group.
  • a triazole-containing moiety is formed from a gRNA comprising an alkyne group and a NLS (or a peptide spacer) comprising an azide group. In embodiments, a triazole-containing moiety is formed from a gRNA comprising an azide group and a NLS (or a peptide spacer) comprising an alkyne group. In embodiments, a 1,3- dipolar cycloaddition is copper-catalyzed cycloaddition. In embodiments, a 1,3-dipolar cycloaddition is strain-promoted cycloaddition. [0232] In embodiments, a chemical moiety (e.g., L) comprises
  • the moiety is formed from a gRNA comprising an alkyne group and
  • the moiety is formed from a gRNA comprising an azide group and a NLS (or a peptide spacer) comprising an alkyne group.
  • a chemical moiety (e.g., L) comprises wherein each of ring A and ring B are optionally substituted aryl groups.
  • ring A is present.
  • ring A is not present.
  • ring B is present.
  • ring B is not present. In embodiments, both ring A and ring B are present. In embodiments, both ring A and ring B are not present.
  • the moiety is formed from a gRNA comprising an alkyne group and a NLS (or a peptide spacer) comprising an azide group. In embodiments, the moiety is formed from a gRNA comprising an azide group and a NLS (or a peptide spacer) comprising an alkyne group.
  • a chemical moiety (e.g., L) comprises . wherein each of ring A and ring B are optionally substituted aryl groups. In embodiments, ring A is present. In embodiments, ring A is not present. In embodiments, ring B is present.
  • ring B is not present. In embodiments, both ring A and ring B are present. In embodiments, both ring A and ring B are not present.
  • the moiety is formed from a gRNA comprising an alkyne group and a NLS (or a peptide spacer) comprising an azide group. In embodiments, the moiety is formed from a gRNA comprising an azide group and a NLS (or a peptide spacer) comprising an alkyne group. Diazanorcaradiene
  • a chemical moiety (e.g., L) comprises a diazanorcaradiene group.
  • gRNA is conjugated to NLS using a Diels-Alder reaction between a cyclopropene group and a tetrazine group.
  • a diazanorcaradiene-containing moiety is formed from a gRNA comprising a cyclopropene group and a NLS (or a peptide spacer) comprising a tetrazine group.
  • a diazanorcaradiene-containing moiety is formed from a gRNA comprising a tetrazine group and a NLS (or a peptide spacer) comprising a cyclopropene group.
  • a chemical moiety (e.g., L) comprises wherein R is a Ci- 6 alkyl.
  • the moiety is formed from a gRNA comprising a cyclopropene group and a NLS (or a peptide spacer) comprising a tetrazine group.
  • the moiety is formed from a gRNA comprising a tetrazine group and a NLS (or a peptide spacer) comprising a cyclopropene group.
  • a chemical moiety (e.g., L) comprises an amide group.
  • gRNA is conjugated to NLS using a conjugation reaction between a carboxyl group and an amino group (e.g., primary amine).
  • an amide-containing moiety is formed from a gRNA comprising a carboxyl group and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • an amide-containing moiety is formed from a gRNA comprising an amino group (e.g., primary amine) and a NLS (or a peptide spacer) comprising a carboxyl group.
  • a carboxyl group is an activated carboxyl group.
  • the carboxyl group is activated by carbodiimides such as 1 -ethyl-3 -(3- dimethyl-aminopropyl) carbodiimide (EDC) or dicyclohexylcarbodiimide (DCC).
  • the carboxyl group is activated by N-hydroxysuccinimide (NHS) derivatives (e.g., sulfo-NHS).
  • a chemical moiety (e.g., L) comprises .
  • the H moiety is formed from a gRNA comprising a carboxyl group and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • the moiety is formed from a gRNA comprising an amino group (e.g., primary amine) and a NLS (or a peptide spacer) comprising a carboxyl group.
  • a chemical moiety (e.g., L) comprises a sulfonamide group.
  • gRNA is conjugated to NLS using a conjugation reaction between a sulfonyl group and an amino (e.g., primary amine) group.
  • a sulfonamide- containing moiety is formed from a gRNA comprising a sulfonyl group and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • an amide- containing moiety is formed from a gRNA comprising an amino (e.g., primary amine) group and a NLS (or a peptide spacer) comprising a sulfonyl group.
  • a chemical moiety (e.g., L) comprises .
  • the H moiety is formed from a gRNA comprising a sulfonyl group and a NLS (or a peptide spacer) comprising an amino (e.g., primary amine) group.
  • the H moiety is formed from a gRNA comprising an amino
  • a chemical moiety (e.g., L) comprises an amino group.
  • gRNA is conjugated to NLS using a conjugation reaction between an amino group (e.g., primary amine) and an aldehyde group followed by a reduction reaction to form an amine-containing moiety.
  • an amine-containing moiety is formed from a gRNA comprising an amino group (e.g., primary amine), and aNLS (or a peptide spacer) comprising an aldehyde group.
  • an amine -containing moiety is formed from a gRNA comprising an aldehyde group, and a NLS (or a peptide spacer) comprising an amino group (e.g., primary amine).
  • an amine-containing moiety is formed from a bifunctional cross-linking reagent (e.g., a dialdehyde such as glutaraldehyde).
  • a bifunctional cross-linking reagent e.g., a dialdehyde such as glutaraldehyde.
  • an amine- containing moiety is formed from a gRNA comprising an amino group (e.g., primary amine), aNLS (or a peptide spacer) comprising an amino group (e.g., primary amine), and a dialdehyde (e.g., glutaraldehyde).
  • an amine -containing moiety is formed from a gRNA comprising an aldehyde group, a NLS (or a peptide spacer) comprising an aldehyde group, and a diaminoalkane.
  • a chemical moiety (e.g., L) comprises y g p g amino group (e.g., primary amine), a NLS (or a peptide spacer) comprising an amino group (e.g., primary amine), and a dialdehyde (e.g., glutaraldehyde).
  • the moiety is formed from a gRNA comprising an aldehyde group, a
  • NLS (or a peptide spacer) comprising an aldehyde group, and a diaminoalkane.
  • a chemical moiety (e.g., L) comprises an amino group.
  • gRNA is conjugated to NLS using a conjugation reaction between an amino (e.g., a primary amine) group and atresyl (2,2,2-Trifluoroethanesulfonyl) group.
  • an amine moiety is formed from a gRNA comprising an amino (e.g., a primary amine) group, and a NLS (or a peptide spacer) comprising a tresyl (2,2,2- Trifluoroethanesulfonyl) group.
  • an amine -containing moiety is formed from agRNA comprising atresyl (2,2,2-Trifhioroethanesulfonyl) group and aNLS (or a peptide spacer) comprising an amino (e.g., a primary amine) group.
  • agRNA comprising atresyl (2,2,2-Trifhioroethanesulfonyl) group and aNLS (or a peptide spacer) comprising an amino (e.g., a primary amine) group.
  • a chemical moiety (e.g., L) comprises H .
  • the H moiety is formed from a gRNA comprising an amino (e.g., a primary amine) group, and a NLS (or a peptide spacer) comprising a tresyl (2,2,2- L- N ⁇ L -
  • the H moiety is formed from agRNA comprising atresyl (2,2,2-Trifhioroethanesulfonyl) group and aNLS (or a peptide spacer) comprising an amino (e.g., a primary amine) group.
  • the NLS-gRNA comprises a crRNA. In some embodiments, the NLS-gRNA comprises a tracrRNA. In some embodiments, the NLS- gRNA comprises a crRNA and a NLS-gRNA.
  • a linear guide RNA is first synthesized. In this approach, two or more separate RNAs are ligated together.
  • a first RNA comprises a trans-activating RNA (tracrRNA), and a second RNA comprises a clustered regularly interspersed short palindromic repeats (CRISPR) RNA (crRNA).
  • CRISPR clustered regularly interspersed short palindromic repeats
  • the RNA comprising the tracrRNA sequences are synthesized such that a portion of the tracrRNA contains a phosphate at the 5 '-terminus.
  • Two forms of ligation are possible with this approach, both of which are found within the stem loop region.
  • the first form of ligation occurs within the terminal loop of the hairpin, which is a natural site of T4 RNA Ligase 1.
  • the second form of ligation occurs within the duplex which is a natural of T4 RNA Ligase 2 and DNA ligases.
  • One of the advantages of this form of ligation is that fragment impurities are readily removable because of the marked differences in elution time between the fused gRNA and the fragment impurities.
  • the first end of the guide RNA and/or the second end of the guide RNA comprises a chemical modification to its backbone or to one or more of its bases.
  • chemically modified RNA can comprise chemical synthesis can be used to install highly modified monomers including modified sugars, bases, backbones or functional groups that do not resemble natural nucleotides.
  • the first end of the guide RNA and/or the second end of the guide RNA comprises a modified base.
  • the modified RNA include one or more of the following 2'-0-methoxy-ethyl bases (2'-MOE) such as 2-MethoxyEthoxy A, 2-MethoxyEthoxy MeC, 2-MethoxyEthoxy G, 2- MethoxyEthoxy T.
  • Other modified bases include for example, 2'-0-Methyl RNA bases, and fluoro bases.
  • fluoro bases are known, and include for example, Fluoro C, Fluoro U, Fluoro A, Fluoro G bases.
  • RNA comprising one or more of the following 2'OMethyl modifications can be used with the methods described: 2'-OMe-5- Methyl-rC, 2'-OMe-rT, 2'-OMe-rI, 2'-OMe-2-Amino-rA, Aminolinker-C6-rC, Aminolinker- C6-rU, 2'-OMe-5-Br-rU, 2'-OMe-5-I-rU, 2-OMe-7-Deaza-rG.
  • the first end of the guide RNA and/or second end of the guide RNA comprises one or more of the following modifications: phosphorothioates, 2 ⁇ - methyl, 2' fluoro (2'F), DNA.
  • the first end of the guide RNA and/or the second end of the guide RNA comprises 2'OMe modifications at the 3' and 5'-ends.
  • the first end of the guide RNA and/or second end of the guide RNA comprises one or more of the following modifications: 2' -O-2-Methoxy ethyl (MOE), locked nucleic acids, bridged nucleic acids, unlocked nucleic acids, peptide nucleic acids, morpholino nucleic acids.
  • MOE 2' -O-2-Methoxy ethyl
  • the first end of the guide RNA and/or second end of the guide RNA comprises one or more of the following base modifications: 2,6-diaminopurine, 2-aminopurine, pseudouracil, N1 -methyl -psuedouracil, 5' methyl cytosine, 2'pyrimidinone (zebularine), thymine.
  • modified bases include for example, 2-Aminopurine, 5-Bromo dU, deoxyUridine, 2,6-Diaminopurine (2-Amino-dA), Dideoxy-C, deoxylnosine, Hydroxymethyl dC, Inverted dT, Iso-dG, Iso-dC, Inverted Dideoxy-T, 5 -Methyl dC, 5 -Methyl dC, 5- Nitroindole, Super T®, 2'-F-r(C,U), 2'-NH2-r(C,U), 2,2'-Anhydro-U, 3'-Desoxy-r(A,C,G,U),
  • RNA 3 '-O-Methyl -r(A,C,G,U), rT, rl, 5-Methyl-rC, 2-Amino-rA, rSpacer (Abasic), 7-Deaza-rG, 7- Deaza-rA, 8-Oxo-rG, 5-Halogenated-rU, N-Alkylated-rN.
  • Other chemically modified RNA can be used herein.
  • the first end of the guide RNA and/or second end of the guide RNA can comprise a modified base such as, for example, 5', Int, 3' Azide (NHS Ester); 5' Hexynyl; 5', Int, 3' 5-Octadiynyl dU;
  • a modified base such as, for example, 5', Int, 3' Azide (NHS Ester); 5' Hexynyl; 5', Int, 3' 5-Octadiynyl dU;
  • RNA nucleotide modifications that can be used with the methods described herein include for example phosphorylation modifications, such as 5 '-phosphorylation and 3'- phosphorylation.
  • the RNA can also have one or more of the following modifications: an amino modification, biotinylation, thiol modification, alkyne modifier, adenylation, Azide (NHS Ester), Cholesterol-TEG, and Digoxigenin (NHS Ester).
  • nucleobase editors that edit, modify or alter a target nucleotide sequence of a polynucleotide.
  • Nucleobase editors described herein typically include a polynucleotide programmable nucleotide binding domain and a nucleobase editing domain (e.g., adenosine deaminase or cytidine deaminase).
  • a polynucleotide programmable nucleotide binding domain when in conjunction with a bound guide polynucleotide (e.g. , gRNA), can specifically bind to a target polynucleotide sequence and thereby localize the base editor to the target nucleic acid sequence desired to be edited.
  • a bound guide polynucleotide e.g. , gRNA
  • the nucleobase editors provided herein comprise one or more features that improve base editing activity.
  • any of the nucleobase editors provided herein may comprise a Cas9 domain that has reduced nuclease activity.
  • any of the nucleobase editors provided herein may have a Cas9 domain that does not have nuclease activity (dCas9), or a Cas9 domain that cuts one strand of a duplexed DNA molecule, referred to as a Cas9 nickase (nCas9).
  • dCas9 nucleas9 domain that does not have nuclease activity
  • nCas9 Cas9 nickase
  • H840 maintains the activity of the Cas9 to cleave the non-edited (e.g., non-deaminated) strand opposite the targeted nucleobase.
  • Mutation of the catalytic residue e.g., D10 to A 10
  • cleavage of the edited (e.g., deaminated) strand containing the targeted residue e.g., A or C.
  • Such Cas9 variants can generate a single-strand DNA break (nick) at a specific location based on the gRNA-defmed target sequence, leading to repair of the non-edited strand, ultimately resulting in a nucleobase change on the non-edited strand.
  • Polynucleotide programmable nucleotide binding domains bind polynucleotides (e.g., RNA, DNA).
  • a polynucleotide programmable nucleotide binding domain of a base editor can itself comprise one or more domains (e.g., one or more nuclease domains).
  • the nuclease domain of a polynucleotide programmable nucleotide binding domain can comprise an endonuclease or an exonuclease.
  • An endonuclease can cleave a single strand of a double-stranded nucleic acid or both strands of a double-stranded nucleic acid molecule.
  • a nuclease domain of a polynucleotide programmable nucleotide binding domain can cut zero, one, or two strands of a target polynucleotide.
  • fusion proteins comprising a heterologous polypeptide fused to a nucleic acid programmable nucleic acid binding protein, for example, a nucleic acid programmable DNA binding protein (napDNAbp).
  • a heterologous polypeptide can be a polypeptide that is not found in the native or wild-type napDNAbp polypeptide sequence.
  • the heterologous polypeptide can be fused to the napDNAbp at a C-terminal end of the napDNAbp, an N-terminal end of the napDNAbp, or inserted at an internal location of the napDNAbp.
  • the heterologous polypeptide is inserted at an internal location of the napDNAbp.
  • the heterologous polypeptide is a deaminase (e.g., adenosine deaminase) or a functional fragment thereof.
  • a fusion protein can comprise a deaminase (e.g., adenosine deaminase) flanked by an N- terminal fragment and a C-terminal fragment of a Cas9 polypeptide.
  • the deaminase in a fusion protein can be an adenosine deaminase.
  • the adenosine deaminase is a TadA (e.g., TadA*7.10 or a variant thereof).
  • the fusion protein comprises the structure:
  • the deaminase can be a circular permutant deaminase.
  • the deaminase can be a circular permutant adenosine deaminase.
  • the deaminase is a circular permutant TadA, circularly permutated at amino acid residue 116 as numbered in the TadA reference sequence.
  • the deaminase is a circular permutant TadA, circularly permutated at amino acid residue 136 as numbered in the TadA reference sequence. In some embodiments, the deaminase is a circular permutant TadA, circularly permutated at amino acid residue 65 as numbered in the TadA reference sequence.
  • the fusion protein can comprise more than one deaminase.
  • the fusion protein can comprise, for example, 1, 2, 3, 4, 5 or more deaminases.
  • the fusion protein comprises one deaminase.
  • the fusion protein comprises two deaminases.
  • the two or more deaminases in a fusion protein can be an adenosine deaminase, cytidine deaminase, or a combination thereof.
  • the two or more deaminases can be homodimers.
  • the two or more deaminases can be heterodimers.
  • the two or more deaminases can be inserted in tandem in the napDNAbp. In some embodiments, the two or more deaminases may not be in tandem in the napDNAbp.
  • the napDNAbp in the fusion protein is a Cas9 polypeptide or a fragment thereof.
  • the Cas9 polypeptide can be a variant Cas9 polypeptide.
  • the Cas9 polypeptide is a Cas9 nickase (nCas9) polypeptide or a fragment thereof. In some embodiments, the Cas9 polypeptide is a nuclease dead Cas9 (dCas9) polypeptide or a fragment thereof.
  • the Cas9 polypeptide in a fusion protein can be a full-length Cas9 polypeptide. In some cases, the Cas9 polypeptide in a fusion protein may not be a full length Cas9 polypeptide.
  • the Cas9 polypeptide can be truncated, for example, at a N-terminal or C-terminal end relative to a naturally-occurring Cas9 protein.
  • the Cas9 polypeptide can be a circularly permuted Cas9 protein.
  • the Cas9 polypeptide can be a fragment, a portion, or a domain of a Cas9 polypeptide, that is still capable of binding the target polynucleotide and a guide nucleic acid sequence.
  • the Cas9 polypeptide is a Streptococcus pyogenes Cas9
  • SpCas9 Staphylococcus aureus Cas9
  • StaCas9 Streptococcus thermophilus 1 Cas9
  • StlCas9 Streptococcus thermophilus 1 Cas9
  • Fusion proteins comprising a heterologous catalytic domain flanked by N- and
  • C-terminal fragments of a Cas9 polypeptide are also useful for base editing in the methods as described herein.
  • Fusion proteins comprising Cas9 and one or more deaminase domains, e.g., adenosine deaminase, or comprising an adenosine deaminase domain flanked by Cas9 sequences are also useful for highly specific and efficient base editing of target sequences.
  • a chimeric Cas9 fusion protein contains a heterologous catalytic domain (e.g., adenosine deaminase, cytidine deaminase, or adenosine deaminase and cytidine deaminase) inserted within a Cas9 polypeptide.
  • the fusion protein comprises an adenosine deaminase domain and a cytidine deaminase domain inserted within a Cas9.
  • an adenosine deaminase is fused within a Cas9 and a cytidine deaminase is fused to the C-terminus.
  • an adenosine deaminase is fused within Cas9 and a cytidine deaminase fused to the N-terminus.
  • a cytidine deaminase is fused within Cas9 and an adenosine deaminase is fused to the C- terminus.
  • a cytidine deaminase is fused within Cas9 and an adenosine deaminase fused to the N-terminus.
  • Exemplary structures of a fusion protein with an adenosine deaminase and a cytidine deaminase and a Cas9 are provided as follows:
  • the used in the general architecture above indicates the presence of an optional linker.
  • the catalytic domain has DNA modifying activity
  • the adenosine deaminase is a TadA (e.g., TadA*7.10).
  • the TadA is a TadA variant.
  • a TadA variant is fused within Cas9 and a cytidine deaminase is fused to the C-terminus.
  • a TadA variant is fused within Cas9 and a cytidine deaminase fused to the N-terminus.
  • a cytidine deaminase is fused within Cas9 and a TadA variant is fused to the C-terminus. In some embodiments, a cytidine deaminase is fused within Cas9 and a TadA variant fused to the N- terminus.
  • Exemplary structures of a fusion protein with a TadA variant and a cytidine deaminase and a Cas9 are provided as follows:
  • the used in the general architecture above indicates the presence of an optional linker.
  • the fusion protein contains a nuclear localization signal
  • the nuclear localization signal is a bipartite nuclear localization signal.
  • the amino acid sequence of the nuclear localization signal is MAPKKKRKVGIHGVPAA (SEQ ID NO: 4).
  • the nuclear localization signal is encoded by the following sequence:
  • the Casl2b polypeptide contains a mutation that silences the catalytic activity of a RuvC domain.
  • the Casl2b polypeptide contains D574A, D829A and/or D952A mutations.
  • the fusion protein further contains a tag (e.g., an influenza hemagglutinin tag).
  • the fusion protein comprises a napDNAbp domain
  • the napDNAbp is a Casl2b.
  • an adenosine deaminase (e.g., TadA*8.13) may be inserted into a BhCasl2b to produce a fusion protein (e.g., TadA*8.13-BhCasl2b) that effectively edits a nucleic acid sequence.
  • adenosine deaminase e.g., TadA*8.13
  • a fusion protein e.g., TadA*8.13-BhCasl2b
  • the NLS-gRNA described herein can be used with a suitable gene editing system for targeted gene editing which can result in a gene silencing event, or an alteration of the expression (e.g., an increase or a decrease) in the expression of a desired target gene.
  • the NLS-gRNA described herein can be used in a method for targeted transcription activation, targeted transcription repression, targeted epigenome modification, or targeted genome modification, the method comprising introducing into a eukaryotic cell: (a) a NLS-gRNA as defined herein; (b) at least one CRISPR/Cas protein or a nucleic acid encoding the at least one CRISPR/Cas protein; wherein interactions between (a) and (b) and a target sequence in chromosomal DNA leads to targeted transcription activation, targeted transcription repression, targeted epigenome modification, or targeted genome modification.
  • the NLS-gRNA described herein can be used in a gene editing system comprising: the NLS-gRNA described herein, wherein the RNA guide comprises a direct repeat sequence and a spacer sequence capable of hybridizing to a target nucleic acid; gene editing protein, and wherein the gene editing enzyme is capable of binding to the RNA guide and of causing a break in the target nucleic acid sequence complementary to the RNA guide.
  • the NLS-gRNA described herein can be used in a gene editing system comprising: the NLS-gRNA described herein, wherein the RNA guide comprises a direct repeat sequence and a spacer sequence capable of hybridizing to a target nucleic acid; and a gene editing protein; wherein the gene editing protein is fused to a deaminase, and wherein the gene editing protein fusion is capable of binding to the RNA guide and of editing the target nucleic acid sequence complementary to the RNA guide.
  • the invention provides a method of altering expression of a target nucleic acid in a eukaryotic cell comprising: contacting the cell with a gene editing protein, and the NLS-gRNA described herein, wherein the NLS-gRNA comprises a direct repeat sequence and a spacer sequence capable of hybridizing to the target nucleic acid, and wherein the gene editing protein is capable of binding to the NLS-gRNA and of causing a break in the target nucleic acid sequence complementary to the NLS-gRNA.
  • the invention provides a method of altering expression of a target nucleic acid in a eukaryotic cell comprising: contacting the cell with a gene editing protein, and the synthetic NLS-gRNA described herein, wherein the NLS-gRNA comprises a direct repeat sequence and a spacer sequence capable of hybridizing to the target nucleic acid, and wherein the gene editing protein is capable of binding to the NLS-gRNA and editing the target nucleic acid sequence complementary to the NLS-gRNA.
  • the invention provides a method of modifying a target nucleic acid in a eukaryotic cell comprising: contacting the cell with a gene editing protein, and the NLS-gRNA described herein, wherein the NLS-gRNA comprises a direct repeat sequence and a spacer sequence capable of hybridizing to the target nucleic acid, and wherein the gene editing protein is capable of binding to the NLS-gRNA and editing the target nucleic acid sequence complementary to the NLS-gRNA.
  • the gene editing method or system comprises a fusion protein with an effector that modifies target DNA in a site-specific manner, where the modifying activity includes methyltransferase activity, demethylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitinating activity, adenylation activity, deadenylation activity, SUMOylating activity, deSUMOylating activity, ribosylation activity, deribosylation activity, myristoylation activity, demyristoylation activity, integrase activity, transposase activity, recombinase activity, polymerase activity, ligase activity, helicase activity, or nuclease activity, any of which can modify DNA or a DNA-associated polypeptide (e.g., a histone or DNA binding protein).
  • the modifying activity includes methyltransferase activity, demethyl
  • the gene editing method or system comprises a fusion protein with enzymes that can edit DNA sequences by chemically modifying nucleotide bases, including deaminase enzymes that can modify adenosine or cytosine bases and function as site-specific base editors.
  • deaminase enzymes that can modify adenosine or cytosine bases and function as site-specific base editors.
  • APOBEC1 cytidine deaminase which usually uses RNA as a substrate, can be targeted to single-stranded and double-stranded DNA when it is fused to Cas9, converting cytidine to uridine directly, and ADAR enzymes deaminate adenosine to inosine.
  • 'base editing' using deaminases enables programmable conversion of one target DNA base into another.
  • base editing results in the introduction of stop codons to silence genes. In some embodiments, base editing results in altered protein function by altering amino acid sequences.
  • the NLS-gRNA described herein can be used in a gene editing method or system to modulate transcription of target DNA.
  • the NLS-gRNA can be used in a gene editing method or system to modulate the expression of a target non-coding RNA, including tRNA, rRNA, snoRNA, siRNA, miRNA, and long ncRNA.
  • the NLS-gRNA described herein is used for targeted engineering of chromatin loop structures using a suitable gene editing system. Targeted engineering of chromatin loops between regulatory genomic regions provides a means to manipulate endogenous chromatin structures and enable the formation of new enhancer- promoter connections to overcome genetic deficiencies or inhibit aberrant enhancer-promoter connections.
  • the NLS-gRNA described herein is used in conjunction with a gene editing system for correction of pathogenic mutations by insertion of beneficial clinical variants or suppressor mutations.
  • a base editor described herein comprises an adenosine deaminase domain.
  • Such an adenosine deaminase domain of a base editor can facilitate the editing of an adenine (A) nucleobase to a guanine (G) nucleobase by deaminating the A to form inosine (I), which exhibits base pairing properties of G.
  • Adenosine deaminase is capable of deaminating (i.e., removing an amine group) adenine of a deoxyadenosine residue in deoxyribonucleic acid (DNA).
  • an A-to-G base editor further comprises an inhibitor of inosine base excision repair, for example, a uracil glycosylase inhibitor (UGI) domain or a catalytically inactive inosine specific nuclease.
  • a uracil glycosylase inhibitor UGI domain
  • a catalytically inactive inosine specific nuclease can inhibit or prevent base excision repair of a deaminated adenosine residue (e.g., inosine), which can improve the activity or efficiency of the base editor.
  • a base editor comprising an adenosine deaminase can act on any polynucleotide, including DNA, RNA and DNA-RNA hybrids.
  • a base editor comprising an adenosine deaminase can deaminate a target A of a polynucleotide comprising RNA.
  • the base editor can comprise an adenosine deaminase domain capable of deaminating a target A of an RNA polynucleotide and/or a DNA-RNA hybrid polynucleotide.
  • an adenosine deaminase incorporated into a base editor comprises all or a portion of adenosine deaminase acting on RNA (ADAR, e.g., ADAR1 or ADAR2) or tRNA (AD AT).
  • a base editor comprising an adenosine deaminase domain can also be capable of deaminating an A nucleobase of a DNA polynucleotide.
  • an adenosine deaminase domain of a base editor comprises all or a portion of an AD AT comprising one or more mutations which permit the ADAT to deaminate a target A in DNA.
  • the base editor can comprise all or a portion of an ADAT from Escherichia coli (EcTadA) comprising one or more of the following mutations: D108N,
  • a base editor described herein comprises a fusion protein comprising an adenosine deaminase domain (e.g., adenosine deaminase variant domain).
  • an adenosine deaminase variant domain contains a combination of alterations in a TadA*7.10 amino acid sequence, where the combinations are V82G, Y147T/D, Q154S, and one or more ofL36H, I76Y, F149Y, N157K, and D167N.
  • the combinations of alterations in a TadA*7.10 amino acid sequence are V82G + Y147T + Q154S; I76Y + V82G + Y147T + Q154S; L36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; L36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; L36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; or L36H + I76Y + V82G + Y147D + F149Y + Q154S + N157K + D167N or a corresponding alteration in another adenosine deaminase.
  • Such an adenosine deaminase domain of a base editor can facilitate the editing of an adenine (A) nucleobase to a guanine (G) nucleobase by deaminating the A to form inosine (I), which exhibits base pairing properties of G.
  • Adenosine deaminase is capable of deaminating (i.e., removing an amine group) adenine of a deoxyadenosine residue in deoxyribonucleic acid (DNA).
  • the nucleobase editors provided herein can be made by fusing together one or more protein domains, thereby generating a fusion protein.
  • the fusion proteins provided herein comprise one or more features that improve the base editing activity (e.g., efficiency, selectivity, and specificity) of the fusion proteins.
  • the fusion proteins provided herein can comprise a Cas9 domain that has reduced nuclease activity.
  • the fusion proteins provided herein can have a Cas9 domain that does not have nuclease activity (dCas9), or a Cas9 domain that cuts one strand of a duplexed DNA molecule, referred to as a Cas9 nickase (nCas9).
  • dCas9 nuclease activity
  • nCas9 Cas9 nickase
  • H840 maintains the activity of the Cas9 to cleave the non-edited (e.g., non-deaminated) strand containing a T opposite the targeted A.
  • Mutation of the catalytic residue (e.g., D10 to A10) of Cas9 prevents cleavage of the edited strand containing the targeted A residue.
  • Such Cas9 variants are able to generate a single-strand DNA break (nick) at a specific location based on the gRNA-defmed target sequence, leading to repair of the non-edited strand, ultimately resulting in a T to C change on the non-edited strand.
  • an A-to-G base editor further comprises an inhibitor of inosine base excision repair, for example, a uracil glycosylase inhibitor (UGI) domain or a catalytically inactive inosine specific nuclease.
  • a uracil glycosylase inhibitor UGI domain
  • a catalytically inactive inosine specific nuclease can inhibit or prevent base excision repair of a deaminated adenosine residue (e.g., inosine), which can improve the activity or efficiency of the base editor.
  • a base editor comprising an adenosine deaminase can act on any polynucleotide, including DNA, RNA and DNA-RNA hybrids.
  • a base editor comprising an adenosine deaminase can deaminate a target A of a polynucleotide comprising RNA.
  • the base editor can comprise an adenosine deaminase domain capable of deaminating a target A of an RNA polynucleotide and/or a DNA-RNA hybrid polynucleotide.
  • an adenosine deaminase incorporated into a base editor comprises all or a portion of adenosine deaminase acting on RNA (ADAR, e.g., ADAR1 or ADAR2).
  • ADAR adenosine deaminase acting on RNA
  • an adenosine deaminase incorporated into a base editor comprises all or a portion of adenosine deaminase acting on tRNA (AD AT).
  • a base editor comprising an adenosine deaminase domain can also be capable of deaminating an A nucleobase of a DNA polynucleotide.
  • an adenosine deaminase domain of a base editor comprises all or a portion of an ADAT comprising one or more mutations which permit the ADAT to deaminate a target A in DNA.
  • the base editor can comprise all or a portion of an ADAT from Escherichia coli (EcTadA) comprising one or more of the following mutations: D108N, A106V, D147Y, E155V, L84F, H123Y, I156F, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase can be derived from any suitable organism (e.g., E. coli). In some embodiments, the adenosine deaminase is from a prokaryote. In some embodiments, the adenosine deaminase is from a bacterium. In some embodiments, the adenosine deaminase is from Escherichia coli, Staphylococcus aureus, Salmonella typhi, Shewanella putrefaciens, Haemophilus influenzae, Caulobacter crescentus, or Bacillus subtilis. In some embodiments, the adenosine deaminase is from E.
  • the adenine deaminase is a naturally-occurring adenosine deaminase that includes one or more mutations corresponding to any of the mutations provided herein (e.g., mutations in ecTadA).
  • the corresponding residue in any homologous protein can be identified by e.g., sequence alignment and determination of homologous residues.
  • the mutations in any naturally-occurring adenosine deaminase e.g., having homology to ecTadA
  • any of the mutations identified in ecTadA can be generated accordingly.
  • the fusion proteins as described herein comprise one or more adenosine deaminase domains.
  • the adenosine deaminases provided herein are capable of deaminating adenine.
  • the adenosine deaminases provided herein are capable of deaminating adenine in a deoxyadenosine residue of DNA.
  • the adenosine deaminase may be derived from any suitable organism (e.g., E. coli).
  • the adenine deaminase is a naturally -occurring adenosine deaminase that includes one or more mutations corresponding to any of the mutations provided herein (e.g., mutations in ecTadA).
  • mutations in ecTadA e.g., mutations in ecTadA.
  • One of skill in the art will be able to identify the corresponding residue in any homologous protein, e.g., by sequence alignment and determination of homologous residues.
  • adenosine deaminase e.g., having homology to ecTadA
  • the adenosine deaminase is from a prokaryote.
  • the adenosine deaminase is from a bacterium.
  • the adenosine deaminase is from Escherichia coli, Staphylococcus aureus, Salmonella typhi, Shewanella putrefaciens, Haemophilus influenzae, Caulohacter crescentus, or Bacillus suhtilis. In some embodiments, the adenosine deaminase is from E. coli.
  • adenosine deaminase variants that have increased efficiency (>50-60%) and specificity.
  • the adenosine deaminase variants described herein are more likely to edit a desired base within a polynucleotide, and are less likely to edit bases that are not intended to be altered (i.e., “bystanders”).
  • the adenosine deaminase is a TadA deaminase.
  • the TadA is any one of the TadA described in PCT/US2017/045381 (WO 2018/027078), which is incorporated herein by reference in its entirety.
  • a wild type TadA(wt) adenosine deaminase has the following sequence (also termed TadA reference sequence): MSEVEFSHEYWMRHALTLAKRAWDEREVPVGAVLVHNNRVIGEGWNRPIG RHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTLEPCVMCAGAMIHSRIGRVVFGA RDAKTGAAGSLMDVLHHPGMNHRVEITEGILADECAALLSDFFRMRRQEIKAQKKA QSSTD (SEQ ID NO: 6)
  • the adenosine deaminase is a full-length E. coli TadA deaminase.
  • the adenosine deaminase comprises the amino acid sequence:
  • the adenosine deaminase is from a prokaryote. In some embodiments, the adenosine deaminase is from a bacterium. In some embodiments, the adenosine deaminase is from Escherichia coli (E. coli), Staphylococcus aureus (S. aureus), Salmonella typhimurium (S. typhimurium), Shewanella putrefaciens (S. putrefaciens), Haemophilus influenzae ( H influenzae), Caulohacter crescentus (C. crescentus), Geohacter sulfurreducens (G. sulfurreducens), or Bacillus suhtilis. In some embodiments, the adenosine deaminase is from E. coli.
  • adenosine deaminases useful in the present application would be apparent to the skilled artisan and are within the scope of this disclosure.
  • the adenosine deaminase may be a homolog of adenosine deaminase acting on tRNA (ADAT).
  • ADAT tRNA
  • amino acid sequences of exemplary AD AT homologs include the following:
  • An embodiment of A. Coli TadA includes the following:
  • the adenosine deaminase comprises an amino acid sequence that is at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to any one of the amino acid sequences set forth in any of the adenosine deaminases provided herein. It should be appreciated that adenosine deaminases provided herein may include one or more mutations (e.g., any of the mutations provided herein).
  • the disclosure provides any deaminase domains with a certain percent identity plus any of the mutations or combinations thereof described herein.
  • the adenosine deaminase comprises an amino acid sequence that has 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
  • the adenosine deaminase comprises an amino acid sequence that has at least 5, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 110, at least 120, at least 130, at least 140, at least 150, at least 160, or at least 170 identical contiguous amino acid residues as compared to any one of the amino acid sequences known in the art or described herein.
  • any of the mutations provided herein can be introduced into other adenosine deaminases, such as E. coli TadA (ecTadA), S. aureus TadA (saTadA), or other adenosine deaminases (e.g., bacterial adenosine deaminases). It would be apparent to the skilled artisan that additional deaminases may similarly be aligned to identify homologous amino acid residues that can be mutated as provided herein.
  • adenosine deaminases such as E. coli TadA (ecTadA), S. aureus TadA (saTadA), or other adenosine deaminases (e.g., bacterial adenosine deaminases). It would be apparent to the skilled artisan that additional deaminases may similarly be aligned to identify homologous amino acid residues that can be mutated as provided herein
  • any of the mutations identified in the TadA reference sequence can be made in other adenosine deaminases (e.g., ecTada) that have homologous amino acid residues. It should also be appreciated that any of the mutations provided herein can be made individually or in any combination in the TadA reference sequence or another adenosine deaminase.
  • the adenosine deaminase comprises a D108X mutation in the TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a D108G, D108N, D108V, D108A, or D108Y mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase. It should be appreciated, however, that additional deaminases may similarly be aligned to identify homologous amino acid residues that can be mutated as provided herein.
  • the adenosine deaminase comprises an A106X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an A 106V mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises a E155X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where the presence of X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a E155D, E155G, or E155V mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises a D147X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where the presence of X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a D147Y, mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises an A106X, E155X, or D147X, mutation in the TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g, ecTadA), where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an E155D, E155G, or E155V mutation.
  • the adenosine deaminase comprises a D147Y.
  • any of the mutations provided herein may be introduced into other adenosine deaminases, such as S. aureus TadA (saTadA), or other adenosine deaminases (e.g., bacterial adenosine deaminases). It would be apparent to the skilled artisan how to are homologous to the mutated residues in ecTadA. Thus, any of the mutations identified in ecTadA may be made in other adenosine deaminases that have homologous amino acid residues. It should also be appreciated that any of the mutations provided herein may be made individually or in any combination in ecTadA or another adenosine deaminase.
  • an adenosine deaminase contains a combination of mutations
  • V82G + Y147T + Q154S I76Y + V82G + Y147T + Q154S; L36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; L36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; L36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; or L36H + I76Y + V82G + Y147D + F149Y + Q154S + N157K + D167N), and may contain one or more additional mutations.
  • Additional mutations include, for example, a D108N, a A106V, a E155V, and/or a D147Y mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • an adenosine deaminase comprises the following group of mutations (groups of mutations are separated by a in TadA reference sequence, or corresponding mutations in another adenosine deaminase: D108N and A106V; D108N and E155V; D108N and D147Y; A106V and E155V; A106V and D147Y; E155V and D147Y; D108N, A106V, and E155V; D108N, A106V, and D147Y; D108N, E155V, and D147Y; A 106V, E155V, and D147Y; and D108N, A106V, E155V, and D147Y. It should be appreciated, however, that any combination of corresponding mutations provided herein may be made in an adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one or more of a
  • the adenosine deaminase comprises one or more of H8Y, T17S, L18E, W23L, L34S, W45L, R51H, A56E, or A56S, E59G, E85K, or E85G, M94L, I95L, V102A, F104L, A106V, R107C, or R107H, or R107P, D108G, or D108N, or D108V, or D108A, or D108Y, K110I, M118K, N127S, A138V, F149Y, M151V, R153C, Q154L, I156D, and/or K157R mutation in TadA reference sequence, or one or more corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises one or more of a
  • the adenosine deaminase comprises one or more of a H8Y, D 108N, and/or N127S mutation in TadA reference sequence, or one or more corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises one or more of
  • the adenosine deaminase comprises one or more ofH8Y, R26W, M61I, L68Q, M70V, A106T, D108N, A109T, N127S, D147Y, R152C, Q154H or Q154R, E155G or E155V or E155D, K161Q, Q163H, and/or T166P mutation in TadA reference sequence, or one or more corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, or six mutations selected from the group consisting of H8X, D108X, N127X, D147X, R152X, and Q154X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA), where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • ecTadA another adenosine deaminase
  • the adenosine deaminase comprises one, two, three, four, five, six, seven, or eight mutations selected from the group consisting of H8X, M61X, M70X, D108X, N127X, Q154X, E155X, and Q163X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA), where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • ecTadA another adenosine deaminase
  • the adenosine deaminase comprises one, two, three, four, or five, mutations selected from the group consisting of H8X, D108X, N127X, E155X, and T166X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA), where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • ecTadA another adenosine deaminase
  • the adenosine deaminase comprises one, two, three, four, five, or six mutations selected from the group consisting of H8X, A106X, and D108X, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, six, seven, or eight mutations selected from the group consisting of H8X, R26X, L68X, D108X, N127X, D147X, and E155X, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, six, or seven mutations selected from the group consisting of H8X, R126X, L68X, D108X, N127X, D147X, and E155X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, or five mutations selected from the group consisting of H8X, D108X, A109X, N127X, and E155X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, or six mutations selected from the group consisting of H8Y, D108N, N127S, D147Y, R152C, and Q154H in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one, two, three, four, five, six, seven, or eight mutations selected from the group consisting of H8Y, M61I, M70V, D108N, N127S, Q154R, E155G and Q163H in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one, two, three, four, or five, mutations selected from the group consisting of H8Y, D108N, N127S, E155V, and T166P in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one, two, three, four, five, or six mutations selected from the group consisting of H8Y, A106T, D108N, N127S, E155D, and K161Q in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one, two, three, four, five, six, seven, or eight mutations selected from the group consisting of H8Y, R26W, L68Q, D108N, N127S, D147Y, and E155V in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one, two, three, four, or five, mutations selected from the group consisting of H8Y, D108N, A109T, N127S, and E155G in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one or more of the or one or more corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises a D108N, D108G, or D108V mutation in TadA reference sequence, or corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises a A 106V and D108N mutation in TadA reference sequence, or corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises R107C and D108N mutations in TadA reference sequence, or corresponding mutations in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises aH8Y, D108N, N127S, D147Y, and Q154H mutation in TadA reference sequence, or corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises a H8Y, D108N, N127S, D147Y, and E155V mutation in TadA reference sequence, or corresponding mutations in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises a D108N, D147Y, and E155V mutation in TadA reference sequence, or corresponding mutations in another adenosine deaminase. In some embodiments, the adenosine deaminase comprises a H8Y, D108N, and N127S mutation in TadA reference sequence, or corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises a A106V, D108N, D147Y, and E155V mutation in TadA reference sequence, or corresponding mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one or more of
  • the adenosine deaminase comprises one or more of S2A, H8Y, I49F, L84F, H123Y, N127S, I156F, and/or K160S mutation in TadA reference sequence, or one or more corresponding mutations in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises an L84X mutation adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an L84F mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises an H123X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an H123Y mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an I156X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an I156F mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, six, or seven mutations selected from the group consisting of L84X, A106X, D108X, H123X, D147X, E155X, and I156X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, or six mutations selected from the group consisting of S2X, I49X, A106X, D108X, D147X, and E155X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, or five mutations selected from the group consisting of H8X, A106X, D108X, N127X, and K160X in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase, where X indicates the presence of any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, six, or seven mutations selected from the group consisting of L84F, A 106V, D108N, H123Y, D147Y, E155V, and I156F in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase.
  • the adenosine deaminase comprises one, two, three, four, five, or six mutations selected from the group consisting of S2A, I49F, A 106V, D108N, D147Y, and El 55V in TadA reference sequence.
  • the adenosine deaminase comprises one, two, three, four, or five mutations selected from the group consisting of H8Y, A106T, D108N, N127S, and K160S in TadA reference sequence, or a corresponding mutation or mutations in another adenosine deaminase.
  • the adenosine deaminase comprises one or more of a
  • the adenosine deaminase comprises one or more of E25M, E25D, E25A, E25R, E25V, E25S, E25Y, R26G, R26N, R26Q, R26C, R26L, R26K, R107P, R107K, R107A, R107N, R107W, R107H, R107S, A142N, A142D, A142G, A143D, A143G, A143E, A143L, A143W, A143M, A143S, A143Q, and/or A143R mutation in TadA reference sequence, or one or more corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises one or more of the mutations described herein corresponding to TadA reference sequence, or one or more corresponding mutations in another adenosine deaminase.
  • the adenosine deaminase comprises an E25X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an E25M, E25D, E25A, E25R, E25V, E25S, or E25Y mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises an R26X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises R26G, R26N, R26Q, R26C, R26L, or R26K mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises an R107X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an R107P, R107K, R107A, R107N, R107W, R107H, or R107S mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises an A142X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an A142N, A142D, A142G, mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises an A143X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an A143D, A143G, A143E, A143L, A143W, A143M, A143S, A143Q, and/or A143R mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase (e.g., ecTadA).
  • the adenosine deaminase comprises one or more of a
  • the adenosine deaminase comprises one or more of H36L, N37T, N37S, P48T, P48L, 149V, R51H, R51L, M70L, N72S, D77G, E134G, S146R, S146C, Q154H, K157N, and/or K161T mutation in TadA reference sequence, or one or more corresponding mutations in another adenosine deaminase (e.g., ecTadA).
  • ecTadA another adenosine deaminase
  • the adenosine deaminase comprises an H36X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an H36L mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an N37X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an N37T or N37S mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an P48X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an P48T or P48L mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an R51X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an R51H or R51L mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an S146X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises an S146R or S146C mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an K157X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a K157N mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an P48X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a P48S, P48T, or P48A mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an A142X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a A142N mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an W23X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a W23R or W23L mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase comprises an R152X mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase, where X indicates any amino acid other than the corresponding amino acid in the wild-type adenosine deaminase.
  • the adenosine deaminase comprises a R152P or R52H mutation in TadA reference sequence, or a corresponding mutation in another adenosine deaminase.
  • the adenosine deaminase may comprise the mutations
  • the adenosine deaminase comprises the following combination of mutations relative to TadA reference sequence, where each mutation of a combination is separated by a and each combination of mutations is between parentheses:
  • the TadA deaminase is TadA variant.
  • the TadA variant is TadA* 7.10.
  • the fusion proteins comprise a single TadA*7.10 domain (e.g., provided as a monomer).
  • the fusion protein comprises TadA* 7.10 and TadA(wt), which are capable of forming heterodimers.
  • a fusion protein as described herein comprises a wild-type TadA linked to TadA*7.10, which is linked to Cas9 nickase.
  • TadA*7.10 comprises at least one alteration.
  • the adenosine deaminase comprises an alteration in the following sequence: TadA*7.10
  • TadA*7.10 comprises an alteration at amino acid 82 and/or 166.
  • TadA*7.10 comprises one or more of the following alterations: Y147T, Y147R, Q154S, Y123H, V82S, T166R, and/or Q154R.
  • a variant of TadA*7.10 comprises a combination of alterations selected from the group of: Y147T + Q154R; Y147T + Q154S; Y147R + Q154S; V82S + Q154S; V82S + Y147R; V82S + Q154R; V82S + Y123H; I76Y + V82S; V82S + Y123H + Y147T; V82S + Y123H + Y147R; V82S + Y123H + Q154R; Y147R + Q154R +Y123H; Y147R + Q154R + I76Y; Y147R + Q154R + T166R; Y123H + Y147R + Q154R + I76Y; V82S + Y123H + Y147R + Q154R; and I76Y + V82S + Y123H + Y147R + Q154R.
  • a variant of TadA*7.10 comprises one or more of alterations selected from the group of F36H, I76Y, V82G, Y147T, Y147D, F149Y, Q154S, N157K, and/or D167N.
  • a variant of TadA*7.10 comprises V82G, Y147T/D, Q154S, and one or more of F36H, I76Y, F149Y, N157K, and D167N.
  • a variant of TadA*7.10 comprises a combination of alterations selected from the group of: V82G + Y147T + Q154S; I76Y + V82G + Y147T + Q154S; F36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; F36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; F36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; F36H + I76Y + V82G + Y147D + F149Y + Q154S + D167N; F36H + I76Y + V82G + Y147D + F149Y + Q154S + N157K + D167N.
  • an adenosine deaminase variant (e.g., TadA variant) comprises a deletion.
  • an adenosine deaminase variant comprises a deletion of the C terminus.
  • an adenosine deaminase variant comprises a deletion of the C terminus beginning at residue 149, 150, 151, 152, 153, 154,
  • TadA 155, 156, and 157, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • an adenosine deaminase variant (e.g., TadA* 8) is a monomer comprising one or more of the following alterations: Y147T, Y147R, Q154S, Y123H, V82S, T166R, and/or Q154R, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant (TadA* 8) is a monomer comprising a combination of alterations selected from the group of: Y147T + Q154R; Y147T + Q154S; Y147R + Q154S; V82S + Q154S; V82S + Y147R; V82S + Q154R; V82S + Y123H; I76Y + V82S; V82S + Y123H + Y147T; V82S + Y123H + Y147R; V82S + Y123H + Q154R; Y147R + Q154R +Y123H; Y147R + Q154R + I76Y; Y147R + Q154R + T166R; Y123H + Y147R + Q154R + I76Y; V82S + Y123H + Y147R + Q154R; and I76Y + V82S + Y123H + Y147R + Q154R, relative
  • a base editor of the disclosure comprising an adenosine deaminase variant (e.g. , TadA* 8) monomer comprising one or more of the following alterations: R26C, V88A, A109S, T111R, D119N, H122N, Y147D, F149Y, T166I and/or D167N, relative to TadA* 7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • an adenosine deaminase variant e.g. , TadA* 8
  • monomer comprising one or more of the following alterations: R26C, V88A, A109S, T111R, D119N, H122N, Y147D, F149Y, T166I and/or D167N, relative to TadA* 7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant (TadA* 8) monomer comprises a combination of alterations selected from the group of: R26C + A109S + T111R + D119N + H122N + Y147D + F149Y + T166I + D167N; V88A + A109S + T111R +
  • an adenosine deaminase variant (e.g., MSP828) is a monomer comprising one or more of the following alterations L36H, I76Y, V82G, Y147T, Y147D, F149Y, Q154S, N157K, and/or D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • an adenosine deaminase variant (e.g., MSP828) is a monomer comprising V82G, Y147T/D, Q154S, and one or more of L36H, I76Y, F149Y, N157K, and D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant is a monomer comprising a combination of alterations selected from the group of: V82G + Y147T + Q154S; I76Y + V82G + Y147T + Q154S; L36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; L36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; L36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; L36H + I76Y + V82G + Y147D + F149Y + Q154S + D167N; L36H + I76Y + V82G + Y147D + F149Y + Q154S + D167N; L36H + I76Y + V82G +
  • the adenosine deaminase variant is a homodimer comprising two adenosine deaminase domains (e.g., TadA* 8) each having one or more of the following alterations Y147T, Y147R, Q154S, Y123H, V82S, T166R, and/or Q154R, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • TadA*8 two adenosine deaminase domains
  • the adenosine deaminase variant is a homodimer comprising two adenosine deaminase domains (e.g., TadA* 8) each having a combination of alterations selected from the group of: Y147T + Q154R; Y147T + Q154S; Y147R + Q154S; V82S + Q154S; V82S + Y147R; V82S + Q154R; V82S + Y123H; I76Y + V82S; V82S + Y123H + Y147T; V82S + Y123H + Y147R; V82S + Y123H + Q154R; Y147R + Q154R +Y123H; Y147R + Q154R + I76Y; Y147R + Q154R + T166R; Y123H + Y147R + Q154R + I76Y; V82S + Y123H + Y147R + Q154R; and
  • a base editor of the disclosure comprising an adenosine deaminase variant (e.g., TadA* 8) homodimer comprising two adenosine deaminase domains (e.g., TadA* 8) each having one or more of the following alterations R26C, V88A, A109S,
  • the adenosine deaminase variant is a homodimer comprising two adenosine deaminase domains (e.g., TadA* 8) each having a combination of alterations selected from the group of: R26C + A109S + T111R + D119N + H122N + Y147D + F149Y + T166I + D167N; V88A + A109S + T111R + D119N + H122N + F149Y + T166I + D167N; R26C + A109S + T111R + D119N + H122N + F149Y + T166I + D167N; R26C + A109S + T111R + D119N + H122N + F149Y + T166I + D167N; V88A + T111R + D119N + F149Y + T166I + D167N; and A109S + T111R + D
  • an adenosine deaminase variant is a homodimer comprising two adenosine deaminase domains (e.g., TadA* 7.10) each having one or more of the following alterations L36H, I76Y, V82G, Y147T, Y147D, F149Y, Q154S, N157K, and/or D167N, relative to TadA* 7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • TadA* 7.10 two adenosine deaminase domains
  • an adenosine deaminase variant is a homodimer comprising two adenosine deaminase variant domains (e.g., MSP828) each having the following alterations V82G, Y147T/D, Q154S, and one or more of L36H, I76Y, F149Y, N157K, and D 167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant is a homodimer comprising two adenosine deaminase domains (e.g.
  • the adenosine deaminase variant is a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain (e.g., TadA* 8) comprising one or more of the following alterations Y147T, Y 147R, Q 154S, Y123H, V82S, T166R, and/or Q154R, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • TadA*8 a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain
  • TadA*8 a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain (e.g., TadA* 8) comprising one or
  • the adenosine deaminase variant is a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain (e.g., TadA* 8) comprising a combination of alterations selected from the group of: Y147T + Q154R; Y147T + Q154S; Y147R + Q154S; V82S + Q154S; V82S + Y147R; V82S + Q154R; V82S + Y123H; I76Y + V82S; V82S + Y123H + Y147T; V82S + Y123H + Y147R; V82S + Y123H + Q154R; Y147R + Q154R +Y123H; Y147R + Q154R +Y123H; Y147R + Q154R + I76Y; Y147R + Q154R + T166R; Y123H + Y147R +
  • a base editor comprises a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain (e.g., TadA* 8) comprising one or more of the following alterations R26C, V88A, A109S, T111R, D 119N, H122N, Y147D, F149Y, T166I and/or D167N, relative to TadA* 7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • TadA* a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain
  • the base editor comprises a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain (e.g., TadA* 8) comprising a combination of alterations selected from the group of: R26C + A109S + T111R + D119N + H122N + Y147D + F149Y + T166I + D167N; V88A + A109S + T111R + D119N + H122N + F149Y + T166I + D167N; R26C + A109S + T111R + D119N + H122N + F149Y + T166I + D167N; V88A + T111R + D119N + F149Y; and A109S + T111R + D119N + H122N + Y147D + F149Y + T166I + D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant is a heterodimer of a wild-type adenosine deaminase domain and an adenosine deaminase variant domain (e.g ., TadA*7.10) comprising one or more of the following alterations L36H, I76Y, V82G, Y147T, Y147D, F149Y, Q154S, N157K, and/or D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • an adenosine deaminase variant is a heterodimer comprising a wild-type adenosine deaminase domain and an adenosine deaminase variant domain (e.g., MSP828) having the following alterations V82G, Y147T/D, Q154S, and one or more ofL36H, I76Y, F149Y, N157K, and D167N, relative to TadA* 7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • MSP828 adenosine deaminase variant domain having the following alterations V82G, Y147T/D, Q154S, and one or more ofL36H, I76Y, F149Y, N157K, and D167N, relative to TadA* 7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant is a heterodimer of a wild- type adenosine deaminase domain and an adenosine deaminase variant domain (e.g.,
  • TadA* 7.10) comprising a combination of alterations selected from the group of: V82G + Y147T + Q154S; I76Y + V82G + Y147T + Q154S; L36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; L36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; L36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; L36H + I76Y + V82G + Y147D + F149Y + Q154S + D167N; L36H + I76Y + V82G + Y147D + F149Y + Q154S + N157K + D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding
  • the adenosine deaminase variant is a heterodimer of a
  • TadA*7.10 domain and an adenosine deaminase variant domain comprising one or more of the following alterations Y147T, Y147R, Q154S, Y123H, V82S, T166R, and/or Q154R, relative to TadA* 7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant is a heterodimer of a TadA*7.10 domain and an adenosine deaminase variant domain (e.g.,
  • TadA* 8 comprising a combination of alterations selected from the group of: Y147T + Q154R; Y147T + Q154S; Y147R + Q154S; V82S + Q154S; V82S + Y147R; V82S +
  • a base editor comprises a heterodimer of a TadA* 7.10 domain and an adenosine deaminase variant domain (e.g., TadA* 8) comprising one or more of the following alterations R26C, V88A, A109S, T111R, D119N, H122N, Y147D, F149Y, T166I and/or D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the base editor comprises a heterodimer of a TadA*7.10 domain and an adenosine deaminase variant domain (e.g.,
  • TadA* 8 comprising a combination of alterations selected from the group of: R26C + A109S + T111R + D119N + H122N + Y147D + F149Y + T166I + D167N; V88A + A109S +
  • the adenosine deaminase variant is a heterodimer of a
  • TadA*7.10 domain and an adenosine deaminase variant domain comprising one or more of the following alterations L36H, I76Y, V82G, Y147T, Y147D, F149Y, Q154S, N157K, and/or D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • an adenosine deaminase variant is a heterodimer comprising a TadA* 7.10 domain and an adenosine deaminase variant domain (e.g., MSP828) having the following alterations V82G, Y147T/D, Q154S, and one or more of L36H, I76Y, F149Y, N157K, and D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • MSP828 adenosine deaminase variant domain having the following alterations V82G, Y147T/D, Q154S, and one or more of L36H, I76Y, F149Y, N157K, and D167N, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • the adenosine deaminase variant is a heterodimer of a TadA*7.10 domain and an adenosine deaminase variant domain (e.g., TadA* 7.10) comprising a combination of alterations selected from the group of: V82G + Y147T + Q154S; I76Y + V82G + Y147T + Q154S; L36H + V82G + Y147T + Q154S + N157K; V82G + Y147D + F149Y + Q154S + D167N; L36H + V82G + Y147D + F149Y + Q154S + N157K + D167N; L36H + I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147T + Q154S + N157K; I76Y + V82G + Y147D + F149Y + Q154S + D167N; L36H + I76Y + V
  • the TadA*8 is a variant as shown in Tables 8A, 10, 11, or 13.
  • Tables 8A, 10, 11, and 13 show certain amino acid position numbers in the TadA amino acid sequence and the amino acids present in those positions in the TadA-7.10 adenosine deaminase.
  • Tables 8A, 10, 11, and 13 also show amino acid changes in TadA variants relative to TadA-7.10 following phage-assisted non-continuous evolution (PANCE) and phage-assisted continuous evolution (PACE), as described in M.
  • PANCE phage-assisted non-continuous evolution
  • PACE phage-assisted continuous evolution
  • the TadA* 8 is TadA* 8a, TadA* 8b, TadA* 8c, TadA*8d, or TadA*8e. In some embodiments, the TadA* 8 is TadA*8e.
  • an adenosine deaminase heterodimer can comprise a TadA* 8 domain and an adenosine deaminase domain selected from Staphylococcus aureus ( S . aureus) TadA, Bacillus suhtilis ( B . suhtilis) TadA, Salmonella typhimurium (S. typhimurium) TadA, Shewanella putrefaciens (S. putrefaciens) TadA, Haemophilus influenzae F3031 ( H influenzae) TadA, Caulohacter crescentus (C. crescentus) TadA, Geohacter sulfurreducens (G. sulfurreducens) TadA, or TadA*7.10.
  • an adenosine deaminase is a TadA* 8.
  • an adenosine deaminase is a TadA* 8 that comprises or consists essentially of the following sequence or a fragment thereof having adenosine deaminase activity:
  • the TadA* 8 is truncated. In some embodiments, the truncated TadA*8 is missing 1, 2, 3, 4, 5 ,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 6, 17, 18, 19, or 20 N-terminal amino acid residues relative to the full length TadA* 8. In some embodiments, the truncated TadA*8 is missing 1, 2, 3, 4, 5 ,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 6, 17, 18, 19, or 20 C-terminal amino acid residues relative to the full length TadA* 8. In some embodiments the adenosine deaminase variant is a full-length TadA* 8.
  • a fusion protein as described and/or exemplified herein comprises a wild-type TadA is linked to an adenosine deaminase variant described herein (e.g., TadA* 8), which is linked to Cas9 nickase.
  • the fusion proteins comprise a single TadA* 8 domain (e.g., provided as a monomer).
  • the base editor comprises TadA* 8 and TadA(wt), which are capable of forming heterodimers.
  • the TadA*8 is TadA*8.1, TadA*8.2, TadA*8.3,
  • the TadA variant is a variant as shown in Table 6.
  • Table 6 shows certain amino acid position numbers in the TadA amino acid sequence and the amino acids present in those positions in the TadA*7.10 adenosine deaminase.
  • the TadA variant is MSP605, MSP680, MSP823, MSP824, MSP825, MSP827, MSP828, or MSP829.
  • the TadA variant is MSP828.
  • the TadA variant is MSP829.
  • a fusion protein as described herein comprises a wild-type
  • TadA is linked to an adenosine deaminase variant described herein, which is linked to Cas9 nickase.
  • the fusion proteins comprise a single variant TadA domain (e.g., provided as a monomer).
  • the fusion protein comprises a variant TadA and TadA(wt), which are capable of forming heterodimers.
  • the TadA variant is truncated.
  • the truncated TadA is missing 1, 2, 3, 4, 5 ,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 6, 17, 18, 19, or 20 N-terminal amino acid residues relative to the full length TadA variant.
  • the truncated TadA variant is missing 1, 2, 3, 4, 5 ,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 6, 17, 18, 19, or 20 C-terminal amino acid residues relative to the full length TadA variant.
  • the adenosine deaminase variant is a full-length TadA variant.
  • a TadA* 8 comprises one or more mutations at any of the following positions shown in bold. In other embodiments, a TadA* 8 comprises one or more mutations at any of the positions shown with underlining:
  • the TadA* 8 comprises alterations at amino acid position 82 and/or 166 (e.g., V82S, T166R) alone or in combination with any one or more of the following Y147T, Y147R, Q154S, Y123H, and/or Q154R, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • alterations at amino acid position 82 and/or 166 e.g., V82S, T166R
  • any one or more of the following Y147T, Y147R, Q154S, Y123H, and/or Q154R relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another TadA.
  • a combination of alterations is selected from the group of: Y147T + Q154R; Y147T + Q154S; Y147R + Q154S; V82S + Q154S; V82S + Y147R; V82S + Q154R; V82S + Y123H; I76Y + V82S; V82S + Y123H + Y147T; V82S + Y123H + Y147R; V82S + Y123H + Q154R; Y147R + Q154R +Y123H; Y147R + Q154R + I76Y; Y147R + Q154R + T166R; Y123H + Y147R + Q154R + I76Y; V82S + Y123H + Y147R + Q154R; and I76Y + V82S + Y123H + Y147R + Q154R, relative to TadA*7.10, the TadA reference sequence, or a corresponding mutation in another Ta
  • an adenosine deaminase comprises one or more of the following alterations: R21N, R23H, E25F, N38G, L51W, P54C, M70V, Q71M, N72K, Y73S, V82T, M94V, P124W, T133K, D139L, D139M, C146R, and A158K.
  • the one or more alternations are shown in the sequence above in underlining and bold font.
  • an adenosine deaminase comprises one or more of the following combinations of alterations: V82S + Q154R + Y147R; V82S + Q154R + Y123H; V82S + Q154R + Y147R+ Y123H; Q154R + Y147R + Y123H + I76Y+ V82S; V82S + I76Y; V82S + Y147R; V82S + Y147R + Y123H; V82S + Q154R + Y123H; Q154R + Y147R + Y123H + I76Y; V82S + Y147R; V82S + Y147R + Y123H; V82S + Q154R + Y147R; V82S + Q154R + Y147R; V82S + Q154R + Y147R; V82S + Q154R + Y147R; Q154R + Y147R; Q154R + Y147R; Q154R + Y147R
  • an adenosine deaminase comprises one or more of the following combinations of alterations: E25F + V82S + Y123H, T133K + Y147R + Q154R; E25F + V82S + Y123H + Y147R + Q154R; L51W + V82S + Y123H + C146R + Y147R + Q154R; Y73S + V82S + Y123H + Y147R + Q154R; P54C + V82S + Y123H + Y147R + Q154R; N38G + V82T + Y123H + Y147R + Q154R; N72K + V82S + Y123H + D139L + Y147R + Q154R; E25F + V82S + Y123H + D139M + Y147R + Q154R; Q71M + V82S + Y123H + Y147R + Q154R; E25F + V82S + Y123H + D
  • an adenosine deaminase comprises one or more of the following combinations of alterations: Q71M + V82S + Y123H + Y147R+ Q154R; E25F + I76Y+ V82S + Y123H + Y147R + Q154R; I76Y + V82T + Y123H + Y147R + Q154R; N38G + I76Y + V82S + Y123H + Y147R + Q154R; R23H + I76Y + V82S + Y123H + Y147R + Q154R; P54C + I76Y + V82S + Y123H + Y147R + Q154R; R21N + I76Y + V82S + Y123H + Y147R + Q154R; I76Y + V82S + Y123H + D139M + Y147R + Q154R; Y73S + I76Y + V82S + Y123H + Y147
  • the adenosine deaminase is expressed as a monomer. In other embodiments, the adenosine deaminase is expressed as a heterodimer. In some embodiments, the deaminase or other polypeptide sequence lacks a methionine, for example when included as a component of a fusion protein. This can alter the numbering of positions. However, the skilled person will understand that such corresponding mutations refer to the same mutation, e.g., Y73S and Y72S and D139M and D138M.
  • the TadA*9 variant is a monomer. In some embodiments, the TadA*9 variant is a heterodimer with a wild-type TadA adenosine deaminase. In some embodiments, the TadA*9 variant is a heterodimer with another TadA variant (e.g., TadA*8, TadA*9). Additional details of TadA*9 adenosine deaminases are described in International PCT Application No. PCT/2020/049975, which is incorporated herein by reference for its entirety.
  • a fusion protein as described herein comprises a wild-type TadA is linked to an adenosine deaminase variant described herein (e.g., TadA variant), which is linked to Cas9 nickase.
  • the fusion proteins comprise a single TadA variant domain (e.g., provided as a monomer).
  • the base editor comprises TadA* 8 and TadA(wt), which are capable of forming heterodimers.
  • the fusion proteins comprise a single (e.g., provided as a monomer) TadA variant domain.
  • the TadA variant is linked to a Cas9 nickase.
  • the fusion proteins described herein comprise as a heterodimer of a wild-type TadA (TadA(wt)) linked to a TadA variant.
  • the fusion proteins described herein comprise as a heterodimer of a TadA*7.10 linked to a TadA variant.
  • the fusion protein comprises a TadA variant monomer.
  • the fusion protein comprises a heterodimer of a TadA variant and a TadA(wt). In some embodiments, the fusion protein comprises a heterodimer of a TadA variant and TadA* 7.10. In some embodiments, the fusion protein comprises a heterodimer of two TadA variants. In some embodiments, the TadA variant is selected from Table 5, 6, infra or any other TadA variant provided herein.
  • the deaminase or other polypeptide sequence lacks a methionine, for example when included as a component of a fusion protein. This can alter the numbering of positions. However, the skilled person will understand that such corresponding mutations refer to the same mutation.
  • any of the mutations provided herein and any additional mutations can be introduced into any other adenosine deaminases.
  • Any of the mutations provided herein can be made individually or in any combination in TadA reference sequence or another adenosine deaminase (e.g., ecTadA).
  • next generation sequencing adapters and barcodes for example Illumina multiplex adapters and indexes
  • high throughput sequencing for example on an Illumina MiSeq
  • the nucleobase editors are used to target polynucleotides of interest.
  • a nucleobase editor as described herein is delivered to cells (e.g., hepatocytes) in conjunction with a guide RNA that is used to target a nucleic acid sequence, e.g., a G6PC polynucleotide harboring GSD la-associated mutations, thereby altering the target gene, i.e., G6PC.
  • a base editor is targeted by a guide RNA to introduce one or more edits to the sequence of a gene of interest (e.g. G6PC).
  • the one or more alterations are introduced into the glucose-6-phosphatase (G6PC) gene.
  • the one or more alterations is R83C.
  • the one or more alterations is Q347X.
  • the alteration is introduced into a representative Homo sapiens G6PC protein, found under NCBI Reference Sequence No. AAA 16222.1.
  • the alteration is introduced into a representative Homo sapiens G6PC nucleic acid sequence, found under GenBank Reference Sequence No. U01120.1.
  • the NLS-gRNA described herein can be used in a gene editing system for various therapeutic applications. Accordingly, in some embodiments, a method of treating a disorder or a disease in a subject in need thereof is provided, the method comprising administering to the subject a NLS-gRNA described herein with a gene editing system.
  • a gene editing system include for example CRISPR-Cas9, Cpfl, SaCas9, Casl2.
  • the NLS-gRNA described herein can be used with any gene editing system.
  • Cas protein is from an organism from a genus comprising Streptococcus, Campylobacter, Nitratifr actor, Staphylococcus, Parvibaculum, Roseburia, Neisseria, Gluconacetobacter, Azospirillum, Sphaerochaeta, Lactobacillus, Eubacterium, Corynebacter, Carnobacterium, Rhodobacter, Listeria, Paludibacter, Clostridium, Lachnospira, Lachnospiraceae, Clostridiaridium, Leptotrichia, Francisella, Legionella, Alicyclobacillus, Methanomethyophilus, Porphyromonas, Prevotella, Bacteroidetes, Helcococcus, Leptospira, Desulfovibrio, Desulfonatronum, Opitutaceae, Tuberibacillus, Bacillus, Brevibacilus, Methylobacterium, Buty
  • the Cpfl effector protein is selected from an organism from a genus selected from Eubacterium, Lachnospiraceae, Leptotrichia, Francisella, Methanomethyophilus, Porphyromonas, Prevotella, Leptospira, Butyvibrio, Perigrinibacterium, Pareubacterium, Moraxella, Thiomicrospira or Acidaminococcus.
  • Non-limiting examples of Cas species include Streptococcus pyogenes,
  • Streptococcus thermophiles Sterptococcus aureas Neisseria meningitides, Treponema denticola, Francisella tularensis, Campylobacter jejuni, Corynebacterium ulcerans, Corynebacterium diphtheria, Spiroplasma syrphidicola, Prevotella intermedia, Spiroplasma taiwanense, Streptococcus iniae, Belliella baltica, Psychroflexus torquis, Streptococcus thermophilus, Listeria innocua, Geobacillus stearothermophilus, Streptococcus constellatus, Sharpea spp. isolate RUG017, Veillonella parvula, Ezakiella peruensis, Lactobacillus fermentum strain AF15-40LB and Pep toniphilus sp. Marseille-P3761.
  • the NLS-gRNA described herein can be used in conjunction with a gene editing system to treat various diseases and disorders, e.g., genetic disorders (e.g., monogenetic diseases), diseases that can be treated by nuclease activity, and various cancers, etc.
  • diseases and disorders e.g., genetic disorders (e.g., monogenetic diseases), diseases that can be treated by nuclease activity, and various cancers, etc.
  • the NLS-gRNA described herein can be used in conjunction with a gene editing system to edit a target nucleic acid to modify the target nucleic acid (e.g., by inserting, deleting, or mutating one or more nucleic acid residues).
  • a CRISPR systems is used with the NLS-gRNA described herein and comprises an exogenous donor template nucleic acid (e.g., a DNA molecule or a RNA molecule), which comprises a desirable nucleic acid sequence.
  • the molecular machinery of the cell Upon resolution of a cleavage event induced with the CRISPR system, the molecular machinery of the cell will utilize the exogenous donor template nucleic acid in repairing and/or resolving the cleavage event. Alternatively, the molecular machinery of the cell can utilize an endogenous template in repairing and/or resolving the cleavage event.
  • the NLS-gRNA described herein is used in conjunction with a gene editing system to alter a target nucleic acid resulting in an insertion, a deletion, and/or a point mutation).
  • the insertion is a scarless insertion (i.e., the insertion of an intended nucleic acid sequence into a target nucleic acid resulting in no additional unintended nucleic acid sequence upon resolution of the cleavage event).
  • Donor template nucleic acids may be double stranded or single stranded nucleic acid molecules (e.g., DNA or RNA).
  • NLS-gRNA described herein can be used in conjunction with a gene editing system for treating a disease caused by overexpression of RNAs, toxic RNAs, and/or mutated RNAs (e.g., splicing defects or truncations).
  • the NLS-gRNA described herein can be used in conjunction with a gene editing system to target trans-acting mutations affecting RNA- dependent functions that cause various diseases.
  • the NLS-gRNA described herein can be used in conjunction with a gene editing system to target mutations disrupting the cis-acting splicing codes that can cause splicing defects and diseases.
  • the NLS-gRNA described herein can be used in conjunction with a gene editing system can for antiviral activity, in particular against RNA viruses.
  • a gene editing system can for antiviral activity, in particular against RNA viruses.
  • suitable NLS-gRNA selected to target viral RNA sequences.
  • the NLS-gRNA described herein can be used in conjunction with a gene editing system to treat a cancer in a subject (e.g., a human subject). For example, by targeting a RNA molecule that is aberrant (e.g., comprises a point mutation or are alternatively-spliced) and found in cancer cells to induce cell death in the cancer cells (e.g., via apoptosis).
  • a subject e.g., a human subject.
  • a RNA molecule that is aberrant e.g., comprises a point mutation or are alternatively-spliced
  • the NLS-gRNA described herein can be used in conjunction with a gene editing system to treat an infectious disease in a subject. For example, through targeting a RNA molecule expressed by an infectious agent (e.g., a bacteria, a virus, a parasite or a protozoan) in order to target and induce cell death in the infectious agent cell.
  • an infectious agent e.g., a bacteria, a virus, a parasite or a protozoan
  • the synthetic guide RNA described herein can be used in conjunction with a gene editing system to treat diseases where an intracellular infectious agent infects the cells of a host subject.
  • a polynucleotide comprising a donor sequence to be inserted is also provided to the cell.
  • a donor sequence or “donor polynucleotide” it is meant a nucleic acid sequence to be inserted at the cleavage site induced by a site-directed modifying polypeptide.
  • the donor polynucleotide will contain sufficient homology to a genomic sequence at the cleavage site, e.g. 70%, 80%, 85%, 90%, 95%, or 100% homology with the nucleotide sequences flanking the cleavage site, e.g.
  • Donor sequences can be of any length, e.g.
  • nucleotides or more 10 nucleotides or more, 50 nucleotides or more, 100 nucleotides or more, 250 nucleotides or more, 500 nucleotides or more, 1000 nucleotides or more, 5000 nucleotides or more, etc.
  • the donor sequence is typically not identical to the genomic sequence that it replaces. Rather, the donor sequence may contain at least one or more single base changes, insertions, deletions, inversions or rearrangements with respect to the genomic sequence, so long as sufficient homology is present to support homology-directed repair.
  • the donor sequence comprises a non-homologous sequence flanked by two regions of homology, such that homology-directed repair between the target DNA region and the two flanking sequences results in insertion of the non-homologous sequence at the target region.
  • Donor sequences may also comprise a vector backbone containing sequences that are not homologous to the DNA region of interest and that are not intended for insertion into the DNA region of interest.
  • the homologous region(s) of a donor sequence will have at least 50% sequence identity to a genomic sequence with which recombination is desired. In certain embodiments, 60%, 70%, 80%, 90%, 95%, 98%, 99%, or 99.9% sequence identity is present. Any value between 1% and 100% sequence identity can be present, depending upon the length of the donor polynucleotide.
  • the donor sequence may comprise certain sequence differences as compared to the genomic sequence, e.g. restriction sites, nucleotide polymorphisms, selectable markers (e.g., drug resistance genes, fluorescent proteins, enzymes etc.), etc., which may be used to assess for successful insertion of the donor sequence at the cleavage site or in some cases may be used for other purposes (e.g., to signify expression at the targeted genomic locus).
  • selectable markers e.g., drug resistance genes, fluorescent proteins, enzymes etc.
  • sequence differences may include flanking recombination sequences such as FLPs, loxP sequences, or the like, that can be activated at a later time for removal of the marker sequence.
  • the donor sequence may be provided to the cell as single-stranded DNA, single-stranded RNA, double -stranded DNA, or double-stranded RNA. It may be introduced into a cell in linear or circular form. If introduced in linear form, the ends of the donor sequence may be protected (e.g., from exonucleolytic degradation) by methods known to those of skill in the art. For example, one or more dideoxynucleotide residues are added to the 3' terminus of a linear molecule and/or self-complementary oligonucleotides are ligated to one or both ends.
  • Additional methods for protecting exogenous polynucleotides from degradation include, but are not limited to, addition of terminal amino group(s) and the use of modified intemucleotide linkages such as, for example, phosphorothioates, phosphor amidates, and O-methyl ribose or deoxyribose residues.
  • additional lengths of sequence may be included outside of the regions of homology that can be degraded without impacting recombination.
  • a donor sequence can be introduced into a cell as part of a vector molecule having additional sequences such as, for example, replication origins, promoters and genes encoding antibiotic resistance.
  • donor sequences can be introduced as naked nucleic acid, as nucleic acid complexed with an agent such as a liposome or poloxamer, or can be delivered by viruses (e.g., adenovirus, AAV), as described above for nucleic acids encoding a DNA - targeting RNA and/or site-directed modifying polypeptide and/or donor polynucleotide.
  • viruses e.g., adenovirus, AAV
  • a DNA region of interest may be cleaved and modified, i.e. "genetically modified", ex vivo.
  • the population of cells may be enriched for those comprising the genetic modification by separating the genetically modified cells from the remaining population.
  • the "genetically modified” cells may make up only about 1% or more (e.g., 2% or more, 3% or more, 4% or more, 5% or more, 6% or more, 7% or more, 8% or more, 9% or more, 10% or more, 15% or more, or 20% or more) of the cellular population.
  • Separation of "genetically modified" cells may be achieved by any convenient separation technique appropriate for the selectable marker used. For example, if a fluorescent marker has been inserted, cells may be separated by fluorescence activated cell sorting, whereas if a cell surface marker has been inserted, cells may be separated from the heterogeneous population by affinity separation techniques, e.g. magnetic separation, affinity chromatography, "panning" with an affinity reagent attached to a solid matrix, or other convenient technique.
  • Techniques providing accurate separation include fluorescence activated cell sorters, which can have varying degrees of sophistication, such as multiple color channels, low angle and obtuse light scattering detecting channels, impedance channels, etc.
  • the cells may be selected against dead cells by employing dyes associated with dead cells (e.g. propidium iodide). Any technique may be employed which is not unduly detrimental to the viability of the genetically modified cells.
  • Cell compositions that are highly enriched for cells comprising modified DNA are achieved in this manner.
  • highly enriched it is meant that the genetically modified cells will be 70% or more, 75% or more, 80% or more, 85% or more, 90% or more of the cell composition, for example, about 95% or more, or 98% or more of the cell composition.
  • the composition may be a substantially pure composition of genetically modified cells.
  • Genetically modified cells produced by the methods described herein may be used immediately.
  • the cells may be frozen at liquid nitrogen temperatures and stored for long periods of time, being thawed and capable of being reused.
  • the cells will usually be frozen in 10% dimethylsulfoxide (DMSO), 50% serum, 40% buffered medium, or some other such solution as is commonly used in the art to preserve cells at such freezing temperatures, and thawed in a manner as commonly known in the art for thawing frozen cultured cells.
  • DMSO dimethylsulfoxide
  • the genetically modified cells may be cultured in vitro under various culture conditions.
  • the cells may be expanded in culture, i.e. grown under conditions that promote their proliferation.
  • Culture medium may be liquid or semi-solid, e.g. containing agar, methylcellulose, etc.
  • the cell population may be suspended in an appropriate nutrient medium, such as Iscove's modified DMEM or RPMI 1640, normally supplemented with fetal calf serum (about 5-10%),
  • L-glutamine a thiol, particularly 2-mercaptoethanol
  • antibiotics e.g. penicillin and streptomycin.
  • the culture may contain growth factors to which the regulatory T cells are responsive.
  • Growth factors are molecules capable of promoting survival, growth and/or differentiation of cells, either in culture or in the intact tissue, through specific effects on a transmembrane receptor. Growth factors include polypeptides and non polypeptide factors.
  • Cells that have been genetically modified in this way may be transplanted to a subject for purposes such as gene therapy, e.g. to treat a disease or as an antiviral, antipathogenic, or anticancer therapeutic, for the production of genetically modified organisms in agriculture, or for biological research.
  • the subject may be a neonate, a juvenile, or an adult.
  • Mammalian species that may be treated with the present methods include canines and felines; equines; bovines; ovines; etc. and primates, particularly humans.
  • Animal models, particularly small mammals e.g. mouse, rat, guinea pig, hamster, lagomorpha (e.g., rabbit), etc.
  • small mammals e.g. mouse, rat, guinea pig, hamster, lagomorpha (e.g., rabbit), etc.
  • Cells may be provided to the subject alone or with a suitable substrate or matrix, e.g. to support their growth and/or organization in the tissue to which they are being transplanted. Usually, at least lxlO 3 cells will be administered, for example 5xl0 3 cells, lxlO 4 cells, 5xl0 4 cells, lxlO 5 cells, 1 x 10 6 cells or more.
  • the cells may be introduced to the subject via any of the following routes: parenteral, subcutaneous, intravenous, intracranial, intraspinal, intraocular, or into spinal fluid.
  • the cells may be introduced by injection, catheter, or the like. Cells may also be introduced into an embryo (e.g., a blastocyst) for the purpose of generating a transgenic animal (e.g., a transgenic mouse).
  • the number of administrations of treatment to a subject may vary. Introducing the genetically modified cells into the subject may be a one-time event; but in certain situations, such treatment may elicit improvement for a limited period of time and require an on-going series of repeated treatments. In other situations, multiple administrations of the genetically modified cells may be required before an effect is observed.
  • the exact protocols depend upon the disease or condition, the stage of the disease and parameters of the individual subject being treated.
  • the DNA-targeting RNA and/or site-directed modifying polypeptide and/or donor polynucleotide are employed to modify cellular DNA in vivo, again for purposes such as gene therapy, e.g. to treat a disease or as an antiviral, antipathogenic, or anticancer therapeutic, for the production of genetically modified organisms in agriculture, or for biological research.
  • a DNA- targeting RNA and/or site -directed modifying polypeptide and/or donor polynucleotide are administered directly to the individual.
  • a DNA-targeting RNA and/or site -directed modifying polypeptide and/or donor polynucleotide may be administered by any of a number of well-known methods in the art for the administration of peptides, small molecules and nucleic acids to a subject.
  • a DNA-targeting RNA and/or site- directed modifying polypeptide and/or donor polynucleotide can be incorporated into a variety of formulations. More particularly, a DNA-targeting RNA and/or site-directed modifying polypeptide and/or donor polynucleotide of the present invention can be formulated into pharmaceutical compositions by combination with appropriate pharmaceutically acceptable carriers or diluents.
  • Pharmaceutical preparations are compositions that include one or more a
  • DNA-targeting RNA and/or site -directed modifying polypeptide and/or donor polynucleotide present in a pharmaceutically acceptable vehicle.
  • “Pharmaceutically acceptable vehicles” may be vehicles approved by a regulatory agency of the Federal or a state government or listed in the U.S.
  • lipids e.g. liposomes, e.g. liposome dendrimers
  • liquids such as water and oils, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like, saline; gum acacia, gelatin, starch paste, talc, keratin, colloidal silica, urea, and the like.
  • compositions may be formulated into preparations in solid, semisolid, liquid or gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, suppositories, injections, inhalants, gels, microspheres, and aerosols.
  • administration of the a DNA-targeting RNA and/or site -directed modifying polypeptide and/or donor polynucleotide can be achieved in various ways, including oral, buccal, rectal, parenteral, intraperitoneal, intradermal, transdermal, intratracheal, intraocular, etc., administration.
  • the active agent may be systemic after administration or may be localized by the use of regional administration, intramural administration, or use of an implant that acts to retain the active dose at the site of implantation.
  • the active agent may be formulated for immediate activity or it may be formulated for sustained release.
  • BBB blood-brain barrier
  • osmotic means such as mannitol or leukotrienes
  • vasoactive substances such as bradykinin.
  • a BBB disrupting agent can be co-administered with the therapeutic compositions of the invention when the compositions are administered by intravascular injection.
  • Endogenous transport systems including Caveolin-1 mediated transcytosis, carrier-mediated transporters such as glucose and amino acid carriers, receptor-mediated transcytosis for insulin or transferrin, and active efflux transporters such as p- glycoprotein.
  • Active transport moieties may also be conjugated to the therapeutic compounds for use in the invention to facilitate transport across the endothelial wall of the blood vessel.
  • drug delivery of therapeutics agents behind the BBB may be by local delivery, for example by intrathecal delivery.
  • an effective amount of a DNA-targeting RNA and/or site-directed modifying polypeptide and/or donor polynucleotide are provided.
  • an effective amount or effective dose of a DNA-targeting RNA and/or site- directed modifying polypeptide and/or donor polynucleotide in vivo is the amount to induce a 2 fold increase or more in the amount of recombination observed between two homologous sequences relative to a negative control, e.g. a cell contacted with an empty vector or irrelevant polypeptide.
  • the amount of recombination may be measured by any convenient method, e.g. as described above and known in the art.
  • the calculation of the effective amount or effective dose of a DNA-targeting RNA and/or site-directed modifying polypeptide and/or donor polynucleotide to be administered is within the skill of one of ordinary skill in the art, and will be routine to those persons skilled in the art.
  • the final amount to be administered will be dependent upon the route of administration and upon the nature of the disorder or condition that is to be treated. In some embodiments, an exemplary dose of between about 0.01 to 1 mpk is used.
  • the effective amount given to a particular patient will depend on a variety of factors, several of which will differ from patient to patient.
  • a competent clinician will be able to determine an effective amount of a therapeutic agent to administer to a patient to halt or reverse the progression the disease condition as required.
  • a clinician can determine the maximum safe dose for an individual, depending on the route of administration. For instance, an intravenously administered dose may be more than an intrathecally administered dose, given the greater body of fluid into which the therapeutic composition is being administered. Similarly, compositions which are rapidly cleared from the body may be administered at higher doses, or in repeated doses, in order to maintain a therapeutic concentration.
  • a DNA-targeting RNA and/or site -directed modifying polypeptide and/or donor polynucleotide may be obtained from a suitable commercial source.
  • the total pharmaceutically effective amount of the a DNA-targeting RNA and/or site -directed modifying polypeptide and/or donor polynucleotide administered parenterally per dose will be in a range that can be measured by a dose response curve.
  • Therapies based on a DNA-targeting RNA and/or site-directed modifying polypeptide and/or donor polynucleotides i.e. preparations of a DNA-targeting RNA and/or site-directed modifying polypeptide and/or donor polynucleotide to be used for therapeutic administration, must be sterile. Sterility is readily accomplished by fdtration through sterile fdtration membranes (e.g., 0.2 pm membranes).
  • Therapeutic compositions generally are placed into a container having a sterile access port, for example, an intravenous solution bag or vial having a stopper pierceable by a hypodermic injection needle.
  • the therapies based on a DNA-targeting RNA and/or site- directed modifying polypeptide and/or donor polynucleotide may be stored in unit or multi -dose containers, for example, sealed ampules or vials, as an aqueous solution or as a lyophilized formulation for reconstitution.
  • a lyophilized formulation 10-mL vials are fdled with 5 ml of sterile-fdtered 1 % (w/v) aqueous solution of compound, and the resulting mixture is lyophilized.
  • the infusion solution is prepared by reconstituting the lyophilized compound using bacteriostatic Water-for- Injection.
  • compositions can include, depending on the formulation desired, pharmaceutically-acceptable, non-toxic carriers of diluents, which are defined as vehicles commonly used to formulate pharmaceutical compositions for animal or human administration.
  • diluents are selected so as not to affect the biological activity of the combination. Examples of such diluents are distilled water, buffered water, physiological saline, PBS, Ringer's solution, dextrose solution, and Hank's solution.
  • the pharmaceutical composition or formulation can include other carriers, adjuvants, or non toxic, nontherapeutic, nonimmunogenic stabilizers, excipients and the like.
  • the compositions can also include additional substances to approximate physiological conditions, such as pH adjusting and buffering agents, toxicity adjusting agents, wetting agents and detergents.
  • the composition can also include any of a variety of stabilizing agents, such as an antioxidant for example.
  • the polypeptide can be complexed with various well-known compounds that enhance the in vivo stability of the polypeptide, or otherwise enhance its pharmacological properties (e.g., increase the half-life of the polypeptide, reduce its toxicity, and enhance solubility or uptake). Examples of such modifications or complexing agents include sulfate, gluconate, citrate and phosphate.
  • the nucleic acids or polypeptides of a composition can also be complexed with molecules that enhance their in vivo attributes. Such molecules include, for example, carbohydrates, polyamines, amino acids, other peptides, ions (e.g., sodium, potassium, calcium, magnesium, manganese), and lipids.
  • the pharmaceutical compositions can be administered for prophylactic and/or therapeutic treatments.
  • Toxicity and therapeutic efficacy of the active ingredient can be determined according to standard pharmaceutical procedures in cell cultures and/or experimental animals, including, for example, determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population).
  • the dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50. Therapies that exhibit large therapeutic indices are preferred.
  • the data obtained from cell culture and/or animal studies can be used in formulating a range of dosages for humans.
  • the dosage of the active ingredient typically lines within a range of circulating concentrations that include the ED50 with low toxicity.
  • the dosage can vary within this range depending upon the dosage form employed and the route of administration utilized.
  • compositions intended for in vivo use are usually sterile. To the extent that a given compound must be synthesized prior to use, the resulting product is typically substantially free of any potentially toxic agents, particularly any endotoxins, which may be present during the synthesis or purification process.
  • compositions for parental administration are also sterile, substantially isotonic and made under GMP conditions.
  • NLS-gRNA described herein can be delivered to a cell of interest by various delivery systems such as vectors, carriers, e.g., lipid nanoparticles.
  • the NLS-gRNA described herein can be delivered by nanoparticles, which can be organic or inorganic. Nanoparticles are well known in the art. Any suitable nanoparticle design can be used to deliver genome editing system components or nucleic acids encoding such components. For instance, organic (e.g. lipid and/or polymer) nanoparticles can be suitable for use as delivery vehicles in certain embodiments of this disclosure. Exemplary lipids for use in nanoparticle formulations, and/or gene transfer are shown in Table 2 (below).
  • Table 3 lists exemplary polymers for use in gene transfer and/or nanoparticle formulations. Table 3
  • Table 4 summarizes delivery methods for a polynucleotide encoding a Cas9 described herein.
  • AAV Virus
  • the delivery of genome editing system including the NLS- gRNA describe herein may be accomplished by delivering a ribonucleoprotein (RNP) to cells.
  • RNP comprises the nucleic acid binding protein, e.g., Cas9, in complex with the targeting gRNA.
  • RNPs may be delivered to cells using known methods, such as electroporation, nucleofection, or cationic lipid-mediated methods, for example, as reported by Zuris, J.A. et ah, 2015, Nat. Biotechnology, 33(l):73-80.
  • RNPs are advantageous for use in CRISPR base editing systems, particularly for cells that are difficult to transfect, such as primary cells.
  • RNPs can also alleviate difficulties that may occur with protein expression in cells, especially when eukaryotic promoters, e.g., CMV or EF1A, which may be used in CRISPR plasmids, are not we 11 -expressed.
  • the use of RNPs does not require the delivery of foreign DNA into cells.
  • an RNP comprising a nucleic acid binding protein and gRNA complex is degraded over time, the use of RNPs has the potential to limit off-target effects.
  • RNPs can be used to deliver binding protein (e.g., Cas9 variants) and to direct homology directed repair (HDR).
  • a promoter used to drive the CRISPR system can include AAV ITR. This can be advantageous for eliminating the need for an additional promoter element, which can take up space in the vector. The additional space freed up can be used to drive the expression of additional elements, such as a guide nucleic acid or a selectable marker. ITR activity is relatively weak, so it can be used to reduce potential toxicity due to over expression of the chosen nuclease.
  • any suitable promoter can be used to drive expression of the Cas9 and, where appropriate, the guide nucleic acid.
  • promoters that can be used include CMV, CAG, CBh, PGK, SV40, Ferritin heavy or light chains, etc.
  • suitable promoters can include: Synapsinl for all neurons, CaMKIIalpha for excitatory neurons, GAD67 or GAD65 or VGAT for GABAergic neurons, etc.
  • suitable promoters include the Albumin promoter.
  • suitable promoters can include SP-B.
  • suitable promoters can include ICAM.
  • suitable promoters can include IFNbeta or CD45.
  • suitable promoters can include OG-2.
  • a vector or viral vector can comprise a first promoter operably linked to a nucleic acid encoding the base editor and a second promoter operably linked to the guide nucleic acid.
  • the promoter used to drive expression of a guide nucleic acid can include: Pol
  • AAV gRNA Adeno Associated Virus
  • a Cas9 can be delivered using adeno associated virus (AAV), lentivirus, adenovirus or other plasmid or viral vector types, in particular, using formulations and doses from, for example, U.S. Patent No. 8,454,972 (formulations, doses for adenovirus), U.S. Patent No. 8,404,658 (formulations, doses for AAV) and U.S. Patent No. 5,846,946 (formulations, doses for DNA plasmids) and from clinical trials and publications regarding the clinical trials involving lentivirus, AAV and adenovirus.
  • AAV the route of administration, formulation and dose can be as in U.S. Patent No.
  • the route of administration, formulation and dose can be as in U.S. Patent No. 8,404,658 and as in clinical trials involving adenovirus.
  • the route of administration, formulation and dose can be as in U.S. Patent No. 5,846,946 and as in clinical studies involving plasmids.
  • Doses can be based on or extrapolated to an average 70 kg individual (e.g. a male adult human), and can be adjusted for patients, subjects, mammals of different weight and species.
  • Frequency of administration is within the ambit of the medical or veterinary practitioner (e.g., physician, veterinarian), depending on usual factors including the age, sex, general health, other conditions of the patient or subject and the particular condition or symptoms being addressed.
  • the viral vectors can be injected into the tissue of interest.
  • the expression of the base editor and optional guide nucleic acid can be driven by a cell-type specific promoter.
  • AAV can be advantageous over other viral vectors.
  • AAV allows low toxicity, which can be due to the purification method not requiring ultra-centrifugation of cell particles that can activate the immune response.
  • AAV allows low probability of causing insertional mutagenesis because it doesn't integrate into the host genome.
  • AAV has a packaging limit of 4.5 or 4.75 Kb. Constructs larger than 4.5 or
  • embodiments of the present disclosure include utilizing a disclosed Cas9 which is shorter in length than conventional Cas9.
  • An AAV can be AAV1, AAV2, AAV5 or any combination thereof.
  • AAV8 is useful for delivery to the liver.
  • a tabulation of certain AAV serotypes as to these cells can be found in Grimm, D. et al, J. Virol. 82: 5887-5911 (2008)).
  • Lentiviruses are complex retroviruses that have the ability to infect and express their genes in both mitotic and post-mitotic cells.
  • the most commonly known lentivirus is the human immunodeficiency virus (HIV), which uses the envelope glycoproteins of other viruses to target a broad range of cell types.
  • HIV human immunodeficiency virus
  • pCasESlO which contains a lentiviral transfer plasmid backbone
  • Cells are transfected with 10 pg of lentiviral transfer plasmid (pCasESlO) and the following packaging plasmids: 5 pg of pMD2.G (VSV-g pseudotype), and 7.5 pg of psPAX2 (gag/pol/rev/tat).
  • Transfection can be done in 4 mL OptiMEM with a cationic lipid delivery agent (50 pi Lipofectamine 2000 and 100 ul Plus reagent). After 6 hours, the media is changed to antibiotic-free DMEM with 10% fetal bovine serum. These methods use serum during cell culture, but serum-free methods are preferred.
  • Lentivirus can be purified as follows. Viral supernatants are harvested after
  • minimal non-primate lentiviral vectors based on the equine infectious anemia virus are also contemplated.
  • EIAV equine infectious anemia virus
  • RetinoStat® an equine infectious anemia virus-based lentiviral gene therapy vector that expresses angiostatic proteins endostatin and angiostatin that is contemplated to be delivered via a subretinal injection.
  • use of self-inactivating lentiviral vectors is contemplated.
  • RNA of the systems can be delivered in the form of RNA.
  • Cas9 encoding mRNA can be generated using in vitro transcription.
  • Cas9 mRNA can be synthesized using a PCR cassette containing the following elements: T7 promoter, optional kozak sequence (GCCACC), nuclease sequence, and 3' UTR such as a 3' UTR from beta globin-polyA tail.
  • the cassette can be used for transcription by T7 polymerase.
  • Guide polynucleotides e.g., gRNA
  • the Cas9 sequence and/or the guide nucleic acid can be modified to include one or more modified nucleoside e.g. using pseudo-U or 5-Methyl-C.
  • the disclosure in some embodiments comprehends a method of modifying a cell or organism.
  • the cell can be a prokaryotic cell or a eukaryotic cell.
  • the cell can be a mammalian cell.
  • the mammalian cell many be a non-human primate, bovine, porcine, rodent or mouse cell.
  • the modification introduced to the cell by the base editors, compositions and methods of the present disclosure can be such that the cell and progeny of the cell are altered for improved production of biologic products such as an antibody, starch, alcohol or other desired cellular output.
  • the modification introduced to the cell by the methods of the present disclosure can be such that the cell and progeny of the cell include an alteration that changes the biologic product produced.
  • the system can comprise one or more different vectors.
  • the Cas9 is codon optimized for expression the desired cell type, preferentially a eukaryotic cell, preferably a mammalian cell or a human cell.
  • codon optimization refers to a process of modifying a nucleic acid sequence for enhanced expression in the host cells of interest by replacing at least one codon (e.g. about or more than about 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more codons) of the native sequence with codons that are more frequently or most frequently used in the genes of that host cell while maintaining the native amino acid sequence.
  • codon bias differs in codon usage between organisms
  • mRNA messenger RNA
  • tRNA transfer RNA
  • Codon usage tables are readily available, for example, at the “Codon Usage Database” available at www.kazusa.oqp/codon/ (visited Jul. 9, 2002), and these tables can be adapted in a number of ways. See, Nakamura, Y., et al. "Codon usage tabulated from the international DNA sequence databases: status for the year 2000" Nucl. Acids Res. 28:292 (2000).
  • codon optimizing a particular sequence for expression in a particular host cell are also available, such as Gene Forge (Aptagen; Jacobus, Pa.), are also available.
  • one or more codons e.g. 1, 2, 3, 4, 5, 10, 15, 20, 25, 50, or more, or all codons
  • one or more codons in a sequence encoding an engineered nuclease correspond to the most frequently used codon for a particular amino acid.
  • Packaging cells are typically used to form virus particles that are capable of infecting a host cell. Such cells include 293 cells, which package adenovirus, and psi.2 cells or PA317 cells, which package retrovirus. Viral vectors used in gene therapy are usually generated by producing a cell line that packages a nucleic acid vector into a viral particle.
  • the vectors typically contain the minimal viral sequences required for packaging and subsequent integration into a host, other viral sequences being replaced by an expression cassette for the polynucleotide (s) to be expressed.
  • the missing viral functions are typically supplied in trans by the packaging cell line.
  • AAV vectors used in gene therapy typically only possess ITR sequences from the AAV genome which are required for packaging and integration into the host genome.
  • Viral DNA can be packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences.
  • the cell line can also be infected with adenovirus as a helper.
  • the helper vims can promote replication of the AAV vector and expression of AAV genes from the helper plasmid.
  • the helper plasmid in some cases is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovims can be reduced by, e.g., heat treatment to which adenovims is more sensitive than AAV.
  • compositions comprising gene editing system (e.g., including the NLS-gRNA described herein).
  • pharmaceutical composition refers to a composition formulated for pharmaceutical use.
  • the pharmaceutical composition further comprises a pharmaceutically acceptable carrier.
  • the pharmaceutical composition comprises additional agents (e.g., for specific delivery, increasing half-life, or other therapeutic compounds).
  • the term “pharmaceutically-acceptable carrier” means a pharmaceutically-acceptable material, composition or vehicle, such as a liquid or solid filler, diluent, excipient, manufacturing aid (e.g., lubricant, talc magnesium, calcium or zinc stearate, or steric acid), or solvent encapsulating material, involved in carrying or transporting the compound from one site (e.g., the delivery site) of the body, to another site (e.g., organ, tissue or portion of the body).
  • a pharmaceutically acceptable carrier is “acceptable” in the sense of being compatible with the other ingredients of the formulation and not injurious to the tissue of the subject (e.g., physiologically compatible, sterile, physiologic pH, etc.).
  • Some nonlimiting examples of materials which can serve as pharmaceutically- acceptable carriers include: (1) sugars, such as lactose, glucose and sucrose; (2) starches, such as com starch and potato starch; (3) cellulose, and its derivatives, such as sodium carboxymethyl cellulose, methylcellulose, ethyl cellulose, microcrystalline cellulose and cellulose acetate; (4) powdered tragacanth; (5) malt; (6) gelatin; (7) lubricating agents, such as magnesium stearate, sodium lauryl sulfate and talc; (8) excipients, such as cocoa butter and suppository waxes; (9) oils, such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, com oil and soybean oil; (10) glycols, such as propylene glycol; (11) polyols, such as glycerin, sorbitol, mannitol and polyethylene glycol (PEG); (12) esters,
  • wetting agents, coloring agents, release agents, coating agents, sweetening agents, flavoring agents, perfuming agents, preservative and antioxidants can also be present in the formulation.
  • excipient e.g., pharmaceutically acceptable carrier, “vehicle,” or the like are used interchangeably herein.
  • compositions can comprise one or more pH buffering compounds to maintain the pH of the formulation at a predetermined level that reflects physiological pH, such as in the range of about 5.0 to about 8.0.
  • the pH buffering compound used in the aqueous liquid formulation can be an amino acid or mixture of amino acids, such as histidine or a mixture of amino acids such as histidine and glycine.
  • the pH buffering compound is preferably an agent which maintains the pH of the formulation at a predetermined level, such as in the range of about 5.0 to about 8.0, and which does not chelate calcium ions.
  • Illustrative examples of such pH buffering compounds include, but are not limited to, imidazole and acetate ions.
  • the pH buffering compound may be present in any amount suitable to maintain the pH of the formulation at a predetermined level.
  • compositions can also contain one or more osmotic modulating agents, i.e., a compound that modulates the osmotic properties (e.g, tonicity, osmolality, and/or osmotic pressure) of the formulation to a level that is acceptable to the blood stream and blood cells of recipient individuals.
  • the osmotic modulating agent can be an agent that does not chelate calcium ions.
  • the osmotic modulating agent can be any compound known or available to those skilled in the art that modulates the osmotic properties of the formulation. One skilled in the art may empirically determine the suitability of a given osmotic modulating agent for use in the inventive formulation.
  • osmotic modulating agents include, but are not limited to: salts, such as sodium chloride and sodium acetate; sugars, such as sucrose, dextrose, and mannitol; amino acids, such as glycine; and mixtures of one or more of these agents and/or types of agents.
  • the osmotic modulating agent(s) may be present in any concentration sufficient to modulate the osmotic properties of the formulation.
  • the pharmaceutical composition is formulated for delivery to a subject, e.g., for gene editing.
  • Suitable routes of administrating the pharmaceutical composition described herein include, without limitation: topical, subcutaneous, transdermal, intradermal, intralesional, intraarticular, intraperitoneal, intravesical, transmucosal, gingival, intradental, intracochlear, transtympanic, intraorgan, epidural, intrathecal, intramuscular, intravenous, intravascular, intraosseus, periocular, intratumoral, intracerebral, and intracerebroventricular administration.
  • the pharmaceutical composition described herein is administered locally to a diseased site.
  • the pharmaceutical composition described herein is administered to a subject by injection, by means of a catheter, by means of a suppository, or by means of an implant, the implant being of a porous, non-porous, or gelatinous material, including a membrane, such as a sialastic membrane, or a fiber.
  • the pharmaceutical composition described herein is delivered in a controlled release system.
  • a pump can be used (See, e.g., Langer, 1990, Science 249: 1527-1533; Sefton, 1989, CRC Crit. Ref. Biomed. Eng. 14:201; Buchwald et al., 1980, Surgery 88:507; Saudek et al., 1989, N. Engl. J. Med. 321:574).
  • polymeric materials can be used.
  • the pharmaceutical composition is formulated in accordance with routine procedures as a composition adapted for intravenous or subcutaneous administration to a subject, e.g., a human.
  • pharmaceutical composition for administration by injection are solutions in sterile isotonic use as solubilizing agent and a local anesthetic such as lignocaine to ease pain at the site of the injection.
  • the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent.
  • the pharmaceutical is to be administered by infusion
  • it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water or saline.
  • an ampoule of sterile water for injection or saline can be provided so that the ingredients can be mixed prior to administration.
  • a pharmaceutical composition for systemic administration can be a liquid, e.g., sterile saline, lactated Ringer's or Hank's solution.
  • the pharmaceutical composition can be in solid forms and re-dissolved or suspended immediately prior to use. Lyophilized forms are also contemplated.
  • the pharmaceutical composition can be contained within a lipid particle or vesicle, such as a liposome or microcrystal, which is also suitable for parenteral administration.
  • the particles can be of any suitable structure, such as unilamellar or plurilamellar, so long as compositions are contained therein.
  • SPLP stabilized plasmid-lipid particles
  • DOPE fusogenic lipid dioleoylphosphatidylethanolamine
  • PEG polyethyleneglycol
  • Positively charged lipids such as N-[l-(2,3-dioleoyloxi)propyl]-N,N,N-trimethyl- amoniummethylsulfate, or “DOTAP,” are particularly preferred for such particles and vesicles.
  • DOTAP N-[l-(2,3-dioleoyloxi)propyl]-N,N,N-trimethyl- amoniummethylsulfate
  • the pharmaceutical composition described herein can be administered or packaged as a unit dose, for example.
  • unit dose when used in reference to a pharmaceutical composition of the present disclosure refers to physically discrete units suitable as unitary dosage for the subject, each unit containing a predetermined quantity of active material calculated to produce the desired therapeutic effect in association with the required diluent; i.e., carrier, or vehicle.
  • the pharmaceutical composition can be provided as a pharmaceutical kit comprising (a) a container containing a compound of the invention in lyophilized form and (b) a second container containing a pharmaceutically acceptable diluent (e.g., sterile used for reconstitution or dilution of the lyophilized compound of the invention.
  • a pharmaceutically acceptable diluent e.g., sterile used for reconstitution or dilution of the lyophilized compound of the invention.
  • a pharmaceutically acceptable diluent e.g., sterile used for reconstitution or dilution of the lyophilized compound of the invention.
  • a pharmaceutically acceptable diluent e.g., sterile used for reconstitution or dilution of the lyophilized compound of the invention.
  • a pharmaceutically acceptable diluent e.g., sterile used for reconstitution or dilution of the lyophilized compound of the invention.
  • an article of manufacture containing materials useful for the treatment of the diseases described above is included.
  • the article of manufacture comprises a container and a label.
  • Suitable containers include, for example, bottles, vials, syringes, and test tubes.
  • the containers can be formed from a variety of materials such as glass or plastic.
  • the container holds a composition that is effective for treating a disease described herein and can have a sterile access port.
  • the container can be an intravenous solution bag or a vial having a stopper pierceable by a hypodermic injection needle.
  • the active agent in the composition is a compound of the invention.
  • the label on or associated with the container indicates that the composition is used for treating the disease of choice.
  • the article of manufacture can further comprise a second container comprising a pharmaceutically- acceptable buffer, such as phosphate-buffered saline, Ringer's solution, or dextrose solution. It can further include other materials desirable from a commercial and user standpoint, including other buffers, diluents, fdters, needles, syringes, and package inserts with instructions for use.
  • the CRISPR system (e.g., including the Cas9 described herein) are provided as part of a pharmaceutical composition.
  • the pharmaceutical composition comprises any of the fusion proteins provided herein (e.g., including the nucleobase editor described herein comprising LubCas9).
  • the pharmaceutical composition comprises any of the complexes provided herein.
  • the pharmaceutical composition comprises a ribonucleoprotein complex comprising an RNA-guided nuclease (e.g., Cas9) that forms a complex with a gRNA and a cationic lipid.
  • pharmaceutical composition comprises a gRNA, a nucleic acid programmable DNA binding protein, a cationic lipid, and a pharmaceutically acceptable excipient.
  • Pharmaceutical compositions can optionally comprise one or more additional therapeutically active substances. Kits
  • the NLS-gRNA described herein can be provided and or produced by a kit containing any one or more of the elements disclosed in the above methods and compositions.
  • a kit may include a NLS-gRNA, a ligase, and suitable buffering reagents.
  • the kit further comprises a nucleobase editor.
  • a kit comprises one or more reagents for use in a process utilizing one or more of the elements described herein.
  • Reagents may be provided in any suitable container.
  • a kit may provide one or more reaction or storage buffers.
  • Reagents may be provided in a form that is usable in a particular assay, or in a form that requires addition of one or more other components before use (e.g. in concentrate or lyophilized form).
  • a buffer can be any buffer, including but not limited to a sodium carbonate buffer, a sodium bicarbonate buffer, a borate buffer, a Tris buffer, a MOPS buffer, a HEPES buffer, and combinations thereof.
  • the buffer is alkaline.
  • the buffer has a pH from about 7 to about 10.
  • the kit comprises one or more oligonucleotides corresponding to a guide sequence for insertion into a vector so as to operably link the guide sequence and a regulatory element.
  • the kit comprises a homologous recombination template polynucleotide.
  • This example describes an exemplary gRNA conjugated to NLS (NLS-gRNA) of the present invention and its efficacy ex vivo.
  • NLS-gRNA NLS conjugated to NLS
  • a peptide comprising the NLS sequence and a peptide spacer was synthesized by solid-phase peptide synthesis.
  • the synthesized peptide was conjugated to the 3' end of the gRNA via thiol group, as shown in FIG. 1.
  • the linker and the peptide spacer can be modified in the practice of the present invention. Additionally, the sequence of the NLS, gRNA, and/or linker can be modified.
  • NLS-sgRNA was prepared and formulated in lipid nanoparticles with mRNA encoding a CRISPR-Cas9 based editor. The formulation was delivered to hepatocytes at three different ratios of mRNA:sgRNA (1: 1, 3: 1, and 9: 1). As shown in FIG. 2, NLS-sgRNA showed a significantly higher base editing efficiency as compared to gRNA without the NLS sequence.
  • CRISPR-Cas system e.g., base editing
  • a gRNA that is conjugated to a NLS sequence.
  • the improvement in CRISPR-Cas system may be due in part to better trafficking of the NLS-gRNA to the nucleus which protects gRNA from cytosolic RNases, increased local concentration of gRNA and therefore ribonucleic acid complex (RNP) formation, and higher rate of import to the nucleus.
  • the cationic NLS sequence may act in part by promoting endosomal escape.
  • NLS-gRNA significantly improves base editing in vivo, even as compared to highly modified gRNA.
  • spCas9 gRNAs were used with an adenine base editor (ABE) comprising an spCas9 nickase and adenosine deaminase.
  • ABE adenine base editor
  • gRNAs with various modifications were prepared. As shown in FIG. 3A, an end-modified (EM) gRNA comprises 6% modifications, a heavy modi (HM1) gRNA comprises 47 % modification, a heavy mod2 (HM2) gRNA comprises 60% modification, and a heavy mod3 (HM3) gRNA comprises 88% modification. NLS-gRNA comprises NLS sequence conjugated to the 3' end of the gRNA and 6% modification. Two different mRNAs, both encoding the same bae editor were prepared. As compared to the mRNA 2, mRNA 1 is codon-optimized, with 3' and 5' UTR sequences. Various combinations of the gRNAs with either mRNA 1 or mRNA2 were formulated in LNPs and were delivered to mice at sub saturating dose of 0.03 mpk or 0.01 mpk, as shown in FIG. 3B.
  • NLS-gRNA exhibited higher base editing efficacy as compared to all EM, HM1, HM2, or HM3 gRNAs. Particularly, even at ultra-low doses (0.01 mpk), base editing was visible for NLS-gRNA, and was significantly higher than heavily modified (HM1, HM2, and HM3) gRNAs. Additionally, combining NLS-gRNA with less potent mRNA (mRNA2) compensated for the quality of mRNA - while the base editing efficiency of mRNA2 with end-modified gRNA was about 5%, substituting the gRNA with NLS-gRNA increased the base editing efficiency to greater than 30%.
  • mRNA2 less potent mRNA
  • This example illustrates that the improvement in base editing efficiency by using NLS-gRNA is also observed in NHPs.
  • spCas9 gRNAs were used with an spCas9-based adenine base editor (ABE).
  • gRNAs and mRNA encoding a base editor were formulated in lipid nanoparticles as shown in FIG. 4A.
  • the formulations were delivered to NHPs at 1.0 mpk, and base editing efficiency was determined in liver.
  • the results show that NLS-gRNA with mRNAl (g5-BVN) and HM3 gRNA with mRNAl (g4-BVB) exhibited the highest base editing efficiency, followed by g2-BVI, g3-BVV, and gl-BVE.
  • NLS-gRNA with end modifications showed more than two-fold base editing efficiency as compared to respective end-modified gRNA without NLS (compare to gl-BVE and g6-BG3IE, respectively).
  • ALT alanine aminotransferase
  • AST aspartate aminotransferase
  • FIG. 4B minimal to mild increases in AST and/or ALT were observed 24 hr post-dose for all test articles.
  • g5-BVN which comprises NLS-gRNA with end modification showed the lowest AST and ALT increases. Additionally, no other significant changes in clinical pathology parameters were observed.
  • Cas system e.g., base editing efficiency in NHPs, with decreased toxicity.
  • NLS-gRNA can be applied to various Cas proteins.
  • a Staphylococcus aureus Cas9 saCas9
  • saCas9 requires a unique guide that is not compatible with spCas9 editing shown in previous examples.
  • Glycogen storage disease type la is caused by a mutation in the glucose-6-phosphatase (G6PC) gene, which affects about 80% of patients with GSDla.
  • G6PC glucose-6-phosphatase
  • the R83C mutation affects about 900 US patients annually diagnosed with Glycogen storage disease type la (GSDla).
  • This mutation is a single base substitution that introduces a cysteine at position 83 (R83C) of the G6PC protein. A precise correction of R83C will likely restore expression of G6PC and normalize glucose metabolism.
  • gRNA were prepared and its purity was determined. gRNAs with two different backbone chemistry were used in the study (sg029 vs. sg093). Sg093 guides have end modifications with 2'-OMe and phosphothioate modifications). Various gRNAs and mRNA encoding a base editor were formulated in LNPs at 1: 1 ratio of gRNA: mRNA.
  • LNPs 1: 1 ratio of gRNA: mRNA.
  • Rati mice heterozygous for huG6PC-R83C were administered LNP formulations at a sub-saturating dose of 1 mpk.
  • FIG. 5 shows a correlation between base editing efficiency and purity of gRNA, with 80% purity yielding maximum base editing levels. Additionally, NLS-gRNA showed an improvement in potency with spCas9 protein relative to other sg093 guides without NLS sequence, illustrating that NLS-gRNA of the present invention can be applied across multiple Cas proteins.
  • Example 5 In vivo base editing correction of metabolic defects in GSDla R83C mice using NLS-gRNA
  • ABEs Adenine base editors
  • the G6PC gRNA sequence hybridizes to the complement of the G6PC target sequence shown below:
  • the NNGRRT PAM sequence (/. e. , Staphylococcus aureus Cas9 (saCas9)) is underlined above.
  • the gRNA sequence is as follows: CAGUAUGGACACUGUCCAAA (SEQ ID NO: 2)
  • TadA variants MSP605, MSP824, MSP825, MSP680, MSP828, and MSP829 were evaluated in vivo using a transgenic mouse model heterozygous for huG6PC, harboring the R83C mutation for Glycogen storage disease type la (GSDla) (FIGs. 6B and 6C).
  • Glycogen storage disease type la Glycogen storage disease type la (GSDla)
  • FIGs. 6B and 6C The use of saCas9 for efficient in vivo genome editing and exemplification of an saCas9 sgRNA scaffold are described in A. Ran et al. (2015, Nature, Vol. 520, pages 186— 191).
  • FIG. 6A depicts the in vivo workflow used to introduce the base editors into the transgenic mice.
  • Lipid nanoparticles (LNP) carrying base editor mRNA and NLS-gRNA were dosed via intravenous (IV) injection into the transgenic mice at a dose of 1 mg/kg.
  • IV intravenous
  • FIG. 6B and 6C Next-generation sequencing data from whole-liver extracts revealed significant correction for R83C (FIGs. 6B and 6C).
  • TadA variant MSP828 demonstrated about 40% precise correction of the R83C mutation, with low bystander editing. This level of mutation correction is expected to restore glucose homeostasis.
  • GSDla is an autosomal recessive disorder caused by mutations in the G6PC gene.
  • the most prevalent pathogenic mutation identified in Caucasian GSDla patients is R83C, located in the active site of the enzyme and associated with inactivation of G6Pase.
  • a loss of G6Pase function can result in life- threatening hypoglycemia, seizures and even death.
  • patients must maintain strict and frequent adherence to glucose supplementation through day and night, by way of a slow glucose release formula.
  • One missed or delayed dose can result in emergency hypoglycemia.
  • enlarged liver, accumulation of uric acid, lactate, and lipids are common in GSDla patients.
  • the R83C mutation introduces a single G>A conversion in the g6pc gene.
  • Adenine base editors as described herein effect the programmable conversion of A to G in genomic DNA, thus supporting their utility to correct this mutation.
  • the adenine base editor is a fusion protein containing an evolved TadA deaminase connected to CRISPR-Cas enzyme.
  • the base editor binds to target DNA that is complementary to the guide-RNA (superimposed on the CRISPR-Cas9 enzyme) and exposes a stretch of single -stranded DNA.
  • the deaminase converts the target adenine into inosine, and the Cas enzyme nicks the opposite strand, which is then repaired, completing the base pair conversion.
  • the direct repair of a point mutation has the potential for restoration of gene function.
  • FIG. 9A Shown in FIG. 9A is the target DNA sequence (CCACCAGTATGGACACTGTCCAAAGAGAAT (SEQ ID NO: 17)) and underlying amino acid translation for the GSDla R83C mutation (WWYPCQGFLI; SEQ ID NO: 18).
  • the target nucleobase to be edited is represented by double underlining, at position 12.
  • the editing window also includes a possible bystander, shown represented by single underlining at position 6.
  • An edit that may result in a synonymous conversion is shown at position 10.
  • HEK293 cell line that expressed the G6PC transgene harboring the R83C mutation was generated and was transfected with base-editor mRNA and gRNA. Allele frequencies were assessed by high-throughput targeted amplicon Next- Generation Sequencing. Variants 1-5 represent a combination of gRNA and base-editor RNA, engineered for optimized target correction. Variant 5 yielded approximately 60% targeted base-editing efficiency for R83C correction and limited bystander editing (FIG. 9B).
  • GSDla mouse that expresses the human G6PC-R83C transgene in place of mouse G6pc was generated. It was confirmed that mice homozygous for huR83C exhibited postnatal lethality and rarely survived to weaning (21 days). On glucose supplementation therapy, the animals survived to at least 3 weeks of age and revealed characteristic pathological signatures of GSDla, such as reduced body weight, enlarged livers, significant G6Pase inhibition, and abnormal serum metabolites compared to littermate controls (FIG. 7). This phenotype is consistent with published and clinical reports in humans.
  • FIG. 6A depicts in vivo workflow, with lipid nanoparticle, or LNP, co formulations of base-editor mRNA and gRNA dosed via IV injection.
  • LNP-dosing was administered via the temporal vein shortly post birth, and activity was compared with that in adult mice.
  • Next Generation Sequencing (NGS) analysis of whole liver extracts revealed approximately 40% base-editing efficiency in adults and up to -60% efficiency in newborns, with a broader range in efficiencies (FIG. 11A). Bystander editing remained low in adults and newborns. (FIG. 11A).
  • NGS Next Generation Sequencing
  • LNP LNP-mediated R83C correction was associated with the survival of the homozygous huR83C mice.
  • Hepatomegaly is another clinical presentation of GSDla and is primarily caused by excess glycogen and lipid deposition in the liver.
  • liver sections were collected from 3wk old newborn mice and immune -histochemical analysis were conducted via hematoxylin and eosin (H&E) and Oil red O staining (FIG. 12B).
  • H&E hematoxylin and eosin
  • FIG. 12B Oil red O staining
  • Single LNP dose administration maintains euglycemia during a 24 hour fasting challenge via base editing
  • GSD-la pathology A hallmark symptom of GSD-la pathology is fasting hypoglycemia, with a precipitous decline in blood glucose levels within minutes.
  • a full proof-of-concept study was conducted in GSD-la transgenic mice, homozygous for huG6PC-R83C, to test whether the animals could sustain a 24 hour (hr) fast after base-editing treatment as described herein. In this study, 100% animal survival was achieved post-24hr fasting period in LNP -treated (1.5mpk) GSD-la animals and in healthy controls.
  • normal fasting glucose levels were measured in control mice and in treated mice pre- and post-24hr fasting, which maintained levels above hypoglycemic therapeutic threshold (>60mg/dL), (FIG. 13).
  • G6PC target sequences that can be used in conjunction with the base editors to effect base editing to correct the R83C mutation as described herein include those shown in Table 7.
  • the target sequences include the types of PAMs and base editors, such as IBEs as described herein, suitable for use.
  • the position of the targeted “A” nucleotide i.e., A8-A15
  • G6PC gRNA sequences hybridize to the complement of the G6PC target sequence shown in Table 7.
  • the PAM sequences e.g., SpCas9 are underlined in Table 7.
  • Inlaid base editors (IBEs) noted in Table 7 refer to structures of Cas9 and
  • TadA having an architecture in which the deaminase domains are internal to (embedded inside) a CRISPR-Cas protein, e.g., Cas9.
  • the IBE architecture allows for a greater breadth of potential base editing targets compared with other base editors and is not limited by the requirement of a suitably positioned Cas9 protospacer adjacent motif sequence.
  • Such IBEs exhibited shifted editing windows and exhibited greater editing efficiency, thus allowing for the editing of targets outside the canonical editing window with reduced DNA and RNA off- target editing frequency. Accordingly, IBEs expand the breadth of potential base editing targets by extending the range of editing windows that can be created for any given CRISPR- Cas protein used to target the DNA.
  • the active site of the deaminase can be repositioned, making IBEs capable of editing outside the traditional editing window.
  • IBE architectures are described hereinabove and in S. Haihua Chu et al., The CRISPR Journal, Vol. 4, No. 2; published online 20 April 2021 (DOI: 10.1089/crispr.2020.0144).
  • gRNA sequences which hybridize to the complement of the G6PC target sequence in Table 7 are as follows (5' to 3'): CCACCAGUAUGGACACUGUC (SEQ ID NO:

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Medicinal Chemistry (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Epidemiology (AREA)
  • Diabetes (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Obesity (AREA)
  • Hematology (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Peptides Or Proteins (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
PCT/US2022/074041 2021-07-23 2022-07-22 Guide rnas for crispr/cas editing systems Ceased WO2023004409A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
CN202280060813.2A CN117916373A (zh) 2021-07-23 2022-07-22 用于crispr/cas编辑系统的引导rna
KR1020247005580A KR20240037299A (ko) 2021-07-23 2022-07-22 Crispr/cas 편집 시스템용 가이드 rna
AU2022313315A AU2022313315A1 (en) 2021-07-23 2022-07-22 Guide rnas for crispr/cas editing systems
JP2024504461A JP2024529425A (ja) 2021-07-23 2022-07-22 CRISPR/Cas編集系のためのガイドRNA
EP22761858.4A EP4373931A1 (en) 2021-07-23 2022-07-22 Guide rnas for crispr/cas editing systems
CA3226664A CA3226664A1 (en) 2021-07-23 2022-07-22 Guide rnas for crispr/cas editing systems
US18/418,751 US20240301405A1 (en) 2021-07-23 2024-01-22 Guide rnas for crispr/cas editing systems

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163225322P 2021-07-23 2021-07-23
US63/225,322 2021-07-23
US202163255927P 2021-10-14 2021-10-14
US63/255,927 2021-10-14

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/418,751 Continuation US20240301405A1 (en) 2021-07-23 2024-01-22 Guide rnas for crispr/cas editing systems

Publications (1)

Publication Number Publication Date
WO2023004409A1 true WO2023004409A1 (en) 2023-01-26

Family

ID=83149172

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/074041 Ceased WO2023004409A1 (en) 2021-07-23 2022-07-22 Guide rnas for crispr/cas editing systems

Country Status (7)

Country Link
US (1) US20240301405A1 (https=)
EP (1) EP4373931A1 (https=)
JP (1) JP2024529425A (https=)
KR (1) KR20240037299A (https=)
AU (1) AU2022313315A1 (https=)
CA (1) CA3226664A1 (https=)
WO (1) WO2023004409A1 (https=)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024220980A3 (en) * 2023-04-20 2024-12-05 The Schepens Eye Research Institute, Inc. Dscam inhibitory nucleic acids and methods of use thereof
WO2025087411A1 (zh) * 2023-10-25 2025-05-01 尧唐(上海)生物科技有限公司 一种脱氨酶、包含其的碱基编辑器及其应用
EP4229195A4 (en) * 2020-10-14 2025-08-13 Beam Therapeutics Inc COMPOSITIONS AND METHODS FOR TREATING GLYCOGEN STORAGE DISEASE TYPE 1A
US12390538B2 (en) 2023-05-15 2025-08-19 Nchroma Bio, Inc. Compositions and methods for epigenetic regulation of HBV gene expression

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4880635A (en) 1984-08-08 1989-11-14 The Liposome Company, Inc. Dehydrated liposomes
US4906477A (en) 1987-02-09 1990-03-06 Kabushiki Kaisha Vitamin Kenkyusyo Antineoplastic agent-entrapping liposomes
US4911928A (en) 1987-03-13 1990-03-27 Micro-Pak, Inc. Paucilamellar lipid vesicles
US4917951A (en) 1987-07-28 1990-04-17 Micro-Pak, Inc. Lipid vesicles formed of surfactants and steroids
US4920016A (en) 1986-12-24 1990-04-24 Linear Technology, Inc. Liposomes with enhanced circulation time
US4921757A (en) 1985-04-26 1990-05-01 Massachusetts Institute Of Technology System for delayed and pulsed release of biologically active substances
US5846946A (en) 1996-06-14 1998-12-08 Pasteur Merieux Serums Et Vaccins Compositions and methods for administering Borrelia DNA
US8404658B2 (en) 2007-12-31 2013-03-26 Nanocor Therapeutics, Inc. RNA interference for the treatment of heart failure
US8454972B2 (en) 2004-07-16 2013-06-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Method for inducing a multiclade immune response against HIV utilizing a multigene and multiclade immunogen
WO2013176772A1 (en) 2012-05-25 2013-11-28 The Regents Of The University Of California Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription
WO2017045381A1 (zh) 2015-09-18 2017-03-23 京东方科技集团股份有限公司 一种拼接屏
WO2017058751A1 (en) * 2015-09-28 2017-04-06 North Carolina State University Methods and compositions for sequence specific antimicrobials
WO2017070632A2 (en) 2015-10-23 2017-04-27 President And Fellows Of Harvard College Nucleobase editors and uses thereof
WO2018027078A1 (en) 2016-08-03 2018-02-08 President And Fellows Of Harard College Adenosine nucleobase editors and uses thereof
WO2020028254A1 (en) * 2018-07-30 2020-02-06 Sarepta Therapeutics, Inc. Trimeric peptides for antisense delivery
WO2020049975A1 (ja) 2018-09-03 2020-03-12 株式会社オートネットワーク技術研究所 回路構造体及び回路構造体の製造方法
WO2021061636A1 (en) * 2019-09-23 2021-04-01 Flagship Pioneering Innovations V, Inc. Modulating genomic complexes
WO2021127650A1 (en) * 2019-12-19 2021-06-24 Entrada Therapeutics, Inc. Compositions for delivery of antisense compounds

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4880635B1 (en) 1984-08-08 1996-07-02 Liposome Company Dehydrated liposomes
US4880635A (en) 1984-08-08 1989-11-14 The Liposome Company, Inc. Dehydrated liposomes
US4921757A (en) 1985-04-26 1990-05-01 Massachusetts Institute Of Technology System for delayed and pulsed release of biologically active substances
US4920016A (en) 1986-12-24 1990-04-24 Linear Technology, Inc. Liposomes with enhanced circulation time
US4906477A (en) 1987-02-09 1990-03-06 Kabushiki Kaisha Vitamin Kenkyusyo Antineoplastic agent-entrapping liposomes
US4911928A (en) 1987-03-13 1990-03-27 Micro-Pak, Inc. Paucilamellar lipid vesicles
US4917951A (en) 1987-07-28 1990-04-17 Micro-Pak, Inc. Lipid vesicles formed of surfactants and steroids
US5846946A (en) 1996-06-14 1998-12-08 Pasteur Merieux Serums Et Vaccins Compositions and methods for administering Borrelia DNA
US8454972B2 (en) 2004-07-16 2013-06-04 The United States Of America, As Represented By The Secretary, Department Of Health And Human Services Method for inducing a multiclade immune response against HIV utilizing a multigene and multiclade immunogen
US8404658B2 (en) 2007-12-31 2013-03-26 Nanocor Therapeutics, Inc. RNA interference for the treatment of heart failure
WO2013176772A1 (en) 2012-05-25 2013-11-28 The Regents Of The University Of California Methods and compositions for rna-directed target dna modification and for rna-directed modulation of transcription
WO2017045381A1 (zh) 2015-09-18 2017-03-23 京东方科技集团股份有限公司 一种拼接屏
WO2017058751A1 (en) * 2015-09-28 2017-04-06 North Carolina State University Methods and compositions for sequence specific antimicrobials
WO2017070632A2 (en) 2015-10-23 2017-04-27 President And Fellows Of Harvard College Nucleobase editors and uses thereof
WO2018027078A1 (en) 2016-08-03 2018-02-08 President And Fellows Of Harard College Adenosine nucleobase editors and uses thereof
WO2020028254A1 (en) * 2018-07-30 2020-02-06 Sarepta Therapeutics, Inc. Trimeric peptides for antisense delivery
WO2020049975A1 (ja) 2018-09-03 2020-03-12 株式会社オートネットワーク技術研究所 回路構造体及び回路構造体の製造方法
WO2021061636A1 (en) * 2019-09-23 2021-04-01 Flagship Pioneering Innovations V, Inc. Modulating genomic complexes
WO2021127650A1 (en) * 2019-12-19 2021-06-24 Entrada Therapeutics, Inc. Compositions for delivery of antisense compounds

Non-Patent Citations (34)

* Cited by examiner, † Cited by third party
Title
"Controlled Drug Bioavailability, Drug Product Design and Performance", 1984, WILEY
"GenBank", Database accession no. U01120.1
"Medical Applications of Controlled Release", 1974, CRC PRESS
"NCBI", Database accession no. AAA16222.1
A. RAN ET AL., NATURE, vol. 520, 2015, pages 186 - 191
ALTSCHUL ET AL., METHODS IN ENZYMOLOGY
ALTSCHUL ET AL., NUCLEIC ACIDS RES., vol. 25, 1997, pages 3389 - 3402
ALTSCHUL ET AL.: "Basic local alignment search tool", J. MOL. BIOL., vol. 215, no. 3, 1990, pages 403 - 410, XP002949123, DOI: 10.1006/jmbi.1990.9999
ARNAOTOVA ET AL., MOL. THERAPY, vol. 29, no. 4, 2021
ARNAOTOVA ET AL., MOL. THERAPY., vol. 29, no. 4, 2021
BAXEVANIS ET AL.: "Bioinformatics : A Practical Guide to the Analysis of Genes and Proteins", 1998, WILEY
BUCHWALD ET AL., SURGERY, vol. 88, 1980, pages 507
DURING ET AL., ANN. NEUROL., vol. 25, 1989, pages 351
GAUDELLI, N.M. ET AL.: "Programmable base editing of A«T to G-C in genomic DNA without DNA cleavage", NATURE, vol. 551, 2017, pages 464 - 471
GRIMM, D. ET AL., J. VIROL., vol. 82, 2008, pages 5887 - 5911
HOWARD, J. NEUROSURG., vol. 71, 1989, pages 105
KOMOR, A.C. ET AL.: "Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage", NATURE, vol. 533, 2016, pages 420 - 424, XP055551781, DOI: 10.1038/nature17946
KOMOR, A.C. ET AL.: "Science Advances", vol. 3, 2017, article "Improved base excision repair inhibition and bacteriophage Mu Gam protein yields C:G-to-T:A base editors with higher efficiency and product purity", pages: eaao4774
LANGER, SCIENCE, vol. 249, 1990, pages 1527 - 1533
LEE, Y.M., JUN, H.S. PAN, C.-J. LIN, S.R., WILSON, L.H., MANSFIELD, B.C., AND CHOU, J.Y: "Prevention of hepatocyellular adenoma and correction of metabolic abnormalities in murine glycogen storage disease type Ia by gene therapy", HEPATOLOGY, vol. 56, 2012, pages 1719 - 1729
LEI, K.-J. ET AL., NATURE GENETICS, vol. 13, no. 2, 1996, pages 203 - 9
LEVY ET AL., SCIENCE, vol. 228, 1985, pages 190
M. RICHTER ET AL., NATURE BIOTECHNOLOGY, 2020
NAKAMURA, Y. ET AL.: "Codon usage tabulated from the international DNA sequence databases: status for the year 2000", NUCL. ACIDS RES., vol. 28, 2000, pages 292, XP002941557, DOI: 10.1093/nar/28.1.292
NAT. GENET., vol. 13, pages 203 - 209
RANGERPEPPAS, MACROMOL. SCI. REV. MACROMOL. CHEM., vol. 23, 1983, pages 61
REES, H.A. ET AL.: "Base editing: precision chemistry on the genome and transcriptome of living cells", NAT REV GENET., vol. 19, no. 12, December 2018 (2018-12-01), pages 770 - 788
REESLIU, NATURE REVIEW GENETICS, vol. 19, no. 12, 2018, pages 770 - 788
S. HAIHUA CHU ET AL., THE CRISPR JOURNAL, vol. 4, no. 2, 20 April 2021 (2021-04-20)
SAUDEK ET AL., N. ENGL. J. MED., vol. 321, 1989, pages 574
SEFTON, CRC CRIT. REF. BIOMED. ENG., vol. 14, 1989, pages 201
WAHL, G. M.S. L. BERGER, METHODS ENZYMOL., vol. 152, 1987, pages 507
ZHANG Y. P., GENE THER., vol. 132, 1999, pages 1438 - 47
ZURIS, J.A. ET AL., NAT. BIOTECHNOLOGY, vol. 33, no. 1, 2015, pages 73 - 80

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4229195A4 (en) * 2020-10-14 2025-08-13 Beam Therapeutics Inc COMPOSITIONS AND METHODS FOR TREATING GLYCOGEN STORAGE DISEASE TYPE 1A
WO2024220980A3 (en) * 2023-04-20 2024-12-05 The Schepens Eye Research Institute, Inc. Dscam inhibitory nucleic acids and methods of use thereof
US12390538B2 (en) 2023-05-15 2025-08-19 Nchroma Bio, Inc. Compositions and methods for epigenetic regulation of HBV gene expression
WO2025087411A1 (zh) * 2023-10-25 2025-05-01 尧唐(上海)生物科技有限公司 一种脱氨酶、包含其的碱基编辑器及其应用

Also Published As

Publication number Publication date
AU2022313315A1 (en) 2024-02-08
US20240301405A1 (en) 2024-09-12
CA3226664A1 (en) 2023-01-26
KR20240037299A (ko) 2024-03-21
JP2024529425A (ja) 2024-08-06
EP4373931A1 (en) 2024-05-29

Similar Documents

Publication Publication Date Title
JP7826199B2 (ja) 合成ガイドrna、組成物、方法、およびそれらの使用
US20240301405A1 (en) Guide rnas for crispr/cas editing systems
US20240167008A1 (en) Novel crispr enzymes, methods, systems and uses thereof
CA3110998A1 (en) Rna and dna base editing via engneered adar recruitment
US20230383277A1 (en) Compositions and methods for treating glycogen storage disease type 1a
US20240376468A1 (en) CIRCULAR GUIDE RNAs FOR CRISPR/CAS EDITING SYSTEMS
US20230279373A1 (en) Novel crispr enzymes, methods, systems and uses thereof
US20240327813A1 (en) Crispr enzymes, methods, systems and uses thereof
US20240252550A1 (en) Genetic modification of hepatocytes
CN117916373A (zh) 用于crispr/cas编辑系统的引导rna
US20250288690A1 (en) Rna base editing compositions, systems, methods and uses thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22761858

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022313315

Country of ref document: AU

Ref document number: 3226664

Country of ref document: CA

Ref document number: AU2022313315

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 2024504461

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2022313315

Country of ref document: AU

Date of ref document: 20220722

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20247005580

Country of ref document: KR

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 1020247005580

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2022761858

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022761858

Country of ref document: EP

Effective date: 20240223

WWE Wipo information: entry into national phase

Ref document number: 202280060813.2

Country of ref document: CN