WO2021189110A1 - Dna altering proteins and uses therefor - Google Patents

Dna altering proteins and uses therefor Download PDF

Info

Publication number
WO2021189110A1
WO2021189110A1 PCT/AU2021/050269 AU2021050269W WO2021189110A1 WO 2021189110 A1 WO2021189110 A1 WO 2021189110A1 AU 2021050269 W AU2021050269 W AU 2021050269W WO 2021189110 A1 WO2021189110 A1 WO 2021189110A1
Authority
WO
WIPO (PCT)
Prior art keywords
deaminase
domain
fusion protein
sequence
amino acid
Prior art date
Application number
PCT/AU2021/050269
Other languages
French (fr)
Inventor
Alex William Hewitt
Minh Thuan Nguyen TRAN
Original Assignee
University Of Tasmania
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2020900913A external-priority patent/AU2020900913A0/en
Application filed by University Of Tasmania filed Critical University Of Tasmania
Publication of WO2021189110A1 publication Critical patent/WO2021189110A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/305Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Micrococcaceae (F)
    • C07K14/31Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Micrococcaceae (F) from Staphylococcus (G)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • C12N15/864Parvoviral vectors, e.g. parvovirus, densovirus
    • C12N15/8645Adeno-associated virus
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04002Adenine deaminase (3.5.4.2)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04004Adenosine deaminase (3.5.4.4)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04005Cytidine deaminase (3.5.4.5)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P27/00Drugs for disorders of the senses
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector

Definitions

  • the present disclosure relates generally to base editing proteins comprising a Cas9 domain and a deaminase domain, and uses therefor.
  • the proteins of the present disclosure are adenosine base editors (ABEs) and cytidine base editors (CBEs) with improved on-target efficiency and precision, useful for clinical applications.
  • ABEs adenosine base editors
  • CBEs cytidine base editors
  • CRISPR/Cas Clustered regularly interspaced short palindromic repeats/CRISPR-associated protein
  • This specific and adaptable method for genome engineering typically utilizes a two-component system consisting of a Cas endonuclease and guide RNA (gRNA), which can be designed to target essentially any genomic locus and generate double-strand breaks (DSBs). These DSBs are subsequently repaired via the non-homologous end-joining (NHEJ) pathway or the homology-directed repair (HDR) pathway, thereby editing the genome.
  • NHEJ non-homologous end-joining
  • HDR homology-directed repair
  • CBEs cytidine base editors
  • ABEs adenosine base editors
  • CBEs direct cytidine-to-thymidine nucleotide conversions at a user-defined guide sequence (i.e., sgRNA) and comprise a cytidine deaminase derived from vertebrate or invertebrate systems
  • sgRNA user-defined guide sequence
  • ABEmax current generation ABEs
  • ABEmax employ a dimerized, codon optimized variant of laboratory-evolved ecTadA to direct site-specific adenosine-to-guanosine nucleotide conversions in a diverse array of systems.
  • ABEs have a significant off-target footprint on the transcriptome and effect incidences of missense and nonsense mutations.
  • the present disclosure is predicated, in part, on the surprising finding that the activity profile of base editing proteins can be improved by manipulating the secondary structure of Cas9.
  • the strategic use of circular permutation and protein-domain insertion can be used to calibrate the DNA and RNA footprint of base editing proteins based on a model of “best-fit” between the overall reach of the deaminating catalytic pocket and the nucleotide substrate.
  • microABE 1744 a Staphylococcus aureus Cas9 nickase (SaCas9n)-intradomain variant with an insertion of the miniABEmax (V82G) base editor at residue 1744
  • microAIDx 1744 a Staphylococcus aureus Cas9 nickase (SaCas9n)- intradomain variant with an insertion of the AIDx base editor at residue 1744, both of which exhibit robust on-targeting editing and reduced RNA signature on the transcriptome.
  • microABE 1744 has also been demonstrated to effectively correct the highly penetrant PCDH15 Arg245Ter variant, which causes type 1 Usher syndrome.
  • the present disclosure provides an isolated fusion protein comprising: a. a Cas9 domain comprising the amino acid sequence of SEQ ID NO: 1, or an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO: 1, wherein the Cas9 domain when in conjunction with a bound guide RNA (gRNA) specifically binds to a target nucleic acid sequence; and b.
  • gRNA bound guide RNA
  • deaminase domain deaminates a nucleotide base in a single-stranded portion of the target nucleic acid sequence when in conjunction with the Cas9 domain and the gRNA, wherein the deaminase domain is positioned between amino acid residues 119 and 132 or 730 and 745 of the amino acid sequence of SEQ ID NO: 1.
  • the present disclosure provides an isolated polynucleotide encoding the fusion protein as disclosed herein.
  • the present disclosure provides a vector comprising the polynucleotide as disclosed herein.
  • the present disclosure provides a complex comprising the fusion protein as disclosed herein and a gRNA bound to the Cas9 domain of the fusion protein.
  • the present disclosure provides a method to deaminate an adenosine nucleotide to an inosine nucleotide comprising contacting a nucleic acid molecule with the fusion protein as disclosed herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more adenosine or thymidine nucleotides.
  • the present disclosure provides a method to deaminate a cytidine nucleotide to an uracil nucleotide comprising contacting a nucleic acid molecule with a the fusion protein as disclosed herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more cytidine or guanosine nucleotides.
  • Figure 1 is a schematic representation of the SpCas9 constructs.
  • FIG. 2 is a schematic representation of SpCas9n with key residues for permutant variants.
  • S. pyogenes Cas9 (PDB: 4008) is shown complexed with target DNA (red) and gRNA (orange).
  • Key residues for circular permutation (1010, magenta; 1029, cyan; 1058, green) or intradomain insertion (202, yellow; 208, navy; 468, black; 1058, green) of the hAIDx protein domain are shown. Numbering is considered from the starting methionine at position 1, BH - Bridge helix.
  • Figure 3 is a graphical representation of the cytosine conversion efficiency (%, y-axis) of TAM-AIDx after circular permutation (CP) or intradomain (ID) insertion into SpCas9 constructs targeting the YFP locus (x-axis).
  • FIG. 4 shows that SpCas9 is amenable to intradomain insertion of hAIDx protein at residue 1058.
  • A A graphical representation of the total cytosine conversion efficiency (%, y-axis) of TAM-AIDx after circular CP or ID insertion into SpCas9 constructs targeting the YFP locus (x-axis);
  • B A schematic representation of a nucleotide quilt of the N-terminal linked BE3 or TAM-AIDx and after CP or ID insertion into SpCas9 constructs targeting the YFP locus.
  • Figure 5 is a schematic representation of YFP locus tables from genomic DNA isolated from HEK293A-YFP cells after editing with SpCas9n, where the hAIDx has been linked to the C-terminal, inserted in the PI domain (at residue 1058) or linked to a circularly permuted SpCas9n (at residue 1058).
  • Figure 6 is a graphical representation of the on-target activity of various permutations of CBEs targeting the YFP locus as shown by reference to total cytosine conversion efficiency (%, y-axis) and nucleotide position (x-axis).
  • Figure 7 is a graphical representation of the on-target activity of various permutations of ABEs targeting the (A) YFP and (B) ABE16 loci as shown by reference to adenosine-to-guanine conversion efficiency (%, y-axis) and nucleotide position (x-axis).
  • Figure 8 is a series of graphical representations of the on-target activity of various permutations of ABEs (x-axis) targeting the (A) DNAJB, (B) MTA2, (C) PTBP2, (D) SAP30BP, (E) LCMT1, and (F) SCAP loci as shown by reference to adenosine-to-inosine conversion efficiency (%, y-axis).
  • Figure 9 is a heat map showing the localized, off-target profile of SpCas9 ABEmax permutants across promiscuous RNA transcripts. Average editing across the transcript was considered for adenosine-to-inosine, including that of the position of the highest-edited adenosine in the amplicon.
  • Figure 10 is a graphical representation of on-target editing efficiencies for various SpCas9 ABEmax permutants with sgRNA targeting the (A) ABE16 and (B) YFP loci.
  • Figure 11 is a heat map showing the off-target profile of different permutants of SpCas9 ABEs. Average editing across the transcript was considered for adenosine-to- inosine, including that of the position of the highest-edited adenosine in the amplicon.
  • Figure 12 is a schematic representation of S. aureus Cas9 (PDB: 5CZZ) with the corresponding target DNA (red) and gRNA (orange) complex. Residues marked in yellow or green demarcate amino acid positions 119-132 or 730-745 of SEQ ID NO: 1, respectively.
  • Figure 13 is a graphical representation of the base editing activity for each intradomain-inserted miniABEmax (V82G) SaCas9.
  • Figure 14 is a series of graphical representations of the base editing activity for each intradomain-inserted hAIDx SaCas9n.
  • A sgRNA targeting HEKsite4;
  • B sgRNA targeting ABE5;
  • C sgRNA targeting ABE9. Heatmap data points are presented as the average over three technical replicates.
  • Figure 16 is a graphical representation of the on-target editing efficiencies of intradomain SaCas9 constructs as shown by reference to adenine-to-guanine conversion efficiency (%, y-axis) and nucleotide position (x-axis) at various example loci.
  • Figure 17 is a heat map showing the off-target profile of SaCas9n constructs. Average editing across the transcript was considered for adenosine-to-inosine, including that of the position of the highest-edited adenosine in the amplicon.
  • Figure 18 is a series of graphical representations of the off-target activity of Sa- ABEmax, Sa-miniABEmax (V82G), G129, N730 and 1744 (microABE 1744) (x-axis) targeting the (A) DNAJB, (B) MTA2, (C) PTBP2, (D) SAP30BP, (E) LCMT1, and (F) SCAP loci as shown by reference to adenine-to-inosine conversion efficiency (%, y-axis).
  • Figure 19 is a graphical representation of the transcriptomic profiling of microABE 1744 (ID744 mini ABEmax [V82G]), Sa-ABEmax (N-terminal ABEmax), and Sa-miniABEmax (V82g) (N-terminal mini ABEmax [V82G]) showing the number of reads containing adenosine-to-inosine editing (y-axis) and chromosomal position (x-axis) in HEK293A-YFP cells.
  • Figure 20 is a heat map showing the localized, off-target profile of SaCas9 ABE permutants across promiscuous RNA transcripts. Average editing across the transcript was considered for adenosine-to-inosine, including that of the position of the highest-edited adenosine in the amplicon.
  • Figure 21 shows that microABE 1744 has high editing efficiency and lower mRNA off-target effects as compared to the SaCas9-variant of miniABEmax (V82G).
  • A A graphical representation of the editing profile of miniABEmax (V82G) compared to microABE with a sgRNA targeting the PCDH15 Arg245ter variant (NM_033056.4:c.733C>T) or LacZ;
  • B A heat map showing the off-target events at several promiscuous RNA transcripts for constructs targeting the PCDH15 Arg245ter variant.
  • Figure 22 shows AAV-mediated delivery of a single construct containing Intradomain-SaCas9n 1744 (microABE) and a sgRNA targeting either ABE site 11 or LacZ. Proportion of A-to-G conversions are presented as an average of three technical replicates, for either the AAV-7m8 or AAV-DJ serotypes, in HEK293A-YFP cells.
  • Figure 23 shows the annotated amino acid sequence of microABE 1744.
  • Figure 24 shows the annotated amino acid sequence of microAIDx 1744.
  • a distinct advantage of the fusion proteins described herein is that the size of the polynucleotide sequences encoding the fusion proteins are small enough to fit comfortably within vectors that are typically used in gene therapy (e.g., AAV vector). Further, as a result of the broad editing window, robust on-target editing, and reduced off- target signature, the fusion proteins described herein are also suitable for therapeutic applications.
  • base editing refers to a genome editing method that enables direct, irreversible conversion of one nucleotide to another at a target genomic locus without requiring double- stranded breaks (DSBs), homology-directed repair (HDR) processes, or donor DNA templates.
  • DSBs double- stranded breaks
  • HDR homology-directed repair
  • ABE adenosine base editor
  • A-T adenosine-to-guanosine
  • G-C adenosine intermediate
  • ABEs can be extended to mediate the conversion of thymidine -to-cytidine ⁇ i.e., T ⁇ A to C-G) on the non- targeted strand ⁇ i.e., sense strand) of the target nucleic acid sequence.
  • CBE cytidine base editor
  • nucleotide refers to the nucleotides adenosine, guanosine, cytidine, thymidine and uridine, each of which comprise a nucleotide base attached to a ribose ring.
  • adenosine guanosine
  • cytidine thymidine
  • uridine may be used interchangeably herein with the terms “adenine” or “A”, “guanine” or “G”, “cytosine” or “C”, “thymine” or “T” and “uracil” or “U”, respectively, which refer the nucleotide base comprised by the nucleotides.
  • Base editing proteins are used in conjunction with components of clustered regularly interspaced short palindromic repeats/ CRISPR-associated protein (CRISPR/Cas) systems to directly introduce point mutations into cellular DNA without making DSBs.
  • CRISPR/Cas CRISPR-associated protein
  • CRISPR Clustered regularly interspaced short palindromic repeat
  • Cas CRISPR-associated protein
  • RNA is transcribed from a portion of the CRISPR locus that includes the viral sequence. That RNA, which contains sequence complementarity to the viral genome, mediates targeting of a Cas endonuclease to the sequence in the viral genome. The Cas endonuclease cleaves the viral target sequence to prevent integration or expression of the viral sequence.
  • the present disclosure provides an isolated fusion protein comprising: a. a Cas9 domain comprising the amino acid sequence of SEQ ID NO: 1, or an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO: 1, wherein the Cas9 domain when in conjunction with a bound guide RNA (gRNA) specifically binds to a target nucleic acid sequence; and b.
  • gRNA bound guide RNA
  • deaminase domain deaminates a nucleotide base in a single-stranded portion of the target nucleic acid sequence when in conjunction with the Cas9 domain and the gRNA, wherein the deaminase domain is positioned between amino acid residues 119 and 132 or 730 and 745 of the amino acid sequence of SEQ ID NO: 1.
  • isolated with reference to a protein, means that the protein is substantially free of cellular material or other contaminating proteins from the cells from which the protein is derived (and thus altered from its natural state), or substantially free from chemical precursors or other chemicals when chemically synthesized, and thus altered from its natural state.
  • protein protein
  • peptide and “polypeptide” are used interchangeably herein to refer to a polymer of amino acid residues linked together by peptide (amide) bonds.
  • amide peptide bonds
  • the terms refer to a protein, peptide, or polypeptide of any size, structure or function.
  • fusion protein as used herein relates to a protein comprising two or more heterologous regions or domains not found operably linked in nature.
  • Cas9 refers to an RNA-guided nuclease comprising a Cas9 protein, or a fragment thereof (e.g. , a protein comprising an active, inactive, or partially active DNA cleavage domain of Cas9 and / or the gRNA binding domain of Cas9.
  • Cas9 nuclease sequences would be known to persons skilled in the art, illustrative examples of which are described by, for example Ferretti et al. (2001, Proceedings of the National Academy of Science U.S.A., 98: 4658-4663), Deltcheva et al. (2011, Nature, 471: 602-607), and Jinek et al. (2012, Science, 337: 816-821).
  • the Cas9 domain comprises an amino acid sequence of the Staphylococcus aureus Cas9 (SaCas9).
  • the Cas9 domain comprises a catalytically impaired Cas9 nuclease.
  • Methods for generating a catalytically impaired Cas9 proteins or fragments thereof would be known to persons skilled in the art, an illustrative example of which includes the introduction of mutations in the HNH nuclease subdomain or the RuvC1 subdomain of the DNA cleavage domain of Cas9, as described by Jinek et al. (2012, supra).
  • the Cas9 domain comprises a Cas9 nickase.
  • the Cas9 domain comprise a Staphylococcus aureus Cas9 nickase (SaCas9n).
  • the Cas9 domain comprises the amino acid sequence of SEQ ID NO: 1, or an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO: 1. Accordingly, the sequence may be at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, orl00% identical to the amino acid sequence of SEQ ID NO: 1.
  • gRNA guide RNA
  • gRNA refers to a RNA sequence that is complementary to a target DNA and directs a CRISPR endonuclease to the target nucleic acid sequence.
  • gRNA comprises CRISPR RNA (crRNA) and a tracr RNA (tracrRNA).
  • crRNA is a 17-20 nucleotide sequence that is complementary to the target nucleic acid sequence, while the tracrRNA provides a binding scaffold for the endonuclease.
  • crRNA and tracrRNA exist in nature a two separate RNA molecules, which has been adapted for molecular biology techniques using, for example, 2-piece gRNAs such as CRISPR tracer RNAs (crdracrRNAs).
  • single-guide RNA or “sgRNA” refer to a single RNA sequence that comprises the crRNA fused to the tracrRNA.
  • gRNA describes all CRISPR guide formats, including two separate RNA molecules or a single RNA molecule.
  • sgRNA will be understood to refer to single RNA molecules combining the crRNA and tracrRNA elements into a single nucleotide sequence.
  • the gRNA is a single-guide RNA (sgRNA).
  • deaminase domain refers to a protein that deami nates a nucleotide base in a single- stranded portion of the target nucleic acid sequence when in conjunction with the Cas9 domain and the gRNA.
  • the deaminase domains described herein may be naturally occurring deaminase or a synthetic (i.e., engineered or laboratory-evolved) deaminase. In some embodiments, the deaminase domain is a variant of a naturally occurring deaminase that does not occur in nature.
  • deaminase domain as used herein is not intended to be limiting and encompasses genetically engineered deaminase that may comprise genetic modifications (e.g., one or more mutations) that results in a variant deaminase having an amino acid sequence comprising one or more changes relative to a wild-type counterpart deaminase.
  • the deaminase domain is an adenosine deaminase domain.
  • Suitable adenosine deaminase domains would be known to persons skilled in the art, illustrative examples of which include the adenosine deaminase enzymes described by Gaudelli et al. (2017, Nature, 551: 464-471).
  • the adenosine deaminase domain is a tRNA-specific adenosine deaminase (TadA) monomer.
  • the TadA monomer is selected from the group consisting of an evolved TadA monomer and a wild- type TadA monomer.
  • the term “evolved” or “laboratory-evolved” as used herein refers to the directed protein evolution of protein domains (e.g., adenosine deaminase domains) to introduce new protein functions into an existing protein (i.e., an enzyme).
  • protein domains e.g., adenosine deaminase domains
  • Methods for protein evolution would be known to persons skilled in the art, illustrative examples of which include the phage-assisted continuous evolution (PACE) system as described in WO 2019/023680.
  • the TadA monomer is from Escherichia coli (ecTadA).
  • the adenosine deaminase domain is an evolved TadA monomer from Escherichia coli (ecTadA).
  • the evolved ecTadA monomer comprises an amino acid sequence that is at least 80% identical to the amino acid sequence of SEQ ID NO: 2.
  • the deaminase domain is a cytidine deaminase domain.
  • Suitable cytidine deaminase domains would be known to persons skilled in the art, illustrative examples of which include rat APOBEC1 (rAPOBECl) as described by Komor et al. (2016, Nature, 533: 420-424), human activation-induced deaminase (hAID) as described by Hess et al. (2016, Nature Methods, 13: 1036-104) and Ma et al. (2016, Nature Methods, 13: 1029-1035), apolipoprotein-B-REN-editing-catalytic polypeptide-like 3A (APOBEC3A) as described by Gehrke et al.
  • rAPOBECl rat APOBEC1
  • hAID human activation-induced deaminase
  • APOBEC3A apolipoprotein-B-REN-editing-catalytic polypeptide-like 3A
  • the cytidine deaminase domain is a deaminase from an apolipoprotein B mRNA-editing complex (APOBEC) family deaminase.
  • APOBEC apolipoprotein B mRNA-editing complex
  • the APOBEC family deaminase is selected from the group consisting of APOBEC 1 deaminase, APOBEC2 deaminase, APOBEC3A deaminase, APOBEC3B deaminase, APOBEC3C deaminase, APOBEC3D deaminase, APOBEC3F deaminase, APOBEC3G deaminase, APOBEC3H deaminase and activation-induced deaminase variant (AIDx).
  • the APOBEC family deaminase is APOBEC 1 deaminase.
  • the APOBEC family deaminase is a human AIDx (hAIDx).
  • a fusion protein of the present disclosure may further comprise one or more additional domains or moieties.
  • the fusion protein may comprise cell recognition or targeting domains, a nuclear localization signals (NLS), and/or antibiotic selection domains (e.g., blasticidin-S-deaminase).
  • NLS nuclear localization signals
  • antibiotic selection domains e.g., blasticidin-S-deaminase
  • the U-G mismatch is recognized by uracil N-glycosylase (UNG), which cleaves the glycosidic bond between uracil and the deoxyribose backbone of DNA to revert the U-G intermediate created by the cytidine deaminase back to a C- G base pair.
  • UNG uracil N-glycosylase
  • the activity of UNG is effectively inhibited by the uracil DNA glycosylase inhibitor (UGI) as described by Mol et al. (1995, Cell, 82: 701-708).
  • an isolated fusion protein comprising a cytidine deaminase domain further comprises a UGI domain, wherein the UGI domain inhibits a uracil-DNA glycosylase.
  • deaminase domain will define the isolated fusion protein as either an ABE or CBE.
  • an isolated fusion protein comprising an adenosine deaminase domain is an ABE.
  • an isolated fusion protein comprising a cytidine deaminase domain is a CBE.
  • the intradomain insertion of the deaminase domain between amino acid residues 119 and 132 or 730 and 745 of the amino acid sequence of SEQ ID NO: 1 unexpectedly improves the activity profile of base editing proteins.
  • the deaminase domain is positioned at an amino acid position selected from the group consisting of N730, Q731, M732, F733, E735, K736, Q737, E739, S740, M741, P742, E743, 1744 and E745.
  • the deaminase domain is positioned at amino acid residue 1744.
  • the deaminase domain is positioned at an amino acid selected from the group consisting of HI 19, N120, E125, D126, D127, T128, G129, N130, E131 and L132.
  • the deaminase domain is positioned at amino acid residue G129.
  • components (a) and (b) of the fusion protein described herein may be directly attached to one another or may be attached via a linker.
  • linker refers to any molecule or group of molecules that binds the two components. Linkers may provide for optimal spacing of the two components. Suitable linkers include, by way of example only, amino acids such as aminohexanoic acid, glycine and serine and stretches of two or more amino acids such as glycine and serine. Linkers may further supply a labile linkage that allows the two components to be separated from each other. Labile linkages include photocleavable groups, acid-labile moieties, base-labile moieties and enzyme-cleavable groups.
  • Embodiments of the disclosure contemplate derivatives of the fusion proteins disclosed herein.
  • the term "derivative” is intended to encompass chemical modification to a protein or one or more amino acid residues of a protein, including chemical modification in vitro, for example, by introducing a group in a side chain in one or more positions of a peptide, such as a nitro group in a tyrosine residue or iodine in a tyrosine residue, by conversion of a free carboxylic group to an ester group or to an amide group, by converting an amino group to an amide by acylation, by acylating a hydroxy group rendering an ester, by alkylation of a primary amine rendering a secondary amine, or linkage of a hydrophilic moiety to an amino acid side chain.
  • Modification of an amino acid may also include derivation of an amino acid by the addition and/or removal of chemical groups to/from the amino acid, and may include substitution of an amino acid with an amino acid analog (e.g. , a phosphorylated or glycosylated amino acid) or a non-naturally occurring amino acid such as a N-alkylated amino acid (e.g., N-methyl amino acid), D-amino acid, b-amino acid or g-amino acid.
  • an amino acid analog e.g. , a phosphorylated or glycosylated amino acid
  • a non-naturally occurring amino acid such as a N-alkylated amino acid (e.g., N-methyl amino acid), D-amino acid, b-amino acid or g-amino acid.
  • conservative variants of the fusion proteins disclosed herein comprise one or more conservative amino acid substitutions.
  • a "conservative amino acid substitution” is one in which an amino acid residue is replaced with another residue having a chemically similar or derivatised side chain. Families of amino acid residues having similar side chains, for example, have been defined in the art.
  • amino acids with basic side chains e.g., lysine, arginine, histidine
  • acidic side chains e.g., aspartic acid, glutamic acid
  • uncharged polar side chains e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine
  • nonpolar side chains e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan
  • beta-branched side chains e.g., threonine, valine, isoleucine
  • aromatic side chains e.g., tyrosine, phenylalanine, tryptophan, histidine
  • the substitution of the neutral amino acid serine (S) for the similarly neutral amino acid threonine (T) would be a conservative amino acid substitution.
  • S neutral amino acid serine
  • T similarly neutral amino acid threonine
  • the variant will possess at least about 80% identity to the sequence of which it is a variant.
  • the sequence may be about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to the sequence of which it is a variant.
  • the fusion proteins of the present disclosure may be produced using any method known in the art, including synthetically or by recombinant techniques such as expression of polynucleotide constructs encoding the components.
  • a protein may be synthesized using the Fmoc -polyamide mode of solid-phase peptide synthesis.
  • Other synthesis methods include solid phase t-Boc synthesis and liquid phase synthesis.
  • Purification can be performed by any one of, or a combination of, techniques such as re crystallization, size exclusion chromatography, ion-exchange chromatography, hydrophobic interaction chromatography and reverse-phase high performance liquid chromatography using, for example, acetonitrile/water gradient separation.
  • a fusion protein of the present disclosure may be produced when two or more heterologous nucleotide sequences encoding each component of the fusion protein, optionally including nucleotides sequences encoding a linker or spacer amino acid(s), are fused together in the correct translational reading frame and are expressed.
  • the present disclosure also provides isolated polynucleotide encoding the fusion protein and components thereof as described herein.
  • nucleic acid sequence mean a single- or double-stranded polymer of deoxyribonucleotide, ribonucleotide bases or known analogues or natural nucleotides, or mixtures thereof, and include coding and non-coding sequences of a gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polypeptides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers and fragments.
  • encode refers to the capacity of a nucleic acid to provide for another nucleic acid or a polypeptide.
  • a nucleic acid sequence is said to "encode" a polypeptide if it can be transcribed and/or translated to produce the polypeptide or if it can be processed into a form that can be transcribed and/or translated to produce the polypeptide.
  • Such a nucleic acid sequence may include a coding sequence or both a coding sequence and a non-coding sequence.
  • the terms "encode,” "encoding” and the like include an RNA product resulting from transcription of a DNA molecule, a protein resulting from translation of an RNA molecule, a protein resulting from transcription of a DNA molecule to form an RNA product and the subsequent translation of the RNA product, or a protein resulting from transcription of a DNA molecule to provide an RNA product, processing of the RNA product to provide a processed RNA product (e.g., mRNA) and the subsequent translation of the processed RNA product.
  • a processed RNA product e.g., mRNA
  • the present disclosure also provides vectors comprising a polynucleotide sequence(s) encoding the fusion protein and components thereof as described herein.
  • the polynucleotide sequence(s) is operably linked to a promoter to allow for expression of the fusion peptide or components thereof.
  • the vector further comprises a polynucleotide encoding a gRNA.
  • the vectors can be episomal vectors (i.e., that do not integrate into the genome of a host cell), or can be vectors that integrate into a host cell genome.
  • Vectors may be replication competent or replication-deficient.
  • Exemplary vectors include, but are not limited to, plasmids, cosmids, and viral vectors, such as adeno-associated virus (AAV) vectors, lentiviral, retroviral, adenoviral, herpesviral, parvoviral and hepatitis viral vectors.
  • AAV adeno-associated virus
  • the choice and design of an appropriate vector is within the ability and discretion of one of ordinary skill in the art.
  • the vector is suitable for use in gene therapy.
  • Vectors suitable for use in gene therapy would be known to persons skilled in the art, illustrative examples of which include viral vectors derived from adenovirus, adeno- associated virus (AAV), herpes simplex virus (HSV), retrovirus, lentivirus, self-amplifying single-strand RNA (ssRNA) viruses such as alphavirus (e.g., Semliki Forest virus, Sindbis virus, Venezuelan equine encephalitis, Ml), and flavivirus (e.g., Kunjin virus, West Nile virus, Dengue virus), rhabdovirus (e.g., rabies, vesicular stomatitis virus), measles virus, Newcastle Disease virus (NDV) and poxivirus as described by, for example, Lundstrom (2019, Diseases, 6: 42).
  • alphavirus e.g., Semliki Forest virus, Sindbis virus, Venezuelan equine encephalitis, Ml
  • flavivirus e.
  • the vector is an adeno-associated virus (AAV) vector.
  • AAV vectors include, without limitation, those derived from serotypes AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV 12 or AAV13, or using synthetic or modified AAV capsid proteins such as those optimized for efficient in vivo transduction.
  • a recombinant AAV vector describes replication-defective virus that includes an AAV capsid shell encapsidating an AAV genome.
  • one or more of the wild-type AAV genes have been deleted from the genome in whole or part, preferably the rep and/or cap genes.
  • the present disclosure also provides non-viral delivery methods of the fusion protein and components thereof as described herein.
  • Suitable non-viral delivery methods will be known to persons skilled in the art, illustrative examples of which include using lipids, lipid-like materials or polymeric materials, as described, for example, by Rui et al. (2019, Trends in Biotechnology, 37(3): 281-293), and nanoparticles, as described, for example, by Nguyen et al. (2020, Nature Biotechnology, 38: 44-49).
  • the present disclosure provides a method to deaminate an adenosine nucleotide to an inosine nucleotide comprising contacting a nucleic acid molecule with the fusion protein as described herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more adenosine or thymidine nucleotides.
  • the target sequence is present on the non-coding (i.e., antisense) strand of a nucleic acid duplex.
  • the term “antisense” refers to a nucleotide sequence whose sequence of nucleotide residues is in reverse 5’ to 3’ orientation in relation to the sequence of deoxynucleotide residues in a sense strand of a nucleic acid (e.g ., DNA or RNA) duplex.
  • a “sense strand” of a DNA duplex refers to a strand in a DNA duplex which is transcribed by a cell in its natural state into a mRNA (i.e., coding and/or regulatory DNA molecules).
  • mRNA i.e., coding and/or regulatory DNA molecules
  • the target sequence comprises a guanosine to adenosine point mutation associated with a disease or disorder, and wherein the deamination of the mutant adenosine nucleotide results in a sequence that is not associated with a disease or disorder.
  • the target sequence comprises a cytidine to thymidine point mutation associated with a disease or disorder, and wherein the deamination of the mutant cytidine nucleotide results in a sequence that is not associated with a disease or disorder.
  • the present disclosure provides a method to deaminate a cytidine nucleotide to an uridine nucleotide comprising contacting a nucleic acid molecule with the fusion protein as described herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more cytidine or guanosine nucleotides.
  • the target sequence is present on the non-coding (i.e., antisense) strand of a nucleic acid duplex.
  • the target sequence comprises an adenosine to guanosine point mutation associated with a disease or disorder, and wherein the deamination of the mutant guanosine nucleotide results in a sequence that is not associated with a disease or disorder.
  • the target sequence comprises a thymidine to cytidine point mutation associated with a disease or disorder, and wherein the deamination of the mutant cytosine nucleotide results in a sequence that is not associated with a disease or disorder.
  • the deamination methods described herein may be adapted to methods for the treatment of diseases or disorders that are characterized by guanosine to adenosine point mutations, cytidine to thymidine point mutations, adenosine to guanosine point mutations, or thymidine to cytidine point mutations.
  • a fusion protein comprising SaCa9n with an intradomain insertion of miniABEmax (V82G) base editor at residue 1744 (i.e., microABE 1744), effectively and efficiently corrects the highly penetrant PCDH15 Arg245Ter variant, which causes type 1 Usher syndrome.
  • the novel fusion proteins described herein may also be useful in correcting single-base variants that are causative of other human genetic diseases or disorders.
  • Plasmids expressing the U6-sgRNA scaffold with mCherry fluorophore reporter were cloned into either the pX552-CMV-mCherry-U6-SpCas9_sgRNA scaffold (SEQ ID NO: 121; Addgene #107051) or PX552-CMV-mCherry-U6- SaCas9_sgRNA scaffold (SEQ ID NO: 122; Addgene #107053) via Sapl (NEB) digest sites using oligonucleotides corresponding to the target spacer (Tables 2-4).
  • HEK293A cells expressing yellow fluorescent protein were cultured in Dulbecco’s Modified Eagle Medium (DMEM) with high glucose (Life Technologies). Culture media was supplemented with 10% (vol/vol) Fetal Bovine Serum (Life Technologies) and 1% (vol/vol) antibiotic-antimycotic (Thermo Fisher Scientific). HEK293A-YFP cells were maintained in the aforementioned media at 37°C with 5% CO2 for cell culture experiments. Cell culture was performed HEK293A-YFP cells that were subject to no more than 20 passages.
  • Cells carrying the full PCDH15 cDNA sequence with the Arg245ter (NM_033056.4:c.733C>T) variant were generated using the Flp-In T-Rex core kit on a Flp-In T-Rex cell background (Thermo Fisher Scientific) as per manufacturer's instructions and maintained similar to HEK293A lines.
  • Mycoplasma testing was performed on a biweekly basis using PCR Mycoplasma Test Kit I/C (Banksia Scientific).
  • HEK293A-YFP cells were seeded at a density of 50,000 cells per well in a 24-weII, tissue culture-treated plate (In Vitro Technologies). Subsequently, 8 ⁇ L ViaFect Transfection reagent (Promega) with 1 ⁇ g CRISPR base editor plasmid and 1 ⁇ g sgRNA-expressing plasmid was transfected into cells 20-24 h after plating. Fresh media containing 20 ⁇ g/mL Blasticidine (Sigma Aldrich) was exchanged 18-22 h after transfection to select for cells expressing the base editor construct.
  • RNA and DNA were simultaneously harvested using 350 ⁇ L Buffer RLT Plus as part of the Allprep DNA/RNA Mini Kit (QIAGEN) following the manufacturer's protocol.
  • PCDH15 Arg245Ter Flp-In T- Rex lines were transfected with 1 ⁇ g base editor construct and 0.45 ⁇ g sgRNA plasmid (FugeneHDTM, Promega), and selected with 1 ug/mL puromycin for five days.
  • RNA samples were eluted in 30 ⁇ L Buffer EB and RNase-free water, respectively, with 1.5 ⁇ L RNaseOUT Recombinant Ribonuclease Inhibitor (Life Technologies) added to the eluted RNA sample.
  • RNaseOUT Recombinant Ribonuclease Inhibitor Life Technologies
  • HEK293A-YFP cells were plated at a density of 50,000 cells per well, 24 hours prior to transduction at a multiplicity- of-infection (MOI) of 2 x 10 6 viral genomes/cell. After 72 hours of culture, cells were washed twice with PBS and harvested.
  • RNA samples were paired with their counterpart gDNA samples for targeted amplification.
  • the cDNA samples were diluted 1:10 and 2 ⁇ L of the diluted cDNA was used as input for the first-round PCR amplification of either RNA off-target sites or undiluted gDNA for those experiments involving DNA on-target sites (Table 2).
  • PCR reactions were made up to 25 ⁇ L comprising 12.5 ⁇ L Q5 Hot Start High-Fidelity 2X Master Mix (NEB), 1.25 ⁇ L of forward and reverse primers containing 5’ flanking Illumina-style adapter overhangs, and diluted cDNA or 50-100 ng of gDNA under thermocycling conditions of 98°C initial denaturation for 30 s, and 30 cycles of 98°C denaturation for 10 s, 65°C annealing for 30 s, and 72°C extension for 12 s with a 72°C final extension for 2 m.
  • PCR amplification was validated using electrophoresis using 1.5% agarose gel and cleaned using Agencourt AMPure XP (Beckman Coulter) 1.8X paramagnetic bead clean-up.
  • Sequencing libraries were prepared using NEBNext(R) UltraTM RNA Library Prep Kit for Illumina(R) and sequencing was carried out on HiSeq X Ten using a 2X150-bp paired-end configuration at Genewiz (Suzhou, China). Libraries were downsampled to 120 million reads using seqtk v.1.3 (r106) (https://github.com/lh3/seqtk). The downsampled libraries were processed according to GATK best practices for RNA-seq variant calling (Griinewald et al., 2019, Nature Biotechnology, 37: 1041-1048). Briefly, raw sequencing reads were aligned to the human hg38 reference genome using STAR (v.2.7.2b).
  • a second round barcoding PCR was performed using between 20 and 150 ng of the purified first round PCR products.
  • the barcoding PCR added unique dual i5/i7 indices using the Nextera XT index kit V 2 (Illumina).
  • Q5 Hot Start High-Lidelity 2X Master Mix was used following the sample preparation and thermocycling conditions described by Gaudelli et al. (2017, Nature, 551: 464-471).
  • the second-round PCR products were purified using 0.7x paramagnetic bead clean-up and quantified using QubitTM dsDNA BR Assay Kit (Life Technologies). Each sample was then normalized to 4 nM and 5 ⁇ L of each library member was pooled into a final library that was validated using High Sensitivity D1000 ScreenTape (Agilent Technologies). The final library was paired-end sequenced (2 x 251) on the Illumina MiSeq machine using 600-cycle MiSeq Reagent Kit v3. Amplicon sequencing analysis
  • Paired-end fastq files were joined and trimmed (Bolger et al., 2014, Bioinformatics, 30: 2114-2120), before being processed using the CRISPResso2 (V.2.0.29) workflow (Clement et al, 2019, Nature Biotechnology, 37: 224-226).
  • Example 1 Functional screen of intradomain and circularly permuted base editors
  • the hAIDx domain is tethered to nickase SpCas9 via an N-terminal linker (Ma et al, 2016, Nature Methods, 13: 1029-1035). Therefore, for this analysis, the previously characterized 44-amino acid N-terminal linker was conserved and appended to a floppy glycine-serine-rich linker at its C-terminus to bridge the SpCas9n and hAIDx protein domains.
  • the miniABEmax has less spurious RNA off-target activity as it harbors a single evolved ecTadA monomer and has been engineered for reduced non-specific RNA contact (Griinewald et al., 2019, Nature Biotechnology, 37: 1041-1048). Circular permutation of the miniABEmax however showed no appreciable difference in on- target DNA editing. Nonetheless, unlike the ABEmax variant, there was no significant bearing on the incidence of off-target events from its circular permutation.
  • mini ABEmax (V82G) base editor at residue 1744 i.e., ‘micro ABE 1744’
  • the insertion of the hAIDx domain in SaCas9n was also consistent with the higher on-target editing efficiencies observed by ABEs for position 1744 (micro AIDx 1744), as compared to G129 and N730 ( Figure 14).
  • the microABE 1744 had a broader activity window with improved, overall on- target editing efficiencies compared to its Sa-ABEmax (SaCas9n variant of ABEmax) and Sa-miniABEmax (SaCas9 mini ABEmax (V82G) variant) counterparts ( Figure 15).
  • Sa-ABEmax SaCas9n variant of ABEmax
  • Sa-miniABEmax SaCas9 mini ABEmax (V82G) variant
  • Figure 15 We observed up to a 2.28- and 1.78-fold increase in editing efficiency at the A7 position compared to Sa-ABEmax and Sa-miniABEmax, respectively.
  • microABE 1744 outperformed the Sa-ABEmax and Sa-miniABEmax by up to 3.63- and 3.09-fold, respectively.
  • microABE 1744 vastly augments the editing scope of targeted adenines within a 21 -nucleotide spacer, displaying a characteristic bi-lobed activity window spanning from adenine position four to 16 ( Figure 15).
  • microABE 1744 Compared to the ‘wild-type’ SaCas9 mini ABEmax (V82G) variant, the microABE 1744 showed a 22-fold increase for on-target nucleotide modification at ABE16 (0.42-0.93% and 10.79-11.31%, respectively). Further, microABE 1744 attenuated the incidence of off-target events for at least three of the six commonly deaminated RNA- off target transcripts compared to the Sa-miniABEmax (V82G) and Sa-ABEmax ( Figure 17; Table 6). The microABE 1744 also had a significantly reduced, local RNA off-target profile compared to its counterparts at residue G129 and N730 by up to 2- and 1.8-fold, respectively. Base editor insertion at residue G129 dramatically increased the incidence of RNA off-target events relative to its other intradomain counterparts and the Sa-miniABEmax (V82G) ( Figures 17 to 18).
  • RNAseq was used to characterize the molecular footprint of the microABE 1744, Sa-ABEmax and Sa-miniABEmax (V82G) on the transcriptome.
  • the microABE 1744 dramatically lowered the incidence of aberrant mRNA off-target events compared to both the Sa-ABEmax and Sa-miniABEmax (V82G) (2243 reads containing adenosine-to-inosine editing for microABE 1744 as compared to 4425 and 52,030 reads for Sa-miniABEmax [V82G] and Sa-ABEmax, respectively).
  • the domain-inlaid base editor resulted in a 6-fold reduction in the number of mRNA off-target edits (Figure 19) as compared to its non-inlaid permutant (V82G) (81 vs. 544 reads containing A-to-I editing, respectively, for transcripts mapped to chromosome 19).
  • V82G non-inlaid permutant
  • the microABE 1744 was directed to correct the highly penetrant PCDH15 Arg245Ter variant, which causes type 1 Usher syndrome, whereby homozygous carriers have congenital deafness and develop retinitis pigmentosa (Ben-Yosef et al, The New England Journal of Medicine, 348: 1664-1670).
  • PCDH15 Arg245Ter variant causes type 1 Usher syndrome, whereby homozygous carriers have congenital deafness and develop retinitis pigmentosa (Ben-Yosef et al, The New England Journal of Medicine, 348: 1664-1670).
  • ABE11 the SaCas9n-intradomain ABE constructs (i.e., an all-in-one vector) could be packaged as AAV-7m8 and AAV-DJ serotypes with an SCP1 promoter (Juven-Gershon et al, 2006, Nature Methods, 3: 917-922), to drive the expression of the microABE 1744, and a hU6 promoter expressing an sgRNA.
  • HEK293A-YFP cells were transduced and observed editing with no selection or enrichment after three days of culturing with either the AAV-7m8 and AAV-DJ capsid derivatives ( Figure 22). These data demonstrate that the microABE 1744 can be packaged as an all-in-one vector for AAV delivery suitable for gene therapy delivery (Westhaus et al, 2020, Human Gene Therapy, published online 26 February 2020, ahead of print).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Virology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Peptides Or Proteins (AREA)

Abstract

This disclosure relates generally to base editing proteins comprising a Cas9 domain and a deaminase domain, and uses therefor. In particular, the proteins of the present disclosure are adenosine base editors (ABEs) and cytidine base editors (CBEs) with improved on-target efficiency and precision, useful for clinical applications.

Description

DNA ALTERING PROTEINS AND USES THEREFOR Related Applications
[0001] This application claims priority from Australian Provisional Patent Application No. 2020900913, filed on 25 March 2020, the entire contents of which is hereby incorporated by reference.
Field of the Art
[0002] The present disclosure relates generally to base editing proteins comprising a Cas9 domain and a deaminase domain, and uses therefor. In particular, the proteins of the present disclosure are adenosine base editors (ABEs) and cytidine base editors (CBEs) with improved on-target efficiency and precision, useful for clinical applications.
Background
[0003] Precision genome engineering via the clustered regularly interspaced short palindromic repeats/CRISPR-associated protein (CRISPR/Cas) system has revolutionized molecular biology. This specific and adaptable method for genome engineering typically utilizes a two-component system consisting of a Cas endonuclease and guide RNA (gRNA), which can be designed to target essentially any genomic locus and generate double-strand breaks (DSBs). These DSBs are subsequently repaired via the non-homologous end-joining (NHEJ) pathway or the homology-directed repair (HDR) pathway, thereby editing the genome. It is generally recognized that while NHEJ is effective for the generation of gene knockouts, HDR is more effective in the precise editing of DNA sequences. However, as HDR is very inefficient in eukaryotic cells (Song & Stieger, 2017, Molecular Therapy - Nucleic Acids, 7: 53-60), the use of CRISPR/Cas systems for making precise single-base changes or substitutions (i.e., base editing) has been limited. Given that two-thirds of human genetic diseases are associated with single-base changes (Shalem et al., 2014, Science, 343: 84-87), this limitation has necessarily impeded the development of CRISPR/Cas systems as gene therapies.
[0004] The recent development of base editing systems, including cytidine base editors (CBEs) and adenosine base editors (ABEs), has allowed for the site-specific modification of DNA independent from the DSB repair pathways. CBEs direct cytidine-to-thymidine nucleotide conversions at a user-defined guide sequence (i.e., sgRNA) and comprise a cytidine deaminase derived from vertebrate or invertebrate systems, while current generation ABEs (ABEmax) employ a dimerized, codon optimized variant of laboratory-evolved ecTadA to direct site-specific adenosine-to-guanosine nucleotide conversions in a diverse array of systems. Despite their broad scope for robust on-target editing, ABEs have a significant off-target footprint on the transcriptome and effect incidences of missense and nonsense mutations.
[0005] Efforts to minimize the occurrence of promiscuous editing have largely focused on improving the fidelity of existing ABEs by installing various inactivating mutations in the wild-type domain of the ecTadA monomer, or the use of truncated variants of ABEmax with amino acid substitutions to reduce non-specific contact with RNA as recently described, for example, by Griinewald et al. (2019, Nature Biotechnology, 37: 1041-1048). However, while these strategies are effective at improving the biosafety of ABEs, they represent a Cas9-independent solution towards minimizing aberrant editing. Accordingly, there is a need to develop new base editing proteins that both improve on-target efficiency and reduce off-target frequency.
Summary of the Disclosure
[0006] The present disclosure is predicated, in part, on the surprising finding that the activity profile of base editing proteins can be improved by manipulating the secondary structure of Cas9. In particular, the strategic use of circular permutation and protein-domain insertion can be used to calibrate the DNA and RNA footprint of base editing proteins based on a model of “best-fit” between the overall reach of the deaminating catalytic pocket and the nucleotide substrate. These findings have been reduced to practice in the generation of new base editing proteins being microABE 1744, a Staphylococcus aureus Cas9 nickase (SaCas9n)-intradomain variant with an insertion of the miniABEmax (V82G) base editor at residue 1744, and microAIDx 1744, a Staphylococcus aureus Cas9 nickase (SaCas9n)- intradomain variant with an insertion of the AIDx base editor at residue 1744, both of which exhibit robust on-targeting editing and reduced RNA signature on the transcriptome. Further, microABE 1744 has also been demonstrated to effectively correct the highly penetrant PCDH15 Arg245Ter variant, which causes type 1 Usher syndrome.
[0007] Accordingly, in an aspect, the present disclosure provides an isolated fusion protein comprising: a. a Cas9 domain comprising the amino acid sequence of SEQ ID NO: 1, or an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO: 1, wherein the Cas9 domain when in conjunction with a bound guide RNA (gRNA) specifically binds to a target nucleic acid sequence; and b. a deaminase domain, wherein the deaminase domain deaminates a nucleotide base in a single-stranded portion of the target nucleic acid sequence when in conjunction with the Cas9 domain and the gRNA, wherein the deaminase domain is positioned between amino acid residues 119 and 132 or 730 and 745 of the amino acid sequence of SEQ ID NO: 1.
[0008] In another aspect, the present disclosure provides an isolated polynucleotide encoding the fusion protein as disclosed herein.
[0009] In another aspect, the present disclosure provides a vector comprising the polynucleotide as disclosed herein.
[0010] In another aspect, the present disclosure provides a complex comprising the fusion protein as disclosed herein and a gRNA bound to the Cas9 domain of the fusion protein.
[0011] In another aspect, the present disclosure provides a method to deaminate an adenosine nucleotide to an inosine nucleotide comprising contacting a nucleic acid molecule with the fusion protein as disclosed herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more adenosine or thymidine nucleotides.
[0012] In another aspect, the present disclosure provides a method to deaminate a cytidine nucleotide to an uracil nucleotide comprising contacting a nucleic acid molecule with a the fusion protein as disclosed herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more cytidine or guanosine nucleotides.
Brief Description of the Drawings
[0013] Embodiments of the disclosure are described herein, by way of non-limiting example only, with reference to the accompanying drawings. [0014] Figure 1 is a schematic representation of the SpCas9 constructs.
[0015] Figure 2 is a schematic representation of SpCas9n with key residues for permutant variants. S. pyogenes Cas9 (PDB: 4008) is shown complexed with target DNA (red) and gRNA (orange). Key residues for circular permutation (1010, magenta; 1029, cyan; 1058, green) or intradomain insertion (202, yellow; 208, navy; 468, black; 1058, green) of the hAIDx protein domain are shown. Numbering is considered from the starting methionine at position 1, BH - Bridge helix.
[0016] Figure 3 is a graphical representation of the cytosine conversion efficiency (%, y-axis) of TAM-AIDx after circular permutation (CP) or intradomain (ID) insertion into SpCas9 constructs targeting the YFP locus (x-axis).
[0017] Figure 4 shows that SpCas9 is amenable to intradomain insertion of hAIDx protein at residue 1058. (A) A graphical representation of the total cytosine conversion efficiency (%, y-axis) of TAM-AIDx after circular CP or ID insertion into SpCas9 constructs targeting the YFP locus (x-axis); (B) A schematic representation of a nucleotide quilt of the N-terminal linked BE3 or TAM-AIDx and after CP or ID insertion into SpCas9 constructs targeting the YFP locus.
[0018] Figure 5 is a schematic representation of YFP locus tables from genomic DNA isolated from HEK293A-YFP cells after editing with SpCas9n, where the hAIDx has been linked to the C-terminal, inserted in the PI domain (at residue 1058) or linked to a circularly permuted SpCas9n (at residue 1058).
[0019] Figure 6 is a graphical representation of the on-target activity of various permutations of CBEs targeting the YFP locus as shown by reference to total cytosine conversion efficiency (%, y-axis) and nucleotide position (x-axis).
[0020] Figure 7 is a graphical representation of the on-target activity of various permutations of ABEs targeting the (A) YFP and (B) ABE16 loci as shown by reference to adenosine-to-guanine conversion efficiency (%, y-axis) and nucleotide position (x-axis).
[0021] Figure 8 is a series of graphical representations of the on-target activity of various permutations of ABEs (x-axis) targeting the (A) DNAJB, (B) MTA2, (C) PTBP2, (D) SAP30BP, (E) LCMT1, and (F) SCAP loci as shown by reference to adenosine-to-inosine conversion efficiency (%, y-axis). [0022] Figure 9 is a heat map showing the localized, off-target profile of SpCas9 ABEmax permutants across promiscuous RNA transcripts. Average editing across the transcript was considered for adenosine-to-inosine, including that of the position of the highest-edited adenosine in the amplicon.
[0023] Figure 10 is a graphical representation of on-target editing efficiencies for various SpCas9 ABEmax permutants with sgRNA targeting the (A) ABE16 and (B) YFP loci.
[0024] Figure 11 is a heat map showing the off-target profile of different permutants of SpCas9 ABEs. Average editing across the transcript was considered for adenosine-to- inosine, including that of the position of the highest-edited adenosine in the amplicon.
[0025] Figure 12 is a schematic representation of S. aureus Cas9 (PDB: 5CZZ) with the corresponding target DNA (red) and gRNA (orange) complex. Residues marked in yellow or green demarcate amino acid positions 119-132 or 730-745 of SEQ ID NO: 1, respectively.
[0026] Figure 13 is a graphical representation of the base editing activity for each intradomain-inserted miniABEmax (V82G) SaCas9.
[0027] Figure 14 is a series of graphical representations of the base editing activity for each intradomain-inserted hAIDx SaCas9n. (A) sgRNA targeting HEKsite4; (B) sgRNA targeting ABE5; (C) sgRNA targeting ABE9. Heatmap data points are presented as the average over three technical replicates.
[0028] Figure 15 is a series of graphical representations of editing efficiency (adenine - to-guanine conversion efficiency, %; y-axis) at various adenine positions in a 21 -nucleotide activity window (nucleotide position; x-axis). Values and errors bars reflect mean ± sem of n=3 independent biological replicates across 13 different target sites.
[0029] Figure 16 is a graphical representation of the on-target editing efficiencies of intradomain SaCas9 constructs as shown by reference to adenine-to-guanine conversion efficiency (%, y-axis) and nucleotide position (x-axis) at various example loci.
[0030] Figure 17 is a heat map showing the off-target profile of SaCas9n constructs. Average editing across the transcript was considered for adenosine-to-inosine, including that of the position of the highest-edited adenosine in the amplicon. [0031] Figure 18 is a series of graphical representations of the off-target activity of Sa- ABEmax, Sa-miniABEmax (V82G), G129, N730 and 1744 (microABE 1744) (x-axis) targeting the (A) DNAJB, (B) MTA2, (C) PTBP2, (D) SAP30BP, (E) LCMT1, and (F) SCAP loci as shown by reference to adenine-to-inosine conversion efficiency (%, y-axis).
[0032] Figure 19 is a graphical representation of the transcriptomic profiling of microABE 1744 (ID744 mini ABEmax [V82G]), Sa-ABEmax (N-terminal ABEmax), and Sa-miniABEmax (V82g) (N-terminal mini ABEmax [V82G]) showing the number of reads containing adenosine-to-inosine editing (y-axis) and chromosomal position (x-axis) in HEK293A-YFP cells.
[0033] Figure 20 is a heat map showing the localized, off-target profile of SaCas9 ABE permutants across promiscuous RNA transcripts. Average editing across the transcript was considered for adenosine-to-inosine, including that of the position of the highest-edited adenosine in the amplicon.
[0034] Figure 21 shows that microABE 1744 has high editing efficiency and lower mRNA off-target effects as compared to the SaCas9-variant of miniABEmax (V82G). (A) A graphical representation of the editing profile of miniABEmax (V82G) compared to microABE with a sgRNA targeting the PCDH15 Arg245ter variant (NM_033056.4:c.733C>T) or LacZ; (B) A heat map showing the off-target events at several promiscuous RNA transcripts for constructs targeting the PCDH15 Arg245ter variant.
[0035] Figure 22 shows AAV-mediated delivery of a single construct containing Intradomain-SaCas9n 1744 (microABE) and a sgRNA targeting either ABE site 11 or LacZ. Proportion of A-to-G conversions are presented as an average of three technical replicates, for either the AAV-7m8 or AAV-DJ serotypes, in HEK293A-YFP cells.
[0036] Figure 23 shows the annotated amino acid sequence of microABE 1744.
[0037] Figure 24 shows the annotated amino acid sequence of microAIDx 1744.
[0038] Nucleic acid sequences are referred to by a sequence identifier number (SEQ ID NO). Sequences are provided in the Sequence Listing. Detailed Description
[0039] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which the invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, preferred methods and materials are described. All patents, patent applications, published applications and publications, databases, websites and other published materials referred to throughout the entire disclosure, unless noted otherwise, are incorporated by reference in their entirety. In the event that there is a plurality of definitions for terms, those in this section prevail. Where reference is made to a URL or other such identifier or address, it is understood that such identifiers can change and particular information on the internet can come and go, but equivalent information can be found by searching the internet. Reference to the identifier evidences the availability and public dissemination of such information.
[0040] The articles "a", "an" and "the" include plural aspects unless the context clearly dictates otherwise. Thus, for example, reference to "an allele" includes a single allele, as well as two or more alleles; reference to "a treatment" includes a single treatment, as well as two or more treatments; and so forth.
[0041] In the context of this specification, the term “about” is understood to refer to a range of numbers that a person of skill in the art would consider equivalent to the recited value in the context of achieving the same function or result.
[0042] Throughout this specification, unless the context requires otherwise, the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element or integer or group of elements or integers but not the exclusion of any other element or integer or group of elements or integers.
[0043] The term “optionally” is used herein to mean that the subsequent described feature may or may not be present or that the subsequently described event or circumstance may or may not occur. Hence, the specification will be understood to include and encompass embodiments in which the feature is present and embodiments in which the feature is not present, and embodiment in which the event or circumstance occurs as well as embodiments in which it does not. [0044] Currently, there are few gene editing systems that mediate precise single-base changes or substitutions {i.e., base editing). Of the base editing proteins that have been previously described, significant off-target effects and incidences of missense and nonsense mutations have limited their development as gene therapies. Therefore, the generation of improved base editing proteins with improved on-target efficiency and reduced off-target effects is essential to the development of gene therapies that can correct single-base changes or substitutions, which are characteristic of a large majority of human genetic diseases. As described herein, novel base editing fusion proteins have been generated. In particular, the intradomain insertion of a deaminase domain within an optimal amino acid region of Staphylococcus aureus Cas9 nickase (SaCas9n) unexpectedly improves the activity profile of the resulting fusion protein. A distinct advantage of the fusion proteins described herein is that the size of the polynucleotide sequences encoding the fusion proteins are small enough to fit comfortably within vectors that are typically used in gene therapy (e.g., AAV vector). Further, as a result of the broad editing window, robust on-target editing, and reduced off- target signature, the fusion proteins described herein are also suitable for therapeutic applications.
Base editing
[0045] The term “base editing” as used herein refers to a genome editing method that enables direct, irreversible conversion of one nucleotide to another at a target genomic locus without requiring double- stranded breaks (DSBs), homology-directed repair (HDR) processes, or donor DNA templates.
[0046] The terms “adenosine base editor” or “ABE” as used herein refer to a base editor that mediates conversion of adenosine-to-guanosine {i.e., A-T to G-C) via an inosine intermediate. The person skilled in the art would appreciate that the activity of ABEs can be extended to mediate the conversion of thymidine -to-cytidine {i.e., T· A to C-G) on the non- targeted strand {i.e., sense strand) of the target nucleic acid sequence.
[0047] The terms “cytidine base editor” or “CBE” as used herein refer to a base editor that mediates conversion of cytidine-to-thymidine {i.e., C-G to T-A) via an uridine intermediate. The person skilled in the art would appreciate that the activity of CBEs can be extended to mediate the conversion of adenosine-to-guanosine {i.e., A-T to G-C) on the non- targeted strand {i.e., sense strand) of the target nucleic acid sequence. [0048] The term “nucleotide” as used herein refers to the nucleotides adenosine, guanosine, cytidine, thymidine and uridine, each of which comprise a nucleotide base attached to a ribose ring. A person skilled in the art will appreciate that the terms “adenosine”, “guanosine”, “cytidine”, “thymidine” and “uridine” may be used interchangeably herein with the terms “adenine” or “A”, “guanine” or “G”, “cytosine” or “C”, “thymine” or “T” and “uracil” or “U”, respectively, which refer the nucleotide base comprised by the nucleotides.
[0049] Base editing proteins are used in conjunction with components of clustered regularly interspaced short palindromic repeats/ CRISPR-associated protein (CRISPR/Cas) systems to directly introduce point mutations into cellular DNA without making DSBs.
[0050] The “clustered regularly interspaced short palindromic repeat” (CRISPR) / “CRISPR-associated protein” (Cas) system (CRISPR/Cas system) evolved in bacteria and archaea as an adaptive immune system to defend against viral attack. Upon exposure to a virus, short segments of viral DNA are integrated in the clustered regularly interspaced short palindromic repeats (i.e., CRISPR) locus. RNA is transcribed from a portion of the CRISPR locus that includes the viral sequence. That RNA, which contains sequence complementarity to the viral genome, mediates targeting of a Cas endonuclease to the sequence in the viral genome. The Cas endonuclease cleaves the viral target sequence to prevent integration or expression of the viral sequence.
[0051] The mechanisms of CRISPR-mediated gene editing would be known to persons skilled in the art and have been described, for example, by Doudna et al., (2014, Methods in Enzymology, 546).
Fusion proteins
[0052] In an aspect, the present disclosure provides an isolated fusion protein comprising: a. a Cas9 domain comprising the amino acid sequence of SEQ ID NO: 1, or an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO: 1, wherein the Cas9 domain when in conjunction with a bound guide RNA (gRNA) specifically binds to a target nucleic acid sequence; and b. a deaminase domain, wherein the deaminase domain deaminates a nucleotide base in a single-stranded portion of the target nucleic acid sequence when in conjunction with the Cas9 domain and the gRNA, wherein the deaminase domain is positioned between amino acid residues 119 and 132 or 730 and 745 of the amino acid sequence of SEQ ID NO: 1.
[0053] As used herein, "isolated" with reference to a protein, means that the protein is substantially free of cellular material or other contaminating proteins from the cells from which the protein is derived (and thus altered from its natural state), or substantially free from chemical precursors or other chemicals when chemically synthesized, and thus altered from its natural state.
[0054] The terms “protein”, “peptide” and “polypeptide” are used interchangeably herein to refer to a polymer of amino acid residues linked together by peptide (amide) bonds. The terms refer to a protein, peptide, or polypeptide of any size, structure or function.
[0055] The term “fusion protein” as used herein relates to a protein comprising two or more heterologous regions or domains not found operably linked in nature.
[0056] The terms “Cas9”, “Cas9 nuclease”, or “Cas9 endonuclease” as used herein refer to an RNA-guided nuclease comprising a Cas9 protein, or a fragment thereof (e.g. , a protein comprising an active, inactive, or partially active DNA cleavage domain of Cas9 and / or the gRNA binding domain of Cas9.
[0057] Cas9 nuclease sequences would be known to persons skilled in the art, illustrative examples of which are described by, for example Ferretti et al. (2001, Proceedings of the National Academy of Science U.S.A., 98: 4658-4663), Deltcheva et al. (2011, Nature, 471: 602-607), and Jinek et al. (2012, Science, 337: 816-821).
[0058] In an embodiment, the Cas9 domain comprises an amino acid sequence of the Staphylococcus aureus Cas9 (SaCas9).
[0059] In an embodiment, the Cas9 domain comprises a catalytically impaired Cas9 nuclease. Methods for generating a catalytically impaired Cas9 proteins or fragments thereof would be known to persons skilled in the art, an illustrative example of which includes the introduction of mutations in the HNH nuclease subdomain or the RuvC1 subdomain of the DNA cleavage domain of Cas9, as described by Jinek et al. (2012, supra). [0060] In an embodiment, the Cas9 domain comprises a Cas9 nickase. In a preferred embodiment, the Cas9 domain comprise a Staphylococcus aureus Cas9 nickase (SaCas9n).
[0061] In an exemplary embodiment, the Cas9 domain comprises the amino acid sequence of SEQ ID NO: 1, or an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO: 1. Accordingly, the sequence may be at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, orl00% identical to the amino acid sequence of SEQ ID NO: 1. Methods for the determination of sequence identity would be known to persons skilled in the art, illustrative examples of which include computer programs that employ algorithms such as protein BLAST (Altschul et al., 1997, Nucleic Acids Research, 25: 3389-3402).
[0062] The terms “guide RNA” or “gRNA” refer to a RNA sequence that is complementary to a target DNA and directs a CRISPR endonuclease to the target nucleic acid sequence. gRNA comprises CRISPR RNA (crRNA) and a tracr RNA (tracrRNA). crRNA is a 17-20 nucleotide sequence that is complementary to the target nucleic acid sequence, while the tracrRNA provides a binding scaffold for the endonuclease. crRNA and tracrRNA exist in nature a two separate RNA molecules, which has been adapted for molecular biology techniques using, for example, 2-piece gRNAs such as CRISPR tracer RNAs (crdracrRNAs).
[0063] The terms “single-guide RNA” or “sgRNA” refer to a single RNA sequence that comprises the crRNA fused to the tracrRNA.
[0064] Accordingly, the skilled person would understand that the term “gRNA” describes all CRISPR guide formats, including two separate RNA molecules or a single RNA molecule. By contrast, the term “sgRNA” will be understood to refer to single RNA molecules combining the crRNA and tracrRNA elements into a single nucleotide sequence.
[0065] In an embodiment, the gRNA is a single-guide RNA (sgRNA).
[0066] The term “deaminase domain” as used herein refers to a protein that deami nates a nucleotide base in a single- stranded portion of the target nucleic acid sequence when in conjunction with the Cas9 domain and the gRNA. The deaminase domains described herein may be naturally occurring deaminase or a synthetic (i.e., engineered or laboratory-evolved) deaminase. In some embodiments, the deaminase domain is a variant of a naturally occurring deaminase that does not occur in nature. Accordingly, it is to be understood that the term “deaminase domain” as used herein is not intended to be limiting and encompasses genetically engineered deaminase that may comprise genetic modifications (e.g., one or more mutations) that results in a variant deaminase having an amino acid sequence comprising one or more changes relative to a wild-type counterpart deaminase.
[0067] In an embodiment, the deaminase domain is an adenosine deaminase domain.
[0068] Suitable adenosine deaminase domains would be known to persons skilled in the art, illustrative examples of which include the adenosine deaminase enzymes described by Gaudelli et al. (2017, Nature, 551: 464-471).
[0069] In an embodiment, the adenosine deaminase domain is a tRNA-specific adenosine deaminase (TadA) monomer. In another embodiment, the TadA monomer is selected from the group consisting of an evolved TadA monomer and a wild- type TadA monomer.
[0070] The term “evolved” or “laboratory-evolved” as used herein refers to the directed protein evolution of protein domains (e.g., adenosine deaminase domains) to introduce new protein functions into an existing protein (i.e., an enzyme). Methods for protein evolution would be known to persons skilled in the art, illustrative examples of which include the phage-assisted continuous evolution (PACE) system as described in WO 2019/023680.
[0071] In an embodiment, the TadA monomer is from Escherichia coli (ecTadA).
[0072] In an exemplary embodiment, the adenosine deaminase domain is an evolved TadA monomer from Escherichia coli (ecTadA).
[0073] In a further exemplary embodiment, the evolved ecTadA monomer comprises an amino acid sequence that is at least 80% identical to the amino acid sequence of SEQ ID NO: 2.
[0074] In an embodiment, the deaminase domain is a cytidine deaminase domain.
[0075] Suitable cytidine deaminase domains would be known to persons skilled in the art, illustrative examples of which include rat APOBEC1 (rAPOBECl) as described by Komor et al. (2016, Nature, 533: 420-424), human activation-induced deaminase (hAID) as described by Hess et al. (2016, Nature Methods, 13: 1036-104) and Ma et al. (2016, Nature Methods, 13: 1029-1035), apolipoprotein-B-REN-editing-catalytic polypeptide-like 3A (APOBEC3A) as described by Gehrke et al. (2018, Nature Biotechnology, 36: 977-982), Petromyzon marinus cytidine deaminase 1 (PmCDAl) as described by Nishida et al. (2016, Science, 353: 8) and Komor et al. (2017, Scientific Advances, 3: eaao4774) and functional CDA-like (CDAL) deaminases as described by Cheng et al. (2019, Nature Communications, 10: 3612).
[0076] In an embodiment, the cytidine deaminase domain is a deaminase from an apolipoprotein B mRNA-editing complex (APOBEC) family deaminase.
[0077] In another embodiment, the APOBEC family deaminase is selected from the group consisting of APOBEC 1 deaminase, APOBEC2 deaminase, APOBEC3A deaminase, APOBEC3B deaminase, APOBEC3C deaminase, APOBEC3D deaminase, APOBEC3F deaminase, APOBEC3G deaminase, APOBEC3H deaminase and activation-induced deaminase variant (AIDx).
[0078] In an exemplary embodiment, the APOBEC family deaminase is APOBEC 1 deaminase.
[0079] In a further exemplary embodiment, the APOBEC family deaminase is a human AIDx (hAIDx).
[0080] A fusion protein of the present disclosure may further comprise one or more additional domains or moieties. For example, the fusion protein may comprise cell recognition or targeting domains, a nuclear localization signals (NLS), and/or antibiotic selection domains (e.g., blasticidin-S-deaminase).
[0081] One of the challenges associated with the use of base editing proteins in mammalian cells is overcoming the function of DNA repair processes that operate to oppose base pair conversion. For example, the use of some base editing proteins comprising cytidine deaminase domains to edit a C-G base pair to a T-A base pair through a U-G intermediate has been shown to activate base excision repair (BER) of the U-G intermediate in DNA (Kunz et al., 2009, Cellular and Molecular Life Sciences, 66: 1021-1038). In particular, the U-G mismatch is recognized by uracil N-glycosylase (UNG), which cleaves the glycosidic bond between uracil and the deoxyribose backbone of DNA to revert the U-G intermediate created by the cytidine deaminase back to a C- G base pair. The activity of UNG is effectively inhibited by the uracil DNA glycosylase inhibitor (UGI) as described by Mol et al. (1995, Cell, 82: 701-708).
[0082] Accordingly, in an embodiment, an isolated fusion protein comprising a cytidine deaminase domain further comprises a UGI domain, wherein the UGI domain inhibits a uracil-DNA glycosylase.
[0083] A person skilled in the art would understand that the deaminase domain will define the isolated fusion protein as either an ABE or CBE. Specifically, an isolated fusion protein comprising an adenosine deaminase domain is an ABE. Conversely, an isolated fusion protein comprising a cytidine deaminase domain is a CBE.
[0084] As described elsewhere herein, the intradomain insertion of the deaminase domain between amino acid residues 119 and 132 or 730 and 745 of the amino acid sequence of SEQ ID NO: 1 unexpectedly improves the activity profile of base editing proteins.
[0085] In an embodiment, the deaminase domain is positioned at an amino acid position selected from the group consisting of N730, Q731, M732, F733, E735, K736, Q737, E739, S740, M741, P742, E743, 1744 and E745.
[0086] In an exemplary embodiment, the deaminase domain is positioned at amino acid residue 1744.
[0087] In another embodiment, the deaminase domain is positioned at an amino acid selected from the group consisting of HI 19, N120, E125, D126, D127, T128, G129, N130, E131 and L132.
[0088] In an exemplary embodiment, the deaminase domain is positioned at amino acid residue G129.
[0089] In particular embodiment, components (a) and (b) of the fusion protein described herein may be directly attached to one another or may be attached via a linker. As used herein, the term "linker" refers to any molecule or group of molecules that binds the two components. Linkers may provide for optimal spacing of the two components. Suitable linkers include, by way of example only, amino acids such as aminohexanoic acid, glycine and serine and stretches of two or more amino acids such as glycine and serine. Linkers may further supply a labile linkage that allows the two components to be separated from each other. Labile linkages include photocleavable groups, acid-labile moieties, base-labile moieties and enzyme-cleavable groups.
[0090] Embodiments of the disclosure contemplate derivatives of the fusion proteins disclosed herein. As used herein the term "derivative" is intended to encompass chemical modification to a protein or one or more amino acid residues of a protein, including chemical modification in vitro, for example, by introducing a group in a side chain in one or more positions of a peptide, such as a nitro group in a tyrosine residue or iodine in a tyrosine residue, by conversion of a free carboxylic group to an ester group or to an amide group, by converting an amino group to an amide by acylation, by acylating a hydroxy group rendering an ester, by alkylation of a primary amine rendering a secondary amine, or linkage of a hydrophilic moiety to an amino acid side chain. Other derivatives may be obtained by oxidation or reduction of the side-chains of the amino acid residues in the protein. Modification of an amino acid may also include derivation of an amino acid by the addition and/or removal of chemical groups to/from the amino acid, and may include substitution of an amino acid with an amino acid analog (e.g. , a phosphorylated or glycosylated amino acid) or a non-naturally occurring amino acid such as a N-alkylated amino acid (e.g., N-methyl amino acid), D-amino acid, b-amino acid or g-amino acid.
[0091] Also contemplated herein are conservative variants of the fusion proteins disclosed herein. Conservative variants comprise one or more conservative amino acid substitutions. A "conservative amino acid substitution" is one in which an amino acid residue is replaced with another residue having a chemically similar or derivatised side chain. Families of amino acid residues having similar side chains, for example, have been defined in the art. These families include, for example, amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). For example, the substitution of the neutral amino acid serine (S) for the similarly neutral amino acid threonine (T) would be a conservative amino acid substitution. Those skilled in the art will be able to determine suitable conservative amino acid substitutions that do not eliminate the functional properties of the fusion proteins described herein. [0092] Variants of the fusion proteins defined herein are contemplated. In particular embodiments, the variant will possess at least about 80% identity to the sequence of which it is a variant. The sequence may be about 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical to the sequence of which it is a variant.
[0093] The fusion proteins of the present disclosure may be produced using any method known in the art, including synthetically or by recombinant techniques such as expression of polynucleotide constructs encoding the components. For example, a protein may be synthesized using the Fmoc -polyamide mode of solid-phase peptide synthesis. Other synthesis methods include solid phase t-Boc synthesis and liquid phase synthesis. Purification can be performed by any one of, or a combination of, techniques such as re crystallization, size exclusion chromatography, ion-exchange chromatography, hydrophobic interaction chromatography and reverse-phase high performance liquid chromatography using, for example, acetonitrile/water gradient separation.
[0094] A fusion protein of the present disclosure may be produced when two or more heterologous nucleotide sequences encoding each component of the fusion protein, optionally including nucleotides sequences encoding a linker or spacer amino acid(s), are fused together in the correct translational reading frame and are expressed.
[0095] Accordingly, the present disclosure also provides isolated polynucleotide encoding the fusion protein and components thereof as described herein.
[0096] As used herein the terms “polynucleotide”, “nucleotide sequence” or “nucleic acid sequence” mean a single- or double-stranded polymer of deoxyribonucleotide, ribonucleotide bases or known analogues or natural nucleotides, or mixtures thereof, and include coding and non-coding sequences of a gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polypeptides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers and fragments.
[0097] As used herein, the terms “encode,” “encoding” and the like refer to the capacity of a nucleic acid to provide for another nucleic acid or a polypeptide. For example, a nucleic acid sequence is said to "encode" a polypeptide if it can be transcribed and/or translated to produce the polypeptide or if it can be processed into a form that can be transcribed and/or translated to produce the polypeptide. Such a nucleic acid sequence may include a coding sequence or both a coding sequence and a non-coding sequence. Thus, the terms "encode," "encoding" and the like include an RNA product resulting from transcription of a DNA molecule, a protein resulting from translation of an RNA molecule, a protein resulting from transcription of a DNA molecule to form an RNA product and the subsequent translation of the RNA product, or a protein resulting from transcription of a DNA molecule to provide an RNA product, processing of the RNA product to provide a processed RNA product (e.g., mRNA) and the subsequent translation of the processed RNA product.
[0098] The present disclosure also provides vectors comprising a polynucleotide sequence(s) encoding the fusion protein and components thereof as described herein. Typically, the polynucleotide sequence(s) is operably linked to a promoter to allow for expression of the fusion peptide or components thereof.
[0099] In an embodiment, the vector further comprises a polynucleotide encoding a gRNA.
[0100] The vectors can be episomal vectors (i.e., that do not integrate into the genome of a host cell), or can be vectors that integrate into a host cell genome. Vectors may be replication competent or replication-deficient. Exemplary vectors include, but are not limited to, plasmids, cosmids, and viral vectors, such as adeno-associated virus (AAV) vectors, lentiviral, retroviral, adenoviral, herpesviral, parvoviral and hepatitis viral vectors. The choice and design of an appropriate vector is within the ability and discretion of one of ordinary skill in the art. Preferably, however, the vector is suitable for use in gene therapy.
[0101] Vectors suitable for use in gene therapy would be known to persons skilled in the art, illustrative examples of which include viral vectors derived from adenovirus, adeno- associated virus (AAV), herpes simplex virus (HSV), retrovirus, lentivirus, self-amplifying single-strand RNA (ssRNA) viruses such as alphavirus (e.g., Semliki Forest virus, Sindbis virus, Venezuelan equine encephalitis, Ml), and flavivirus (e.g., Kunjin virus, West Nile virus, Dengue virus), rhabdovirus (e.g., rabies, vesicular stomatitis virus), measles virus, Newcastle Disease virus (NDV) and poxivirus as described by, for example, Lundstrom (2019, Diseases, 6: 42). [0102] In an exemplary embodiment, the vector is an adeno-associated virus (AAV) vector. Exemplary AAV vectors include, without limitation, those derived from serotypes AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV 12 or AAV13, or using synthetic or modified AAV capsid proteins such as those optimized for efficient in vivo transduction. A recombinant AAV vector describes replication-defective virus that includes an AAV capsid shell encapsidating an AAV genome. Typically, one or more of the wild-type AAV genes have been deleted from the genome in whole or part, preferably the rep and/or cap genes.
[0103] The present disclosure also provides non-viral delivery methods of the fusion protein and components thereof as described herein. Suitable non-viral delivery methods will be known to persons skilled in the art, illustrative examples of which include using lipids, lipid-like materials or polymeric materials, as described, for example, by Rui et al. (2019, Trends in Biotechnology, 37(3): 281-293), and nanoparticles, as described, for example, by Nguyen et al. (2020, Nature Biotechnology, 38: 44-49).
Deamination methods
[0104] In another aspect, the present disclosure provides a method to deaminate an adenosine nucleotide to an inosine nucleotide comprising contacting a nucleic acid molecule with the fusion protein as described herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more adenosine or thymidine nucleotides.
[0105] In an embodiment, the target sequence is present on the non-coding (i.e., antisense) strand of a nucleic acid duplex.
[0106] As used herein the term “antisense” refers to a nucleotide sequence whose sequence of nucleotide residues is in reverse 5’ to 3’ orientation in relation to the sequence of deoxynucleotide residues in a sense strand of a nucleic acid ( e.g ., DNA or RNA) duplex. A “sense strand” of a DNA duplex refers to a strand in a DNA duplex which is transcribed by a cell in its natural state into a mRNA (i.e., coding and/or regulatory DNA molecules). Thus an “antisense” sequence is typically substantially complementary to the coding strand in a DNA duplex and has homology to the non-coding strand in a DNA duplex.
[0107] A person skilled in the art would appreciate that the deamination of an adenosine nucleotide to an inosine nucleotide (i.e., resulting in A G) on the antisense strand of a nucleic acid duplex will result in the corresponding changes being made to the sense strand in the next replication cycle (i.e., T → C).
[0108] Accordingly, in an embodiment, the target sequence comprises a guanosine to adenosine point mutation associated with a disease or disorder, and wherein the deamination of the mutant adenosine nucleotide results in a sequence that is not associated with a disease or disorder. In another embodiment, the target sequence comprises a cytidine to thymidine point mutation associated with a disease or disorder, and wherein the deamination of the mutant cytidine nucleotide results in a sequence that is not associated with a disease or disorder.
[0109] In another aspect, the present disclosure provides a method to deaminate a cytidine nucleotide to an uridine nucleotide comprising contacting a nucleic acid molecule with the fusion protein as described herein and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more cytidine or guanosine nucleotides.
[0110] In an embodiment, the target sequence is present on the non-coding (i.e., antisense) strand of a nucleic acid duplex.
[0111] A person skilled in the art would appreciate that the deamination of an cytidine nucleotide to a uridine nucleotide (i.e., resulting in C → T) on the antisense strand of a nucleic acid duplex will result in the corresponding changes being made to the sense strand in the next replication cycle (i.e., G → A).
[0112] Accordingly, in an embodiment, the target sequence comprises an adenosine to guanosine point mutation associated with a disease or disorder, and wherein the deamination of the mutant guanosine nucleotide results in a sequence that is not associated with a disease or disorder. In another embodiment, the target sequence comprises a thymidine to cytidine point mutation associated with a disease or disorder, and wherein the deamination of the mutant cytosine nucleotide results in a sequence that is not associated with a disease or disorder.
[0113] It is further contemplated that the deamination methods described herein may be adapted to methods for the treatment of diseases or disorders that are characterized by guanosine to adenosine point mutations, cytidine to thymidine point mutations, adenosine to guanosine point mutations, or thymidine to cytidine point mutations. For example, it has been exemplified herein that a fusion protein comprising SaCa9n with an intradomain insertion of miniABEmax (V82G) base editor at residue 1744 (i.e., microABE 1744), effectively and efficiently corrects the highly penetrant PCDH15 Arg245Ter variant, which causes type 1 Usher syndrome. On this basis, it is reasonable to expect that the novel fusion proteins described herein may also be useful in correcting single-base variants that are causative of other human genetic diseases or disorders.
[0114] All publications mentioned in this specification are herein incorporated by reference. The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that that prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavor to which this specification relates.
[0115] It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the present disclosure without departing from the spirit or scope of the disclosure as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
[0116] The present disclosure will now be further described in greater detail by reference to the following specific examples, which should not be construed as in any way limiting the scope of the disclosure.
Examples
General Methods
PyMOL analysis and I-TASSER alignment of SpCas9 and SaCas9
[0117] Crystal structures of S. pyogenes Cas9 (PDB accession: 4008) and S. aureus Cas9 (PDB accession: 5CZZ) were downloaded from the Protein Data Bank and visualized using PyMOL v2.3.1 (Burley et al., 2019, Nucleic Acids Research, 47: D464-D474). Given that residues 731-741 in SaCas9 were not crystalized, I-TASSER was used to generate a predictive crystal structure that was then superimposed with that of S. pyogenes Cas9 using the ‘super’ command in PyMOL (Zhang & Skolnick, 2005, Nucleic Acids Research, 33: 2302-2309). The “complete” SaCas9 structure was then aligned structurally using the TM- Align webtool from I-TASSER to determine the structural homology between the two proteins (Zhang & Skolnick, 2005, supra).
Plasmid Construction and Cloning
[0118] Plasmids were generated and Sanger sequence verified by Genscript
(Piscataway) (Table 1). Plasmids expressing the U6-sgRNA scaffold with mCherry fluorophore reporter were cloned into either the pX552-CMV-mCherry-U6-SpCas9_sgRNA scaffold (SEQ ID NO: 121; Addgene #107051) or PX552-CMV-mCherry-U6- SaCas9_sgRNA scaffold (SEQ ID NO: 122; Addgene #107053) via Sapl (NEB) digest sites using oligonucleotides corresponding to the target spacer (Tables 2-4).
Table 1. Plasmid sequences and description
Figure imgf000022_0001
Figure imgf000023_0001
Table 2. Amplicon primer sequences
Figure imgf000023_0002
Figure imgf000024_0001
Figure imgf000025_0001
Table 3. sgRNA sequences
Figure imgf000025_0002
Figure imgf000026_0001
Table 4. On-Target sequence used for alignment of amplified sequences
Figure imgf000026_0002
Figure imgf000027_0001
Figure imgf000028_0001
Figure imgf000029_0001
Table 5. Off-Target sequence used for alignment of amplified sequences
Figure imgf000029_0002
Figure imgf000030_0001
Figure imgf000031_0001
Figure imgf000032_0001
Figure imgf000033_0001
Figure imgf000034_0001
Cell Culture
[0119] HEK293A cells expressing yellow fluorescent protein (HEK293A-YFP; Huang et al., 2016, Investigative Ophthalmology & Visual Science, 57: 3470-3476) were cultured in Dulbecco’s Modified Eagle Medium (DMEM) with high glucose (Life Technologies). Culture media was supplemented with 10% (vol/vol) Fetal Bovine Serum (Life Technologies) and 1% (vol/vol) antibiotic-antimycotic (Thermo Fisher Scientific). HEK293A-YFP cells were maintained in the aforementioned media at 37°C with 5% CO2 for cell culture experiments. Cell culture was performed HEK293A-YFP cells that were subject to no more than 20 passages. Cells carrying the full PCDH15 cDNA sequence with the Arg245ter (NM_033056.4:c.733C>T) variant were generated using the Flp-In T-Rex core kit on a Flp-In T-Rex cell background (Thermo Fisher Scientific) as per manufacturer's instructions and maintained similar to HEK293A lines. Mycoplasma testing was performed on a biweekly basis using PCR Mycoplasma Test Kit I/C (Banksia Scientific).
Transfections and DNA/RNA extractions
[0120] For on-target DNA and off-target RNA characterization, HEK293A-YFP cells were seeded at a density of 50,000 cells per well in a 24-weII, tissue culture-treated plate (In Vitro Technologies). Subsequently, 8 μL ViaFect Transfection reagent (Promega) with 1 μg CRISPR base editor plasmid and 1 μg sgRNA-expressing plasmid was transfected into cells 20-24 h after plating. Fresh media containing 20 μg/mL Blasticidine (Sigma Aldrich) was exchanged 18-22 h after transfection to select for cells expressing the base editor construct. Further enrichment was performed 18-22 h following the first selection round with the replacement of media containing 30 μg/mL Blasticidine. Overall, cells were cultured for no longer than 72 h after initial transfection before washing with xl PBS (ThermoFisher Scientific) due to the loss of RNA A-to-I edits in the transcriptome over time. For the initial on-target gDNA editing screen of SaCas9n intradomain constructs, total culturing time was five days to ensure for maximum selection with Blasticidine, and were extracted for DNA only. For those experiments involving three days of culturing, RNA and DNA were simultaneously harvested using 350 μL Buffer RLT Plus as part of the Allprep DNA/RNA Mini Kit (QIAGEN) following the manufacturer's protocol. PCDH15 Arg245Ter Flp-In T- Rex lines were transfected with 1 μg base editor construct and 0.45 μg sgRNA plasmid (FugeneHD™, Promega), and selected with 1 ug/mL puromycin for five days. DNA and RNA samples were eluted in 30 μL Buffer EB and RNase-free water, respectively, with 1.5 μL RNaseOUT Recombinant Ribonuclease Inhibitor (Life Technologies) added to the eluted RNA sample. For experiments involving AAV-transduction, HEK293A-YFP cells were plated at a density of 50,000 cells per well, 24 hours prior to transduction at a multiplicity- of-infection (MOI) of 2 x 106 viral genomes/cell. After 72 hours of culture, cells were washed twice with PBS and harvested.
RNA reverse transcription and targeted PCR amplification
[0121] Between 200 and 400 ng of RNA was reverse transcribed using the High- Capacity RNA-to-cDNA™ Kit (Life Technologies) following the manufacturer's instructions. RNA samples were paired with their counterpart gDNA samples for targeted amplification. The cDNA samples were diluted 1:10 and 2 μL of the diluted cDNA was used as input for the first-round PCR amplification of either RNA off-target sites or undiluted gDNA for those experiments involving DNA on-target sites (Table 2). Briefly, PCR reactions were made up to 25 μL comprising 12.5 μL Q5 Hot Start High-Fidelity 2X Master Mix (NEB), 1.25 μL of forward and reverse primers containing 5’ flanking Illumina-style adapter overhangs, and diluted cDNA or 50-100 ng of gDNA under thermocycling conditions of 98°C initial denaturation for 30 s, and 30 cycles of 98°C denaturation for 10 s, 65°C annealing for 30 s, and 72°C extension for 12 s with a 72°C final extension for 2 m. PCR amplification was validated using electrophoresis using 1.5% agarose gel and cleaned using Agencourt AMPure XP (Beckman Coulter) 1.8X paramagnetic bead clean-up.
RNA-seq analysis
[0122] Sequencing libraries were prepared using NEBNext(R) UltraTM RNA Library Prep Kit for Illumina(R) and sequencing was carried out on HiSeq X Ten using a 2X150-bp paired-end configuration at Genewiz (Suzhou, China). Libraries were downsampled to 120 million reads using seqtk v.1.3 (r106) (https://github.com/lh3/seqtk). The downsampled libraries were processed according to GATK best practices for RNA-seq variant calling (Griinewald et al., 2019, Nature Biotechnology, 37: 1041-1048). Briefly, raw sequencing reads were aligned to the human hg38 reference genome using STAR (v.2.7.2b). Next, tools from GATK (v.4.1.3.0) that include MarkDuplicates, SplitNCigarReads, BaseRecalibrator and ApplyBQSR were used to process the aligned reads. Known variants in dbSNP build 138 were used for base quality recalibration. Linally, 'analysis-ready' BAM files were subjected to bam-readcount and HaplotypeCaller to estimate per-library nucleotide abundances per position and to identify RNA base-editing variants, respectively. Total A- to-I edits per library were calculated as the sum of A-to-I edits on the positive strand and T- to-C edits on the negative strand.
Library preparation for targeted amplicon sequencing
[0123] Lollowing the first-round PCR amplification and clean-up of amplicons containing on-target sites or RNA off-target sites, a second round barcoding PCR was performed using between 20 and 150 ng of the purified first round PCR products. The barcoding PCR added unique dual i5/i7 indices using the Nextera XT index kit V 2 (Illumina). Q5 Hot Start High-Lidelity 2X Master Mix was used following the sample preparation and thermocycling conditions described by Gaudelli et al. (2017, Nature, 551: 464-471). Subsequently, the second-round PCR products were purified using 0.7x paramagnetic bead clean-up and quantified using Qubit™ dsDNA BR Assay Kit (Life Technologies). Each sample was then normalized to 4 nM and 5 μL of each library member was pooled into a final library that was validated using High Sensitivity D1000 ScreenTape (Agilent Technologies). The final library was paired-end sequenced (2 x 251) on the Illumina MiSeq machine using 600-cycle MiSeq Reagent Kit v3. Amplicon sequencing analysis
[0124] Paired-end fastq files were joined and trimmed (Bolger et al., 2014, Bioinformatics, 30: 2114-2120), before being processed using the CRISPResso2 (V.2.0.29) workflow (Clement et al, 2019, Nature Biotechnology, 37: 224-226).
Statistical analysis
[0125] The average nucleotide modification percentage outputs from CRISPResso2 (V.2.0.29) were pooled across independent biological and technical replicates for each nucleotide position in the amplicon. Welch Two-sample t-tests were performed to compare differences in editing efficiencies, and a p value of less than 0.005 was considered statistically significant. Specifically, for comparative analyses of the six RNA off-target transcripts ( DNAJB , MTA2, PTBP2, SAP30BP, LCMT1, and SCAP), both average adenine - to-guanine (inosine) editing across the length of the amplicon, and also the highest edited position of the amplicon were considered (Griinewald et al, 2019, supra). The average of the amplicon was considered as multiple off-target events were observed relative to the non- transfected mock control.
Data Availability
[0126] All raw sequencing reads have been uploaded to the European Nucleotide Archive accession: PRJEB35675.
[0127] All sequence identifiers (e.g., GenBank ID, EMBL-Bank ID, DNA Data Bank of Japan (DDBJ) ID, etc.), Addgene identifiers, Protein Data Base (PDB) identifiers provided herein were current at the filing date.
Example 1 - Functional screen of intradomain and circularly permuted base editors
[0128] The compact size and unique chemistry of the hAIDx protein (TAM-AIDx; 182 residues) made it an ideal candidate for a functional screen of intradomain and circularly permuted base editors (Figure 1). First, several sites of interest in the REC2, REC1, and RuvC-III domains of SpCas9 were selected for analysis (Figure 2), each of which were previously shown to be highly amenable to protein domain insertion without resulting in loss-of-function to SpCas9 (Oakes et al, 2016, Nature Biotechnology, 34: 646-651). Ordinarily, the hAIDx domain is tethered to nickase SpCas9 via an N-terminal linker (Ma et al, 2016, Nature Methods, 13: 1029-1035). Therefore, for this analysis, the previously characterized 44-amino acid N-terminal linker was conserved and appended to a floppy glycine-serine-rich linker at its C-terminus to bridge the SpCas9n and hAIDx protein domains.
[0129] To broadly survey the effects of protein-domain alterations on base editing activity, three circularly permuted SpCas9n constructs of interest were used for comparison (Figure 3; Oakes et al., 2019, Cell, 176: 254-267). Circular permutant-variants of hAIDx at residues 1010, 1029, and 1058 - all residing within the PI domain of SpCas9n (Figure 2) - were selected for a direct comparison of on-target editing using a cell line expressing yellow fluorescent protein (YFP), which has no homologous analog in the human genome. These data show that the intradomain insertion of the hAIDx protein into SpCas9 maintains a consistent on-target DNA signature (i.e., characterized by cytosine-to-guanine transversions at position 9 of the sgRNA) compared to its C-terminal variant, and that SpCas9n domain- interruptions are most amenable at residue 1058 of the positions assayed (Figures 3 and 4). Similarly, the highest on-target editing was observed with the circular permutation of SpCas9n at residue 1029 compared to other circular permutant variants of hAIDx (Figure 3).
[0130] Further, intradomain insertion of rAPOBECl (BE3) at residue 1058 maintains on-target cytosine-to-thymine activity, despite a 2.2-fold average reduction in editing efficiency at the YFP locus (26.9% and 12.1%, respectively). Although the C-terminal appendage of a uracil DNA glycosylase inhibitor (UGI) directs product fate towards a cytosine-to-thymine base transition, comparison of CBEs demonstrates that BE3 had the poorest product purity for a construct bearing a UGI (Figures 5 and 6). These data enable the inlaying of CBEs at residue 1058, particularly for the insertion of different varieties of base editors as the CBEs disclosed herein are sufficiently plastic for dramatic structural variations without deleterious effects (Figure 6).
Example 2 - Generation and characterization of conformational variants of ABEmax
[0131] Several conformational variants of ABEmax were then generated to profile on- target DNA and off-target RNA editing efficiencies (Figure 1). Initially, the circularly permuted SpCas9n designs described elsewhere herein, which used hAIDx and rAPOBECl insertions at position 1029, were adapted to an N-terminal, C-terminal and a de-coupled ecTadA dimer variant of ABEmax. On average, the N-terminal variant of ABEmax (ABEmax, also referred to as “wild-type”) severely impeded editing at the YFP locus compared to its wild-type counterpart (4.2% vs. 40.5%, averaged across three independent technical replicates; Figure 7). However, there was a modest, four-fold improvement in editing efficiency at a previously well-characterized locus (ABE site 16; ABE16; Gaudelli etal., 2017, Nature, 551: 464-471), compared to the YFP locus. Interestingly, the C-terminal construct had a ten- and four-fold reduction for on-target editing at the YFP and ABE 16 loci, respectively, but significantly increased the incidence of localized, RNA off-target events at two promiscuous transcripts (Figure 8).
[0132] Next, the ecTadA dimer of ABEmax was decoupled (Figure 1) to determine whether decoupling of the ecTadA monomers influences the on-target editing efficiency of circularly permuted ABEs. The unevolved ecTadA monomer of ABEmax was placed at the C-terminus of the circularly permuted SpCas9n construct while its evolved monomer was shifted to the N-terminus. De-coupling of ABEmax did not significantly affect the on-target activity of the N-terminal circular permutant of ABEmax (Figure 7). Surprisingly, however, we found that there was an increase in the incidence of localized RNA off-target events at the DNAJB transcript, which had only been previously observed by circularly permutating ABEmax at its C-terminus (Tables 5 and 6; Figure 9).
Table 6. On-target and off-target editing efficiencies of SpCas9 permutants of ABEmax
Figure imgf000039_0001
Figure imgf000040_0001
Table 7. On-target and off-target editing efficiencies of SaCas9 permutants of microABE and miniABEmax
Figure imgf000040_0002
Figure imgf000041_0001
[0133] In comparison to ABEmax, the miniABEmax has less spurious RNA off-target activity as it harbors a single evolved ecTadA monomer and has been engineered for reduced non-specific RNA contact (Griinewald et al., 2019, Nature Biotechnology, 37: 1041-1048). Circular permutation of the miniABEmax however showed no appreciable difference in on- target DNA editing. Nonetheless, unlike the ABEmax variant, there was no significant bearing on the incidence of off-target events from its circular permutation.
[0134] The effects of inlaying both the ABEmax and miniABEmax (V82G) variants at the intradomain site of residue 1058 in SpCas9n described elsewhere herein was also compared (Figure 1). Overall, a 3.5- and 1.7-fold average reduction to on-target editing at the YFP and ABE16 loci was shown, respectively, upon intradomain insertion of ABEmax in SpCas9n (Figure 10). For the miniABEmax (V82G) variant, on-target DNA editing was abrogated when the evolved ecTadA monomer was inlaid, as compared to its native conformation, but no difference to the RNA editing profile was observed (15.5- and 8.5-fold average reductions for ABE 16 and YFP loci, respectively). Altogether, these data show that both the DNA and RNA activity profiles of ABEs can be altered based upon their domain positioning in SpCas9n relative to their cognate nucleotide substrates. Example 3 - Generation of Staphylococcus aureus nickase Cas9 (SaCas9n)- intradomain ABE constructs
[0135] Although the alignment between SaCas9 and SpCas9 crystal structures revealed poor structural homology between the two proteins (Jinek et al., 2014, Science, 343: 1247997; Nishimasu et al., 2015, Cell, 162: 1113-1126), residue 1058 in SpCas9 was conformationally analogous to the poorly crystalized protein loop of residue 745 in SaCas9. On this basis, the length of the uncharacterized protein loop between residue 730 and 745 was assayed within the constraints of the adjacent alpha helices of the local domain. To further elucidate the apparent positional dependency of base editing activity and protein structure, residues 119 to 132 in SaCas9n were assayed (Figure 12). These residues were positional analogs to the topographically equivalent residue of 468 in SpCas9, which were assayed in the preliminary screen of intradomain insertions in the REC lobe of SpCas9n using hAIDx described elsewhere herein (Figure 4).
[0136] Interestingly, the insertion of a base editing domain within the REC lobe of SaCas9n significantly impeded the on-target activity of the mini ABEmax (V82G) (between 0.00 and 5.47% across residues 119 to 132), whereas on-target activity was dramatically improved when inserted into the RuvC-III domain of SaCas9n (between 5.39 and 17.7% across residues 730 to 745). Moreover, a gradated, topographical ‘hotspot’ was revealed by shifting the base editor domain from one residue to another in the RuvC-III domain (Figure 13), until a local ‘maximum’ was achieved with the highest on-target editing efficiency being observed at residue 1744 (13.6-17.7% across three independent technical replicates). Here, the insertion of the mini ABEmax (V82G) base editor at residue 1744 (i.e., ‘micro ABE 1744’) showed significantly superior on-target activity at the ABE16 locus compared to SaCas9n variants of mini ABEmax (V82G) and ABEmax (15.96% vs. 1.19% and 1.03%, respectively). The insertion of the hAIDx domain in SaCas9n was also consistent with the higher on-target editing efficiencies observed by ABEs for position 1744 (micro AIDx 1744), as compared to G129 and N730 (Figure 14).
Example 4 - Characterization of SaCas9n-intradomain ABE constructs
[0137] The microABE 1744 had a broader activity window with improved, overall on- target editing efficiencies compared to its Sa-ABEmax (SaCas9n variant of ABEmax) and Sa-miniABEmax (SaCas9 mini ABEmax (V82G) variant) counterparts (Figure 15). We observed up to a 2.28- and 1.78-fold increase in editing efficiency at the A7 position compared to Sa-ABEmax and Sa-miniABEmax, respectively. At the A10 position, microABE 1744 outperformed the Sa-ABEmax and Sa-miniABEmax by up to 3.63- and 3.09-fold, respectively. Collectively, these data demonstrate that the microABE 1744 vastly augments the editing scope of targeted adenines within a 21 -nucleotide spacer, displaying a characteristic bi-lobed activity window spanning from adenine position four to 16 (Figure 15).
[0138] The effects of SaCas9n intradomain base editor insertion on RNA activity were also characterized. In addition to assaying microABE 1744, other intradomain base editors at residues G129 and N730 were challenged against an adenine-rich RNF2 locus, a previously validated sgRNA against ABE16, and a non-targeting sgRNA against LacZ. Base editor insertion at SaCas9n residue 1744 consistently showed the highest on-target editing efficiencies across both loci (11.06%, averaged across three independent technical replicates for ABE16 locus, and 11.01%, averaged across three independent technical replicates for RNF2 locus) (Figure 16). Compared to the ‘wild-type’ SaCas9 mini ABEmax (V82G) variant, the microABE 1744 showed a 22-fold increase for on-target nucleotide modification at ABE16 (0.42-0.93% and 10.79-11.31%, respectively). Further, microABE 1744 attenuated the incidence of off-target events for at least three of the six commonly deaminated RNA- off target transcripts compared to the Sa-miniABEmax (V82G) and Sa-ABEmax (Figure 17; Table 6). The microABE 1744 also had a significantly reduced, local RNA off-target profile compared to its counterparts at residue G129 and N730 by up to 2- and 1.8-fold, respectively. Base editor insertion at residue G129 dramatically increased the incidence of RNA off-target events relative to its other intradomain counterparts and the Sa-miniABEmax (V82G) (Figures 17 to 18).
[0139] RNAseq was used to characterize the molecular footprint of the microABE 1744, Sa-ABEmax and Sa-miniABEmax (V82G) on the transcriptome. The microABE 1744 dramatically lowered the incidence of aberrant mRNA off-target events compared to both the Sa-ABEmax and Sa-miniABEmax (V82G) (2243 reads containing adenosine-to-inosine editing for microABE 1744 as compared to 4425 and 52,030 reads for Sa-miniABEmax [V82G] and Sa-ABEmax, respectively). In some instances, the domain-inlaid base editor resulted in a 6-fold reduction in the number of mRNA off-target edits (Figure 19) as compared to its non-inlaid permutant (V82G) (81 vs. 544 reads containing A-to-I editing, respectively, for transcripts mapped to chromosome 19). To further assess if domain-inlaid ABEs adversely affected their DNA-editing fidelity, we selected the top 28 predicted gDNA off-target sites based on the sgRNA-target homology for further analysis (Stemmer et al, 2015, PLoS ONE, 10(4): e0124633). Of the top three edited sites (ABE site 11 ; ABE11, ABE site 8; ABE8 , and ABE site 1 ; ABE1 ), we found that, overall, there was no apparent change in the off-target gDNA editing breadth of the microABE 1744 as compared to its existing counterparts at the predicted off-target sites.
[0140] Thereafter, the microABE 1744 was directed to correct the highly penetrant PCDH15 Arg245Ter variant, which causes type 1 Usher syndrome, whereby homozygous carriers have congenital deafness and develop retinitis pigmentosa (Ben-Yosef et al, The New England Journal of Medicine, 348: 1664-1670). Overall, we observed a 10-fold increase in editing efficiency and dramatically lower mRNA off-target effects as compared to the SaCas9-variant of mini ABEmax (V82G) (Table 6; Figure 21).
Example 5 - Packaging of SaCas9n-intradomain ABE constructs
[0141] Current generation AAV-mediated delivery platforms for base editors employ a dual-vector system, which is largely reliant on the use of intein trans-splicing for the reconstitution of full-length CBE or ABE (Levy et al, 2020, Nature Bioengineering, 4: 97- 110), which can hamper on-target editing efficiencies due to the need for co-delivery and co-transduction of the payload. To resolve these issues, we targeted the previously well- characterized locus, ABE11, to demonstrate that the SaCas9n-intradomain ABE constructs (i.e., an all-in-one vector) could be packaged as AAV-7m8 and AAV-DJ serotypes with an SCP1 promoter (Juven-Gershon et al, 2006, Nature Methods, 3: 917-922), to drive the expression of the microABE 1744, and a hU6 promoter expressing an sgRNA.
[0142] HEK293A-YFP cells were transduced and observed editing with no selection or enrichment after three days of culturing with either the AAV-7m8 and AAV-DJ capsid derivatives (Figure 22). These data demonstrate that the microABE 1744 can be packaged as an all-in-one vector for AAV delivery suitable for gene therapy delivery (Westhaus et al, 2020, Human Gene Therapy, published online 26 February 2020, ahead of print).
Summary
[0143] Collectively, these data demonstrate that the activity profiles of CBEs and ABEs can be improved for their on-target efficiency and precision by manipulating the structure of Cas9. By refining the endonuclease-base editor stereochemistry, it has been demonstrated herein that, for example, the same variant of ABE can have different DNA and RNA editing profiles arising from alterations to their secondary structure. The strategic use of circular permutation and protein-domain insertion enables the calibration of both the DNA and RNA footprint, based upon a model of “best-fit” between the overall reach of the deaminating catalytic pocket and the nucleotide substrate.
[0144] These data enable the use of the disclosed CBEs and ABEs, in particular the micro ABE 1744 with surprisingly increased on-target and reduced off-target capabilities, for use in a therapeutic context.

Claims

Claims
1. An isolated fusion protein comprising: a. a Cas9 domain comprising the amino acid sequence of SEQ ID NO: 1, or an amino acid sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO: 1, wherein the Cas9 domain when in conjunction with a bound guide RNA (gRNA) specifically binds to a target nucleic acid sequence; and b. a deaminase domain, wherein the deaminase domain deaminates a nucleotide base in a single-stranded portion of the target nucleic acid sequence when in conjunction with the Cas9 domain and the gRNA, wherein the deaminase domain is positioned between amino acid residues 119 and 132 or 730 and 745 of the amino acid sequence of SEQ ID NO: 1.
2. The fusion protein of claim 1, wherein the deaminase domain is positioned at an amino acid position selected from the group consisting of N730, Q731, M732, F733, E735, K736, Q737, E739, S740, M741, P742, E743, 1744, E745.
3. The fusion protein of claim 1 or claim 2, wherein the deaminase domain is positioned at amino acid residue 1744.
4. The fusion protein of claim 1 , wherein the deaminase domain is positioned at an amino acid position selected from the group consisting of HI 19, N120, E125, D126, D127, T128, G129, N130, E131 and L132.
5. The fusion protein of claim 1 or claim 4, wherein the deaminase domain is positioned at amino acid residue G129.
6. The fusion protein of any one of claims 1 to 5, wherein the deaminase domain is an adenosine deaminase domain.
7. The fusion protein of claim 6, wherein the adenosine deaminase domain is an evolved tRNA-specific adenosine deaminase (TadA) monomer from Escherichia coli (ecTadA).
8. The fusion protein of claim 7, wherein the evolved ecTadA monomer comprises an amino acid sequence that is at least 80% identical to the amino acid sequence of SEQ ID NO: 2.
9. The fusion protein of any one of claims 1 to 5, wherein the deaminase domain is an cytidine deaminase domain.
10. The fusion protein of claim 9, wherein the cytidine deaminase domain is a deaminase from an apolipoprotein B mRNA-editing complex (APOBEC) family deaminase.
11. The fusion protein of claim 10, wherein the APOBEC family deaminase is selected from the group consisting of APOBEC 1 deaminase, APOBEC2 deaminase, APOBEC3A deaminase, APOBEC3B deaminase, APOBEC3C deaminase, APOBEC3D deaminase, APOBEC3F deaminase, APOBEC3G deaminase, APOBEC3H deaminase and activation-induced deaminase variant (AIDx).
12. The fusion protein of claim 11, wherein the APOBEC family deaminase is AIDx.
13. The fusion protein of any one of claims 9 to 12, further comprising an uracil glycosylase inhibitor (UGI) domain, wherein the UGI domain inhibits a uracil-DNA glycosylase.
14. An isolated polynucleotide encoding the fusion protein of any one of claims 1 to 13.
15. A vector comprising the polynucleotide of claim 14.
16. The vector of claim 15, further comprising a nucleic acid sequence encoding a gRNA.
17. The vector of claim 15 or claim 16, wherein the vector is an adeno-associated virus (AAV) vector.
18. A complex comprising the fusion protein of any one of claims 1 to 13 and a gRNA bound to the Cas9 domain of the fusion protein.
19. A method to deaminate an adenosine nucleotide to an inosine nucleotide comprising contacting a nucleic acid molecule with the fusion protein of any one of claims 6 to 8 and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more adenosine or thymidine nucleotides.
20. The method of claim 19, wherein the target sequence comprises a guanosine to adenosine point mutation associated with a disease or disorder, and wherein the deamination of the mutant adenosine nucleotide results in a sequence that is not associated with a disease or disorder.
21. The method of claim 19, wherein the target sequence comprises a cytidine to thymidine point mutation associated with a disease or disorder, and wherein the deamination of the mutant cytidine nucleotide results in a sequence that is not associated with a disease or disorder.
22. A method to deaminate a cytidine nucleotide to an uracil nucleotide comprising contacting a nucleic acid molecule with a the fusion protein of any one of claims 9 to 13 and a gRNA, wherein the gRNA comprises a sequence of at least 10 contiguous nucleotides that are complementary to a target sequence in the genome of an organism comprising one or more cytidine or guanosine nucleotides.
23. The method of claim 22, wherein the target sequence comprises an adenosine to guanosine point mutation associated with a disease or disorder, and wherein the deamination of the mutant guanosine nucleotide results in a sequence that is not associated with a disease or disorder.
24. The method of claim 22, wherein the target comprises a thymidine to cytidine point mutation associated with a disease or disorder, and wherein the deamination of the mutant cytidine nucleotide results in a sequence that is not associated with a disease or disorder.
PCT/AU2021/050269 2020-03-25 2021-03-25 Dna altering proteins and uses therefor WO2021189110A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2020900913A AU2020900913A0 (en) 2020-03-25 DNA altering proteins and uses therefor
AU2020900913 2020-03-25

Publications (1)

Publication Number Publication Date
WO2021189110A1 true WO2021189110A1 (en) 2021-09-30

Family

ID=77889869

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2021/050269 WO2021189110A1 (en) 2020-03-25 2021-03-25 Dna altering proteins and uses therefor

Country Status (1)

Country Link
WO (1) WO2021189110A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018089664A1 (en) * 2016-11-11 2018-05-17 The Regents Of The University Of California Variant rna-guided polypeptides and methods of use
US20190017055A1 (en) * 2017-07-13 2019-01-17 Regents Of The University Of Minnesota Real-time reporter systems for monitoring base editing
WO2020168135A1 (en) * 2019-02-13 2020-08-20 Beam Therapeutics Inc. Compositions and methods for treating alpha-1 antitrypsin deficiency
WO2020168133A1 (en) * 2019-02-13 2020-08-20 Beam Therapeutics Inc. Compositions and methods for treating hemoglobinopathies

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018089664A1 (en) * 2016-11-11 2018-05-17 The Regents Of The University Of California Variant rna-guided polypeptides and methods of use
US20190017055A1 (en) * 2017-07-13 2019-01-17 Regents Of The University Of Minnesota Real-time reporter systems for monitoring base editing
WO2020168135A1 (en) * 2019-02-13 2020-08-20 Beam Therapeutics Inc. Compositions and methods for treating alpha-1 antitrypsin deficiency
WO2020168133A1 (en) * 2019-02-13 2020-08-20 Beam Therapeutics Inc. Compositions and methods for treating hemoglobinopathies

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HAN, J. JIA, Y. JIN, S. JING, Y. TILLARD, M. BELIN, C.: "Morphology and electrochemistry of spinel Li-Mn-O optimized by composite technology", ENERGY, ELSEVIER, AMSTERDAM, NL, vol. 31, no. 12, 1 September 2006 (2006-09-01), AMSTERDAM, NL , pages 1752 - 1757, XP005550876, ISSN: 0360-5442 *

Similar Documents

Publication Publication Date Title
US20230108687A1 (en) Gene editing methods for treating spinal muscular atrophy
US11608503B2 (en) RNA targeting of mutations via suppressor tRNAs and deaminases
US20220315906A1 (en) Base editors with diversified targeting scope
EP3178935B1 (en) Genome editing using campylobacter jejuni crispr/cas system-derived rgen
WO2019217943A1 (en) Methods of editing single nucleotide polymorphism using programmable base editor systems
AU2020223060B2 (en) Compositions and methods for treating hemoglobinopathies
US20220401530A1 (en) Methods of substituting pathogenic amino acids using programmable base editor systems
JP2022546608A (en) A novel nucleobase editor and method of use thereof
CN114072509A (en) Nucleobase editor with reduced off-target of deamination and method of modifying nucleobase target sequence using same
CN114072180A (en) Compositions and methods for treating alpha 1-antitrypsin deficiency
AU2020336953A1 (en) Compositions and methods for editing a mutation to permit transcription or expression
WO2022261509A1 (en) Improved cytosine to guanine base editors
CN114026237A (en) Compositions and methods for treating glycogen storage disease type 1a
WO2021189110A1 (en) Dna altering proteins and uses therefor
US20240132868A1 (en) Compositions and methods for the self-inactivation of base editors
WO2024077247A1 (en) Base editing methods and compositions for treating triplet repeat disorders
WO2023086953A1 (en) Compositions and methods for the treatment of hereditary angioedema (hae)
WO2024026478A1 (en) Compositions and methods for treating a congenital eye disease
BR112021013605B1 (en) BASE EDITING SYSTEMS, CELL OR A PROGENITOR THEREOF, CELL POPULATION, PHARMACEUTICAL COMPOSITION, AND METHODS FOR EDITING A BETA GLOBIN POLYNUCLEOTIDE (HBB) ASSOCIATED WITH SICKLE CELL ANEMIA AND FOR PRODUCING A RED BLOOD CELL OR PROGENITOR THEREOF
BR122023002401B1 (en) BASE EDITING SYSTEMS, CELLS AND THEIR USES, PHARMACEUTICAL COMPOSITIONS, KITS, USES OF A FUSION PROTEIN AND AN ADENOSINE 8 (ABE8) BASE EDITOR, AS WELL AS METHODS FOR EDITING A BETA GLOBIN POLYNUCLEOTIDE (HBB) COMPRISING A SINGLE NUCLEOTIDE POLYMORPHISM (SNP) ASSOCIATED WITH SICKLE CELL ANEMIA AND FOR THE PRODUCTION OF A RED BLOOD CELL

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21774111

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21774111

Country of ref document: EP

Kind code of ref document: A1