WO2024077267A1 - Méthodes et compositions d'édition d'amorce pour traiter des troubles de répétition de triplet - Google Patents

Méthodes et compositions d'édition d'amorce pour traiter des troubles de répétition de triplet Download PDF

Info

Publication number
WO2024077267A1
WO2024077267A1 PCT/US2023/076282 US2023076282W WO2024077267A1 WO 2024077267 A1 WO2024077267 A1 WO 2024077267A1 US 2023076282 W US2023076282 W US 2023076282W WO 2024077267 A1 WO2024077267 A1 WO 2024077267A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
pegrna
seq
composition
nucleotides
Prior art date
Application number
PCT/US2023/076282
Other languages
English (en)
Inventor
David R. Liu
Zaneta MATUSZEK
Mandana ARBAB
Original Assignee
The Broad Institute, Inc.
President And Fellows Of Harvard College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Broad Institute, Inc., President And Fellows Of Harvard College filed Critical The Broad Institute, Inc.
Publication of WO2024077267A1 publication Critical patent/WO2024077267A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P25/00Drugs for disorders of the nervous system
    • A61P25/28Drugs for disorders of the nervous system for treating neurodegenerative disorders of the central nervous system, e.g. nootropic agents, cognition enhancers, drugs for treating Alzheimer's disease or other forms of dementia
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/102Mutagenizing nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1276RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector

Definitions

  • Triplet repeat disorders including Huntington’s disease (HD) and Friedreich’s Ataxia (FRDA), are complex, progressive disorders that involve developmental neurobiology and often affect cognition as well as sensory-motor functions.
  • the disorders show genetic anticipation (i.e., increased severity with each generation), and the DNA expansions or contractions usually happen meiotically (i.e., during the time of gametogenesis, or early in embryonic development) and often have sex-bias, meaning that some genes expand only when inherited through the female and others only through the male.
  • the DNA expansions and contractions can also happen somatically (i.e., during an individual’s development or lifetime).
  • trinucleotide repeat expansion disorders can cause gene silencing at either the transcriptional or translational level, which essentially knocks out gene function.
  • trinucleotide repeat expansion disorders can cause altered proteins generated with large repetitive amino acid sequences that either abrogate or change protein function, often in a dominant-negative manner (e.g., poly-glutamine diseases).
  • Huntington’s Disease is an autosomal dominant disorder characterized by the loss of striatal neurons in the central nervous system and is associated with progressive unwanted choreatic movements, behavioral and psychiatric disturbances, and dementia.
  • HD is caused by CAG triplet repeat expansions in the first exon of the HTT gene, which codes for huntingtin protein, resulting in an expanded stretch of glutamines (polyQ).
  • polyQ glutamines
  • CAG repeat lengths range between 9-35 in the general population, while HD patients typically carry 40-50 repeats from birth. Individuals with an intermediate range of 36 to 39 CAGs may develop HD at later stages, with lower penetrance and variable clinical manifestation. Notably, CAG repeat length correlates with repeat instability, and long, unstable CAG repeats undergo somatic expansion in some tissues throughout a patient’s life, particularly including tissues in the central nervous system (CNS).
  • CNS central nervous system
  • Ataxia is an autosomal recessive disorder characterized by progressive ataxia and damage to the nervous system and is often associated with muscle weakness, spasticity, cardiomyopathy, and diabetes mellitus.
  • FRDA is the most common hereditary ataxia in the United States, Europe, the Middle East, South Asia (Indian subcontinent), and North Africa, with a carrier frequency between 1:60-1:100 individuals, though it is rarely identified in other populations.
  • FRDA is typically caused by the expansion of a GAA-triplet repeat in intron 1 of the FXN gene, resulting in transcriptional silencing and deficiency in frataxin (FXN) protein levels to below 30% of normal.
  • the age of FRDA onset in patients, loss of FXN protein, and severity of symptoms are inversely correlated with the GAA repeat length of the shortest FXN allele.
  • the length of FXN GAA-repeats in the general population ranges from ⁇ 5-60, while FRDA patients may present with 66 to well over 1200 repeats, typically ranging from 600 to 900 repeats.
  • GAA repeat length correlates with repeat instability, and long, unstable GAA repeats undergo somatic expansion in some tissues throughout a patient’s life that are particularly affected in FRDA, including the dorsal root ganglia (DRGs), spinal cord, cerebellum, heart, and pancreas, that subsequently experience greater loss of FXN protein expression.
  • DDGs dorsal root ganglia
  • nuclease and nicking activity within or flanking repeat loci does not enable reliable correction of FXN expression, and the biological consequences of unintended nuclease and nicking activity are not entirely known and may be deleterious.
  • a more precise gene-based therapy is needed to convert pathogenic HTT and FXN alleles to wild type alleles.
  • the present disclosure describes the use of prime editing to reduce the size of CAG repeat tracts to a normal polyQ length in cell and animal models (for example, by AAV delivery), as well as in subjects being treated, that contain pathogenic HTT alleles, without further changes to the flanking coding sequence.
  • a CAG repeat sequence is contracted to approximately four CAG repeats in length (e.g., approximately 10 CAG repeats in length, approximately 9 CAG repeats in length, approximately 8 CAG repeats in length, approximately 7 CAG repeats in length, approximately 6 CAG repeats in length, approximately 5 CAG repeats in length, approximately 4 CAG repeats in length, or approximately 3 CAG repeats in length).
  • a CAG repeat sequence is replaced with a CAG repeat sequence of approximately four CAG repeats in length (e.g., approximately 10 CAG repeats in length, approximately 9 CAG repeats in length, approximately 8 CAG repeats in length, approximately 7 CAG repeats in length, approximately 6 CAG repeats in length, approximately 5 CAG repeats in length, approximately 4 CAG repeats in length, or approximately 3 CAG repeats in length).
  • the present disclosure also describes the use of prime editing to remove long GAA repeats at FXN alleles in cell and animal models that contain pathogenic FXN alleles, with minimal loss of the surrounding FXN regulatory region in intron 1.
  • a GAA sequence of approximately 65 GAA repeats in length is deleted (e.g., approximately 60 GAA repeats in length, approximately 61 GAA repeats in length, approximately 62 GAA repeats in length, approximately 63 GAA repeats in length, approximately 64 GAA repeats in length, approximately 65 GAA repeats in length, approximately 66 GAA repeats in length, approximately 67 GAA repeats in length, approximately 68 GAA repeats in length, approximately 69 GAA repeats in length, or approximately 70 GAA repeats in length).
  • pegRNAs, complexes, polynucleotides, vectors, cells, compositions, and kits useful in treating Huntington’s disease and Friedreich’s ataxia, and methods of using the same.
  • the present disclosure provides pegRNAs for contracting or replacing trinucleotide repeat sequences in the HTT gene.
  • the present disclosure provides pegRNAs comprising a spacer sequence comprising the nucleic acid sequence: GACCCTGGAAAAGCTGATGA (SEQ ID NO: 381); GCTGCTGCTGGAAGGACTTG (SEQ ID NO: 382); GCTGCTGCTGCTGCTGCTGGA (SEQ ID NO: 383); GCTGCTGCTGCTGCTGCTGC (SEQ ID NO: 384); GGCGGCGGCGGCGGCGGTGG (SEQ ID NO: 385); TGAGGAAGCTGAGGAGGCGG (SEQ ID NO: 386); or GGCGGCTGAGGAAGCTGAGG (SEQ ID NO: 387).
  • the pegRNA comprises the sequence of any one of SEQ ID NOs: 454-815, or a sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the sequence of any one of SEQ ID NOs: 454-815.
  • the present disclosure provides pegRNAs for deleting trinucleotide repeats sequences in the FXN gene.
  • the present disclosure provides pegRNAs comprising a spacer sequence comprising the nucleic acid sequence: GCAAGACTAACCTGGCCAACA (SEQ ID NO: 388); GTCCGGAGTTCAAGACTAACC (SEQ ID NO: 389); GAAGGTGGATCACCTGAGGTC (SEQ ID NO: 390); GTCTGGAGTAGCTGGGATTAC (SEQ ID NO: 391); or GCAGGCGCGCGACACCACGCC (SEQ ID NO: 392).
  • the pegRNA comprises the sequence of any one of SEQ ID NOs: 816-867, or a sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the sequence of any one of SEQ ID NOs: 816-867.
  • the present disclosure provides compositions comprising a prime editor and any of the pegRNAs disclosed herein. Such compositions may be useful, for example, for editing the HTT and FXN genes, and/or for contracting or deleting pathogenic trinucleotide repeat sequences from the HTT and FXN genes.
  • a composition further comprises any of the nicking guide RNAs (ngRNAs) disclosed herein.
  • ngRNAs nicking guide RNAs
  • the present disclosure provides methods of treating Huntington’s disease by prime editing comprising contacting a target nucleotide sequence with any of the prime editor-pegRNA complexes or compositions provided herein.
  • the contacting is performed in a cell (e.g., a cell in vitro, or a cell in a human).
  • the cell is a cell of the central nervous system (e.g., a neuron).
  • the present disclosure provides methods of treating Friedreich’s ataxia by prime editing comprising contacting a target nucleotide sequence with any of the prime editor-pegRNA complexes or compositions provided herein. In some embodiments, the contacting is performed in a cell (e.g., a cell in vitro, or a cell in a human).
  • the present disclosure provides polynucleotides encoding any of the pegRNAs and/or prime editor-pegRNA complexes or compositions provided herein. In some embodiments, the present disclosure provides one or more polynucleotides encoding the pegRNA and the prime editor of any of the complexes provided herein.
  • the present disclosure provides vectors comprising any of the pegRNA- and/or prime editor-encoding polynucleotides provided herein.
  • the present disclosure provides adeno-associated virus (AAV) particles comprising any of the polynucleotides or vectors provided herein.
  • AAV adeno-associated virus
  • the present disclosure provides pharmaceutical compositions comprising any of the pegRNAs, compositions, polynucleotides, vectors, and/or AAV particles provided herein.
  • the present disclosure provides cells comprising any of the pegRNAs, compositions, polynucleotides, vectors, and/or AAV particles provided herein.
  • kits comprising any of the pegRNAs, compositions, polynucleotides, vectors, and/or AAV particles provided herein.
  • the present disclosure provides uses of any of the pegRNAs, compositions, polynucleotides, vectors, AAV particles, and/or pharmaceutical compositions provided herein in the treatment of Huntington’s disease (including, for example, adult-onset Huntington’s disease or juvenile Huntington’s disease) or Friedreich’s ataxia.
  • the present disclosure provides uses of any of the pegRNAs, compositions, polynucleotides, vectors, AAV particles, and/or pharmaceutical compositions provided herein in the manufacture of a medicament for the treatment of Huntington’s disease or Friedreich’s ataxia.
  • the present disclosure provides methods of using the prime editors, compositions, polynucleotides, or vectors provided herein in veterinary uses.
  • the present disclosure provides methods of using the prime editors, compositions, polynucleotides, or vectors provided herein in agricultural uses.
  • FIG.1 shows the rationale for reducing CAG expansions in HTT with prime editing.
  • Huntington’s disease (HD) patients carry >35 CAG repeats.
  • 14 Excision of CAG repeats alleviates HD pathology in cell and animal models of HD. This strategy is more precise and safer than other known nuclease-mediated genome editing approaches.
  • FIG.2 provides a schematic of prime editing strategies to reduce CAG repeats. Optimization of the pegRNA spacer is shown. Protospacer sequences upstream and downstream of the CAG tract, PBS lengths of 8-16 nucleotides, and RT templates comprising a CAG insertion + 16-40 nucleotides of homology were tested. Spacer 1 showed the highest editing efficiency above background. [0030] FIGs.3A-3B show optimization of the nicking guide for PE3.
  • PegRNA conditions tested include: a protospacer upstream of the CAG tract; an arbitrary PBS length; and RT templates comprising 26 nt of exon1 of HTT + a 4X CAG insertion + an arbitrary homology length. From all tested nicking conditions, nick N5 resulted in the highest editing efficiency and best editing:indel ratio. [0031] FIGs.4A-4B show optimization of the PBS and RTT in the pegRNA.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; PBS lengths of 8-14 nucleotides; and RT templates comprising a 4X CAG insertion + 16-35 nucleotides homology. From all conditions tested, a PBS of >8 nucleotides and an RT template of at least 25 nucleotides provided the highest editing efficiency. [0032] FIGs.5A-5B show testing of various pegRNA structural motifs.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; an RT template comprising a 4X CAG insertion + 31 nucleotides homology; and evopreq1 and mpknot 3′ pegRNA motifs.
  • pegRNAs comprising evopreq1 with a rationally designed linker worked best.
  • FIG.6 shows testing of various pegRNA RT template lengths.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; RT templates comprising a 4X CAG insertion + 25-40 nucleotides homology; and an evopreq1 motif.
  • pegRNAs comprising RT templates of 31 nucleotides and 40 nucleotides long provided the best editing efficiencies and the best editing:indel ratios.
  • FIG.7 shows a comparison of editing efficiencies between prime editors with and without PEmax architecture.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; RT templates comprising a 4X CAG insertion + 31 or 40 nucleotides homology; and an evopreq1 motif. PEmax outperformed PE2 for epegRNAs with homology of 31 and 40 nucleotides.
  • FIG.8 shows the effect of MLH1dn (PE4 and PE5) on HTT editing. PegRNA conditions tested include those identified in FIG.7. MLH1dn overexpression was observed to improve PE2 editing efficiency.
  • FIG.9 shows HTT editing using dual pegRNAs (i.e., twin prime editing).
  • PegRNA conditions tested include those identified in FIG.7. Single flap prime editing was found to outperform twin prime editing.
  • FIGs.10A-10B show testing increasing sizes of the CAG insertion.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; RT templates comprising 2-9X CAG insertions + 31 nucleotides homology; and an evopreq1 motif. Replacement with a smaller number of CAG repeats performed best.
  • FIG.10A shows testing of additional nicking guide options. PegRNA conditions tested include those identified in FIG.7, with various nicking guide RNAs. Nicking guide RNA N5 was still observed to yield the highest editing efficiency. Nicking guide RNA 23NGA3b was observed to yield the best editing:indel ratio, with high editing efficiency. [0039] FIGs.12A-12B show improvement of HTT editing through pegRNA scaffold modifications.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; a PBS length of 10 nucleotides; RT templates comprising 4X or 9X CAG insertions + 31-40 nucleotides homology; and an evopreq1 motif.
  • PegRNAs comprising the U-A flip shown in FIG.12A showed slightly improved editing efficiency.
  • SEQ ID NOs: 882 (top) and 883 (bottom) are shown.
  • FIGs.13A-13B show screening of prime editors with various reverse transcriptase variants for 4X CAG replacement.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; an RT template comprising a 4X CAG insertion + 40 nucleotides homology; and an evopreq1 motif.
  • the prime editor comprising the V223Y MMLV reverse transcriptase variant (PEmax Rhdelta with a V223Y substitution) and pRT-5.800max outperformed PEmax in a PE3 system with the N5 nicking gRNA. Both editors are smaller than PEmax.
  • FIG.14 shows screening of prime editors with various reverse transcriptase variants for 9X CAG replacement.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; an RT template comprising a 9X CAG insertion + 40 nucleotides homology; and an evopreq1 motif.
  • the prime editor comprising the V223Y MMLV reverse transcriptase variant (PEmax Rhdelta with a V223Y substitution) and pRT-5.800max outperformed PEmax in a PE3 system with the N5 nicking gRNA.
  • FIGs.15A-15B show improvement of HTT editing by rational design of pegRNA sequences.
  • PegRNA conditions tested include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG. 3A; a PBS length of 10 nucleotides; RT templates comprising a 9X CAG insertion + 31-40 nucleotides homology; and an evopreq1 motif.
  • PegRNA with silent mutations in the synthesis template of the RT template sequence generally improve editing efficiency.
  • FIGs.16A-16C show improvement of HTT editing by modifying the CAG insertion sequence.
  • FIG.16A SEQ ID NOs: 884 (nt-top), 885 (aa-middle) and 886 (nt-bottom) are shown.
  • FIG.16B SEQ ID NOs: 887-891 (top-bottom) are shown.
  • FIGs.17A-17B show further improvement of HTT editing by rational design of pegRNA sequences. PegRNA conditions tested include those used in FIGs.15A-15B. PegRNA with silent mutations in the synthesis template of the RT template sequence generally improve editing efficiency.
  • SEQ ID NOs: 892 (nt-top), 893 (aa- middle) and 894 (nt-bottom) are shown.
  • FIGs.18A-18B show an optimized pegRNA strategy for 6Q (six repeats of the sequence CAG) replacement.
  • Optimized pegRNAs include: the spacer identified with the highest editing efficiency in FIG.2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; an RT template comprising a 9X CAG insertion + 31 nucleotides homology; and an evopreq1 motif.
  • SEQ ID NOs: 892 (nt-top), 893 (aa-middle) and 894 (nt-bottom) are shown.
  • FIGs.19A-19B show an optimized pegRNA strategy for 11Q replacement.
  • Optimized pegRNAs include: the spacer identified with the highest editing efficiency in FIG. 2; a silent PAM mutation; PE3 with nicking gRNA N5 identified in FIG.3A; a PBS length of 10 nucleotides; an RT template comprising a 9X CAG insertion + 31 or 40 nucleotides homology; and an evopreq1 motif.
  • SEQ ID NOs: 892 (nt-top), 893 (aa-middle) and 894 (nt-bottom) are shown.
  • FIGs.20A-20C show optimized conditions for using prime editing to reduce CAG repeats.
  • Optimal pegRNA designs include: 1) a protospacer upstream of the CAG tract; 2) a PBS of 10 nucleotides in length; 3) an RT comprising a 26-nucleotide constant region with a PAM edit and silent mutations, CAG insertions of 4X CAG or 4CAG + 3CAA, and 31 or 40 nucleotides of homology; 4) a tevopreq1 motif; and 5) a U-A flip in the pegRNA scaffold as shown in FIG.12A.
  • Optimal PE2max editor designs include a truncated MMLV (RNAseHdelta) with a V223Y mutation.
  • Optimal PE3 system designs include the N5 nicking guide RNA and PE3b editor with an edit-specific nicking guide (NGA PAM).
  • N5 nicking guide RNA and PE3b editor with an edit-specific nicking guide (NGA PAM).
  • SEQ ID NOs: 895-899 top-bottom are shown.
  • FIG.21 shows additional screens of PE variants using the PE34CAG strategy.
  • PE variant V223Y outperformed all other variants in the PE3 system with the N5 nicking guide for 4CAG replacement.
  • FIG.22 shows additional screens of PE variants using the PE34CAG strategy with limited PE to enhance differences.
  • PE variant V223Y with eeCas9-8 outperformed PE variant V223Y and all other variants in the PE3 system with the N5 nicking guide for 4CAG replacement, in the context of limited PE plasmid.
  • FIG.23 shows additional screens of PE variants using the PE39CAG strategy.
  • PE variant V223Y outperformed all other variants in the PE3 system with the N5 nicking guide for 9CAG replacement.
  • FIG.24 shows additional screens of PE variants using the PE39CAG strategy with limited PE to enhance differences.
  • FIG.25 shows additional screens of PE variants using the PE3b 4CAG strategy.
  • PE variant V223Y eeCas9-8 outperformed all other variants in the PE3b system with 23b nicking guide for 4CAG replacement.
  • FIG.26 shows additional screens of PE variants using the PE3b 4CAG strategy with limited PE to enhance differences.
  • FIG.27 shows additional screens of PE variants using the PE3b 9CAG strategy.
  • PE variant V223Y eeCas9-8 outperformed all other variants in the PE3b system with the 23b nicking guide for 9CAG replacement.
  • FIG.28 shows additional screens of PE variants using the PE3b 9CAG strategy with limited PE to enhance differences.
  • FIGs.29A-29B show a second-generation PE strategy to reduce CAG repeats.
  • PegRNA design includes: a protospacer upstream of the CAG tract; a PBS of 10 nucleotides in length; an RT template comprising a 26 nucleotide constant region with PAM edit and silent mutations, a CAG insertion of 4CAG or 9CAR (4CAG + 3CAA), and 31 nucleotides or 40 nucleotides homology; a tevopreq1 motif; and a UA flip in the pegRNA scaffold.
  • PE2max editor is used in the PE3b system with the N5 nicking gRNA and NGA PAM.
  • FIG.29A shows the construct of HTT mESC cells with 21 or 72 CAG repeats. SEQ ID NOs: 900-902 (top-bottom) are shown.
  • FIGs.31A-31B show prime editing of the CAG expansion in HTT mESC cells.
  • FIGs.32A-32B show prime editing of the CAG expansion in HTT mESC cells.
  • FIG. 32A shows that pegRNA with homology of 40 nucleotides show better editing efficiency in HTT mESC cells.
  • An extra nicking guide RNA did not improve editing efficiency. Additionally, editing of longer alleles is more efficient.
  • FIG.32B shows that in contrast to HEK293T cells with only 17 CAGs or mESC with 21 CAGs, a replacement of a long CAG tract (mESC with 72 CAGs) is nearly as efficient as a replacement with a short repeat sequence.
  • FIG.33 provides a schematic for editing of CAG repeats in vivo.
  • FIGs.34A-34B show improvement of prime editing of CAG repeats in vivo from the first strategy developed (htt-v1) to the second strategy developed (htt-v2). Compared to htt- v1, the htt-v2 strategy employs PE V223Y with a further optimized pegRNA, an additional nicking guide (PE3), and a stronger promoter in the AAV architecture.
  • FIGs.35A-35C show prime editing of CAG repeats in vivo using the htt-v2 strategy. The htt-v2 strategy yielded some editing activity in the brain and liver.
  • FIGs.36A-36C show prime editing of CAG repeats in vivo using longer treatment time and improved transduction efficiency. A treatment length of 8 weeks was used. The htt- v2 strategy yielded good transduction efficiency.
  • FIG.37 provides a schematic for a third-generation strategy for prime editing of CAG repeats in vivo that employs the use of the PE3b system (htt-v3).
  • FIGs.38A-38C show analysis of impure editing results during prime editing of CAG repeats in vivo.
  • FIG.38A SEQ ID NOs: 926-927 (top-bottom) are shown.
  • FIGs.39A-39D show comparison of the htt-v3 strategy to the htt-v2 strategy. The htt- v3 strategy yielded better editing efficiency and distribution of correct edits.
  • FIGs.40A-40B show testing of additional prime editor variants to develop a further improved strategy for prime editing of CAG repeats in vivo (htt-v4).
  • Htt-4a truncated MMLV reverse transcriptase V223Y + eeCas9-1.
  • Htt-4b truncated MMLV reverse transcriptase V223Y + eeCas9-8.
  • FIG.41 shows the rationale for removal of GAA expansions in FXN with prime editing. Friedreich’s ataxia (FRDA or FA) patients carry >65 GAA repeats. Long GAA repeats reduce frataxin mRNA and protein.
  • FIG.42 shows optimization of the spacer sequence for removal of GAA repeats by prime editing.
  • PegRNA conditions tested include: protospacers upstream and downstream of the GAA tract; PBS lengths of 8-14 nucleotides; and RT template lengths of 8-40 nucleotides. The pegRNAs were used for deletion of the GAA region and flanking sequence.
  • FIGs.43A-43B show optimization of the nicking guide for PE3.
  • PegRNA conditions tested include: the spacer forward 1 sequence identified in FIG.42; arbitrary PBS lengths; and RT templates comprising arbitrary lengths of homology.
  • nick C resulted in the highest editing efficiency (equivalent to PE2).
  • SEQ ID NOs: 905, 908-909 (top-bottom) are shown.
  • FIGs.44A-44B show optimization of the PBS and RT template in the pegRNA for FXN editing.
  • PegRNA conditions tested include: the spacer forward 1 sequence identified in FIG.42; PE2 or PE3 with nick C; PBS lengths of 8-14 nucleotides; and RT template lengths of 32-40 nucleotides. Of all conditions tested, an RT template of 40 nucleotides paired with a PBS of 10 nucleotides resulted in the highest editing efficiency.
  • SEQ ID NOs: 910 (top) and 911 (bottom) are shown.
  • FIGs.45A-45B show optimization of pegRNA structural motifs for FXN editing.
  • PegRNA conditions tested include: the spacer forward 1 sequence identified in FIG.42; PE2 or PE3 with nick C; a PBS length of 10 nucleotides; an RT template length of 40 nucleotides; and an evopreq1 motif.
  • a 3′ evopreq1 motif improved editing efficiency in both PE2 and PE3 systems.
  • Use of a linker generated by pegLIT resulted in slightly higher editing efficiency compared to a rationally designed linker.
  • FIG.46 shows a comparison of prime editing FXN with and without PEmax architecture.
  • PegRNA conditions tested include those identified in FIGs.45A-45B, with a pegLIT-designed linker joining the evopreq1 motif to the pegRNA.
  • FIG.47 shows the effect of MLH1dn (PE4 and PE5) on FXN editing.
  • PegRNA conditions tested include those utilized in FIG.46.
  • MLH1dn overexpression slightly improved PE2 editing efficiency.
  • FIG.48 shows FXN editing using dual pegRNAs (twin prime editing).
  • PegRNA conditions tested include those utilized in FIG.46.
  • FIGs.49A-49B shows a comparison of PE3 and PE3b for FXN editing. PegRNA conditions tested include those utilized in FIG.46. Use of PE3b resulted in higher editing efficiency than PE3.
  • FIGs.50A-50B show improvement of FXN editing using pegRNA scaffold modifications.
  • PegRNA conditions tested include: the spacer forward 1 sequence identified in FIG.42; PE2 or PE3b with nick 3b; a PBS length of 10 nucleotides; an RT template length of 40 nucleotides; and an evopreq1 motif with a pegLIT-designed linker.
  • the U-A flip in the pegRNA scaffold, as shown in FIG.50A improves editing efficiency in both the PE2 and PE3b systems.
  • SEQ ID NOs: 882 (top) and 883 (bottom) are shown.
  • FIGs.51A-51B show screening of prime editors with reverse transcriptase variants.
  • Prime editor conditions tested include those used in FIGs.50A-50B, including the U-A flip in the pegRNA scaffold as shown in FIG.50A.
  • MMLV reverse transcriptase comprising a V223Y mutation (PEmax Rhdelta with V223Y) and pRT-5.800max outperform PEmax in the PE3b system with a 3b nicking guide. Both editors are smaller than PEmax.
  • FIGs.52A-52B show optimized conditions for using prime editing to excise GAA repeats.
  • Optimal pegRNA conditions include: 1) a protospacer upstream of the GAA tract; 2) a PBS of 10 nucleotides in length; 3) an RT template comprising a 40-nucleotide region of homology; 4) a tevopreq1 motif; and 5) a U-A flip in the pegRNA scaffold as shown in FIG. 50A.
  • Optimal conditions for the prime editor include use of a PE2max editor comprising truncated MMLV (RNAseHdelta) with a V223Y mutation and the typical Cas9 nickase used in PE2max, eeCas9-1, or eeCas9-8, or a PE3b editor with an edit-specific nicking guide.
  • RNAseHdelta truncated MMLV
  • FIG.52B SEQ ID NOs: 912 (top) and 913 (bottom) are shown.
  • FIG.53 shows construction of FXN mESC cells with 30 GAA repeats. SEQ ID NOs: 914, 915, 916, 915, and 917 (top-bottom) are shown.
  • FIG.54 shows prime editing of the GAA expansion in FXN mESC cells. Conditions tested include the use of PE2max, epegRNA h40p10, and nick C and nick 3b.
  • FIG.55 shows results of prime editing of the GAA expansion in FXN mESC cells. pegRNAs with homology of 40 nucleotides worked well in FXN mESC cells, and extra nicking did not improve editing efficiency. Additionally, editing of longer alleles (30 GAAs) was more efficient than editing in HEK293T cells (9 GAAs). PEmax V223Y Rhdelta performs better than or similar to PE2max and has the advantage of being of a smaller size.
  • FIG.56 shows screening of prime editors comprising Cas9 variants for FXN editing.
  • the editing strategy includes the following: 1) the protospacer of spacer 1 from FIG.42; 2) PE3b prime editor with nicking guide RNA; 3) a PBS of 10 nucleotides; 4) a reverse transcriptase template of 40 nucleotides; 5) an evopreq1 motif and pegLIT linker in the pegRNA; 6) a UA scaffold flip in the pegRNA; and 7) prime editor architecture of PE2max comprising truncated MMLV (RNaseHdelta) with a V223Y mutation.
  • RNaseHdelta truncated MMLV
  • FIG.57 shows construction of FXN mESCs with 30, 60, and 200 GAA repeats.
  • FIGs.58A-58D show prime editing of GAA repeats in vitro.
  • FIG.59 shows prime editing of the GAA expansion in FXN mESC cells.
  • FIG.58 A SEQ ID NOs: 914-917 (top-bottom) are shown.
  • FIGs.60A-60B show prime editing of GAA repeats in FRDA fibroblasts.
  • FIGs.61A-61C show further data for prime editing of GAA repeats in FRDA fibroblasts.
  • FIG.61A shows Cas9 nuclease editing of GAA repeats.
  • FIG.61B shows prime editing of GAA repeats in FRDA fibroblasts.
  • FIG.61C shows FXN expression in FRDA fibroblasts.
  • FIGs.62A-62B show delivery approach for prime editing of GAA repeats in vivo.
  • FIGs.63A-63C show optimization of strategy for prime editing of GAA repeats in vivo.
  • FIGs.64A-64C show further data for optimization of prime editing of GAA repeats in vivo.
  • FIG.64A provides a schematic for ICV injection of mice for prime editing of GAA repeats.
  • FIG.64B shows FXN prime editing in YG8 mice via injection of AAV9 delivery system.
  • FIG.64C shows AAV9 transduction efficiency in the cortex of YG8 mice.
  • FIGs.65A-65C show FXN expression in prime edited YG8 mice.
  • FIG.65A shows FXN prime editing in the liver of YG8s mice.
  • FIG.65B shows FXN expression in the liver of YG8.GAA300 mice.
  • FIG.65C shows FXN expression in the liver of YG8.GAA800 mice.
  • DEFINITIONS [0093] Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this invention belongs.
  • Adeno-Associated Virus [0094] An “adeno-associated virus” or “AAV” is a virus that infects humans and some other primate species.
  • the wild-type AAV genome is a single-stranded deoxyribonucleic acid (ssDNA), either positive- or negative-sensed.
  • the genome comprises two inverted terminal repeats (ITRs), one at each end of the DNA strand, and two open reading frames (ORFs): rep and cap between the ITRs.
  • the rep ORF comprises four overlapping genes encoding Rep proteins required for the AAV life cycle.
  • the cap ORF comprises overlapping genes encoding capsid proteins: VP1, VP2, and VP3, which interact together to form the viral capsid.
  • VP1, VP2, and VP3 are translated from one mRNA transcript, which can be spliced in two different manners: either a longer or shorter intron can be excised resulting in the formation of two isoforms of mRNAs: a ⁇ 2.3 kb- and a ⁇ 2.6 kb-long mRNA isoform.
  • the capsid forms a supramolecular assembly of approximately 60 individual capsid protein subunits into a non-enveloped, T-1 icosahedral lattice capable of protecting the AAV genome.
  • Recombinant AAV (rAAV) particles may comprise a nucleic acid vector (e.g., a recombinant genome), which may comprise at a minimum: (a) one or more heterologous nucleic acid regions comprising a sequence encoding a protein or polypeptide of interest (e.g., a split prime editor) or an RNA of interest (e.g., a gRNA), or one or more nucleic acid regions comprising a sequence encoding a Rep protein; and (b) one or more regions comprising inverted terminal repeat (ITR) sequences (e.g., wild-type ITR sequences or engineered ITR sequences) flanking the one or more nucleic acid regions (e.g., heterologous nucleic acid regions).
  • ITR inverted terminal repeat
  • the nucleic acid vector is between 4 kb and 5 kb in size (e.g., 4.2 to 4.7 kb in size). In some embodiments, the nucleic acid vector further comprises a region encoding a Rep protein. In some embodiments, the nucleic acid vector is circular. In some embodiments, the nucleic acid vector is single-stranded. In some embodiments, the nucleic acid vector is double-stranded. In some embodiments, a double- stranded nucleic acid vector may be, for example, a self-complimentary vector that contains a region of the nucleic acid vector that is complementary to another region of the nucleic acid vector, initiating the formation of the double-stranded nucleic acid vector.
  • Cas9 refers to an RNA-guided nuclease comprising a Cas9 domain, or a fragment thereof (e.g., a protein comprising an active or inactive DNA cleavage domain of Cas9, and/or the gRNA binding domain of Cas9).
  • a “Cas9 domain,” as used herein, is a protein fragment comprising an active or fully or partly inactive cleavage domain of Cas9 and/or the gRNA binding domain of Cas9.
  • a “Cas9 protein” is a full length Cas9 protein.
  • a Cas9 nuclease is also referred to sometimes as a casn1 nuclease or a CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat)-associated nuclease.
  • CRISPR is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements, and conjugative plasmids).
  • CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids.
  • CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).
  • tracrRNA trans-encoded small RNA
  • rnc endogenous ribonuclease 3
  • Cas9 domain The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA.
  • Cas9/crRNA/tracrRNA endonucleolytically cleaves a linear or circular dsDNA target complementary to the spacer.
  • the strand in the target DNA not complementary to crRNA is first cut endonucleolytically, then trimmed 3′-5′ exonucleolytically.
  • DNA-binding and cleavage typically requires protein and both RNAs.
  • single guide RNAs (“sgRNA”, or simply “gRNA”) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species. See, e.g., Jinek M., Chylinski K., Fonfara I., Hauer M., Doudna J.A., Charpentier E. Science 337:816-821(2012), the contents of which are incorporated herein by reference.
  • Cas9 recognizes a short motif in the CRISPR repeat sequences (the PAM or protospacer adjacent motif) to help distinguish self versus non-self.
  • Cas9 nuclease sequences and structures are well known to those of skill in the art (see, e.g., “Complete genome sequence of an M1 strain of Streptococcus pyogenes.” Ferretti et al., J.J., McShan W.M., Ajdic D.J., Savic D.J., Savic G., Lyon K., Primeaux C., Sezate S., Suvorov A.N., Kenton S., Lai H.S., Lin S.P., Qian Y., Jia H.G., Najar F.Z., Ren Q., Zhu H., Song L., White J., Yuan X., Clifton S.W., Roe B.A., McLaughlin R.E., Proc.
  • Cas9 orthologs have been described in various species, including, but not limited to, S. pyogenes and S. thermophilus. Additional suitable Cas9 nucleases and sequences will be apparent to those of skill in the art based on this disclosure, and such Cas9 nucleases and sequences include Cas9 sequences from the organisms and loci disclosed in Chylinski, Rhun, and Charpentier, “The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems” (2013) RNA Biology 10:5, 726-737; the entire contents of which are incorporated herein by reference.
  • a Cas9 nuclease comprises one or more mutations that partially impair or inactivate the DNA cleavage domain.
  • a nuclease-inactivated Cas9 domain may interchangeably be referred to as a “dCas9” protein (for nuclease-“dead” Cas9).
  • Methods for generating a Cas9 domain (or a fragment thereof) having an inactive DNA cleavage domain are known (see, e.g., Jinek et al., Science.
  • the DNA cleavage domain of Cas9 is known to include two subdomains, the HNH nuclease subdomain and the RuvC1 subdomain.
  • the HNH subdomain cleaves the strand complementary to the gRNA, whereas the RuvC1 subdomain cleaves the non-complementary strand. Mutations within these subdomains can silence the nuclease activity of Cas9.
  • proteins comprising fragments of a Cas9 protein are provided.
  • a protein comprises one of two Cas9 domains: (1) the gRNA binding domain of Cas9; or (2) the DNA cleavage domain of Cas9.
  • proteins comprising Cas9, or fragments thereof are referred to as “Cas9 variants.”
  • a Cas9 variant shares homology to Cas9, or a fragment thereof.
  • a Cas9 variant is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, at least about 99.8% identical, or at least about 99.9% identical to wild type Cas9 (e.g., SpCas9 of SEQ ID NO: 6).
  • the Cas9 variant may have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 21, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, or more amino acid changes compared to wild type Cas9 (e.g., SpCas9 of SEQ ID NO: 6).
  • wild type Cas9 e.g., SpCas9 of SEQ ID NO: 6
  • the Cas9 variant comprises a fragment of SEQ ID NO: 6 Cas9 (e.g., a gRNA binding domain or a DNA-cleavage domain), such that the fragment is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to the corresponding fragment of wild type Cas9 (e.g., SpCas9 of SEQ ID NO: 6).
  • a fragment of SEQ ID NO: 6 Cas9 e.g., a gRNA binding domain or a DNA-cleavage domain
  • the fragment is at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% identical, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% of the amino acid length of a corresponding wild type Cas9 (e.g., SpCas9 of SEQ ID NO: 6).
  • a corresponding wild type Cas9 e.g., SpCas9 of SEQ ID NO: 6
  • CRISPR is a family of DNA sequences (i.e., CRISPR clusters) in bacteria and archaea that represent snippets of prior infections by a virus that have invaded the prokaryote.
  • the snippets of DNA are used by the prokaryotic cell to detect and destroy DNA from subsequent attacks by similar viruses and effectively compose, along with an array of CRISPR- associated proteins (including Cas9 and homologs thereof) and CRISPR-associated RNA, a prokaryotic immune defense system.
  • CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA).
  • tracrRNA trans-encoded small RNA
  • rnc endogenous ribonuclease 3
  • Cas9 protein a trans-encoded small RNA
  • the tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA.
  • Cas9/crRNA/tracrRNA endonucleolytically cleaves a linear or circular dsDNA target complementary to the RNA. Specifically, the DNA strand in the target that is not complementary to crRNA is first cut endonucleolytically, then trimmed 3′-5′ exonucleolytically.
  • RNA-binding and cleavage typically requires protein and both RNAs.
  • single guide RNAs (“sgRNA”, or simply “gRNA”) can be engineered so as to incorporate aspects of both the crRNA and tracrRNA into a single RNA species – the guide RNA.
  • sgRNA single guide RNAs
  • Cas9 recognizes a short motif in the CRISPR repeat sequences (the PAM or protospacer adjacent motif) to help distinguish self versus non-self.
  • Cas9 orthologs have been described in various species, including, but not limited to, S. pyogenes and S. thermophilus. Additional suitable Cas9 nucleases and sequences will be apparent to those of skill in the art based on this disclosure, and such Cas9 nucleases and sequences include Cas9 sequences from the organisms and loci disclosed in Chylinski, Rhun, and Charpentier, “The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems” (2013) RNA Biology 10:5, 726-737; the entire contents of which are incorporated herein by reference.
  • a “CRISPR system” refers collectively to transcripts and other elements involved in the expression of or directing the activity of CRISPR-associated (“Cas”) genes, including sequences encoding a Cas gene, a tracr (trans-activating CRISPR) sequence (e.g., tracrRNA or an active partial tracrRNA), a tracr mate sequence (encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system), a guide sequence (also referred to as a “spacer” ), or other sequences and transcripts from a CRISPR locus.
  • a tracr trans-activating CRISPR
  • tracrRNA or an active partial tracrRNA e.g., tracrRNA or an active partial tracrRNA
  • a tracr mate sequence encompassing a “direct repeat” and a tracrRNA-processed partial direct repeat in the context of an endogenous CRISPR system
  • DNA synthesis template or Reverse Transcriptase Template (RTT)
  • RTT Reverse Transcriptase Template
  • DNA synthesis template and “reverse transcriptase template (RTT)” refer to the region or portion of the extension arm of a PEgRNA that is utilized as a template by a polymerase of a prime editor to encode a 3 ⁇ single-strand DNA flap that contains the desired edit and which then, through the mechanism of prime editing, replaces the corresponding endogenous strand of DNA at the target site.
  • the extension arm including the DNA synthesis template, may be comprised of DNA or RNA.
  • the polymerase of the prime editor can be an RNA-dependent DNA polymerase (e.g., a reverse transcriptase).
  • the polymerase of the prime editor can be a DNA-dependent DNA polymerase.
  • the DNA synthesis template may comprise the “edit template” and the “homology arm”, and all or a portion of an optional 5′ end modifier region and/or an optional 3′ end modifier region.
  • the DNA synthesis template can include the portion of the extension arm that spans from the 5 ⁇ end of the primer binding site (PBS) to 3 ⁇ end of the gRNA core that may operate as a template for the synthesis of a single-strand of DNA by a polymerase (e.g., a reverse transcriptase).
  • a polymerase e.g., a reverse transcriptase
  • the DNA synthesis template can include the portion of the extension arm that spans from the 5 ⁇ end of the PEgRNA molecule to the 5′ end of the PBS.
  • an RT template may be used to refer to a template polynucleotide for reverse transcription, e.g., in a prime editing system, complex, or method using a prime editor having a polymerase that is a reverse transcriptase.
  • a DNA synthesis template may be used to refer to a template polynucleotide for DNA polymerization, e.g., RNA- dependent DNA polymerization or DNA-dependent polymerization, e.g., in a prime editing system, complex, or method using a prime editor having a polymerase that is an RNA- dependent DNA polymerase or a DNA-dependent DNA polymerase.
  • the term “edit template” refers to a portion of the extension arm that encodes the desired edit in the single strand 3 ⁇ DNA flap that is synthesized by the polymerase, e.g., a DNA-dependent DNA polymerase or an RNA-dependent DNA polymerase (e.g., a reverse transcriptase).
  • DNA synthesis template refers to the region or portion of the extension arm of a pegRNA that is utilized as a template strand by a polymerase of a prime editor to encode a 3 ⁇ single-strand DNA flap that contains the desired edit and which then, through the mechanism of prime editing, replaces the corresponding endogenous strand of DNA at the target site.
  • the extension arm including the DNA synthesis template, may be comprised of DNA or RNA.
  • the polymerase of the prime editor can be an RNA-dependent DNA polymerase (e.g., a reverse transcriptase).
  • the polymerase of the prime editor can be a DNA-dependent DNA polymerase.
  • the DNA synthesis template comprises an the “edit template” and a “homology arm.”
  • the DNA synthesis template may comprise the “edit template” and a “homology arm”, and all or a portion of the optional 5′ end modifier region, e2. That is, depending on the nature of the e2 region (e.g., whether it includes a hairpin, toeloop, or stem/loop secondary structure), the polymerase may encode none, some, or all of the e2 region, as well.
  • the DNA synthesis template can include the portion of the extension arm that spans from the 5 ⁇ end of the primer binding site (PBS) to 3 ⁇ end of the gRNA core that may operate as a template for the synthesis of a single-strand of DNA by a polymerase (e.g., a reverse transcriptase).
  • a polymerase e.g., a reverse transcriptase
  • the DNA synthesis template can include the portion of the extension arm that spans from the 5 ⁇ end of the pegRNA molecule to the 3 ⁇ end of the edit template.
  • the DNA synthesis template excludes the primer binding site (PBS) of pegRNAs either having a 3 ⁇ extension arm or a 5 ⁇ extension arm.
  • an RT template which is inclusive of the edit template and the homology arm, i.e., the sequence of the pegRNA extension arm which is actually used as a template during DNA synthesis.
  • the term “RT template” is equivalent to the term “DNA synthesis template.”
  • an RT template may be used to refer to a template polynucleotide for reverse transcription, e.g., in a prime editing system, complex or method using a prime editor having a polymerase that is a reverse transcriptase.
  • a DNA synthesis template may be used to refer to a template polynucleotide for DNA polymerization, e.g., RNA-dependent DNA polymerization or DNA-dependent polymerization, e.g., in a prime editing system, complex, or method using a prime editor having a polymerase that is an RNA-dependent DNA polymerase or a DNA-dependent DNA polymerase.
  • the DNA synthesis template is a single-stranded portion of the PEgRNA that is 5′ of the PBS and comprises a region of complementarity to the PAM strand (i.e., the non-target strand or the edit strand), and comprises one or more nucleotide edits compared to the endogenous sequence of the double stranded target DNA.
  • the DNA synthesis template is complementary or substantially complementary to a sequence on the non-target strand that is downstream of a nick site, except for one or more non-complementary nucleotides at the intended nucleotide edit positions.
  • the DNA synthesis template is complementary or substantially complementary to a sequence on the non-target strand that is immediately downstream (i.e., directly downstream) of a nick site, except for one or more non-complementary nucleotides at the intended nucleotide edit positions. In some embodiments, one or more of the non- complementary nucleotides at the intended nucleotide edit positions are immediately downstream of a nick site. In some embodiments, the DNA synthesis template comprises one or more nucleotide edits relative to the double-stranded target DNA sequence. In some embodiments, the DNA synthesis template comprises one or more nucleotide edits relative to the non-target strand of the double-stranded target DNA sequence.
  • a nick site is characteristic of the particular napDNAbp to which the gRNA core of the PEgRNA associates with, and is characteristic of the particular PAM required for recognition and function of the napDNAbp.
  • the nick site in the phosphodiester bond between bases three (“-3” position relative to the position 1 of the PAM sequence) and four (“-4” position relative to position 1 of the PAM sequence).
  • the DNA synthesis template and the primer binding site are immediately adjacent to each other.
  • nucleotide edit refers to a specific nucleotide edit, e.g., a specific deletion of one or more nucleotides, a specific insertion of one or more nucleotides, a specific substitution(s) of one or more nucleotides, or a combination thereof, at a specific position in a DNA synthesis template of a PEgRNA to be incorporated in a target DNA sequence.
  • the DNA synthesis template comprises more than one nucleotide edit relative to the double-stranded target DNA sequence.
  • each nucleotide edit is a specific nucleotide edit at a specific position in the DNA synthesis template, each nucleotide edit is at a different specific position relative to any of the other nucleotide edits in the DNA synthesis template, and each nucleotide edit is independently selected from a specific deletion of one or more nucleotides, a specific insertion of one or more nucleotides, a specific substitution(s) of one or more nucleotides, or a combination thereof.
  • a nucleotide edit may refer to the edit on the DNA synthesis template as compared to the sequence on the target strand of the double stranded target DNA, or may refer to the edit encoded by the DNA synthesis template on the newly synthesized single stranded DNA that replaces the endogenous target DNA sequence on the non-target strand.
  • Edit strand and non-edit strand [0103]
  • the terms “edit strand” and “non-edit strand” are terms that may be used when describing the mechanism of action of a prime editing system on a double-stranded DNA substrate.
  • the “edit strand” refers to the strand of DNA which is nicked by the prime editor complex to form a 3 ⁇ end, which is then extended as a newly synthesized single stranded DNA (also referred herein as the newly synthesized 3′ DNA flap), which comprises a desired edit and ultimately displaces and replaces the single strand region of DNA just downstream of the nick, thereby installing the 3 ⁇ DNA flap containing the desired edit downstream of the nick on the “edit strand.”
  • the newly synthesized 3′ DNA flap comprising the nucleotide edit is paired in a heteroduplex with the non-edit strand that does not comprise the nucleotide edit, thereby creating a mismatch.
  • the mismatch is recognized by DNA repair machinery, and/or replication machinery, e.g., an endogenous DNA repair machinery.
  • the intended nucleotide edit is incorporated into both strands of the target double-stranded DNA substrate.
  • the application may also refer to the “edit strand” as the “protospacer strand” or the “PAM strand” since these elements are present on that strand.
  • the “edit strand” may also be called the “non-target strand” since the edit strand is not the strand that becomes annealed to the spacer of the PEgRNA molecule, but rather is the complement of the strand that is annealed by the spacer of the PEgRNA.
  • extension arm refers to a nucleotide sequence component of a PEgRNA which comprises a primer binding site (PBS) and a DNA synthesis template for a polymerase (e.g., an RT template for reverse transcriptase).
  • PBS primer binding site
  • a DNA synthesis template for a polymerase e.g., an RT template for reverse transcriptase
  • the extension arm is located at the 3 ⁇ end of the guide RNA. In other embodiments, the extension arm is located at the 5 ⁇ end of the guide RNA. In some embodiments, the extension arm comprises a DNA synthesis template and a primer binding site. In some embodiments, the extension arm comprises the following components in a 5 ⁇ to 3 ⁇ direction: the DNA synthesis template, and the primer binding site. In some embodiments, the extension arm also includes a homology arm. In various embodiments, the extension arm comprises the following components in a 5 ⁇ to 3 ⁇ direction: the homology arm, the edit template, and the primer binding site.
  • the extension arm may be described as comprising generally two regions: a primer binding site (PBS) and a DNA synthesis template, for instance.
  • PBS primer binding site
  • the primer binding site binds to a primer sequence, for example, a single stranded primer sequence containing a free 3′ end at the nick site that is formed from the endogenous DNA strand of the target site when it becomes nicked by the prime editor complex, thereby exposing a 3 ⁇ end on the endogenous nicked strand.
  • a primer sequence for example, a single stranded primer sequence containing a free 3′ end at the nick site that is formed from the endogenous DNA strand of the target site when it becomes nicked by the prime editor complex, thereby exposing a 3 ⁇ end on the endogenous nicked strand.
  • the binding of the primer sequence to the primer binding site on the extension arm of the PEgRNA creates a duplex region with an exposed 3 ⁇ end (i.e., the 3 ⁇ of the primer sequence), which then provides a substrate for a polymerase to begin polymerizing a single strand of DNA from the exposed 3 ⁇ end along the length of the DNA synthesis template.
  • the sequence of the single strand DNA product is the complement of the DNA synthesis template.
  • Polymerization continues towards the 5 ⁇ of the DNA synthesis template (or extension arm) until polymerization terminates.
  • the DNA synthesis template represents the portion of the extension arm that is encoded into a single strand DNA product (i.e., the 3 ⁇ single strand DNA flap containing the desired nucleotide edit) by the polymerase of the prime editor complex and that ultimately replaces the corresponding endogenous DNA strand of the target site that sits immediately downstream of the PE- induced nick site.
  • polymerization of the DNA synthesis template continues towards the 5 ⁇ end of the extension arm until a termination event.
  • Polymerization may terminate in a variety of ways, including, but not limited to (a) reaching a 5 ⁇ terminus of the PEgRNA (e.g., in the case of the 5 ⁇ extension arm wherein the DNA polymerase simply runs out of template), (b) reaching an impassable RNA secondary structure (e.g., hairpin or stem/loop), or (c) reaching a replication termination signal, e.g., a specific nucleotide sequence that blocks or inhibits the polymerase, or a nucleic acid topological signal, such as supercoiled DNA or RNA.
  • a DNA synthesis template e.g., a reverse transcription template
  • a DNA synthesis template (e.g., a reverse transcription template) comprises or encodes a (CAG)m repeat sequence, wherein m is no more than 35.
  • Fusion protein refers to a hybrid polypeptide which comprises protein domains from at least two different proteins. One protein may be located at the amino-terminal (N-terminal) portion of the fusion protein or at the carboxy-terminal (C- terminal) protein thus forming an “amino-terminal fusion protein” or a “carboxy-terminal fusion protein,” respectively.
  • a protein may comprise different domains, for example, a nucleic acid binding domain (e.g., the gRNA binding domain of Cas9 that directs the binding of the protein to a target site) and a nucleic acid cleavage domain or a catalytic domain of a nucleic-acid editing protein.
  • a nucleic acid binding domain e.g., the gRNA binding domain of Cas9 that directs the binding of the protein to a target site
  • a nucleic acid cleavage domain or a catalytic domain of a nucleic-acid editing protein e.g., the gRNA binding domain of Cas9 that directs the binding of the protein to a target site
  • Another example includes fusion of a Cas9 or equivalent thereof to a reverse transcriptase (i.e., a prime editor).
  • Any of the proteins provided herein may be produced by any method known in the art.
  • the proteins provided herein may be produced via recombinant protein expression and pur
  • Nucleic acids of the present disclosure may include one or more genetic elements.
  • a “genetic element” refers to a particular nucleotide sequence that has a role in nucleic acid expression (e.g., promoter, enhancer, terminator) or encodes a discrete product of an engineered nucleic acid (e.g., a nucleotide sequence encoding a guide RNA and/or a protein).
  • a “promoter” refers to a control region of a nucleic acid sequence at which initiation and rate of transcription of the remainder of a nucleic acid sequence are controlled.
  • a promoter may also contain sub-regions at which regulatory proteins and molecules may bind, such as an RNA polymerase and other transcription factors.
  • Promoters may be constitutive, inducible, activatable, repressible, tissue-specific, or any combination thereof.
  • a promoter drives expression or drives transcription of the nucleic acid sequence that it regulates.
  • a promoter is considered to be “operably linked” when it is in a correct functional location and orientation in relation to a nucleic acid sequence it regulates to control (“drive”) transcriptional initiation and/or expression of that sequence.
  • a promoter may be one naturally associated with a gene or sequence, as may be obtained by isolating the 5' non-coding sequences located upstream of the coding segment of a given gene or sequence.
  • a promoter is referred to as an “endogenous promoter.”
  • a coding nucleic acid sequence may be positioned under the control of a recombinant or heterologous promoter, which refers to a promoter that is not normally associated with the encoded sequence in its natural environment.
  • promoters may include promoters of other genes; promoters isolated from any other cell; and synthetic promoters or enhancers that are not “naturally occurring” such as, for example, those that contain different elements of different transcriptional regulatory regions and/or mutations that alter expression through methods of genetic engineering that are known in the art.
  • promoters used in accordance with the present disclosure are “inducible promoters,” which are promoters that are characterized by regulating (e.g., initiating or activating) transcriptional activity when in the presence of, influenced by or contacted by an inducer signal.
  • An inducer signal may be endogenous or a normally exogenous condition (e.g., light), compound (e.g., chemical or non-chemical compound), or protein that contacts an inducible promoter in such a way as to be active in regulating transcriptional activity from the inducible promoter.
  • a “signal that regulates transcription” of a nucleic acid refers to an inducer signal that acts on an inducible promoter.
  • a signal that regulates transcription may activate or inactivate transcription, depending on the regulatory system used. Activation of transcription may involve directly acting on a promoter to drive transcription or indirectly acting on a promoter by inactivation of a repressor that is preventing the promoter from driving transcription.
  • a “transcriptional terminator” is a nucleic acid sequence that causes transcription to stop.
  • a transcriptional terminator may be unidirectional or bidirectional. It is comprised of a DNA sequence involved in specific termination of an RNA transcript by an RNA polymerase.
  • a transcriptional terminator sequence prevents transcriptional activation of downstream nucleic acid sequences by upstream promoters.
  • a transcriptional terminator may be necessary in vivo to achieve desirable expression levels or to avoid transcription of certain sequences.
  • a transcriptional terminator is considered to be “operably linked to” a nucleotide sequence when it is able to terminate the transcription of the sequence it is linked to.
  • the most commonly used type of terminator is a forward terminator. When placed downstream of a nucleic acid sequence that is usually transcribed, a forward transcriptional terminator will cause transcription to abort.
  • bidirectional transcriptional terminators are provided, which usually cause transcription to terminate on both the forward and reverse strand.
  • reverse transcriptional terminators are provided, which usually terminate transcription on the reverse strand only.
  • terminators In prokaryotic systems, terminators usually fall into two categories (1) rho- independent terminators and (2) rho-dependent terminators.
  • Rho-independent terminators are generally composed of a palindromic sequence that forms a stem loop rich in G-C base pairs followed by several T bases.
  • the conventional model of transcriptional termination is that the stem loop causes RNA polymerase to pause, and transcription of the poly-A tail causes the RNA:DNA duplex to unwind and dissociate from RNA polymerase.
  • the terminator region may comprise specific DNA sequences that permit site-specific cleavage of the new transcript so as to expose a polyadenylation site. This signals a specialized endogenous polymerase to add a stretch of about 200 A residues (polyA) to the 3' end of the transcript.
  • RNA molecules modified with this polyA tail appear to more stable and are translated more efficiently.
  • a terminator may comprise a signal for the cleavage of the RNA.
  • the terminator signal promotes polyadenylation of the message.
  • the terminator and/or polyadenylation site elements may serve to enhance output nucleic acid levels and/or to minimize read through between nucleic acids.
  • terminators include, without limitation, the termination sequences of genes such as, for example, the bovine growth hormone terminator, and viral termination sequences such as, for example, the SV40 terminator, spy, yejM, secG-leuU, thrLABC, rrnB T1, hisLGDCBHAFI, metZWV, rrnC, xapR, aspA, and arcA terminator.
  • the termination signal may be a sequence that cannot be transcribed or translated, such as those resulting from a sequence truncation.
  • guide RNA is a particular type of guide nucleic acid which is mostly commonly associated with a Cas protein of a CRISPR-Cas9 and which associates with Cas9, directing the Cas9 protein to a specific sequence in a DNA molecule that includes complementarity to the spacer sequence of the guide RNA.
  • this term also embraces the equivalent guide nucleic acid molecules that associate with Cas9 equivalents, homologs, orthologs, or paralogs, whether naturally occurring or non-naturally occurring (e.g., engineered or recombinant), and which otherwise program the Cas9 equivalent to localize to a specific target nucleotide sequence.
  • the Cas9 equivalents may include other napDNAbp from any type of CRISPR system (e.g., type II, V, VI), including Cpf1 (a type-V CRISPR- Cas systems), C2c1 (a type V CRISPR-Cas system), C2c2 (a type VI CRISPR-Cas system), and C2c3 (a type V CRISPR-Cas system).
  • Cpf1 a type-V CRISPR- Cas systems
  • C2c1 a type V CRISPR-Cas system
  • C2c2 a type VI CRISPR-Cas system
  • C2c3 a type V CRISPR-Cas system
  • guide RNA may also be referred to as a “traditional guide RNA” to contrast it with the modified forms of guide RNA termed “prime editing guide RNAs” (or “PEgRNAs”) and “engineered PEgRNAs” (or epegRNAs”).
  • PEgRNAs primary editing guide RNAs
  • epegRNAs engineered PEgRNAs
  • Guide RNAs or PEgRNAs/epegRNAs may comprise various structural elements that include, but are not limited to: [0118] Spacer sequence – the sequence in the guide RNA or pegRNA/epegRNA (having about 20 nts in length) that has the same sequence as the protospacer in the target DNA, except that the guide RNA or PEgRNA/epegRNA comprises Uracil and the target protospacer contains Thymine. [0119] gRNA core (or gRNA scaffold or backbone sequence) – the sequence within the gRNA that is responsible for binding with a nucleic acid programmable DNA binding protein, e.g., a Cas9.
  • Spacer sequence the sequence in the guide RNA or pegRNA/epegRNA (having about 20 nts in length) that has the same sequence as the protospacer in the target DNA, except that the guide RNA or PEgRNA/epegRNA comprises Uracil and the target protospacer contains Thymine.
  • gRNA core or
  • a gRNA core sequence (including a gRNA core sequence in a pegRNA) is capable of complexing with a Cas9 protein.
  • Transcription terminator – the guide RNA or PEgRNA may comprise a transcriptional termination sequence at the 3 ⁇ of the molecule.
  • a pegRNA or epegRNA may also comprise an extension arm – a single strand extension at the 3 ⁇ end or the 5 ⁇ end of the PEgRNA which comprises a primer binding site and a DNA synthesis template sequence that encodes via a polymerase (e.g., a reverse transcriptase) a single stranded DNA flap containing the desired nucleotide change, which then integrates into the endogenous DNA by replacing the corresponding endogenous strand, thereby installing the desired nucleotide change.
  • a polymerase e.g., a reverse transcriptase
  • an “intein” is a segment of a protein that is able to excise itself and join the remaining portions (the exteins) with a peptide bond in a process known as protein splicing. Inteins are also referred to as “protein introns.” The process of an intein excising itself and joining the remaining portions of the protein is herein termed “protein splicing” or “intein-mediated protein splicing.” In some embodiments, an intein of a precursor protein (an intein containing protein prior to intein-mediated protein splicing) comes from two genes. Such an intein is referred to herein as a split intein.
  • cyanobacteria DnaE
  • the catalytic subunit ⁇ of DNA polymerase III is encoded by two separate genes, dnaE-n and dnaE-c.
  • the intein encoded by the dnaE-n gene is herein referred as “intein-N.”
  • the intein encoded by the dnaE- c gene is herein referred as “intein-C.”
  • Other intein systems may also be used.
  • a synthetic intein based on the dnaE intein, the Cfa-N and Cfa-C intein pair has been described (e.g., in Stevens et al., J. Am. Chem.
  • Non-limiting examples of intein pairs that may be used in accordance with the present disclosure include: Cfa DnaE intein, Ssp GyrB intein, Ssp DnaX intein, Ter DnaE3 intein, Ter ThyX intein, Rma DnaB intein and Cne Prp8 intein (e.g., as described in US Patent No.8,394,604, incorporated herein by reference).
  • Intein-N and intein-C may be fused to the N-terminal portion of the split prime editor and the C-terminal portion of the split prime editor, respectively, for the joining of the N- terminal portion of the split prime editor and the C-terminal portion of the split prime editor.
  • an intein-N is fused to the C-terminus of the N-terminal portion of the split prime editor, i.e., to form a structure of N-[N-terminal portion of the split prime editor]-[intein-N]-C.
  • an intein-C is fused to the N-terminus of the C-terminal portion of the split prime editor, i.e., to form a structure of N-[intein-C]-[C- terminal portion of the split prime editor]-C.
  • the mechanism of intein-mediated protein splicing for joining the proteins the inteins are fused to is known in the art, e.g., as described in Shah et al., Chem. Sci.2014; 5(1):446–461, incorporated herein by reference.
  • the split site is within a Cas9 protein of a prime editor.
  • the split site is between amino acid residues 844 and 845 of a Cas9 protein within a prime editor (e.g., a Cas9 protein of SEQ ID NO: 6). In certain embodiments, the split site is between amino acid residues 1024 and 1025 of a Cas9 protein within a prime editor (e.g., a Cas9 protein of SEQ ID NO: 6).
  • Linker refers to a molecule linking two other molecules or moieties. The linker can be an amino acid sequence in the case of a peptide linker joining two domains of a fusion protein.
  • a napDNAbp (e.g., Cas9) can be fused to a reverse transcriptase by an amino acid linker sequence.
  • the linker can also be a nucleotide sequence in the case of joining two nucleotide sequences together (e.g., in a gRNA).
  • the traditional guide RNA is linked via a spacer or linker nucleotide sequence to the RNA extension of a prime editing guide RNA which may comprise an RT template sequence and an RT primer binding site.
  • the linker is an organic molecule, group, polymer, or chemical moiety.
  • the linker is 5- 200 amino acids in length, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-150, or 150-200 amino acids in length. Longer or shorter linkers are also contemplated.
  • napDNAbp As used herein, the term “nucleic acid programmable DNA binding protein” or “napDNAbp,” of which Cas9 is an example, refers to a protein that uses RNA:DNA hybridization to target and bind to specific sequences in a DNA molecule.
  • Each napDNAbp is associated with at least one guide nucleic acid (e.g., guide RNA), which localizes the napDNAbp to a DNA sequence that comprises a DNA strand (i.e., a target strand) that is complementary to the guide nucleic acid, or a portion thereof (e.g., the protospacer of a guide RNA).
  • guide nucleic acid e.g., guide RNA
  • the guide nucleic-acid “programs” the napDNAbp (e.g., Cas9 or equivalent) to localize and bind to a complementary sequence.
  • the binding mechanism of a napDNAbp–guide RNA complex includes the step of forming an R-loop whereby the napDNAbp induces the unwinding of a double-strand DNA target, thereby separating the strands in the region bound by the napDNAbp.
  • the guide RNA protospacer then hybridizes to the “target strand.” This displaces a “non-target strand” that is complementary to the target strand, which forms the single strand region of the R-loop.
  • the napDNAbp includes one or more nuclease activities, which then cut the DNA, leaving various types of lesions.
  • the napDNAbp may comprise a nuclease activity that cuts the non-target strand at a first location, and/or cuts the target strand at a second location.
  • the target DNA can be cut to form a “double-stranded break” whereby both strands are cut.
  • the target DNA can be cut at only a single site, i.e., the DNA is “nicked” on one strand.
  • Exemplary napDNAbp with different nuclease activities include “Cas9 nickase” (“nCas9”) and a deactivated Cas9 having no nuclease activities (“dead Cas9” or “dCas9”).
  • nickase refers to a napDNAbp (e.g., a Cas protein) which is capable of cleaving only one of the two complementary strands of a double-stranded target DNA sequence, thereby generating a nick in that strand.
  • the nickase cleaves a non-target strand of a double stranded target DNA sequence.
  • the nickase comprises an amino acid sequence with one or more mutations in a catalytic domain of a canonical napDNAbp (e.g., a Cas protein), wherein the one or more mutations reduces or abolishes nuclease activity of the catalytic domain.
  • the nickase is a Cas9 that comprises one or more mutations in a RuvC-like domain relative to a wild type Cas9 sequence or to an equivalent amino acid position in other Cas9 variants or Cas9 equivalents.
  • the nickase is a Cas9 that comprises one or more mutations in an HNH-like domain relative to a wild type Cas9 sequence or to an equivalent amino acid position in other Cas9 variants or Cas9 equivalents.
  • the nickase is a Cas9 that comprises an aspartate-to-alanine substitution (D10A) in the RuvC I catalytic domain of Cas9 relative to a canonical SpCas9 sequence or to an equivalent amino acid position in other Cas9 variants or Cas9 equivalents.
  • the nickase is a Cas9 that comprises an H840A, N854A, and/or N863A mutation relative to a canonical SpCas9 sequence, or to an equivalent amino acid position in other Cas9 variants or Cas9 equivalents.
  • the term “Cas9 nickase” refers to a Cas9 with one of the two nuclease domains inactivated. This enzyme is capable of cleaving only one strand of a target DNA.
  • the nickase is a Cas protein that is not a Cas9 nickase.
  • the napDNAbp of the prime editing complex comprises an endonuclease having nucleic acid programmable DNA binding ability.
  • the napDNAbp comprises an active endonuclease capable of cleaving both strands of a double stranded target DNA.
  • the napDNAbp is a nuclease active endonuclease, e.g., a nuclease active Cas protein, that can cleave both strands of a double stranded target DNA by generating a nick on each strand.
  • a nuclease active Cas protein can generate a cleavage (a nick) on each strand of a double stranded target DNA.
  • the two nicks on both strands are staggered nicks, for example, generated by a napDNAbp comprising a Cas12a or Cas12b1.
  • the two nicks on both strands are at the same genomic position, for example, generated by a napDNAbp comprising a nuclease active Cas9.
  • the napDNAbp comprises an endonuclease that is a nickase.
  • the napDNAbp comprises an endonuclease comprising one or more mutations that reduce nuclease activity of the endonuclease, rendering it a nickase.
  • the napDNAbp comprises an inactive endonuclease, for example, in some embodiments, the napDNAbp comprises an endonuclease comprising one or more mutations that abolish the nuclease activity.
  • the napDNAbp is a Cas9 protein or variant thereof.
  • the napDNAbp can also be a nuclease active Cas9, a nuclease inactive Cas9 (dCas9), or a Cas9 nickase (nCas9).
  • the napDNAbp is Cas9 nickase (nCas9) that nicks only a single strand.
  • the napDNAbp can be selected from the group consisting of: Cas9, Cas12e, Cas12d, Cas12a, Cas12b1, Cas12b2, Cas13a, Cas12c, Cas12d, Cas12e, Cas12h, Cas12i, Cas12g, Cas12f (Cas14), Cas12f1, Cas12j (Cas ⁇ ), and Argonaute and optionally has a nickase activity such that only one strand is cut.
  • the napDNAbp is selected from Cas9, Cas12e, Cas12d, Cas12a, Cas12b1, Cas12b2, Cas13a, Cas12c, Cas12d, Cas12e, Cas12h, Cas12i, Cas12g, Cas12f (Cas14), Cas12f1, Cas12j (Cas ⁇ ), and Argonaute and optionally has a nickase activity such that one DNA strand is cut preferentially to the other DNA strand.
  • NLS Nuclear localization sequence
  • nucleic acid refers to a polymer of nucleotides.
  • the polymer may include natural nucleosides (i.e., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine), nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C5 bromouridine, C5 fluorouridine, C5 iodouridine, C5 propynyl uridine, C5 propynyl cytidine, C5 methylcytidine, 7-deazaadenosine, 7- deazaguanosine, 8-oxoadeno
  • nucleoside analogs e.g., 2-
  • the terms “prime editing guide RNA” or “PEgRNA” or “pegRNA” or “extended guide RNA” refer to a specialized form of a guide RNA that has been modified to include one or more additional sequences for implementing the prime editing methods and compositions described herein.
  • the prime editing guide RNAs comprise one or more “extended regions,” also referred to herein as “extension arms,” of nucleic acid sequence.
  • the extended regions may comprise, but are not limited to, single-stranded RNA or DNA. Further, the extended regions may occur at the 3′ end of a traditional guide RNA. In other arrangements, the extended regions may occur at the 5′ end of a traditional guide RNA.
  • the extended region may occur at an intramolecular region of the traditional guide RNA, for example, in the gRNA core region which associates and/or binds to the napDNAbp.
  • the extended region comprises a “DNA synthesis template” which encodes (by the polymerase of the prime editor) a single-stranded DNA which, in turn, has been designed to be (a) homologous with the endogenous target DNA to be edited, and (b) which comprises at least one desired nucleotide change (e.g., a transition, a transversion, a deletion, or an insertion) to be introduced or integrated into the endogenous target DNA.
  • the extended region may also comprise other functional sequence elements, such as, but not limited to, a “primer binding site” and a “linker” sequence, or other structural elements, such as, but not limited to, aptamers, stem loops, hairpins, toe loops (e.g., a 3′ toeloop), or an RNA-protein recruitment domain (e.g., MS2 hairpin).
  • the “primer binding site” comprises a sequence that hybridizes to a single-strand DNA sequence having a 3′ end generated from the nicked DNA of the R-loop.
  • the PEgRNAs have a 3 ⁇ extension arm, a spacer, and a gRNA core.
  • the 3 ⁇ extension arm further comprises in the 5 ⁇ to 3 ⁇ direction a reverse transcriptase template, a primer binding site, and a linker.
  • the reverse transcriptase template may also be referred to more broadly as the “DNA synthesis template” where the polymerase of a prime editor described herein is not an RT, but another type of polymerase.
  • the PEgRNAs have a 5 ⁇ extension arm, a spacer, and a gRNA core.
  • the 5 ⁇ extension further comprises in the 5 ⁇ to 3 ⁇ direction a reverse transcriptase template, a primer binding site, and a linker.
  • the reverse transcriptase template may also be referred to more broadly as the “DNA synthesis template” where the polymerase of a prime editor described herein is not an RT, but another type of polymerase.
  • the PEgRNAs have in the 5 ⁇ to 3 ⁇ direction a spacer (1), a gRNA core (2), and an extension arm (3).
  • the extension arm (3) is at the 3 ⁇ end of the PEgRNA.
  • the extension arm (3) further comprises in the 5 ⁇ to 3 ⁇ direction a homology arm, an edit template, and a primer binding site.
  • the extension arm (3) may also comprise an optional modifier region at the 3 ⁇ and 5 ⁇ ends, which may be the same sequences or different sequences.
  • the 3 ⁇ end of the PEgRNA may comprise a transcriptional terminator sequence.
  • these sequence elements of the PEgRNAs are further described and defined herein.
  • the PEgRNAs have in the 5 ⁇ to 3 ⁇ direction an extension arm (3), a spacer (1), and a gRNA core (2).
  • the extension arm (3) is at the 5 ⁇ end of the PEgRNA.
  • the extension arm (3) further comprises in the 3 ⁇ to 5 ⁇ direction a primer binding site, an edit template, and a homology arm.
  • the extension arm (3) may also comprise an optional modifier region at the 3 ⁇ and 5 ⁇ ends, which may be the same sequences or different sequences.
  • the PEgRNAs may also comprise a transcriptional terminator sequence at the 3 ⁇ end.
  • the spacer sequence of the pegRNA is about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, or about 25 nucleotides in length. In certain embodiments, the spacer sequence of the pegRNA is about 20 nucleotides in length. In some embodiments, the primer binding site is about 4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, or about 17 nucleotides in length.
  • the homology arm of the pegRNA is about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20 nucleotides in length.
  • the DNA synthesis template is from about 5 to about 58 nucleotides in length, about 10 to about 16 nucleotides in length, or about 12 to about 17 nucleotides in length. In certain embodiments, the DNA synthesis template is less than 15 nucleotides in length.
  • a pegRNA is an “engineered pegRNA” (“epegRNA”). Relative to a pegRNA, an epegRNA comprises an additional structured motif, for example, attached to its 3′ end.
  • Such additional structured motifs may stabilize the pegRNA or otherwise prevent it from being degraded.
  • Suitable structured motifs include, but are not limited to, toe-loops, hairpins, stem-loops, pseudoknots, aptamers, G-quadruplexes, tRNAs, riboswitches, and ribozymes.
  • a 3′ structured motif comprises evopreq1.
  • PCT/US2021/031439 filed May 7, 2021, which published as WO 2021/226558; International Patent Application No. PCT/2021/052097, filed September 24, 2021, which published as WO 2022/067130; International Patent Application No. PCT/US2022/012054, filed January 11, 2022, which published as WO 2022/150790; International Patent Application No. PCT/US2022/078655, filed October 25, 2022, which published as WO 2023/076898; and International Patent Application No. PCT/US2022/074628, filed August 5, 2022, which published as WO 2023/015309; the contents of each of which is incorporated by reference herein.
  • PE1 refers to a prime editing composition comprising 1) a fusion protein comprising a Cas9 protein variant Cas9(H840A) and a wild type MMLV RT having the following structure: [NLS]-[Cas9(H840A)]-[linker]-[MMLV_RT(wt)] -NLS and 2) a desired PEgRNA, wherein the fusion protein (referred to as the PE1 protein) has the amino acid sequence of SEQ ID NO: 3, which is shown as follows.
  • PE2 refers to prime editing composition comprising 1) a fusion protein comprising a Cas9 protein variant Cas9(H840A) and a variant MMLV RT having the following structure: [NLS]-[Cas9(H840A)]-[linker]- [MMLV_RT(D200N)(T330P)(L603W)(T306K)(W313F)] -NLS and 2) a desired PEgRNA, wherein the fusion protein (referred to as the PE2 protein) has the amino acid sequence of SEQ ID NO: 4, which is shown as follows: KEY: NUCLEAR LOCALIZATION SEQUENCE (NLS) TOP:(SEQ ID NO: 95), BOTTOM: (SEQ ID NO: 96) CAS9(H840A) (SEQ ID NO: 10) 33-AMINO ACID LINKER (SEQ ID NO: 80) M-MLV reverse transcriptase (SEQ ID NO: 34).
  • PE3 refers a prime editing composition comprising a PE2 and further comprising a second-strand nicking guide RNA that complexes with the PE2 and introduces a nick in the non-edit DNA strand in order to induce preferential replacement of the edit strand.
  • PE3b refers to a prime editing composition comprising PE2 and further comprising a second-strand nicking guide RNA that complexes with PE2 and introduces a nick in the non-edit DNA strand, wherein the second-strand nicking guide RNA is designed for temporal control such that the second strand nick is not introduced until after the installation of the desired edit.
  • the second strand nicking guide RNA with a spacer sequence that comprises complementarity to, and only hybridizes with, only the edited strand after installation of the desired nucleotide edit(s), but not the endogenous target DNA sequence.
  • mismatches between the nicking guide RNA spacer and the unedited target DNA should disfavor nicking by the sgRNA until after the editing event on the PAM strand takes place.
  • PE4 refers to a prime editing composition comprising a PE2 and further comprising an MLH1 dominant negative protein variant (i.e., wild-type MLH1 with amino acids 754-756 truncated, which may be referred to herein as “MLH1 ⁇ 754-756” or “MLH1dn”).
  • the MLH1 dominant negative protein variant may be expressed in trans in some embodiments.
  • a PE4 system comprises a fusion protein comprising a PE2 protein and an MLH1 dominant negative protein joined via an optional linker.
  • PE5 refers to a prime editing composition comprising a PE3 and further comprising an MLH1 dominant negative protein variant (i.e., wild-type MLH1 with amino acids 754-756 truncated, which may be referred to as “MLH1 ⁇ 754-756” or “MLH1dn”).
  • the MLH1 dominant negative variant may be expressed in trans in some embodiments.
  • a PE5 system comprises a fusion protein comprising a PE2 protein and an MLH1 dominant negative protein joined via an optional linker.
  • PE5b refers to a prime editing composition comprising a PE3 and an MLH1 dominant negative protein, wherein the second-strand nicking guide RNA is designed for temporal control such that the second strand nick is not introduced until after the installation of the desired edit. This is achieved by designing the second strand nicking guide RNA with a spacer sequence that comprise complementarity to, and hybridizes with, only the edited strand after installation of the desired nucleotide edit(s), but not the endogenous target DNA sequence.
  • PEmax refers to a prime editing composition comprising 1) a fusion protein comprising a Cas9 protein variant Cas9(R221K N39K H840A) and a variant MMLV RT having the following structure: [bipartite NLS]-[Cas9(R221K)(N394K)(H840A)]- [linker]-[MMLV_RT(D200N)(T330P)(L603W)]-[bipartite NLS]-[NLS] and 2) a desired PEgRNA, wherein the fusion protein (referred to as the PEmax protein) has the amino acid sequence of SEQ ID NO: 5, which is shown as follows:
  • PE3max refers to a prime editing composition comprising a PEmax protein, a desired pegRNA, and a second strand nicking guide RNA.
  • PE3max can be considered as PE3 except wherein the PE2 component is substituted with PEmax.
  • PE3bmax refers to a prime editing composition comprising a PEmax protein, a desired pegRNA, and a second strand nicking guide RNA, wherein the second-strand nicking guide RNA is designed for temporal control such that the second strand nick is not introduced until after the installation of the desired edit. This is achieved by designing the second strand nicking guide RNA with a spacer sequence that comprise complementarity to, and hybridizes with, only the edited strand after installation of the desired nucleotide edit(s), but not the endogenous target DNA sequence.
  • PE4max refers to PE4 but the PE2 component is substituted with PEmax.
  • PE5max and PE5bmax [0148] As used herein, “PE5max” refers to PE5, but the PE2 component of PE3 is substituted with PEmax.
  • PE5bmax refers to PE5b, but the PE2 component of PE3 is substituted with PEmax.
  • PE6 [0149] The term “PE6” refers to a suite of next-generation prime editors described herein (PE6a, PE6b, PE6c, PE6d, PE6e, PE6f, and PE6g) comprising improved reverse transcriptase and/or Cas9 variants.
  • PE6 prime editor comprising an improved reverse transcriptase variant of PE6a and an improved Cas9 variant of PE6e is referred to herein as the prime editor “PE6a-e” (or “PE6e- a”).
  • PE6a-e a PE6 prime editor comprising an improved reverse transcriptase variant of PE6a and an improved Cas9 variant of PE6e
  • PE6a-e a PE6 prime editor comprising an improved reverse transcriptase variant of PE6a and an improved Cas9 variant of PE6e
  • PE6a-e or “PE6e- a”.
  • PE6a comprises a reverse transcriptase variant comprising the amino acid substitutions E60K, K87E, E165D, D243N, R267I, E279K, K318E, and K343N relative to an Ec48 reverse transcriptase (SEQ ID NO: 59).
  • PE6b comprises a reverse transcriptase variant comprising the amino acid substitutions P70T, G72V, S87G, M102I, K106R, K118R, I128V, L158Q, F269L, A363V, K413E, and S492N relative to a Tf1 reverse transcriptase (SEQ ID NO: 55).
  • PE6c comprises a reverse transcriptase variant comprising the amino acid substitutions P70T, G72V, S87G, M102I, K106R, K118R, I128V, L158Q, S188K, I260L, F269L, R288Q, S297Q, A363V, K413E, and S492N relative to a Tf1 reverse transcriptase (SEQ ID NO: 55).
  • PE6d comprises a reverse transcriptase variant comprising the amino acid substitutions T128N, D200C, and V223Y (and the substitutions T306K, W313F, and T330P used in the MMLV reverse transcriptase of PE2 and PEmax) relative to a MMLV reverse transcriptase (SEQ ID NO: 33) with a truncation of the C-terminal RnaseH domain (e.g., between D497 and I498 of SEQ ID NO: 33).
  • PE6e comprises a Cas9 variant comprising the amino acid substitutions K775R and K918A relative to wild type Streptococcus pyogenes Cas9 or Streptococcus pyogenes Cas9 nickase (SEQ ID NO: 7).
  • PE6f comprises a Cas9 variant comprising the amino acid substitutions H99R, E471K, I632V, D645N, H721Y, and K918A relative to wild type Streptococcus pyogenes Cas9 or Streptococcus pyogenes Cas9 nickase (SEQ ID NO: 7).
  • PE6g comprises a Cas9 variant comprising the amino acid substitutions H99R, E471K, I632V, D645N, R654C, and H721Y relative to wild type Streptococcus pyogenes Cas9 or Streptococcus pyogenes Cas9 nickase (SEQ ID NO: 7).
  • Any of the PE6 prime editors provided herein may also comprise the architecture of the PEmax protein as provided below. In some embodiments, any of the PE6 prime editors provided herein may further comprise additional amino acid mutations, e.g., any of those included in PEmax as provided below.
  • PE7 [0151] The term “PE7” refers to the PE6 prime editors plus a second strand nicking guide RNA.
  • PE7a refers to the PE6a prime editor as provided herein, plus a second strand nicking guide RNA.
  • polymerase refers to an enzyme that synthesizes a nucleotide strand and that may be used in connection with the prime editor delivery systems described herein.
  • the polymerase can be a “template-dependent” polymerase (i.e., a polymerase that synthesizes a nucleotide strand based on the order of nucleotide bases of a template strand).
  • the polymerase can also be a “template-independent” polymerase (i.e., a polymerase that synthesizes a nucleotide strand without the requirement of a template strand).
  • a polymerase may also be further categorized as a “DNA polymerase” or an “RNA polymerase.”
  • the prime editor system comprises a DNA polymerase.
  • the DNA polymerase can be a “DNA-dependent DNA polymerase” (i.e., whereby the template molecule is a strand of DNA).
  • the DNA template molecule can be a PEgRNA, wherein the extension arm comprises a strand of DNA.
  • the PEgRNA may be referred to as a chimeric or hybrid PEgRNA which comprises an RNA portion (i.e., the guide RNA components, including the spacer and the gRNA core) and a DNA portion (i.e., the extension arm).
  • the DNA polymerase can be an “RNA-dependent DNA polymerase” (i.e., whereby the template molecule is a strand of RNA).
  • the PEgRNA is RNA, i.e., including an RNA extension.
  • the term “polymerase” may also refer to an enzyme that catalyzes the polymerization of nucleotides (i.e., the polymerase activity).
  • the enzyme will initiate synthesis at the 3′-end of a primer annealed to a polynucleotide template sequence (e.g., such as a primer sequence annealed to the primer binding site of a PEgRNA) and will proceed toward the 5′ end of the template strand.
  • a polynucleotide template sequence e.g., such as a primer sequence annealed to the primer binding site of a PEgRNA
  • a “DNA polymerase” catalyzes the polymerization of deoxynucleotides.
  • DNA polymerase includes a “functional fragment thereof.”
  • a “functional fragment thereof” refers to any portion of a wild-type or mutant DNA polymerase that encompasses less than the entire amino acid sequence of the polymerase and which retains the ability, under at least one set of conditions, to catalyze the polymerization of a polynucleotide.
  • Such a functional fragment may exist as a separate entity, or it may be a constituent of a larger polypeptide, such as a fusion protein.
  • Prime editing refers to an approach for gene editing using napDNAbps, a polymerase (e.g., a reverse transcriptase), and specialized guide RNAs that include a primer binding site and a DNA synthesis template for encoding desired new genetic information (or deleting genetic information) that is then incorporated into a target DNA sequence.
  • Prime editing is described in Anzalone, A. V. et al., Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019), which is incorporated herein by reference.
  • Prime editing represents a platform for genome editing that is a versatile and precise method to directly write new genetic information into a specified DNA site using a nucleic acid programmable DNA binding protein (“napDNAbp”) working in association with a polymerase (i.e., in the form of a fusion protein or otherwise provided in trans with the napDNAbp), wherein the prime editing system is programmed with a prime editing (PE) guide RNA (“PEgRNA”) that both specifies the target site and templates the synthesis of the desired edit in the form of a replacement DNA strand by way of an extension (either DNA or RNA) engineered onto a guide RNA (e.g., at the 5 ⁇ or 3 ⁇ end, or at an internal portion of a guide RNA).
  • PE prime editing
  • PEgRNA prime editing guide RNA
  • the replacement strand containing the desired edit (e.g., a single nucleobase substitution) shares the same sequence as the endogenous strand (or is homologous to it) immediately downstream of the nick site of the target site to be edited (with the exception that it includes the desired edit).
  • the endogenous strand downstream of the nick site is replaced by the newly synthesized replacement strand containing the desired edit.
  • prime editing may be thought of as a “search-and-replace” genome editing technology since the prime editors, as described herein, not only search and locate the desired target site to be edited, but at the same time, encode a replacement strand containing a desired edit that is installed in place of the corresponding target site endogenous DNA strand.
  • the prime editors of the present disclosure relate, in part, to the discovery that the mechanism of target-primed reverse transcription (TPRT) or “prime editing” can be leveraged or adapted for conducting precision CRISPR/Cas-based genome editing with high efficiency and genetic flexibility.
  • TPRT is naturally used by mobile DNA elements, such as mammalian non-LTR retrotransposons and bacterial Group II introns.
  • Cas protein-reverse transcriptase fusions or related systems are used to target a specific DNA sequence with a guide RNA, generate a single strand nick at the target site, and use the nicked DNA as a primer for reverse transcription of an engineered reverse transcriptase template that is integrated with the guide RNA.
  • prime editors that use reverse transcriptase as the DNA polymerase component
  • the prime editors described herein are not limited to reverse transcriptases but may include the use of virtually any DNA polymerase. Indeed, while the application throughout may refer to prime editors with “reverse transcriptases,” it is set forth here that reverse transcriptases are only one type of DNA polymerase that may work with prime editing. Thus, wherever the specification mentions a “reverse transcriptase,” the person having ordinary skill in the art should appreciate that any suitable DNA polymerase may be used in place of the reverse transcriptase.
  • the prime editors may comprise Cas9 (or an equivalent napDNAbp), which is programmed to target a DNA sequence by associating it with a specialized guide RNA (i.e., PEgRNA) containing a spacer sequence that anneals to a complementary sequence (the complementary sequence to an endogenous protospacer sequence) in the target DNA.
  • PEgRNA also contains new genetic information in the form of an extension that encodes a replacement strand of DNA containing a desired nucleotide change which is used to replace a corresponding endogenous DNA strand at the target site.
  • the mechanism of prime editing involves nicking the target site in one strand of the DNA to expose a 3′-hydroxyl group.
  • the exposed 3′-hydroxyl group can then be used to prime the DNA polymerization of the edit-encoding extension on PEgRNA directly into the target site.
  • the extension which provides the template for polymerization of the replacement strand containing the edit—can be formed from RNA or DNA.
  • the polymerase of the prime editor can be an RNA-dependent DNA polymerase (such as a reverse transcriptase).
  • the polymerase of the prime editor may be a DNA-dependent DNA polymerase.
  • the newly synthesized strand i.e., the replacement DNA strand containing the desired nucleotide edit
  • the newly synthesized (or replacement) strand of DNA may also be referred to as a single strand DNA flap, which would compete for hybridization with the complementary homologous endogenous DNA strand, thereby displacing the corresponding endogenous strand.
  • Resolution of the hybridized intermediate (also referred to as a heteroduplex, comprising the single strand DNA flap synthesized by the reverse transcriptase hybridized to the endogenous DNA strand with the exception of mismatches at positions where desired nucleotide edits are installed in the edit strand) can include removal of the resulting displaced flap of endogenous DNA (e.g., with a 5 ⁇ end DNA flap endonuclease, FEN1), ligation of the synthesized single strand DNA flap to the target DNA, and assimilation of the desired nucleotide changes as a result of cellular DNA repair and/or replication processes.
  • endogenous DNA e.g., with a 5 ⁇ end DNA flap endonuclease, FEN1
  • FEN1 5 ⁇ end DNA flap endonuclease
  • the system can be combined with the use of an error-prone reverse transcriptase enzyme (e.g., provided as a fusion protein with the Cas9 domain, or provided in trans to the Cas9 domain).
  • the error- prone reverse transcriptase enzyme can introduce alterations during synthesis of the single strand DNA flap.
  • error-prone reverse transcriptase can be utilized to introduce nucleotide changes to the target DNA.
  • prime editing operates by contacting a target DNA molecule (for which a change in the nucleotide sequence is desired to be introduced) with a nucleic acid programmable DNA binding protein (napDNAbp) complexed with a prime editing guide RNA (PEgRNA).
  • napDNAbp nucleic acid programmable DNA binding protein
  • PgRNA prime editing guide RNA
  • the prime editing guide RNA comprises an extension at the 3′ or 5′ end of the guide RNA, or at an intramolecular location in the guide RNA, and encodes the desired nucleotide change (e.g., single nucleotide substitution, insertion, or deletion).
  • the napDNAbp/extended gRNA complex contacts the DNA molecule, and the extended gRNA guides the napDNAbp to bind to a target locus.
  • a nick in one of the strands of DNA of the target locus is introduced (e.g., by a nuclease or chemical agent), thereby creating an available 3′ end in one of the strands of the target locus.
  • the nick is created in the strand of DNA that corresponds to the R-loop strand, i.e., the strand that is not hybridized to the guide RNA sequence, i.e., the “non-target strand.”
  • the nick could be introduced in either of the strands.
  • the nick could be introduced into the R-loop “target strand” (i.e., the strand hybridized to the protospacer of the extended gRNA) or the “non-target strand” (i.e., the strand forming the single-stranded portion of the R-loop and which is complementary to the target strand).
  • target strand i.e., the strand hybridized to the protospacer of the extended gRNA
  • the “non-target strand” i.e., the strand forming the single-stranded portion of the R-loop and which is complementary to the target strand.
  • the 3′ end of the DNA strand formed by the nick
  • interacts with the extended portion of the guide RNA in order to prime reverse transcription i.e., “target-primed RT”.
  • the 3′ end DNA strand hybridizes to a specific RT priming sequence on the extended portion of the guide RNA, i.e., the “reverse transcriptase priming sequence” or “primer binding site” on the PEgRNA.
  • a reverse transcriptase (or other suitable DNA polymerase) is introduced that synthesizes a single strand of DNA from the 3′ end of the primed site towards the 5′ end of the prime editing guide RNA.
  • the DNA polymerase e.g., reverse transcriptase
  • the napDNAbp and guide RNA are released.
  • the final two steps relate to the resolution of the single strand DNA flap such that the desired nucleotide change becomes incorporated into the target locus. This process can be driven towards the desired product formation by removing the corresponding 5′ endogenous DNA flap that forms once the 3′ single strand DNA flap invades and hybridizes to the endogenous DNA sequence.
  • the cell s endogenous DNA repair and replication processes resolve the mismatched DNA to incorporate the nucleotide change(s) to form the desired altered product.
  • the process can also be driven towards product formation with “second strand nicking.” This process may introduce at least one or more of the following genetic changes: transversions, transitions, deletions, and insertions.
  • PE primary editor
  • PE system or “prime editor (PE)” or “PE system” or “PE editing system” refers the compositions involved in the method of genome editing using target-primed reverse transcription (TPRT) described herein, including, but not limited to, the napDNAbps, reverse transcriptases, fusion proteins (e.g., comprising napDNAbps and reverse transcriptases), prime editing guide RNAs, and complexes comprising fusion proteins and prime editing guide RNAs, as well as accessory elements, such as second strand nicking components (e.g., second strand nicking sgRNAs) and 5′ endogenous DNA flap removal endonucleases (e.g., FEN1) for helping to drive the prime editing process towards the edited product formation.
  • TPRT target-primed reverse transcription
  • the PEgRNA constitutes a single molecule comprising a guide RNA (which itself comprises a spacer sequence and a gRNA core or scaffold) and a 5 ⁇ or 3 ⁇ extension arm comprising the primer binding site and a DNA synthesis template
  • the PEgRNA may also take the form of two individual molecules.
  • a PEgRNA may comprise a guide RNA and a trans prime editor RNA template (tPERT), which essentially houses the extension arm (including, in particular, the primer binding site and the DNA synthesis domain) and an RNA-protein recruitment domain (e.g., MS2 aptamer or hairpin) in the same molecule which becomes co- localized or recruited to a modified prime editor complex that comprises a tPERT recruiting protein (e.g., MS2cp protein, which binds to the MS2 aptamer).
  • tPERT trans prime editor RNA template
  • Prime editor refers to the polypeptide or polypeptide components involved in prime editing as described herein.
  • a prime editor comprises a fusion constructs comprising a napDNAbp (e.g., Cas9 nickase) and a reverse transcriptase.
  • a prime editor is capable of carrying out prime editing on a target nucleotide sequence in the presence of a PEgRNA (or “extended guide RNA”).
  • a prime editor comprises a napDNAbp (e.g., Cas9 nickase) and a reverse transcriptase provided in trans, i.e., the napDNAbp and the reverse transcriptase are not fused.
  • the in trans napDNAbp and the reverse transcriptase may be tethered via a non-peptide linkage, e.g., a MS2 RNA-protein binding RNA sequence and a MS2 coat protein fused to either the napDNAbp or the reverse transcriptase, or may be unlinked to each other and simply recruited by the pegRNA.
  • a prime editor composition, system, or complex provided herein comprises a fusion protein or a fusion protein complexed with a PEgRNA, and/or further complexed with a second-strand nicking sgRNA.
  • the prime editor system may also refer to the complex comprising a fusion protein (reverse transcriptase fused to a napDNAbp), a PEgRNA, and a regular guide RNA capable of directing the second-site nicking step of the non-edited strand as described herein.
  • Primer binding site refers to the portion of a PEgRNA as a component of the extension arm (e.g., at the 3 ⁇ end of the extension arm), and is a single- stranded portion of the PEgRNA as a component of the extension arm that comprises a region of complementarity to a sequence on the non-target strand of a double stranded target DNA.
  • the primer binding site is complementary to a region upstream of a nick site in a non-target strand.
  • the primer binding site is complementary to a region immediately upstream of a nick site in the non-target strand.
  • the primer binding site is capable of binding to the primer sequence that is formed after nicking of the edit strand (the non-target strand) of the target DNA sequence by the prime editor.
  • the prime editor e.g., by a Cas9 nickase component of a prime editor
  • nicks the edit strand of the target DNA sequence a free 3′ end is formed in the edit strand, which serves as a primer sequence that anneals to the primer binding site on the PEgRNA to prime reverse transcription.
  • the PBS is complementary to or substantially complementary to and can anneal to a free 3′ end on the non-target strand of the double stranded target DNA at the nick site.
  • a PBS comprises complementarity to nucleotides 1 to (n-3) of a spacer sequence.
  • protein, peptide, and polypeptide are used interchangeably herein and refer to a polymer of amino acid residues linked together by peptide (amide) bonds. The terms refer to a protein, peptide, or polypeptide of any size, structure, or function. Typically, a protein, peptide, or polypeptide will be at least three amino acids long.
  • a protein, peptide, or polypeptide may refer to an individual protein or a collection of proteins.
  • One or more of the amino acids in a protein, peptide, or polypeptide may be modified, for example, by the addition of a chemical entity such as a carbohydrate group, a hydroxyl group, a phosphate group, a farnesyl group, an isofarnesyl group, a fatty acid group, a linker for conjugation, functionalization, or other modification, etc.
  • a protein, peptide, or polypeptide may also be a single molecule or may be a multi-molecular complex.
  • a protein, peptide, or polypeptide may be just a fragment of a naturally occurring protein or peptide.
  • a protein, peptide, or polypeptide may be naturally occurring, recombinant, or synthetic, or any combination thereof. Any of the proteins provided herein may be produced by any method known in the art. For example, the proteins provided herein may be produced via recombinant protein expression and purification, which is especially suited for fusion proteins comprising a peptide linker. Methods for recombinant protein expression and purification are well known, and include those described by Green and Sambrook, Molecular Cloning: A Laboratory Manual (4th ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (2012)), the contents of which are incorporated herein by reference.
  • Protospacer refers to the sequence (e.g., of ⁇ 20 bp) in DNA adjacent to the PAM (protospacer adjacent motif) sequence.
  • the protospacer shares the same sequence as the spacer sequence of the guide RNA (except that a protospacer contains Thymine and the spacer sequence contains Uracil).
  • the guide RNA anneals to the complement of the protospacer sequence on the target DNA (specifically, one strand thereof, i.e., the “target strand” versus the “non-target strand” of the target DNA sequence).
  • a Cas nickase component of a prime editor in order for a Cas nickase component of a prime editor to function, it also requires a specific protospacer adjacent motif (PAM) that varies depending on the Cas protein component itself, e.g., the type of Cas protein and the bacterial species from which it is derived.
  • PAM protospacer adjacent motif
  • Protospacer adjacent motif refers to a DNA sequence (e.g., an approximately 2-6 nucleotide sequence) that is an important targeting component of a Cas nuclease, e.g., a Cas9.
  • a Cas nuclease e.g., a Cas9
  • the PAM sequence is on either strand and is downstream in the 5 ⁇ to 3 ⁇ direction of the Cas9 cut site.
  • the canonical PAM sequence i.e., the PAM sequence that is associated with the Cas9 nuclease of Streptococcus pyogenes or SpCas9
  • the canonical PAM sequence is 5 ⁇ -NGG-3 ⁇ , wherein “N” is any nucleobase followed by two guanine (“G”) nucleobases.
  • SpCas9’s can also recognize additional non-canonical PAMs (e.g., NAG and NGA).
  • Different PAM sequences can be associated with different Cas9 nucleases or equivalent proteins from different organisms.
  • any given Cas9 nuclease may be modified to alter the PAM specificity of the nuclease such that the nuclease recognizes an alternative PAM sequence.
  • the PAM sequence can be modified by introducing one or more mutations, including (a) D1135V, R1335Q, and T1337R “the VQR variant,” which alters the PAM specificity to NGAN or NGNG, (b) D1135E, R1335Q, and T1337R “the EQR variant,” which alters the PAM specificity to NGAG, and (c) D1135V, G1218R, R1335E, and T1337R “the VRER variant,” which alters the PAM specificity to NGCG.
  • Cas9 enzymes from different bacterial species can have varying PAM specificities.
  • Cas9 from Staphylococcus aureus (SaCas9) recognizes NGRRT or NGRRN.
  • Cas9 from Neisseria meningitis (NmCas) recognizes NNNNGATT.
  • Speptococcus thermophilis (StCas9) recognizes NNAGAAW.
  • Cas9 from Treponema denticola recognizes NAAAAC.
  • TdCas Treponema denticola
  • non-SpCas9s bind a variety of PAM sequences, which makes them useful when no suitable SpCas9 PAM sequence is present at the desired target cut site.
  • non-SpCas9s may have other characteristics that make them more useful than SpCas9.
  • Cas9 from Staphylococcus aureus (SaCas9) is about 1 kilobase smaller than SpCas9, so it can be packaged into adeno- associated virus (AAV).
  • AAV adeno- associated virus
  • Reverse transcriptase describes a class of polymerases characterized as RNA-dependent DNA polymerases. All known reverse transcriptases require a primer to synthesize a DNA transcript from an RNA template. Historically, reverse transcriptase has been used primarily to transcribe mRNA into cDNA, which can then be cloned into a vector for further manipulation.
  • Avian myoblastosis virus (AMV) reverse transcriptase was the first widely used RNA-dependent DNA polymerase (Verma, Biochim. Biophys. Acta 473:1 (1977)).
  • the enzyme has 5 ⁇ -3 ⁇ RNA-directed DNA polymerase activity, 5 ⁇ -3 ⁇ DNA-directed DNA polymerase activity, and RNase H activity.
  • RNase H is a processive 5 ⁇ and 3 ⁇ ribonuclease specific for the RNA strand for RNA-DNA hybrids (Perbal, A Practical Guide to Molecular Cloning, New York: Wiley & Sons (1984)).
  • M-MLV reverse transcriptase substantially lacking in RNase H activity has also been described. See, e.g., U.S. Pat. No.5,244,797.
  • the invention contemplates the use of any such reverse transcriptases, or variants or mutants thereof.
  • the invention contemplates the use of reverse transcriptases that are error- prone, i.e., that may be referred to as error-prone reverse transcriptases or reverse transcriptases that do not support high fidelity incorporation of nucleotides during polymerization.
  • the error-prone reverse transcriptase can introduce one or more nucleotides that are mismatched with the RT template sequence, thereby introducing changes to the nucleotide sequence through erroneous polymerization of the single-strand DNA flap.
  • These errors introduced during synthesis of the single strand DNA flap then become integrated into the double strand molecule through hybridization to the corresponding endogenous target strand, removal of the endogenous displaced strand, ligation, and then through one more round of endogenous DNA repair and/or sequencing processes.
  • the prime editors used in the complexes and methods provided herein comprise MMLV RT.
  • Reverse transcription indicates the capability of an enzyme to synthesize a DNA strand (that is, complementary DNA or cDNA) using RNA as a template.
  • the reverse transcription can be “error-prone reverse transcription,” which refers to the properties of certain reverse transcriptase enzymes that are error-prone in their DNA polymerization activity.
  • Second-strand nicking [0170] Prime editing typically involves the resolution of heteroduplex DNA (i.e., containing one edited and one non-edited strand) formed as a result of installation of one or more desired nucleotide changes in the edit strand but not (yet) in the non-edit strand of the target DNA sequence.
  • Second-strand nicking can be used herein to help drive the resolution of heteroduplex DNA in favor of permanent integration of the edited strand into the DNA molecule.
  • the concept of “second-strand nicking” refers to the introduction of a second nick on the unedited strand.
  • a second nick is introduced at a location on the non-edit strand corresponding to a position downstream of the first nick (i.e., the initial nick site that provides the free 3′ end for use in priming of the reverse transcriptase on the extended portion of the guide RNA) on the edit strand.
  • the first nick (introduced by the prime editor in combination with the PEgRNA) and the second nick (introduced by the prime editor and a second-strand nicking guide RNA) are on opposite strands.
  • the first nick is on the non-target strand (i.e., the strand that forms the single strand portion of the R-loop), and the second nick is on the target strand.
  • the first nick (introduced by the prime editor in combination with the PEgRNA) is on the edit strand
  • the second nick is on the non-edit strand
  • the second nick can be introduced in the non- edit strand at a position that is opposite at least 1, 2, 3, 4, or 5 nucleotides downstream or upstream of the first nick of the edit strand, or that is opposite at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, or 150 or more nucleotides downstream or upstream of the first nick of the edit strand.
  • the second nick can also be introduced in the non-edit strand at a position that is opposite at least 1, 2, 3, 4, or 5 nucleotides downstream or upstream of the edit site of the edit strand, or that is opposite at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, or 150 or more nucleotides downstream or upstream of the edit site of the edit strand.
  • the second nick in certain embodiments, can be introduced in the non-edit strand at a position that is opposite about 1-150 nucleotides downstream or upstream of the first nick of the edit strand, or that is opposite about 1-140, or about 1-130, or about 1-120, or about 1-110, or about 1- 100, or about 1-90, or about 1-80, or about 1-70, or about 1-60, or about 1-50, or about 1-40, or about 1-30, or about 1-20, or about 1-10 nucleotides downstream or upstream of the first nick of the edit strand.
  • the second nick induces the cell’s endogenous DNA repair and replication processes towards replacement of the non-edit strand, thereby permanently installing the edited sequence on both strands of the target DNA and resolving the heteroduplex that is formed as a result of PE.
  • the second strand nicking guide RNA (also referred to herein as the nicking guide RNA, ngRNA, secondary nicking RNA, or second strand nicking sgRNA) may include a spacer sequence that preferentially and/or selectively only anneals to the edit strand after the desired nucleotide edit(s) are installed but not to the original strand of DNA the becomes replaced by the edited strand (i.e., the 5′ single-strand DNA flap that is displaced and ultimately removed during heteroduplex resolution).
  • This can be referred to as “temporal second-strand nicking” because the second strand nicking occurs only after prime editing has generated the new 3′ DNA flap containing the desired edit.
  • spacer sequence in connection with a guide RNA or a PEgRNA refers to the portion of the guide RNA or PEgRNA of about 20 nucleotides that contains a nucleotide sequence that shares the same sequence as the protospacer sequence in the target DNA sequence.
  • split Cas9 refers to a fusion protein with a split site located within the Cas9 protein that is provided as an N-terminal portion (also referred to as an N-terminal half) and a C-terminal portion (also referred to as a C-terminal half) encoded by two separate nucleotide sequences.
  • the polypeptides corresponding to the N-terminal portion and the C-terminal portion of the Cas9 protein may be combined (joined) to form a complete Cas9 protein.
  • a Cas9 protein is known to consist of a bi-lobed structure linked by a disordered linker (e.g., as described in Nishimasu et al., Cell, Volume 156, Issue 5, pp.935–949, 2014, incorporated herein by reference).
  • the “split” occurs between the two lobes, generating two portions of a Cas9 protein, each containing one lobe.
  • Split Intein Although inteins are most frequently found as a contiguous domain, some exist in a naturally split form.
  • An exemplary split intein is the Ssp DnaE intein, which comprises two subunits, namely, DnaE-N and DnaE-C.
  • the two different subunits are encoded by separate genes, namely dnaE-n and dnaE-c, which encode the DnaE-N and DnaE-C subunits, respectively.
  • DnaE is a naturally occurring split intein in Synechocytis sp.
  • PCC6803 is capable of directing trans-splicing of two separate proteins, each comprising a fusion with either DnaE- N or DnaE-C.
  • Additional naturally occurring or engineered split-intein sequences are known in the art or can be made from whole-intein sequences described herein or those available in the art.
  • split-intein sequences can be found in Stevens et al., “A promiscuous split intein with expanded protein engineering applications,” PNAS, 2017, Vol.114: 8538-8543; Iwai et al., “Highly efficient protein trans-splicing by a naturally split DnaE intein from Nostc punctiforme, FEBS Lett, 580: 1853-1858, each of which are incorporated herein by reference. Additional split intein sequences can be found, for example, in WO 2013/045632, WO 2014/055782, WO 2016/069774, and EP2877490, the contents each of which are incorporated herein by reference.
  • a “split prime editor” refers to a prime editor that is provided as an N-terminal portion (also referred to as a N-terminal half) and a C-terminal portion (also referred to as a C-terminal half) encoded by two separate nucleic acids.
  • the polypeptides corresponding to the N-terminal portion and the C-terminal portion of the prime editor may be combined to form a complete prime editor.
  • the “split” is located in the dCas9 or nCas9 domain, at positions as described herein in the split prime editor.
  • the N-terminal portion of the prime editor contains the N-terminal portion of the split prime editor
  • the C-terminal portion of the prime editor contains the C-terminal portion of the split prime editor.
  • intein-N or intein-C may be fused to the N-terminal portion or the C-terminal portion of the prime editor, respectively, for the joining of the N- and C-terminal portions of the prime editor to form a complete prime editor.
  • the split site is within a Cas9 protein of a prime editor.
  • the split site is between amino acid residues 844 and 845 of a Cas9 protein within a prime editor (e.g., a Cas9 protein of SEQ ID NO: 6).
  • the split site is between amino acid residues 1024 and 1025 of a Cas9 protein within a prime editor (e.g., a Cas9 protein of SEQ ID NO: 6).
  • Silent mutation refers to a mutation in a nucleic acid molecule that does not have an effect on the phenotype of the nucleic acid molecule, or the protein it produces if it encodes a protein. Silent mutations can be introduced into coding regions of a nucleic acid (i.e., segments of a gene that encode for a protein), or they can be introduced in non-coding regions of a nucleic acid.
  • a silent mutation in a nucleic acid sequence may be a nucleotide alteration that does not result in expression or function of the amino acid sequence encoded by the nucleic acid sequence, or other functional features of the target nucleic acid sequence.
  • silent mutations may be present in a coding region, they may be synonymous mutations.
  • Synonymous mutations refer to substitutions of one base for another in a gene such that the corresponding amino acid residue of the protein produced by the gene is not modified. This is due to the redundancy of the genetic code, allowing for multiple different codons to encode for the same amino acid in a particular organism.
  • a silent mutation when in a noncoding region or a junction of a coding region and a non-coding region (e.g., an intron/exon junction), it may be in a region that does not impact any biological properties of the nucleic acid molecule (e.g., splicing, gene regulation, RNA lifetime, etc.).
  • a silent mutation may also be a “benign” mutation, for example, where a nucleotide substitution results in one or more alterations in the amino acid sequence encoded, but does not result in detrimental impact on the expression or function of the polypeptide.
  • Silent mutations may be useful, for example, for increasing the length of contiguous changes in a desired nucleotide edit or the number of nucleotide edits made to a target nucleotide sequence using prime editing to evade correction of the edit by the MMR pathway.
  • the number of silent mutations installed may be one, or two, or three, or four, or five, or six, or seven, or eight, or nine, or ten, or more.
  • the silent mutations may be installed within one, or two, or three, or four, or five, or six, or seven, or eight, or nine, or ten, or 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides from the intended edit site.
  • silent mutations are installed in order to alter or optimize the secondary structure that a particular pegRNA will form in cell.
  • changing some bases of a pegRNA with silent mutations results in changes to the secondary structure of the pegRNA that can improve editing efficiency.
  • subject refers to an individual organism, for example, an individual mammal. In some embodiments, the subject is a human.
  • the subject is a non-human mammal. In some embodiments, the subject is a non-human primate. In some embodiments, the subject is a rodent. In some embodiments, the subject is a sheep, a goat, a cattle, a cat, or a dog. In some embodiments, the subject is a vertebrate, an amphibian, a reptile, a fish, an insect, a fly, or a nematode. In some embodiments, the subject is a research animal. In some embodiments, the subject is genetically engineered, e.g., a genetically engineered non-human subject. The subject may be of either sex and may be at any stage of development.
  • a subject has or is suspected of having a triplet repeat disorder. In some embodiments, a subject has or is suspected of having Huntington’s disease. In some embodiments, a subject has or is suspected of having Friedreich’s ataxia.
  • Target site refers to a sequence within a nucleic acid molecule that is edited by a prime editor (PE) disclosed herein. The target site further refers to the sequence within a nucleic acid molecule to which a complex of the prime editor (PE) and gRNA binds. In some embodiments, a target site comprises a trinucleotide repeat sequence. In some embodiments, a target site comprises part of the HTT gene.
  • a target site comprises part of the FXN gene.
  • Temporal second-strand nicking refers to a variant of second strand nicking whereby the installation of the second-strand nick in the unedited strand occurs only after the desired edit is installed in the edited strand by the PE complexed with the PEgRNA. Without being bound by theory, the second-strand nick in the unedited strand induces the cell’s endogenous DNA repair and replication processes towards replacement of the unedited strand, thereby permanently installing the edited sequence on both strands and resolving the heteroduplex that is formed as a result of PE.
  • a prime editor system comprising a second strand nicking guide RNA designed with the temporal second strand nicking strategy, which can avoid concurrent nicks on both strands that could lead to double-stranded DNA breaks.
  • the second-strand nicking guide RNA is designed for temporal control such that the second strand nick is not introduced until after the installation of the desired edit. This is achieved by designing a gRNA with a spacer sequence that matches only the edited strand, but not the original allele. Using this strategy, mismatches between the spacer of the second-strand nicking guide RNA and the unedited allele should disfavor second-strand nicking until after the editing event on the PAM strand takes place.
  • the second strand nicking guide RNA may include a spacer sequence that preferentially and/or selectively only anneals to the edited strand (i.e., after PE synthesizes the edit), but not to the original strand of DNA that becomes replaced by the edited strand (i.e., the 5′ single-strand DNA flap that is displaced and ultimately removed during heteroduplex resolution).
  • the second strand nicking guide RNA can operate by designing the second strand nicking guide RNA to comprise a spacer sequence that anneals only to the edited region of the edited strand (and thus, wherein the spacer of the second strand nicking guide RNA comprises a nucleotide sequence that is the complement of the edited sequence or region thereof and includes the complement of the edit) and thus, can discriminate between the edited strand and the original strand of the displaced 5′ single-strand DNA flap that is immediately downstream of the cut site of the edited strand.
  • a prime editor system (e.g., a PE3b system or a PE5b system) comprises components that improve temporal second-strand nicking by including PE-based installation of one or more silent mutations around an edit site (e.g., introducing one or more silent mutations located upstream and/or downstream of a non-silent, desired nucleotide edit or adjacent to the non-silent nucleotide edit).
  • a prime editor system comprises a pegRNA, the DNA synthesis template of which comprises one or more non- silent nucleotide edits and further comprises one or more silent mutations compared to the endogenous sequence of the target strand (and accordingly encodes a single stranded DNA comprising the one or more non-silent nucleotide edits and the silent mutations compared to the endogenous sequence of the edit strand).
  • the one or more silent mutations are adjacent to or immediately adjacent to a non-silent nucleotide edit in the DNA synthesis template.
  • the one or more silent mutations are within 5 nucleotides upstream of the non-silent nucleotide edit.
  • the one or more silent mutations are within 5 nucleotides downstream of the non-silent nucleotide edit. In some embodiments, the one or more silent mutations are immediately adjacent to the non-silent nucleotide edit, such that the DNA synthesis template contains at least 3 contiguous nucleotides that are not complementary to the corresponding endogenous sequence downstream of the nick site on the edit strand of the target DNA sequence. Without wishing to be bound by a particular theory, such silent mutations may improve prime editing efficiency by evading the cellular mismatch repair pathway by avoiding reversion of the PE- installed edit on the edit strand back to the pre-edited sequence.
  • Such silent mutations may also improve prime editing efficiency by altering or optimizing the secondary structure of the pegRNA.
  • a prime editor system comprising a pegRNA with the one or more silent mutations in addition to the non-silent mutation in the DNA synthesis template can result in improved editing efficiency of the target DNA, as compared to a control prime editor system comprising a pegRNA that only contains the non-silent mutation and not the one or more silent mutations in the DNA synthesis template.
  • combining PE3b designs with the silent mutations can further improve prime editing efficiency and/or reduce indel frequency resulted from editing.
  • the single-strand nicking guide RNA comprises a spacer sequence that is complementary to the PE-edited strand can discriminate between the edited strand and the original strand, which corresponds to the displaced 5′ single-strand DNA flap that is immediately downstream of the first nick site of the edited strand.
  • the silent mutations may be installed in coding regions of the target nucleic acid molecule or in non-coding regions of the target nucleic acid molecule. When the silent mutations are installed in a coding region, they introduce into the nucleic acid molecule one or more alternate codons encoding the same amino acid as the unedited nucleic acid molecule.
  • the silent mutations when installed in a non-coding region, the silent mutations may be present in a region of the nucleic acid molecule that does not influence splicing, gene regulation, RNA lifetime, or other biological properties of the target site on the nucleic acid molecule.
  • treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein.
  • treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein.
  • treatment may be administered after one or more symptoms have developed and/or after a disease has been diagnosed.
  • treatment may be administered in the absence of symptoms, e.g., to prevent or delay onset of a symptom or inhibit onset or progression of a disease.
  • treatment may be administered to a susceptible individual prior to the onset of symptoms (e.g., in light of a history of symptoms and/or in light of genetic or other susceptibility factors).
  • Treatment may also be continued after symptoms have resolved, for example, to prevent or delay their recurrence.
  • a treatment is administered to treat a triplet repeat disorder.
  • a treatment is administered to treat Huntington’s disease.
  • a treatment is administered to treat Friedreich’s ataxia.
  • Triplet Repeat Disorder [0186] The terms “triplet repeat disorder,” “trinucleotide repeat disorder,” “triplet repeat expansion disorder,” or “trinucleotide repeat expansion disorder refer to a number of human disease, including Huntington’s Disease, Fragile X syndrome, and Friedreich’s ataxia, associated with expansion of particular trinucleotide repeat sequences.
  • Triplet repeat disorders include, for example, those associated with the following genes and numbers of pathogenic polyglutamine trinucleotide repeats: [0188] Triplet repeat disorders include, for example, those associated with the following genes and numbers of pathogenic non-polyglutamine trinucleotide repeats:
  • Trinucleotide repeat expansion disorders are complex, progressive disorders that involve developmental neurobiology and often affect cognition as well as sensory-motor functions. The disorders show genetic anticipation (i.e., increased severity with each generation). The DNA expansions or contractions usually happen meiotically (i.e., during the time of gametogenesis, or early in embryonic development), and often have sex-bias meaning that some genes expand only when inherited through the female, and others only through the male. In humans, trinucleotide repeat expansion disorders can cause gene silencing at either the transcriptional or translational level, which essentially knocks out gene function.
  • trinucleotide repeat expansion disorders can cause altered proteins generated with large repetitive amino acid sequences that either abrogate or change protein function, often in a dominant-negative manner (e.g., poly-glutamine diseases).
  • triplet expansion is caused by slippage during DNA replication or during DNA repair synthesis. Because the tandem repeats have identical sequence to one another, base pairing between two DNA strands can take place at multiple points along the sequence. This may lead to the formation of “loop out” structures during DNA replication or DNA repair synthesis. This may also lead to repeated copying of the repeated sequence, expanding the number of repeats. Additional mechanisms involving hybrid RNA:DNA intermediates have been proposed.
  • Trinucleotide repeat expansion proteins are a diverse set of proteins associated with susceptibility for developing a trinucleotide repeat expansion disorder, the presence of a trinucleotide repeat expansion disorder, the severity of a trinucleotide repeat expansion disorder, or any combination thereof. Trinucleotide repeat expansion disorders are divided into two categories determined by the type of repeat. The most common repeat is the triplet CAG, which, when present in the coding region of a gene, codes for the amino acid glutamine (Q).
  • polyglutamine disorders comprise the following diseases: Huntington’s Disease (HD); Spinobulbar Muscular Atrophy (SBMA); Spinocerebellar Ataxias (SCA types 1, 2, 3, 6, 7, and 17); and Dentatorubro-Pallidoluysian Atrophy (DRPLA).
  • HD Huntington’s Disease
  • SBMA Spinobulbar Muscular Atrophy
  • SCA types 1, 2, 3, 6, 7, and 17 Spinocerebellar Ataxias
  • DPLA Dentatorubro-Pallidoluysian Atrophy
  • the remaining trinucleotide repeat expansion disorders either do not involve the CAG triplet or the CAG triplet is not in the coding region of the gene. These disorders, therefore, are referred to as the non-polyglutamine disorders.
  • the non- polyglutamine disorders comprise Fragile X Syndrome (FRAXA); Fragile XE Mental Retardation (FRAXE); Friedreich’s Ataxia (FRDA); Myotonic Dystrophy (DM); and Spinocerebellar Ataxias (SCA types 8 and 12).
  • FSAXA Fragile X Syndrome
  • FAAXE Fragile XE Mental Retardation
  • FRDA Friedreich’s Ataxia
  • DM Myotonic Dystrophy
  • SCA types 8 and 12 Spinocerebellar Ataxias
  • Non-limiting examples of proteins associated with trinucleotide repeat expansion disorders include AR (androgen receptor), FMR1 (fragile X mental retardation 1), HTT (huntingtin), DMPK (dystrophia myotonica-protein kinase), FXN (frataxin), ATXN2 (ataxin 2), ATN1 (atrophin 1), FEN1 (flap structure-specific endonuclease 1), TNRC6A (trinucleotide repeat containing 6A), PABPN1 (poly(A) binding protein, nuclear 1), JPH3 (junctophilin 3), MED15 (mediator complex subunit 15), ATXN1 (ataxin 1), ATXN3 (ataxin 3), TBP (TATA box binding protein), CACNA1A (calcium channel, voltage-dependent, P/Q type, alpha 1A subunit), ATXN80S (ATXN8 opposite strand (non- protein coding)), PPP2R2
  • G protein guanine nucleotide binding protein
  • beta polypeptide 2 ribosomal protein L14
  • ATXN8 ataxin 8
  • INSR insulin receptor
  • TTR transthyretin
  • EP400 E1A binding protein p400
  • GIGYF2 GYF protein 2
  • OGG1 8-oxoguanine DNA glycosylase
  • STC1 stanniocalcin 1
  • CNDP1 carnosine dipeptidase 1 (metallopeptidase M20 family)
  • coli (S. cerevisiae)), NCOA3 (nuclear receptor coactivator 3), ERDA1 (expanded repeat domain, CAG/CTG 1), TSC1 (tuberous sclerosis 1), COMP (cartilage oligomeric matrix protein), GCLC (glutamate- cysteine ligase, catalytic subunit), RRAD (Ras-related associated with diabetes), MSH3 (mutS homolog 3 (E.
  • GLA galactosidase, alpha
  • CBL Cas-Br-M (murine) ecotropic retroviral transforming sequence
  • FTH1 ferritin, heavy polypeptide 1
  • IL12RB2 interleukin 12 receptor, beta 2
  • OTX2 orthodenticle homeobox 2
  • HOXA5 homeobox A5
  • POLG2 polymerase (DNA directed), gamma 2, accessory subunit)
  • DLX2 distal-less homeobox 2
  • SIRPA signal-regulatory protein alpha
  • OTX1 orthodenticle homeobox 1
  • AHRR aryl-hydrocarbon receptor repressor
  • MANF mesencephalic astrocyte-derived neurotrophic factor
  • TMEM158 transmembrane protein 158 (gene/pseudogene)
  • ENSG00000078687 GLA (galactosidase, alpha
  • CBL Cas-Br-M
  • the protein associated with a trinucleotide repeat disorder is the HTT protein. In certain embodiments, the protein associated with a trinucleotide repeat disorder is the FXN protein.
  • Variant should be taken to mean the exhibition of qualities that have a pattern that deviates from what occurs in nature, e.g., a variant Cas9 is a Cas9 comprising one or more changes in amino acid residues as compared to a wild type Cas9 amino acid sequence.
  • vector refers to a nucleic acid that can be modified to encode a gene of interest and that is able to enter a host cell, mutate, and replicate within the host cell, and then transfer a replicated form of the vector into another host cell.
  • Exemplary suitable vectors include viral vectors, such as retroviral vectors or bacteriophages and filamentous phage, and conjugative plasmids. Additional suitable vectors will be apparent to those of skill in the art based on the instant disclosure.
  • Wild type As used herein the term “wild type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene, or characteristic as it occurs in nature as distinguished from mutant or variant forms.
  • DETAILED DESCRIPTION [0196] The present disclosure provides compositions, systems, and methods useful in the treatment of trinucleotide repeat disorders, including Huntington’s disease and Friedreich’s ataxia. The present disclosure also provides pegRNAs designed to target the HTT or FXN genes.
  • compositions comprising a prime editor and any of the pegRNAs disclosed herein are also provided by the present disclosure.
  • the present disclosure further provides polynucleotides, vectors, AAVs, cells, compositions, and kits. Methods of treating Huntington’s disease and Friedreich’s ataxia, as well as uses of the compositions, pegRNAs, and systems described herein, are also provided herein.
  • PEgRNAs [0197] The prime editing complexes and methods described herein contemplate the use of any suitable PEgRNAs.
  • the present disclosure provides pegRNAs targeting the HTT and FXN genes as described herein.
  • pegRNAs may be useful, for example, in prime editing methods targeting pathogenic trinucleotide repeats in HTT and FXN for the treatment of Huntington’s disease and Friedreich’s ataxia.
  • a trinucleotide repeat is replaced with a trinucleotide repeat of a different length using the pegRNAs provided herein.
  • a trinucleotide repeat is contracted using the pegRNAs provided herein.
  • a trinucleotide repeat is deleted using the pegRNAs provided herein.
  • CAG repeats in the HTT gene are replaced, contracted, or deleted using the pegRNAs provided herein.
  • GAA repeats in the FXN gene e.g., in intron 1 of the FXN gene are replaced, contracted, or deleted using the pegRNAs provided herein.
  • the present disclosure provides pegRNAs targeting the HTT gene comprising a spacer sequence comprising the nucleic acid sequence: GACCCTGGAAAAGCTGATGA (SEQ ID NO: 381); GCTGCTGCTGGAAGGACTTG (SEQ ID NO: 382); GCTGCTGCTGCTGCTGCTGGA (SEQ ID NO: 383); GCTGCTGCTGCTGCTGCTGC (SEQ ID NO: 384); GGCGGCGGCGGCGGCGGTGG (SEQ ID NO: 385); TGAGGAAGCTGAGGAGGCGG (SEQ ID NO: 386); or GGCGGCTGAGGAAGCTGAGG (SEQ ID NO: 387), or a nucleic acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 381-387.
  • the pegRNA comprises the spacer sequence GACCCTGGAAAAGCTGATGA (SEQ ID NO: 381), or a spacer sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 381.
  • the pegRNA comprises a primer binding site (PBS) of about 8 to about 16 nucleotides in length.
  • the pegRNA comprises a PBS of about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, or about 16 nucleotides in length.
  • the pegRNA comprises a PBS of about 10 nucleotides in length.
  • the pegRNA comprises a reverse transcription template comprising a sequence of about 15 to about 35 nucleotides, wherein the sequence encodes a sequence that comprises one or more nucleotide edits compared to a sequence directly upstream of a CAG repeat in exon 1 of the HTT gene.
  • the one or more nucleotide edits comprises a PAM mutation that alters a PAM sequence.
  • the PAM sequence is NGG, wherein N is any one of nucleotides A, G, C, or T.
  • the pegRNA comprises a reverse transcription template comprising a sequence of about 26 nucleotides, wherein the sequence encodes a sequence that comprises one or more nucleotide edits compared to a sequence directly upstream of a CAG repeat in exon 1 of the HTT gene.
  • the pegRNA comprises a reverse transcription template comprising an edit template encoding or comprising one or repeats of the trinucleotide sequence CAG (for example, 4-35 or 4-10 repeats).
  • the pegRNA comprises a reverse transcription template comprising an edit template encoding or comprising the nucleotide sequence CAGCAGCAGCAG (SEQ ID NO: 874), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 874.
  • the pegRNA comprises a reverse transcription template comprising an edit template comprising the nucleotide sequence CAGCAGCAGCAGCAGCAGCAGCAGCAG (SEQ ID NO: 875), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 875.
  • the edit template further comprises one or more nucleotides of the trinucleotide sequence CAA.
  • the pegRNA comprises a reverse transcription template comprising an edit template comprising the nucleotide sequence CAGCAGCAGCAGCAACAACAA (SEQ ID NO: 876), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 876.
  • the pegRNA comprises a reverse transcription template comprising a homology arm of about 16 to about 50 nucleotides in length, wherein the homology arm is complementary to a sequence directly downstream of a CAG repeat in exon 1 of a wildtype HTT gene in the coding strand.
  • the pegRNA comprises a reverse transcription template comprising a homology arm of about 31 nucleotides in length. In certain embodiments, the pegRNA comprises a reverse transcription template comprising a homology arm of about 40 nucleotides in length.
  • the pegRNA for editing HTT is an engineered pegRNA (epegRNA). In some embodiments, the pegRNA comprises at its 3' end a structural motif that improves stability of the pegRNA. In some embodiments, the epegRNA comprises an evopreq1 motif.
  • the evopreq1 motif comprises the sequence of SEQ ID NO: 442, or a sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 442, a sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the sequence of SEQ ID NO: 442, or a sequence having 1, 2, 3, 4, or 5 mutations relative to SEQ ID NO: 442.
  • the pegRNA comprises a UA flip in the pegRNA scaffold sequence.
  • the pegRNA comprises the sequence of any one of SEQ ID NOs: 454-815, or a sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the sequence of any one of SEQ ID NOs: 454-815.
  • the present disclosure provides pegRNAs targeting the FXN gene comprising a spacer sequence comprising the nucleic acid sequence: GCAAGACTAACCTGGCCAACA (SEQ ID NO: 388); GTCCGGAGTTCAAGACTAACC (SEQ ID NO: 389); GAAGGTGGATCACCTGAGGTC (SEQ ID NO: 390); GTCTGGAGTAGCTGGGATTAC (SEQ ID NO: 391); or GCAGGCGCGCGACACCACGCC (SEQ ID NO: 392), or a nucleic acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 388-392.
  • the pegRNA comprises the spacer sequence GCAAGACTAACCTGGCCAACA (SEQ ID NO: 388), or a nucleic acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 388.
  • the pegRNA comprises a primer binding site (PBS) of about 8 to about 14 nucleotides in length.
  • the pegRNA comprises a PBS of about 8, about 9, about 10, about 11, about 12, about 13, or about 14 nucleotides in length.
  • the pegRNA comprises a PBS of about 10 nucleotides in length.
  • the pegRNA comprises a reverse transcription template comprising a homology arm of about 8 to about 50 nucleotides in length, wherein the homology arm is complementary to a sequence directly downstream of a GAA repeat in a wildtype FXN gene. In certain embodiments, the pegRNA comprises a reverse transcription template comprising a homology arm of about 40 nucleotides in length.
  • the pegRNA for editing FXN is an engineered pegRNA (epegRNA). In some embodiments, the pegRNA comprises at its 3' end a structural motif that improves stability of the pegRNA. In some embodiments, the epegRNA comprises an evopreq1 motif.
  • the evopreq1 motif comprises the sequence of SEQ ID NO: 442, a sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the sequence of SEQ ID NO: 442, or a sequence having 1, 2, 3, 4, or 5 mutations relative to SEQ ID NO: 442.
  • the pegRNA comprises a UA flip in the pegRNA scaffold sequence.
  • the pegRNA for editing FXN comprises the sequence of any one of SEQ ID NOs: 816-867, or a sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to the sequence of any one of SEQ ID NOs: 816-867.
  • the present disclosure also provides gRNAs for nicking the non- PAM-containing strand of the target nucleotide sequence.
  • the spacer of the nicking gRNA targets the HTT gene and comprises the nucleotide sequence: GGCGGCTGAGGAAGCTGAGG (SEQ ID NO: 387); GGCGGCGGCGGCGGCGGTGG (SEQ ID NO: 385); GCTGCTGCTGCTGCTGCTGC (SEQ ID NO: 384); GAAGGACTTGAGGGACTCGA (SEQ ID NO: 393); GTGAGGAAGCTGAGGAGGCGG (SEQ ID NO: 394); GCTGTTGCTGCTGCTGCTGC (SEQ ID NO: 395); GCTGCTGCTGCTGCTGCTGGA (SEQ ID NO: 383); GCTGCTGCTGGAAGGACTTG (SEQ ID NO: 382); GGCCTTCATCAGCTTTTCC (SEQ ID NO: 396); GGCTTTCATCAGCTTTTCC (SEQ ID NO: 397); GGGACTCGAAGGCCTTCAT (SEQ ID NO: 398); GGAAGGACTTGAGGGACTCG
  • the spacer of the nicking gRNA comprises the nucleotide sequence GCTGCTGCTGCTGCTGCTGC (SEQ ID NO: 384), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 384.
  • the spacer of the nicking gRNA comprises the nucleotide sequence GCTGCTGGAAGGACTTGAG (SEQ ID NO: 406), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 406.
  • the spacer of the nicking gRNA targets the FXN gene and comprises the nucleotide sequence: GTCCCAAAGTGCTGAGATTAT (SEQ ID NO: 410); GTGTATTTTTTAGTAGATACT (SEQ ID NO: 411); GATTCTCCTGCCGCAGCCTC (SEQ ID NO: 412); or GCGACACCACGCCCGGCTAAC (SEQ ID NO: 413), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any of these.
  • the spacer of the nicking gRNA comprises the nucleotide sequence GTGTATTTTTTAGTAGATACT (SEQ ID NO: 411), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 411.
  • the spacer of the nicking gRNA comprises the nucleotide sequence GCGACACCACGCCCGGCTAAC (SEQ ID NO: 413), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 413.
  • various gRNA core sequences may be utilized in the pegRNAs provided herein.
  • a gRNA core that binds to SpCas9 is used in the pegRNAs provided herein.
  • a gRNA core that binds to SaCas9 is used in the pegRNAs provided herein.
  • a gRNA core that binds to any Cas9 protein provided herein, or any Cas9 protein known in the art is used in the pegRNAs provided herein.
  • PEgRNA architecture [0210]
  • an extended guide RNA, or pegRNA, used in the prime editing complexes and methods disclosed herein includes a spacer sequence (e.g., a ⁇ 20 nt spacer sequence) and a gRNA core region, which binds with the napDNAbp.
  • the pegRNA includes an extended RNA segment, i.e., an extension arm, at the 5′ end, i.e., a 5′ extension.
  • the 5′ extension includes a reverse transcription template sequence, a primer binding site, and an optional 5-20 nucleotide linker sequence. The RT primer binding site hybridizes to the free 3 ⁇ end that is formed after a nick is formed in the non-target strand of the R-loop, thereby priming reverse transcriptase for DNA polymerization in the 5′-3′ direction.
  • an extended guide RNA used in the prime editing complexes and methods provided herein includes a spacer sequence (e.g., a ⁇ 20 nt spacer sequence) and a gRNA core, which binds with the napDNAbp.
  • the pegRNA includes an extended RNA segment, i.e., an extension arm, at the 3′ end, i.e., a 3′ extension.
  • the 3′ extension includes a reverse transcription template sequence, and a reverse transcription primer binding site.
  • an extended guide RNA used in the prime editing complexes and methods provided herein includes a spacer sequence (e.g., a ⁇ 20 nt spacer sequence) and a gRNA core, which binds with the napDNAbp.
  • the pegRNA includes an extended RNA segment, i.e., an extension arm, at an intermolecular position within the gRNA core, i.e., an intramolecular extension.
  • the intramolecular extension includes a reverse transcription template sequence, and a reverse transcription primer binding site. The RT primer binding site hybridizes to the free 3′ end that is formed after a nick is formed in the non-target strand of the R-loop, thereby priming reverse transcriptase for DNA polymerization in the 5′-3′ direction.
  • the position of the intermolecular RNA extension is not in the spacer sequence of the guide RNA.
  • the position of the intermolecular RNA extension is in the gRNA core. In still another embodiment, the position of the intermolecular RNA extension is anywhere within the guide RNA molecule except within the spacer sequence, or at a position which disrupts the spacer sequence. In one embodiment, the intermolecular RNA extension is inserted downstream from the 3′ end of the spacer sequence.
  • the intermolecular RNA extension is inserted at least 1 nucleotide, at least 2 nucleotides, at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 21 nucleotides, at least 22 nucleotides, at least 23 nucleotides, at least 24 nucleotides, or at least 25 nucleotides downstream of the 3′ end of the spacer sequence.
  • the intermolecular RNA extension is inserted into the gRNA core, which refers to the portion of a traditional guide RNA corresponding or comprising the tracrRNA, which binds and/or interacts with the napDNAbp, e.g., a Cas9 protein or equivalent thereof (i.e., a different napDNAbp).
  • the insertion of the intermolecular RNA extension does not disrupt or minimally disrupts the interaction between the tracrRNA portion and the napDNAbp.
  • the length of the RNA extension (which includes at least the RT template and primer binding site) can be any useful length.
  • the RNA extension is at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 21 nucleotides, at least 22 nucleotides, at least 23 nucleotides, at least 24 nucleotides, at least 25 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least
  • the RT template sequence can also be any suitable length.
  • the RT template sequence can be at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides, at least 100 nucleotides
  • the reverse transcription primer binding site sequence is at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides, at least 100 nucleotides, at least 200
  • the optional linker or spacer sequence is at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides, at least 100 nucleotides, at least 200
  • the RT template sequence encodes a single-stranded DNA molecule which is homologous to the non-target strand (and thus, complementary to the corresponding site of the target strand) but includes one or more nucleotide changes.
  • the one or more nucleotide changes may include one or more single-base nucleotide changes, one or more deletions, and/or one or more insertions.
  • the synthesized single-stranded DNA product of the RT template sequence is homologous to the non-target strand except that it contains one or more nucleotide changes.
  • the single-stranded DNA product of the RT template sequence hybridizes in equilibrium with the complementary target strand sequence, thereby displacing the homologous endogenous target strand sequence.
  • the displaced endogenous strand may be referred to in some embodiments as a 5′ endogenous DNA flap species.
  • This 5′ endogenous DNA flap species can be removed by a 5′ flap endonuclease (e.g., FEN1) and the single-stranded DNA product, now hybridized to the endogenous target strand, may be ligated, thereby creating a mismatch between the endogenous sequence and the newly synthesized strand.
  • the mismatch may be resolved by the cell’s innate DNA repair and/or replication processes.
  • the nucleotide sequence of the RT template sequence corresponds to the nucleotide sequence of the non-target strand that becomes displaced as the 5′ flap species and that overlaps with the site to be edited.
  • the reverse transcription template sequence may encode a single-strand DNA flap that is complementary to an endogenous DNA sequence adjacent to a nick site, wherein the single-strand DNA flap comprises a desired nucleotide change.
  • the single-stranded DNA flap may displace an endogenous single-strand DNA at the nick site.
  • the displaced endogenous single-strand DNA at the nick site can have a 5′ end and form an endogenous flap, which can be excised by the cell.
  • excision of the 5′ end endogenous flap can help drive product formation since removing the 5′ end endogenous flap encourages hybridization of the single- strand 3′ DNA flap to the corresponding complementary DNA strand, and the incorporation or assimilation of the desired nucleotide change carried by the single-strand 3′ DNA flap into the target DNA.
  • cleavage site refers to a specific position in between two nucleotides or two base pairs in the double-stranded target DNA sequence. In some embodiments, the position of a nick site is determined relative to the position of a specific PAM sequence.
  • the nick site is the particular position where a nick will occur when the double stranded target DNA is contacted with a napDNAbp, e.g., a nickase such as a Cas nickase, that recognizes a specific PAM sequence.
  • a nick site e.g., the “first nick site” when referred to in the context of PE3, PE5 and similar approaches
  • is characteristic of the particular napDNAbp to which the gRNA core of the PEgRNA associates with and is characteristic of the particular PAM required for recognition and function of the napDNAbp.
  • a nick site in the phosphodiester bond between bases three (“-3” position relative to the position 1 of the PAM sequence) and four (“-4” position relative to position 1 of the PAM sequence).
  • a nick site is in a target strand of the double-stranded target DNA sequence.
  • a nick site is in a non-target strand of the double- stranded target DNA sequence.
  • the nick site is in a protospacer sequence.
  • the nick site is adjacent to a protospacer sequence.
  • a nick site is downstream of a region, e.g., on a non-target strand, that is complementary to a primer binding site of a PEgRNA. In some embodiments, a nick site is downstream of a region, e.g., on a non-target strand, that binds to a primer binding site of a PEgRNA. In some embodiments, a nick site is immediately downstream of a region, e.g., on a non-target strand, that is complementary to a primer binding site of a PEgRNA.
  • the nick site is upstream of a specific PAM sequence on the non-target strand of the double stranded target DNA, wherein the PAM sequence is specific for recognition by a napDNAbp that associates with the gRNA core of a PEgRNA. In some embodiments, the nick site is downstream of a specific PAM sequence on the non-target strand of the double stranded target DNA, wherein the PAM sequence is specific for recognition by a napDNAbp that associates with the gRNA core of a PEgRNA.
  • the nick site is 3 nucleotides upstream of the PAM sequence, and the PAM sequence is recognized by a Streptococcus pyogenes Cas9 nickase, a P. lavamentivorans Cas9 nickase, a C. diphtheriae Cas9 nickase, a N. cinerea Cas9, a S. aureus Cas9, or a N. lari Cas9 nickase.
  • the nick site is 3 nucleotides upstream of the PAM sequence, and the PAM sequence is recognized by a Cas9 nickase, wherein the Cas9 nickase comprises a nuclease active HNH domain and a nuclease inactive RuvC domain.
  • the nick site is 2 base pairs upstream of the PAM sequence, and the PAM sequence is recognized by a S. thermophilus Cas9 nickase.
  • the cellular repair of the single- strand DNA flap results in installation of the desired nucleotide change, thereby forming a desired product.
  • the desired nucleotide change is installed in an editing window that is between about -5 to +5 of the nick site, or between about -10 to +10 of the nick site, or between about -20 to +20 of the nick site, or between about -30 to +30 of the nick site, or between about -40 to +40 of the nick site, or between about -50 to +50 of the nick site, or between about -60 to +60 of the nick site, or between about -70 to +70 of the nick site, or between about -80 to +80 of the nick site, or between about -90 to +90 of the nick site, or between about -100 to +100 of the nick site, or between about -200 to +200 of the nick site.
  • the desired nucleotide change is installed in an editing window that is between about +1 to +2 from the nick site, or about +1 to +3, +1 to +4, +1 to +5, +1 to +6, +1 to +7, +1 to +8, +1 to +9, +1 to +10, +1 to +11, +1 to +12, +1 to +13, +1 to +14, +1 to +15, +1 to +16, +1 to +17, +1 to +18, +1 to +19, +1 to +20, +1 to +21, +1 to +22, +1 to +23, +1 to +24, +1 to +25, +1 to +26, +1 to +27, +1 to +28, +1 to +29, +1 to +30, +1 to +31, +1 to +32, +1 to +33, +1 to +34, +1 to +35, +1 to +36, +1 to +37, +1 to +38, +1 to +
  • the desired nucleotide change is installed in an editing window that is between about +1 to +2 from the nick site, or about +1 to +5, +1 to +10, +1 to +15, +1 to +20, +1 to +25, +1 to +30, +1 to +35, +1 to +40, +1 to +45, +1 to +50, +1 to +55, +1 to +100, +1 to +105, +1 to +110, +1 to +115, +1 to +120, +1 to +125, +1 to +130, +1 to +135, +1 to +140, +1 to +145, +1 to +150, +1 to +155, +1 to +160, +1 to +165, +1 to +170, +1 to +175, +1 to +180, +1 to +185, +1 to +190, +1 to +195, or +1 to +200, from the nick site.
  • the extended guide RNAs are modified versions of an extended guide RNA.
  • pegRNAs i.e., extended guide RNAs
  • ngRNAs may be expressed from an encoding nucleic acid, or synthesized chemically. Methods are well known in the art for obtaining or otherwise synthesizing guide RNAs, and for determining the appropriate sequence of the pegRNA, including the protospacer sequence, which interacts and hybridizes with the target strand of a genomic target site of interest.
  • the particular design aspects of a pegRNA sequence and ngRNA sequence will depend upon the nucleotide sequence of a genomic target site of interest (i.e., the desired site to be edited) and the type of napDNAbp (e.g., Cas9 protein) present in the prime editing systems utilized in the methods and compositions described herein, among other factors, such as PAM sequence locations, percent G/C content in the target sequence, the degree of microhomology regions, secondary structures, etc.
  • a genomic target site of interest i.e., the desired site to be edited
  • type of napDNAbp e.g., Cas9 protein
  • a spacer sequence (i.e., a guide sequence) of a pegRNA or ngRNA can be any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a napDNAbp (e.g., a Cas9, Cas9 homolog, or Cas9 variant) to the target sequence.
  • a napDNAbp e.g., a Cas9, Cas9 homolog, or Cas9 variant
  • the degree of complementarity between a spacer and its corresponding target sequence, when optimally aligned using a suitable alignment algorithm is about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 95%, 97.5%, 99%, or more.
  • Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies, ELAND (Illumina, San Diego, Calif.), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
  • Burrows-Wheeler Transform e.g., the Burrows Wheeler Aligner
  • ClustalW ClustalW
  • Clustal X Clustal X
  • BLAT Novoalign
  • SOAP available at soap.genomics.org.cn
  • Maq available at maq.sourceforge.net
  • a spacer is about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length. [0232] In some embodiments, a spacer is less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length. The ability of a spacer to direct sequence-specific binding of a prime editor to a target sequence may be assessed by any suitable assay.
  • the components of a prime editor including the spacer to be tested, may be provided to a host cell having the corresponding target sequence, such as by transfection with vectors encoding the components of a prime editor disclosed herein, followed by an assessment of preferential cleavage within the target sequence, such as by Surveyor assay as described herein.
  • cleavage of a target polynucleotide sequence may be evaluated in a test tube by providing the target sequence, components of a prime editor, including the spacer to be tested and a control spacer different from the test spacer, and comparing binding or rate of cleavage at the target sequence between the test and control spacer reactions.
  • Other assays are possible, and will occur to those skilled in the art.
  • a spacer may be selected to target any target sequence.
  • the target sequence is a sequence within a genome of a cell.
  • Exemplary target sequences include those that are unique in the target genome.
  • a unique target sequence in a genome may include a Cas9 target site of the form MMMMMMMMNNNNNNNNNNNNXGG where NNNNNNNNNNXGG (N is A, G, T, or C; and X can be anything).
  • a unique target sequence in a genome may include an S.
  • a unique target sequence in a genome may include a Cas9 target site of the form MMMMMMMMNNNNNNNNNNNNXAGAAW where NNNNNNNNNNXXAGAAW (N is A, G, T, or C; X can be anything; and W is A or T).
  • a unique target sequence in a genome may include an S.
  • a unique target sequence in a genome may include a Cas9 target site of the form MMMMMMMMNNNNNNNNNNNNXGGXG where NNNNNNNNNNXGGXG (N is A, G, T, or C; and X can be anything).
  • a unique target sequence in a genome may include an S.
  • pyogenes Cas9 target site of the form MMMMMMMMMNNNNNNNNNNNNNXGGXG where NNNNNNNNNXGGXG (N is A, G, T, or C; and X can be anything).
  • M may be A, G, T, or C, and need not be considered in identifying a sequence as unique.
  • a target sequence is in the HTT gene.
  • CAG repeats in the HTT gene e.g., at the 5′- end of the HTT gene) are targeted.
  • a target sequence is in the FXN gene.
  • GAA repeats in the FXN gene are targeted.
  • a spacer is selected to reduce the degree of secondary structure within the spacer. Secondary structure may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res.9 (1981), 133-148). Another example folding algorithm is the online webserver RNAfold, developed at Institute for Theoretical Chemistry at the University of Vienna, using the centroid structure prediction algorithm (see, e.g., A. R.
  • the scaffold or gRNA core portion of a pegRNA comprises sequences corresponding to the tracr sequence and tracr mate sequence of a traditional guide RNA.
  • a tracr mate sequence includes any sequence that has sufficient complementarity with a tracr sequence to promote one or more of: (1) excision of a spacer flanked by tracr mate sequences in a cell containing the corresponding tracr sequence; and (2) formation of a complex at a target sequence, wherein the complex comprises the tracr mate sequence hybridized to the tracr sequence.
  • degree of complementarity is with reference to the optimal alignment of the tracr mate sequence and tracr sequence, along the length of the shorter of the two sequences.
  • Optimal alignment may be determined by any suitable alignment algorithm, and may further account for secondary structures, such as self- complementarity within either the tracr sequence or tracr mate sequence.
  • the degree of complementarity between the tracr sequence and tracr mate sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 97.5%, 99%, or higher.
  • the tracr sequence is about or more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or more nucleotides in length.
  • the tracr sequence and tracr mate sequence are contained within a single transcript, such that hybridization between the two produces a transcript having a secondary structure, such as a hairpin.
  • Preferred loop forming sequences for use in hairpin structures are four nucleotides in length, and most preferably have the sequence GAAA. However, longer or shorter loop sequences may be used, as may alternative sequences.
  • the sequences preferably include a nucleotide triplet (for example, AAA), and an additional nucleotide (for example C or G). Examples of loop forming sequences include CAAA and AAAG.
  • the transcript or transcribed polynucleotide sequence has at least two or more hairpins. In preferred embodiments, the transcript has two, three, four or five hairpins. In a further embodiment of the invention, the transcript has at most five hairpins.
  • the single transcript further includes a transcription termination sequence; preferably this is a polyT sequence, for example six T nucleotides.
  • a transcription termination sequence preferably this is a polyT sequence, for example six T nucleotides.
  • a transcription termination sequence preferably this is a polyT sequence, for example six T nucleotides.
  • single polynucleotides comprising a spacer, a tracr mate sequence, and a tracr sequence are as follows (listed 5′ to 3′), where “N” represents a base of a spacer, the first block of lower case letters represent the tracr mate sequence, and the second block of lower case letters represent the tracr sequence, and the final poly-T sequence represents the transcription terminator: (1)NNNNNNNNGTTTTTGTACTCTCAAGATTTAGAAATAAATCTTGCAGAAGCTACA AAGATAAGGCTTCATGCCGAAATCAACACCCTGTCATTTTATGGCAGGGTGTTTTC GTTATTTAATTTTTT (SEQ ID NO:
  • sequences (1) to (3) are used in combination with Cas9 from S. thermophilus CRISPR1.
  • sequences (4) to (6) are used in combination with Cas9 from S. pyogenes.
  • the tracr sequence is a separate transcript from a transcript comprising the tracr mate sequence.
  • a guide RNA typically comprises a tracrRNA framework allowing for Cas9 binding, and a spacer, which confers sequence specificity to the Cas9:nucleic acid editing enzyme/domain fusion protein.
  • a pegRNA comprises a structure 5 ⁇ -[spacer]- GUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAAGGCUAGUCCGUUAUCAACU UGAAAAAGUGGCACCGAGUCGGUGCUUUUU(SEQ ID NO: 143)-extension arm-3 ⁇ , wherein the spacer comprises a sequence that is complementary to the target sequence.
  • the spacer also referred to herein as the spacer sequence, is typically 20 nucleotides long.
  • RNA sequences typically comprise spacers that are complementary to a nucleic acid sequence within 50 nucleotides upstream or downstream of the target nucleotide to be edited.
  • Some exemplary guide RNA sequences suitable for targeting any of the provided fusion proteins to specific target sequences are provided herein. Additional spacers are well known in the art and can be used with the prime editors utilized in the methods and compositions described herein.
  • a PEgRNA comprises three main component elements ordered in the 5 ⁇ to 3 ⁇ direction, namely: a spacer, a gRNA core, and an extension arm at the 3 ⁇ end.
  • the extension arm may further be divided into the following structural elements in the 5 ⁇ to 3 ⁇ direction, namely: an edit template , a homology arm, and a primer binding site. In some embodiments, the extension arm may further be divided into the following structural elements in the 5 ⁇ to 3 ⁇ direction, namely: a homology arm, an edit template, and a primer binding site. In some embodiments, the extension arm may further be divided into the following structural elements in the 5 ⁇ to 3 ⁇ direction, namely: a DNA synthesis template (e.g., a RT template), and a primer binding site.
  • the PEgRNA may comprise an optional 3 ⁇ end modifier region and an optional 5 ⁇ end modifier region .
  • the PEgRNA may comprise a transcriptional termination signal at the 3 ⁇ end of the PEgRNA.
  • These structural elements are further defined herein. The depiction of the structure of the PEgRNA is not meant to be limiting and embraces variations in the arrangement of the elements. For example, the optional sequence modifiers and could be positioned within or between any of the other regions shown, and not limited to being located at the 3 ⁇ and 5 ⁇ ends.
  • PEgRNA modifications [0240]
  • the PEgRNAs may also include additional design modifications that may alter the properties and/or characteristics of PEgRNAs, thereby improving the efficacy of prime editing.
  • these modifications may belong to one or more of a number of different categories, including but not limited to: (1) designs to enable efficient expression of functional PEgRNAs from non-polymerase III (pol III) promoters, which would enable the expression of longer PEgRNAs without burdensome sequence requirements; (2) modifications to the core, Cas9-binding PEgRNA scaffold, which could improve efficacy; (3) modifications to the PEgRNA to improve RT processivity, allowing the insertion of longer sequences at targeted genomic loci; and (4) addition of RNA motifs to the 5 ⁇ or 3 ⁇ termini of the PEgRNA that improve PEgRNA stability, enhance RT processivity, prevent misfolding of the PEgRNA, or recruit additional factors important for genome editing.
  • poly III non-polymerase III
  • PEgRNA could be designed with polIII promoters to improve the expression of longer-length PEgRNA with larger extension arms.
  • sgRNAs are typically expressed from the U6 snRNA promoter. This promoter recruits pol III to express the associated RNA and is useful for expression of short RNAs that are retained within the nucleus.
  • pol III is not highly processive and is unable to express RNAs longer than a few hundred nucleotides in length at the levels required for efficient genome editing. Additionally, pol III can stall or terminate at stretches of U’s, potentially limiting the sequence diversity that could be inserted using a PEgRNA.
  • promoters that recruit polymerase II (such as pCMV) or polymerase I (such as the U1 snRNA promoter) have been examined for their ability to express longer sgRNAs.
  • these promoters are typically partially transcribed, which would result in extra sequence 5 ⁇ of the spacer in the expressed PEgRNA, which has been shown to result in markedly reduced Cas9:sgRNA activity in a site-dependent manner.
  • pol III-transcribed PEgRNAs can simply terminate in a run of 6-7 U’s, PEgRNAs transcribed from pol II or pol I would require a different termination signal. Such signals often also result in polyadenylation, which would result in undesired transport of the PEgRNA from the nucleus.
  • RNAs expressed from pol II promoters such as pCMV are typically 5 ⁇ -capped, also resulting in their nuclear export.
  • the complexes and methods of present disclosure utilize next- generation modified pegRNAs (also referred to herein as “engineered pegRNAs” or “epegRNAs”) with improved properties, including but not limited to, increased stability and cellular lifespan, and improved binding affinity for a napDNAbp.
  • engineered pegRNAs also referred to herein as “engineered pegRNAs” or “epegRNAs” with improved properties, including but not limited to, increased stability and cellular lifespan, and improved binding affinity for a napDNAbp.
  • the modified pegRNAs include a nucleic acid moiety at the 3′ end of the pegRNA.
  • the 3′ end of the pegRNA is fused to the nucleic acid moiety through a nucleotide linker.
  • a nucleotide linker In various embodiments, it will be appreciated that a wide variety of nucleotide sequences will work reasonably well for each genomic target site. Linker length can also be variable. In some cases, linkers ranging in length from 3-18 nucleotides will work.
  • the linker may be at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 21 nucleotides, at least 22 nucleotides, at least 23 nucleotides, at least 24 nucleotides, at least 25 nucleotides, at least 26 nucleotides, at least 27 nucleotides, at least 28 nucleotides, at least 29 nucleotides, or at least
  • the nucleic acid moieties that may be used to modify a pegRNA, for example, by attaching it to the 3′ end of a pegRNA may include any nucleic acid moiety, including, for instance, a nucleic acid molecule comprising or which forms a double-helix moiety, toeloop moiety, hairpin moiety, stem-loop moiety, pseudoknot moiety, aptamer moiety, G quadraplex moiety, tRNA moiety, or a ribozyme moiety.
  • the nucleic acid moiety may be characterized as forming a secondary nucleic acid structure, a tertiary nucleic acid structure, or a quadruple nucleic acid structure.
  • the nucleic acid moiety may form any two-dimensional or three-dimensional structure known to be formed by such structures.
  • the nucleic acid moiety may be DNA or RNA.
  • the following are specific examples of nucleotide motifs that may be appended to the terminus of the extension arm of a pegRNA.
  • the nucleotide motif would be coupled, attached, or otherwise linked to the 3′ of the pegRNA, optionally via a linker.
  • the nucleotide motif would be coupled, attached, or otherwise linked to the 5′ end of the pegRNA, optionally via a linker.
  • linkers include, but are not limited to: [0247] In some embodiments, a linker will be designed and/or selected based on the genomic site being targeted by prime editing and the modified pegRNA. [0248] In various embodiments, it will be appreciated that a wide variety of nucleotide sequences will work reasonably well for each genomic target site. Linker length is also likely to be variable. In some cases, linkers ranging in length from 3-18 nucleotides will work.
  • the linker may be at least 3 nucleotides, at least 4 nucleotides, at least 5 nucleotides, at least 6 nucleotides, at least 7 nucleotides, at least 8 nucleotides, at least 9 nucleotides, at least 10 nucleotides, at least 11 nucleotides, at least 12 nucleotides, at least 13 nucleotides, at least 14 nucleotides, at least 15 nucleotides, at least 16 nucleotides, at least 17 nucleotides, at least 18 nucleotides, at least 19 nucleotides, at least 20 nucleotides, at least 21 nucleotides, at least 22 nucleotides, at least 23 nucleotides, at least 24 nucleotides, at least 25 nucleotides, at least 26 nucleotides, at least 27 nucleotides, at least 28 nucleotides, at least 29 nucleotides, or at least
  • the linker is 8 nucleotides in length.
  • the present disclosure also contemplates variants of the above nucleotide motifs and linkers that have at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.9% sequence identity with any of the above motif and linker sequences.
  • the pegRNAs may also include additional design improvements that may modify the properties and/or characteristics of pegRNAs thereby improving the efficacy of prime editing.
  • these improvements may belong to one or more of a number of different categories, including, but not limited to: (1) designs to enable efficient expression of functional pegRNAs from non-polymerase III (pol III) promoters, which would enable the expression of longer pegRNAs without burdensome sequence requirements; (2) improvements to the core, Cas9-binding pegRNA scaffold, which could improve efficacy; (3) modifications to the pegRNA to improve RT processivity, allowing the insertion of longer sequences at targeted genomic loci; and (4) addition of RNA motifs to the 5 ⁇ or 3 ⁇ termini of the pegRNA that improve pegRNA stability, enhance RT processivity, prevent misfolding of the pegRNA, or recruit additional factors important for genome editing.
  • poly III non-polymerase III
  • pegRNA could be designed with polIII promoters to improve the expression of longer-length pegRNA with larger extension arms.
  • sgRNAs are typically expressed from the U6 snRNA promoter. This promoter recruits pol III to express the associated RNA and is useful for expression of short RNAs that are retained within the nucleus.
  • pol III is not highly processive and is unable to express RNAs longer than a few hundred nucleotides in length at the levels required for efficient genome editing. Additionally, pol III can stall or terminate at stretches of U’s, potentially limiting the sequence diversity that could be inserted using a pegRNA.
  • promoters that recruit polymerase II (such as pCMV) or polymerase I (such as the U1 snRNA promoter) have been examined for their ability to express longer sgRNAs.
  • these promoters are typically partially transcribed, which would result in extra sequence 5 ⁇ of the spacer in the expressed pegRNA, which has been shown to result in markedly reduced Cas9:sgRNA activity in a site- dependent manner.
  • pol III-transcribed pegRNAs can simply terminate in a run of 6-7 U’s, pegRNAs transcribed from pol II or pol I would require a different termination signal. Often such signals also result in polyadenylation, which would result in undesired transport of the pegRNA from the nucleus.
  • RNAs expressed from pol II promoters such as pCMV are typically 5 ⁇ -capped, also resulting in their nuclear export.
  • Exemplary U6 promoters include, but are not limited to: [0254] U6 promoter: [0255] U6v9 promoter: [0256] U6v7 promoter: [0257] U6v4 promoter: [0258]
  • U6v4 promoter [0258]
  • any of the U6 promoters could be trimmed at the 5′ end by removing up to 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 nucleotides from the 5′ end, i.e., approximately 30% of the promoter length.
  • up to 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, or 30% of the length of the promoter may be trimmed from the 5′ end.
  • the MALAT1 ncRNA and PAN ENEs form triple helices protecting the polyA-tail. These constructs could also enhance RNA stability. It is contemplated that these expression systems will also allow the expression of longer PEgRNAs.
  • a series of methods have been designed for the cleavage of the portion of the pol II promoter that would be transcribed as part of the PEgRNA, adding either a self- cleaving ribozyme such as the hammerhead, pistol, hatchet, hairpin, VS, twister, or twister sister ribozymes, or other self-cleaving elements to process the transcribed guide, or a hairpin that is recognized by Csy4 and also leads to processing of the guide.
  • the PEgRNA may include various above elements, as exemplified by the following sequences.
  • the PEgRNA may be improved by introducing modifications to the scaffold or core sequences.
  • the core, Cas9-binding PEgRNA scaffold can likely be improved to enhance PE activity.
  • the first pairing element of the scaffold contains a GTTTT- AAAAC (SEQ ID NO: 146) pairing element.
  • GTTTT- AAAAC SEQ ID NO: 146 pairing element.
  • Such runs of Ts have been shown to result in pol III pausing and premature termination of the RNA transcript.
  • Rational mutation of one of the T-A pairs to a G-C pair in this portion of P1 has been shown to enhance sgRNA activity, suggesting this approach would also be feasible for PEgRNAs.
  • increasing the length of P1 has also been shown to enhance sgRNA folding and lead to improved activity, suggesting it as another avenue for the modification of PEgRNA activity.
  • Example modifications to the core can include: [0269] PEgRNA containing a 6 nt extension to P1 C GCC C GCG GC C G C G (S Q NO: 5 ) [0270] PEgRNA containing a T-A to G-C mutation within P1 [0271]
  • the PEgRNA may be modified at the edit template region. As the size of the insertion templated by the PEgRNA increases, it is more likely to be degraded by endonucleases, undergo spontaneous hydrolysis, or fold into secondary structures unable to be reverse-transcribed by the RT, or that disrupt folding of the PEgRNA scaffold and subsequent Cas9-RT binding.
  • modification to the template of the PEgRNA might be necessary to affect large insertions, such as the insertion of whole genes.
  • Some strategies to do so include the incorporation of modified nucleotides within a synthetic or semi-synthetic PEgRNA that render the RNA more resistant to degradation or hydrolysis or less likely to adopt inhibitory secondary structures.
  • modifications could include 8-aza-7-deazaguanosine, which would reduce RNA secondary structure in G-rich sequences; locked-nucleic acids (LNA) that reduce degradation and enhance certain kinds of RNA secondary structure; 2′-O-methyl, 2′-fluoro, or 2′-O- methoxyethoxy modifications that enhance RNA stability.
  • LNA locked-nucleic acids
  • modifications could also be included elsewhere in the PEgRNA to enhance stability and activity.
  • the template of the PEgRNA could be designed such that it is also more likely to adopt simple secondary structures that are able to allow processing by the RT. Such simple structures would act as a thermodynamic sink, making it less likely that more complicated structures that would prevent reverse transcription would occur.
  • a prime editors e.g., a nCas9-RT fusion protein, would be used to initiate transcription, and also to recruit a separate template RNA to the targeted site via an RNA-binding protein fused to Cas9 or an RNA recognition element on the PEgRNA itself such as the MS2 aptamer.
  • the RT could either directly bind to this separate template RNA, or initiate reverse transcription on the original PEgRNA before swapping to the second template.
  • Such an approach could allow long insertions by both preventing misfolding of the PEgRNA upon addition of the long template, and also by not requiring dissociation of Cas9 from the genome for long insertions to occur, which could possibly inhibit PE-based long insertions.
  • the PEgRNA may be modified by introducing additional RNA motifs at the 5 ⁇ and 3 ⁇ termini of the PEgRNAs, or even at positions therein between (e.g., in the gRNA core region, or the spacer).
  • motifs - such as the PAN ENE from KSHV and the ENE from MALAT1 were discussed above as possible means to terminate expression of longer PEgRNAs from non-pol III promoters. These elements form RNA triple helices that engulf the polyA tail, resulting in their being retained within the nucleus. However, by forming complex structures at the 3 ⁇ terminus of the PEgRNA that occlude the terminal nucleotide, these structures would also likely help prevent exonuclease- mediated degradation of PEgRNAs. [0273] Other structural elements inserted at the 3 ⁇ terminus could also enhance RNA stability, albeit without allowing for termination from non-pol III promoters.
  • Such motifs could include hairpins or RNA quadruplexes that would occlude the 3 ⁇ terminus, or self- cleaving ribozymes such as HDV that would result in the formation of a 2 ⁇ -3 ⁇ -cyclic phosphate at the 3 ⁇ terminus, and also potentially render the PEgRNA less likely to be degraded by exonucleases. Inducing the PEgRNA to cyclize via incomplete splicing (to form a ciRNA) could also increase PEgRNA stability and result in the PEgRNA being retained within the nucleus. [0274] Additional RNA motifs could also improve RT processivity or enhance PEgRNA activity by enhancing RT binding to the DNA-RNA duplex.
  • Addition of the native sequence bound by the RT in its cognate retroviral genome could enhance RT activity. This could include the native primer binding site (PBS), polypurine tract (PPT), or kissing loops involved in retroviral genome dimerization and initiation of transcription.
  • PBS native primer binding site
  • PPT polypurine tract
  • kissing loops involved in retroviral genome dimerization and initiation of transcription.
  • Dimerization motifs such as kissing loops or a GNRA tetraloop/tetraloop receptor pair, at the 5 ⁇ and 3 ⁇ termini of the PEgRNA could also result in effective circularization of the PEgRNA, improving stability. Additionally, it is envisioned that addition of these motifs could allow the physical separation of the PEgRNA spacer and primer, preventing occlusion of the spacer, which would hinder PE activity.
  • Short 5 ⁇ extensions or 3 ⁇ extensions to the PEgRNA that form a small toehold hairpin in the spacer region or along the primer binding site could also compete favorably against the annealing of intracomplementary regions along the length of the PEgRNA, e.g., the interaction between the spacer and the primer binding site that can occur.
  • kissing loops could also be used to recruit other template RNAs to the genomic site and enable swapping of RT activity from one RNA to the other.
  • a number of secondary RNA structures may be engineered into any region of the PEgRNA, including in the terminal portions of the extension arm (i.e., e1 and e2), as shown.
  • Example modifications include, but are not limited to: [0277] PEgRNA-HDV fusion [0278] PEgRNA-MMLV kissing loop [0279] PEgRNA-VS ribozyme kissing loop [0280] PEgRNA-GNRA tetraloop/tetraloop receptor [0281] PEgRNA template switching secondary RNA-HDV fusion [0282] PEgRNA scaffolds can be further improved via directed evolution, in an analogous fashion to how SpCas9 and prime editors (PE) have been improved. Directed evolution can enhance PEgRNA recognition by Cas9 or evolved Cas9 variants.
  • PE prime editors
  • PEgRNA scaffold sequences are likely optimal at different genomic loci, either enhancing PE activity at the site in question, reducing off-target activities, or both.
  • evolution of PEgRNA scaffolds to which other RNA motifs have been added would almost certainly improve the activity of the fused PEgRNA relative to the unevolved, fusion RNA.
  • evolution of allosteric ribozymes composed of c-di-GMP-I aptamers and hammerhead ribozymes led to dramatically improved activity, suggesting that evolution would improve the activity of hammerhead-PEgRNA fusions as well.
  • scaffolds that have been shown to improve activity relative to canonical sgRNA scaffolds may be used in pegRNAs and epegRNAs as described herein. Such improvements may include, for example, those disclosed in Chen, B. et al., Dynamic Imaging of Genomic Loci in Living Human Cells by an Optimized CRISPR/Cas System. Cell.2013, 155(7), 1479-1471 and Jost, M.
  • Exemplary epegRNAs incorporating improved sgRNA scaffolds include, but are not limited to: [0285] HEK31-15del standard scaffold evopreQ1 [0286] HEK31-15del cr748 evopreQ1 [0287] HEK31-15del cr289 evopreQ1 [0288] HEK31-15del cr622 evopreQ1 [0289] HEK31-15del cr772 evopreQ1 [0290] HEK31-15del cr532 evopreQ1 [0291] HEK31-15del cr961 evopreQ1 [0292] HEK31-15del flip and extension scaffold evopreQ1 [0293] RNF21-15del cr748 evopreQ1 [0294] RNF21-15del cr289 evopreQ1 [0295] RNF21-15del cr622 evopreQ1 [0296] RNF21-15del cr2
  • consecutive series of T’s may limit the capacity of the PEgRNA to be transcribed.
  • strings of at least three consecutive T’s, at least four consecutive T’s, at least five consecutive T’s, at least six consecutive T’s, at least seven consecutive T’s, at least eight consecutive T’s, at least nine consecutive T’s, at least ten consecutive T’s, at least eleven consecutive T’s, at least twelve consecutive T’s, at least thirteen consecutive T’s, at least fourteen consecutive T’s, or at least fifteen consecutive T’s should be avoided when designing the PEgRNA, or should be at least removed from the final designed sequence.
  • Methods of Treatment and Uses [0412] In some aspects, the present disclosure provides methods of treating trinucleotide repeat disorders, including Huntington’s disease and Friedreich’s ataxia, using prime editing. [0413] In one aspect, the present disclosure provides methods of treating Huntington’s disease by prime editing comprising contacting a target nucleotide sequence with any of the pegRNA-prime editor complexes or compositions disclosed herein. In some embodiments, the contacting results in the contraction or replacement of a CAG repeat sequence in the target nucleotide sequence.
  • the CAG repeat sequence is contracted from greater than 35 repeats to 35 or fewer repeats (e.g., 35 repeats, 34 repeats, 33 repeats, 32 repeats, 31 repeats, 30 repeats, 29 repeats, 28 repeats, 27 repeats, 26 repeats, 25 repeats, 24 repeats, 23 repeats, 22 repeats, 21 repeats, 20 repeats, 19 repeats, 18 repeats, 17 repeats, 16 repeats, 15 repeats, 14 repeats, 13 repeats, 12 repeats, 11 repeats, 10 repeats, 9 repeats, 8 repeats, 7 repeats, 6 repeats, 5 repeats, 4 repeats, or 3 repeats).
  • the CAG repeat sequence is greater than 35 repeats in length and is replaced with a CAG repeat sequence of 35 or fewer repeats in length (e.g., 35 or fewer repeats, 34 or fewer repeats, 33 or fewer repeats, 32 or fewer repeats, 31 or fewer repeats, 30 or fewer repeats, 29 or fewer repeats, 28 or fewer repeats, 27 or fewer repeats, 26 or fewer repeats, 25 or fewer repeats, 24 or fewer repeats, 23 or fewer repeats, 22 or fewer repeats, 21 or fewer repeats, 20 or fewer repeats, 19 or fewer repeats, 18 or fewer repeats, 17 or fewer repeats, 16 or fewer repeats, 15 or fewer repeats, 14 or fewer repeats, 13 or fewer repeats, 12 or fewer repeats, 11 or fewer repeats, 10 or fewer repeats, 9 or fewer repeats, 8 or fewer repeats, 7 or fewer repeats, 6 or fewer repeats, 5 or
  • the CAG repeat sequence is contracted to four repeats or replaced with a sequence comprising four CAG repeats.
  • the methods further comprise nicking the non-PAM-containing strand of the target nucleotide sequence using a nicking gRNA.
  • the spacer of the nicking gRNA comprises the nucleotide sequence: GGCGGCTGAGGAAGCTGAGG (SEQ ID NO: 387); GGCGGCGGCGGCGGCGGTGG (SEQ ID NO: 385); GCTGCTGCTGCTGCTGCTGC (SEQ ID NO: 384); GAAGGACTTGAGGGACTCGA (SEQ ID NO: 393); GTGAGGAAGCTGAGGAGGCGG (SEQ ID NO: 394); GCTGTTGCTGCTGCTGCTGC (SEQ ID NO: 395); GCTGCTGCTGCTGCTGCTGGA (SEQ ID NO: 383); GCTGCTGCTGGAAGGACTTG (SEQ ID NO: 382); GGCCTTCATCAGCTTTTCC (SEQ ID NO: 396); GGCTTTCATCAGCTTTTCC (SEQ ID NO: 397); GGGACTCGAAGGCCTTCAT (SEQ ID NO: 398); GGAAGGACTTGAGGGACTCG (SEQ ID NO: 387);
  • the spacer of the nicking gRNA comprises the nucleotide sequence GCTGCTGCTGCTGCTGCTGC (SEQ ID NO: 384), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 384.
  • the spacer of the nicking gRNA comprises the nucleotide sequence GCTGCTGGAAGGACTTGAG (SEQ ID NO: 406), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 406.
  • the contacting is performed in a cell.
  • the cell is a eukaryotic cell.
  • the cell is a human cell.
  • the cell is in vitro.
  • the cell is ex vivo.
  • the cell is in a subject.
  • the subject is a human.
  • the present disclosure provides methods of treating Friedreich’s ataxia by prime editing comprising contacting a target nucleotide sequence with any of the complexes provided herein. In some embodiments, the contacting results in the deletion of a GAA repeat sequence in the target nucleotide sequence.
  • the GAA repeat sequence deleted comprises greater than about 50 GAA repeats, greater than about 51 GAA repeats, greater than about 52 GAA repeats, greater than about 53 GAA repeats, greater than about 54 GAA repeats, greater than about 55 GAA repeats, greater than about 56 GAA repeats, greater than about 57 GAA repeats, greater than about 58 GAA repeats, greater than about 59 GAA repeats, greater than about 60 GAA repeats, greater than about 61 GAA repeats, greater than about 62 GAA repeats, greater than about 63 GAA repeats, greater than about 64 GAA repeats, greater than about 65 GAA repeats, greater than about 66 GAA repeats, greater than about 67 GAA repeats, greater than about 68 GAA repeats, greater than about 69 GAA repeats, greater than about 70 GAA repeats, greater than about 71 GAA repeats, greater than about 72 GAA repeats, greater than about 73 GAA repeats, greater than about 74 GAA repeats, greater than about 75 GAA
  • the GAA repeat sequence deleted comprises greater than 65 GAA repeats.
  • the methods further comprise nicking the non-PAM-containing strand of the target nucleotide sequence using a nicking gRNA.
  • the spacer of the nicking gRNA comprises the nucleotide sequence: GTCCCAAAGTGCTGAGATTAT (SEQ ID NO: 410); GTGTATTTTTTAGTAGATACT (SEQ ID NO: 411); GATTCTCCTGCCGCAGCCTC (SEQ ID NO: 412); or GCGACACCACGCCCGGCTAAC (SEQ ID NO: 413), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any of these.
  • the spacer of the nicking gRNA comprises the nucleotide sequence GTGTATTTTTTAGTAGATACT (SEQ ID NO: 411), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 411.
  • the spacer of the nicking gRNA comprises the nucleotide sequence GCGACACCACGCCCGGCTAAC (SEQ ID NO: 413), or a nucleotide sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 413.
  • the contacting is performed in a cell.
  • the cell is a eukaryotic cell.
  • the cell is a human cell.
  • the cell is a central nervous system cell.
  • the cell is in vitro.
  • the cell is ex vivo.
  • the cell is in a subject. In certain embodiments, the subject is a human. In certain embodiments, the central nervous system is targeted. [0419] In some embodiments, one or more polynucleotides encoding any of the complexes provided herein are delivered to a cell. In certain embodiments, two polynucleotides encoding any of the complexes provided herein are delivered to a cell. In certain embodiments, the two polynucleotides comprise two halves of a prime editor comprising a split intein capable of reassembling into a prime editor molecule.
  • the one or more polynucleotides encoding a complex provided herein are delivered to a cell in one or more adeno-associated virus (AAV) particles. Delivery of prime editor complexes has been described, for example, in U.S. Provisional Application, U.S.S.N., 63/426,336, filed November 17, 2022, U.S. Provisional Application, U.S.S.N., 63/491,013, filed March 17, 2023, and Davis, J. R., et al. Nat. Biotechnol.2023, each of which is incorporated herein by reference. In certain embodiments, the one or more polynucleotides encoding the complex are delivered to the cell in two AAV particles.
  • one or both of the AAV particles comprise AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, or AAV9. In certain embodiments, one or both of the AAV particles comprise AAV9. In some embodiments, the AAV particles target central nervous system cells. [0420] In some embodiments, a first and second AAV particle are delivered to a cell. In certain embodiments, the first AAV particle comprises a polynucleotide comprising the structure 5′-[inverted terminal repeat (ITR) sequence]-[promoter]-[napDNAbp N-terminal fragment]-[N-intein]-[terminator sequence]-[ITR sequence]-3′.
  • ITR inverted terminal repeat
  • the second AAV particle comprises a polynucleotide comprising the structure 5′-[ITR sequence]- [promoter]-[C-intein]-[napDNAbp C-terminal fragment]-[reverse transcriptase]-[terminator sequence]-[optional nicking gRNA]-[pegRNA]-[ITR]-3′.
  • the split site within the napDNAbp is between amino acid residues 844 and 845 of a Cas9 (e.g., a Cas9 protein of SEQ ID NO: 6).
  • the split site within the napDNAbp is between amino acid residues 1024 and 1025 of a Cas9 protein (e.g., a Cas9 protein of SEQ ID NO: 6).
  • a Cas9 protein e.g., a Cas9 protein of SEQ ID NO: 6
  • the prime editors utilized in the complexes and methods described herein comprise a nucleic acid programmable DNA binding protein (napDNAbp).
  • prime editors may include a napDNAbp domain having a wild type Cas9 sequence, including, for example the canonical Streptococcus pyogenes Cas9 sequence of SEQ ID NO: 6, shown as follows.
  • the prime editors may include a napDNAbp domain having a modified Cas9 sequence, including, for example the nickase variant of Streptococcus pyogenes Cas9 of SEQ ID NO: 7 having an H840A substitution relative to the wild type SpCas9 (of SEQ ID NO: 6), shown as follows: [0424]
  • the prime editors described herein may include any of the modified Cas9 sequences described above, or any variant thereof having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
  • the prime editors used in the methods described herein include any of the following other wild type SpCas9 sequences, which may be modified with one or more of the mutations described herein at corresponding amino acid positions:
  • the prime editors used in the methods described herein may include any of the above SpCas9 sequences, or any variant thereof having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto.
  • the Cas9 protein can be a wild type Cas9 ortholog from another bacterial species different from the canonical Cas9 from S. pyogenes.
  • modified versions of the following Cas9 orthologs can be used in connection with the prime editors described in this specification by making mutations at positions corresponding to H840A or any other amino acids of interest in wild type SpCas9.
  • any variant Cas9 orthologs having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to any of the below orthologs may also be used with the prime editors.
  • the napDNAbp used in the prime editors described herein comprise one or more amino acid mutations relative to a wild type napDNAbp, for example, a wild type Cas9 protein.
  • a Cas9 protein comprises an inactivating mutation in an HNH domain.
  • the prime editors described herein comprise a Cas9 protein comprising one or more amino acid substitutions relative to the amino acid sequence of SEQ ID NO: 6, or an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6.
  • the Cas9 protein comprises a K775R substitution and/or a K918A substitution relative to the amino acid sequence of SEQ ID NO: 6, or an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6 (or corresponding mutations in a homologous Cas9 protein).
  • the Cas9 protein comprises a K775R substitution and a K918A substitution relative to the amino acid sequence of SEQ ID NO: 6, or an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6 (or corresponding mutations in a homologous Cas9 protein).
  • the Cas9 protein comprises a D23G substitution and/or an H754R substitution relative to the amino acid sequence of SEQ ID NO: 6, or an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6 (or corresponding mutations in a homologous Cas9 protein).
  • the Cas9 protein comprises a D23G substitution and an H754R substitution relative to the amino acid sequence of SEQ ID NO: 6, or an amino acid sequence at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6 (or corresponding mutations in a homologous Cas9 protein).
  • the napDNAbp used in the prime editors described herein may include any suitable homologs and/or orthologs or naturally occurring enzymes, such as Cas9. Cas9 homologs and/or orthologs have been described in various species, including, but not limited to, S. pyogenes and S. thermophilus.
  • the Cas moiety may be configured (e.g., mutagenized, recombinantly engineered, or otherwise obtained from nature) as a nickase, i.e., capable of cleaving only a single strand of the target double-stranded DNA.
  • a Cas9 nuclease has an inactive (e.g., an inactivated) DNA cleavage domain; that is, the Cas9 is a nickase.
  • the Cas9 protein comprises an amino acid sequence that is at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the amino acid sequence of a Cas9 protein as provided by any one of the Cas9 orthologs in the above tables.
  • Cas9 nucleases and sequences include Cas9 sequences from the organisms and loci disclosed in Chylinski, Rhun, and Charpentier, “The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems” (2013) RNA Biology 10:5, 726-737; the entire contents of which are incorporated herein by reference.
  • Additional exemplary Cas variants and homologs include, but are not limited to, Cas9 (e.g., dCas9 and nCas9), Cpf1, CasX, CasY, C2c1, C2c2, C2c3, GeoCas9, CjCas9, Cas12a, Cas12b, Cas12g, Cas12h, Cas12i, Cas13b, Cas13c, Cas13d, Cas14, Csn2, xCas9, SpCas9-NG, Nme2Cas9, circularly permuted Cas9, Argonaute (Ago), Cas9-KKH, SmacCas9, Spy-macCas9, SpCas9-VRQR, SpCas9-NRRH, SpaCas9-NRTH, SpCas9-NRCH, LbCas12a, AsCas12a, CeCas12a, MbC
  • the prime editors used in the complexes and methods described herein comprise a reverse transcriptase domain.
  • the reverse transcriptase domain is a wild type MMLV reverse transcriptase.
  • the reverse transcriptase domain is a variant of wild type MMLV reverse transcriptase having the amino acid sequence of SEQ ID NO: 34 (e.g., at least 85%, at least 90%, at least 92%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% identical to the amino acid sequence of SEQ ID NO: 34).
  • PE2 and PEmax comprise a variant reverse transcriptase domain of SEQ ID NO: 34, which is based on the wild type MMLV reverse transcriptase domain of SEQ ID NO: 33 (and, in particular, a Genscript codon optimized MMLV reverse transcriptase having the nucleotide sequence of SEQ ID NO: 33), and which comprises amino acid substitutions D200N, T306K, W313F, T330P, and L603W relative to the wild type MMLV RT of SEQ ID NO: 33.
  • the amino acid sequence of the variant RT domain of PE2 and PEmax is SEQ ID NO: 34.
  • Prime editors may also comprise other variant RT domains as well.
  • the prime editors used in the methods and systems described herein can include a variant RT comprising one or more of the following mutations: P51L, S67K, E69K, L139P, T197A, D200N, H204R, F209N, E302K, E302R, T306K, F309N, W313F, T330P, L345G, L435G, N454K, D524G, E562Q, D583N, H594Q, L603W, E607K, or D653N in the wild type M- MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence.
  • exemplary reverse transcriptases that can be fused to napDNAbp proteins or provided as individual proteins according to various embodiments of this disclosure are provided below.
  • exemplary reverse transcriptases include variants with at least 80%, at least 85%, at least 90%, at least 95%, at least 98%, or at least 99% sequence identity to the following wild-type enzymes or partial enzymes:
  • the prime editors utilized in the complexes and methods described herein can include a variant RT comprising one or more of the following mutations: P51X, S67X, E69X, L139X, T197X, D200X, H204X, F209X, E302X, T306X, F309X, W313X, T330X, L345X, L435X, N454X, D524X, E562X, D583X, H594X, L603X, E607X, or D65
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a P51X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is L.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an S67X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is K.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an E69X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is K.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an L139X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is P.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a T197X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is A.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a D200X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is N.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an H204X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is R.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an F209X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is N.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an E302X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is K.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an E302X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is R.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a T306X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is K.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an F309X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is N.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a W313X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is F.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a T330X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is P.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an L345X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is G.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an L435X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is G.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an N454X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is K.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a D524X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is G.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an E562X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is Q.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a D583X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is N.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an H594X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is Q.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an L603X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is W.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising an E607X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid.
  • X is K.
  • the prime editors used in the complexes and methods described herein can include a variant RT comprising a D653X mutation in the wild type M-MLV RT of SEQ ID NO: 33, or at a corresponding amino acid position in another wild type RT polypeptide sequence, wherein “X” can be any amino acid. In certain embodiments, X is N.
  • X is N.
  • Exemplary reverse transcriptases include variants with at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the wild-type enzymes or partial enzymes described in SEQ ID NOs: 33- 78, 159-179, and 188-213.
  • the prime editor (PE) system described here contemplates any publicly available reverse transcriptase (and any variant thereof) described or disclosed in any of the following U.S. patents (each of which is incorporated by reference), U.S.
  • the following references describe reverse transcriptases known in the art. Each of their disclosures are incorporated herein by reference.
  • the prime editor proteins comprise an MMLV reverse transcriptase comprising one or more amino acid substitutions.
  • the wild-type MMLV reverse transcriptase is provided by the following sequence:
  • the reverse transcriptases used in the prime editors described herein may comprise one or more mutations relative to the wild-type amino acid sequence.
  • the reverse transcriptase is the MMLV pentamutant described above (i.e., comprising amino acid substitutions D200N, T306K, W313F, T330P, and L603W).
  • the present disclosure provides MMLV reverse transcriptase variants, and prime editors (e.g., fusion proteins and prime editors in which the napDNAbp and reverse transcriptase are provided in trans) comprising MMLV reverse transcriptase variants, wherein the variants comprise one or more mutations relative to SEQ ID NO: 33 selected from the group consisting of T13I, V19I, A32T, G38V, S60Y, P111L, K120R, H126Y, T128N, T128F, T128H, V129S, P132S, G138R, C157F, P175Q, P175S, D200S, D200Y, D200N, D200C, Y222F, V223A, V223M, V223T, V223W, V223Y, L234I, T246I, N249S, T287A, P292T, E302A, E302K, T306K,
  • prime editors e.g
  • an MMLV reverse transcriptase variant comprises two or more of these mutations, three or more of these mutations, four or more of these mutations, or five or more of these mutations.
  • the MMLV reverse transcriptase variants used in the prime editors provided herein comprise a single mutation relative to SEQ ID NO: 33.
  • the single mutations is selected from the group consisting of T13I, G38V, K120R, H126Y, T128N, T128F, T128H, V129S, P132S, P175Q, P175S, D200C, D200Y, V223M, V223T, V223W, V223Y, L234I, P292T, G316R, K373N, M457I, and V402A.
  • the MMLV reverse transcriptase variants used in the prime editors provided herein comprise any one of the following groups of mutations relative to the amino acid sequence of SEQ ID NO: 33: D200Y and E302A; D200Y, V223A, and M457I; V223M, T306K, and A462S; D200N and E302K; D200Y and E302K; T128N and V223A; V19I, A32T, and D200Y; D200S, V223A, E346K, and W388C; S60Y, V223A, and N249S; P111L, V223A, T287A, and G316R; S60Y, G138R, and V223A; S60Y, Y222F, V223A, and K445N; or S60Y, C157F, V223A, and T246I.
  • the MMLV reverse transcriptase variant used in the prime editors provided herein comprises the amino acid sequence of any one of SEQ ID NOs: 33-42, 63-78, and 172-179, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 33-42, 63-78, and 172-179, wherein the amino acid sequence comprises at least one of residues 13I, 19I, 32T, 38V, 60Y, 111L, 120R, 126Y, 128N, 128F, 128H, 129S, 132S, 138R, 157F, 175Q, 175S, 200S, 200Y, 200N, 200C, 222F, 223A, 223M, 223T, 223W, 223Y, 234I, 246I, 249S, 287
  • the proteins described herein may comprise an MMLV reverse transcriptase comprising one or more substitutions at amino acid positions V19, A32, S60, P111, T128, G138R, C157F, D200, Y222, V223, T246, N249, T287, G316, E346, W388, and/or K445.
  • the proteins described herein comprise an MMLV reverse transcriptase comprising one or more substitutions selected from the group consisting of V19I, A32T, S60Y, P111L, T128N, G138R, C157F, D200S, D200Y, Y222F, V223A, T246I, N249S, T287A, G316R, E346K, W388C, and K445N.
  • the proteins described herein comprise an MMLV reverse transcriptase comprising any one of the following groups of amino acid substitutions: T128N and V223A; V19I, A32T, and D200Y; D200S, V223A, E346K, and W388C; S60Y, V223A, and N249S; P111L, V223A, T287A, and G316R; S60Y, G138R, and V223A; S60Y, Y222F, V223A, and K445N; or S60Y, C157F, V223A, and T246I.
  • Exemplary evolved reverse transcriptase enzymes are as follows:
  • reverse transcriptase enzymes comprising an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to any of the evolved variants described herein in the prime editors disclosed herein is also contemplated by the present disclosure, provided the RT sequence comprises one of the amino acid substitutions disclosed herein.
  • the disclosure also contemplates the use of any wild-type reverse transcriptase in the prime editors described herein. Exemplary wild-type reverse transcriptases which may be used include, but are not limited to, the following sequences, or any variant thereof having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity thereto:
  • reverse transcriptase enzymes comprising an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 95%, or at least 99% sequence identity to any of the enzymes above in the prime editor proteins disclosed herein is also contemplated by the present disclosure.
  • the present disclosure provides reverse transcriptases, and prime editors (e.g.
  • AVIRE reverse transcriptase of SEQ ID NO: 47
  • AVIRE reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 47, wherein the AVIRE reverse transcriptase variant comprises one or more mutations selected from the group consisting of D199N, T305K, W312F, G329P, and L604W.
  • the AVIRE reverse transcriptase variant comprises two or more, three or more, four or more, or all five of these mutations. In some embodiments, the AVIRE reverse transcriptase variant comprises the mutation D199N. In some embodiments, the AVIRE reverse transcriptase variant comprises the mutation T305K. In some embodiments, the AVIRE reverse transcriptase variant comprises the mutation W312F. In some embodiments, the AVIRE reverse transcriptase variant comprises the mutation G329P. In some embodiments, the AVIRE reverse transcriptase variant comprises the mutation L604W.
  • the AVIRE reverse transcriptase variant comprises the amino acid sequence of any one of SEQ ID NOs: 214-219, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 214-219, wherein the amino acid sequence comprises at least one of the residues 199N, 305K, 312F, 329P, and 604W: AVIRE-RT (D199N): AVIRE-RT (T305K): AVIRE-RT (W312F): AVIRE-RT (G329P): AVIRE-RT (L604W): [0503]
  • the AVIRE reverse transcriptase variant comprises an amino acid sequence of SEQ ID NO: 219, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least
  • the reverse transcriptase is a KORV reverse transcriptase of SEQ ID NO: 50, or a KORV reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 50, wherein the KORV reverse transcriptase variant comprises one or more mutations selected from the group consisting of D197N, T303K, W310F, E327P, and L599W.
  • the KORV reverse transcriptase variant comprises two or more, three or more, four or more, or all five of these mutations. In some embodiments, the KORV reverse transcriptase variant comprises the mutation D197N. In some embodiments, the KORV reverse transcriptase variant comprises the mutation T303K. In some embodiments, the KORV reverse transcriptase variant comprises the mutation W310F. In some embodiments, the KORV reverse transcriptase variant comprises the mutation E327P. In some embodiments, the KORV reverse transcriptase variant comprises the mutation L599W.
  • the KORV reverse transcriptase variant comprises the amino acid sequence of any one of SEQ ID NOs: 220-225, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 220-225, wherein the amino acid sequence comprises at least one of the residues 197N, 303K, 310F, 327P, and 599W: KORV-RT D197N: KORV-RT T303K: KORV-RT W310F: KORV-RT E327P: KORV-RT L599W: [0506]
  • the KORV reverse transcriptase variant comprises an amino acid sequence of SEQ ID NO: 225, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at
  • a WMSV reverse transcriptase of SEQ ID NO: 54 or a WMSV reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 54, wherein the WMSV reverse transcriptase variant comprises one or more mutations selected from the group consisting of D197N, T303K, W311F, E327P, and L599W.
  • the WMSV reverse transcriptase variant comprises two or more, three or more, four or more, or all five of these mutations. In some embodiments, the WMSV reverse transcriptase variant comprises the mutation D197N. In some embodiments, the WMSV reverse transcriptase variant comprises the mutation T303K. In some embodiments, the WMSV reverse transcriptase variant comprises the mutation W311F. In some embodiments, the WMSV reverse transcriptase variant comprises the mutation E327P. In some embodiments, the WMSV reverse transcriptase variant comprises the mutation L599W.
  • the WMSV reverse transcriptase variant comprises the amino acid sequence of any one of SEQ ID NOs: 226-231, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 226-231, wherein the amino acid sequence comprises at least one of the residues 197N, 303K, 311F, 327P, and 599W: WMSV-RT D197N: WMSV-RT T303K: WMSV-RT W311F: WMSV-RT E327P: WMSV-RT L599W: [0509]
  • the WMSV reverse transcriptase variant comprises an amino acid sequence of SEQ ID NO: 231, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at
  • the prime editor proteins described herein may comprise a PERV reverse transcriptase comprising one or more mutations relative to the amino acid sequence of SEQ ID NO: 45.
  • the PERV reverse transcriptase comprises one or more mutations selected from the group consisting of D199N, T305K, W312F, E329P, and L602W relative to the amino acid sequence of SEQ ID NO: 45.
  • the PERV reverse transcriptase comprises the mutations D199N, T305K, W312F, E329P, and L602W relative to the amino acid sequence of SEQ ID NO: 45.
  • the present disclosure provides reverse transcriptases, and prime editors (e.g.
  • the reverse transcriptase is a PERV reverse transcriptase of SEQ ID NO: 45, or a PERV reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 45, wherein the PERV reverse transcriptase variant comprises one or more mutations selected from the group consisting of D199N, T305K, W312F, E329P, and L602W.
  • the PERV reverse transcriptase variant comprises two or more, three or more, four or more, or all five of these mutations. In some embodiments, the PERV reverse transcriptase variant comprises the mutation D199N. In some embodiments, the PERV reverse transcriptase variant comprises the mutation T305K. In some embodiments, the PERV reverse transcriptase variant comprises the mutation W312F. In some embodiments, the PERV reverse transcriptase variant comprises the mutation E329P. In some embodiments, the PERV reverse transcriptase variant comprises the mutation L602W.
  • the PERV reverse transcriptase variant comprises the amino acid sequence of any one of SEQ ID NOs: 232-238, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 232-238, wherein the amino acid sequence comprises at least one of the residues 199N, 305K, 312F, 329P, and 602W: PERV variant 21: PERV-RT D199N: PERV-RT T305K: PERV-RT W313F: PERV-RT E329P: PERV-RT L602W: [0512]
  • the PERV reverse transcriptase variant comprises an amino acid sequence of SEQ ID NO: 238, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%,
  • the prime editor proteins described herein may comprise a Tf1 reverse transcriptase comprising one or more mutations relative to the amino acid sequence of SEQ ID NO: 55.
  • the Tf1 reverse transcriptase comprises one or more mutations selected from the group consisting of V14A, E22K, P70T, G72V, M102I, K106R, K118R, A139T, L158Q, F269L, S297Q, K356E, A363V, K413E, I423V, and S492N relative to the amino acid sequence of SEQ ID NO: 55.
  • the Tf1 reverse transcriptase comprises any one of the following groups of amino acid substitutions relative to the amino acid sequence of SEQ ID NO: 55: K118R and S297Q; V14A, L158Q, F269L, and K356E; K106R, L158Q, F269L, A363V, and I423V; E22K, P70T, G72V, M102I, K106R, A139T, L158Q, F269L, A363V, K413E, and S492N; or P70T, G72V, M102I, K106R, L158Q, F269L, A363V, K413E, and S492N.
  • the present disclosure provides reverse transcriptases, and prime editors (e.g. fusion proteins or prime editors in which each component is provided in trans) comprising reverse transcriptases, wherein the reverse transcriptase is a Tf1 reverse transcriptase of SEQ ID NO: 171, or a Tf1 reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 171, wherein the Tf1 reverse transcriptase variant comprises one or more mutations selected from the group consisting of V14A, E22K, I64L, I64W, P70T, G72V, M102I, K106R, K118R, L133N, A139T, L158Q, S188K, I260L, F269L, E274R, R288Q, Q293K, S297Q, N316Q
  • the Tf1 reverse transcriptase variant comprises a single mutation, wherein the single mutation is an I64L mutation, an I64W mutation, a K118R mutation, an L133N mutation, an S188K mutation, an I260L mutation, an E274R mutation, an R288Q mutation, a Q293K mutation, an S297Q mutation, an N316Q mutation, or a K321R mutation.
  • the Tf1 reverse transcriptase variant comprises any one of the following groups of mutations relative to the amino acid sequence of SEQ ID NO: 171: K118R and S297Q; V14A, L158Q, F269L, and K356E; E22K, P70T, G72V, M102I, K106R, A139T, L158Q, F269L, A363V, K413E, and S492N; P70T, G72V, M102I, K106R, L158Q, F269L, A363V, K413E, and S492N; K106R, L158Q, F269L, A363V, and I423V; K118R, S297Q, S188K, I64L, I260L, and R288Q; E22K, P70T, G72V, M102I, K106R, A139T, L158Q, F269L, A363V, K413E
  • the Tf1 reverse transcriptase variant comprises the amino acid sequence of any one of SEQ ID NOs: 196-213 and 241-245, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 196-213 and 241-245, wherein the amino acid sequence comprises at least one of residues 14A, 22K, 64L, 64W, 70T, 72V, 102I, 106R, 118R, 133N, 139T, 158Q, 188K, 260L, 269L, 274R, 288Q, 293K, 297Q, 316Q, 321R, 356E, 363V, 413E, 423V, 492N: [0517]
  • the reverse transcriptase comprises an Ec48 reverse transcriptase.
  • the prime editor proteins described herein may comprise an Ec48 reverse transcriptase comprising one or more mutations relative to the amino acid sequence of SEQ ID NO: 59.
  • the Ec48 reverse transcriptase comprises one or more mutations selected from the group consisting of A36V, E54K, K87E, R205K, V214L, D243N, R267I, S277F, E279K, N317S, K318E, H324Q, K326E, E328K, and R372K relative to the amino acid sequence of SEQ ID NO: 59.
  • the Ec48 reverse transcriptase comprises any one of the following groups of amino acid substitutions relative to the amino acid sequence of SEQ ID NO: 59: R267I, K318E, K326E, E328K, and R372K; K87E, R205K, V214L, D243N, R267I, N317S, K318E, H324Q, and K326E; E54K, K87E, D243N, R267I, E279K, and K318E; A36V, K87E, R205K, D243N, R267I, E279K, and K318E; E54K, K87E, D243N, R267I, E279K, and K318E; or E54K, K87E, D243N, R267I, S277F, E279K, and K318E.
  • the present disclosure provides reverse transcriptases, and prime editors (e.g. fusion proteins or prime editors in which each component is provided in trans) comprising reverse transcriptases, wherein the reverse transcriptase is an Ec48 reverse transcriptase of SEQ ID NO: 59, or an Ec48 reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 59, wherein the Ec48 reverse transcriptase variant comprises one or more mutations selected from the group consisting of A36V, E54K, E60K, K87E, S151T, E165D, L182N, T189N, R205K, V214L, D243N, R267I, S277F, E279K, V303M, K307R, R315K, N317S, K318E, H324
  • the Ec48 reverse transcriptase variant comprises a single mutation, wherein the single mutation is an L182N mutation, a T189N mutation, a K307R mutation, an R315K mutation, an R378K mutation, or a T385R mutation.
  • the Ec48 reverse transcriptase variant comprises any one of the following groups of mutations relative to the amino acid sequence of SEQ ID NO: 59: R267I, K318E, K326E, E328K, and R372K; K87E, R205K, V214L, D243N, R267I, N317S, K318E, H324Q, and K326E; E54K, K87E, D243N, R267I, E279K, and K318E; A36V, K87E, R205K, D243N, R267I, E279K, and K318E; E54K, K87E, D243N, R267I, E279K, and K318E; E54K, K87E, D243N, R267I, E279K, and K318E; E54K, K87E, D243N, R267I, S277F, E279K, and K318E; E60K,
  • the Ec48 reverse transcriptase variant comprises the amino acid sequence of any one of SEQ ID NOs: 188-195 or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 188-195, wherein the amino acid sequence comprises at least one of residues 36V, 54K, 60K, 87E, 151T, 165D, 182N, 189N, 205K, 214L, 243N, 267I, 277F, 279K, 303M, 307R, 315K, 317S, 318E, 324Q, 326E, 328K, 343N, 372K, 378K, and 385R: [0521]
  • the present disclosure provides reverse transcriptases, and prime editors (e.g.
  • the reverse transcriptase is an Ne144 reverse transcriptase of SEQ ID NO: 64, or an Ne144 reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 64, wherein the Ne144 reverse transcriptase variant comprises one or more mutations selected from the group consisting of A157T, A165T, and G288V relative to SEQ ID NO: 64.
  • the Ne144 reverse transcriptase variant comprises the mutations A157T, A165T, and G288V.
  • the Ne144 reverse transcriptase variant comprises the amino acid sequence of SEQ ID NO: 239, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 239, wherein the amino acid sequence comprises at least one of residues 157T, 165T, and 288V: [0523]
  • the present disclosure provides reverse transcriptases, and prime editors (e.g.
  • the reverse transcriptase is a Vc95 reverse transcriptase of SEQ ID NO: 58, or a Vc95 reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 58, wherein the Vc95 reverse transcriptase variant comprises one or more mutations selected from the group consisting of L11M, S75A, V97M, N146D, and N245T relative to SEQ ID NO: 58.
  • the Vc95 reverse transcriptase variant comprises the mutations L11M, S75A, V97M, N146D, and N245T.
  • the Vc95 reverse transcriptase variant comprises the amino acid sequence of SEQ ID NO: 240, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 240, wherein the amino acid sequence comprises at least one of residues 11M, 75A, 97M, 146D, and 245T: [0525]
  • the present disclosure provides reverse transcriptases, and prime editors (e.g.
  • the reverse transcriptase is a Gs reverse transcriptase of SEQ ID NO: 60, or a Gs reverse transcriptase variant having at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity with SEQ ID NO: 60, wherein the Gs reverse transcriptase variant comprises one or more mutations selected from the group consisting of N12D, A16E, A16V, L17P, V20G, L37R, L37P, R38H, Y40C, I41N, I41S, W45R, I67T, I67R, G72E, G73V, G78V, Q93R, A123V, Y126F, E129G, K162N, P190L, D206V, R233K, A234V, R263G,
  • the Gs reverse transcriptase variant comprises two or more, three or more, four or more, five or more, six or more, seven or more, eight or more, nine or more, or ten or more of these mutations. [0526] In some embodiments, the Gs reverse transcriptase variant comprises any one of the following groups of mutations relative to the amino acid sequence of SEQ ID NO: 60: L17P and D206V; N12D, L37R, and G78V; A16E, L37P, and A123V; A16V, R38H, W45R, Y126F, and Q412H; A16V, R38H, W45R, and R291K; N12D, L37R, G72E, E129G, P264S, R344S, and R360S; N12D, Y40C, I67T, G73V, Q93R, R287I, and R358S; N12D, Y40C, I67T, G73V, Q93
  • the Gs reverse transcriptase variant comprises the amino acid sequence of any one of SEQ ID NOs: 159-171, or an amino acid sequence at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to any one of SEQ ID NOs: 159-171, wherein the amino acid sequence comprises at least one of residues 12D, 16E, 16V, 17P, 20G, 37R, 37P, 38H, 40C, 41N, 41S, 45R, 67T, 67R, 72E, 73V, 78V, 93R, 123V, 126F, 129G, 162N, 190L, 206V, 233K, 234V, 263G, 264S, 267M, 279E, 287I, 291K, 309T, 344S, 358S, 360S, 363G, 374A
  • the NLS examples above are non-limiting.
  • the prime editors used in the presently described complexes and methods may comprise any known NLS sequence, including any of those described in Cokol et al., “Finding nuclear localization signals,” EMBO Rep., 2000, 1(5): 411-415 and Freitas et al., “Mechanisms and Signals for the Nuclear Import of Proteins,” Current Genomics, 2009, 10(8): 550-7, each of which are incorporated herein by reference.
  • the fusion proteins used in the complexes and methods described herein further comprise one or more (and preferably at least two) nuclear localization sequences.
  • the fusion proteins comprise at least two NLSs. In embodiments with at least two NLSs, the NLSs can be the same NLSs or they can be different NLSs. In some embodiments, one or more of the NLSs are bipartite NLSs (“bpNLS”). In certain embodiments, the disclosed fusion proteins comprise two bipartite NLSs. In some embodiments, the disclosed fusion proteins comprise more than two bipartite NLSs.
  • the location of the NLS fusion can be at the N-terminus, the C-terminus, or within a sequence of a fusion protein (e.g., inserted between the encoded napDNAbp component (e.g., Cas9) and a polymerase domain (e.g., a reverse transcriptase).
  • the NLSs may be any known NLS sequence in the art.
  • the NLSs may also be any future-discovered NLSs for nuclear localization.
  • the NLSs also may be any naturally- occurring NLS, or any non-naturally occurring NLS (e.g., an NLS with one or more desired mutations).
  • nuclear localization sequence refers to an amino acid sequence that promotes import of a protein into the cell nucleus, for example, by nuclear transport.
  • Nuclear localization sequences are known in the art and would be apparent to the skilled artisan. For example, NLS sequences are described in Plank et al., International PCT application PCT/EP2000/011690, filed November 23, 2000, published as WO/2001/038547 on May 31, 2001, the contents of which are incorporated herein by reference.
  • an NLS comprises the amino acid sequence PKKKRKV (SEQ ID NO: 94), MDSLLMNRRKFLYQFKNVRWAKGRRETYLC (SEQ ID NO: 99), KRTADGSEFESPKKKRKV (SEQ ID NO: 97), or KRTADGSEFEPKKKRKV (SEQ ID NO: 106).
  • an NLS comprises the amino acid sequences NLSKRPAAIKKAGQAKKKK (SEQ ID NO: 107), PAAKRVKLD (SEQ ID NO: 98), RQRRNELKRSF (SEQ ID NO: 108), or NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO: 109).
  • a prime editor or other fusion protein may be modified with one or more nuclear localization sequences (NLS), preferably at least two NLSs.
  • the fusion proteins are modified with two or more NLSs.
  • a representative nuclear localization sequence is a peptide sequence that directs the protein to the nucleus of the cell in which the sequence is expressed.
  • a nuclear localization signal is predominantly basic, can be positioned almost anywhere in a protein's amino acid sequence, generally comprises a short sequence of four amino acids (Autieri & Agrawal, (1998) J. Biol. Chem.
  • Nuclear localization sequences often comprise proline residues.
  • a variety of nuclear localization sequences have been identified and have been used to effect transport of biological molecules from the cytoplasm to the nucleus of a cell. See, e.g., Tinland et al., (1992) Proc. Natl. Acad. Sci. U.S.A.89:7442-46; Moede et al., (1999) FEBS Lett.461:229- 34, which is incorporated herein by reference.
  • NLSs can be classified in three general groups: (i) a monopartite NLS exemplified by the SV40 large T antigen NLS (PKKKRKV (SEQ ID NO: 94)); (ii) a bipartite motif consisting of two basic domains separated by a variable number of spacer amino acids and exemplified by the Xenopus nucleoplasmin NLS (KRXXXXXXXXXKKKL (SEQ ID NO: 110)); and (iii) noncanonical sequences such as M9 of the hnRNP Al protein, the influenza virus nucleoprotein NLS, and the yeast Gal4 protein NLS (Dingwall and Laskey 1991).
  • Nuclear localization sequences appear at various points in the amino acid sequences of proteins. NLS have been identified at the N-terminus, the C-terminus, and in the central region of proteins. Thus, the disclosure provides fusion proteins that may be modified with one or more NLSs at the C-terminus and/or the N-terminus, as well as at internal regions of the fusion protein. The residues of a longer sequence that do not function as component NLS residues should be selected so as not to interfere, for example, tonically or sterically, with the nuclear localization signal itself. Therefore, although there are no strict limits on the composition of an NLS-comprising sequence, in practice, such a sequence can be functionally limited in length and composition.
  • the present disclosure contemplates any suitable means by which to modify a fusion protein to include one or more NLSs.
  • the fusion proteins may be engineered to express a fusion protein that is translationally fused at its N-terminus or its C-terminus (or both) to one or more NLSs, i.e., to form a prime editor-NLS fusion construct.
  • a fusion protein-encoding nucleotide sequence may be genetically modified to incorporate a reading frame that encodes one or more NLSs in an internal region of the encoded prime editor.
  • the NLSs may include various amino acid linkers or spacer regions encoded between the prime editor and the N-terminally, C-terminally, or internally- attached NLS amino acid sequence, e.g., and in the central region of proteins.
  • the present disclosure also provides for nucleotide constructs, vectors, and host cells for expressing fusion proteins that comprise a prime editor and one or more NLSs, among other components.
  • the prime editors described herein may also comprise nuclear localization sequences that are linked to a prime editor through one or more linkers, e.g., a polymeric, amino acid, nucleic acid, polysaccharide, chemical, or nucleic acid linker element.
  • linkers within the contemplated scope of the disclosure are not intended to have any limitations and can be any suitable type of molecule (e.g., polymer, amino acid, polysaccharide, nucleic acid, lipid, or any synthetic chemical linker domain) and can be joined to the prime editor by any suitable strategy that effectuates forming a bond (e.g., covalent linkage, hydrogen bonding) between the prime editor and the one or more NLSs.
  • Linkers [0539]
  • the prime editors used in the complexes and methods described herein may include one or more linkers.
  • linker refers to a chemical group or a molecule linking two molecules or moieties, e.g., a binding domain and a cleavage domain of a nuclease.
  • a linker joins a gRNA binding domain of an RNA-programmable nuclease and a polymerase (e.g., a reverse transcriptase).
  • a linker joins a Cas9 nickase and a reverse transcriptase.
  • the linker is positioned between, or flanked by, two groups, molecules, or other moieties and connected to each one via a covalent bond, thus connecting the two.
  • the linker is an amino acid or a plurality of amino acids (e.g., a peptide or protein).
  • the linker is an organic molecule, group, polymer, or chemical moiety.
  • the linker is 5-100 amino acids in length, for example, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 30-35, 35-40, 40-45, 45-50, 50-60, 60- 70, 70-80, 80-90, 90-100, 100-150, or 150-200 amino acids in length. Longer or shorter linkers are also contemplated.
  • the linker may be as simple as a covalent bond, or it may be a polymeric linker many atoms in length.
  • the linker is a polypeptide, or amino acid-based. In other embodiments, the linker is not peptide-like.
  • the linker is a covalent bond (e.g., a carbon-carbon bond, disulfide bond, carbon-heteroatom bond, etc.).
  • the linker is a carbon-nitrogen bond of an amide linkage.
  • the linker is a cyclic or acyclic, substituted or unsubstituted, branched or unbranched, aliphatic or heteroaliphatic linker.
  • the linker is polymeric (e.g., polyethylene, polyethylene glycol, polyamide, polyester, etc.). In certain embodiments, the linker comprises a monomer, dimer, or polymer of aminoalkanoic acid. In certain embodiments, the linker comprises an aminoalkanoic acid (e.g., glycine, ethanoic acid, alanine, beta-alanine, 3-aminopropanoic acid, 4-aminobutanoic acid, 5-pentanoic acid, etc.). In certain embodiments, the linker comprises a monomer, dimer, or polymer of aminohexanoic acid (Ahx).
  • Ahx aminohexanoic acid
  • the linker is based on a carbocyclic moiety (e.g., cyclopentane, cyclohexane). In other embodiments, the linker comprises a polyethylene glycol moiety (PEG). In other embodiments, the linker comprises amino acids. In certain embodiments, the linker comprises a peptide. In certain embodiments, the linker comprises an aryl or heteroaryl moiety. In certain embodiments, the linker is based on a phenyl ring. The linker may include functionalized moieties to facilitate attachment of a nucleophile (e.g., thiol, amino) from the peptide to the linker. Any electrophile may be used as part of the linker.
  • a nucleophile e.g., thiol, amino
  • the linker comprises the amino acid sequence (GGGGS)n (SEQ ID NO: 84), (G)n (SEQ ID NO: 85), (EAAAK)n (SEQ ID NO: 86), (GGS)n (SEQ ID NO: 87), (SGGS) n (SEQ ID NO: 81), (XP) n (SEQ ID NO: 88), or any combination thereof, wherein n is independently an integer between 1 and 30, and wherein X is any amino acid.
  • the linker comprises the amino acid sequence (GGS)n (SEQ ID NO: 87), wherein n is 1, 3, or 7. In some embodiments, the linker comprises the amino acid sequence SGSETPGTSESATPES (SEQ ID NO: 89). In some embodiments, the linker comprises the amino acid sequence SGGSSGGSSGSETPGTSESATPESSGGSSGGS (SEQ ID NO: 90). In some embodiments, the linker comprises the amino acid sequence SGGSGGSGGS (SEQ ID NO: 91). In some embodiments, the linker comprises the amino acid sequence SGGS (SEQ ID NO: 82).
  • the linker comprises the amino acid sequence SGGSSGGSSGSETPGTSESATPESAGSYPYDVPDYAGSAAPAAKKKKLDGSGSGGSS GGS (SEQ ID NO: 83, 60AA).
  • the linker comprises the amino acid sequence GGS, GGSGGS (SEQ ID NO: 92), GGSGGSGGS (SEQ ID NO: 93), SGGSSGGSSGSETPGTSESATPESSGGSSGGSS (SEQ ID NO: 80), SGSETPGTSESATPES (SEQ ID NO: 89), or SGGSSGGSSGSETPGTSESATPESAGSYPYDVPDYAGSAAPAAKKKKLDGSGSGGSS GGS (SEQ ID NO: 83).
  • linkers may be used to link any of the peptides or peptide domains or moieties of the invention (e.g., a napDNAbp linked or fused to a reverse transcriptase domain, and/or a napDNAbp linked to one or more NLS). Any of the domains of the fusion proteins used in the complexes and methods described herein may also be connected to one another through any of the presently described linkers. Additional prime editor domains A.
  • Flap endonucleases (e.g., FEN1) [0543]
  • the prime editors described herein may comprise one or more flap endonucleases (e.g., FEN1), which refers to an enzyme that catalyzes the removal of 5 ⁇ single stranded DNA flaps (provided in trans or fused to the PE fusion proteins). These are naturally occurring enzymes that process the removal of 5 ⁇ flaps formed during cellular processes, including DNA replication.
  • the prime editors described herein may utilize endogenously supplied flap endonucleases or those provided in trans to remove the 5 ⁇ flap of endogenous DNA formed at the target site during prime editing.
  • Flap endonucleases are known in the art and are described, for example, in Patel et al., “Flap endonucleases pass 5 ⁇ - flaps through a flexible arch using a disorder-thread-order mechanism to confer specificity for free 5 ⁇ -ends,” Nucleic Acids Research, 2012, 40(10): 4507-4519 and Tsutakawa et al., “Human flap endonuclease structures, DNA double-base flipping, and a unified understanding of the FEN1 superfamily,” Cell, 2011, 145(2): 198-211 (each of which are incorporated herein by reference).
  • An exemplary flap endonuclease is FEN1, which can be represented by the following amino acid sequence:
  • the flap endonucleases may also include any FEN1 variant, mutant, or other flap endonuclease ortholog, homolog, or variant.
  • FEN1 variant examples are as follows:
  • the prime editors utilized in the complexes and methods contemplated herein may include any flap endonuclease variant of the above-disclosed sequences having an amino acid sequence that is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to any of the above sequences.
  • endonucleases that may be utilized by the instant compositions and methods to facilitate removal of the 5′ end single strand DNA flap include, but are not limited to (1) trex 2, (2) exo1 endonuclease (e.g., Keijzers et al., Biosci Rep.2015, 35(3): e00206) Trex 2 [0546] Three prime (3 ⁇ ) repair exonuclease 2 (TREX2) – human Accession No. NM_080701 [0547] Three prime (3′) repair exonuclease 2 (TREX2) – mouse Accession No. NM_011907 [0548] Three prime (3′) repair exonuclease 2 (TREX2) – rat Accession No.
  • EXO1 Human exonuclease 1
  • MMR DNA mismatch repair
  • HR homologous recombination
  • Human EXO1 belongs to a family of eukaryotic nucleases, Rad2/XPG, which also include FEN1 and GEN1.
  • the Rad2/XPG family is conserved in the nuclease domain through species from phage to human.
  • the EXO1 gene product exhibits both 5′ exonuclease and 5′ flap activity. Additionally, EXO1 contains an intrinsic 5′ RNase H activity.
  • Human EXO1 has a high affinity for processing double stranded DNA (dsDNA), nicks, gaps, and pseudo Y structures and can resolve Holliday junctions using its inherit flap activity. Human EXO1 is implicated in MMR and contains conserved binding domains interacting directly with MLH1 and MSH2. EXO1 nucleolytic activity is positively stimulated by PCNA, MutS ⁇ (MSH2/MSH6 complex), 14-3-3, MRN, and 9-1-1 complex. [0550] Exonuclease 1 (EXO1) Accession No.
  • NM_003686 Homo sapiens exonuclease 1 (EXO1), transcript variant 3) – isoform A
  • Exonuclease 1 (EXO1) Accession No. NM_006027 Homo sapiens exonuclease 1 (EXO1), transcript variant 3) – isoform B B.
  • Inteins and split-inteins [0553] It will be understood that in some embodiments (e.g., delivery of a prime editor in vivo), it may be advantageous to split a polypeptide (e.g., a reverse transcriptase or a napDNAbp) or a fusion protein (e.g., a prime editor) into an N-terminal half and a C-terminal half, deliver them separately, and then allow their colocalization to reform the complete protein (or fusion protein as the case may be) within the cell.
  • a polypeptide e.g., a reverse transcriptase or a napDNAbp
  • a fusion protein e.g., a prime editor
  • Separate halves of a protein or a fusion protein may each comprise a split-intein tag to facilitate the reformation of the complete protein or fusion protein by the mechanism of protein trans splicing.
  • split inteins Protein trans-splicing, catalyzed by split inteins, provides an entirely enzymatic method for protein ligation.
  • a split-intein is essentially a contiguous intein (e.g., a mini- intein) split into two pieces named N-intein and C-intein, respectively.
  • the N-intein and C- intein of a split intein can associate non-covalently to form an active intein and catalyze the splicing reaction in essentially the same way as a contiguous intein does.
  • Split inteins have been found in nature and have also been engineered in laboratories.
  • split intein refers to any intein in which one or more peptide bond breaks exists between the N-terminal and C-terminal amino acid sequences such that the N-terminal and C-terminal sequences become separate molecules that can non-covalently reassociate, or reconstitute, into an intein that is functional for trans-splicing reactions.
  • Any catalytically active intein, or fragment thereof, may be used to derive a split intein for use in the methods of the invention.
  • the split intein may be derived from a eukaryotic intein.
  • the split intein may be derived from a bacterial intein.
  • the split intein may be derived from an archaeal intein.
  • the split intein so-derived will possess only the amino acid sequences essential for catalyzing trans-splicing reactions.
  • the “N-terminal split intein (In)” refers to any intein sequence that comprises an N-terminal amino acid sequence that is functional for trans-splicing reactions.
  • An In thus also comprises a sequence that is spliced out when trans-splicing occurs.
  • An In can comprise a sequence that is a modification of the N-terminal portion of a naturally occurring intein sequence.
  • an In can comprise additional amino acid residues and/or mutated residues, as long as the inclusion of such additional and/or mutated residues does not render the In non-functional in trans-splicing.
  • the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the In.
  • the “C-terminal split intein (Ic)” refers to any intein sequence that comprises a C-terminal amino acid sequence that is functional for trans-splicing reactions.
  • the Ic comprises 4 to 7 contiguous amino acid residues, at least 4 amino acids of which are from the last ⁇ -strand of the intein from which it was derived.
  • An Ic thus also comprises a sequence that is spliced out when trans-splicing occurs.
  • An Ic can comprise a sequence that is a modification of the C-terminal portion of a naturally occurring intein sequence.
  • an Ic can comprise additional amino acid residues and/or mutated residues, as long as the inclusion of such additional and/or mutated residues does not render the In non-functional in trans-splicing.
  • the inclusion of the additional and/or mutated residues improves or enhances the trans-splicing activity of the Ic.
  • a peptide linked to an Ic or an In can comprise an additional chemical moiety including, among others, fluorescence groups, biotin, polyethylene glycol (PEG), amino acid analogs, unnatural amino acids, phosphate groups, glycosyl groups, radioisotope labels, and pharmaceutical molecules.
  • a peptide linked to an Ic can comprise one or more chemically reactive groups including, among others, ketones, aldehydes, Cys residues, and Lys residues.
  • intein-splicing polypeptide refers to the portion of the amino acid sequence of a split intein that remains when the Ic, In, or both, are removed from the split intein.
  • the In comprises the ISP.
  • the Ic comprises the ISP.
  • the ISP is a separate peptide that is not covalently linked to In nor to Ic.
  • Split inteins may be created from contiguous inteins by engineering one or more split sites in the unstructured loop or intervening amino acid sequence between the -12 conserved beta-strands found in the structure of mini-inteins. Some flexibility in the position of the split site within regions between the beta-strands may exist, provided that creation of the split will not disrupt the structure of the intein, the structured beta-strands in particular, to a sufficient degree that protein splicing activity is lost.
  • one precursor protein consists of an N-extein part followed by the N-intein
  • another precursor protein consists of the C-intein followed by a C-extein part
  • a trans-splicing reaction catalyzed by the N- and C-inteins together
  • Protein trans- splicing being an enzymatic reaction, can work with very low (e.g., micromolar) concentrations of proteins and can be carried out under physiological conditions.
  • inteins are most frequently found as a contiguous domain, some exist in a naturally split form. In this case, the two fragments are expressed as separate polypeptides and must associate before splicing takes place, so-called protein trans-splicing.
  • An exemplary split intein is the Ssp DnaE intein, which comprises two subunits, namely, DnaE-N and DnaE-C. The two different subunits are encoded by separate genes, namely dnaE-n and dnaE-c, which encode the DnaE-N and DnaE-C subunits, respectively.
  • DnaE is a naturally occurring split intein in Synechocytis sp. PCC6803 and is capable of directing trans-splicing of two separate proteins, each comprising a fusion with either DnaE- N or DnaE-C.
  • Additional naturally occurring or engineered split-intein sequences are known in the art or can be made from whole-intein sequences described herein or those available in the art.
  • split-intein sequences can be found in Stevens et al., “A promiscuous split intein with expanded protein engineering applications,” PNAS, 2017, Vol.114: 8538-8543; Iwai et al., “Highly efficient protein trans-splicing by a naturally split DnaE intein from Nostc punctiforme, FEBS Lett, 580: 1853-1858, each of which are incorporated herein by reference. Additional split intein sequences can be found, for example, in WO 2013/045632, WO 2014/055782, WO 2016/069774, and EP2877490, the contents of each of which are incorporated herein by reference.
  • RNA-protein interaction domain [0565]
  • two separate protein domains may be colocalized to one another to form a functional complex (akin to the function of a fusion protein comprising the two separate protein domains) by using an “RNA-protein recruitment system,” such as the “MS2 tagging technique.”
  • RNA-protein recruitment system such as the “MS2 tagging technique.”
  • Such systems generally tag one protein domain with an “RNA-protein interaction domain” (a.k.a. “RNA- protein recruitment domain”) and the other with an “RNA-binding protein” that specifically recognizes and binds to the RNA-protein interaction domain, e.g., a specific hairpin structure.
  • the MS2 tagging technique is based on the natural interaction of the MS2 bacteriophage coat protein (“MCP” or “MS2cp”) with a stem-loop or hairpin structure present in the genome of the phage, i.e., the “MS2 hairpin.” In the case of the MS2 hairpin, it is recognized and bound by the MS2 bacteriophage coat protein (MCP).
  • MCP MS2 bacteriophage coat protein
  • a reverse transcriptase-MS2 fusion can recruit a Cas9-MCP fusion.
  • RNA recognition by the MS2 phage coat protein Sem Virol., 1997, Vol.8(3): 176-185
  • Delebecque et al. “Organization of intracellular reactions with rationally designed RNA assemblies,” Science, 2011, Vol.333: 470-474
  • Mali et al. “Cas9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering,” Nat.
  • the amino acid sequence of the MCP or MS2cp is: GSASNFTQFVLVDNGGTGDVTVAPSNFANGVAEWISSNSRSQAYKVTCSVRQSSAQ NRKYTIKVEVPKVATQTVGGEELPVAGWRSYLNMELTIPIFATNSDCELIVKAMQGL LKDGNPIPSAIAANSGIY (SEQ ID NO: 145).
  • Other PE elements the prime editors utilized in the methods and complexes described herein may comprise an inhibitor of base repair.
  • the term “inhibitor of base repair” or “IBR” refers to a protein that is capable in inhibiting the activity of a nucleic acid repair enzyme, for example, a base excision repair enzyme.
  • the IBR is an inhibitor of OGG base excision repair. In some embodiments, the IBR is an inhibitor of base excision repair (“iBER”). Exemplary inhibitors of base excision repair include inhibitors of APE1, Endo III, Endo IV, Endo V, Endo VIII, Fpg, hOGG1, hNEIL1, T7 EndoI, T4PDG, UDG, hSMUG1, and hAAG. In some embodiments, the IBR is an inhibitor of Endo V or hAAG.
  • the IBR is an iBER that may be a catalytically inactive glycosylase or catalytically inactive dioxygenase or a small molecule or peptide inhibitor of an oxidase, or variants thereof.
  • the IBR is an iBER that may be a TDG inhibitor, an MBD4 inhibitor, or an inhibitor of an AlkBH enzyme.
  • the IBR is an iBER that comprises a catalytically inactive TDG or catalytically inactive MBD4.
  • An exemplary catalytically inactive TDG is an N140A mutant of SEQ ID NO: 136 (human TDG).
  • the catalytically inactivated variants of any of these glycosylase domains are iBERs that may be fused to the napDNAbp or polymerase domain of the prime editors utilized in the methods and compositions provided in this disclosure.
  • a fusion protein may comprise any additional protein sequence, and optionally a linker sequence between any two domains.
  • Other exemplary features that may be present are localization sequences, such as cytoplasmic localization sequences, export sequences, such as nuclear export sequences, or other localization sequences, as well as sequence tags that are useful for solubilization, purification, or detection of the fusion proteins.
  • localization sequences such as cytoplasmic localization sequences, export sequences, such as nuclear export sequences, or other localization sequences, as well as sequence tags that are useful for solubilization, purification, or detection of the fusion proteins.
  • protein domains that may be fused to a prime editor or component thereof (e.g., the napDNAbp domain, the polymerase domain, or the NLS domain) include, without limitation, epitope tags and reporter gene sequences.
  • Non-limiting examples of epitope tags include histidine (His) tags, V5 tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-G tags, and thioredoxin (Trx) tags.
  • reporter genes include, but are not limited to, glutathione-5-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT), beta-galactosidase, beta-glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP).
  • GST glutathione-5-transferase
  • HRP horseradish peroxidase
  • CAT chloramphenicol acetyltransferase
  • beta-galactosidase beta-galacto
  • a prime editor may be fused to a gene sequence encoding a protein or a fragment of a protein that binds DNA molecules or binds other cellular molecules, including, but not limited to, maltose binding protein (MBP), S-tag, Lex A DNA binding domain (DBD) fusions, GAL4 DNA binding domain fusions, and herpes simplex virus (HSV) BP16 protein fusions. Additional domains that may form part of a prime editor are described in US Patent Publication No.2011/0059502, published March 10, 2011, and incorporated herein by reference.
  • a reporter gene that includes, but is not limited to, glutathione-5-transferase (GST), horseradish peroxidase (HRP), chloramphenicol acetyltransferase (CAT) beta-galactosidase, beta-glucuronidase, luciferase, green fluorescent protein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and autofluorescent proteins including blue fluorescent protein (BFP), may be introduced into a cell to encode a gene product that serves as a marker by which to measure the alteration or modification of expression of the gene product.
  • GST glutathione-5-transferase
  • HRP horseradish peroxidase
  • CAT chloramphenicol acetyltransferase
  • beta-galactosidase beta-galactosidase
  • beta-glucuronidase beta-galactosidase
  • Suitable protein tags include, but are not limited to, biotin carboxylase carrier protein (BCCP) tags, myc-tags, calmodulin-tags, FLAG-tags, hemagglutinin (HA)-tags, polyhistidine tags, also referred to as histidine tags or His-tags, maltose binding protein (MBP)-tags, nus-tags, glutathione-S-transferase (GST)-tags, green fluorescent protein (GFP)-tags, thioredoxin-tags, S-tags, Softags (e.g., Softag 1, Softag 3), strep-tags, biotin ligase tags, FlAsH tags, V5 tags, and SBP-tags.
  • BCCP biotin carboxylase carrier protein
  • MBP maltose binding protein
  • GST glutathione-S-transferase
  • GST glutathione-S-transferase
  • GFP green fluorescent protein
  • Softags e
  • the fusion protein comprises one or more His tags.
  • the activity of the prime editing system may be temporally regulated by adjusting the residence time, the amount, and/or the activity of the expressed components of the PE system.
  • the PE may be fused with a protein domain that is capable of modifying the intracellular half-life of the PE.
  • the activity of the PE system may be temporally regulated by controlling the timing in which the vectors are delivered.
  • a vector encoding the nuclease system may deliver the PE prior to the vector encoding the template.
  • the vector encoding the PEgRNA may deliver the guide prior to the vector encoding the PE system.
  • the vectors encoding the PE system and PEgRNA are delivered simultaneously.
  • the simultaneously delivered vectors temporally deliver, e.g., the PE, PEgRNA, and/or second strand guide RNA components.
  • the RNA (such as, e.g., the nuclease transcript) transcribed from the coding sequence on the vectors may further comprise at least one element that is capable of modifying the intracellular half-life of the RNA and/or modulating translational control.
  • the half-life of the RNA may be increased.
  • the half-life of the RNA may be decreased.
  • the element may be capable of increasing the stability of the RNA.
  • the element may be capable of decreasing the stability of the RNA.
  • the element may be within the 3′ UTR of the RNA.
  • the element may include a polyadenylation signal (PA).
  • PA polyadenylation signal
  • the element may include a cap, e.g., an upstream mRNA or PEgRNA end.
  • the RNA may comprise no PA such that it is subject to quicker degradation in the cell after transcription.
  • the element may include at least one AU-rich element (ARE).
  • the AREs may be bound by ARE binding proteins (ARE-BPs) in a manner that is dependent upon tissue type, cell type, timing, cellular localization, and environment.
  • the destabilizing element may promote RNA decay, affect RNA stability, or activate translation.
  • the ARE may comprise 50 to 150 nucleotides in length.
  • the ARE may comprise at least one copy of the sequence AUUUA.
  • At least one ARE may be added to the 3′ UTR of the RNA.
  • the element may be a Woodchuck Hepatitis Virus (WHP).
  • WPRE Posttranscriptional Regulatory Element
  • the element is a modified and/or truncated WPRE sequence that is capable of enhancing expression from the transcript, as described, for example in Zufferey et al., J. Virol., 73(4): 2886-92 (1999) and Flajolet et al., J. Virol., 72(7): 6175-80 (1998).
  • the WPRE or equivalent may be added to the 3′ UTR of the RNA.
  • the element may be selected from other RNA sequence motifs that are enriched in either fast- or slow-decaying transcripts.
  • the vector encoding the PE or the PEgRNA may be self- destroyed via cleavage of a target sequence present on the vector by the PE system. The cleavage may prevent continued transcription of a PE or a PEgRNA from the vector. Although transcription may occur on the linearized vector for some amount of time, the expressed transcripts or proteins subject to intracellular degradation will have less time to produce off-target effects without continued supply from expression of the encoding vectors.
  • compositions comprising any of the guide RNAs (including, e.g., PEgRNAs, ePEgRNAs, and second strand nicking gRNAs), fusion proteins, and polynucleotides described herein.
  • the term “pharmaceutical composition,” as used herein, refers to a composition formulated for pharmaceutical use.
  • the pharmaceutical composition further comprises a pharmaceutically acceptable carrier.
  • the pharmaceutical composition comprises additional agents (e.g., for specific delivery, increasing half-life, or other therapeutic compounds).
  • the term “pharmaceutically-acceptable carrier” means a pharmaceutically-acceptable material, composition or vehicle, such as a liquid or solid filler, diluent, excipient, manufacturing aid (e.g., lubricant, talc magnesium, calcium or zinc stearate, or steric acid), or solvent encapsulating material, involved in carrying or transporting the compound from one site (e.g., the delivery site) of the body, to another site (e.g., an organ, tissue, or other part of the body).
  • a pharmaceutically-acceptable material such as a liquid or solid filler, diluent, excipient, manufacturing aid (e.g., lubricant, talc magnesium, calcium or zinc stearate, or steric acid), or solvent encapsulating material, involved in carrying or transporting the compound from one site (e.g., the delivery site) of the body, to another site (e.g., an organ, tissue, or other part of the body).
  • a pharmaceutically acceptable carrier is “acceptable” in the sense of being compatible with the other ingredients of the formulation and not injurious to the tissue of the subject (e.g., physiologically compatible, sterile, physiologic pH, etc.).
  • materials that can serve as pharmaceutically-acceptable carriers include: (1) sugars, such as lactose, glucose and sucrose; (2) starches, such as corn starch and potato starch; (3) cellulose, and its derivatives, such as sodium carboxymethyl cellulose, methylcellulose, ethyl cellulose, microcrystalline cellulose and cellulose acetate; (4) powdered tragacanth; (5) malt; (6) gelatin; (7) lubricating agents, such as magnesium stearate, sodium lauryl sulfate and talc; (8) excipients, such as cocoa butter and suppository waxes; (9) oils, such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, corn oil, and soybean oil; (10) glycol
  • the pharmaceutical composition is formulated for delivery to a subject, e.g., for gene editing.
  • Suitable routes of administering the pharmaceutical composition described herein include, without limitation: topical, subcutaneous, transdermal, intradermal, intralesional, intraarticular, intraperitoneal, intravesical, transmucosal, gingival, intradental, intracochlear, transtympanic, intraorgan, epidural, intrathecal, intramuscular, intravenous, intravascular, intraosseus, periocular, intratumoral, intracerebral, and intracerebroventricular administration.
  • the pharmaceutical composition described herein is administered locally to a diseased site (e.g., tumor site).
  • the pharmaceutical composition described herein is administered to a subject by injection, by means of a catheter, by means of a suppository, or by means of an implant, the implant being of a porous, non-porous, or gelatinous material, including a membrane, such as a sialastic membrane, or a fiber.
  • the pharmaceutical composition described herein is delivered in a controlled release system.
  • a pump may be used (see, e.g., Langer, 1990, Science 249:1527-1533; Sefton, 1989, CRC Crit. Ref. Biomed.
  • polymeric materials can be used.
  • Polymeric materials See, e.g., Medical Applications of Controlled Release (Langer and Wise eds., CRC Press, Boca Raton, Fla., 1974); Controlled Drug Bioavailability, Drug Product Design and Performance (Smolen and Ball eds., Wiley, New York, 1984); Ranger and Peppas, 1983, Macromol. Sci. Rev. Macromol. Chem.23:61.
  • the pharmaceutical composition is formulated in accordance with routine procedures as a composition adapted for intravenous or subcutaneous administration to a subject, e.g., a human.
  • pharmaceutical compositions for administration by injection are solutions in sterile isotonic aqueous buffer.
  • the pharmaceutical composition can also include a solubilizing agent and a local anesthetic such as lignocaine to ease pain at the site of the injection.
  • the ingredients are supplied either separately or mixed together in unit dosage form, for example, as a dry lyophilized powder or water free concentrate in a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent.
  • a hermetically sealed container such as an ampoule or sachette indicating the quantity of active agent.
  • the pharmaceutical composition is to be administered by infusion, it can be dispensed with an infusion bottle containing sterile pharmaceutical grade water or saline.
  • an ampoule of sterile water for injection or saline can be provided so that the ingredients can be mixed prior to administration.
  • a pharmaceutical composition for systemic administration may be a liquid, e.g., sterile saline, lactated Ringer’s or Hank’s solution.
  • the pharmaceutical composition can be in solid forms and re-dissolved or suspended immediately prior to use. Lyophilized forms are also contemplated.
  • the pharmaceutical composition can be contained within a lipid particle or vesicle, such as a liposome or microcrystal, which is also suitable for parenteral administration.
  • the particles can be of any suitable structure, such as unilamellar or plurilamellar, so long as compositions are contained therein.
  • SPLP stabilized plasmid-lipid particles
  • DOPE fusogenic lipid dioleoylphosphatidylethanolamine
  • PEG polyethyleneglycol
  • Positively charged lipids such as N-[1-(2,3-dioleoyloxi)propyl]-N,N,N-trimethyl-amoniummethylsulfate, or “DOTAP,” are particularly preferred for such particles and vesicles.
  • DOTAP N-[1-(2,3-dioleoyloxi)propyl]-N,N,N-trimethyl-amoniummethylsulfate
  • compositions described herein may be administered or packaged as a unit dose, for example.
  • unit dose when used in reference to a pharmaceutical composition of the present disclosure refers to physically discrete units suitable as unitary dosage for the subject, each unit containing a predetermined quantity of active material calculated to produce the desired therapeutic effect in association with the required diluent; i.e., carrier, or vehicle.
  • the pharmaceutical composition can be provided as a pharmaceutical kit comprising (a) a container containing a compound of the invention in lyophilized form and (b) a second container containing a pharmaceutically acceptable diluent (e.g., sterile water) for injection.
  • a pharmaceutically acceptable diluent e.g., sterile water
  • the pharmaceutically acceptable diluent can be used for reconstitution or dilution of the lyophilized compound of the invention.
  • Optionally associated with such container(s) can be a notice in the form prescribed by a governmental agency regulating the manufacture, use, or sale of pharmaceuticals or biological products, which notice reflects approval by the agency of manufacture, use, or sale for human administration.
  • an article of manufacture containing materials useful for the treatment of the diseases described above is included.
  • the article of manufacture comprises a container and a label.
  • Suitable containers include, for example, bottles, vials, syringes, and test tubes.
  • the containers may be formed from a variety of materials such as glass or plastic.
  • the container holds a composition that is effective for treating a disease and may have a sterile access port.
  • the container may be an intravenous solution bag or a vial having a stopper pierce-able by a hypodermic injection needle.
  • the active agent in the composition is a compound of the invention.
  • the label on or associated with the container indicates that the composition is used for treating the disease of choice.
  • the article of manufacture may further comprise a second container comprising a pharmaceutically acceptable buffer, such as phosphate-buffered saline, Ringer’s solution, or dextrose solution. It may further include other materials desirable from a commercial and user standpoint, including other buffers, diluents, filters, needles, syringes, and package inserts with instructions for use.
  • Kits and Cells [0593] The guide RNAs (including pegRNAs and epegRNAs), fusion proteins, and compositions of the present disclosure may be assembled into kits.
  • the kit comprises polynucleotides for expression of the prime editors and/or pegRNAs, epegRNAs, and second strand nicking gRNAs described herein.
  • the kit further comprises appropriate guide nucleotide sequences or nucleic acid vectors for the expression of such guide nucleotide sequences, to target the Cas9 protein of the prime editors to the desired target sequence (e.g., a gene associate with a triplet repeat disorder, such as the HTT or FXN genes).
  • the kits described herein may include one or more containers housing components for performing the methods described herein, and optionally instructions for use. Any of the kits described herein may further comprise components needed for performing the prime editing methods described herein. Each component of the kits, where applicable, may be provided in liquid form (e.g., in solution) or in solid form, (e.g., a dry powder).
  • kits may optionally include instructions and/or promotion for use of the components provided.
  • instructions can define a component of instruction and/or promotion, and typically involve written instructions on or associated with packaging of the disclosure. Instructions also can include any oral or electronic instructions provided in any manner such that a user will clearly recognize that the instructions are to be associated with the kit, for example, audiovisual (e.g., videotape, DVD, etc.), Internet, and/or web-based communications, etc.
  • the written instructions may be in a form prescribed by a governmental agency regulating the manufacture, use, or sale of pharmaceuticals or biological products, which can also reflect approval by the agency of manufacture, use, or sale for animal administration.
  • “promoted” includes all methods of doing business including methods of education, hospital and other clinical instruction, scientific inquiry, drug discovery or development, academic research, pharmaceutical industry activity including pharmaceutical sales, and any advertising or other promotional activity including written, oral, and electronic communication of any form, associated with the disclosure.
  • the kits may include other components depending on the specific application, as described herein. [0596]
  • the kits may contain any one or more of the components described herein in one or more containers.
  • the components may be prepared sterilely, packaged in a syringe, and shipped refrigerated.
  • kits may be housed in a vial or other container for storage.
  • a second container may have other components prepared sterilely.
  • the kits may include the active agents premixed and shipped in a vial, tube, or other container.
  • the kits may have a variety of forms, such as a blister pouch, a shrink-wrapped pouch, a vacuum sealable pouch, a sealable thermoformed tray, or a similar pouch or tray form, with the accessories loosely packed within the pouch, one or more tubes, containers, a box, or a bag.
  • the kits may be sterilized after the accessories are added, thereby allowing the individual accessories in the container to be otherwise unwrapped.
  • kits can be sterilized using any appropriate sterilization techniques, such as radiation sterilization, heat sterilization, or other sterilization methods known in the art.
  • the kits may also include other components, depending on the specific application, for example, containers, cell media, salts, buffers, reagents, syringes, needles, a fabric, such as gauze, for applying or removing a disinfecting agent, disposable gloves, a support for the agents prior to administration, etc.
  • kits comprising a nucleic acid construct comprising a nucleotide sequence encoding the prime editor systems described herein, or various components thereof (e.g., including, but not limited to, the napDNAbps, reverse transcriptase domains, and pegRNAs/epegRNAs/second strand nicking gRNAs).
  • the nucleotide sequence(s) comprises a heterologous promoter (or more than a single promoter) that drives expression of the prime editor system components.
  • kits comprising one or more nucleic acid constructs encoding the various components of the prime editing system described herein.
  • the nucleotide sequence comprises a heterologous promoter that drives expression of the prime editing system components.
  • Cells that may contain any of the guide RNAs, fusion proteins, and compositions described herein include prokaryotic cells and eukaryotic cells. The methods described herein may be used to deliver a prime editor and/or guide RNA into a eukaryotic cell (e.g., a mammalian cell, such as a human cell).
  • the cell is in vitro (e.g., a cultured cell).
  • the cell is in vivo (e.g., in a subject, such as a human subject).
  • the cell is ex vivo (e.g., isolated from a subject and may be administered back to the same or a different subject).
  • Mammalian cells of the present disclosure include human cells, primate cells (e.g., vero cells), rat cells (e.g., GH3 cells, OC23 cells) or mouse cells (e.g., MC3T3 cells).
  • human cell lines including, without limitation, human embryonic kidney (HEK) cells, HeLa cells, cancer cells from the National Cancer Institute's 60 cancer cell lines (NCI60), DU145 (prostate cancer) cells, Lncap (prostate cancer) cells, MCF-7 (breast cancer) cells, MDA-MB-438 (breast cancer) cells, PC3 (prostate cancer) cells, T47D (breast cancer) cells, THP-1 (acute myeloid leukemia) cells, U87 (glioblastoma) cells, SHSY5Y human neuroblastoma cells (cloned from a myeloma) and Saos-2 (bone cancer) cells.
  • HEK human embryonic kidney
  • HeLa cells cancer cells from the National Cancer Institute's 60 cancer cell lines (NCI60)
  • DU145 (prostate cancer) cells Lncap (prostate cancer) cells
  • MCF-7 breast cancer
  • MDA-MB-438 breast cancer
  • PC3 prostate cancer
  • T47D
  • prime editors and/or guide RNAs are delivered into human embryonic kidney (HEK) cells (e.g., HEK293 or HEK293T cells).
  • prime editors and/or guide RNAs are delivered into stem cells (e.g., human stem cells), such as, for example, pluripotent stem cells (e.g., human pluripotent stem cells including human induced pluripotent stem cells (hiPSCs)).
  • stem cell refers to a cell with the ability to divide for indefinite periods in culture and to give rise to specialized cells.
  • a pluripotent stem cell refers to a type of stem cell that is capable of differentiating into all tissues of an organism, but not alone capable of sustaining full organismal development.
  • a human induced pluripotent stem cell refers to a somatic (e.g., mature or adult) cell that has been reprogrammed to an embryonic stem cell-like state by being forced to express genes and factors important for maintaining the defining properties of embryonic stem cells (see, e.g., Takahashi and Yamanaka, Cell 126 (4): 663–76, 2006, incorporated by reference herein).
  • Human induced pluripotent stem cells express stem cell markers and are capable of generating cells characteristic of all three germ layers (ectoderm, endoderm, mesoderm).
  • a host cell is transiently or non-transiently transfected with one or more vectors described herein.
  • a cell is transfected as it naturally occurs in a subject.
  • a cell that is transfected is taken from a subject.
  • the cell is derived from cells taken from a subject, such as a cell line. A wide variety of cell lines for tissue culture are known in the art.
  • cell lines include, but are not limited to, C8161, CCRF-CEM, MOLT, mIMCD- 3, NHDF, HeLa-S3, Huh1, Huh4, Huh7, HUVEC, HASMC, HEKn, HEKa, MiaPaCell, Panc1, PC-3, TF1, CTLL-2, C1R, Rat6, CV1, RPTE, A10, T24, J82, A375, ARH-77, Calu1, SW480, SW620, SKOV3, SK-UT, CaCo2, P388D1, SEM-K2, WEHI-231, HB56, TIB55, Jurkat, J45.01, LRMB, Bcl-1, BC-3, IC21, DLD2, Raw264.7, NRK, NRK-52E, MRC5, MEF, Hep G2, HeLa B, HeLa T4, COS, COS-1, COS-6, COS-M6A, BS-C-1 monkey kidney epithelial, BALB
  • Cell lines are available from a variety of sources known to those with skill in the art (see, e.g., the American Type Culture Collection (ATCC) (Manassus, VA)).
  • ATCC American Type Culture Collection
  • VA Manassus, VA
  • a cell transfected with one or more vectors described herein is used to establish a new cell line comprising one or more vector-derived sequences.
  • a cell transiently transfected with the components of a CRISPR system as described herein such as by transient transfection of one or more vectors, or transfection with RNA), and modified through the activity of a CRISPR complex, is used to establish a new cell line comprising cells containing the modification but lacking any other exogenous sequence.
  • cells transiently or non-transiently transfected with one or more vectors described herein, or cell lines derived from such cells are used in assessing one or more test compounds.
  • PE2 prime editing strategy
  • pegRNA prime editing sgRNA
  • RTT homology + synthesis template sequence
  • pegRNA with 3′ structural motifs (evopreq1 and mpknot) 26 were tested with rationally designed or computed by pegLIT 26 linker sequences, and an ⁇ 2- fold increase in editing efficiency was observed after applying evopreq1 motif with a rationally designed linker sequence (FIGs.5A-5B). All designs moving forward include the evopreq1 motif and are referred to as “epegRNA”. [0607] The length of the epegRNA homology sequence was then revisited, and it was determined that epegRNAs with homology of 31 and 40 nt are most efficient (FIG.6). It was also validated that “PEmax” optimized editor architecture resulted in better prime editing at this locus (FIG.7).
  • the optimized epegRNA architecture was used in a PE3 system (N5 nick), and different variants of the sequence were screened, replacing CAG repeats (from 2 CAGs to 9 CAGs, either as pure CAG codons or in a combination with interrupting CAA codons – “CARs”, where R stands for “either A or G”). It was observed that replacement of the repeats with 4CAG units results in the highest editing efficiency (FIGs.10A-10B). To further maximize editing efficiency and maximize the editing:indel ratio in the PE3 system, a few more nicking guides were screened, either recognizing the canonical NGG or non-canonical NGA PAMs.
  • N5 nicking guide offers the highest editing efficiency, but N23b nicking guide can be a safer alternative (resulting in lower indels, but also slightly decreased correct editing) if using a PE3b editing system is desired (FIG.11) 12 .
  • the efficiency of epegRNAs with a regular scaffold was compared to ones with a modified scaffold containing a UA flip. This improvement was implemented for all future applications (FIGs.12A-12B) 30 .
  • a screen of various different prime editors that employ evolved or rationally designed reverse transcriptase (RT) components was performed next.
  • RNAseH domain e.g., between D497 and I498 of SEQ ID NO: 33
  • V223Y V223Y
  • pRT-5.800max Tf1 RT
  • these two editors are much smaller in size, which is advantageous for in vivo applications. This trend was true for short and long CAG replacement variants (FIGs.13A-13B and 14).
  • epegRNA reducing CAG repeat size to 9CARs encoding additional GC2+CT2 mutations improves editing > 2-fold over the previous epegRNA design (homology 31nt or 40nt, PBS 10nt, epegRNA, UA scaffold flip) in the PE3 system (N5 nicking guide) (FIGs.19A-19B).
  • the final prime editing strategy to reduce the size of CAG repeats in the HTT locus employs the following improvements: PE3 or PE3b system 12 , PEmax 27 , epegRNA 26 , UA scaffold flip 30 , pegRNA RTT sequence modification, and PE RT variants.
  • FIGs.29A-29B provide an updated summary of the PE strategy to reduce CAG repeats in view of the additional Cas9 variants tested.
  • transgenic mouse embryonic stem cell lines containing the human HTT exon 1 were generated.
  • Tol2 transposase two cell lines expressing HTT exon 1 with either 21 or 72 pure CAG repeats 31 , ending in a terminal ‘CAACAG’ sequence 32,33 , were generated.
  • These transgenic cell lines allowed rapid screening of genome editing at both non-pathogenic (21 CAG) and pathogenic (72 CAG) repeat lengths, that are larger than CAG repeats at endogenous HTT alleles typically found in wild type human cell lines (typically 16-18 CAG) (FIGs.30 and 31A-31B).
  • Htt.Q111 mouse model was used that carries a human pathogenic HTT allele with ⁇ 111 glutamines and exhibits somatic repeat instability in the CNS (including the striatum and cortex) and the liver 71–73 .
  • a dual-ssAAV delivery system was generated for the HTT-targeting prime editing strategy to replace the pathogenic number of CAG repeats in Htt.Q111 with either 6 or 11 glutamines 54,74,75 .
  • intracerebroventricular (ICV) injections were performed in Htt.Q111 neonates using ssAAV9 virus and the htt-v1 (dual ssAAV9-PE-v1).
  • FIG.33 Four weeks post injection, no appreciable prime editing was observed above background (FIG.33).
  • the AAV architecture was reevaluated. and a new prime editing strategy (htt-v2) that utilizes additional improvements to the prime editor and pegRNA was used (FIGs.34A-34B).
  • P0 ICV was injected into HTT.Q111 neonates with a regular dose of dual ssAAV9-PE-v2 (4.0x10 10 vg total) 54 , and editing was evaluated four weeks post injection. Approximately 1% of alleles in the cortex and over 3% of alleles in the liver acquired the desired prime edit (FIGs.35A-35C).
  • the alternative PE3b prime editing strategy was next evaluated (htt-v3, described above), which yielded good editing efficiency with much lower indels (FIGs.38A- 38C).
  • the goal was to reduce undesired editing outcomes and possibly increase editing efficiency in vivo (FIGs.38A-38C).
  • the mice were P0 ICV injected with a high dose (1.6x10 11 vg total) of the dual ssAAV9-PE-v3 to compensate for the lower editing efficiency of the PE3b system compared to PE3 observed in vitro, and the editing outcomes were assessed 4 weeks post injection.
  • This unexpected editing outcome may be caused by 1) DNA damage repair mechanisms resolving the transient DSB at the edited site before editing is complete; 2) unsuccessful flap resolution after the desired sequence (edit) gets incorporated into the DNA; or 3) inefficient prime editing in the transduced cells caused by a non-optimal strategy design and/or the complex character of the edit.
  • additional evaluation of the prime editing strategies in vitro was performed by testing new PE variants to further improve editing. It was found that some strategies (htt-v4a and htt-v4b), have the potential to perform better in the endogenous HTT context. These strategies were shown to yield slightly increased editing efficiency in vitro (FIGs.40A-40B).
  • FXN GAA-repeats in the general population ranges from ⁇ 5-60, while FRDA patients may present with 66 to well over 1200 repeats, typically ranging from 600 to 900 repeats 57 .
  • the age of FRDA onset in patients, loss of FXN protein, and severity of symptoms are inversely correlated with the GAA repeat length of the shortest FXN allele (FIG.41).
  • the first intron in FXN plays a key role in transcriptional regulation, and GAA repeat expansion induces epigenetic changes to the chromatin state that results in silencing of gene expression 81-83 .
  • PE2 prime editing strategies
  • pegRNA prime editing sgRNA
  • RTT homology + synthesis template sequence
  • PegRNA with 3′ structural motifs (evopreq1 and mpknot) 89 were tested with rationally designed linker sequences or linker sequences computed by pegLIT 89 , and a slight increase in editing efficiency was observed after adding an evopreq1 motif with a pegLIT-predicted linker sequence (FIGs.45A-45B).
  • pegRNA designs from here forward include the evopreq1 motif and are referred to as “epegRNA.”
  • epegRNA evopreq1 motif
  • FIG. 46 It was then validated that use of PEmax-optimized editor architecture results in slightly better prime editing at this locus (FIG.46), but no significant improvement of editing was observed while using it in combination with overexpression of dominant negative MLH1 protein (PE4 and PE5, without and with an additional nicking sgRNA, respectively) (FIG. 47) 90 . From here forward, all conditions discussed were performed with PEmax editors.
  • V223Y RHdelta V223Y RHdelta
  • pRT-5.800max Tf1 RT
  • a cell line expressing FXN intron 1 with 30 GAA repeats was generated.
  • This transgenic cell line facilitated rapid screening of genome editing at non-pathogenic yet larger GAA repeats than at endogenous FXN alleles typically found in wild type human cell lines (typically 8-18 GAAs) (FIGs.53 and 54).
  • Up to 60% and 55% average prime editing was achieved using all the improvements to the pegRNA (epegRNA, UA scaffold flip) and PE (PEmax, V223Y RHdelta) described above in the PE2 and PE3b system, respectively. This strategy also resulted in 70% precise deletion in transgenic cells (FIGs.54 and 55).
  • Cas9 comprising an N863A mutation abolished editing activity.
  • Cas9-1 and Cas9-8 were included in further optimizations in cell models with pathogenic GAA repeats.
  • PEmax utilizing V223Y RNaseH- truncated MMLV reverse transcriptase or pRT-5.800max i.e., prime editor comprising Tf1 RT
  • pegRNA with homology of 40 nucleotides resulted in particularly good editing efficiency in FXN mESC cells, and use of an extra nicking guide did not improve editing efficiency further. Additionally, editing of longer alleles (30, 60, and 200 GAAs) was more efficient than editing in HEK293T cells (9GAAs) (FIG.59).
  • PEmax with V223Y RNaseH- truncated MMLV reverse transcriptase performed similar or better than standard PE2max but has the advantage of being smaller sized (FIG.58B).
  • FIGs.61A-61C the prime edited fibroblast cells were used to assess phenotypic rescue, as measured by FXN expression (RNA expression measured by ddPCR) (FIGs.61A-61C). This was found to be highly reduced in FRDA patient-derived cells (FIG.61C).
  • the prime editing result (FIG.61B) was compared to rescue in cells edited with a Cas9 nuclease strategy (FIG. 61A), previously published by Rocca et al., Methods and Clinical Development, 2020, 17, 1026-1036.
  • prime editing has therapeutic potential for rescuing the main phenotypic trait of FRDA (reduced FXN expression).
  • a YG8s mouse model was used (Virmouni et al., Dis Model Mech, 2015, 8(3), 225-235).
  • FIG.62A Before testing the V223Y RNaseH- truncated MMLV reverse transcriptase prime editor, PEmax, together with the leading pegRNA design (and no nicking guide RNA) was packaged (FIG.62A) and tested in vivo in two mouse models of FRDA, with either 300 GAAs or 800 GAAs (FIG.62B).
  • the mice were treated by injecting dual-AAV on postnatal day 0 via ICV (intracerebroventricular injection) or on postnatal day 1 via FVI (facial vein injection). It was determined that AAV9 FXN-v1 strategy shows promising results when delivered systemically. On average, 10% editing in the liver was achieved (FIG.62B).
  • the optimized prime editing strategy included the pegRNA design described herein, PE3b, and PEmax architecture with V223Y RNaseH-truncated MMLV reverse transcriptase.
  • AAV architecture described in Davis et al., NBME, 2023 was also included to optimize delivery (FIG.63A). Mice were treated with dual-AAV9 by P0 ICV injection (FIG.63B), and editing was measured across the CNS and systemic tissues after 4 and 8 weeks (FIG. 63C).
  • This improved prime editing AAV delivery strategy yielded prime editing efficiencies in the liver of approximately 15% and in the heart of approximately 10% and showed non-zero editing in the cortex (FIG.63C).
  • the transduction efficiency of the AAV9 FXN-v2 strategy was then evaluated in the cortex, and editing efficiency was evaluated across different tissues in transduced and bulk nuclei (FIGs.64A-64C). Additionally, the mice were treated with an increased dose of the AAV9 FXN-v2 to determine the effect of dose on prime editing.
  • the FXN-v2 strategy significantly increased transduction efficiency (FIG.64C).
  • the invention includes embodiments in which exactly one member of the group is present in, employed in, or otherwise relevant to a given product or process.
  • the invention includes embodiments in which more than one, or all of the group members are present in, employed in, or otherwise relevant to a given product or process.
  • the invention encompasses all variations, combinations, and permutations in which one or more limitations, elements, clauses, and descriptive terms from one or more of the listed claims is introduced into another claim.
  • any claim that is dependent on another claim can be modified to include one or more limitations found in any other claim that is dependent on the same base claim.
  • elements are presented as lists, e.g., in Markush group format, each subgroup of the elements is also disclosed, and any element(s) can be removed from the group.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Neurology (AREA)
  • Neurosurgery (AREA)
  • Virology (AREA)
  • Hospice & Palliative Care (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Psychiatry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

La présente divulgation concerne des compositions et des méthodes utiles dans le traitement de troubles de répétition trinucléotidique, notamment de la maladie de Huntington et de l'ataxie de Friedreich. La présente divulgation concerne également des ARNpeg conçus pour cibler les gènes HTT ou FXN. La présente divulgation concerne également des complexes, des compositions et des systèmes comprenant un éditeur d'amorce et l'un quelconque des ARNpeg de la présente divulgation. La présente divulgation concerne en outre des polynucléotides, des vecteurs, des VAA, des cellules, des compositions et des kits. La divulgation concerne en outre des méthodes de traitement de la maladie de Huntington et de l'ataxie de Friedreich, ainsi que des utilisations des compositions, des ARNpeg et des systèmes présentement décrits.
PCT/US2023/076282 2022-10-07 2023-10-06 Méthodes et compositions d'édition d'amorce pour traiter des troubles de répétition de triplet WO2024077267A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202263414364P 2022-10-07 2022-10-07
US63/414,364 2022-10-07
US202363508616P 2023-06-16 2023-06-16
US63/508,616 2023-06-16

Publications (1)

Publication Number Publication Date
WO2024077267A1 true WO2024077267A1 (fr) 2024-04-11

Family

ID=88731479

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2023/076282 WO2024077267A1 (fr) 2022-10-07 2023-10-06 Méthodes et compositions d'édition d'amorce pour traiter des troubles de répétition de triplet

Country Status (1)

Country Link
WO (1) WO2024077267A1 (fr)

Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4880635A (en) 1984-08-08 1989-11-14 The Liposome Company, Inc. Dehydrated liposomes
US4906477A (en) 1987-02-09 1990-03-06 Kabushiki Kaisha Vitamin Kenkyusyo Antineoplastic agent-entrapping liposomes
US4911928A (en) 1987-03-13 1990-03-27 Micro-Pak, Inc. Paucilamellar lipid vesicles
US4917951A (en) 1987-07-28 1990-04-17 Micro-Pak, Inc. Lipid vesicles formed of surfactants and steroids
US4920016A (en) 1986-12-24 1990-04-24 Linear Technology, Inc. Liposomes with enhanced circulation time
US4921757A (en) 1985-04-26 1990-05-01 Massachusetts Institute Of Technology System for delayed and pulsed release of biologically active substances
US5244797A (en) 1988-01-13 1993-09-14 Life Technologies, Inc. Cloned genes encoding reverse transcriptase lacking RNase H activity
WO2001038547A2 (fr) 1999-11-24 2001-05-31 Mcs Micro Carrier Systems Gmbh Polypeptides comprenant des multimeres de signaux de localisation nucleaire ou de domaines de transduction de proteine et utilisations de ces derniers pour transferer des molecules dans des cellules
US20110059502A1 (en) 2009-09-07 2011-03-10 Chalasani Sreekanth H Multiple domain proteins
US8394604B2 (en) 2008-04-30 2013-03-12 Paul Xiang-Qin Liu Protein splicing using short terminal split inteins
WO2013045632A1 (fr) 2011-09-28 2013-04-04 Era Biotech, S.A. Intéines divisées et leurs utilisations
WO2014055782A1 (fr) 2012-10-03 2014-04-10 Agrivida, Inc. Protéases modifiées par de des intéines, leur production et leurs applications industrielles
EP2877490A2 (fr) 2012-06-27 2015-06-03 The Trustees Of Princeton University Intéines clivées, conjugués et utilisations de celles-ci
WO2016069774A1 (fr) 2014-10-28 2016-05-06 Agrivida, Inc. Procédés et compositions de stabilisation de protéases de trans-épissage modifiée par intéine
US9458484B2 (en) 2010-10-22 2016-10-04 Bio-Rad Laboratories, Inc. Reverse transcriptase mixtures with improved storage stability
US9534201B2 (en) 2007-04-26 2017-01-03 Ramot At Tel-Aviv University Ltd. Culture of pluripotent autologous stem cells from oral mucosa
US9580698B1 (en) 2016-09-23 2017-02-28 New England Biolabs, Inc. Mutant reverse transcriptase
US20170224843A1 (en) * 2014-08-04 2017-08-10 Centre Hospitalier Universitaire Vaudois (Chuv) Genome editing for the treatment of huntington's disease
US9783791B2 (en) 2005-08-10 2017-10-10 Agilent Technologies, Inc. Mutant reverse transcriptase and methods of use
WO2018098587A1 (fr) * 2016-12-01 2018-06-07 UNIVERSITé LAVAL Traitement basé sur crispr de l'ataxie de friedreich
US10150955B2 (en) 2009-03-04 2018-12-11 Board Of Regents, The University Of Texas System Stabilized reverse transcriptase fusion proteins
US10189831B2 (en) 2012-10-08 2019-01-29 Merck Sharp & Dohme Corp. Non-nucleoside reverse transcriptase inhibitors
US10202658B2 (en) 2005-02-18 2019-02-12 Monogram Biosciences, Inc. Methods for determining hypersusceptibility of HIV-1 to non-nucleoside reverse transcriptase inhibitors
WO2020191239A1 (fr) 2019-03-19 2020-09-24 The Broad Institute, Inc. Procédés et compositions pour l'édition de séquences nucléotiques
US20200318116A1 (en) * 2006-01-26 2020-10-08 Ionis Pharmaceuticals, Inc. Compositions and their uses directed to huntingtin
WO2021030344A1 (fr) * 2019-08-12 2021-02-18 Lifeedit, Inc. Nucléases guidées par arn et fragments actifs et variants associés et méthodes d'utilisation
WO2021052097A1 (fr) 2019-09-19 2021-03-25 江苏大学 Système et procédé de mesure d'un paramètre caractéristique d'oxydation d'un carburant mélangé liquide
WO2021226558A1 (fr) 2020-05-08 2021-11-11 The Broad Institute, Inc. Méthodes et compositions d'édition simultanée des deux brins d'une séquence nucléotidique double brin cible
WO2022067130A2 (fr) 2020-09-24 2022-03-31 The Broad Institute, Inc. Arn guides d'édition primaire, leurs compositions et leurs méthodes d'utilisation
WO2022150790A2 (fr) 2021-01-11 2022-07-14 The Broad Institute, Inc. Variants d'éditeur primaire, constructions et procédés pour améliorer l'efficacité et la précision d'une édition primaire
WO2022203905A1 (fr) * 2021-03-24 2022-09-29 University Of Massachusetts Suppression et insertion génomiques simultanées basées sur l'édition primaire
AU2021232005A1 (en) * 2020-03-04 2022-09-29 Flagship Pioneering Innovations Vi, Llc Methods and compositions for modulating a genome
WO2022204543A1 (fr) * 2021-03-25 2022-09-29 The Regents Of The University Of California Procédés et matériaux pour le traitement de la maladie de huntington
WO2023015309A2 (fr) 2021-08-06 2023-02-09 The Broad Institute, Inc. Éditeurs primaires améliorés et leurs procédés d'utilisation
WO2023076898A1 (fr) 2021-10-25 2023-05-04 The Broad Institute, Inc. Procédés et compositions pour l'édition d'un génome à l'aide d'une édition primaire et d'une recombinase
WO2023081426A1 (fr) * 2021-11-05 2023-05-11 Prime Medicine, Inc. Compositions d'édition de génome et méthodes de traitement de l'ataxie de friedreich

Patent Citations (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4880635B1 (en) 1984-08-08 1996-07-02 Liposome Company Dehydrated liposomes
US4880635A (en) 1984-08-08 1989-11-14 The Liposome Company, Inc. Dehydrated liposomes
US4921757A (en) 1985-04-26 1990-05-01 Massachusetts Institute Of Technology System for delayed and pulsed release of biologically active substances
US4920016A (en) 1986-12-24 1990-04-24 Linear Technology, Inc. Liposomes with enhanced circulation time
US4906477A (en) 1987-02-09 1990-03-06 Kabushiki Kaisha Vitamin Kenkyusyo Antineoplastic agent-entrapping liposomes
US4911928A (en) 1987-03-13 1990-03-27 Micro-Pak, Inc. Paucilamellar lipid vesicles
US4917951A (en) 1987-07-28 1990-04-17 Micro-Pak, Inc. Lipid vesicles formed of surfactants and steroids
US5244797A (en) 1988-01-13 1993-09-14 Life Technologies, Inc. Cloned genes encoding reverse transcriptase lacking RNase H activity
US5244797B1 (en) 1988-01-13 1998-08-25 Life Technologies Inc Cloned genes encoding reverse transcriptase lacking rnase h activity
WO2001038547A2 (fr) 1999-11-24 2001-05-31 Mcs Micro Carrier Systems Gmbh Polypeptides comprenant des multimeres de signaux de localisation nucleaire ou de domaines de transduction de proteine et utilisations de ces derniers pour transferer des molecules dans des cellules
US10202658B2 (en) 2005-02-18 2019-02-12 Monogram Biosciences, Inc. Methods for determining hypersusceptibility of HIV-1 to non-nucleoside reverse transcriptase inhibitors
US9783791B2 (en) 2005-08-10 2017-10-10 Agilent Technologies, Inc. Mutant reverse transcriptase and methods of use
US20200318116A1 (en) * 2006-01-26 2020-10-08 Ionis Pharmaceuticals, Inc. Compositions and their uses directed to huntingtin
US9534201B2 (en) 2007-04-26 2017-01-03 Ramot At Tel-Aviv University Ltd. Culture of pluripotent autologous stem cells from oral mucosa
US8394604B2 (en) 2008-04-30 2013-03-12 Paul Xiang-Qin Liu Protein splicing using short terminal split inteins
US10150955B2 (en) 2009-03-04 2018-12-11 Board Of Regents, The University Of Texas System Stabilized reverse transcriptase fusion proteins
US20110059502A1 (en) 2009-09-07 2011-03-10 Chalasani Sreekanth H Multiple domain proteins
US9458484B2 (en) 2010-10-22 2016-10-04 Bio-Rad Laboratories, Inc. Reverse transcriptase mixtures with improved storage stability
WO2013045632A1 (fr) 2011-09-28 2013-04-04 Era Biotech, S.A. Intéines divisées et leurs utilisations
EP2877490A2 (fr) 2012-06-27 2015-06-03 The Trustees Of Princeton University Intéines clivées, conjugués et utilisations de celles-ci
WO2014055782A1 (fr) 2012-10-03 2014-04-10 Agrivida, Inc. Protéases modifiées par de des intéines, leur production et leurs applications industrielles
US10189831B2 (en) 2012-10-08 2019-01-29 Merck Sharp & Dohme Corp. Non-nucleoside reverse transcriptase inhibitors
US20170224843A1 (en) * 2014-08-04 2017-08-10 Centre Hospitalier Universitaire Vaudois (Chuv) Genome editing for the treatment of huntington's disease
WO2016069774A1 (fr) 2014-10-28 2016-05-06 Agrivida, Inc. Procédés et compositions de stabilisation de protéases de trans-épissage modifiée par intéine
US9932567B1 (en) 2016-09-23 2018-04-03 New England Biolabs, Inc. Mutant reverse transcriptase
US9580698B1 (en) 2016-09-23 2017-02-28 New England Biolabs, Inc. Mutant reverse transcriptase
WO2018098587A1 (fr) * 2016-12-01 2018-06-07 UNIVERSITé LAVAL Traitement basé sur crispr de l'ataxie de friedreich
WO2020191239A1 (fr) 2019-03-19 2020-09-24 The Broad Institute, Inc. Procédés et compositions pour l'édition de séquences nucléotiques
WO2021030344A1 (fr) * 2019-08-12 2021-02-18 Lifeedit, Inc. Nucléases guidées par arn et fragments actifs et variants associés et méthodes d'utilisation
WO2021052097A1 (fr) 2019-09-19 2021-03-25 江苏大学 Système et procédé de mesure d'un paramètre caractéristique d'oxydation d'un carburant mélangé liquide
AU2021232005A1 (en) * 2020-03-04 2022-09-29 Flagship Pioneering Innovations Vi, Llc Methods and compositions for modulating a genome
WO2021226558A1 (fr) 2020-05-08 2021-11-11 The Broad Institute, Inc. Méthodes et compositions d'édition simultanée des deux brins d'une séquence nucléotidique double brin cible
WO2022067130A2 (fr) 2020-09-24 2022-03-31 The Broad Institute, Inc. Arn guides d'édition primaire, leurs compositions et leurs méthodes d'utilisation
WO2022150790A2 (fr) 2021-01-11 2022-07-14 The Broad Institute, Inc. Variants d'éditeur primaire, constructions et procédés pour améliorer l'efficacité et la précision d'une édition primaire
WO2022203905A1 (fr) * 2021-03-24 2022-09-29 University Of Massachusetts Suppression et insertion génomiques simultanées basées sur l'édition primaire
WO2022204543A1 (fr) * 2021-03-25 2022-09-29 The Regents Of The University Of California Procédés et matériaux pour le traitement de la maladie de huntington
WO2023015309A2 (fr) 2021-08-06 2023-02-09 The Broad Institute, Inc. Éditeurs primaires améliorés et leurs procédés d'utilisation
WO2023076898A1 (fr) 2021-10-25 2023-05-04 The Broad Institute, Inc. Procédés et compositions pour l'édition d'un génome à l'aide d'une édition primaire et d'une recombinase
WO2023081426A1 (fr) * 2021-11-05 2023-05-11 Prime Medicine, Inc. Compositions d'édition de génome et méthodes de traitement de l'ataxie de friedreich

Non-Patent Citations (97)

* Cited by examiner, † Cited by third party
Title
"Medical Applications of Controlled Release", 1974, CRC PRESS
A. R. GRUBER ET AL., CELL, vol. 106, no. 1, 2008, pages 23 - 24
ANZALONE, A. V. ET AL.: "Search-and-replace genome editing without double-strand breaks or donor DNA", NATURE, vol. 576, 2019, pages 149 - 157, XP055980447, DOI: 10.1038/s41586-019-1711-4
AREZI, B.HOGREFE, H.: "Novel mutations in Moloney Murine Leukemia Virus reverse transcriptase increase thermostability through tighter binding to template-primer", NUCLEIC ACIDS RES, vol. 37, 2009, pages 473 - 481, XP002556110, DOI: 10.1093/nar/gkn952
AUTIERIAGRAWAL, J. BIOL. CHEM., vol. 273, 1998, pages 14731 - 15890
AVIDAN, O.MEER, M. E.OZ, I.HIZI, A.: "The processivity and fidelity of DNA synthesis exhibited by the reverse transcriptase of bovine leukemia virus", EUROPEAN JOURNAL OF BIOCHEMISTRY, vol. 269, 2002, pages 859 - 867
BARANAUSKAS, A. ET AL.: "Generation and characterization of new highly thermostable and processive M-MuLV reverse transcriptase variants", PROTEIN ENG DES SEL, vol. 25, 2012, pages 657 - 668, XP055071799, DOI: 10.1093/protein/gzs034
BERGER ET AL., BIOCHEMISTRY, vol. 22, 1983, pages 2365 - 2372
BERKHOUT, B.JEBBINK, M.ZSIROS, J.: "Identification of an Active Reverse Transcriptase Enzyme Encoded by a Human Endogenous HERV-K Retrovirus", JOURNAL OF VIROLOGY, vol. 73, 1999, pages 2365 - 2375, XP002361440
BLAIN, S. W.GOFF, S. P.: "Nuclease activities of Moloney murine leukemia virus reverse transcriptase. Mutants with altered substrate specificities", J. BIOL. CHEM., vol. 268, 1993, pages 23585 - 23592, XP055491482
BUCHWALD ET AL., SURGERY, vol. 88, 1980, pages 507
CHEN, B. ET AL.: "Dynamic Imaging of Genomic Loci in Living Human Cells by an Optimized CRISPR/Cas System", CELL, vol. 155, no. 7, 2013, pages 1479 - 1471, XP028806611, DOI: 10.1016/j.cell.2013.12.001
CHYLINSKI, RHUN: "The tracrRNA and Cas9 families of type II CRISPR-Cas immunity systems Charpentier", RNA BIOLOGY, vol. 10, no. 5, 2013, pages 726 - 737, XP055116068, DOI: 10.4161/rna.24321
COKOL ET AL.: "Finding nuclear localization signals", EMBO REP., vol. 1, no. 5, 2000, pages 411 - 415, XP072230221, DOI: 10.1093/embo-reports/kvd092
DAS, D.GEORGIADIS, M. M.: "The Crystal Structure of the Monomeric Reverse Transcriptase from Moloney Murine Leukemia Virus", STRUCTURE, vol. 12, 2004, pages 819 - 829, XP025941534, DOI: 10.1016/j.str.2004.02.032
DAVIS ET AL., NBME, 2023
DAVIS, J. R. ET AL., NAT. BIOTECHNOL., 2023
DELEBECQUE ET AL.: "Organization of intracellular reactions with rationally designed RNA assemblies", SCIENCE, vol. 333, 2011, pages 470 - 474
DELTCHEVA E.CHYLINSKI K.SHARMA C.M.GONZALES K.CHAO Y.PIRZADA Z.A.ECKERT M.R.VOGEL J.CHARPENTIER E.: "CRISPR RNA maturation by trans-encoded small RNA and host factor RNase III", NATURE, vol. 471, 2011, pages 602 - 607, XP055308803, DOI: 10.1038/nature09886
DURING ET AL., ANN. NEUROL., vol. 25, 1989, pages 351
EVANS ET AL., J. BIOL. CHEM., vol. 275, 2000, pages 9091
FENG, Q.MORAN, J. V.KAZAZIAN, H. H.BOEKE, J. D.: "Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition", CELL, vol. 87, 1996, pages 905 - 916
FERRETTIMCSHAN W.M.AJDIC D.J.SAVIC D.J.SAVIC G.LYON K.PRIMEAUX C.SEZATE S.SUVOROV A.N.KENTON S.: "Complete genome sequence of an Ml strain of Streptococcus pyogenes", PROC. NATL. ACAD. SCI. U.S.A., vol. 98, 2001, pages 4658 - 4663
FIELDS ERIC ET AL: "Gene targeting techniques for Huntington's disease", AGEING RESEARCH REVIEWS, ELSEVIER, AMSTERDAM, NL, vol. 70, 5 June 2021 (2021-06-05), XP086737835, ISSN: 1568-1637, [retrieved on 20210605], DOI: 10.1016/J.ARR.2021.101385 *
FLAJOLET ET AL., J. VIROL., vol. 72, no. 7, 1998, pages 6175 - 80
FREITAS ET AL.: "Mechanisms and Signals for the Nuclear Import of Proteins", CURRENT GENOMICS, vol. 10, no. 8, 2009, pages 550 - 7, XP055502464
GERARD, G. F. ET AL.: "The role of template-primer in protection of reverse transcriptase from thermal inactivation", NUCLEIC ACIDS RES, vol. 30, 2002, pages 3118 - 3129, XP002556108, DOI: 10.1093/nar/gkf417
GERARD, G. R, DNA, vol. 5, 1986, pages 271 - 279
GRIFFITHS, D. J.: "Endogenous retroviruses in the human genome sequence", GENOME BIOL, vol. 2, 2001, pages 1017, XP002996132
HALEMARHAM: "The Harper Collins Dictionary of Biology", 1991, SPRINGER VERLAG
HALVAS, E. K.SVAROVSKAIA, E. S.PATHAK, V. K.: "Role of Murine Leukemia Virus Reverse Transcriptase Deoxyribonucleoside Triphosphate-Binding Site in Retroviral Replication and In Vivo Fidelity", JOURNAL OF VIROLOGY, vol. 74, 2000, pages 10349 - 10358
HERSCHHORN, A.HIZI, A.: "Retroviral reverse transcriptases", CELL. MOL. LIFE SCI., vol. 67, 2010, pages 2717 - 2747, XP019837855
HERZIG, E.VORONIN, N.KUCHERENKO, N.HIZI, A.: "A Novel Leu92 Mutant of HIV-1 Reverse Transcriptase with a Selective Deficiency in Strand Transfer Causes a Loss of Viral Replication", J. VIROL., vol. 89, 2015, pages 8119 - 8129
HOWARD ET AL., J. NEUROSURG., vol. 71, 1989, pages 105
IWAI ET AL.: "Highly efficient protein trans-splicing by a naturally split DnaE intein from Nostc punctiforme", FEBS LETT, vol. 580, pages 1853 - 1858
JINEK ET AL., SCIENCE, vol. 337, 2012, pages 816 - 821
JINEK M.CHYLINSKI K.FONFARA I.HAUER M.DOUDNA J.A.CHARPENTIER E.: "A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity", SCIENCE, vol. 337, 2012, pages 816 - 821, XP055229606, DOI: 10.1126/science.1225829
JOHANSSON ET AL.: "RNA recognition by the MS2 phage coat protein", SEM VIROL., vol. 8, no. 3, 1997, pages 176 - 185
JOST, M. ET AL.: "Titrating expression using libraries of systematically attenuated CRISPR guide RNAs", NAT. BIOTECHNOL., vol. 38, 2020, pages 355 - 364, XP037055445, DOI: 10.1038/s41587-019-0387-5
KEIJZERS ET AL., BIOSCI REP, vol. 35, no. 3, 2015, pages e00206
KOTEWICZ, M. L. ET AL., GENE, vol. 35, 1985, pages 249 - 258
KOTEWICZ, M. L.SAMPSON, C. M.D'ALESSIO, J. M.GERARD, G. F.: "Isolation of cloned Moloney murine leukemia virus reverse transcriptase lacking ribonuclease H activity", NUCLEIC ACIDS RES, vol. 16, 1988, pages 265 - 277
LANGER, SCIENCE, vol. 249, 1990, pages 1527 - 1533
LEVY ET AL., SCIENCE, vol. 228, 1985, pages 190
LIM, D. ET AL.: "Crystal structure of the moloney murine leukemia virus RNase H domain", J. VIROL., vol. 80, 2006, pages 8379 - 8389
LIU, M. ET AL.: "Reverse Transcriptase-Mediated Tropism Switching in Bordetella Bacteriophage", SCIENCE, vol. 295, 2002, pages 2091 - 2094, XP002384941, DOI: 10.1126/science.1067467
LUAN, D. D.KORMAN, M. H.JAKUBCZAK, J. L.EICKBUSH, T. H.: "Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition", CELL, vol. 72, 1993, pages 595 - 605, XP024245568, DOI: 10.1016/0092-8674(93)90078-5
MAGIN ET AL., VIROLOGY, vol. 274, 2000, pages 11 - 16
MAKAROVA ET AL.: "C2c2 is a single-component programmable RNA-guided RNA-targeting CRISPR effector", SCIENCE, vol. 353, no. 6299, 2016, XP055407082, DOI: 10.1126/science.aaf5573
MALI ET AL.: "Cas9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering", NAT. BIOTECHNOL., vol. 31, 2013, pages 833 - 838, XP055693153, DOI: 10.1038/nbt.2675
MILLS ET AL., PROC. NATL. ACAD. SCI. USA, vol. 95, 1998, pages 3543 - 3548
MOEDE ET AL., FEBS LETT., vol. 461, 1999, pages 229 - 34
MOHR, G. ET AL.: "A Reverse Transcriptase-Cas 1 Fusion Protein Contains a Cas6 Domain Required for Both CRISPR RNA Biogenesis and RNA Spacer Acquisition", MOL. CELL, vol. 72, 2018, pages 700 - 714
MOHR, S. ET AL.: "Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing", RNA, vol. 19, 2013, pages 958 - 970, XP055149277, DOI: 10.1261/rna.039743.113
MONOT, C. ET AL.: "The Specificity and Flexibility of L1 Reverse Transcription Priming at Imperfect T-Tracts", PLOS GENETICS, vol. 9, 2013, pages el003499
NISHIMASU ET AL., CELL, vol. 156, 2014, pages 935 - 949
NOTTINGHAM, R. M. ET AL.: "RNA-seq of human reference RNA samples using a thermostable group II intron reverse transcriptase", RNA, vol. 22, 2016, pages 597 - 613
NOWAK, E. ET AL.: "Structural analysis of monomeric retroviral reverse transcriptase in complex with an RNA/DNA hybrid", NUCLEIC ACIDS RES, vol. 41, 2013, pages 3874 - 3887
OSTERTAG, E. M.KAZAZIAN JR, H. H.: "Biology of Mammalian L1 Retrotransposons", ANNUAL REVIEW OF GENETICS, vol. 35, 2001, pages 501 - 538, XP002474549
OTOMO ET AL., BIOCHEMISTRY, vol. 38, 1999, pages 16040 - 16044
OTOMO ET AL., J. BIOLMOL. NMR, vol. 14, 1999, pages 105 - 114
PA CARRGM CHURCH, NATURE BIOTECHNOLOGY, vol. 27, no. 12, 2009, pages 1151 - 62
PATEL ET AL.: "Flap endonucleases pass 5'-flaps through a flexible arch using a disorder-thread-order mechanism to confer specificity for free 5'-ends", NUCLEIC ACIDS RESEARCH, vol. 40, no. 10, 2012, pages 4507 - 4519
PERACH, M.HIZI, A.: "Catalytic Features of the Recombinant Reverse Transcriptase of Bovine Leukemia Virus Expressed in Bacteria", VIROLOGY, vol. 259, 1999, pages 176 - 189, XP004450354, DOI: 10.1006/viro.1999.9761
PERBAL: "Controlled Drug Bioavailability, Drug Product Design and Performance", 1984, WILEY & SONS
QI ET AL.: "Repurposing CRISPR as an RNA-Guided Platform for Sequence-Specific Control of Gene Expression", CELL, vol. 152, no. 5, 2013, pages 1173 - 83, XP055346792, DOI: 10.1016/j.cell.2013.02.022
RANGERPEPPAS, MACROMOL. SCI. REV. MACROMOL. CHEM., vol. 23, 1983, pages 61
ROCCA ET AL., METHODS AND CLINICAL DEVELOPMENT, vol. 17, 2020, pages 1026 - 1036
SAUDEK ET AL., N. ENGL. J. MED., vol. 321, 1989, pages 574
SAUNDERSSAUNDERS: "Microbial Genetics Applied to Biotechnology", 1987, CROOM HELM
SCOTT ET AL., PROC. NATL. ACAD. SCI. USA, vol. 96, 1999, pages 13638 - 13643
SEFTON, CRC CRIT. REF. BIOMED. ENG., vol. 14, 1989, pages 201
SHAH ET AL., CHEM. SCI., vol. 5, no. 1, 2014, pages 446 - 461
SHAH ET AL.: "Protospacer recognition motifs: mixed identities and functional diversity", RNA BIOLOGY, vol. 10, no. 5, pages 891 - 899
SHINGLEDECKER ET AL., GENE, vol. 207, 1998, pages 187
SINGLETON ET AL.: "Dictionary of Microbiology and Molecular Biology", 1994
SOUTHWORTH ET AL., EMBO J., vol. 17, 1998, pages 918
STAMOS, J. L.LENTZSCH, A. M.LAMBOWITZ, A. M.: "Structure of a Thermostable Group II Intron Reverse Transcriptase with Template-Primer and Its Functional and Evolutionary Implications", MOLECULAR CELL, vol. 68, 2017, pages 926 - 939
STEVENS ET AL., J. AM. CHEM. SOC., vol. 138, no. 7, 24 February 2016 (2016-02-24), pages 2162 - 5
STEVENS ET AL.: "A promiscuous split intein with expanded protein engineering applications", PNAS, vol. 114, 2017, pages 8538 - 8543, XP055661453, DOI: 10.1073/pnas.1701083114
TAKAHASHIYAMANAKA, CELL, vol. 126, no. 4, 2006, pages 663 - 76
TAUBE, R.LOYA, S.AVIDAN, O.PERACH, M.HIZI, A.: "Reverse transcriptase of mouse mammary tumour virus: expression in bacteria, purification and biochemical characterization", BIOCHEM. J., vol. 329, 1998, pages 579 - 587, XP055980374, DOI: 10.1042/bj3290579
TELESNITSKY, A.GOFF, S. P.: "RNase H domain mutations affect the interaction between Moloney murine leukemia virus reverse transcriptase and its primer-template", PROC. NATL. ACAD. SCI. U.S.A., vol. 90, 1993, pages 1276 - 1280
TINLAND ET AL., PROC. NATL. ACAD. SCI. U.S.A., vol. 89, 1992, pages 7442 - 46
TSUTAKAWA ET AL.: "Human flap endonuclease structures, DNA double-base flipping, and a unified understanding of the FEN1 superfamily", CELL, vol. 145, no. 2, 2011, pages 198 - 211, XP028194588, DOI: 10.1016/j.cell.2011.03.004
VIRMOUNI ET AL., DIS MODEL MECH, vol. 8, no. 3, 2015, pages 225 - 235
WU ET AL., BIOCHIM. BIOPHYS. ACTA, vol. 35732, 1998, pages 1
XIONG, Y.EICKBUSH, T. H.: "Origin and evolution of retroelements based upon their reverse transcriptase sequences", EMBO J, vol. 9, 1990, pages 3353 - 3362
YAMAZAKI ET AL., J. AM. CHEM. SOC., vol. 120, 1998, pages 5591
ZALATAN ET AL.: "Engineering complex synthetic transcriptional programs with CRISPR RNA scaffolds", CELL, vol. 160, 2015, pages 339 - 350, XP055278878, DOI: 10.1016/j.cell.2014.11.052
ZHANG Y. P. ET AL., GENE THER., vol. 6, 1999, pages 1438 - 47
ZHAO, C.LIU, F.PYLE, A. M.: "An ultraprocessive, accurate reverse transcriptase encoded by a metazoan group II intron", RNA, vol. 24, 2018, pages 183 - 195
ZHAO, C.PYLE, A. M.: "Crystal structures of a group II intron maturase reveal a missing link in spliceosome evolution", NATURE STRUCTURAL & MOLECULAR BIOLOGY, vol. 23, 2016, pages 558 - 565, XP055556551, DOI: 10.1038/nsmb.3224
ZIMMERLY, S.GUO, H.PERLMAN, P. S.LAMBOWLTZ, A. M.: "Group II intron mobility occurs by target DNA-primed reverse transcription", CELL, vol. 82, 1995, pages 545 - 554
ZIMMERLY, S.WU, L.: "An Unexplored Diversity of Reverse Transcriptases in Bacteria", MICROBIOL SPECTR, vol. 3, 2015
ZUFFEREY ET AL., J. VIROL., vol. 73, no. 4, 1999, pages 2886 - 92
ZUKERSTIEGLER, NUCLEIC ACIDS RES., vol. 9, 1981, pages 133 - 148

Similar Documents

Publication Publication Date Title
US20220315906A1 (en) Base editors with diversified targeting scope
US20230108687A1 (en) Gene editing methods for treating spinal muscular atrophy
US20230159913A1 (en) Targeted base editing of the ush2a gene
US20230021641A1 (en) Cas9 variants having non-canonical pam specificities and uses thereof
WO2021155065A1 (fr) Éditeurs de bases, compositions, et procédés de modification du génome mitochondrial
JP2022526908A (ja) 編集ヌクレオチド配列を編集するための方法および組成物
JP2023525304A (ja) 標的二本鎖ヌクレオチド配列の両鎖同時編集のための方法および組成物
US20240173430A1 (en) Base editing for treating hutchinson-gilford progeria syndrome
JP6793547B2 (ja) 最適化機能CRISPR−Cas系による配列操作のための系、方法および組成物
JP2023543803A (ja) プライム編集ガイドrna、その組成物、及びその使用方法
CA3100019A1 (fr) Procedes de substitution d'acides amines pathogenes a l'aide de systemes d'editeur de bases programmables
US20220235347A1 (en) Compositions and methods for treating hemoglobinopathies
US20230127008A1 (en) Stat3-targeted base editor therapeutics for the treatment of melanoma and other cancers
JPWO2020191234A5 (fr)
JPWO2020191243A5 (fr)
JPWO2020191233A5 (fr)
WO2023076898A1 (fr) Procédés et compositions pour l'édition d'un génome à l'aide d'une édition primaire et d'une recombinase
WO2022150790A2 (fr) Variants d'éditeur primaire, constructions et procédés pour améliorer l'efficacité et la précision d'une édition primaire
US20230059368A1 (en) Polynucleotide editors and methods of using the same
CA3227004A1 (fr) Editeurs primaires ameliores et leurs procedes d'utilisation
WO2024077267A1 (fr) Méthodes et compositions d'édition d'amorce pour traiter des troubles de répétition de triplet
WO2023102538A1 (fr) Particules pseudovirales auto-assemblées pour administration d'éditeurs principaux et procédés de fabrication et d'utilisation de ces dernières
WO2023205687A1 (fr) Procédés et compositions d'édition primaire améliorés
WO2024108092A1 (fr) Distribution d'éditeur primaire par vaa
WO2024077247A1 (fr) Méthodes et compositions d'édition de bases pour le traitement de troubles de répétition triplet

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23802069

Country of ref document: EP

Kind code of ref document: A1