EP4684023A2 - Modifizierte matrizenführungs-rna-moleküle - Google Patents

Modifizierte matrizenführungs-rna-moleküle

Info

Publication number
EP4684023A2
EP4684023A2 EP24775687.7A EP24775687A EP4684023A2 EP 4684023 A2 EP4684023 A2 EP 4684023A2 EP 24775687 A EP24775687 A EP 24775687A EP 4684023 A2 EP4684023 A2 EP 4684023A2
Authority
EP
European Patent Office
Prior art keywords
sequence
template rna
nucleotides
pbs
heterologous object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP24775687.7A
Other languages
English (en)
French (fr)
Inventor
Rahul Ravindran NAIR
Nikita BRODYAGIN
Luciano Henrique APPONI
Anne Helen Bothmer
Aamir MIR
John Frederick BRIONES
Cecilia Giovanna Silvia COTTA-RAMUSINO
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tessera Therapeutics Inc
Original Assignee
Tessera Therapeutics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tessera Therapeutics Inc filed Critical Tessera Therapeutics Inc
Publication of EP4684023A2 publication Critical patent/EP4684023A2/de
Pending legal-status Critical Current

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K31/00Medicinal preparations containing organic active ingredients
    • A61K31/70Carbohydrates; Sugars; Derivatives thereof
    • A61K31/7088Compounds having three or more nucleosides or nucleotides
    • A61K31/711Natural deoxyribonucleic acids, i.e. containing only 2'-deoxyriboses attached to adenine, guanine, cytosine or thymine and having 3'-5' phosphodiester links
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases [RNase]; Deoxyribonucleases [DNase]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/31Chemical structure of the backbone
    • C12N2310/315Phosphorothioates
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/34Spatial arrangement of the modifications
    • C12N2310/344Position-specific modifications, e.g. on every purine, at the 3'-end
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2320/00Applications; Uses
    • C12N2320/50Methods for regulating/modulating their activity
    • C12N2320/51Methods for regulating/modulating their activity modulating the chemical stability, e.g. nuclease-resistance

Definitions

  • compositions e.g., proteins and nucleic acids
  • methods for inserting, altering, or deleting sequences of interest in a genome e.g., proteins and nucleic acids
  • SUMMARY OF THE INVENTION This disclosure relates to novel compositions, systems, and methods for altering a genome at one or more locations in a host cell, tissue or subject, in vivo or in vitro.
  • Features of the compositions or methods can include one or more of the following enumerated embodiments. Enumerated Embodiments 1.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence comprising a region of at least 5 contiguous nucleotides comprising 2’-fluoro modifications on alternating nucleotides, and (iv) a primer binding site (PBS) sequence.
  • PBS primer binding site
  • the heterologous object sequence comprises 2’-fluoro modifications on alternating nucleotides starting from position +5 of the heterologous object sequence. 7.
  • the heterologous object sequence further comprises a region between the region of at least 5 contiguous nucleotides and the PBS sequence that does not comprise 2’-fluoro modifications, e.g. comprises unmodified nucleotides.
  • the template RNA of any one of the preceding embodiments which comprises 2’-fluoro modified nucleotides at positions +5, +7, +9, and/or +11 of the heterologous object sequence.
  • the template RNA of any one of the preceding embodiments wherein the second nucleotide from the 5’ end of the heterologous object sequence comprises a 2’-fluoro modification (e.g., position +9 or +10 of the heterologous object sequence).
  • the template RNA of any one of the preceding embodiments which comprises 2’-fluoro modified nucleotides at positions +5, +7, and/or +9 of the heterologous object sequence. 16.
  • the template RNA of any one of the preceding embodiments which comprises 2’-fluoro modified nucleotides at positions +4, +6, +8, and/or +10 of the heterologous object sequence. 17.
  • the template RNA of embodiment 17, which comprises at the 3’ end, in 5’ to 3’ order, a 2’-fluoro modified nucleotide, a 2’-OMe modified nucleotide, and one or more (e.g., 1, 2, or 3) nucleotides comprising a phosphorothioate modification. 19.
  • the template RNA of embodiment 17 or 18, wherein the one or more nucleotides comprising a phosphorothioate modification further comprise a 2’-OMe modification.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence, and (iv) a primer binding site (PBS) sequence, wherein the template RNA comprises a region of at least 5 contiguous nucleotides comprising 2’- fluoro modifications on alternating nucleotides, optionally wherein the number of 3 of 237 11867955v1 Attorney Docket No.: 2017469-0019 nucleotides between the 3’ most nucleotide comprising a 2’-fluoro modification in the region and the 3’ end of the template RNA is 8, 9, 10, 11, 12, 13, 14, or 15. 21.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence, and (iv) a primer binding site (PBS) sequence comprising a 2’-fluoro modified nucleotide.
  • PBS primer binding site
  • 25. The template RNA of any one of embodiments 21-24, wherein the template RNA does not comprise any unmodified nucleotides 3’ of the 2’-fluoro modified nucleotide.
  • 26. The template RNA of any one of embodiments 21-25, wherein the template RNA further comprises (e.g., at the 3’ end of the template RNA) one or more (e.g., 1, 2, or 3) nucleotides each comprising a phosphorothioate modification and a 2’-OMe modification. 27.
  • the template RNA of any one of embodiments 21-26, wherein the number of nucleotides between the 2’-fluoro modified nucleotide and the 3’ end of the template RNA is 2, 3, 4, 5, 6, 7, 8, 9, or 10. 28.
  • the template RNA of any one of embodiments 21-27, wherein the number of nucleotides between the 2’-fluoro modified nucleotide and the 3’ end of the template RNA is at least 4. 29.
  • the template RNA of any one of embodiments 21-28, wherein the number of nucleotides between the 2’-fluoro modified nucleotide and the 3’ end of the template RNA is 4. 4 of 237 11867955v1 Attorney Docket No.: 2017469-0019 30.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence, and (iv) a primer binding site (PBS) sequence comprising one or more (e.g., 1, 2, or 3) 2’-OMe modified nucleotides.
  • PBS primer binding site
  • 34. The template RNA of any of embodiments 30-33, wherein the template RNA does not comprise any unmodified nucleotides 3’ of the 2’-OMe modified nucleotides.
  • 35. The template RNA of any of embodiments 30-34, wherein the template RNA further comprises (e.g., at the 3’ end of the template RNA) one or more (e.g., 1, 2, or 3) nucleotides each comprising a phosphorothioate modification and a 2’-OMe modification. 36.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence comprising one or more (e.g., 1, 2, or 3) 2’-OMe modified nucleotides, and (iv) a primer binding site (PBS) sequence.
  • the heterologous object sequence comprises a plurality of 2’-OMe modified nucleotides (e.g., 2, 3, 4, or 52’-OMe modified nucleotides) positioned adjacent to each other. 5 of 237 11867955v1 Attorney Docket No.: 2017469-0019 38.
  • the template RNA of embodiment 37 wherein the plurality of 2’-OMe modified nucleotides are at least 1, 2, 3, 4, or 5 nucleotides from the 3’ end of the heterologous object sequence. 39. The template RNA of embodiment 37, wherein the plurality of 2’-OMe modified nucleotides are less than 10, 9, 8, 7, 6, or 6 nucleotides from the 3’ end of the heterologous object sequence. 40.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence, and (iv) a primer binding site (PBS) sequence comprising one or more (e.g., 1, 2, or 3) 2’-OMe modified nucleotides.
  • PBS primer binding site
  • the template RNA of embodiment 41 wherein the plurality of 2’-OMe modified nucleotides are at least 1, 2, 3, 4, or 5 nucleotides from the 5’ end of the PBS sequence.
  • 43. The template RNA of any one of embodiments 40-42, wherein the 2’-OMe modified nucleotides further comprise a phosphorothioate modification.
  • 44. The template RNA of embodiment 40, wherein nucleotides -4, -5, -6, -7, -8, -9, and -10 of the PBS sequence comprise a 2’-OMe modification and/or a phosphorothioate modification. 45.
  • nucleotides -5, -6, -7, -8, -9, and -10 of the PBS sequence comprise a 2’-OMe modification and/or a phosphorothioate modification.
  • nucleotide -10 of the PBS sequence is at the 3’ end of the template RNA.
  • the template RNA of embodiment 47 which does not comprise a 2’-OMe modified nucleotide at positions -2 or -1 of the PBS sequence or, +1 or +2 of the heterologous object sequence.
  • 49. The template RNA of any one of embodiments 36-48, wherein the nucleotides at positions -1 of the PBS sequence and +1 of the heterologous object sequence are unmodified nucleotides.
  • 50. The template RNA of embodiment 47, wherein the nucleotides at positions -2 or -1 of the PBS sequence and +1 or and +2 of the heterologous object sequence are unmodified nucleotides. 51.
  • 52. The template RNA of any one of embodiments 36-51, wherein the heterologous object sequence and/or PBS sequence comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 unmodified nucleotides.
  • 53. The template RNA of any one of embodiments 36-52, wherein the heterologous object sequence and/or PBS sequence comprise 1-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, or 45-50 unmodified nucleotides. 54.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence comprising one or more (e.g., 1, 2, or 3) 2’-fluoro modified nucleotides, and (iv) a primer binding site (PBS) sequence.
  • the heterologous object sequence comprises a plurality of 2’-fluoro modified nucleotides (e.g., 2, 3, 4, or 52’-fluoro modified nucleotides) positioned adjacent to each other.
  • the template RNA of embodiment 55 wherein the plurality of 2’-fluoro modified nucleotides are at least 1, 2, 3, 4, or 5 nucleotides from the 3’ end of the heterologous object sequence. 7 of 237 11867955v1 Attorney Docket No.: 2017469-0019 57.
  • the template RNA of embodiment 54, wherein the heterologous object sequence comprises a plurality of 2’-fluoro modified nucleotides (e.g., 2, 3, 4, or 5 2’-fluoro modified nucleotides) alternating with a plurality of nucleotides lacking a 2’-fluoro modification. 58.
  • the template RNA of embodiment 58 which comprises one or more (e.g., 1, 2, 3, 4, or 5) 2’-fluoro modifications on alternating nucleotides after the 2’-fluoro modified nucleotide at the 5’ end of the heterologous object sequence. 60.
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence, and (iv) a primer binding site (PBS) sequence comprising one or more (e.g., 1, 2, or 3) 2’-fluoro modified nucleotides.
  • PBS primer binding site
  • a template RNA comprising, from 5’ to 3’: (i) a gRNA spacer, (ii) a gRNA scaffold, (iii) a heterologous object sequence, and (iv) a primer binding site (PBS) sequence, wherein the template RNA comprises one or more (e.g., 1, 2, or 3) 2’-fluoro modified nucleotides (e.g., one or more adjacent nucleotides) comprising 2’-fluoro modifications, optionally wherein the number of nucleotides between the one of the nucleotides comprising a 2’- fluoro modification and the 3’ end of the template RNA is 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, or 29.
  • the gRNA scaffold binds a gene modifying polypeptide (e.g., binds a Cas domain, e.g., a Cas9 domain of the gene modifying polypeptide). 9 of 237 11867955v1 Attorney Docket No.: 2017469-0019 74.
  • gRNA scaffold binds a gene modifying polypeptide (e.g., binds a Cas domain, e.g., a Cas9 domain of the gene modifying polypeptide) comprising the amino acid sequence of SEQ ID NO: 101, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto. 75.
  • a gene modifying polypeptide e.g., binds a Cas domain, e.g., a Cas9 domain of the gene modifying polypeptide
  • SEQ ID NO: 101 or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto.
  • gRNA scaffold binds a gene modifying polypeptide (e.g., binds a Cas domain, e.g., a Cas9 domain of the gene modifying polypeptide) comprising the amino acid sequence of SEQ ID NO: 102, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto. 76.
  • a gene modifying polypeptide e.g., binds a Cas domain, e.g., a Cas9 domain of the gene modifying polypeptide
  • amino acid sequence of SEQ ID NO: 102 or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto.
  • gRNA scaffold binds a gene modifying polypeptide (e.g., binds a Cas domain, e.g., a Cas9 domain of the gene modifying polypeptide) comprising the amino acid sequence of SEQ ID NO: 103, or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto. 77.
  • a gene modifying polypeptide e.g., binds a Cas domain, e.g., a Cas9 domain of the gene modifying polypeptide
  • SEQ ID NO: 103 or an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto.
  • the template RNA of any one of the preceding embodiments wherein the heterologous object sequence comprises a mutation region to introduce a mutation into (e.g., to correct a mutation in) a portion (e.g., a second portion) of the human PAH, FAH, HBB, TRAC4, B2M, or A1AT gene (wherein optionally the heterologous object sequence comprises, from 5’ to 3’, a post-edit homology region, a mutation region, and a pre-edit homology region).
  • the template RNA of any one of the preceding embodiments which comprises at least 5, 6, 7, or 8 bases with 100% identity to a third portion of the human PAH, FAH, HBB, TRAC4, B2M, or A1AT gene. 79.
  • the template RNA of any one of the preceding embodiments which does not comprise a 2’-fluoro modified nucleotide at position -1 of the PBS sequence or +1 of the heterologous object sequence.
  • 80. The template RNA of any one of the preceding embodiments, which does not comprise a 2’-fluoro modified nucleotide at positions -2 or -1 of the PBS sequence or +1 or +2 of the heterologous object sequence.
  • 81. The template RNA of any one of the preceding embodiments, which does not comprise a 2’-OMe modified nucleotide at position -1 of the PBS sequence or +1 of the heterologous object sequence and the PBS sequence.
  • the template RNA of any one of the preceding embodiments wherein the nucleotides at positions -2 and -1 of the PBS sequence and/or +1 and +2 of the heterologous object sequence are unmodified nucleotides.
  • the 3’ end of the PBS sequence comprises one or more of: a 2’-fluoro modified nucleotide, a 2’-OMe modified nucleotide, and one or more (e.g., 1, 2, or 3) nucleotides comprising a phosphorothioate modification. 86.
  • the template RNA of any one of the preceding embodiments, wherein one or more (e.g., 1, 2, 3, 4, or 5) of the nucleotides comprising a phosphorothioate modification further comprise a 2’-OMe modification.
  • the template RNA of any one of the preceding embodiments which comprises a polynucleotide as listed in column 3 of any of Tables 12-14 or column 2 of Table 15, or a nucleic acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity thereto.
  • the template RNA of embodiment 88 which comprises one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or all of) the nucleotide modifications for the polynucleotide as listed in column 3 of any of Tables 12-14 or column 2 of Table 15. 11 of 237 11867955v1 Attorney Docket No.: 2017469-0019 90.
  • the template RNA of embodiment 88 or 89 which comprises one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or all of) the 2’-OMe modifications for the polynucleotide as listed in column 3 of any of Tables 12-14 or column 2 of Table 15.
  • the template RNA of embodiment 88 or 89 which comprises one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or all of) the 2’-fluoro modifications for the polynucleotide as listed in column 3 of any of Tables 12- 14 or column 2 of Table 15. 92.
  • the template RNA of any one of the preceding embodiments which is capable of introducing an alteration (e.g., a nucleic acid substitution, deletion, or insertion) into at least 20%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, or 90% target nucleic acid molecules (e.g., genomic DNA) in a population of said target nucleic acid molecules (e.g., a population of cells comprising said genomic DNA).
  • an alteration e.g., a nucleic acid substitution, deletion, or insertion
  • target nucleic acid molecules e.g., genomic DNA
  • a population of said target nucleic acid molecules e.g., a population of cells comprising said genomic DNA.
  • a gene modifying system comprising: (i) a template RNA of any of the preceding embodiments, and (ii) a gene modifying polypeptide (e.g., as described herein), or a nucleic acid (e.g., RNA) encoding the gene modifying polypeptide.
  • a method for modifying a target site in a nucleic acid molecule e.g., genomic DNA
  • the method comprising contacting the cell with the gene modifying system of embodiment 94, or DNA encoding the same, thereby modifying the target site in the nucleic acid molecule in the cell.
  • the left hand diagram shows the gene modifying polypeptide, which comprises a Cas nickase domain (e.g., spCas9 N863A) and a reverse transcriptase domain (RT domain) which are linked by a linker.
  • the right hand diagram shows the template RNA which comprises, from 5’ to 3’, a gRNA spacer, a gRNA scaffold, a heterologous object sequence, and a primer binding site sequence (PBS sequence).
  • the heterologous object sequence can comprise a mutation region that comprises one or more sequence differences relative to the target site.
  • the heterologous object sequence can also comprise a pre-edit homology region and a post-edit homology region, which flank the mutation region.
  • the gRNA spacer of the template RNA binds to the second strand of a target site in the genome
  • the gRNA scaffold of the template RNA binds to the gene modifying polypeptide, e.g., localizing the gene modifying 12 of 237 11867955v1 Attorney Docket No.: 2017469-0019 polypeptide to the target site in the genome.
  • the Cas domain of the gene modifying polypeptide nicks the target site (e.g., the first strand of the target site), e.g., allowing the PBS sequence to bind to a sequence adjacent to the site to be altered on the first strand of the target site.
  • the RT domain of the gene modifying polypeptide uses the first strand of the target site that is bound to the complementary sequence comprising the PBS sequence of the template RNA as a primer and the heterologous object sequence of the template RNA as a template to, e.g., polymerize a sequence complementary to the heterologous object sequence.
  • reverse transcription can then proceed through the pre-edit homology region, then through the mutation region, and then through the post-edit homology region, thereby producing a DNA strand comprising a mutation specified by the heterologous object sequence.
  • FIG. 2 is a series of diagrams depicting the chemical structures of exemplary chemical modifications to nucleotides described herein.
  • Shown in the upper left is a nucleotide comprising a 2’- hydroxyl group (e.g., as in a ribonucleotide). Shown in the lower left is a 2’-fluoro (2’-F) modified nucleotide comprising a fluorine at the 2’ position of the sugar. Shown on the upper right is a nucleotide comprising both a 2’-O-methyl (OMe) modification and a phosphorothioate modification. Shown in the lower right is a nucleotide comprising just the 2’-OMe modification.
  • 3A is a series of diagrams showing the heterologous object sequences and PBS (priming) sequences of a series of FAH1 template RNA variants comprising 2’-OMe modifications (indicated by m) and/or phosphorothioate modifications (indicated by *) at varying positions in the heterologous object sequence and/or priming sequence, as indicated in the table.
  • These template RNA variants were tested for their capacity to rewrite the target nucleic acid sequence, with the resultant rewriting efficiency for each variant shown in the bar graph.
  • FIG. 3B is a series of diagrams showing the heterologous object sequences and PBS (primer) sequences of a series of variants of a GFP template RNA, comprising 2’-OMe modifications (indicated by m) and/or phosphorothioate modifications (indicated by *), as indicated in the table.
  • These template RNA variants were tested for their capacity to rewrite the target nucleic acid sequence, with the resultant rewriting efficiency for each variant shown in the bar graph.
  • FIG. 3C is a series of diagrams showing the heterologous object sequences and PBS (priming) sequences of variants of template RNAs targeting HEK3, FAH, or HBB.
  • the template RNA variants comprised 2’-OMe modifications (indicated by m) and/or phosphorothioate modifications (indicated by *), as indicated in the first table. These template RNA variants were tested for their capacity to rewrite the target nucleic acid sequence, with the resultant rewriting efficiency for each variant against each target relative to base variant A1 shown in the second table. 13 of 237 11867955v1 Attorney Docket No.: 2017469-0019 FIG.
  • FIG. 4 is a series of diagrams showing the heterologous object sequences and PBS (priming) sequences of a series of GFP-targeting template RNA variants comprising 2’-fluoro modifications (as indicated by f) at varying positions in the heterologous object sequence and/or priming sequence, as indicated in the table.
  • These template RNA variants were tested for their capacity to rewrite the target nucleic acid sequence, with the resultant rewriting efficiency for each variant shown in the bar graph.
  • FIG. 5A is a series of diagrams showing the heterologous object sequences and PBS (priming) sequences of a series of variants of a template RNA, each comprising 2’-fluoro modifications as indicated in the table on the top in gray boxes.
  • Two of the variants included alternating patterns of 2’-fluoro modifications, in which every other nucleotide in a subsequence of the heterologous object sequence comprises a 2’-fluoro modification.
  • These variants further comprised, in 5’ to 3’ order, a 2’-fluoro modified nucleotide, three 2’-OMe modified nucleotides, and three nucleotides each comprising a 2’-OMe and a phosphorothioate modification, at the 3’ end of the priming region.
  • FIG. 5B is a series of diagrams showing the heterologous object sequences and PBS (priming) sequences of a series of variants of a template RNA, each comprising 2’-fluoro modifications as indicated in the table on the top in gray boxes.
  • Two of the variants included alternating patterns of 2’-fluoro modifications, in which every other nucleotide in a subsequence of the heterologous object sequence comprises a 2’-fluoro modification.
  • variants further comprised, in 5’ to 3’ order, a 2’-fluoro modified nucleotide, three 2’-OMe modified nucleotides, and three nucleotides each comprising a 2’-OMe and/or a phosphorothioate modification, at the 3’ end of the priming region.
  • template RNA variants were tested for their capacity to introduce alterations the target nucleic acid sequence, with the resultant rewriting efficiency and percentage of indels for each variant shown in the graphs on the bottom left and bottom right, respectively.
  • 5C is a series of diagrams showing the heterologous object sequence and PBS (priming) sequence of a variant of a template RNA, each comprising 2’-fluoro modifications as indicated in the table on the top in gray boxes.
  • the RNACS6874 variant included an alternating pattern of 2’-fluoro modifications, in which every other nucleotide in a subsequence of the heterologous object sequence comprises a 2’-fluoro modification.
  • variants further comprised, in 5’ to 3’ order, a 2’-fluoro modified nucleotide, a 2’-OMe modified nucleotide, and three nucleotides each comprising a 2’-OMe and/or a phosphorothioate modification, at the 3’ end of the priming region.
  • These template RNA variants were tested for their capacity to introduce alterations the target nucleic acid sequence, with the resultant rewriting efficiency and percentage of indels for each variant shown in the graphs on the bottom left and bottom right, respectively.
  • 5E is a series of diagrams showing the heterologous object sequences and PBS (priming) sequences of a series of variants of a template RNA, each comprising 2’-fluoro modifications as indicated in the table on the top in gray boxes.
  • Two of the variants included alternating patterns of 2’-fluoro modifications, in which every other nucleotide in a subsequence of the heterologous object sequence comprises a 2’-fluoro modification.
  • variants further comprised, in 5’ to 3’ order, a 2’-fluoro modified nucleotide, a 2’-OMe modified nucleotide, and three nucleotides each comprising a 2’-OMe and/or a phosphorothioate modification, at the 3’ end of the priming region.
  • These template RNA variants were tested for their capacity to introduce alterations the target nucleic acid sequence, with the resultant rewriting efficiency and percentage of indels for each variant shown in the graphs on the bottom left and bottom right, respectively.
  • FIG.6 is a diagram showing exemplary modifications for a template RNA as described herein.
  • the heterologous object sequence can, in some instances, comprise an alternating pattern of 2’-fluoro modified nucleotides and unmodified nucleotides (e.g., ribonucleotides comprising a 2’-hydroxyl group).
  • the length of the region showing the alternating pattern can have a length, in some instances, between 0 nucleotides and the full length of the heterologous object sequence minus four nucleotides.
  • the length of the region showing the alternating pattern can have a length, in some instances, between 0 nucleotides and the full length of the heterologous object sequence minus three nucleotides.
  • the 5’-most nucleotide of the heterologous object sequence comprises a 2’-fluoro modification and, in certain instances, is the first 2’-fluoro modified nucleotide of the alternating pattern.
  • the junction region connecting the heterologous object sequence to the priming sequence only comprises unmodified nucleotides (e.g., comprising the -1, -2, and/or -3, and/or the +1, +2, and/or +3 nucleotides, as numbered relative to the junction of the heterologous object sequence and the priming sequence, e.g., as described 15 of 237 11867955v1 Attorney Docket No.: 2017469-0019 herein).
  • the 3’ end of the priming sequence comprises a motif comprising, in 5’ to 3’ order, a 2’-fluoro modified nucleotide, one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more) 2’-OMe modified nucleotides, and one or more (e.g., 1, 2, 3, 4, or 5) nucleotides each comprising both a 2’-OMe modification and a phosphorothioate modification.
  • alternating nucleotides refers to a pattern of nucleotides wherein all of the odd nucleotides of that region have the same chemical modification and all the even nucleotides do not have that chemical modification, or the opposite: all of the even nucleotides of that region have the same chemical modification and all the odd nucleotides do not have that chemical modification.
  • the first, third, and fifth positions of a region may all comprise 2’F chemical modifications
  • the second and fourth positions of a region may comprise unmodified nucleotides or a chemical modification other than 2’F.
  • the second and fourth positions may be the same or different.
  • one or more may comprise a second chemical modification.
  • the region described above having 2’F chemical modifications at the first, third, and fifth positions if only one of the nucleotides at those positions further comprises a backbone modification, the region still comprises alternating nucleotides with respect to the 2’F chemical modification.
  • the alternating nucleotides may be found in a region of a larger nucleic acid, wherein the larger nucleic acid comprises one or more other, non-alternating, regions.
  • expression cassette refers to a nucleic acid construct comprising nucleic acid elements sufficient for the expression of the nucleic acid molecule of the instant invention.
  • a “gRNA spacer”, as used herein, refers to a portion of a nucleic acid that has complementarity to a target nucleic acid and can, together with a gRNA scaffold, target a Cas protein to the target nucleic acid.
  • a “gRNA scaffold”, as used herein, refers to a portion of a nucleic acid that can bind a Cas protein and can, together with a gRNA spacer, target the Cas protein to the target nucleic acid.
  • the gRNA scaffold comprises a crRNA sequence, tetraloop, and tracrRNA sequence.
  • a “gene modifying polypeptide”, as used herein, refers to a polypeptide comprising a retroviral reverse transcriptase, or a polypeptide comprising an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% amino acid sequence identity to a retroviral reverse transcriptase, which is capable of integrating a nucleic acid sequence (e.g., a sequence provided on a template nucleic acid) into a target DNA molecule (e.g., in a mammalian host cell, such as a genomic DNA molecule in the host cell).
  • a nucleic acid sequence e.g., a sequence provided on a template nucleic acid
  • target DNA molecule e.g., in a mammalian host cell, such as a genomic DNA molecule in the host cell.
  • the gene modifying polypeptide is capable of integrating the sequence 16 of 237 11867955v1 Attorney Docket No.: 2017469-0019 substantially without relying on host machinery.
  • the gene modifying polypeptide integrates a sequence into a specific target site.
  • a gene modifying polypeptide includes one or more domains that, collectively, facilitate 1) binding the template nucleic acid, 2) binding the target DNA molecule, and 3) facilitate integration of the at least a portion of the sequence of the template nucleic acid into the target DNA.
  • Gene modifying polypeptides include both naturally occurring polypeptides as well as engineered variants of the foregoing, e.g., having one or more amino acid substitutions to the naturally occurring sequence.
  • Gene modifying polypeptides also include heterologous constructs, e.g., where one or more of the domains recited above are heterologous to each other, whether through a heterologous fusion (or other conjugate) of otherwise wild-type domains, as well as fusions of modified domains, e.g., by way of replacement or fusion of a heterologous sub-domain or other substituted domain.
  • a gene modifying polypeptide integrates a sequence into a gene.
  • a gene modifying polypeptide integrates a sequence into a sequence outside of a gene.
  • domain refers to a structure of a biomolecule that contributes to a specified function of the biomolecule.
  • a domain may comprise a contiguous region (e.g., a contiguous sequence) or distinct, non-contiguous regions (e.g., non-contiguous sequences) of a biomolecule.
  • protein domains include, but are not limited to, an endonuclease domain, a DNA binding domain, a reverse transcription domain; an example of a domain of a nucleic acid is a regulatory domain, such as a transcription factor binding domain.
  • a domain e.g., a Cas domain
  • a domain can comprise two or more smaller domains (e.g., a DNA binding domain and an endonuclease domain).
  • exogenous when used with reference to a biomolecule (such as a nucleic acid sequence or polypeptide) means that the biomolecule was introduced into a host genome, cell or organism by the hand of man.
  • a nucleic acid that is as added into an existing genome, cell, tissue or subject using recombinant DNA techniques or other methods is exogenous to the existing nucleic acid sequence, cell, tissue or subject.
  • First and second strand designations do not describe the target site DNA strands in other respects; for example, in some embodiments the first and second strands are nicked by a polypeptide described herein, but the designations ‘first’ and ‘second’ strand have no bearing on the order in which such nicks occur. 17 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • the term “heterologous,” as used herein to describe a first element in reference to a second element means that the first element and second element do not exist in nature disposed as described.
  • a heterologous polypeptide, nucleic acid molecule, construct or sequence refers to (a) a polypeptide, nucleic acid molecule or portion of a polypeptide or nucleic acid molecule sequence that is not native to a cell in which it is expressed, (b) a polypeptide or nucleic acid molecule or portion of a polypeptide or nucleic acid molecule that has been altered or mutated relative to its native state, or (c) a polypeptide or nucleic acid molecule with an altered expression as compared to the native expression levels under similar conditions.
  • a heterologous nucleic acid molecule may exist in a native host cell genome, but may have an altered expression level or have a different sequence or both.
  • heterologous nucleic acid molecules may not be endogenous to a host cell or host genome but instead may have been introduced into a host cell by transformation (e.g., transfection, electroporation), wherein the added molecule may integrate into the host genome or can exist as extra-chromosomal genetic material either transiently (e.g., mRNA) or semi-stably for more than one generation (e.g., episomal viral vector, plasmid or other self-replicating vector).
  • insertion of a sequence into a target site refers to the net addition of DNA sequence at the target site, e.g., where there are new nucleotides in the heterologous object sequence with no cognate positions in the unedited target site.
  • a nucleotide alignment of the PBS sequence and heterologous object sequence to the target nucleic acid sequence would result in an alignment gap in the target nucleic acid sequence.
  • a “deletion” generated by a heterologous object sequence in a target site refers to the net deletion of DNA sequence at the target site, e.g., where there are nucleotides in the unedited target site with no cognate positions in the heterologous object sequence.
  • Nucleic acid molecule refers to both RNA and DNA molecules including, without limitation, complementary DNA (“cDNA”), genomic DNA (“gDNA”), and messenger RNA (“mRNA”), and also includes synthetic nucleic acid molecules, such as those that are chemically synthesized or recombinantly produced, such as RNA templates, as described herein.
  • the nucleic acid molecule can be double-stranded or single-stranded, circular, or linear. If single-stranded, the nucleic acid molecule can be the sense strand or the antisense strand.
  • nucleic acid comprising SEQ ID NO:1 refers to a nucleic acid, at least a portion which has either (i) the sequence of SEQ ID NO:1, or (ii) a sequence complimentary to SEQ ID NO:1.
  • the choice between the two is dictated by the context in which SEQ ID NO:1 is used. For instance, if the nucleic acid is used as a probe, the choice between the two is dictated by the requirement that the probe be complementary to the desired target.
  • nucleic acid elements of the systems provided by the invention can be provided in a variety of topologies, including single-stranded, double-stranded, circular, linear, linear with open ends, linear with closed ends, and particular versions of these, such as doggybone DNA (dbDNA), closed-ended DNA (ceDNA).
  • a “gene expression unit” is a nucleic acid sequence comprising at least one regulatory nucleic acid sequence operably linked to at least one effector sequence.
  • a first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence.
  • a promoter or enhancer is operably linked to a coding sequence if the promoter or enhancer affects the transcription or expression of the coding sequence.
  • Operably linked DNA sequences may be contiguous or non-contiguous. Where necessary to join two protein-coding regions, operably linked sequences may be in the same reading frame.
  • the terms “host genome” or “host cell”, as used herein, refer to a cell and/or its genome into which protein and/or genetic material has been introduced. It should be understood that such terms are intended to refer not only to the particular subject cell and/or genome, but to the progeny of such a cell and/or the genome of the progeny of such a cell.
  • a host genome or host cell may be an isolated cell or cell line grown in culture, or genomic material isolated from such a cell or cell line, or may be a host cell or host genome which composing living tissue or an organism.
  • a host cell may be an animal cell or a plant cell, e.g., as described herein.
  • a host cell may be a mammalian cell, a human cell, avian cell, reptilian cell, bovine cell, horse cell, pig cell, goat cell, sheep cell, chicken cell, or turkey cell.
  • a host cell may be a corn cell, soy cell, wheat cell, or rice cell.
  • operative association describes a functional relationship between two nucleic acid sequences, such as a 1) promoter and 2) a heterologous object sequence, and means, in such example, the promoter and heterologous object sequence (e.g., a gene of interest) are oriented such that, under suitable conditions, the promoter drives expression of the heterologous object sequence.
  • a template nucleic acid carrying a promoter and a heterologous object sequence may be single-stranded, e.g., either the (+) or (-) orientation.
  • An “operative association” between the promoter and the heterologous object sequence in this template means that, regardless of whether the template nucleic acid will be transcribed in a particular state, when it is in the suitable state (e.g., is in the (+) orientation, in the presence of required 20 of 237 11867955v1 Attorney Docket No.: 2017469-0019 catalytic factors, and NTPs, etc.), it is accurately transcribed.
  • Operative association applies analogously to other pairs of nucleic acids, including other tissue-specific expression control sequences (such as enhancers, repressors and microRNA recognition sequences), IR/DR, ITRs, UTRs, or homology regions and heterologous object sequences or sequences encoding a retroviral RT domain.
  • a template RNA comprises a PBS sequence and a heterologous object sequence
  • the PBS sequence binds to a region comprised in a target nucleic acid sequence, allowing a reverse transcriptase domain to use that region as a primer for reverse transcription, and to use the heterologous object sequence as a template for reverse transcription.
  • a “stem-loop sequence” refers to a nucleic acid sequence (e.g., RNA sequence) with sufficient self-complementarity to form a stem-loop, e.g., having a stem comprising at least two (e.g., 3, 4, 5, 6, 7, 8, 9, or 10) base pairs, and a loop with at least three (e.g., four) base pairs.
  • the stem may comprise mismatches or bulges.
  • the following numbering system will be adhered to for describing the position of nucleotides having chemical modifications in the PBS sequence and/or heterologous object sequence of a template RNA.
  • a gene modifying system described herein comprises: (A) a gene modifying polypeptide or a nucleic acid encoding the gene modifying polypeptide, wherein the gene modifying polypeptide comprises (i) a reverse transcriptase domain, and (x) an endonuclease domain that contains 22 of 237 11867955v1 Attorney Docket No.: 2017469-0019 DNA binding functionality; and (B) a template RNA.
  • RNA template element of a gene modifying system is typically heterologous to the gene modifying polypeptide element and provides an object sequence to be inserted (reverse transcribed) into the host genome.
  • the gene modifying polypeptide is capable of target primed reverse transcription.
  • the gene modifying polypeptide is capable of second-strand synthesis.
  • a gene modifying polypeptide includes one or more domains that, collectively, facilitate 1) binding the template nucleic acid, 2) binding the target DNA molecule, and 3) facilitate integration of the at least a portion of the sequence of the template nucleic acid into the target DNA.
  • a gene modifying system is capable of producing an insertion into the target site of at least 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 nucleotides (and optionally no more than 500, 400, 300, 200, or 100 nucleotides). In some embodiments, a gene modifying system is capable of producing an insertion into the target site of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 nucleotides (and optionally no more than 500, 400, 300, 200, or 100 nucleotides).
  • a gene modifying system is capable of producing an insertion into the target site of at least 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5 or 10 kilobases (and optionally no more than 1, 5, 10, or 20 kilobases).
  • a gene modifying system is capable of producing a deletion of at least 81, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides (and optionally no more than 500, 400, 300, or 200 nucleotides).
  • a gene modifying system is capable of producing a deletion of at least 81, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides (and optionally 23 of 237 11867955v1 Attorney Docket No.: 2017469-0019 no more than 500, 400, 300, or 200 nucleotides).
  • a gene modifying system is capable of producing a deletion of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides (and optionally no more than 500, 400, 300, or 200 nucleotides).
  • a gene modifying system is capable of producing a deletion of at least 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5 or 10 kilobases (and optionally no more than 1, 5, 10, or 20 kilobases).
  • a gene modifying system is capable of producing a substitution into the target site of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 or more nucleotides.
  • a gene modifying system is capable of producing a substitution in the target site of 1-2, 2-3, 3-4, 4-5, 5-10, 10-15, 15-20, 20-30, 30-40, 40-50, 50-60, 60-70, 70-80, 80-90, or 90-100 nucleotides.
  • the substitution is a transition mutation. In some embodiments, the substitution is a transversion mutation.
  • the substitution converts an adenine to a thymine, an adenine to a guanine, an adenine to a cytosine, a guanine to a thymine, a guanine to a cytosine, a guanine to an adenine, a thymine to a cytosine, a thymine to an adenine, a thymine to a guanine, a cytosine to an adenine, a cytosine to a guanine, or a cytosine to a thymine.
  • an insertion, deletion, substitution, or combination thereof increases or decreases expression (e.g. transcription or translation) of a gene.
  • an insertion, deletion, substitution, or combination thereof increases or decreases expression (e.g. transcription or translation) of a gene by altering, adding, or deleting sequences in a promoter or enhancer, e.g. sequences that bind transcription factors.
  • an insertion, deletion, substitution, or combination thereof alters translation of a gene (e.g. alters an amino acid sequence), inserts or deletes a start or stop codon, alters or fixes the translation frame of a gene.
  • an insertion, deletion, substitution, or combination thereof alters splicing of a gene, e.g. by inserting, deleting, or altering a splice acceptor or donor site. In some embodiments, an insertion, deletion, substitution, or combination thereof alters transcript or protein half-life. In some embodiments, an insertion, deletion, substitution, or combination thereof alters protein localization in the cell (e.g. from the cytoplasm to a mitochondria, from the cytoplasm into the extracellular space (e.g. adds a secretion tag)). In some embodiments, an insertion, deletion, substitution, or combination thereof alters (e.g. improves) protein folding (e.g. to prevent accumulation of misfolded proteins).
  • an insertion, deletion, substitution, or combination thereof alters, increases, decreases the activity of a gene, e.g. a protein encoded by the gene.
  • Polypeptide Components of Gene Modifying Systems the gene modifying polypeptide possesses the functions of DNA target site binding, template nucleic acid (e.g., RNA) binding, DNA target site cleavage, and template nucleic acid 24 of 237 11867955v1 Attorney Docket No.: 2017469-0019 (e.g., RNA) writing, e.g., reverse transcription.
  • each function is contained within a distinct domain.
  • a function may be attributed to two or more domains (e.g., two or more domains, together, exhibit the functionality).
  • two or more domains may have the same or similar function (e.g., two or more domains each independently have DNA-binding functionality, e.g., for two different DNA sequences).
  • one or more domains may be capable of enabling one or more functions, e.g., a Cas9 domain enabling both DNA binding and target site cleavage.
  • the domains are all located within a single polypeptide.
  • the gene modifying polypeptide comprises, in N-terminal to C-terminal order, one or more (e.g., 1, 2, 3, 4, 5, or all 6) of an N-terminal methionine residue, a first nuclear localization signal (NLS), a DNA binding domain, a linker, an RT domain, and/or a second NLS.
  • the gene modifying polypeptide further comprises an N-terminal methionine residue.
  • a nucleic acid encoding a gene modifying polypeptide encodes a T2A sequence, e.g., wherein the T2A sequence is situated between a region encoding the gene modifying polypeptide and a second region, wherein the second region optionally encodes a selectable marker, e.g., puromycin.
  • the gene modifying polypeptide further comprises a spacer sequence between the first NLS and the DNA binding domain.
  • the spacer sequence between the first NLS and the DNA binding domain comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids.
  • the spacer sequence between the first NLS and the DNA binding domain comprises the amino acid sequence GG.
  • the gene modifying polypeptide further comprises a spacer sequence between the DNA binding domain and the linker. In certain embodiments, the spacer sequence between the DNA binding domain and the linker comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids. In certain embodiments, the spacer sequence between the DNA binding domain and the linker comprises the amino acid sequence GG. In certain embodiments, the gene modifying polypeptide further comprises a spacer sequence between the linker and the RT domain. In certain embodiments, the spacer sequence between the linker and the RT domain comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids.
  • the spacer sequence between the linker and the RT domain comprises the amino acid sequence GG.
  • the gene modifying polypeptide further comprises a spacer sequence between the RT domain and the second NLS.
  • the spacer sequence between the RT domain and the second NLS comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids.
  • the spacer sequence between the RT domain and the second NLS comprises the amino acid sequence AG.
  • the gene modifying polypeptide further comprises a spacer sequence between the second NLS and the T2A sequence and/or puromycin sequence.
  • the spacer sequence between the second NLS and the T2A sequence and/or puromycin sequence comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acids.
  • the spacer sequence 25 of 237 11867955v1 Attorney Docket No.: 2017469-0019 between the second NLS and the T2A sequence and/or puromycin sequence comprises the amino acid sequence GSG.
  • RT Domains In certain aspects of the present invention, the writing domain of the gene modifying system possesses reverse transcriptase activity and is also referred to as a reverse transcriptase domain (a RT domain). In some embodiments, the RT domain comprises an RT catalytic portion and RNA-binding region (e.g., a region that binds the template RNA).
  • a nucleic acid encoding the reverse transcriptase is altered from its natural sequence to have altered codon usage, e.g. improved for human cells.
  • the reverse transcriptase domain is a heterologous reverse transcriptase from a retrovirus.
  • the RT domain comprising a gene modifying polypeptide has been mutated from its original amino acid sequence, e.g., has at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100 substitutions.
  • the RT domain is derived from the RT of a retrovirus, e.g., HIV-1 RT, Moloney Murine Leukemia Virus (MMLV) RT, avian myeloblastosis virus (AMV) RT, or Rous Sarcoma Virus (RSV) RT.
  • the RT domain has a length of about 400-500, 500-600, 600-700, 700-800, 800-900, or 900-1000 amino acids.
  • the retroviral reverse transcriptase (RT) domain exhibits enhanced stringency of target-primed reverse transcription (TPRT) initiation, e.g., relative to an endogenous RT domain.
  • TPRT target-primed reverse transcription
  • the RT domain initiates TPRT when the 3 nt in the target site immediately upstream of the first strand nick, e.g., the genomic DNA priming the RNA template, have at least 66% or 100% complementarity to the 3 nt of homology in the RNA template. In some embodiments, the RT domain initiates TPRT when there are less than 5 nt mismatched (e.g., less than 1, 2, 3, 4, or 5 nt mismatched) between the template RNA homology and the target DNA priming reverse transcription.
  • 5 nt mismatched e.g., less than 1, 2, 3, 4, or 5 nt mismatched
  • the RT domain is modified such that the stringency for mismatches in priming the TPRT reaction is increased, e.g., wherein the RT domain does not tolerate any mismatches or tolerates fewer mismatches in the priming region relative to a wild-type (e.g., unmodified) RT domain.
  • the RT domain comprises a HIV-1 RT domain.
  • the HIV-1 RT domain initiates lower levels of synthesis even with three nucleotide mismatches relative to an alternative RT domain (e.g., as described by Jamburuthugoda and Eickbush J Mol Biol 407(5):661-672 (2011); incorporated herein by reference in its entirety).
  • the RT domain forms a dimer (e.g., a heterodimer or homodimer). In some embodiments, the RT domain is monomeric. In some embodiments, an RT domain, naturally functions as a monomer or as a dimer (e.g., heterodimer or homodimer). In some embodiments, an RT domain naturally functions as a monomer, e.g., is derived from a virus wherein it 26 of 237 11867955v1 Attorney Docket No.: 2017469-0019 functions as a monomer.
  • the RT domain is selected from an RT domain from murine leukemia virus (MLV; sometimes referred to as MoMLV) (e.g., P03355), porcine endogenous retrovirus (PERV) (e.g., UniProt Q4VFZ2), mouse mammary tumor virus (MMTV) (e.g., UniProt P03365), Avian reticuloendotheliosis virus (AVIRE) (e.g., UniProtKB accession: P03360); Feline leukemia virus (FLV or FeLV) (e.g., e.g., UniProtKB accession: P10273); Mason-Pfizer monkey virus (MPMV) (e.g., UniProt P07572), bovine leukemia virus (BLV) (e.g., UniProt P03361), human T-cell leukemia virus-1 (HTLV-1) (e.g., UniProt P03362), human foamy virus (HFV) (e.g., M
  • an RT domain is dimeric in its natural functioning.
  • the RT domain is derived from a virus wherein it functions as a dimer.
  • the RT domain is selected from an RT domain from avian sarcoma/leukemia virus (ASLV) (e.g., UniProt A0A142BKH1), Rous sarcoma virus (RSV) (e.g., UniProt P03354), avian myeloblastosis virus (AMV) (e.g., UniProt Q83133), human immunodeficiency virus type I (HIV-1) (e.g., UniProt P03369), human immunodeficiency virus type II (HIV-2) (e.g., UniProt P15833), simian immunodeficiency virus (SIV) (e.g., UniProt P05896), bovine immunodeficiency virus (BIV) (e.g., UniProt P19560
  • ASLV avian s
  • Naturally heterodimeric RT domains may, in some embodiments, also be functional as homodimers.
  • dimeric RT domains are expressed as fusion proteins, e.g., as homodimeric fusion proteins or heterodimeric fusion proteins.
  • the RT function of the system is fulfilled by multiple RT domains (e.g., as described herein).
  • the multiple RT domains are fused or separate, e.g., may be on the same polypeptide or on different polypeptides.
  • a gene modifying system described herein comprises an integrase domain, e.g., wherein the integrase domain may be part of the RT domain.
  • an RT domain (e.g., as described herein) comprises an integrase domain.
  • an RT domain (e.g., as described herein) lacks an integrase domain, or comprises an integrase domain that has been inactivated by mutation or deleted.
  • a gene modifying system described herein comprises an RNase H domain, e.g., wherein the RNase H domain may be part of the RT domain.
  • the RNase H domain is not part of the RT domain and is covalently linked via a flexible linker.
  • an RT domain (e.g., as described herein) comprises an RNase H domain, e.g., an endogenous RNAse H domain or a heterologous RNase H domain.
  • an RT domain (e.g., as 27 of 237 11867955v1 Attorney Docket No.: 2017469-0019 described herein) lacks an RNase H domain.
  • an RT domain (e.g., as described herein) comprises an RNase H domain that has been added, deleted, mutated, or swapped for a heterologous RNase H domain.
  • the polypeptide comprises an inactivated endogenous RNase H domain.
  • an endogenous RNase H domain from one of the other domains of the polypeptide is genetically removed such that it is not included in the polypeptide, e.g., the endogenous RNase H domain is partially or completely truncated from the comprising domain.
  • mutation of an RNase H domain yields a polypeptide exhibiting lower RNase activity, e.g., as determined by the methods described in Kotewicz et al. Nucleic Acids Res 16(1):265-277 (1988) (incorporated herein by reference in its entirety), e.g., lower by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% compared to an otherwise similar domain without the mutation.
  • RNase H activity is abolished.
  • an RT domain is mutated to increase fidelity compared to an otherwise similar domain without the mutation.
  • a YADD or YMDD motif in an RT domain e.g., in a reverse transcriptase
  • YVDD a YADD or YMDD motif in an RT domain
  • replacement of the YADD or YMDD or YVDD results in higher fidelity in retroviral reverse transcriptase activity (e.g., as described in Jamburuthugoda and Eickbush J Mol Biol 2011; incorporated herein by reference in its entirety).
  • a gene modifying polypeptide described herein comprises an RT domain having an amino acid sequence according to Table 1, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity thereto.
  • a nucleic acid described herein encodes an RT domain having an amino acid sequence according to Table 1, or a sequence having at least 70%, 80%, 85%, 90%, 95%, 97%, 98%, or 99% identity thereto.
  • reverse transcriptase domains are engineered to have improved properties, e.g. SuperScript IV (SSIV) reverse transcriptase derived from the MMLV RT.
  • the reverse transcriptase domain may be engineered to have lower error rates, e.g., as described in WO2001068895, incorporated herein by reference.
  • the reverse transcriptase domain may be engineered to be more thermostable.
  • the reverse transcriptase domain may be engineered to be more processive.
  • the reverse transcriptase domain may be engineered to have tolerance to inhibitors.
  • the reverse transcriptase domain may be engineered to be faster.
  • the reverse transcriptase domain may be engineered to better tolerate modified nucleotides in the RNA template. In some embodiments, the reverse transcriptase domain may be engineered to insert modified DNA nucleotides. In some embodiments, the reverse transcriptase domain is engineered to bind a template RNA.
  • one or more mutations are chosen from D200N, L603W, T330P, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, W313F, L435G, N454K, H594Q, L671P, E69K, H8Y, T306K, or D653N in the RT domain of murine leukemia virus reverse transcriptase or a corresponding mutation at a corresponding position of another RT domain.
  • a gene modifying polypeptide as described herein comprises a reverse transcriptase or RT domain (e.g., as described herein) that comprises a MoMLV RT sequence or variant thereof.
  • the MoMLV RT sequence comprises one or more mutations selected from D200N, L603W, T330P, T306K, W313F, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, L435G, N454K, H594Q, D653N, R110S, and K103L.
  • the MoMLV RT sequence comprises a combination of mutations, such as D200N, L603W, and T330P, optionally further including T306K and/or W313F.
  • a gene modifying polypeptide comprises the RT domain from a retroviral reverse transcriptase, e.g., a wild-type M-MLV RT, e.g., comprising the following sequence: M-MLV (WT): TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEA RLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLP PSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFDEALHRDLADF RIQHPDLILLQYVDDLLLAATSEL
  • a gene modifying polypeptide comprises the RT domain from a retroviral reverse transcriptase comprising the sequence of amino acids 659-1329 of NP_057933.
  • the gene modifying polypeptide further comprises one additional amino acid at the N-terminus of the sequence of amino acids 659-1329 of NP_057933, e.g., as shown below: TLNIEDEHRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEA RLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLP PSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFDEALHRDLADF RIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVK
  • the gene modifying polypeptide comprises an RNaseH1 domain (e.g., amino acids 1178-1318 of NP_057933).
  • a retroviral reverse transcriptase domain e.g., M-MLV RT, may comprise one or more mutations from a wild-type sequence that may improve features of the RT, e.g., thermostability, processivity, and/or template binding.
  • an M-MLV RT domain comprises, relative to the M-MLV (WT) sequence above, one or more mutations, e.g., selected from D200N, L603W, T330P, T306K, W313F, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, L435G, N454K, H594Q, D653N, R110S, K103L, e.g., a combination of mutations, such as D200N, L603W, and T330P, optionally further including T306K and W313F.
  • one or more mutations e.g., selected from D200N, L603W, T330P, T306K, W313F, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K
  • an M-MLV RT used herein comprises the mutations D200N, L603W, T330P, T306K and W313F.
  • the mutant M- MLV RT comprises the following amino acid sequence: M-MLV (PE2): TLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQAPLIIPLKATSTPVSIKQYPMSQEA RLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTNDYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLP PSHQWYTVLDLKDAFFCLRLHPTSQPLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFNEALHRDLADF RIQHPDLILLQYVDDLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWL TEARKETVMGQPTPKTPRQLREFLGKAGFCRLFIPGFA
  • a gene modifying polypeptide comprises the amino acid sequence of an RT domain sequence from a family selected from: AVIRE, BAEVM, FFV, FLV, FOAMV, GALV, KORV, MLVAV, MLVBM, MLVCB, MLVFF, MLVMS, PERV, SFV1, SFV3L, WMSV, and XMRV6. 65 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • a gene modifying polypeptide comprises the amino acid sequence of an RT domain sequence from an MLVMS RT domain.
  • the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 1 of Table 2, or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 3 of Table 2 (Gen1 MLVMS), or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations at an amino acid position of the RT domain as listed in columns 1 and 2 of Table 3, or an amino acid position corresponding thereto. In certain embodiments, a gene modifying polypeptide comprises the amino acid sequence of an RT domain sequence from an AVIRE RT domain.
  • the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 2 of Table 2, or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations as listed in column 4 of Table 2 (Gen2 AVIRE), or a point mutation corresponding thereto. In embodiments, the amino acid sequence of an RT domain sequence comprises one or more point mutations at an amino acid position of the RT domain as listed in columns 3 and 4 of Table 3, or an amino acid position corresponding thereto. In certain embodiments, the RT domain comprises an IENSSP (e.g., at the C-terminus). Table 2.
  • IENSSP e.g., at the C-terminus
  • a gene modifying polypeptide comprises a gamma retrovirus derived RT domain.
  • the gamma retrovirus-derived RT domain of a gene modifying polypeptide comprises the amino acid sequence of an RT domain sequence from a family selected from: AVIRE, BAEVM, FFV, FLV, FOAMV, GALV, KORV, MLVAV, MLVBM, MLVCB, MLVFF, 67 of 237 11867955v1 Attorney Docket No.: 2017469-0019 MLVMS, PERV, SFV1, SFV3L, WMSV, and XMRV6.
  • the gamma retrovirus- derived RT domain of a gene modifying polypeptide is not derived from PERV.
  • said RT includes one, two, three, four, five, six or more mutations shown in Table 4 and corresponding to mutations D200N, L603W, T330P, D524G, E562Q, D583N, P51L, S67R, E67K, T197A, H204R, E302K, F309N, W313F, L435G, N454K, H594Q, L671P, E69K, or D653N in the RT domain of murine leukemia virus reverse transcriptase.
  • the gene modifying polypeptide further comprises a linker having at least 99% or 100% identity to SEQ ID NO: 5217.
  • the RT domain comprises the amino acid sequence of an RT domain of an AVIRE RT (e.g., an AVIRE_P03360 sequence, e.g., SEQ ID NO: 8001), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of an AVIRE RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, G330P, L605W, T306K, and W313F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an AVIRE RT further comprising one, two, or three mutations selected from the group consisting of D200N, G330P, and L605W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a BAEVM RT (e.g., an BAEVM_P10272 sequence, e.g., SEQ ID NO: 8004), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a BAEVM RT further comprising one, two, three, four, or five mutations selected from the group consisting of D198N, E328P, L602W, T304K, and W311F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a BAEVM RT further comprising one, two, or three mutations selected from the group consisting of D198N, E328P, and L602W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of an FFV RT (e.g., an FFV_O93209 sequence, e.g., SEQ ID NO: 8012), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of an FFV RT further comprising one, two, three, or four mutations selected from the group consisting of D21N, T293N, T419P, and L393K, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an FFV RT further comprising one, two, or three mutations selected from the group consisting of D21N, T293N, and T419P, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FFV RT further comprising the mutation D21N.
  • the RT domain comprises the amino acid sequence of an FFV RT further comprising one, two, or three mutations selected from the group consisting of T207N, T333P, and L307K, or a 68 of 237 11867955v1 Attorney Docket No.: 2017469-0019 corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an FFV RT further comprising one or two mutations selected from the group consisting of T207N and T333P, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of an FLV RT (e.g., an FLV_P10273 sequence, e.g., SEQ ID NO: 8019), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of an FLV RT further comprising one, two, three, or four mutations selected from the group consisting of D199N, L602W, T305K, and W312F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an FLV RT further comprising one or two mutations selected from the group consisting of D199N and L602W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a FOAMV RT (e.g., an FOAMV_P14350 sequence, e.g., SEQ ID NO: 8021), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one, two, three, or four mutations selected from the group consisting of D24N, T296N, S420P, and L396K, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one, two, or three mutations selected from the group consisting of D24N, T296N, and S420P, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an FOAMV RT further comprising the mutation D24N, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one, two, or three mutations selected from the group consisting of T207N, S331P, and L307K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of an FOAMV RT further comprising one or two mutations selected from the group consisting of T207N and S331P, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a GALV RT (e.g., an GALV_P21414 sequence, e.g., SEQ ID NO: 8027), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D198N, E328P, L600W, T304K, and W311F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, or three mutations selected from the group consisting of D198N, E328P, and L600W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a KORV RT (e.g., an KORV_Q9TTC1 sequence, e.g., SEQ ID NO: 8047), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, three, four, five, or six mutations selected from the group consisting of D32N, D322N, E452P, L274W, T428K, and W435F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a GALV RT further comprising one, two, three, or four mutations selected from the group consisting of D32N, D322N, E452P, and L274W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a GALV RT further comprising the mutation D32N. In some embodiments, the RT domain comprises the amino acid sequence of a KORV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D231N, E361P, L633W, T337K, and W344F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a KORV RT further comprising one, two, or three mutations selected from the group consisting of D231N, E361P, and L633W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a MLVAV RT (e.g., an MLVAV_P03356 sequence, e.g., SEQ ID NO: 8053), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a MLVAV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a MLVAV RT further comprising one, two, or three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a MLVBM RT (e.g., an MLVBM_Q7SVK7 sequence, e.g., SEQ ID NO: 8056), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a MLVBM RT further comprising one, two, three, four, or five mutations selected from the group consisting of D199N, T329P, L602W, T305K, and W312F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a MLVBM RT further comprising one, two, and three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a MLVCB RT (e.g., an MLVCB_P08361 sequence, e.g., SEQ ID NO: 8062), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • a MLVCB RT e.g., an MLVCB_P08361 sequence, e.g., SEQ ID NO: 8062
  • the RT domain 70 of 237 11867955v1 Attorney Docket No.: 2017469-0019 comprises the amino acid sequence of a MLVCB RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a MLVCB RT further comprising one, two, and three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a MLVFF RT, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a MLVFF RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a MLVFF RT further comprising one, two, and three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a MLVMS RT (e.g., an MLVMS_reference sequence, e.g., SEQ ID NO: 8137; or an MLVMS_P03355 sequence, e.g., SEQ ID NO: 8070), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a MLVMS RT further comprising one, two, three, four, five, or six mutations selected from the group consisting of D200N, T330P, L603W, T306K, W313F, and H8Y, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a MLVMS RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a MLVMS RT further comprising one, two, or three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a PERV RT (e.g., an PERV_Q4VFZ2 sequence, e.g., SEQ ID NO: 8099), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a PERV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D196N, E326P, L599W, T302K, and W309F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a PERV RT further comprising one, two, or three mutations selected from the group consisting of D196N, E326P, and L599W, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a SFV1 RT (e.g., an SFV1_P23074 sequence, e.g., SEQ ID NO: 8105), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a SFV1 RT further comprising one, two, three, or four mutations selected from the group consisting of D24N, T296N, N420P, and L396K, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a SFV1 RT further comprising one, two, or three mutations selected from the group consisting of D24N, T296N, and N420P, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV1 RT further comprising the D24N, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a SFV3L RT (e.g., an SFV3L_P27401 sequence, e.g., SEQ ID NO: 8111), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one, two, three, or four mutations selected from the group consisting of D24N, T296N, N422P, and L396K, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one, two, or three mutations selected from the group consisting of D24N, T296N, and N422P, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV3L RT further comprising the mutation D24N, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one, two, or three mutations selected from the group consisting of T307N, N333P, and L307K, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a SFV3L RT further comprising one or two mutations selected from the group consisting of T307N and N333P, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of an RT domain of a WMSV RT (e.g., an WMSV_P03359 sequence, e.g., SEQ ID NO: 8131), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a WMSV RT further comprising one, two, three, four, or five mutations selected from the group consisting of D198N, E328P, L600W, T304K, and W311F, or a corresponding position in a homologous RT domain.
  • the RT domain comprises the amino acid sequence of a WMSV RT further comprising one, two, or three mutations selected from the group consisting of D198N, E328P, and L600W, or a corresponding position in a homologous RT domain.
  • 72 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • the RT domain comprises the amino acid sequence of an RT domain of a XMRV6 RT (e.g., an XMRV6_A1Z651 sequence, e.g., SEQ ID NO: 8134), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the RT domain comprises the amino acid sequence of a XMRV6 RT further comprising one, two, three, four, or five mutations selected from the group consisting of D200N, T330P, L603W, T306K, and W313F, or a corresponding position in a homologous RT domain. In some embodiments, the RT domain comprises the amino acid sequence of a XMRV6 RT further comprising one, two, or three mutations selected from the group consisting of D200N, T330P, and L603W, or a corresponding position in a homologous RT domain.
  • the RT domain of a gene modifying polypeptide comprises the amino acid sequence of an RT domain of an AVIRE RT, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the gene modifying polypeptide further comprises a linker having at least 99% or 100% identity to SEQ ID NO: 5217.
  • the RT domain of a gene modifying polypeptide comprises the amino acid sequence of an RT domain of an MLVMS RT, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the gene modifying polypeptide further comprises a linker having at least 99% or 100% identity to SEQ ID NO: 5217.
  • an RT domain (e.g., as listed in Table 1) comprises one or more mutations as listed in Table 4 below.
  • an RT domain as listed in Table 1 comprises one, two, three, four, five, or six of the mutations listed in the corresponding row of Table 4 below. Table 4.
  • RT domain mutations (relative to corresponding wild-type sequences as listed in the corresponding row of Table 1) RT Domain Name Mutation(s) 73 of 237 11867955v1 Attorney Docket No.: 2017469-0019 FFV_O93209_2mutA D21N T293N T419P L393K FFV_O93209-Pro F 74 of 237 11867955v1 Attorney Docket No.: 2017469-0019 MLVAV_P03356 MLVAV_P03356_3mut D200N T330P L603W 75 of 237 11867955v1 Attorney Docket No.: 2017469-0019 MMTVB_P03365-Pro MMTVB_P03365-Pro_2mut G309P 76 of 237 11867955v1 Attorney Docket No.: 2017469-0019 WMSV_P03359 WMSV_P03359_3mut D198N E328P L600W In some
  • the Cas domain can direct the gene modifying polypeptide to a target site specified by a gRNA spacer, thereby modifying a target nucleic acid sequence in “cis”.
  • the Cas domain comprises two or more smaller domains, e.g., a DNA binding domain and an endonuclease domain.
  • the Cas domain possesses the function of DNA target site cleavage via an endonuclease domain.
  • the Cas domain has DNA binding activity. It is understood that when a Cas domain is said to bind to a target nucleic acid sequence, in some embodiments, the binding is mediated by a gRNA spacer.
  • the Cas domain has RNA binding activity, e.g., the Cas domain may bind the gRNA scaffold region of the template RNA.
  • CRISPR endonucleases identified from various prokaryotic species have unique PAM sequence requirements, e.g., as listed for exemplary Cas enzymes in Table 5; An example of a PAM sequences is 5 ⁇ - NGG (Streptococcus pyogenes).
  • Some endonucleases, e.g., Cas9 endonucleases are associated with G- rich PAM sites, e.
  • a gene modifying polypeptide may comprise the amino acid sequence of SEQ ID NO: 4000 below, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto.
  • the amino acid sequence of SEQ ID NO: 4000 below, or the sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto is positioned at the N-terminal end of the gene modifying polypeptide.
  • the amino acid sequence of SEQ ID NO: 4000 below, or the sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto is positioned within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or 30 amino acids of the N-terminal end of the gene modifying polypeptide.
  • the gene modifying polypeptide comprises a GG amino acid sequence between the Cas domain and the linker, an AG amino acid sequence between the RT domain and the second NLS, and/or a GG amino acid sequence between the linker and the RT domain.
  • N-terminal sequence comprising an NLS (bold) and an SpCas9 domain with N863A: 77 of 237 11867955v1
  • N863A 77 of 237 11867955v1
  • N863A 77 of 237 11867955v1
  • a gene modifying polypeptide may comprise the amino acid sequence of SEQ ID NO: 4001 below, or a sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto.
  • the amino acid sequence of SEQ ID NO: 4001 below, or the sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identity thereto is positioned at the C-terminal end of the gene modifying polypeptide.
  • amino acid sequence of SEQ ID NO: 4001 below is positioned within 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, or 30 amino acids of the C-terminal end of the gene modifying polypeptide.
  • Exemplary C-terminal sequence comprising an NLS: AGKRTADGSEFEKRTADGSEFESPKKKAKVE (SEQ ID NO: 4001)
  • Exemplary benchmarking sequence MPAAKRVKLDGGDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETA EATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEEN PINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKD TYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALV RQLPEK
  • a Cas protein comprises E1369R, E1449H, and R1556A mutations or analogous substitutions to the amino acids corresponding to said positions. In some embodiments, a Cas protein comprises E782K, N968K, and R1015H mutations or analogous substitutions to the amino acids corresponding to said positions. In some embodiments, a Cas protein comprises D1135V, R1335Q, and T1337R mutations or analogous substitutions to the amino acids corresponding to said positions. In some embodiments, a Cas protein comprises S542R and K607R mutations or analogous substitutions to the amino acids corresponding to said positions.
  • a Cas protein comprises S542R, K548V, and N552R mutations or analogous substitutions to the amino acids corresponding to said positions. Exemplary advances in the engineering of Cas enzymes to recognize altered PAM sequences are reviewed in Collias et al Nature Communications 12:555 (2021), incorporated herein by reference in its entirety.
  • the Cas protein is catalytically active and cuts one or both strands of the target DNA site. In some embodiments, cutting the target DNA site is followed by formation of an alteration, e.g., an insertion or deletion, e.g., by the cellular repair machinery.
  • the Cas protein is modified to deactivate or partially deactivate the nuclease, e.g., nuclease-deficient Cas9.
  • nuclease e.g., nuclease-deficient Cas9.
  • wild-type Cas9 generates double-strand breaks (DSBs) at specific DNA sequences targeted by a gRNA
  • a number of CRISPR endonucleases having modified functionalities are available, for example: a “nickase” version of Cas9 that has been partially deactivated generates only a single-strand break; a catalytically inactive Cas9 (“dCas9”) does not cut target DNA.
  • the endonuclease domain has nickase activity and cleaves one strand of a target DNA.
  • nickase activity reduces the formation of double-stranded breaks at the target site.
  • the endonuclease domain has nickase activity and does not form double-stranded breaks.
  • the endonuclease domain forms single-stranded breaks at a higher frequency than double-stranded breaks, e.g., at least 90%, 95%, 96%, 97%, 98%, or 99% of the breaks are single- stranded breaks, or less than 10%, 5%, 4%, 3%, 2%, or 1% of the breaks are double-stranded breaks.
  • the endonuclease forms substantially no double-stranded breaks.
  • a catalytically inactive or partially inactive CRISPR/Cas domain comprises a Cas protein comprising one or more mutations, e.g., one or more of the mutations listed in Table 5.
  • a Cas protein described on a given row of Table 5 comprises one, two, three, or all of the mutations listed in the same row of Table 5.
  • a Cas protein, e.g., not described in Table 5 comprises one, two, three, or all of the mutations listed in a row of Table 5 or a corresponding mutation at a corresponding site in that Cas protein.
  • a catalytically inactive, e.g., dCas9, or partially deactivated Cas9 protein comprises a D11 mutation (e.g., D11A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a H969 mutation (e.g., H969A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, or partially deactivated Cas9 protein comprises a N995 mutation (e.g., N995A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, comprises mutations at one, two, or three of positions D11, H969, and N995 (e.g., D11A, H969A, and N995A mutations) or analogous substitutions to the amino acids corresponding to said positions.
  • a catalytically inactive Cas9 protein e.g., dCas9, or partially deactivated Cas9 protein comprises a D10 mutation (e.g., a D10A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a H557 mutation (e.g., a H557A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9
  • a catalytically inactive Cas9 protein comprises a D10 mutation (e.g., a D10A mutation) and a H557 mutation (e.g., a H557A mutation) or analogous substitutions to the amino acids corresponding to said positions.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D839 mutation (e.g., a D839A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, or partially deactivated Cas9 protein comprises a H840 mutation (e.g., a H840A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a N863 mutation (e.g., a N863A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein comprises a D10 mutation (e.g., D10A), a D839 mutation (e.g., D839A), a H840 mutation (e.g., H840A), and a N863 mutation (e.g., N863A) or analogous substitutions to the amino acids corresponding to said positions.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a E993 mutation (e.g., a E993A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, or partially deactivated Cas9 protein comprises a D917 mutation (e.g., a D917A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a E1006 mutation (e.g., a E1006A mutation) or an 94 of 237 11867955v1 Attorney Docket No.: 2017469-0019 analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, or partially deactivated Cas9 protein comprises a D1255 mutation (e.g., a D1255A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, comprises a D917 mutation (e.g., D917A), a E1006 mutation (e.g., E1006A), and a D1255 mutation (e.g., D1255A) or analogous substitutions to the amino acids corresponding to said positions.
  • a catalytically inactive Cas9 protein e.g., dCas9, or partially deactivated Cas9 protein comprises a D16 mutation (e.g., a D16A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a D587 mutation (e.g., a D587A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a partially deactivated Cas domain has nickase activity.
  • a partially deactivated Cas9 domain is a Cas9 nickase domain.
  • the catalytically inactive Cas domain or dead Cas domain produces no detectable double strand break formation.
  • a catalytically inactive Cas9 protein, e.g., dCas9, or partially deactivated Cas9 protein comprises a H588 mutation (e.g., a H588A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, or partially deactivated Cas9 protein comprises a N611 mutation (e.g., a N611A mutation) or an analogous substitution to the amino acid corresponding to said position.
  • a catalytically inactive Cas9 protein e.g., dCas9, comprises a D16 mutation (e.g., D16A), a D587 mutation (e.g., D587A), a H588 mutation (e.g., H588A), and a N611 mutation (e.g., N611A) or analogous substitutions to the amino acids corresponding to said positions.
  • an endonuclease domain or DNA binding domain comprises a Streptococcus pyogenes Cas9 (SpCas9) or a functional fragment or variant thereof.
  • the endonuclease domain or DNA binding domain comprises a modified SpCas9.
  • the modified SpCas9 comprises a modification that alters protospacer-adjacent motif (PAM) specificity.
  • the PAM has specificity for the nucleic acid sequence 5′-NGT-3′.
  • the modified SpCas9 comprises one or more amino acid substitutions, e.g., at one or more of positions L1111, D1135, G1218, E1219, A1322, of R1335, e.g., selected from L1111R, D1135V, G1218R, E1219F, A1322R, R1335V.
  • the modified SpCas9 comprises the amino acid substitution T1337R and one or more additional amino acid substitutions, e.g., selected from L1111, D1135L, S1136R, G1218S, E1219V, D1332A, D1332S, D1332T, D1332V, D1332L, D1332K, D1332R, R1335Q, T1337, T1337L, T1337Q, T1337I, T1337V, T1337F, T1337S, T1337N, T1337K, T1337H, T1337Q, and T1337M, or corresponding amino acid substitutions thereto.
  • additional amino acid substitutions e.g., selected from L1111, D1135L, S1136R, G1218S, E1219V, D1332A, D1332S, D1332T, D1332V, D1332L, D1332K, D1332R, R1335Q, T1337, T1337L,
  • the modified SpCas9 comprises: (i) one or more amino acid substitutions selected from D1135L, S1136R, G1218S, E1219V, A1322R, R1335Q, 95 of 237 11867955v1 Attorney Docket No.: 2017469-0019 and T1337; and (ii) one or more amino acid substitutions selected from L1111R, G1218R, E1219F, D1332A, D1332S, D1332T, D1332V, D1332L, D1332K, D1332R, T1337L, T1337I, T1337V, T1337F, T1337S, T1337N, T1337K, T1337R, T1337H, T1337Q, and T1337M, or corresponding amino acid substitutions thereto.
  • the Cas9 comprises one or more substitutions, e.g., selected from H840A, D10A, P475A, W476A, N477A, D1125A, W1126A, and D1127A.
  • the Cas9 comprises one or more mutations at positions selected from: D10, G12, G17, E762, H840, N854, N863, H982, H983, A984, D986, and/or A987, e.g., one or more substitutions selected from D10A, G12A, G17A, E762A, H840A, N854A, N863A, H982A, H983A, A984A, and/or D986A.
  • the gene modifying polypeptide comprises spCas9, spCas9-VRQR, spCas9- VRER, spCas9-MQKSER, spCas9-LRKIQK, or spCas9- LRVSQL.
  • Linkers In some embodiments, a gene modifying polypeptide may comprise a linker, e.g., a peptide linker, e.g., a linker as described in Table 7.
  • a gene modifying polypeptide comprises, in an N-terminal to C-terminal direction, a Cas domain (e.g., a Cas domain of Table 6), a linker of Table 7 (or a sequence having at least 70%, 80%, 85%, 90%, 95%, or 99% identity thereto), and an RT domain (e.g., an RT domain of Table 1).
  • a gene modifying polypeptide comprises a flexible linker between the endonuclease and the RT domain, e.g., a linker comprising the amino acid sequence SGGSSGGSSGSETPGTSESATPESSGGSSGGSS (SEQ ID NO: 11,002).
  • an RT domain of a gene modifying polypeptide may be located C-terminal to the Cas domain. In some embodiments, an RT domain of a gene modifying polypeptide may be located N-terminal to the Cas domain.
  • Table 7 Exemplary linker sequences Amino Acid Sequence SEQ ID NO 96 of 237 11867955v1 Attorney Docket No.: 2017469-0019 GGGGSGGGGSGGGGSGGGGS 5110 GGGGSGGGGSGGGGSGGGGSGGGGS 5111 97 of 237 11867955v1 Attorney Docket No.: 2017469-0019 GGGPAP 5149 PAPGGG 5150 98 of 237 11867955v1 Attorney Docket No.: 2017469-0019 GGSPAPEAAAK 5188 EAAAKGGSPAP 5189
  • a linker of a gene modifying polypeptide comprises a motif chosen from: (SGGS) n (SEQ ID NO: 5025), (GGGS) n (SEQ ID NO: 50
  • a gene modifying polypeptide comprises an amino acid sequence as listed in Table 8, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • a gene modifying polypeptide comprises a linker comprising a linker sequence as listed in Table 8, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • a gene modifying polypeptide comprises an RT domain comprising an RT domain sequence as listed in Table 8, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • a gene modifying polypeptide comprises: (i) a linker comprising a linker sequence as listed in a row of Table 8, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto; and (ii) an RT domain comprising an RT domain sequence as listed in the same row of Table 8, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the gene modifying polypeptide comprises a Cas domain according to SEQ ID NO: 11,096, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto, together with an RT and linker of Table 8.
  • the gene modifying polypeptide comprises a Cas domain according to SEQ ID NO: 11,096 together with an RT and linker of Table 8. Table 8.
  • a gene modifying polypeptide comprises an amino acid sequence as listed in Table 9, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • a gene modifying polypeptide comprises a linker comprising a linker sequence as listed in Table 9, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • a gene modifying polypeptide comprises an RT domain comprising an RT domain sequence as listed in Table 9, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • a gene modifying polypeptide comprises: (i) a linker comprising a linker sequence as listed in a row of Table 9, or an amino acid sequence 100 of 237 11867955v1 Attorney Docket No.: 2017469-0019 having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto; and (ii) an RT domain comprising an RT domain sequence as listed in the same row of Table 9, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the gene modifying polypeptide comprises a Cas domain according to SEQ ID NO: 11,096, or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto, together with an RT and linker of Table 9.
  • the gene modifying polypeptide comprises a Cas domain according to SEQ ID NO: 11,096 together with an RT and linker of Table 9. Table 9.
  • exemplary gene modifying polypeptides Linker Sequence SEQ ID NO of linker RT name GGGGSGGGGSGGGGSGGGGS 15,405 MLVCB P08361 3mutA S S 101 of 237 11867955v1 Attorney Docket No.: 2017469-0019 EAAAKGGSPAP 15,432 PERV_Q4VFZ2_3mut EAAAKPAPGGS 15,433 MLVCB_P08361_3mutA S S S S S A S S S S S S S S Systems
  • the disclosure relates to a system comprising nucleic acid molecule encoding a gene modifying polypeptide (e.g., as described herein) and a template nucleic acid (e.g., a template RNA, e.g., as described herein).
  • the nucleic acid molecule encoding the gene modifying polypeptide comprises one or more silent mutations in the coding region (e.g., in the sequence encoding 102 of 237 11867955v1 Attorney Docket No.: 2017469-0019 the RT domain) relative to a nucleic acid molecule as described herein.
  • the system further comprises a gRNA (e.g., a gRNA that binds to a polypeptide that induces a nick, e.g., in the opposite strand of the target DNA bound by the gene modifying polypeptide).
  • a gene editor system RNA further comprises an intracellular localization sequence, e.g., a nuclear localization sequence (NLS).
  • a gene modifying polypeptide comprises an NLS as comprised in SEQ ID NO: 4000 and/or SEQ ID NO: 4001, or an NLS having an amino acid sequence having at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity thereto.
  • the nuclear localization sequence may be an RNA sequence that promotes the import of the RNA into the nucleus.
  • the nuclear localization signal is located on the template RNA.
  • the gene modifying polypeptide is encoded on a first RNA
  • the template RNA is a second, separate, RNA
  • the nuclear localization signal is located on the template RNA and not on an RNA encoding the gene modifying polypeptide.
  • the RNA encoding the gene modifying polypeptide is targeted primarily to the cytoplasm to promote its translation, while the template RNA is targeted primarily to the nucleus to promote insertion into the genome.
  • the nuclear localization signal is at the 3′ end, 5′ end, or in an internal region of the template RNA.
  • the nuclear localization sequence is situated inside of an intron.
  • a plurality of the same or different nuclear localization signals are in the RNA, e.g., in the template RNA.
  • the nuclear localization signal is less than 5, 10, 25, 50, 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900 or 1000 bp in length.
  • RNA nuclear localization sequences can be used. For example, Lubelsky and Ulitsky, Nature 555 (107-111), 2018 describe RNA sequences which drive RNA localization into the nucleus.
  • the nuclear localization signal is a SINE-derived nuclear RNA localization (SIRLOIN) signal.
  • the nuclear localization signal binds a nuclear-enriched protein.
  • the nuclear localization signal binds the HNRNPK protein.
  • the nuclear localization signal is rich in pyrimidines, e.g., is a C/T rich, C/U rich, C rich, T rich, or U rich region.
  • the nuclear localization signal is derived from a long non-coding RNA.
  • the nuclear localization signal is derived from MALAT1 long non-coding RNA or is the 600 nucleotide M region of MALAT1 (described in Miyagawa et al., RNA 18, (738-751), 2012).
  • the nuclear localization signal is derived from BORG long non-coding RNA or is a AGCCC motif (described in Zhang et al., Molecular and Cellular Biology 34, 2318-2329 (2014). In some 103 of 237 11867955v1 Attorney Docket No.: 2017469-0019 embodiments the nuclear localization sequence is described in Shukla et al., The EMBO Journal e98452 (2016).
  • the nuclear localization signal is derived from a retrovirus.
  • a polypeptide described herein comprises one or more (e.g., 2, 3, 4, 5) nuclear targeting sequences, for example a nuclear localization sequence (NLS).
  • the NLS is a bipartite NLS.
  • an NLS facilitates the import of a protein comprising an NLS into the cell nucleus.
  • the NLS is fused to the N-terminus of a gene modifying polypeptide as described herein.
  • the NLS is fused to the C-terminus of the gene modifying polypeptide.
  • the NLS is fused to the N-terminus or the C-terminus of a Cas domain.
  • a linker sequence is disposed between the NLS and the neighboring domain of the gene modifying polypeptide.
  • an NLS comprises the amino acid sequence PKKRKVEGADKRTADGSEFESPKKKRKV (SEQ ID NO: 5010), RKSGKIAAIWKRPRKPKKKRKV (SEQ ID NO: 5011), KRTADGSEFESPKKKRKV(SEQ ID NO: 5012), KKTELQTTNAENKTKKL (SEQ ID NO: 5013), KRGINDRNFWRGENGRKTR (SEQ ID NO: 5014), KRPAATKKAGQAKKKK (SEQ ID NO: 5015) or a functional fragment or variant thereof.
  • the first NLS comprises the amino acid sequence PAAKRVKLD (SEQ ID NO: 11,095), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • Exemplary NLS sequences are also described in PCT/EP2000/011690, the contents of which are incorporated herein by reference for their disclosure of exemplary nuclear localization sequences.
  • an NLS comprises an amino acid sequence as disclosed in Table 10.
  • An NLS of this table may be utilized with one or more copies in a polypeptide in one or more locations in a polypeptide, e.g., 1, 2, 3 or more copies of an NLS in an N- terminal domain, between peptide domains, in a C-terminal domain, or in a combination of locations, in order to improve subcellular localization to the nucleus.
  • Multiple unique sequences may be used within a single polypeptide. Sequences may be naturally monopartite or bipartite, e.g., having one or two stretches of basic amino acids, or may be used as chimeric bipartite sequences.
  • the NLS sequence (e.g., second NLS sequence) comprises a plurality of partial NLS sequences.
  • the NLS sequence e.g., the second NLS sequence, comprises a first partial NLS sequence, e.g., comprising the amino acid sequence KRTADGSEFE (SEQ ID NO: 5350), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the NLS sequence e.g., the second NLS sequence
  • the NLS sequence comprises a second partial NLS sequence.
  • the NLS sequence comprises an SV40A5 NLS, e.g., a 104 of 237 11867955v1 Attorney Docket No.: 2017469-0019 bipartite SV40A5 NLS, e.g., comprising the amino acid sequence KRTADGSEFESPKKKAKVE (SEQ ID NO: 5351), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the NLS sequence e.g., the second NLS sequence, comprises the amino acid sequence KRTADGSEFEKRTADGSEFESPKKKAKVE (SEQ ID NO: 5349), or an amino acid sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • Table 10 Exemplary nuclear localization signals for use in gene modifying systems Sequence Sequence References SEQ ID No.
  • AHFKISGEKRPSTDPGKK 105 of 237 11867955v1 Attorney Docket No.: 2017469-0019 KKTGKNRKLKSKRVKTR Q9Z301, O54943, Q8K3T2 5249 KKVSIAGQSGKLWRWKR Q6YUL8 5250 106 of 237 11867955v1 Attorney Docket No.: 2017469-0019 PKKGDKYDKTD Q45FA5 5279 PKKKSRK O35914, Q01954 5280 107 of 237 11867955v1 Attorney Docket No.: 2017469-0019 Q8QPH4, Q809M7, A8C8X1, Q2VNC5, Q38SQ0, O89749, Q6DNQ9, Q809L9, Q0A429, Q20NV3, P16509 P16505 6DN 5 P16506 6XT06 108 of 237 11867955v1 Attorney Docket No.: 2017469-
  • a monopartite NLS typically lacks a spacer.
  • An example of a bipartite NLS is the nucleoplasmin NLS, having the sequence KR[PAATKKAGQA]KKKK (SEQ ID NO: 5015), wherein the spacer is bracketed.
  • Another exemplary bipartite NLS has the sequence PKKKRKVEGADKRTADGSEFESPKKKRKV (SEQ ID NO: 5016).
  • Exemplary NLSs are described in International Application WO2020051561, which is herein incorporated by reference in its entirety, including for its disclosures regarding nuclear localization sequences.
  • an intein-N (intN) domain may be fused to the N-terminal portion of a first domain of a gene modifying polypeptide described herein, and an intein-C (intC) domain may be fused to the C-terminal portion of a second domain of a gene modifying polypeptide described herein for the joining of the N-terminal portion to the C-terminal portion, thereby joining the first and second domains.
  • Inteins can occur as self-splicing protein intron (e.g., peptide), e.g., which ligates flanking N- terminal and C-terminal exteins (e.g., fragments to be joined).
  • An intein may, in some instances, comprise a fragment of a protein that is able to excise itself and join the remaining fragments (the exteins) with a peptide bond in a process known as protein splicing. Inteins are also referred to as “protein introns.” The process of an intein excising itself and joining the remaining portions of the protein is herein termed “protein splicing” or “intein-mediated protein splicing.” 109 of 237 11867955v1 Attorney Docket No.: 2017469-0019 Additional Domains
  • the gene modifying polypeptide can bind a target DNA sequence and template nucleic acid (e.g., template RNA), nick the target site, and write (e.g., reverse transcribe) the template into DNA, resulting in a modification of the target site.
  • template nucleic acid e.g., template RNA
  • additional domains may be added to the polypeptide to enhance the efficiency of the process.
  • the gene modifying polypeptide may contain an additional DNA ligation domain to join reverse transcribed DNA to the DNA of the target site.
  • the polypeptide may comprise a heterologous RNA-binding domain.
  • the polypeptide may comprise a domain having 5 ⁇ to 3 ⁇ exonuclease activity (e.g., wherein the 5 ⁇ to 3 ⁇ exonuclease activity increases repair of the alteration of the target site, e.g., in favor of alteration over the original genomic sequence).
  • the polypeptide may comprise a domain having 3 ⁇ to 5 ⁇ exonuclease activity, e.g., proof-reading activity.
  • the writing domain e.g., RT domain
  • the gene modifying systems described herein can modify a host target DNA site using a template nucleic acid sequence.
  • the gene modifying systems described herein transcribe an RNA sequence template into host target DNA sites by target-primed reverse transcription (TPRT).
  • TPRT target-primed reverse transcription
  • the gene modifying system can insert an object sequence into a target genome without the need for exogenous DNA sequences to be introduced into the host cell (unlike, for example, CRISPR systems), as well as eliminate an exogenous DNA insertion step.
  • the gene modifying system can also delete a sequence from the target genome or introduce a substitution using an object sequence. Therefore, the gene modifying system provides a platform for the use of customized RNA sequence templates containing object sequences, e.g., sequences comprising heterologous gene coding and/or function information.
  • the template nucleic acid comprises one or more sequence (e.g., 2 sequences) that binds the gene modifying polypeptide.
  • a template RNA can comprise a gRNA sequence, e.g., to direct the gene modifying polypeptide to a target site of interest.
  • a template RNA comprises (e.g., from 5′ to 3′) (i) a gRNA spacer that binds a target site (e.g., a second strand of a site in a target genome), (ii) a gRNA scaffold that binds a polypeptide described herein (e.g., a Cas domain of a gene modifying polypeptide), (iii) a heterologous object sequence comprising a mutation region (optionally the heterologous object sequence comprises, from 5’ to 3’, a first homology region, a mutation region, and a second homology region), and (iv) a primer binding site (PBS) sequence.
  • PBS primer binding site
  • the template RNA has a poly-A tail at the 3 ⁇ end. In some embodiments the template RNA does not have a poly-A tail at the 3 ⁇ end.
  • the template nucleic acid is a template RNA.
  • the template RNA comprises one or more modified nucleotides. For example, in some embodiments, the template RNA comprises one or more deoxyribonucleotides. In some embodiments, regions of the template RNA are replaced by DNA nucleotides, e.g., to enhance stability of the molecule.
  • the 3 ⁇ end of the template may comprise DNA nucleotides, while the rest of the template comprises RNA nucleotides that can be reverse transcribed.
  • the heterologous object sequence is primarily or wholly made up of RNA nucleotides (e.g., at least 90%, 95%, 98%, or 99% RNA nucleotides).
  • the PBS sequence is primarily or wholly made up of DNA nucleotides (e.g., at least 90%, 95%, 98%, or 99% DNA nucleotides).
  • a template RNA described herein may comprise, from 5’ to 3’: (1) a gRNA spacer; (2) a gRNA scaffold; (3) heterologous object sequence (4) a primer binding site (PBS) sequence.
  • PBS primer binding site
  • a template RNA described herein may comprise a gRNA spacer that directs the gene modifying system to a target nucleic acid, and a gRNA scaffold that promotes association of the template RNA with the Cas domain of the gene modifying polypeptide.
  • the systems described herein can also comprise a gRNA that is not part of a template nucleic acid.
  • a gRNA that comprises a gRNA spacer and gRNA scaffold, but not a heterologous object sequence or a PBS sequence can be used, e.g., to induce second strand nicking, e.g., as described in the section herein entitled “Second Strand Nicking”.
  • the gRNA is a short synthetic RNA composed of a scaffold sequence that participates in CRISPR-associated protein binding and a user-defined ⁇ 20 nucleotide targeting sequence for a genomic target. The structure of a complete gRNA was described by Nishimasu et al. Cell 156, P935- 949 (2014).
  • the gRNA (also referred to as sgRNA for single-guide RNA) consists of crRNA- and tracrRNA-derived sequences connected by an artificial tetraloop.
  • the crRNA sequence can be divided into 111 of 237 11867955v1 Attorney Docket No.: 2017469-0019 guide (20 nt) and repeat (12 nt) regions, whereas the tracrRNA sequence can be divided into anti-repeat (14 nt) and three tracrRNA stem loops (Nishimasu et al. Cell 156, P935-949 (2014)).
  • guide RNA sequences are generally designed to have a length of between 17 – 24 nucleotides (e.g., 19, 20, or 21 nucleotides) and be complementary to a targeted nucleic acid sequence.
  • Custom gRNA generators and algorithms are available commercially for use in the design of effective guide RNAs.
  • the gRNA comprises two RNA components from the native CRISPR system, e.g. crRNA and tracrRNA.
  • the gRNA may also comprise a chimeric, single guide RNA (sgRNA) containing sequence from both a tracrRNA (for binding the nuclease) and at least one crRNA (to guide the nuclease to the sequence targeted for editing/binding).
  • sgRNA single guide RNA
  • a gRNA spacer comprises a nucleic acid sequence that is complementary to a DNA sequence associated with a target gene.
  • the region of the template nucleic acid, e.g., template RNA, comprising the gRNA adopts an underwound ribbon-like structure of gRNA bound to target DNA (e.g., as described in Mulepati et al. Science 19 Sep 2014:Vol. 345, Issue 6203, pp. 1479-1484).
  • the region of the template nucleic acid, e.g., template RNA, comprising the gRNA may tolerate increased mismatching with the target site at some interval, e.g., every sixth base.
  • the region of the template nucleic acid, e.g., template RNA, comprising the gRNA comprising homology to the target site may possess wobble positions at a regular interval, e.g., every sixth base, that do not need to base pair with the target site.
  • the template nucleic acid (e.g., template RNA) has at least 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 bases of at least 80%, 85%, 90%, 95%, 99%, or 100% homology to the target site, e.g., at the 5’ end, e.g., comprising a gRNA spacer sequence of length appropriate to the Cas9 domain of the gene modifying polypeptide (Table 6).
  • Table 11 provides parameters to define components for designing gRNA and/or template RNAs to apply Cas variants listed in Table 6 for gene modifying.
  • the cut site indicates the validated or predicted protospacer adjacent motif (PAM) requirements, validated or predicted location of cut site (relative to the most upstream base of the PAM site).
  • PAM protospacer adjacent motif
  • the gRNA for a given enzyme can be assembled by concatenating the crRNA, Tetraloop, and tracrRNA sequences, and further adding a 5′ spacer of a length within Spacer (min) and Spacer (max) that matches a protospacer at a target site. Further, the predicted location of the ssDNA nick at the target is important for designing a PBS sequence of a Template RNA that can anneal to the sequence immediately 5′ of the nick in order to initiate target primed reverse transcription.
  • a gRNA scaffold described herein comprises a nucleic acid sequence comprising, in the 5’ 112 of 237 11867955v1 Attorney Docket No.: 2017469-0019 to 3’ direction, a crRNA of Table 11, a tetraloop from the same row of Table 11, and a tracrRNA from the same row of Table 11, or a sequence having at least 70%, 80%, 85%, 90%, 95%, or 99% identity thereto.
  • the gRNA or template RNA comprising the scaffold further comprises a gRNA spacer having a length within the Spacer (min) and Spacer (max) indicated in the same row of Table 11.
  • the gRNA or template RNA having a sequence according to Table 11 is comprised by a system that further comprises a gene modifying polypeptide, wherein the gene modifying polypeptide comprises a Cas domain described in the same row of Table 11.
  • the gene modifying polypeptide comprises a Cas domain described in the same row of Table 11.
  • the RNA sequence may comprise U at every position shown as T in the sequence in Table 11. More specifically, the present disclosure provides an RNA sequence according to every gRNA scaffold sequence of Table 11, wherein the RNA sequence has a U in place of each T in the sequence in Table 11. Additionally, it is understood that terminal Us and Ts may optionally be added or removed from tracrRNA sequences and may be modified or unmodified when provided as RNA.
  • versions of gRNA scaffold sequences alternative to those exemplified in Table 11 may also function with the different Cas9 enzymes or derivatives thereof exemplified in Table 6, e.g., alternate gRNA scaffold sequences with nucleotide additions, substitutions, or deletions, e.g., sequences with stem-loop structures added or removed. It is contemplated herein that the gRNA scaffold sequences represent a component of gene modifying systems that can be similarly optimized for a given system, Cas-RT fusion polypeptide, indication, target mutation, template RNA, or delivery vehicle.
  • a template RNA described herein may comprise a heterologous object sequence that the gene modifying polypeptide can use as a template for reverse transcription, to write a desired sequence into the target nucleic acid.
  • the heterologous object sequence comprises, from 5’ to 3’, a post-edit homology region, the mutation region, and a pre-edit homology region.
  • an RT performing reverse transcription on the template RNA first reverse transcribes the pre-edit homology region, then the mutation region, and then the post-edit homology region, thereby creating a DNA strand comprising the desired mutation with a homology region on either side.
  • the heterologous object sequence is at least 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 120, 140, 160, 180, 200, 500, or 1,000 nucleotides (nts) in length, or at least 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5, 7, 7.5, 8, 8.5, 9, 9.5, or 10 kilobases
  • the heterologous object sequence is no more than 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 120, 140, 160, 180, 200, 500, 1,000, or 2000 nucleotides (nts) in length, or no more than 20, 15, 10, 9, 8, 7, 6, 5, 4, or 3 kilobases in length.
  • the heterologous object sequence is 30-1000, 40-1000, 50-1000, 60-1000, 70-1000, 74-1000, 75-1000, 76-1000, 77-1000, 78-1000, 79-1000, 80-1000, 85-1000, 90-1000, 117 11867955v1 Attorney Docket No.: 2017469-0019 100-1000, 120-1000, 140-1000, 160-1000, 180-1000, 200-1000, 500-1000, 30-500, 40-500, 50-500, 60- 500, 70-500, 74-500, 75-500, 76-500, 77-500, 78-500, 79-500, 80-500, 85-500, 90-500, 100-500, 120-500, 140-500, 160-500, 180-500, 200-500, 30-200, 40-200, 50-200, 60-200, 70-200, 74-200, 75-200, 76-200, 77-200, 78-200, 79-200, 80-1000, 85-
  • the heterologous object sequence (e.g., of a system as described herein) is about 1-50, 50-100, 100-200, 200-300, 300-400, 400-500, 500-600, 600-700, 700-800, 800-900, 900-1000, or more, nucleotides in length.
  • the heterologous object sequence is 10-100, 10-90, 10-80, 10-70, 10-60, 10- 50, 10-40, 10-30, or 10-20 nt in length, e.g., 10-80, 10-50, or 10-20 nt in length, e.g., about10-20 nt in length.
  • the heterologous object sequence is 8-30, 9-25, 10-20, 11-16, or 12-15 nucleotides in length, e.g., is 11-16 nt in length.
  • a larger insertion size, larger region of editing e.g., the distance between a first edit/substitution and a second edit/substitution in the target region
  • greater number of desired edits e.g., mismatches of the heterologous object sequence to the target genome
  • the template nucleic acid comprises a customized RNA sequence template which can be identified, designed, engineered and constructed to contain sequences altering or specifying host genome function, for example by introducing a heterologous coding region into a genome; affecting or causing exon structure/alternative splicing, e.g., leading to exon skipping of one or more exons; causing disruption of an endogenous gene, e.g., creating a genetic knockout; causing transcriptional activation of an endogenous gene; causing epigenetic regulation of an endogenous DNA; causing up-regulation of one or more operably linked genes, e.g., leading to gene activation or overexpression; causing down-regulation of one or more operably linked genes, e.g., creating a genetic knock-down; etc.
  • a customized RNA sequence template can be engineered to contain sequences coding for exons and/or transgenes, provide binding sites for transcription factor activators, repressors, enhancers, etc., and combinations thereof.
  • a customized template can be engineered to encode a nucleic acid or peptide tag to be expressed in an endogenous RNA transcript or endogenous protein operably linked to the target site.
  • the coding sequence can be further customized with splice donor sites, splice acceptor sites, or poly-A tails.
  • the template nucleic acid (e.g., template RNA) of the system typically comprises an object sequence (e.g., a heterologous object sequence) for writing a desired sequence into a target DNA.
  • the object sequence may be coding or non-coding.
  • the template nucleic acid (e.g., template RNA) can be designed to result in insertions, mutations, or deletions at the target DNA locus.
  • the template nucleic acid e.g., template RNA
  • the template nucleic acid may contain a heterologous sequence, wherein the reverse transcription will result in insertion of the heterologous sequence into the target DNA.
  • the RNA template may be designed to introduce a deletion into the target DNA.
  • the template nucleic acid e.g., template RNA
  • the template nucleic acid may match the target DNA upstream and downstream of the desired deletion, wherein the reverse transcription will result in the copying of the upstream and downstream sequences from the template nucleic acid (e.g., template RNA) without the intervening sequence, e.g., causing deletion of the intervening sequence.
  • the template nucleic acid may be designed to introduce an edit into the target DNA.
  • the template RNA may match the target DNA sequence with the exception of one or more nucleotides, wherein the reverse transcription will result in the copying of these edits into the target DNA, e.g., resulting in mutations, e.g., transition or transversion mutations.
  • writing of an object sequence into a target site results in the substitution of nucleotides, e.g., where the full length of the object sequence corresponds to a matching length of the target site with one or more mismatched bases.
  • a heterologous object sequence may be designed such that a combination of sequence alterations may occur, e.g., a simultaneous addition and deletion, addition and substitution, or deletion and substitution.
  • the heterologous object sequence may contain an open reading frame or a fragment of an open reading frame.
  • the heterologous object sequence has a Kozak sequence.
  • the heterologous object sequence has an internal ribosome entry site.
  • the heterologous object sequence has a self-cleaving peptide such as a T2A or P2A site.
  • the heterologous object sequence has a start codon.
  • the template RNA has a splice acceptor site.
  • the template RNA has a splice donor site.
  • Exemplary splice acceptor and splice donor sites are described in WO2016044416, incorporated herein by reference in its entirety. Exemplary splice acceptor site sequences are known to those of skill in the art.
  • the template RNA has a microRNA binding site downstream of the stop codon.
  • the template RNA has a polyA tail downstream of the stop codon of an open reading frame.
  • the template RNA comprises one or more exons.
  • the template RNA comprises one or more introns.
  • the template RNA comprises a eukaryotic transcriptional terminator.
  • the template RNA comprises an enhanced 119 of 237 11867955v1 Attorney Docket No.: 2017469-0019 translation element or a translation enhancing element.
  • the RNA comprises the human T-cell leukemia virus (HTLV-1) R region.
  • the RNA comprises a posttranscriptional regulatory element that enhances nuclear export, such as that of Hepatitis B Virus (HPRE) or Woodchuck Hepatitis Virus (WPRE).
  • the heterologous object sequence may contain a non-coding sequence.
  • the template nucleic acid e.g., template RNA
  • the template nucleic acid (e.g., template RNA) comprises a tissue specific promoter or enhancer, each of which may be unidirectional or bidirectional.
  • the promoter is an RNA polymerase I promoter, RNA polymerase II promoter, or RNA polymerase III promoter.
  • the promoter comprises a TATA element.
  • the promoter comprises a B recognition element.
  • the promoter has one or more binding sites for transcription factors.
  • the template nucleic acid (e.g., template RNA) comprises a site that coordinates epigenetic modification.
  • the template nucleic acid e.g., template RNA
  • the template nucleic acid (e.g., template RNA) comprises a CTCF site or a site targeted for DNA methylation.
  • the template nucleic acid (e.g., template RNA) comprises a gene expression unit composed of at least one regulatory region operably linked to an effector sequence.
  • the effector sequence may be a sequence that is transcribed into RNA (e.g., a coding sequence or a non-coding sequence such as a sequence encoding a micro RNA).
  • the heterologous object sequence of the template nucleic acid is inserted into a target genome in an endogenous intron.
  • the heterologous object sequence of the template nucleic acid e.g., template RNA
  • the insertion of the heterologous object sequence into the target genome results in replacement of a natural exon or the skipping of a natural exon.
  • the template nucleic acid e.g., template RNA
  • the template nucleic acid may be designed to cause an insertion in the target DNA.
  • the template nucleic acid e.g., template RNA
  • the RNA template may be designed to write a deletion into the target DNA.
  • the template nucleic acid 120 of 237 11867955v1 Attorney Docket No.: 2017469-0019 may match the target DNA upstream and downstream of the desired deletion, wherein the reverse transcription will result in the copying of the upstream and downstream sequences from the template nucleic acid (e.g., template RNA) without the intervening sequence, e.g., causing deletion of the intervening sequence.
  • the template nucleic acid e.g., template RNA
  • the template nucleic acid may be designed to write an edit into the target DNA.
  • the template RNA may match the target DNA sequence with the exception of one or more nucleotides, wherein the reverse transcription will result in the copying of these edits into the target DNA, e.g., resulting in mutations, e.g., transition or transversion mutations.
  • the pre-edit homology domain comprises a nucleic acid sequence having 100% sequence identity with a nucleic acid sequence comprised in a target nucleic acid molecule.
  • the post-edit homology domain comprises a nucleic acid sequence having 100% sequence identity with a nucleic acid sequence comprised in a target nucleic acid molecule.
  • a template nucleic acid (e.g., template RNA) comprises a PBS sequence.
  • a PBS sequence is disposed 3′ of the heterologous object sequence and is complementary to a sequence adjacent to a site to be modified by a system described herein, or comprises no more than 1, 2, 3, 4, or 5 mismatches to a sequence complementary to the sequence adjacent to a site to be modified by the system/gene modifying polypeptide.
  • the PBS sequence binds within 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides of a nick site in the target nucleic acid molecule.
  • binding of the PBS sequence to the target nucleic acid molecule permits initiation of target- primed reverse transcription (TPRT), e.g., with the 3′ homology domain acting as a primer for TPRT.
  • TPRT target- primed reverse transcription
  • the PBS sequence is 3-5, 5-10, 10-30, 10-25, 10-20, 10-19, 10-18, 10-17, 10-16, 10- 15, 10-14, 10-13, 10-12, 10-11, 11-30, 11-25, 11-20, 11-19, 11-18, 11-17, 11-16, 11-15, 11-14, 11-13, 11- 12, 12-30, 12-25, 12-20, 12-19, 12-18, 12-17, 12-16, 12-15, 12-14, 12-13, 13-30, 13-25, 13-20, 13-19, 13- 18, 13-17, 13-16, 13-15, 13-14, 14-30, 14-25, 14-20, 14-19, 14-18, 14-17, 14-16, 14-15, 15-30, 15-25, 15- 20, 15-19, 15-18, 15-17, 15-16, 16-30, 16-25, 16-20
  • the PBS sequence is 5-20, 8-16, 8-14, 8- 13, 8-12, 9-13, 9-12, or 10-12 nucleotides in length, e.g., 9-12 nucleotides in length.
  • the template nucleic acid e.g., template RNA
  • the template nucleic acid (e.g., template RNA) PBS sequence domain may serve as an annealing region to the target DNA, such that the target DNA is positioned to prime the reverse transcription of the template nucleic acid (e.g., template RNA).
  • the template nucleic acid (e.g., 121 of 237 11867955v1 Attorney Docket No.: 2017469-0019 template RNA) has at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 175, 200 or more bases of exact homology to the target DNA at the 3′ end of the RNA.
  • the template nucleic acid (e.g., template RNA) has at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 175, 200 or more bases of at least 50%, 60%, 70%, 80%, 85%, 90%, 95%, 97%, 98%, 99% or 100% homology to the target DNA, e.g., at the 5′ end of the template nucleic acid (e.g., template RNA).
  • the template RNA sequences may be customized, e.g., depending on the cell being targeted.
  • RNAs described herein are designed to write a mutation (e.g., a substitution) into the PAM of the target site, such that upon editing, the PAM site will be mutated to a sequence no longer recognized by the gene modifying polypeptide.
  • a mutation region within the heterologous object sequence of the template RNA may comprise a PAM-kill sequence.
  • a PAM-kill sequence prevents re-engagement of the gene modifying polypeptide upon completion of a gene modification, or decreases re-engagement relative to a template RNA lacking a PAM-kill sequence.
  • a PAM-kill sequence does not alter the amino acid sequence encoded by a gene, e.g., the PAM-kill sequence results in a silent mutation. In other embodiments, it is desired to leave the PAM sequence intact (no PAM-kill).
  • RNAs described herein are designed to write a mutation (e.g., a substitution) into the portion of the target site corresponding to the first three nucleotides of the RT template sequence, such that upon editing, the target site will be mutated to a sequence with lower homology to the RT template sequence.
  • a mutation region within the heterologous object sequence of the template RNA may comprise a seed-kill sequence.
  • a seed-kill sequence prevents re-engagement of the gene modifying polypeptide upon completion of genetic modification, or decreases re-engagement relative to an otherwise similar template RNA lacking a seed-kill sequence.
  • a seed-kill sequence does not alter the amino acid sequence encoded by a gene, e.g., the seed-kill sequence results in a silent mutation.
  • multiple silent mutations may be introduced within the RT template sequence to evade the target cell’s mismatch repair or nucleotide repair pathways or to bias the target cell’s repair pathways toward preservation of the edited strand.
  • gRNAs with Inducible Activity e.g., a gRNA that is part of a template RNA or a gRNA used for second strand nicking has inducible activity.
  • Inducible activity may be achieved by the template nucleic acid, e.g., template RNA, further comprising (in addition to the gRNA) a blocking domain, wherein the sequence of a portion of or all of the blocking domain is at least partially complementary to a portion or all of the gRNA.
  • the gRNA that coordinates the second nick has inducible activity.
  • the gRNA that coordinates the second nick is induced after the template is reverse transcribed.
  • hybridization of the gRNA to the blocking domain can be disrupted using an opener molecule. Exemplary blocking domains, opener molecules, and uses thereof are described in PCT App. Publication WO2020044039A1, which is incorporated herein by reference in its entirety.
  • a gene modifying system comprises one or more circular RNAs (circRNAs).
  • a gene modifying system comprises one or more linear RNAs.
  • the circRNA comprises one or more ribozyme sequences.
  • the ribozyme sequence is activated for autocleavage, e.g., in a host cell, e.g., thereby resulting in linearization of the circRNA.
  • the target site surrounding the edited sequence contains a limited number of insertions or deletions, for example, in less than about 50% or 10% of editing events, e.g., as determined by long-read amplicon sequencing of the target site, e.g., as described in Karst et al. (2020) bioRxiv doi.org/10.1101/645903 (incorporated by reference herein in its entirety).
  • the target site does not show multiple consecutive editing events, e.g., head-to-tail or head- to-head duplications, e.g., as determined by long-read amplicon sequencing of the target site, e.g., as described in Karst et al.
  • the target site contains an integrated sequence corresponding to the template RNA.
  • the target site does not contain insertions resulting from endogenous RNA in more than about 1% or 10% of events, e.g., as determined by long-read amplicon sequencing of the target site, e.g., as described in Karst et al. bioRxiv doi.org/10.1101/645903 (2020) (incorporated herein by reference in its entirety).
  • the target site contains the integrated sequence corresponding to the template RNA.
  • the host DNA-binding site integrated into by the gene modifying system can be in a gene, in an intron, in an exon, an ORF, outside of a coding region of any gene, in a regulatory region of a gene, or outside of a regulatory region of a gene.
  • the polypeptide may bind to one or more than one host DNA sequence.
  • a gene modifying system is used to edit a target locus in multiple alleles.
  • a gene modifying system is designed to edit a specific allele.
  • a gene modifying polypeptide may be directed to a specific sequence that is only present on one allele, e.g., comprises a template RNA with homology to a target allele, e.g., a gRNA or annealing domain, but not to a second cognate allele.
  • a gene modifying system can alter a haplotype-specific allele.
  • a gene modifying system that targets a specific allele preferentially targets that allele, e.g., has at least a 2, 4, 6, 8, or 10-fold preference for a target allele.
  • a gene modifying system described herein comprises a nickase activity (e.g., in the gene modifying polypeptide) that nicks the first strand, and a nickase activity (e.g., in the gene modifying polypeptide or in a polypeptide separate from the gene modifying polypeptide) that nicks the second strand of target DNA.
  • nicking of the first strand of the target site DNA is thought to provide a 3 ⁇ OH that can be used by an RT domain to reverse transcribe a sequence of a template RNA, e.g., a heterologous object sequence.
  • a writing domain e.g., RT domain
  • a polypeptide described herein polymerizes (e.g., reverse transcribes) from the heterologous object sequence of a template nucleic acid (e.g., template RNA)
  • the cellular DNA repair machinery must repair the nick on the first DNA strand.
  • the target site DNA now contains two different sequences for the first DNA strand: one corresponding to the original genomic DNA (e.g., having a free 5′ end) and a second corresponding to that polymerized from the heterologous object sequence (e.g., having a free 3′ end).
  • the Cas domain is capable of nicking a first strand and a second strand.
  • the first and second strand nicks occur at the same position in the target site but on opposite strands.
  • the second strand nick occurs in a staggered location, e.g., upstream or downstream, from the first nick.
  • the endonuclease domain generates a target site deletion if the second strand nick is upstream of the first strand nick.
  • the endonuclease domain generates a target site duplication if the second strand nick is downstream of the first strand nick. In some embodiments, the endonuclease domain generates no duplication and/or deletion if the first and second strand nicks occur in the same position of the target site. In some embodiments, the Cas domain has altered activity depending on protein conformation or RNA-binding status, e.g., which promotes the nicking of the first or second strand (e.g., as described in Christensen et al. PNAS 2006; incorporated by reference herein in its entirety).
  • the additional nick to the second strand is made by the same endonuclease domain (e.g., nickase domain) as the nick to the first strand.
  • the same gene modifying polypeptide performs both the nick to the first strand and the nick to the second strand.
  • the gene modifying polypeptide comprises a Cas domain and the additional nick to the second strand is directed by an additional nucleic acid, e.g., comprising a second gRNA directing the Cas domain to nick the second strand.
  • the additional second strand nick is made by a different endonuclease domain (e.g., nickase domain) than the nick to the first strand.
  • that different endonuclease domain is situated in an additional polypeptide (e.g., a system of the invention further comprises the additional polypeptide), separate from the gene modifying polypeptide.
  • the additional polypeptide comprises an endonuclease domain (e.g., nickase domain) described herein.
  • the additional polypeptide comprises a DNA binding domain, e.g., described herein.
  • second strand nicking may occur in two general orientations: inward nicks and outward nicks.
  • the RT domain polymerizes (e.g., using the template RNA (e.g., the heterologous object sequence)) away from the second strand nick.
  • the location of the nick to the first strand and the location of the nick to the second strand are positioned between the first PAM site and second PAM site (e.g., in a scenario wherein both nicks are made by a polypeptide (e.g., a gene modifying polypeptide) comprising a CRISPR/Cas domain).
  • this inward nick orientation can also be referred to as “PAM-out”.
  • the location of the nick to the first strand and the location of the nick to the second strand are between the sites where the polypeptide and the additional polypeptide bind to the target DNA.
  • the location of the nick to the second strand is positioned between the binding sites of the polypeptide and additional polypeptide, and the nick to the first strand is also located between the binding sites of the polypeptide and additional polypeptide.
  • the location of the nick to the first strand and the location of the nick to the second strand are positioned between the PAM site and the binding site of the second polypeptide which is at a distance from the target site.
  • An example of a gene modifying system that provides an inward nick orientation comprises a gene modifying polypeptide comprising a CRISPR/Cas domain, a template RNA comprising a gRNA that directs nicking of the target site DNA on the first strand, and an additional nucleic acid comprising an additional gRNA that directs nicking at a site a distance from the location of the first nick, wherein the location of the first nick and the location of the second nick are between the PAM sites of the sites to which the two gRNAs direct the gene modifying polypeptide.
  • the RT domain polymerizes (e.g., using the template RNA (e.g., the heterologous object sequence)) toward the second strand nick.
  • the first PAM site and second PAM site are positioned between the location of the nick to the first strand and the location of the nick to the second strand.
  • this outward nick orientation also can be referred to as “PAM-in”.
  • the polypeptide e.g., the gene modifying polypeptide
  • the additional polypeptide bind to sites on the target DNA between the location of the nick to the first strand and the location of the nick to the second.
  • the location of the nick to the second strand is positioned on the opposite side of the binding sites of the polypeptide and additional polypeptide relative to the location of the nick to the first strand.
  • the PAM site and the binding site of the second polypeptide which is at a distance from the target site are positioned between the location of the nick to the first strand and the location of the nick to the second strand.
  • An example of a gene modifying system that provides an outward nick orientation comprises a gene modifying polypeptide comprising a Cas domain, a template RNA comprising a gRNA that directs nicking of the target site DNA on the first strand, and an additional nucleic acid comprising an additional gRNA that directs nicking at a site a distance from the location of the first nick, wherein the location of the first nick and the location of the second nick are outside of the PAM sites of the sites to which the two gRNAs direct the gene modifying polypeptide (i.e., the PAM sites are between the location of the first nick and the location of the second nick).
  • an outward nick orientation is preferred in some embodiments.
  • an inward nick may produce a higher number of double-strand breaks (DSBs) than an outward nick orientation.
  • DSBs may be recognized by the DSB repair pathways in the nucleus of a cell, which can result in undesired insertions and deletions.
  • An outward nick orientation may provide a decreased risk of DSB formation, and a corresponding lower amount of undesired insertions and deletions.
  • undesired insertions and deletions are insertions and deletions not encoded by the heterologous object sequence, e.g., an insertion or deletion produced by the double-strand break repair pathway unrelated to the modification encoded by the heterologous object sequence.
  • a desired gene modification comprises a change to the target DNA (e.g., a substitution, insertion, or deletion) encoded by the heterologous object sequence (e.g., and achieved by the gene modifying writing the heterologous object sequence into the target site).
  • the first strand nick and the second strand nick are in an outward orientation.
  • the distance between the first strand nick and second strand nick may influence the extent to which one or more of: desired gene modifying system DNA modifications are obtained, undesired double-strand breaks (DSBs) occur, undesired insertions occur, or undesired deletions occur.
  • DSBs double-strand breaks
  • the second strand nick benefit the biasing of DNA repair toward incorporation of the heterologous object sequence into the target DNA, increases as the distance between the first strand nick and second strand nick decreases.
  • the risk of DSB formation also increases as the distance between the first strand nick and second strand nick decreases.
  • the number of undesired insertions and/or deletions may increase as the distance between the first strand nick and second strand nick decreases.
  • the distance between the first strand nick and second strand nick is chosen to balance the benefit of biasing DNA repair toward incorporation of the heterologous object sequence into the target DNA and the risk of DSB formation and of undesired deletions and/or insertions.
  • a system where the first strand nick and the second strand nick are at least a threshold distance apart has an increased level of desired gene modifying system modification outcomes, a decreased level of undesired deletions, and/or a decreased level of undesired insertions relative to an otherwise similar inward nick orientation system where the first 127 of 237 11867955v1 Attorney Docket No.: 2017469-0019 nick and the second nick are less than the a threshold distance apart.
  • the threshold distance(s) is given below.
  • the first nick and the second nick are at least 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, or 200 nucleotides apart. In some embodiments, the first nick and the second nick are no more than 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, or 250 nucleotides apart.
  • the first nick and the second nick are 20-200, 30-200, 40-200, 50-200, 60-200, 70-200, 80-200, 90-200, 100-200, 110-200, 120-200, 130-200, 140-200, 150-200, 160-200, 170-200, 180-200, 190- 200, 20-190, 30-190, 40-190, 50-190, 60-190, 70-190, 80-190, 90-190, 100-190, 110-190, 120-190, 130- 190, 140-190, 150-190, 160-190, 170-190, 180-190, 20-180, 30-180, 40-180, 50-180, 60-180, 70-180, 80- 180, 90-180, 100-180, 110-180, 120-180, 130-180, 140-180, 150-180, 160-180, 170-180, 20-170, 30-170, 40-170, 50-170, 60-170, 70-170, 80-170, 90-170, 100-170, 110-200, 120
  • the first nick and the second nick are 40-100 nucleotides apart.
  • the second nick is positioned at least 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135, 140, 145, or 150 nucleotides 5 ⁇ or 3 ⁇ of the target site modification (e.g., the insertion, deletion, or substitution) or to the nick on the first strand.
  • the target site modification e.g., the insertion, deletion, or substitution
  • an inward nick orientation may produce a higher number of DSBs than an outward nick orientation, and may result in a higher amount of undesired insertions and deletions than an outward nick orientation, but increasing the distance between the nicks may mitigate that increase in DSBs, undesired deletions, and/or undesired insertions.
  • an inward nick orientation wherein the first nick and the second nick are at least a threshold distance apart has an increased level of desired gene modifying system modification outcomes, a decreased level of undesired deletions, and/or a decreased level of undesired insertions relative 128 of 237 11867955v1 Attorney Docket No.: 2017469-0019 to an otherwise similar inward nick orientation system where the first nick and the second nick are less than the a threshold distance apart.
  • the threshold distance is given below.
  • the first strand nick and the second strand nick are in an inward orientation.
  • the first strand nick and the second strand nick are in an inward orientation and the first strand nick and second strand nick are at least 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 220, 240, 260, 280, 300, 350, 400, 450, or 500 nucleotides apart, e.g., at least 100 nucleotides apart, (and optionally no more than 500, 400, 300, 200, 190, 180, 170, 160, 150, 140, 130, or 120 nucleotides apart).
  • the first strand nick and the second strand nick are in an inward orientation and the first strand nick and second strand nick are 100-200, 110-200, 120-200, 130-200, 140-200, 150-200, 160- 200, 170-200, 180-200, 190-200, 100-190, 110-190, 120-190, 130-190, 140-190, 150-190, 160-190, 170- 190, 180-190, 100-180, 110-180, 120-180, 130-180, 140-180, 150-180, 160-180, 170-180, 100-170, 110- 170, 120-170, 130-170, 140-170, 150-170, 160-170, 100-160, 110-160, 120-160, 130-160, 140-160, 150- 160, 100-150, 110-150, 120-150, 130-150, 140-150, 100-140, 110-140, 120-140, 130-140, 100-130, 110- 130, 120-130, 100-120, 110-
  • a second gRNA associated with the system may help drive complete integration.
  • the second gRNA may target a location that is 0-200 nt away from the first-strand nick, e.g., 0-50, 50-100, 100-200 nt away from the first-strand nick.
  • the second gRNA can only bind its target sequence after the edit is made, e.g., the gRNA binds a sequence present in the heterologous object sequence, but not in the initial target sequence.
  • nucleic acid described herein can comprise unmodified or modified nucleobases.
  • Naturally occurring RNAs are synthesized from four basic ribonucleotides: ATP, CTP, UTP and GTP, but may contain post-transcriptionally modified nucleotides. Further, approximately one hundred different nucleoside modifications have been identified in RNA (Rozenski, J, Crain, P, and McCloskey, J. (1999).
  • RNA Modification Database 1999 update. Nucl Acids Res 27: 196-197).
  • An RNA can also comprise wholly synthetic nucleotides that do not occur in nature.
  • the heterologous object sequence comprises one or more 2’-O-methyl (OMe) modified nucleotides. In some embodiments, the heterologous object sequence comprises one or more 2’-fluoro modified nucleotides. In certain embodiments, the heterologous object sequence comprises a region having a pattern in which 2’-fluoro modified nucleotides alternate with unmodified nucleotides (e.g., unmodified ribonucleotides).
  • OMe 2’-O-methyl
  • the heterologous object sequence comprises one or more 2’-fluoro modified nucleotides. In certain embodiments, the heterologous object sequence comprises a region having a pattern in which 2’-fluoro modified nucleotides alternate with unmodified nucleotides (e.g., unmodified ribonucleotides).
  • the pattern begins at the 5’ end of the heterologous object sequence (e.g., the 5’-most nucleotide of the 130 of 237 11867955v1 Attorney Docket No.: 2017469-0019 heterologous object sequence comprises a 2’-fluoro modification). In other embodiments, the pattern begins at the second nucleotide from the 5’ end of the heterologous object sequence (e.g., such that the 5’- most nucleotide of the heterologous object sequence is unmodified and the next nucleotide comprises a 2’- fluoro modification).
  • the region having the pattern of alternating 2’-fluoro modified and unmodified nucleotides comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 202’-fluoro modified nucleotides.
  • the region having the pattern of alternating 2’-fluoro modified and unmodified nucleotides has a length of 0-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50, 50-60, 60-70, 70-80, 80-90, 90-100, 100-150, 150-200, 200-300, 300- 400, 400-500, 500-600, 600-700, 700-800, 800-900, 900-1000, 1000-1500, 1500-2000, 2000-2500, 2500- 3000, 3000-3500, 3500-4000, 4000-4500, or 4500-5000 nucleotides.
  • the region having the pattern of alternating 2’-fluoro modified and unmodified nucleotides has a length equal to the length of the heterologous object sequence minus 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides.
  • the PBS sequence of the template RNA comprises one or more 2’-fluoro modified nucleotides.
  • the PBS sequence of the template RNA comprises one or more 2’-OMe modified nucleotides.
  • the PBS sequence of the template RNA comprises one or more nucleotides each comprising both a 2’-OMe modification and a phosphorothioate modification.
  • the 3’ end of the PBS sequence comprises, in 5’ to 3’ order, a 2’- fluoro modified nucleotide, one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10) 2’-OMe modified nucleotides, and/or one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10) nucleotides each comprising both a 2’-OMe modification and a phosphorothioate modification.
  • the nucleotides at the junction between the heterologous object sequence and the PBS sequence do not comprise a 2’-fluoro modification. In some embodiments, the nucleotides at the junction between the heterologous object sequence and the PBS sequence (e.g., one or more of the nucleotides at positions +3, +2, +1, -1, -2, and/or -3) do not comprise a 2’-OMe modification.
  • the nucleotides at the junction between the heterologous object sequence and the PBS sequence are unmodified nucleotides.
  • the template RNA comprising modified nucleotides comprises a heterologous object sequence having a mutation region for introducing a mutation into a portion of a human PAH, FAH, HBB, TRAC4, B2M, or A1AT gene.
  • a series of exemplary template RNA sequences comprising 2’-OMe modifications at various positions, e.g., as tested in Example 1, are shown in Table 12 below.
  • a series of additional exemplary template RNA sequences comprising modifications at various positions are shown in Table 15 below.
  • the chemical modification is one provided in WO/2017/183482, US Pat. Pub. No. 20090286852, of International Application No.
  • incorporation of a chemically modified nucleotide into a polynucleotide can result in the modification being incorporated into a nucleobase, the backbone, or both, depending on the location of the modification in the nucleotide.
  • the backbone modification is one provided in EP 2813570, which is herein incorporated by reference in its entirety.
  • the modified cap is one provided in US Pat. Pub. No.20050287539, which is herein incorporated by reference in its entirety.
  • the chemically modified nucleic acid comprises one or more of ARCA: anti-reverse cap analog (m27.3 ⁇ -OGP3G), GP3G (Unmethylated Cap Analog), m7GP3G (Monomethylated Cap Analog), m32.2.7GP3G (Trimethylated Cap Analog), m5CTP (5 ⁇ -methyl- cytidine triphosphate), m6ATP (N6-methyl-adenosine-5 ⁇ -triphosphate), s2UTP (2-thio-uridine triphosphate), and ⁇ (pseudouridine triphosphate).
  • ARCA anti-reverse cap analog
  • GP3G Unmethylated Cap Analog
  • m7GP3G Monitoring of Cap Analog
  • m32.2.7GP3G Trimethylated Cap Analog
  • m5CTP 5 ⁇ -methyl- cytidine triphosphate
  • m6ATP N6-methyl-adenosine-5 ⁇ -triphosphate
  • s2UTP 2
  • the chemically modified nucleic acid comprises a 5 ⁇ cap, e.g.: a 7- methylguanosine cap (e.g., a O-Me-m7G cap); a hypermethylated cap analog; an NAD+-derived cap analog (e.g., as described in Kiledjian, Trends in Cell Biology 28, 454-464 (2016)); or a modified, e.g., biotinylated, cap analog (e.g., as described in Bednarek et al., Phil Trans R Soc B 373, 20180167 (2016)).
  • a 5 ⁇ cap e.g.: a 7- methylguanosine cap (e.g., a O-Me-m7G cap); a hypermethylated cap analog; an NAD+-derived cap analog (e.g., as described in Kiledjian, Trends in Cell Biology 28, 454-464 (2016)); or a modified, e.g., biotinylated, cap analog (
  • the chemically modified nucleic acid comprises a 3 ⁇ feature selected from one or more of: a polyA tail; a 16-nucleotide long stem-loop structure flanked by unpaired 5 nucleotides (e.g., as described by Mannironi et al., Nucleic Acid Research 17, 9113-9126 (1989)); a triple-helical structure (e.g., as described by Brown et al., PNAS 109, 19202-19207 (2012)); a tRNA, Y RNA, or vault RNA structure (e.g., as described by Labno et al., Biochemica et Biophysica Acta 1863, 3125-3147 (2016)); 175 11867955v1 Attorney Docket No.: 2017469-0019 incorporation of one or more deoxyribonucleotide triphosphates (dNTPs), 2’O-Methylated NTPs, or phosphorothioate-NTPs;
  • the nucleic acid (e.g., template nucleic acid) comprises one or more modified nucleotides, e.g., selected from dihydrouridine, inosine, 7-methylguanosine, 5-methylcytidine (5mC), 5′ Phosphate ribothymidine, 2′-O-methyl ribothymidine, 2′-O-ethyl ribothymidine, 2′-fluoro ribothymidine, C-5 propynyl-deoxycytidine (pdC), C-5 propynyl-deoxyuridine (pdU), C-5 propynyl- cytidine (pC), C-5 propynyl-uridine (pU), 5-methyl cytidine, 5-methyl uridine, 5-methyl deoxycytidine, 5- methyl deoxyuridine methoxy, 2,6-diaminopurine, 5′-Dimethoxytrityl-N4
  • the nucleic acid comprises a backbone modification, e.g., a modification to a sugar or phosphate group in the backbone.
  • the nucleic acid comprises a nucleobase modification.
  • the nucleic acid comprises one or more chemically modified nucleotides of Table 16, one or more chemical backbone modifications of Table 17, one or more chemically modified caps of Table 18.
  • the nucleic acid comprises two or more (e.g., 3, 4, 5, 6, 7, 8, 9, or 10 or more) different types of chemical modifications.
  • the nucleic acid may comprise two or more (e.g., 3, 4, 5, 6, 7, 8, 9, or 10 or more) different types of modified nucleobases, e.g., as described herein, e.g., in Table 16.
  • the nucleic acid may comprise two or more (e.g., 3, 4, 5, 6, 7, 8, 9, or 10 or more) different types of backbone modifications, e.g., as described herein, e.g., in Table 17.
  • the nucleic acid may comprise one or more modified cap, e.g., as described herein, e.g., in Table 18.
  • the nucleic acid comprises one or more type of modified nucleobase and one or more type of backbone modification; one or more type of modified nucleobase and one or more modified cap; one or more type of modified cap and one or more type of backbone modification; or one or more type of modified nucleobase, one or more type of backbone modification, and one or more type of modified cap.
  • the nucleic acid comprises one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, or more) modified nucleobases.
  • nucleic acid is modified at one or more (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 176 of 237 11867955v1 Attorney Docket No.: 2017469-0019 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 900, 1000, or more) positions in the backbone. In some embodiments, all backbone positions of the nucleic acid are modified. Table 16.
  • the nucleotides comprising the template of the gene modifying system can be natural or modified bases, or a combination thereof.
  • the template may contain pseudouridine, dihydrouridine, inosine, 7-methylguanosine, or other modified bases.
  • the template may contain locked nucleic acid nucleotides.
  • the modified bases used in the template do not 179 of 237 11867955v1 Attorney Docket No.: 2017469-0019 inhibit the reverse transcription of the template.
  • the modified bases used in the template may improve reverse transcription, e.g., specificity or fidelity.
  • an RNA component of the system (e.g., a template RNA or a gRNA) comprises one or more nucleotide modifications.
  • the modification pattern of a gRNA can significantly affect in vivo activity compared to unmodified or end-modified guides (e.g., as shown in Figure 1D from Finn et al. Cell Rep 22(9):2227-2235 (2016); incorporated herein by reference in its entirety). Without wishing to be bound by theory, this process may be due, at least in part, to a stabilization of the RNA conferred by the modifications.
  • Non-limiting examples of such modifications may include 2'-O-methyl (2'-O-Me), 2'-O-(2-methoxyethyl) (2'-O-MOE), 2'-fluoro (2'-F), phosphorothioate (PS) bond between nucleotides, G-C substitutions, and inverted abasic linkages between nucleotides and equivalents thereof.
  • the template RNA e.g., at the portion thereof that binds a target site
  • the guide RNA comprises a 5 ⁇ terminus region.
  • the template RNA or the guide RNA does not comprise a 5 ⁇ terminus region.
  • the 5 ⁇ terminus region comprises a gRNA spacer region, e.g., as described with respect to sgRNA in Briner AE et al, Molecular Cell 56: 333-339 (2014) (incorporated herein by reference in its entirety; applicable herein, e.g., to all guide RNAs).
  • the 5 ⁇ terminus region comprises a 5 ⁇ end modification.
  • a 5 ⁇ terminus region with or without a spacer region may be associated with a crRNA, trRNA, sgRNA and/or dgRNA.
  • the gRNA spacer region can, in some instances, comprise a guide region, guide domain, or targeting domain.
  • the composition may comprise this region or not.
  • a guide RNA comprises one or more of the modifications of any of the sequences shown in Table 4 of WO2018107028A1, e.g., as identified therein by a SEQ ID NO.
  • the nucleotides may be the same or different, and/or the modification pattern shown may be the same or similar to a modification pattern of a guide sequence as shown in Table 4 of WO2018107028A1.
  • a modification pattern includes the relative position and identity of modifications of the gRNA or a region of the gRNA (e.g. 5 ⁇ terminus region, lower stem region, bulge region, upper stem region, nexus region, hairpin 1 region, hairpin 2 region, 3 ⁇ terminus region).
  • the modification pattern contains at least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% of the modifications of any one of the sequences shown in the sequence column of Table 4 of WO2018107028A1, and/or over one or more regions of the sequence.
  • the modification pattern is at least 180 of 237 11867955v1 Attorney Docket No.: 2017469-0019 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the modification pattern of any one of the sequences shown in the sequence column of Table 4 of WO2018107028A1.
  • the modification pattern is at least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over one or more regions of the sequence shown in Table 4 of WO2018107028A1, e.g., in a 5 ' terminus region, lower stem region, bulge region, upper stem region, nexus region, hairpin 1 region, hairpin 2 region, and/or 3 ⁇ terminus region.
  • the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical to the modification pattern of a sequence over the 5 ' terminus region.
  • the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the lower stem. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the bulge. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the upper stem.
  • the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the nexus. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the hairpin 1. In some embodiments, the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the hairpin 2.
  • the modification pattern is least 50%, 55%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 100% identical over the 3 ' terminus.
  • the modification pattern differs from the modification pattern of a sequence of Table 4 of WO2018107028A1, or a region (e.g. 5 ⁇ terminus, lower stem, bulge, upper stem, nexus, hairpin 1, hairpin 2, 3 ⁇ terminus) of such a sequence, e.g., at 0, 1, 2, 3, 4, 5, 6, or more nucleotides.
  • the gRNA comprises modifications that differ from the modifications of a sequence of Table 4 of WO2018107028A1, e.g., at 0, 1, 2, 3, 4, 5, 6, or more nucleotides.
  • the gRNA comprises modifications that differ from modifications of a region (e.g. 5 ' terminus, lower stem, bulge, upper stem, nexus, hairpin 1, hairpin 2, 3 ⁇ terminus) of a sequence of Table 4 of WO2018107028A1, e.g., at 0, 1, 2, 3, 4, 5, 6, or more nucleotides.
  • the template RNAs e.g., at the portion thereof that binds a target site
  • the gRNA comprises a 2'-O-methyl (2'-O-Me) modified nucleotide.
  • the gRNA comprises a 2'-O-(2-methoxy ethyl) (2'-O-moe) modified nucleotide.
  • the gRNA comprises a 2'-fluoro (2'- F) modified nucleotide.
  • the gRNA comprises a phosphorothioate (PS) bond between nucleotides.
  • PS phosphorothioate
  • the gRNA comprises a 5 ⁇ end modification, a 3 ⁇ end modification, or 5 ⁇ and 3 ⁇ end modifications.
  • the 5 ⁇ end modification comprises a phosphorothioate (PS) bond between nucleotides.
  • the 5 ⁇ 181 of 237 11867955v1 Attorney Docket No.: 2017469-0019 end modification comprises a 2'-O-methyl (2'-O-Me), 2'-O-(2-methoxyethyl) (2'-O-MOE), and/or 2'-fluoro (2'-F) modified nucleotide.
  • the 5 ⁇ end modification comprises at least one phosphorothioate (PS) bond and one or more of a 2'-O-methyl (2'-O- Me), 2'-O-(2-methoxyethyl) (2'-O- MOE), and/or 2'-fluoro (2'-F) modified nucleotide.
  • the end modification may comprise a phosphorothioate (PS), 2'-O-methyl (2'-O-Me), 2'-O-(2-methoxyethyl) (2'-O-MOE), and/or 2'-fluoro (2'-F) modification.
  • Equivalent end modifications are also encompassed by embodiments described herein.
  • the template RNA or gRNA comprises an end modification in combination with a modification of one or more regions of the template RNA or gRNA. Additional exemplary modifications and methods for protecting RNA, e.g., gRNA, and formulae thereof, are described in WO2018126176A1, which is incorporated herein by reference in its entirety.
  • a template RNA described herein comprises three phosphorothioate linkages at the 5’ end and three phosphorothioate linkages at the 3’ end.
  • a template RNA described herein comprises three 2’-O-methyl ribonucleotides at the 5’ end and three 2’-O-methyl ribonucleotides at the 3’ end.
  • the 5’ most three nucleotides of the template RNA are 2’-O-methyl ribonucleotides
  • the 5’ most three internucleotide linkages of the template RNA are phosphorothioate linkages
  • the 3’ most three nucleotides of the template RNA are 2’-O-methyl ribonucleotides
  • the 3’ most three internucleotide linkages of the template RNA are phosphorothioate linkages.
  • the template RNA comprises alternating blocks of ribonucleotides and 2’- O-methyl ribonucleotides, for instance, blocks of between 12 and 28 nucleotides in length.
  • the central portion of the template RNA comprises the alternating blocks and the 5’ and 3’ ends each comprise three 2’-O-methyl ribonucleotides and three phosphorothioate linkages.
  • structure-guided and systematic approaches are used to introduce modifications (e.g., 2′-OMe-RNA, 2′-F-RNA, and PS modifications) to a template RNA or guide RNA, for example, as described in Mir et al. Nat Commun 9:2641 (2016) (incorporated by reference herein in its entirety).
  • the incorporation of 2′-F-RNAs increases thermal and nuclease stability of RNA:RNA or RNA:DNA duplexes, e.g., while minimally interfering with C3′-endo sugar puckering.
  • 2′-F may be better tolerated than 2′-OMe at positions where the 2′-OH is important for RNA:DNA duplex stability.
  • a crRNA comprises one or more modifications that do not reduce Cas9 activity, e.g., C10, C20, or C21 (fully modified), e.g., as described in Supplementary Table 1 of Mir et al. Nat Commun 9:2641 (2016), incorporated herein by reference in its entirety.
  • a tracrRNA comprises one or more modifications that do not reduce Cas9 activity, e.g., T2, T6, T7, or T8 (fully modified) of Supplementary Table 1 of Mir et al. Nat Commun 9:2641 (2016).
  • a crRNA comprises one or more modifications (e.g., as described herein) may be paired with a tracrRNA comprising one or more modifications, e.g., C20 and T2.
  • a gRNA 182 of 237 11867955v1 Attorney Docket No.: 2017469-0019 comprises a chimera, e.g., of a crRNA and a tracrRNA (e.g., Jinek et al. Science 337(6096):816-821 (2012)).
  • modifications from the crRNA and tracrRNA are mapped onto the single-guide chimera, e.g., to produce a modified gRNA with enhanced stability.
  • gRNA molecules may be modified by the addition or subtraction of the naturally occurring structural components, e.g., hairpins.
  • a gRNA may comprise a gRNA with one or more 3 ⁇ hairpin elements deleted, e.g., as described in WO2018106727, incorporated herein by reference in its entirety.
  • a gRNA may contain an added hairpin structure, e.g., an added hairpin structure in the spacer region, which was shown to increase specificity of a CRISPR- Cas system in the teachings of Kocak et al. Nat Biotechnol 37(6):657-666 (2019). Additional modifications, including examples of shortened gRNA and specific modifications improving in vivo activity, can be found in US20190316121, incorporated herein by reference in its entirety.
  • structure-guided and systematic approaches are employed to find modifications for the template RNA.
  • the modifications are identified with the inclusion or exclusion of a guide region of the template RNA.
  • a structure of polypeptide bound to template RNA is used to determine non-protein-contacted nucleotides of the RNA that may then be selected for modifications, e.g., with lower risk of disrupting the association of the RNA with the polypeptide.
  • Secondary structures in a template RNA can also be predicted in silico by software tools, e.g., the RNAstructure tool available at rna.urmc.rochester.edu/RNAstructureWeb (Bellaousov et al. Nucleic Acids Res 41:W471-W474 (2013); incorporated by reference herein in its entirety), e.g., to determine secondary structures for selecting modifications, e.g., hairpins, stems, and/or bulges.
  • An mRNA encoding a gene modifying polypeptide may have a cap, 5′ UTR containing a Kozak, 3′ UTR, and polyA tail containing at least 60 As.
  • An mRNA encoding a gene modifying polypeptide may have a reduced Uridine content through codon selection/optimization.
  • An mRNA encoding a gene modifying polypeptide may have uridines that are about 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% substituted with 5-methoxy uridine.
  • An mRNA encoding a gene modifying polypeptide may have uridines that are about 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% substituted with N1-methyl-pseudouridine.
  • An mRNA encoding a gene modifying polypeptide may have cytosines in the mRNA are about 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% substituted with 5-methylcytosine.
  • An mRNA encoding a gene modifying polypeptide may have a combination of about 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% substitution of cytosine with 5-methylcytosine and about 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% substitution of uridine with 5-methoxy uridine.
  • An mRNA encoding a gene modifying polypeptide may have a combination of about 1%, 2%, 5%, 10%, 20%, 30%, 183 of 237 11867955v1 Attorney Docket No.: 2017469-0019 40%, 50%, 60%, 70%, 80%, 90%, or 100% substitution of cytosine with 5-methylcytosine and about 1%, 2%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% substitution of uridine with N1- methyl-pseudouridine.
  • a guide RNA may be synthesized by T7 RNA polymerase.
  • a guide RNA may be chemically synthesized and contain modifications such as, e.g., 2′-O-methyl, 2′-Fluoro, and/or phosphorothioate.
  • the 3 most terminal nucleotides of a guide RNA may contain 2′-O-methyl modifications with 3 phosphorothioate linkages between the nucleotides.
  • a guide RNA may contain 2′-O-methyl modified nucleotides where there are cytosines and uridines, except at nucleotides found in the “seed” of the guide RNA where cytosines and uridines contain 2′-fluoro modifications.
  • nucleic acid constructs and proteins or polypeptides are routine in the art.
  • recombinant methods may be used. See, in general, Smales & James (Eds.), Therapeutic Proteins: Methods and Protocols (Methods in Molecular Biology), Humana Press (2005); and Crommelin, Sindelar & Meibohm (Eds.), Pharmaceutical Biotechnology: Fundamentals and Applications, Springer (2013).
  • a gene modifying system as described herein can be used to modify a cell (e.g., an animal cell, plant cell, or fungal cell).
  • a gene modifying system as described herein can be used to modify a mammalian cell (e.g., a human cell).
  • a gene modifying system as described herein can be used to modify a cell from a livestock animal (e.g., a cow, horse, sheep, goat, pig, llama, alpaca, camel, yak, chicken, duck, goose, or ostrich).
  • a gene modifying system as described herein can be used as a laboratory tool or a research tool, or used in a laboratory method or research method, e.g., to modify an animal cell, e.g., a mammalian cell (e.g., a human cell), a plant cell, or a fungal cell.
  • compositions and systems described herein may be used in vitro or in vivo.
  • the system or components of the system are delivered to cells (e.g., mammalian cells, e.g., human cells), e.g., in vitro or in vivo.
  • the cells are eukaryotic cells, e.g., cells of a multicellular organism, e.g., an animal, e.g., a mammal (e.g., human, swine, bovine), a bird (e.g., poultry, such as chicken, turkey, or duck), or a fish.
  • the cells are non-human animal cells (e.g., a laboratory animal, a livestock animal, or a companion animal).
  • the cell is a stem cell (e.g., a hematopoietic stem cell), a fibroblast, or a T cell.
  • the cell is an immune cell, e.g., a T cell (e.g., a Treg, CD4, CD8, ⁇ , or memory T cell), B cell (e.g., memory B cell or plasma cell), or NK cell.
  • the cell is a non-dividing cell, e.g., a non-dividing fibroblast or non-dividing T cell.
  • the cell is an HSC and p53 is not upregulated or is upregulated by less than 10%, 5%, 2%, or 1%, e.g., as determined according to the method described in Example 30 of PCT/US2019/048607.
  • the components of the gene modifying system may be delivered in the form of polypeptide, nucleic acid (e.g., DNA, RNA), and combinations thereof.
  • the system and/or components of the system are delivered as nucleic acid.
  • the gene modifying polypeptide may be delivered in the form of a DNA or RNA encoding the polypeptide
  • the template RNA may be delivered in the form of RNA or its complementary DNA to be transcribed into RNA.
  • system or components of the system are delivered on 1, 2, 3, 4, or more distinct nucleic acid molecules.
  • system or components of the system are delivered as a combination of DNA and RNA.
  • system or components of the system are delivered as a combination of DNA and protein.
  • system or components of the system are delivered as a combination of RNA and protein.
  • the gene modifying polypeptide is delivered as a protein.
  • the system or components of the system are delivered to cells, e.g. mammalian cells or human cells, using a vector.
  • the vector may be, e.g., a plasmid or a virus.
  • delivery is in vivo, in vitro, ex vivo, or in situ.
  • the virus is an adeno associated virus (AAV), a lentivirus, or an adenovirus.
  • AAV adeno associated virus
  • the system or components of the system are delivered to cells with a viral-like particle or a virosome.
  • the delivery uses more than one virus, viral-like particle or virosome.
  • the compositions and systems described herein can be formulated in liposomes or other similar vesicles.
  • Liposomes are spherical vesicle structures composed of a uni- or multilamellar lipid bilayer surrounding internal aqueous compartments and a relatively impermeable outer lipophilic phospholipid bilayer. Liposomes may be anionic, neutral or cationic. Liposomes are biocompatible, 185 of 237 11867955v1 Attorney Docket No.: 2017469-0019 nontoxic, can deliver both hydrophilic and lipophilic drug molecules, protect their cargo from degradation by plasma enzymes, and transport their load across biological membranes and the blood brain barrier (BBB) (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12 pages, 2011.
  • BBB blood brain barrier
  • Vesicles can be made from several different types of lipids; however, phospholipids are most commonly used to generate liposomes as drug carriers.
  • Methods for preparation of multilamellar vesicle lipids are known in the art (see for example U.S. Pat. No. 6,693,086, the teachings of which relating to multilamellar vesicle lipid preparation are incorporated herein by reference).
  • vesicle formation can be spontaneous when a lipid film is mixed with an aqueous solution, it can also be expedited by applying force in the form of shaking by using a homogenizer, sonicator, or an extrusion apparatus (see, e.g., Spuch and Navarro, Journal of Drug Delivery, vol. 2011, Article ID 469679, 12 pages, 2011. doi:10.1155/2011/469679 for review).
  • Extruded lipids can be prepared by extruding through filters of decreasing size, as described in Templeton et al., Nature Biotech, 15:647-652, 1997, the teachings of which relating to extruded lipid preparation are incorporated herein by reference.
  • Nanoparticles can be used for delivery, such as a liposome, a lipid nanoparticle, a cationic lipid nanoparticle, an ionizable lipid nanoparticle, a polymeric nanoparticle, a gold nanoparticle, a dendrimer, a cyclodextrin nanoparticle, a micelle, or a combination of the foregoing.
  • Lipid nanoparticles are an example of a carrier that provides a biocompatible and biodegradable delivery system for the pharmaceutical compositions described herein.
  • Nanostructured lipid carriers are modified solid lipid nanoparticles (SLNs) that retain the characteristics of the SLN, improve drug stability and loading capacity, and prevent drug leakage.
  • PNPs Polymer nanoparticles
  • PPNs Lipid–polymer nanoparticles
  • a PLN is composed of a core–shell structure; the polymer core provides a stable structure, and the phospholipid shell offers good biocompatibility. As such, the two components increase the drug encapsulation efficiency rate, facilitate surface modification, and prevent leakage of water-soluble drugs.
  • Exosomes can also be used as drug delivery vehicles for the compositions and systems described herein. For a review, see Ha et al. July 2016. Acta Pharmaceutica Sinica B. Volume 6, Issue 4, Pages 287- 296; doi.org/10.1016/j.apsb.2016.02.001. Fusosomes interact and fuse with target cells, and thus can be used as delivery vehicles for a variety of molecules.
  • the protein component(s) of the gene modifying system may be pre- associated with the template nucleic acid (e.g., template RNA).
  • the gene modifying polypeptide may be first combined with the template nucleic acid (e.g., template RNA) to form a ribonucleoprotein (RNP) complex.
  • the RNP may be delivered to cells via, e.g., transfection, nucleofection, virus, vesicle, LNP, exosome, fusosome.
  • a gene modifying system can be introduced into cells, tissues and multicellular organisms. In some embodiments the system or components of the system are delivered to the cells via mechanical means or physical means. Formulation of protein therapeutics is described in Meyer (Ed.), Therapeutic Protein Drug Products: Practical Approaches to formulation in the Laboratory, Manufacturing, and the Clinic, Woodhead Publishing Series (2012).
  • a system described herein can make use of one or more feature (e.g., a promoter or microRNA binding site) to limit activity in off-target cells or tissues.
  • a nucleic acid described herein e.g., a template RNA or a DNA encoding a template RNA
  • a tissue specific promoter sequence e.g., a tissue specific promoter sequence.
  • the tissue-specific promoter is used to increase the target-cell specificity of a gene modifying system.
  • the promoter can be chosen on the basis that it is active in a target cell type but not active in (or active at a lower level in) a non-target cell type.
  • a system having a tissue-specific promoter sequence in the template RNA may also be used in combination with a microRNA binding site, e.g., in the template RNA or a nucleic acid encoding a gene modifying protein, e.g., as described herein.
  • a system having a tissue-specific promoter sequence in the template RNA may also be used in combination with a DNA encoding a gene modifying polypeptide, driven by a tissue-specific promoter, e.g., to achieve higher levels of gene modifying protein in target cells than in non-target cells.
  • a tissue-specific promoter is selected from Table 3 of WO2020014209, incorporated herein by reference. 187 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • a nucleic acid described herein e.g., a template RNA or a DNA encoding a template RNA
  • the microRNA binding site is used to increase the target-cell specificity of a gene modifying system.
  • the microRNA binding site can be chosen on the basis that is recognized by a miRNA that is present in a non-target cell type, but that is not present (or is present at a reduced level relative to the non-target cell) in a target cell type.
  • a miRNA that is present in a non-target cell type
  • the template RNA when the template RNA is present in a target cell, it would not be bound by the miRNA (or bound but at reduced levels relative to the non-target cell).
  • binding of the miRNA to the template RNA may interfere with its activity, e.g., may interfere with insertion of the heterologous object sequence into the genome.
  • the system would edit the genome of target cells more efficiently than it edits the genome of non-target cells, e.g., the heterologous object sequence would be inserted into the genome of target cells more efficiently than into the genome of non-target cells, or an insertion or deletion is produced more efficiently in target cells than in non-target cells.
  • a system having a microRNA binding site in the template RNA (or DNA encoding it) may also be used in combination with a nucleic acid encoding a gene modifying polypeptide, wherein expression of the gene modifying polypeptide is regulated by a second microRNA binding site, e.g., as described herein.
  • a miRNA is selected from Table 4 of WO2020014209, incorporated herein by reference.
  • UTRs to Modify Protein Expression Levels a nucleic acid component of a system provided by the invention is a sequence (e.g., encoding the polypeptide or comprising a heterologous object sequence) flanked by untranslated regions (UTRs) that modify protein expression levels.
  • UTRs untranslated regions
  • Various 5 ⁇ and 3 ⁇ UTRs can affect protein expression.
  • the coding sequence may be preceded by a 5 ⁇ UTR that modifies RNA stability or protein translation.
  • the sequence may be followed by a 3 ⁇ UTR that modifies RNA stability or translation. In some embodiments, the sequence may be preceded by a 5 ⁇ UTR and followed by a 3 ⁇ UTR that modify RNA stability or translation.
  • the 5 ⁇ and/or 3 ⁇ UTR may be selected from the 5 ⁇ and 3 ⁇ UTRs of complement factor 3 (C3) (CACTCCTCCCCATCCTCTCCCTCTGTCCCTCTGTCCCTCTGACCCTGCACTGTCCCAGCACC; SEQ ID NO: 11,004) or orosomucoid 1 (ORM1) (CAGGACACAGCCTTGGATCAGGACAGAGACTTGGGGGCCATCCTGCCCCTCCAACCCGACA TGTGTACCTCAGCTTTTTCCCTCACTTGCATCAATAAAGCTTCTGTGTTTGGAACAGCTAA; SEQ ID NO: 11,005) (Asrani et al. RNA Biology 2018).
  • C3 complement factor 3
  • ORM1 orosomucoid 1
  • the 5 ⁇ UTR is the 5 ⁇ UTR from C3 and the 3 ⁇ UTR is the 3 ⁇ UTR from ORM1.
  • a 5 ⁇ UTR and 3 ⁇ UTR 188 of 237 11867955v1 Attorney Docket No.: 2017469-0019 for protein expression, e.g., mRNA (or DNA encoding the RNA) for a gene modifying polypeptide or heterologous object sequence, comprise optimized expression sequences.
  • the 5 ⁇ UTR comprises GGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACC (SEQ ID NO: 11,006) and/or the 3 ⁇ UTR comprising UGAUAAUAGGCUGGAGCCUCGGUGGCCAUGCUUCUUGCCCCUUGGGCCUCCCCCCAGCCC CUCCUCCCCUUCCUGCACCCGUACCCCCGUGGUCUUUGAAUAAAGUCUGA (SEQ ID NO: 11,007), e.g., as described in Richner et al. Cell 168(6): P1114-1125 (2017), the sequences of which are incorporated herein by reference.
  • a 5 ⁇ and/or 3 ⁇ UTR may be selected to enhance protein expression.
  • a 5 ⁇ and/or 3 ⁇ UTR may be selected to modify protein expression such that overproduction inhibition is minimized.
  • UTRs are around a coding sequence, e.g., outside the coding sequence and in other embodiments proximal to the coding sequence, In some embodiments, additional regulatory elements (e.g., miRNA binding sites, cis-regulatory sites) are included in the UTRs.
  • an open reading frame of a gene modifying system e.g., an ORF of an mRNA (or DNA encoding an mRNA) encoding a gene modifying polypeptide or one or more ORFs of an mRNA (or DNA encoding an mRNA) of a heterologous object sequence, is flanked by a 5 ⁇ and/or 3 ⁇ untranslated region (UTR) that enhances the expression thereof.
  • the 5 ⁇ UTR of an mRNA component (or transcript produced from a DNA component) of the system comprises the sequence 5 ⁇ - GGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACC-3 ⁇ ; SEQ ID NO: 11,008).
  • the 3 ⁇ UTR of an mRNA component (or transcript produced from a DNA component) of the system comprises the sequence 5 ⁇ - UGAUAAUAGGCUGGAGCCUCGGUGGCCAUGCUUCUUGCCCCUUGGGCCUCCCCCCAGCCC CUCCUCCCCUUCCUGCACCCGUACCCCCGUGGUCUUUGAAUAAAGUCUGA-3 ⁇ (SEQ ID NO: 11,009).
  • This combination of 5 ⁇ UTR and 3 ⁇ UTR has been shown to result in desirable expression of an operably linked ORF by Richner et al. Cell 168(6): P1114-1125 (2017), the teachings and sequences of which are incorporated herein by reference.
  • a system described herein comprises a DNA encoding a transcript, wherein the DNA comprises the corresponding 5 ⁇ UTR and 3 ⁇ UTR sequences, with T substituting for U in the above-listed sequence).
  • a DNA vector used to produce an RNA component of the system further comprises a promoter upstream of the 5 ⁇ UTR for initiating in vitro transcription, e.g., a T7, T3, or SP6 promoter.
  • the 5 ⁇ UTR above begins with GGG, which is a suitable start for optimizing transcription using T7 RNA polymerase.
  • the virus is selected from a Group I virus, e.g., is a DNA virus and packages dsDNA into virions.
  • the Group I virus is selected from, e.g., Adenoviruses, Herpesviruses, Poxviruses.
  • the virus is selected from a Group II virus, e.g., is a DNA virus and packages ssDNA into virions.
  • the Group II virus is selected from, e.g., Parvoviruses.
  • the parvovirus is a dependoparvovirus, e.g., an adeno-associated virus (AAV).
  • the virus is selected from a Group III virus, e.g., is an RNA virus and packages dsRNA into virions.
  • the Group III virus is selected from, e.g., Reoviruses.
  • the virus is selected from a Group IV virus, e.g., is an RNA virus and packages ssRNA(+) into virions.
  • the Group IV virus is selected from, e.g., Coronaviruses, Picornaviruses, Togaviruses.
  • the virus is selected from a Group V virus, e.g., is an RNA virus and packages ssRNA(-) into virions.
  • the Group V virus is selected from, e.g., Orthomyxoviruses, Rhabdoviruses.
  • the virus is selected from a Group VI virus, e.g., is a retrovirus and packages ssRNA(+) into virions.
  • the Group VI virus is selected from, e.g., retroviruses.
  • the retrovirus is a lentivirus, e.g., HIV-1, HIV-2, SIV, BIV.
  • the retrovirus is a spumavirus, e.g., a foamy virus, e.g., HFV, SFV, BFV.
  • the virus is selected from a Group VII virus, e.g., is a retrovirus and packages dsRNA into virions.
  • the Group VII virus is selected from, e.g., Hepadnaviruses.
  • a virion used as a delivery vehicle may comprise a commensal human virus.
  • a virion used as a delivery vehicle may comprise an anellovirus, the use of which is described in WO2018232017A1, which is incorporated herein by reference in its entirety.
  • AAV Administration In some embodiments, an adeno-associated virus (AAV) is used in conjunction with the system, template nucleic acid, and/or polypeptide described herein. In some embodiments, an AAV is used to deliver, administer, or package the system, template nucleic acid, and/or polypeptide described herein.
  • the AAV is a recombinant AAV (rAAV). 190 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • a system described herein further comprises a first recombinant adeno- associated virus (rAAV) capsid protein; wherein the at least one of (a) or (b) is associated with the first rAAV capsid protein, wherein at least one of (a) or (b) is flanked by AAV inverted terminal repeats (ITRs).
  • ITRs AAV inverted terminal repeats
  • an adenoviral vector is used to deliver DNA corresponding to the polypeptide or template component of the gene modifying system, or both are contained on separate or the same adenoviral vector.
  • nucleic acid e.g., encoding a polypeptide, or a template, or both
  • ceDNA is derived from the replicative form of the AAV genome (Li et al. PLoS One 2013).
  • Lipid Nanoparticles The methods and systems provided herein may employ any suitable carrier or delivery modality, including, in certain embodiments, lipid nanoparticles (LNPs).
  • Lipid nanoparticles in some embodiments, comprise one or more ionic lipids, such as non-cationic lipids (e.g., neutral or anionic, or zwitterionic lipids); one or more conjugated lipids (such as PEG-conjugated lipids or lipids conjugated to polymers described in Table 5 of WO2019217941; incorporated herein by reference in its entirety); one or more sterols (e.g., cholesterol); and, optionally, one or more targeting molecules (e.g., conjugated receptors, receptor ligands, antibodies); or combinations of the foregoing.
  • ionic lipids such as non-cationic lipids (e.g., neutral or anionic, or zwitterionic lipids)
  • conjugated lipids such as PEG-conjugated lipids or lipids conjugated to polymers described in Table 5 of WO2019217941; incorporated herein by reference in its entirety
  • sterols e.g.
  • Lipids that can be used in nanoparticle formations include, for example those described in Table 4 of WO2019217941, which is incorporated by reference—e.g., a lipid-containing nanoparticle can comprise one or more of the lipids in Table 4 of WO2019217941.
  • Lipid nanoparticles can include additional elements, such as polymers, such as the polymers described in Table 5 of WO2019217941, incorporated by reference.
  • conjugated lipids when present, can include one or more of PEG- diacylglycerol (DAG) (such as l-(monomethoxy-polyethyleneglycol)-2,3- dimyristoylglycerol (PEG- DMG)), PEG-dialkyloxypropyl (DAA), PEG-phospholipid, PEG- ceramide (Cer), a pegylated phosphatidylethanoloamine (PEG-PE), PEG succinate diacylglycerol (PEGS-DAG) (such as 4-0-(2',3'- di(tetradecanoyloxy)propyl-l-0-(w- methoxy(polyethoxy)ethyl) butanedioate (PEG-S-DMG)), PEG dialkoxypropylcarbam, N- (carbonyl-methoxypoly ethylene glycol 2000)- 1 ,2-distearoyl-sn-
  • DAG P
  • sterols that can be incorporated into lipid nanoparticles include one or more of cholesterol or cholesterol derivatives, such as those in W02009/127060 or US2010/0130588, which are 191 of 237 11867955v1 Attorney Docket No.: 2017469-0019 incorporated by reference.
  • Additional exemplary sterols include phytosterols, including those described in Eygeris et al (2020), dx.doi.org/10.1021/acs.nanolett.0c01386, incorporated herein by reference.
  • the lipid particle comprises an ionizable lipid, a non-cationic lipid, a conjugated lipid that inhibits aggregation of particles, and a sterol.
  • the amounts of these components can be varied independently and to achieve desired properties.
  • the lipid nanoparticle comprises an ionizable lipid is in an amount from about 20 mol % to about 90 mol % of the total lipids (in other embodiments it may be 20-70% (mol), 30-60% (mol) or 40-50% (mol); about 50 mol % to about 90 mol % of the total lipid present in the lipid nanoparticle), a non-cationic lipid in an amount from about 5 mol % to about 30 mol % of the total lipids, a conjugated lipid in an amount from about 0.5 mol % to about 20 mol % of the total lipids, and a sterol in an amount from about 20 mol % to about 50 mol % of the total lipids.
  • the ratio of total lipid to nucleic acid can be varied as desired.
  • the total lipid to nucleic acid (mass or weight) ratio can be from about 10: 1 to about 30: 1.
  • an ionizable lipid may be a cationic lipid, an ionizable cationic lipid, e.g., a cationic lipid that can exist in a positively charged or neutral form depending on pH, or an amine-containing lipid that can be readily protonated.
  • the cationic lipid is a lipid capable of being positively charged, e.g., under physiological conditions.
  • Exemplary cationic lipids include one or more amine group(s) which bear the positive charge.
  • the lipid particle comprises a cationic lipid in formulation with one or more of neutral lipids, ionizable amine-containing lipids, biodegradable alkyn lipids, steroids, phospholipids including polyunsaturated lipids, structural lipids (e.g., sterols), PEG, cholesterol and polymer conjugated lipids.
  • the cationic lipid may be an ionizable cationic lipid.
  • An exemplary cationic lipid as disclosed herein may have an effective pKa over 6.0.
  • a lipid nanoparticle may comprise a second cationic lipid having a different effective pKa (e.g., greater than the first effective pKa), than the first cationic lipid.
  • a lipid nanoparticle may comprise between 40 and 60 mol percent of a cationic lipid, a neutral lipid, a steroid, a polymer conjugated lipid, and a therapeutic agent, e.g., a nucleic acid (e.g., RNA) described herein (e.g., a template nucleic acid or a nucleic acid encoding a gene modifying polypeptide), encapsulated within or associated with the lipid nanoparticle.
  • a nucleic acid e.g., RNA
  • the nucleic acid is co-formulated with the cationic lipid.
  • the nucleic acid may be adsorbed to the surface of an LNP, e.g., an LNP comprising a cationic lipid.
  • the nucleic acid may be encapsulated in an LNP, e.g., an LNP comprising a cationic lipid.
  • the lipid nanoparticle may comprise a targeting moiety, e.g., coated with a targeting agent.
  • the LNP formulation is biodegradable.
  • a lipid nanoparticle comprising one or more lipid described herein, e.g., Formula (i), (ii), (ii), (vii) and/or (ix) encapsulates at least 1%, at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at 192 of 237 11867955v1 Attorney Docket No.: 2017469-0019 least 70%, at least 80%, at least 90%, at least 92%, at least 95%, at least 97%, at least 98% or 100% of an RNA molecule, e.g., template RNA and/or a mRNA encoding the gene modifying polypeptide.
  • an RNA molecule e.g., template RNA and/or a mRNA encoding the gene modifying polypeptide.
  • the lipid to nucleic acid ratio (mass/mass ratio; w/w ratio) can be in the range of from about 1 : 1 to about 25: 1, from about 10: 1 to about 14: 1, from about 3 : 1 to about 15: 1, from about 4: 1 to about 10: 1, from about 5: 1 to about 9: 1, or about 6: 1 to about 9: 1.
  • the amounts of lipids and nucleic acid can be adjusted to provide a desired N/P ratio, for example, N/P ratio of 3, 4, 5, 6, 7, 8, 9, 10 or higher.
  • the lipid nanoparticle formulation’s overall lipid content can range from about 5 mg/ml to about 30 mg/mL.
  • Exemplary ionizable lipids that can be used in lipid nanoparticle formulations include, without limitation, those listed in Table 1 of WO2019051289, incorporated herein by reference. Additional exemplary lipids include, without limitation, one or more of the following formulae: X of US2016/0311759; I of US20150376115 or in US2016/0376224; I, II or III of US20160151284; I, IA, II, or IIA of US20170210967; I-c of US20150140070; A of US2013/0178541; I of US2013/0303587 or US2013/0123338; I of US2015/0141678; II, III, IV, or V of US2015/0239926; I of US2017/0119904; I or II of WO2017/117528; A of US2012/0149894; A of US2015/0057373; A of WO2013/116126; A of US2013/0090372; A of US2013/0274523
  • the ionizable lipid is MC3 (6Z,9Z,28Z,3 lZ)-heptatriaconta- 6,9,28,3 l- tetraen-l9-yl-4-(dimethylamino) butanoate (DLin-MC3-DMA or MC3), e.g., as described in Example 9 of WO2019051289A9 (incorporated by reference herein in its entirety).
  • the ionizable lipid is the lipid ATX-002, e.g., as described in Example 10 of WO2019051289A9 (incorporated by 193 of 237 11867955v1 Attorney Docket No.: 2017469-0019 reference herein in its entirety).
  • the ionizable lipid is (l3Z,l6Z)-A,A-dimethyl-3- nonyldocosa-l3, l6-dien-l-amine (Compound 32), e.g., as described in Example 11 of WO2019051289A9 (incorporated by reference herein in its entirety).
  • the ionizable lipid is Compound 6 or Compound 22, e.g., as described in Example 12 of WO2019051289A9 (incorporated by reference herein in its entirety).
  • the ionizable lipid is heptadecan-9-yl 8-((2-hydroxyethyl)(6- oxo-6-(undecyloxy)hexyl)amino)octanoate (SM-102); e.g., as described in Example 1 of US9,867,888(incorporated by reference herein in its entirety).
  • the ionizable lipid is 9Z,12Z)-3-((4,4-bis(octyloxy)butanoyl)oxy)-2-((((3-(diethylamino)propoxy)carbonyl)oxy)methyl)propyl octadeca-9,12-dienoate (LP01) e.g., as synthesized in Example 13 of WO2015/095340(incorporated by reference herein in its entirety).
  • the ionizable lipid is Di((Z)-non-2-en-1-yl) 9-((4- dimethylamino)butanoyl)oxy)heptadecanedioate (L319), e.g. as synthesized in Example 7, 8, or 9 of US2012/0027803(incorporated by reference herein in its entirety).
  • the ionizable lipid is 1,1'-((2-(4-(2-((2-(Bis(2-hydroxydodecyl)amino)ethyl)(2-hydroxydodecyl) amino)ethyl)piperazin- 1-yl)ethyl)azanediyl)bis(dodecan-2-ol) (C12-200), e.g., as synthesized in Examples 14 and 16 of WO2010/053572(incorporated by reference herein in its entirety).
  • the ionizable lipid is; Imidazole cholesterol ester (ICE) lipid (3S, 10R, 13R, 17R)-10, 13-dimethyl-17- ((R)-6-methylheptan- 2-yl)-2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17-tetradecahydro-lH- cyclopenta[a]phenanthren-3-yl 3- (1H-imidazol-4-yl)propanoate, e.g., Structure (I) from WO2020/106946 (incorporated by reference herein in its entirety).
  • ICE Imidazole cholesterol ester
  • lipid compounds that may be used (e.g., in combination with other lipid components) to form lipid nanoparticles for the delivery of compositions described herein, e.g., nucleic acid (e.g., RNA) described herein (e.g., a template nucleic acid or a nucleic acid encoding a gene modifying polypeptide) includes: (i) a gene modifying composition described herein to the liver and/or hepatocyte cells. 194 of 237 11867955v1 Attorney Docket No.: 2017469-0019 In some embodiments an LNP comprising Formula (ii) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.
  • nucleic acid e.g., RNA
  • an LNP comprising Formula (ii) is used to deliver a gene modifying composition described herein to the liver and/or
  • a gene modifying to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.
  • 195 of 237 11867955v1 Attorney Docket No.: 2017469-0019 a gene modifying to a gene modifying composition described herein to the liver and/or hepatocyte cells.
  • X 1 is O, NR 1 , or a direct bond
  • X 2 is C2-5 alkylene
  • R 1 is H or Me
  • R 3 is Ci-3 alkyl
  • R 2 is Ci-3 alkyl
  • R 2 taken together with the nitrogen atom to which it is attached and 1-3 carbon atoms of X 2 form a 4-, 5-, or 6-membered ring
  • X 1 is NR 1 , R 1 and R 2 taken together with the nitrogen atoms to which they are attached form a 5- or 6-membered ring, or R 2 taken together with R 3 and the nitrogen atom to which they are attached form a 5-, 6-, or 7-membered ring
  • Y 1 is C2-12 alkylene
  • Y 2 is selected from (in either orientation), and 196 of 237 11867955v1 Attorney Docket No.: 2017469-0019 , (in either orientation), n is 0 to 3, R 4 is
  • an LNP comprising Formula (xii) is used to deliver a gene modifying composition described herein to the liver and/or hepatocyte cells.
  • composition described herein to the liver and/or hepatocyte cells 197 of 237 11867955v1 Attorney Docket No.: 2017469-0019 of Formula (xiv).
  • modifying composition described herein to the liver and/or hepatocyte cells LNP comprising a formulation of Formula (xvi) is used to deliver a gene modifying composition described herein to the lung endothelial cells.
  • nucleic acid e.g., RNA
  • nucleic acid e.g., RNA
  • DSPC distearoylphosphatidylcholine
  • DOPC dioleoylphosphatidylcholine
  • DOPG dipalmitoylphosphatidylcholine
  • DOPG dipalmitoylphosphatidylglycerol
  • DOPE dioleoyl-phosphatidylethanolamine
  • DOPE 1,2-dioleoyl-sn- 199 of 237 11867955v1
  • acyl groups in these lipids are preferably acyl groups derived from fatty acids having C10-C24 carbon chains, e.g., lauroyl, myristoyl, palmitoyl, stearoyl, or oleoyl.
  • Additional exemplary lipids include, without limitation, those described in Kim et al. (2020) dx.doi.org/10.1021/acs.nanolett.0c01386, incorporated herein by reference.
  • lipids include, in some embodiments, plant lipids found to improve liver transfection with mRNA (e.g., DGTS).
  • the non-cationic lipid may have the following structure: lipid nanoparticles include, without limitation, nonphosphorous lipids such as, e.g., stearylamine, dodeeylamine, hexadecylamine, acetyl palmitate, glycerol ricinoleate, hexadecyl steareate, isopropyl myristate, amphoteric acrylic polymers, triethanolamine-lauryl sulfate, alkyl-aryl sulfate polyethyloxylated fatty acid amides, dioctadecyl dimethyl ammonium bromide, ceramide, sphingomyelin, and the like.
  • nonphosphorous lipids such as, e.g., stearylamine, dodeeylamine,
  • non-cationic lipids are described in WO2017/099823 or US patent publication US2018/0028664, the contents of which is incorporated herein by reference in their entirety.
  • the non-cationic lipid is oleic acid or a compound of Formula I, II, or IV of US2018/0028664, incorporated herein by reference in its entirety.
  • the non-cationic lipid can comprise, for example, 0-30% (mol) of the total lipid present in the lipid nanoparticle.
  • the non- 200 of 237 11867955v1 Attorney Docket No.: 2017469-0019 cationic lipid content is 5-20% (mol) or 10-15% (mol) of the total lipid present in the lipid nanoparticle.
  • the molar ratio of ionizable lipid to the neutral lipid ranges from about 2:1 to about 8:1 (e.g., about 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, or 8:1).
  • the lipid nanoparticles do not comprise any phospholipids.
  • the lipid nanoparticle can further comprise a component, such as a sterol, to provide membrane integrity.
  • a sterol that can be used in the lipid nanoparticle is cholesterol and derivatives thereof.
  • Non-limiting examples of cholesterol derivatives include polar analogues such as 5a-choiestanol, 53-coprostanol, choiesteryl-(2 , -hydroxy)-ethyl ether, choiesteryl-(4'- hydroxy)-butyl ether, and 6-ketocholestanol; non-polar analogues such as 5a-cholestane, cholestenone, 5a-cholestanone, 5p- cholestanone, and cholesteryl decanoate; and mixtures thereof.
  • the cholesterol derivative is a polar analogue, e.g., choiesteryl-(4'-hydroxy)-buty1 ether.
  • the component providing membrane integrity such as a sterol
  • the component providing membrane integrity can comprise 0-50% (mol) (e.g., 0-10%, 10-20%, 20-30%, 30-40%, or 40-50%) of the total lipid present in the lipid nanoparticle.
  • such a component is 20-50% (mol) 30-40% (mol) of the total lipid content of the lipid nanoparticle.
  • the lipid nanoparticle can comprise a polyethylene glycol (PEG) or a conjugated lipid molecule.
  • conjugated lipids include, but are not limited to, PEG-lipid conjugates, polyoxazoline (POZ)-lipid conjugates, polyamide-lipid conjugates (such as ATTA-lipid conjugates), cationic-polymer lipid (CPL) conjugates, and mixtures thereof.
  • the conjugated lipid molecule is a PEG-lipid conjugate, for example, a (methoxy polyethylene glycol)- conjugated lipid.
  • PEG-lipid conjugates include, but are not limited to, PEG-diacylglycerol (DAG) (such as l-(monomethoxy-polyethyleneglycol)-2,3-dimyristoylglycerol (PEG-DMG)), PEG-dialkyloxypropyl (DAA), PEG-phospholipid, PEG-ceramide (Cer), a pegylated phosphatidylethanoloamine (PEG-PE), 1,2- dimyristoyl-sn-glycerol, methoxypoly ethylene glycol (DMG-PEG-2K), PEG succinate diacylglycerol (PEGS-DAG) (such as 4-0-(2',3'-di(tetradecanoyloxy)propyl-l-0-(w-methoxy(polyethoxy)ethyl) butanedioate (PEG-S-DMG)), PEG dialkoxypropylcarbam, N-
  • PEG-lipid conjugates are described, for example, in US5,885,613, US6,287,591, US2003/0077829, US2003/0077829, US2005/0175682, US2008/0020058, US2011/0117125, US2010/0130588, US2016/0376224, US2017/0119904, and WO2017/099823, the contents of all of which 201 of 237 11867955v1 Attorney Docket No.: 2017469-0019 are incorporated herein by reference in their entirety.
  • a PEG-lipid is a compound of Formula III, III-a-I, III-a-2, III-b-1, III-b-2, or V of US2018/0028664, the content of which is incorporated herein by reference in its entirety.
  • a PEG-lipid is of Formula II of US20150376115 or US2016/0376224, the content of both of which is incorporated herein by reference in its entirety.
  • the PEG-DAA conjugate can be, for example, PEG-dilauryloxypropyl, PEG- dimyristyloxypropyl, PEG-dipalmityloxypropyl, or PEG-distearyloxypropyl.
  • the PEG-lipid can be one or more of PEG-DMG, PEG-dilaurylglycerol, PEG-dipalmitoylglycerol, PEG- disterylglycerol, PEG- dilaurylglycamide, PEG-dimyristylglycamide, PEG- dipalmitoylglycamide, PEG-disterylglycamide, PEG- cholesterol (l-[8'-(Cholest-5-en-3[beta]- oxy)carboxamido-3',6'-dioxaoctanyl] carbamoyl-[omega]-methyl- poly(ethylene glycol), PEG- DMB (3,4-Ditetradecoxylbenzyl- [omega]-methyl-poly(ethylene glycol) ether), and 1,2- dimyristoyl-sn-glycero-3-phosphoethanolamine-N-[methoxy(polyethylene glycol)-2000].
  • the PEG-lipid comprises PEG-DMG, 1,2- dimyristoyl-sn-glycero-3- phosphoethanolamine-N-[methoxy(polyethylene glycol)-2000].
  • the PEG-lipid comprises a structure selected from: . 202 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • lipids conjugated with a molecule other than a PEG can also be used in place of PEG-lipid.
  • polyoxazoline (POZ)-lipid conjugates polyamide-lipid conjugates (such as ATTA-lipid conjugates), and cationic-polymer lipid (GPL) conjugates can be used in place of or in addition to the PEG-lipid.
  • Exemplary conjugated lipids i.e., PEG-lipids, (POZ)-lipid conjugates, ATTA-lipid conjugates and cationic polymer-lipids are described in the PCT and LIS patent applications listed in Table 2 of WO2019051289A9 and in WO2020106946A1, the contents of all of which are incorporated herein by reference in their entirety.
  • an LNP comprises a compound of Formula (xix), a compound of Formula (xxi) and a compound of Formula (xxv).
  • an LNP comprising a formulation of Formula (xix), Formula (xxi) and Formula (xxv) is used to deliver a gene modifying composition described herein to the lung or pulmonary cells.
  • a lipid nanoparticle may comprise one or more cationic lipids selected from Formula (i), Formula (ii), Formula (iii), Formula (vii), and Formula (ix).
  • the LNP may further comprise one or more neutral lipid, e.g., DSPC, DPPC, DMPC, DOPC, POPC, DOPE, SM, a steroid, e.g., cholesterol, and/or one or more polymer conjugated lipid, e.g., a pegylated lipid, e.g., PEG- DAG, PEG-PE, PEG-S-DAG, PEG-cer or a PEG dialkyoxypropylcarbamate.
  • the PEG or the conjugated lipid can comprise 0-20% (mol) of the total lipid present in the lipid nanoparticle.
  • PEG or the conjugated lipid content is 0.5- 10% or 2-5% (mol) of the total lipid present in the lipid nanoparticle.
  • Molar ratios of the ionizable lipid, non- cationic-lipid, sterol, and PEG/conjugated lipid can be varied as needed.
  • the lipid particle can comprise 30-70% ionizable lipid by mole or by total weight of the composition, 0-60% cholesterol by mole or by total weight of the composition, 0-30% non-cationic-lipid by mole or by total weight of the composition and 1-10% conjugated lipid by mole or by total weight of the composition.
  • the composition comprises 30-40% ionizable lipid by mole or by total weight of the composition, 40-50% cholesterol by mole or by total weight of the composition, and 10- 20% non-cationic-lipid by mole or by total weight of the composition.
  • the composition is 50-75% ionizable lipid by mole or by total weight of the composition, 20-40% cholesterol by mole or by total weight of the composition, and 5 to 10% non-cationic-lipid, by mole or by total weight of the composition and 1-10% conjugated lipid by mole or by total weight of the composition.
  • the composition may contain 60-70% ionizable lipid by mole or by total weight of the composition, 25-35% cholesterol by mole or by total weight of the composition, and 5-10% non-cationic-lipid by mole or by total weight of the composition.
  • the composition may also contain up to 90% ionizable lipid by mole or by total weight of the composition and 2 to 15% non-cationic lipid by mole or by total weight of the composition.
  • the formulation may also be a 203 of 237 11867955v1 Attorney Docket No.: 2017469-0019 lipid nanoparticle formulation, for example comprising 8-30% ionizable lipid by mole or by total weight of the composition, 5-30% non- cationic lipid by mole or by total weight of the composition, and 0-20% cholesterol by mole or by total weight of the composition; 4-25% ionizable lipid by mole or by total weight of the composition, 4-25% non-cationic lipid by mole or by total weight of the composition, 2 to 25% cholesterol by mole or by total weight of the composition, 10 to 35% conjugate lipid by mole or by total weight of the composition, and 5% cholesterol by mole or by total weight of the composition; or 2-30% ionizable lipid by mole or by total weight of the composition, 2-30% non-cationic lipid by mole or by total weight of the composition, 1 to 15% cholesterol by mole or by total weight of the composition, 2 to 35% conjugate lipid
  • the lipid particle formulation comprises ionizable lipid, phospholipid, cholesterol and a PEG-ylated lipid in a molar ratio of 50: 10:38.5: 1.5. In some other embodiments, the lipid particle formulation comprises ionizable lipid, cholesterol and a PEG-ylated lipid in a molar ratio of 60:38.5: 1.5. In some embodiments, the lipid particle comprises ionizable lipid, non-cationic lipid (e.g.
  • phospholipid e.g., cholesterol
  • sterol e.g., cholesterol
  • PEG-ylated lipid where the molar ratio of lipids ranges from 20 to 70 mole percent for the ionizable lipid, with a target of 40-60, the mole percent of non-cationic lipid ranges from 0 to 30, with a target of 0 to 15, the mole percent of sterol ranges from 20 to 70, with a target of 30 to 50, and the mole percent of PEG-ylated lipid ranges from 1 to 6, with a target of 2 to 5.
  • the lipid particle comprises ionizable lipid / non-cationic- lipid / sterol / conjugated lipid at a molar ratio of 50: 10:38.5: 1.5.
  • the disclosure provides a lipid nanoparticle formulation comprising phospholipids, lecithin, phosphatidylcholine and phosphatidylethanolamine.
  • one or more additional compounds can also be included. Those compounds can be administered separately or the additional compounds can be included in the lipid nanoparticles of the invention.
  • the lipid nanoparticles can contain other compounds in addition to the nucleic acid or at least a second nucleic acid, different than the first.
  • additional compounds can be selected from the group consisting of small or large organic or inorganic molecules, monosaccharides, disaccharides, trisaccharides, oligosaccharides, polysaccharides, peptides, proteins, peptide analogs and derivatives thereof, peptidomimetics, nucleic acids, nucleic acid analogs and derivatives, an extract made from biological materials, or any combinations thereof.
  • a lipid nanoparticle (or a formulation comprising lipid nanoparticles) lacks reactive impurities (e.g., aldehydes or ketones), or comprises less than a preselected level of reactive 204 of 237 11867955v1 Attorney Docket No.: 2017469-0019 impurities (e.g., aldehydes or ketones).
  • a lipid reagent is used to make a lipid nanoparticle formulation, and the lipid reagent may comprise a contaminating reactive impurity (e.g., an aldehyde or ketone).
  • a lipid regent may be selected for manufacturing based on having less than a preselected level of reactive impurities (e.g., aldehydes or ketones).
  • aldehydes can cause modification and damage of RNA, e.g., cross-linking between bases and/or covalently conjugating lipid to RNA (e.g., forming lipid-RNA adducts). This may, in some instances, lead to failure of a reverse transcriptase reaction and/or incorporation of inappropriate bases, e.g., at the site(s) of lesion(s), e.g., a mutation in a newly synthesized target DNA.
  • a lipid nanoparticle formulation is produced using a lipid reagent comprising less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content.
  • a lipid nanoparticle formulation is produced using a lipid reagent comprising less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.
  • a lipid nanoparticle formulation is produced using a lipid reagent comprising: (i) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content; and (ii) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.
  • a lipid reagent comprising: (i) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.
  • the lipid nanoparticle formulation is produced using a plurality of lipid reagents, and each lipid reagent of the plurality independently meets one or more criterion described in this paragraph. In some embodiments, each lipid reagent of the plurality meets the same criterion, e.g., a criterion of this paragraph. In some embodiments, the lipid nanoparticle formulation comprises less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content.
  • each lipid reagent of the plurality meets the same criterion, e.g., a criterion of this paragraph. In some embodiments, the lipid nanoparticle formulation comprises less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.
  • the lipid nanoparticle formulation comprises less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.
  • any single reactive impurity e.g., aldehyde
  • the lipid nanoparticle formulation comprises: (i) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content; and (ii) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.
  • any single reactive impurity e.g., aldehyde
  • one or more, or optionally all, of the lipid reagents used for a lipid nanoparticle as described herein or a formulation thereof comprise less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content.
  • one or more, or optionally all, of the lipid reagents used for a lipid nanoparticle as described herein or a formulation thereof comprise less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 205 of 237 11867955v1 Attorney Docket No.: 2017469-0019 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.
  • any single reactive impurity e.g., aldehyde
  • one or more, or optionally all, of the lipid reagents used for a lipid nanoparticle as described herein or a formulation thereof comprise: (i) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% total reactive impurity (e.g., aldehyde) content; and (ii) less than 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% of any single reactive impurity (e.g., aldehyde) species.
  • any single reactive impurity e.g., aldehyde
  • total aldehyde content and/or quantity of any single reactive impurity (e.g., aldehyde) species is determined by liquid chromatography (LC), e.g., coupled with tandem mass spectrometry (MS/MS), e.g., according to the method described in Example 40 of PCT/US21/20948.
  • LC liquid chromatography
  • MS/MS tandem mass spectrometry
  • reactive impurity (e.g., aldehyde) content and/or quantity of reactive impurity (e.g., aldehyde) species is determined by detecting one or more chemical modifications of a nucleic acid molecule (e.g., an RNA molecule, e.g., as described herein) associated with the presence of reactive impurities (e.g., aldehydes), e.g., in the lipid reagents.
  • a nucleic acid molecule e.g., an RNA molecule, e.g., as described herein
  • reactive impurity (e.g., aldehyde) content and/or quantity of reactive impurity (e.g., aldehyde) species is determined by detecting one or more chemical modifications of a nucleotide or nucleoside (e.g., a ribonucleotide or ribonucleoside, e.g., comprised in or isolated from a template nucleic acid, e.g., as described herein) associated with the presence of reactive impurities (e.g., aldehydes), e.g., in the lipid reagents, e.g., according to the method described in Example 41 of PCT/US21/20948.
  • a nucleotide or nucleoside e.g., a ribonucleotide or ribonucleoside, e.g., comprised in or isolated from a template nucleic acid, e.g., as described herein
  • reactive impurities
  • nucleic acid molecule e.g., RNA
  • a nucleic acid described herein e.g., a template nucleic acid or a nucleic acid encoding a gene modifying polypeptide
  • a nucleic acid has less than 50, 20, 10, 5, 2, or 1 aldehyde modifications per 1000 nucleotides, e.g., wherein a single cross-linking of two nucleotides is a single aldehyde modification.
  • the aldehyde modification is an RNA adduct (e.g., a lipid-RNA adduct).
  • the aldehyde- modified nucleotide is cross-linking between bases.
  • a nucleic acid (e.g., RNA) described herein comprises less than 50, 20, 10, 5, 2, or 1 cross-links between nucleotide.
  • LNPs are directed to specific tissues by the addition of targeting domains.
  • biological ligands may be displayed on the surface of LNPs to enhance interaction with cells displaying cognate receptors, thus driving association with and cargo delivery to tissues wherein cells express the receptor.
  • the biological ligand may be a ligand that drives delivery to the liver, e.g., LNPs that display GalNAc result in delivery of nucleic acid cargo to hepatocytes that display 206 of 237 11867955v1 Attorney Docket No.: 2017469-0019 asialoglycoprotein receptor (ASGPR).
  • Mol Ther 18(7):1357-1364 (2010) teaches the conjugation of a trivalent GalNAc ligand to a PEG-lipid (GalNAc-PEG-DSG) to yield LNPs dependent on ASGPR for observable LNP cargo effect (see, e.g., Figure 6 therein).
  • ligand-displaying LNP formulations e.g., incorporating folate, transferrin, or antibodies
  • WO2017223135 is incorporated herein by reference in its entirety, in addition to the references used therein, namely Kolhatkar et al., Curr Drug Discov Technol.20118:197-206; Musacchio and Torchilin, Front Biosci.2011 16:1388-1412; Yu et al., Mol Membr Biol.201027:286-298; Patil et al., Crit Rev Ther Drug Carrier Syst.
  • LNPs are selected for tissue-specific activity by the addition of a Selective ORgan Targeting (SORT) molecule to a formulation comprising traditional components, such as ionizable cationic lipids, amphipathic phospholipids, cholesterol and poly(ethylene glycol) (PEG) lipids.
  • SORT Selective ORgan Targeting
  • Nat Nanotechnol 15(4):313-320 demonstrate that the addition of a supplemental “SORT” component precisely alters the in vivo RNA delivery profile and mediates tissue- specific (e.g., lungs, liver, spleen) gene delivery and editing as a function of the percentage and biophysical property of the SORT molecule.
  • the LNPs comprise biodegradable, ionizable lipids.
  • the LNPs comprise (9Z,l2Z)-3-((4,4-bis(octyloxy)butanoyl)oxy)-2-((((3- (diethylamino)propoxy)carbonyl)oxy)methyl)propyl octadeca-9,l2-dienoate, also called 3-((4,4- bis(octyloxy)butanoyl)oxy)-2-(((3-(diethylamino)propoxy)carbonyl)oxy)methyl)propyl(9Z,l2Z)- octadeca-9,l2-dienoate) or another ionizable lipid.
  • an LNP described herein comprises a lipid described in Table 19.
  • mu tp e components o a gene mo yng system may e prepare as a single LNP formulation, e.g., an LNP formulation comprises mRNA encoding for the gene modifying polypeptide and an RNA template. Ratios of nucleic acid components may be varied in order to maximize the properties of a therapeutic.
  • the ratio of RNA template to mRNA encoding a gene modifying polypeptide is about 1:1 to 100:1, e.g., about 1:1 to 20:1, about 20:1 to 40:1, about 40:1 to 60:1, about 60:1 to 80:1, or about 80:1 to 100:1, by molar ratio.
  • a system of multiple nucleic acids may be prepared by separate formulations, e.g., one LNP formulation comprising a template RNA and a second LNP formulation comprising an mRNA encoding a gene modifying polypeptide.
  • the system may comprise more than two nucleic acid components formulated into LNPs.
  • the system may comprise a protein, e.g., a gene modifying polypeptide, and a template RNA formulated into at least one LNP formulation.
  • the average LNP diameter of the LNP formulation may be between 10s of nm and 100s of nm, e.g., measured by dynamic light scattering (DLS).
  • the average LNP diameter of the LNP formulation may be from about 40 nm to about 150 nm, such as about 40 nm, 45 nm, 50 nm, 55 nm, 60 nm, 65 nm, 70 nm, 75 nm, 80 nm, 85 nm, 90 nm, 95 nm, 100 nm, 105 nm, 110 nm, 115 nm, 120 nm, 125 nm, 130 nm, 135 nm, 140 nm, 145 nm, or 150 nm.
  • the average LNP diameter of the LNP formulation may be from about 50 nm to about 100 nm, from about 50 nm to about 90 nm, from about 50 nm to about 80 nm, from about 50 nm to about 70 nm, from about 50 nm to 208 of 237 11867955v1 Attorney Docket No.: 2017469-0019 about 60 nm, from about 60 nm to about 100 nm, from about 60 nm to about 90 nm, from about 60 nm to about 80 nm, from about 60 nm to about 70 nm, from about 70 nm to about 100 nm, from about 70 nm to about 90 nm, from about 70 nm to about 80 nm, from about 80 nm to about 100 nm, from about 80 nm to about 90 nm, or from about 90 nm to about 100 nm.
  • the average LNP diameter of the LNP formulation may be from about 70 nm to about 100 nm. In a particular embodiment, the average LNP diameter of the LNP formulation may be about 80 nm. In some embodiments, the average LNP diameter of the LNP formulation may be about 100 nm. In some embodiments, the average LNP diameter of the LNP formulation ranges from about l mm to about 500 mm, from about 5 mm to about 200 mm, from about 10 mm to about 100 mm, from about 20 mm to about 80 mm, from about 25 mm to about 60 mm, from about 30 mm to about 55 mm, from about 35 mm to about 50 mm, or from about 38 mm to about 42 mm.
  • An LNP may, in some instances, be relatively homogenous.
  • a polydispersity index may be used to indicate the homogeneity of an LNP, e.g., the particle size distribution of the lipid nanoparticles.
  • a small (e.g., less than 0.3) polydispersity index generally indicates a narrow particle size distribution.
  • An LNP may have a polydispersity index from about 0 to about 0.25, such as 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22, 0.23, 0.24, or 0.25.
  • the polydispersity index of an LNP may be from about 0.10 to about 0.20.
  • the zeta potential of an LNP may be used to indicate the electrokinetic potential of the composition.
  • the zeta potential may describe the surface charge of an LNP. Lipid nanoparticles with relatively low charges, positive or negative, are generally desirable, as more highly charged species may interact undesirably with cells, tissues, and other elements in the body.
  • the zeta potential of an LNP may be from about -10 mV to about +20 mV, from about -10 mV to about +15 mV, from about -10 mV to about +10 mV, from about -10 mV to about +5 mV, from about -10 mV to about 0 mV, from about -10 mV to about -5 mV, from about -5 mV to about +20 mV, from about -5 mV to about +15 mV, from about -5 mV to about +10 mV, from about -5 mV to about +5 mV, from about -5 mV to about 0 mV, from about 0 mV to about +20 mV, from about 0 mV to about +15 mV, from about 0 mV to about +10 mV, from about 0 mV to about +5 mV, from about 0 mV to about +20 mV, from about
  • the efficiency of encapsulation of a protein and/or nucleic acid describes the amount of protein and/or nucleic acid that is encapsulated or otherwise associated with an LNP after preparation, relative to the initial amount provided.
  • the encapsulation efficiency is desirably high (e.g., close to 100%).
  • the encapsulation efficiency may be measured, for example, by comparing the amount of protein or nucleic acid in a solution containing the lipid nanoparticle before and after breaking up the lipid nanoparticle with one or more organic solvents or 209 of 237 11867955v1 Attorney Docket No.: 2017469-0019 detergents.
  • an anion exchange resin may be used to measure the amount of free protein or nucleic acid (e.g., RNA) in a solution. Fluorescence may be used to measure the amount of free protein and/or nucleic acid (e.g., RNA) in a solution.
  • the encapsulation efficiency of a protein and/or nucleic acid may be at least 50%, for example 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100%. In some embodiments, the encapsulation efficiency may be at least 80%.
  • the encapsulation efficiency may be at least 90%. In some embodiments, the encapsulation efficiency may be at least 95%.
  • An LNP may optionally comprise one or more coatings. In some embodiments, an LNP may be formulated in a capsule, film, or table having a coating. A capsule, film, or tablet including a composition described herein may have any useful size, tensile strength, hardness or density. Additional exemplary lipids, formulations, methods, and characterization of LNPs are taught by WO2020061457, which is incorporated herein by reference in its entirety.
  • in vitro or ex vivo cell lipofections are performed using Lipofectamine MessengerMax (Thermo Fisher) or TransIT-mRNA Transfection Reagent (Mirus Bio).
  • LNPs are formulated using the GenVoy_ILM ionizable lipid mix (Precision NanoSystems).
  • LNPs are formulated using 2,2 ⁇ dilinoleyl ⁇ 4 ⁇ dimethylaminoethyl ⁇ [1,3] ⁇ dioxolane (DLin ⁇ KC2 ⁇ DMA) or dilinoleylmethyl ⁇ 4 ⁇ dimethylaminobutyrate (DLin-MC3-DMA or MC3), the formulation and in vivo use of which are taught in Jayaraman et al. Angew Chem Int Ed Engl 51(34):8529- 8533 (2012), incorporated herein by reference in its entirety.
  • LNP formulations optimized for the delivery of CRISPR-Cas systems e.g., Cas9-gRNA RNP, gRNA, Cas9 mRNA
  • Cas9-gRNA RNP gRNA
  • Cas9 mRNA gRNA
  • Additional specific LNP formulations useful for delivery of nucleic acids are described in US8158601 and US8168775, both incorporated by reference, which include formulations used in patisiran, sold under the name ONPATTRO.
  • a gene modifying mRNA and a guide RNA may be co-formulated in an LNP as described herein. They may be separately formulated. They may be combined prior to injection. They may be combined at a molar ratio in the range of about 1:10 to 1:250 mRNA:gRNA.
  • mRNA and guide RNA may be injected 30-180 minutes apart where the mRNA LNPs are delivered first followed by the guide RNA LNPs.
  • The may be delivered about 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, or 180 minutes apart.
  • the mRNA and/or gRNA may be dosed at 0.01 – 6 mg/kg either separately or together as a total amount of RNA-LNP.
  • the RNA-LNPs may be injected IV bolus.
  • the RNA-LNPs 210 of 237 11867955v1 Attorney Docket No.: 2017469-0019 may be infused over a period of 30-360 minutes.
  • the RNA-LNPs may be infused over a period of about 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330 or 360 minutes. Formulations.
  • the gene modifying polypeptide mRNA and Template RNA are separately formulated as described above, combined prior to injection at a 1:20 RNA molar ratio, mRNA:Template RNA (and optionally mRNA:second-nick guide RNA), respectively.
  • the gene modifying polypeptide mRNA and Template RNA are separately formulated as described above, combined prior to injection at a 1:50 RNA molar ratio, mRNA:guide RNAs (and optionally mRNA:second-nick guide RNA), respectively.
  • the gene modifying polypeptide mRNA and Template RNA are separately formulated, combined prior to injection at ratio ranges from 1:10-1:250, mRNA:Template RNA (and optionally mRNA:second-nick guide RNA), respectively.
  • the mRNA and Template RNA are mixed together at a 1:10-1:250, mRNA:Template RNA (and optionally mRNA:second-nick guide RNA), and then formulated as described above, where the RNA concentration going into formulation is 0.1 mg/mL.
  • the mRNA and Template RNA are formulated separately and are injected 30-180 minutes apart, where the mRNA LNPs are delivered first followed by the Template RNA (and optional second-nick guide RNA) LNPs.
  • the ionizable lipid is LIPIDV005 from Table 19. Dosing.
  • the gene modifying polypeptide mRNA and/or Template RNA are dosed at 0.01 – 6 mg/kg, either separately or together as a total amount of RNA-LNP.
  • the RNA-LNPs is injected as an IV bolus.
  • the RNA-LNPs is infused over a period of 30-360 minutes.
  • Kits, Articles of Manufacture, and Pharmaceutical Compositions in an aspect the disclosure provides a kit comprising a gene modifying polypeptide or a gene modifying system, e.g., as described herein.
  • the kit comprises a gene modifying polypeptide (or a nucleic acid encoding the polypeptide) and a template RNA (or DNA encoding the template RNA).
  • the kit further comprises a reagent for introducing the system into a cell, e.g., transfection reagent, LNP, and the like.
  • the kit is suitable for any of the methods described herein.
  • the kit comprises one or more elements, compositions (e.g., pharmaceutical compositions), gene modifying polypeptides, and/or gene modifying systems, or a functional fragment or component thereof, e.g., disposed in an article of manufacture.
  • the kit comprises instructions for use thereof. 211 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • the disclosure provides an article of manufacture, e.g., in which a kit as described herein, or a component thereof, is disposed.
  • the disclosure provides a pharmaceutical composition comprising a gene modifying polypeptide or a gene modifying system, e.g., as described herein.
  • the pharmaceutical composition further comprises a pharmaceutically acceptable carrier or excipient.
  • the pharmaceutical composition comprises a template RNA and/or an RNA encoding the polypeptide.
  • the pharmaceutical composition has one or more (e.g., 1, 2, 3, or 4) of the following characteristics: (a) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) DNA template relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis; (b) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) uncapped RNA relative to the template RNA and/or the RNA encoding the polypeptide, e.g., on a molar basis; (c) less than 1% (e.g., less than 0.5%, 0.4%, 0.3%, 0.2%, or 0.1%) partial length RNAs relative to the template RNA
  • a gene modifying system, polypeptide, and/or template nucleic acid e.g., template RNA
  • a gene modifying system, polypeptide, and/or template nucleic acid conforms to certain quality standards.
  • a gene modifying system, polypeptide, and/or template nucleic acid e.g., template RNA
  • a method described herein conforms to certain quality standards.
  • the disclosure is directed, in some aspects, to methods of manufacturing a gene modifying system, polypeptide, and/or template nucleic acid (e.g., template RNA) that conforms to certain quality standards, e.g., in which said quality standards are assayed.
  • the disclosure is also directed, in some aspects, to methods of assaying said quality standards in a gene modifying system, polypeptide, and/or template nucleic acid (e.g., template RNA).
  • a system or pharmaceutical composition described herein is endotoxin free.
  • the presence, absence, and/or level of one or more of a pyrogen, virus, fungus, bacterial pathogen, and/or host cell protein is determined.
  • Example 1 Evaluating the placement of 2’OMe modifications in the template RNAs to improve rewriting This example describes the use of exemplary gene modifying systems containing a gene modifying polypeptide and a template RNA comprising a gRNA scaffold, a spacer, a heterologous object sequence and a PBS sequence, and different patterns of 2’-O-methyl chemical modifications in the PBS sequence and heterologous object sequence, to quantify the activity of template RNAs targeting the FAH, HEK3, HBB and GFP loci.
  • the template RNAs contained: • a gRNA spacer; • a gRNA scaffold; • a heterologous object sequence; and • a primer binding site (PBS) sequence.
  • a gene modifying polypeptide contained: • an endonuclease and/or DNA binding domain; • a peptide linker; and • a reverse transcriptase (RT) domain.
  • Exemplary template RNAs evaluated in this Example are listed in Table 12 above.
  • the exemplary gene modifying polypeptide used in this example had the following amino acid sequence (RNAV093): MKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALL FDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNI VDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTY NQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDA KLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDL TLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFY
  • the gene modifying system included an mRNA encoding the RNAV093 gene modifying polypeptide and a template RNA (tgRNA) as described above. Specifically, 75 ng of RNAV093 mRNA and 1 pmol tgRNA were diluted to 10 ⁇ l and mixed with 35 ⁇ l Opti-MEM containing 0.5 ⁇ l MessengerMax. The lipoplexes were mixed with the cells in suspension and plated into 96-well plates. After transfection, cells were grown at 37 ⁇ C, 5% CO2 in DMEM media supplemented with 10% serum for 3 days prior to cell lysis and genomic DNA extraction.
  • tgRNA template RNA
  • Editing of the FAH, HBB, and HEK3 target nucleic acid sequence was assessed using amplicon sequencing (Amp-SEQ) using primers flanking the loci. Editing at the GFP locus was assessed by flow cytometry. To prepare treated cells for flow cytometry, they were collected and stained with Zombie Red Dye (1:500). The stained cells were analyzed on the Novocyte flow cytometer. For GFP to BFP conversion analysis, cells were gated based on FSC-A and SSC-A. Second, single cells were gated based on FSC-A and FSC- H.
  • FIG. 3A shows the effect of introducing three (3) consecutive 2’-O-methyl modifications in the priming sequence (PBS) and/or heterologous object sequence of the FAH1 template gRNA, which contained either unmodified scaffold or heavily modified scaffold (containing 2’-O-methyl modifications).
  • FIG. 3C shows the effect of different patterns of 2’-O-methyl and phosphorothioate modification in the PBS sequence and heterologous object sequence of different template RNAs targeting the HEK3 gene, FAH gene and HBB gene.
  • the HEK3 template RNAs were designed to insert either a CTT sequence or a 10 nt insertion, the FAH1 and FAH2 tgRNAs were designed to correct the FAH G>A mutation, and the HBB5 and HBB8 tgRNAs were designed to correct the mutation that causes sickle cell disease. Across all six tgRNAs, relative to the A1 design, the most modification patterns that were most tolerated appeared to be designs B12 and B12b. Both of these designs contained no modifications in the heterologous object sequence and contained 2’-O-methyl and phosphorothioate modifications in the PBS sequence regions except for the -1 to -3 positions (B12) or the -1 to -4 positions (B12b) of the PBS sequence region.
  • Example 2 Evaluating the placement of 2’-fluoro modifications in the template RNAs to improve rewriting
  • This example describes the use of exemplary gene modifying systems containing a gene modifying polypeptide and a template RNA comprising a gRNA scaffold, a spacer, a heterologous object sequence, a PBS sequence, and different patterns of 2’-fluoro chemical modifications in the PBS sequence and heterologous object sequence, to quantify the activity of template RNAs targeting GFP loci.
  • RNA contained: • a gRNA spacer; • a gRNA scaffold; • a heterologous object sequence; and • a primer binding site (PBS) sequence.
  • a gene modifying polypeptide contained: • an endonuclease and/or DNA binding domain; • a peptide linker; and • a reverse transcriptase (RT) domain.
  • RT reverse transcriptase
  • the exemplary gene modifying polypeptide used in this example had the following amino acid sequence (RNAV093): MKRTADGSEFESPKKKRKVDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALL FDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNI VDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTY NQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDA KLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDL TLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFY
  • the gene modifying system included RNAV093 gene modifying polypeptide and a template RNA described above. Specifically, 75 ng of RNAV093 mRNA and 1 pmol tgRNA were diluted to 10 ⁇ l and mixed with 35 ⁇ l Opti-MEM containing 0.5 ⁇ l MessengerMax. The lipoplexes were mixed with the cells in suspension and plated into 96-well plates. After transfection, cells were grown at 37 ⁇ C, 5% CO2 in DMEM media supplemented with 10% serum for 3 days prior to cell lysis and genomic DNA extraction. Editing at the GFP locus was assessed by flow cytometry. To prepare treated cells for flow cytometry, they were collected and stained with Zombie Red Dye (1:500).
  • FIG. 4 shows the effect of introducing 2 consecutive 2’-fluoro modifications in the priming sequence (PBS sequence) and/or the heterologous object sequence of the GFP template gRNA.
  • Example 3 Evaluating the combination of 2’-fluoro and 2’OMe modification patterns in the template RNAs to improve gene rewriting
  • This example describes the use of exemplary gene modifying systems containing a gene modifying polypeptide, a nicking gRNA comprising a spacer and a gRNA scaffold, and template RNAs comprising 218 of 237 11867955v1 Attorney Docket No.: 2017469-0019 different spacers, a gRNA scaffold, and different chemically modified heterologous object sequences and PBS sequences, to quantify the activity of template RNAs targeting hPAH and the mPah gene in mouse models.
  • this example describes the use of specific patterns of 2’-fluoro and 2’-O-methyl chemical modifications within the heterologous object sequences and the priming sequences that enable significantly higher gene rewriting efficacies in vitro and in vivo.
  • the specific patterns of 2’-fluoro and 2’-O-methyl modifications were designed based on the results of Examples 1 and 2 above. Using the 2’-O-methyl modification walk (see Example 1), it was determined that the -1 to -4 region of the priming sequence does not appear to tolerate 2’-O-methyl modifications, however, the region downstream of the -4 position was able to tolerate the 2’-O-methyl.
  • alternating 2’-fluoro modification patterns were designed by placing a 2’-fluoro at position +4 and then on alternating positions onwards towards the 5’ end to prevent overwinding of the RNA helix and to allow read-through by the reverse transcriptase.
  • the 2’-fluoro modifications may not only increase nuclease and thermodynamic stability, but may also modulate RNA-protein interactions (i.e., 2’-fluoro cannot participate in hydrogen bonds, and but can still participate in electrostatic interactions), which together could result in favorable rewriting outcomes.
  • two alternating 2’-fluoro modification patterns were designed in the heterologous object sequence.
  • the alternating pattern started from the most 5’ position of the heterologous object sequence (see, e.g., RNACS6874 as depicted in FIG. 5A), and in design 2, the alternating started with the second position from the 5’ end of the heterologous object sequence (see, e.g., RNACS6785 as depicted in FIG. 5A). It was hypothesized that these chemical modification patterns would improve rewriting performance via increased nuclease stability, improved interaction with DNA and by modulating interaction with reverse transcriptase.
  • a template RNA contained: • a gRNA spacer; • a gRNA scaffold; • a heterologous object sequence; and • a primer binding site (PBS) sequence.
  • a nicking RNA contained: • a gRNA spacer; and • a gRNA scaffold.
  • a gene modifying polypeptide contained: • an endonuclease and/or DNA binding domain; • a peptide linker; and • a reverse transcriptase (RT) domain.
  • Exemplary template RNAs evaluated in this experiment are listed in Table 14 above.
  • RNAIVT1044 The exemplary gene modifying polypeptide used in this example had the following amino acid sequence (RNAIVT1044).
  • RNAIVT1044 MPAAKRVKLDGGDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETA EATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYH EKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEEN PINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKD TYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALV RQQLPEKYKEIFFD
  • RNAIVT1044 mRNA coding for the gene modifying polypeptide, a nicking guide RNA and template RNAs described above were formulated into LNPs and transfected into primary mouse hepatocytes from a transgenic hPAH model carrying the human R408W mutation knock-in in the mouse Pah gene locus.
  • the activity of the template RNAs was tested by quantifying the correction of R408W mutation (corresponding to a C>T base change) in hPAH transgenic mouse hepatocytes in vitro.
  • Hepatocytes were freshly isolated from hPAH transgenic mouse liver and plated at a density of 25,000 cells/well in a Collagen-I coated plate and left to attach at 37 ⁇ C, 5% CO2 in plating media containing 5% fetal bovine serum (FBS).18-20 hours post-plating, the plating media was switched out to maintenance media containing no FBS, prior to LNP treatment.
  • Different doses of LNPs were prepared in 10% mouse serum and incubated at room temperature for at least 10 minutes. 20 uL LNP-serum mixture was added to hepatocytes coated with maintenance media. The cells were allowed to incubate at 37 ⁇ C, 5% CO2 for 72 hours post-LNP treatment before lysis and genomic DNA extraction.
  • Genomic DNA samples were analyzed using Amp-Seq to determine % rewriting and % INDELs in treated hPAH mouse hepatocytes.
  • primers flanking the target insertion site locus were used to amplify across the locus in the genomic DNA of liver 221 of 237 11867955v1 Attorney Docket No.: 2017469-0019 samples.
  • Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Conversion of a T nucleotide to a C nucleotide at hPAH locus indicated successful editing.
  • FIG.5A shows graphs of percentage rewriting (left) and percentage INDEL (right) levels for each of the tested template RNAs.
  • RNACS6874 (design 1) showed the highest rewriting activity at 40% at 1 ug total LNP. Lower doses at 0.5 ug and 0.25 ug showed approximately 1.8-fold and approximately 1.9-fold higher rewriting, respectively, than the parent template RNA, RNACS4134.
  • RNACS6875 (design 2) was also able to enhance rewriting activity by approximately 1.5-fold at 0.5 ug total LNP and approximately 1.6-fold at 0.25 ug total LNP, relative to the parent template RNA, RNACS4134.
  • the INDEL% results showed that all tested template RNAs exhibited relatively low levels of INDEL generation in hPAH transgenic mouse hepatocytes, including RNACS6874.
  • RNAIVT1044 mRNA coding for the gene modifying polypeptide, a nicking guide RNA and template RNAs described above were formulated into LNPs and dosed in hPAH transgenic mice, carrying the human R408W mutation, in order to quantify the activity of the template RNAs for the correction of the R408W mutation (corresponding to a C>T base change) in vivo.
  • FIG.5B shows graphs of percentage rewriting (left) and percentage INDEL (right) levels for each of the tested template RNAs.
  • RNACS6874 design 1
  • RNACS6875 design 2
  • was also able to enhance rewriting activity approximately 1.3-fold relative to the parent template RNA, RNACS4134.
  • the INDEL% results show that all tested template RNAs exhibited low levels of INDEL generation in hPAH in mouse liver cells, including RNACS6874.
  • the results show that a combination of alternating the placement of 2’ fluoro and including 2’-O-methyl in the template RNA enhanced rewriting levels in hPAH in vivo in mice and corrected a clinically relevant mutation.
  • 222 of 237 11867955v1 Attorney Docket No.: 2017469-0019
  • the dose response for hPAH template RNAs were tested in vivo in a mouse model for hPKU.
  • RNAIVT1044 mRNA coding for the gene modifying polypeptide, a nicking guide RNA and template RNAs described above were formulated into LNPs and dosed in hPAH transgenic mice, carrying the human R408W mutation, for a dose response study in order to validate the correction of R408W mutation (corresponding to a C>T base change) at varying doses in vivo.
  • Dosing was performed intravenously in 9 to 11-week-old, female hPAH mice in a 10 ml/kg bolus at doses of 0.15 mg/kg, 0.5 mg/kg and 1.5 mg/kg of total RNA payload.
  • FIG.5C shows graphs of percentage rewriting (left) and percentage INDEL (right) levels for each of the tested template RNAs.
  • RNACS6874 (design 1) showed highest rewriting activity of 47% at the 1.5 mg/kg dose.
  • the INDEL% results showed relatively low levels of INDEL generation at doses of 0.15 mg/kg and 0.5 mg/kg in these mouse liver cells.
  • the activity of the template RNAs was tested by quantifying the correction of F263S mutation (corresponding to a T>C base change) in Pah enu2 F263S mutant mouse hepatocytes in vitro.
  • Hepatocytes were freshly isolated from Pah enu2 F263S mutant mouse liver and plated at a density of 25,000 cells/well in a Collagen-I coated plate and left to attach at 37 ⁇ C, 5% CO2 in plating media containing 5% fetal bovine serum (FBS).18-20 hours post-plating, the plating media was switched out to maintenance media containing no FBS, prior to LNP treatment.
  • Different doses of LNPs were prepared in 10% mouse serum and incubated at room temperature for at least 10 minutes.
  • FIG.5D shows graphs of percentage rewriting (left) and percentage INDEL (right) levels for each of the tested template RNAs.
  • RNACS6872 (design 1) showed the highest rewriting activity of 47% at 1 ug total LNP. Lower doses at 0.5 ug and 0.25 ug showed approximately 1.4-fold and approximately 1.4-fold higher rewriting, respectively, than the parent template RNA, RNACS1855. Rewriting activity was not enhanced for RNACS6873 (design 2) relative to the parent template RNA, RNACS1855 in Pah enu2 F263S mutant hepatocytes.
  • INDEL% results show that all tested template RNAs had relatively low levels of INDEL generation in these hepatocytes, including RNACS6872. These results showed that some combination of alternating the placement of 2’ fluoro and including 2’-O-methyl in the template RNA enhanced Pah rewriting levels in vitro in Pah enu2 F263S mutant mouse hepatocytes. In a further experiment, the murine PAH template RNAs were tested in vivo.
  • RNAIVT1044 mRNA coding for the gene modifying polypeptide, a nicking guide RNA and template RNAs described above were formulated into LNPs and dosed in Pah enu2 F263S mutant mice, carrying an F263S mutation in the mouse Pah gene locus, in order to quantify the activity of the template RNAs for the correction of F263S mutation (corresponding to a T>C base change).
  • liver samples were analyzed using Amp-Seq to determine % rewriting and % INDELs in target liver cells.
  • primers flanking the target insertion site locus were used to amplify across the locus in the genomic DNA of liver samples. Amplicons were analyzed via short read sequencing using an Illumina MiSeq. Conversion of a C nucleotide to a T nucleotide at position c.788 in the mouse Pah gene indicated successful editing.
  • FIG.5E shows graphs of percentage rewriting (left) and percentage INDEL (right) levels for each of the tested template RNAs.
  • RNACS6872 design 1
  • RNACS6873 design 2
  • INDEL% results show that all tested template RNAs had relatively low levels of INDEL generation in these mouse liver cells, including RNACS6872.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Epidemiology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)
EP24775687.7A 2023-03-21 2024-03-20 Modifizierte matrizenführungs-rna-moleküle Pending EP4684023A2 (de)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363453662P 2023-03-21 2023-03-21
PCT/US2024/020822 WO2024197103A2 (en) 2023-03-21 2024-03-20 Modified template guide rna molecules

Publications (1)

Publication Number Publication Date
EP4684023A2 true EP4684023A2 (de) 2026-01-28

Family

ID=92842443

Family Applications (1)

Application Number Title Priority Date Filing Date
EP24775687.7A Pending EP4684023A2 (de) 2023-03-21 2024-03-20 Modifizierte matrizenführungs-rna-moleküle

Country Status (2)

Country Link
EP (1) EP4684023A2 (de)
WO (1) WO2024197103A2 (de)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4162044A4 (de) * 2020-06-05 2025-08-06 Flagship Pioneering Innovations Vi Llc Matrizenführungs-rna-moleküle
CA3221566A1 (en) * 2021-05-26 2022-12-01 Flagship Pioneering Innovations Vi, Llc Integrase compositions and methods
EP4399306A4 (de) * 2021-09-08 2026-01-21 Flagship Pioneering Innovations Vi Llc Pah-modulierende zusammensetzungen und verfahren
MX2024002928A (es) * 2021-09-08 2024-05-29 Flagship Pioneering Innovations Vi Llc Reclutamiento en trans de componentes de sistema de edición de genes.
JP2024533313A (ja) * 2021-09-08 2024-09-12 フラッグシップ パイオニアリング イノベーションズ シックス,エルエルシー Hbbを調節する組成物及び方法

Also Published As

Publication number Publication date
WO2024197103A3 (en) 2024-11-14
WO2024197103A2 (en) 2024-09-26

Similar Documents

Publication Publication Date Title
US12031162B2 (en) Methods and compositions for modulating a genome
EP4399309A2 (de) Serpinamodulierende zusammensetzungen und verfahren
US12270029B2 (en) CFTR-modulating compositions and methods
WO2023039440A9 (en) Hbb-modulating compositions and methods
WO2023250492A2 (en) Fah-modulating compositions and methods
WO2024086586A2 (en) Improved gene editing systems utilizing trans recruiting components
US20250305002A1 (en) Recruitment in trans of gene editing system components
AU2024235161A1 (en) Serpina-modulating compositions and methods
WO2024197103A2 (en) Modified template guide rna molecules
WO2025194138A1 (en) St1cas9 compositions and methods for modulating a genome
WO2026060328A2 (en) Serpina-modulating systems and methods
WO2026039752A1 (en) Cftr-modulating compositions and methods
WO2025235506A2 (en) Atp7b-modulating compositions and methods
WO2024192270A2 (en) Pah-modulating systems and methods
WO2025194124A1 (en) Modified st1cas9 guide nucleic acids
WO2023225471A2 (en) Helitron compositions and methods

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20251013

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR