WO2023225670A2 - Ex vivo programmable gene insertion - Google Patents

Ex vivo programmable gene insertion Download PDF

Info

Publication number
WO2023225670A2
WO2023225670A2 PCT/US2023/067265 US2023067265W WO2023225670A2 WO 2023225670 A2 WO2023225670 A2 WO 2023225670A2 US 2023067265 W US2023067265 W US 2023067265W WO 2023225670 A2 WO2023225670 A2 WO 2023225670A2
Authority
WO
WIPO (PCT)
Prior art keywords
integration
recognition site
primary cell
site
atgrna
Prior art date
Application number
PCT/US2023/067265
Other languages
French (fr)
Other versions
WO2023225670A3 (en
Inventor
Chong Luo
Patrick Mendes TAVARES
Jonathan Douglas FINN
Original Assignee
Tome Biosciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tome Biosciences, Inc. filed Critical Tome Biosciences, Inc.
Publication of WO2023225670A2 publication Critical patent/WO2023225670A2/en
Publication of WO2023225670A3 publication Critical patent/WO2023225670A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0634Cells from the blood or the immune system
    • C12N5/0636T lymphocytes
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/10Type of nucleic acid
    • C12N2310/20Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPRs]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2310/00Structure or type of the nucleic acid
    • C12N2310/30Chemical structure
    • C12N2310/35Nature of the modification
    • C12N2310/351Conjugate
    • C12N2310/3519Fusion with another nucleic acid
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1276RNA-directed DNA polymerase (2.7.7.49), i.e. reverse transcriptase or telomerase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses

Definitions

  • an integration target recognition site i.e., attB or attP site
  • a gene editor protein or polynucleotide encoding a gene editor protein wherein the gene editor protein is capable of incorporating the integration target recognition site into the genome
  • one or more guide RNA wherein the one or more guide RNA encodes an integrase target recognition site (i.e., attachment site-containing guide RNA, atgRNA), and (iii) optionally, a nicking guide RNA (ngRNA).
  • ngRNA nicking guide RNA
  • integrating a donor polynucleotide template into the human primary cell genome at the incorporated target recognition site by delivering into the cell: (i) the donor polynucleotide template, wherein the donor polynucleotide template is comprised of an integration target site, wherein the donor polynucleotide is integrated into the human primary cell at the incorporated genomic integration target recognition site by an integrase, thereby producing a genetically modified human primary cell.
  • the present disclosure provides nucleic acid compositions, methods, and an overall platform for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE) (see Ionnidi et al.; doi: 10.1101/2021.11.01.466786; the entirety of which is incorporated herein by reference), transposon-mediated gene editing, or other suitable gene editing, or gene incorporation technology.
  • PASTE Programmable Addition via Site-Specific Targeting Elements
  • this disclosure features a method of generating a primary cell comprising an integration recognition site, the method comprising: (a) site-specifically incorporating at least a first integration recognition site into a target sequence in the genome of the primary cell.
  • site-specifically incorporating the at least first integration recognition site is affected by introducing into the primary cell: (i) a gene editor polypeptide or a polynucleotide encoding a gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain and is capable of incorporating the integration recognition site into the genome of the primary cell at the target sequence; (ii) at least a first pair of guide RNAs, wherein the first paired guide RNAs have domains that are capable of guiding the gene editor polypeptide to nick the primary cell genome at sites respectively flanking the specific incorporation site, and at least one of the two paired guide RNAs is an atgRNA that
  • the at least first pair of guide RNAs comprise: (i) the first of the two paired guide RNAs is a first atgRNA that further includes a first RT template that comprises at least a portion of the first integration recognition site; and (ii) a second of the two paired guide RNAs is a second atgRNA that further includes a second RT template that comprises at least a portion of the first integration recognition site, wherein the first atgRNA and the second atgRNAs collectively encode the entirety of the first integration recognition site.
  • the at least first pair of guide RNAs comprise: (i) the first of the two paired guide RNAs is an atgRNA that further includes an RT template that comprises at least a portion of the first integration recognition site, wherein the atgRNA encodes the entirety of the first integration recognition site; and (ii) a second of the two paired guide RNAs is a nicking gRNA.
  • the method further comprises incorporating a plurality of integration recognition sites.
  • the method further comprises: (b) integrating at least a first donor polynucleotide template into the primary cell genome at the first incorporated integration recognition site, by introducing into the cell: (i) the first donor polynucleotide template, wherein the first donor polynucleotide template is comprised of one or more orthogonal/cognate integration recognition sites, and (ii) an integrase, whereby the donor polynucleotide is integrated into the primary cell at the at least one incorporated genomic integration recognition sites by the integrase; thereby producing a primary cell with a site-specifically integrated donor polynucleotide template.
  • steps (a) and (b) are performed concurrently. [0014] In some embodiments, step (a) is performed prior to step (b). [0015] In some embodiments, step (a) and step (b) are performed at least 1 day, at least 2 days, at least 3 days, at least 4 days, at least 5 days, at least 6 days, at least 7 days, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 5 weeks, at least 6 weeks, at least 7 weeks, or at least 8 weeks apart.
  • this disclosure features a method of generating an edited primary cell, the method comprising: integrating, into the genome of the primary cell at the first incorporated recognition site, at least a first donor polynucleotide template, by introducing into the cell: (i) the first donor polynucleotide template, wherein the first donor polynucleotide template is comprised of one or more orthogonal/cognate integration recognition sites, and (ii) an integrase, whereby the donor polynucleotide is integrated into the primary cell at the at least one incorporated genomic integration recognition sites by the integrase; thereby producing a primary cell with a site-specifically integrated donor polynucleotide template.
  • this disclosure features a method of site specifically integrating a donor polynucleotide template into a primary cell genome, the method comprising: incorporating an integration recognition site into the primary cell genome by delivering into the cell: a gene editor polypeptide or a polynucleotide encoding a gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain and is capable of incorporating the integration recognition site into the genome of the primary cell at a target sequence; an at least a first attachment site-containing guide RNA (atgRNA), wherein the one or more guide RNA encodes all or a portion of an at least first integration recognition site; and integrating a donor polynucleotide template into the human primary cell genome at the incorporated recognition site by delivering into the cell: the donor polynucleotide template, an integration enzyme or a polynucleotide encoding an integration enzyme, wherein the donor polynucle
  • the gene editor polypeptide or the polynucleotide encoding a gene editor polypeptide, the at least first atgRNA, the donor polynucleotide template, and the integration enzyme or the polynucleotide encoding an integration enzyme are concurrently delivered.
  • the gene editor protein or the polynucleotide encoding a gene editor protein, and the at least first atgRNA are delivered at a first time point and the donor polynucleotide template, and the integration enzyme or the polynucleotide encoding an integration enzyme are delivered at a second time point.
  • the first time point and the second time point are at least 1 day, at least 2 days, at least 3 days, at least 4 days, at least 5 days, at least 6 days, at least 7 days, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 5 weeks, at least 6 weeks, at least 7 weeks, or at least 8 weeks apart.
  • the gene editor polypeptide is configured such that the nickase is linked to the reverse transcriptase.
  • the nickase is linked to the reverse transcriptase by in-frame fusion.
  • the nickase is linked to the reverse transcriptase by a linker.
  • the linker is a peptide fused in-frame between the nickase and reverse transcriptase.
  • the gene editor polynucleotide further comprises a polynucleotide sequence encoding at least an integration enzyme or the gene editor polypeptide further comprises an integration enzyme.
  • the linked nickase-reverse transcriptase are further linked to the integration enzyme.
  • an at least first atgRNA comprises: (i) a domain that is capable of guiding the prime editor system to the target sequence; and (ii) a reverse transcriptase (RT) template that comprises at least a portion of an at least first integration recognition site.
  • the RT template comprises the entirety of the first integration recognition site.
  • site-specifically incorporating an integration recognition site into the primary cell genome further comprises delivering into the cell a second atgRNA.
  • the first atgRNA and the second atgRNA are an at least first pair of atgRNAs, wherein the at least first pair of atgRNAs have domains that are capable of guiding the prime editor system to a target sequence; the first atgRNA further includes a first RT template that comprises at least a portion of an at least first integration recognition site; the second atgRNA further includes a second RT template that comprises at least a portion of the first integration recognition site, and the first atgRNA and the second atgRNAs collectively encode the entirety of the first integration recognition site.
  • the first RT template encodes a first single-stranded DNA sequence and the second RT template encodes a second single-stranded DNA sequence.
  • the first single-stranded DNA sequence comprises a complementary region with the first single-stranded DNA sequence.
  • the first single-stranded DNA sequence and the first single-stranded DNA sequence form a duplex.
  • the complementary region is 5 or more consecutive bases. In some embodiments, the complementary region is 10 or more consecutive bases. In some embodiments, the complementary region is 20 or more consecutive bases. In some embodiments, the complementary region is 30 or more consecutive bases.
  • incorporating an integration recognition site into the human primary cell genome further comprises delivering into the cell a nicking guide RNA (gRNA).
  • the first integration recognition site is an attB or attP site.
  • the integration recognition site is a modified attB or attP site.
  • the integration recognition site is specific for BxB1 or a modified BxB1.
  • the integration recognition site is comprised of 38 or 46 nucleotides.
  • the donor polynucleotide template is a minicircle.
  • the primary cell is a human primary cell.
  • the primary cell is a T cell.
  • the T cell is a CD4 + T cell, a CD8 + T cell, or a combination thereof.
  • the T cell is an autologous T cell or allogeneic T cell.
  • the target sequence is located in the TRAC locus, the CXCR4 locus, or the IL2RB locus, whereby the integration recognition site is incorporated into the TRAC locus, the CXCR4 locus, or the IL2RB locus.
  • the introducing is performed by electroporation.
  • the introducing is achieved using one or more of a recombinant adenovirus, helper dependent adenovirus, AAV, lentivirus, HSV, anellovirus, retrovirus, Doggybone DNA, minicircle, plasmid, miniDNA, nanoplasmid, exosome, fusosome, mRNA, RNP, or lipid nanoparticle (LNP), or a combination thereof.
  • the donor polynucleotide template encodes for a chimeric antigen receptor or a T cell receptor.
  • the primary cell is activated ex vivo or in vitro prior to step (a).
  • the edited primary cell is expanded to generate an expanded edited primary cell composition.
  • the expanded modified human primary cell composition is isolated.
  • this disclosure features a primary cell generated by any of the methods described herein.
  • this disclosure features a population of primary cells generated by any of the methods described herein.
  • this disclosure features a method of treating a blood disease or disorder in a subject in need thereof, the method comprising: administering to the subject the primary cell or the population of cells of any one of the preceding embodiments.
  • this disclosure features a primary cell, comprising: an at least first integration recognition site site-specifically incorporated into the primary cell genome.
  • the at least first integration recognition site is incorporated into the primary cell genome at a TRAC locus, a CXCR4 locus, or a IL2RB locus.
  • the at least first integration recognition site is specific for a serine integrase.
  • the at least first integration recognition site is an attB or attP site.
  • the at least first integration recognition site is a modified attB or attP site.
  • the at least first integration recognition site is specific for BxB1 or a modified BxB1.
  • the at least first integration recognition site is comprised of 38 or 46 nucleotides.
  • this disclosure features a primary cell comprising: a donor polynucleotide template integrated into the TRAC locus, CXCR4 locus, or the IL2RB locus, wherein the donor polynucleotide comprises a residual integration recognition site (e.g., an AttR and/or an AttL site).
  • the at least first integration recognition site is incorporated into the primary cell genome at a TRAC locus, a CXCR4 locus, or a IL2RB locus.
  • the donor polynucleotide template is a minicircle.
  • the donor polynucleotide template encodes a chimeric antigen receptor or a T cell receptor.
  • this disclosure features a system for site-specifically integrating a donor polynucleotide template into a primary cell genome, the system comprising: a first attachment site- containing guide RNA comprising a sequence selected from Tables 12 or 14; a second attachment site-containing guide RNA comprising a sequence selected from Tables 12 or 14, wherein the second atgRNA comprises a different spacer sequence from the first atgRNA; and a gene editor polypeptide or a polynucleotide encoding a gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain and is capable of incorporating the integration recognition site into the genome of the primary cell at a target sequence.
  • the system further comprises: a donor polynucleotide template; and an integration enzyme or a polynucleotide encoding an integration enzyme, wherein the donor polynucleotide template is linked to a sequence that is an integration cognate of the integration recognition site present in the atgRNA, and wherein the donor polynucleotide template is integrated into the genome at the incorporated genomic integration recognition site by the integration enzyme.
  • this disclosure features an attachment site-containing guide RNA (atgRNA) comprising: a sequence selected from Tables 12 or 14.
  • this disclosure features a pair of attachment site-containing guide RNA (atgRNA) comprising: a first attachment site-containing guide RNA comprising a sequence selected from Tables 12 or 14; and a second attachment site-containing guide RNA comprising a sequence selected from Tables 12 or 14, wherein the second atgRNA comprises a different spacer sequence from the first atgRNA. 5.
  • atgRNA attachment site-containing guide RNA
  • FIG.1 illustrates reference CXCR4 and IL2RB prime editing guide RNAs (pegRNAs) to be modified into attachment site-containing guide RNAs (atgRNAs) via integrase recognition site incorporation into the reverse transcriptase template segment of the guide RNA.
  • FIG.2 illustrates modification of pegRNAs into atgRNAs capable of beacon placement into a target genome site. Exemplary 38 or 46 base attB insertion is shown.
  • FIG.1 illustrates reference CXCR4 and IL2RB prime editing guide RNAs (pegRNAs) to be modified into attachment site-containing guide RNAs (atgRNAs) via integrase recognition site incorporation into the reverse transcriptase template segment of the guide RNA.
  • FIG.2 illustrates modification of pegRNAs into atgRNAs capable of beacon placement into a target genome site. Exemplary 38 or 46 base attB insertion is shown.
  • FIG. 3 illustrates beacon placement within T cells at a high dose and low dose of reagent conditions, wherein reagents are delivered via electroporation (on Day 2 of experimental workflow).
  • FIGs.4A-4C illustrates beacon placement efficiency within T cells at a high dose and low dose of reagent conditions, wherein reagents are delivered via electroporation (on Day 2 of experimental workflow) (see Example 1).
  • FIG.4A shows prime editing efficiency of the reference pegRNA as described in FIG. 1.
  • FIG. 4B shows percent beacon (integration recognition site) placement using low and high dose CXCR4 atgRNA comprising either a 38bp or a 46bp AttB integration recognition site.
  • FIG. 4C shows percent beacon (integration recognition site) placement using low and high dose IL2RB atgRNA comprising either a 38bp or a 46bp AttB integration recognition site.
  • FIG. 5 illustrates single atgRNA (with a paired nicking guide) optimization screen conditions for T cell beacon placement at the CXCR4 and IL2RB loci, respectively.
  • FIG.6A illustrates two-color beacon placement quantification (top).
  • FIG. 6B-6C show percent beacon placement efficiency is shown for CXCR4 and IL2RB loci, respectively, as detected by the assay as described in FIG.6A.
  • FIG.6B shows percent beacon placement efficiency is shown for CXCR4.
  • FIG. 6C shows percent beacon placement efficiency is shown for IL2RB.
  • FIGs. 7A-7B illustrate beacon placement efficiency for CXCR4 and IL2RBi loci determined by amplicon sequencing (amp-seq) or digital droplet PCR (ddPCR).
  • FIG.7A shows percent beacon placement for CXCR4 as detected by amp-seq and ddPCR.
  • FIG.7B shows percent beacon placement for IL2RB as detected by amp-seq and ddPCR.
  • FIG.8 illustrates dual atgRNA method for genomic beacon placement.
  • FIG.9 illustrates the dual atgRNA design which includes attB site (beacon) site length, and 20 base pair overlap between synthesized DNA strands (encoded within a segment of atgRNA reverse transcription template).
  • FIG. 10 illustrates experimental workflow for a dual atgRNA mediated beacon placement. Top panel indicates the workflow timeline. Bottom panel indicates electroporation conditions.
  • FIGs. 11A-11B illustrate dual guide mediated beacon placement efficiency at the CXCR4 (FIG. 11A) and IL2RBi (FIG. 11B) loci, respectively, as determined by amplicon sequencing (amp-seq) or digital droplet PCR (dd-PCR).
  • FIG.12A illustrates a non-limiting exemplary chimeric antigen receptor (CAR) or T cell receptor (TCR) knock in (insertion) at the TRAC locus (FIG.12A) (see Eyquem et al, Nature, 2017; Roth et al.
  • CAR chimeric antigen receptor
  • TCR T cell receptor
  • FIG.12B shows guide RNAs (“MS”, “AM”, “TF”) and nicking guide RNAs targeting at or near TRAC exon 1 region.
  • FIG.13 illustrates modification of single guide RNAs (sgRNAs) identified in FIG.12 into single atgRNA (with nicking guide) or dual atgRNAs capable of beacon placement into the T cell TRAC locus. Exemplary 38 or 46 base attB insertion is shown.
  • FIGs. 14A-14B illustrate TCR protein disruption with TRAC specific guides (“MS”, “AM”, “TF”) complexed to a SpCas9 (expressed from mRNA or a protein form).
  • FIG.14A shows results using guides “gAM” and “gTF.”
  • FIG.14A shows results using guides “gAM” and “gTF.”
  • FIG. 14A left panel shows percent live CD3 + T cells following introduction of the guides in a complex (e.g., RNP) with a Cas9 protein.
  • FIG.14A right panel shows percent INDELs following introduction of the guides in a complex (e.g., RNP) with a Cas9 protein.
  • FIG.14B shows results of testing of various forms of Cas9 in combination with guides “gMS”, “gAM”, or “gTF”.
  • FIG.14B top panel show flow cytometry plots of CD3 + T cells following transduction with a control “EP No Substrate” or a gAM quide/Cas9 RNP.
  • FIG. 14B bottom panel shows a summary of the live CD3 + T cells following transduction with the indicate conditions.
  • FIG. 15 illustrates single atgRNA (and associated ngRNA) and dual atgRNA experimental screen conditions for TRAC locus beacon placement. Top tables show conditions for electroporations. Bottom panel shows experimental workflow.
  • FIGs.16A-16C illustrates single atgRNA mediated (with nicking guide RNA) beacon placement efficiency at the TRAC locus using AM, MS, and TF spacers identified in FIG.12 and tested in FIG.14.
  • FIG. 16A shows percent beacon placement at the TRAC locus when using an atgRNA having a AM spacer.
  • FIG.16B shows percent beacon placement at the TRAC locus when using an atgRNA having a MS spacer.
  • FIG.16C shows percent beacon placement at the TRAC locus when using an atgRNA having a TF spacer.
  • FIGs. 17A-17C illustrate dual atgRNA mediated beacon placement efficiency at the TRAC locus.
  • FIG. 17A shows percent beacon placement at the TRAC locus when using a first atgRNA having a AM spacer and a second atgRNA having a spacer from the ng4 nicking guide RNA.
  • FIG.17B shows percent beacon placement at the TRAC locus when using a first atgRNA having a MS spacer and a second atgRNA having a spacer from the ng4 nicking guide RNA.
  • FIG. 17C shows percent beacon placement at the TRAC locus when using a first atgRNA having a TF spacer and a second atgRNA having a spacer from the ng12 nicking guide RNA.
  • FIG. 18 shows percent beacon placement comparing the dual atgRNA-mediated beacon placement compared to single atgRNA-mediated beacon placement across 3 genomic targets (CXCR4, IL2RB, and TRAC) and the five spacer pairs (CXCR4, IL2RB, TRAC-AM, TRAC- MS, and TRAC-TF).
  • FIG. 19A-19E shows analysis of AttP variants.
  • FIG. 19A shows a non-limiting schematic of AttP mutations tested for improving integration efficiency (SEQ ID NOS: 394 and 540-542, respectively, in order of appearance).
  • FIG. 19B shows integration efficiencies of wildtype and mutant AttP sites across a panel of AttB lengths.
  • FIG. 19C shows a non-limiting schematic of multiplexed integration of different cargo sets at specific genomic loci. Three fluorescent cargos (GFP, mCherry, and YFP) are inserted orthogonally at three different loci (ACTB, LMNB1, NOLC1) for in-frame gene tagging.
  • FIG. 19D shows orthogonality of top 4 AttB/AttP dinucleotide pairs evaluated for GFP integration with PASTE at the ACTB locus.
  • FIG. 20A-20E show the experimental details used to test programmable gene insertion (PGI) in primary T cells as tested in Example 6.
  • FIG. 20A shows a schematic of the mRNA construct encoding BxB1 used for the Experiments in Example 6, which has a sequence of SEQ ID NO: 600.
  • FIG. 20B shows a schematic of the experimental workflow used in Example 6.
  • FIG. 20C shows a schematic of the experimental workflow in terms of days.
  • Day 0 T cells were thawed.
  • Day 2 T cells were transfected with a gene editor polynucleotide and dual attachment site-containing guide RNAs (atgRNAs).
  • Day 6 T cells were transfected with a polynucleotide encoding an integrase and a donor polynucleotide template.
  • FIG. 20D shows a table providing further details of the electroporation conditions tested in Example 6.
  • FIG. 20E shows a histogram of percent (%) viable cells following the electroporation at day 2 (“Post EP #1”) and following the second electroporation at day 6 (“Post EP #2”).
  • FIG. 21 shows a histogram of percent (%) beacon placement (BP) for the indicated conditions tested in Example 6.
  • FIG.22 shows a histogram of percent (%) programmable gene insertion (PGI) for the indicated conditions tested in Example 6.
  • FIG. 23 shows a histogram of percent (%) beacon conversion for the indicated conditions tested in Example 6.
  • BxB1 #1 comprises a sequence of SEQ ID NO: 600.
  • BxB1 #2 comprises a sequence of SEQ ID NO: 601.
  • BxB1 #3 comprises a sequence of SEQ ID NO: 602. [0099] FIGs.
  • FIG. 25A-25B show the experimental details used to test programmable gene insertion (PGI) in primary T cells as tested in Example 7.
  • FIG. 25A shows a schematic of the experimental workflow in terms of days.
  • Day 0 T cells were thawed.
  • Day 2 T cells were transfected with a gene editor polynucleotide and at least a first attachment site-containing guide RNA (atgRNA).
  • Day 6 T cells were transfected with a polynucleotide encoding an integrase.
  • Day 12 cells were harvested for assessment of PGI.
  • FIG.25B shows a table providing further details of the transfection conditions tested in Example 7. [0100]
  • FIG. 26 shows a histogram of percent (%) viability for the indicated conditions (see Example 7).
  • Donor polynucleotide template was provided in the form of a minicircle at 3 ⁇ g or 6 ⁇ g for No BxB1, BxB1 #1, BxB1 #2, BxB1 #3 and EH-115 BxB1 #1.
  • FIG.27 shows a histogram of percent (%) beacon placement for the indicated conditions (see Example 7).
  • Donor polynucleotide template was provided in the form of a minicircle at 3 ⁇ g or 6 ⁇ g for No BxB1, BxB1 #1, BxB1 #2, BxB1 #3 and EH-115 BxB1 #1.
  • FIG. 28 shows a histogram of percent (%) beacon conversion for the indicated conditions (see Example 7).
  • Each condition on the x-axis was tested with DNA template provided as a minicircle at 3 ⁇ g or 6 ⁇ g.
  • Donor polynucleotide template was provided in the form of a minicircle at 3 ⁇ g or 6 ⁇ g for No BxB1, BxB1 #1, BxB1 #2, BxB1 #3 and EH-115 BxB1 #1Conditions on x-axis: “BP only” – Beacon placement (cells were electroporated once with program EO-115 with nCas9-RT mRNA and guide RNAs); NS – No substrate in the electroporation reaction; “BP + NS” - Cells were electroporated with program EO-115 for beacon placement (BP), then electroporated a second time with program EO-115 but no substrate (DNA cargo or BxB1 mRNA); “No BxB1
  • FIG.29 shows a histogram of percent (%) programmable gene insertion (PGI) for the indicated conditions (see Example 7).
  • PPI programmable gene insertion
  • BP only Beacon placement (cells were electroporated once with program EO-115 with nCas9-RT mRNA and guide RNAs); NS – No substrate in the electroporation reaction; “BP + NS” - Cells were electroporated with program EO-115 for beacon placement (BP), then electroporated a second time with program EO-115 but no substrate (DNA cargo or BxB1 mRNA); “No BxB1 group” - Cells were electroporated with program EO-115 for beacon placement (BP), then electroporated a second time with program EO-115 with DNA cargo (but no Bxb1 mRNA); and “EH-115” – An electroporation program for the Lonza 4D Nucleofector.
  • FIGs. 30A-30D show the experimental details used to test programmable gene insertion (PGI) in primary T cells as tested in Example 8.
  • FIG. 30A shows a schematic of the experimental workflow used in Example 8.
  • FIG.30B shows a schematic of the mRNA construct encoding BxB1 used for the Experiments in Example 8, which has a sequence of SEQ ID NO: 600.
  • FIG. 30C shows a schematic of the experimental workflow in terms of days. Day 0: T cells were thawed.
  • FIG. 30D shows a table providing further details of the electroporation conditions tested in Example 8.
  • FIG. 31 shows a histogram of percent (%) beacon placement (BP) for the indicated conditions tested in Example 8.
  • FIG.32 shows a histogram of percent (%) programmable gene insertion (PGI) for the indicated conditions tested in Example 8.
  • This disclosure features systems, compositions, and methods for generating a primary cell comprising an integration recognition site site-specifically integrated into the genome of the primary cell.
  • this disclosure features a method of generating a primary cell comprising an integration recognition site, where the method includes (a) site- specifically incorporating at least a first integration recognition site into a target sequence in the genome of the primary cell.
  • site-specifically incorporating the at least first integration recognition site is effected by introducing into the primary cell: (i) a gene editor polypeptide or a polynucleotide encoding a gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain and is capable of incorporating the integration recognition site into the genome of the primary cell at the target sequence; (ii) at least a first pair of guide RNAs, wherein the first paired guide RNAs have domains that are capable of guiding the gene editor protein or prime editor fusion protein to nick the primary cell genome at sites respectively flanking the specific incorporation site, and at least one of the two paired guide RNAs is an atgRNA that further includes a reverse transcriptase (RT) template that comprises at least a portion of the first integration recognition site, and the atgRNAs collectively encode the entirety of the first integration recognition site, whereby the first integration recognition site is site-specifically incorporated into the RT
  • this disclosure features systems, compositions, and methods methods for generating a primary cell comprising an exogenous nucleic acid (e.g., a donor polynucleotide template) site- specifically integrated into the genome of the primary cell.
  • this disclosure features a method incorporating an integration recognition site into the human primary cell genome by delivering into the cell: a gene editor polypeptide or a polynucleotide encoding a gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain and is capable of incorporating the integration recognition site into the genome of the primary cell at a target sequence; an at least a first attachment site-containing guide RNA (atgRNA), wherein the one or more guide RNA encodes all or a portion of an at least first integration recognition site; and integrating a donor polynucleotide template into the human primary cell genome at the incorporated recognition site by delivering into the cell: the donor polynu
  • Gene editor is a protein that that can be used to perform gene editing, gene modification, gene insertion, gene deletion, or gene inversion.
  • gene editor polynucleotide refers to a polynucleotide sequence encoding the gene editor polypeptide.
  • Such an enzyme or enzyme fusion may contain DNA or RNA targetable nuclease protein (i.e., Cas protein, ADAR, or ADAT), wherein target specificity is mediated by a complexed nucleic acid (i.e., guide RNA).
  • RNA targetable nuclease protein i.e., Cas protein, ADAR, or ADAT
  • target specificity is mediated by a complexed nucleic acid (i.e., guide RNA).
  • Such an enzyme or enzyme fusion may be a DNA/RNA targetable protein, wherein target specificity is mediated by internal, conjugated, fused, or linked amino acids, such as within TALENs, ZFNs, or meganucleases.
  • a gene editor comprising a targetable protein may be fused or linked to one or more proteins or protein fragment motifs. Gene editors may be fused, linked, complexed, operate in cis or trans to one or more integrase, recombinase, polymerase, telomerase, reverse transcriptase, or invertase.
  • a gene editor can be a prime editor fusion protein or a gene writer fusion protein.
  • Prime editor fusion protein describes a protein that is used in prime editing.
  • Prime editor system as used herein describes the components used in prime editing.
  • Prime editing uses CRISPR enzyme that nicks or cuts only single strand of double stranded DNA, i.e., a nickase; the nickase can occur either naturally or by mutation or modification of a nuclease that makes double stranded cuts.
  • a nickase can occur either naturally or by mutation or modification of a nuclease that makes double stranded cuts.
  • Such an enzyme can be a catalytically-impaired Cas9 endonuclease (a nickase).
  • a nickase Such an enzyme can be a Casl2a/b, MAD7, or variant thereof.
  • the nickase is fused to an engineered reverse transcriptase (RT).
  • the nickase is programmed (directed) with an attachment site-containing guide RNA (or a prime-editing guide RNA (pegRNA)).
  • the atgRNA both specifies the target site and encodes the desired edit. Described herein, are attachment site-containing guide RNA (atgRNA) that both specifies the target and encodes for the desired integrase target recognition site.
  • the nickase may be programmed (directed) with an atgRNA.
  • the nickase is a catalytically-impaired Cas9 endonuclease, a Cas9 nickase, that is fused to the reverse transcriptase.
  • the Cas9 nickase part of the protein is guided to the DNA target site by the atgRNA (pegRNA), whereby a nick or single stranded cut occurs.
  • the reverse transcriptase domain then uses the atgRNA to template reverse transcription of the desired edit, directly polymerizing DNA onto the nicked target DNA strand.
  • the edited DNA strand replaces the original DNA strand, creating a heteroduplex containing one edited strand and one unedited strand.
  • the prime editor (PE) guides resolution of the heteroduplex to favor copying the edit onto the unedited strand, completing the process (typically achieved with a nickase gRNA).
  • Other enzymes that can be used to nick or cut only a single strand of double stranded DNA includes a cleavase (e.g., cleavase I enzyme).
  • an additional agent or agents may be added that improve the efficiency and outcome purity of the prime edit.
  • the agent may be chemical or biological and disrupt DNA mismatch repair (MMR) processes at or near the edit site (i.e., PE4 and PE5 and PEmax architecture by Chen et al. Cell, 184, 1-18, October 28, 2021; Chen et al. is incorporated herein by reference).
  • MMR DNA mismatch repair
  • the agent is a MMR-inhibiting protein.
  • the MMR-inhibiting protein is dominant negative MMR protein.
  • the dominant negative MMR protein is MLH1dn.
  • the MMR-inhibiting agent is incorporated into the single nucleic acid construct design described herein.
  • the MMR-inhibiting agent is linked or fused to the prime editor protein fusion, which may or may not have a linked or fused integrase.
  • the MMR-inhibiting agent is linked or fused to the Gene WriterTM protein, which may or may not have a linked or fused integrase.
  • the prime editor or gene editor system can be used to achieve DNA deletion and replacement.
  • the DNA deletion replacement is induced using a pair of atgRNAs (or pegRNAs) that target opposite DNA strands, programming not only the sites that are nicked but also the outcome of the repair (i.e., PrimeDel by Choi et al. Nat. Biotechnology, October 14, 2021; Choi et al. is incorporated herein by reference and TwinPE by Anzalone et al.BioRxiv, November 2, 2021; Anzalone et al. is incorporated herein by reference).
  • the DNA deletion is induced using a single atgRNA.
  • the DNA deletion and replacement is induced using a wild type Cas9 prime editor (PE-Cas9) system (i.e., PEDAR by Jiang et al. Nat. Biotechnology, October 14, 2021; Jiang et al. is incorporated herein by reference).
  • the DNA replacement is an integrase target recognition site or recombinase target recognition site.
  • the constructs and methods described herein may be utilized to incorporate the pair of pegRNAs used in PrimeDel, TwinPE (WO2021226558, which is hereby incorporated by reference in its entirety), or PEDAR, the prime editor fusion protein or Gene Writer protein, optionally a nickase guide RNA (ngRNA), an integrase, a nucleic acid cargo, and optionally a recombinase into a single nucleic acid construct described herein.
  • the integrase may be directly linked, for example by a peptide linker, to the prime editor fusion or gene writer protein.
  • the prime editors can refer to a retrovirus or lentivirus reverse transcriptase such as a Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase (RT) fused to a CRISPR enzyme nickase such as a Cas9 H840A nickase, a Cas9nickase.
  • the prime editors can refer to a retrovirus or lentivirus reverse transcriptase such as a Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase (RT) fused to a cleavase.
  • the RT can be fused at, near or to the C-terminus of a Cas9nickase, e.g., Cas9 H840A. Fusing the RT to the C-terminus region, e.g., to the C-terminus, of the Cas9 nickase may result in higher editing efficiency.
  • a complex is called PEI.
  • the CRISPR enzyme nickase e.g., Cas9(H840A), i.e., a Cas9nickase
  • the CRISPR enzyme nickase instead of being a Cas9 (H840A), i.e., instead of being a Cas9 nickase, the CRISPR enzyme nickase instead can be a CRISPR enzyme that naturally is a nickase or cuts a single strand of double stranded DNA; for instance, the CRISPR enzyme nickase can be Casl2a/b. Alternatively, the CRISPR enzyme nickase can be another mutation of Cas9, such as Cas9(Dl0A).
  • a CRISPR enzyme such as a CRISPR enzyme nickase, such as Cas9 (wild type), Cas9(H840A), Cas9(Dl0A) or Cas 12a/b nickase can be fused in some embodiments to a pentamutant of M-MLV RT (D200N/ L603W/ T330P/ T306K/ W313F), whereby there can be up to about 45-fold higher efficiency, and this is called PE2.
  • a CRISPR enzyme nickase such as Cas9 (wild type), Cas9(H840A), Cas9(Dl0A) or Cas 12a/b nickase
  • a pentamutant of M-MLV RT D200N/ L603W/ T330P/ T306K/ W313F
  • the M-MLV RT comprise one or more of the mutations Y8H, P51L, S56A, S67R, E69K, Vl29P, L139P, Tl97A, H204R, V223H, T246E, N249D, E286R, Q2911, E302K, E302R, F309N, M320L, P330E, L435G, L435R, N454K, D524A, D524G, D524N, E562Q, D583N, H594Q, E607K, D653N, and L671P. Specific M-MLV RT mutations are shown in Table 1.
  • the reverse transcriptase can also be a wild-type or modified transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV RT), Feline Immunodeficiency Virus reverse transcriptase (FIV-RT), FeLV-RT (Feline leukemia virus reverse transcriptase), HIV-RT (Human Immunodeficiency Virus reverse transcriptase).
  • RTX transcription xenopolymerase
  • AMV RT avian myeloblastosis virus reverse transcriptase
  • FV-RT Feline Immunodeficiency Virus reverse transcriptase
  • FeLV-RT FeLV-RT
  • Feline leukemia virus reverse transcriptase FeLV-RT
  • HIV-RT Human Immunodeficiency Virus reverse transcriptase
  • the reverse transcriptase can be a fusion of MMuLV to the Sto7d DNA binding domain (see Ionnidi et al.;
  • PE3, PE3b, PE4, PE5, and/or PEmax which a skilled person can incorporate into the gene editor polypeptide, involves nicking the non-edited strand, potentially causing the cell to remake that strand using the edited strand as the template to induce HR.
  • the nicking of the non-edited strand can involve the use of a nicking guide RNA (ngRNA).
  • ngRNA nicking guide RNA
  • the skilled person can readily incorporate into a gene editor polypeptide described herein a prime editing or CRISPR system.
  • Prime editors can be found in the following: WO2020/191153, WO2020/191171, WO2020/191233, WO2020/191234, WO2020/191239, WO2020/191241, WO2020/191242, WO2020/191243, WO2020/191245, WO2020/191246, WO2020/191248, WO2020/191249, each of which is incorporated by reference herein in its entirety.
  • the skilled person can incorporate the selected CRISPR enzyme, as part of the gene editor composition described herein.
  • Additional gene editor polypeptides are as described in WO 2023/076898; WO 2023/015014; WO 2023/070062; WO2023288332; WO2023015318; WO2023004439; WO2023070110; WO2022256714; and WO2023283092, each of which are hereby incorporated by reference in their entireties.
  • Additional gene editor polypeptides are as described in U.S. Patent Pub.2023/0059368, which is hereby incorporated by reference in its entirety.
  • the prime editor protein (1) site- specifically targets a genomic locus and (2) performs a catalytic cut or nick. These steps are typically performed by a CRISPR-Cas.
  • the Cas protein may be substituted by other nucleic acid programmable DNA binding proteins (napDNAbp) such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), or meganucleases.
  • ZFNs zinc finger nucleases
  • TALENs transcription activator-like effector nucleases
  • meganucleases meganucleases
  • a Gene Writer protein comprises: (A) a polypeptide or a nucleic acid encoding a polypeptide, wherein the polypeptide comprises (i) a reverse transcriptase domain, and either (x) an endonuclease domain that contains DNA binding functionality or (y) an endonuclease domain and separate DNA binding domain; and (B) a template RNA comprising (i) a sequence that binds the polypeptide and (ii) a heterologous insert sequence.
  • the prime editor or Gene Writer protein fusion or prime editor protein linked or fused to an integrase is expressed as a split construct.
  • the split construct in reconstituted in a cell.
  • the split construct can be fused or ligated via intein protein splicing.
  • the split construct can be reconstituted via protein-protein inter-molecular bonding and/or interactions.
  • the split construct can be reconstituted via chemical, biological, or environmental induced oligomerization.
  • the split construct can be reconstituted via nanobody binding ALFA-tagged proteins.
  • the split construct can be adapted into one or more single nucleic acid polynucleotides.
  • an integrase or recombinase is directly linked or fused, for example by a peptide linker, which may be cleavable or non-cleavable, to the prime editor fusion protein (i.e., fused Cas9 nickase-reverse transcriptase) or Gene Writer protein.
  • Suitable linkers for example between the Cas9, RT, and integrase, may be selected from Table 3:
  • SpCas9 Streptococcus pyogenes Cas9
  • REC recognition
  • NUC nuclease
  • the REC lobe can be divided into three regions, a long a helix referred to as the bridge helix (residues 60–93), the REC1 (residues 94–179 and 308–713) domain, and the REC2 (residues 180–307) domain.
  • the NUC lobe consists of the RuvC (residues 1–59, 718–769, and 909–1098), HNH (residues 775–908), and PAM-interacting (PI) (residues 1099–1368) domains.
  • the negatively charged sgRNA:target DNA heteroduplex is accommodated in a positively charged groove at the interface between the REC and NUC lobes.
  • the RuvC domain is assembled from the three split RuvC motifs (RuvC I–III) and interfaces with the PI domain to form a positively charged surface that interacts with the 30 tail of the sgRNA.
  • the HNH domain lies between the RuvC II–III motifs and forms only a few contacts with the rest of the protein. Structural aspects of SpCas9 are described by Nishimasu et al., Crystal Structure of Cas9 in Complex with Guide RNA and Target DNA, Cell 156, 935-949, February 27, 2014.
  • REC lobe The REC lobe includes the REC1 and REC2 domains.
  • the REC2 domain does not contact the bound guide:target heteroduplex, indicating that truncation of REC lobe may be tolerated by SpCas9.
  • SpCas9 mutant lacking the REC2 domain (D175–307) retained ⁇ 50% of the wild-type Cas9 activity, indicating that the REC2 domain is not critical for DNA cleavage.
  • PAM-Interacting domain The NUC lobe contains the PAM-interacting (PI) domain that is positioned to recognize the PAM sequence on the noncomplementary DNA strand.
  • RuvC domain The RuvC nucleases of SpCas9 have an RNase H fold and four catalytic residues, Asp10 (Ala), Glu762, His983, and Asp986, that are critical for the two-metal cleavage of the noncomplementary strand of the target DNA.
  • the Cas9 RuvC domain has other structural elements involved in interactions with the guide:target heteroduplex (an end-capping loop between ⁇ 42 and ⁇ 43) and the PI domain/stem loop 3 ( ⁇ hairpin formed by ⁇ 3 and ⁇ 4).
  • HNH domain SpCas9 HNH nucleases have three catalytic residues, Asp839, His840, and Asn863 and cleave the complementary strand of the target DNA through a single-metal mechanism.
  • the backbone phosphate groups of the guide region interact with the REC1 domain (Arg165, Gly166, Arg403, Asn407, Lys510, Tyr515, and Arg661) and the bridge helix (Arg63, Arg66, Arg70, Arg71, Arg74, and Arg78).
  • the 20-hydroxyl groups of G1, C15, U16, and G19 hydrogen bond with Val1009, Tyr450, Arg447/Ile448, and Thr404, respectively.
  • RNA-guided DNA targeting: SpCas9 recognizes the guide:target heteroduplex in a sequence-independent manner.
  • the backbone phosphate groups of the target DNA interact with the REC1 (Asn497, Trp659, Arg661, and Gln695), RuvC (Gln926), and PI (Glu1108) domains.
  • the C2’ atoms of the target DNA form van der Waals interactions with the REC1 domain (Leu169, Tyr450, Met495, Met694, and His698) and the RuvC domain (Ala728).
  • the terminal base pair of the guide:target heteroduplex (G1:C20’) is recognized by the RuvC domain via end-capping interactions; the sgRNA G1 and target DNA C20’ nucleobases interact with the Tyr1013 and Val1015 side chains, respectively, whereas the 20-hydroxyl and phosphate groups of sgRNA G1 interact with Val1009 and Gln926, respectively.
  • the nucleobase of the flipped U44 is sandwiched between Tyr325 and His328, with its N3 atom hydrogen bonded with Tyr325, whereas the nucleobase of the unpaired G43 stacks with Tyr359 and hydrogen bonds with Asp364.
  • the nucleobases of G21 and U50 in the G21:U50 wobble pair stack with the terminal C20:G10 pair in the guide:target heteroduplex and Tyr72 on the bridge helix, respectively, with the U50 O4 atom hydrogen bonded with Arg75.
  • A51 adopts the syn conformation and is oriented in the direction opposite to U50.
  • Stem loop 1 is primarily recognized by the REC lobe, together with the PI domain.
  • the backbone phosphate groups of stem loop 1 interact with the REC1 domain (Leu455, Ser460, Arg467, Thr472, and Ile473), the PI domain (Lys1123 and Lys1124), and the bridge helix (Arg70 and Arg74), with the 20-hydroxyl group of G58 hydrogen bonded with Leu455.
  • A52 interacts with Phe1105 through a face-to-edge p-p stacking interaction, and the flipped U59 nucleobase hydrogen bonds with Asn77.
  • the single-stranded linker and stem loops 2 and 3 are primarily recognized by the NUC lobe.
  • the backbone phosphate groups of the linker (nucleotides 63–65 and 67) interact with the RuvC domain (Glu57, Lys742, and Lys1097), the PI domain (Thr1102), and the bridge helix (Arg69), with the 20-hydroxyl groups of U64 and A65 hydrogen bonded with Glu57 and His721, respectively.
  • the C67 nucleobase forms two hydrogen bonds with Val1100.
  • Stem loop 2 is recognized by Cas9 via the interactions between the NUC lobe and the non-Watson-Crick A68:G81 pair, which is formed by direct (between the A68 N6 and G81 O6 atoms) and water-mediated (between the A68 N1 and G81 N1 atoms) hydrogen-bonding interactions.
  • the A68 and G81 nucleobases contact Ser1351 and Tyr1356, respectively, whereas the A68:G81 pair interacts with Thr1358 via a water-mediated hydrogen bond.
  • Stem loop 3 interacts with the NUC lobe more extensively, as compared to stem loop 2.
  • the backbone phosphate group of G92 interacts with the RuvC domain (Arg40 and Lys44), whereas the G89 and U90 nucleobases hydrogen bond with Gln1272 and Glu1225/Ala1227, respectively.
  • the A88 and C91 nucleobases are recognized by Asn46 via multiple hydrogen- bonding interactions.
  • Cas9 proteins smaller than SpCas9 allow more efficient packaging of nucleic acids encoding CRISPR systems, e.g., Cas9 and sgRNA into one rAAV (“all-in-one-AAV”) particle.
  • Small Cas9 proteins can be advantageous for multidomain-Cas-nuclease-based systems for prime editing.
  • Well characterized smaller Cas9 proteins include Staphylococcus aureus (SauCas9, 1053 amino acid residues) and Campylobacter jejuni (CjCas9, 984 amino residues).
  • Staphylococcus lugdunensis (Slu) Cas9 as having genome-editing activity and provided homology mapping to SpCas9 and SauCas9 to facilitate generation of nickases and inactive (“dead”) enzymes (Schmidt et al., 2021, Improved CRISPR genome editing using small highly active and specific engineered RNA-guided nucleases. Nat Commun 12, 4219. doi.org/10.1038/s41467-021- 24454-5) and engineered nucleases with higher cleavage activity by fragmenting and shuffling Cas9 DNAs.
  • the small Cas9s and nickases are useful in the instant disclosure.
  • the Cas9 proteins used herein may also include other “Cas9 variants” having at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to any reference Cas9 protein, including any wild type Cas9, or mutant Cas9 (e.g., a dead Cas9 or Cas9 nickase), or fragment Cas9, or circular permutant Cas9, or other variant of Cas9 disclosed herein or known in the art.
  • Cas9 variants having at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to any reference Cas9 protein, including any wild
  • a Cas9 variant may have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 21, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 or more amino acid changes compared to a reference Cas9.
  • the Cas9 variant comprises a fragment of a reference Cas9 (e.g., a gRNA binding domain or a DNA-cleavage domain), such that the fragment is at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to the corresponding fragment of wild type Cas9.
  • a reference Cas9 e.g., a gRNA binding domain or a DNA-cleavage domain
  • the fragment is at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% identical, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.5% of the amino acid length of a corresponding wild type Cas9.
  • the disclosure also may utilize Cas9 fragments that retain their functionality and that are fragments of any herein disclosed Cas9 protein.
  • the Cas9 fragment is at least 100 amino acids in length.
  • the fragment is at least 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100, 1150, 1200, 1250, or at least 1300 amino acids in length.
  • the prime editors disclosed herein may comprise one of the Cas9 variants described as follows, or a Cas9 variant thereof having at least about 70% identical, at least about 80% identical, at least about 90% identical, at least about 95% identical, at least about 96% identical, at least about 97% identical, at least about 98% identical, at least about 99% identical, at least about 99.5% identical, or at least about 99.9% identical to any reference Cas9 variants.
  • prime editors utilized herein comprise CRISPR-Cas system enzymes other than type II enzymes.
  • prime editors comprise type V or type VI CRISPR-Cas system enzymes. It will be appreciated that certain CRISPR enzymes exhibit promiscuous ssDNA cleavage activity and appropriate precautions should be considered.
  • prime editors comprise a nickase or a dead CRISPR with nuclease function comprised in a different component.
  • the nucleic acid programmable DNA binding proteins utilized herein include, without limitation, Cas9 (e.g., dCas9 and nCas9), Cas12a (Cpf1), Cas12b1 (C2c1), Cas12b2, Cas12c (C2c3), Cas12d (CasY), Cas12e (CasX), C2c4, C2c5, C2c8, C2c9, C2c10, Cas13a (C2c2), Cas13b (C2c6), Cas13c (C2c7), Cas13d, and Argonaute.
  • Cas9 e.g., dCas9 and nCas9
  • Cas9 e.g., dCas9 and nCas9
  • Cas9 e.g., dCas9 and nCas9
  • Cas9 e.g., dCas9 and nC
  • Cas-equivalents further include those described in Makarova et al., “C2c2 is a single-component programmable RNA- guided RNA-targeting CRISPR effector,” Science 2016; 353(6299) and Makarova et al., “Classification and Nomenclature of CRISPR-Cas Systems: Where from Here?,” The CRISPR Journal, Vol.1. No.5, 2018, the contents of which are incorporated herein by reference.
  • One example of a nucleic acid programmable DNA-binding protein that has different PAM specificity than Cas9 is Clustered Regularly Interspaced Short Palindromic Repeats from Prevotella and Francisella 1 (i.e, Cas12a (Cpf1)).
  • Cas12a (Cpf1) is also a Class 2 CRISPR effector, but it is a member of type V subgroup of enzymes, rather than the type II subgroup. It has been shown that Cas12a (Cpf1) mediates robust DNA interference with features distinct from Cas9.
  • Cas12a (Cpf1) is a single RNA-guided endonuclease lacking tracrRNA, and it utilizes a T- rich protospacer-adjacent motif (TTN, TTTN, or YTN). Moreover, Cpf1 cleaves DNA via a staggered DNA double-stranded break.
  • Cpf1-family proteins Two enzymes from Acidaminococcus and Lachnospiraceae are shown to have efficient genome-editing activity in human cells.
  • Cpf1 proteins are known in the art and have been described previously, for example Yamano et al., “Crystal structure of Cpf1 in complex with guide RNA and target DNA.” Cell (165) 2016, p.949-962; the entire contents of which is hereby incorporated by reference. 6.3.
  • Type V CRISPR proteins [0143]
  • prime editors used herein comprise the type V CRISPR family includes Francisella novicida U112 Cpf1 (FnCpf1) also known as FnCas12a.
  • FnCpf1 adopts a bilobed architecture with the two lobes connected by the wedge (WED) domain.
  • the N-terminal REC lobe consists of two a-helical domains (REC1 and REC2) that have been shown to coordinate the crRNA-target DNA heteroduplex.
  • the C-terminal NUC lobe consists of the C-terminal RuvC and Nuc domains involved in target cleavage, the arginine-rich bridge helix (BH), and the PAM- interacting (PI) domain.
  • the repeat-derived segment of the crRNA forms a pseudoknot stabilized by intra-molecular base-pairing and hydrogen-bonding interactions.
  • the pseudoknot is coordinated by residues from the WED, RuvC, and REC2 domains, as well as by two hydrated magnesium cations.
  • nucleotides 1–5 of the crRNA are ordered in the central cavity of FnCas12a and adopt an A-form-like helical conformation. Conformational ordering of the seed sequence is facilitated by multiple interactions between the ribose and phosphate moieties of the crRNA backbone and FnCpf1 residues in the WED and REC1 domains. These include residues Thr16, Lys595, His804, and His881 from the WED domain and residues Tyr47, Lys51, Phe182, and Arg186 from the REC1 domain.
  • FnCas12a-crRNA complex further reveals that the bases of the seed sequence are solvent exposed and poised for hybridization with target DNA.
  • Structural aspects of FnCpf1 are described by Swarts et al., Structural Basis for Guide RNA Processing and Seed-Dependent DNA Targeting by CRISPR-Cas12a, Molecular Cell 66, 221-233, April 20, 2017. [0144]
  • Pre-crRNA processing Essential residues for crRNA processing include His843, Lys852, and Lys869.
  • R-loop formation The crRNA-target DNA strand heteroduplex is enclosed in the central cavity formed by the REC and NUC lobes and interacts extensively with the REC1 and REC2 domains.
  • the PAM-containing DNA duplex comprises target strand nucleotides dT0–dT8 and non-target strand nucleotides dA(8)*–dA0* and is contacted by the PI, WED, and REC1 domains.
  • the 5’-TTN-3’ PAM is recognized in FnCas12a by a mechanism combining the shape- specific recognition of a narrowed minor groove, with base-specific recognition of the PAM bases by two invariant residues, Lys671 and Lys613.
  • the duplex of the target DNA is disrupted by the side chain of residue Lys667, which is inserted between the DNA strands and forms a cation- ⁇ stacking interaction with the dA0–dT0* base pair.
  • the phosphate group linking target strand residues dT(-1) and dT0 is coordinated by hydrogen-bonding interactions with the side chain of Lys823 and the backbone amide of Gly826.
  • Target strand residue dT(-1) bends away from residue T0, allowing the target strand to interact with the seed sequence of the crRNA.
  • the non-target strand nucleotides dT1*–dT5* interact with the Arg692- Ser702 loop in FnCas12a through hydrogen-bonding and ionic interactions between backbone phosphate groups and side chains of Arg692, Asn700, Ser702, and Gln704, as well as main-chain amide groups of Lys699, Asn700, and Ser702.
  • Alanine substitution of Q704 or replacement of residues Thr698–Ser702 in FnCas12a with the sequence Ala-Gly3 (SEQ ID NO: 115) substantially reduced DNA cleavage activity, suggesting that these residues contribute to R-loop formation by stabilizing the displaced conformation of the nontarget DNA strand.
  • the crRNA-target strand heteroduplex is terminated by a stacking interaction with a conserved aromatic residue (Tyr410). This prevents base pairing between the crRNA and the target strand beyond nucleotides U20 and dA(-20), respectively. Beyond this point, the target DNA strand nucleotides re-engage the non-target DNA strand, forming a PAM-distal DNA duplex comprising nucleotides dC(-21)–dA(-27) and dG21*–dT27*, respectively. The duplex is confined between the REC2 and Nuc domains at the end of the central channel formed by the REC and NUC lobes.
  • Target DNA cleavage FnCpf1 can independently accommodate both the target and non-target DNA strands in the catalytic pocket of the RuvC domain.
  • the RuvC active site contains three catalytic residues (D917, E1006, and D1255). Structural observations suggest that both the target and non-target DNA strands are cleaved by the same catalytic mechanism in a single active site in Cpf1/Cas12a enzymes.
  • nuclease comprises a Cas12f effector.
  • Small CRISPR- associated effector proteins belonging to the type V-F subtype have been identified through the mining of sequence databases and members classified into Cas12f1 (Cas14a and type V-U3), Cas12f2 (Cas14b) and Cas12f3 (Cas14c, type V-U2 and U4).
  • Exemplary CRISPR-Cas proteins and enzymes used in the Prime Editors herein include the following without limitation.
  • Protospacer Adjacent Motif refers to an approximately 2-6 base pair DNA sequence (or a 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-, 11-, 12-long nucleotide sequence) that is an important targeting component of a Cas9 nuclease.
  • the PAM sequence is on either strand, and is downstream in the 5' to 3' direction of Cas9 cut site.
  • the canonical PAM sequence (i.e., the PAM sequence that is associated with the Cas9 nuclease of Streptococcus pyogenes or SpCas9) is 5'-NGG-3' wherein “N” is any nucleobase followed by two guanine (“G”) nucleobases.
  • N is any nucleobase followed by two guanine (“G”) nucleobases.
  • G guanine
  • Different PAM sequences can be associated with different Cas9 nucleases or equivalent proteins from different organisms.
  • any given Cas9 nuclease may be modified to alter the PAM specificity of the nuclease such that the nuclease recognizes alternative PAM sequence.
  • the PAM specificity can be modified by introducing one or more mutations, including (a) D1135V, R1335Q, and T1337R “the VQR variant”, which alters the PAM specificity to NGAN or NGNG, (b) D1135E, R1335Q, and T1337R “the EQR variant”, which alters the PAM specificity to NGAG, and (c) D1135V, G1218R, R1335E, and T1337R “the VRER variant”, which alters the PAM specificity to NGCG.
  • Cas9 enzymes from different bacterial species can have varying PAM specificities and in some embodiments are therefore chosen based on the desired PAM recognition.
  • Cas9 from Staphylococcus aureus (SaCas9) recognizes NGRRT or NGRRN.
  • Cas9 from Neisseria meningitis (NmCas) recognizes NNNNGATT.
  • Cas9 from Streptococcus thermophilis (StCas9) recognizes NNAGAAW.
  • Cas9 from Treponema denticola (TdCas) recognizes NAAAAC.
  • TdCas Treponema denticola
  • non- SpCas9s bind a variety of PAM sequences, which makes them useful to expand the range of sequences that can be targeted according to the invention.
  • non-SpCas9s may have other characteristics that make them more useful than SpCas9.
  • Cas9 from Staphylococcus aureus is about 1 kilobase smaller than SpCas9, so it can be packaged into adeno-associated virus (AAV).
  • AAV adeno-associated virus
  • Prime editor fusion protein describes a protein that is used in prime editing.
  • Prime editing uses CRISPR enzyme that nicks or cuts only single strand of double stranded DNA, i.e., a nickase; and a nickase can occur either naturally or by mutation or modification of a nuclease that makes double stranded cuts.
  • a nickase can occur either naturally or by mutation or modification of a nuclease that makes double stranded cuts.
  • Such an enzyme can be a catalytically-impaired Cas9 endonuclease (a nickase).
  • a nickase Such an enzyme can be a Casl2a/b, MAD7, or variant thereof.
  • the nickase is fused to an engineered reverse transcriptase (RT).
  • the nickase is programmed (directed) with an attachment site-containing gRNA (atgRNA) or a prime-editing guide RNA (pegRNA).
  • the atgRNA both specifies the target site and encodes the desired edit.
  • the nickase is a catalytically-impaired Cas9 endonuclease, a Cas9 nickase, that is fused to the reverse transcriptase.
  • the Cas9 nickase part of the protein is guided to the DNA target site by the pegRNA, whereby a nick or single stranded cut occurs.
  • the reverse transcriptase domain then uses the pegRNA to template reverse transcription of the desired edit, directly polymerizing DNA onto the nicked target DNA strand.
  • PE1 refers to a PE complex comprising a fusion protein comprising Cas9(H840A) and a wild type MMLV RT having the following N-terminus to C-terminus structure: [NLS]-[Cas9(H840A)]- [linker]-[MMLV_RT(wt)] + a desired atgRNA.
  • the prime editors disclosed herein is comprised of PE1.
  • PE2 refers to a PE complex comprising a fusion protein comprising Cas9(H840A) and a variant MMLV RT having the following N-terminus to C-terminus structure: [NLS]-[Cas9(H840A)]- [linker]-[MMLV_RT(D200N)(T330P)(L603W)(T306K)(W313F)] + a desired PEgRNA.
  • the prime editors disclosed herein is comprised of PE2.
  • the prime editors disclosed herein is comprised of PE2 and co-expression of MMR protein MLH1dn, that is PE4.
  • PE3 refers to PE2 plus a second-strand nicking guide RNA that complexes with the PE2 and introduces a nick in the non-edited DNA strand. The induction of the second nick increases the chances of the unedited strand, rather than the edited strand, to be repaired.
  • the prime editors disclosed herein is comprised of PE3.
  • the prime editors disclosed herein is comprised of PE3 and co-expression of MMR protein MLH1dn, that is PE5.
  • PE3b refers to PE3 but wherein the second-strand nicking guide RNA is designed for temporal control such that the second strand nick is not introduced until after the installation of the desired edit. This is achieved by designing a gRNA with a spacer sequence with mismatches to the unedited original allele that matches only the edited strand. Using this strategy, mismatches between the protospacer and the unedited allele should disfavor nicking by the sgRNA until after the editing event on the PAM strand takes place. 6.6.
  • a prime editing complex consists of a type II CRISPR PE protein containing an RNA-guided DNA-nicking domain fused to a reverse transcriptase (RT) domain and complexed with a pegRNA.
  • the pegRNA comprises (5’ to 3’) a spacer that is complementary to the target sequence of a genomic DNA, a nickase (e.g. Cas9) binding site, a reverse transcriptase template including editing positions, and primer binding site (PBS).
  • the PE–pegRNA complex binds the target DNA and the CRISPR protein nicks the PAM- containing strand.
  • the resulting 3′ end of the nicked target hybridizes to the primer-binding site (PBS) of the pegRNA, then primes reverse transcription of new DNA containing the desired edit using the RT template of the pegRNA.
  • PBS primer-binding site
  • the overall structure of the pegRNA is like that of a typical type II sgRNA with a reverse transcriptase template/primer binding site appended to the 3’ end. The structure leaves the PBS at the 3’ end of the pegRNA free to bind to the nicked strand complementary to the target which forms the primer for reverse transcription.
  • Guide RNAs of CRISPRs differ in overall structure.
  • the spacer of a type II gRNA is located at the 5’ end
  • the spacer of a type V gRNA is located towards the 3’ end, with the CRISPR protein (e.g. Cas12a) binding region located toward the 5’ end.
  • the regions of a type V pegRNA are rearranged compared to a type II pegRNA.
  • the overall structure of the pegRNA is like that of a typical type II sgRNA with a reverse transcriptase template/primer binding site appended to the 3’ end.
  • the pegRNA comprises (5’ to 3’) a CRISPR protein-binding region, a spacer which is complementary to the target sequence of a genomic DNA, a reverse transcriptase template including editing positions, and primer binding site (PBS).
  • Attachment Site-Containing Guide RNA (atgRNA)
  • the term “attachment site-containing guide RNA” (atgRNA) and the like refer to an extended single guide RNA (sgRNA) comprising a primer binding site (PBS), a reverse transcriptase (RT) template sequence, and wherein the RT template encodes for an integration recognition site or a recombinase recognition site that can be recognized by a recombinase, integrase, or transposase.
  • the RT template comprises a clamp sequence and an integration recognition site.
  • an atgRNA may be referred to as a guide RNA.
  • An integration target recognition site or recombinase target recognition site incorporated into the pegRNA is referred to as an attachment site containing guide RNA (atgRNA).
  • the term “cognate integration recognition site” or “integration cognate” or “cognate pair” refers to a first integration recognition site (e.g., any of the integration recognition sites described herein) and a second integration recognition site (e.g., any of the integration recognition sites described herein) that can be recombined.
  • Recombination between a first integration recognition site (e.g., any of the integration recognition sites described herein) and a second recognition site (e.g., any of the integration recognition sites described herein) is mediated by functional symmetry between the two integration recognition sites and the central dinucleotide of each of the two integration recognition sites.
  • a first integration recognition site e.g., any of the integration recognition sites described herein
  • a second integration recognition site e.g., any of the integration recognition sites described herein
  • a non-limiting example of a cognate pair include an attB site and an attP site, whereby a serine integrase mediates recombination between the attB site and the attP site.
  • FIGs.1A-1E show optimization of the integration recognition site.
  • an atgRNA comprises a reverse transcriptase template that encodes, partially or in its entirety, an integration recognition site (also referred to as an integration target recognition site) or a recombinase recognition site (also referred to as a recombinase target recognition site).
  • the integration target recognition site which is to be placed at a desired location in the genome or intracellular nucleic acid, is referred to as a “beacon,” a “beacon” site or an “attachment site” or a “landing pad” or “landing site.”
  • An integration target recognition site or recombinase target recognition site incorporated into the pegRNA is referred to as an attachment site containing guide RNA (atgRNA).
  • an atgRNA comprises a reverse transcriptase template that encodes, partially or in entirety, for an integration target recognition site or recombinase target recognition site.
  • the integration target recognition site which is to be place at a desired location in the genome, is referred to as a “beacon site” or an “attachment site” or a “landing pad” or “landing site.”
  • the primer binding site allows the 3' end of the nicked DNA strand to hybridize to the atgRNA, while the RT template serves as a template for the synthesis of edited genetic information.
  • the atgRNA is capable for instance, without limitation, of (i) identifying the target nucleotide sequence to be edited and (ii) encoding new genetic information that replaces (or in some cases adds) the targeted sequence.
  • the atgRNA is capable of (i) identifying the target nucleotide sequence to be edited and (ii) encoding an integration site that replaces (or inserts/deletes within) the targeted sequences.
  • the first atgRNA and the second atgRNA are an at least first pair of atgRNAs, where the at least first pair of atgRNAs have domains that are capable of guiding the gene editor polypeptide or prime editor fusion protein to a target sequence, the first atgRNA further includes a first RT template that comprises at least a portion of the first integration recognition site; and the second atgRNA further includes a second RT template that comprises at least a portion of the first integration recognition site, and the first atgRNA and the second atgRNAs collectively encode the entirety of the first integration recognition site.
  • the first atgRNA’s reverse transcriptase template encodes for a first single-stranded DNA sequence (i.e., a first DNA flap) that contains a complementary region to a second single-stranded DNA sequence (i.e., a second DNA flap) encoded by a second atgRNA comprising a second reverse transcriptase template.
  • the complementary region between the first and second single-stranded DNA sequences is comprised of more than 5 consecutive bases of an integrase target recognition site.
  • the complementary region between the first and second single-stranded DNA sequences is comprised of more than 10 consecutive bases of an integrase target recognition site.
  • the complementary region between the first and second single-stranded DNA sequences is comprised of more than 20 consecutive bases of an integrase target recognition site. In certain embodiments, the complementary region between the first and second single-stranded DNA sequences is comprised of more than 30 consecutive bases of an integrase target recognition site.
  • Use of two guide RNAs that are (or encode DNA that is) partially complementarity to each other and comprised of consecutive bases of an integrase target recognition site are referred to as dual, paired, annealing, complementary, or twin attachment site-containing guide RNAs (atgRNAs).
  • the atgRNA are selected from Tables 12 or 14.
  • one or more of the atgRNA are selected from SEQ ID NO: 559, 560, 563, 564, 567- 569, 570, 573, 574, 577, 578, or 579-598.
  • one or more of the atgRNA are selected from 603 and 604.
  • AttP variants SEQ ID NOS: 394 and 540-542, respectively, in order of appearance
  • Table 9 includes atgRNAs, sgRNAs and nicking guides that can be used herein. Spacers are labeled in capital font (SPACER), RT regions in bold capital (RT REGION), AttB sites in bold lower case (attb site), and PBS in capital italics (PBS). Unless otherwise denoted, the AttB is for Bxb1.
  • SPACER capital font
  • RT REGION RT regions in bold capital
  • AttB sites in bold lower case
  • PBS PBS in capital italics
  • a gene editor polypeptide described herein contains an integrase or recombinase.
  • the integrase is delivered as a protein or the integrase is encoded in a delivered polynucleotide.
  • the integration enzyme is selected from the group consisting of Dre, Vika, Bxb1, ⁇ C31, RDF, ⁇ BTl, R1, R2, R3, R4, R5, TP901-1, A118, ⁇ FCl, ⁇ C1, MR11, TG1, ⁇ 370.l, W ⁇ , BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, ⁇ RV, retrotransposases encoded by a Tc1/mariner family member including but not limited to retrotransposases encoded by LI, Tol2, Tel, Tc3, Himar 1 (isolated from the horn fly, Haematobia irritans
  • Xu et al describes methods for evaluating integrase activity in E. coli and mammalian cells and confirmed at least R4, ⁇ C31, ⁇ BT1, Bxb1, SPBc, TP901-1 and W ⁇ integrases to be active on substrates integrated into the genome of HT1080 cells (Xu et al., 2013, Accuracy and efficiency define Bxb1 integrase as the best of fifteen candidate serine recombinases for the integration of DNA into the human genome. BMC Biotechnol.2013 Oct 20;13:87. doi: 10.1186/1472-6750-13-87).
  • Durrant describes new large serine recombinases (LSRs) divided into three classes distinguished from one another by efficiency and specificity, including landing pad LSRs which outperform wild-type Bxb1 in episomal and chromosomal integration efficiency, LSRs that achieve both efficient and site- specific integration without a landing pad, and multi-targeting LSRs with minimal site-specificity. Additionally, embodiments can include any serine recombinase such as BceINT, SSCINT, SACINT, and INT10 (see Ionnidi et al., 2021; Drag-and-drop genome insertion without DNA cleavage with CRISPR directed integrases.
  • LSRs serine recombinases
  • the integration site can be selected from an attB site, an attP site, an attL site, an attR site, a lox71 site a Vox site, or a FRT site.
  • desired activity of integrases, transposases and the like can depend on nuclear localization.
  • prokaryotic enzymes are adapted to modulate nuclear localization.
  • eukaryotic or vertebrate enzymes are adapted to modulate nuclear localization.
  • the disclosure provides fusion or hybrid proteins.
  • Such modulation can comprise addition or removal of one or more nuclear localization signal (NLS) and/or addition or removal of one or more nuclear export signal (NES).
  • NLS nuclear localization signal
  • NES nuclear export signal
  • This disclosure also features systems, compositions, and methods for generating a primary cell comprising an integration recognition site site-specifically integrated into the genome of the primary cell.
  • a method of generating a primary cell comprising an integration recognition site where the method includes (a) site-specifically incorporating at least a first integration recognition site into a target sequence in the genome of the primary cell.
  • site-specifically incorporating the at least first integration recognition site is effected by introducing into the primary cell: (i) a gene editor polypeptide or a polynucleotide encoding a gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain and is capable of incorporating the integration recognition site into the genome of the primary cell at the target sequence; (ii) at least a first pair of guide RNAs, wherein the first paired guide RNAs have domains that are capable of guiding the gene editor protein or prime editor fusion protein to nick the primary cell genome at sites respectively flanking the specific incorporation site, and at least one of the two paired guide RNAs is an atgRNA that further includes a reverse transcriptase (RT) template that comprises at least a portion of the first integration recognition site, and the atgRNAs collectively encode the entirety of the first integration recognition site, whereby the first integration recognition site is site- specifically incorporated into the genome
  • RT
  • the at least first pair of guide RNAs comprise: (i) the first of the two paired guide RNAs is a first atgRNA that further includes a first RT template that comprises at least a portion of the first integration recognition site; and (ii) a second of the two paired guide RNAs is a second atgRNA that further includes a second RT template that comprises at least a portion of the first integration recognition site, wherein the first atgRNA and the second atgRNAs collectively encode the entirety of the first integration recognition site.
  • the at least first pair of guide RNAs comprise: (i) the first of the two paired guide RNAs is an atgRNA that further includes an RT template that comprises at least a portion of the first integration recognition site, wherein the atgRNA encodes the entirety of the first integration recognition site; and (ii) a second of the two paired guide RNAs is a nicking gRNA.
  • This disclosure also features systems, compositions, and methods for generating a primary cell comprising an exogenous nucleic acid (e.g., a donor polynucleotide template) site-specifically integrated into the genome of the primary cell.
  • the method of site-specifically integrating an exogenous nucleic acid (e.g., a donor polynucleotide template) into a primary cell genome includes: (b) integrating at least a first donor polynucleotide template into the primary cell genome at the first incorporated integration recognition site, by introducing into the cell: (i) the first donor polynucleotide template, wherein the first donor polynucleotide template is comprised of one or more orthogonal/cognate integration recognition sites, and (ii) an integrase, whereby the donor polynucleotide is integrated into the primary cell at the at least one incorporated genomic integration recognition sites by the integrase; thereby producing a primary cell with a site-specifically integrated donor polynucleotide template.
  • the method of site-specifically integrating an exogenous nucleic acid (e.g., a donor polynucleotide template) into a primary cell genome includes performing steps (a) and (b) concurrently. In some embodiments of the method of site-specifically incorporating the at least first integration recognition site the genome of the primary cell, the method includes performing step (a) prior to performing step (b).
  • an exogenous nucleic acid e.g., a donor polynucleotide template
  • the method includes performing step (a) and step (b) at least 1 day, at least 2 days, at least 3 days, at least 4 days, at least 5 days, at least 6 days, at least 7 days, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 5 weeks, at least 6 weeks, at least 7 weeks, or at least 8 weeks apart.
  • the method of site specifically integrating an exogenous nucleic acid (e.g., a donor polynucleotide template) into a primary cell genome includes integrating, into the genome of any of the primary cells described herein at the first incorporated recognition site, at least a first donor polynucleotide template, by introducing into the cell: (i) the first donor polynucleotide template, wherein the first donor polynucleotide template is comprised of one or more orthogonal/cognate integration recognition sites, and (ii) an integrase, whereby the donor polynucleotide is integrated into the primary cell at the at least one incorporated genomic integration recognition sites by the integrase; thereby producing a primary cell with a site- specifically integrated donor polynucleotide template.
  • a donor polynucleotide template e.g., a donor polynucleotide template
  • the method of site specifically integrating an exogenous nucleic acid (e.g., a donor polynucleotide template) into a primary cell genome comprising: incorporating an integration recognition site into the human primary cell genome by delivering into the cell: a gene editor polypeptide or a polynucleotide encoding a gene editor polypeptide, wherein the gene editor polypeptide comprises a DNA binding nickase domain linked to a reverse transcriptase domain and is capable of incorporating the integration recognition site into the genome of the primary cell at a target sequence; an at least a first attachment site-containing guide RNA (atgRNA), wherein the one or more guide RNA encodes all or a portion of an at least first integration recognition site; and integrating a donor polynucleotide template into the human primary cell genome at the incorporated recognition site by delivering into the cell: the donor polynucleotide template, an integration enzyme or a polynucleotide encoding
  • the gene editor polypeptide or the polynucleotide encoding a gene editor polypeptide, the at least first atgRNA, the donor polynucleotide template, and the integration enzyme or the polynucleotide encoding an integration enzyme are concurrently delivered into the primary cell.
  • the gene editor protein or the polynucleotide encoding a gene editor protein, and the at least first atgRNA are delivered at a first time point and the donor polynucleotide template, and the integration enzyme or the polynucleotide encoding an integration enzyme are delivered at a second time point.
  • the first time point and the second time point are at least 1 day, at least 2 days, at least 3 days, at least 4 days, at least 5 days, at least 6 days, at least 7 days, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 5 weeks, at least 6 weeks, at least 7 weeks, or at least 8 weeks apart.
  • the gene editor polypeptide is configured such that the nickase is linked to the reverse transcriptase.
  • the nickase is linked to the reverse transcriptase by in-frame fusion.
  • the nickase is linked to the reverse transcriptase by a linker.
  • the linker is a peptide fused in-frame between the nickase and reverse transcriptase.
  • the gene editor polynucleotide further comprises a polynucleotide sequence encoding at least an integration enzyme or the gene editor polypeptide further comprises an integration enzyme.
  • the linked nickase-reverse transcriptase are further linked to the integration enzyme.
  • an at least first atgRNA comprises: (i) a domain that is capable of guiding the prime editor system to the target sequence; and (ii) a reverse transcriptase (RT) template that comprises at least a portion of an at least first integration recognition site.
  • the RT template comprises the entirety of the first integration recognition site.
  • site-specifically incorporating an integration recognition site into the primary cell genome further comprises delivering into the cell a second atgRNA.
  • the first atgRNA and the second atgRNA are an at least first pair of atgRNAs, wherein: the at least first pair of atgRNAs have domains that are capable of guiding the prime editor system to a target sequence; the first atgRNA further includes a first RT template that comprises at least a portion of an at least first integration recognition site; the second atgRNA further includes a second RT template that comprises at least a portion of the first integration recognition site, and the first atgRNA and the second atgRNAs collectively encode the entirety of the first integration recognition site.
  • the first RT template encodes a first single-stranded DNA sequence and the second RT template encodes a second single-stranded DNA sequence.
  • the first single-stranded DNA sequence comprises a complementary region with the first single-stranded DNA sequence.
  • the first single- stranded DNA sequence and the first single-stranded DNA sequence form a duplex.
  • the complementary region is 5 or more consecutive bases.
  • the complementary region is 10 or more consecutive bases.
  • the complementary region is 20 or more consecutive bases.
  • the complementary region is 30 or more consecutive bases.
  • incorporating an integration recognition site into the human primary cell genome further comprises delivering into the cell a nicking guide RNA (gRNA).
  • the first integration recognition site is an attB or attP site.
  • the integration recognition site is a modified attB or attP site.
  • the integration recognition site is specific for BxB1 or a modified BxB1.
  • the integration recognition site is comprised of 38 or 46 nucleotides.
  • an integration target recognition site i.e., integration recognition site or beacon
  • a primary cell genome e.g., a human primary cell
  • a gene editor protein or polynucleotide encoding a gene editor protein, wherein the gene editor protein is capable of incorporating the integration target recognition site into the genome
  • one or more guide RNA e.g., one or more atgRNA
  • the one or more guide RNA encodes an integrase target recognition site (i.e., atgRNA)
  • ngRNA nicking guide RNA
  • integrating a donor polynucleotide template into the human primary cell (e.g., the human primary cell) genome at the incorporated target recognition site by delivering into the cell: (i) the donor polynucleotide template, wherein the donor polynucleotide template is comprised of an integration target site, wherein the donor polynucleotide is integrated into the primary cell (e.g., the human primary cell) at the incorporated genomic integration target recognition site by an integrase, thereby producing a genetically modified human primary cell.
  • an integration target recognition site i.e., integration recognition site or beacon
  • a primary cell genome e.g., a human primary cell genome
  • a gene editor protein or polynucleotide encoding a gene editor protein, wherein the gene editor protein is capable of incorporating the integration target recognition site into the genome
  • one or more guide RNA complex wherein the one or more guide RNA complex is comprised of at least one polynucleotide that encodes an integrase target recognition site (i.e., atgRNA), and (iii) optionally, a nicking guide RNA (ngRNA).
  • ngRNA nicking guide RNA
  • integrating a donor polynucleotide template into the primary cell genome (e.g., the human primary cell genome) at the incorporated target recognition site by delivering into the cell: (i) the donor polynucleotide template, wherein the donor polynucleotide template is comprised of an integration target site, wherein the donor polynucleotide is integrated into the primary cell (e.g., the human primary cell) at the incorporated genomic integration target recognition site by an integrase, thereby producing a genetically modified primary cell (e.g., a modified human primary cell).
  • the one or more guide RNA complex is a split guide RNA complex or split guide RNA system.
  • the one or more guide RNA complex is a split atgRNA complex or split atgRNA system.
  • the guide RNA or the guide RNA complex reverse transcriptase template encodes for a first single-stranded DNA sequence (i.e., a first DNA flap) that contains a complementary region to a second single-stranded DNA sequence (i.e., a second DNA flap) encoded by a second guide RNA comprised of a reverse transcriptase template.
  • the complementary region between the first and second single-stranded DNA sequences is comprised of more than 10 consecutive bases of an integrase target recognition site.
  • the complementary region between the first and second single-stranded DNA sequences is comprised of more than 20 consecutive bases of an integrase target recognition site. In certain embodiments, the complementary region between the first and second single- stranded DNA sequences is comprised of more than 30 consecutive bases of an integrase target recognition site.
  • Use of two guide RNAs that are (or encode DNA that is) partially complementary to each other and comprised of consecutive bases of an integrase target recognition site are referred to as dual, paired, annealing, complementary, or twin attachment site-containing guide RNAs (atgRNAs).
  • an integration target recognition site is incorporated (referred to herein as beacon placement) into a primary cell (e.g., a human primary cell) genome using a single atgRNA and a single nicking guide RNA (ngRNA).
  • an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) using two atgRNAs (dual atgRNAs).
  • Beacon placement efficiency can be determined, for example, by a two-color digital droplet PCR assay that compares signal from no-insertion amplicons (i.e., wild type) with signal from beacon inserted/placed amplicons.
  • an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at an efficiency of at least 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%
  • an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at an efficiency of at least 60%. In some embodiments, an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at an efficiency of at least 70%. In some embodiments, an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at an efficiency of at least 80%. In some embodiments, an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at an efficiency of at least 90%.
  • the integration target site is incorporated at one or more of the TRAC locus, CXCR4 locus, TRBC1 locus, TRBC2 locus, PDCD1 locus, B2M locus, CIITA locus, or IL2RB locus.
  • a donor polynucleotide is integrated at one or more incorporated integration target site.
  • an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at the CXCR4 locus at an of at least 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%
  • an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at the CXCR4 locus at an efficiency of at least 0.5% using a single atgRNA and a single ngRNA at a 1.5:1 or 3:1 molar ratio of atgRNA to ngRNA.
  • an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at the CXCR4 locus at an efficiency of at least 0.5% using a single atgRNA and a single ngRNA at a 1.5:1 or 3:1 molar ratio of atgRNA to ngRNA at 900 picomoles total of atgRNA plus ngRNA.
  • an integration target recognition site is incorporated into a primary cell genome (e.g., a human primary cell genome) at the CXCR4 locus at an efficiency of at least 0.5% using a single atgRNA and a single ngRNA at a 1.5:1 or 3:1 molar ratio of atgRNA to ngRNA at 900 picomoles total of atgRNA plus ngRNA and 3 or 6 micrograms of mRNA encoding a prime editor protein.
  • a primary cell genome e.g., a human primary cell genome
  • an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%,
  • an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 12.5% using a single atgRNA and a single ngRNA. In some embodiments, an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 12.5% using a single atgRNA and a single ngRNA at a 3:1 molar ratio of atgRNA to ngRNA.
  • an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 12.5% using a single atgRNA and a single ngRNA at a 3:1 molar ratio of atgRNA to ngRNA at 900 picomoles total of atgRNA plus ngRNA.
  • an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 12.5% using a single atgRNA and a single ngRNA at a 3:1 molar ratio of atgRNA to ngRNA at 900 picomoles total of atgRNA plus ngRNA and 3 or 6 micrograms of mRNA encoding a prime editor protein.
  • an integration target recognition site is incorporated into a human primary cell genome at the CXCR4 locus at an efficiency of at least 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,
  • an integration target recognition site is incorporated into a human primary cell genome at the CXCR4 locus at an efficiency of at least 57.2% using dual atgRNAs. In some embodiments, an integration target recognition site is incorporated into a human primary cell genome at the CXCR4 locus at an efficiency of at least 57.2% using dual atgRNAs at a 1:1 molar ratio of each atgRNA. In some embodiments, an integration target recognition site is incorporated into a human primary cell genome at the CXCR4 locus at an efficiency of at least 57.2% using dual atgRNAs at a 1:1 molar ratio of each atgRNA at 400 picomoles total of atgRNA.
  • an integration target recognition site is incorporated into a human primary cell genome at the CXCR4 locus at an efficiency of at least 57.2% using dual atgRNAs at a 1:1 molar ratio of each atgRNA at 400 picomoles total of atgRNA and 3 or 6 micrograms of mRNA encoding a prime editor protein.
  • an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or
  • an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 25% using dual atgRNAs at a 1:1 molar ratio of each atgRNA. In some embodiments, an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 25% using dual atgRNAs at a 1:1 molar ratio of each atgRNA at 400 or 200 picomoles total of atgRNA.
  • an integration target recognition site is incorporated into a human primary cell genome at the IL2RB locus at an efficiency of at least 25% using dual atgRNAs at a 1:1 molar ratio of each atgRNA at 400 or 200 picomoles total of atgRNA and 3 or 6 micrograms of mRNA encoding a prime editor protein.
  • an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%,
  • an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 10% using a single atgRNA and a single ngRNA at a 1.5:1 or 3:1 molar ratio of atgRNA to ngRNA. In some embodiments, an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 10% using a single atgRNA and a single ngRNA at a 1.5:1 or 3:1 molar ratio of atgRNA to ngRNA at 900 picomoles total of atgRNA plus ngRNA.
  • an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 10% using a single atgRNA and a single ngRNA at a 1.5:1 or 3:1 molar ratio of atgRNA to ngRNA at 900 picomoles total of atgRNA plus ngRNA and 3 or 6 micrograms of mRNA encoding a prime editor protein.
  • an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%
  • an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 20% using dual atgRNAs at a 1:1 molar ratio of each dual atgRNA. In some embodiments, an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 20% using dual atgRNAs at a 1:1 molar ratio of each dual atgRNA at total of 800 picomoles total of dual atgRNA.
  • an integration target recognition site is incorporated into a human primary cell genome at the TRAC locus at an efficiency of at least 20% using dual atgRNAs at a 1:1 molar ratio of each dual atgRNA at total of 800 picomoles total of dual atgRNA and 3 or 6 micrograms of mRNA encoding a prime editor protein.
  • the integration target recognition site is specific for an integrase.
  • the integration target recognition site is specific for a recombinase or transposase.
  • the integration target recognition site is an attB or attP site.
  • the integration target recognition site is a modified or orthogonal attB or attP site.
  • the integrase target recognition site is a Bxb1 attB or attP sequence. In some embodiments, the att-site central dinucleotide is not GT or CA. [0214] In some embodiments, the integration target recognition site is comprised of 38 or 46 nucleotides. In some embodiments, the integration target recognition site is less than 46 nucleotides. In some embodiments, the integration target recognition site is less than 40 nucleotides. In some embodiments, the integration target recognition site is less than 35 nucleotides. In some embodiments, the integration target recognition site is less than 30 nucleotides. In some embodiments, the integration target recognition site is less than 25 nucleotides.
  • the one or more guide RNA or the one or more guide RNA complex comprises a chemical modification.
  • the chemical modification is selected from one or more of a 2’ O-methyl and phosphorothioate.
  • the guide RNA for beacon placement comprises, in 5’ to 3’ order, (i) a spacer complementary to a first target genomic site, (ii) a scaffold capable of binding to a DNA binding nuclease or nuclease with nickase activity, (iii) a reverse transcriptase template comprised of at least an integrase target recognition site, and (iv) a primer binding site.
  • the reverse transcriptase template encodes for a first single-stranded DNA sequence or first DNA flap sequence.
  • the cellular integrated polynucleotide encodes for one or more of a chimeric antigen receptor, a T cell receptor, a cytokine receptor, a chemokine receptor, or a modified receptor.
  • the cellular integrated polynucleotide encodes for an exogenous T cell receptor.
  • the cellular integrated polynucleotide encodes for a secreted cytokine.
  • the cellular integrated polynucleotide encodes for a recombinant protein.
  • the cellular integrated polynucleotide further comprises one or more of a co-stimulatory receptor such suitable co-stimulatory signaling regions are also well known in the art, and include members of the B7/CD28 family such as B7-1, B7-2, E7-H1 , B7-H2, E7-H3, B7-H4, B7-H6, B7-H7, BTLA, CD28, CTLA-4, Gi24, ICOS, PD-l, PD-L2 or PDCD6; or ILT/CD85 family proteins such as LII .RA3. MI .R.A4.
  • a co-stimulatory receptor such as B7-1, B7-2, E7-H1 , B7-H2, E7-H3, B7-H4, B7-H6, B7-H7, BTLA, CD28, CTLA-4, Gi24, ICOS, PD-l, PD-L2 or PDCD6; or ILT/CD85 family proteins such as LII
  • MLRB L L1LRB2, I..ILRB3 or LILRB4; or tumor necrosis factor (TNF) superfamily members such as 4-1BB, BAFF, BAFF R, CD27, CD30, CD40, DR3, GITR, HVEM, LIGHT.
  • TNF tumor necrosis factor
  • the cellular integrated polynucleotide i.e., donor polynucleotide template
  • the suicide switch is comprised of an inducible caspase 9 or HSV thymidine kinase.
  • the selection of the co-stimulatory signaling regions may be selected depending upon the particular use intended for the transformed cells.
  • the co-stimulatory signaling regions selected for those which may work co-operatively or synergistically together may be selected from CD28, CD27, ICOS, 4-1BB, 0X40, CD30, GI TR, HVEM, DIM or CD40.
  • the encoded one or more chimeric antigen receptor, T cell receptor, cytokine receptor, chemokine receptor, or modified receptor further comprises one or more integrase target recognition site.
  • the encoded one or more chimeric antigen receptor, T cell receptor, cytokine receptor, chemokine receptor, or modified receptor further comprises at least two integrase target recognition sites.
  • the one or more integrase target recognition sites flank the encoded one or more chimeric antigen receptor, T cell receptor, cytokine receptor, chemokine receptor, or modified receptor.
  • the flanking integrase target recognition sites allow for excision of any inserted genetic sequence.
  • the human primary cell is isolated ex vivo or differentiated from isolated human primary cells.
  • the human primary cell is isolated from a human subject prior to incorporating an integration target recognition site into the human primary cell genome.
  • the human primary cell is autologous.
  • the human primary cell is allogeneic. “Allogeneic” as used herein, means cells from different individuals of the same species.
  • the human primary cell is activated ex vivo or in vitro prior to incorporating an integration target recognition site into the human primary cell genome.
  • the human cell is non-activated (i.e., non-stimulated) prior to incorporating an integration target recognition site into the human primary cell genome.
  • the genetically modified human primary cell is expanded in vitro or ex vivo after integrating a donor polynucleotide template.
  • the expanded modified human primary cell composition is isolated.
  • the isolated human primary cell composition is administered in an effective amount to a human subject.
  • the human subject is concurrently or subsequently administered a suitable dose of an anti-IL-6R antibody, such as tocilizumab, alone or in combination with a suitable corticosteroid to decrease or prevent cytokine release syndrome (CRS) and/or neurotoxicity in the subject, also referred to as immune effector cell-associated neurotoxicity syndrome (ICANS) (See, e.g., Front.
  • an anti-IL-6R antibody such as tocilizumab
  • CRS cytokine release syndrome
  • ICANS immune effector cell-associated neurotoxicity syndrome
  • the present disclosure contemplates a split guide RNA comprised of two or more polynucleotides that are capable of forming a guide RNA complex (see Liu et al; doi: 10.1038/s41587-022-01255-9 the entirety of which is incorporated by reference herein).
  • the guide RNA may be a guide RNA complex, wherein the guide RNA complex is comprised of at least two polynucleotide components.
  • the guide RNA complex is comprised of a first polynucleotide component and a second polynucleotide component.
  • the first polynucleotide component comprises a spacer complementary to a target first genomic site and a scaffold capable of binding to a DNA binding nickase
  • the second polynucleotide component comprises a reverse transcriptase template comprised of at least an integrase target recognition site, a primer binding site and a RNA- protein recruitment domain.
  • the RNA-protein recruitment domain is a MS2 hairpin.
  • the guide RNA or the guide RNA complex further comprises one or more of an RNA-protein recruitment domain, RNA-RNA recruitment domain, a transcriptional termination signal, a reverse transcription termination signal, an RNA ribozyme, or a chemical linker.
  • one or more guide RNA complex is comprised in one or more RNA polynucleotides or DNA polynucleotides. 6.11. Guide RNA Compositions for Dual Guide RNA Systems [0231]
  • the guide RNA or the guide RNA complex reverse transcriptase template encodes for a first single-stranded DNA sequence (i.e., a first DNA flap) that contains a complementary region to a second single-stranded DNA sequence (i.e., a second DNA flap) encoded by a second guide RNA comprised of a reverse transcriptase template.
  • the complementary region between the first and second single- stranded DNA sequences is comprises of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15 , at least 16 , at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 consecutive bases.
  • the complementary region between the first and second single- stranded DNA sequences is comprised of more than 5 consecutive bases of an integrase target recognition site.
  • the complementary region between the first and second single-stranded DNA sequences is comprised of more than 10 consecutive bases of an integrase target recognition site. In certain embodiments, the complementary region between the first and second single-stranded DNA sequences is comprised of more than 20 consecutive bases of an integrase target recognition site. In certain embodiments, the complementary region between the first and second single-stranded DNA sequences is comprised of more than 30 consecutive bases of an integrase target recognition site. [0234] In some embodiments, the integrase target recognition site is an Bxb1 attB or attP sequence. In some embodiments, the att-site central dinucleotide is not GT or CA.
  • one or more guide RNA or the guide RNA complex is comprised in one or more RNA polynucleotides or DNA polynucleotides. 6.12. Guide RNA-Gene Editor Complex [0236]
  • the guide RNA or guide RNA complex is capable of binding a DNA binding nickase selected from the group consisting of: Cas9-D10A, Cas9-H840A, Cas12a/b/c/d/e nickase, CasX nickase, SaCas9 nickase, and CasY nickase.
  • the nickase is linked or fused to one or more of a reverse transcriptase.
  • the nickase is linked or fused to one or more of a reverse transcriptase and integrase. In certain embodiments, the nickase is linked or fused to one or more of an integrase. 6.13. Genes and Targets [0237]
  • This disclosure provides compositions, systems and methods for correcting or replacing genes or gene fragments (including introns or exons) or inserting genes in new locations. In certain embodiments, such a method comprises recombination or integration into a safe harbor site (SHS).
  • SHS safe harbor site
  • a frequently used human SHS is the AAVS1 site on chromosome 19q, initially identified as a site for recurrent adeno-associated virus insertion.
  • a method of the disclosure comprises recombining corrective gene fragments into a defective locus.
  • the methods and compositions can be used to target, without limitation, stem cells for example induced pluripotent stem cells (iPSCs), HSCs, HSPCs, mesenchymal stem cells, or neuronal stem cells and cells at various stages of differentiation.
  • methods and compositions of the disclosure are adapted to target organoids, including patient derived organoids.
  • methods and compositions of the disclosure are adapted to treat muscle cells, not limited to cardiomyocytes for Duchene Muscular Dystrophy (DMD).
  • the dystrophin gene is the largest gene in the human genome, spanning ⁇ 2.3 Mb of DNA. DMD is composed of 79 exons resulting in a 14-kb full-length mRNA. Common mutations include mutations that disrupt the reading frame of generate a premature stop codon.
  • An aspect of DMD that lends it to gene editing as a therapeutic approach is the modular structure of the dystrophin protein. Redundancy in the central rod domain permits the deletion of internal segments of the gene that may harbor loss-of-function mutations, thereby restoring the open reading frame (ORFs).
  • the methods and systems described herein are used to treat DMD by site- specifically integrating in the genome a polynucleotide template that repairs or replaces all or a portion of the defective DMD gene.
  • the most common cystic fibrosis (CF) mutation F508del removes a single amino acid.
  • recombining human CFTR into an SHS of a cell that expresses CFTR F508del is a corrective treatment path.
  • the methods and systems described herein are used to CF by site-specifically integrating in the genome a polynucleotide template that corrects the mutation causing CF. Proposed validation is detection of persistent CFTR mRNA and protein expression in transduced cells.
  • Sickle cell disease (SCD) is caused by mutation of a specific amino acid — valine to glutamic acid at amino acid position 6.
  • SCD is corrected by recombination of the HBB gene into a safe harbor site (SHS) and by demonstrating correction in a proportion of target cells that is high enough to produce a substantial benefit.
  • the methods and systems described herein are used to sickle cell disease by site-specifically integrating in the genome a polynucleotide template that corrects the mutation causing the disease.
  • validation is detection of persistent HBB mRNA and protein expression in transduced cells.
  • DMD Duchenne Muscular Dystrophy. The dystrophin gene is the largest gene in the human genome, spanning ⁇ 2.3 Mb of DNA.
  • DMD is composed of 79 exons resulting in a 14- kb full-length mRNA. Common mutations include mutations that disrupt the reading frame of generate a premature stop codon.
  • An aspect of DMD that lends it to gene editing as a therapeutic approach is the modular structure of the dystrophin protein. Redundancy in the central rod domain permits the deletion of internal segments of the gene that may harbor loss-of-function mutations, thereby restoring the open reading frame (ORFs).
  • ORFs open reading frame
  • recombination will be into safe harbor sites (SHS).
  • a frequently used human SHS is the AAVS1 site on chromosome 19q, initially identified as a site for recurrent adeno-associated virus insertion.
  • the site is the human homolog of the e murine Rosa26 locus (pubmed.ncbi.nlm.nih.gov/18037879). In some embodiments, the site is the human H11 locus on chromosome 22.
  • Proposed target cells for recombination include stem cells for example induced pluripotent stem cells (iPSCs) and cells at various stages of differentiation. In some cases, a complete gene may be prohibitively large and replacement of an entire gene impractical. In such instances, rescuing mutants by recombining in corrected gene fragments with the methods and systems described herein is a corrective option.
  • iPSCs induced pluripotent stem cells
  • correcting mutations in exon 44 (or 51) by recombining in a corrective coding sequence downstream of exon 43 (or 50), using the methods and systems described herein is a corrective option.
  • Proposed validation is detection of persistent DMD mRNA and protein expression in transduced cells.
  • correcting factor VIII deficiency by recombining the FVIII gene into an SHS is a corrective path.
  • the methods and systems described herein are used to correct factor VIII deficiency by site-specifically integrating in the genome a polynucleotide template that corrects the mutation causing the FIX deficiency. Proposed validation is detection of persistent FVIII mRNA and protein expression in transduced cells.
  • Factor 9 (Factor IX) Hemophilia B, also called factor IX (FIX) deficiency is a genetic disorder caused by missing or defective factor IX, a clotting protein.
  • the methods and systems described herein are used to correct factor IX deficiency by site-specifically integrating in the genome a polynucleotide template that corrects the mutation causing the FIX deficiency. Proposed validation is detection of persistent FiX mRNA and protein expression in transduced cells.
  • Ornithine transcarbamylase deficiency OTCD
  • Ornithine transcarbamylase deficiency is a rare genetic condition that causes ammonia to build up in the blood. The condition – more commonly called OTC deficiency — is more common in boys than girls and tends to be more severe when symptoms emerge shortly after birth.
  • the methods and systems described herein are used to correct OTC deficiency by site-specifically integrating in the genome a polynucleotide template that corrects the mutation causing the OTC deficiency or integrates a polynucleotide encoding a functional ornithine transcarbamylase enzyme.
  • Proposed validation is detection of persistent OTC mRNA and protein expression in transduced cells.
  • Phenylketonuria also called PKU, is a rare inherited disorder that causes an amino acid called phenylalanine to build up in the body. PKU is caused by a change in the phenylalanine hydroxylase (PAH) gene.
  • the methods and systems described herein are used to correct PKU by site-specifically integrating in the genome a polynucleotide template that corrects the mutation causing the PKU deficiency or integrates a polynucleotide encoding a functional phenylalanine hydroxylase (PAH) gene. Proposed validation is detection of persistent PAH mRNA and protein expression in transduced cells.
  • PAH phenylalanine hydroxylase
  • HCU Homocystinuria
  • Homocystinuria is elevation of the amino acid, homocysteine (protein building block coming from our diet) in the urine or blood.
  • HCU Common causes of HCU include: problems with the enzyme cystathionine beta synthase (CBS), which converts homocysteine to the amino acid cystathionine (which then becomes cysteine) and needs the vitamin B6 (pyridoxine); and roblems with converting homocysteine to the amino acid methionine.
  • CBS cystathionine beta synthase
  • pyridoxine pyridoxine
  • the methods and systems described herein are used to correct HCU by site-specifically integrating in the genome a polynucleotide template that corrects the mutation causing the HCU or integrates a polynucleotide encoding a functional copy of a gene (e.g., CBS) able to reduce or prevent buildup of homocysteine in the urine.
  • a functional copy of a gene e.g., CBS
  • IgA Nephropathy (Berger’s disease). IgA nephropathy, also known as Berger's disease, is a kidney/autoimmune disease that occurs when an antibody called immunoglobulin A (IgA) builds up in the kidneys.
  • IgA immunoglobulin A
  • the methods and systems described herein are used to treat Berger’s disease by administering to a patient an iPSC-derived Natural Killer cell that includes a polynucleotide site-specifically integrated in the genome of the cell using the methods described herein.
  • the iPSC-NK cell Upon administering the iPSC-NK cell to the patient, the iPSC-NK cell is capable of removing native cells (e.g., B cells) that are responsible, at least in part, for the symptoms of Berger’s disease.
  • native cells e.g., B cells
  • the iPSC-NK cell Upon administering the iPSC-NK cell to the patient, the iPSC-NK cell is capable of removing native cells (e.g., B cells) that are responsible, at least in part, for the symptoms of Berger’s disease.
  • ANCA vasculitis is an autoimmune disease affecting small blood vessels in the body. It is caused by autoantibodies called ANCAs, or Anti-Neutrophilic Cytoplasmic Autoantibodies. ANCAs target and attack a certain kind of white blood cells called neutrophils.
  • the methods and systems described herein are used to treat ANCA vasculitis by administering to a patient an iPSC-derived Natural Killer cell that includes a polynucleotide site-specifically integrated in the genome of the cell using the methods described herein.
  • the iPSC-NK cell Upon administering the iPSC-NK cell to the patient, the iPSC-NK cell is capable of removing native cells (e.g., B cells) that are responsible, at least in part, for the symptoms of ANCA vasculitis.
  • native cells e.g., B cells
  • SLE Systemic Lupus Erythematosus
  • LN Lupus Nephritis
  • Lupus is an autoimmune—a disorder in which the body’s immune system attacks the body’s own cells and organs.
  • the methods and systems described herein are used to treat SLE/LN by administering to a patient an iPSC-derived Natural Killer cell that includes a polynucleotide site-specifically integrated in the genome of the cell using the methods described herein.
  • the iPSC-NK cell Upon administering the iPSC-NK cell to the patient, the iPSC-NK cell is capable of removing native cells (e.g., B cells) that are responsible, at least in part, for the symptoms of SLE/LN.
  • MN Membranous Nephropathy
  • MN is a kidney disease that affects the filters (glomeruli) of the kidney and can cause protein in the urine, as well as decreased kidney function and swelling. It can sometimes be called membranous glomerulopathy as well (these terms can be used interchangeably and mean the same thing).
  • the methods and systems described herein are used to treat MN by administering to a patient an iPSC-derived Natural Killer cell that includes a polynucleotide site-specifically integrated in the genome of the cell using the methods described herein.
  • the iPSC-NK cell Upon administering the iPSC-NK cell to the patient, the iPSC-NK cell is capable of removing native cells (e.g., B cells) that are responsible, at least in part, for the symptoms of MN.
  • C3 glomerulonephritis C3GN.
  • C3 glomerulopathy is a group of related conditions that cause the kidneys to malfunction.
  • the major features of C3 glomerulopathy include high levels of protein in the urine (proteinuria), blood in the urine (hematuria), reduced amounts of urine, low levels of protein in the blood, and swelling in many areas of the body. Affected individuals may have particularly low levels of a protein called complement component 3 (or C3) in the blood.
  • the methods and systems described herein are used to treat C3 glomerulopathy by administering to a patient an iPSC-derived Natural Killer cell that includes a polynucleotide site-specifically integrated in the genome of the cell using the methods described herein.
  • the iPSC-NK cell Upon administering the iPSC-NK cell to the patient, the iPSC-NK cell is capable of removing native cells (e.g., B cells) that are responsible, at least in part, for the symptoms of C3 glomerulopathy. 6.14.
  • native cells e.g., B cells
  • Methods of treatment are presented. The method comprises administering an effective amount of the pharmaceutical composition comprising the nucleic acid construct or vectorized nucleic acid construct described above to a patient in need thereof.
  • the method of cell delivery used here occurs using electroporation.
  • the method of cell delivery used here occurs using a recombinant adenovirus, helper dependent adenovirus, AAV, lentivirus, HSV, anellovirus, retrovirus, Doggybone DNA, minicircle, plasmid, miniDNA, nanoplasmid, mRNA, RNP, or lipid nanoparticle.
  • Delivery of the nucleic acid construct can also be by fusosome or exosome, (See, e.g., WO2019222403 which is incorporated by reference herein). Delivery of nucleic acid construct can also be by VesiCas (See, e.g., US20210261957A1 which is incorporated by reference herein).
  • DNA or RNA viral vectors can be administered directly to patients (in vivo), or they can be used to treat cells in vitro, and the modified cells may optionally be administered to patients (ex vivo).
  • Conventional viral based systems to be used herein could include retroviral, lentivirus, adenoviral, adeno-associated and herpes simplex virus vectors for gene transfer. Integration in the host genome is possible with the retrovirus, lentivirus, and adeno-associated virus gene transfer methods, often resulting in long term expression of the inserted transgene. Additionally, high transduction efficiencies have been observed in many different cell types and target tissues.
  • Methods of non-viral delivery include lipofection, nucleofection, microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipid:nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced uptake of DNA.
  • Lipofection is described in e.g., U.S. Pat. Nos. 5,049,386, 4,946,787; and 4,897,355) and lipofection reagents are sold commercially (e.g., TransfectamTM and LipofectinTM).
  • Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of Felgner, WO 91/17424; WO 91/16024. Delivery can be to cells (e.g. in vitro or ex vivo administration) or target tissues (e.g. in vivo administration).
  • 6.14.1. Lipid Nanoparticle Delivery [0271]
  • the gene editor, guide RNA (i.e. atgRNA or atgRNAs), nicking guide RNA, and/or donor polynucleotide can be packaged in a LNP and administered intravenously.
  • the gene editor, guide RNA i.e.
  • the gene editor, guide RNA (i.e. atgRNA or atgRNAs), nicking guide RNA, and/or donor polynucleotide can be packaged in a LNP and administered intrathecally.
  • the gene editor, guide RNA (i.e. atgRNA or atgRNAs), nicking guide RNA, and/or donor polynucleotide can be packaged in a LNP and administered by intracerebral ventricular injection.
  • the gene editor, guide RNA (i.e. atgRNA or atgRNAs), nicking guide RNA, and/or donor polynucleotide can be packaged in a LNP and administered by intracisternal magna administration.
  • the gene editor, guide RNA (i.e. atgRNA or atgRNAs), nicking guide RNA, and/or donor polynucleotide are packaged in a LNP and administered by intravitreal injection.
  • guide RNA i.e. atgRNA or atgRNAs
  • nicking guide RNA i.e. atgRNA or atgRNAs
  • donor polynucleotide i.e. atgRNA or atgRNAs
  • donor polynucleotide are packaged in a LNP and administered by intravitreal injection.
  • the preparation of lipid:nucleic acid complexes, including targeted liposomes such as immunolipid complexes is well known to one of skill in the art (see, e.g., Crystal, Science 270:404-410 (1995); Blaese et al., Cancer Gene Ther.2:291-297 (1995); Behr et al., Bioconjugate Chem. 5:382-389 (1994); Remy et
  • LNP doses of about 0.01 to about 1 mg per kg of body weight administered intravenously are contemplated.
  • Medications to reduce the risk of infusion-related reactions are contemplated, such as dexamethasone, acetampinophen, diphenhydramine or cetirizine, and ranitidine are contemplated. Multiple doses of about 0.3 mg per kilogram every 4 weeks for five doses are also contemplated. [0274] The charge of the LNP must be taken into consideration. As cationic lipids combined with negatively charged lipids to induce nonbilayer structures that facilitate intracellular delivery.
  • ionizable cationic lipids with pKa values below 7 were developed (see, e.g., Rosin et al, Molecular Therapy, vol.19, no.12, pages 1286-2200, December 2011).
  • Negatively charged polymers such as RNA may be loaded into LNPs at low pH values (e.g., pH 4) where the ionizable lipids display a positive charge.
  • pH 4 e.g., pH 4
  • the LNPs exhibit a low surface charge compatible with longer circulation times.
  • ionizable cationic lipids Four species of ionizable cationic lipids have been focused upon, namely 1,2-dilineoyl-3-dimethylammonium-propane (DLinDAP), 1,2- dilinoleyloxy-3-N,N-dimethylaminopropane (DLinDMA), 1,2-dilinoleyloxy-keto-N,N-dimethyl- 3-aminopropane (DLinKDMA), and 1,2-dilinoleyl-4-(2-dimethylaminoethyl)-[1,3]-dioxolane (DLinKC2-DMA).
  • DLinDAP 1,2-dilineoyl-3-dimethylammonium-propane
  • DLinDMA 1,2- dilinoleyloxy-3-N,N-dimethylaminopropane
  • DLinKDMA 1,2-dilinoleyloxy-keto-N,N-di
  • the LNP siRNA systems containing these lipids exhibit remarkably different gene silencing properties in hepatocytes in vivo, with potencies varying according to the series DLinKC2-DMA>DLinKDMA>DLinDMA>>DLinDAP employing a Factor VII gene silencing model (see, e.g., Rosin et al, Molecular Therapy, vol.19, no.12, pages 1286-2200, December 2011).
  • a dosage of 1 ⁇ g/ml of LNP in or associated with the LNP may be contemplated, especially for a formulation containing DLinKC2-DMA.
  • the LNP composition comprises one or more one or more ionizable lipids.
  • ionizable lipid has its ordinary meaning in the art and may refer to a lipid comprising one or more charged moieties. In some embodiments, an ionizable lipid may be positively charged or negatively charged. In principle, there are no specific limitations concerning the ionizable lipids of the LNP compositions disclosed herein.
  • the one or more ionizable lipids are selected from the group consisting of 3-(didodecylamino)- N1,N1,4-tridodecyl-1-piperazineethanamine (KL10), N1-[2-(didodecylamino)ethyl]-N1,N4,N4- tridodecyl-1,4-piperazinediethanami- ne (KL22), 14,25-ditridecyl-15,18,21,24-tetraaza- octatriacontane (KL25), 1,2-dilinoleyloxy-N,N-dimethylaminopropane (DLin-DMA), 2,2- dilinoleyl-4-dimethylaminomethyl-[1,3]-dioxolane (DLin-K-DMA), heptatriaconta-6,9,28,31- tetraen-19-yl 4-(di
  • the ionizable lipid may be selected from, but not limited to, an ionizable lipid described in International Publication Nos. WO2013086354 and WO2013116126.
  • the lipid nanoparticle may include one or more (e.g., 1, 2, 3, 4, 5, 6, 7, or 8) cationic and/or ionizable lipids.
  • Such cationic and/or ionizable lipids include, but are not limited to, 3-(didodecylamino)-N1,N1,4-tridodecyl-1-piperazineethanamine (KL10), N1-[2- (didodecylamino)ethyl]-N1,N4,N4-tridodecyl-1,4-piperazinediethanami- ne (KL22), 14,25- ditridecyl-15,18,21,24-tetraaza-octatriacontane (KL25), 1,2-dilinoleyloxy-N,N- dimethylaminopropane (DLin-DMA), 2,2-dilinoleyl-4-dimethylaminomethyl-[1,3]-dioxolane (DLin-K-DMA), heptatriaconta-6,9,28,31-tetraen-19-yl 4-(dimethylamino)but
  • LIPOFECTIN.RTM including DOTMA and DOPE, available from GIBCO/BRL
  • LIPOFECTAMINE.RTM including DOSPA and DOPE, available from GIBCO/BRL
  • KL10, KL22, and KL25 are described, for example, in U.S. Pat. No.8,691,750.
  • the LNP composition comprises one or more amino lipids.
  • amino lipid and “cationic lipid” are used interchangeably herein to include those lipids and salts thereof having one, two, three, or more fatty acid or fatty alkyl chains and a pH-titratable amino head group (e.g., an alkylamino or dialkylamino head group).
  • a pH-titratable amino head group e.g., an alkylamino or dialkylamino head group.
  • amino lipids of the LNP compositions disclosed herein.
  • the cationic lipid is typically protonated (i.e., positively charged) at a pH below the pKa of the cationic lipid and is substantially neutral at a pH above the pKa.
  • the cationic lipids can also be termed titratable cationic lipids.
  • the one or more cationic lipids include: a protonatable tertiary amine (e.g., pH-titratable) head group; alkyl chains, wherein each alkyl chain independently has 0 to 3 (e.g., 0, 1, 2, or 3) double bonds; and ether, ester, or ketal linkages between the head group and alkyl chains.
  • a protonatable tertiary amine e.g., pH-titratable
  • alkyl chains wherein each alkyl chain independently has 0 to 3 (e.g., 0, 1, 2, or 3) double bonds
  • ether, ester, or ketal linkages between the head group and alkyl chains e.g., 1, 2, or 3
  • Such cationic lipids include, but are not limited to, DSDMA, DODMA, DOTMA, DLinDMA, DLenDMA, .gamma.-DLenDMA, DLin-K-DMA, DLin-K-C2- DMA (also known as DLin-C2K-DMA, XTC2, and C2K), DLin-K-C3-DMA, DLin-K-C4-DMA, DLen-C2K-DMA, y-DLen-C2-DMA, C12-200, cKK-E12, cKK-A12, cKK-O12, DLin-MC2- DMA (also known as MC2), and DLin-MC3-DMA (also known as MC3).
  • Anionic lipids suitable for use in lipid nanoparticles include, but are not limited to, phosphatidylglycerol, cardiolipin, diacylphosphatidylserine, diacylphosphatidic acid, N- dodecanoyl phosphatidylethanoloamine, N-succinyl phosphatidylethanolamine, N-glutaryl phosphatidylethanolamine, lysylphosphatidylglycerol, and other anionic modifying groups joined to neutral lipids.
  • Neutral lipids suitable for use in lipid nanoparticles include, but are not limited to, diacylphosphatidylcholine, diacylphosphatidylethanolamine, ceramide, sphingomyelin, dihydrosphingomyelin, cephalin, sterols (e.g., cholesterol) and cerebrosides.
  • the lipid nanoparticle comprises cholesterol.
  • Lipids having a variety of acyl chain groups of varying chain length and degree of saturation are available or may be isolated or synthesized by well-known techniques. Additionally, lipids having mixtures of saturated and unsaturated fatty acid chains and cyclic regions can be used.
  • the neutral lipids used in the disclosure are DOPE, DSPC, DPPC, POPC, or any related phosphatidylcholine.
  • the neutral lipid may be composed of sphingomyelin, dihydrosphingomyeline, or phospholipids with other head groups, such as serine and inositol.
  • amphipathic lipids are included in nanoparticles. Exemplary amphipathic lipids suitable for use in nanoparticles include, but are not limited to, sphingolipids, phospholipids, fatty acids, and amino lipids.
  • the lipid composition of the pharmaceutical composition may comprise one or more phospholipids, for example, one or more saturated or (poly)unsaturated phospholipids or a combination thereof.
  • phospholipids comprise a phospholipid moiety and one or more fatty acid moieties.
  • a phospholipid moiety can be selected, for example, from the non-limiting group consisting of phosphatidyl choline, phosphatidyl ethanolamine, phosphatidyl glycerol, phosphatidyl serine, phosphatidic acid, 2-lysophosphatidyl choline, and a sphingomyelin.
  • a fatty acid moiety can be selected, for example, from the non-limiting group consisting of lauric acid, myristic acid, myristoleic acid, palmitic acid, palmitoleic acid, stearic acid, oleic acid, linoleic acid, alpha-linolenic acid, erucic acid, phytanoic acid, arachidic acid, arachidonic acid, eicosapentaenoic acid, behenic acid, docosapentaenoic acid, and docosahexaenoic acid.
  • Particular amphipathic lipids can facilitate fusion to a membrane.
  • a cationic phospholipid can interact with one or more negatively charged phospholipids of a membrane (e.g., a cellular or intracellular membrane). Fusion of a phospholipid to a membrane can allow one or more elements (e.g., a therapeutic agent) of a lipid-containing composition (e.g., LNPs) to pass through the membrane permitting, e.g., delivery of the one or more elements to a target tissue.
  • a lipid-containing composition e.g., LNPs
  • Non-natural amphipathic lipid species including natural species with modifications and substitutions including branching, oxidation, cyclization, and alkynes are also contemplated.
  • a phospholipid can be functionalized with or cross-linked to one or more alkynes (e.g., an alkenyl group in which one or more double bonds is replaced with a triple bond).
  • alkynes e.g., an alkenyl group in which one or more double bonds is replaced with a triple bond.
  • an alkyne group can undergo a copper-catalyzed cycloaddition upon exposure to an azide.
  • Such reactions can be useful in functionalizing a lipid bilayer of a nanoparticle composition to facilitate membrane permeation or cellular recognition or in conjugating a nanoparticle composition to a useful component such as a targeting or imaging moiety (e.g., a dye).
  • Phospholipids include, but are not limited to, glycerophospholipids such as phosphatidylcholines, phosphatidylethanolamines, phosphatidylserines, phosphatidylinositols, phosphatidy glycerols, and phosphatidic acids. Phospholipids also include phosphosphingolipid, such as sphingomyelin. [0287] In some embodiments, the LNP composition comprises one or more phospholipids.
  • the phospholipid is selected from the group consisting of 1,2-dilinoleoyl-sn- glycero-3-phosphocholine (DLPC), 1,2-dimyristoyl-sn-glycero-phosphocholine (DMPC), 1,2- dioleoyl-sn-glycero-3-phosphocholine (DOPC), 1,2-dipalmitoyl-sn-glycero-3-phosphocholine (DPPC), 1,2-distearoyl-sn-glycero-3-phosphocholine (DSPC), 1,2-diundecanoyl-sn-glycero- phosphocholine (DUPC), 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC), 1,2-di-O- octadecenyl-sn-glycero-3-phosphocholine (18:0 Diether PC), 1-oleoyl-2- cholesterylhemisuccino
  • the LNP composition comprises one or more helper lipids.
  • helper lipid refers to lipids that enhance transfection (e.g., transfection of an LNP comprising an mRNA that encodes a site-directed endonuclease, such as a SpCas9 polypeptide).
  • helper lipids of the LNP compositions there are no specific limitations concerning the helper lipids of the LNP compositions disclosed herein. Without being bound to any particular theory, it is believed that the mechanism by which the helper lipid enhances transfection includes enhancing particle stability. In some embodiments, the helper lipid enhances membrane fusogenicity.
  • the helper lipid of the LNP compositions disclosure herein can be any helper lipid known in the art. Non-limiting examples of helper lipids suitable for the compositions and methods include steroids, sterols, and alkyl resorcinols.
  • helper lipids suitable for use in the present disclosure include, but are not limited to, saturated phosphatidylcholine (PC) such as distearoyl-PC (DSPC) and dipalymitoyl-PC (DPPC), dioleoylphosphatidylethanolamine (DOPE), 1,2-dilinoleoyl-sn- glycero-3-phosphocholine (DLPC), cholesterol, 5-heptadecylresorcinol, and cholesterol hemisuccinate.
  • the helper lipid of the LNP composition includes cholesterol.
  • the LNP composition comprises one or more structural lipids.
  • structural lipid refers to sterols and also to lipids containing sterol moieties. Without being bound to any particular theory, it is believed that the incorporation of structural lipids into the LNPs mitigates aggregation of other lipids in the particle.
  • Structural lipids can be selected from the group including but not limited to, cholesterol, fecosterol, sitosterol, ergosterol, campesterol, stigmasterol, brassicasterol, tomatidine, tomatine, ursolic acid, alpha-tocopherol, hopanoids, phytosterols, steroids, and mixtures thereof.
  • the structural lipid is a sterol.
  • sterols are a subgroup of steroids consisting of steroid alcohols.
  • the structural lipid is a steroid.
  • the structural lipid is cholesterol.
  • the structural lipid is an analog of cholesterol.
  • the lipid component of a lipid nanoparticle composition may include one or more molecules comprising polyethylene glycol, such as PEG or PEG-modified lipids.
  • the LNP composition disclosed herein comprise one or more polyethylene glycol (PEG) lipid.
  • PEG-lipid refers to polyethylene glycol (PEG)-modified lipids. Such lipids are also referred to as PEGylated lipids.
  • PEG-lipids include PEG- modified phosphatidylethanolamine and phosphatidic acid, PEG-ceramide conjugates (e.g., PEG- CerC14 or PEG-CerC20), PEG-modified dialkylamines and PEG-modified 1,2-diacyloxypropan- 3-amines
  • a PEG lipid can be PEG-c-DOMG, PEG-DMG, PEG-DLPE, PEG-DMPE, PEG-DPPC, or a PEG-DSPE lipid.
  • the PEG-lipid includes, but not limited to 1,2-dimyristoyl-sn-glycerol methoxypolyethylene glycol (PEG-DMG), 1,2-distearoyl-sn- glycero-3-phosphoethanolamine-N-[amino(polyethylene glycol)] (PEG-DSPE), PEG-disteryl glycerol (PEG-DSG), PEG-dipalmetoleyl, PEG-dioleyl, PEG-distearyl, PEG-diacylglycamide (PEG-DAG), PEG-dipalmitoyl phosphatidylethanolamine (PEG-DPPE), or PEG-1,2- dimyristyloxlpropyl-3-amine (PEG-c-DMA).
  • PEG-DMG 1,2-dimyristoyl-sn-glycerol methoxypolyethylene glycol
  • PEG-DSPE 1,2-distearoyl-sn-
  • the PEG-lipid is selected from the group consisting of a PEG-modified phosphatidylethanolamine, a PEG-modified phosphatidic acid, a PEG-modified ceramide, a PEG-modified dialkylamine, a PEG-modified diacylglycerol, a PEG-modified dialkylglycerol, and mixtures thereof.
  • the lipid moiety of the PEG-lipids includes those having lengths of from about C.sub.14 to about C.sub.22, preferably from about C.sub.14 to about C.sub.16.
  • a PEG moiety for example a mPEG-NH.sub.2, has a size of about 1000, 2000, 5000, 10,000, 15,000 or 20,000 daltons.
  • the PEG-lipid is PEG2k-DMG.
  • the one or more PEG lipids of the LNP composition comprises PEG-DMPE.
  • the one or more PEG lipids of the LNP composition comprises PEG-DMG.
  • the ratio between the lipid components and the nucleic acid molecules of the LNP composition is sufficient for (i) formation of LNPs with desired characteristics, e.g., size, charge, and (ii) delivery of a sufficient dose of nucleic acid at a dose of the lipid component(s) that is tolerable for in vivo administration as readily ascertained by one of skill in the art.
  • a nanoparticle may be targeted to a particular cell, tissue, and/or organ using a targeting moiety.
  • a nanoparticle comprises a targeting moiety.
  • targeting moieties include ligands, cell surface receptors, glycoproteins, vitamins (e.g., riboflavin) and antibodies (e.g., full-length antibodies, antibody fragments (e.g., Fv fragments, single chain Fv (scFv) fragments, Fab' fragments, or F(ab')2 fragments), single domain antibodies, camelid antibodies and fragments thereof, human antibodies and fragments thereof, monoclonal antibodies, and multispecific antibodies (e.g., bispecific antibodies)).
  • ligands include ligands, cell surface receptors, glycoproteins, vitamins (e.g., riboflavin) and antibodies (e.g., full-length antibodies, antibody fragments (e.g., Fv fragments, single chain Fv (scFv) fragments, Fab' fragments, or F
  • the targeting moiety may be a polypeptide.
  • the targeting moiety may include the entire polypeptide (e.g., peptide or protein) or fragments thereof.
  • a targeting moiety is typically positioned on the outer surface of the nanoparticle in such a manner that the targeting moiety is available for interaction with the target, for example, a cell surface receptor.
  • a variety of different targeting moieties and methods are known and available in the art, including those described, e.g., in Sapra et al., Prog. Lipid Res.42(5):439-62, 2003 and Abra et al., J. Liposome Res.12:1-3, 2002.
  • a lipid nanoparticle may include a surface coating of hydrophilic polymer chains, such as polyethylene glycol (PEG) chains (see, e.g., Allen et al., Biochimica et Biophysica Acta 1237: 99-108, 1995; DeFrees et al., Journal of the American Chemistry Society 118: 6101-6104, 1996; Blume et al., Biochimica et Biophysica Acta 1149: 180- 184,1993; Klibanov et al., Journal of Liposome Research 2: 321-334, 1992; U.S. Pat. No.
  • PEG polyethylene glycol
  • a targeting moiety for targeting the lipid nanoparticle is linked to the polar head group of lipids forming the nanoparticle.
  • the targeting moiety is attached to the distal ends of the PEG chains forming the hydrophilic polymer coating (see, e.g., Klibanov et al., Journal of Liposome Research 2: 321-334, 1992; Kirpotin et al., FEBS Letters 388: 115-118, 1996).
  • Standard methods for coupling the targeting moiety or moieties may be used.
  • phosphatidylethanolamine which can be activated for attachment of targeting moieties, or derivatized lipophilic compounds, such as lipid-derivatized bleomycin, can be used.
  • Antibody- targeted liposomes can be constructed using, for instance, liposomes that incorporate protein A (see, e.g., Renneisen et al., J. Bio. Chem., 265:16337-16342, 1990 and Leonetti et al., Proc. Natl. Acad. Sci. (USA), 87:2448-2451, 1990).
  • Other examples of antibody conjugation are disclosed in U.S. Pat. No.6,027,726.
  • targeting moieties can also include other polypeptides that are specific to cellular components, including antigens associated with neoplasms or tumors.
  • Polypeptides used as targeting moieties can be attached to the liposomes via covalent bonds (see, for example Heath, Covalent Attachment of Proteins to Liposomes, 149 Methods in Enzymology 111-119 (Academic Press, Inc.1987)).
  • Other targeting methods include the biotin-avidin system.
  • a lipid nanoparticle includes a targeting moiety that targets the lipid nanoparticle to a cell including, but not limited to, hepatocytes, colon cells, epithelial cells, hematopoietic cells, epithelial cells, endothelial cells, lung cells, bone cells, stem cells, mesenchymal cells, neural cells, cardiac cells, adipocytes, vascular smooth muscle cells, cardiomyocytes, skeletal muscle cells, beta cells, pituitary cells, synovial lining cells, ovarian cells, testicular cells, fibroblasts, B cells, T cells, reticulocytes, leukocytes, granulocytes, and tumor cells (including primary tumor cells and metastatic tumor cells).
  • a targeting moiety that targets the lipid nanoparticle to a cell including, but not limited to, hepatocytes, colon cells, epithelial cells, hematopoietic cells, epithelial cells, endothelial cells, lung cells, bone cells, stem cells, me
  • the targeting moiety targets the lipid nanoparticle to a hepatocyte.
  • the lipid nanoparticles described herein may be lipidoid-based. The synthesis of lipidoids has been extensively described and formulations containing these compounds are particularly suited for delivery of polynucleotides (see Mahon et al., Bioconjug Chem. 201021:1448-1454; Schroeder et al., J Intern Med. 2010267:9-21; Akinc et al., Nat. Biotechnol. 200826:561-569; Love et al., Proc Natl Acad Sci USA.
  • lipidoid formulations for intramuscular or subcutaneous routes may vary significantly depending on the target cell type and the ability of formulations to diffuse through the extracellular matrix into the blood stream.
  • lipidoid oligonucleotides to deliver the formulation to other cells types including, but not limited to, endothelial cells, myeloid cells, and muscle cells may not be similarly size-limited.
  • effective delivery to myeloid cells, such as monocytes, lipidoid formulations may have a similar component molar ratio.
  • lipidoids and other components including, but not limited to, a neutral lipid (e.g., diacylphosphatidylcholine), cholesterol, a PEGylated lipid (e.g., PEG-DMPE), and a fatty acid (e.g., an omega-3 fatty acid) may be used to optimize the formulation of the mRNA or system for delivery to different cell types including, but not limited to, hepatocytes, myeloid cells, muscle cells, etc.
  • a neutral lipid e.g., diacylphosphatidylcholine
  • cholesterol e.g., a PEGylated lipid
  • PEG-DMPE PEGylated lipid
  • a fatty acid e.g., an omega-3 fatty acid
  • Exemplary lipidoids include, but are not limited to, DLin-DMA, DLin-K-DMA, DLin-KC2-DMA, 98N12-5, C12-200 (including variants and derivatives), DLin-MC3-DMA and analogs thereof.
  • lipidoid formulations for the localized delivery of nucleic acids to cells may also not require all of the formulation components which may be required for systemic delivery, and as such may comprise the lipidoid and the mRNA or system.
  • a system described herein may be formulated by mixing the mRNA or system, or individual components of the system, with the lipidoid at a set ratio prior to addition to cells.
  • In vivo formulations may require the addition of extra ingredients to facilitate circulation throughout the body.
  • a system or individual components of a system is added and allowed to integrate with the complex. The encapsulation efficiency is determined using a standard dye exclusion assays.
  • In vivo delivery of systems may be affected by many parameters, including, but not limited to, the formulation composition, nature of particle PEGylation, degree of loading, oligonucleotide to lipid ratio, and biophysical parameters such as particle size (Akinc et al., Mol Ther. 2009 17:872-879; herein incorporated by reference in its entirety).
  • particle size Akinc et al., Mol Ther. 2009 17:872-879; herein incorporated by reference in its entirety.
  • small changes in the anchor chain length of poly(ethylene glycol) (PEG) lipids may result in significant effects on in vivo efficacy.
  • Formulations with the different lipidoids including, but not limited to penta[3-(1- laurylaminopropionyl)]-triethylenetetramine hydrochloride (TETA-5LAP; aka 98N12-5, see Murugaiah et al., Analytical Biochemistry, 401:61 (2010)), C12-200 (including derivatives and variants), MD1, DLin-DMA, DLin-K-DMA, DLin-KC2-DMA and DLin-MC3-DMA can be tested for in vivo activity.
  • the lipidoid referred to herein as "98N12-5" is disclosed by Akinc et al., Mol Ther.200917:872-879).
  • LNPs in which a nucleic acid is entrapped within the lipid portion of the particle and is protected from degradation can be formed by any method known in the art including, but not limited to, a continuous mixing method, a direct dilution process, and an in-line dilution process. Additional techniques and methods suitable for the preparation of the LNPs described herein include coacervation, microemulsions, supercritical fluid technologies, phase-inversion temperature (PIT) techniques.
  • PIT phase-inversion temperature
  • the LNPs used herein are produced via a continuous mixing method, e.g., a process that includes providing an aqueous solution a nucleic acid described herein in a first reservoir, providing an organic lipid solution in a second reservoir (wherein the lipids present in the organic lipid solution are solubilized in an organic solvent, e.g., a lower alkanol such as ethanol), and mixing the aqueous solution with the organic lipid solution such that the organic lipid solution mixes with the aqueous solution so as to substantially instantaneously produce a lipid vesicle (e.g., liposome) encapsulating the nucleic acid molecule within the lipid vesicle.
  • a continuous mixing method e.g., a process that includes providing an aqueous solution a nucleic acid described herein in a first reservoir, providing an organic lipid solution in a second reservoir (wherein the lipids present in the organic lipid solution are solubilized in an organic solvent
  • the LNPs used herein are produced via a direct dilution process that includes forming a lipid vesicle (e.g., liposome) solution and immediately and directly introducing the lipid vesicle solution into a collection vessel containing a controlled amount of dilution buffer.
  • the collection vessel includes one or more elements configured to stir the contents of the collection vessel to facilitate dilution.
  • the amount of dilution buffer present in the collection vessel is substantially equal to the volume of lipid vesicle solution introduced thereto.
  • the LNPs are produced via an in-line dilution process in which a third reservoir containing dilution buffer is fluidly coupled to a second mixing region.
  • the lipid vesicle (e.g., liposome) solution formed in a first mixing region is immediately and directly mixed with dilution buffer in the second mixing region.
  • a “vector“ is a tool that allows or facilitates the transfer of an entity from one environment to another. It is a replicon, such as a plasmid, phage, or cosmid, into which another DNA segment may be inserted so as to bring about the replication of the inserted segment.
  • a vector is capable of replication when associated with the proper control elements.
  • the term “vector” refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
  • Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double- stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g. circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art.
  • plasmid re