WO2019126589A1 - Micelles for complexation and delivery of proteins and nucleic acids - Google Patents

Micelles for complexation and delivery of proteins and nucleic acids Download PDF

Info

Publication number
WO2019126589A1
WO2019126589A1 PCT/US2018/066961 US2018066961W WO2019126589A1 WO 2019126589 A1 WO2019126589 A1 WO 2019126589A1 US 2018066961 W US2018066961 W US 2018066961W WO 2019126589 A1 WO2019126589 A1 WO 2019126589A1
Authority
WO
WIPO (PCT)
Prior art keywords
transposase
poly
seq
sequence
composition
Prior art date
Application number
PCT/US2018/066961
Other languages
French (fr)
Inventor
P. Peter Ghoroghchian
Gabriela Romero URIBE
Eric Ostertag
Original Assignee
Poseida Therapeutics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Poseida Therapeutics, Inc. filed Critical Poseida Therapeutics, Inc.
Publication of WO2019126589A1 publication Critical patent/WO2019126589A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/88Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation using microencapsulation, e.g. using amphiphile liposome vesicle
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • A61K38/16Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • A61K38/43Enzymes; Proenzymes; Derivatives thereof
    • A61K38/45Transferases (2)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • A61K38/16Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • A61K38/43Enzymes; Proenzymes; Derivatives thereof
    • A61K38/46Hydrolases (3)
    • A61K38/465Hydrolases (3) acting on ester bonds (3.1), e.g. lipases, ribonucleases
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K47/00Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient
    • A61K47/50Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates
    • A61K47/69Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the conjugate being characterised by physical or galenical forms, e.g. emulsion, particle, inclusion complex, stent or kit
    • A61K47/6905Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the conjugate being characterised by physical or galenical forms, e.g. emulsion, particle, inclusion complex, stent or kit the form being a colloid or an emulsion
    • A61K47/6907Medicinal preparations characterised by the non-active ingredients used, e.g. carriers or inert additives; Targeting or modifying agents chemically bound to the active ingredient the non-active ingredient being chemically bound to the active ingredient, e.g. polymer-drug conjugates the conjugate being characterised by physical or galenical forms, e.g. emulsion, particle, inclusion complex, stent or kit the form being a colloid or an emulsion the form being a microemulsion, nanoemulsion or micelle
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/07Nucleotidyltransferases (2.7.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y301/00Hydrolases acting on ester bonds (3.1)
    • C12Y301/21Endodeoxyribonucleases producing 5'-phosphomonoesters (3.1.21)
    • C12Y301/21004Type II site-specific deoxyribonuclease (3.1.21.4)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/10Dispersions; Emulsions
    • A61K9/107Emulsions ; Emulsion preconcentrates; Micelles
    • A61K9/1075Microemulsions or submicron emulsions; Preconcentrates or solids thereof; Micelles, e.g. made of phospholipids or block copolymers
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/48Preparations in capsules, e.g. of gelatin, of chocolate
    • A61K9/50Microcapsules having a gas, liquid or semi-solid filling; Solid microparticles or pellets surrounded by a distinct coating layer, e.g. coated microspheres, coated drug crystals
    • A61K9/51Nanocapsules; Nanoparticles
    • A61K9/5107Excipients; Inactive ingredients
    • A61K9/513Organic macromolecular compounds; Dendrimers
    • A61K9/5138Organic macromolecular compounds; Dendrimers obtained by reactions only involving carbon-to-carbon unsaturated bonds, e.g. polyvinyl pyrrolidone, poly(meth)acrylates
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K9/00Medicinal preparations characterised by special physical form
    • A61K9/48Preparations in capsules, e.g. of gelatin, of chocolate
    • A61K9/50Microcapsules having a gas, liquid or semi-solid filling; Solid microparticles or pellets surrounded by a distinct coating layer, e.g. coated microspheres, coated drug crystals
    • A61K9/51Nanocapsules; Nanoparticles
    • A61K9/5107Excipients; Inactive ingredients
    • A61K9/513Organic macromolecular compounds; Dendrimers
    • A61K9/5161Polysaccharides, e.g. alginate, chitosan, cellulose derivatives; Cyclodextrin

Definitions

  • the present invention is directed to compositions and methods for delivery of proteins and nucleic acids, for use in, for example, targeted gene modification.
  • the disclosure provides a non- viral composition for the delivery of at least one RNA or DNA sequence.
  • the RNA sequence is an mRNA sequence.
  • the RNA sequence is a regulatory RNA sequence.
  • the RNA sequence is an oligonucleotide (e.g., that may bind a complementary DNA or RNA in a cell).
  • the RNA sequence is an mRNA sequence that alters the biochemistry of the cell (e.g., an mRNA encoding protein that tamps down the response against DNA, such that administration of a nanoparticle/RNA may be followed with a second nanoparticle/DNA or electroporation/DNA without DNA toxicity).
  • the disclosure provides a non- viral composition for the delivery of at least one sequence encoding a therapeutic protein to a cell.
  • the at least one sequence encoding a therapeutic protein is an mRNA sequence.
  • the at least one sequence encoding a therapeutic protein is a cDNA sequence.
  • the at least one sequence encoding a therapeutic protein is a non-naturally occurring sequence, including, but not limited to a sequence comprising at least one modified nucleotide, a sequence comprising a recombinant sequence, a sequence comprising a chimeric sequence, a sequence comprising a sequence encoding a self-cleaving peptide, a sequence comprising a sequence encoding an inducible proapoptotic polypeptide, a sequence comprising a sequence encoding a selection marker (e.g., a sequence encoding a DHFR mutein), or any combination thereof.
  • Exemplary therapeutic proteins may be soluble, secreted, cell-surface linked, transmembrane or any combination thereof.
  • Exemplary therapeutic proteins may be non-naturally occurring.
  • Exemplary therapeutic proteins may be naturally occurring.
  • the disclosure provides a non- viral composition for the delivery of at least one gene editing molecule to a cell.
  • the disclosure provides a composition for the delivery of at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one cationically-charged block, wherein: the at least one cationically-charged block complexes with the at least one gene editing molecule; and the at least one cationically-charged block is capable of intracellular delivery and release of the at least one gene editing molecule.
  • the cationically-charged block is constitutively positively charged at a physiological pH equal to or greater than 6 0 In some embodiments, the cationically-charged block is constitutively positively charged at a physiological pH of between 7.0 and 7 8 In some embodiments, the cationically-charged block comprises polyallylamine (PAA), polyamidoamine (PAMAM), polydimethylaminoethylmethacrylate (PDMAEMA), poly(2(diisopropylamino)ethyl methacrylate), polyethylenimine (PEI), branched polyethylenimine (bPEI), poly(l-lysine) (PLL), poly(l-arginine) (PLA),
  • PAA polyallylamine
  • PAMAM polyamidoamine
  • PDMAEMA polydimethylaminoethylmethacrylate
  • PEI polyethylenimine
  • bPEI branched polyethylenimine
  • PLA poly(l
  • poly(oligoethanamino)amide) POEAA
  • chitosan succinated chitosan, 6-N,N,N- trimethyltriazole-chitosans, a polyphosphoramidate, a polyhydroxyalkanoate, poly(calixane), poly(cyclodextrin), poly(aspartamide) (pASP(DET)), a poly(aminoamide), Poly(methacrylic acid-g-ethylene glycol) (P(MAA-g-EG)), or Poly(N,N-dialkylaminoethylmethacrylates) (PDAAEMA).
  • P(MAA-g-EG) Poly(N,N-dialkylaminoethylmethacrylates)
  • PDAAEMA Poly(N,N-dialkylaminoethylmethacrylates)
  • the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at a reduced pH of between 6.0 and 7.0. In some embodiments, the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at a reduced pH of between 6.0 and 6.5. In some embodiments, the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at a reduced pH of between 6.2 and 6.4.
  • the cationically-charged block comprises a substituted polyimidazole, a poly (L-histidine), a poly(beta)amino ester (PBAE), a poly(allylamine) hydrochloride(PAH), a poly(meth)acrylamide, or a poly(styrene-alt-maleic anhydride) (pSMA).
  • a substituted polyimidazole a poly (L-histidine), a poly(beta)amino ester (PBAE), a poly(allylamine) hydrochloride(PAH), a poly(meth)acrylamide, or a poly(styrene-alt-maleic anhydride) (pSMA).
  • the at least one gene editing molecule comprises one or more of a protein or a nucleic acid encoding for a protein, wherein the protein is selected from the group comprising a transposase, a nuclease, and an integrase.
  • the nuclease is selected from the group comprising: a CRISPR associated protein 9 (Cas9); a type IIS restriction enzyme; a transcription activator-like effector nuclease (TALEN); and a zinc finger nuclease (ZFN).
  • the type IIS restriction enzyme comprises Clo05l.
  • the Cas9 comprises a dCas9 or dSaCas9.
  • the at least one gene editing molecule comprises one or more transposable elements.
  • the one or more transposable elements comprises one or more of a piggyBac transposon, a piggyBac-like transposon, a Sleeping Beauty transposon, a Helraiser Transposon, a Tol2 transposon or a LINE-l (Ll) transposon.
  • the one or more transposable elements comprise a piggyBac transposon.
  • the one or more transposable elements comprise a piggyBac-like transposon.
  • the one or more transposable elements comprise a Sleeping Beauty transposon. In some embodiments, the one or more transposable elements comprise a Helraiser transposon. In some embodiments, the one or more transposable elements comprise a Tol2 transposon.
  • the at least one gene editing molecule further comprises one or more transposase(s).
  • the one or more transposase(s) comprises a piggyBac transposase, a super piggyBac transposase (SPB), a piggyBac-like transposase, a Sleeping Beauty transposase, a hyperactive Sleeping Beauty transposase (SB100X), a Helitron Transposase, a Tol2 transposase or a transposase capable of transposing a LINE-l (Ll transposon).
  • the transposase comprises a piggyBac transposase or a super piggyBac transposase (SPB). In some embodiments, the transposase comprises a piggyBac-bke transposase. In some embodiments, the transposase comprises a Sleeping Beauty transposase or a hyperactive Sleeping Beauty transposase (SB100X). In some embodiments, the transposase comprises Helitron transposase. In some embodiments, the transposase comprises a Tol2 transposase.
  • the disclosure provides a kit, comprising: a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one cationically- charged block, wherein: the at least one cationically-charged block complexes with the at least one gene editing molecule; and the at least one cationically-charged block is capable of intracellular delivery and release of the at least one gene editing molecule; and an implement for administering the pharmaceutical composition by a systemic route or by a local route.
  • the systemic route comprises intravenous delivery, inhalation, transmucosal delivery, rectal delivery, vaginal delivery, subcutaneous delivery,
  • the local route comprises topical delivery, transdermal delivery,
  • intracerebrospinal delivery intraspinal delivery, direct delivery to the central nervous system (CNS), intraocular delivery, intravitreal delivery, intramuscular delivery, or intraosseous delivery.
  • CNS central nervous system
  • intraocular delivery intravitreal delivery, intramuscular delivery, or intraosseous delivery.
  • the cationically-charged block is constitutively positively charged at a physiological pH equal to or greater than 6.0. In some embodiments, the cationically-charged block is constitutively positively charged at a physiological pH of between 7.0 and 7.8.
  • the cationically-charged block comprises polyallylamine (PAA), polyamidoamine (PAMAM), polydimethylaminoethylmethacrylate (PDMAEMA), poly(2(diisopropylamino)ethyl methacrylate), polyethylenimine (PEI), branched
  • PAA polyallylamine
  • PAMAM polyamidoamine
  • PDMAEMA polydimethylaminoethylmethacrylate
  • PEI polyethylenimine
  • polyethylenimine bPEI
  • PLA poly(l-arginine)
  • poly(oligoethanamino)amide) POEAA
  • chitosan succinated chitosan, 6-N,N,N- trimethyltriazole-chitosans, a polyphosphoramidate, a polyhydroxyalkanoate, poly(calixane), poly(cyclodextrin), poly(aspartamide) (pASP(DET)), a poly(aminoamide), Poly(methacrylic acid-g-ethylene glycol) (P(MAA-g-EG)), or Poly(N,N-dialkylaminoethylmethacrylates) (PDAAEMA).
  • P(MAA-g-EG) Poly(N,N-dialkylaminoethylmethacrylates)
  • PDAAEMA Poly(N,N-dialkylaminoethylmethacrylates)
  • the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at low pH of between 3.0 and 6.0.
  • the cationically-charged block comprises a substituted polyimidazole, a poly (L-histidine), a poly(beta)amino ester (PBAE), a poly(allylamine) hydrochloride(PAH), a poly(meth)acrylamide, or a poly(styrene-alt-maleic anhydride) (pSMA).
  • compositions of the disclosure for the modification of a target sequence, comprising contacting the composition and the target sequence under conditions suitable for nuclease activity.
  • a target cell comprises the target sequence.
  • the target sequence is a DNA sequence.
  • the DNA sequence is a genomic sequence.
  • the target sequence is an RNA sequence.
  • the target cell is ex vivo or in vivo.
  • compositions of the disclosure for the treatment of a disease or disorder, comprising administering a therapeutically-effective amount of the composition to a subject in need thereof.
  • the disclosure provides a method of modifying a target sequence, comprising contacting any one of the compositions of the disclosure and the target sequence under conditions suitable for nuclease activity.
  • the disclosure provides a method of modifying a target sequence, comprising contacting any one of the compositions of the disclosure and the target sequence under conditions suitable for transposase activity.
  • a target cell comprises the target sequence.
  • the target sequence is a DNA sequence.
  • the DNA sequence is a genomic sequence.
  • the target sequence is an RNA sequence.
  • the target cell is ex vivo or in vivo.
  • the disclosure provides a method of treating a disease or disorder, comprising administering a therapeutically-effective amount of any one of the compositions of the disclosure to a subject in need thereof.
  • Figure 1 A is a table depicting PLA polymerization times, micelle formation techniques, and mean diameter sizes of nanoparticles in the diblock copolymer micelle model of Example 1. As shown, using the particular test combination of PLA polymerization for 6 hours (25 PLA units) and sonication of the copolymers in phosphate-buffered saline (PBS), the mean diameter of the resulting micelles was 247 nm.
  • PBS phosphate-buffered saline
  • Figure 1B is a graph depicting the size distribution for the PEO-b-PLA micelles generated using the same test combination (i.e., 6 hours PLA polymerization and sonication in PBS) shown in Figure 1A and Example 1.
  • Figure 1C is a graph showing the z- potential distribution of the PEO-b-PLA micelles generated using the same test combination (i.e., 6 hours PLA polymerization and sonication in PBS) shown in Figures 1A-B and Example 1. As demonstrated, the z- potential of the tested PEO-b-PLA micelle is about -12.20 mV.
  • Figure 2 is a graph depicting the absorbance of light at a wavelength of 560 nm by the micelles with different concentrations of the DIL dye in solution.
  • the graph may be used to quantify how much DIL dye can be bound to the hydrophobic portion of the micelles. Specifically, it was found that 1 mg of the PEO-b-PLA micelles was able to load around 4 mM of the DIL dye.
  • Figure 3A is a table depicting PHIS polymerization times, micelle formation techniques, and mean diameter sizes of the resulting nanoparticles of the diblock copolymer micelle model of Example 1.
  • TFR thin film rehydration
  • DCM dichloromethane
  • Figure 3B is graph showing the size distribution (around 248 nm in diameter) for the PEO-b-PLA-b-PHA micelles generated using the same preparation parameters (i.e., 6 hours PLA polymerization, 48 hours PHIS polymerization, and THS in DCM).
  • Figure 3C is a graph of the z-potential distribution of the PEO-b-PLA-b-PHIS micelles generated using the same preparation parameters (i.e., 6 hours PLA polymerization, 48 hours PHIS polymerization, and THS in DCM). As demonstrated, the z-potential of the tested PEO-b-PLA-b-PHIS micelle is about -18 mV.
  • Figure 4 is a table depicting the variation in properties of the PEO-b-PLA-b-PHIS micelles in different pHs was tested. As shown, the micelles were the smallest at a pH of around 7, with a mean diameter size of around 316 nm. When the pH was substantially raised or lowered, the mean diameter size increases. At the lower pH, such increase is likely due to the micelle swelling based on poly(histidine) chains gaining positive charges and growing.
  • FIG. 5 is a photograph of a gel electrophoresis depicting DNA + mRNA encapsulation and release from PEO-PLA-PHIS particles.
  • 1% agarose gel electrophoresis was used to demonstrate the encapsulation of DNA and mRNA into PEO-PLA-PHIS particles (well 1).
  • Exposure of particles to acidic pH of 4.6 causes protonation of PHIS and disruption of particle conformation to result in plasmid release as observed in the DNA band from well 2 in the gel image. Plasmid release can be also triggered by surfactant exposure from the loading dye containing SDS as can be seen in the well 3.
  • the DNA band from release was compared to the band resulting from running DNA alone in the gel (well 4).
  • Figure 6A is a graph of the average diameter of PEO-b-PLA-b-PHIS micelles complexed with BSA as a function of pH as discussed in Example 1.
  • Figure 6B is a graph of the amount of released BSA as a function of pH as discussed in Example 1.
  • Figure 7 is a series of photographs and FACS plots showing the transfection efficiency results from Example 1.
  • HepG2 cells were seeded overnight in 24-well plates at 50,000 cells/well. Cell were exposed to different formulations in Opti-MEM Media (DNA alone, Lipofectamine + DNA + mRNA and PEO-PLA-PHIS + DNA + mRNA) at a final concentration of 500ng of DNA per well. At 48 hours post-incubation, cells were analyzed for GFP expression by microscopy and flow cytometry to determine the transfection efficiency for each condition.
  • FIG. 8 is a graph depicting piggyBac delivery via polymeric micelles. Evaluation of complexation of PEO-b-PLA-b-PHIS micelles with an pEF-GFP DNA vector (GFP), GFP- piggyBac transposon (GFP-Transposon), which was delivered with a second micelle that was complexed with piggyBac transposase mRNA or a DNA vector containing luciferase on a sleeping beauty transposon as well as the sleeping beauty transposase. Micelles were purified on a GPC column and a second fraction was detected as micelles containing DNA. Molar ratio of polymer to DNA cargo was 20: 1.
  • Figure 9 is a graph depicting piggyBac delivery via polymeric micelles. Evaluation of in vitro toxicity of PEO-b-PLA-b-PHIS micelles at different concentrations. Micelle toxicity in HepG2 cells was evaluated by an MTT assay. Empty micelles were incubated with cells over 3 days at the typical transfection concentration of DNA (1%) and at lOx the typical concentrations (i.e. 10%). [041]
  • Figure 10 is a graph depicting piggyBac delivery via polymeric micelles. Evaluation of transfection efficiency in HepG2 cells. HepG2 cells were incubated with plasmid or micelle formulation containing plasmid for 3 days. Flow cytometer was to detect transfected cells.
  • nucleases may be used to generate DNA double strand breaks (DSBs) in precise genomic locations, and cellular repair machinery then exploited to silence or replace nucleotides and/or genes.
  • DSBs DNA double strand breaks
  • Targeted editing of nucleic acid sequences is a highly promising approach for the study of gene function and also has the potential to provide new therapies for human genetic diseases.
  • Current gene editing tools include, for example, various enzymes, such as endonucleases, and mobile genetic elements, such as transposons.
  • These tools provide the potential, for example, to remove, replace, or add nucleotide bases to native DNA in order to correct or induce a point mutation, as well as to change a nucleotide base in order to correct or induce a frame shift mutation. Further, such tools may enable removing, inserting or modifying pieces of DNA containing a plurality of codons as part of one or more gene(s).
  • AAV-based gene therapy vectors form episomal concatemers in the host cell nucleus. In non-dividing cells, these concatemers remain intact for the life of the host cell. In dividing cells, AAV DNA is lost through cell division, since the episomal DNA is not replicated along with the host cell DNA. Random integration of AAV DNA into the host genome is detectable but occurs at very low frequency. AAVs also present very low immunogenicity, seemingly restricted to generation of neutralizing antibodies, while they induce no clearly defined cytotoxic response. These features, along with the ability to infect quiescent cells, demonstrate that AAVs are dominant over adenoviruses as vectors for human gene therapy.
  • viral vectors including AAVs
  • AAVs a virus genome
  • kb kilobase
  • viruses to deliver gene editing tools may include targeting only dividing cells, random insertion into the host genome, risk of replication, and possible host immune reaction, as well as limitations on payload size imposed by the viral capsid.
  • non-viral vectors are typically easy to manufacture, less likely to produce immune reactions, and do not produce replication reactions compared to viral vectors;
  • nanocapsules in which a slurry of free DNA/RNA/protein is wrapped with a polymer or peptide; "bioconjugates” (e.g., lipids, synthetic macromolecules, etc.) that target the nucleic acid, including via binding to specific proteins expressed by target cells to enable cellular internalization; and "lipid-based vehicles” (e.g., liposomes, lipid-based nanoparticles, etc.) modified with cationic/ionizable amphiphilic polymers to self-assemble with the nucleic acids based on charge.
  • bioconjugates e.g., lipids, synthetic macromolecules, etc.
  • lipid-based vehicles e.g., liposomes, lipid-based nanoparticles, etc.
  • Polymeric micelles have been extensively studied for their potential applications in the drug delivery field.
  • Polymeric micelles are formed by amphiphilic block copolymers, which can self-assemble into nano-sized core/shell structures in an aqueous environment via hydrophobic or ion pair interactions between polymer segments.
  • Such micelles generally are able to solubilize the insoluble drugs, avoid non-selective uptake by the reticuloendothelial system (RES), and utilize the enhanced permeability and retention (EPR) effect for passive targeting. In this manner, a drug's solubility and pharmacokinetic profiles may be significantly improved through the use of micelles.
  • RES reticuloendothelial system
  • EPR enhanced permeability and retention
  • Polymeric micelles used for drug delivery have in some cases shown capabilities in attenuating nonspecific toxicities and enhancing drug delivery to desired sites resulting in improved therapeutic efficacy.
  • Synthetic amphiphilic copolymers may be beneficial tools for drug delivery because they are highly versatile in terms of composition and architecture.
  • micelles may be customized, for example, by modifying the hydrophilic block using functional groups.
  • Such functional group may include, for example, targeting ligands, such as monoclonal antibody, or intracellular drug delivery moieties, such as cell-penetrating peptides (CPPs), etc.
  • nanoparticles have been reported to accumulate preferably in certain regions due to passive and/or active targeting, their inefficient drug release can be another barrier that may significantly lower drug's efficacy.
  • surface PEO chains may inhibit the cellular uptake of long circulating nanoparticles following intracellular events. Therefore, quicker and more controllable payload release remains a target for nanoparticle systems such as micelles.
  • nucleic acids such as mRNA and/or large DNA plasmids
  • subject and “patient” are used interchangeably herein to refer to human patients, whereas the term “subject” may also refer to any animal. It should be understood that in various embodiments, the subject may be a mammal, a non-human animal, a canine and/or a vertebrate.
  • monomeric units is used herein to mean a unit of polymer molecule containing the same or similar number of atoms as one of the monomers.
  • Monomeric units as used in this specification, may be of a single type (homogeneous) or a variety of types (heterogeneous).
  • polymer is used according to its ordinary meaning of a macromolecule comprising connected monomeric molecules.
  • amphiphilic is used herein to mean a substance containing both polar (water-soluble) and hydrophobic (water-insoluble) groups.
  • an effective amount is used herein to refer to an amount of a compound, material, or composition effective to achieve a particular biological result such as, but not limited to, biological results disclosed, described, or exemplified herein. Such results may include, but are not limited to, the effective reduction of symptoms associated with any of the disease states mentioned herein, as determined by any means suitable in the art.
  • the effective amount of an agent e.g., a nuclease, an integrase, a transposase, a recombinase, a hybrid protein, a fusion protein, a protein dimer, a complex of a protein (or protein dimer) and a polynucleotide, or a polynucleotide, may vary depending on various factors as, for example, on the desired biological response, the specific allele, genome, target site, cell, or tissue being targeted, and the agent being used.
  • the term "membrane" is used herein to mean a spatially distinct collection of molecules that defines a two-dimensional surface in three-dimensional space, and thus separates one space from another in at least a local sense.
  • active agent is used herein to refer to any a protein, peptide, sugar, saccharide, nucleoside, inorganic compound, lipid, nucleic acid, small synthetic chemical compound, or organic compound that appreciably alters or affects the biological system into which it is introduced.
  • vehicle is used herein to refer to agents with no inherent therapeutic benefit but when combined with an active agent for the purposes of delivery into a cell result in modification of the active agent's properties, including but not limited to its mechanism or mode of in vivo delivery, its concentration, bioavailability, absorption, distribution and elimination for the benefit of improving product efficacy and safety, as well as patient convenience and compliance.
  • carrier is used herein to describe a delivery vehicle that is used to incorporate a pharmaceutically active agent for the purposes of drug delivery.
  • homopolymer is used herein to refer to a polymer derived from one monomeric species of polymer.
  • copolymer is used herein to refer to a polymer derived from two (or more) monomeric species of polymer, as opposed to a homopolymer where only one monomer is used. Since a copolymer consists of at least two types of constituent units (also structural units), copolymers may be classified based on how these units are arranged along the chain.
  • block copolymers is used herein to refer to a copolymer that includes two or more homopolymer subunits linked by covalent bonds in which the union of the homopolymer subunits may require an intermediate non-repeating subunit, known as a junction block.
  • Block copolymers with two or three distinct blocks are referred to herein as “diblock copolymers” and “triblock copolymers,” respectively.
  • loading capacity is used herein to refer to the weight of a particular compound within a carrier divided by the total weight of carrier.
  • complexation efficiency and “loading efficiency” are interchangeably used herein to refer to the weight a particular compound that is complexed with and/or incorporated within a carrier suspension divided by the weight of the original compound in solution prior to forming a complex (expressed as a %).
  • nucleic acid and “nucleic acid molecule” are used interchangeably herein to refer to a compound with a nucleobase and an acidic moiety, e.g., a nucleoside, a nucleotide, or a polymer of nucleotides.
  • polymeric nucleic acids e.g., nucleic acid molecules comprising three or more nucleotides are linear molecules, in which adjacent nucleotides are linked to each other via a phosphodiester or a phosphorothioate linkage.
  • nucleic acid refers to individual nucleic acid residues (e.g. nucleotides and/or nucleosides).
  • nucleic acid refers to an oligonucleotide chain comprising three or more individual nucleotide residues.
  • oligonucleotide and “polynucleotide” can be used interchangeably to refer to a polymer of nucleotides (e.g., a string of at least three nucleotides).
  • nucleic acid encompasses RNA as well as single and/or double-stranded DNA Nucleic acids may be naturally occurring, for example, in the context of a genome, a transcript, an mRNA, tRNA, rRNA, siRNA, snRNA, a plasmid, cosmid, chromosome, chromatid, or other naturally occurring nucleic acid molecule.
  • a nucleic acid molecule may be a non-natural occurring molecule, e.g., a recombinant DNA or RNA, an artificial chromosome, an engineered genome, or fragment thereof, or a synthetic DNA, RNA, DNA/RNA hybrid, or including non-naturally occurring nucleotides or nucleosides.
  • nucleic acid examples include nucleic acid analogs, e.g., analogs having other than a phosphodiester backbone including a phosphorothioate linkage.
  • Nucleic acids can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc. Where appropriate, e.g., in the case of chemically synthesized molecules, nucleic acids can comprise nucleoside analogs such as analogs having chemically modified bases or sugars, and backbone modifications. A nucleic acid sequence is presented in the 5' to 3' direction unless otherwise indicated.
  • a nucleic acid is or comprises natural nucleosides (e.g. adenosine, thymidine, guanosine, cytidine, uridine, deoxy adenosine, deoxythymidine, deoxyguanosine, and deoxy cytidine); nucleoside analogs (e.g., 2- aminoadenosine, 2- thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5- methylcytidine, 2- aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propyny 1 -uridine, C5-propyny l-cytidine, C5-methy lcytidine, 2-aminoadenosine, 7- deazaadenosine, 7- deazaguanosine, 8-oxoa
  • nuclease is used interchangeably herein to refer to an enzyme that forms a complex with (e.g., binds or associates with) one or more nucleic acid to provide a target for cleavage, or indirect guide to another site for cleavage.
  • treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein.
  • treatment refers to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein.
  • treatment may be administered after one or more symptoms have developed and/or after a disease has been diagnosed.
  • treatment may be administered in the absence of symptoms, e.g., to prevent or delay onset of a symptom or inhibit onset or progression of a disease.
  • treatment may be administered to a susceptible individual prior to the onset of symptoms (e.g., in light of a history of symptoms and/or in light of genetic or other susceptibility factors). Treatment may also be continued after symptoms have resolved, for example, to prevent or delay their recurrence.
  • the disclosure provides a composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly(L-histidine) block, wherein: the at least one poly(L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly(L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule.
  • the at least one gene editing molecule comprises one or more of a protein or a nucleic acid encoding for a protein.
  • the at least one gene editing molecule comprises a protein and the protein is selected from the group comprising a transposase, a nuclease, and an integrase.
  • the at least one gene editing molecule comprises one or more of a protein or a nucleic acid encoding for a protein, wherein the protein is selected from the group comprising a transposase, a nuclease, and an integrase.
  • the nuclease or the protein having nuclease activity is selected from the group comprising: a CRISPR associated protein 9 (Cas9); a type IIS restriction enzyme; a transcription activator-like effector nuclease (TALEN); and a zinc finger nuclease (ZFN).
  • the gene editing molecule comprises a DNA-binding domain and a nuclease.
  • the DNA- binding domain comprises a guide RNA.
  • the DNA-binding domain comprises a DNA-binding domain of a TALEN.
  • the DNA-binding domain comprises a DNA-binding domain of a zinc-finger nuclease.
  • the CRISPR associated protein 9 is an inactivated Cas9 (dCas9).
  • the CRISPR associated protein 9 (Cas9) is truncated or short Cas9.
  • the CRISPR associated protein 9 (Cas9) is a short and inactivated Cas9 (dSaCas9). ).
  • the dSaCas9 of the disclosure comprises the mutations D10A and N580A (underlined and bolded) which inactivate the catalytic site.
  • the dSaCas9 isolated or derived from Staphylococcus aureus
  • the Cas9 of the disclosure comprises a dCas9.
  • the Cas9 comprises a dCas9 isolated or derived from Streptococcus pyogenes.
  • the dCas9 comprises a substitution at position 10 and/or 840 of the amino acid sequence of the dCas9 which inactivate the catalytic site. In some embodiments, these substitutions are D10A and H840A.
  • the amino acid sequence of the dCas9 isolated or derived from Streptococcus pyogenes
  • the amino acid sequence of the dCas9 (isolated or derived from Streptococcus pyogenes) comprises the sequence of:
  • the nuclease domain may comprise, consist essentially of or consist of a dCas9 or a dSaCas9 and a type IIS endonuclease.
  • the nuclease domain may comprise, consist essentially of or consist of a dSaCas9 and a type IIS endonuclease, including, but not limited to, Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, My II, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, B
  • the nudease domain may comprise, consist essentially of or consist of a dCas9 or dSaCas9 and Clo05l.
  • exemplary Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
  • An exemplary dCas9-Clo05l nudease domain may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 (isolated or derived from Staphylococcus pyogenes) sequence in italics):
  • the type IIS restriction enzyme comprises one or more of Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, Myll, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Ecil, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bril, Mboll, Acc36I, Fokl or Clo05 l.
  • BceAI BsmAI, BsmFI, Bs
  • Clo051 nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
  • the DNA binding domain or the nuclease comprises a sequence isolated or derived from a Ralstonia TALEN or from a Xanthomonas TALEN. In some embodiments, the DNA binding domain or the nuclease comprises a recombinant TALEN sequence derived from a Ralstonia TALEN, a Xanthomonas TALEN or a combination thereof.
  • the at least one gene editing molecule comprises one or more transposable element(s).
  • the one or more transposable element(s) comprise a circular DNA.
  • the one or more transposable element(s) comprise a plasmid vector or a minicircle DNA vector.
  • the at least one gene editing molecule comprises one or more transposable element(s).
  • the one or more transposable element(s) comprise a linear DNA.
  • the linear recombinant and non-naturally occurring DNA sequence encoding a transposon may be produced in vitro.
  • Linear recombinant and non-naturally occurring DNA sequences of the disclosure may be a product of a restriction digest of a circular DNA.
  • the circular DNA is a plasmid vector or a minicircle DNA vector.
  • Linear recombinant and non-naturally occurring DNA sequences of the disclosure may be a product of a polymerase chain reaction (PCR).
  • PCR polymerase chain reaction
  • Linear recombinant and non-naturally occurring DNA sequences of the disclosure may be a double-stranded DoggyboneTM DNA sequence.
  • DoggyboneTM DNA sequences of the disclosure may be produced by an enzymatic process that solely encodes an antigen expression cassette, comprising antigen, promoter, poly-A tail and telomeric ends.
  • the at least one gene editing molecule comprises one or more transposable element(s).
  • the one or more transposable element(s) comprise a piggyBac transposon, a Sleeping Beauty transposon, a Helraiser Transposon, a Tol2 transposon or a LINE-l (Ll) transposon.
  • the one or more transposable elements comprise a piggyBac transposon.
  • the one or more transposable elements comprise a Sleeping Beauty transposon.
  • the one or more transposable elements comprise a Helraiser transposon.
  • the one or more transposable elements comprise a Tol2 transposon.
  • the at least one gene editing molecule comprises one or more transposable element(s)
  • the at least one gene editing molecule comprises further comprises one or more transposase(s).
  • the one or more transposase(s) comprises a piggyBac transposase, a super piggyBac transposase (SPB), a Sleeping Beauty transposase, a hyperactive Sleeping Beauty transposase (SB100X), a Helitron Transposase, a Tol2 transposase or a transposase capable of transposing a LINE-l (Ll transposon).
  • the transposase comprises a piggyBac transposase or a super piggyBac transposase (SPB).
  • the transposase comprises a Sleeping Beauty transposon
  • the transposase comprises a Sleeping Beauty transposase, a hyperactive Sleeping Beauty transposase (SB100X).
  • the transposase comprises Helitron transposase.
  • the transposase comprises a Tol2 transposase.
  • the at least one gene editing molecule comprises one or more transposable element(s)
  • the at least one gene editing molecule comprises further comprises one or more transposase(s).
  • the transposon is a piggyBac transposon.
  • the transposase is a piggyBacTM or a Super piggyBacTM (SPB) transposase.
  • the transposon is a plasmid DNA transposon.
  • the transposon is a piggyBac transposon.
  • the transposase is a piggyBacTM or a Super piggyBacTM (SPB) transposase.
  • the sequence encoding the transposase is an mRNA sequence.
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme.
  • PB piggyBac
  • the piggyBac (PB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at one or more of positions 30, 165, 282, or 538 of the sequence:
  • PB piggyBacTM
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at two or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO:
  • PB piggyBacTM
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at three or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 5.
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at each of the following positions 30, 165, 282, and 538 of the sequence of SEQ ID NO: 5.
  • the amino acid substitution at position 30 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for an isoleucine (I).
  • the amino acid substitution at position 165 of the sequence of SEQ ID NO: 5 is a substitution of a serine (S) for a glycine (G).
  • the amino acid substitution at position 282 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for a methionine (M).
  • the amino acid substitution at position 538 of the sequence of SEQ ID NO: 5 is a substitution of a lysine (K) for an asparagine (N).
  • the transposase enzyme is a Super piggyBacTM (SPB) transposase enzyme.
  • the Super piggyBacTM (SPB) transposase enzymes of the disclosure may comprise or consist of the amino acid sequence of the sequence of SEQ ID NO: 5 wherein the amino acid substitution at position 30 is a substitution of a valine (V) for an isoleucine (I), the amino acid substitution at position 165 is a substitution of a serine (S) for a glycine (G), the amino acid substitution at position 282 is a substitution of a valine (V) for a methionine (M), and the amino acid substitution at position 538 is a substitution of a lysine (K) for an asparagine (N).
  • the Super piggyBacTM (SPB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%,
  • the piggyBacTM or Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at one or more of positions 3, 46, 82, 103, 119,
  • the piggyBacTM or Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at one or more of positions 46, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 485,
  • the amino acid substitution at position 3 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for a serine (S).
  • the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an alanine (A).
  • the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A).
  • the amino acid substitution at position 82 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for an isoleucine (I).
  • the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S).
  • the amino acid substitution at position 119 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for an arginine (R).
  • the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) a cysteine (C). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a tyrosine (Y).
  • the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for a tyrosine (Y).
  • the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F).
  • the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a phenylalanine (F).
  • the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a phenylalanine (F).
  • the amino acid substitution at position 185 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M).
  • the amino acid substitution at position 187 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for an alanine (A).
  • the amino acid substitution at position 200 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a phenylalanine (F).
  • the amino acid substitution at position 207 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a valine (V).
  • the amino acid substitution at position 209 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a valine (V).
  • the amino acid substitution at position 226 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a methionine (M).
  • the amino acid substitution at position 235 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a leucine (L).
  • the amino acid substitution at position 240 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V).
  • the amino acid substitution at position 241 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F).
  • the amino acid substitution at position 243 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a proline (P).
  • the amino acid substitution at position 258 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N).
  • the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L).
  • the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M).
  • the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a proline (P). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine for a proline (P).
  • the amino acid substitution at position 315 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for an arginine (R).
  • the amino acid substitution at position 319 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a threonine (T).
  • the amino acid substitution at position 327 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a tyrosine (Y).
  • the amino acid substitution at position 328 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a tyrosine (Y).
  • the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a cysteine (C).
  • the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C).
  • the amino acid substitution at position 421 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for the aspartic acid (D).
  • the amino acid substitution at position 436 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a valine (V).
  • the amino acid substitution at position 456 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a methionine (M).
  • the amino acid substitution at position 470 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L).
  • the amino acid substitution at position 485 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a serine (S).
  • the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M).
  • the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a methionine (M).
  • the amino acid substitution at position 552 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V).
  • the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A).
  • the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a glutamine (Q). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a glutamine (Q).
  • the piggyBacTM transposase enzyme may comprise or the Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at one or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the piggyBacTM transposase enzyme may comprise or the Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at two, three, four, five, six or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the piggyBacTM transposase enzyme may comprise or the Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S).
  • the amino acid substitution at position 194 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M).
  • the amino acid substitution at position 372 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for an arginine (R).
  • the amino acid substitution at position 375 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a lysine (K).
  • the amino acid substitution at position 450 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for an aspartic acid (D).
  • the amino acid substitution at position 509 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a serine (S).
  • the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N).
  • the piggyBacTM transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5. In some embodiments, including those embodiments wherein the piggyBacTM transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, the piggyBacTM
  • the transposase enzyme may further comprise an amino acid substitution at positions 372, 375 and 450 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the piggyBacTM transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, and a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5.
  • the piggyBacTM transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5 and a substitution of an asparagine (N) for an aspartic acid (D) at position 450 of SEQ ID NO: 5.
  • the at least one gene editing molecule comprises one or more transposable element(s)
  • the at least one gene editing molecule comprises further comprises one or more transposase(s).
  • the transposon is a Sleeping Beauty transposon
  • the transposase is a Sleeping Beauty or Sleeping Beauty 100X (SB100X) transposase.
  • the Sleeping Beauty transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
  • the Sleeping Beauty transposase is a hyperactive Sleeping Beauty SB100X transposase
  • the Sleeping Beauty transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
  • the at least one gene editing molecule comprises one or more transposable element(s)
  • the at least one gene editing molecule comprises further comprises one or more transposase(s).
  • the transposase is a Helitron transposase.
  • Helitron transposases mobilize the Helraiser transposon, an ancient element from the bat genome that was active about 30 to 36 million years ago.
  • An exemplary Helraiser transposon of the disclosure includes Helibatl, which comprises a nucleic acid sequence comprising:
  • the Helitron transposase comprises a RepHel motif made up of a replication initiator domain (Rep) and a DNA helicase domain.
  • the Rep domain is a nuclease domain of the HUH superfamily of nucleases.
  • An exemplary Helitron transposase of the disclosure comprises an amino acid sequence comprising
  • the Helitron transposase transposes the Helraiser transposable element in a Helitron transposition.
  • a hairpin close to the 3’ end of the Helraiser transposon functions as a terminator.
  • the transposase is flanked by left and right terminal sequences termed LTS and RTS. These sequences terminate with a conserved 5’-TC/CTAG-3’ motif.
  • the at least one gene editing molecule comprises one or more transposable element(s)
  • the at least one gene editing molecule comprises further comprises one or more transposase(s).
  • the transposase is a Tol2 transposase.
  • the transposase is a Tol2 transposase.
  • Tol2 transposons may be isolated or derived from the genome of the medaka fish, and may be similar to transposons of the hAT family.
  • Exemplary Tol2 transposons of the disclosure are encoded by a sequence comprising about 4.7 kilobases and contain a gene encoding the Tol2 transposase, which contains four exons.
  • An exemplary Tol2 transposase of the disclosure comprises an amino acid sequence comprising the following:
  • An exemplary Tol2 transposon of the disclosure including inverted repeats, subterminal sequences and the Tol2 transposase, is encoded by a nucleic acid sequence comprising the following:
  • the disclosure provides a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly (L-histidine) block, wherein: the at least one poly (L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly (L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule.
  • the disclosure provides a kit, comprising: a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly (L- histidine) block, wherein: the at least one poly (L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly (L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule; and an implement for administering the pharmaceutical composition.
  • the pharmaceutical composition is administered systemically or locally.
  • the composition is administered systemically or locally.
  • composition is administered intravenously, via inhalation, topically, per rectum, per the vagina, transdermally, subcutaneously, intraperitoneally, intrathecally, intramuscularly or orally.
  • the disclosure provides a kit, comprising: a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly (L- histidine) block, wherein: the at least one poly (L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly (L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule; and an implement for administering the pharmaceutical composition intravenously, via inhalation, topically, per rectum, per the vagina, transdermally, subcutaneously, intraperitoneally, intrathecally, intramuscularly or orally.
  • compositions of the disclosure including
  • compositions of the disclosure comprise a micelle structure comprising a triblock copolymer capable of complexing with at least one protein or nucleic acid, wherein the triblock copolymer comprises a hydrophilic block a hydrophobic block, and a poly(L-histidine) block.
  • the hydrophilic block comprises poly(ethylene oxide) (PEO).
  • the hydrophilic block comprises at least one aliphatic polyester.
  • the hydrophilic block comprises a poly(lactic acid), a poly(gly colic acid) (PGA), a poly(lactic-co-gly colic acid) (PLGA), a poly(s-caprolactone) (PCL), a poly(3-hydroxybutyrate) (PHB) or any combination thereof.
  • the hydrophilic block comprises a poly(lactic acid) having an average length of 25 units.
  • compositions of the disclosure including
  • compositions of the disclosure comprise a micelle structure comprising a triblock copolymer capable of complexing with at least one protein or nucleic acid, wherein the triblock copolymer comprises a hydrophilic block a hydrophobic block, and a poly(L-histidine) block.
  • the hydrophobic block comprises a poly(ester), a poly(anhydride), a poly (peptide), an artificial poly(nucleic acid) or any combination thereof.
  • compositions of the disclosure including
  • compositions of the disclosure comprise a micelle structure comprising a triblock copolymer capable of complexing with at least one protein or nucleic acid, wherein the triblock copolymer comprises a hydrophilic block a hydrophobic block, and a poly(L-histidine) block.
  • the poly(L- histidine) block enables pH- dependent release of the at least one protein or nucleic acid.
  • Exemplary poly(L-histidine) copolymers include, but are not limited to, non-degradable and degradable diblocks.
  • Exemplary degradable poly(L-histidine) copolymers include, but are not limited to, PEO(5000)-b-PCL(l6300) ("P2350-EOCL”); PEO(2000)-b-PMCL(l l900) ("OCL”); PEO(2000)-b-PMCL(8300) (“OMCL”); PEO(l l00)-b-PTMC(5l00) (“OTMC”); and PEO(2000)-b-PTMC/PCL(l 1200) (“OTCL”).
  • compositions of the disclosure including
  • the compositions comprise a micelle structure comprising a copolymer comprising PEO-b-PLA-b PHIS.
  • the PEO block may comprise at least 1 monomer, 5 monomers, 10 monomers, 100 monomers, 500 monomers, 1000 monomers, 2500 monomers, 5000 monomers, 10000 monomers, 15000 monomers or any number of monomers in between.
  • the PLA block may comprise at least 1 monomer, 5 monomers, 10 monomers, 100 monomers, 500 monomers, 1000 monomers, 2500 monomers, 5000 monomers, 10000 monomers, 15000 monomers or any number of monomers in between.
  • the PHIS block may comprise at least 1 monomer, 5 monomers, 10 monomers, 100 monomers, 500 monomers, 1000 monomers, 2500 monomers, 5000 monomers, 10000 monomers, 15000 monomers or any number of monomers in between.
  • compositions of the disclosure including
  • compositions of the disclosure comprise a micelle structure comprising a copolymer comprising PEO-b-PLA-b PHIS.
  • the molar ratio of polymer to cargo is 20: 1, 15: 1, 10: 1, 5: 1, or 2: 1.
  • the cargo is at least one gene editing molecule of the disclosure.
  • micellar systems with triggered release mechanisms may be developed that enable the delivery drugs or other treatment agents in response to specific stimuli.
  • pH-sensitive polymeric micelles may be useful therapeutic agents since changes in pH occur in a variety of cellular processes and locations. For example, once the micelle enters cells via endocytosis where pH can drop as low as 5.5-6.0 in endosomes and 4.5-5.0 in lysosomes.
  • cationically-charged, pH-sensitive polymers maintain a neutral charge at a pH around physiological pH (7.0-7.8) and become positively charged at a reduced pH such as that which may be found in an endosome or lysosome.
  • cationically-charged, pH-sensitive polymers that maintain a neutral charge at a pH around physiological pH (7.0-7.8) become positively charged at a reduced pH of between 6.0 and 7.0.
  • cationically-charged, pH-sensitive polymers that maintain a neutral charge at a pH around physiological pH (7.0-7.8) become positively charged at a reduced pH of between 6.0 and 6.5.
  • cationically-charged, pH-sensitive polymers that maintain a neutral charge at a pH around physiological pH (7.0-7.8), become positively charged at a reduced pH of between 6.2 and 6.4.
  • Exemplary cationically-charged, pH-sensitive polymers are displayed in Table 1.
  • Table 1 Ionizable polymers that are cationically-charged dependent upon pH state.
  • cationically-charged polymers are constitutively positively charged at a pH around physiological pH (7.0-7.8).
  • Exemplary constitutively cationically- charged polymers are displayed in Table 2.
  • the various embodiments enable intracellular delivery of gene editing tools by complexing with ionizable and/or cationically-charged polymer-based micelles.
  • the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block.
  • the hydrophilic block may be poly(ethylene oxide) (PEO)
  • the charged block may be selected from Table 1 or Table 2.
  • An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA- b-PHIS, with variable numbers of repeating units in each block varying by design.
  • the gene editing tools may be various molecules that are recognized as capable of modifying, repairing, adding and/or silencing genes in various cells.
  • DSBs double-strand breaks
  • Structural damage to DNA may occur randomly and unpredictably in the genome due to any of a number of intracellular factors (e.g., nucleases, reactive oxygen species, etc.) as well as external forces (e.g., ionizing radiation, ultraviolet (UV) radiation, etc.).
  • DSBs double-strand breaks
  • Genetic modification tools may therefore be composed of programmable, sequence-specific DNA-binding modules associated with a nonspecific DNA nuclease, introducing DSBs into the genome.
  • CRISPR mostly found in bacteria, are loci containing short direct repeats, and are part of the acquired prokaryotic immune system, conferring resistance to exogenous sequences such as plasmids and phages.
  • RNA-guided endonucleases are programmable genetic engineering tools that are adapted from the CRISPR/CRISPR-associated protein 9 (Cas9) system, which is a component of prokaryotic innate immunity.
  • Diblock copolymers that may be used as intermediates for making triblock copolymers of the embodiment micelles may have hydrophilic biocompatible poly(ethylene oxide) (PEO), which is chemically synonymous with PEG, coupled to various hydrophobic aliphatic poly(anhydrides), poly(nucleic acids), poly(esters), poly(ortho esters),
  • PEO poly(ethylene oxide)
  • PEG poly(ethylene oxide)
  • Polymeric micelles comprised of 100% PEGylated surfaces possess improved in vitro chemical stability, augmented in vivo bioavailablity, and prolonged blood circulatory half-lives.
  • aliphatic polyesters, constituting the polymeric micelle's membrane portions are degraded by hydrolysis of their ester linkages in physiological conditions such as in the human body. Because of their biodegradable nature, aliphatic polyesters have received a great deal of attention for use as implantable biomaterials in drug delivery devices, bioresorbable sutures, adhesion barriers, and as scaffolds for injury repair via tissue engineering.
  • molecules required for gene editing may be delivered to cells using one or more micelle formed from self-assembled triblock copolymers containing cationically-charged polymer blocks.
  • gene editing refers to the insertion, deletion or replacement of nucleic acids in genomic DNA so as to add, disrupt or modify the function of the product that is encoded by a gene.
  • a cutting enzyme e.g., a nuclease or recombinase
  • Poly(histidine) i.e., poly(L-histidine)
  • poly(L-histidine) is an ionizable polymer that becomes positively charged at lower pH ( ⁇ 7.0) due to the imidazole ring providing an electron lone pair on the unsaturated nitrogen. That is, poly(histidine) has amphoteric properties through protonation-deprotonation.
  • the various embodiments enable intracellular delivery of gene editing tools by complexing with poly(histidine)-based micelles.
  • the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block.
  • the hydrophilic block may be poly(ethylene oxide) (PEO)
  • the charged block may be poly(L-histidine).
  • An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA-b-PHIS, with variable numbers of repeating units in each block varying by design.
  • the gene editing tools may be various molecules that are recognized as capable of modifying, repairing, adding and/or silencing genes in various cells.
  • DSBs double-strand breaks
  • Structural damage to DNA may occur randomly and unpredictably in the genome due to any of a number of intracellular factors (e.g., nucleases, reactive oxygen species, etc.) as well as external forces (e.g., ionizing radiation, ultraviolet (UV) radiation, etc.).
  • DSBs double-strand breaks
  • Genetic modification tools may therefore be composed of programmable, sequence-specific DNA-binding modules associated with a nonspecific DNA nuclease, introducing DSBs into the genome.
  • CRISPR mostly found in bacteria, are loci containing short direct repeats, and are part of the acquired prokaryotic immune system, conferring resistance to exogenous sequences such as plasmids and phages.
  • RNA-guided endonucleases are programmable genetic engineering tools that are adapted from the CRISPR/CRISPR-associated protein 9 (Cas9) system, which is a component of prokaryotic innate immunity.
  • Diblock copolymers that may be used as intermediates for making triblock copolymers of the embodiment micelles may have hydrophilic biocompatible poly(ethylene oxide) (PEO), which is chemically synonymous with PEG, coupled to various hydrophobic aliphatic poly(anhydrides), poly(nucleic acids), poly(esters), poly(ortho esters),
  • PEO poly(ethylene oxide)
  • PEG poly(ethylene oxide)
  • Polymeric micelles comprised of 100% PEGylated surfaces possess improved in vitro chemical stability, augmented in vivo bioavailablity, and prolonged blood circulatory half-lives.
  • aliphatic polyesters, constituting the polymeric micelle's membrane portions are degraded by hydrolysis of their ester linkages in physiological conditions such as in the human body. Because of their biodegradable nature, aliphatic polyesters have received a great deal of attention for use as implantable biomaterials in drug delivery devices, bioresorbable sutures, adhesion barriers, and as scaffolds for injury repair via tissue engineering.
  • molecules required for gene editing may be delivered to cells using one or more micelle formed from self-assembled triblock copolymers containing poly(histidine).
  • gene editing refers to the insertion, deletion or replacement of nucleic acids in genomic DNA so as to add, disrupt or modify the function of the product that is encoded by a gene.
  • a cutting enzyme e.g., a nuclease or recombinase
  • insertion tools e.g. DNA template vectors, transposable elements (transposons or retrotransposons) must be delivered to the cell in addition to the cutting enzyme (e.g. a nuclease, recombinase, integrase or transposase).
  • the cutting enzyme e.g. a nuclease, recombinase, integrase or transposase.
  • insertion tools e.g. DNA template vectors, transposable elements (transposons or retrotransposons
  • Examples of such insertion tools for a recombinase may include a DNA vector.
  • Other gene editing systems require the delivery of an integrase along with an insertion vector, a transposase along with a transposon/retrotransposon, etc.
  • an example recombinase that may be used as a cutting enzyme is the CRE recombinase.
  • example integrases that may be used in insertion tools include viral based enzymes taken from any of a number of viruses including, but not limited to, AAV, gamma retrovirus, and lentivirus.
  • Example transposons/retrotransposons that may be used in insertion tools include, but are not limited to, the piggyBac transposon, Sleeping Beauty transposon, Helraiser transposon, Tol2 transposon and the Ll retrotransposon.
  • nucleases that may be used as cutting enzymes include, but are not limited to, Cas9, transcription activator-like effector nucleases (TALENs) and zinc finger nucleases.
  • the Cas9 is a catalytically inactive or“inactivated” Cas9 (dCas9).
  • the Cas9 is a catalytically inactive or“inactivated” nuclease domain of Cas9.
  • the dCas9 is encoded by a shorter sequence that is derived from a full length, catalytically inactivated, Cas9, referred to herein as a “small” dCas9 or dSaCas9.
  • the inactivated, small, Cas9 operatively-linked to an active nuclease.
  • the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA binding domain and molecule nuclease, wherein the nuclease comprises a small, inactivated Cas9 (dSaCas9).
  • the dSaCas9 of the disclosure is isolated or derived from Staphylococcus aureus and comprises the mutations D10A and N580A (underlined and bolded) which inactivate the catalytic site.
  • the dSaCas9 isolated or derived from
  • Staphylococcus aureus of the disclosure comprises the amino acid sequence of:
  • the dCas9 of the disclosure comprises a dCas9 isolated or derived from Streptococcus pyogenes.
  • the dCas9 is isolated or derived from Staphylococcus pyogenes and comprises a dCas9 with substitutions at positions 10 and 840 of the amino acid sequence of the dCas9 which inactivate the catalytic site. In some embodiments, these substitutions are D10A and
  • the amino acid sequence of the dCas9 (isolated or derived from Staphylococcus pyogenes) comprises the sequence of:
  • the nuclease domain may comprise, consist essentially of or consist of a dCas9 or a dSaCas9 and a type IIS endonuclease.
  • the nuclease domain may comprise, consist essentially of or consist of a dSaCas9 and a type IIS endonuclease, including, but not limited to, Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, Myll, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Ecil, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI
  • Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
  • An exemplary dCas9-Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 (isolated or derived from S. pyogenes) sequence in italics):
  • the nuclease may comprise, consist essentially of or consist of, a homodimer or a heterodimer.
  • Nuclease domains of the disclosure may comprise, consist essentially of or consist of a nuclease domain isolated, derived or recombined from a transcription-activator-like effector nuclease (TALEN).
  • TALENs are transcription factors with programmable DNA binding domains that provide a means to create designer proteins that bind to pre-determined DNA sequences or individual nucleic acids. Modular DNA binding domains have been identified in
  • TAL proteins transcriptional activator-like (TAL) proteins, or, more specifically, transcriptional activator like effector nucleases (TALENs), thereby allowing for the tie novo creation of synthetic transcription factors that bind to DNA sequences of interest and, if desirable, also allowing a second domain present on the protein or polypeptide to perform an activity related to DNA.
  • TAL proteins have been derived from the organisms Xanthomonas and Ralstonia.
  • the nuclease domain may comprise, consist essentially of or consist of a nuclease domain isolated, derived or recombined from a TALEN and a type IIS endonuclease.
  • the type IIS endonuclease may comprise, consist essentially of or consist of Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, My II, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Edl, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bfil, Mboll, Acc36I, Fokl or Clo05l.
  • BceAI Bsm
  • the nuclease domain of may comprise, consist essentially of or consist of a nuclease domain isolated, derived or recombined from a zinc finger nuclease (ZFN) and a type IIS endonuclease.
  • ZFN zinc finger nuclease
  • the type IIS endonuclease may comprise, consist essentially of or consist of Acil, Mnll, Alwl, Bbvl, BccI, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, Myll, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Edl, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bfil, Mboll, Acc36I, Fokl or Clo05
  • the DNA binding domain and the nuclease domain may be covalently linked.
  • a fusion protein may comprise the DNA binding domain and the nuclease domain.
  • the DNA binding domain and the nuclease domain may be operably linked through a non-covalent linkage.
  • the gene editing systems described herein may be complexed with nanoparticles that are poly(histidine)- based micelles.
  • poly(histidine)-containing tri block copolymers may assemble into a micelle with positively charged poly(histidine) units on the surface, thereby enabling complexing with the negatively-charged gene editing molecule(s).
  • Using these nanoparticles to bind and release proteins and/or nucleic acids in a pH-dependent manner may provide an efficient and selective mechanism to perform a desired gene modification.
  • this micelle-based delivery system provides substantial flexibility with respect to the charged materials, as well as a large payload capacity, and targeted release of the nanoparticle payload.
  • site-specific cleavage of the double stranded DNA may be enabled by delivery of a nuclease using the poly(histidine)-based micelles.
  • the various embodiments enable intracellular delivery of gene editing tools by complexing with poly(histidine)-based micelles.
  • the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block.
  • the hydrophilic block may be poly(ethylene oxide) (PEO)
  • the charged block may be poly(L-histidine).
  • An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA-b-PHIS, with variable numbers of repeating units in each block varying by design.
  • the hydrophobic blocks aggregate to form a core, leaving the hydrophilic blocks and poly(histidine) blocks on the ends to form one or more surrounding layer.
  • poly(histidine)-based micelles may be formed at a pH higher than the pKa of poly(histidine) (e.g., pH of about 7).
  • the amine groups of the poly(histidine) block may be protonated, imparting a positive charge and enabling the poly(histidine) block to complex with negatively charged molecules (e.g., proteins and nucleic acids).
  • negatively charged molecules e.g., proteins and nucleic acids.
  • the pH is dropped substantially, such as a pH of around 3-4, the bound protein and/or nucleic acid may be released due to protonation of the poly(histidine).
  • poly(histidine)-based micelles may exploit the controllable pH-dependent release of the payload molecules to target particular cells and/or pathways.
  • the gene editing systems described herein may be complexed with nanoparticles that are ionizable or constitutively cationically-charged and that are composed of polymer-based micelles.
  • nanoparticles that are ionizable or constitutively cationically-charged and that are composed of polymer-based micelles.
  • cationically-charged polymer -containing triblock copolymers may assemble into a micelle with positively charged polymer units on the surface, thereby enabling complexing with the negatively-charged gene editing molecule(s).
  • Using these nanoparticles to bind and release proteins and/or nucleic acids in a pH-dependent manner may provide an efficient and selective mechanism to perform a desired gene modification.
  • this micelle-based delivery system provides substantial flexibility with respect to the charged materials, as well as a large payload capacity, and targeted release of the nanoparticle payload.
  • site-specific cleavage of the double stranded DNA may be enabled by delivery of a nuclease using the cationically-charged polymer-based micelles.
  • the various embodiments enable intracellular delivery of gene editing tools by complexing with ionizable or constitutively cationically-charged polymer-based micelles.
  • the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block.
  • the hydrophilic block may be poly(ethylene oxide) (PEO)
  • the charged block may be selected from Table 1 or Table 2.
  • An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA-b-PHIS, with variable numbers of repeating units in each block varying by design.
  • the hydrophobic blocks aggregate to form a core, leaving the hydrophilic blocks and positively- charged polymer blocks on the ends to form one or more surrounding layer.
  • cationically-charged polymer-based micelles may be formed from a triblock copolymer containing at least cationic block comprised of the polymers in Table 1 or 2.
  • the amine groups of the ionizable polymer block may be protonated, imparting a positive charge and enabling the resultant cationically-charged polymer block to complex with negatively charged molecules (e.g., proteins and nucleic acids).
  • negatively charged molecules e.g., proteins and nucleic acids
  • the resultant micelles are cationically charged above a pH of 6.0.
  • Various applications of the embodiment cationically-charged polymer-based micelles may exploit the controllable pH-dependent release of the payload molecules to target particular cells and/or pathways.
  • Additional applications of the embodiment micelles may include conjugating molecules to the hydrophilic block in order to target particular cell types.
  • Apoliprotein E or N-Acetylgalactosamine (GalNAc) may be conjugated to a PEO block for specific targeting of the micelles to hepatocytes.
  • the particular methods of creating the block copolymers used in the various embodiments, as well as the techniques of forming the micelles, may be varied based on the composition.
  • these methods and techniques may be optimized to achieve the most desirable block and nanoparticle properties.
  • the polymerization times may be altered to change the molecular weight of a block, and therefore the overall nanoparticle size, as described in further detail in the examples below.
  • the hydrophobic block of the triblock copolymers used to form the micelles may be a polyester, a polyanhydride, a polypeptide, or an artificial polynucleic acid.
  • the hydrophobic block may be an aliphatic polyester, including, but not limited to, poly(lactic acid), poly(gly colic acid) (PGA), poly(lactic-co- gly colic acid) (PLGA), poly(s-caprolactone) (PCL), and/or poly(3-hydroxybutyrate) (PHB).
  • Various embodiments may be DNA-based systems that are complexed with the poly(histidine )-based micelles.
  • an expression vector that expresses a nuclease or other protein may be complexed with poly(histidine )-based micelles.
  • the expression vector may be, for example, a plasmid constructed to contain DNA encoding nuclease as well as a promoter region. Once inside the target cell, the DNA encoding the nuclease may be transcribed and translated to create the enzyme.
  • Various embodiment systems may also be designed to integrate DNA into the genome of a target cell using a transposon provided on a vector, such as an artificially constructed plasmid.
  • Applications of such systems may include introducing (i.e., "knocking in”) a new gene to perform a particular function through the inserted DNA, or inactivating (i.e., "knocking out”) a mutated gene that is functioning improperly through interruption in the target DNA.
  • the DNA may be transposon that is directly transposed between vectors and chromosomes via a "cut and paste" mechanism.
  • the transposon may be a retrotransposon, e.g., a DNA that is first transcribed into an RNA intermediate, followed by reverse transcription into the DNA that is transposed.
  • the cationically-charged polymer-based micelles may complex with a vector that includes the transposon, as well as a transposase that catalyzes the integration of the transposon into specific sites in the target genome.
  • the poly(histidine )-based micelles may complex with a vector that includes the transposon, as well as a transposase that catalyzes the integration of the transposon into specific sites in the target genome.
  • the transposase that is used is specific to the particular transposon that is selected, each of which may have particular properties are desirable for use in various embodiments.
  • One example transposon is the piggyBac transposon, which is transposed into a target genome by the piggyBac transposase.
  • the piggyBac transposase recognizes transposon-specific inverted terminal repeat sequences (ITRs) on the ends of the transposon, and moves the contents between the ITRs into TTAA chromosomal sites.
  • ITRs inverted terminal repeat sequences
  • the piggyBac transposon system has no payload limit for the genes of interest that can be included between the ITRs.
  • the transposase is a piggyBacTM or a Super piggyBacTM (SPB) transposase.
  • the sequence encoding the transposase is an mRNA sequence.
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme.
  • PB piggyBac
  • the piggyBac (PB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at one or more of positions 30, 165, 282, or 538 of the sequence: 1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at two or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 5.
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at three or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 5.
  • the transposase enzyme is a piggyBacTM (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at each of the following positions 30, 165, 282, and 538 of the sequence of SEQ ID NO: 5.
  • the amino acid substitution at position 30 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for an isoleucine (I).
  • the amino acid substitution at position 165 of the sequence of SEQ ID NO: 5 is a substitution of a serine (S) for a glycine (G).
  • the amino acid substitution at position 282 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 538 of the sequence of SEQ ID NO: 5 is a substitution of a lysine (K) for an asparagine (N).
  • the transposase enzyme is a Super piggyBacTM (SPB) transposase enzyme.
  • the Super piggyBacTM (SPB) transposase enzymes of the disclosure may comprise or consist of the amino acid sequence of the sequence of SEQ ID NO: 5 wherein the amino acid substitution at position 30 is a substitution of a valine (V) for an isoleucine (I), the amino acid substitution at position
  • the Super piggyBacTM (SPB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
  • the piggyBacTM or Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at one or more of positions 3, 46, 82, 103, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 258, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 486, 503, 552, 570 and 591 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the piggyBacTM or Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at one or more of positions 46, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 485,
  • the amino acid substitution at position 3 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for a serine (S).
  • the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an alanine (A).
  • the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A).
  • the amino acid substitution at position 82 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for an isoleucine (I).
  • the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S).
  • the amino acid substitution at position 119 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for an arginine (R).
  • the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) a cysteine (C). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a tyrosine (Y).
  • the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for a tyrosine (Y).
  • the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F).
  • the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a phenylalanine (F).
  • the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a phenylalanine (F).
  • the amino acid substitution at position 185 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M).
  • the amino acid substitution at position 187 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for an alanine (A).
  • the amino acid substitution at position 200 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a phenylalanine (F).
  • the amino acid substitution at position 207 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a valine (V).
  • the amino acid substitution at position 209 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a valine (V).
  • the amino acid substitution at position 226 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a methionine (M).
  • the amino acid substitution at position 235 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a leucine (L).
  • the amino acid substitution at position 240 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V).
  • the amino acid substitution at position 241 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F).
  • the amino acid substitution at position 243 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a proline (P).
  • the amino acid substitution at position 258 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N).
  • the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L).
  • the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M).
  • the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a proline (P). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine for a proline (P).
  • the amino acid substitution at position 315 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for an arginine (R).
  • the amino acid substitution at position 319 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a threonine (T).
  • the amino acid substitution at position 327 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a tyrosine (Y).
  • the amino acid substitution at position 328 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a tyrosine (Y).
  • the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a cysteine (C).
  • the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C).
  • the amino acid substitution at position 421 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for the aspartic acid (D).
  • the amino acid substitution at position 436 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a valine (V).
  • the amino acid substitution at position 456 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a methionine (M).
  • the amino acid substitution at position 470 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L).
  • the amino acid substitution at position 485 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a serine (S).
  • the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M).
  • the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a methionine (M).
  • the amino acid substitution at position 552 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V).
  • the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A).
  • the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a glutamine (Q). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a glutamine (Q).
  • the piggyBacTM transposase enzyme may comprise or the Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at one or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the piggyBacTM transposase enzyme may comprise or the Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at two, three, four, five, six or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the piggyBacTM transposase enzyme may comprise or the Super piggyBacTM transposase enzyme may further comprise an amino acid substitution at positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S).
  • the amino acid substitution at position 194 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M).
  • the amino acid substitution at position 372 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for an arginine (R).
  • the amino acid substitution at position 375 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a lysine (K).
  • the amino acid substitution at position 450 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for an aspartic acid (D).
  • the amino acid substitution at position 509 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a serine (S).
  • the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N).
  • the piggyBacTM transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5.
  • the piggyBacTM transposase enzyme may further comprise an amino acid substitution at positions 372, 375 and 450 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2.
  • the piggyBacTM transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, and a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5.
  • the piggyBacTM transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5 and a substitution of an asparagine (N) for an aspartic acid (D) at position 450 of SEQ ID NO: 5.
  • Another example transposon system is the sleeping beauty transposon, which is transposed into the target genome by the Sleeping Beauty transposase that recognizes ITRs, and moves the contents between the ITRs into TA chromosomal sites.
  • SB transposon-mediated gene transfer or gene transfer using any of a number of similar transposons, may be used for long-term expression of a therapeutic gene.
  • the transposase is a Sleeping Beauty transposase or a hyperactive Sleeping Beauty transposase (SB100X).
  • the Sleeping Beauty transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
  • the hyperactive Sleeping Beauty (SB100X) transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
  • Another example transposon system is the Helraiser/Helitron transposon system.
  • the Helraiser transposon is transposed by the Helitron transposase.
  • Helitron transposases mobilize the Helraiser transposon, an ancient element from the bat genome that was active about 30 to 36 million years ago.
  • An exemplary Helraiser transposon of the disclosure includes Helibatl, which comprises a nucleic acid sequence comprising:
  • the Helitron transposase does not contain an RNase-H like catalytic domain, but instead comprises a RepHel motif made up of a replication initiator domain (Rep) and a DNA helicase domain.
  • the Rep domain is a nuclease domain of the HUH superfamily of nucleases.
  • An exemplary Helitron transposase of the disclosure comprises an amino acid sequence comprising:
  • a hairpin close to the 3’ end of the transposon functions as a terminator.
  • this hairpin can be bypassed by the transposase, resulting in the transduction of flanking sequences.
  • Helraiser transposition generates covalently closed circular intermediates.
  • Helitron transpositions can lack target site duplications.
  • the transposase is flanked by left and right terminal sequences termed LTS and RTS. These sequences terminate with a conserved 5’-TC/CTAG- 3’ motif.
  • a 19 bp palindromic sequence with the potential to form the hairpin termination structure is located 11 nucleotides upstream of the RTS and consists of the sequence
  • Tol2 transposons may be isolated or derived from the genome of the medaka fish, and may be similar to transposons of the hAT family.
  • Exemplary Tol2 transposons of the disclosure are encoded by a sequence comprising about 4.7 kilobases and contain a gene encoding the Tol2 transposase, which contains four exons.
  • An exemplary Tol2 transposase of the disclosure comprises an amino acid sequence comprising the following:
  • An exemplary Tol2 transposon of the disclosure including inverted repeats, subterminal sequences and the Tol2 transposase, is encoded by a nucleic acid sequence comprising the following:
  • poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelles may complex with the transposase in its native protein for, as mRNA that is transcribed into protein in the target cell, or as an expression vector containing DNA to express the transposase protein.
  • genes encoding the transposase may be provided in the same vector as the transposon itself, or on a different vector.
  • Various embodiments may further enable complexing a nuclease and a transposon system in a poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelles for delivery to a target cell.
  • Such micelle systems may be used for example, to replace a mutated gene that causes disease with a healthy copy of the gene that is inserted at a specific site dictated by the activity of the nuclease.
  • a transposon may be created that includes one or more gene to be inserted, which is surrounded by the ITRs for recognition by the transposase.
  • the transposon and ITRs may be provided on a vector that contains homology arms on each end of the ITRs.
  • the transposon system i.e., the transposon vector and corresponding transposase
  • the transposon when delivered with the nuclease, may serve the function of the DNA repair template used in HDR. That is, following the creation of one or more DSB by the nuclease, the transposon may be inserted into the target DNA based on the homology arms. In some embodiments, the transposon insertion may occur between the two ends generated by a DSB. In other embodiments, the transposon may be inserted between one arm of a first DSB and the other arm at a second DSB in the target DNA (i.e., replacing the sequence between two DSBs).
  • each complexing system may include common characteristics in order to be effective.
  • nucleic acids may be complexed with a poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelle with at least 40% efficiency. Such minimum efficiency ensures delivery of enough active molecule to achieve efficient DNA cleavage and/or other modification, and that the product can be reproducibly generated at a low cost.
  • the poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelles may be designed to be stable, yet to provide facile release of the complexed payload once the micelle has been taken up intracellularly, thereby avoiding endosomal retraffi eking and ensuring release of the nucleic acids.
  • the vector i.e., transposon
  • the vector may be designed to provide stable expression.
  • the gene editing tools provided in poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelle described herein may be beneficial for a number of in vivo applications.
  • the embodiment materials may be delivered to various cell types in order to cut or to repair gene defects.
  • Such cells include, but are not limited to, hepatocytes, hepatic endothelial cells, immune cells, neurons, etc.
  • poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelle may also be delivered to various cell types in order to silence defective genes that cause diseases (for example, delivery to retinal cells to silence mutations underlying Leber's Congenital Amaurosis).
  • Various methods may be used to generate the poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelles and/or complexation of micelles and proteins and/or nucleic acids described herein.
  • conventional preparation techniques such as thin-film rehydration, direct-hydration, and electro-formation may be used to form polymeric micelles that complex with nucleic acids and/or proteins with gene editing functions into various degradable and non-degradable micelles.
  • model proteins having various sizes provides a range of sizes of functional proteins that may be used in various embodiments.
  • various DNA plasmids may be used as model nucleic acids for poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelles, such as plasmid DNA encoding the mammalian expression vector for expression of green fluorescent protein (GFP) using the elongation factor I alpha (EF la) promoter) (i.e., pEF-GFP DNA).
  • GFP green fluorescent protein
  • EF la elongation factor I alpha
  • the pEF-GFP DNA is about 5000 base-pairs, and has a molecular weight of about 3283 kDa.
  • the hydrophobic blocks may aggregate to form a core, leaving the hydrophilic blocks and poly(histidine), other ionizable polymers, or cationically-charged polymer blocks on the ends to form one or more surrounding layer.
  • the disclosure provides a composition comprising a guide RNA and a fusion protein or a sequence encoding the fusion protein wherein the fusion protein comprises a dCas9 and a Clo05l endonuclease or a nuclease domain thereof.
  • SaCas9 Small Cas9
  • compositions comprising a small, Cas9 (Cas9) operatively- linked to an effector.
  • the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA localization component and an effector molecule, wherein the effector comprises a small, Cas9 (Cas9).
  • a small Cas9 construct of the disclosure may comprise an effector comprising a type IIS endonuclease.
  • compositions comprising an inactivated, small, Cas9 (dSaCas9) operatively -linked to an effector.
  • the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA localization component and an effector molecule, wherein the effector comprises a small, inactivated Cas9 (dSaCas9).
  • a small, inactivated Cas9 (dSaCas9) construct of the disclosure may comprise an effector comprising a type IIS endonuclease.
  • dSaCas9 Sequence D10A and N580A mutations (bold, capitalized, and underlined) inactivate the catalytic site.
  • compositions comprising an inactivated Cas9 (dCas9) operatively-linked to an effector.
  • the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA localization component and an effector molecule, wherein the effector comprises an inactivated Cas9 (dCas9).
  • an inactivated Cas9 (dCas9) construct of the disclosure may comprise an effector comprising a type IIS endonuclease.
  • the dCas9 of the disclosure comprises a dCas9 isolated or derived from Staphyloccocus pyogenes.
  • the dCas9 comprises a dCas9 with substitutions at positions 10 and 840 of the amino acid sequence of the dCas9 which inactivate the catalytic site. In some embodiments, these substitutions are D10A and H840A.
  • amino acid sequence of the dCas9 comprises the sequence of:
  • amino acid sequence of the dCas9 comprises the sequence of:
  • Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
  • an exemplary dCas9-Clo05l fusion protein may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 sequence ( Streptoccocus pyogenes) in italics):
  • MAPKKKRKVEGIKSNI SLLKDELRGQISHI SHEYLSLIDIAFDSKQNRLFEMKVLELLVNEYGFKGRHLGGSRKP DGIVYSTTLEDNFGIIVDTKAYSEGYSLPI SQADEMERYVRENSNRDEEV PNKWWENFSEEVKKYYFVFISGSF KGKFEEQLRRLSMTTGVNGSAVNWNLLLGAEKIRSGEMTIEELERAMFNNSEFILKYGGGGSDKKYSIGLAIGT NSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSN EMAKVDDSFFHRLEESELVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIK FRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGE
  • an exemplary dCas9-Clo05l fusion protein may comprise, consist essentially of or consist of, the nucleic acid sequence of (dCas9 sequence derived from Streptoccocus pyogenes ):
  • the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 1) of the disclosure may comprise a DNA. In some embodiments, the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 1) of the disclosure may comprise an RNA.
  • an exemplary dCas9-Clo05l fusion protein may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 sequence ( Streptoccocus pyogenes) in italics):
  • an exemplary dCas9-Clo05l fusion protein may comprise, consist essentially of or consist of, the nucleic acid sequence of (dCas9 sequence derived fro m Slreploccocus pyogenes ):
  • the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 2) of the disclosure may comprise a DNA. In some embodiments, the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 2) of the disclosure may comprise an RNA.
  • the disclosure provides a nanotransposon comprising: (a) a sequence encoding a transposon insert, comprising a sequence encoding a first inverted terminal repeat (ITR), a sequence encoding a second inverted terminal repeat (ITR), and an intra-ITR sequence; (b) a sequence encoding a backbone, wherein the sequence encoding the backbone comprises a sequence encoding an origin of replication having between 1 and 450 nucleotides, inclusive of the endpoints, and a sequence encoding a selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints, and (c) an inter-ITR sequence.
  • the inter-ITR sequence of (c) comprises the sequence of (b).
  • the intra- ITR sequence of (a) comprises the sequence of (b).
  • the sequence encoding the backbone comprises between 1 and 600 nucleotides, inclusive of the endpoints. In some embodiments, the sequence encoding the backbone consists of between 1 and 50 nucleotides, between 50 and 100 nucleotides, between 100 and 150 nucleotides, between 150 and 200 nucleotides, between 200 and 250 nucleotides, between 250 and 300 nucleotides, between 300 and 350 nucleotides, between 350 and 400 nucleotides, between 400 and 450 nucleotides, between 450 and 500 nucleotides, between 500 and 550 nucleotides, between 550 and 600 nucleotides, each range inclusive of the endpoints.
  • the inter-ITR sequence comprises between 1 and 1000 nucleotides, inclusive of the endpoints. In some embodiments, the inter-ITR sequence consists of between 1 and 50 nucleotides, between 50 and 100 nucleotides, between 100 and 150 nucleotides, between 150 and 200 nucleotides, between 200 and 250 nucleotides, between 250 and 300 nucleotides, between 300 and 350 nucleotides, between 350 and 400 nucleotides, between 400 and 450 nucleotides, between 450 and 500 nucleotides, between 500 and 550 nucleotides, between 550 and 600
  • nucleotides between 600 and 650 nucleotides, between 650 and 700 nucleotides, between 700 and 750 nucleotides, between 750 and 800 nucleotides, between 800 and 850
  • nucleotides between 850 and 900 nucleotides, between 900 and 950 nucleotides, or between 950 and 1000 nucleotides, each range inclusive of the endpoints.
  • the inter-ITR sequence comprises between 1 and 200 nucleotides, inclusive of the endpoints.
  • the inter-ITR sequence consists of between 1 and 10 nucleotides, between 10 and 20 nucleotides, between 20 and 30 nucleotides, between 30 and 40 nucleotides, between 40 and 50 nucleotides, between 50 and 60 nucleotides, between 60 and 70 nucleotides, between 70 and 80 nucleotides, between 80 and 90 nucleotides, or between 90 and 100 nucleotides, each range inclusive of the endpoints.
  • the selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints comprises a sequence encoding a sucrose-selectable marker.
  • the sequence encoding a sucrose-selectable marker comprises a sequence encoding an RNA-OUT sequence.
  • the sequence encoding an RNA-OUT sequence comprises or consists of 137 base pairs (bp).
  • the selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints comprises a sequence encoding a fluorescent marker.
  • the selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints comprises a sequence encoding a cell surface marker.
  • the sequence encoding an origin of replication having between 1 and 450 nucleotides, inclusive of the endpoints comprises a sequence encoding a mini origin of replication.
  • the sequence encoding an origin of replication having between 1 and 450 nucleotides, inclusive of the endpoints comprises a sequence encoding an R6K origin of replication.
  • the R6K origin of replication comprises an R6K gamma origin of replication.
  • the R6K origin of replication comprises an R6K mini origin of replication.
  • the R6K origin of replication comprises an R6K gamma mini origin of replication.
  • the R6K gamma mini origin of replication comprises or consists of 281 base pairs (bp).
  • the sequence encoding the backbone does not comprise a recombination site, an excision site, a ligation site or a combination thereof. In some embodiments, neither the nanotransposon nor the sequence encoding the backbone comprises a product of a recombination site, an excision site, a ligation site or a combination thereof. In some embodiments, neither the
  • nanotransposon nor the sequence encoding the backbone is derived from a recombination site, an excision site, a ligation site or a combination thereof.
  • a recombination site comprises a sequence resulting from a recombination event.
  • a recombination site comprises a sequence that is a product of a recombination event.
  • the recombination event comprises an activity of a recombinase (e.g., a recombinase site).
  • the sequence encoding the backbone does not further comprise a sequence encoding foreign DNA.
  • the inter-ITR sequence does not comprise a recombination site, an excision site, a ligation site or a combination thereof. In some embodiments, the inter-ITR sequence does not comprise a product of a recombination event, an excision event, a ligation event or a combination thereof. In some embodiments, the inter-ITR sequence is not derived from a recombination event, an excision event, a ligation event or a combination thereof.
  • the inter-ITR sequence comprises a sequence encoding foreign DNA.
  • the intra-ITR sequence comprises at least one sequence encoding an insulator and a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell.
  • the mammalian cell is a human cell.
  • the intra-ITR sequence comprises a first sequence encoding an insulator, a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell and a second sequence encoding an insulator.
  • the intra-ITR sequence comprises a first sequence encoding an insulator, a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell, a polyadenosine (poly A) sequence and a second sequence encoding an insulator.
  • the intra-ITR sequence comprises a first sequence encoding an insulator, a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell, at least one exogenous sequence, a polyadenosine (poly A) sequence and a second sequence encoding an insulator.
  • the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell is capable of expressing an exogenous sequence in a human cell.
  • the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding a constitutive promoter.
  • the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding an inducible promoter.
  • the intra- ITR sequence comprises a first sequence encoding a first promoter capable of expressing an exogenous sequence in a mammalian cell and a second sequence encoding a second promoter capable of expressing an exogenous sequence in mammalian cell, wherein the first promoter is a constitutive promoter, wherein the second promoter is an inducible promoter, and wherein the first sequence encoding the first promoter and the second sequence encoding the second promoter are oriented in opposite directions.
  • the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding a cell-type or tissue-type specific promoter.
  • the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding an EFla promoter, a sequence encoding a CMV promoter, a sequence encoding an MND promoter, a sequence encoding an SV40 promoter, a sequence encoding a PGK1 promoter, a sequence encoding a Ubc promoter, a sequence encoding a CAG promoter, a sequence encoding an Hl promoter, or a sequence encoding a U6 promoter.
  • the poly adenosine (poly A) sequence is isolated or derived from a viral polyA sequence. In some embodiments, the polyadenosine (polyA) sequence is isolated or derived from an (SV40) polyA sequence.
  • the at least one exogenous sequence comprises an inducible proapoptotic polypeptide. In some embodiments, the inducible caspase polypeptide comprises (a) a ligand binding region, (b) a linker, and (c) a caspase polypeptide, wherein the inducible proapoptotic polypeptide does not comprise a non-human sequence.
  • the inducible caspase polypeptide comprises (a) a ligand binding region, (b) a linker, and (c) a truncated caspase 9 polypeptide, wherein the inducible proapoptotic polypeptide does not comprise a non-human sequence.
  • the ligand binding region comprises a FK506 binding protein 12 (FKBP12) polypeptide.
  • the amino acid sequence of the ligand binding region comprises a FK506 binding protein 12 (FKBP12) polypeptide.
  • the FK506 binding protein 12 (FKBP12) polypeptide comprises a modification at position 36 of the sequence.
  • the modification comprises a substitution of valine (V) for phenylalanine (F) at position 36 (F36V).
  • the FKBP12 polypeptide is encoded by an amino acid sequence comprising
  • the FKBP12 polypeptide is encoded by a nucleic acid sequence comprising
  • the linker region is encoded by an amino acid comprising GGGGS (SEQ ID NO: 27) or a nucleic acid sequence comprising GGAGGAGGAGGATCC (SEQ ID NO: 28).
  • the nucleic acid sequence encoding the linker does not comprise a restriction site.
  • the truncated caspase 9 polypeptide is encoded by an amino acid sequence that does not comprise an arginine (R) at position 87 of the sequence. In some embodiments, the truncated caspase 9 polypeptide is encoded by an amino acid sequence that does not comprise an alanine (A) at position 282 the sequence. In some embodiments, the truncated caspase 9 polypeptide is encoded by an amino acid comprising
  • the truncated caspase 9 polypeptide is encoded by a nucleic acid sequence comprising
  • the inducible proapoptotic polypeptide is encoded by an amino acid sequence comprising GVQVETISPGDGRTFPKRGQTCVVHYTGMLEDGKKVDSSRDRNKPFKFMLGKQEVI RGWEEGVAQMSVGQRAKLTISPDYAYGATGHPGIIPPHATLVFDVELLKLEGGGGS GFGDVGALESLRGNADLAYILSMEPCGHCLIINNVNFCRESGLRTRTGSNIDCEKLRR RFSSLHFMVEVKGDLTAKKMVLALLELAQQDHGALDCCVVVILSHGCQASHLQFPG AVYGTDGCPVSVEKIVNIFNGTSCPSLGGKPKLFFIQACGGEQKDHGFEVASTSPEDE SPGSNPEPDATPFQEGLRTFDQLDAISSLPTPSDIFVSYSTFPGFVSWRDPKSGSW
  • the exogenous sequence further comprises a sequence encoding a selectable marker.
  • the sequence encoding the selectable marker comprises a sequence encoding a detectable marker.
  • the detectable marker comprises a fluorescent marker or a cell-surface marker.
  • the sequence encoding the selectable marker comprises a sequence encoding a protein that is active in dividing cells and not active in non-dividing cells.
  • the sequence encoding the selectable marker comprises a sequence encoding a metabolic marker.
  • the sequence encoding the selectable marker comprises a sequence encoding a dihydrofolate reductase (DHFR) mutein enzyme.
  • the DHFR mutein enzyme comprises or consists of the amino acid sequence of:
  • the DHFR mutein enzyme is encoded by a the nucleic acid sequence comprising or consisting of
  • the amino acid sequence of the DHFR mutein enzyme further comprises a mutation at one or more of positions 80, 113, or 153.
  • the amino acid sequence of the DHFR mutein enzyme comprises one or more of a substitution of a Phenylalanine (F) or a Leucine (L) at position 80, a substitution of a Leucine (L) or a Valine (V) at position 113, and a substitution of a Valine (V) or an Aspartic Acid (D) at position 153.
  • the exogenous sequence further comprises a sequence encoding a non-naturally occurring antigen receptor, and/or a sequence encoding a therapeutic polypeptide.
  • the non-naturally occurring antigen receptor comprises a T cell Receptor (TCR).
  • TCR T cell Receptor
  • a sequence encoding the TCR comprises one or more of an insertion, a deletion, a substitution, an invertion, a transposition or a frameshift compared to a corresponding wild type sequence.
  • a sequence encoding the TCR comprises a chimeric or recombinant sequence.
  • the non-naturally occurring antigen receptor comprises a chimeric antigen receptor (CAR).
  • the CAR comprises: (a) an ectodomain comprising an antigen recognition region, (b) a transmembrane domain, and (c) an endodomain comprising at least one costimulatory domain.
  • the ectodomain of (a) of the CAR further comprises a signal peptide.
  • the ectodomain of (a) of the CAR further comprises a hinge between the antigen recognition region and the transmembrane domain.
  • the endodomain comprises a human O ⁇ 3z endodomain.
  • the at least one costimulatory domain comprises a human 4-1BB, CD28,
  • the at least one costimulatory domain comprises a human CD28 and/or a 4- 1BB costimulatory domain.
  • the antigen recognition region comprises one or more of a scFv, a VHH, a VH, and a Centyrin.
  • the exogenous sequence comprises an inducible proapoptotic polypeptide and/or the exogenous sequence comprises a sequence encoding a selectable marker
  • the exogenous sequence further comprises a sequence encoding a transposase.
  • the intra-ITR sequence comprises a sequence encoding a selectable marker, an exogenous sequence, a sequence encoding an inducible caspase polypeptide, and at least one sequence encoding a self-cleaving peptide.
  • the at least one sequence encoding a self cleaving peptide is positioned between one or more of: (a) the sequence encoding a selectable marker and the exogenous sequence, (b) the sequence encoding a selectable marker and the inducible caspase polypeptide, and (c) the exogenous sequence and the inducible caspase polypeptide.
  • a first sequence encoding a self-cleaving peptide is positioned between the sequence encoding a selectable marker and the exogenous sequence and a second sequence encoding a self-cleaving peptide is positioned between the exogenous sequence and the inducible caspase polypeptide.
  • the at least one self cleaving peptide comprises T2A peptide, GSG-T2A peptide, an E2A peptide, a GSG-E2A peptide, an F2A peptide, a GSG-F2A peptide, a P2A peptide, or a GSG-P2A peptide.
  • the T2A peptide comprises an amino acid sequence comprising
  • the GSG-T2A peptide comprises an amino acid sequence comprising GSGEGRGSLLTCGDVEENPGP (SEQ ID NO: 26).
  • the E2A peptide comprises an amino acid sequence comprising QCTNYALLKLAGDVESNPGP (SEQ ID NO: 35).
  • the GSG-E2A peptide comprises an amino acid sequence comprising
  • the F2A peptide comprises an amino acid sequence comprising VKQTLNFDLLKLAGDVESNPGP (SEQ ID NO: 37). In some embodiments, the GSG-F2A peptide comprises an amino acid sequence comprising GSGVKQTLNFDLLKLAGDVESNPGP (SEQ ID NO: 38). In some embodiments, the P2A peptide comprises an amino acid sequence comprising
  • the GSG-P2A peptide comprises an amino acid sequence comprising GSGATNFSLLKQAGDVEENPGP (SEQ ID NO: 40).
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase or a piggyBac-like transposase.
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase.
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac-like transposase.
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprise a TTAA, a TTAT or a TTAX recognition sequence.
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprise a TTAA, a TTAT or a TTAX recognition sequence and a sequence having at least 50% identity to a sequence isolated or derived from a piggyBac transposase or a piggyBac-like transposase.
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprise at least 2 nucleotides (nts), 3 nts, 4 nts, 5 nts, 6 nts, 7 nts, 8 nts, 9 nts, 10 nts, 11 nts, 12 nts, 13 nts, 14 nts, 15 nts, 16 nts, 17 nts, 18 nts, 19 nts, or 20 nts.
  • nts nucleotides
  • the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase or a piggyBac-like transposase.
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) or a sequence having at least 70% identity to the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41).
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) and comprises the sequence of
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) and comprises the sequence of
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) and comprises the sequence of
  • the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase or a piggyBac-like transposase.
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) is recognized by a piggyBac transposase having an amino acid sequence of at least 20% identity to the amino acid sequence of
  • ITR inverted terminal repeat
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) is recognized by a piggyBac transposase having an amino acid sequence of at least 20% identity to the amino acid sequence of
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) is recognized by a piggyBac transposase having the amino acid sequence of
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a Sleeping Beauty transposase.
  • the Sleeping Beauty transposase is a hyperactive Sleeping Beauty transposase (SB100X).
  • the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a Helitron transposase.
  • the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence
  • the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a Tol2 transposase.
  • the disclosure provides a cell comprising a nanotransposon of the disclosure.
  • the cell further comprises a transposase composition.
  • the transposase composition comprises a transposase or a sequence encoding the transposase that is capable of recognizing the first ITR or the second ITR of the nanotransposon.
  • the transposase composition comprises a
  • the cell comprises a first nanotransposon comprising an exogenous sequence and a second nanotransposon comprising a sequence encoding a transposase. In some embodiments, the cell is an allogeneic cell.
  • the disclosure provides a composition comprising the nanotransposon of the disclosure.
  • the disclosure provides a composition comprising the cell of the disclosure.
  • the cell comprises a nanotransposon of the disclosure.
  • the cell is not further modified.
  • the cell is allogeneic.
  • the disclosure provides a composition comprising the cell of the disclosure.
  • the cell comprises a nanotransposon of the disclosure.
  • the cell is not further modified.
  • the cell is autologous.
  • the disclosure provides a composition comprising a plurality of cells of the disclosure.
  • at least one cell of the plurality of cells comprises a nanotransposon of the disclosure.
  • a portion of the plurality of cells comprises a nanotransposon of the disclosure.
  • the portion comprises at least 1%, 2%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%,
  • each cell of the plurality of cells comprises a nanotransposon of the disclosure.
  • the plurality of cells does not comprise a modified cell of the disclosure.
  • at least one cell of the plurality of cells is not further modified.
  • none of the plurality of cells is not further modified.
  • plurality of cells is allogeneic.
  • an allogeneic plurality of cells are produced according to the methods of the disclosure.
  • plurality of cells is autologous. In some embodiments, an autologous plurality of cells are produced according to the methods of the disclosure.
  • the disclosure provides a modified cell comprising: (a) a nanotransposon of the disclosure; (b) a sequence encoding an inducible proapoptotic polypeptide; and wherein the cell is a T cell, (c) a modification of an endogenous sequence encoding a T cell Receptor (TCR), wherein the modification reduces or eliminates a level of expression or activity of the TCR.
  • a modified cell comprising: (a) a nanotransposon of the disclosure; (b) a sequence encoding an inducible proapoptotic polypeptide; and wherein the cell is a T cell, (c) a modification of an endogenous sequence encoding a T cell Receptor (TCR), wherein the modification reduces or eliminates a level of expression or activity of the TCR.
  • TCR T cell Receptor
  • the cell further comprises: (d) a non-naturally occurring sequence comprising an HLA class I histocompatibility antigen, alpha chain E (HLA-E), and (e) a modification of an endogenous sequence encoding Beta-2 -Microglobulin (B2M), wherein the modification reduces or eliminates a level of expression or activity of a major histocompatibility complex (MHC) class I (MHC-I).
  • HLA-E alpha chain E
  • B2M Beta-2 -Microglobulin
  • the disclosure provides a modified cell comprising: (a) a nanotransposon of the disclosure; (b) a sequence encoding an inducible proapoptotic polypeptide; (c) a non- naturally occurring sequence comprising an HLA class I histocompatibility antigen, alpha chain E (HLA-E), and (e) a modification of an endogenous sequence encoding Beta-2- Microglobulin (B2M), wherein the modification reduces or eliminates a level of expression or activity of a major histocompatibility complex (MHC) class I (MHC-I).
  • MHC major histocompatibility complex
  • the non-naturally occurring sequence comprising a HLA-E further comprises a sequence encoding a B2M signal peptide.
  • the non-naturally occurring sequence comprising an HLA-E further comprises a linker, wherein the linker is positioned between the sequence encoding the sequence encoding a B2M polypeptide and the sequence encoding the HLA-E.
  • the non-naturally occurring sequence comprising an HLA-E further comprises a sequence encoding a peptide and a sequence encoding a B2M polypeptide.
  • the non-naturally occurring sequence comprising an HLA-E further comprises a first linker positioned between the sequence encoding the B2M signal peptide and the sequence encoding the peptide, and a second linker positioned between the sequence encoding the B2M polypeptide and the sequence encoding the HLA-E.
  • the cell is a mammalian cell.
  • the cell is a human cell.
  • the cell is a stem cell.
  • the cell is a differentiated cell.
  • the cell is a somatic cell.
  • the cell is an immune cell or an immune cell precursor.
  • the immune cell is a lymphoid progenitor cell, a natural killer (NK) cell, a cytokine induced killer (CIK) cell, a T lymphocyte (T cell), a B lymphocyte (B-cell) or an antigen presenting cell (APC).
  • the immune cell is a T cell, an early memory T cell, a stem cell-like T cell, a stem memory T cell (Tscm), or a central memory T cell (Tcm).
  • the immune cell precursor is a hematopoietic stem cell (HSC).
  • the cell is an antigen presenting cell (APC).
  • the cell further comprises a gene editing composition.
  • the gene editing composition comprises a sequence encoding a DNA binding domain and a sequence encoding a nuclease protein or a nuclease domain thereof.
  • the gene editing composition comprises a sequence encoding a nuclease protein or a sequence encoding a nuclease domain thereof.
  • the e sequence encoding a nuclease protein or the sequence encoding a nuclease domain thereof comprises a DNA sequence, an RNA sequence, or a combination thereof.
  • the nuclease or the nuclease domain thereof comprises one or more of a CRISPR/Cas protein, a Transcription Activator-Like Effector Nuclease (TALEN), a Zinc Finger Nuclease (ZFN), and an endonuclease.
  • the CRISPR/Cas protein comprises a nuclease- inactivated Cas (dCas) protein.

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • General Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biochemistry (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Epidemiology (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Medicinal Chemistry (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biomedical Technology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Dispersion Chemistry (AREA)
  • Nanotechnology (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)

Abstract

Disclosed are compositions for delivering gene editing molecules to a cell. Exemplary compositions comprise a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly(L-histidine) block, wherein: the at least one poly(L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly(L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule.

Description

MICELLES FOR COMPLEXATION AND DELIVERY OF PROTEINS AND
NUCLEIC ACIDS
RELATED APPLICATIONS
[01] This application claims the benefit of provisional application USSN 62/608,518, filed December 20, 2017, the contents of which are herein incorporated by reference in their entirety.
INCORPORATION OF SEQUENCE LISTING
[02] The contents of the text file named“POTH-031/001 WO_SeqList.txt,” which was created on December 20, 2018 and is 588 KB in size, are hereby incorporated by reference in their entirety.
FIELD OF THE DISCLOSURE
[03] The present invention is directed to compositions and methods for delivery of proteins and nucleic acids, for use in, for example, targeted gene modification.
BACKGROUND
[04] Current mechanisms for delivering proteins and/or nucleic acids to target cells for gene modification including, for example, the use of viral-based gene delivery has limitations including toxicity, immunogenicity, mutagenicity, payload size limits, and difficulties with large-scale production, including costs and time. Despite a long-felt need in the art, there remains a need for a method of delivering proteins and/or nucleic acids for use in gene modification that overcomes the limitations of the current technology. The disclosure provides compositions and methods for non-viral delivery that overcome the limitations of existing technologies.
SUMMARY
[05] The disclosure provides a non- viral composition for the delivery of at least one RNA or DNA sequence. In some embodiments, the RNA sequence is an mRNA sequence. In some embodiments, the RNA sequence is a regulatory RNA sequence. In some embodiments, the RNA sequence is an oligonucleotide (e.g., that may bind a complementary DNA or RNA in a cell). In some embodiments, the RNA sequence is an mRNA sequence that alters the biochemistry of the cell (e.g., an mRNA encoding protein that tamps down the response against DNA, such that administration of a nanoparticle/RNA may be followed with a second nanoparticle/DNA or electroporation/DNA without DNA toxicity).
[06] The disclosure provides a non- viral composition for the delivery of at least one sequence encoding a therapeutic protein to a cell. In some embodiments, the at least one sequence encoding a therapeutic protein is an mRNA sequence. In some embodiments, the at least one sequence encoding a therapeutic protein is a cDNA sequence. In some
embodiments, the at least one sequence encoding a therapeutic protein is a non-naturally occurring sequence, including, but not limited to a sequence comprising at least one modified nucleotide, a sequence comprising a recombinant sequence, a sequence comprising a chimeric sequence, a sequence comprising a sequence encoding a self-cleaving peptide, a sequence comprising a sequence encoding an inducible proapoptotic polypeptide, a sequence comprising a sequence encoding a selection marker (e.g., a sequence encoding a DHFR mutein), or any combination thereof. Exemplary therapeutic proteins may be soluble, secreted, cell-surface linked, transmembrane or any combination thereof. Exemplary therapeutic proteins may be non-naturally occurring. Exemplary therapeutic proteins may be naturally occurring.
[07] The disclosure provides a non- viral composition for the delivery of at least one gene editing molecule to a cell.
[08] The disclosure provides a composition for the delivery of at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one cationically-charged block, wherein: the at least one cationically-charged block complexes with the at least one gene editing molecule; and the at least one cationically-charged block is capable of intracellular delivery and release of the at least one gene editing molecule.
[09] In some embodiments, the cationically-charged block is constitutively positively charged at a physiological pH equal to or greater than 6 0 In some embodiments, the cationically-charged block is constitutively positively charged at a physiological pH of between 7.0 and 7 8 In some embodiments, the cationically-charged block comprises polyallylamine (PAA), polyamidoamine (PAMAM), polydimethylaminoethylmethacrylate (PDMAEMA), poly(2(diisopropylamino)ethyl methacrylate), polyethylenimine (PEI), branched polyethylenimine (bPEI), poly(l-lysine) (PLL), poly(l-arginine) (PLA),
poly(oligoethanamino)amide) (POEAA), chitosan, succinated chitosan, 6-N,N,N- trimethyltriazole-chitosans, a polyphosphoramidate, a polyhydroxyalkanoate, poly(calixane), poly(cyclodextrin), poly(aspartamide) (pASP(DET)), a poly(aminoamide), Poly(methacrylic acid-g-ethylene glycol) (P(MAA-g-EG)), or Poly(N,N-dialkylaminoethylmethacrylates) (PDAAEMA).
[010] In some embodiments, the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at a reduced pH of between 6.0 and 7.0. In some embodiments, the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at a reduced pH of between 6.0 and 6.5. In some embodiments, the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at a reduced pH of between 6.2 and 6.4. In some embodiments, the cationically-charged block comprises a substituted polyimidazole, a poly (L-histidine), a poly(beta)amino ester (PBAE), a poly(allylamine) hydrochloride(PAH), a poly(meth)acrylamide, or a poly(styrene-alt-maleic anhydride) (pSMA).
[Oil] In some embodiments, the at least one gene editing molecule comprises one or more of a protein or a nucleic acid encoding for a protein, wherein the protein is selected from the group comprising a transposase, a nuclease, and an integrase. In some embodiments, the nuclease is selected from the group comprising: a CRISPR associated protein 9 (Cas9); a type IIS restriction enzyme; a transcription activator-like effector nuclease (TALEN); and a zinc finger nuclease (ZFN).
[012] In some embodiments, the type IIS restriction enzyme comprises Clo05l. In some embodiments, the Cas9 comprises a dCas9 or dSaCas9. In some embodiments, the at least one gene editing molecule comprises one or more transposable elements.
[013] In some embodiments, the one or more transposable elements comprises one or more of a piggyBac transposon, a piggyBac-like transposon, a Sleeping Beauty transposon, a Helraiser Transposon, a Tol2 transposon or a LINE-l (Ll) transposon. In some embodiments, the one or more transposable elements comprise a piggyBac transposon. In some
embodiments, the one or more transposable elements comprise a piggyBac-like transposon.
In some embodiments, the one or more transposable elements comprise a Sleeping Beauty transposon. In some embodiments, the one or more transposable elements comprise a Helraiser transposon. In some embodiments, the one or more transposable elements comprise a Tol2 transposon.
[014] In some embodiments, the at least one gene editing molecule further comprises one or more transposase(s). In some embodiments, the one or more transposase(s) comprises a piggyBac transposase, a super piggyBac transposase (SPB), a piggyBac-like transposase, a Sleeping Beauty transposase, a hyperactive Sleeping Beauty transposase (SB100X), a Helitron Transposase, a Tol2 transposase or a transposase capable of transposing a LINE-l (Ll transposon). In some embodiments, the transposase comprises a piggyBac transposase or a super piggyBac transposase (SPB). In some embodiments, the transposase comprises a piggyBac-bke transposase. In some embodiments, the transposase comprises a Sleeping Beauty transposase or a hyperactive Sleeping Beauty transposase (SB100X). In some embodiments, the transposase comprises Helitron transposase. In some embodiments, the transposase comprises a Tol2 transposase.
[015] The disclosure provides a kit, comprising: a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one cationically- charged block, wherein: the at least one cationically-charged block complexes with the at least one gene editing molecule; and the at least one cationically-charged block is capable of intracellular delivery and release of the at least one gene editing molecule; and an implement for administering the pharmaceutical composition by a systemic route or by a local route.
[016] In some embodiments, the systemic route comprises intravenous delivery, inhalation, transmucosal delivery, rectal delivery, vaginal delivery, subcutaneous delivery,
intraperitoneal delivery, intrathecal delivery, intramuscular delivery or oral delivery. In some embodiments, the local route comprises topical delivery, transdermal delivery,
intracerebrospinal delivery, intraspinal delivery, direct delivery to the central nervous system (CNS), intraocular delivery, intravitreal delivery, intramuscular delivery, or intraosseous delivery.
[017] In some embodiments, the cationically-charged block is constitutively positively charged at a physiological pH equal to or greater than 6.0. In some embodiments, the cationically-charged block is constitutively positively charged at a physiological pH of between 7.0 and 7.8.
[018] In some embodiments, the cationically-charged block comprises polyallylamine (PAA), polyamidoamine (PAMAM), polydimethylaminoethylmethacrylate (PDMAEMA), poly(2(diisopropylamino)ethyl methacrylate), polyethylenimine (PEI), branched
polyethylenimine (bPEI), poly(l-lysine) (PLL), poly(l-arginine) (PLA),
poly(oligoethanamino)amide) (POEAA), chitosan, succinated chitosan, 6-N,N,N- trimethyltriazole-chitosans, a polyphosphoramidate, a polyhydroxyalkanoate, poly(calixane), poly(cyclodextrin), poly(aspartamide) (pASP(DET)), a poly(aminoamide), Poly(methacrylic acid-g-ethylene glycol) (P(MAA-g-EG)), or Poly(N,N-dialkylaminoethylmethacrylates) (PDAAEMA).
[019] In some embodiments, the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at low pH of between 3.0 and 6.0. In some embodiments, the cationically-charged block comprises a substituted polyimidazole, a poly (L-histidine), a poly(beta)amino ester (PBAE), a poly(allylamine) hydrochloride(PAH), a poly(meth)acrylamide, or a poly(styrene-alt-maleic anhydride) (pSMA).
[020] In some embodiments, the use of the compositions of the disclosure for the modification of a target sequence, comprising contacting the composition and the target sequence under conditions suitable for nuclease activity.
[021] In some embodiments, the use of the compositions of the disclosure for the modification of a target sequence, comprising contacting the composition and the target sequence under conditions suitable for transposase activity. In some embodiments, a target cell comprises the target sequence. In some embodiments, the target sequence is a DNA sequence. In some embodiments, the DNA sequence is a genomic sequence. In some embodiments, the target sequence is an RNA sequence. In some embodiments, the target cell is ex vivo or in vivo.
[022] In some embodiments, the use of the compositions of the disclosure for the treatment of a disease or disorder, comprising administering a therapeutically-effective amount of the composition to a subject in need thereof.
[023] The disclosure provides a method of modifying a target sequence, comprising contacting any one of the compositions of the disclosure and the target sequence under conditions suitable for nuclease activity.
[024] The disclosure provides a method of modifying a target sequence, comprising contacting any one of the compositions of the disclosure and the target sequence under conditions suitable for transposase activity.
[025] In some embodiments, a target cell comprises the target sequence. In some embodiments, the target sequence is a DNA sequence. In some embodiments, the DNA sequence is a genomic sequence. In some embodiments, the target sequence is an RNA sequence. In some embodiments, the target cell is ex vivo or in vivo.
[026] The disclosure provides a method of treating a disease or disorder, comprising administering a therapeutically-effective amount of any one of the compositions of the disclosure to a subject in need thereof. BRIEF DESCRIPTION OF THE DRAWINGS
[027] Figure 1 A is a table depicting PLA polymerization times, micelle formation techniques, and mean diameter sizes of nanoparticles in the diblock copolymer micelle model of Example 1. As shown, using the particular test combination of PLA polymerization for 6 hours (25 PLA units) and sonication of the copolymers in phosphate-buffered saline (PBS), the mean diameter of the resulting micelles was 247 nm.
[028] Figure 1B is a graph depicting the size distribution for the PEO-b-PLA micelles generated using the same test combination (i.e., 6 hours PLA polymerization and sonication in PBS) shown in Figure 1A and Example 1.
[029] Figure 1C is a graph showing the z- potential distribution of the PEO-b-PLA micelles generated using the same test combination (i.e., 6 hours PLA polymerization and sonication in PBS) shown in Figures 1A-B and Example 1. As demonstrated, the z- potential of the tested PEO-b-PLA micelle is about -12.20 mV.
[030] Figure 2 is a graph depicting the absorbance of light at a wavelength of 560 nm by the micelles with different concentrations of the DIL dye in solution. In particular, the graph may be used to quantify how much DIL dye can be bound to the hydrophobic portion of the micelles. Specifically, it was found that 1 mg of the PEO-b-PLA micelles was able to load around 4 mM of the DIL dye.
[031] Figure 3A is a table depicting PHIS polymerization times, micelle formation techniques, and mean diameter sizes of the resulting nanoparticles of the diblock copolymer micelle model of Example 1. Using the particular combination of PHIS polymerization for 48 hours and thin film rehydration (TFR) of the block copolymers in dichloromethane (DCM) of the copolymers in PBS, the mean diameter of the resulting micelles was 248 nm.
[032] Figure 3B is graph showing the size distribution (around 248 nm in diameter) for the PEO-b-PLA-b-PHA micelles generated using the same preparation parameters (i.e., 6 hours PLA polymerization, 48 hours PHIS polymerization, and THS in DCM).
[033] Figure 3C is a graph of the z-potential distribution of the PEO-b-PLA-b-PHIS micelles generated using the same preparation parameters (i.e., 6 hours PLA polymerization, 48 hours PHIS polymerization, and THS in DCM). As demonstrated, the z-potential of the tested PEO-b-PLA-b-PHIS micelle is about -18 mV.
[034] Figure 4 is a table depicting the variation in properties of the PEO-b-PLA-b-PHIS micelles in different pHs was tested. As shown, the micelles were the smallest at a pH of around 7, with a mean diameter size of around 316 nm. When the pH was substantially raised or lowered, the mean diameter size increases. At the lower pH, such increase is likely due to the micelle swelling based on poly(histidine) chains gaining positive charges and growing.
[035] Figure 5 is a photograph of a gel electrophoresis depicting DNA + mRNA encapsulation and release from PEO-PLA-PHIS particles. 1% agarose gel electrophoresis was used to demonstrate the encapsulation of DNA and mRNA into PEO-PLA-PHIS particles (well 1). Exposure of particles to acidic pH of 4.6 causes protonation of PHIS and disruption of particle conformation to result in plasmid release as observed in the DNA band from well 2 in the gel image. Plasmid release can be also triggered by surfactant exposure from the loading dye containing SDS as can be seen in the well 3. The DNA band from release was compared to the band resulting from running DNA alone in the gel (well 4).
[036] Figure 6A is a graph of the average diameter of PEO-b-PLA-b-PHIS micelles complexed with BSA as a function of pH as discussed in Example 1.
[037] Figure 6B is a graph of the amount of released BSA as a function of pH as discussed in Example 1.
[038] Figure 7 is a series of photographs and FACS plots showing the transfection efficiency results from Example 1. HepG2 cells were seeded overnight in 24-well plates at 50,000 cells/well. Cell were exposed to different formulations in Opti-MEM Media (DNA alone, Lipofectamine + DNA + mRNA and PEO-PLA-PHIS + DNA + mRNA) at a final concentration of 500ng of DNA per well. At 48 hours post-incubation, cells were analyzed for GFP expression by microscopy and flow cytometry to determine the transfection efficiency for each condition.
[039] Figure 8 is a graph depicting piggyBac delivery via polymeric micelles. Evaluation of complexation of PEO-b-PLA-b-PHIS micelles with an pEF-GFP DNA vector (GFP), GFP- piggyBac transposon (GFP-Transposon), which was delivered with a second micelle that was complexed with piggyBac transposase mRNA or a DNA vector containing luciferase on a sleeping beauty transposon as well as the sleeping beauty transposase. Micelles were purified on a GPC column and a second fraction was detected as micelles containing DNA. Molar ratio of polymer to DNA cargo was 20: 1.
[040] Figure 9 is a graph depicting piggyBac delivery via polymeric micelles. Evaluation of in vitro toxicity of PEO-b-PLA-b-PHIS micelles at different concentrations. Micelle toxicity in HepG2 cells was evaluated by an MTT assay. Empty micelles were incubated with cells over 3 days at the typical transfection concentration of DNA (1%) and at lOx the typical concentrations (i.e. 10%). [041] Figure 10 is a graph depicting piggyBac delivery via polymeric micelles. Evaluation of transfection efficiency in HepG2 cells. HepG2 cells were incubated with plasmid or micelle formulation containing plasmid for 3 days. Flow cytometer was to detect transfected cells.
DETAILED DESCRIPTION
[042] A new era for genome editing technologies has recently emerged based on the development of sequence-specific nucleases. In particular, such nucleases may be used to generate DNA double strand breaks (DSBs) in precise genomic locations, and cellular repair machinery then exploited to silence or replace nucleotides and/or genes. Targeted editing of nucleic acid sequences is a highly promising approach for the study of gene function and also has the potential to provide new therapies for human genetic diseases.
[043] Current gene editing tools include, for example, various enzymes, such as endonucleases, and mobile genetic elements, such as transposons.
[044] These tools provide the potential, for example, to remove, replace, or add nucleotide bases to native DNA in order to correct or induce a point mutation, as well as to change a nucleotide base in order to correct or induce a frame shift mutation. Further, such tools may enable removing, inserting or modifying pieces of DNA containing a plurality of codons as part of one or more gene(s).
[045] Currently, mechanisms for delivering proteins and/or nucleic acids to target cells include using viral vectors. However, viral-based gene delivery has limitations including toxicity, immunogenicity, mutagenicity, payload size limitations, and difficulties with large- scale production, including costs and time.
[046] Progress has been made in the delivery of functional nucleic acids, using both viral vectors (e.g., retrovirus, adenovirus, etc.) and non-viral vectors. For example, wild-type AAV has attracted considerable interest from gene therapy researchers due to a number of features, such as the virus's apparent lack of pathogenicity. It can also infect non-dividing cells and has the ability to stably integrate into the host cell genome at a specific site (designated AAVS 1) in the human chromosome 19. The feature makes it somewhat more predictable than retroviruses, which present the threat of a random insertion and of mutagenesis, which is sometimes followed by development of a cancer. AAV-based gene therapy vectors form episomal concatemers in the host cell nucleus. In non-dividing cells, these concatemers remain intact for the life of the host cell. In dividing cells, AAV DNA is lost through cell division, since the episomal DNA is not replicated along with the host cell DNA. Random integration of AAV DNA into the host genome is detectable but occurs at very low frequency. AAVs also present very low immunogenicity, seemingly restricted to generation of neutralizing antibodies, while they induce no clearly defined cytotoxic response. These features, along with the ability to infect quiescent cells, demonstrate that AAVs are dominant over adenoviruses as vectors for human gene therapy. However, the use of viral vectors (including AAVs) is also associated with some disadvantages, in particular the limited size of viral genomes. For example, the AAV genome is only 4.8 kilobase (kb), and therefore is unable to be used for single- vehicle delivery of the multitude of gene editing tools of the various embodiments.
[047] Further drawbacks to the use of viruses to deliver gene editing tools may include targeting only dividing cells, random insertion into the host genome, risk of replication, and possible host immune reaction, as well as limitations on payload size imposed by the viral capsid.
[048] In general, non-viral vectors are typically easy to manufacture, less likely to produce immune reactions, and do not produce replication reactions compared to viral vectors;
existing methods are generally ineffective for in vivo introduction of genetic material into cells and have resulted in relatively low gene expression. Specifically, a number of existing non-viral systems have been recently explored for delivery of gene editing tools in the form of proteins and/or nucleic acids to cells. Such system may be broadly classified as:
"nanocapsules" in which a slurry of free DNA/RNA/protein is wrapped with a polymer or peptide; "bioconjugates" (e.g., lipids, synthetic macromolecules, etc.) that target the nucleic acid, including via binding to specific proteins expressed by target cells to enable cellular internalization; and "lipid-based vehicles" (e.g., liposomes, lipid-based nanoparticles, etc.) modified with cationic/ionizable amphiphilic polymers to self-assemble with the nucleic acids based on charge. Each of these non-viral systems presents its own set of issues with respect to encapsulating either single or a multitude of gene editing tools in a single delivery vehicle. For example, in a nanocapsule system, the structure is highly unstable and may leak its contents into the vasculature after intravenous administration. As such, the capability to achieve intracellular delivery and release of a sufficient quantity of material components necessary for effective gene editing is unlikely.
[049] In a bioconjugate system, the use of a vector of sufficient size will expose the protein or nucleic acid directly to nucleases in the blood stream/cytosol and can cause fragmentation and destruction of the payload. In lipid-based vehicles, the charged delivery systems have demonstrated poor loading capacity and difficult release of encapsulated payload, which is especially true for larger nucleic acies such as mRNA and DNA.
[050] Polymeric micelles have been extensively studied for their potential applications in the drug delivery field. Polymeric micelles are formed by amphiphilic block copolymers, which can self-assemble into nano-sized core/shell structures in an aqueous environment via hydrophobic or ion pair interactions between polymer segments. Such micelles generally are able to solubilize the insoluble drugs, avoid non-selective uptake by the reticuloendothelial system (RES), and utilize the enhanced permeability and retention (EPR) effect for passive targeting. In this manner, a drug's solubility and pharmacokinetic profiles may be significantly improved through the use of micelles.
[051] Polymeric micelles used for drug delivery have in some cases shown capabilities in attenuating nonspecific toxicities and enhancing drug delivery to desired sites resulting in improved therapeutic efficacy. Synthetic amphiphilic copolymers may be beneficial tools for drug delivery because they are highly versatile in terms of composition and architecture. Further, micelles may be customized, for example, by modifying the hydrophilic block using functional groups. Such functional group may include, for example, targeting ligands, such as monoclonal antibody, or intracellular drug delivery moieties, such as cell-penetrating peptides (CPPs), etc.
[052] While nanoparticles have been reported to accumulate preferably in certain regions due to passive and/or active targeting, their inefficient drug release can be another barrier that may significantly lower drug's efficacy. For example, surface PEO chains may inhibit the cellular uptake of long circulating nanoparticles following intracellular events. Therefore, quicker and more controllable payload release remains a target for nanoparticle systems such as micelles.
[053] Therefore, an effective vehicle for delivering nucleic acids, such as mRNA and/or large DNA plasmids, to target cells is needed.
[054] The various embodiments will be described in detail with reference to the
accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts. References made to particular examples and implementations are for illustrative purposes, and are not intended to limit the scope of the claims.
[055] It is to be appreciated that certain features that are, for clarity, described herein in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features that are, for brevity, described in the context of a single embodiment, may also be provided separately or in any sub-combination. Further, reference to values stated in ranges includes each and every value within that range.
[056] As used in this specification and the appended claims, the singular forms "a," "an," and "the" include plural referents unless the content clearly dictates otherwise.
[057] The word "plurality" is used herein to mean more than one. When a range of values is expressed, another embodiment includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent "about," it will be understood that the particular value forms another embodiment. All ranges are inclusive and combinable.
[058] The terms "subject" and "patient" are used interchangeably herein to refer to human patients, whereas the term "subject" may also refer to any animal. It should be understood that in various embodiments, the subject may be a mammal, a non-human animal, a canine and/or a vertebrate.
[059] The term "monomeric units" is used herein to mean a unit of polymer molecule containing the same or similar number of atoms as one of the monomers. Monomeric units, as used in this specification, may be of a single type (homogeneous) or a variety of types (heterogeneous).
[060] The term "polymer" is used according to its ordinary meaning of a macromolecule comprising connected monomeric molecules.
[061] The term "amphiphilic" is used herein to mean a substance containing both polar (water-soluble) and hydrophobic (water-insoluble) groups.
[062] The term "an effective amount" is used herein to refer to an amount of a compound, material, or composition effective to achieve a particular biological result such as, but not limited to, biological results disclosed, described, or exemplified herein. Such results may include, but are not limited to, the effective reduction of symptoms associated with any of the disease states mentioned herein, as determined by any means suitable in the art. As recognized by those of ordinary skill in the art, the effective amount of an agent, e.g., a nuclease, an integrase, a transposase, a recombinase, a hybrid protein, a fusion protein, a protein dimer, a complex of a protein (or protein dimer) and a polynucleotide, or a polynucleotide, may vary depending on various factors as, for example, on the desired biological response, the specific allele, genome, target site, cell, or tissue being targeted, and the agent being used. [063] The term "membrane" is used herein to mean a spatially distinct collection of molecules that defines a two-dimensional surface in three-dimensional space, and thus separates one space from another in at least a local sense.
[064] The term "active agent" is used herein to refer to any a protein, peptide, sugar, saccharide, nucleoside, inorganic compound, lipid, nucleic acid, small synthetic chemical compound, or organic compound that appreciably alters or affects the biological system into which it is introduced.
[065] The term, "vehicle" is used herein to refer to agents with no inherent therapeutic benefit but when combined with an active agent for the purposes of delivery into a cell result in modification of the active agent's properties, including but not limited to its mechanism or mode of in vivo delivery, its concentration, bioavailability, absorption, distribution and elimination for the benefit of improving product efficacy and safety, as well as patient convenience and compliance.
[066] The term "carrier" is used herein to describe a delivery vehicle that is used to incorporate a pharmaceutically active agent for the purposes of drug delivery.
[067] The term "homopolymer" is used herein to refer to a polymer derived from one monomeric species of polymer.
[068] The term "copolymer" is used herein to refer to a polymer derived from two (or more) monomeric species of polymer, as opposed to a homopolymer where only one monomer is used. Since a copolymer consists of at least two types of constituent units (also structural units), copolymers may be classified based on how these units are arranged along the chain.
[069] The term "block copolymers" is used herein to refer to a copolymer that includes two or more homopolymer subunits linked by covalent bonds in which the union of the homopolymer subunits may require an intermediate non-repeating subunit, known as a junction block. Block copolymers with two or three distinct blocks are referred to herein as "diblock copolymers" and "triblock copolymers," respectively.
[070] The term "loading capacity" is used herein to refer to the weight of a particular compound within a carrier divided by the total weight of carrier. The terms "complexation efficiency" and "loading efficiency" are interchangeably used herein to refer to the weight a particular compound that is complexed with and/or incorporated within a carrier suspension divided by the weight of the original compound in solution prior to forming a complex (expressed as a %).
[071] The terms "nucleic acid" and "nucleic acid molecule" are used interchangeably herein to refer to a compound with a nucleobase and an acidic moiety, e.g., a nucleoside, a nucleotide, or a polymer of nucleotides. Typically, polymeric nucleic acids, e.g., nucleic acid molecules comprising three or more nucleotides are linear molecules, in which adjacent nucleotides are linked to each other via a phosphodiester or a phosphorothioate linkage. In some embodiments, "nucleic acid" refers to individual nucleic acid residues (e.g. nucleotides and/or nucleosides). In some embodiments, "nucleic acid" refers to an oligonucleotide chain comprising three or more individual nucleotide residues. As used herein, the terms
"oligonucleotide" and "polynucleotide" can be used interchangeably to refer to a polymer of nucleotides (e.g., a string of at least three nucleotides). In some embodiments, "nucleic acid" encompasses RNA as well as single and/or double-stranded DNA Nucleic acids may be naturally occurring, for example, in the context of a genome, a transcript, an mRNA, tRNA, rRNA, siRNA, snRNA, a plasmid, cosmid, chromosome, chromatid, or other naturally occurring nucleic acid molecule. On the other hand, a nucleic acid molecule may be a non- naturally occurring molecule, e.g., a recombinant DNA or RNA, an artificial chromosome, an engineered genome, or fragment thereof, or a synthetic DNA, RNA, DNA/RNA hybrid, or including non-naturally occurring nucleotides or nucleosides.
[072] Furthermore, the terms "nucleic acid," "DNA," "RNA," and/or similar terms include nucleic acid analogs, e.g., analogs having other than a phosphodiester backbone including a phosphorothioate linkage. Nucleic acids can be purified from natural sources, produced using recombinant expression systems and optionally purified, chemically synthesized, etc. Where appropriate, e.g., in the case of chemically synthesized molecules, nucleic acids can comprise nucleoside analogs such as analogs having chemically modified bases or sugars, and backbone modifications. A nucleic acid sequence is presented in the 5' to 3' direction unless otherwise indicated. In some embodiments, a nucleic acid is or comprises natural nucleosides (e.g. adenosine, thymidine, guanosine, cytidine, uridine, deoxy adenosine, deoxythymidine, deoxyguanosine, and deoxy cytidine); nucleoside analogs (e.g., 2- aminoadenosine, 2- thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5- methylcytidine, 2- aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propyny 1 -uridine, C5-propyny l-cytidine, C5-methy lcytidine, 2-aminoadenosine, 7- deazaadenosine, 7- deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, 0(6)- methylguanine, and 2- thiocytidine); chemically modified bases; biologically modified bases (e.g., methylated bases); intercalated bases; modified sugars (e.g., 2'-fluororibose, ribose, 2'-deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioates and 5'-N- phosphoramidite linkages). [073] The term "nuclease" is used interchangeably herein to refer to an enzyme that forms a complex with (e.g., binds or associates with) one or more nucleic acid to provide a target for cleavage, or indirect guide to another site for cleavage.
[074] The terms "treatment," "treat," and "treating," refer to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein. As used herein, the terms "treatment," "treat," and "treating" refer to a clinical intervention aimed to reverse, alleviate, delay the onset of, or inhibit the progress of a disease or disorder, or one or more symptoms thereof, as described herein. In some embodiments, treatment may be administered after one or more symptoms have developed and/or after a disease has been diagnosed. In other embodiments, treatment may be administered in the absence of symptoms, e.g., to prevent or delay onset of a symptom or inhibit onset or progression of a disease. For example, treatment may be administered to a susceptible individual prior to the onset of symptoms (e.g., in light of a history of symptoms and/or in light of genetic or other susceptibility factors). Treatment may also be continued after symptoms have resolved, for example, to prevent or delay their recurrence.
[075] The disclosure provides a composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly(L-histidine) block, wherein: the at least one poly(L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly(L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule. In certain embodiments of this composition, the at least one gene editing molecule comprises one or more of a protein or a nucleic acid encoding for a protein. In certain embodiments of this composition, the at least one gene editing molecule comprises a protein and the protein is selected from the group comprising a transposase, a nuclease, and an integrase. In certain embodiments of this composition, the at least one gene editing molecule comprises one or more of a protein or a nucleic acid encoding for a protein, wherein the protein is selected from the group comprising a transposase, a nuclease, and an integrase. In certain embodiments of this composition, the nuclease or the protein having nuclease activity is selected from the group comprising: a CRISPR associated protein 9 (Cas9); a type IIS restriction enzyme; a transcription activator-like effector nuclease (TALEN); and a zinc finger nuclease (ZFN). [076] In certain embodiments of the compositions of the disclosure, the gene editing molecule comprises a DNA-binding domain and a nuclease. In some embodiments, the DNA- binding domain comprises a guide RNA. In some embodiments, the DNA-binding domain comprises a DNA-binding domain of a TALEN. In some embodiments, the DNA-binding domain comprises a DNA-binding domain of a zinc-finger nuclease.
[077] In certain embodiments of the compositions of the disclosure, the CRISPR associated protein 9 (Cas9) is an inactivated Cas9 (dCas9). In some embodiments, the CRISPR associated protein 9 (Cas9) is truncated or short Cas9. In some embodiments, the CRISPR associated protein 9 (Cas9) is a short and inactivated Cas9 (dSaCas9). ). In some embodiments, the dSaCas9 of the disclosure comprises the mutations D10A and N580A (underlined and bolded) which inactivate the catalytic site. In some embodiments, the dSaCas9 (isolated or derived from Staphylococcus aureus) comprises the amino acid sequence of:
1 MKRNYILGLA IGITSVGYGI IDYETRDVID AGVRLFKEAN VENNEGRRSK RGARRLKRRR 61 RHRIQRVKKL LFDYNLLTDH SELSGINPYE ARVKGLSQKL SEEEFSAALL HIAKRRGVHN 121 VNEVEEDTGN ELSTKEQI SR NSKALEEKYV AELQLERLKK DGEVRGSINR FKTSDYVKEA 181 KQLLKVQKAY HQLDQSFIDT YIDLLETRRT YYEGPGEGSP FGWKDIKEWY EMLMGHCTYF 241 PEELRSVKYA YNADLYNALN DLNNLVITRD ENEKLEYYEK FQIIENVFKQ KKKPTLKQIA 301 KEILVNEEDI KGYRVTSTGK PEFTNLKVYH DIKDITARKE IIENAELLDQ IAKILTIYQS 361 SEDIQEELTN LNSELTQEEI EQI SNLKGYT GTHNLSLKAI NLILDELWHT NDNQIAIFNR 421 LKLVPKKVDL SQQKEIPTTL VDDFILSPW KRSFIQSIKV INAIIKKYGL PNDIIIELAR 481 EKNSKDAQKM INEMQKRNRQ TNERIEEI IR TTGKENAKYL IEKIKLHDMQ EGKCLYSLEA 541 I PLEDLLNNP FNYEVDHIIP RSVSFDNSFN NKVLVKQEEA SKKGNRTPFQ YLSSSDSKIS 601 YETFKKHILN LAKGKGRISK TKKEYLLEER DINRFSVQKD FINRNLVDTR YATRGLMNLL 661 RSYFRVNNLD VKVKSINGGF TSFLRRKWKF KKERNKGYKH HAEDALI IAN ADFIFKEWKK 721 LDKAKKVMEN QMFEEKQAES MPEIETEQEY KEI FITPHQI KHIKDFKDYK YSHRVDKKPN 781 RELINDTLYS TRKDDKGNTL IVNNLNGLYD KDNDKLKKLI NKSPEKLLMY HHDPQTYQKL 841 KLIMEQYGDE KNPLYKYYEE TGNYLTKYSK KDNGPVIKKI KYYGNKLNAH LDITDDYPNS 901 RNKWKLSLK PYRFDVYLDN GVYKFVTVKN LDVIKKENYY EVNSKCYEEA KKLKKI SNQA 961 EFIASFYNND LIKINGELYR VIGVNNDLLN RIEVNMIDIT YREYLENMND KRPPRI IKTI 1021 ASKTQSIKKY STDILGNLYE VKSKKHPQII KKG (SEQ ID NO: 1)
[078] In some embodiments, the Cas9 of the disclosure comprises a dCas9. In some embodiments, the Cas9 comprises a dCas9 isolated or derived from Streptococcus pyogenes. In some embodiments, the dCas9 comprises a substitution at position 10 and/or 840 of the amino acid sequence of the dCas9 which inactivate the catalytic site. In some embodiments, these substitutions are D10A and H840A. In some embodiments, the amino acid sequence of the dCas9 (isolated or derived from Streptococcus pyogenes) comprises the sequence of:
1 XDKKYSIGLA IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE
61 ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPI FG 121 NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYLALAH MIKFRGHFLI EGDLNPDNSD 181 VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN 241 LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI 301 LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA 361 GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH 421 AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE 481 WDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL 541 SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI 601 IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG 661 RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL 721 HEHIANLAGS PAIKKGILQT VKWDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER 781 MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDA 841 IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEWKKMK NYWRQLLNAK LITQRKFDNL 901 TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS 961 KLVSDFRKDF QFYKVREINN YHHAHDAYLN AWGTALIKK YPKLESEFVY GDYKVYDVRK 1021 MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TIANGEIRKR PLIETNGETG EIVWDKGRDF 1081 ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA 1141 YSVLWAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK 1201 YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE 1261 QHKHYLDEI I EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA 1321 PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD (SEQ ID NO
6
[079] In some embodiments, the amino acid sequence of the dCas9 (isolated or derived from Streptococcus pyogenes) comprises the sequence of:
1 MDKKYSIGLA IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE 61 ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPI FG 121 NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYIAIAH MIKFRGHFLI EGDLNPDNSD 181 VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN 241 LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI 301 LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA 361 GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH 421 AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE 481 WDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL 541 SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI 601 IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG 661 RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL 721 HEHIANLAGS PAIKKGILQT VKWDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER 781 MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDA 841 IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEWKKMK NYWRQLLNAK LITQRKFDNL 901 TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS 961 KLVSDFRKDF QFYKVREINN YHHAHDAYLN AWGTALIKK YPKLESEFVY GDYKVYDVRK 1021 MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TIANGEIRKR PLIETNGETG EIVWDKGRDF 1081 ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA 1141 YSVLWAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK
1201 YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYIAS HYEKLKGSPE DNEQKQLFVE 1261 QHKHYLDEI I EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA 1321 PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD (SEQ ID NO:
6
[080] In certain embodiments of the disclosure, the nuclease domain may comprise, consist essentially of or consist of a dCas9 or a dSaCas9 and a type IIS endonuclease. In certain embodiments of the disclosure, the nuclease domain may comprise, consist essentially of or consist of a dSaCas9 and a type IIS endonuclease, including, but not limited to, Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, My II, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Ecil, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bfil, Mboll, Acc36I, Fokl or Clo05l.
[081] In certain embodiments of the disclosure, the nudease domain may comprise, consist essentially of or consist of a dCas9 or dSaCas9 and Clo05l.An exemplary Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
EG IKSNISLLKDELRGQISH ISH EYLSLI DLAFDSKQN RLFEMKVLELLVN EYGFKGRH LGGSRKPDGIVYSTTLEDN FG IIVDTKAYSEGYSLPISQADEM ERYVRENSN RDEEVN PN KWWEN FSEEVKKYYFVFISGSFKG KFEEQLRRLSMTTG VNGSAVNVVN LLLGAEKI RSGEMTI EELERAMFN NSEFILKY (SEQ I D NO: 7).
[082] An exemplary dCas9-Clo05l nudease domain may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 (isolated or derived from Staphylococcus pyogenes) sequence in italics):
MAPKKKRKVEGI KSN ISLLKDELRGQISH ISH EYLSLI DLAFDSKQN RLFEM KVLELLVN EYGFKG RH LGGSRKPDGIV
YSTTLEDN FGI IVDTKAYSEGYSLPISQADEM ERYVRENSN RDEEVN PN KWWEN FSEEVKKYYFVFISGSFKGKFEE
QLRRLSMTTGVNGSAVNVVN LLLGAEKI RSGEMTI EELERAM FN NSEFILKYGGGGSD/C /G /GrA/S'/GI/l/A VI
TDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHR
LEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSD
VDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAE
DAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLLKALV
RQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGE
LHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFA WMTRKSEETITPWNFEEVVDKGASAQSFIERMT
NFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECF
DSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRR
RYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSP AIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQ
NEKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNY
WRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKS
KLVSDFRKDFQFYKVREINNYHHAHDAYLNA VVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFF
YSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNS
DKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLII
KLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISE
FSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLY
ETRIDLSQLGGDGSPKKKRKVSS (SEQ ID NO: 8).
[083] In certain embodiments of the compositions of the disclosure, the type IIS restriction enzyme comprises one or more of Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, Myll, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Ecil, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bril, Mboll, Acc36I, Fokl or Clo05 l. In some embodiments, the type IIS restriction enzyme comprises Clo05 l.
[084] An exemplary Clo051 nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
EG IKSNISLLKDELRGQISH ISH EYLSLI DLAFDSKQN RLFEMKVLELLVN EYGFKGRH LGGSRKPDGIVYSTTLEDN FG IIVDTKAYSEGYSLPISQADEM ERYVRENSN RDEEVN PN KWWEN FSEEVKKYYFVFISGSFKG KFEEQLRRLSMTTG VNGSAVNVVN LLLGAEKI RSGEMTI EELERAMFN NSEFILKY (SEQ I D NO: 7).
[085] In certain embodiments of the compositions of the disclosure, the DNA binding domain or the nuclease comprises a sequence isolated or derived from a Ralstonia TALEN or from a Xanthomonas TALEN. In some embodiments, the DNA binding domain or the nuclease comprises a recombinant TALEN sequence derived from a Ralstonia TALEN, a Xanthomonas TALEN or a combination thereof.
[086] In certain embodiments of the compositions of the disclosure, the at least one gene editing molecule comprises one or more transposable element(s). In some embodiments, the one or more transposable element(s) comprise a circular DNA. In some embodiments, the one or more transposable element(s) comprise a plasmid vector or a minicircle DNA vector.
[087] In certain embodiments of the compositions of the disclosure, the at least one gene editing molecule comprises one or more transposable element(s). In some embodiments, the one or more transposable element(s) comprise a linear DNA. The linear recombinant and non-naturally occurring DNA sequence encoding a transposon may be produced in vitro. Linear recombinant and non-naturally occurring DNA sequences of the disclosure may be a product of a restriction digest of a circular DNA. In some embodiments, the circular DNA is a plasmid vector or a minicircle DNA vector. Linear recombinant and non-naturally occurring DNA sequences of the disclosure may be a product of a polymerase chain reaction (PCR). Linear recombinant and non-naturally occurring DNA sequences of the disclosure may be a double-stranded Doggybone™ DNA sequence. Doggybone™ DNA sequences of the disclosure may be produced by an enzymatic process that solely encodes an antigen expression cassette, comprising antigen, promoter, poly-A tail and telomeric ends.
[088] In certain embodiments of the compositions of the disclosure, the at least one gene editing molecule comprises one or more transposable element(s). In some embodiments, the one or more transposable element(s) comprise a piggyBac transposon, a Sleeping Beauty transposon, a Helraiser Transposon, a Tol2 transposon or a LINE-l (Ll) transposon. In some embodiments, the one or more transposable elements comprise a piggyBac transposon. In some embodiments, the one or more transposable elements comprise a Sleeping Beauty transposon. In some embodiments, the one or more transposable elements comprise a Helraiser transposon. In some embodiments, the one or more transposable elements comprise a Tol2 transposon.
[089] In certain embodiments of the compositions of the disclosure, including those embodiments wherein the at least one gene editing molecule comprises one or more transposable element(s), the at least one gene editing molecule comprises further comprises one or more transposase(s). In some embodiments, the one or more transposase(s) comprises a piggyBac transposase, a super piggyBac transposase (SPB), a Sleeping Beauty transposase, a hyperactive Sleeping Beauty transposase (SB100X), a Helitron Transposase, a Tol2 transposase or a transposase capable of transposing a LINE-l (Ll transposon). In some embodiments, including those embodiments wherein the transposon comprises a piggyBac transposon, the transposase comprises a piggyBac transposase or a super piggyBac transposase (SPB). In some embodiments, including those embodiments wherein the transposon comprises a Sleeping Beauty transposon, the transposase comprises a Sleeping Beauty transposase, a hyperactive Sleeping Beauty transposase (SB100X). In some embodiments, including those embodiments wherein the transposon comprises a Helraiser transposon, the transposase comprises Helitron transposase. In some embodiments, including those embodiments wherein the transposon comprises a Tol2 transposon, the transposase comprises a Tol2 transposase. [090] In certain embodiments of the compositions of the disclosure, including those embodiments wherein the at least one gene editing molecule comprises one or more transposable element(s), the at least one gene editing molecule comprises further comprises one or more transposase(s). In some embodiments, the transposon is a piggyBac transposon.
In some embodiments, and, in particular, those embodiments wherein the transposon is a piggyBac transposon, the transposase is a piggyBac™ or a Super piggyBac™ (SPB) transposase.
[091] In certain embodiments of the compositions of the disclosure, the transposon is a plasmid DNA transposon. In some embodiments, the transposon is a piggyBac transposon. In some embodiments, and, in particular, those embodiments wherein the transposon is a piggyBac transposon, the transposase is a piggyBac™ or a Super piggyBac™ (SPB) transposase. In some embodiments, and, in particular, those embodiments wherein the transposase is a Super piggyBac™ (SPB) transposase, the sequence encoding the transposase is an mRNA sequence.
[092] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac™ (PB) transposase enzyme. The piggyBac (PB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
5) .
[093] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at one or more of positions 30, 165, 282, or 538 of the sequence:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTGATFRD TNEDEIYAFF 181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
5) .
[094] In some embodiments, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at two or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO:
5. In some embodiments, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at three or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 5. In some embodiments, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at each of the following positions 30, 165, 282, and 538 of the sequence of SEQ ID NO: 5. In some embodiments, the amino acid substitution at position 30 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for an isoleucine (I). In some embodiments, the amino acid substitution at position 165 of the sequence of SEQ ID NO: 5 is a substitution of a serine (S) for a glycine (G). In some embodiments, the amino acid substitution at position 282 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 538 of the sequence of SEQ ID NO: 5 is a substitution of a lysine (K) for an asparagine (N).
[095] In certain embodiments of the methods of the disclosure, the transposase enzyme is a Super piggyBac™ (SPB) transposase enzyme. In some embodiments, the Super piggyBac™ (SPB) transposase enzymes of the disclosure may comprise or consist of the amino acid sequence of the sequence of SEQ ID NO: 5 wherein the amino acid substitution at position 30 is a substitution of a valine (V) for an isoleucine (I), the amino acid substitution at position 165 is a substitution of a serine (S) for a glycine (G), the amino acid substitution at position 282 is a substitution of a valine (V) for a methionine (M), and the amino acid substitution at position 538 is a substitution of a lysine (K) for an asparagine (N). In some embodiments, the Super piggyBac™ (SPB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEV SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG 61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTSATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RVYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPKEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
2
[096] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions
30, 165, 282 and/or 538, the piggyBac™ or Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at one or more of positions 3, 46, 82, 103, 119,
125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 258, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 486, 503, 552, 570 and 591 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ or Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at one or more of positions 46, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 485,
503, 552 and 570. In some embodiments, the amino acid substitution at position 3 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for a serine (S). In some embodiments, the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an alanine (A). In some embodiments, the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 82 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for an isoleucine (I). In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 119 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for an arginine (R). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) a cysteine (C). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 185 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 187 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for an alanine (A). In some embodiments, the amino acid substitution at position 200 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a phenylalanine (F).In some embodiments, the amino acid substitution at position 207 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a valine (V). In some embodiments, the amino acid substitution at position 209 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a valine (V). In some embodiments, the amino acid substitution at position 226 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a methionine (M). In some embodiments, the amino acid substitution at position 235 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a leucine (L). In some embodiments, the amino acid substitution at position 240 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 241 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 243 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a proline (P). In some embodiments, the amino acid substitution at position 258 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a proline (P). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine for a proline (P). In some embodiments, the amino acid substitution at position 315 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for an arginine (R).In some embodiments, the amino acid substitution at position 319 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a threonine (T). In some embodiments, the amino acid substitution at position 327 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 328 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a cysteine (C). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 421 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for the aspartic acid (D). In some embodiments, the amino acid substitution at position 436 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a valine (V). In some embodiments, the amino acid substitution at position 456 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a methionine (M). In some embodiments, the amino acid substitution at position 470 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 485 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a serine (S). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a methionine (M). In some embodiments, the amino acid substitution at position 552 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a glutamine (Q). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a glutamine (Q).
[097] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ transposase enzyme may comprise or the Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at one or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In certain embodiments of the methods of the disclosure, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ transposase enzyme may comprise or the Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at two, three, four, five, six or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ transposase enzyme may comprise or the Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 194 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 372 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for an arginine (R). In some embodiments, the amino acid substitution at position 375 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a lysine (K). In some embodiments, the amino acid substitution at position 450 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for an aspartic acid (D). In some embodiments, the amino acid substitution at position 509 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a serine (S). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N). In some embodiments, the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5. In some embodiments, including those embodiments wherein the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, the piggyBac™
transposase enzyme may further comprise an amino acid substitution at positions 372, 375 and 450 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, and a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5. In some embodiments, the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5 and a substitution of an asparagine (N) for an aspartic acid (D) at position 450 of SEQ ID NO: 5.
[098] In certain embodiments of the compositions of the disclosure, including those embodiments wherein the at least one gene editing molecule comprises one or more transposable element(s), the at least one gene editing molecule comprises further comprises one or more transposase(s). In some embodiments, including those embodiments wherein the transposon is a Sleeping Beauty transposon, the transposase is a Sleeping Beauty or Sleeping Beauty 100X (SB100X) transposase. In certain embodiments of the methods of the disclosure, the Sleeping Beauty transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
MGKS KEI S QDLRKKIVDLHKS GS S LGAIS KRLKVPRS S V QTIVRKYKHHGTTQP S YRS GRRRYLSPRDERTLVRKV QINPRTTAKDLVKMLEETGTKV SISTVKRVLYRHNLKGR S ARKKPLLQNRHKKARLRFATAHGDKDRTFWRNVLW SDETKIELF GHNDHRYVWR KKGEACKPKNTIPTVKHGGGSIMLWGCFAAGGTGALHKIDGIMRKENYVDILKQHL KTSVRKLKLGRKWVFQMDNDPKHTSKVVAKWLKDNKVKVLEWPSQSPDLNPIENL WAELKKRVRARRPTNLTQLHQLCQEEWAKIHPTYCGKLVEGYPKRLTQVKQFKGN ATKY (SEQ ID NO: 3). In some embodiments, including those wherein the Sleeping Beauty transposase is a hyperactive Sleeping Beauty SB100X transposase, the Sleeping Beauty transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
MGKSKEISQDLRKRIVDLHKSGSSLGAISKRLAVPRSSVQTIVRKYKHHGTTQPSYRS GRRRYLSPRDERTLVRKV QINPRTTAKDLVKMLEETGTKV SISTVKRVLYRHNLKGH S ARKKPLLQNRHKKARLRFATAHGDKDRTFWRNVLW SDETKIELF GHNDHRYVWR KKGEACKPKNTIPTVKHGGGSIMLWGCFAAGGTGALHKIDGIMDAVQYVDILKQHL KTSVRKLKLGRKWVFQHDNDPKHTSKVVAKWLKDNKVKVLEWPSQSPDLNPIENL WAELKKRVRARRPTNLTQLHQLCQEEWAKIHPNYCGKLVEGYPKRLTQVKQFKGN ATKY (SEQ ID NO: 4). [099] In certain embodiments of the compositions of the disclosure, including those embodiments wherein the at least one gene editing molecule comprises one or more transposable element(s), the at least one gene editing molecule comprises further comprises one or more transposase(s). In some embodiments, including those embodiments wherein the transposon is a Helraiser transposon, the transposase is a Helitron transposase. Helitron transposases mobilize the Helraiser transposon, an ancient element from the bat genome that was active about 30 to 36 million years ago. An exemplary Helraiser transposon of the disclosure includes Helibatl, which comprises a nucleic acid sequence comprising:
1 TCCTATATAA TAAAAGAGAA ACATGCAAAT TGACCATCCC TCCGCTACGC TCAAGCCACG 61 CCCACCAGCC AATCAGAAGT GACTATGCAA ATTAACCCAA CAAAGATGGC AGTTAAATTT 121 GCATACGCAG GT GTCAAGCG CCCCAGGAGG CAACGGCGGC CGCGGGCTCC CAGGACCTTC 181 GCTGGCCCCG GGAGGCGAGG CCGGCCGCGC CTAGCCACAC CCGCGGGCTC CCGGGACCTT 241 CGCCAGCAGA GAGCAGAGCG GGAGAGCGGG CGGAGAGCGG GAGGTTTGGA GGACTTGGCA 301 GAGCAGGAGG CCGCTGGACA TAGAGCAGAG CGA GAGA GAG GGTGGCTTGG AGGGCGTGGC 361 TCCCTCTGTC ACCCCAGCTT C CT CAT CACA GCTGTGGAAA CTGACAGCAG GGAGGAGGAA 421 GTCCCACCCC CACAGAATCA GCCAGAATCA GCCGTTGGTC AGACAGCTCT CAGCGGCCTG 481 ACAGCCAGGA CTCTCATTCA CCTGCATCTC AGACCGTGAC AGTAGAGAGG TGGGACTATG 541 TCTAAAGAAC AACTGTTGAT ACAACGTAGC TCTGCAGCCG AAAGATGCCG GCGTTATCGA 601 CAGAAAATGT CTGCAGAGCA ACGTGCGTCT GATCTTGAAA GAAGGCGGCG CCTGCAACAG 661 AATGTATCTG AAGAGCAGCT ACTGGAAAAA CGTCGCTCTG AAGCCGAAAA ACAGCGGCGT 721 CATCGACAGA AAATGTCTAA AGACCAACGT GCCTTTGAAG TTGAAAGAAG GCGGTGGCGA 781 C GACAGAAT A TGTCTAGAGA ACAGTCATCA ACAAGTACTA CCAATACCGG TAGGAACTGC 841 CTTCTCAGCA AAAATGGAGT ACATGAGGAT GCAATTCTCG AACATAGTTG TGGTGGAATG 901 ACTGTTCGAT GTGAATTTTG CCTATCACTA AATTTCTCTG AT GAAAAAC C ATCCGATGGG 961 AAATTTACTC GATGTTGTAG CAAAGGGAAA GTCTGTCCAA ATGATATACA TTTTCCAGAT 1021 TACCCGGCAT AT T T AAAAAG ATTAATGACA AACGAAGATT CTGACAGTAA AAATTTCATG 1081 GAAAATATTC GTTCCATAAA TAGTTCTTTT GCTTTTGCTT CCATGGGTGC AAATATTGCA 1141 TCGCCATCAG GATATGGGCC ATACTGTTTT AGAATACACG GACAAGTTTA TCACCGTACT 1201 GGAACTTTAC ATCCTTCGGA TGGTGTTTCT CGGAAGTTTG CTCAACTCTA TATTTTGGAT 1261 ACAGCCGAAG CTACAAGTAA AAGAT TAG CA ATGCCAGAAA ACCAGGGCTG CTCAGAAAGA 1321 CTCATGATCA ACAT CAACAA CCTCATGCAT GAAATAAAT G AAT T AACAAA ATCGTACAAG 1381 ATGCTACATG AGGTAGAAAA GGAAGCCCAA TCTGAAGCAG CAGCAAAAGG TATTGCTCCC 1441 ACAGAAGTAA CAATGGCGAT TAAATACGAT CGTAACAGTG ACCCAGGTAG ATATAATTCT 1501 CCCCGTGTAA CCGAGGTTGC TGTCATATTC AGAAACGAAG ATGGAGAACC TCCTTTTGAA 1561 AGGGACTTGC TCATTCATTG TAAACCAGAT CCCAATAATC CAAATGCCAC TAAAATGAAA 1621 CAAATCAGTA TCCTGTTTCC TACATTAGAT GCAATGACAT ATCCTATTCT TTTTCCACAT 1681 GGTGAAAAAG GCTGGGGAAC AGATATTGCA TTAAGACTCA GAGACAACAG TGTAATCGAC 1741 AATAATACTA GACAAAATGT AAGGACACGA GTCACACAAA TGCAGTATTA TGGATTTCAT 1801 CTCTCTGTGC GGGACACGTT CAATCCTATT TTAAATGCAG GAAAATTAAC TCAACAGTTT 1861 ATTGTGGATT CATATTCAAA AATGGAGGCC AATCGGATAA ATTTCATCAA AGCAAACCAA
1921 TCTAAGTTGA GAGTTGAAAA ATATAGTGGT TTGATGGATT ATCTCAAATC TAGATCTGAA
1981 AATGACAATG TGCCGATTGG TAAAAT GATA ATACTTCCAT CATCTTTTGA GGGTAGTCCC
2041 AGAAATATGC AGCAGCGATA TCAGGATGCT ATGGCAATTG TAACGAAGTA TGGCAAGCCC
2101 GATTTATTCA TAACCATGAC ATGCAACCCC AAATGGGCAG AT AT T ACAAA CAATTTACAA
2161 CGCTGGCAAA AAGTTGAAAA CAGACCTGAC TTGGTAGCCA GAGTTTTTAA TATTAAGCTG
2221 AATGCTCTTT TAAATGATAT ATGTAAATTC CATTTATTTG GCAAAGTAAT AGCTAAAATT
2281 CATGTCATTG AATTTCAGAA ACGCGGACTG CCTCACGCTC ACATATTATT GATATTAGAT
2341 AGTGAGTCCA AATTACGTTC AGAAGAT GAC ATTGACCGTA TAGTTAAGGC AGAAATTCCA
2401 GATGAAGACC AGTGTCCTCG ACTTTTTCAA ATTGTAAAAT CAAATATGGT ACATGGACCA
2461 TGTGGAATAC AAAATCCAAA TAGTCCATGT ATGGAAAATG GAAAATGTTC AAAGGGATAT
2521 C CAAAAGAAT TTCAAAATGC GACCATTGGA AATATTGATG GATATCCCAA ATACAAAC GA
2581 AGATCTGGTA GCACCATGTC TATTGGAAAT AAAGTTGTCG ATAACACTTG GATTGTCCCT
2641 TATAACCCGT ATTTGTGCCT TAAATATAAC TGTCATATAA ATGTTGAAGT CTGTGCATCA
2701 ATTAAAAGTG TCAAATATTT ATTTAAATAC ATCTATAAAG GGCACGATTG TGCAAATATT
2761 CAAATTTCTG AAAAAAATAT TATCAATCAT GACGAAGTAC AGGACTTCAT TGACTCCAGG
2821 TATGTGAGCG CTCCTGAGGC TGTTTGGAGA CTTTTTGCAA TGCGAATGCA TGACCAATCT
2881 CATGCAATCA CAAGATTAGC TATTCATTTG CCAAATGATC AGAATTTGTA TTTTCATACC
2941 GATGATTTTG CTGAAGTTTT AGATAGGGCT AAAAGGCATA ACTCGACTTT GATGGCTTGG
3001 TTCTTATTGA ATAGAGAAGA TTCTGATGCA CGTAATTATT ATTATTGGGA GATTCCACAG
3061 CATTATGTGT TTAATAATTC TTTGTGGACA AAACGCCGAA AGGGTGGGAA TAAAGTATTA
3121 GGTAGACTGT TCACTGTGAG CTTTAGAGAA CCAGAACGAT ATTACCTTAG ACTTTTGCTT
3181 CTGCATGTAA AAGGTGCGAT AAGTTTTGAG GATCTGCGAA CTGTAGGAGG TGTAACTTAT
3241 GATACATTTC ATGAAGCTGC TAAACACCGA GGATTATTAC TTGATGACAC TATCTGGAAA
3301 GATACGATTG ACGATGCAAT CATCCTTAAT ATGCCCAAAC AACTACGGCA ACTTTTTGCA
3361 TATATATGTG TGTTTGGATG TCCTTCTGCT GCAGACAAAT TATGGGATGA GAATAAATCT
3421 CATTTTATTG AAGATTTCTG TTGGAAATTA CACCGAAGAG AAGGTGCCTG TGTGAACTGT
3481 GAAATGCATG CCCTTAACGA AATTCAGGAG GTATTCACAT TGCATGGAAT GAAATGTTCA
3541 CATTTCAAAC TTCCGGACTA TCCTTTATTA ATGAATGCAA ATACATGTGA TCAATTGTAC
3601 GAGCAACAAC AGGCAGAGGT TTTGATAAAT TCTCTGAATG ATGAACAGTT GGCAGCCTTT
3661 CAGACTATAA CTTCAGCCAT CGAAGATCAA ACTGTACACC CCAAATGCTT TTTCTTGGAT
3721 GGTCCAGGTG GTAGTGGAAA AACATATCTG TATAAAGTTT TAACACATTA TATTAGAGGT
3781 CGTGGTGGTA CTGTTTTACC CACAGCATCT ACAGGAATTG CTGCAAATTT ACTTCTTGGT
3841 GGAAGAACCT TTCATTCCCA AT AT AAAT T A CCAATTCCAT TAAAT GAAAC TTCAATTTCT
3901 AGACTCGATA TAAAGAGTGA AGTTGCTAAA ACCATTAAAA AGGCCCAACT TCTCATTATT
3961 GATGAATGCA CCATGGCATC CAGTCATGCT ATAAACGCCA TAGATAGATT ACTAAGAGAA
4021 ATTATGAATT TGAATGTTGC ATTTGGTGGG AAAGTTCTCC TTCTCGGAGG GGATTTTCGA
4081 CAATGTCTCA GTATTGTACC ACATGCTATG CGATCGGCCA TAGTACAAAC GAGTTTAAAG
4141 TACTGTAATG TTTGGGGATG TTTCAGAAAG TTGTCTCTTA AAACAAATAT GAGATCAGAG
4201 GATTCTGCTT ATAGTGAATG GTTAGTAAAA CTTGGAGATG GCAAACTTGA TAGCAGTTTT
4261 CATTTAGGAA TG GAT AT TAT TGAAATCCCC CATGAAATGA TTTGTAACGG AT C TAT TAT T 4321 GAAGCTACCT TTGGAAATAG TATATCTATA GATAATATTA AAAATATATC TAAACGTGCA
4381 ATTCTTTGTC CAAAAAATGA GCATGTTCAA AAATTAAATG AAGAAATTTT GGATATACTT
4441 GATGGAGATT TTCACACATA TTTGAGTGAT GATTCCATTG ATTCAACAGA TGATGCTGAA
4501 AAGGAAAATT TTCCCATCGA ATTTCTTAAT AGTATTACTC CTTCGGGAAT GCCGTGTCAT
4561 AAATTAAAAT TGAAAGTGGG TGCAATCATC ATGCTATTGA GAAATCTTAA TAGTAAATGG
4621 GGTCTTTGTA ATGGTACTAG ATTTATTATC AAAAGATTAC GACCTAACAT TATCGAAGCT
4681 GAAGTATTAA CAGGATCTGC AGAGGGAGAG GTTGTTCTGA TTCCAAGAAT TGATTTGTCC
4741 CCATCTGACA CTGGCCTCCC ATTTAAATTA ATTCGAAGAC AGTTTCCCGT GATGCCAGCA
4801 TTTGCGATGA CTATTAATAA ATCACAAGGA CAAACTCTAG ACAGAGTAGG AATATTCCTA
4861 CCTGAACCCG TTTTCGCACA TGGTCAGTTA TATGTTGCTT TCTCTCGAGT TCGAAGAGCA
4921 TGTGACGTTA AAGTTAAAGT TGTAAATACT TCATCACAAG GGAAATTAGT CAAGCACTCT
4981 GAAAGTGTTT TTACTCTTAA TGTGGTATAC AGGGAGATAT TAGAATAAGT TTAATCACTT
5041 TATCAGTCAT TGTTTGCATC AATGTTGTTT TTATATCATG TTTTTGTTGT TTTTATATCA
5101 TGTCTTTGTT GTTGTTATAT CATGTTGTTA TTGTTTATTT ATTAATAAAT TTATGTATTA
5161 TTTTCATATA CATTTTACTC ATTTCCTTTC ATCTCTCACA CTTCTATTAT AGAGAAAGGG
5221 CAAATAGCAA TATTAAAATA TTTCCTCTAA TTAATTCCCT TTCAATGTGC ACGAATTTCG
5281 TGCACCGGGC CACTAG (SEQ ID NO: 9)
[0100] The Helitron transposase comprises a RepHel motif made up of a replication initiator domain (Rep) and a DNA helicase domain. The Rep domain is a nuclease domain of the HUH superfamily of nucleases. An exemplary Helitron transposase of the disclosure comprises an amino acid sequence comprising
1 MSKEQLLIQR SSAAERCRRY RQKMSAEQRA SDLERRRRLQ QNVSEEQLLE KRRSEAEKQR 61 RHRQKMSKDQ RAFEVERRRW RRQNMSREQS STSTTNTGRN CLLSKNGVHE DAILEHSCGG 121 MTVRCEFCLS LNFSDEKPSD GKFTRCCSKG KVCPNDIHFP DYPAYLKRLM TNEDSDSKNF 181 MENIRSINSS FAFASMGANI ASPSGYGPYC FRIHGQVYHR TGTLHPSDGV SRKFAQLYIL 241 DTAEATSKRL AMPENQGCSE RLMININNLM HEINELTKSY KMLHEVEKEA QSEAAAKGIA 301 PTEVTMAIKY DRNSDPGRYN SPRVTEVAVI FRNEDGEPPF ERDLLIHCKP DPNNPNATKM 361 KQISILFPTL DAMTYPILFP HGEKGWGTDI ALRLRDNSVI DNNTRQNVRT RVTQMQYYGF 421 HLSVRDTFNP ILNAGKLTQQ FIVDSYSKME ANRINFIKAN QSKLRVEKYS GLMDYLKSRS 481 ENDNVPIGKM IILPSSFEGS PRNMQQRYQD AMAIVTKYGK PDLFITMTCN PKWADITNNL 541 QRWQKVENRP DLVARVFNIK LNALLNDICK FHLFGKVIAK IHVIEFQKRG LPHAHILLIL 601 DSESKLRSED DIDRIVKAEI PDEDQCPRLF QIVKSNMVHG PCGIQNPNSP CMENGKCSKG 661 YPKEFQNATI GNIDGYPKYK RRSGSTMSIG NKWDNTWIV PYNPYLCLKY NCHINVEVCA 721 SIKSVKYLFK YIYKGHDCAN IQISEKNIIN HDEVQDFIDS RYVSAPEAW RLF7\MRMHDQ 781 SHAITRLAIH LPNDQNLYFH TDDFAEVLDR AKRHNSTLMA WFLLNREDSD ARNYYYWEIP 841 QHYVFNNSLW TKRRKGGNKV LGRLFTVSFR EPERYYLRLL LLHVKGAISF EDLRTVGGVT 901 YDTFHEAAKH RGLLLDDTIW KDTIDDAIIL NMPKQLRQLF AYICVFGCPS AADKLWDENK 961 SHFIEDFCWK LHRREGACVN CEMHALNEIQ EVFTLHGMKC SHFKLPDYPL LMNANTCDQL 1021 YEQQQAEVLI NSLNDEQLAA FQTITSAIED QTVHPKCFFL DGPGGSGKTY LYKVLTHYIR 1081 GRGGTVLPTA STGIAANLLL GGRTFHSQYK LPIPLNETSI SRLDIKSEVA KTIKKAQLLI 1141 IDECTMASSH AINAIDRLLR EIMNLNVAFG GKVLLLGGDF RQCLSIVPHA MRSAIVQTSL 1201 KYCNWGCFR KLSLKTNMRS EDSAYSEWLV KLGDGKLDSS FHLGMDI IEI PHEMICNGSI
1261 IEATFGNSIS IDNIKNISKR AILCPKNEHV QKLNEEILDI LDGDFHTYLS DDSIDSTDDA 1321 EKENFPIEFL NSITPSGMPC HKLKLKVGAI IMLLRNLNSK WGLCNGTRFI IKRLRPNI IE 1381 AEVLTGSAEG EWLIPRIDL SPSDTGLPFK LIRRQFPVMP AFAMTINKSQ GQTLDRVGIF 1441 LPEPVFAHGQ LYVAFSRVRR ACDVKVKW TSSQGKLVKH SESVFTLNW YREILE (SEQ ID NO: 10) .
[0101] In some embodiments, the Helitron transposase transposes the Helraiser transposable element in a Helitron transposition. In some embodiments, a hairpin close to the 3’ end of the Helraiser transposon functions as a terminator. In certain embodiments of the Helraiser sequence, the transposase is flanked by left and right terminal sequences termed LTS and RTS. These sequences terminate with a conserved 5’-TC/CTAG-3’ motif. In some embodiments, a 19 bp palindromic sequence with the potential to form the hairpin termination structure is located 11 nucleotides upstream of the RTS and consists of the sequence GTGCACGAATTTCGTGCACCGGGCCACTAG (SEQ ID NO: 11).
[0102] In certain embodiments of the compositions of the disclosure, including those embodiments wherein the at least one gene editing molecule comprises one or more transposable element(s), the at least one gene editing molecule comprises further comprises one or more transposase(s). In some embodiments, including those embodiments wherein the transposon is a Tol2 transposon, the transposase is a Tol2 transposase. In some embodiments, the transposase is a Tol2 transposase. Tol2 transposons may be isolated or derived from the genome of the medaka fish, and may be similar to transposons of the hAT family. Exemplary Tol2 transposons of the disclosure are encoded by a sequence comprising about 4.7 kilobases and contain a gene encoding the Tol2 transposase, which contains four exons. An exemplary Tol2 transposase of the disclosure comprises an amino acid sequence comprising the following:
1 MEEVCDSSAA ASSTVQNQPQ DQEHPWPYLR EFFSLSGVNK DSFKMKCVLC LPLNKEI SAF 61 KSSPSNLRKH IERMHPNYLK NYSKLTAQKR KIGTSTHASS SKQLKVDSVF PVKHVSPVTV 121 NKAILRYIIQ GLHPFSTVDL PSFKELISTL QPGI SVITRP TLRSKIAEAA LIMKQKVTAA 181 MSEVEWIATT TDCWTARRKS FIGVTAHWIN PGSLERHSAA LACKRLMGSH TFEVLASAMN 241 DIHSEYEIRD KWCTTTDSG SNFMKAFRVF GVENNDIETE ARRCESDDTD SEGCGEGSDG 301 VEFQDASRVL DQDDGFEFQL PKHQKCACHL LNLVSSVDAQ KALSNEHYKK LYRSVFGKCQ 361 ALWNKSSRSA LAAEAVESES RLQLLRPNQT RWNSTFMAVD RILQICKEAG EGALRNICTS 421 LEVPMFNPAE MLFLTEWANT MRPVAKVLDI LQAETNTQLG WLLPSVHQLS LKLQRLHHSL 481 RYCDPLVDAL QQGIQTRFKH MFEDPEI IAA AILLPKFRTS WTNDETIIKR GMDYIRVHLE 541 PLDHKKELAN SSSDDEDFFA SLKPTTHEAS KELDGYLACV SDTRESLLTF PAICSLSIKT 601 NTPLPASAAC ERLFSTAGLL FS PKRARLDT NNFENQLLLK LNLRFYNFE (SEQ ID NO: 12).
[0103] An exemplary Tol2 transposon of the disclosure, including inverted repeats, subterminal sequences and the Tol2 transposase, is encoded by a nucleic acid sequence comprising the following:
1 CAGAGGT GTA AAGTACTTGA GTAATTTTAC TT GAT TACT G TACTTAAGTA TTATTTTTGG 61 GGATTTTTAC TTTACTTGAG TACAATTAAA AATCAATACT TTTACTTTTA CTTAATTACA 121 TTTTTTTAGA AAAAAAAGTA CTTTTTACTC CTTACAATTT TATTTACAGT CAAAAAGTAC 181 TTATTTTTTG GAGATCACTT CATTCTATTT TCCCTTGCTA TTACCAAACC AATTGAATTG 241 CGCTGATGCC CAGTTTAATT TAAATGTTAT TTATTCTGCC TATGAAAATC GTTTTCACAT 301 TATATGAAAT TGGTCAGACA TGTTCATTGG TCCTTTGGAA GTGACGTCAT GTCACATCTA 361 TTACCACAAT GCACAGCACC TTGACCTGGA AATTAGGGAA ATTATAACAG TCAATCAGTG 421 GAAGAAAAT G GAGGAAGTAT GTGATTCATC AGCAGCTGCG AGCAGCACAG TCCAAAATCA 481 GCCACAGGAT CAAGAGCACC CGTGGCCGTA TCTTCGCGAA TTCTTTTCTT TAAGTGGT GT 541 AAATAAAGAT TCATTCAAGA TGAAATGT GT CCTCTGTCTC CCGCTTAATA AAGAAAT AT C 601 GGCCTTCAAA AGTTCGCCAT CAAACCTAAG GAAGCATATT GAGGTAAGTA CATTAAGTAT 661 TTTGTTTTAC TGATAGTTTT TTTTTTTTTT TTTTTTTTTT TTTTTGGGTG TGCATGTTTT 721 GACGTTGATG GCGCGCCTTT TATATGTGTA GTAGGCCTAT TTTCACTAAT GCATGCGATT 781 GACAATATAA GGCTCACGTA ATAAAATGCT AAAATGCATT TGTAATTGGT AACGTTAGGT 841 CCACGGGAAA TTTGGCGCCT ATTGCAGCTT TGAATAATCA TTATCATTCC GTGCTCTCAT 901 TGT GTTTGAA TTCATGCAAA ACACAAGAAA ACCAAGCGAG AAATTTTTTT CCAAACATGT 961 TGTATTGTCA AAACGGTAAC ACTTTACAAT GAG GTT GATT AGTTCATGTA TTAACTAACA 1021 TTAAATAACC ATGAGCAATA CATTTGTTAC TGTATCTGTT AATCTTTGTT AACGTTAGTT 1081 AATAGAAATA CAGATGTTCA TTGTTTGTTC ATGTTAGTTC ACAGTGCATT AACTAATGTT 1141 AACAAGATAT AAAGTATTAG TAAATGTTGA AATTAACATG TATACGTGCA GTT CAT TAT T 1201 AGTTCATGTT AACTAATGTA GTTAACTAAC GAACCTTATT GTAAAAGTGT TACCATCAAA 1261 ACTAATGTAA TGAAATCAAT TCACCCTGTC ATGTCAGCCT TACAGTCCTG TGTTTTTGTC 1321 AATATAATCA GAAATAAAAT TAAT GTTTGA TTGTCACTAA ATGCTACTGT ATTTCTAAAA 1381 TCAACAAGTA TTTAACATTA TAAAGT GTGC AATTGGCTGC AAATGTCAGT TTTATTAAAG 1441 GGTTAGTTCA CCCAAAAATG AAAATAAT GT CAT TAAT GAC TCGCCCTCAT GTCGTTCCAA 1501 GCCCGTAAGA CCTCCGTTCA TCTTCAGAAC ACAGTTTAAG ATATTTTAGA TTTAGTCCGA 1561 GAGCTTTCTG TGCCTCCATT GAGAATGTAT GTACGGTATA CTGTCCATGT CCAGAAAGGT 1621 AATAAAAACA TCAAAGTAGT CCATGTGACA TCAGTGGGTT AGTTAGAATT TTTTGAAGCA 1681 TCGAATACAT TTTGGTCCAA AAATAACAAA ACCTACGACT TTATTCGGCA TTGTATTCTC 1741 TTCCGGGTCT GTTGTCAATC CGCGTTCACG ACTTCGCAGT GACGCTACAA TGCTGAATAA 1801 AGTCGTAGGT TTTGTTATTT TTGGACCAAA ATGTATTTTC GATGCTTCAA ATAATTCTAC 1861 CTAACCCACT GATGTCACAT GGACTACTTT GATGTTTTTA TTACCTTTCT GGACATGGAC 1921 AGTATACCGT ACATACATTT TCAGTGGAGG GACAGAAAGC TCTCGGACTA AATCTAAAAT 1981 ATCTTAAACT GT GTTCCGAA GATGAACGGA GGT GTTACGG GCTTGGAACG ACATGAGGGT 2041 GAGTCATTAA TGACATCTTT TCATTTTTGG GTGAACTAAC CCTTTAATGC TGTAATCAGA 2101 GAGTGTATGT GTAATTGTTA CATTTATTGC ATACAATATA AATATTTATT TGTTGTTTTT 2161 ACAGAGAAT G CACCCAAATT ACCTCAAAAA CTACTCTAAA TTGACAGCAC AGAAGAGAAA
2221 GATCGGGACC TCCACCCATG CTTCCAGCAG TAAGCAACTG AAAGTTGACT CAGTTTTCCC
2281 AGTCAAACAT GT GTCTCCAG TCACTGTGAA CAAAGCTATA TTAAGGTACA TCATTCAAGG
2341 ACTTCATCCT TTCAGCACTG TTGATCTGCC AT CAT TT AAA GAGCTGATTA GTACACTGCA
2401 GCCTGGCATT TCTGTCATTA CAAGGCCTAC TTTACGCTCC AAGATAGCTG AAGCTGCTCT
2461 GATCATGAAA CAGAAAGTGA CTGCTGCCAT GAGTGAAGTT GAATGGATTG CAACCACAAC
2521 GGATTGTTGG ACTGCACGTA GAAAGTCATT CATTGGT GTA ACTGCTCACT GGATCAACCC
2581 TGGAAGTCTT GAAAGACATT CCGCTGCACT TGCCTGCAAA AGATTAATGG GCTCTCATAC
2641 TTTTGAGGTA CTGGCCAGTG CCATGAATGA TATCCACTCA GAGTATGAAA TACGTGACAA
2701 GGTTGTTTGC ACAACCACAG ACAGTGGTTC CAACTTTATG AAGGCTTTCA GAGTTTTTGG
2761 TGTGGAAAAC AATGATATCG AGACTGAGGC AAGAAGGTGT GAAAGTGATG ACACTGATTC
2821 TGAAGGCTGT GGTGAGGGAA GTGATGGT GT GGAATTCCAA GATGCCTCAC GAGTCCTGGA
2881 CCAAGACGAT GGCTTCGAAT TCCAGCTACC AAAACAT CAA AAGTGTGCCT GTCACTTACT
2941 TAACCTAGTC TCAAGCGTTG ATGCCCAAAA AGCTCTCTCA AATGAACACT ACAAGAAACT
3001 CTACAGATCT GTCTTTGGCA AATGCCAAGC TTTATGGAAT AAAAGCAGCC GATCGGCTCT
3061 AGCAGCTGAA GCTGTTGAAT CAGAAAGCCG GCTTCAGCTT TTAAGGCCAA ACCAAACGCG
3121 GTGGAATTCA ACTTTTATGG CTGTTGACAG AATTCTTCAA ATTTGCAAAG AAGCAGGAGA
3181 AGGCGCACTT CGGAATATAT GCACCTCTCT TGAGGTTCCA ATGTAAGTGT TTTTCCCCTC
3241 TATCGATGTA AACAAATGTG GGTTGTTTTT GTTTAATACT CTTTGATTAT GCTGATTTCT
3301 CCTGTAGGTT TAATCCAGCA GAAATGCTGT TCTTGACAGA GTGGGCCAAC ACAATGCGTC
3361 CAGTTGCAAA AGTACTCGAC ATCTTGCAAG CGGAAACGAA TACACAGCTG GGGTGGCTGC
3421 TGCCTAGTGT CCATCAGTTA AGCTTGAAAC TTCAGCGACT CCACCATTCT CTCAGGTACT
3481 GTGACCCACT TGTGGATGCC CTACAACAAG GAATCCAAAC ACGATTCAAG CATATGTTTG
3541 AAGATCCTGA GATCATAGCA GCTGCCATCC TTCTCCCTAA ATTTCGGACC TCTTGGACAA
3601 ATGATGAAAC CAT C AT AAAA CGAGGTAAAT GAATGCAAGC AACATACACT TGACGAATTC
3661 TAATCTGGGC AACCTTTGAG CCATACCAAA ATTATTCTTT TATTTATTTA TTTTTGCACT
3721 TTTTAGGAAT GTTATATCCC ATCTTTGGCT GTGATCTCAA TATGAATATT GATGTAAAGT
3781 ATTCTTGCAG CAGGTTGTAG TTATCCCTCA GTGTTTCTTG AAACCAAACT CATATGTATC
3841 ATATGTGGTT TGGAAATGCA GTTAGATTTT ATGCTAAAAT AAGGGATTTG CATGATTTTA
3901 GATGTAGATG ACTGCACGTA AATGTAGTTA AT GACAAAAT CCATAAAATT TGTTCCCAGT
3961 CAGAAGCCCC TCAACCAAAC TTTTCTTTGT GTCTGCTCAC TGTGCTTGTA GGCATGGACT
4021 ACATCAGAGT GCATCTGGAG CCTTTGGACC ACAAGAAGGA ATTGGCCAAC AGTTCATCTG
4081 ATGATGAAGA TTTTTTCGCT TCTTTGAAAC CGACAACACA TGAAGCCAGC AAAGAGTTGG
4141 ATGGATATCT GGCCTGTGTT TCAGACACCA GGGAGTCTCT GCTCACGTTT CCTGCTATTT
4201 GCAGCCTCTC TATCAAGACT AATACACCTC TTCCCGCATC GGCTGCCTGT GAGAGGCTTT
4261 TCAGCACTGC AGGATTGCTT TTCAGCCCCA AAAGAGCTAG GCTTGACACT AACAATTTTG
4321 AGAATCAGCT TCTACTGAAG TTAAATCTGA GGTTTTACAA CTTTGAGTAG CGT GTACTGG
4381 CATTAGATTG TCTGTCTTAT AGTTTGATAA T TAAATACAA ACAGTTCTAA AGCAGGATAA
4441 AACCTTGTAT GCATTTCATT TAAT GTTTTT TGAGATTAAA AGCTTAAACA AGAATCTCTA
4501 GTTTTCTTTC TTGCTTTTAC TTTTACTTCC TTAATACTCA AGTACAATTT TAATGGAGTA
4561 CTTTTTTACT TTTACTCAAG TAAGATTCTA GCCAGATACT TTTACTTTTA ATTGAGTAAA 4621 ATTTTCCCTA AGTACTTGTA CTTTCACTTG AGTAAAATTT TTGAGTACTT TTTACACCTC 4681 TG (SEQ I D NO: 13).
[0104] The disclosure provides a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly (L-histidine) block, wherein: the at least one poly (L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly (L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule.
[0105] The disclosure provides a kit, comprising: a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly (L- histidine) block, wherein: the at least one poly (L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly (L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule; and an implement for administering the pharmaceutical composition. In some embodiments, the pharmaceutical composition is administered systemically or locally. In some embodiments, the
pharmaceutical composition is administered intravenously, via inhalation, topically, per rectum, per the vagina, transdermally, subcutaneously, intraperitoneally, intrathecally, intramuscularly or orally.
[0106] The disclosure provides a kit, comprising: a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising: a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly (L- histidine) block, wherein: the at least one poly (L-histidine) block complexes with the at least one gene editing molecule; and the at least one poly (L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule; and an implement for administering the pharmaceutical composition intravenously, via inhalation, topically, per rectum, per the vagina, transdermally, subcutaneously, intraperitoneally, intrathecally, intramuscularly or orally.
[0107] In certain embodiments of the compositions of the disclosure, including
pharmaceutical compositions of the disclosure, the compositions comprise a micelle structure comprising a triblock copolymer capable of complexing with at least one protein or nucleic acid, wherein the triblock copolymer comprises a hydrophilic block a hydrophobic block, and a poly(L-histidine) block. In certain embodiments of the triblock copolymer, the hydrophilic block comprises poly(ethylene oxide) (PEO). In certain embodiments of the triblock copolymer, the hydrophilic block comprises at least one aliphatic polyester. In certain embodiments of the triblock copolymer, the hydrophilic block comprises a poly(lactic acid), a poly(gly colic acid) (PGA), a poly(lactic-co-gly colic acid) (PLGA), a poly(s-caprolactone) (PCL), a poly(3-hydroxybutyrate) (PHB) or any combination thereof. In certain embodiments of the triblock copolymer, the hydrophilic block comprises a poly(lactic acid) having an average length of 25 units.
[0108] In certain embodiments of the compositions of the disclosure, including
pharmaceutical compositions of the disclosure, the compositions comprise a micelle structure comprising a triblock copolymer capable of complexing with at least one protein or nucleic acid, wherein the triblock copolymer comprises a hydrophilic block a hydrophobic block, and a poly(L-histidine) block. In certain embodiments of the triblock copolymer, the hydrophobic block comprises a poly(ester), a poly(anhydride), a poly (peptide), an artificial poly(nucleic acid) or any combination thereof.
[0109] In certain embodiments of the compositions of the disclosure, including
pharmaceutical compositions of the disclosure, the compositions comprise a micelle structure comprising a triblock copolymer capable of complexing with at least one protein or nucleic acid, wherein the triblock copolymer comprises a hydrophilic block a hydrophobic block, and a poly(L-histidine) block. In certain embodiments of the triblock copolymer, the poly(L- histidine) block enables pH- dependent release of the at least one protein or nucleic acid. Exemplary poly(L-histidine) copolymers include, but are not limited to, non-degradable and degradable diblocks. Exemplary degradable poly(L-histidine) copolymers include, but are not limited to, PEO(5000)-b-PCL(l6300) ("P2350-EOCL"); PEO(2000)-b-PMCL(l l900) ("OCL"); PEO(2000)-b-PMCL(8300) ("OMCL"); PEO(l l00)-b-PTMC(5l00) ("OTMC"); and PEO(2000)-b-PTMC/PCL(l 1200) ("OTCL").
[0110] In certain embodiments of the compositions of the disclosure, including
pharmaceutical compositions of the disclosure, the compositions comprise a micelle structure comprising a copolymer comprising PEO-b-PLA-b PHIS. In some embodiments, the PEO block may comprise at least 1 monomer, 5 monomers, 10 monomers, 100 monomers, 500 monomers, 1000 monomers, 2500 monomers, 5000 monomers, 10000 monomers, 15000 monomers or any number of monomers in between. In some embodiments, the PLA block may comprise at least 1 monomer, 5 monomers, 10 monomers, 100 monomers, 500 monomers, 1000 monomers, 2500 monomers, 5000 monomers, 10000 monomers, 15000 monomers or any number of monomers in between. In some embodiments, the PHIS block may comprise at least 1 monomer, 5 monomers, 10 monomers, 100 monomers, 500 monomers, 1000 monomers, 2500 monomers, 5000 monomers, 10000 monomers, 15000 monomers or any number of monomers in between.
[0111] In certain embodiments of the compositions of the disclosure, including
pharmaceutical compositions of the disclosure, the compositions comprise a micelle structure comprising a copolymer comprising PEO-b-PLA-b PHIS. In some embodiments, the molar ratio of polymer to cargo is 20: 1, 15: 1, 10: 1, 5: 1, or 2: 1. In some embodiments, the cargo is at least one gene editing molecule of the disclosure.
[0112] In order to develop nanoparticles with controllable release, micellar systems with triggered release mechanisms may be developed that enable the delivery drugs or other treatment agents in response to specific stimuli. In particular, pH-sensitive polymeric micelles may be useful therapeutic agents since changes in pH occur in a variety of cellular processes and locations. For example, once the micelle enters cells via endocytosis where pH can drop as low as 5.5-6.0 in endosomes and 4.5-5.0 in lysosomes.
[0113] In some embodiments, cationically-charged, pH-sensitive polymers maintain a neutral charge at a pH around physiological pH (7.0-7.8) and become positively charged at a reduced pH such as that which may be found in an endosome or lysosome. In some embodiments, cationically-charged, pH-sensitive polymers that maintain a neutral charge at a pH around physiological pH (7.0-7.8), become positively charged at a reduced pH of between 6.0 and 7.0. In some embodiments, cationically-charged, pH-sensitive polymers that maintain a neutral charge at a pH around physiological pH (7.0-7.8), become positively charged at a reduced pH of between 6.0 and 6.5. In some embodiments, cationically-charged, pH-sensitive polymers that maintain a neutral charge at a pH around physiological pH (7.0-7.8), become positively charged at a reduced pH of between 6.2 and 6.4. Exemplary cationically-charged, pH-sensitive polymers are displayed in Table 1.
[0114] Table 1: Ionizable polymers that are cationically-charged dependent upon pH state.
Figure imgf000037_0001
Figure imgf000038_0002
[0115] In some embodiments, cationically-charged polymers are constitutively positively charged at a pH around physiological pH (7.0-7.8). Exemplary constitutively cationically- charged polymers are displayed in Table 2.
[0116] Table 2: Stable cationically-charged polymer
Figure imgf000038_0001
[0117] The various embodiments enable intracellular delivery of gene editing tools by complexing with ionizable and/or cationically-charged polymer-based micelles. In particular, the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block. In some embodiments, the hydrophilic block may be poly(ethylene oxide) (PEO), and the charged block may be selected from Table 1 or Table 2. An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA- b-PHIS, with variable numbers of repeating units in each block varying by design. The gene editing tools may be various molecules that are recognized as capable of modifying, repairing, adding and/or silencing genes in various cells.
[0118] The correct and efficient repair of double-strand breaks (DSBs) in DNA is critical to maintaining genome stability in cells. Structural damage to DNA may occur randomly and unpredictably in the genome due to any of a number of intracellular factors (e.g., nucleases, reactive oxygen species, etc.) as well as external forces (e.g., ionizing radiation, ultraviolet (UV) radiation, etc.). In particular, correct and efficient repair of double-strand breaks (DSBs) in DNA is critical to maintaining genome stability. Accordingly, cells naturally possess a number of DNA repair mechanisms, which can be leveraged to alter DNA sequences through controlled DSBs at specific sites. Genetic modification tools may therefore be composed of programmable, sequence-specific DNA-binding modules associated with a nonspecific DNA nuclease, introducing DSBs into the genome. For example CRISPR, mostly found in bacteria, are loci containing short direct repeats, and are part of the acquired prokaryotic immune system, conferring resistance to exogenous sequences such as plasmids and phages. RNA-guided endonucleases are programmable genetic engineering tools that are adapted from the CRISPR/CRISPR-associated protein 9 (Cas9) system, which is a component of prokaryotic innate immunity.
[0119] Diblock copolymers that may be used as intermediates for making triblock copolymers of the embodiment micelles may have hydrophilic biocompatible poly(ethylene oxide) (PEO), which is chemically synonymous with PEG, coupled to various hydrophobic aliphatic poly(anhydrides), poly(nucleic acids), poly(esters), poly(ortho esters),
poly(peptides), poly(phosphazenes) and poly(saccharides), including but not limited by poly(lactide) (PLA), poly(glycolide) (PLGA), poly(lactic-co-gly colic acid) (PLGA), poly(s- caprolactone) (PCL), and poly (trimethylene carbonate) (PTMC). Polymeric micelles comprised of 100% PEGylated surfaces possess improved in vitro chemical stability, augmented in vivo bioavailablity, and prolonged blood circulatory half-lives. For example, aliphatic polyesters, constituting the polymeric micelle's membrane portions, are degraded by hydrolysis of their ester linkages in physiological conditions such as in the human body. Because of their biodegradable nature, aliphatic polyesters have received a great deal of attention for use as implantable biomaterials in drug delivery devices, bioresorbable sutures, adhesion barriers, and as scaffolds for injury repair via tissue engineering.
[0120] In various embodiments, molecules required for gene editing (i.e., gene editing tools) may be delivered to cells using one or more micelle formed from self-assembled triblock copolymers containing cationically-charged polymer blocks. The term "gene editing" as used herein refers to the insertion, deletion or replacement of nucleic acids in genomic DNA so as to add, disrupt or modify the function of the product that is encoded by a gene. Various gene editing systems require, at a minimum, the introduction of a cutting enzyme (e.g., a nuclease or recombinase) that cuts genomic DNA to disrupt or activate gene function.
[0121] Poly(histidine) (i.e., poly(L-histidine)), is an ionizable polymer that becomes positively charged at lower pH (< 7.0) due to the imidazole ring providing an electron lone pair on the unsaturated nitrogen. That is, poly(histidine) has amphoteric properties through protonation-deprotonation.
[0122] The various embodiments enable intracellular delivery of gene editing tools by complexing with poly(histidine)-based micelles. In particular, the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block. In some embodiments, the hydrophilic block may be poly(ethylene oxide) (PEO), and the charged block may be poly(L-histidine). An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA-b-PHIS, with variable numbers of repeating units in each block varying by design. The gene editing tools may be various molecules that are recognized as capable of modifying, repairing, adding and/or silencing genes in various cells.
[0123] The correct and efficient repair of double-strand breaks (DSBs) in DNA is critical to maintaining genome stability in cells. Structural damage to DNA may occur randomly and unpredictably in the genome due to any of a number of intracellular factors (e.g., nucleases, reactive oxygen species, etc.) as well as external forces (e.g., ionizing radiation, ultraviolet (UV) radiation, etc.). In particular, correct and efficient repair of double-strand breaks (DSBs) in DNA is critical to maintaining genome stability. Accordingly, cells naturally possess a number of DNA repair mechanisms, which can be leveraged to alter DNA sequences through controlled DSBs at specific sites. Genetic modification tools may therefore be composed of programmable, sequence-specific DNA-binding modules associated with a nonspecific DNA nuclease, introducing DSBs into the genome. For example CRISPR, mostly found in bacteria, are loci containing short direct repeats, and are part of the acquired prokaryotic immune system, conferring resistance to exogenous sequences such as plasmids and phages. RNA-guided endonucleases are programmable genetic engineering tools that are adapted from the CRISPR/CRISPR-associated protein 9 (Cas9) system, which is a component of prokaryotic innate immunity.
[0124] Diblock copolymers that may be used as intermediates for making triblock copolymers of the embodiment micelles may have hydrophilic biocompatible poly(ethylene oxide) (PEO), which is chemically synonymous with PEG, coupled to various hydrophobic aliphatic poly(anhydrides), poly(nucleic acids), poly(esters), poly(ortho esters),
poly(peptides), poly(phosphazenes) and poly(saccharides), including but not limited by poly(lactide) (PLA), poly(glycolide) (PLGA), poly(lactic-co-gly colic acid) (PLGA), poly(s- caprolactone) (PCL), and poly (trimethylene carbonate) (PTMC). Polymeric micelles comprised of 100% PEGylated surfaces possess improved in vitro chemical stability, augmented in vivo bioavailablity, and prolonged blood circulatory half-lives. For example, aliphatic polyesters, constituting the polymeric micelle's membrane portions, are degraded by hydrolysis of their ester linkages in physiological conditions such as in the human body. Because of their biodegradable nature, aliphatic polyesters have received a great deal of attention for use as implantable biomaterials in drug delivery devices, bioresorbable sutures, adhesion barriers, and as scaffolds for injury repair via tissue engineering.
[0125] In various embodiments, molecules required for gene editing (i.e., gene editing tools) may be delivered to cells using one or more micelle formed from self-assembled triblock copolymers containing poly(histidine). The term "gene editing" as used herein refers to the insertion, deletion or replacement of nucleic acids in genomic DNA so as to add, disrupt or modify the function of the product that is encoded by a gene. Various gene editing systems require, at a minimum, the introduction of a cutting enzyme (e.g., a nuclease or recombinase) that cuts genomic DNA to disrupt or activate gene function.
[0126] Further, in gene editing systems that involve inserting new or existing
nucleotides/nucleic acids, insertion tools (e.g. DNA template vectors, transposable elements (transposons or retrotransposons) must be delivered to the cell in addition to the cutting enzyme (e.g. a nuclease, recombinase, integrase or transposase). Examples of such insertion tools for a recombinase may include a DNA vector. Other gene editing systems require the delivery of an integrase along with an insertion vector, a transposase along with a transposon/retrotransposon, etc. In some embodiments, an example recombinase that may be used as a cutting enzyme is the CRE recombinase. In various embodiments, example integrases that may be used in insertion tools include viral based enzymes taken from any of a number of viruses including, but not limited to, AAV, gamma retrovirus, and lentivirus. Example transposons/retrotransposons that may be used in insertion tools include, but are not limited to, the piggyBac transposon, Sleeping Beauty transposon, Helraiser transposon, Tol2 transposon and the Ll retrotransposon.
[0127] In various embodiments, nucleases that may be used as cutting enzymes include, but are not limited to, Cas9, transcription activator-like effector nucleases (TALENs) and zinc finger nucleases. In some embodiments, the Cas9 is a catalytically inactive or“inactivated” Cas9 (dCas9). In some embodiments, the Cas9 is a catalytically inactive or“inactivated” nuclease domain of Cas9. In some embodiments, the dCas9 is encoded by a shorter sequence that is derived from a full length, catalytically inactivated, Cas9, referred to herein as a “small” dCas9 or dSaCas9.
[0128] In some embodiments, the inactivated, small, Cas9 (dSaCas9) operatively-linked to an active nuclease. In some embodiments, the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA binding domain and molecule nuclease, wherein the nuclease comprises a small, inactivated Cas9 (dSaCas9). In some embodiments, the dSaCas9 of the disclosure is isolated or derived from Staphylococcus aureus and comprises the mutations D10A and N580A (underlined and bolded) which inactivate the catalytic site. In some embodiments, the dSaCas9 (isolated or derived from
Staphylococcus aureus) of the disclosure comprises the amino acid sequence of:
1 MKRNYILGLA IGITSVGYGI IDYETRDVID AGVRLFKEAN VENNEGRRSK RGARRLKRRR 61 RHRIQRVKKL LFDYNLLTDH SELSGINPYE ARVKGLSQKL SEEEFSAALL HLAKRRGVHN 121 VNEVEEDTGN ELSTKEQI SR NSKALEEKYV AELQLERLKK DGEVRGSINR FKTSDYVKEA 181 KQLLKVQKAY HQLDQSFIDT YIDLLETRRT YYEGPGEGSP FGWKDIKEWY EMLMGHCTYF 241 PEELRSVKYA YNADLYNALN DLNNLVITRD ENEKLEYYEK FQIIENVFKQ KKKPTLKQIA 301 KEILVNEEDI KGYRVTSTGK PEFTNLKVYH DIKDITARKE IIENAELLDQ IAKILTIYQS 361 SEDIQEELTN LNSELTQEEI EQI SNLKGYT GTHNLSLKAI NLILDELWHT NDNQIAIFNR 421 LKLVPKKVDL SQQKEIPTTL VDDFILSPW KRSFIQSIKV INAIIKKYGL PNDIIIELAR 481 EKNSKDAQKM INEMQKRNRQ TNERIEEI IR TTGKENAKYL IEKIKLHDMQ EGKCLYSLEA 541 I PLEDLLNNP FNYEVDHIIP RSVSFDNSFN NKVLVKQEEA SKKGNRTPFQ YLSSSDSKIS 601 YETFKKHILN LAKGKGRISK TKKEYLLEER DINRFSVQKD FINRNLVDTR YATRGLMNLL 661 RSYFRV NLD VKVKSINGGF TSFLRRKWKF KKERNKGYKH HAEDALI IAN ADFIFKEWKK 721 LDKAKKVMEN QMFEEKQAES MPEIETEQEY KEI FITPHQI KHIKDFKDYK YSHRVDKKPN 781 RELINDTLYS TRKDDKGNTL IV NLNGLYD KDNDKLKKLI NKSPEKLLMY HHDPQTYQKL 841 KLIMEQYGDE KNPLYKYYEE TGNYLTKYSK KDNGPVIKKI KYYGNKLNAH LDITDDYPNS 901 RNKWKLSLK PYRFDVYLDN GVYKFVTVKN LDVIKKENYY EVNSKCYEEA KKLKKI SNQA 961 EFIASFYNND LIKINGELYR VIGVNNDLLN RIEV MIDIT YREYLENMND KRPPRIIKTI 1021 ASKTQSIKKY STDILGNLYE VKSKKHPQII KKG (SEQ I D NO: 1). [0129] In certain embodiments of the gene editing systems of the disclosure, the dCas9 of the disclosure comprises a dCas9 isolated or derived from Streptococcus pyogenes. In some embodiments, the dCas9 is isolated or derived from Staphylococcus pyogenes and comprises a dCas9 with substitutions at positions 10 and 840 of the amino acid sequence of the dCas9 which inactivate the catalytic site. In some embodiments, these substitutions are D10A and
H840A. In some embodiments, the amino acid sequence of the dCas9 (isolated or derived from Staphylococcus pyogenes) comprises the sequence of:
1 XDKKYSIGLA IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE 61 ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPI FG 121 NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYLALAH MIKFRGHFLI EGDLNPDNSD 181 VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN 241 LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI 301 LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA 361 GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH 421 AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE 481 WDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL 541 SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI 601 IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG 661 RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL 721 HEHIANLAGS PAIKKGILQT VKWDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER 781 MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDA 841 IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEWKKMK NYWRQLLNAK LITQRKFDNL 901 TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS 961 KLVSDFRKDF QFYKVREINN YHHAHDAYLN AWGTALIKK YPKLESEFVY GDYKVYDVRK 1021 MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TLANGEIRKR PLIETNGETG EIVWDKGRDF 1081 ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA 1141 YSVLWAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK 1201 YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE 1261 QHKHYLDEI I EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA 1321 PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD (SEQIDNO:6).
[0130] In certain embodiments of the gene editing systems of the disclosure, the nuclease domain may comprise, consist essentially of or consist of a dCas9 or a dSaCas9 and a type IIS endonuclease. In certain embodiments of the disclosure, the nuclease domain may comprise, consist essentially of or consist of a dSaCas9 and a type IIS endonuclease, including, but not limited to, Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, Myll, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Ecil, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bril, Mboll, Acc36I, Fokl or Clo05l. In certain embodiments of the disclosure, the nuclease domain may comprise, consist essentially of or consist of a dSaCas9 and Clo05l.
[0131] An exemplary Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
EGIKSNISLLKDELRGQISHISHEYLSLIDLAFDSKQNRLFEMKVLELLVNEYGFKGRHLGGSRKPDG IVYSTTLEDNFGI IVDTKAYSEGYSLPISQADEMERYVRENSNRDEEVNPNKWWENFSEEVKKYYFVF ISGSFKGKFEEQLRRLSMTTGVNGSAVNWNLLLGAEKIRSGEMTIEELERAMFNNSEFILKY (SEQ ID NO: 7) .
[0132] An exemplary dCas9-Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 (isolated or derived from S. pyogenes) sequence in italics):
MAPKKKRKVEGIKSNISLLKDELRGQISHISHEYLSLIDLAFDSKQNRLFEMKVLELLVNEYGFKGRH LGGSRKPDGIVYSTTLEDNFGI IVDTKAYSEGYSLPISQADEMERYVRENSNRDEEVNPNKWWENFSE EVKKYYFVFISGSFKGKFEEQLRRLSMTTGVNGSAVNWNLLLGAEKIRSGEMTIEELERAMFNNSEF ILKYGGGGSDKKYSIGLAIGTNSVGWAVIIDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAE ATRLKRTARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAY HEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLF EENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPNFKSNFDLAEDAKL QLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDL TLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLR KQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRK SEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRK PAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDK DFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRD KQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQT VKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQN EKLYLYYLQNGRDMYVDQELDINRLSDYDVDAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVV KKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYD ENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALIKKYPKLESEFVYGD YKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDF ATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAK VEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLAS AGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILAD ANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSI TGIYETRIDLSQLGGDGSPKKKRKVSS (SEQ ID NO: 8) . [0133] In certain embodiments of the gene editing systems described herein, the nuclease may comprise, consist essentially of or consist of, a homodimer or a heterodimer. Nuclease domains of the disclosure may comprise, consist essentially of or consist of a nuclease domain isolated, derived or recombined from a transcription-activator-like effector nuclease (TALEN). TALENs are transcription factors with programmable DNA binding domains that provide a means to create designer proteins that bind to pre-determined DNA sequences or individual nucleic acids. Modular DNA binding domains have been identified in
transcriptional activator-like (TAL) proteins, or, more specifically, transcriptional activator like effector nucleases (TALENs), thereby allowing for the tie novo creation of synthetic transcription factors that bind to DNA sequences of interest and, if desirable, also allowing a second domain present on the protein or polypeptide to perform an activity related to DNA. TAL proteins have been derived from the organisms Xanthomonas and Ralstonia.
[0134] In certain embodiments of the gene editing systems described herein, the nuclease domain may comprise, consist essentially of or consist of a nuclease domain isolated, derived or recombined from a TALEN and a type IIS endonuclease. In certain embodiments of the disclosure, the type IIS endonuclease may comprise, consist essentially of or consist of Acil, Mnll, Alwl, Bbvl, Bed, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, My II, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Edl, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bfil, Mboll, Acc36I, Fokl or Clo05l. In certain embodiments of the disclosure, the type IIS endonuclease may comprise, consist essentially of or consist of Clo05l (SEQ ID NO: 7).
[0135] In certain embodiments of the gene editing systems described herein, the nuclease domain of may comprise, consist essentially of or consist of a nuclease domain isolated, derived or recombined from a zinc finger nuclease (ZFN) and a type IIS endonuclease. In certain embodiments of the disclosure, the type IIS endonuclease may comprise, consist essentially of or consist of Acil, Mnll, Alwl, Bbvl, BccI, BceAI, BsmAI, BsmFI, BspCNI, Bsrl, BtsCI, Hgal, Hphl, HpyAV, Mboll, Myll, Plel, SfaNI, Acul, BdVI, BfuAI, BmgBI, Bmrl, Bpml, BpuEI, Bsal, BseRI, Bsgl, Bsml, BspMI, BsrBI, BsrBI, BsrDI, BtgZI, Btsl, Earl, Edl, Mmel, NmeAIII, BbvCI, BpulOI, BspQI, Sapl, Bael, BsaXI, CspCI, Bfil, Mboll, Acc36I, Fokl or Clo05l. In certain embodiments of the disdosure, the type IIS endonudease may comprise, consist essentially of or consist of Clo05l (SEQ ID NO: 7).
[0136] In certain embodiments of the gene editing systems described herein, the DNA binding domain and the nuclease domain may be covalently linked. For example, a fusion protein may comprise the DNA binding domain and the nuclease domain. In certain embodiments of the genomic editing compositions or constructs of the disclosure, the DNA binding domain and the nuclease domain may be operably linked through a non-covalent linkage.
[0137] In various embodiments, the gene editing systems described herein, particularly proteins and/or nucleic acids, may be complexed with nanoparticles that are poly(histidine)- based micelles. In particular, at certain pHs, poly(histidine)-containing tri block copolymers may assemble into a micelle with positively charged poly(histidine) units on the surface, thereby enabling complexing with the negatively-charged gene editing molecule(s). Using these nanoparticles to bind and release proteins and/or nucleic acids in a pH-dependent manner may provide an efficient and selective mechanism to perform a desired gene modification.
[0138] In particular, this micelle-based delivery system provides substantial flexibility with respect to the charged materials, as well as a large payload capacity, and targeted release of the nanoparticle payload. In one example, site-specific cleavage of the double stranded DNA may be enabled by delivery of a nuclease using the poly(histidine)-based micelles.
[0139] The various embodiments enable intracellular delivery of gene editing tools by complexing with poly(histidine)-based micelles. In particular, the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block. In some embodiments, the hydrophilic block may be poly(ethylene oxide) (PEO), and the charged block may be poly(L-histidine). An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA-b-PHIS, with variable numbers of repeating units in each block varying by design. Without wishing to be bound by a particular theory, it is believed that believed that in the micelles that are formed by the various embodiment triblock copolymers, the hydrophobic blocks aggregate to form a core, leaving the hydrophilic blocks and poly(histidine) blocks on the ends to form one or more surrounding layer.
[0140] In various embodiments, poly(histidine)-based micelles may be formed at a pH higher than the pKa of poly(histidine) (e.g., pH of about 7). At a pH of around 6, the amine groups of the poly(histidine) block may be protonated, imparting a positive charge and enabling the poly(histidine) block to complex with negatively charged molecules (e.g., proteins and nucleic acids). If the pH is dropped substantially, such as a pH of around 3-4, the bound protein and/or nucleic acid may be released due to protonation of the poly(histidine). Various applications of the embodiment poly(histidine)-based micelles may exploit the controllable pH-dependent release of the payload molecules to target particular cells and/or pathways.
[0141] In various embodiments, the gene editing systems described herein, particularly proteins and/or nucleic acids, may be complexed with nanoparticles that are ionizable or constitutively cationically-charged and that are composed of polymer-based micelles. In particular, at certain pHs, cationically-charged polymer -containing triblock copolymers may assemble into a micelle with positively charged polymer units on the surface, thereby enabling complexing with the negatively-charged gene editing molecule(s). Using these nanoparticles to bind and release proteins and/or nucleic acids in a pH-dependent manner may provide an efficient and selective mechanism to perform a desired gene modification.
[0142] In particular, this micelle-based delivery system provides substantial flexibility with respect to the charged materials, as well as a large payload capacity, and targeted release of the nanoparticle payload. In one example, site-specific cleavage of the double stranded DNA may be enabled by delivery of a nuclease using the cationically-charged polymer-based micelles.
[0143] The various embodiments enable intracellular delivery of gene editing tools by complexing with ionizable or constitutively cationically-charged polymer-based micelles. In particular, the various embodiments provide triblock copolymers made of a hydrophilic block, a hydrophobic block, and a charged block. In some embodiments, the hydrophilic block may be poly(ethylene oxide) (PEO), and the charged block may be selected from Table 1 or Table 2. An example tri-block copolymer that may be used in various embodiments is a PEO-b-PLA-b-PHIS, with variable numbers of repeating units in each block varying by design. Without wishing to be bound by a particular theory, it is believed that believed that in the micelles that are formed by the various embodiment triblock copolymers, the hydrophobic blocks aggregate to form a core, leaving the hydrophilic blocks and positively- charged polymer blocks on the ends to form one or more surrounding layer.
[0144] In various embodiments, cationically-charged polymer-based micelles may be formed from a triblock copolymer containing at least cationic block comprised of the polymers in Table 1 or 2. . For the micelles formed from the ionizable polymers found in Table 1, at a pH of around 6-7, the amine groups of the ionizable polymer block may be protonated, imparting a positive charge and enabling the resultant cationically-charged polymer block to complex with negatively charged molecules (e.g., proteins and nucleic acids). In other compositions formed from the polymers in Table 2, the resultant micelles are cationically charged above a pH of 6.0. Various applications of the embodiment cationically-charged polymer-based micelles may exploit the controllable pH-dependent release of the payload molecules to target particular cells and/or pathways.
[0145] Additional applications of the embodiment micelles may include conjugating molecules to the hydrophilic block in order to target particular cell types. For example, Apoliprotein E or N-Acetylgalactosamine (GalNAc) may be conjugated to a PEO block for specific targeting of the micelles to hepatocytes.
[0146] The particular methods of creating the block copolymers used in the various embodiments, as well as the techniques of forming the micelles, may be varied based on the composition. In particular, these methods and techniques may be optimized to achieve the most desirable block and nanoparticle properties. For example, the polymerization times may be altered to change the molecular weight of a block, and therefore the overall nanoparticle size, as described in further detail in the examples below.
[0147] In various embodiments, the hydrophobic block of the triblock copolymers used to form the micelles may be a polyester, a polyanhydride, a polypeptide, or an artificial polynucleic acid. For example, the hydrophobic block may be an aliphatic polyester, including, but not limited to, poly(lactic acid), poly(gly colic acid) (PGA), poly(lactic-co- gly colic acid) (PLGA), poly(s-caprolactone) (PCL), and/or poly(3-hydroxybutyrate) (PHB).
[0148] Various embodiments may be DNA-based systems that are complexed with the poly(histidine )-based micelles. In some embodiments, an expression vector that expresses a nuclease or other protein may be complexed with poly(histidine )-based micelles. The expression vector may be, for example, a plasmid constructed to contain DNA encoding nuclease as well as a promoter region. Once inside the target cell, the DNA encoding the nuclease may be transcribed and translated to create the enzyme.
[0149] Various embodiment systems may also be designed to integrate DNA into the genome of a target cell using a transposon provided on a vector, such as an artificially constructed plasmid. Applications of such systems may include introducing (i.e., "knocking in") a new gene to perform a particular function through the inserted DNA, or inactivating (i.e., "knocking out") a mutated gene that is functioning improperly through interruption in the target DNA.
[0150] In some embodiments, the DNA may be transposon that is directly transposed between vectors and chromosomes via a "cut and paste" mechanism. In some embodiments, the transposon may be a retrotransposon, e.g., a DNA that is first transcribed into an RNA intermediate, followed by reverse transcription into the DNA that is transposed. [0151] In various embodiments, the cationically-charged polymer-based micelles may complex with a vector that includes the transposon, as well as a transposase that catalyzes the integration of the transposon into specific sites in the target genome. In various embodiments, the poly(histidine )-based micelles may complex with a vector that includes the transposon, as well as a transposase that catalyzes the integration of the transposon into specific sites in the target genome. The transposase that is used is specific to the particular transposon that is selected, each of which may have particular properties are desirable for use in various embodiments. One example transposon is the piggyBac transposon, which is transposed into a target genome by the piggyBac transposase. Specifically, the piggyBac transposase recognizes transposon-specific inverted terminal repeat sequences (ITRs) on the ends of the transposon, and moves the contents between the ITRs into TTAA chromosomal sites. The piggyBac transposon system has no payload limit for the genes of interest that can be included between the ITRs. In some embodiments, and, in particular, those embodiments wherein the transposon is a piggyBac transposon, the transposase is a piggyBac™ or a Super piggyBac™ (SPB) transposase. In some embodiments, and, in particular, those embodiments wherein the transposase is a Super piggyBac™ (SPB) transposase, the sequence encoding the transposase is an mRNA sequence.
[0152] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac™ (PB) transposase enzyme. The piggyBac (PB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
5) .
[0153] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at one or more of positions 30, 165, 282, or 538 of the sequence: 1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPIAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
[0154] In some embodiments, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at two or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 5. In some embodiments, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at three or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 5. In some embodiments, the transposase enzyme is a piggyBac™ (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at each of the following positions 30, 165, 282, and 538 of the sequence of SEQ ID NO: 5. In some embodiments, the amino acid substitution at position 30 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for an isoleucine (I). In some embodiments, the amino acid substitution at position 165 of the sequence of SEQ ID NO: 5 is a substitution of a serine (S) for a glycine (G). In some embodiments, the amino acid substitution at position 282 of the sequence of SEQ ID NO: 5 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 538 of the sequence of SEQ ID NO: 5 is a substitution of a lysine (K) for an asparagine (N).
[0155] In certain embodiments of the methods of the disclosure, the transposase enzyme is a Super piggyBac™ (SPB) transposase enzyme. In some embodiments, the Super piggyBac™ (SPB) transposase enzymes of the disclosure may comprise or consist of the amino acid sequence of the sequence of SEQ ID NO: 5 wherein the amino acid substitution at position 30 is a substitution of a valine (V) for an isoleucine (I), the amino acid substitution at position
165 is a substitution of a serine (S) for a glycine (G), the amino acid substitution at position 282 is a substitution of a valine (V) for a methionine (M), and the amino acid substitution at position 538 is a substitution of a lysine (K) for an asparagine (N). In some embodiments, the Super piggyBac™ (SPB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEV SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTSATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RVYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPKEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
2
[0156] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ or Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at one or more of positions 3, 46, 82, 103, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 258, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 486, 503, 552, 570 and 591 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ or Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at one or more of positions 46, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 485,
503, 552 and 570. In some embodiments, the amino acid substitution at position 3 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for a serine (S). In some embodiments, the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an alanine (A). In some embodiments, the amino acid substitution at position 46 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 82 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for an isoleucine (I). In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 119 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for an arginine (R). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) a cysteine (C). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 185 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 187 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for an alanine (A). In some embodiments, the amino acid substitution at position 200 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a phenylalanine (F).In some embodiments, the amino acid substitution at position 207 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a valine (V). In some embodiments, the amino acid substitution at position 209 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a valine (V). In some embodiments, the amino acid substitution at position 226 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a methionine (M). In some embodiments, the amino acid substitution at position 235 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a leucine (L). In some embodiments, the amino acid substitution at position 240 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 241 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 243 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a proline (P). In some embodiments, the amino acid substitution at position 258 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tryptophan (W) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a proline (P). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine for a proline (P). In some embodiments, the amino acid substitution at position 315 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for an arginine (R).In some embodiments, the amino acid substitution at position 319 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a threonine (T). In some embodiments, the amino acid substitution at position 327 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 328 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a cysteine (C). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 421 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a histidine (H) for the aspartic acid (D). In some embodiments, the amino acid substitution at position 436 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a valine (V). In some embodiments, the amino acid substitution at position 456 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a tyrosine (Y) for a methionine (M). In some embodiments, the amino acid substitution at position 470 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 485 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a serine (S). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an isoleucine (I) for a methionine (M). In some embodiments, the amino acid substitution at position 552 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a glutamine (Q). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an arginine (R) for a glutamine (Q).
[0157] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ transposase enzyme may comprise or the Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at one or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In certain embodiments of the methods of the disclosure, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ transposase enzyme may comprise or the Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at two, three, four, five, six or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac™ transposase enzyme may comprise or the Super piggyBac™ transposase enzyme may further comprise an amino acid substitution at positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 194 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 372 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for an arginine (R). In some embodiments, the amino acid substitution at position 375 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an alanine (A) for a lysine (K). In some embodiments, the amino acid substitution at position 450 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of an asparagine (N) for an aspartic acid (D). In some embodiments, the amino acid substitution at position 509 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a glycine (G) for a serine (S). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO: 5 or SEQ ID NO: 2 is a substitution of a serine (S) for an asparagine (N). In some embodiments, the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5. In some embodiments, including those embodiments wherein the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, the piggyBac™ transposase enzyme may further comprise an amino acid substitution at positions 372, 375 and 450 of the sequence of SEQ ID NO: 5 or SEQ ID NO: 2. In some embodiments, the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, and a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5. In some embodiments, the piggyBac™ transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 5, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 5, a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 5 and a substitution of an asparagine (N) for an aspartic acid (D) at position 450 of SEQ ID NO: 5.
[0158] Another example transposon system is the sleeping beauty transposon, which is transposed into the target genome by the Sleeping Beauty transposase that recognizes ITRs, and moves the contents between the ITRs into TA chromosomal sites. In various
embodiments, SB transposon-mediated gene transfer, or gene transfer using any of a number of similar transposons, may be used for long-term expression of a therapeutic gene.
[0159] In some embodiments, and, in particular, those embodiments wherein the transposon is a Sleeping Beauty transposon, the transposase is a Sleeping Beauty transposase or a hyperactive Sleeping Beauty transposase (SB100X).
[0160] In certain embodiments of the methods of the disclosure, the Sleeping Beauty transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGKSKEISQD LRKKIVDLHK SGSSLGAISK RLKVPRSSVQ TIVRKYKHHG TTQPSYRSGR
61 RRYLSPRDER TLVRKVQINP RTTAKDLVKM LEETGTKVSI STVKRVLYRH NLKGRSARKK
121 PLLQNRHKKA RLRFATAHGD KDRTFWRNVL WSDETKIELF GHNDHRYVWR KKGEACKPKN
181 TIPTVKHGGG SIMLWGCFAA GGTGALHKID GIMRKENYVD ILKQHLKTSV RKLKLGRKWV
241 FQMDNDPKHT SKWAKWLKD NKVKVLEWPS QSPDLNPIEN LWAELKKRVR ARRPTNLTQL
301 HQLCQEEWAK IHPTYCGKLV EGYPKRLTQV KQFKGNATKY (SEQ ID NO: 3) .
[0161] In certain embodiments of the methods of the disclosure, the hyperactive Sleeping Beauty (SB100X) transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGKSKEISQD LRKRIVDLHK SGSSLGAISK RLAVPRSSVQ TIVRKYKHHG TTQPSYRSGR
61 RRYLSPRDER TLVRKVQINP RTTAKDLVKM LEETGTKVSI STVKRVLYRH NLKGHSARKK
121 PLLQNRHKKA RLRFATAHGD KDRTFWRNVL WSDETKIELF GHNDHRYVWR KKGEACKPKN
181 TIPTVKHGGG SIMLWGCFAA GGTGALHKID GIMDAVQYVD ILKQHLKTSV RKLKLGRKWV 241 FQHDNDPKHT SKWAKWLKD NKVKVLEWPS QS PDLNPI EN LWAELKKRVR ARRPTNLTQL 301 HQLCQEEWAK IHPNYCGKLV EGYPKRLTQV KQFKGNATKY ( SEQ I D NO : 4 ) .
[0162] Another example transposon system is the Helraiser/Helitron transposon system. The Helraiser transposon is transposed by the Helitron transposase. Helitron transposases mobilize the Helraiser transposon, an ancient element from the bat genome that was active about 30 to 36 million years ago. An exemplary Helraiser transposon of the disclosure includes Helibatl, which comprises a nucleic acid sequence comprising:
1 TCCTATATAA TAAAAGAGAA ACATGCAAAT TGACCATCCC TCCGCTACGC TCAAGCCACG 61 CCCACCAGCC AATCAGAAGT GACTATGCAA ATTAACCCAA CAAAGATGGC AGTTAAATTT 121 GCATACGCAG GT GTCAAGCG CCCCAGGAGG CAACGGCGGC CGCGGGCTCC CAGGACCTTC 181 GCTGGCCCCG GGAGGCGAGG CCGGCCGCGC CTAGCCACAC CCGCGGGCTC CCGGGACCTT 241 CGCCAGCAGA GAGCAGAGCG GGAGAGCGGG CGGAGAGCGG GAGGTTTGGA GGACTTGGCA 301 GAGCAGGAGG CCGCTGGACA TAGAGCAGAG CGA GAGA GAG GGTGGCTTGG AGGGCGTGGC 361 TCCCTCTGTC ACCCCAGCTT C CT CAT CACA GCTGTGGAAA CTGACAGCAG GGAGGAGGAA 421 GTCCCACCCC CACAGAATCA GCCAGAATCA GCCGTTGGTC AGACAGCTCT CAGCGGCCTG 481 ACAGCCAGGA CTCTCATTCA CCTGCATCTC AGACCGTGAC AGTAGAGAGG TGGGACTATG 541 TCTAAAGAAC AACTGTTGAT ACAACGTAGC TCTGCAGCCG AAAGATGCCG GCGTTATCGA 601 CAGAAAATGT CTGCAGAGCA ACGTGCGTCT GATCTTGAAA GAAGGCGGCG CCTGCAACAG 661 AATGTATCTG AAGAGCAGCT ACTGGAAAAA CGTCGCTCTG AAGCCGAAAA ACAGCGGCGT 721 CATCGACAGA AAATGTCTAA AGACCAACGT GCCTTTGAAG TTGAAAGAAG GCGGTGGCGA 781 C GACAGAAT A TGTCTAGAGA ACAGTCATCA ACAAGTACTA CCAATACCGG TAGGAACTGC 841 CTTCTCAGCA AAAATGGAGT ACATGAGGAT GCAATTCTCG AACATAGTTG TGGTGGAATG 901 ACTGTTCGAT GTGAATTTTG CCTATCACTA AATTTCTCTG AT GAAAAAC C ATCCGATGGG 961 AAATTTACTC GATGTTGTAG CAAAGGGAAA GTCTGTCCAA ATGATATACA TTTTCCAGAT 1021 TACCCGGCAT AT T T AAAAAG ATTAATGACA AACGAAGATT CTGACAGTAA AAATTTCATG 1081 GAAAATATTC GTTCCATAAA TAGTTCTTTT GCTTTTGCTT CCATGGGTGC AAATATTGCA 1141 TCGCCATCAG GATATGGGCC ATACTGTTTT AGAATACACG GACAAGTTTA TCACCGTACT 1201 GGAACTTTAC ATCCTTCGGA TGGTGTTTCT CGGAAGTTTG CTCAACTCTA TATTTTGGAT 1261 ACAGCCGAAG CTACAAGTAA AAGAT TAG CA ATGCCAGAAA ACCAGGGCTG CTCAGAAAGA 1321 CTCATGATCA ACAT CAACAA CCTCATGCAT GAAATAAAT G AAT T AACAAA ATCGTACAAG 1381 ATGCTACATG AGGTAGAAAA GGAAGCCCAA TCTGAAGCAG CAGCAAAAGG TATTGCTCCC 1441 ACAGAAGTAA CAATGGCGAT TAAATACGAT CGTAACAGTG ACCCAGGTAG ATATAATTCT 1501 CCCCGTGTAA CCGAGGTTGC TGTCATATTC AGAAACGAAG ATGGAGAACC TCCTTTTGAA 1561 AGGGACTTGC TCATTCATTG TAAACCAGAT CCCAATAATC CAAATGCCAC TAAAATGAAA 1621 CAAATCAGTA TCCTGTTTCC TACATTAGAT GCAATGACAT ATCCTATTCT TTTTCCACAT 1681 GGTGAAAAAG GCTGGGGAAC AGATATTGCA TTAAGACTCA GAGACAACAG TGTAATCGAC 1741 AATAATACTA GACAAAATGT AAGGACACGA GTCACACAAA TGCAGTATTA TGGATTTCAT 1801 CTCTCTGTGC GGGACACGTT CAATCCTATT TTAAATGCAG GAAAATTAAC TCAACAGTTT 1861 ATTGTGGATT CATATTCAAA AATGGAGGCC AATCGGATAA ATTTCATCAA AGCAAACCAA 1921 TCTAAGTTGA GAGTTGAAAA ATATAGTGGT TTGATGGATT ATCTCAAATC TAGATCTGAA 1981 AATGACAATG TGCCGATTGG TAAAAT GATA ATACTTCCAT CATCTTTTGA GGGTAGTCCC
2041 AGAAATATGC AGCAGCGATA TCAGGATGCT ATGGCAATTG TAACGAAGTA TGGCAAGCCC
2101 GATTTATTCA TAACCATGAC ATGCAACCCC AAATGGGCAG AT AT T ACAAA CAATTTACAA
2161 CGCTGGCAAA AAGTTGAAAA CAGACCTGAC TTGGTAGCCA GAGTTTTTAA TATTAAGCTG
2221 AATGCTCTTT TAAATGATAT ATGTAAATTC CATTTATTTG GCAAAGTAAT AGCTAAAATT
2281 CATGTCATTG AATTTCAGAA ACGCGGACTG CCTCACGCTC ACATATTATT GATATTAGAT
2341 AGTGAGTCCA AATTACGTTC AGAAGAT GAC ATTGACCGTA TAGTTAAGGC AGAAATTCCA
2401 GATGAAGACC AGTGTCCTCG ACTTTTTCAA ATTGTAAAAT CAAATATGGT ACATGGACCA
2461 TGTGGAATAC AAAATCCAAA TAGTCCATGT ATGGAAAATG GAAAATGTTC AAAGGGATAT
2521 C CAAAAGAAT TTCAAAATGC GACCATTGGA AATATTGATG GATATCCCAA ATACAAAC GA
2581 AGATCTGGTA GCACCATGTC TATTGGAAAT AAAGTTGTCG ATAACACTTG GATTGTCCCT
2641 TATAACCCGT ATTTGTGCCT TAAATATAAC TGTCATATAA ATGTTGAAGT CTGTGCATCA
2701 ATTAAAAGTG TCAAATATTT ATTTAAATAC ATCTATAAAG GGCACGATTG TGCAAATATT
2761 CAAATTTCTG AAAAAAATAT TATCAATCAT GACGAAGTAC AGGACTTCAT TGACTCCAGG
2821 TATGTGAGCG CTCCTGAGGC TGTTTGGAGA CTTTTTGCAA TGCGAATGCA TGACCAATCT
2881 CATGCAATCA CAAGATTAGC TATTCATTTG CCAAATGATC AGAATTTGTA TTTTCATACC
2941 GATGATTTTG CTGAAGTTTT AGATAGGGCT AAAAGGCATA ACTCGACTTT GATGGCTTGG
3001 TTCTTATTGA ATAGAGAAGA TTCTGATGCA CGTAATTATT ATTATTGGGA GATTCCACAG
3061 CATTATGTGT TTAATAATTC TTTGTGGACA AAACGCCGAA AGGGTGGGAA TAAAGTATTA
3121 GGTAGACTGT TCACTGTGAG CTTTAGAGAA CCAGAACGAT ATTACCTTAG ACTTTTGCTT
3181 CTGCATGTAA AAGGTGCGAT AAGTTTTGAG GATCTGCGAA CTGTAGGAGG TGTAACTTAT
3241 GATACATTTC ATGAAGCTGC TAAACACCGA GGATTATTAC TTGATGACAC TATCTGGAAA
3301 GATACGATTG ACGATGCAAT CATCCTTAAT ATGCCCAAAC AACTACGGCA ACTTTTTGCA
3361 TATATATGTG TGTTTGGATG TCCTTCTGCT GCAGACAAAT TATGGGATGA GAATAAATCT
3421 CATTTTATTG AAGATTTCTG TTGGAAATTA CACCGAAGAG AAGGTGCCTG TGTGAACTGT
3481 GAAATGCATG CCCTTAACGA AATTCAGGAG GTATTCACAT TGCATGGAAT GAAATGTTCA
3541 CATTTCAAAC TTCCGGACTA TCCTTTATTA ATGAATGCAA ATACATGTGA TCAATTGTAC
3601 GAGCAACAAC AGGCAGAGGT TTTGATAAAT TCTCTGAATG ATGAACAGTT GGCAGCCTTT
3661 CAGACTATAA CTTCAGCCAT CGAAGATCAA ACTGTACACC CCAAATGCTT TTTCTTGGAT
3721 GGTCCAGGTG GTAGTGGAAA AACATATCTG TATAAAGTTT TAACACATTA TATTAGAGGT
3781 CGTGGTGGTA CTGTTTTACC CACAGCATCT ACAGGAATTG CTGCAAATTT ACTTCTTGGT
3841 GGAAGAACCT TTCATTCCCA AT AT AAAT T A CCAATTCCAT TAAAT GAAAC TTCAATTTCT
3901 AGACTCGATA TAAAGAGTGA AGTTGCTAAA ACCATTAAAA AGGCCCAACT TCTCATTATT
3961 GATGAATGCA CCATGGCATC CAGTCATGCT ATAAACGCCA TAGATAGATT ACTAAGAGAA
4021 ATTATGAATT TGAATGTTGC ATTTGGTGGG AAAGTTCTCC TTCTCGGAGG GGATTTTCGA
4081 CAATGTCTCA GTATTGTACC ACATGCTATG CGATCGGCCA TAGTACAAAC GAGTTTAAAG
4141 TACTGTAATG TTTGGGGATG TTTCAGAAAG TTGTCTCTTA AAACAAATAT GAGATCAGAG
4201 GATTCTGCTT ATAGTGAATG GTTAGTAAAA CTTGGAGATG GCAAACTTGA TAGCAGTTTT
4261 CATTTAGGAA TG GAT AT TAT TGAAATCCCC CATGAAATGA TTTGTAACGG AT C TAT TAT T
4321 GAAGCTACCT TTGGAAATAG TATATCTATA GATAATATTA AAAAT AT AT C TAAACGTGCA
4381 ATTCTTTGTC CAAAAAATGA GCAT GTTCAA AAAT T AAAT G AAGAAATTTT GGATATACTT 4441 GATGGAGATT TTCACACATA TTTGAGTGAT GATTCCATTG ATTCAACAGA TGATGCTGAA
4501 AAGGAAAATT TTCCCATCGA ATTTCTTAAT AGTATTACTC CTTCGGGAAT GCCGTGTCAT
4561 AAATTAAAAT TGAAAGTGGG TGCAATCATC ATGCTATTGA GAAATCTTAA TAGTAAATGG
4621 GGTCTTTGTA ATGGTACTAG ATTTATTATC AAAAGATTAC GACCTAACAT TATCGAAGCT
4681 GAAGTATTAA CAGGATCTGC AGAGGGAGAG GTTGTTCTGA TTCCAAGAAT TGATTTGTCC
4741 CCATCTGACA CTGGCCTCCC ATTTAAATTA ATTCGAAGAC AGTTTCCCGT GATGCCAGCA
4801 TTTGCGATGA CTATTAATAA ATCACAAGGA CAAACTCTAG ACAGAGTAGG AATATTCCTA
4861 CCTGAACCCG TTTTCGCACA TGGTCAGTTA TATGTTGCTT TCTCTCGAGT TCGAAGAGCA
4921 TGTGACGTTA AAGTTAAAGT TGTAAATACT TCATCACAAG GGAAATTAGT CAAGCACTCT
4981 GAAAGTGTTT TTACTCTTAA TGTGGTATAC AGGGAGATAT TAGAATAAGT TTAATCACTT
5041 TATCAGTCAT TGTTTGCATC AATGTTGTTT TTATATCATG TTTTTGTTGT TTTTATATCA
5101 TGTCTTTGTT GTTGTTATAT CATGTTGTTA TTGTTTATTT ATTAATAAAT TTATGTATTA
5161 TTTTCATATA CATTTTACTC ATTTCCTTTC ATCTCTCACA CTTCTATTAT AGAGAAAGGG
5221 CAAATAGCAA TATTAAAATA TTTCCTCTAA TTAATTCCCT TTCAATGTGC ACGAATTTCG
5281 TGCACCGGGC CACTAG (SEQ ID NO: 9)
[0163] Unlike other transposases, the Helitron transposase does not contain an RNase-H like catalytic domain, but instead comprises a RepHel motif made up of a replication initiator domain (Rep) and a DNA helicase domain. The Rep domain is a nuclease domain of the HUH superfamily of nucleases.
[0164] An exemplary Helitron transposase of the disclosure comprises an amino acid sequence comprising:
1 MSKEQLLIQR SSAAERCRRY RQKMSAEQRA SDLERRRRLQ QNVSEEQLLE KRRSEAEKQR 61 RHRQKMSKDQ RAFEVERRRW RRQNMSREQS STSTTNTGRN CLLSKNGVHE DAILEHSCGG 121 MTVRCEFCLS LNFSDEKPSD GKFTRCCSKG KVCPNDIHFP DYPAYLKRLM TNEDSDSKNF 181 MENIRSINSS FAFASMGANI ASPSGYGPYC FRIHGQVYHR TGTLHPSDGV SRKFAQLYIL 241 DTAEATSKRL AMPENQGCSE RLMININNLM HEINELTKSY KMLHEVEKEA QSEAAAKGIA 301 PTEVTMAIKY DRNSDPGRYN SPRVTEVAVI FRNEDGEPPF ERDLLIHCKP DPNNPNATKM 361 KQISILFPTL DAMTYPILFP HGEKGWGTDI ALRLRDNSVI DNNTRQNVRT RVTQMQYYGF 421 HLSVRDTFNP ILNAGKLTQQ FIVDSYSKME ANRINFIKAN QSKLRVEKYS GLMDYLKSRS 481 ENDNVPIGKM IILPSSFEGS PRNMQQRYQD AMAIVTKYGK PDLFITMTCN PKWADITNNL 541 QRWQKVENRP DLVARVFNIK LNALLNDICK FHLFGKVIAK IHVIEFQKRG LPHAHILLIL 601 DSESKLRSED DIDRIVKAEI PDEDQCPRLF QIVKSNMVHG PCGIQNPNSP CMENGKCSKG 661 YPKEFQNATI GNIDGYPKYK RRSGSTMSIG NKWDNTWIV PYNPYLCLKY NCHINVEVCA 721 SIKSVKYLFK YIYKGHDCAN IQISEKNIIN HDEVQDFIDS RYVSAPEAW RLF7\MRMHDQ 781 SHAITRLAIH LPNDQNLYFH TDDFAEVLDR AKRHNSTLMA WFLLNREDSD ARNYYYWEIP 841 QHYVFNNSLW TKRRKGGNKV LGRLFTVSFR EPERYYLRLL LLHVKGAISF EDLRTVGGVT 901 YDTFHEAAKH RGLLLDDTIW KDTIDDAIIL NMPKQLRQLF AYICVFGCPS AADKLWDENK 961 SHFIEDFCWK LHRREGACVN CEMHALNEIQ EVFTLHGMKC SHFKLPDYPL LMNANTCDQL 1021 YEQQQAEVLI NSLNDEQLAA FQTITSAIED QTVHPKCFFL DGPGGSGKTY LYKVLTHYIR 1081 GRGGTVLPTA STGIAANLLL GGRTFHSQYK LPIPLNETSI SRLDIKSEVA KTIKKAQLLI 1141 IDECTMASSH AINAIDRLLR EIMNLNVAFG GKVLLLGGDF RQCLSIVPHA MRSAIVQTSL 1201 KYCNWGCFR KLSLKTNMRS EDSAYSEWLV KLGDGKLDSS FHLGMDI IEI PHEMICNGSI 1261 IEATFGNSIS IDNIKNISKR AILCPKNEHV QKLNEEILDI LDGDFHTYLS DDSIDSTDDA 1321 EKENFPIEFL NSITPSGMPC HKLKLKVGAI IMLLRNLNSK WGLCNGTRFI IKRLRPNI IE 1381 AEVLTGSAEG EWLIPRIDL SPSDTGLPFK LIRRQFPVMP AFAMTINKSQ GQTLDRVGIF 1441 LPEPVFAHGQ LYVAFSRVRR ACDVKVKW TSSQGKLVKH SESVFTLNW YREILE (SEQ ID NO: 10) .
[0165] In Helitron transpositions, a hairpin close to the 3’ end of the transposon functions as a terminator. However, this hairpin can be bypassed by the transposase, resulting in the transduction of flanking sequences. In addition, Helraiser transposition generates covalently closed circular intermediates. Furthermore, Helitron transpositions can lack target site duplications. In the Helraiser sequence, the transposase is flanked by left and right terminal sequences termed LTS and RTS. These sequences terminate with a conserved 5’-TC/CTAG- 3’ motif. A 19 bp palindromic sequence with the potential to form the hairpin termination structure is located 11 nucleotides upstream of the RTS and consists of the sequence
GTGCACGAATTTCGTGCACCGGGCCACTAG (SEQ I D NO: 11).
[0166] Another example transposon system is the Tol2 transposon system. Tol2 transposons may be isolated or derived from the genome of the medaka fish, and may be similar to transposons of the hAT family. Exemplary Tol2 transposons of the disclosure are encoded by a sequence comprising about 4.7 kilobases and contain a gene encoding the Tol2 transposase, which contains four exons. An exemplary Tol2 transposase of the disclosure comprises an amino acid sequence comprising the following:
1 MEEVCDSSAA ASSTVQNQPQ DQEHPWPYLR EFFSLSGVNK DSFKMKCVLC LPLNKEI SAF
61 KSSPSNLRKH IERMHPNYLK NYSKLTAQKR KIGTSTHASS SKQLKVDSVF PVKHVSPVTV
121 NKAILRYIIQ GLHPFSTVDL PSFKELISTL QPGISVITRP TLRSKIAEAA LIMKQKVTAA
181 MSEVEWIATT TDCWTARRKS FIGVTAHWIN PGSLERHSAA LACKRLMGSH TFEVLASAMN
241 DIHSEYEIRD KWCTTTDSG SNFMKAFRVF GVENNDIETE ARRCESDDTD SEGCGEGSDG
301 VEFQDASRVL DQDDGFEFQL PKHQKCACHL LNLVSSVDAQ KALSNEHYKK LYRSVFGKCQ
361 ALWNKSSRSA LAAEAVESES RLQLLRPNQT RWNSTFMAVD RILQICKEAG EGALRNICTS
421 LEVPMFNPAE MLFLTEWANT MRPVAKVLDI LQAETNTQLG WLLPSVHQLS LKLQRLHHSL
481 RYCDPLVDAL QQGIQTRFKH MFEDPEI IAA AILLPKFRTS WTNDETIIKR GMDYIRVHLE
541 PLDHKKELAN SSSDDEDFFA SLKPTTHEAS KELDGYLACV SDTRESLLTF PAICSLSIKT
601 NTPLPASAAC ERLFSTAGLL FSPKRARLDT NNFENQLLLK LNLRFYNFE (SEQ I D NO: 12).
[0167] An exemplary Tol2 transposon of the disclosure, including inverted repeats, subterminal sequences and the Tol2 transposase, is encoded by a nucleic acid sequence comprising the following:
1 CAGAGGTGTA AAGTACTTGA GTAATTTTAC TTGATTACTG TACTTAAGTA TTATTTTTGG 61 GGATTTTTAC TTTACTTGAG TACAATTAAA AATCAATACT TTTACTTTTA CTTAATTACA
121 TTTTTTTAGA AAAAAAAGTA CTTTTTACTC CTTACAATTT TATTTACAGT CAAAAAGTAC
181 TTATTTTTTG GAGATCACTT CATTCTATTT TCCCTTGCTA TTACCAAACC AATTGAATTG
241 CGCTGATGCC CAGTTTAATT TAAATGTTAT TTATTCTGCC TATGAAAATC GTTTTCACAT
301 TATATGAAAT TGGTCAGACA TGTTCATTGG TCCTTTGGAA GTGACGTCAT GTCACATCTA
361 TTACCACAAT GCACAGCACC TTGACCTGGA AATTAGGGAA ATTATAACAG TCAATCAGTG
421 GAAGAAAAT G GAGGAAGTAT GTGATTCATC AGCAGCTGCG AGCAGCACAG TCCAAAATCA
481 GCCACAGGAT CAAGAGCACC CGTGGCCGTA TCTTCGCGAA TTCTTTTCTT TAAGTGGT GT
541 AAATAAAGAT TCATTCAAGA TGAAATGT GT CCTCTGTCTC CCGCTTAATA AAGAAAT AT C
601 GGCCTTCAAA AGTTCGCCAT CAAACCTAAG GAAGCATATT GAGGTAAGTA CATTAAGTAT
661 TTTGTTTTAC TGATAGTTTT TTTTTTTTTT TTTTTTTTTT TTTTTGGGTG TGCATGTTTT
721 GACGTTGATG GCGCGCCTTT TATATGTGTA GTAGGCCTAT TTTCACTAAT GCATGCGATT
781 GACAATATAA GGCTCACGTA ATAAAATGCT AAAATGCATT TGTAATTGGT AACGTTAGGT
841 CCACGGGAAA TTTGGCGCCT ATTGCAGCTT TGAATAATCA TTATCATTCC GTGCTCTCAT
901 TGT GTTTGAA TTCATGCAAA ACACAAGAAA ACCAAGCGAG AAATTTTTTT CCAAACATGT
961 TGTATTGTCA AAACGGTAAC ACTTTACAAT GAG GTT GATT AGTTCATGTA TTAACTAACA
1021 TTAAATAACC ATGAGCAATA CATTTGTTAC TGTATCTGTT AATCTTTGTT AACGTTAGTT
1081 AATAGAAATA CAGATGTTCA TTGTTTGTTC ATGTTAGTTC ACAGTGCATT AACTAATGTT
1141 AACAAGATAT AAAGTATTAG TAAATGTTGA AATTAACATG TATACGTGCA GTT CAT TAT T
1201 AGTTCATGTT AACTAATGTA GTTAACTAAC GAACCTTATT GTAAAAGTGT TACCATCAAA
1261 ACTAATGTAA TGAAATCAAT TCACCCTGTC ATGTCAGCCT TACAGTCCTG TGTTTTTGTC
1321 AATATAATCA GAAATAAAAT TAAT GTTTGA TTGTCACTAA ATGCTACTGT ATTTCTAAAA
1381 TCAACAAGTA TTTAACATTA TAAAGT GTGC AATTGGCTGC AAATGTCAGT TTTATTAAAG
1441 GGTTAGTTCA CCCAAAAATG AAAATAATGT CAT TAAT GAC TCGCCCTCAT GTCGTTCCAA
1501 GCCCGTAAGA CCTCCGTTCA TCTTCAGAAC ACAGTTTAAG ATATTTTAGA TTTAGTCCGA
1561 GAGCTTTCTG TGCCTCCATT GAGAATGTAT GTACGGTATA CTGTCCATGT CCAGAAAGGT
1621 AATAAAAACA TCAAAGTAGT CCAT GTGACA TCAGTGGGTT AGTTAGAATT TTTTGAAGCA
1681 TCGAATACAT TTTGGTCCAA AAATAACAAA ACCTACGACT TTATTCGGCA TTGTATTCTC
1741 TTCCGGGTCT GTTGTCAATC CGCGTTCACG ACTTCGCAGT GACGCTACAA TGCTGAATAA
1801 AGTCGTAGGT TTTGTTATTT TTGGACCAAA ATGTATTTTC GATGCTTCAA ATAATTCTAC
1861 CTAACCCACT GATGTCACAT GGACTACTTT GATGTTTTTA TTACCTTTCT GGACATGGAC
1921 AGTATACCGT ACATACATTT TCAGTGGAGG GACAGAAAGC TCTCGGACTA AATCTAAAAT
1981 ATCTTAAACT GTGTTCCGAA GATGAACGGA GGT GTTACGG GCTTGGAACG ACATGAGGGT
2041 GAGTCATTAA TGACATCTTT TCATTTTTGG GTGAACTAAC CCTTTAATGC TGTAATCAGA
2101 GAGTGTATGT GTAATTGTTA CATTTATTGC ATACAATATA AATATTTATT TGTTGTTTTT
2161 ACAGAGAAT G CACCCAAATT ACCTCAAAAA CTACTCTAAA TTGACAGCAC AGAAGAGAAA
2221 GATCGGGACC TCCACCCATG CTTCCAGCAG TAAGCAACTG AAAGTTGACT CAGTTTTCCC
2281 AGTCAAACAT GT GTCTCCAG TCACTGTGAA CAAAGCTATA TTAAGGTACA TCATTCAAGG
2341 ACTTCATCCT TTCAGCACTG TTGATCTGCC AT CAT TT AAA GAGCTGATTA GTACACTGCA
2401 GCCTGGCATT TCTGTCATTA CAAGGCCTAC TTTACGCTCC AAGATAGCTG AAGCTGCTCT
2461 GATCATGAAA CAGAAAGTGA CTGCTGCCAT GAGTGAAGTT GAATGGATTG CAACCACAAC 2521 GGATTGTTGG ACTGCACGTA GAAAGTCATT CATTGGT GTA ACTGCTCACT GGATCAACCC
2581 TGGAAGTCTT GAAAGACATT CCGCTGCACT TGCCTGCAAA AGATTAATGG GCTCTCATAC
2641 TTTTGAGGTA CTGGCCAGTG CCATGAATGA TATCCACTCA GAGTATGAAA TACGTGACAA
2701 GGTTGTTTGC ACAACCACAG ACAGTGGTTC CAACTTTATG AAGGCTTTCA GAGTTTTTGG
2761 TGTGGAAAAC AATGATATCG AGACTGAGGC AAGAAGGTGT GAAAGTGATG ACACTGATTC
2821 TGAAGGCTGT GGTGAGGGAA GTGATGGTGT GGAATTCCAA GATGCCTCAC GAGTCCTGGA
2881 CCAAGACGAT GGCTTCGAAT TCCAGCTACC AAAACAT CAA AAGTGTGCCT GTCACTTACT
2941 TAACCTAGTC TCAAGCGTTG ATGCCCAAAA AGCTCTCTCA AATGAACACT ACAAGAAACT
3001 CTACAGATCT GTCTTTGGCA AATGCCAAGC TTTATGGAAT AAAAGCAGCC GATCGGCTCT
3061 AGCAGCTGAA GCTGTTGAAT CAGAAAGCCG GCTTCAGCTT TTAAGGCCAA ACCAAACGCG
3121 GTGGAATTCA ACTTTTATGG CTGTTGACAG AATTCTTCAA ATTTGCAAAG AAGCAGGAGA
3181 AGGCGCACTT CGGAATATAT GCACCTCTCT TGAGGTTCCA ATGTAAGTGT TTTTCCCCTC
3241 TATCGATGTA AACAAATGTG GGTTGTTTTT GTTTAATACT CTTTGATTAT GCTGATTTCT
3301 CCTGTAGGTT TAATCCAGCA GAAATGCTGT TCTTGACAGA GTGGGCCAAC ACAATGCGTC
3361 CAGTTGCAAA AGTACTCGAC ATCTTGCAAG CGGAAACGAA TACACAGCTG GGGTGGCTGC
3421 TGCCTAGTGT CCATCAGTTA AGCTTGAAAC TTCAGCGACT CCACCATTCT CTCAGGTACT
3481 GTGACCCACT TGTGGATGCC CTACAACAAG GAATCCAAAC AC GATT CAA G CATATGTTTG
3541 AAGATCCTGA GATCATAGCA GCTGCCATCC TTCTCCCTAA ATTTCGGACC TCTTGGACAA
3601 ATGATGAAAC CAT C AT AAAA CGAGGTAAAT GAATGCAAGC AACATACACT TGACGAATTC
3661 TAATCTGGGC AACCTTTGAG CCATACCAAA ATTATTCTTT TATTTATTTA TTTTTGCACT
3721 TTTTAGGAAT GTTATATCCC ATCTTTGGCT GTGATCTCAA TATGAATATT GATGTAAAGT
3781 ATTCTTGCAG CAGGTTGTAG TTATCCCTCA GTGTTTCTTG AAACCAAACT CATATGTATC
3841 ATATGTGGTT TGGAAATGCA GTTAGATTTT ATGCTAAAAT AAGGGATTTG CATGATTTTA
3901 GATGTAGATG ACTGCACGTA AATGTAGTTA AT GACAAAAT CCATAAAATT TGTTCCCAGT
3961 CAGAAGCCCC TCAACCAAAC TTTTCTTTGT GTCTGCTCAC TGTGCTTGTA GGCATGGACT
4021 ACATCAGAGT GCATCTGGAG CCTTTGGACC ACAAGAAGGA ATTGGCCAAC AGTTCATCTG
4081 ATGATGAAGA TTTTTTCGCT TCTTTGAAAC CGACAACACA TGAAGCCAGC AAAGAGTTGG
4141 ATGGATATCT GGCCTGTGTT TCAGACACCA GGGAGTCTCT GCTCACGTTT CCTGCTATTT
4201 GCAGCCTCTC TATCAAGACT AATACACCTC TTCCCGCATC GGCTGCCTGT GAGAGGCTTT
4261 TCAGCACTGC AGGATTGCTT TTCAGCCCCA AAAGAGCTAG GCTTGACACT AACAATTTTG
4321 AGAATCAGCT TCTACTGAAG TTAAATCTGA GGTTTTACAA CTTTGAGTAG CGT GTACTGG
4381 CATTAGATTG TCTGTCTTAT AGTTTGATAA T TAAATACAA ACAGTTCTAA AGCAGGATAA
4441 AACCTTGTAT GCATTTCATT TAAT GTTTTT TGAGATTAAA AGCTTAAACA AGAATCTCTA
4501 GTTTTCTTTC TTGCTTTTAC TTTTACTTCC TTAATACTCA AGTACAATTT TAATGGAGTA
4561 CTTTTTTACT TTTACTCAAG TAAGATTCTA GCCAGATACT TTTACTTTTA ATTGAGTAAA
4621 ATTTTCCCTA AGTACTTGTA CTTTCACTTG AGTAAAATTT TTGAGTACTT TTTACACCTC
4681 TG (SEQ I D NO: 13).
[0168] Similar to the RNA-directed nucleases discussed here, poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelles may complex with the transposase in its native protein for, as mRNA that is transcribed into protein in the target cell, or as an expression vector containing DNA to express the transposase protein. For example, genes encoding the transposase may be provided in the same vector as the transposon itself, or on a different vector.
[0169] Various embodiments may further enable complexing a nuclease and a transposon system in a poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelles for delivery to a target cell. Such micelle systems may be used for example, to replace a mutated gene that causes disease with a healthy copy of the gene that is inserted at a specific site dictated by the activity of the nuclease. Specifically, a transposon may be created that includes one or more gene to be inserted, which is surrounded by the ITRs for recognition by the transposase. The transposon and ITRs may be provided on a vector that contains homology arms on each end of the ITRs. The transposon system (i.e., the transposon vector and corresponding transposase), when delivered with the nuclease, may serve the function of the DNA repair template used in HDR. That is, following the creation of one or more DSB by the nuclease, the transposon may be inserted into the target DNA based on the homology arms. In some embodiments, the transposon insertion may occur between the two ends generated by a DSB. In other embodiments, the transposon may be inserted between one arm of a first DSB and the other arm at a second DSB in the target DNA (i.e., replacing the sequence between two DSBs).
[0170] While a variety poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelle formulations that complex with proteins and/or nucleic acids may be designed for different uses, each complexing system may include common characteristics in order to be effective. For example, nucleic acids may be complexed with a poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelle with at least 40% efficiency. Such minimum efficiency ensures delivery of enough active molecule to achieve efficient DNA cleavage and/or other modification, and that the product can be reproducibly generated at a low cost. In another example, the poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelles may be designed to be stable, yet to provide facile release of the complexed payload once the micelle has been taken up intracellularly, thereby avoiding endosomal retraffi eking and ensuring release of the nucleic acids. Moreover, in various gene therapy systems, the vector (i.e., transposon) may be designed to provide stable expression.
[0171] The gene editing tools provided in poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelle described herein may be beneficial for a number of in vivo applications. For example, the embodiment materials may be delivered to various cell types in order to cut or to repair gene defects. Such cells include, but are not limited to, hepatocytes, hepatic endothelial cells, immune cells, neurons, etc. The embodiment poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelle may also be delivered to various cell types in order to silence defective genes that cause diseases (for example, delivery to retinal cells to silence mutations underlying Leber's Congenital Amaurosis).
[0172] Various methods may be used to generate the poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelles and/or complexation of micelles and proteins and/or nucleic acids described herein. In some embodiments, conventional preparation techniques such as thin-film rehydration, direct-hydration, and electro-formation may be used to form polymeric micelles that complex with nucleic acids and/or proteins with gene editing functions into various degradable and non-degradable micelles.
[0173] Creation of various poly(histidine)-based, or another ionizable or constitutively cationically-charged, polymer-based micelles complexed with model proteins and model nucleic acids may be created using conventional techniques. For example, bovine serum albumin (BSA; Mw = about 66 kDa), which has a size and thermal stability (i.e., denaturation above 60 °C) comparable to other medium size proteins with therapeutic potential, was used as a model protein. Other model proteins that may be used in such compositions are myoglobin (Mb; Mw = about 17 kDa) and catalase (Mw = about 250 kDa). The complexing of model proteins having various sizes provides a range of sizes of functional proteins that may be used in various embodiments. Further, various DNA plasmids may be used as model nucleic acids for poly(histidine)-based, or another ionizable or constitutively cationically- charged, polymer-based micelles, such as plasmid DNA encoding the mammalian expression vector for expression of green fluorescent protein (GFP) using the elongation factor I alpha (EF la) promoter) (i.e., pEF-GFP DNA). The pEF-GFP DNA is about 5000 base-pairs, and has a molecular weight of about 3283 kDa.
[0174] In the micelles that are formed by the various embodiment triblock copolymers, the hydrophobic blocks may aggregate to form a core, leaving the hydrophilic blocks and poly(histidine), other ionizable polymers, or cationically-charged polymer blocks on the ends to form one or more surrounding layer.
Cas-CLOVER [0175] The disclosure provides a composition comprising a guide RNA and a fusion protein or a sequence encoding the fusion protein wherein the fusion protein comprises a dCas9 and a Clo05l endonuclease or a nuclease domain thereof.
Small Cas9 ( SaCas9 )
[0176] The disclosure provides compositions comprising a small, Cas9 (Cas9) operatively- linked to an effector. In some embodiments, the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA localization component and an effector molecule, wherein the effector comprises a small, Cas9 (Cas9). In some
embodiments, a small Cas9 construct of the disclosure may comprise an effector comprising a type IIS endonuclease.
[0177] Amino acid sequence of Staphylococcus aureus Cas9 with an active catalytic site.
1 mkrnyilgld igitsvgygi idyetrdvid agvrl fkean vennegrrs k rgarrlkrrr
61 rhriqrvkkl lfdynlltdh selsginpye arvkglsqkl seeefsaall hlakrrgvhn
121 vneveedtgn elstkeqisr nskaleekyv aelqlerlkk dgevrgsinr fktsdyvkea
181 kqllkvqkay hqldqs fidt yidlletrrt yyegpgegsp fgwkdikewy emlmghctyf
241 peelrsvkya ynadlynaln dlnnlvitrd enekleyyek fqiienvfkq kkkptlkqia
301 keilvneedi kgyrvtstgk peftnlkvyh dikditarke iienaelldq iakiltiyqs
361 sediqeeltn lnseltqeei eqisnlkgyt gthnlslkai nlildelwht ndnqiaifnr
421 lklvpkkvdl sqqkeipttl vddfilspvv krs fiqsikv inaiikkygl pndiiielar
481 eknskdaqkm inemqkrnrq tnerieeiir ttgkenakyl iekiklhdmq egkclyslea
541 ipledllnnp fnyevdhiip rsvs fdns fn nkvlvkqeen skkgnrtpfq ylsssdskis
601 yetfkkhiln lakgkgrisk tkkeylleer dinrfsvqkd finrnlvdtr yatrglmnll
661 rsyfrvnnld vkvksinggf tsflrrkwkf kkernkgykh haedaliian adfifkewkk
721 ldkakkvmen qmfeekqaes mpeieteqey kei fitphqi khikdfkdyk yshrvdkkpn
781 relindtlys trkddkgntl ivnnlnglyd kdndklkkli nkspekllmy hhdpqtyqkl
841 klimeqygde knplykyyee tgnyltkysk kdngpvikki kyygnklnah lditddypns
901 rnkvvklslk pyrfdvyldn gvykfvtvkn ldvikkenyy evns kcyeea kklkkisnqa
961 efias fynnd likingelyr vigvnndlln rievnmidit yreylenmnd krppriikti
1021 asktqsikky stdilgnlye vks kkhpqii kkg (SEQ ID NO: 14)
Inactivated, small Cas9 (dSaCas9)
[0178] The disclosure provides compositions comprising an inactivated, small, Cas9 (dSaCas9) operatively -linked to an effector. In some embodiments, the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA localization component and an effector molecule, wherein the effector comprises a small, inactivated Cas9 (dSaCas9). In some embodiments, a small, inactivated Cas9 (dSaCas9) construct of the disclosure may comprise an effector comprising a type IIS endonuclease.
[0179] dSaCas9 Sequence: D10A and N580A mutations (bold, capitalized, and underlined) inactivate the catalytic site.
1 mkrnyilglA igitsvgygi idyetrdvid agvrlfkean vennegrrsk rgarrlkrrr
61 rhriqrvkkl lfdynlltdh selsginpye arvkglsqkl seeefsaall hlakrrgvhn
121 vneveedtgn elstkeqisr nskaleekyv aelqlerlkk dgevrgsinr fktsdyvkea
181 kqllkvqkay hqldqs fidt yidlletrrt yyegpgegsp fgwkdikewy emlmghctyf
241 peelrsvkya ynadlynaln dlnnlvitrd enekleyyek fqiienvfkq kkkptlkqia 301 keilvneedi kgyrvtstgk peftnlkvyh dikditarke iienaelldq iakiltiyqs
361 sediqeeltn lnseltqeei eqisnlkgyt gthnlslkai nlildelwht ndnqiaifnr
421 lklvpkkvdl sqqkeipttl vddfilspvv krs fiqsikv inaiikkygl pndiiielar
481 eknskdaqkm inemqkrnrq tnerieeiir ttgkenakyl iekiklhdmq egkclyslea
541 ipledllnnp fnyevdhiip rsvs fdns fn nkvlvkqeeA skkgnrtpfq ylsssdskis
601 yetfkkhiln lakgkgrisk tkkeylleer dinrfsvqkd finrnlvdtr yatrglmnll
661 rsyfrvnnld vkvksinggf tsflrrkwkf kkernkgykh haedaliian adfifkewkk
721 ldkakkvmen qmfeekqaes mpeieteqey kei fitphqi khikdfkdyk yshrvdkkpn
781 relindtlys trkddkgntl ivnnlnglyd kdndklkkli nkspekllmy hhdpqtyqkl
841 klimeqygde knplykyyee tgnyltkysk kdngpvikki kyygnklnah lditddypns
901 rnkvvklslk pyrfdvyldn gvykfvtvkn ldvikkenyy evns kcyeea kklkkisnqa
961 efias fynnd likingelyr vigvnndlln rievnmidit yreylenmnd krppriikti
1021 asktqsikky stdilgnlye vks kkhpqii kkg (SEQ ID NO: 15)
Inactivated Cas9 (dCas9)
[0180] The disclosure provides compositions comprising an inactivated Cas9 (dCas9) operatively-linked to an effector. In some embodiments, the disclosure provides a fusion protein comprising, consisting essentially of or consisting of a DNA localization component and an effector molecule, wherein the effector comprises an inactivated Cas9 (dCas9). In some embodiments, an inactivated Cas9 (dCas9) construct of the disclosure may comprise an effector comprising a type IIS endonuclease.
[0181] In some embodiments, the dCas9 of the disclosure comprises a dCas9 isolated or derived from Staphyloccocus pyogenes. In some embodiments, the dCas9 comprises a dCas9 with substitutions at positions 10 and 840 of the amino acid sequence of the dCas9 which inactivate the catalytic site. In some embodiments, these substitutions are D10A and H840A.
In some embodiments, the amino acid sequence of the dCas9 comprises the sequence of:
1 XDKKYSIGLA IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE
61 ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPI FG 121 NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYIAIAH MIKFRGHFLI EGDLNPDNSD 181 VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN 241 LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI 301 LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA 361 GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH 421 AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE 481 WDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL 541 SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI 601 IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG 661 RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL 721 HEHIANLAGS PAIKKGILQT VKWDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER 781 MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDA 841 IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEWKKMK NYWRQLLNAK LITQRKFDNL 901 TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS 961 KLVSDFRKDF QFYKVREINN YHHAHDAYLN AWGTALIKK YPKLESEFVY GDYKVYDVRK 1021 MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TIANGEIRKR PLIETNGETG EIVWDKGRDF 1081 ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA 1141 YSVLWAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK 1201 YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE 1261 QHKHYLDEI I EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA 1321 PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD (SEQ ID NO:
16)
[0182] In some embodiments, the amino acid sequence of the dCas9 comprises the sequence of:
1 MDKKYSIGLA IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE 61 ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPI FG 121 NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYIAIAH MIKFRGHFLI EGDLNPDNSD 181 VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN 241 LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI 301 LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA 361 GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH 421 AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE 481 WDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL 541 SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI 601 IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG 661 RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL 721 HEHIANLAGS PAIKKGILQT VKWDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER 781 MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDA 841 IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEWKKMK NYWRQLLNAK LITQRKFDNL 901 TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS 961 KLVSDFRKDF QFYKVREINN YHHAHDAYLN AWGTALIKE YPKLESEFVY GDYKVYDVRK 1021 MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TIANGEIRKR PLIETNGETG EIVWDKGRDF 1081 ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA 1141 YSVLWAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK 1201 YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE 1261 QHKHYLDEI I EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA 1321 PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD (SEQ ID NO
17) .
Clo051 Endonuclease
[0183] An exemplary Clo05l nuclease domain may comprise, consist essentially of or consist of, the amino acid sequence of:
EG IKSNISLLKDELRGQISH ISH EYLSLI DLAFDSKQN RLFEMKVLELLVN EYGFKGRH LGGSRKPDGIVYSTTLEDN FG IIVDTKAYSEGYSLPISQADEM ERYVRENSN RDEEVN PN KWWEN FSEEVKKYYFVFISGSFKG KFEEQLRRLSMTTG VNGSAVNVVN LLLGAEKI RSGEMTI EELERAMFN NSEFILKY (SEQ I D NO: 18).
Cas-CLOVER Fusion Protein
[0184] In some embodiments, an exemplary dCas9-Clo05l fusion protein (embodiment 1) may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 sequence ( Streptoccocus pyogenes) in italics):
MAPKKKRKVEGIKSNI SLLKDELRGQISHI SHEYLSLIDIAFDSKQNRLFEMKVLELLVNEYGFKGRHLGGSRKP DGIVYSTTLEDNFGIIVDTKAYSEGYSLPI SQADEMERYVRENSNRDEEV PNKWWENFSEEVKKYYFVFISGSF KGKFEEQLRRLSMTTGVNGSAVNWNLLLGAEKIRSGEMTIEELERAMFNNSEFILKYGGGGSDKKYSIGLAIGT NSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEIFSN EMAKVDDSFFHRLEESELVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIK FRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLF GNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEIT KAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEE LLVKLNREDLLRKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFA WMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPA ELSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENE DILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGF ANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKWDELVKVMGRHKPENIVIEM ARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDV DAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDK AGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAY LNAWGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPT VAYSVLWAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM
LASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLD KVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLS QLGGDGSPKKKRKVSS (SEQ I D NO: 19) .
[0185] In some embodiments, an exemplary dCas9-Clo05l fusion protein (embodiment 1) may comprise, consist essentially of or consist of, the nucleic acid sequence of (dCas9 sequence derived from Streptoccocus pyogenes ):
1 atggcaccaa agaagaaaag aaaagtggag ggcatcaagt caaacatcag cctgctgaaa
61 gacgaactgc ggggacagat tagtcacatc agtcacgagt acctgtcact gattgatctg
121 gccttcgaca gcaagcagaa tagactgttt gagatgaaag tgctggaact gctggtcaac
181 gagtatggct tcaagggcag acatctgggc gggtctagga aacctgacgg catcgtgtac
241 agtaccacac tggaagacaa cttcggaatc attgtcgata ccaaggctta ttccgagggc
301 tactctctgc caattagtca ggcagatgag atggaaaggt acgtgcgcga aaactcaaat
361 agggacgagg aagtcaaccc caataagtgg tgggagaatt tcagcgagga agtgaagaaa
421 tactacttcg tctttatctc aggcagcttc aaagggaagt ttgaggaaca gctgcggaga
481 ctgtccatga ctaccggggt gaacggatct gctgtcaacg tggtcaatct gctgctgggc
541 gcagaaaaga tcaggtccgg ggagatgaca attgaggaac tggaacgcgc catgttcaac
601 aattctgagt ttatcctgaa gtatggaggc gggggaagcg ataagaaata ctccatcgga 661 ctggccattg gcaccaattc cgtgggctgg gctgtcatca cagacgagta caaggtgcca
721 agcaagaagt tcaaggtcct ggggaacacc gatcgccaca gtatcaagaa aaatctgatt
781 ggagccctgc tgttcgactc aggcgagact gctgaagcaa cccgactgaa gcggactgct
841 aggcgccgat atacccggag aaaaaatcgg atctgctacc tgcaggaaat tttcagcaac
901 gagatggcca aggtggacga tagtttcttt caccgcctgg aggaatcatt cctggtggag
961 gaagataaga aacacgagcg gcatcccatc tttggcaaca ttgtggacga agtcgcttat
1021 cacgagaagt accctactat ctatcatctg aggaagaaac tggtggactc caccgataag
1081 gcagacctgc gcctgatcta tctggccctg gctcacatga tcaagttccg ggggcatttt
1141 ctgatcgagg gagatctgaa ccctgacaat tctgatgtgg acaagctgtt catccagctg
1201 gtccagacat acaatcagct gtttgaggaa aacccaatta atgcctcagg cgtggacgca
1261 aaggccatcc tgagcgccag actgtccaaa tctaggcgcc tggaaaacct gatcgctcag
1321 ctgccaggag agaagaaaaa cggcctgttt gggaatctga ttgcactgtc cctgggcctg
1381 acacccaact tcaagtctaa ttttgatctg gccgaggacg ctaagctgca gctgtccaaa
1441 gacacttatg acgatgacct ggataacctg ctggctcaga tcggcgatca gtacgcagac
1501 ctgttcctgg ccgctaagaa tctgagtgac gccatcctgc tgtcagatat tctgcgcgtg
1561 aacacagaga ttactaaggc cccactgagt gcttcaatga tcaaaagata tgacgagcac
1621 catcaggatc tgaccctgct gaaggctctg gtgaggcagc agctgcccga gaaatacaag
1681 gaaatcttct ttgatcagag caagaatgga tacgccggct atattgacgg cggggcttcc
1741 caggaggagt tctacaagtt catcaagccc attctggaaa agatggacgg caccgaggaa
1801 ctgctggtga agctgaatcg ggaggacctg ctgagaaaac agaggacatt tgataacgga
1861 agcatccctc accagattca tctgggcgaa ctgcacgcca tcctgcgacg gcaggaggac
1921 ttctacccat ttctgaagga taaccgcgag aaaatcgaaa agatcctgac cttcagaatc
1981 ccctactatg tggggcctct ggcacgggga aatagtagat ttgcctggat gacaagaaag
2041 tcagaggaaa ctatcacccc ctggaacttc gaggaagtgg tcgataaagg cgctagcgca
2101 cagtccttca ttgaaaggat gacaaatttt gacaagaacc tgccaaatga gaaggtgctg
2161 cccaaacaca gcctgctgta cgaatatttc acagtgtata acgagctgac taaagtgaag
2221 tacgtcaccg aagggatgcg caagcccgca ttcctgtccg gagagcagaa gaaagccatc
2281 gtggacctgc tgtttaagac aaatcggaaa gtgactgtca aacagctgaa ggaagactat
2341 ttcaagaaaa ttgagtgttt cgattcagtg gaaatcagcg gcgtcgagga caggtttaac
2401 gcctccctgg ggacctacca cgatctgctg aagatcatca aggataagga cttcctggac
2461 aacgaggaaa atgaggacat cctggaggac attgtgctga cactgactct gtttgaggat
2521 cgcgaaatga tcgaggaacg actgaagact tatgcccatc tgttcgatga caaagtgatg
2581 aagcagctga aaagaaggcg ctacaccgga tggggacgcc tgagccgaaa actgatcaat
2641 gggattagag acaagcagag cggaaaaact atcctggact ttctgaagtc cgatggcttc
2701 gccaacagga acttcatgca gctgattcac gatgactctc tgaccttcaa ggaggacatc
2761 cagaaagcac aggtgtctgg ccagggggac agtctgcacg agcatatcgc aaacctggcc
2821 ggcagccccg ccatcaagaa agggattctg cagaccgtga aggtggtgga cgaactggtc
2881 aaggtcatgg gacgacacaa acctgagaac atcgtgattg agatggcccg cgaaaatcag
2941 acaactcaga agggccagaa aaacagtcga gaacggatga agagaatcga ggaaggcatc
3001 aaggagctgg ggtcacagat cctgaaggag catcctgtgg aaaacactca gctgcagaat
3061 gagaaactgt atctgtacta tctgcagaat ggacgggata tgtacgtgga ccaggagctg
3121 gatattaaca gactgagtga ttatgacgtg gatgccatcg tccctcagag cttcctgaag
3181 gatgactcca ttgacaacaa ggtgctgacc aggtccgaca agaaccgcgg caaatcagat
3241 aatgtgccaa gcgaggaagt ggtcaagaaa atgaagaact actggaggca gctgctgaat
3301 gccaagctga tcacacagcg gaaatttgat aacctgacta aggcagaaag aggaggcctg
3361 tctgagctgg acaaggccgg cttcatcaag cggcagctgg tggagacaag acagatcact
3421 aagcacgtcg ctcagattct ggatagcaga atgaacacaa agtacgatga aaacgacaag
3481 ctgatcaggg aggtgaaagt cattactctg aaatccaagc tggtgtctga ctttagaaag
3541 gatttccagt tttataaagt cagggagatc aacaactacc accatgctca tgacgcatac
3601 ctgaacgcag tggtcgggac cgccctgatt aagaaatacc ccaagctgga gtccgagttc
3661 gtgtacggag actataaagt gtacgatgtc cggaagatga tcgccaaatc tgagcaggaa
3721 attggcaagg ccaccgctaa gtatttcttt tacagtaaca tcatgaattt ctttaagacc
3781 gaaatcacac tggcaaatgg ggagatcaga aaaaggcctc tgattgagac caacggggag
3841 acaggagaaa tcgtgtggga caagggaagg gattttgcta ccgtgcgcaa agtcctgtcc
3901 atgccccaag tgaatattgt caagaaaact gaagtgcaga ccgggggatt ctctaaggag
3961 agtattctgc ctaagcgaaa ctctgataaa ctgatcgccc ggaagaaaga ctgggacccc
4021 aagaagtatg gcgggttcga ctctccaaca gtggcttaca gtgtcctggt ggtcgcaaag
4081 gtggaaaagg ggaagtccaa gaaactgaag tctgtcaaag agctgctggg aatcactatt
4141 atggaacgca gctccttcga gaagaatcct atcgattttc tggaagccaa gggctataaa
4201 gaggtgaaga aagacctgat cattaagctg ccaaaatact cactgtttga gctggaaaac
4261 ggacgaaagc gaatgctggc aagcgccgga gaactgcaga agggcaatga gctggccctg 4321 ccctccaaat acgtgaactt cctgtatctg gctagccact acgagaaact gaaggggtcc
4381 cctgaggata acgaacagaa gcagctgttt gtggagcagc acaaacatta tctggacgag
4441 atcattgaac agatttcaga gttcagcaag agagtgatcc tggctgacgc aaatctggat
4501 aaagtcctga gcgcatacaa caagcaccga gacaaaccaa tccgggagca ggccgaaaat
4561 atcattcatc tgttcaccct gacaaacctg ggcgcccctg cagccttcaa gtattttgac
4621 accacaatcg atcggaagag atacacttct accaaagagg tgctggatgc taccctgatc
4681 caccagagta ttaccggcct gtatgagaca cgcatcgacc tgtcacagct gggaggcgat
4741 gggagcccca agaaaaagcg gaaggtgtct agttaa (SEQ ID NO: 20) .
[0186] In some embodiments, the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 1) of the disclosure may comprise a DNA. In some embodiments, the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 1) of the disclosure may comprise an RNA.
[0187] In some embodiments, an exemplary dCas9-Clo05l fusion protein (embodiment 2) may comprise, consist essentially of or consist of, the amino acid sequence of (Clo05l sequence underlined, linker bold italics, dCas9 sequence ( Streptoccocus pyogenes) in italics):
1 MPKKKRKVEG IKSNISLLKD ELRGQISHIS HEYLSLIDLA FDSKQNRLFE MKVLELLVNE 61 YGFKGRHLGG SRKPDGIVYS TTLEDNFGII VDTKAYSEGY SLPISQADEM ERYVRENSNR
121 DEEV PNKWW ENFSEEVKKY YFVFISGSFK GKFEEQLRRL SMTTGVNGSA VNWNLLLGA
181 EKIRSGEMTI EELERAMFNN SEFILKYGGG GSDKKYSIGL AIGTNSVGWA VITDEYKVPS
241 KKFKVLGNTD RHSIKKNLIG ALLFDSGETA EATRLKRTAR RRYTRRKNRI CYLQEIFSNE
301 MAKVDDSFFH RLEESFLVEE DKKHERHPIF GNIVDEVAYH EKYPTIYHLR KKLVDSTDKA
361 DLRLIYLALA HMIKFRGHFL IEGDLNPDNS DVDKLFIQLV QTYNQLFEEN PINASGVDAK
421 AILSARLSKS RRLENLIAQL PGEKKNGLFG NLIALSLGLT PNFKSNFDLA EDAKLQLSKD
481 TYDDDLDNLL AQIGDQYADL FLAAKNLSDA ILLSDILRVN TEITKAPLSA SMIKRYDEHH
541 QDLTLLKALV RQQLPEKYKE IFFDQSKNGY AGYIDGGASQ EEFYKFIKPI LEKMDGTEEL
601 LVKLNREDLL RKQRTFDNGS IPHQIHLGEL HAILRRQEDF YPFLKDNREK IEKILTFRIP
661 YYVGPLARGN SRFAWMTRKS EETITPWNFE EWDKGASAQ SFIERMTNFD KNLPNEKVLP
721 KHSLLYEYFT VYNELTKVKY VTEGMRKPAF LSGEQKKAIV DLLFKTNRKV TVKQLKEDYF
781 KKIECFDSVE ISGVEDRFNA SLGTYHDLLK IIKDKDFLDN EENEDILEDI VLTLTLFEDR
841 EMIEERLKTY AHLFDDKVMK QLKRRRYTGW GRLSRKLING IRDKQSGKTI LDFLKSDGFA
901 NRNFMQLIHD DSLTFKEDIQ KAQVSGQGDS LHEHIANLAG SPAIKKGILQ TVKVVDELVK
961 VMGRHKPENI VIEMARENQT TQKGQKNSRE RMKRIEEGIK ELGSQILKEH PVENTQLQNE
1021 KLYLYYLQNG RDMYVDQELD INRLSDYDVD AIVPQSFLKD DSIDNKVLTR SDKNRGKSDN
1081 VPSEEWKKM KNYWRQLLNA KLITQRKFDN LTKAERGGLS ELDKAGFIKR QLVETRQITK
1141 HVAQILDSRM NTKYDENDKL IREVKVITLK SKLVSDFRKD FQFYKVREIN NYHHAHDAYL
1201 NAVVGTALIK KYPKLESEFV YGDYKVYDVR KMIAKSEQEI GKATAKYFFY SNIMNFFKTE
1261 ITLANGEIRK RPLIETNGET GEIVWDKGRD FATVRKVLSM PQVNIVKKTE VQTGGFSKES
1321 ILPKRNSDKL IARKKDWDPK KYGGFDSPTV AYSVLWAKV EKGKSKKLKS VKELLGITIM
1381 ERSSFEKNPI DFLEAKGYKE VKKDLIIKLP KYSLFELENG RKRMLASAGE LQKGNELALP
1441 SKYVNFLYLA SHYEKLKGSP EDNEQKQLFV EQHKHYLDEI IEQISEFSKR VILADANLDK
1501 VLSAYNKHRD KPIREQAENI IHLFTLTNLG APAAFKYFDT TIDRKRYTST KEVLDATLIH 1561 QSITGLYETR IDLSQLGGDG SPKKKRKV (SEQ ID NO: 21) .
[0188] In some embodiments, an exemplary dCas9-Clo05l fusion protein (embodiment 2) may comprise, consist essentially of or consist of, the nucleic acid sequence of (dCas9 sequence derived fro m Slreploccocus pyogenes ):
1 atgcctaaga agaagcggaa ggtggaaggc atcaaaagca acatctccct cctgaaagac
61 gaactccggg ggcagattag ccacattagt cacgaatacc tctccctcat cgacctggct
121 ttcgatagca agcagaacag gctctttgag atgaaagtgc tggaactgct cgtcaatgag
181 tacgggttca agggtcgaca cctcggcgga tctaggaaac cagacggcat cgtgtatagt
241 accacactgg aagacaactt tgggatcatt gtggatacca aggcatactc tgagggttat
301 agtctgccca tttcacaggc cgacgagatg gaacggtacg tgcgcgagaa ctcaaataga
361 gatgaggaag tcaaccctaa caagtggtgg gagaacttct ctgaggaagt gaagaaatac
421 tacttcgtct ttatcagcgg gtccttcaag ggtaaatttg aggaacagct caggagactg
481 agcatgacta ccggcgtgaa tggcagcgcc gtcaacgtgg tcaatctgct cctgggcgct
541 gaaaagattc ggagcggaga gatgaccatc gaagagctgg agagggcaat gtttaataat
601 agcgagttta tcctgaaata cggtggcggt ggatccgata aaaagtattc tattggttta
661 gccatcggca ctaattccgt tggatgggct gtcataaccg atgaatacaa agtaccttca
721 aagaaattta aggtgttggg gaacacagac cgtcattcga ttaaaaagaa tcttatcggt
781 gccctcctat tcgatagtgg cgaaacggca gaggcgactc gcctgaaacg aaccgctcgg
841 agaaggtata cacgtcgcaa gaaccgaata tgttacttac aagaaatttt tagcaatgag
901 atggccaaag ttgacgattc tttctttcac cgtttggaag agtccttcct tgtcgaagag
961 gacaagaaac atgaacggca ccccatcttt ggaaacatag tagatgaggt ggcatatcat
1021 gaaaagtacc caacgattta tcacctcaga aaaaagctag ttgactcaac tgataaagcg
1081 gacctgaggt taatctactt ggctcttgcc catatgataa agttccgtgg gcactttctc
1141 attgagggtg atctaaatcc ggacaactcg gatgtcgaca aactgttcat ccagttagta
1201 caaacctata atcagttgtt tgaagagaac cctataaatg caagtggcgt ggatgcgaag
1261 gctattctta gcgcccgcct ctctaaatcc cgacggctag aaaacctgat cgcacaatta
1321 cccggagaga agaaaaatgg gttgttcggt aaccttatag cgctctcact aggcctgaca
1381 ccaaatttta agtcgaactt cgacttagct gaagatgcca aattgcagct tagtaaggac
1441 acgtacgatg acgatctcga caatctactg gcacaaattg gagatcagta tgcggactta
1501 tttttggctg ccaaaaacct tagcgatgca atcctcctat ctgacatact gagagttaat
1561 actgagatta ccaaggcgcc gttatccgct tcaatgatca aaaggtacga tgaacatcac
1621 caagacttga cacttctcaa ggccctagtc cgtcagcaac tgcctgagaa atataaggaa
1681 atattctttg atcagtcgaa aaacgggtac gcaggttata ttgacggcgg agcgagtcaa
1741 gaggaattct acaagtttat caaacccata ttagagaaga tggatgggac ggaagagttg
1801 cttgtaaaac tcaatcgcga agatctactg cgaaagcagc ggactttcga caacggtagc
1861 attccacatc aaatccactt aggcgaattg catgctatac ttagaaggca ggaggatttt
1921 tatccgttcc tcaaagacaa tcgtgaaaag attgagaaaa tcctaacctt tcgcatacct
1981 tactatgtgg gacccctggc ccgagggaac tctcggttcg catggatgac aagaaagtcc
2041 gaagaaacga ttactccatg gaattttgag gaagttgtcg ataaaggtgc gtcagctcaa
2101 tcgttcatcg agaggatgac caactttgac aagaatttac cgaacgaaaa agtattgcct
2161 aagcacagtt tactttacga gtatttcaca gtgtacaatg aactcacgaa agttaagtat
2221 gtcactgagg gcatgcgtaa acccgccttt ctaagcggag aacagaagaa agcaatagta
2281 gatctgttat tcaagaccaa ccgcaaagtg acagttaagc aattgaaaga ggactacttt
2341 aagaaaattg aatgcttcga ttctgtcgag atctccgggg tagaagatcg atttaatgcg
2401 tcacttggta cgtatcatga cctcctaaag ataattaaag ataaggactt cctggataac
2461 gaagagaatg aagatatctt agaagatata gtgttgactc ttaccctctt tgaagatcgg
2521 gaaatgattg aggaaagact aaaaacatac gctcacctgt tcgacgataa ggttatgaaa
2581 cagttaaaga ggcgtcgcta tacgggctgg ggacgattgt cgcggaaact tatcaacggg
2641 ataagagaca agcaaagtgg taaaactatt ctcgattttc taaagagcga cggcttcgcc
2701 aataggaact ttatgcagct gatccatgat gactctttaa ccttcaaaga ggatatacaa
2761 aaggcacagg tttccggaca aggggactca ttgcacgaac atattgcgaa tcttgctggt
2821 tcgccagcca tcaaaaaggg catactccag acagtcaaag tagtggatga gctagttaag
2881 gtcatgggac gtcacaaacc ggaaaacatt gtaatcgaga tggcacgcga aaatcaaacg
2941 actcagaagg ggcaaaaaaa cagtcgagag cggatgaaga gaatagaaga gggtattaaa
3001 gaactgggca gccagatctt aaaggagcat cctgtggaaa atacccaatt gcagaacgag
3061 aaactttacc tctattacct acaaaatgga agggacatgt atgttgatca ggaactggac
3121 ataaaccgtt tatctgatta cgacgtcgat gccattgtac cccaatcctt tttgaaggac
3181 gattcaatcg acaataaagt gcttacacgc tcggataaga accgagggaa aagtgacaat 3241 gttccaagcg aggaagtcgt aaagaaaatg aagaactatt ggcggcagct cctaaatgcg
3301 aaactgataa cgcaaagaaa gttcgataac ttaactaaag ctgagagggg tggcttgtct
3361 gaacttgaca aggccggatt tattaaacgt cagctcgtgg aaacccgcca aatcacaaag
3421 catgttgcac agatactaga ttcccgaatg aatacgaaat acgacgagaa cgataagctg
3481 attcgggaag tcaaagtaat cactttaaag tcaaaattgg tgtcggactt cagaaaggat
3541 tttcaattct ataaagttag ggagataaat aactaccacc atgcgcacga cgcttatctt
3601 aatgccgtcg tagggaccgc actcattaag aaatacccga agctagaaag tgagtttgtg
3661 tatggtgatt acaaagttta tgacgtccgt aagatgatcg cgaaaagcga acaggagata
3721 ggcaaggcta cagccaaata cttcttttat tctaacatta tgaatttctt taagacggaa
3781 atcactctgg caaacggaga gatacgcaaa cgacctttaa ttgaaaccaa tggggagaca
3841 ggtgaaatcg tatgggataa gggccgggac ttcgcgacgg tgagaaaagt tttgtccatg
3901 ccccaagtca acatagtaaa gaaaactgag gtgcagaccg gagggttttc aaaggaatcg
3961 attcttccaa aaaggaatag tgataagctc atcgctcgta aaaaggactg ggacccgaaa
4021 aagtacggtg gcttcgatag ccctacagtt gcctattctg tcctagtagt ggcaaaagtt
4081 gagaagggaa aatccaagaa actgaagtca gtcaaagaat tattggggat aacgattatg
4141 gagcgctcgt cttttgaaaa gaaccccatc gacttccttg aggcgaaagg ttacaaggaa
4201 gtaaaaaagg atctcataat taaactacca aagtatagtc tgtttgagtt agaaaatggc
4261 cgaaaacgga tgttggctag cgccggagag cttcaaaagg ggaacgaact cgcactaccg
4321 tctaaatacg tgaatttcct gtatttagcg tcccattacg agaagttgaa aggttcacct
4381 gaagataacg aacagaagca actttttgtt gagcagcaca aacattatct cgacgaaatc
4441 atagagcaaa tttcggaatt cagtaagaga gtcatcctag ctgatgccaa tctggacaaa
4501 gtattaagcg catacaacaa gcacagggat aaacccatac gtgagcaggc ggaaaatatt
4561 atccatttgt ttactcttac caacctcggc gctccagccg cattcaagta ttttgacaca
4621 acgatagatc gcaaacgata cacttctacc aaggaggtgc tagacgcgac actgattcac
4681 caatccatca cgggattata tgaaactcgg atagatttgt cacagcttgg gggtgacgga
4741 tcccccaaga agaagaggaa agtctga (SEQ ID NO: 22) .
[0189] In some embodiments, the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 2) of the disclosure may comprise a DNA. In some embodiments, the nucleic acid sequence encoding a dCas9-Clo05l fusion protein (embodiment 2) of the disclosure may comprise an RNA.
Nanotransposons
[0190] The disclosure provides a nanotransposon comprising: (a) a sequence encoding a transposon insert, comprising a sequence encoding a first inverted terminal repeat (ITR), a sequence encoding a second inverted terminal repeat (ITR), and an intra-ITR sequence; (b) a sequence encoding a backbone, wherein the sequence encoding the backbone comprises a sequence encoding an origin of replication having between 1 and 450 nucleotides, inclusive of the endpoints, and a sequence encoding a selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints, and (c) an inter-ITR sequence. In some embodiments, the inter-ITR sequence of (c) comprises the sequence of (b). In some embodiments, the intra- ITR sequence of (a) comprises the sequence of (b).
[0191] In some embodiments of the nanotransposons of the disclosure, the sequence encoding the backbone comprises between 1 and 600 nucleotides, inclusive of the endpoints. In some embodiments, the sequence encoding the backbone consists of between 1 and 50 nucleotides, between 50 and 100 nucleotides, between 100 and 150 nucleotides, between 150 and 200 nucleotides, between 200 and 250 nucleotides, between 250 and 300 nucleotides, between 300 and 350 nucleotides, between 350 and 400 nucleotides, between 400 and 450 nucleotides, between 450 and 500 nucleotides, between 500 and 550 nucleotides, between 550 and 600 nucleotides, each range inclusive of the endpoints.
[0192] In some embodiments of the nanotransposons of the disclosure, the inter-ITR sequence comprises between 1 and 1000 nucleotides, inclusive of the endpoints. In some embodiments, the inter-ITR sequence consists of between 1 and 50 nucleotides, between 50 and 100 nucleotides, between 100 and 150 nucleotides, between 150 and 200 nucleotides, between 200 and 250 nucleotides, between 250 and 300 nucleotides, between 300 and 350 nucleotides, between 350 and 400 nucleotides, between 400 and 450 nucleotides, between 450 and 500 nucleotides, between 500 and 550 nucleotides, between 550 and 600
nucleotides, between 600 and 650 nucleotides, between 650 and 700 nucleotides, between 700 and 750 nucleotides, between 750 and 800 nucleotides, between 800 and 850
nucleotides, between 850 and 900 nucleotides, between 900 and 950 nucleotides, or between 950 and 1000 nucleotides, each range inclusive of the endpoints.
[0193] In some embodiments of the nanotransposons of the disclosure, including the short nanotransposons (SNTs) of the disclosure, the inter-ITR sequence comprises between 1 and 200 nucleotides, inclusive of the endpoints. In some embodiments, the inter-ITR sequence consists of between 1 and 10 nucleotides, between 10 and 20 nucleotides, between 20 and 30 nucleotides, between 30 and 40 nucleotides, between 40 and 50 nucleotides, between 50 and 60 nucleotides, between 60 and 70 nucleotides, between 70 and 80 nucleotides, between 80 and 90 nucleotides, or between 90 and 100 nucleotides, each range inclusive of the endpoints.
[0194] In some embodiments of the nanotransposons of the disclosure, the selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints, comprises a sequence encoding a sucrose-selectable marker. In some embodiments, the sequence encoding a sucrose-selectable marker comprises a sequence encoding an RNA-OUT sequence. In some embodiments, the sequence encoding an RNA-OUT sequence comprises or consists of 137 base pairs (bp). In some embodiments, the selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints, comprises a sequence encoding a fluorescent marker. In some embodiments, the selectable marker having between 1 and 200 nucleotides, inclusive of the endpoints, comprises a sequence encoding a cell surface marker.
[0195] In some embodiments of the nanotransposons of the disclosure, the sequence encoding an origin of replication having between 1 and 450 nucleotides, inclusive of the endpoints, comprises a sequence encoding a mini origin of replication. In some embodiments, the sequence encoding an origin of replication having between 1 and 450 nucleotides, inclusive of the endpoints, comprises a sequence encoding an R6K origin of replication. In some embodiments, the R6K origin of replication comprises an R6K gamma origin of replication. In some embodiments, the R6K origin of replication comprises an R6K mini origin of replication. In some embodiments, the R6K origin of replication comprises an R6K gamma mini origin of replication. In some embodiments, the R6K gamma mini origin of replication comprises or consists of 281 base pairs (bp).
[0196] In some embodiments of the nanotransposons of the disclosure, the sequence encoding the backbone does not comprise a recombination site, an excision site, a ligation site or a combination thereof. In some embodiments, neither the nanotransposon nor the sequence encoding the backbone comprises a product of a recombination site, an excision site, a ligation site or a combination thereof. In some embodiments, neither the
nanotransposon nor the sequence encoding the backbone is derived from a recombination site, an excision site, a ligation site or a combination thereof.
[0197] In some embodiments of the nanotransposons of the disclosure, a recombination site comprises a sequence resulting from a recombination event. In some embodiments, a recombination site comprises a sequence that is a product of a recombination event. In some embodiments, the recombination event comprises an activity of a recombinase (e.g., a recombinase site).
[0198] In some embodiments of the nanotransposons of the disclosure, the sequence encoding the backbone does not further comprise a sequence encoding foreign DNA.
[0199] In some embodiments of the nanotransposons of the disclosure, the inter-ITR sequence does not comprise a recombination site, an excision site, a ligation site or a combination thereof. In some embodiments, the inter-ITR sequence does not comprise a product of a recombination event, an excision event, a ligation event or a combination thereof. In some embodiments, the inter-ITR sequence is not derived from a recombination event, an excision event, a ligation event or a combination thereof.
[0200] In some embodiments of the nanotransposons of the disclosure, the inter-ITR sequence comprises a sequence encoding foreign DNA.
[0201] In some embodiments of the nanotransposons of the disclosure, the intra-ITR sequence comprises at least one sequence encoding an insulator and a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell. In some embodiments, the mammalian cell is a human cell.
[0202] In some embodiments of the nanotransposons of the disclosure, the intra-ITR sequence comprises a first sequence encoding an insulator, a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell and a second sequence encoding an insulator.
[0203] In some embodiments of the nanotransposons of the disclosure, the intra-ITR sequence comprises a first sequence encoding an insulator, a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell, a polyadenosine (poly A) sequence and a second sequence encoding an insulator.
[0204] In some embodiments of the nanotransposons of the disclosure, the intra-ITR sequence comprises a first sequence encoding an insulator, a sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell, at least one exogenous sequence, a polyadenosine (poly A) sequence and a second sequence encoding an insulator.
[0205] In some embodiments of the nanotransposons of the disclosure, the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell is capable of expressing an exogenous sequence in a human cell. In some embodiments, the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding a constitutive promoter. In some embodiments, the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding an inducible promoter. In some embodiments, the intra- ITR sequence comprises a first sequence encoding a first promoter capable of expressing an exogenous sequence in a mammalian cell and a second sequence encoding a second promoter capable of expressing an exogenous sequence in mammalian cell, wherein the first promoter is a constitutive promoter, wherein the second promoter is an inducible promoter, and wherein the first sequence encoding the first promoter and the second sequence encoding the second promoter are oriented in opposite directions. In some embodiments, the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding a cell-type or tissue-type specific promoter. In some embodiments, the sequence encoding a promoter capable of expressing an exogenous sequence in a mammalian cell comprises a sequence encoding an EFla promoter, a sequence encoding a CMV promoter, a sequence encoding an MND promoter, a sequence encoding an SV40 promoter, a sequence encoding a PGK1 promoter, a sequence encoding a Ubc promoter, a sequence encoding a CAG promoter, a sequence encoding an Hl promoter, or a sequence encoding a U6 promoter.
[0206] In some embodiments of the nanotransposons of the disclosure, the poly adenosine (poly A) sequence is isolated or derived from a viral polyA sequence. In some embodiments, the polyadenosine (polyA) sequence is isolated or derived from an (SV40) polyA sequence. [0207] In some embodiments of the nanotransposons of the disclosure, the at least one exogenous sequence comprises an inducible proapoptotic polypeptide. In some embodiments, the inducible caspase polypeptide comprises (a) a ligand binding region, (b) a linker, and (c) a caspase polypeptide, wherein the inducible proapoptotic polypeptide does not comprise a non-human sequence. In some embodiments, the inducible caspase polypeptide comprises (a) a ligand binding region, (b) a linker, and (c) a truncated caspase 9 polypeptide, wherein the inducible proapoptotic polypeptide does not comprise a non-human sequence.
[0208] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises an inducible proapoptotic polypeptide, the ligand binding region comprises a FK506 binding protein 12 (FKBP12) polypeptide. In some embodiments, the amino acid sequence of the ligand binding region comprises a FK506 binding protein 12 (FKBP12) polypeptide. In some embodiments, the FK506 binding protein 12 (FKBP12) polypeptide comprises a modification at position 36 of the sequence. In some embodiments, the modification comprises a substitution of valine (V) for phenylalanine (F) at position 36 (F36V). In some embodiments, the FKBP12 polypeptide is encoded by an amino acid sequence comprising
GVQVETISPGDGRTFPKRGQTCVVHYTGMLEDGKKVDSSRDRNKPFKFMLGKQEVI RGWEEGVAQMSVGQRAKLTISPDYAYGATGHPGIIPPHATLVFDVELLKLE (SEQ ID NO: 23). In some embodiments, the FKBP12 polypeptide is encoded by a nucleic acid sequence comprising
GGGGTCCAGGTCGAGACTATTTCACCAGGGGATGGGCGAACATTTCCAAAAAGG
GGCCAGACTTGCGTCGTGCATTACACCGGGATGCTGGAGGACGGGAAGAAAGTG
GACAGCTCCAGGGATCGCAACAAGCCCTTCAAGTTCATGCTGGGAAAGCAGGAA
GTGATCCGAGGATGGGAGGAAGGCGTGGCACAGATGTCAGTCGGCCAGCGGGCC
AAACTGACCATTAGCCCTGACTACGCTTATGGAGCAACAGGCCACCCAGGGATC
ATTCCCCCTCATGCCACCCTGGTCTTCGAT GT GGAACT GCT GAAGCT GGAG (SEQ
ID NO: 24).
[0209] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises an inducible proapoptotic polypeptide, the linker region is encoded by an amino acid comprising GGGGS (SEQ ID NO: 27) or a nucleic acid sequence comprising GGAGGAGGAGGATCC (SEQ ID NO: 28). In some embodiments, the nucleic acid sequence encoding the linker does not comprise a restriction site. [0210] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises an inducible proapoptotic polypeptide, the truncated caspase 9 polypeptide is encoded by an amino acid sequence that does not comprise an arginine (R) at position 87 of the sequence. In some embodiments, the truncated caspase 9 polypeptide is encoded by an amino acid sequence that does not comprise an alanine (A) at position 282 the sequence. In some embodiments, the truncated caspase 9 polypeptide is encoded by an amino acid comprising
GFGDVGALESLRGNADLAYILSMEPCGHCLIINNVNFCRESGLRTRTGSNIDCEKLRR RFSSLHFMVEVKGDLTAKKMVLALLELAQQDHGALDCCVVVILSHGCQASHLQFPG AVYGTDGCPVSVEKIVNIFNGTSCPSLGGKPKLFFIQACGGEQKDHGFEVASTSPEDE SPGSNPEPDATPF QEGLRTFDQLDAIS SLPTPSDIFV SY STFPGFV SWRDPKSGSWYVE TLDDIFEQWAHSEDLQSLLLRVANAVSVKGIYKQMPGCFNFLRKKLFFKTS (SEQ ID NO: 29). In some embodiments, the truncated caspase 9 polypeptide is encoded by a nucleic acid sequence comprising
GGATTTGGGGACGTGGGGGCCCTGGAGTCTCTGCGAGGAAATGCCGATCTGGCT
TACATCCTGAGCATGGAACCCTGCGGCCACTGTCTGATCATTAACAATGTGAACT
TCTGCAGAGAAAGCGGACTGCGAACACGGACTGGCTCCAATATTGACTGTGAGA
AGCTGCGGAGAAGGTTCTCTAGTCTGCACTTTATGGTCGAAGTGAAAGGGGATCT
GACCGCCAAGAAAATGGTGCTGGCCCTGCTGGAGCTGGCTCAGCAGGACCATGG
AGCTCTGGATTGCTGCGTGGTCGTGATCCTGTCCCACGGGTGCCAGGCTTCTCAT
CTGCAGTTCCCCGGAGCAGTGTACGGAACAGACGGCTGTCCTGTCAGCGTGGAG
AAGATCGTCAACATCTTCAACGGCACTTCTTGCCCTAGTCTGGGGGGAAAGCCAA
AACTGTTCTTTATCCAGGCCTGTGGCGGGGAACAGAAAGATCACGGCTTCGAGG
TGGCCAGCACCAGCCCTGAGGACGAATCACCAGGGAGCAACCCTGAACCAGATG
CAACTCCATTCCAGGAGGGACTGAGGACCTTTGACCAGCTGGATGCTATCTCAAG
CCTGCCCACTCCTAGTGACATTTTCGTGTCTTACAGTACCTTCCCAGGCTTTGTCT
CATGGCGCGATCCCAAGTCAGGGAGCTGGTACGTGGAGACACTGGACGACATCT
TTGAACAGTGGGCCCATTCAGAGGACCTGCAGAGCCTGCTGCTGCGAGTGGCAA
ACGCTGTCTCTGTGAAGGGCATCTACAAACAGATGCCCGGGTGCTTCAATTTTCT
GAGAAAGAAACTGTTCTTTAAGACTTCC (SEQ ID NO: 30)
[0211] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises an inducible proapoptotic polypeptide, the inducible proapoptotic polypeptide is encoded by an amino acid sequence comprising GVQVETISPGDGRTFPKRGQTCVVHYTGMLEDGKKVDSSRDRNKPFKFMLGKQEVI RGWEEGVAQMSVGQRAKLTISPDYAYGATGHPGIIPPHATLVFDVELLKLEGGGGS GFGDVGALESLRGNADLAYILSMEPCGHCLIINNVNFCRESGLRTRTGSNIDCEKLRR RFSSLHFMVEVKGDLTAKKMVLALLELAQQDHGALDCCVVVILSHGCQASHLQFPG AVYGTDGCPVSVEKIVNIFNGTSCPSLGGKPKLFFIQACGGEQKDHGFEVASTSPEDE SPGSNPEPDATPFQEGLRTFDQLDAISSLPTPSDIFVSYSTFPGFVSWRDPKSGSWYVE TLDDIFEQWAHSEDLQSLLLRVANAVSVKGIYKQMPGCFNFLRKKLFFKTS (SEQ ID NO: 31). In some embodiments, the inducible proapoptotic polypeptide is encoded by a nucleic acid sequence comprising
ggggtccaggtcgagactatttcaccaggggatgggcgaacatttccaaaaaggggccagacttgcgtcgtgcattacaccgggatg ctggaggacgggaagaaagtggacagctccagggatcgcaacaagcccttcaagttcatgctgggaaagcaggaagtgatccgag gatgggaggaaggcgtggcacagatgtcagtcggccagcgggccaaactgaccattagccctgactacgcttatggagcaacagg ccacccagggatcattccccctcatgccaccctggtcttcgatgtggaactgctgaagctggagggaggaggaggatccggatttgg ggacgtgggggccctggagtctctgcgaggaaatgccgatctggcttacatcctgagcatggaaccctgcggccactgtctgatcatt aacaatgtgaacttctgcagagaaagcggactgcgaacacggactggctccaatattgactgtgagaagctgcggagaaggttctcta gtctgcactttatggtcgaagtgaaaggggatctgaccgccaagaaaatggtgctggccctgctggagctggctcagcaggaccatg gagctctggattgctgcgtggtcgtgatcctgtcccacgggtgccaggcttctcatctgcagttccccggagcagtgtacggaacagac ggctgtcctgtcagcgtggagaagatcgtcaacatcttcaacggcacttcttgccctagtctggggggaaagccaaaactgttctttatc caggcctgtggcggggaacagaaagatcacggcttcgaggtggccagcaccagccctgaggacgaatcaccagggagcaaccct gaaccagatgcaactccattccaggagggactgaggacctttgaccagctggatgctatctcaagcctgcccactcctagtgacattttc gtgtcttacagtaccttcccaggctttgtctcatggcgcgatcccaagtcagggagctggtacgtggagacactggacgacatctttgaa cagtgggcccattcagaggacctgcagagcctgctgctgcgagtggcaaacgctgtctctgtgaagggcatctacaaacagatgccc gggtgcttcaattttctgagaaagaaactgttctttaagacttcc (SEQ ID NO: 32).
[0212] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises an inducible proapoptotic polypeptide, the exogenous sequence further comprises a sequence encoding a selectable marker. In some embodiments, the sequence encoding the selectable marker comprises a sequence encoding a detectable marker. In some embodiments, the detectable marker comprises a fluorescent marker or a cell-surface marker. In some embodiments, the sequence encoding the selectable marker comprises a sequence encoding a protein that is active in dividing cells and not active in non-dividing cells. In some embodiments, the sequence encoding the selectable marker comprises a sequence encoding a metabolic marker. In some embodiments, the sequence encoding the selectable marker comprises a sequence encoding a dihydrofolate reductase (DHFR) mutein enzyme. In some embodiments, the DHFR mutein enzyme comprises or consists of the amino acid sequence of:
1 MVGSLNCIVA VSQNMGIGKN GDFPWPPLRN ESRYFQRMTT TSSVEGKQNL
61 VIMGKKTWFS I PEKNRPLKG RINLVLSREL KEPPQGAHFL SRSLDDALKL
121 TEQPELANKV DMVWIVGGSS VYKEAMNHPG HLKLFVTRIM QDFESDTFFP
181 EIDLEKYKLL PEYPGVLSDV QEEKGIKYKF EVYEKND (SEQ ID NO: 33 ) . In SOme embodiments, the DHFR mutein enzyme is encoded by a the nucleic acid sequence comprising or consisting of
atggtcgggtctctgaattgtatcgtcgccgtgagtcagaacatgggcattgggaagaatggcgatttcccatggccacctctgcgcaa cgagtcccgatactttcagcggatgacaactacctcctctgtggaagggaaacagaatctggtcatcatgggaaagaaaacttggttca gcattccagagaagaaccggcccctgaaaggcagaatcaatctggtgctgtcccgagaactgaaggagccaccacagggagctca ctttctgagccggtccctggacgatgcactgaagctgacagaacagcctgagctggccaacaaagtcgatatggtgtggatcgtcgg gggaagttcagtgtataaggaggccatgaatcaccccggccatctgaaactgttcgtcacacggatcatgcaggactttgagagcgat actttctttcctgaaattgacctggagaagtacaaactgctgcccgaatatcctggcgtgctgtccgatgtccaggaagagaaaggcatc aaatacaagttcgaggtctatgagaagaatgac (SEQ ID NO: 34). In some embodiments, the amino acid sequence of the DHFR mutein enzyme further comprises a mutation at one or more of positions 80, 113, or 153. In some embodiments, the amino acid sequence of the DHFR mutein enzyme comprises one or more of a substitution of a Phenylalanine (F) or a Leucine (L) at position 80, a substitution of a Leucine (L) or a Valine (V) at position 113, and a substitution of a Valine (V) or an Aspartic Acid (D) at position 153.
[0213] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises an inducible proapoptotic polypeptide and/or the exogenous sequence comprises a sequence encoding a selectable marker, the exogenous sequence further comprises a sequence encoding a non-naturally occurring antigen receptor, and/or a sequence encoding a therapeutic polypeptide. In some embodiments, the non-naturally occurring antigen receptor comprises a T cell Receptor (TCR). In some embodiments, a sequence encoding the TCR comprises one or more of an insertion, a deletion, a substitution, an invertion, a transposition or a frameshift compared to a corresponding wild type sequence. In some embodiments, a sequence encoding the TCR comprises a chimeric or recombinant sequence. In some embodiments, the non-naturally occurring antigen receptor comprises a chimeric antigen receptor (CAR). In some embodiments, the CAR comprises: (a) an ectodomain comprising an antigen recognition region, (b) a transmembrane domain, and (c) an endodomain comprising at least one costimulatory domain. In some embodiments, the ectodomain of (a) of the CAR further comprises a signal peptide. In some embodiments, the ectodomain of (a) of the CAR further comprises a hinge between the antigen recognition region and the transmembrane domain. In some embodiments, the endodomain comprises a human Oϋ3z endodomain. In some embodiments, the at least one costimulatory domain comprises a human 4-1BB, CD28,
CD40, ICOS, MyD88, OX-40 intracellular segment, or any combination thereof. In some embodiments, the at least one costimulatory domain comprises a human CD28 and/or a 4- 1BB costimulatory domain. In some embodiments, the antigen recognition region comprises one or more of a scFv, a VHH, a VH, and a Centyrin.
[0214] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises an inducible proapoptotic polypeptide and/or the exogenous sequence comprises a sequence encoding a selectable marker, the exogenous sequence further comprises a sequence encoding a transposase.
[0215] In some embodiments of the nanotransposons of the disclosure, the intra-ITR sequence comprises a sequence encoding a selectable marker, an exogenous sequence, a sequence encoding an inducible caspase polypeptide, and at least one sequence encoding a self-cleaving peptide. In some embodiments, the at least one sequence encoding a self cleaving peptide is positioned between one or more of: (a) the sequence encoding a selectable marker and the exogenous sequence, (b) the sequence encoding a selectable marker and the inducible caspase polypeptide, and (c) the exogenous sequence and the inducible caspase polypeptide. In some embodiments, a first sequence encoding a self-cleaving peptide is positioned between the sequence encoding a selectable marker and the exogenous sequence and a second sequence encoding a self-cleaving peptide is positioned between the exogenous sequence and the inducible caspase polypeptide. In some embodiments, the at least one self cleaving peptide comprises T2A peptide, GSG-T2A peptide, an E2A peptide, a GSG-E2A peptide, an F2A peptide, a GSG-F2A peptide, a P2A peptide, or a GSG-P2A peptide. In some embodiments, the T2A peptide comprises an amino acid sequence comprising
EGRGS LLTC GD VEENP GP (SEQ ID NO: 25). In some embodiments, the GSG-T2A peptide comprises an amino acid sequence comprising GSGEGRGSLLTCGDVEENPGP (SEQ ID NO: 26). In some embodiments, the E2A peptide comprises an amino acid sequence comprising QCTNYALLKLAGDVESNPGP (SEQ ID NO: 35). In some embodiments, the GSG-E2A peptide comprises an amino acid sequence comprising
GSGQCTNYALLKLAGDVESNPGP (SEQ ID NO: 36). In some embodiments, the F2A peptide comprises an amino acid sequence comprising VKQTLNFDLLKLAGDVESNPGP (SEQ ID NO: 37). In some embodiments, the GSG-F2A peptide comprises an amino acid sequence comprising GSGVKQTLNFDLLKLAGDVESNPGP (SEQ ID NO: 38). In some embodiments, the P2A peptide comprises an amino acid sequence comprising
ATNF S LLKQ AGD VEENP GP (SEQ ID NO: 39). In some embodiments, the GSG-P2A peptide comprises an amino acid sequence comprising GSGATNFSLLKQAGDVEENPGP (SEQ ID NO: 40).
[0216] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase or a piggyBac-like transposase. In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase. In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac-like transposase. In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprise a TTAA, a TTAT or a TTAX recognition sequence. In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprise a TTAA, a TTAT or a TTAX recognition sequence and a sequence having at least 50% identity to a sequence isolated or derived from a piggyBac transposase or a piggyBac-like transposase. In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprise at least 2 nucleotides (nts), 3 nts, 4 nts, 5 nts, 6 nts, 7 nts, 8 nts, 9 nts, 10 nts, 11 nts, 12 nts, 13 nts, 14 nts, 15 nts, 16 nts, 17 nts, 18 nts, 19 nts, or 20 nts.
[0217] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase or a piggyBac-like transposase. In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) or a sequence having at least 70% identity to the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41). In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of
CCCTAGAAAGATAATCATATTGTGACGTACGTTAAAGATAATCATGCGTAAAATT GACGCATG (SEQ ID NO: 42). In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) and comprises the sequence of
CCCTAGAAAGATAATCATATTGTGACGTACGTTAAAGATAATCATGCGTAAAATT GACGCATG (SEQ ID NO: 42). In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) and comprises the sequence of
CCCTAGAAAGATAATCATATTGTGACGTACGTTAAAGATAATCATGTGTAAAATT GACGCATG (SEQ ID NO: 43). In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) comprises the sequence of CCCTAGAAAGATAGTCTGCGTAAAATTGACGCATG (SEQ ID NO: 41) and comprises the sequence of
TTAACCCTAGAAAGATAATCATATTGTGACGTACGTTAAAGATAATCATGTGTAA AATTGACGCATGTGTTTTATCGGTCTGTATATCGAGGTTTATTTATTAATTTGAAT AGATATTAAGTTTTATTATATTTACACTTACATACTAATAATAAATTCAACAAAC AATTT ATTT AT GTTT ATTT ATTT ATT A AAA A A A AC A AA AACTC A A A ATTT CTTCT A TAAAGT AAC AAAACTTTT A (SEQ ID NO: 44).
[0218] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a piggyBac transposase or a piggyBac-like transposase. In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) is recognized by a piggyBac transposase having an amino acid sequence of at least 20% identity to the amino acid sequence of
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG 61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG 121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SI PLAKNLLQ
361 EPYKLTIVGT VRSNKREI PE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID
NO : 45) . In some embodiments, the sequence encoding a first inverted terminal repeat
(ITR) or the sequence encoding a second inverted terminal repeat (ITR) is recognized by a piggyBac transposase having the amino acid sequence of
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG 61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG 121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTGATFRD TNEDEIYAFF 181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV 241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD 301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SI PLAKNLLQ 361 EPYKLTIVGT VRSNKREI PE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC 421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN 481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV 541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID
NO : 45 ) . In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) is recognized by a piggyBac transposase having an amino acid sequence of at least 20% identity to the amino acid sequence of
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEV SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG 61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG 121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTSATFRD TNEDEIYAFF 181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV 241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RVYIPNKPSK YGIKILMMCD 301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ 361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC 421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN 481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPKEV 541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
46 . In some embodiments, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) is recognized by a piggyBac transposase having the amino acid sequence of
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEV SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG 61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTSATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RVYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPKEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
46) .
[0219] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a Sleeping Beauty transposase. In some embodiments, the Sleeping Beauty transposase is a hyperactive Sleeping Beauty transposase (SB100X).
[0220] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a Helitron transposase.
[0221] In some embodiments of the nanotransposons of the disclosure, including those wherein the at least one exogenous sequence comprises one or more of an inducible proapoptotic polypeptide, a sequence encoding a selectable marker, and an exogenous sequence, the sequence encoding a first inverted terminal repeat (ITR) or the sequence encoding a second inverted terminal repeat (ITR) are recognized by a Tol2 transposase.
[0222] The disclosure provides a cell comprising a nanotransposon of the disclosure. In some embodiments, the cell further comprises a transposase composition. In some embodiments, the transposase composition comprises a transposase or a sequence encoding the transposase that is capable of recognizing the first ITR or the second ITR of the nanotransposon. In some embodiments, the transposase composition comprises a
nanotransposon comprising the sequence encoding the transposase. In some embodiments, the cell comprises a first nanotransposon comprising an exogenous sequence and a second nanotransposon comprising a sequence encoding a transposase. In some embodiments, the cell is an allogeneic cell.
[0223] The disclosure provides a composition comprising the nanotransposon of the disclosure.
[0224] The disclosure provides a composition comprising the cell of the disclosure. In some embodiments, the cell comprises a nanotransposon of the disclosure. In some embodiments, the cell is not further modified. In some embodiments, the cell is allogeneic.
[0225] The disclosure provides a composition comprising the cell of the disclosure. In some embodiments, the cell comprises a nanotransposon of the disclosure. In some embodiments, the cell is not further modified. In some embodiments, the cell is autologous.
[0226] The disclosure provides a composition comprising a plurality of cells of the disclosure. In some embodiments, at least one cell of the plurality of cells comprises a nanotransposon of the disclosure. In some embodiments, a portion of the plurality of cells comprises a nanotransposon of the disclosure. In some embodiments, the portion comprises at least 1%, 2%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%,
70%, 75%, 80%, 85%, 90%, 95%, 97%, 99% or any percentage in between of the plurality of cells. In some embodiments, each cell of the plurality of cells comprises a nanotransposon of the disclosure. In some embodiments, the plurality of cells does not comprise a modified cell of the disclosure. In some embodiments, at least one cell of the plurality of cells is not further modified. In some embodiments, none of the plurality of cells is not further modified. In some embodiments, plurality of cells is allogeneic. In some embodiments, an allogeneic plurality of cells are produced according to the methods of the disclosure. In some embodiments, plurality of cells is autologous. In some embodiments, an autologous plurality of cells are produced according to the methods of the disclosure.
[0227] The disclosure provides a modified cell comprising: (a) a nanotransposon of the disclosure; (b) a sequence encoding an inducible proapoptotic polypeptide; and wherein the cell is a T cell, (c) a modification of an endogenous sequence encoding a T cell Receptor (TCR), wherein the modification reduces or eliminates a level of expression or activity of the TCR. In some embodiments, the cell further comprises: (d) a non-naturally occurring sequence comprising an HLA class I histocompatibility antigen, alpha chain E (HLA-E), and (e) a modification of an endogenous sequence encoding Beta-2 -Microglobulin (B2M), wherein the modification reduces or eliminates a level of expression or activity of a major histocompatibility complex (MHC) class I (MHC-I). [0228] The disclosure provides a modified cell comprising: (a) a nanotransposon of the disclosure; (b) a sequence encoding an inducible proapoptotic polypeptide; (c) a non- naturally occurring sequence comprising an HLA class I histocompatibility antigen, alpha chain E (HLA-E), and (e) a modification of an endogenous sequence encoding Beta-2- Microglobulin (B2M), wherein the modification reduces or eliminates a level of expression or activity of a major histocompatibility complex (MHC) class I (MHC-I).
[0229] In some embodiments of the modified cells of the disclosure, the non-naturally occurring sequence comprising a HLA-E further comprises a sequence encoding a B2M signal peptide. In some embodiments, the non-naturally occurring sequence comprising an HLA-E further comprises a linker, wherein the linker is positioned between the sequence encoding the sequence encoding a B2M polypeptide and the sequence encoding the HLA-E. In some embodiments, the non-naturally occurring sequence comprising an HLA-E further comprises a sequence encoding a peptide and a sequence encoding a B2M polypeptide. In some embodiments, the non-naturally occurring sequence comprising an HLA-E further comprises a first linker positioned between the sequence encoding the B2M signal peptide and the sequence encoding the peptide, and a second linker positioned between the sequence encoding the B2M polypeptide and the sequence encoding the HLA-E.
[0230] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell is a mammalian cell.
[0231] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell is a human cell.
[0232] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell is a stem cell.
[0233] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell is a differentiated cell.
[0234] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell is a somatic cell.
[0235] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell is an immune cell or an immune cell precursor. In some embodiments, the immune cell is a lymphoid progenitor cell, a natural killer (NK) cell, a cytokine induced killer (CIK) cell, a T lymphocyte (T cell), a B lymphocyte (B-cell) or an antigen presenting cell (APC). In some embodiments, the immune cell is a T cell, an early memory T cell, a stem cell-like T cell, a stem memory T cell (Tscm), or a central memory T cell (Tcm). In some embodiments, the immune cell precursor is a hematopoietic stem cell (HSC). In some embodiments, the cell is an antigen presenting cell (APC).
[0236] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell further comprises a gene editing composition. In some embodiments, the gene editing composition comprises a sequence encoding a DNA binding domain and a sequence encoding a nuclease protein or a nuclease domain thereof. In some embodiments, the gene editing composition comprises a sequence encoding a nuclease protein or a sequence encoding a nuclease domain thereof. In some embodiments, the e sequence encoding a nuclease protein or the sequence encoding a nuclease domain thereof comprises a DNA sequence, an RNA sequence, or a combination thereof. In some embodiments, the nuclease or the nuclease domain thereof comprises one or more of a CRISPR/Cas protein, a Transcription Activator-Like Effector Nuclease (TALEN), a Zinc Finger Nuclease (ZFN), and an endonuclease. In some embodiments, the CRISPR/Cas protein comprises a nuclease- inactivated Cas (dCas) protein.
[0237] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell further comprises a gene editing composition. In some embodiments, the gene editing composition comprises a sequence encoding a DNA binding domain and a sequence encoding a nuclease protein or a nuclease domain thereof. In some embodiments, the nuclease or the nuclease domain thereof comprises a nuclease-inactivated Cas (dCas) protein and an endonuclease. In some embodiments, the endonuclease comprises a Clo05l nuclease or a nuclease domain thereof. In some embodiments, the gene editing composition comprises a fusion protein. In some embodiments, the fusion protein comprises a nuclease- inactivated Cas9 (dCas9) protein and a Clo05l nuclease or a Clo05l nuclease domain. In some embodiments, the gene editing composition further comprises a guide sequence. In some embodiments, the guide sequence comprises an RNA sequence. In some embodiments, the fusion protein comprises or consists of the amino acid sequence:
MAPKKKRKVEGIKSNI SLLKDELRGQISHI SHEYLSLIDIAFDSKQNRLFEMKVLELLVNEYGFKGRHLGGSRKP DGIVYSTTLEDNFGIIVDTKAYSEGYSLPI SQADEMERYVRENSNRDEEV PNKWWENFSEEVKKYYFVFISGSF KGKFEEQLRRLSMTTGVNGSAVNWNLLLGAEKIRSGEMTIEELERAMFNNSEFILKYGGGGSDKKYSIGIAIGT NSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQEI FSN EMAKVDDSFFHRLEESFLVEEDKKHERHPI FGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIK FRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLF GNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEIT KAPLSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEI FFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEE LLVKLNREDLLRKQRTFDNGSI PHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRI PYYVGPIARGNSRFA WMTRKSEETITPWNFEEWDKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPA FLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKI IKDKDFLDNEENE DILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGF ANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKWDELVKVMGRHKPENIVIEM ARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSDYDV DAIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEWKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDK AGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAY LNAWGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPT VAYSVLWAKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRM LASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEI IEQI SEFSKRVILADANLD KVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLS
QLGGDGSPKKKRKVSS (SEQ ID NO: 47) or a nucleic acid comprising or consisting of the sequence:
1 atggcaccaa agaagaaaag aaaagtggag ggcatcaagt caaacatcag cctgctgaaa
61 gacgaactgc ggggacagat tagtcacatc agtcacgagt acctgtcact gattgatctg
121 gccttcgaca gcaagcagaa tagactgttt gagatgaaag tgctggaact gctggtcaac
181 gagtatggct tcaagggcag acatctgggc gggtctagga aacctgacgg catcgtgtac
241 agtaccacac tggaagacaa cttcggaatc attgtcgata ccaaggctta ttccgagggc
301 tactctctgc caattagtca ggcagatgag atggaaaggt acgtgcgcga aaactcaaat
361 agggacgagg aagtcaaccc caataagtgg tgggagaatt tcagcgagga agtgaagaaa
421 tactacttcg tctttatctc aggcagcttc aaagggaagt ttgaggaaca gctgcggaga
481 ctgtccatga ctaccggggt gaacggatct gctgtcaacg tggtcaatct gctgctgggc
541 gcagaaaaga tcaggtccgg ggagatgaca attgaggaac tggaacgcgc catgttcaac
601 aattctgagt ttatcctgaa gtatggaggc gggggaagcg ataagaaata ctccatcgga
661 ctggccattg gcaccaattc cgtgggctgg gctgtcatca cagacgagta caaggtgcca
721 agcaagaagt tcaaggtcct ggggaacacc gatcgccaca gtatcaagaa aaatctgatt
781 ggagccctgc tgttcgactc aggcgagact gctgaagcaa cccgactgaa gcggactgct
841 aggcgccgat atacccggag aaaaaatcgg atctgctacc tgcaggaaat tttcagcaac
901 gagatggcca aggtggacga tagtttcttt caccgcctgg aggaatcatt cctggtggag
961 gaagataaga aacacgagcg gcatcccatc tttggcaaca ttgtggacga agtcgcttat
1021 cacgagaagt accctactat ctatcatctg aggaagaaac tggtggactc caccgataag
1081 gcagacctgc gcctgatcta tctggccctg gctcacatga tcaagttccg ggggcatttt
1141 ctgatcgagg gagatctgaa ccctgacaat tctgatgtgg acaagctgtt catccagctg
1201 gtccagacat acaatcagct gtttgaggaa aacccaatta atgcctcagg cgtggacgca
1261 aaggccatcc tgagcgccag actgtccaaa tctaggcgcc tggaaaacct gatcgctcag
1321 ctgccaggag agaagaaaaa cggcctgttt gggaatctga ttgcactgtc cctgggcctg
1381 acacccaact tcaagtctaa ttttgatctg gccgaggacg ctaagctgca gctgtccaaa
1441 gacacttatg acgatgacct ggataacctg ctggctcaga tcggcgatca gtacgcagac
1501 ctgttcctgg ccgctaagaa tctgagtgac gccatcctgc tgtcagatat tctgcgcgtg
1561 aacacagaga ttactaaggc cccactgagt gcttcaatga tcaaaagata tgacgagcac
1621 catcaggatc tgaccctgct gaaggctctg gtgaggcagc agctgcccga gaaatacaag
1681 gaaatcttct ttgatcagag caagaatgga tacgccggct atattgacgg cggggcttcc
1741 caggaggagt tctacaagtt catcaagccc attctggaaa agatggacgg caccgaggaa
1801 ctgctggtga agctgaatcg ggaggacctg ctgagaaaac agaggacatt tgataacgga
1861 agcatccctc accagattca tctgggcgaa ctgcacgcca tcctgcgacg gcaggaggac
1921 ttctacccat ttctgaagga taaccgcgag aaaatcgaaa agatcctgac cttcagaatc
1981 ccctactatg tggggcctct ggcacgggga aatagtagat ttgcctggat gacaagaaag
2041 tcagaggaaa ctatcacccc ctggaacttc gaggaagtgg tcgataaagg cgctagcgca
2101 cagtccttca ttgaaaggat gacaaatttt gacaagaacc tgccaaatga gaaggtgctg
2161 cccaaacaca gcctgctgta cgaatatttc acagtgtata acgagctgac taaagtgaag
2221 tacgtcaccg aagggatgcg caagcccgca ttcctgtccg gagagcagaa gaaagccatc
2281 gtggacctgc tgtttaagac aaatcggaaa gtgactgtca aacagctgaa ggaagactat 2341 ttcaagaaaa ttgagtgttt cgattcagtg gaaatcagcg gcgtcgagga caggtttaac 2401 gcctccctgg ggacctacca cgatctgctg aagatcatca aggataagga cttcctggac 2461 aacgaggaaa atgaggacat cctggaggac attgtgctga cactgactct gtttgaggat 2521 cgcgaaatga tcgaggaacg actgaagact tatgcccatc tgttcgatga caaagtgatg 2581 aagcagctga aaagaaggcg ctacaccgga tggggacgcc tgagccgaaa actgatcaat 2641 gggattagag acaagcagag cggaaaaact atcctggact ttctgaagtc cgatggcttc 2701 gccaacagga acttcatgca gctgattcac gatgactctc tgaccttcaa ggaggacatc 2761 cagaaagcac aggtgtctgg ccagggggac agtctgcacg agcatatcgc aaacctggcc 2821 ggcagccccg ccatcaagaa agggattctg cagaccgtga aggtggtgga cgaactggtc 2881 aaggtcatgg gacgacacaa acctgagaac atcgtgattg agatggcccg cgaaaatcag 2941 acaactcaga agggccagaa aaacagtcga gaacggatga agagaatcga ggaaggcatc 3001 aaggagctgg ggtcacagat cctgaaggag catcctgtgg aaaacactca gctgcagaat 3061 gagaaactgt atctgtacta tctgcagaat ggacgggata tgtacgtgga ccaggagctg 3121 gatattaaca gactgagtga ttatgacgtg gatgccatcg tccctcagag cttcctgaag 3181 gatgactcca ttgacaacaa ggtgctgacc aggtccgaca agaaccgcgg caaatcagat 3241 aatgtgccaa gcgaggaagt ggtcaagaaa atgaagaact actggaggca gctgctgaat 3301 gccaagctga tcacacagcg gaaatttgat aacctgacta aggcagaaag aggaggcctg 3361 tctgagctgg acaaggccgg cttcatcaag cggcagctgg tggagacaag acagatcact 3421 aagcacgtcg ctcagattct ggatagcaga atgaacacaa agtacgatga aaacgacaag 3481 ctgatcaggg aggtgaaagt cattactctg aaatccaagc tggtgtctga ctttagaaag 3541 gatttccagt tttataaagt cagggagatc aacaactacc accatgctca tgacgcatac 3601 ctgaacgcag tggtcgggac cgccctgatt aagaaatacc ccaagctgga gtccgagttc 3661 gtgtacggag actataaagt gtacgatgtc cggaagatga tcgccaaatc tgagcaggaa 3721 attggcaagg ccaccgctaa gtatttcttt tacagtaaca tcatgaattt ctttaagacc 3781 gaaatcacac tggcaaatgg ggagatcaga aaaaggcctc tgattgagac caacggggag 3841 acaggagaaa tcgtgtggga caagggaagg gattttgcta ccgtgcgcaa agtcctgtcc 3901 atgccccaag tgaatattgt caagaaaact gaagtgcaga ccgggggatt ctctaaggag 3961 agtattctgc ctaagcgaaa ctctgataaa ctgatcgccc ggaagaaaga ctgggacccc 4021 aagaagtatg gcgggttcga ctctccaaca gtggcttaca gtgtcctggt ggtcgcaaag 4081 gtggaaaagg ggaagtccaa gaaactgaag tctgtcaaag agctgctggg aatcactatt 4141 atggaacgca gctccttcga gaagaatcct atcgattttc tggaagccaa gggctataaa 4201 gaggtgaaga aagacctgat cattaagctg ccaaaatact cactgtttga gctggaaaac 4261 ggacgaaagc gaatgctggc aagcgccgga gaactgcaga agggcaatga gctggccctg 4321 ccctccaaat acgtgaactt cctgtatctg gctagccact acgagaaact gaaggggtcc 4381 cctgaggata acgaacagaa gcagctgttt gtggagcagc acaaacatta tctggacgag 4441 atcattgaac agatttcaga gttcagcaag agagtgatcc tggctgacgc aaatctggat 4501 aaagtcctga gcgcatacaa caagcaccga gacaaaccaa tccgggagca ggccgaaaat 4561 atcattcatc tgttcaccct gacaaacctg ggcgcccctg cagccttcaa gtattttgac 4621 accacaatcg atcggaagag atacacttct accaaagagg tgctggatgc taccctgatc 4681 caccagagta ttaccggcct gtatgagaca cgcatcgacc tgtcacagct gggaggcgat 4741 gggagcccca agaaaaagcg gaaggtgtct agttaa (SEQ ID NO: 48) . In some embodiments, the fusion protein comprises or consists of the amino acid sequence:
1 MPKKKRKVEG IKSNISLLKD ELRGQISHIS HEYLSLIDLA FDSKQNRLFE MKVLELLVNE 61 YGFKGRHLGG SRKPDGIVYS TTLEDNFGII VDTKAYSEGY SLPISQADEM ERYVRENSNR 121 DEEVNPNKWW ENFSEEVKKY YFVFISGSFK GKFEEQLRRL SMTTGVNGSA VNWNLLLGA 181 EKIRSGEMTI EELERAMFNN SEFILKYGGG GSDKKYSIGL AIGTNSVGWA VITDEYKVPS 241 KKFKVLGNTD RHSIKKNLIG ALLFDSGETA EATRLKRTAR RRYTRRKNRI CYLQEI FSNE 301 MAKVDDSFFH RLEESFLVEE DKKHERHPIF GNIVDEVAYH EKYPTIYHLR KKLVDSTDKA 361 DLRLIYLALA HMIKFRGHFL IEGDLNPDNS DVDKLFIQLV QTYNQLFEEN PINASGVDAK 421 AILSARLSKS RRLENLIAQL PGEKKNGLFG NLIALSLGLT PNFKSNFDLA EDAKLQLSKD 481 TYDDDLDNLL AQIGDQYADL FLAAKNLSDA ILLSDILRVN TEITKAPLSA SMIKRYDEHH 541 QDLTLLKALV RQQLPEKYKE IFFDQSKNGY AGYIDGGASQ EEFYKFIKPI LEKMDGTEEL 601 LVKLNREDLL RKQRTFDNGS IPHQIHLGEL HAILRRQEDF YPFLKDNREK IEKILTFRIP 661 YYVGPLARGN SRFAWMTRKS EETITPWNFE EWDKGASAQ SFIERMTNFD KNLPNEKVLP 721 KHSLLYEYFT VYNELTKVKY VTEGMRKPAF LSGEQKKAIV DLLFKTNRKV TVKQLKEDYF
781 KKIECFDSVE ISGVEDRFNA SLGTYHDLLK I IKDKDFLDN EENEDILEDI VLTLTLFEDR
841 EMIEERLKTY AHLFDDKVMK QLKRRRYTGW GRLSRKLING IRDKQSGKTI LDFLKSDGFA
901 NRNFMQLIHD DSLTFKEDIQ KAQVSGQGDS LHEHIANLAG SPAIKKGILQ TVKWDELVK
961 VMGRHKPENI VIEMARENQT TQKGQKNSRE RMKRIEEGIK ELGSQILKEH PVENTQLQNE
1021 KLYLYYLQNG RDMYVDQELD INRLSDYDVD AIVPQSFLKD DSIDNKVLTR SDKNRGKSDN
1081 VPSEEWKKM KNYWRQLLNA KLITQRKFDN LTKAERGGLS ELDKAGFIKR QLVETRQITK
1141 HVAQILDSRM NTKYDENDKL IREVKVITLK SKLVSDFRKD FQFYKVREIN NYHHAHDAYL
1201 NAWGTALIK KYPKLESEFV YGDYKVYDVR KMIAKSEQEI GKATAKYFFY SNIMNFFKTE
1261 ITLANGEIRK RPLIETNGET GEIVWDKGRD FATVRKVLSM PQVNIVKKTE VQTGGFSKES
1321 ILPKRNSDKL IARKKDWDPK KYGGFDSPTV AYSVLWAKV EKGKSKKLKS VKELLGITIM
1381 ERSSFEKNPI DFLEAKGYKE VKKDLI IKLP KYSLFELENG RKRMLASAGE LQKGNEIALP
1441 SKYV FLYLA SHYEKLKGSP EDNEQKQLFV EQHKHYLDEI IEQISEFSKR VIIADANLDK
1501 VLSAYNKHRD KPIREQAENI IHLFTLTNLG APAAFKYFDT TIDRKRYTST KEVLDATLIH
1561 QSITGLYETR IDLSQLGGDG SPKKKRKV ( (SEQ ID NO 21 ) or a nucleic acid comprising or consisting of the sequence:
1 atgcctaaga agaagcggaa ggtggaaggc atcaaaagca acatctccct cctgaaagac
61 gaactccggg ggcagattag ccacattagt cacgaatacc tctccctcat cgacctggct
121 ttcgatagca agcagaacag gctctttgag atgaaagtgc tggaactgct cgtcaatgag
181 tacgggttca agggtcgaca cctcggcgga tctaggaaac cagacggcat cgtgtatagt
241 accacactgg aagacaactt tgggatcatt gtggatacca aggcatactc tgagggttat
301 agtctgccca tttcacaggc cgacgagatg gaacggtacg tgcgcgagaa ctcaaataga
361 gatgaggaag tcaaccctaa caagtggtgg gagaacttct ctgaggaagt gaagaaatac
421 tacttcgtct ttatcagcgg gtccttcaag ggtaaatttg aggaacagct caggagactg
481 agcatgacta ccggcgtgaa tggcagcgcc gtcaacgtgg tcaatctgct cctgggcgct
541 gaaaagattc ggagcggaga gatgaccatc gaagagctgg agagggcaat gtttaataat
601 agcgagttta tcctgaaata cggtggcggt ggatccgata aaaagtattc tattggttta
661 gccatcggca ctaattccgt tggatgggct gtcataaccg atgaatacaa agtaccttca
721 aagaaattta aggtgttggg gaacacagac cgtcattcga ttaaaaagaa tcttatcggt
781 gccctcctat tcgatagtgg cgaaacggca gaggcgactc gcctgaaacg aaccgctcgg
841 agaaggtata cacgtcgcaa gaaccgaata tgttacttac aagaaatttt tagcaatgag
901 atggccaaag ttgacgattc tttctttcac cgtttggaag agtccttcct tgtcgaagag
961 gacaagaaac atgaacggca ccccatcttt ggaaacatag tagatgaggt ggcatatcat
1021 gaaaagtacc caacgattta tcacctcaga aaaaagctag ttgactcaac tgataaagcg
1081 gacctgaggt taatctactt ggctcttgcc catatgataa agttccgtgg gcactttctc
1141 attgagggtg atctaaatcc ggacaactcg gatgtcgaca aactgttcat ccagttagta
1201 caaacctata atcagttgtt tgaagagaac cctataaatg caagtggcgt ggatgcgaag
1261 gctattctta gcgcccgcct ctctaaatcc cgacggctag aaaacctgat cgcacaatta
1321 cccggagaga agaaaaatgg gttgttcggt aaccttatag cgctctcact aggcctgaca
1381 ccaaatttta agtcgaactt cgacttagct gaagatgcca aattgcagct tagtaaggac
1441 acgtacgatg acgatctcga caatctactg gcacaaattg gagatcagta tgcggactta
1501 tttttggctg ccaaaaacct tagcgatgca atcctcctat ctgacatact gagagttaat
1561 actgagatta ccaaggcgcc gttatccgct tcaatgatca aaaggtacga tgaacatcac
1621 caagacttga cacttctcaa ggccctagtc cgtcagcaac tgcctgagaa atataaggaa
1681 atattctttg atcagtcgaa aaacgggtac gcaggttata ttgacggcgg agcgagtcaa
1741 gaggaattct acaagtttat caaacccata ttagagaaga tggatgggac ggaagagttg
1801 cttgtaaaac tcaatcgcga agatctactg cgaaagcagc ggactttcga caacggtagc
1861 attccacatc aaatccactt aggcgaattg catgctatac ttagaaggca ggaggatttt
1921 tatccgttcc tcaaagacaa tcgtgaaaag attgagaaaa tcctaacctt tcgcatacct
1981 tactatgtgg gacccctggc ccgagggaac tctcggttcg catggatgac aagaaagtcc
2041 gaagaaacga ttactccatg gaattttgag gaagttgtcg ataaaggtgc gtcagctcaa
2101 tcgttcatcg agaggatgac caactttgac aagaatttac cgaacgaaaa agtattgcct 2161 aagcacagtt tactttacga gtatttcaca gtgtacaatg aactcacgaa agttaagtat
2221 gtcactgagg gcatgcgtaa acccgccttt ctaagcggag aacagaagaa agcaatagta
2281 gatctgttat tcaagaccaa ccgcaaagtg acagttaagc aattgaaaga ggactacttt
2341 aagaaaattg aatgcttcga ttctgtcgag atctccgggg tagaagatcg atttaatgcg
2401 tcacttggta cgtatcatga cctcctaaag ataattaaag ataaggactt cctggataac
2461 gaagagaatg aagatatctt agaagatata gtgttgactc ttaccctctt tgaagatcgg
2521 gaaatgattg aggaaagact aaaaacatac gctcacctgt tcgacgataa ggttatgaaa
2581 cagttaaaga ggcgtcgcta tacgggctgg ggacgattgt cgcggaaact tatcaacggg
2641 ataagagaca agcaaagtgg taaaactatt ctcgattttc taaagagcga cggcttcgcc
2701 aataggaact ttatgcagct gatccatgat gactctttaa ccttcaaaga ggatatacaa
2761 aaggcacagg tttccggaca aggggactca ttgcacgaac atattgcgaa tcttgctggt
2821 tcgccagcca tcaaaaaggg catactccag acagtcaaag tagtggatga gctagttaag
2881 gtcatgggac gtcacaaacc ggaaaacatt gtaatcgaga tggcacgcga aaatcaaacg
2941 actcagaagg ggcaaaaaaa cagtcgagag cggatgaaga gaatagaaga gggtattaaa
3001 gaactgggca gccagatctt aaaggagcat cctgtggaaa atacccaatt gcagaacgag
3061 aaactttacc tctattacct acaaaatgga agggacatgt atgttgatca ggaactggac
3121 ataaaccgtt tatctgatta cgacgtcgat gccattgtac cccaatcctt tttgaaggac
3181 gattcaatcg acaataaagt gcttacacgc tcggataaga accgagggaa aagtgacaat
3241 gttccaagcg aggaagtcgt aaagaaaatg aagaactatt ggcggcagct cctaaatgcg
3301 aaactgataa cgcaaagaaa gttcgataac ttaactaaag ctgagagggg tggcttgtct
3361 gaacttgaca aggccggatt tattaaacgt cagctcgtgg aaacccgcca aatcacaaag
3421 catgttgcac agatactaga ttcccgaatg aatacgaaat acgacgagaa cgataagctg
3481 attcgggaag tcaaagtaat cactttaaag tcaaaattgg tgtcggactt cagaaaggat
3541 tttcaattct ataaagttag ggagataaat aactaccacc atgcgcacga cgcttatctt
3601 aatgccgtcg tagggaccgc actcattaag aaatacccga agctagaaag tgagtttgtg
3661 tatggtgatt acaaagttta tgacgtccgt aagatgatcg cgaaaagcga acaggagata
3721 ggcaaggcta cagccaaata cttcttttat tctaacatta tgaatttctt taagacggaa
3781 atcactctgg caaacggaga gatacgcaaa cgacctttaa ttgaaaccaa tggggagaca
3841 ggtgaaatcg tatgggataa gggccgggac ttcgcgacgg tgagaaaagt tttgtccatg
3901 ccccaagtca acatagtaaa gaaaactgag gtgcagaccg gagggttttc aaaggaatcg
3961 attcttccaa aaaggaatag tgataagctc atcgctcgta aaaaggactg ggacccgaaa
4021 aagtacggtg gcttcgatag ccctacagtt gcctattctg tcctagtagt ggcaaaagtt
4081 gagaagggaa aatccaagaa actgaagtca gtcaaagaat tattggggat aacgattatg
4141 gagcgctcgt cttttgaaaa gaaccccatc gacttccttg aggcgaaagg ttacaaggaa
4201 gtaaaaaagg atctcataat taaactacca aagtatagtc tgtttgagtt agaaaatggc
4261 cgaaaacgga tgttggctag cgccggagag cttcaaaagg ggaacgaact cgcactaccg
4321 tctaaatacg tgaatttcct gtatttagcg tcccattacg agaagttgaa aggttcacct
4381 gaagataacg aacagaagca actttttgtt gagcagcaca aacattatct cgacgaaatc
4441 atagagcaaa tttcggaatt cagtaagaga gtcatcctag ctgatgccaa tctggacaaa
4501 gtattaagcg catacaacaa gcacagggat aaacccatac gtgagcaggc ggaaaatatt
4561 atccatttgt ttactcttac caacctcggc gctccagccg cattcaagta ttttgacaca
4621 acgatagatc gcaaacgata cacttctacc aaggaggtgc tagacgcgac actgattcac
4681 caatccatca cgggattata tgaaactcgg atagatttgt cacagcttgg gggtgacgga
4741 tcccccaaga agaagaggaa agtctga (SEQ ID NO: 22) .
[0238] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, a nanotransposon comprises the gene editing composition comprising a guide sequence and a sequence encoding a fusion protein comprising a sequence encoding an inactivated Cas9 (dCas9) and a sequence encoding a Clo05l nuclease or a nuclease domain thereof.
[0239] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell expresses the gene editing composition transiently.
[0240] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the cell is a T cell and the guide RNA comprises a sequence complementary to a target sequence encoding an endogenous TCR. In some embodiments, the guide RNA comprises a sequence complementary to a target sequence encoding a B2M polypeptide.
[0241] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the guide RNA comprises a sequence complementary to a target sequence within a safe harbor site of a genomic DNA sequence.
[0242] In some embodiments of the cells, unmodified cells and modified cells of the disclosure, the Clo05l nuclease or a nuclease domain thereof induces a single or double strand break in a target sequence. In some embodiments, a donor sequence, a donor plasmid, or a donor nanotransposon intra-ITR sequence integrated at a position of single or double strand break and/or at a position of cellular repair within a target sequence.
[0243] The disclosure provides a composition comprising a modified cell according to the disclosure. In some embodiments, the composition further comprises a pharmaceutically- acceptable carrier.
[0244] The disclosure provides a composition comprising a plurality of modified cells according to the disclosure. In some embodiments, the composition further comprises a pharmaceutically-acceptable carrier.
[0245] The disclosure provides a composition of the disclosure for use in the treatment of a disease or disorder.
[0246] The disclosure provides the use of a composition of the disclosure for the treatment of a disease or disorder.
[0247] The disclosure provides a method of treating a disease or disorder comprising administering to a subject in need thereof a therapeutically-effective amount of a composition of the disclosure. In some embodiments, the subject does not develop graft vs. host (GvH) and/or host vs. graft (HvG) following administration of the composition. In some embodiments, the administration is systemic. In some embodiments, the composition is administered by an intravenous route. In some embodiments, the composition is administered by an intravenous injection or an intravenous infusion.
[0248] The disclosure provides a method of treating a disease or disorder comprising administering to a subject in need thereof a therapeutically-effective amount of a composition of the disclosure. In some embodiments, the subject does not develop graft vs. host (GvH) and/or host vs. graft (HvG) following administration of the composition. In some embodiments, the administration is local. In some embodiments, the composition is administered by an intra-tumoral route, an intraspinal route, an intracerebroventricular route, an intraocular route or an intraosseous route. In some embodiments, the composition is administered by an intra-tumoral injection or infusion, an intraspinal injection or infusion, an intracerebroventricular injection or infusion, an intraocular injection or infusion or an intraosseous injection or infusion.
[0249] In some embodiments of the methods of treating a disease or disorder of the disclosure, the therapeutically effective dose is a single dose and wherein the allogeneic cells of the composition engraft and/or persist for a sufficient time to treat the disease or disorder. In some embodiments, the single dose is one of at least 2, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or any number of doses in between that are
manufactured simultaneously.
[0250] In some embodiments of the methods of treating a disease or disorder of the disclosure, the therapeutically effective dose is a single dose and wherein the autologous cells of the composition engraft and/or persist for a sufficient time to treat the disease or disorder. In some embodiments, the single dose is one of at least 2, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or any number of doses in between that are
manufactured simultaneously.
Transposition Systems
[0251] Exemplary transposon/transposase systems of the disclosure include, but are not limited to, piggyBac transposons and transposases, piggyBac-like transposons and transposases, Sleeping Beauty transposons and transposases, Helraiser transposons and transposases and Tol2 transposons and transposases.
[0252] The piggyBac transposase recognizes transposon-specific inverted terminal repeat sequences (ITRs) on the ends of the transposon, and moves the contents between the ITRs into TTAA chromosomal sites. The piggyBac transposon system has no payload limit for the genes of interest that can be included between the ITRs. In some embodiments, and, in particular, those embodiments wherein the transposon is a piggyBac transposon, the transposase is a piggyBac or a Super piggyBac (SPB) transposase. In some embodiments, and, in particular, those embodiments wherein the transposase is a Super piggyBac (SPB) transposase, the sequence encoding the transposase is an mRNA sequence.
[0253] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac (PB) transposase enzyme. The piggyBac (PB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG 61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG 121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEI SLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SI PLAKNLLQ
361 EPYKLTIVGT VRSNKREI PE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID
NO: 45) .
[0254] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at one or more of positions 30, 165, 282, or 538 of the sequence:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEI SLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SI PLAKNLLQ
361 EPYKLTIVGT VRSNKREI PE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID
NO: 45) .
[0255] In some embodiments, the transposase enzyme is a piggyBac (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at two or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 45. In some embodiments, the transposase enzyme is a piggyBac (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at three or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 45. In some embodiments, the transposase enzyme is a piggyBac (PB) transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at each of the following positions 30, 165, 282, and 538 of the sequence of SEQ ID NO: 45. In some embodiments, the amino acid substitution at position 30 of the sequence of SEQ ID NO: 45 is a substitution of a valine (V) for an isoleucine (I). In some embodiments, the amino acid substitution at position 165 of the sequence of SEQ ID NO: 45 is a substitution of a serine (S) for a glycine (G). In some embodiments, the amino acid substitution at position 282 of the sequence of SEQ ID NO: 45 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 538 of the sequence of SEQ ID NO: 45 is a substitution of a lysine (K) for an asparagine (N).
[0256] In certain embodiments of the methods of the disclosure, the transposase enzyme is a Super piggyBac (SPB) transposase enzyme. In some embodiments, the Super piggyBac (SPB) transposase enzymes of the disclosure may comprise or consist of the amino acid sequence of the sequence of SEQ ID NO: 45 wherein the amino acid substitution at position 30 is a substitution of a valine (V) for an isoleucine (I), the amino acid substitution at position 165 is a substitution of a serine (S) for a glycine (G), the amino acid substitution at position 282 is a substitution of a valine (V) for a methionine (M), and the amino acid substitution at position 538 is a substitution of a lysine (K) for an asparagine (N). In some embodiments, the Super piggyBac (SPB) transposase enzyme may comprise or consist of an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEV SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG 61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG 121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTSATFRD TNEDEIYAFF 181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV 241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RVYIPNKPSK YGIKILMMCD 301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ 361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC 421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN 481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPKEV 541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO: 46) .
[0257] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac or Super piggyBac transposase enzyme may further comprise an amino acid substitution at one or more of positions 3, 46, 82, 103, 119, 125,
177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 258, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 486, 503, 552, 570 and 591 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac or Super piggyBac transposase enzyme may further comprise an amino acid substitution at one or more of positions 46, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226,
235, 240, 241, 243, 296, 298, 311, 315, 319, 327, 328, 340, 421, 436, 456, 470, 485, 503, 552 and 570. In some embodiments, the amino acid substitution at position 3 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an asparagine (N) for a serine (S). In some
embodiments, the amino acid substitution at position 46 of SEQ ID NO: 45 or SEQ ID NO:
46 is a substitution of a serine (S) for an alanine (A). In some embodiments, the amino acid substitution at position 46 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 82 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tryptophan (W) for an isoleucine (I). In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 119 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a proline (P) for an arginine (R). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an alanine (A) a cysteine (C). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a histidine (H) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a
phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 185 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 187 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for an alanine (A). In some embodiments, the amino acid substitution at position 200 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tryptophan (W) for a phenylalanine (F).In some embodiments, the amino acid substitution at position 207 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a proline (P) for a valine (V). In some embodiments, the amino acid substitution at position 209 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a valine (V). In some embodiments, the amino acid substitution at position 226 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a methionine (M). In some embodiments, the amino acid substitution at position 235 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an arginine (R) for a leucine (L). In some embodiments, the amino acid substitution at position 240 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 241 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 243 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a proline (P). In some embodiments, the amino acid substitution at position 258 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a serine (S) for an asparagine (N). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tryptophan (W) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tyrosine (Y) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an alanine (A) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a proline (P). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine for a proline (P). In some
embodiments, the amino acid substitution at position 315 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for an arginine (R).In some embodiments, the amino acid substitution at position 319 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for a threonine (T). In some embodiments, the amino acid substitution at position 327 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an arginine (R) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 328 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for a cysteine (C). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 421 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a histidine (H) for the aspartic acid (D). In some embodiments, the amino acid substitution at position 436 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a valine (V). In some embodiments, the amino acid substitution at position 456 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tyrosine (Y) for a methionine (M). In some embodiments, the amino acid substitution at position 470 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 485 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a serine (S). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a methionine (M). In some embodiments, the amino acid substitution at position 552 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO:
45 or SEQ ID NO: 46 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 45 or SEQ ID NO:
46 is a substitution of a proline (P) for a glutamine (Q). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an arginine (R) for a glutamine (Q).
[0258] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac transposase enzyme may comprise or the Super piggyBac transposase enzyme may further comprise an amino acid substitution at one or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In certain embodiments of the methods of the disclosure, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac transposase enzyme may comprise or the Super piggyBac transposase enzyme may further comprise an amino acid substitution at two, three, four, five, six or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac transposase enzyme may comprise or the Super piggyBac transposase enzyme may further comprise an amino acid substitution at positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 194 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 372 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an alanine (A) for an arginine (R).
In some embodiments, the amino acid substitution at position 375 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an alanine (A) for a lysine (K). In some embodiments, the amino acid substitution at position 450 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an asparagine (N) for an aspartic acid (D). In some embodiments, the amino acid substitution at position 509 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for a serine (S). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a serine (S) for an asparagine (N). In some embodiments, the piggyBac transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45. In some embodiments, including those embodiments wherein the piggyBac transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45, the piggyBac transposase enzyme may further comprise an amino acid substitution at positions 372, 375 and 450 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, the piggyBac transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 45, and a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 45. In some embodiments, the piggyBac transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 45, a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 45 and a substitution of an asparagine (N) for an aspartic acid (D) at position 450 of SEQ ID NO: 45.
[0259] The sleeping beauty transposon is transposed into the target genome by the Sleeping Beauty transposase that recognizes ITRs, and moves the contents between the ITRs into TA chromosomal sites. In various embodiments, SB transposon-mediated gene transfer, or gene transfer using any of a number of similar transposons, may be used in the compositions and methods of the disclosure.
[0260] In some embodiments, and, in particular, those embodiments wherein the transposon is a Sleeping Beauty transposon, the transposase is a Sleeping Beauty transposase or a hyperactive Sleeping Beauty transposase (SB100X). [0261] In certain embodiments of the methods of the disclosure, the Sleeping Beauty transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGKSKEISQD LRKKIVDLHK SGSSLGAISK RLKVPRSSVQ TIVRKYKHHG TTQPSYRSGR
61 RRVLSPRDER TLVRKVQINP RTTAKDLVKM LEETGTKVSI STVKRVLYRH NLKGRSARKK
121 PLLQNRHKKA RLRFATAHGD KDRTFWRNVL WSDETKIELF GHNDHRYVWR KKGEACKPKN
181 TIPTVKHGGG SIMLWGCFAA GGTGALHKID GIMRKENYVD ILKQHLKTSV RKLKLGRKWV
241 FQMDNDPKHT SKWAKWLKD NKVKVLEWPS QSPDLNPIEN LWAELKKRVR ARRPTNLTQL
301 HQLCQEEWAK IHPTYCGKLV EGYPKRLTQV KQFKGNATKY (SEQ ID NO: 49) .
[0262] In certain embodiments of the methods of the disclosure, the hyperactive Sleeping Beauty (SB100X) transposase enzyme comprises an amino acid sequence at least 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGKSKEISQD LRKRIVDLHK SGSSLGAISK RLAVPRSSVQ TIVRKYKHHG TTQPSYRSGR
61 RRVLSPRDER TLVRKVQINP RTTAKDLVKM LEETGTKVSI STVKRVLYRH NLKGHSARKK
121 PLLQNRHKKA RLRFATAHGD KDRTFWRNVL WSDETKIELF GHNDHRYVWR KKGEACKPKN
181 TIPTVKHGGG SIMLWGCFAA GGTGALHKID GIMDAVQYVD ILKQHLKTSV RKLKLGRKWV
241 FQHDNDPKHT SKWAKWLKD NKVKVLEWPS QSPDLNPIEN LWAELKKRVR ARRPTNLTQL
301 HQLCQEEWAK IHPNYCGKLV EGYPKRLTQV KQFKGNATKY (SEQ ID NO: 50) .
[0263] The Helraiser transposon is transposed by the Helitron transposase. Helitron transposases mobilize the Helraiser transposon, an ancient element from the bat genome that was active about 30 to 36 million years ago. An exemplary Helraiser transposon of the disclosure includes Helibatl, which comprises a nucleic acid sequence comprising:
1 TCCTATATAA TAAAAGAGAA ACATGCAAAT TGACCATCCC TCCGCTACGC TCAAGCCACG
61 CCCACCAGCC AATCAGAAGT GACTATGCAA ATTAACCCAA CAAAGATGGC AGTTAAATTT
121 GCATACGCAG GTGTCAAGCG CCCCAGGAGG CAACGGCGGC CGCGGGCTCC CAGGACCTTC
181 GCTGGCCCCG GGAGGCGAGG CCGGCCGCGC CTAGCCACAC CCGCGGGCTC CCGGGACCTT
241 CGCCAGCAGA GAGCAGAGCG GGAGAGCGGG CGGAGAGCGG GAGGTTTGGA GGACTTGGCA
301 GAGCAGGAGG CCGCTGGACA TAGAGCAGAG CGAGAGAGAG GGTGGCTTGG AGGGCGTGGC
361 TCCCTCTGTC ACCCCAGCTT CCTCATCACA GCTGTGGAAA CTGACAGCAG GGAGGAGGAA
421 GTCCCACCCC CACAGAATCA GCCAGAATCA GCCGTTGGTC AGACAGCTCT CAGCGGCCTG
481 ACAGCCAGGA CTCTCATTCA CCTGCATCTC AGACCGTGAC AGTAGAGAGG TGGGACTATG
541 TCTAAAGAAC AACTGTTGAT ACAACGTAGC TCTGCAGCCG AAAGATGCCG GCGTTATCGA
601 CAGAAAATGT CTGCAGAGCA ACGTGCGTCT GATCTTGAAA GAAGGCGGCG CCTGCAACAG
661 AATGTATCTG AAGAGCAGCT ACTGGAAAAA CGTCGCTCTG AAGCCGAAAA ACAGCGGCGT
721 CATCGACAGA AAATGTCTAA AGACCAACGT GCCTTTGAAG TTGAAAGAAG GCGGTGGCGA
781 CGACAGAATA TGTCTAGAGA ACAGTCATCA ACAAGTACTA CCAATACCGG TAGGAACTGC
841 CTTCTCAGCA AAAATGGAGT ACATGAGGAT GCAATTCTCG AACATAGTTG TGGTGGAATG
901 ACTGTTCGAT GTGAATTTTG CCTATCACTA AATTTCTCTG ATGAAAAACC ATCCGATGGG
961 AAATTTACTC GATGTTGTAG CAAAGGGAAA GTCTGTCCAA ATGATATACA TTTTCCAGAT 1021 TACCCGGCAT AT T T AAAAAG ATTAATGACA AACGAAGATT CTGACAGTAA AAATTTCATG
1081 GAAAATATTC GTTCCATAAA TAGTTCTTTT GCTTTTGCTT CCATGGGTGC AAATATTGCA
1141 TCGCCATCAG GATATGGGCC ATACTGTTTT AGAATACACG GACAAGTTTA TCACCGTACT
1201 GGAACTTTAC ATCCTTCGGA TGGTGTTTCT CGGAAGTTTG CTCAACTCTA TATTTTGGAT
1261 ACAGCCGAAG CTACAAGTAA AAGAT TAG CA ATGCCAGAAA ACCAGGGCTG CTCAGAAAGA
1321 CTCATGATCA ACAT CAACAA CCTCATGCAT GAAATAAAT G AAT T AACAAA ATCGTACAAG
1381 ATGCTACATG AGGTAGAAAA GGAAGCCCAA TCTGAAGCAG CAGCAAAAGG TATTGCTCCC
1441 ACAGAAGTAA CAATGGCGAT TAAATACGAT CGTAACAGTG ACCCAGGTAG ATATAATTCT
1501 CCCCGTGTAA CCGAGGTTGC TGTCATATTC AGAAACGAAG ATGGAGAACC TCCTTTTGAA
1561 AGGGACTTGC TCATTCATTG TAAACCAGAT CCCAATAATC CAAATGCCAC TAAAAT GAAA
1621 CAAATCAGTA TCCTGTTTCC TACATTAGAT GCAATGACAT ATCCTATTCT TTTTCCACAT
1681 GGTGAAAAAG GCTGGGGAAC AGATATTGCA TTAAGACTCA GAGACAACAG TGTAATCGAC
1741 AATAATACTA GACAAAATGT AAGGACACGA GTCACACAAA TGCAGTATTA TGGATTTCAT
1801 CTCTCTGTGC GGGACACGTT CAATCCTATT TTAAATGCAG GAAAATTAAC TCAACAGTTT
1861 ATTGTGGATT CATATTCAAA AATGGAGGCC AATCGGATAA ATTTCATCAA AGCAAACCAA
1921 TCTAAGTTGA GAGTTGAAAA ATATAGTGGT TTGATGGATT ATCTCAAATC TAGATCTGAA
1981 AATGACAATG TGCCGATTGG TAAAAT GATA ATACTTCCAT CATCTTTTGA GGGTAGTCCC
2041 AGAAATATGC AGCAGCGATA TCAGGATGCT ATGGCAATTG TAACGAAGTA TGGCAAGCCC
2101 GATTTATTCA TAACCATGAC ATGCAACCCC AAATGGGCAG AT AT T ACAAA CAATTTACAA
2161 CGCTGGCAAA AAGTTGAAAA CAGACCTGAC TTGGTAGCCA GAGTTTTTAA TATTAAGCTG
2221 AATGCTCTTT TAAATGATAT ATGTAAATTC CATTTATTTG GCAAAGTAAT AGCTAAAATT
2281 CATGTCATTG AATTTCAGAA ACGCGGACTG CCTCACGCTC ACATATTATT GATATTAGAT
2341 AGTGAGTCCA AATTACGTTC AGAAGAT GAC ATTGACCGTA TAGTTAAGGC AGAAATTCCA
2401 GATGAAGACC AGTGTCCTCG ACTTTTTCAA ATTGTAAAAT CAAATATGGT ACAT G GAC CA
2461 TGTGGAATAC AAAATCCAAA TAGTCCATGT ATGGAAAATG GAAAATGTTC AAAGGGATAT
2521 C CAAAAGAAT TTCAAAATGC GACCATTGGA AATATTGATG GATATCCCAA ATACAAAC GA
2581 AGATCTGGTA GCACCATGTC TATTGGAAAT AAAGTTGTCG ATAACACTTG GATTGTCCCT
2641 TATAACCCGT ATTTGTGCCT TAAATATAAC TGTCATATAA ATGTTGAAGT CTGTGCATCA
2701 ATTAAAAGTG TCAAATATTT ATTTAAATAC ATCTATAAAG GGCACGATTG TGCAAATATT
2761 CAAATTTCTG AAAAAAATAT TATCAATCAT GACGAAGTAC AGGACTTCAT TGACTCCAGG
2821 TATGTGAGCG CTCCTGAGGC TGTTTGGAGA CTTTTTGCAA TGCGAATGCA TGACCAATCT
2881 CATGCAATCA CAAGATTAGC TATTCATTTG CCAAATGATC AGAATTTGTA TTTTCATACC
2941 GATGATTTTG CTGAAGTTTT AGATAGGGCT AAAAGGCATA ACTCGACTTT GATGGCTTGG
3001 TTCTTATTGA ATAGAGAAGA TTCTGATGCA CGTAATTATT ATTATTGGGA GATTCCACAG
3061 CATTATGTGT TTAATAATTC TTTGTGGACA AAACGCCGAA AGGGTGGGAA TAAAGTATTA
3121 GGTAGACTGT TCACTGTGAG CTTTAGAGAA CCAGAACGAT ATTACCTTAG ACTTTTGCTT
3181 CTGCATGTAA AAGGTGCGAT AAGTTTTGAG GATCTGCGAA CTGTAGGAGG TGTAACTTAT
3241 GATACATTTC ATGAAGCTGC TAAACACCGA GGATTATTAC TTGATGACAC TATCTGGAAA
3301 GATACGATTG ACGATGCAAT CATCCTTAAT ATGCCCAAAC AACTACGGCA ACTTTTTGCA
3361 TATATATGTG TGTTTGGATG TCCTTCTGCT GCAGACAAAT TATGGGATGA GAATAAATCT
3421 CATTTTATTG AAGATTTCTG TTGGAAATTA CACCGAAGAG AAGGTGCCTG TGTGAACTGT 3481 GAAATGCATG CCCTTAACGA AATTCAGGAG GTATTCACAT TGCATGGAAT GAAATGTTCA
3541 CATTTCAAAC TTCCGGACTA TCCTTTATTA ATGAATGCAA ATACATGTGA TCAATTGTAC
3601 GAGCAACAAC AGGCAGAGGT TTTGATAAAT TCTCTGAATG ATGAACAGTT GGCAGCCTTT
3661 CAGACTATAA CTTCAGCCAT CGAAGATCAA ACTGTACACC CCAAATGCTT TTTCTTGGAT
3721 GGTCCAGGTG GTAGTGGAAA AACATATCTG TATAAAGTTT TAACACATTA TATTAGAGGT
3781 CGTGGTGGTA CTGTTTTACC CACAGCATCT ACAGGAATTG CTGCAAATTT ACTTCTTGGT
3841 GGAAGAACCT TTCATTCCCA ATATAAATTA CCAATTCCAT TAAATGAAAC TTCAATTTCT
3901 AGACTCGATA TAAAGAGTGA AGTTGCTAAA ACCATTAAAA AGGCCCAACT TCTCATTATT
3961 GATGAATGCA CCATGGCATC CAGTCATGCT ATAAACGCCA TAGATAGATT ACTAAGAGAA
4021 ATTATGAATT TGAATGTTGC ATTTGGTGGG AAAGTTCTCC TTCTCGGAGG GGATTTTCGA
4081 CAATGTCTCA GTATTGTACC ACATGCTATG CGATCGGCCA TAGTACAAAC GAGTTTAAAG
4141 TACTGTAATG TTTGGGGATG TTTCAGAAAG TTGTCTCTTA AAACAAATAT GAGATCAGAG
4201 GATTCTGCTT ATAGTGAATG GTTAGTAAAA CTTGGAGATG GCAAACTTGA TAGCAGTTTT
4261 CATTTAGGAA TGGATATTAT TGAAATCCCC CATGAAATGA TTTGTAACGG ATCTATTATT
4321 GAAGCTACCT TTGGAAATAG TATATCTATA GATAATATTA AAAATATATC TAAACGTGCA
4381 ATTCTTTGTC CAAAAAATGA GCATGTTCAA AAATTAAATG AAGAAATTTT GGATATACTT
4441 GATGGAGATT TTCACACATA TTTGAGTGAT GATTCCATTG ATTCAACAGA TGATGCTGAA
4501 AAGGAAAATT TTCCCATCGA ATTTCTTAAT AGTATTACTC CTTCGGGAAT GCCGTGTCAT
4561 AAATTAAAAT TGAAAGTGGG TGCAATCATC ATGCTATTGA GAAATCTTAA TAGTAAATGG
4621 GGTCTTTGTA ATGGTACTAG ATTTATTATC AAAAGATTAC GACCTAACAT TATCGAAGCT
4681 GAAGTATTAA CAGGATCTGC AGAGGGAGAG GTTGTTCTGA TTCCAAGAAT TGATTTGTCC
4741 CCATCTGACA CTGGCCTCCC ATTTAAATTA ATTCGAAGAC AGTTTCCCGT GATGCCAGCA
4801 TTTGCGATGA CTATTAATAA ATCACAAGGA CAAACTCTAG ACAGAGTAGG AATATTCCTA
4861 CCTGAACCCG TTTTCGCACA TGGTCAGTTA TATGTTGCTT TCTCTCGAGT TCGAAGAGCA
4921 TGTGACGTTA AAGTTAAAGT TGTAAATACT TCATCACAAG GGAAATTAGT CAAGCACTCT
4981 GAAAGTGTTT TTACTCTTAA TGTGGTATAC AGGGAGATAT TAGAATAAGT TTAATCACTT
5041 TATCAGTCAT TGTTTGCATC AATGTTGTTT TTATATCATG TTTTTGTTGT TTTTATATCA
5101 TGTCTTTGTT GTTGTTATAT CATGTTGTTA TTGTTTATTT ATTAATAAAT TTATGTATTA
5161 TTTTCATATA CATTTTACTC ATTTCCTTTC ATCTCTCACA CTTCTATTAT AGAGAAAGGG
5221 CAAATAGCAA TATTAAAATA TTTCCTCTAA TTAATTCCCT TTCAATGTGC ACGAATTTCG
5281 TGCACCGGGC CACTAG (SEQ ID NO: 51)
[0264] Unlike other transposases, the Helitron transposase does not contain an RNase-H like catalytic domain, but instead comprises a RepHel motif made up of a replication initiator domain (Rep) and a DNA helicase domain. The Rep domain is a nuclease domain of the HUH superfamily of nucleases.
[0265] An exemplary Helitron transposase of the disclosure comprises an amino acid sequence comprising:
1 MSKEQLLIQR SSAAERCRRY RQKMSAEQRA SDLERRRRLQ QNVSEEQLLE KRRSEAEKQR 61 RHRQKMSKDQ RAFEVERRRW RRQNMSREQS STSTTNTGRN CLLSKNGVHE DAILEHSCGG 121 MTVRCEFCLS LNFSDEKPSD GKFTRCCSKG KVCPNDIHFP DYPAYLKRLM TNEDSDSKNF 181 MENIRSINSS FAFASMGANI ASPSGYGPYC FRIHGQVYHR TGTLHPSDGV SRKFAQLYIL 241 DTAEATSKRL AMPENQGCSE RLMININNLM HEINELTKSY KMLHEVEKEA QSEAAAKGIA 301 PTEVTMAIKY DRNSDPGRYN SPRVTEVAVI FRNEDGEPPF ERDLLIHCKP DPNNPNATKM 361 KQISILFPTL DAMTYPILFP HGEKGWGTDI ALRLRDNSVI DNNTRQNVRT RVTQMQYYGF 421 HLSVRDTFNP ILNAGKLTQQ FIVDSYSKME ANRINFIKAN QSKLRVEKYS GLMDYLKSRS 481 ENDNVPIGKM IILPSSFEGS PRNMQQRYQD AMAIVTKYGK PDLFITMTCN PKWADITNNL 541 QRWQKVENRP DLVARVFNIK LNALLNDICK FHLFGKVIAK IHVIEFQKRG LPHAHILLIL 601 DSESKLRSED DIDRIVKAEI PDEDQCPRLF QIVKSNMVHG PCGIQNPNSP CMENGKCSKG 661 YPKEFQNATI GNIDGYPKYK RRSGSTMSIG NKWDNTWIV PYNPYLCLKY NCHINVEVCA 721 SIKSVKYLFK YIYKGHDCAN IQI SEKNI IN HDEVQDFIDS RYVSAPEAW RLFAMRMHDQ 781 SHAITRLAIH LPNDQNLYFH TDDFAEVLDR AKRHNSTLMA WFLLNREDSD ARNYYYWEIP 841 QHYVFNNSLW TKRRKGGNKV LGRLFTVSFR EPERYYLRLL LLHVKGAISF EDLRTVGGVT 901 YDTFHEAAKH RGLLLDDTIW KDTIDDAIIL NMPKQLRQLF AYICVFGCPS AADKLWDENK 961 SHFIEDFCWK LHRREGACVN CEMHALNEIQ EVFTLHGMKC SHFKLPDYPL LMNANTCDQL 1021 YEQQQAEVLI NSLNDEQLAA FQTITSAIED QTVHPKCFFL DGPGGSGKTY LYKVLTHYIR 1081 GRGGTVLPTA STGIAANLLL GGRTFHSQYK LPIPLNETSI SRLDIKSEVA KTIKKAQLLI 1141 IDECTMASSH AINAIDRLLR EIMNLNVAFG GKVLLLGGDF RQCLSIVPHA MRSAIVQTSL 1201 KYCNVWGCFR KLSLKTNMRS EDSAYSEWLV KLGDGKLDSS FHLGMDI IEI PHEMICNGSI 1261 IEATFGNSIS IDNIKNISKR AILCPKNEHV QKLNEEILDI LDGDFHTYLS DDSIDSTDDA 1321 EKENFPIEFL NSITPSGMPC HKLKLKVGAI IMLLRNLNSK WGLCNGTRFI IKRLRPNI IE 1381 AEVLTGSAEG EWLIPRIDL SPSDTGLPFK LIRRQFPVMP AFAMTINKSQ GQTLDRVGIF 1441 LPEPVFAHGQ LYVAFSRVRR ACDVKVKWN TSSQGKLVKH SESVFTLNW YREILE (SEQ ID NO: 52 ) .
[0266] In Helitron transpositions, a hairpin close to the 3’ end of the transposon functions as a terminator. However, this hairpin can be bypassed by the transposase, resulting in the transduction of flanking sequences. In addition, Helraiser transposition generates covalently closed circular intermediates. Furthermore, Helitron transpositions can lack target site duplications. In the Helraiser sequence, the transposase is flanked by left and right terminal sequences termed LTS (5’ terminal sequence) and RTS (3’ terminal sequence). These sequences terminate with a conserved 5’-TC/CTAG-3’ motif. A 19 bp palindromic sequence with the potential to form the hairpin termination structure is located 11 nucleotides upstream of the RTS and consists of the sequence GTGCACGAATTTCGTGCACCGGGCCACTAG (SEQ ID NO: 53).
[0267] Tol2 transposons may be isolated or derived from the genome of the medaka fish, and may be similar to transposons of the hAT family. Exemplary Tol2 transposons of the disclosure are encoded by a sequence comprising about 4.7 kilobases and contain a gene encoding the Tol2 transposase, which contains four exons. An exemplary Tol2 transposase of the disclosure comprises an amino acid sequence comprising the following: 1 MEEVCDSSAA ASSTVQNQPQ DQEHPWPYLR EFFSLSGVNK DSFKMKCVLC LPLNKEI SAF
61 KSSPSNLRKH IERMHPNYLK NYSKLTAQKR KIGTSTHASS SKQLKVDSVF PVKHVSPVTV
121 NKAILRYIIQ GLHPFSTVDL PSFKELISTL QPGI SVITRP TLRSKIAEAA LIMKQKVTAA
181 MSEVEWIATT TDCWTARRKS FIGVTAHWIN PGSLERHSAA LACKRLMGSH TFEVIASAMN
241 DIHSEYEIRD KWCTTTDSG SNFMKAFRVF GVENNDIETE ARRCESDDTD SEGCGEGSDG
301 VEFQDASRVL DQDDGFEFQL PKHQKCACHL LNLVSSVDAQ KALSNEHYKK LYRSVFGKCQ
361 ALWNKSSRSA LAAEAVESES RLQLLRPNQT RWNSTFMAVD RILQICKEAG EGALRNICTS
421 LEVPMFNPAE MLFLTEWANT MRPVAKVLDI LQAETNTQLG WLLPSVHQLS LKLQRLHHSL
481 RYCDPLVDAL QQGIQTRFKH MFEDPEI IAA AILLPKFRTS WTNDETIIKR GMDYIRVHLE
541 PLDHKKELAN SSSDDEDFFA SLKPTTHEAS KELDGYIACV SDTRESLLTF PAICSLSIKT
601 NTPLPASAAC ERLFSTAGLL FSPKRARLDT NNFENQLLLK LNLRFYNFE (SEQ ID NO: 54).
[0268] An exemplary Tol2 transposon of the disclosure, including inverted repeats, subterminal sequences and the Tol2 transposase, is encoded by a nucleic acid sequence comprising the following:
1 CAGAGGTGTA AAGTACTTGA GTAATTTTAC TTGATTACTG TACTTAAGTA TTATTTTTGG 61 GGATTTTTAC TTTACTTGAG TACAATTAAA AATCAATACT TTTACTTTTA CTTAATTACA 121 TTTTTTTAGA AAAAAAAGTA CTTTTTACTC CTTACAATTT TATTTACAGT CAAAAAGTAC 181 TTATTTTTTG GAGATCACTT CATTCTATTT TCCCTTGCTA TTACCAAACC AATTGAATTG 241 CGCTGATGCC CAGTTTAATT TAAATGTTAT TTATTCTGCC TATGAAAATC GTTTTCACAT 301 TATATGAAAT TGGTCAGACA TGTTCATTGG TCCTTTGGAA GTGACGTCAT GTCACATCTA 361 TTACCACAAT GCACAGCACC TTGACCTGGA AATTAGGGAA ATTATAACAG TCAATCAGTG 421 GAAGAAAATG GAGGAAGTAT GTGATTCATC AGCAGCTGCG AGCAGCACAG TCCAAAATCA 481 GCCACAGGAT CAAGAGCACC CGTGGCCGTA TCTTCGCGAA TTCTTTTCTT TAAGTGGTGT 541 AAATAAAGAT TCATTCAAGA TGAAATGTGT CCTCTGTCTC CCGCTTAATA AAGAAATATC 601 GGCCTTCAAA AGTTCGCCAT CAAACCTAAG GAAGCATATT GAGGTAAGTA CATTAAGTAT 661 TTTGTTTTAC TGATAGTTTT TTTTTTTTTT TTTTTTTTTT TTTTTGGGTG TGCATGTTTT 721 GACGTTGATG GCGCGCCTTT TATATGTGTA GTAGGCCTAT TTTCACTAAT GCATGCGATT 781 GACAATATAA GGCTCACGTA ATAAAATGCT AAAATGCATT TGTAATTGGT AACGTTAGGT 841 CCACGGGAAA TTTGGCGCCT ATTGCAGCTT TGAATAATCA TTATCATTCC GTGCTCTCAT 901 TGTGTTTGAA TTCATGCAAA ACACAAGAAA ACCAAGCGAG AAATTTTTTT CCAAACATGT 961 TGTATTGTCA AAACGGTAAC ACTTTACAAT GAGGTTGATT AGTTCATGTA TTAACTAACA 1021 TTAAATAACC ATGAGCAATA CATTTGTTAC TGTATCTGTT AATCTTTGTT AACGTTAGTT 1081 AATAGAAATA CAGATGTTCA TTGTTTGTTC ATGTTAGTTC ACAGTGCATT AACTAATGTT 1141 AACAAGATAT AAAGTATTAG TAAATGTTGA AATTAACATG TATACGTGCA GTTCATTATT 1201 AGTTCATGTT AACTAATGTA GTTAACTAAC GAACCTTATT GTAAAAGTGT TACCATCAAA 1261 ACTAATGTAA TGAAATCAAT TCACCCTGTC ATGTCAGCCT TACAGTCCTG TGTTTTTGTC 1321 AATATAATCA GAAATAAAAT TAATGTTTGA TTGTCACTAA ATGCTACTGT ATTTCTAAAA 1381 TCAACAAGTA TTTAACATTA TAAAGTGTGC AATTGGCTGC AAATGTCAGT TTTATTAAAG 1441 GGTTAGTTCA CCCAAAAATG AAAATAATGT CATTAATGAC TCGCCCTCAT GTCGTTCCAA 1501 GCCCGTAAGA CCTCCGTTCA TCTTCAGAAC ACAGTTTAAG ATATTTTAGA TTTAGTCCGA 1561 GAGCTTTCTG TGCCTCCATT GAGAATGTAT GTACGGTATA CTGTCCATGT CCAGAAAGGT
1621 AATAAAAACA TCAAAGTAGT CCAT GTGACA TCAGTGGGTT AGTTAGAATT TTTTGAAGCA
1681 TCGAATACAT TTTGGTCCAA AAATAACAAA ACCTACGACT TTATTCGGCA TTGTATTCTC
1741 TTCCGGGTCT GTTGTCAATC CGCGTTCACG ACTTCGCAGT GACGCTACAA TGCTGAATAA
1801 AGTCGTAGGT TTTGTTATTT TTGGACCAAA ATGTATTTTC GATGCTTCAA ATAATTCTAC
1861 CTAACCCACT GATGTCACAT GGACTACTTT GATGTTTTTA TTACCTTTCT GGACATGGAC
1921 AGTATACCGT ACATACATTT TCAGTGGAGG GACAGAAAGC TCTCGGACTA AATCTAAAAT
1981 ATCTTAAACT GTGTTCCGAA GATGAACGGA GGT GTTACGG GCTTGGAACG ACATGAGGGT
2041 GAGTCATTAA TGACATCTTT TCATTTTTGG GTGAACTAAC CCTTTAATGC TGTAATCAGA
2101 GAGTGTATGT GTAATTGTTA CATTTATTGC ATACAATATA AATATTTATT TGTTGTTTTT
2161 ACAGAGAAT G CACCCAAATT ACCTCAAAAA CTACTCTAAA TTGACAGCAC AGAAGAGAAA
2221 GATCGGGACC TCCACCCATG CTTCCAGCAG TAAGCAACTG AAAGTTGACT CAGTTTTCCC
2281 AGTCAAACAT GT GTCTCCAG TCACTGTGAA CAAAGCTATA TTAAGGTACA TCATTCAAGG
2341 ACTTCATCCT TTCAGCACTG TTGATCTGCC AT CAT TT AAA GAGCTGATTA GTACACTGCA
2401 GCCTGGCATT TCTGTCATTA CAAGGCCTAC TTTACGCTCC AAGATAGCTG AAGCTGCTCT
2461 GATCATGAAA CAGAAAGTGA CTGCTGCCAT GAGTGAAGTT GAATGGATTG CAACCACAAC
2521 GGATTGTTGG ACTGCACGTA GAAAGTCATT CAT T GGT GTA ACTGCTCACT GGATCAACCC
2581 TGGAAGTCTT GAAAGACATT CCGCTGCACT TGCCTGCAAA AGATTAATGG GCTCTCATAC
2641 TTTTGAGGTA CTGGCCAGTG CCATGAATGA TATCCACTCA GAGTATGAAA TACGTGACAA
2701 GGTTGTTTGC ACAACCACAG ACAGTGGTTC CAACTTTATG AAGGCTTTCA GAGTTTTTGG
2761 TGTGGAAAAC AATGATATCG AGACTGAGGC AAGAAGGTGT GAAAGTGATG ACACTGATTC
2821 TGAAGGCTGT GGTGAGGGAA GTGATGGTGT GGAATTCCAA GATGCCTCAC GAGTCCTGGA
2881 CCAAGACGAT GGCTTCGAAT TCCAGCTACC AAAACAT CAA AAGTGTGCCT GTCACTTACT
2941 TAACCTAGTC TCAAGCGTTG ATGCCCAAAA AGCTCTCTCA AATGAACACT ACAAGAAACT
3001 CTACAGATCT GTCTTTGGCA AATGCCAAGC TTTATGGAAT AAAAGCAGCC GATCGGCTCT
3061 AGCAGCTGAA GCTGTTGAAT CAGAAAGCCG GCTTCAGCTT TTAAGGCCAA ACCAAACGCG
3121 GTGGAATTCA ACTTTTATGG CTGTTGACAG AATTCTTCAA ATTTGCAAAG AAGCAGGAGA
3181 AGGCGCACTT CGGAATATAT GCACCTCTCT TGAGGTTCCA ATGTAAGTGT TTTTCCCCTC
3241 TATCGATGTA AACAAATGTG GGTTGTTTTT GTTTAATACT CTTTGATTAT GCTGATTTCT
3301 CCTGTAGGTT TAATCCAGCA GAAATGCTGT TCTTGACAGA GTGGGCCAAC ACAATGCGTC
3361 CAGTTGCAAA AGTACTCGAC ATCTTGCAAG CGGAAACGAA TACACAGCTG GGGTGGCTGC
3421 TGCCTAGTGT CCATCAGTTA AGCTTGAAAC TTCAGCGACT CCACCATTCT CTCAGGTACT
3481 GTGACCCACT TGTGGATGCC CTACAACAAG GAATCCAAAC AC GATT CAA G CATATGTTTG
3541 AAGATCCTGA GATCATAGCA GCTGCCATCC TTCTCCCTAA ATTTCGGACC TCTTGGACAA
3601 ATGATGAAAC CAT C AT AAAA CGAGGTAAAT GAATGCAAGC AACATACACT TGACGAATTC
3661 TAATCTGGGC AACCTTTGAG CCATACCAAA ATTATTCTTT TATTTATTTA TTTTTGCACT
3721 TTTTAGGAAT GTTATATCCC ATCTTTGGCT GTGATCTCAA TATGAATATT GATGTAAAGT
3781 ATTCTTGCAG CAGGTTGTAG TTATCCCTCA GTGTTTCTTG AAACCAAACT CATATGTATC
3841 ATATGTGGTT TGGAAATGCA GTTAGATTTT ATGCTAAAAT AAGGGATTTG CATGATTTTA
3901 GATGTAGATG ACTGCACGTA AATGTAGTTA AT GACAAAAT CCATAAAATT TGTTCCCAGT
3961 CAGAAGCCCC TCAACCAAAC TTTTCTTTGT GTCTGCTCAC TGTGCTTGTA GGCATGGACT 4021 ACATCAGAGT GCATCTGGAG CCTTTGGACC ACAAGAAGGA ATTGGCCAAC AGTTCATCTG 4081 ATGATGAAGA TTTTTTCGCT TCTTTGAAAC CGACAACACA TGAAGCCAGC AAAGAGTTGG 4141 ATGGATATCT GGCCTGTGTT TCAGACACCA GGGAGTCTCT GCTCACGTTT CCTGCTATTT 4201 GCAGCCTCTC TATCAAGACT AATACACCTC TTCCCGCATC GGCTGCCTGT GAGAGGCTTT 4261 TCAGCACTGC AGGATTGCTT TTCAGCCCCA AAAGAGCTAG GCTTGACACT AACAATTTTG 4321 AGAATCAGCT TCTACTGAAG TTAAATCTGA GGTTTTACAA CTTTGAGTAG CGTGTACTGG 4381 CATTAGATTG TCTGTCTTAT AGTTTGATAA TTAAATACAA ACAGTTCTAA AGCAGGATAA 4441 AACCTTGTAT GCATTTCATT TAATGTTTTT TGAGATTAAA AGCTTAAACA AGAATCTCTA 4501 GTTTTCTTTC TTGCTTTTAC TTTTACTTCC TTAATACTCA AGTACAATTT TAATGGAGTA 4561 CTTTTTTACT TTTACTCAAG TAAGATTCTA GCCAGATACT TTTACTTTTA ATTGAGTAAA 4621 ATTTTCCCTA AGTACTTGTA CTTTCACTTG AGTAAAATTT TTGAGTACTT TTTACACCTC 4681 TG (SEQ ID NO: 55).
[0269] Exemplary transposon/transposase systems of the disclosure include, but are not limited to, piggyBac and piggyBac-like transposons and transposases.
[0270] PiggyBac and piggyBac-like transposases recognizes transposon-specific inverted terminal repeat sequences (ITRs) on the ends of the transposon, and moves the contents between the ITRs into TTAA or TTAT chromosomal sites. The piggyBac or piggyBac-like transposon system has no payload limit for the genes of interest that can be included between the ITRs.
[0271] In some embodiments, and, in particular, those embodiments wherein the transposon is a piggyBac transposon, the transposase is a piggyBac, Super piggyBac (SPB) transposase. In some embodiments, and, in particular, those embodiments wherein the transposase is a piggyBac, Super piggyBac (SPB), the sequence encoding the transposase is an mRNA sequence.
[0272] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme.
[0273] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or a piggyBac-like transposase enzyme. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%,
15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%.
95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEI SLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SI PLAKNLLQ 361 EPYKLTIVGT VRSNKREI PE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN 481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV 541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO: 45) .
[0274] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at one or more of positions 30, 165, 282, or 538 of the sequence:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEI SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEI SLKRR ESMTGATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RMYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SI PLAKNLLQ
361 EPYKLTIVGT VRSNKREI PE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPNEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID
NO: 45) .
[0275] In some embodiments, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at two or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 45. In some embodiments, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at three or more of positions 30, 165, 282, or 538 of the sequence of SEQ ID NO: 45. In some embodiments, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme that comprises or consists of an amino acid sequence having an amino acid substitution at each of the following positions 30, 165, 282, and 538 of the sequence of SEQ ID NO: 45. In some embodiments, the amino acid substitution at position 30 of the sequence of SEQ ID NO: 45 is a substitution of a valine (V) for an isoleucine (I). In some embodiments, the amino acid substitution at position 165 of the sequence of SEQ ID NO: 45 is a substitution of a serine (S) for a glycine (G). In some embodiments, the amino acid substitution at position 282 of the sequence of SEQ ID NO: 45 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 538 of the sequence of SEQ ID NO: 45 is a substitution of a lysine (K) for an asparagine (N). [0276] In certain embodiments of the methods of the disclosure, the transposase enzyme is a Super piggyBac (SPB) or piggyBac-like transposase enzyme. In some embodiments, the Super piggyBac (SPB) or piggyBac-like transposase enzyme of the disclosure may comprise or consist of the amino acid sequence of the sequence of SEQ ID NO: 45 wherein the amino acid substitution at position 30 is a substitution of a valine (V) for an isoleucine (I), the amino acid substitution at position 165 is a substitution of a serine (S) for a glycine (G), the amino acid substitution at position 282 is a substitution of a valine (V) for a methionine (M), and the amino acid substitution at position 538 is a substitution of a lysine (K) for an asparagine (N). In some embodiments, the Super piggyBac (SPB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEV SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEISLKRR ESMTSATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RVYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREIPE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPKEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ ID NO:
46) .
[0277] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac, Super piggyBac or piggyBac-like transposase enzyme may further comprise an amino acid substitution at one or more of positions 3, 46, 82, 103, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 258, 296, 298, 311,
315, 319, 327, 328, 340, 421, 436, 456, 470, 486, 503, 552, 570 and 591 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac, Super piggyBac or piggyBac-like transposase enzyme may further comprise an amino acid substitution at one or more of positions 46, 119, 125, 177, 180, 185, 187, 200, 207, 209, 226, 235, 240, 241, 243, 296, 298, 311, 315, 319, 327, 328, 340, 421,
436, 456, 470, 485, 503, 552 and 570. In some embodiments, the amino acid substitution at position 3 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an asparagine (N) for a serine (S). In some embodiments, the amino acid substitution at position 46 of SEQ ID NO:
45 or SEQ ID NO: 46 is a substitution of a serine (S) for an alanine (A). In some
embodiments, the amino acid substitution at position 46 of SEQ ID NO: 45 or SEQ ID NO:
46 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 82 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tryptophan (W) for an isoleucine (I). In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 119 of SEQ ID NO:
45 or SEQ ID NO: 46 is a substitution of a proline (P) for an arginine (R). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 45 or SEQ ID NO:
46 is a substitution of an alanine (A) a cysteine (C). In some embodiments, the amino acid substitution at position 125 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 177 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a histidine (H) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 180 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 185 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 187 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for an alanine (A). In some embodiments, the amino acid substitution at position 200 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tryptophan (W) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 207 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a proline (P) for a valine (V). In some embodiments, the amino acid substitution at position 209 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a valine (V). In some embodiments, the amino acid substitution at position 226 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a methionine (M). In some embodiments, the amino acid substitution at position 235 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an arginine (R) for a leucine (L). In some embodiments, the amino acid substitution at position 240 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 241 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a phenylalanine (F). In some embodiments, the amino acid substitution at position 243 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a proline (P). In some embodiments, the amino acid substitution at position 258 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a serine (S) for an asparagine (N).
In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tryptophan (W) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tyrosine (Y) for a leucine (L). In some embodiments, the amino acid substitution at position 296 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an alanine (A) for a methionine (M). In some embodiments, the amino acid substitution at position 298 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a methionine (M). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a proline (P). In some embodiments, the amino acid substitution at position 311 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine for a proline (P). In some
embodiments, the amino acid substitution at position 315 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for an arginine (R).In some embodiments, the amino acid substitution at position 319 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for a threonine (T). In some embodiments, the amino acid substitution at position 327 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an arginine (R) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 328 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a tyrosine (Y). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for a cysteine (C). In some embodiments, the amino acid substitution at position 340 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a cysteine (C). In some embodiments, the amino acid substitution at position 421 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a histidine (H) for the aspartic acid (D). In some embodiments, the amino acid substitution at position 436 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a valine (V). In some embodiments, the amino acid substitution at position 456 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a tyrosine (Y) for a methionine (M). In some embodiments, the amino acid substitution at position 470 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a phenylalanine (F) for a leucine (L). In some embodiments, the amino acid substitution at position 485 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a serine (S). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a leucine (L) for a methionine (M). In some embodiments, the amino acid substitution at position 503 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an isoleucine (I) for a methionine (M). In some embodiments, the amino acid substitution at position 552 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a lysine (K) for a valine (V). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO:
45 or SEQ ID NO: 46 is a substitution of a threonine (T) for an alanine (A). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 45 or SEQ ID NO:
46 is a substitution of a proline (P) for a glutamine (Q). In some embodiments, the amino acid substitution at position 591 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an arginine (R) for a glutamine (Q).
[0278] In certain embodiments of the methods of the disclosure, including those
embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac or piggyBac-like transposase enzyme or may comprise or the Super piggyBac transposase enzyme may further comprise an amino acid substitution at one or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In certain embodiments of the methods of the disclosure, including those embodiments wherein the transposase comprises the above-described mutations at positions 30, 165, 282 and/or 538, the piggyBac or piggyBac-like transposase enzyme may comprise or the Super piggyBac transposase enzyme may further comprise an amino acid substitution at two, three, four, five, six or more of positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, including those embodiments wherein the transposase comprises the above- described mutations at positions 30, 165, 282 and/or 538, the piggyBac or piggyBac-like transposase enzyme may comprise or the Super piggyBac transposase enzyme may further comprise an amino acid substitution at positions 103, 194, 372, 375, 450, 509 and 570 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, the amino acid substitution at position 103 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a proline (P) for a serine (S). In some embodiments, the amino acid substitution at position 194 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a valine (V) for a methionine (M).
In some embodiments, the amino acid substitution at position 372 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an alanine (A) for an arginine (R). In some embodiments, the amino acid substitution at position 375 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an alanine (A) for a lysine (K). In some embodiments, the amino acid substitution at position 450 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of an asparagine (N) for an aspartic acid (D). In some embodiments, the amino acid substitution at position 509 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a glycine (G) for a serine (S). In some embodiments, the amino acid substitution at position 570 of SEQ ID NO: 45 or SEQ ID NO: 46 is a substitution of a serine (S) for an asparagine (N). In some embodiments, the piggyBac or piggyBac-like transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45. In some embodiments, including those embodiments wherein the piggyBac or piggyBac-like transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45, the piggyBac or piggyBac-like transposase enzyme may further comprise an amino acid substitution at positions 372, 375 and 450 of the sequence of SEQ ID NO: 45 or SEQ ID NO: 46. In some embodiments, the piggyBac or piggyBac-like transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 45, and a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 45. In some embodiments, the piggyBac or piggyBac-like transposase enzyme may comprise a substitution of a valine (V) for a methionine (M) at position 194 of SEQ ID NO: 45, a substitution of an alanine (A) for an arginine (R) at position 372 of SEQ ID NO: 45, a substitution of an alanine (A) for a lysine (K) at position 375 of SEQ ID NO: 45 and a substitution of an asparagine (N) for an aspartic acid (D) at position 450 of SEQ ID NO: 45.
[0279] In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from an insect. In some embodiments, the insect is Trichoplusia ni (GenBank Accession No. AAA87375; SEQ ID NO: 188), Argyrogramma agnata (GenBank Accession No. GU477713; SEQ ID NO: 190, SEQ ID NO: 189), Anopheles gambiae (GenBank Accession No. CR_312615 (SEQ ID NO: 191); GenBank Accession No. XP_3204l4 (SEQ ID NO: 192); GenBank Accession No. XP_3l0729 (SEQ ID NO: 193)), Aphis gossypii (GenBank Accession No. GU329918; SEQ ID NO: 194, SEQ ID NO: 195 ), Acyrthosiphon pisum (GenBank Accession No. CR_001948139; SEQ ID NO: 196), Agrotis ipsilon
(GenBank Accession No. GU477714; SEQ ID NO: 197, SEQ ID NO: 198), Bombyx mori (GenBank Accession No. BAD11135; SEQ ID NO: 199), Chilo suppressalis (GenBank Accession No. JX294476; SEQ ID NO: 186, SEQ ID NO: 187), Drosophila melanogaster (GenBank Accession No. AAL39784; SEQ ID NO: 200), Helicoverpa armigera (GenBank Accession No. ABS18391; SEQ ID NO: 201), Heliothis virescens (GenBank Accession No. ABD76335; SEQ ID NO: 202 ), Macdunnoughia crassisigna (GenBank Accession No.
EU287451; SEQ ID NO: 203, SEQ ID NO: 204), Pectinophora gossypiella (GenBank Accession No. GU270322; SEQ ID NO: 205, SEQ ID NO: 206), Tribolium castaneum (GenBank Accession No. CR_001814566; SEQ ID NO: 207), Ctenoplusia agnata (also called Argyrogramma agnata), Messour bouvieri, Megachile rotundata, Bombus impatiens, Mamestra brassicae, Mayetiola destructor or Apis mellifera.
[0280] In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from an insect. In some embodiments, the insect is Trichoplusia ni (AAA87375).
[0281] In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from an insect. In some embodiments, the insect is Bombyx mori (BAD11135).
[0282] In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from a crustacean. In some embodiments, the crustacean is Daphnia pulicaria (AAM76342, SEQ ID NO: 208).
[0283] In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from a vertebrate. In some embodiments, the vertebrate is Xenopus tropicalis (GenBank Accession No. BAF82026; SEQ ID NO: 209), Homo sapiens (GenBank Accession No. NP_689808; SEQ ID NO: 210), Mus musculus (GenBank Accession No. NP_74l958; SEQ ID NO: 211), Macaca fascicularis (GenBank Accession No. AB179012; SEQ ID NO: 212, SEQ ID NO: 213), Rattus norvegicus (GenBank Accession No. XP_220453; SEQ ID NO: 214) or Myotis lucifugus.
[0284] In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from a urochordate. In some embodiments, the urochordate is Ciona intestinalis (GenBank Accession No. XP_002l23602; SEQ ID NO: 215).
[0285] In some embodiments, the piggyBac or piggyBac-like transposase inserts a transposon at the sequence 5’-TTAT-3’ within a chromosomal site (a TTAT target sequence).
[0286] In some embodiments, the piggyBac or piggyBac-like transposase inserts a transposon at the sequence 5’-TTAA-3’ within a chromosomal site (a TTAA target sequence).
[0287] In some embodiments, the target sequence of the piggyBac or piggyBac-like transposon comprises or consists of 5’-CTAA-3’, 5’-TTAG-3’, 5’-ATAA-3’, 5’-TCAA-3’, 5’AGTT-3’, 5’-ATTA-3\ 5’-GTTA-3’, 5’-TTGA-3’, 5’-TTTA-3’, 5’-TTAC-3’, 5’-ACTA- 3’, 5’-AGGG-3’, 5’-CTAG-3\ 5’-TGAA-3\ 5’-AGGT-3\ 5’-ATCA-3\ 5 -CTCC-3’, 5’- TAAA-3’, 5’-TCTC-3\ 5’TGAA-3’, 5’-AAAT-3\ 5’-AATC-3\ 5’-ACAA-3\ 5’-ACAT-3\ 5’-ACTC-3’, 5’-AGTG-3’, 5’-ATAG-3’, 5’-CAAA-3\ 5 -CACA-3’, 5’-CATA-3\ 5’- CCAG-3’, 5’-CCCA-3’, 5’-CGTA-3\ 5’-GTCC-3\ 5’-TAAG-3’, 5’-TCTA-3’, 5’-TGAG-3\ 5 -TGTT-3’, 5’-TTCA-3’5’-TTCT-3’ and 5’-TTTT-3’.
[0288] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Bombyx mori. The piggyBac or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE
61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRHRQTKT AAENSSAETS
181 FYMQETTLCE LKALIALLYL AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLQNN
241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ CCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KILALVDAKN FDWNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELSANYNVSR
481 NSKRWPMTLF YGVLNMAAIN ACIIYRANKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI
541 PTYLRQRIEK QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KHSCNACAKP ICMEHAKFLC
601 ENCAELDSSL (SEQ ID NO: 56) .
[0289] The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE
61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRHRQTKT AAENSSAETS
181 FYMQETTLCE LKALIALLYL AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLQNN
241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ CCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KILALVDAKN FYWNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR
481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI
541 PTYLRQRIEK QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC
601 ENCAELDSSL (SEQ ID NO: 57) .
[0290] In some embodiments, the piggyBac or piggyBac-like transposase is fused to a nuclear localization signal. In some embodiments, the amino acid sequence of the piggyBac or piggyBac-like transposase fused to a nuclear localization signal is encoded by a polynucleotide sequence comprising: 1 atggcaccca aaaagaaacg taaagtgatg gacattgaaa gacaggaaga aagaatcagg 61 gcgatgctcg aagaagaact gagcgactac tccgacgaat cgtcatcaga ggatgaaacc 121 gaccactgta gcgagcatga ggttaactac gacaccgagg aggagagaat cgactctgtg 181 gatgtgccct ccaactcacg ccaagaagag gccaatgcaa ttatcgcaaa cgaatcggac 241 agcgatccag acgatgatct gccactgtcc ctcgtgcgcc agcgggccag cgcttcgaga 301 caagtgtcag gtccattcta cacttcgaag gacggcacta agtggtacaa gaattgccag 361 cgacctaacg tcagactccg ctccgagaat atcgtgaccg aacaggctca ggtcaagaat 421 atcgcccgcg acgcctcgac tgagtacgag tgttggaata tcttcgtgac ttcggacatg 481 ctgcaagaaa ttctgacgca caccaacagc tcgattaggc atcgccagac caagactgca 541 gcggagaact catcggccga aacctccttc tatatgcaag agactactct gtgcgaactg 601 aaggcgctga ttgcactgct gtacttggcc ggcctcatca aatcaaatag gcagagcctc 661 aaagatctct ggagaacgga tggaactgga gtggatatct ttcggacgac tatgagcttg 721 cagcggttcc agtttctgca aaacaatatc agattcgacg acaagtccac ccgggacgaa 781 aggaaacaga ctgacaacat ggctgcgttc cggtcaatat tcgatcagtt tgtgcagtgc 841 tgccaaaacg cttatagccc atcggaattc ctgaccatcg acgaaatgct tctctccttc 901 cgggggcgct gcctgttccg agtgtacatc ccgaacaagc cggctaaata cggaatcaaa 961 atcctggccc tggtggacgc caagaatttc tacgtcgtga atctcgaagt gtacgcagga 1021 aagcaaccgt cgggaccgta cgctgtttcg aaccgcccgt ttgaagtcgt cgagcggctt 1081 attcagccgg tggccagatc ccaccgcaat gttaccttcg acaattggtt caccggctac 1141 gagctgatgc ttcaccttct gaacgagtac cggctcacta gcgtggggac tgtcaggaag 1201 aacaagcggc agatcccaga atccttcatc cgcaccgacc gccagcctaa ctcgtccgtg 1261 ttcggatttc aaaaggatat cacgcttgtc tcgtacgccc ccaagaaaaa caaggtcgtg 1321 gtcgtgatga gcaccatgca tcacgacaac agcatcgacg agtcaaccgg agaaaagcaa 1381 aagcccgaga tgatcacctt ctacaattca actaaggccg gcgtcgacgt cgtggatgaa 1441 ctgtgcgcga actataacgt gtcccggaac tctaagcggt ggcctatgac tctcttctac 1501 ggagtgctga atatggccgc aatcaacgcg tgcatcatct accgcaccaa caagaacgtg 1561 accatcaagc gcaccgagtt catcagatcg ctgggtttga gcatgatcta cgagcacctc 1621 cattcacgga acaagaagaa gaatatccct acttacctga ggcagcgtat cgagaagcag 1681 ttgggagaac caagcccgcg ccacgtgaac gtgccggggc gctacgtgcg gtgccaagat 1741 tgcccgtaca aaaaggaccg caaaaccaaa agatcgtgta acgcgtgcgc caaacctatc 1801 tgcatggagc atgccaaatt tctgtgtgaa aattgtgctg aactcgattc ctccctg (SEQ ID NO: 58 ) .
[0291] In some embodiments, the piggyBac or piggyBac-like transposase is hyperactive. A hyperactive piggyBac or piggyBac-like transposase is a transposase that is more active than the naturally occurring variant from which it is derived. In some embodiments, the hyperactive piggyBac or piggyBac-like transposase enzyme is isolated or derived from
Bombyx mori. In some embodiments, the piggyBac or piggyBac-like transposase is a hyperactive variant of SEQ ID NO: 57. In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence that is at least 90% identical to:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE
61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQMSGPHYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRWRQTKT AAENSSASTS
181 FYMQETTLCE LKALIGLLYI AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLQNN
241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ SCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KILALVDAKN FYVKNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR
481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI
541 PTYLRQRIEK QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC
601 ENCAELDSHL (SEQ ID NO 59) . [0292] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises SEQ ID NO: 59. In some embodiments, the hyperactive piggyBac or piggyBac- like transposase comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE 61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE 121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRWRQTKT AAENSSAETS 181 FYMQETTLCE LKALIGLLYI AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLLNN 241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ SCQNAYSPSE FLTIDEMLLS FRGRCLFRVY 301 I PNKPAKYGI KILALVDAKN FYVHNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR 361 NVTFDNWFTG YEVMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL 421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR 481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI 541 PTYLRQRIEK QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC 601 ENCAHLDS (SEQ ID NO: 60) .
[0293] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE 61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRWRQTKT AAENSSASTS
181 FYMQETTLCE LKALIGLLYI AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLLNN
241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ SCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KILALVDAKN FYVKNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR
481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI
541 PTYLRQRIAM QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC
601 ENCAELDSSL (SEQ ID NO: 61) .
[0294] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE
61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRWRQTKT AAENSSAETS
181 FYMQETTLCE LKALIGLLYI AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLLNN
241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ SCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KILALVDAKN FYVKNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKTQIPENF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELQANYNVSR
481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI
541 PTYLRQRIEK QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC
601 ENCAELDSSL (SEQ ID NO 62) .
[0295] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE 61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE 121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRWRQTKT AAENSSAETS 181 FYMQETTLCE LKALIGLLYI AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLQNN 241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ SCQNAYSPSE FLTIDEMLLS FRGRCLFRVY 301 I PNKPAKYGI KILALVDAKN FYVKNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR 361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL 421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR 481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI 541 PTYLRQRIEK QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC 601 ENCAELDSSL (SEQ ID NO: 63) .
[0296] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS
VDVPSNSRQE
61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRHRQTKT AAENSSAETS
181 FYMQETTLCE LKALIALLYL AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFQFLQNN
241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ CCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KILALVDAKN DYWNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR
481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI
541 PTYLRQRIEK QLGEPSSRHV NVKGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC
601 ENCAELDSSL (SEQ ID NO: 64) .
[0297] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase is more active than the transposase of SEQ ID NO: 57. In some embodiments, the hyperactive piggyBac or piggyBac-like transposase is at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or any percentage in between identical to SEQ ID NO: 57.
[0298] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises an amino acid substitution at a position selected from 92, 93, 96, 97, 165, 178,
189, 196, 200, 201, 211, 215, 235, 238, 246, 253, 258, 261, 263, 271, 303, 321, 324, 330,
373, 389, 399, 402, 403, 404, 448, 473, 484, 507,5 23, 527, 528, 543, 549, 550, 557,6 01,
605, 607, 609, 610 or a combination thereof (relative to SEQ ID NO: 57). In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises an amino acid substitution of Q92A, V93L, V93M, P96G, F97H, F97C, H165E, H165W, E178S, E178H, C189P, A196G, L200I, A201Q, L211A, W215Y, G219S, Q235Y, Q235G, Q238L, K246I, K253V, M258V, F261L, S263K, C271S, N303R, F321W, F321D, V324K, V324H, A330V, L373C, L373V, V389L, S399N, R402K, T403L, D404Q, D404S, D404M, N441R, G448W, E449A, V469T, C473Q, R484K T507C, G523A, I527M, Y528K Y543I, E549A, K550M, P557S, E601V, E605H, E605W, D607H, S609H, L610I or any combination thereof. In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises an amino acid substitution of Q92A, V93L, V93M, P96G, F97H, F97C, H165E, H165W, E178S, E178H, C189P, A196G, L200I, A201Q, L211A, W215Y, G219S, Q235Y, Q235G, Q238L, K246I, K253V, M258V, F261L, S263K, C271S, N303R, F321W, F321D, V324K, V324H, A330V, L373C, L373V, V389L, S399N, R402K, T403L, D404Q, D404S, D404M, N441R, G448W, E449A, V469T, C473Q, R484K T507C, G523A, I527M, Y528K Y543I, E549A, K550M, P557S, E601V, E605H, E605W, D607H, S609H and L610I.
[0299] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises one or more substitutions of an amino acid that is not wild type, wherein the one or more substitutions a for wild type amino acid comprises a substitution of E4X, A12X, Ml3X,Ll4X, E15X, D20X, E24X, S25X, S26X, S27X, D32X, H33X, E36X, E44X, E45X, E46X, I48X, D49X, R58X, A62X, N63X, A64X, I65X, I66X, N68X, E69X, D71X, S72X, D76X, P79X, R84X, Q85X, A87X, S88X, Q92X, V93X, S94X, G95X, P96X, F97X, Y98X, T99X, I145X, S149X, D150X, L152X, E154X, T157X, N160X, S161X, S162X, H165X, R166X, T168X, K169X, T170X, A171X, E173X, S175X, S176X, E178X, T179X, M183X, Q184X, T186X, T187X, L188X, C189X, L194X, I195X, A196X, L198X, L200X, A201X, L203X, I204X, K205X, A206X, N207X, Q209X, S210X, L211X, K212X, D213X, L214X, W215X, R216X, T217X, G219X, V222X, D223X, I224X, T227X, M229X, Q235X, L237X, Q238X, N239X, N240X, P302X, N303X, P305X, A306X, K307X, Y308X, I310X, K311X, I312X, L313X, A314X, L315X, V3l6X,D3l7X, A318X, K319X, N320X, F321X, Y322X, V323X, V324X, L326X, E327X, V328X, A330X, Q333X, P334X, S335X, G336X, P337X, A339X, V340X, S341X, N342X, R343X, P344X, F345X, E346X, V347X, E349X, I352X, Q353X, V355X, A356X, R357X, N361X, D365X, W367X, T369X, G370X, L373X,
M374X, L375X, H376X, N379X, E380X, R382X, V386X, V389X, N392X, R394X, Q395X, S399X, F400X, 140 IX, R402XT403X, D404X, R405X, Q406X, P407X, N408X, S409X, S410X, V411X, F412X, F414X, Q415X, I418X, T419X, L420X, N428XV432X, M434X, D440X, N441X, S442X, I443X, D444X, E445X, G448X, E449X, Q451X, K452X, M455X, I456X, T457X, F458X, S461X, A464X, V466X, Q468X, V469X, E471X, L472X, C473X, A474X, K483X, W485X, T488X, L489X, Y491X, G492X, V493X, M496X, I499X, C502X, I503X, T507X, K509X, N510X, V511X, T512X, I513X, R515X, E517X, S521X, G523X, L524X, S525X, I527X, Y528X, E529X, H532X, S533X, N535X, K536X, K537X, N539X, I540X, T542X, Y543X, Q546X, E549X, K550X, Q551X, G553X, E554X, P555X, S556X, P557X, R558X, H559X, V560X, N561X, V562X, P563X, G564X, R565X, Y566X, V567X, Q570X, D571X, P573X, Y574X, K576X, K581X, S583X, A586X, A588X, E594X, F598X, L599X, E601X, N602X, C603X, A604X, E605X, L606X, D607X, S608X, S609X or L610X (relative to SEQ ID NO: 57). A list of hyperactive amino acid substitutions can be found in US patent No. 10,041,077, the contents of which are incorporated herein by reference in their entirety. [0300] In some embodiments, the piggyBac or piggyBac-like transposase is integration deficient. In some embodiments, an integration deficient piggyBac or piggyBac-like transposase is a transposase that can excise its corresponding transposon, but that integrates the excised transposon at a lower frequency than a corresponding wild type transposase. In some embodiments, the piggyBac or piggyBac-like transposase is an integration deficient variant of SEQ ID NO: 57.
[0301] In some embodiments, the excision competent, integration deficient piggyBac or piggyBac-like transposase comprises one or more substitutions of an amino acid that is not wild type, wherein the one or more substitutions a for wild type amino acid comprises a substitution of R9X, A12X, M13X, D20X, Y21K, D23X, E24X, S25X, S26X, S27X, E28X, E30X, D32X, H33X, E36X, H37X, A39X, Y41X, D42X, T43X, E44X, E45X, E46X, R47X, D49X, S50X, S55X, A62X, N63X, A64X, I66X, A67X, N68X, E69X, D70X, D71X, S72X, D73X, P74X, D75X, D76X, D77X,I78X, S8lX,V83X, R84X, Q85X, A87X, S88X, A89X,S90X,R9lX, Q92X, V93X, S94X, G95X, P96X, F97X, Y98X, T99X, W012X, G103X, Y107X, K108X, L117X, I122X, Q128X, I312X, D135X, S137X, E139X, Y140X, I145X, S149X, D150X, Q153X, E154X, T157X, S161X, S162X, R164X, H165X, R166X, Q167X, T168X, K169X, T170X, A171X, A172X, E173X, R174X, S175X, S176X, A177X, E178X, T179X, Sl80X,Yl82X, Q184X, E185X, T187X, L188X, C189X, L194X, I195X, A196X, L198X, L200X, A201X, L203X, I204X, K205X, N207X, Q209X, L211X, D213X, L214X, W215X, R216X, T217X, G219X, T220X, V222X, D223X, I224X, T227X, T228X, F234X, Q235X, L237X, Q238X, N239X, N240X, N303X, K304X, I310X, I312X, L313X, A314X, L315X, V3l6X,D3l7X, A318X, K319X, N320X, F321X, Y322X, V323X, V324X, N325X, L326X, E327X, V328X, A330X, G331X, K332X, Q333X, S335X, P337X, P344X, F345X, E349X, H359X, N361X, V362X, D365X, F368X, Y371X, E372X, L373X, H376X, E380X, R382X, R382X, V386X, G387X, T388X, V389X, K391X, N392X, R394X, Q395X, E398X, S399X, F400X, I401X, R402XT403X, D404X, R405X, Q406X, P407X, N408X, S409X, S410X, Q4l5X,K4l6X, A424X, K426X, N428X, V430X, V432X, V433X, M434X, D436X, D440X, N441X, S442X, I443X, D444X, E445X, S446X, T447X, G448X, E449X, K450X, Q451X, E454X, M455X, I456X, T457X, F458X, S461X, A464X, V466X, Q468X, V469X, C473X, A474X, N475X, N477X, K483X, R484X, P486X, T488X, L489X, G492X, V493X, M496X, I499X, I503X, Y505X, T507X, N510X, V511X, T512X, I513X, K514X, T516X, E517X, S521X, G523X, L524X, S525X, I527X, Y528X, L531X, H532X, S533X, N535X, I540X, T542X, Y543X, R545X, Q546X, E549X, L552X, G553X, E554X, P555X, S556X, P557X, R558X, H559X, V560X, N561X, V562X, P563X, G564X, V567X, Q570X, D571X, P573X. Y574X. K575X. K576X. N585X. A586X. M593X. K596X. E601 X. N602X.
A604X, E605X, L606X, D607X, S608X, S609X or L610X (relative to SEQ ID NO: 57). A list of integration deficient amino acid substitutions can be found in US patent No.
10,041,077, the contents of which are incorporated by reference in their entirety.
[0302] In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE 61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE 121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRHRQTKT AAENSSAETS 181 FYMQETTLCE LKALIALLYL AGLIKSNRQS LKDLWRKDGT GVDIFRTTMS LQRFQFLLNN 241 IRFDDISTRD ERKQTDNMAA FRSIFDQFVQ CCQNAYSPSE FLTIDEMLLS FRGRCLFRVY 301 I PNKPAKYGI KIIALVDAKN FYWNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR 361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL 421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR 481 NSKKWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMMYEH LHSRNKKKNI 541 PTYLRQRIEK QLGEPVPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC 601 ENCAELDSSL (SEQ ID NO 65) .
In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE
61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRHRQTKT AAENSSAETS
181 FYMQETTLCE LKALIGLLYL AGLIKSNRQS LKDLWRTDGT GVDIFRTTMS LQRFYFLQNN
241 IRFDDKSTLD ERKQTDNMAA FRSIFDQFVQ SCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KIIALVDAKN FYWNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YELMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR
481 NSKRWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIYEH LHSRNKKKNI
541 PTYLRQRIEK QLGEPSPRHV NYPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC
601 VNCAELDSSL (SEQ ID NO 66) .
In some embodiments, the piggyBac or piggyBac-like transposase that is is integration deficient comprises a sequence of:
1 MDIERQEERI RAMLEEELSD YSDESSSEDE TDHCSEHEVN YDTEEERIDS VDVPSNSRQE
61 EANAI IANES DSDPDDDLPL SLVRQRASAS RQVSGPFYTS KDGTKWYKNC QRPNVRLRSE
121 NIVTEQAQVK NIARDASTEY ECWNIFVTSD MLQEILTHTN SSIRHRQTKT AAENSSAETS
181 FYMQETTLCE LKALIALLYL AGLIKSNRQS LKDLWRKDGT GVDIFRTTMS LQRFQFLLNN
241 IRFDDKSTRD ERKQTDNMAA FRSIFDQFVQ CCQNAYSPSE FLTIDEMLLS FRGRCLFRVY
301 I PNKPAKYGI KIIALVDAKN DYWNLEVYA GKQPSGPYAV SNRPFEWER LIQPVARSHR
361 NVTFDNWFTG YECMLHLLNE YRLTSVGTVR KNKRQIPESF IRTDRQPNSS VFGFQKDITL
421 VSYAPKKNKV VWMSTMHHD NSIDESTGEK QKPEMITFYN STKAGVDWD ELCANYNVSR
481 NSKKWPMTLF YGVLNMAAIN ACIIYRTNKN VTIKRTEFIR SLGLSMIKEH LHSRNKKKNI
541 PTYLRQRIEK QLGEPSPRHV NVPGRYVRCQ DCPYKKDRKT KRSCNACAKP ICMEHAKFLC
601 ENCAELDSSL (SEQ ID NO 67) .
In some embodiments, the integration deficient transposase comprises a sequence that is at least 90% identical to SEQ ID NO: 67. [0303] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Bombyx mori. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttatcccggc gagcatgagg cagggtatct cataccctgg taaaatttta aagttgtgta
61 ttttataaaa ttttcgtctg acaacactag cgcgctcagt agctggaggc aggagcgtgc
121 gggaggggat agtggcgtga tcgcagtgtg gcacgggaca ccggcgagat attcgtgtgc
181 aaacctgttt cgggtatgtt ataccctgcc tcattgttga cgtatttttt ttatgtaatt
241 tttccgatta ttaatttcaa ctgttttatt ggtattttta tgttatccat tgttcttttt
301 ttatgattta ctgtatcggt tgtctttcgt tcctttagtt gagttttttt ttattatttt
361 cagtttttga tcaaa (SEQ ID NO: 68) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tcatattttt agtttaaaaa aataattata tgttttataa tgaaaagaat ctcattatct
61 ttcagtatta ggttgattta tattccaaag aataatattt ttgttaaatt gttgattttt
121 gtaaacctct aaatgtttgt tgctaaaatt actgtgttta agaaaaagat taataaataa
181 taataatttc ataattaaaa acttctttca ttgaatgcca ttaaataaac cattatttta
241 caaaataaga tcaacataat tgagtaaata ataataagaa caatattata gtacaacaaa
301 atatgggtat gtcataccct gccacattct tgatgtaact ttttttcacc tcatgctcgc
361 cgggttat (SEQ ID NO: 69) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttatcccggc gagcatgagg cagggtatct cataccctgg taaaatttta aagttgtgta
61 ttttataaaa ttttcgtctg acaacactag cgcgctcagt agctggaggc aggagcgtgc
121 gggaggggat agtggcgtga tcgcagtgtg gcacgggaca ccggcgagat attcgtgtgc
181 aaacctgttt cgggtatgtt ataccctgcc teat (SEQ ID NO: 70) .
In some embodiments, the piggyBac (PB) or piggyBac-like transposon comprises a sequence of:
1 taaataataa taatttcata attaaaaact tctttcattg aatgccatta aataaaccat
61 tattttacaa aataagatca acataattga gtaaataata ataagaacaa tattatagta
121 caacaaaata tgggtatgtc ataccctgcc acattcttga tgtaactttt tttcacctca
181 tgctcgccgg gttat (SEQ ID NO: 71) .
[0304] In some embodiments, the piggyBac or piggyBac-like transposon comprises a 5’ sequence corresponding to SEQ ID NO: 68 and a 3’ sequence corresponding to SEQ ID NO: 69. In some embodiments, one piggyBac or piggyBac-like transposon end is at least 85%, at least 90%, at least 95%, at least 98%, at least 99% identical or any percentage in between identical to SEQ ID NO: 68 and the other piggyBac or piggyBac-like transposon end is at least 85%, at least 90%, at least 95%, at least 98%, at least 99% or any percentage in between identical to SEQ ID NO: 69. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 68 and SEQ ID NO: 69 or SEQ ID NO: 71. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 70 and SEQ ID NO: 69 or SEQ ID NO: 71. In some embodiments, the 5’ and 3’ transposon ends share a 16 bp repeat sequence at their ends of CCCGGCGAGCATGAGG (SEQ ID NO: 72) immediately adjacent to the 5'-TTAT-3 target insertion site, which is inverted in the orientation in the two ends. In some embodiments, 5’ transposon end begins with a sequence comprising 5'-TTATCCCGGCGAGCATGAGG-3 (SEQ ID NO: 73), and the 3’ transposon ends with a sequence comprising the reverse complement of this sequence: 5'- CCTCATGCTCGCCGGGTTAT-3' (SEQ ID NO: 74).
[0305] In some embodiments, the piggyBac or piggyBac-like transposon comprises one end comprising at least 14, 16, 18, 20, 30 or 40 contiguous nucleotides of SEQ ID NO: 68 or SEQ ID NO: 70. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end comprising at least 14, 16, 18, 20, 30 or 40 contiguous nucleotides of SEQ ID NO: 69 or SEQ ID NO: 71. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end with at least 90% identity to SEQ ID NO: 68 or SEQ ID NO: 70. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end with at least 90% identity to SEQ ID NO: 69 or SEQ ID NO: 71.
[0306] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaacccggc gagcatgagg cagggtatct cataccctgg taaaatttta aagttgtgta 61 ttttataaaa ttttcgtctg acaacactag cgcgctcagt agctggaggc aggagcgtgc
121 gggaggggat agtggcgtga tcgcagtgtg gcacgggaca ccggcgagat attcgtgtgc
181 aaacctgttt cgggtatgtt ataccctgcc tcattgttga cgtatttttt ttatgtaatt
241 tttccgatta ttaatttcaa ctgttttatt ggtattttta tgttatccat tgttcttttt
301 ttatgattta ctgtatcggt tgtctttcgt tcctttagtt gagttttttt ttattatttt
361 cagtttttga tcaaa (SEQ ID NO: 75) .
[0307] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tcatattttt agtttaaaaa aataattata tgttttataa tgaaaagaat ctcattatct 61 ttcagtatta ggttgattta tattccaaag aataatattt ttgttaaatt gttgattttt 121 gtaaacctct aaatgtttgt tgctaaaatt actgtgttta agaaaaagat taataaataa 181 taataatttc ataattaaaa acttctttca ttgaatgcca ttaaataatt cattatttta 241 caaaataaga tcaacataat tgagtaaata ataataagaa caatattata gtacaacaaa 301 atatgggtat gtcataccct tttttttttt tttttttttt ttttttcggg tagagggccg 361 aacctcctac gaggtccccg cgcaaaaggg gcgcgcgggg tatgtgagac tcaacgatct 421 gcatggtgtt gtgagcagac cgcgggccca aggattttag agcccaccca ctaaacgact 481 cctctgcact cttacacccg acgtccgatc ccctccgagg tcagaacccg gatgaggtag 541 gggggctacc gcggtcaaca ctacaaccag acggcgcggc tcaccccaag gacgcccagc 601 cgacggagcc ttcgaggcga atcgaaggct ctgaaacgtc ggccgtctcg gtacggcagc 661 ccgtcgggcc gcccagacgg tgccgctggt gtcccggaat accccgctgg accagaacca 721 gcctgccggg tcgggacgcg atacaccgtc gaccggtcgc tctaatcact ccacggcagc 781 gcgctagagt gctggta ( S ,Q ID NO: 7
[0308] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of CCCGGCGAGCATGAGG (SEQ ID NO: 72). In some embodiments, the piggyBac or piggyBac-like transposon comprises an ITR sequence of SEQ ID NO: 72. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of TTATCCCGGCGAGCATGAGG (SEQ ID NO: 73). In some embodiments, the piggyBac or piggyBac-like transposon comprises at least 16 contiguous nucleotides from SEQ ID NO: 73. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of CCTCATGCTCGCCGGGTTAT (SEQ ID NO: 74). In some embodiments, the piggyBac or piggyBac-like transposon comprises at least 16 contiguous nucleotides from SEQ ID NO: 74. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end comprising at least 16 contiguous nucleotides from SEQ ID NO: 73 and one end comprising at least 16 contiguous nucleotides from SEQ ID NO: 74. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 73 and SEQ ID NO: 74. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of
TTAACCCGGCGAGCATGAGG (SEQ ID NO: 77). In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of CCTCATGCTCGCCGGGTTAA (SEQ ID NO: 78).
[0309] In some embodiments, the piggyBac or piggyBac-like transposon may have ends comprising SEQ ID NO: 68 and SEQ ID NO: 69, or a variant of either or both of these having at least 90% sequence identity to SEQ ID NO: 68 or SEQ ID NO: 69, and the piggyBac or piggyBac-like transposase has the sequence of SEQ ID NO: 56 or SEQ ID NO: 57, or a sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%,
60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identity to SEQ ID NO: 56 or SEQ ID NO: 57. In some embodiments, the piggyBac or piggyBac-like transposon comprises a heterologous polynucleotide inserted between a pair of inverted repeats, where the transposon is capable of transposition by a piggyBac or piggyBac-like transposase having at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identity to SEQ ID NO: 56 or SEQ ID NO: 57. In some embodiments, the transposon comprises two transposon ends, each of which comprises SEQ ID NO: 72 in inverted orientations in the two transposon ends. In some embodiments, each inverted terminal repeat (ITR) is at least 90% identical to SEQ ID NO: 72.
[0310] In some embodiments, the piggyBac or piggyBac-like transposon is capable of insertion by a piggyBac or piggyBac-like transposase at the sequence 5'-TTAT-3 within a target nucleic acid. In some embodiments, one end of the piggyBac or piggyBac-like transposon comprises at least 16 contiguous nucleotides from SEQ ID NO: 68 and the other transposon end comprises at least 16 contiguous nucleotides from SEQ ID NO: 69. In some embodiments, one end of the piggyBac or piggyBac-like transposon comprises at least 17, at least 18, at least 19, at least 20, at least 22, at least 25, at least 30 contiguous nucleotides from SEQ ID NO: 68 and the other transposon end comprises at least 17, at least 18, at least 19, at least 20, at least 22, at least 25, at least 30 contiguous nucleotides from SEQ ID NO: 69.
[0311] In some embodiments, the piggyBac or piggyBac-like transposon comprises transposon ends (each end comprising an ITR) corresponding to SEQ ID NO: 68 and SEQ ID NO: 69, and has a target sequence corresponding to 5'-TTAT3'. In some embodiments, the piggyBac or piggyBac-like transposon also comprises a sequence encoding a transposase (e.g. SEQ ID NO: 57). In some embodiments, the piggyBac or piggyBac-like transposon comprises one transposon end corresponding to SEQ ID NO: 68 and a second transposon end corresponding to SEQ ID NO: 76. SEQ ID NO: 76 is very similar to SEQ ID NO: 69, but has a large insertion shortly before the ITR. Although the ITR sequences for the two transposon ends are identical (they are both identical to SEQ ID NO: 72), they have different target sequences: the second transposon has a target sequence corresponding to 5'-TTAA-3', providing evidence that no change in ITR sequence is necessary to modify the target sequence specificity. The piggyBac or piggyBac-like transposase (SEQ ID NO: 56), which is associated with the 5'-TTAA-3’ target site differs from the 5'-TTAT-3'-associated transposase (SEQ ID NO: 57) by only 4 amino acid changes (D322Y, S473C, A507T, H582R). In some embodiments, the piggyBac or piggyBac-like transposase (SEQ ID NO: 56), which is associated with the 5'-TTAA-3’ target site is less active than the 5'-TTAT-3'-associated piggyBac or piggyBac-like transposase (SEQ ID NO: 57) on the transposon with 5'-TTAT-3' ends. In some embodiments, piggyBac or piggyBac-like transposons with 5'-TTAA-3’ target sites can be converted to piggyBac or piggyBac-like transposases with 5'-TTAT-3 target sites by replacing 5'-TTAA-3’ target sites with 5'-TTAT-3'. Such transposons can be used either with a piggyBac or piggyBac-like transposase such as SEQ ID NO: 56 which recognizes the 5'-TTAT-3’ target sequence, or with a variant of a transposase originally associated with the 5'-TTAA-3' transposon. In some embodiments, the high similarity between the 5'-TTAA-3' and 5'-TTAT-3' piggyBac or piggyBac-like transposases demonstrates that very few changes to the amino acid sequence of a piggyBac or piggyBac-like transposase alter target sequence specificity. In some embodiments, modification of any piggyBac or piggyBac-like transposon-transposase gene transfer system, in which 5'-TTAA-3’ target sequences are replaced with 5'-TTAT-3'-target sequences, the ITRs remain the same, and the transposase is the original piggyBac or piggyBac-like transposase or a variant thereof resulting from using a low-level mutagenesis to introduce mutations into the transposase. In some embodiments, piggyBac or piggyBac-like transposon transposase transfer systems can be formed by the modification of a 5'-TTAT-3'-active piggyBac or piggyBac-like transposon-transposase gene transfer systems in which 5'-TTAT-3’ target sequences are replaced with 5'-TTAA-3'-target sequences, the ITRs remain the same, and the piggyBac or piggyBac-like transposase is the original transposase or a variant thereof.
[0312] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Bombyx mori. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 cccggcgagc atgaggcagg gtatctcata ccctggtaaa attttaaagt tgtgtatttt
61 ataaaatttt cgtctgacaa cactagcgcg ctcagtagct ggaggcagga gcgtgcggga
121 ggggatagtg gcgtgatcgc agtgtggcac gggacaccgg cgagatattc gtgtgcaaac
181 ctgtttcggg tatgttatac cctgcctcat tgttgacgta t (SEQ ID NO: 79) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttaagaaaa agattaataa ataataataa tttcataatt aaaaacttct ttcattgaat
61 gccattaaat aaaccattat tttacaaaat aagatcaaca taattgagta aataataata
121 agaacaatat tatagtacaa caaaatatgg gtatgtcata ccctgccaca ttcttgatgt
181 aacttttttt cacctcatgc tcgccggg (SEQ ID NO: 80) .
In some embodiments, the transposon comprises at least 16 contiguous bases from SEQ ID NO: 79 and at least 16 contiguous bases from SEQ ID NO: 80, and inverted terminal repeats that are at least 87% identical to CCCGGCGAGCATGAGG (SEQ ID NO: 72).
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 cccggcgagc atgaggcagg gtatctcata ccctggtaaa attttaaagt tgtgtatttt
61 ataaaatttt cgtctgacaa cactagcgcg ctcagtagct ggaggcagga gcgtgcggga
121 ggggatagtg gcgtgatcgc agtgtggcac gggacaccgg cgagatattc gtgtgcaaac
181 ctgtttcggg tatgttatac cctgcctcat tgttgacgta ttttttttat gtaatttttc
241 cgattattaa tttcaactgt tttattggta tttttatgtt atccattgtt ctttttttat
301 gatttactgt atcggttgtc tttcgttcct ttagttgagt ttttttttat tattttcagt
361 ttttgatcaa a (SEQ ID NO: 81) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tcatattttt agtttaaaaa aataattata tgttttataa tgaaaagaat ctcattatct
61 ttcagtatta ggttgattta tattccaaag aataatattt ttgttaaatt gttgattttt
121 gtaaacctct aaatgtttgt tgctaaaatt actgtgttta agaaaaagat taataaataa
181 taataatttc ataattaaaa acttctttca ttgaatgcca ttaaataaac cattatttta
241 caaaataaga tcaacataat tgagtaaata ataataagaa caatattata gtacaacaaa
301 atatgggtat gtcataccct gccacattct tgatgtaact ttttttcacc tcatgctcgc
361 cggg (SEQ ID NO: 82).
[0313] In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 81 and SEQ ID NO: 82, and is transposed by the piggyBac or piggyBac-like transposase of SEQ ID NO: 57. In some embodiments, the ITRs of SEQ ID NO: 81 and SEQ ID: 82 are not flanked by a 5’-TTAA-3’ sequence. In some embodiments, the ITRs of SEQ ID NO: 81 and SEQ ID: 82 are flanked by a 5’-TTAT-3’ sequence.
[0314] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of: 1 cccggcgagc atgaggcagg gtatctcata ccctggtaaa attttaaagt tgtgtatttt 61 ataaaatttt cgtctgacaa cactagcgcg ctcagtagct ggaggcagga gcgtgcggga
121 ggggatagtg gcgtgatcgc agtgtggcac gggacaccgg cgagatattc gtgtgcaaac
181 ctgtttcggg tatgttatac cctgcctcat tgttgacgta ttttttttat gtaatttttc
241 cgattattaa tttcaactgt tttattggta tttttatgtt atccattgtt ctttttttat
301 g (SEQ ID NO: 83) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 cagggtatct cataccctgg taaaatttta aagttgtgta ttttataaaa ttttcgtctg 61 acaacactag cgcgctcagt agctggaggc aggagcgtgc gggaggggat agtggcgtga
121 tcgcagtgtg gcacgggaca ccggcgagat attcgtgtgc aaacctgttt cgggtatgtt
181 ataccctgcc tcattgttga cgtatttttt ttatgtaatt tttccgatta ttaatttcaa
241 ctgttttatt ggtattttta tgttatccat tgttcttttt ttatg (SEQ ID NO: 84) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 cagggtatct cataccctgg taaaatttta aagttgtgta ttttataaaa ttttcgtctg
61 acaacactag cgcgctcagt agctggaggc aggagcgtgc gggaggggat agtggcgtga
121 tcgcagtgtg gcacgggaca ccggcgagat attcgtgtgc aaacctgttt cgggtatgtt
181 ataccctgcc tcattgttga cgtat (SEQ ID NO: 85) .
In some embodiments, the 5’ end of the piggyBac or piggyBac-like transposon comprises a sequence of SEQ ID NO: 79, SEQ ID NO: 81, or SEQ ID NOs: 83-85. In some
embodiments, the 5’ end of the piggyBac or piggyBac-like transposon is preceded by a 5’ target sequence.
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tcatattttt agtttaaaaa aataattata tgttttataa tgaaaagaat ctcattatct
61 ttcagtatta ggttgattta tattccaaag aataatattt ttgttaaatt gttgattttt
121 gtaaacctct aaatgtttgt tgctaaaatt actgtgttta agaaaaagat taataaataa
181 taataatttc ataattaaaa acttctttca ttgaatgcca ttaaataaac cattatttta
241 caaaataaga tcaacataat tgagtaaata ataataagaa caatattata gtacaacaaa
301 atatgggtat gtcataccct gccacattct tgatgtaact ttttttcacc tcatgctcgc
361 cggg (SEQ ID NO: 86) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttaagaaaa agattaataa ataataataa tttcataatt aaaaacttct ttcattgaat
61 gccattaaat aaaccattat tttacaaaat aagatcaaca taattgagta aataataata
121 agaacaatat tatagtacaa caaaatatgg gtatgtcata ccctgccaca ttcttgatgt
181 aacttttttt ca (SEQ ID NO: 87).
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 cccggcgagc atgaggcagg gtatctcata ccctggtaaa attttaaagt tgtgtatttt
61 ataaaatttt cgtctgacaa cactagcgcg ctcagtagct ggaggcagga gcgtgcggga
121 ggggatagtg gcgtgatcgc agtgtggcac gggacaccgg cgagatattc gtgtgcaaac
181 ctgtttcggg tatgttatac cctgcctcat tgttgacgta ttttttttat gtaatttttc
241 cgattattaa tttcaactgt tttattggta tttttatgtt atccattgtt ctttttttat
301 gatttactgt atcggttgtc tttcgttcct ttagttgagt ttttttttat tattttcagt
361 ttttgatcaa a (SEQ ID NO: 88) .
[0315] In some embodiments, the 3’ end of the piggyBac or piggyBac-like transposon comprises a sequence of SEQ ID NO: 80, SEQ ID NO: 82, or SEQ ID NOs: 86-87. In some embodiments, the 3’ end of the piggyBac or piggyBac-like transposon is followed by a 3’ target sequence. In some embodiments, the transposon is transposed by the transposase of SEQ ID NO: 57. In some embodiments, the 5’ and 3’ ends of the piggyBac or piggyBac-like transposon share a 16 bp repeat sequence of SEQ ID NO: 72 in inverted orientation and immediately adjacent to the target sequence. In some embodiments, the 5’ transposon end begins with SEQ ID NO: 72, and the 3’ transposon end ends with the reverse complement of SEQ ID NO: 72, 5’- CCTCATGCTCGCCGGG-3’ (SEQ ID NO: 89). In some embodiments, the piggyBac or piggyBac-like transposon comprises an ITR with at least 93%, at least 87%, or at least 81% or any percentage in between identity to SEQ ID NO: 72 or SEQ ID NO: 89. In some embodiments, the piggyBac or piggyBac-like transposon comprises a target sequence followed by a 5’ transposon end comprising a sequence selected from SEQ ID NOs: 88, 105 or 107 and a 3’ transposon end comprising SEQ ID NO: 80 or 106 followed by a target sequence. In some embodiments, the piggyBac or piggyBac like transposon comprises one end that comprises a sequence that is at least 90%, at least 95% or at least 99% or any percentage in between identical to SEQ ID NO: 79 and one end that comprises a sequence that is at least 90%, at least 95% or at least 99% or any percentage in between identical to SEQ ID NO: 80. In some embodiments, one transposon end comprises at least 14, at least 16, at least 18 or at least 20 contiguous bases from SEQ ID NO: 79 and one transposon end comprises at least 14, at least 16, at least 18 or at least 20 contiguous bases from SEQ ID NO: 80.
[0316] In some embodiments, the piggyBac or piggyBac-like transposon comprises two transposon ends wherein each transposon ends comprises a sequence that is at least 81% identical, at least 87% identical or at least 93% identical or any percentage in between identical to SEQ ID NO: 72 in inverted orientation in the two transposon ends. One end may further comprise at least 14, at least 16, at least 18 or at least 20 contiguous bases from SEQ ID NO: 85, and the other end may further comprise at least 14, at least 16, at least 18 or at least 20 contiguous bases from SEQ ID NO: 87. The piggyBac or piggyBac-like transposon may be transposed by the transposase of SEQ ID NO: 57, and the transposase may optionally be fused to a nuclear localization signal.
[0317] In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 81 and SEQ ID NO: 82 and the piggyBac or piggyBac-like transposase comprises SEQ ID NO: 56 or SEQ ID NO: 57. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 83 and SEQ ID NO: 82 and the piggyBac or piggyBac- like transposase comprises SEQ ID NO: 56 or SEQ ID NO: 57. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 81 and SEQ ID NO: 80 and the piggyBac or piggyBac-like transposase comprises SEQ ID NO: 56 or SEQ ID NO: 57. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 88 and SEQ ID NO: 86 and the piggyBac or piggyBac-like transposase comprises SEQ ID NO: 56 or SEQ ID NO: 57.
[0318] In some embodiments, the piggyBac or piggyBac-like transposon comprises a 5’ end comprising 1, 2, 3, 4, 5, 6, or 7 sequences selected from ATGAGGCAGGGTAT (SEQ ID NO: 90), ATACCCTGCCTCAT (SEQ ID NO: 91), GGC AGGGT AT (SEQ ID NO: 92), ATACCCTGCC (SEQ ID NO: 93), T A A A ATTTT A (SEQ ID NO: 94), ATTTTATAAAAT (SEQ ID NO: 95), TCATACCCTG (SEQ ID NO: 96) and TAAATAATAATAA (SEQ ID NO: 97). In some embodiments, the piggyBac or piggyBac-like transposon comprises a 3’ end comprising 1, 2 or 3 sequences selected from SEQ ID NO: 93, SEQ ID NO: 96 and SEQ ID NO: 97.
[0319] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Xenopus tropicalis. The piggyBac or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL
121 FMTEAILQDM VLYTNVYAEQ YLTQNPLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN
181 SLESYWDTTT VLSIPVFSAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID
241 SLSERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLDT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RAWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHY (SEQ ID NO:
99) .
[0320] In some embodiments, the piggyBac or piggyBac-like transposase is a hyperactive variant of SEQ ID NO: 99. In some embodiments, the piggyBac or piggyBac-like transposase is an integration defective variant of SEQ ID NO: 99. The piggyBac or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%,
15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%,
95%, 99% or any percentage in between identical to:
1 MAKRFYSAEE AAAHCMAPSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV 61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL 121 FMTEAILQDM VLYTNVYAEQ YLTQNPLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN 181 SLESYWNTTT VLSIPVFSAT MSRNRYQLLL RFLHFNNNAT AVPPDQPDHD RLHKLRPLID 241 SLSERFAAVY TPCQNICIDE SLLLFKGRLR FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF 301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLDT 361 PACGTINRTR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE 421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT SAWYKKVGIY LIQMALRNSY 481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMLP SDNVARLIGK HFIDTLPPTP 541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHY (SEQ ID NO:
98) .
[0321] In some embodiments, the piggyBac or piggyBac-like transposase is isolated or derived from Xenopus tropicalis. In some embodiments, the piggyBac or piggyBac-like transposase is a hyperactive piggyBac or piggyBac-like transposase. In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence at least 90% identical to:
1 MAKRFYSAEE AAAHCSASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL
121 FMTEAILQDM VLYTNVYAEQ YLTQNPLTRG ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN
181 SIESYWDTTT VLSI PVFGAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID
241 SLSERFANVY TPCQNICIDE SLMLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSTGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLNT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RHWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPD SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCRKPCF EIYHTQLHY (SEQ ID NO:
100
[0322] In some embodiments, piggyBac or piggyBac-like transposase is a hyperactive piggyBac or piggyBac-like transposase. A hyperactive piggyBac or piggyBac-like transposase is a transposase that is more active than the naturally occurring variant from which it is derived. In some embodiments, a hyperactive piggyBac or piggyBac-like transposase is more active than the transposase of SEQ ID NO: 99. In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MAKRFYSAEE AAAHCSASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL
121 FMTEAILQDM VLYTNVYAEQ YLTQNPLTRG ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN
181 SIESYWDTTT VLSI PVFGAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID
241 SLSERFANVY TPCQNICIDE SLMLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSTGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLNT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RHWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPD SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCRKPCF EIYHTQLHY 0 EQ. I D NO:
100
[0323] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST
VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL 121 FMTEAILQDM VLYTNVYAEQ YLTQNPLTRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN 181 SLESYWDTTT VLSIPVFSAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID 241 SLSERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF 301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLNT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RHWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCRKPCF EIYHTQLHY (SEQ ID NO:
101) .
[0324] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST
VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL
121 FMTEAILQDM VLYTNVYAEQ YLTQNPLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN
181 SLESYWDTTT VLKIPVFSAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID
241 SLSERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLNT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RHWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHY (SEQ ID NO:
102) .
[0325] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MAKRFYSAEE AAAHCMASSS EQTSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL
121 FMTEAILQDM VLYTNVYAEQ YLTQNPLTRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN
181 SIESYWDTTT VLSI PVFGAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID
241 SLSERFANVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLNT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRKPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RHWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCRKPCF EIYHTQLHY (SEQ ID NO:
103) .
[0326] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MAKRFYSAEE AAAHCSASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL
121 FMTEAILQDM VLYTNVYAEQ YLTQNPLTRG ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN
181 SLESYWDTTT VLSI PVFGAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID
241 SLSERFANVY TPCQNICIDE SLMLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSTGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLNT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RHWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCRKPCF EIYHTQLHY (SEQ ID NO:
104) .
[0327] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises a sequence of:
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV 61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL 121 FMTEAILQDM VLYTNVYAEQ YLTQNPLTRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN 181 SLESYWDTTT VLSIPVFGAT MSRNRYQLLL RFLHFNNNAT AVPPDQPGHD RLHKLRPLID 241 SLSERFANVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF 301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLNT 361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE 421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RHWYKKVGIY LIQMALRNSY 481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP 541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCRKPCF EIYHTQLHY (SEQ ID NO:
105) .
[0328] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises an amino acid substitution at a position selected from amino acid 6, 7, 16, 19, 20,
21, 22, 23, 24, 26, 28, 31, 34, 67, 73, 76, 77, 88, 91, 141, 145, 146, 148, 150, 157, 162, 179, 182, 189, 192, 193, 196, 198, 200, 210, 212, 218, 248, 263, 270, 294, 297, 308, 310, 333, 336, 354, 357, 358, 359, 377, 423, 426, 428, 438, 447, 450, 462, 469, 472, 498, 502, 517,
520, 523, 533, 534, 576, 577, 582, 583 or 587 (relative to SEQ ID NO: 99). In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises an amino acid substitution of Y6C, S7G, M16S, S19G, S20Q, S20G, S20D, E21D, E22Q, F23T, F23P, S24Y, S26V, S28Q, V31K, A34E, L67A, G73H, A76V, D77N, P88A, N91D, Y141Q, Y141A, N145E, N145V, P146T, P146V, P146K, P148T, P148H, Y150G, Y150S, Y150C, H157Y, A162C, A179K, L182I, L182V, T189G, L192H, S193N, S193K, V196I, S198G, T200W, L210H, F212N, N218E, A248N, L263M, Q270L, S294T, T297M, S308R, L310R, L333M, Q336M, A354H, C357V, L358F, D359N, L377I, V 423H, P426K, K428R, S438A, T447G, T447A, L450V, A462H, A462Q, I469V, I472L, Q498M, L502V, E5171, P520D, P520G, N523S, I533E, D534A, F576R, F576E, K577I, I582R, Y583F, L587Y or L587W, or any combination thereof including at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or all of these mutations (relative to SEQ ID NO: 99).
[0329] In some embodiments, the hyperactive piggyBac or piggyBac-like transposase comprises one or more substitutions of an amino acid that is not wild type, wherein the one or more substitutions a for wild type amino acid comprises a substitution of A2X, K3X, R4X, F5X, Y6X, S7X, A11X, A13X, C15X, M16X, A17X, S18X, S19X, S20X, E21X, E22X, F23X, S24X, G25X, 26X, D27X, S28X, E29X, E42X, E43X, S44X, C46X, S47X, S48X, S49X, T50X, V51X, S52X, A53X, L54X, E55X, E56X, P57X, M58X, E59X, E62X, D63X, V64X, D65X, D66X, L67X, E68X, D69X, Q70X, E71X, A72X, G73X, D74X, R75X,
A76X, D77X, A78X, A79X, A80X, G81X, G82X, E83X, P84X, A85X, W86X, G87X,
P88X, P89X, C90X, N91X, F92X, P93X, E95X, I96X, P97X, P98X, F99X, T100X, T101X, P103X, G104X, V105X, K106X, V107X, D108X, T109C, N111C, P114X, I115C, N116C, F117X, F118X, Q119X, M122X, T123X, E124X, A125X, I126X, L127X, Q128X, D129X, M130X, L132X, Y133X, V126X, Y127X, A138X, E139X, Q140X, Y141X, L142X, Q144X, N145X, P146X, L147X, P148X, Y150X, A151X, A155X, H157X, P158X , I161X, A162X, V168X, T171X, L172X, A173X, M174X, I177X, A179X, L182X, D187X, T188X, T189X, T190X, L192X, S193X, I194X, P195X, V196X, S198X, A199X, T200X, S202X, L208X, L209X, L210X, R211X, F212X, F215X, N217X, N218X, A219X, T220X, A221X, V222X, P224X, D225X, Q226X, P227X, H229X, R231X, H233X, L235X, P237X, I239X, D240X, L242X, S243X, E244X, R244X, F246X, A247X, A248X, V249X, Y250X, T251X, P252X, C253X, Q254X, 1256X, C257X, 1258X, D259X, E260X, S261X, L262X, L263X, L264X, F265X, K266X, G267X, R268X, L269X, Q270X, F271X, R272X, Q273X, Y274X, I275X, P276X, S277X, K278X, R279X, A280X, R281X, Y282X, G283X, I284X, K285X, F286X, Y287X, K288X, L289X, C290X, E291X, S292X, S293XS294X, G295X, Y296X, T297X, S298X, Y299X, F300X, E304X, L310X, P313X, G314X, P316X, P317X, D318X, L319X, T320X, V321X, K324X, E328X, I330X, S331X, P332X, L333X, L334X, G335X, Q336X, F338X, L340X, D343X, N344X, F345X, Y346X, S347X, L351X, F352X, A354X, L355X, Y356X, C357X, L358X, D359X, T360X, R422X, Y423X, G424X, P426X, K428X, N429X, K430X, P431X, L432X, S434X, K435X, E436X, S438X, K439X, Y440X, G443X, R446X, T447X, L450X, Q451X, N455X, T460X, R461X, A462X, K465X, V467X, G468X, I469X, Y470X, L471X, I472X, M474X, A475X, L476X, R477X, S479X, Y480X, V482XY483X, K484X, A485X, A486X, V487X, P488X, P490X, K491X, S493X, Y494X, Y495X, K496X, Y497T, Q498X, L499X, Q500X, I501X, L502X, P503X, A504X, L505X, L506X, F507X, G508X, G509X, V510X, E511X, E512X, Q513X, T514X, V515X, E517X, M518X, P519X, P520X, S521X, D522X, N523X, V524X, A525X, L527X, I528X, K530X, H531X, F532X, I533X, D534X, T535X, L536X, T539X, P540X,Q546X, K550X, R553X, K554X, R555X, G556X, I557X, R558X, R559X, D560X, T561X, Y564X, P566X, K567X, P569X, R570X, N571X, L574X, C575X, F576X, K577X, P578X, F580X, E581X, I582X, Y583X, T585X, Q586X, L587X, H588X or Y589X (relative to SEQ ID NO: 99). A list of hyperactive amino acid substitutions can be found in US patent No. 10,041,077, the contents of which are incorporated by reference in their entirety.
[0330] In some embodiments, the piggyBac or piggyBac-like transposase is integration deficient. In some embodiments, an integration deficient piggyBac or piggyBac-like transposase is a transposase that can excise its corresponding transposon, but that integrates the excised transposon at a lower frequency than a corresponding naturally occurring transposase. In some embodiments, the piggyBac or piggyBac-like transposase is an integration deficient variant of SEQ ID NO: 99. In some embodiments, the integration deficient piggyBac or piggyBac-like transposase is deficient relative to SEQ ID NO: 99.
[0331] In some embodiments, the piggyBac or piggyBac-like transposase is active for excision but deficient in integration. In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises a sequence that is at least 90% identical to a sequence of :
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV 61 DEDVDDLEDQ EAGDRVDAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL 121 FMTEAILQDM VLYTNVYAEQ YLTQNPLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN 181 SLESYWDTTT VLSIPVFSAT MSRNRYQLLL KFLHFNNEAT AVPPDQPGHD RLHKLRPLID
241 SLSERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLDT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RAWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHYG RR (SEQ ID
NO: 106) .
[0332] In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises a sequence that is at least 90% identical to a sequence of:
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV
61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL
121 FMTEAILQDM VLYTNVYAEQ YLTQVPLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN
181 SLESYWDTTT VLNIPVFSAT MSRNRYQLLL RFLEFNNEAT AVPPDQPGHD RLHKLRPLID
241 SLSERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLDT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RAWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHY (SEQ ID NO:
107
[0333] In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises a sequence that is at least 90% identical to a sequence of:
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV 61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL 121 FMTEAILQDM VLYTNVYAEQ YLTQNVLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN 181 SLESYWDTTT VLSIPVFSAT MSRNRYQLLL RFLHFNNDAT AVPPDQPGHD RLHKLRPLID
241 SLTERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLDT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RAWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHYG RR (SEQ ID
NO: 108) .
[0334] In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises SEQ ID NO: 108. In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises a sequence that is at least 90% identical to a sequence of: 1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV 61 DEDVDDLEDQ EAGDRADAAP GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL 121 FMTEAILQDM VLYTNVYAEQ YLTQVPLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN 181 SLESYWDTTT VLSIPVFSAT MSRNRYQLLL RFLHFNNEAT AVPPDQPGHD RLHKLRPLID
241 SLSERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLDT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RAWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHYG RR (SEQ ID
NO: 109) .
[0335] In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises SEQ ID NO: 109. In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises a sequence that is at least 90% identical to a sequence of:
1 MAKRFYSAEE AAAHCMASSS EEFSGSDSEY VPPASESDSS TEESWCSSST VSALEEPMEV 61 DEDVDDLEDQ EAGDRADAAA GGEPAWGPPC NFPPEIPPFT TVPGVKVDTS NFEPINFFQL 121 FMTEAILQDM VLYTNVYAEQ YLTQVPLPRY ARAHAWHPTD IAEMKRFVGL TLAMGLIKAN 181 SLESYWDTTT VLNIPVFSAT MSRNRYQLLL RFLEFNNNAT AVPPDQPGHD RLHKLRPLID
241 SLSERFAAVY TPCQNICIDE SLLLFKGRLQ FRQYI PSKRA RYGIKFYKLC ESSSGYTSYF
301 LIYEGKDSKL DPPGCPPDLT VSGKIVWELI SPLLGQGFHL YVDNFYSSIP LFTALYCLDT
361 PACGTINRNR KGLPRALLDK KLNRGETYAL RKNELLAIKF FDKKNVFMLT SIHDESVIRE
421 QRVGRPPKNK PLCSKEYSKY MGGVDRTDQL QHYYNATRKT RAWYKKVGIY LIQMALRNSY
481 IVYKAAVPGP KLSYYKYQLQ ILPALLFGGV EEQTVPEMPP SDNVARLIGK HFIDTLPPTP
541 GKQRPQKGCK VCRKRGIRRD TRYYCPKCPR NPGLCFKPCF EIYHTQLHYG RR (SEQ ID
NO: 110) .
[0336] In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises SEQ ID NO: 110. In some embodiments, the integration deficient piggyBac or piggyBac-like transposase comprises an amino acid substitution wherein the Asn at position 218 is replaced by a Glu or an Asp (N218D or N218E) (relative to SEQ ID NO: 99).
[0337] In some embodiments, the excision competent, integration deficient piggyBac or piggyBac-like transposase comprises one or more substitutions of an amino acid that is not wild type, wherein the one or more substitutions a for wild type amino acid comprises a substitution of A2X, K3X, R4X, F5X, Y6X, S7X, A8X, E9X, E10X, A11X, A12X, A13X, H14X, C15X, M16X, A17X, S18X, S19X, S20X, E21X, E22X, F23X, S24X, G25X, 26X, D27X, S28X, E29X, V31X, P32X, P33X, A34X, S35X, E36X, S37X, D38X, S39X, S40X, T41X, E42X, E43X, S44X, W45X, C46X, S47X, S48X, S49X, T50X, V51X, S52X, A53X, L54X, E55X, E56X, P57X, M58X, E59X, V60X, M122X, T123X, E124X, A125X, L127X, Q128X, D129X, L132X, Y133X, V126X, Y127X, E139X, Q140X, Y141X, L142X, T143X, Q144X, N145X, P146X, L147X, P148X, R149X, Y150X, A151X, H154X, H157X, P158X, T159X, D160X, I161X, A162X, E163X, M164X, K165X, R166X, F167X, V168X, G169X, L170X, T171X, L172X, A173X, M174X, G175X, L176X, I177X, K178X, A179X, N180X, S181X, L182X, S184X, Y185X, D187X, T188X, T189X, T190X, V191X, L192X, S193X, I194X, P195X, V196X, F197X, S198X, A199X, T200X, M201X, S202X, R203X, N204X, R205X, Y206X, Q207X, L208X, L209X, L210X, R211X, F212X, L213X, H241X, F215X, N216X, N217X, N218X, A219X, T220X, A221X, V222X, P223X, P224X, D225X, Q226X, P227X, G228X, H229X, D230X, R231X, H233X, K234X, L235X, R236X, L238X, I239X, D240X, L242X, S243X, E244X, R244X, F246X, A247X, A248X, V249X, Y250X, T251X, P252X, C253X, Q254X, N255X, I256X, C257X, I258X, D259X, E260X, S261X, L262X, L263X, L264X, F265X, K266X, G267X, R268X, L269X, Q270X, F271X, R272X, Q273X, Y274X, I275X, P276X, S277X, K278X, R279X, A280X, R281X, Y282X, G283X, I284X, K285X, F286X, Y287X, K288X, L289X, C290X, E291X, S292X, S293X, S294X, G295X, Y296X, T297X, S298X, Y299X, F300X, I302X, E304X, G305X,K306X, D307X, S308X, K309X, L310X, D311X, P312X, P313X, G314X, C315X, P316X, P317X, D318X, L319X, T320X, V321X, S322X, G323X, K324X, I325X, V326X, W327X, E328X, L329X, I330X, S331X, P332X, L333X, L334X, G335X, Q336X, F338X, H339X, L340X, V342X, N344X, F345X, Y346X, S347X, S348X, I349X, L351X, T353X, A354X, Y356X, C357X, L358X, D359X, T360X, P361X, A362X, C363X, G364X, I366X, N367X, R368X, D369X, K371X, G372X, L373X, R375X, A376X, L377X, L378X, D379X, K380X, K381X, L382X, N383X, R384XG385X, T387X, Y388X, A389X, L390X, K392X, N393X, E394X, A397X, K399X, F400X, F401X, D402X, N405X, L406X, L409X, R422X, Y423X, G424X, E425X, P426X, K428X, N429X, K430X, P431X, L432X, S434X, K435X, E436X, S438X, K439X, Y440X, G442X, G443X, V444X, R446X, T447X, L450X, Q451X, H452X, N455X, T457X, R458X, T460X, R461X, A462X, Y464X, K465X, V467X, G468X, I469X, L471X, I472X, Q473X, M474X, L476X, R477X, N478X, S479X, Y480X, V482XY483X, K484X, A485X, A486X, V487X, P488X, G489X, P490X, K491X, L492X, S493X, Y494X, Y495X, K496X, Q498X, L499X, Q500X, I501X, L502X, P503X, A504X, L505X, L506X, F507X, G508X, G509X, V510X, E511X, E512X, Q513X, T514X, V515X, E517X, M518X, P519X, P520X, S521X, D522X, N523X, V524X, A525X, L527X, I528X, G529X, K530X, F532X, I533X, D534X, T535X, L536X, P537X, P538X, T539X, P540X, G541X, F542X, Q543X, R544X, P545X, Q546X, K547X, G548X, C549X, K550X, V551X, C552X, R553X, K554X, R555X, G556X, I557X, R558X, R559X, D560X, T561X, R562X, Y563X, Y564X, C565X, P566X, K567X, C568X, P569X, R570X, N571X, P572X, G573X, L574X, C575X, F576X, K577X, P578X, C579X, F580X, E581X, I582X, Y583X, H584X, T585X, Q586X, L587X, H588X or Y589X (relative to SEQ ID NO: 99). A list of excision competent, integration deficient amino acid substitutions can be found in US patent No. 10,041,077, the contents of which are incorporated by reference in their entirety.
[0338] In some embodiments, the piggyBac or piggyBac-like transposase is fused to a nuclear localization signal. In some embodiments, SEQ ID NO: 99 or SEQ ID NO: 98 is fused to a nuclear localization signal. In some embodiments, the amino acid sequence of the piggyBac or piggyBac like transposase fused to a nuclear localization signal is encoded by a polynucleotide sequence comprising:
1 atggcaccca aaaagaaacg taaagtgatg gccaaaagat tttacagcgc cgaagaagca 61 gcagcacatt gcatggcatc gtcatccgaa gaattctcgg ggagcgattc cgaatatgtc 121 ccaccggcct cggaaagcga ttcgagcact gaggagtcgt ggtgttcctc ctcaactgtc 181 tcggctcttg aggagccgat ggaagtggat gaggatgtgg acgacttgga ggaccaggaa 241 gccggagaca gggccgacgc tgccgcggga ggggagccgg cgtggggacc tccatgcaat 301 tttcctcccg aaatcccacc gttcactact gtgccgggag tgaaggtcga cacgtccaac 361 ttcgaaccga tcaatttctt tcaactcttc atgactgaag cgatcctgca agatatggtg 421 ctctacacta atgtgtacgc cgagcagtac ctgactcaaa acccgctgcc tcgctacgcg 481 agagcgcatg cgtggcaccc gaccgatatc gcggagatga agcggttcgt gggactgacc 541 ctcgcaatgg gcctgatcaa ggccaacagc ctcgagtcat actgggatac cacgactgtg 601 cttagcattc cggtgttctc cgctaccatg tcccgtaacc gctaccaact cctgctgcgg 661 ttcctccact tcaacaacaa tgcgaccgct gtgccacctg accagccagg acacgacaga 721 ctccacaagc tgcggccatt gatcgactcg ctgagcgagc gattcgccgc ggtgtacacc 781 ccttgccaaa acatttgcat cgacgagtcg cttctgctgt ttaaaggccg gcttcagttc 841 cgccagtaca tcccatcgaa gcgcgctcgc tatggtatca aattctacaa actctgcgag 901 tcgtccagcg gctacacgtc atacttcttg atctacgagg ggaaggactc taagctggac 961 ccaccggggt gtccaccgga tcttactgtc tccggaaaaa tcgtgtggga actcatctca 1021 cctctcctcg gacaaggctt tcatctctac gtcgacaatt tctactcatc gatccctctg 1081 ttcaccgccc tctactgcct ggatactcca gcctgtggga ccattaacag aaaccggaag 1141 ggtctgccga gagcactgct ggataagaag ttgaacaggg gagagactta cgcgctgaga 1201 aagaacgaac tcctcgccat caaattcttc gacaagaaaa atgtgtttat gctcacctcc 1261 atccacgacg aatccgtcat ccgggagcag cgcgtgggca ggccgccgaa aaacaagccg 1321 ctgtgctcta aggaatactc caagtacatg gggggtgtcg accggaccga tcagctgcag 1381 cattactaca acgccactag aaagacccgg gcctggtaca agaaagtcgg catctacctg 1441 atccaaatgg cactgaggaa ttcgtatatt gtctacaagg ctgccgttcc gggcccgaaa 1501 ctgtcatact acaagtacca gcttcaaatc ctgccggcgc tgctgttcgg tggagtggaa 1561 gaacagactg tgcccgagat gccgccatcc gacaacgtgg cccggttgat cggaaagcac 1621 ttcattgata ccctgcctcc gacgcctgga aagcagcggc cacagaaggg atgcaaagtt 1681 tgccgcaagc gcggaatacg gcgcgatacc cgctactatt gcccgaagtg cccccgcaat 1741 cccggactgt gtttcaagcc ctgttttgaa atctaccaca cccagttgca ttac (SEQ ID NO: 111) .
[0339] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Xenopus tropicalis. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaacctttt tactgccaat gacgcatggg atacgtcgtg gcagtaaaag ggcttaaatg
61 ccaacgacgc gtcccatacg ttgttggcat tttaagtctt ctctctgcag cggcagcatg
121 tgccgccgct gcagagagtt tctagcgatg acagcccctc tgggcaacga gccggggggg
181 ctgtc (SEQ ID NO: 112) .
[0340] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttgcatttt tagacattta gaagcctata tcttgttaca gaattggaat tacacaaaaa 61 ttctaccata ttttgaaagc ttaggttgtt ctgaaaaaaa caatatattg ttttcctggg 121 taaactaaaa gtcccctcga ggaaaggccc ctaaagtgaa acagtgcaaa acgttcaaaa 181 actgtctggc aatacaagtt ccactttgac caaaacggct ggcagtaaaa gggttaa (SEQ ID NO: 113) .
[0341] In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 112 and SEQ ID NO: 113. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaacccttt gcctgccaat cacgcatggg atacgtcgtg gcagtaaaag ggcttaaatg
61 ccaacgacgc gtcccatacg ttgttggcat tttaagtctt ctctctgcag cggcagcatg
121 tgccgccgct gcagagagtt tctagcgatg acagcccctc tgggcaacga gccggggggg
181 ctgtc (SEQ ID NO: 114) .
[0342] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttgcatttt tagacattta gaagcctata tcttgttaca gaattggaat tacacaaaaa 61 ttctaccata ttttgaaagc ttaggttgtt ctgaaaaaaa caatatattg ttttcctggg
121 taaactaaaa gtcccctcga ggaaaggccc ctaaagtgaa acagtgcaaa acgttcaaaa
181 actgtctggc aatacaagtt ccactttggg acaaatcggc tggcagtgaa agggttaa
(SEQ ID NO: 115) .
[0343] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaacctttt tactgccaat gacgcatggg atacgtcgtg gcagtaaaag ggcttaaatg
61 ccaacgacgc gtcccatacg ttgttggcat tttaattctt ctctctgcag cggcagcatg
121 tgccgccgct gcagagagtt tctagcgatg acagcccctc tgggcaacga gccggggggg
181 ctgtc (SEQ ID NO: 116) .
[0344] In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 113 and SEQ ID NO: 112, SEQ ID NO: 114 or SEQ ID NO: 116. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 115 and SEQ ID NO: 112, SEQ ID NO: 114 or SEQ ID NO: 116. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end comprising at least 14, 16, 18, 20, 30 or 40 contiguous nucleotides from SEQ ID NO: 112, SEQ ID NO: 114 or SEQ ID NO: 116. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end comprising at least 14, 16, 18, 20, 30 or 40 contiguous nucleotides from SEQ ID NO: 113 or SEQ ID NO: 115. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end with at least 90% identity to SEQ ID NO: 112, SEQ ID NO: 114 or SEQ ID NO: 116. In some embodiments, the piggyBac or piggyBac-like transposon comprises one end with at least 90% identity to SEQ ID NO: 113 or SEQ ID NO: 115. In one embodiment, one transposon end is at least 90% identical to SEQ ID NO: 112 and the other transposon end is at least 90% identical to SEQ ID NO: 113. [0345] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of TTAACCTTTTTACTGCCA (SEQ ID NO: 117). In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of TTAACCCTTTGCCTGCCA (SEQ ID NO: 118). In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of TTAACCYTTTTACTGCCA (SEQ ID NO: 119). In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of
TGGCAGTAAAAGGGTTAA (SEQ ID NO: 120). In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of TGGCAGTGAAAGGGTTAA (SEQ ID NO: 121). In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of TTAACCYTTTKMCTGCCA (SEQ ID NO: 122). In some embodiments, one end of the piggyBac or piggyBac-like transposon comprises a sequence selected from SEQ ID NO: 117, SEQ ID NO: H8 and SEQ ID NO: 119. In some embodiments, one end of the piggyBac (PB) or piggyBac-like transposon comprises a sequence selected from SEQ ID NO: 120 and SEQ ID NO: 121. In some embodiments, each inverted terminal repeat of the piggyBac or piggyBac-like transposon comprises a sequence of ITR sequence of
CCYTTTKMCTGCCA (SEQ ID NO: 123). In some embodiments, each end of the piggyBac (PB) or piggyBac-like transposon comprises SEQ ID NO: 123 in inverted orientations. In some embodiments, one ITR of the piggyBac or piggyBac-like transposon comprises a sequence selected from SEQ ID NO: 117, SEQ ID NO: 118 and SEQ ID NO: 119. In some embodiments, one ITR of the piggyBac or piggyBac-like transposon comprises a sequence selected from SEQ ID NO: 120 and SEQ ID NO: 121. In some embodiments, the piggyBac or piggyBac like transposon comprises SEQ ID NO: 122 in inverted orientation in the two transposon ends.
[0346] In some embodiments, The piggyBac or piggyBac-like transposon may have ends comprising SEQ ID NO: 112 and SEQ ID NO: 113 or a variant of either or both of these having at least 90% sequence identity to SEQ ID NO: 112 or SEQ ID NO: 113, and the piggyBac or piggyBac-like transposase has the sequence of SEQ ID NO: 99 or a variant showing at least %, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between sequence identity to SEQ ID NO: 99 or SEQ ID NO: 98. In some embodiments, one piggyBac or piggyBac-like transposon end comprises at least 14 contiguous nucleotides from SEQ ID NO: 112, SEQ ID NO: 114 or SEQ ID NO: 116, and the other transposon end comprises at least 14 contiguous nucleotides from SEQ ID NO: 113 or SEQ ID NO: 115. In some embodiments, one transposon end comprises at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 22, at least 25, at least 30 contiguous nucleotides from SEQ ID NO: 112, SEQ ID NO: 114 or SEQ ID NO: 116, and the other transposon end comprises at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 22, at least 25 or at least 30 contiguous nucleotides from SEQ ID NO: 113 or SEQ ID NO: 115.
[0347] In some embodiments, the piggyBac or piggyBac-like transposase recognizes a transposon end with a 5’ sequence corresponding to SEQ ID NO: 112, and a 3’ sequence corresponding to SEQ ID NO: 113. It will excise the transposon from one DNA molecule by cutting the DNA at the 5'-TTAA-3' sequence at the 5’ end of one transposon end to the 5'- TTAA-3' at the 3’ end of the second transposon end, including any heterologous DNA that is placed between them, and insert the excised sequence into a second DNA molecule. In some embodiments, truncated and modified versions of the 5’ and 3’ transposon ends will also function as part of a transposon that can be transposed by the piggyBac or piggyBac-like transposase. For example, the 5’ transposon end can be replaced by a sequence corresponding to SEQ ID NO: 114 or SEQ ID NO: 116, the 3’ transposon end can be replaced by a shorter sequence corresponding to SEQ ID NO: 115. In some embodiments, the 5’ and 3’ transposon ends share an 18 bp almost perfectly repeated sequence at their ends (5'- TTAACCYTTTKMCTGCCA: SEQ ID NO: 122) that includes the 5'-TTAA-3' insertion site, which sequence is inverted in the orientation in the two ends. That is in (SEQ ID NO: 112) and SEQ ID NO: 116 the 5’ transposon end begins with the sequence 5'- TTAACCTTTTTACTGCCA-3' (SEQ ID NO: 117), or in (SEQ ID NO: 114) the 5’ transposon end begins with the sequence 5'-TTAACCCTTTGCCTGCCA-3' (SEQ ID NO: 118); the 3’ transposon ends with approximately the reverse complement of this sequence: in SEQ ID NO: 113 it ends 5' T GGC AGT A A A AGGGTT A A- 31 (SEQ ID NO: 120), in (SEQ ID NO: 115) it ends 51 -T GGC AGT GA A AGGGTT A A- 31 (SEQ ID NO: 121.) One embodiment of the invention is a transposon that comprises a heterologous polynucleotide inserted between two transposon ends each comprising SEQ ID NO: 122 in inverted orientations in the two transposon ends. In some embodiments, one transposon end comprises a sequence selected from SEQ ID NOS: 117, SEQ ID NO: H 8 and SEQ ID NO: 119. In some embodiments, one transposon end comprises a sequence selected from SEQ ID NO: 120 and SEQ ID NO: 121.
[0348] In some embodiments, the piggyBac (PB) or piggyBac-like transposon is isolated or derived from Xenopus tropicalis. In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of: 1 ccctttgcct gccaatcacg catgggatac gtcgtggcag taaaagggct taaatgccaa
61 cgacgcgtcc catacgtt (SEQ ID NO: 124) .
[0349] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 cctgggtaaa ctaaaagtcc cctcgaggaa aggcccctaa agtgaaacag tgcaaaacgt 61 tcaaaaactg tctggcaata caagttccac tttgggacaa atcggctggc agtgaaaggg
(SEQ I D NO: 125).
[0350] In some embodiments, the piggyBac or piggyBac-like transposon comprises at least 16 contiguous bases from SEQ ID NO: 124 or SEQ ID NO: 125, and inverted terminal repeat of CCYTTTBMCTGCCA (SEQ ID NO: 126).
[0351] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 ccctttgcct gccaatcacg catgggatac gtcgtggcag taaaagggct taaatgccaa
61 cgacgcgtcc catacgttgt tggcatttta agtcttctct ctgcagcggc agcatgtgcc
121 gccgctgcag agagtttcta gcgatgacag cccctctggg caacgagccg ggggggctgt
181 c (SEQ ID NO: 127) .
[0352] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 cctttttact gccaatgacg catgggatac gtcgtggcag taaaagggct taaatgccaa
61 cgacgcgtcc catacgttgt tggcatttta attcttctct ctgcagcggc agcatgtgcc
121 gccgctgcag agagtttcta gcgatgacag cccctctggg caacgagccg ggggggctgt
181 c (SEQ ID NO: 128) .
[0353] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 cctttttact gccaatgacg catgggatac gtcgtggcag taaaagggct taaatgccaa
61 cgacgcgtcc catacgttgt tggcatttta agtcttctct ctgcagcggc agcatgtgcc
121 gccgctgcag agagtttcta gcgatgacag cccctctggg caacgagccg ggggggctgt
181 c (SEQ ID NO: 129) .
[0354] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 cctttttact gccaatgacg catgggatac gtcgtggcag taaaagggct taaatgccaa
61 cgacgcgtcc catacgttgt tggcatttta agtcttctct ctgcagcggc agcatgtgcc
121 gccgctgcag agag (SEQ ID NO: 130) .
[0355] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 cctttttact gccaatgacg catgggatac gtcgtggcag taaaagggct taaatgccaa 61 cgacgcgtcc catacgttgt tggcatttta agtctt (SEQ ID NO: 131) .
[0356] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 ccctttgcct gccaatcacg catgggatac gtcgtggcag taaaagggct taaatgccaa 61 cgacgcgtcc catacgttgt tggcatttta agtctt (SEQ ID NO: 132) .
[0357] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 ttatcctttt tactgccaat gacgcatggg atacgtcgtg gcagtaaaag ggcttaaatg
61 ccaacgacgc gtcccatacg ttgttggcat tttaagtctt ctctctgcag cggcagcatg
121 tgccgccgct gcagagagtt tctagcgatg acagcccctc tgggcaacga gccggggggg
181 ctgtc (SEQ ID NO: 133) .
[0358] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 tttgcatttt tagacattta gaagcctata tcttgttaca gaattggaat tacacaaaaa
61 ttctaccata ttttgaaagc ttaggttgtt ctgaaaaaaa caatatattg ttttcctggg
121 taaactaaaa gtcccctcga ggaaaggccc ctaaagtgaa acagtgcaaa acgttcaaaa
181 actgtctggc aatacaagtt ccactttggg acaaatcggc tggcagtgaa aggg (SEQ
ID NO: 134 ) .
[0359] In some embodiments, the piggyBac or piggyBac-like transposon comprises a 5’ transposon end sequence selected from SEQ ID NO: 124 and SEQ ID NOs: 127-133. In some embodiments, the 5’ transposon end sequence is preceded by a 5’ target sequence. In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 tttgcatttt tagacattta gaagcctata tcttgttaca gaattggaat tacacaaaaa
61 ttctaccata ttttgaaagc ttaggttgtt ctgaaaaaaa caatatattg ttttcctggg
121 taaactaaaa gtcccctcga ggaaaggccc ctaaagtgaa acagtgcaaa acgttcaaaa
181 actgtctggc aatacaagtt ccactttgac caaaacggct ggcagtaaaa ggg (SEQ ID
NO: 135) .
[0360] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 ttgttctgaa aaaaacaata tattgttttc ctgggtaaac taaaagtccc ctcgaggaaa
61 ggcccctaaa gtgaaacagt gcaaaacgtt caaaaactgt ctggcaatac aagttccact
121 ttgaccaaaa cggctggcag taaaaggg (SEQ ID NO: 136) .
[0361] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 tttgcatttt tagacattta gaagcctata tcttgttaca gaattggaat tacacaaaaa
61 ttctaccata ttttgaaagc ttaggttgtt ctgaaaaaaa caatatattg ttttcctggg
121 taaactaaaa gtcccctcga ggaaaggccc ctaaagtgaa acagtgcaaa acgttcaaaa
181 actgtctggc aatacaagtt ccactttgac caaaacggct ggcagtaaaa gggttat
(SEQ ID NO: 137) .
[0362] In some embodiments, the piggyBac or piggyBac-like transposon comprises at a sequence of:
1 ttgttctgaa aaaaacaata tattgttttc ctgggtaaac taaaagtccc ctcgaggaaa
61 ggcccctaaa gtgaaacagt gcaaaacgtt caaaaactgt ctggcaatac aagttccact
121 ttgggacaaa tcggctggca gtgaaaggg (SEQ ID NO: 138) . [0363] In some embodiments, the piggyBac or piggyBac-like transposon comprises a 3’ transposon end sequence selected from SEQ ID NO: 125 and SEQ ID NOs: 135-138. In some embodiments, the 3’ transposon end sequence is followed by a 3’ target sequence. In some embodiments, the 5’ and 3’ transposon ends share a 14 repeated sequence inverted in orientation in the two ends (SEQ ID NO: 126) adjacent to the target sequence. In some embodiments, the piggyBac or piggyBac-like transposon comprises a 5’ transposon end comprising a target sequence and a sequence that is selected from SEQ ID NOs: 130-132 and 124, and a 3’ transposon end comprising a sequence selected from SEQ ID NOs: 136-138 and 125 followed by a 3’ target sequence.
[0364] In some embodiments, the 5’ transposon end of the piggyBac or piggyBac-like transposon comprises
1 atcacgcatg ggatacgtcg tggcagtaaa agggcttaaa tgccaacgac gcgtcccata 61 cgtt
(SEQ ID NO: 139), and an ITR. In some embodiments, the 5' transposon end comprises
1 atgacgcatg ggatacgtcg tggcagtaaa agggcttaaa tgccaacgac gcgtcccata
61 cgttgttggc attttaagtc tt
(SEQ ID NO: 140) and an ITR. In some embodiments, the 3' transposon end of the piggyBac or piggyBac-like transposon comprises
1 cctgggtaaa ctaaaagtcc cctcgaggaa aggcccctaa agtgaaacag tgcaaaacgt 61 tcaaaaactg tctggcaata caagttccac tttgggacaa atcggc
(SEQ ID NO: 141) and an ITR. In some embodiments, the 3' transposon end comprises
1 ttgttctgaa aaaaacaata tattgttttc ctgggtaaac taaaagtccc ctcgaggaaa
61 ggcccctaaa gtgaaacagt gcaaaacgtt caaaaactgt ctggcaatac aagttccact
121 ttgaccaaaa cggc
(SEQ ID NO: 142) and an ITR.
[0365] In some embodiments, one transposon end comprises a sequence that is at least 90%, at least 95%, at least 99% or any percentage in between identical to SEQ ID NO: 124 and the other transposon end comprises a sequence that is at least 90%, at least 95%, at least 99% or any percentage in between identical to SEQ ID NO: 125. In some embodiments, one transposon end comprises at least 14, at least 16, at least 18, at least 20 or at least 25 contiguous nucleotides from SEQ ID NO: 124 and one transposon end comprises at least 14, at least 16, at least 18, at least 20 or at least 25 contiguous nucleotides from SEQ ID NO: 125. In some embodiments, one transposon end comprises at least 14, at least 16, at least 18, at least 20 from SEQ ID NO: 139, and the other end comprises at least 14, at least 16, at least 18, at least 20 from SEQ ID NO: 141. In some embodiments, each transposon end comprises SEQ ID NO: 126 in inverted orientations. [0366] In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence selected from of SEQ ID NO: 124, SEQ ID NO: 127, SEQ ID NO: 129, SEQ ID NO: 130, SEQ ID NO: 131, and SEQ ID NO: 136, and a sequence selected from SEQ ID NO: 135, SEQ ID NO: 136, SEQ ID NO: 137 and SEQ ID NO: 134 and the piggyBac or piggyBac-like transposase comprises SEQ ID NO: 99 or SEQ ID NO: 98.
[0367] In some embodiments, the piggyBac or piggyBac-like transposon comprises ITRs of CCCTTTGCCTGCCA (SEQ ID NO: 216) (5’ ITR) and T GGC AGT GAAAGGG (SEQ ID NO: 217) (3’ ITR) adjacent to the target sequences.
[0368] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Helicoverpa armigera. The piggyBac or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MASRQRLNHD EIATILENDD DYSPLDSESE KEDCWEDDV WSDNEDAIVD FVEDTSAQED 61 PDNNIASRES PNLEVTSLTS HRIITLPQRS IRGKNNHVWS TTKGRTTGRT SAINIIRTNR 121 GPTRMCRNIV DPLLCFQLFI TDEI IHEIVK WTNVEIIVKR QNLKDISASY RDTNTMEIWA 181 LVGILTLTAV MKDNHLSTDE LFDATFSGTR YVSVMSRERF EFLIRCIRMD DKTLRPTLRS 241 DDAFLPVRKI WEIFINQCRQ NHVPGSNLTV DEQLLGFRGR CPFRMYI PNK PDKYGIKFPM 301 MCAAATKYMI DAIPYLGKST KTNGLPLGEF YVKDLTKTVH GTNRNITCDN WFTSIPLAKN 361 MLQAPYNLTI VGTIRSNKRE MPEEIKNSRS RPVGSSMFCF DGPLTLVSYK PKPSKMVFLL 421 SSCDENAVIN ESNGKPDMIL FYNQTKGGVD SFDQMCKSMS ANRKTNRWPM AVFYGMLNMA 481 FVNSYIIYCH NKINKQEKPI SRKEFMKKLS IQLTTPWMQE RLQAPTLKRT LRDNITNVLK 541 NWPASSENI SNEPEPKKRR YCGVCSYKKR RMTKAQCCKC KKAICGEHNI DVCQDCI (SEQ ID NO: 143) .
[0369] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Helicoverpa armigera. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaaccctag aagcccaatc tacgtaaatt tgacgtatac cgcggcgaaa tatctctgtc
61 tctttcatgt ttaccgtcgg atcgccgcta acttctgaac caactcagta gccattggga
121 cctcgcagga cacagttgcg tcatctcggt aagtgccgcc attttgttgt actctctatt
181 acaacacacg tcacgtcacg tcgttgcacg tcattttgac gtataattgg gctttgtgta
241 acttttgaat ttgtttcaaa ttttttatgt ttgtgattta tttgagttaa tcgtattgtt
301 tcgttacatt tttcatataa taataatatt ttcaggttga gtacaaa (SEQ ID NO:
144 ) . In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 agactgtttt tttctaagag acttctaaaa tattattacg agttgattta attttatgaa 61 aacatttaaa actagttgat tttttttata attacataat tttaagaaaa agtgttagag 121 gcttgatttt tttgttgatt ttttctaaga tttgattaaa gtgccataat agtattaata 181 aagagtattt tttaacttaa aatgtatttt atttattaat taaaacttca attatgataa 241 ctcatgcaaa aatatagttc attaacagaa aaaaatagga aaactttgaa gttttgtttt 301 tacacgtcat ttttacgtat gattgggctt tatagctagt taaatatgat tgggcttcta 361 gggttaa (SEQ ID NO: 145) . [0370] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Pectinophora gossypiella. The piggyBac or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MDLRKQDEKI RQWLEQDIEE DSKGESDNSS SETEDIVEME VHKNTSSESE VSSESDYEPV
61 CPSKRQRTQI IESEESDNSE SIRPSRRQTS RVIDSDETDE DVMSSTPQNI PRNPNVIQPS
121 SRFLYGKNKH KWSSAAKPSS VRTSRRNI IH FIPGPKERAR EVSEPIDIFS LFISEDMLQQ
181 WTFTNAEML IRKNKYKTET FTVSPTNLEE IRALLGLLFN AAAMKSNHLP TRMLFNTHRS
241 GTI FKACMSA ERLNFLIKCL RFDDKLTRNV RQRDDRFAPI RDLWQALISN FQKWYTPGSY
301 ITVDEQLVGF RGRCSFRMYI PNKPNKYGIK LVMAADVNSK YIVNAIPYLG KGTDPQNQPL
361 ATFFIKEITS TLHGTNRNIT MDNWFTSVPL ANELLMAPYN LTLVGTLRSN KREI PEKLKN
421 SKSRAIGTSM FCYDGDKTLV SYKAKSNKW FILSTIHDQP DINQETGKPE MIHFYNSTKG
481 AVDTVDQMCS SI STNRKTQR WPLCVFYNML NLSIINAYW YVYNNVRNNK KPMSRRDFVI
541 KLGDQLMEPW LRQRLQTVTL RRDIKVMIQD ILGESSDLEA PVPSVSNVRK IYYLCPSKAR
601 RMTKHRCIKC KQAICGPHNI DICSRCIE (SEQ ID NO: 146) .
[0371] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Pectinophora gossypiella. In some embodiments, the piggyBac or piggyBac- like transposon comprises a sequence of:
1 ttaaccctag ataactaaac attcgtccgc tcgacgacgc gctatgccgc gaaattgaag
61 tttacctatt attccgcgtc ccccgccccc gccgcttttt ctagcttcct gatttgcaaa
121 atagtgcatc gcgtgacacg ctcgaggtca cacgacaatt aggtcgaaag ttacaggaat
181 ttcgtcgtcc gctcgacgaa agtttagtaa ttacgtaagt ttggcaaagg taagtgaatg
241 aagtattttt ttataattat tttttaattc tttatagtga taacgtaagg tttatttaaa
301 tttattactt ttatagttat ttagccaatt gttataaatt ccttgttatt gctgaaaaat
361 ttgcctgttt tagtcaaaat ttattaactt ttcgatcgtt ttttag (SEQ ID NO:
147 ) . In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttcactaag taattttgtt cctatttagt agataagtaa cacataatta ttgtgatatt
61 caaaacttaa gaggtttaat aaataataat aaaaaaaaaa tggtttttat ttcgtagtct
121 gctcgacgaa tgtttagtta ttacgtaacc gtgaatatag tttagtagtc tagggttaa
(SEQ ID NO: 148) .
[0372] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Ctenoplusia agnata. The piggyBac or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MASRQHLYQD EIAAILENED DYSPHDTDSE MEDCVTQDDV RSDVEDEMVD NIGNGTSPAS 61 RHEDPETPDP SSEASNLEVT LSSHRIIILP QRSIREKNNH IWSTTKGQSS GRTAAINIVR 121 TNRGPTRMCR NIVDPLLCFQ LFIKEEIVEE IVKWTNVEMV QKRVNLKDIS ASYRDTNEME 181 IWAIISMLTL SAVMKDNHLS TDELFNVSYG TRYVSVMSRE RFEFLLRLLR MGDKLLRPNL 241 RQEDAFTPVR KIWEIFINQC RLNYVPGTNL TVDEQLLGFR GRCPFRMYIP NKPDKYGIKF 301 PMVCDAATKY MVDAI PYLGK STKTQGLPLG EFYVKELTQT VHGTNRNVTC DNWFTSVPLA 361 KSLLNSPYNL TLVGTIRSNK REIPEEVKNS RSRQVGSSMF CFDGPLTLVS YKPKPSKMVF 421 LLSSCNEDAV VNQSNGKPDM ILFYNQTKGG VDSFDQMCSS MSTNRKTNRW PMAVFYGMLN 481 MAFVNSYIIY CHNMLAKKEK PLSRKDFMKK LSTDLTTPSM QKRLEAPTLK RSLRDNITNV 541 LKIVPQAAI D TSFDEPEPKK RRYCGFCSYK KKRMTKTQCF KCKKPVCGEH NIDVCQDCI (SEQ ID NO: 149) .
[0373] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Ctenoplusia agnata. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaaccctag aagcccaatc tacgtcattc tgacgtgtat gtcgccgaaa atactctgtc
61 tctttctcct gcacgatcgg attgccgcga acgctcgatt caacccagtt ggcgccgaga
121 tctattggag gactgcggcg ttgattcggt aagtcccgcc attttgtcat agtaacagta
181 ttgcacgtca gcttgacgta tatttgggct ttgtgttatt tttgtaaatt ttcaacgtta
241 gtttattatt gcatcttttt gttacattac tggtttattt gcatgtatta ctcaaatatt
301 atttttattt tagcgtagaa aataca (SEQ ID NO: 150) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 agactgtttt ttttgtattt gcattatata ttatattcta aagttgattt aattctaaga
61 aaaacattaa aataagtttc tttttgtaaa atttaattaa ttataagaaa aagtttaagt
121 tgatctcatt ttttataaaa atttgcaatg tttccaaagt tattattgta aaagaataaa
181 taaaagtaaa ctgagtttta attgatgttt tattatatca ttatactata tattacttaa
241 ataaaacaat aactgaatgt atttctaaaa ggaatcacta gaaaatatag tgatcaaaaa
301 tttacacgtc atttttgcgt atgattgggc tttataggtt ctaaaaatat gattgggcct
361 ctagggttaa (SEQ ID NO: 151).
In some embodiments, the piggyBac or piggyBac-like transposon comprises an ITR sequence of CCCTAGAAGCCCAATC (SEQ 1D NO: 152).
[0374] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Agrotis ipsilon. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%,
70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MESRQRLNQD EIATI LENDD DYSPLDSDSE AEDRWEDDV WSDNEDAMID YVEDTSRQED 61 PDNNIASQES ANLEVTSLTS HRIISLPQRS ICGKNNHVWS TTKGRTTGRT SAINIIRTNR 121 GPTRMCRNIV DPLLCFQLFI TDEI IHEIVK WTNVEMIVKR QNLIDISASY RDTNTMEMWA 181 LVGILTLTAV MKDNHLSTDE LFDATFSGTR YVSVMSRERF EFLIRCMRMD DKTLRPTLRS 241 DDAFI PVRKL WEI FINQCRL NYVPGGNLTV DEQLLGFRGR CPFRMYIPNK PDKYGIRFPM 301 MCDAATKYMI DAI PYLGKST KTNGLPLGEF YVKELTKTVH GTNRNVTCDN WFTSI PLAKN 361 MLQAPYNLTI VGTIRSNKRE IPEEIKNSRS RPVGSSMFCF DGPLTLVSYK PKPSRMVFLL 421 SSCDENAVIN ESNGKPDMIL FYNQTKGGVD SFDQMCKSMS ANRKTNRWPM AVFYGMLNMA 481 FVNSYIIYCH NKINKQKKPI NRKEFMKNLS TDLTTPWMQE RLKAPTLKRT LRDNITNVLK 541 NWPPSPANN SEEPGPKKRS YCGFCSYKKR RMTKTQFYKC KKAI CGEHNI DVCQDCV (SEQ ID NO: 153) .
[0375] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Agrotis ipsilon. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of: 1 ttaaccctag aagcccaatc tacgtaaatt tgacgtatac cgcggcgaaa tatatctgtc 61 tctttcacgt ttaccgtcgg attcccgcta acttcggaac caactcagta gccattgaga 121 actcccagga cacagttgcg tcatctcggt aagtgccgcc attttgttgt aatagacagg 181 ttgcacgtca ttttgacgta taattgggct ttgtgtaact tttgaaatta tttataattt 241 ttattgatgt gatttatttg agttaatcgt attgtttcgt tacatttttc atatgatatt 301 aatattttca gattgaatat aaa (SEQ ID NO: 154 ) . In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 agactgtttt ttttaaaagg cttataaagt attactattg cgtgatttaa ttttataaaa
61 atatttaaaa ccagttgatt tttttaataa ttacctaatt ttaagaaaaa atgttagaag
121 cttgatattt ttgttgattt ttttctaaga tttgattaaa aggccataat tgtattaata
181 aagagtattt ttaacttcaa atttatttta tttattaatt aaaacttcaa ttatgataat
241 acatgcaaaa atatagttca tcaacagaaa aatataggaa aactctaata gttttatttt
301 tacacgtcat ttttacgtat gattgggctt tatagctagt caaatatgat tgggcttcta
361 gggttaa (SEQ ID NO: 155) .
[0376] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggy Bac-like transposase enzyme is isolated or derived from Megachile rotundata. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MNGKDSLGEF YLDDLSDCLD CRSASSTDDE SDSSNIAIRK RCPIPLIYSD SEDEDMNNNV
61 EDNNHFVKES NRYHYQIVEK YKITSKTKKW KDVTVTEMKK FLGLI ILMGQ VKKDVLYDYW
121 STDPSIETPF FSKVMSRNRF LQIMQSWHFY NNNDI SPNSH RLVKIQPVID YFKEKFNNVY
181 KSDQQLSLDE CLIPWRGRLS IKTYNPAKIT KYGILVRVLS EARTGYVSNF CVYAADGKKI
241 EETVLSVIGP YKNMWHHVYQ DNYYNSVNIA KIFLKNKLRV CGTIRKNRSL PQILQTVKLS
301 RGQHQFLRNG HTLLEVWNNG KRNVNMISTI HSAQMAESRN RSRTSDCPIQ KPISIIDYNK
361 YMKGVDRADQ YLSYYSIFRK TKKWTKRWM FFINCALFNS FKVYTTLNGQ KITYKNFLHK
421 AALSLIEDCG TEEQGTDLPN SEPTTTRTTS RVDHPGRLEN FGKHKLVNIV TSGQCKKPLR
481 QCRVCASKKK LSRTGFACKY CNVPLHKGDC FERYHSLKKY (SEQ ID NO: 156) .
[0377] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Megachile rotundata. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaaataatg cccactctag atgaacttaa cactttaccg accggccgtc gattattcga
61 cgtttgctcc ccagcgctta ccgaccggcc atcgattatt cgacgtttgc ttcccagcgc
121 ttaccgaccg gtcatcgact tttgatcttt ccgttagatt tggttaggtc agattgacaa
181 gtagcaagca tttcgcattc tttattcaaa taatcggtgc ttttttctaa gctttagccc
241 ttagaa (SEQ ID NO: 157).
In some embodiments, the the piggyBac or piggyBac-like transposon comprises a sequence of:
1 acaacttctt ttttcaacaa atattgttat atggattatt tatttattta tttatttatg
61 gtatatttta tgtttattta tttatggtta ttatggtata ttttatgtaa ataataaact
121 gaaaacgatt gtaatagatg aaataaatat tgttttaaca ctaatataat taaagtaaaa
181 gattttaata aatttcgtta ccctacaata acacgaagcg tacaatttta ccagagttta
241 ttaa (SEQ ID NO: 158) .
[0378] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Bombus impatiens. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MNEKNGIGEF YLDDLSDCPD SYSRSNSGDE SDGSDTI IRK RGSVLPPRYS DSEDDEINNV
61 EDNANNVENN DDIWSTNDEA IILEPFEGSP GLKIMPSSAE SVTDNVNLFF GDDFFEHLVR
121 ESNRYHYQVM EKYKIPSKAK KWTDITVPEM KKFLGLIVLM GQIKKDVLYD YWSTDPSIET
181 PFFSQVMSRN RFVQIMQSWH FCNNDNIPHD SHRLAKIQPV IDYFRRKFND VYKPCQQLSL
241 DESIIPWRGR LSIKTYNPAK ITKYGILVRV LSEAVTGYVC NFDVYAADGK KLEDTAVIEP
301 YKNIWHQIYQ DNYYNSVKMA RILLKNKVRV CGTIRKNRGL PRSLKTIQLS RGQYEFRRNH
361 QILLEVWNNG RRNVNMISTI HSAQLMESRS KSKRSDVPIQ KPNSIIDYNK YMKGVDRADQ
421 YLAYYSI FRK TKKWTKRWM FFINCALFNS FRVYTILNGK NITYKNFLHK VAVSWIEDGE
481 TNCTEQDDNL PNSEPTRRAP RLDHPGRLSN YGKHKLINIV TSGRSLKPQR QCRVCAVQKK
541 RSRTCFVCKF CNVPLHKGDC FERYHTLKKY (SEQ ID NO: 159) .
[0379] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Bombus impatiens. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaatttttt aacattttac cgaccgatag ccgattaatc gggtttttgc cgctgacgct
61 taccgaccga taacctatta atcggctttt tgtcgtcgaa gcttaccaac ctatagccta
121 cctatagtta atcggttgcc atggcgataa acaatctttc tcattatatg agcagtaatt
181 tgttatttag tactaaggta ccttgctcag ttgcgtcagt tgcgttgctt tgtaagctcc
241 cacagtttta taccaattcg aaaaacttac cgttcgcg (SEQ ID NO: 160) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 actatttcac atttgaacta aaaaccgttg taatagataa aataaatata atttagtatt 61 aatattatgg aaacaaaaga ttttattcaa tttaattatc ctatagtaac aaaaagcggc
121 caattttatc tgagcatacg aaaagcacag atactcccgc ccgacagtct aaaccgaaac
181 agagccggcg ccagggagaa tctgcgcctg agcagccggt cggacgtgcg tttgctgttg
241 aaccgctagt ggtcagtaaa ccagaaccag tcagtaagcc agtaactgat cagttaacta
301 gattgtatag ttcaaattga acttaatcta gtttttaagc gtttgaatgt tgtctaactt
361 cgttatatat tatattcttt ttaa (SEQ ID NO: 161) .
[0380] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Mamestra brassicae. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MFSFVPNKEQ TRTVLI FCFH LKTTAAESHR PLVEAFGEQV PTVKTCERWF QRFKSGDFDV
61 DDKEHGKPPK RYEDAELQAL LDEDDAQTQK QIAEQLEVSQ QAVSNRLREG GKIQKVGRWV
121 PHELNERQRE RRKNTCEILL SRYKRKSFLH RIVTGEEKWI FFVNPKRKKS YVDPGQPATS
181 TARPNRFGKK TRLCVWWDQS GVIYYELLKP GETVNTARYQ QQLINLNRAL QRKRPEYQKR
241 QHRVI FLHDN APSHTARAVR DTLETLNWEV LPHAAYSPDL APSDYHLFAS MGHAIAEQRF
301 DSYESVEEWL DEWFAAKDDE FYWRGIHKLP ERWDNCVASD GKYFE (SEQ ID NO:
162) . [0381] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Mamestra brassicae. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttattgggtt gcccaaaaag taattgcgga tttttcatat acctgtcttt taaacgtaca
61 tagggatcga actcagtaaa actttgacct tgtgaaataa caaacttgac tgtccaacca
121 ccatagtttg gcgcgaattg agcgtcataa ttgttttgac tttttgcagt caac (SEQ
ID NO: 163) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 atgatttttt ctttttaaac caattttaat tagttaattg atataaaaat ccgcaattac 61 tttttgggca acccaataa (SEQ ID NO: 164) .
[0382] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Mayetiola destructor. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MENFENWRKR RHLREVLLGH FFAKKTAAES HRLLVEVYGE HALAKTQCFE WFQRFKSGDF 61 DTEDKERPGQ PKKFEDEELE ALLDEDCCQT QEELAKSLGV TQQAISKRLK AAGYIQKQGN 121 WVPHELKPRD VERRFCMSEM LLQRHKKKSF LSRIITGDEK WIHYDNSKRK KSYVKRGGRA 181 KSTPKSNLHG AKVMLCIWWD QRGVLYYELL EPGQTITGDL YRTQLIRLKQ ALAEKRPEYA 241 KRHGAVI FHH DNARPHVALP VKNYLENSGW EVLPHPPYSP DLAPSDYHLF RSMQNDLAGK 301 RFTSEQGIRK WLDSFLAAKP AKFFEKGIHE LSERWEKVIA SDGQYFE (SEQ ID NO:
165) .
[0383] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Mayetiola destructor. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 taagacttcc aaaatttcca cccgaacttt accttccccg cgcattatgt ctctcttttc
61 accctctgat ccctggtatt gttgtcgagc acgatttata ttgggtgtac aacttaaaaa
121 ccggaattgg acgctagatg tccacactaa cgaatagtgt aaaagcacaa atttcatata
181 tacgtcattt tgaaggtaca tttgacagct atcaaaatca gtcaataaaa ctattctatc
241 tgtgtgcatc atattttttt attaact (SEQ ID NO: 166) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tgcattcatt cattttgtta tcgaaataaa gcattaattt tcactaaaaa attccggttt
61 ttaagttgta cacccaatat catccttagt gacaattttc aaatggcttt cccattgagc
121 tgaaaccgtg gctctagtaa gaaaaacgcc caacccgtca tcatatgcct tttttttctc
181 aacatccg (SEQ ID NO: 167).
[0384] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Apis mellifera. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MENQKEHYRH ILLFYFRKGK NASQAHKKLC AVYGDEALKE RQCQNWFDKF RSGDFSLKDE
61 KRSGRPVEVD DDLIKAIIDS DRHSTTREIA EKLHVSHTCI ENHLKQLGYV QKLDTWVPHE
121 LKEKHLTQRI NSCDLLKKRN ENDPFLKRLI TGDEKWWYN NIKRKRSWSR PREPAQTTSK
181 AGIHRKKVLL SVWWDYKGIV YFELLPPNRT INSWYIEQL TKLNNAVEEK RPELTNRKGV
241 VFHHDNARPH TSLVTRQKLL ELGWDVLPHP PYSPDIAPSD YFLFRSLQNS LNGKNFNNDD
301 DIKSYLIQFF ANKNQKFYER GIMMLPERWQ KVIDQNGQHI TE (SEQ ID NO: 168) .
[0385] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Apis mellifera. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttgggttggc aactaagtaa ttgcggattt cactcataga tggcttcagt tgaattttta
61 ggtttgctgg cgtagtccaa atgtaaaaca cattttgtta tttgatagtt ggcaattcag
121 ctgtcaatca gtaaaaaaag ttttttgatc ggttgcgtag ttttcgtttg gcgttcgttg
181 aaaa (SEQ ID NO: 169) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 agttatttag ttccatgaaa aaattgtctt tgattttcta aaaaaaatcc gcaattactt 61 agttgccaat ccaa (SEQ ID NO: 170) .
[0386] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Messor bouvieri. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MSSFVPENVH LRHALLFLFH QKKRAAESHR LLVETYGEHA PTIRTCETWF RQFKCGDFNV
61 QDKERPGRPK TFEDAELQEL LDEDSTQTQK QLAEKLNVSR VAICERLQAM GKIQKMGRWV
121 PHELNDRQME NRKIVSEMLL QRYERKSFLH RIVTGDEKWI YFENPKRKKS WLSPGEAGPS
181 TARPNRFGRK TMLCVWWDQI GWYYELLKP GETVNTDRYR QQMINLNCAL IEKRPQYAQR
241 HDKVILQHDN APSHTAKPVK EMLKSLGWEV LSHPPYSPDL APSDYHLFAS MGHALAEQHF
301 ADFEEVKKWL DEWFSSKEKL FFWNGIHKLS ERWTKCIESN GQYFE (SEQ ID NO:
171) .
[0387] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Messor bouvieri. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 agtcagaaat gacacctcga tcgacgacta atcgacgtct aatcgacgtc gattttatgt
61 caacatgtta ccaggtgtgt cggtaattcc tttccggttt ttccggcaga tgtcactagc
121 cataagtatg aaatgttatg atttgataca tatgtcattt tattctactg acattaacct
181 taaaactaca caagttacgt tccgccaaaa taacagcgtt atagatttat aattttttga
241 aa (SEQ ID NO: 172) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ataaatttga actatccatt ctaagtaacg tgttttcttt aacgaaaaaa ccggaaaaga
61 attaccgaca ctcctggtat gtcaacatgt tattttcgac attgaatcgc gtcgattcga
121 agtcgatcga ggtgtcattt ctgact (SEQ ID NO: 173) . [0388] In certain embodiments of the methods of the disclosure, the transposase enzyme is a piggyBac or piggyBac-like transposase enzyme. In some embodiments, the piggyBac or piggyBac-like transposase enzyme is isolated or derived from Trichoplusia ni. The piggyBac (PB) or piggyBac-like transposase enzyme may comprise or consist of an amino acid sequence at least 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or any percentage in between identical to:
1 MGSSLDDEHI LSALLQSDDE LVGEDSDSEV SDHVSEDDVQ SDTEEAFIDE VHEVQPTSSG
61 SEILDEQNVI EQPGSSLASN RILTLPQRTI RGKNKHCWST SKSTRRSRVS ALNIVRSQRG
121 PTRMCRNIYD PLLCFKLFFT DEIISEIVKW TNAEI SLKRR ESMTSATFRD TNEDEIYAFF
181 GILVMTAVRK DNHMSTDDLF DRSLSMVYVS VMSRDRFDFL IRCLRMDDKS IRPTLRENDV
241 FTPVRKIWDL FIHQCIQNYT PGAHLTIDEQ LLGFRGRCPF RVYIPNKPSK YGIKILMMCD
301 SGTKYMINGM PYLGRGTQTN GVPLGEYYVK ELSKPVHGSC RNITCDNWFT SIPLAKNLLQ
361 EPYKLTIVGT VRSNKREI PE VLKNSRSRPV GTSMFCFDGP LTLVSYKPKP AKMVYLLSSC
421 DEDASINEST GKPQMVMYYN QTKGGVDTLD QMCSVMTCSR KTNRWPMALL YGMINIACIN
481 SFIIYSHNVS SKGEKVQSRK KFMRNLYMSL TSSFMRKRLE APTLKRYLRD NISNILPKEV
541 PGTSDDSTEE PVMKKRTYCT YCPSKIRRKA NASCKKCKKV ICREHNIDMC QSCF (SEQ
ID NO: 174 ) .
[0389] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Trichoplusia ni. In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ttaaccctag aaagatagtc tgcgtaaaat tgacgcatgc attcttgaaa tattgctctc
61 tctttctaaa tagcgcgaat ccgtcgctgt gcatttagga catctcagtc gccgcttgga
121 gctcccgtga ggcgtgcttg tcaatgcggt aagtgtcact gattttgaac tataacgacc
181 gcgtgagtca aaatgacgca tgattatctt ttacgtgact tttaagattt aactcatacg
241 ataattatat tgttatttca tgttctactt acgtgataac ttattatata tatattttct
301 tgttatagat ate (SEQ ID NO: 175) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttgttactt tatagaagaa attttgagtt tttgtttttt tttaataaat aaataaacat
61 aaataaattg tttgttgaat ttattattag tatgtaagtg taaatataat aaaacttaat
121 atctattcaa attaataaat aaacctcgat atacagaccg ataaaacaca tgcgtcaatt
181 ttacgcatga ttatctttaa cgtacgtcac aatatgatta tetttetagg gttaa (SEQ
ID NO: 176) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 ccctagaaag atagtctgcg taaaattgac gcatgcattc ttgaaatatt gctctctctt
61 tetaaatage gcgaatccgt cgctgtgcat ttaggacatc tcagtcgccg cttggagctc
121 ccgtgaggcg tgcttgtcaa tgcggtaagt gtcactgatt ttgaactata acgaccgcgt
181 gagtcaaaat gacgcatgat tatcttttac gtgactttta agatttaact catacgataa
241 ttatattgtt atttcatgtt etaettaegt gataaettat tatatatata ttttcttgtt
301 atagatatc (SEQ ID NO: 177) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttgttactt tatagaagaa attttgagtt tttgtttttt tttaataaat aaataaacat
61 aaataaattg tttgttgaat ttattattag tatgtaagtg taaatataat aaaacttaat
121 atctattcaa attaataaat aaacctcgat atacagaccg ataaaacaca tgcgtcaatt
181 ttacgcatga ttatctttaa cgtacgtcac aatatgatta tetttetagg g (SEQ ID
NO: 178) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of: 1 tctaaatagc gcgaatccgt cgctgtgcat ttaggacatc tcagtcgccg cttggagctc
61 ccgtgaggcg tgcttgtcaa tgcggtaagt gtcactgatt ttgaactata acgaccgcgt
121 gagtcaaaat gacgcatgat tatcttttac gtgactttta agatttaact catacgataa
181 ttatattgtt atttcatgtt ctacttacgt gataacttat tatatatata ttttcttgtt
241 atagatatc (SEQ ID NO: 179) .
In some embodiments, the piggyBac or piggyBac-like transposon comprises a sequence of:
1 tttgttactt tatagaagaa attttgagtt tttgtttttt tttaataaat aaataaacat
61 aaataaattg tttgttgaat ttattattag tatgtaagtg taaatataat aaaacttaat
121 atctattcaa attaataaat aaacctcgat atacagaccg ataaaacaca tgcgtcaatt
181 ttacgcatga ttatctttaa cgtacgtcac aatatgatta tctttctagg g (SEQ ID
NO: 180) .
[0390] In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 177 and SEQ ID NO: 178, and the piggyBac or piggyBac-like transposase comprises SEQ ID NO: 174. In some embodiments, the piggyBac or piggyBac-like transposon comprises SEQ ID NO: 179 and SEQ ID NO: 180, and the piggyBac or piggyBac-like transposase comprises SEQ ID NO: 174.
[0391] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Aphis gossypii. In some embodiments, the piggyBac or piggyBac-like transposon comprises an ITR sequence of CCTTCCAGCGGGCGCGC (SEQ ID NO: 181).
[0392] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Chilo suppressalis. In some embodiments, the piggyBac or piggyBac-like transposon comprises an ITR sequence of CCCAGATTAGCCT (SEQ ID NO: 182).
[0393] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Heliothis virescens. In some embodiments, the piggyBac or piggyBac-like transposon comprises an ITR sequence of CCCTTAATTACTCGCG (SEQ ID NO: 183).
[0394] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Pectinophora gossypiella. In some embodiments, the piggyBac or piggyBac- like transposon comprises an ITR sequence of CCCTAGATAACTAAAC (SEQ ID NO:
184).
[0395] In some embodiments, the piggyBac or piggyBac-like transposon is isolated or derived from Anopheles stephensi. In some embodiments, the piggyBac or piggyBac-like transposon comprises an ITR sequence of CCCTAGAAAGATA (SEQ ID NO: 185).
EXAMPLES
[0396] Different micelle formulations may be used to compare the properties between resulting particles. For example, poly(histidine)-containing triblock copolymers are used to form micelles that complex with negatively charged particles, including nucleic acids and some proteins.
[0397] Details about comparative and quantitative studies that were performed are provided below.
Experimental Procedures
[0398] Formation and Characterization of Polymeric Micelles . PEO-b-PLA-b-PHIS micelles were prepared by a thin-film rehydration method.
[0399] In overview, 20 mg of polymer was dissolved in 1 mL of DCM. The organic solvent was evaporated to form a polymer thin film by flushing Nitrogen gas. The polymer thin film was rehydrated in PBS and particles were formed by 30 minutes ice-water bath sonication at 30 kHz. Particle sizes and zeta potentials were measured using a Delsa Nano Submicron Particle Size and Zeta Potential Analyzer (Beckman Coulter).
[0400] Complexation of protein or DNA with Polymeric Micelles: Suspensions of polymeric micelles were diluted with Opti-MEM media® (Invitrogen) to different concentrations, varying the final numbers of amino groups in solution. Equal volume solutions containing protein or DNA where then added to the micelles. For DNA, the primary parameters that were varied included the initial ratios of free amines to phosphates (NIP) in suspension, which ranged from 5: 1 to 40: 1. Protein- or DNA- complexed micelles were then formed by gentle pipetting and allowed to equilibrate for 30 min at RT. To determine the maximal loading of protein/DNA, the efficiency of micellar complexation, and the rates of release within different pH solutions, the micro- BSA assay (for protein concentrations), a fluorescence standard curve (for flurophore- conjugated protein), and/or ICP-MS was utilized (to determine the amount of platinum- bound DNA in solution).
Example 1: Diblock copolymer micelle model
[0401] In a first copolymer micelle model, micelles were created using the diblock copolymer PEO-b-PLA. Various time durations for polymerizing the PL A block were tested in combination with different techniques for forming the micelles (i.e., "test combinations"). The of PLA polymerization times, micelle formation techniques, and mean diameter sizes of the resulting nanoparticles are shown in FIG. 1A. As shown, using the particular test combination of PLA polymerization for 6 hours (25 PLA units) and sonication of the copolymers in phosphate-buffered saline (PBS), the mean diameter of the resulting micelles was 247 nm. [0402] As also shown, increasing the amount of time for PLA polymerization resulted in larger mean diameters of the resulting nanoparticles. FIG. 1B is graph showing the size distribution for the PEO-b-PLA micelles generated using the same test combination (i.e., 6 hours PLA polymerization and sonication in PBS). FIG. 1C is a graph showing the z- potential distribution of the PEO-b-PLA micelles generated using the same test combination (i.e., 6 hours PLA polymerization and sonication in PBS). As demonstrated, the z- potential of the tested PEO-b-PLA micelle is about -12.20 mV.
[0403] The block copolymer micelles in the various embodiments may also encapsulate water-insoluble molecules in the hydrophobic block. This capability was shown by encapsulating a lipophilic carbocyanine fluorescent dye (DIL dye) in the hydrophobic portion (i.e., PLA) of the PEO-b-PLA micelles described with respect to FIGs 1B and 1C (i.e., prepared by 6 hours PLA polymerization and sonication in PBS). FIG. 2 is a graph showing the absorbance of light at a wavelength of 560 nm by the micelles with different
concentrations of the DIL dye in solution. In particular, the graph may be used to quantify how much DIL dye can be bound to the hydrophobic portion of the micelles. Specifically, it was found that 1 mg of the PEO-b-PLA micelles was able to load around 4 mM of the DIL dye.
[0404] In various embodiments, creating a triblock copolymer for use in micelle formation involves attaching a poly(histidine) block to a diblock copolymer that has a hydrophobic and a hydrophilic block. The poly(histidine) block may be attached to the hydrophobic block, such that the resulting polymer contains the hydrophobic block in between a hydrophilic block and the poly(histidine) block.
[0405] In an embodiment, poly(histidine)-based micelles were created using the triblock copolymer PEO-b-PLA-b-PHIS. As described above with respect to FIGS. 1B and 2, creating the PEO-b-PLA portion of the copolymers involved PLA polymerization for 6 hours, and sonicating the diblock copolymers in PBS. Various time durations for creating the poly(histidine) block and adding it to the PEO-b-PLA copolymer were used in combination with different techniques for forming the triblock copolymer micelles. The PHIS
polymerization times, micelle formation techniques, and mean diameter sizes of the resulting nanoparticles are shown in FIG. 3A. Using the particular combination of PHIS
polymerization for 48 hours and thin film rehydration (TFR) of the block copolymers in dichloromethane (DCM) of the copolymers in PBS, the mean diameter of the resulting micelles was 248 nm. FIG. 3B is graph showing the size distribution (around 248 nm in diameter) for the PEO-b-PLA-b-PHA micelles generated using the same preparation parameters (i.e., 6 hours PLA polymerization, 48 hours PHIS polymerization, and THS in DCM). FIG. 3C is a graph of the z-potential distribution of the PEO-b-PLA-b-PHIS micelles generated using the same preparation parameters (i.e., 6hours PLA polymerization, 48 hours PHIS polymerization, and THS in DCM). As demonstrated, the z-potential of the tested PEO- b-PLA-b-PHIS micelle is about -18 mV.
[0406] FIG. 4 is a chart showing the variation in properties of the PEO-b-PLA-b-PHIS micelles in different pHs was tested. As shown, the micelles were the smallest at a pH of around 7, with a mean diameter size of around 316 nm. When the pH was substantially raised or lowered, the mean diameter size increases. At the lower pH, such increase is likely due to the micelle swelling based on poly(histidine) chains gaining positive charges and growing.
[0407] To demonstrate the capability of poly(histidine)-based micelles to complex with negatively charged proteins, bovine serum albumin (BSA) was added to a solution with PEO- b-PLA-b-PHIS micelles at a low pH (i.e., lower than 6.6). In this manner, the BSA may complex with positively charged PHIS blocks, thereby creating neutrally charged
nanoparticle-protein complexes.
[0408] When the BSA was added at a ratio of 1 :3 polymer-to-protein, the complexation efficiency was around 50%. Without wishing to be bound by a particular theory, it is believe that the micelle core was formed by the hydrophobic PLA blocks. It is also believed that the BSA complexed with the poly(histidine) created a "shell" layer on the surface of the PLA core, while the PEO created a dispersed second "shell" layer around the BSA/poly(histidine) layer.
[0409] To demonstrate the capability of poly(histidine )-based micelles to complex with a nucleic acid, a model plasmid DNA encoding the mammalian DNA vector for expression of green fluorescent protein (GFP) using the elongation factor 1 alpha (EFla) promoter) (i.e., pEF-GFP DNA) was added to a solution with PEO-b-PLA-b-PHIS micelles at a low pH (i.e., lower than 6.6). Similar to BSA, the pEF-GFP DNA may complex with positively charged PHIS blocks, thereby creating neutrally charged nanoparticle-DNA complexes.
[0410] Without wishing to be bound by a particular theory, it is believe that the micelle cores were formed by the hydrophobic PLA blocks. It is also believed that the BSA or pEF-GFP complexed with the poly(histidine) created a "shell" layer on the surface of the PLA core, while the PEO created a dispersed second "shell" layer around the BSA/poly(histidine) or pEF-GFP DNA/poly(histidine) layer.
[0411] Figure 5 demonstrates DNA + mRNA encapsulation and release from PEO-P LA- PHIS particles. 1% agarose gel electrophoresis was used to demonstrate the encapsulation of DNA and mRNA into PEO-PLA-PHIS particles (well 1). Exposure of particles to acidic pH of 4.6 causes protonation of PHIS and disruption of particle conformation to result in plasmid release as observed in the DNA band from well 2 in the gel image. Plasmid release can be also triggered by surfactant exposure from the loading dye containing SDS as can be seen in the well 3. The DNA band from release was compared to the band resulting from running DNA alone in the gel (well 4).
[0412] The properties of the poly(histidine)-based micelles complexed with protein and/or nucleic acid also vary based on the pH. FIG. 6A is a graph of the average diameter of PEO-b- PLA-b-PHIS micelles complexed with BSA as a function of pH, and FIG. 6B is a graph of the amount of released BSA as a function of pH.
[0413] As shown, the nanoparticles are smallest at a pH of around 7 (around 400 nm). When the pH is raised above 7 (e.g., up to around 10), the overall micelle size also increases, and BSA remains complexed with the micelle. When the pH is lowered below 7 (e.g., to about 3- 4), the overall micelle size also increases (e.g., swells), but at a pH of about 3-4, BSA is released from the micelles.
[0414] In another example, HepG2 cells were seeded overnight into a 24-well plate at a confluence of 50,000 cells per well. Bare DNA and the different formulations containing DNA were added to the cells to achieve a final concentration of DNA of 5pg per well. The formulations used were: Lipofectamine (a traditional method in the market used to transfect cells in vitro), PEO-PLA-PHIS particles, and PEO-PLL-PHIS particles. After 2 days of co incubation, the cells were detached from the surface by trypsin, diluted with PBS and analyzed by flow cytometry. Analysis in flow cytometry was done to measure GFP fluorescence of transfected cells in a cell population of 10,000 cells per sample. Each condition was measured for 5 biological repetitions.
[0415] Figure 7 demonstrates transfection efficiency. HepG2 cells were seeded overnight in 24-well plates at 50,000 cells/well. Cell were exposed to different formulations in Opti-MEM Media (DNA alone, Lipofectamine + DNA + mRNA and PEO-PLA-PHIS + DNA + mRNA) at a final concentration of 500ng of DNA per well. At 48 hours post-incubation, cells were analyzed for GFP expression by microscopy and flow cytometry to determine the transfection efficiency for each condition.
[0416] Further poly(histidine)-containing triblock copolymers using the same protocols have been and continue to be developed. Such copolymers include, in addition to poly(histidine ), non-degradable and degradable diblocks such as: degradable polymers include, but are not limited to: PEO(5000)-b-PCL(l6300) ("P2350-EOCL"); PEO(2000)-b-PMCL(l l900) ("OCL"); PEO(2000)-b-PMCL(8300) ("OMCL"); PEO(l lOO)-b-PTMC(5lOO) ("OTMC"); and PEO(2000)-b-PTMC/PCL(l 1200) ("OTCL").
[0417] The various embodiments include a micelle structure containing a triblock copolymer capable of complexing with at least one protein or nucleic acid, the triblock copolymer including a hydrophilic block including poly(ethylene oxide), a hydrophobic block, and a poly(L-histidine) block, wherein the poly(L-histidine) block enables pH- dependent release of the at least one protein or nucleic acid. In an embodiment, the hydrophobic block is selected from the group including poly(esters), poly(anhydrides), poly(peptides), and artificial poly(nucleic acids). In an embodiment, the hydrophobic block includes at least one aliphatic polyester selected from the group of including poly(lactic acid), poly(gly colic acid) (PGA), poly(lactic-co-gly colic acid) (PLGA), poly(s-caprolactone) (PCL), and poly(3- hydroxybutyrate) (PHB). In an embodiment, the hydrophobic block includes poly(lactic acid) having an average length of 25 units.
[0418] Various embodiments may include a composition for delivering at least one gene editing molecule to a cell, the composition including, a micelle assembled from a plurality of triblock copolymers in which each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly(L-histidine) block in which the at least one poly(L-histidine) block complexes with the at least one gene editing molecule, and the at least one poly(L-histidine) block is capable of a pH dependent release of the at least one gene editing molecule.
[0419] In an embodiment, the at least one gene editing molecule may include one or more protein or nucleic acid encoding for a protein in which the protein is selected from a group that includes transposases, nucleases, and integrases. In an embodiment, the protein may be a nuclease selected from a group that includes CRISPR associated protein 9 (Cas9), transcription activator-like effector nucleases and zinc finger nucleases. In an embodiment, the Cas9 may be a dCas9 or dSaCas9. In an embodiment, the at least one gene editing molecule may include one or more transposable elements. In an embodiment, the one or more transposable elements may include, but are not limited to, a piggyBac transposon, a Sleeping Beauty transposon, a Helraiser transposon, a Tol2 transposon or a LINE-l (Ll) retrotransposon. In an embodiment, the at least one gene editing molecule may further include one or more transposase.
[0420] Further embodiments may include a kit including a pharmaceutical composition for delivering at least one gene editing molecule to a cell. The composition may include a micelle assembled from a plurality of triblock copolymers in which each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one poly(L- histidine) block in which the at least one poly(L-histidine) block complexes with the at least one gene editing molecule, and the at least one poly(L- histidine) block is capable of a pH dependent release of the at least one gene editing molecule, and an implement for administering the pharmaceutical composition intravenously, via inhalation, topically, per rectum, per the vagina, transdermally, subcutaneously, intraperitoneally, intrathecally, intramuscularly, or orally.

Claims

1. A composition for delivering at least one gene editing molecule to a cell, the composition comprising:
a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one cationically-charged block,
wherein:
the at least one cationically-charged block complexes with the at least one gene editing molecule; and
the at least one cationically-charged block is capable of a pH dependent release of the at least one gene editing molecule.
2. The composition of claim 1, wherein the cationically-charged block is constitutively positively charged at a physiological pH > 6.0.
3. The composition of claim 2, wherein the cationically-charged block comprises polyallylamine (PAA), polyamidoamine (PAMAM), polydimethylaminoethylmethacrylate (PDMAEMA), poly(2(diisopropylamino)ethyl methacrylate), polyethylenimine (PEI), branched
polyethylenimine (bPEI), poly(l-lysine) (PLL), poly(l-arginine) (PLA),
poly(oligoethanamino)amide) (POEAA), chitosan, succinated chitosan, 6-N,N,N- trimethyltriazole-chitosans, a polyphosphoramidate, a polyhydroxyalkanoate, poly(calixane), poly(cyclodextrin), poly(aspartamide) (pASP(DET)), a poly(aminoamide), Poly(methacrylic acid-g-ethylene glycol) (P(MAA-g-EG)), or Poly(N,N-dialkylaminoethylmethacrylates) (PDAAEMA).
4. The composition of claim 1, wherein the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at a reduced pH of between 6.0 and 7.0.
5. The composition of claim 4, wherein the cationically-charged block comprises a substituted polyimidazole, a poly (L-histidine), a poly(beta)amino ester (PBAE), a poly(allylamine) hydrochloride(PAH), a poly(meth)acrylamide, or a poly(styrene-alt-maleic anhydride) (pSMA).
6. The composition of any one of claims 1-5, wherein the at least one gene editing molecule comprises one or more of a protein or a nucleic acid encoding for a protein, wherein the protein is selected from the group comprising a transposase, a nuclease, and an integrase.
7. The composition of claim 6, wherein the nuclease is selected from the group comprising:
a CRISPR associated protein 9 (Cas9);
a type IIS restriction enzyme;
a transcription activator-like effector nuclease (TALEN); and
a zinc finger nuclease (ZFN).
8. The composition of claim 7, wherein the type IIS restriction enzyme comprises Clo05l.
9. The composition of claim 7 or 8, wherein the Cas9 comprises a dCas9 or dSaCas9.
10. The composition of claim 1, wherein the at least one gene editing molecule comprises one or more transposable elements.
11. The composition of claim 10, wherein the one or more transposable elements comprises one or more of a piggyBac transposon, a piggyBac-like transposon, a Sleeping Beauty transposon, a Helraiser Transposon, a Tol2 transposon or a LINE-l (Ll) transposon.
12. The composition of claim 10, wherein the one or more transposable elements comprise a piggyBac transposon.
13. The composition of claim 10, wherein the one or more transposable elements comprise a piggyBac-like transposon.
14. The composition of claim 10, wherein the one or more transposable elements comprise a Sleeping Beauty transposon.
15. The composition of claim 10, wherein the one or more transposable elements comprise a Helraiser transposon.
16. The composition of claim 10, wherein the one or more transposable elements comprise a Tol2 transposon.
17. The composition of claim any one of claims 10-16, wherein the at least one gene editing molecule comprises further comprises one or more transposase(s).
18. The composition of claim 17, wherein the one or more transposase(s) comprises a piggyBac transposase, a super piggyBac transposase (SPB), a piggyBac-like transposase, a Sleeping Beauty transposase, a hyperactive Sleeping Beauty transposase (SB100X), a Helitron Transposase, a Tol2 transposase or a transposase capable of transposing a LINE-l (Ll transposon).
19. The composition of claim 17, wherein the transposase comprises a piggyBac transposase or a super piggyBac transposase (SPB).
20. The composition of claim 17, wherein the transposase comprises a piggyBac-like transposase.
21. The composition of claim 17, wherein the transposase comprises a Sleeping Beauty transposase or a hyperactive Sleeping Beauty transposase (SB100X).
22. The composition of claim 17, wherein the transposase comprises Helitron transposase.
23. The composition of claim 17, wherein the transposase comprises a Tol2 transposase.
24. A kit, comprising:
a pharmaceutical composition for delivering at least one gene editing molecule to a cell, the composition comprising:
a micelle assembled from a plurality of triblock copolymers, wherein each triblock copolymer having at least one hydrophobic block, at least one hydrophilic block, and at least one cationically-charged block, wherein:
the at least one cationically-charged block complexes with the at least one gene editing molecule; and
the at least one cationically-charged block is capable of a pH dependent release of the at least one gene editing molecule; and an implement for administering the pharmaceutical composition by a systemic route or by a local route.
25. The kit of claim 24, wherein the systemic route comprises intravenous delivery, inhalation, transmucosal delivery, rectal delivery, vaginal delivery, subcutaneous delivery, intraperitoneal delivery, intrathecal delivery, intramuscular delivery or oral delivery.
26. The kit of claim 24, wherein the local route comprises topical delivery, transdermal delivery, intracerebrospinal delivery, intraspinal delivery, direct delivery to the central nervous system (CNS), intraocular delivery, intravitreal delivery, intramuscular delivery, or intraosseous delivery.
27. The kit of any one of claims 24-26, wherein the cationically-charged block is constitutively positively charged at a physiological pH of > 6.0.
28. The kit of claim 27, wherein the cationically-charged block comprises polyallylamine (PAA), polyamidoamine (PAMAM), polydimethylaminoethylmethacrylate (PDMAEMA), poly(2(diisopropylamino)ethyl methacrylate), polyethylenimine (PEI), branched
polyethylenimine (bPEI), poly(l-lysine) (PLL), poly(l-arginine) (PLA),
poly(oligoethanamino)amide) (POEAA), chitosan, succinated chitosan, 6-N,N,N- trimethyltriazole-chitosans, a polyphosphoramidate, a polyhydroxyalkanoate, poly(calixane), poly(cyclodextrin), poly(aspartamide) (pASP(DET)), a poly(aminoamide), Poly(methacrylic acid-g-ethylene glycol) (P(MAA-g-EG)), or Poly(N,N-dialkylaminoethylmethacrylates) (PDAAEMA).
29. The kit of any one of claims 24-26, wherein the cationically-charged block has a neutral charge at a physiological pH of between 7.0 and 7.8 and a positive charge at reduced pH of between 6.0 and 7.0.
30. The kit of claim 29, wherein the cationically-charged block comprises a substituted polyimidazole, a poly (L-histidine), a poly(beta)amino ester (PBAE), a poly(allylamine) hydrochloride(PAH), a poly(meth)acrylamide, or a poly(styrene-alt-maleic anhydride) (pSMA).
31. The use of the composition of any one of claims 1-23 for the modification of a target sequence, comprising contacting the composition and the target sequence under conditions suitable for nuclease activity.
32. The use of the composition of any one of claims 1-23 for the modification of a target sequence, comprising contacting the composition and the target sequence under conditions suitable for transposase activity.
33. The use of claim 31 or 32, wherein a target cell comprises the target sequence.
34. The use of any one of claims 31-33, wherein the target sequence is a DNA sequence.
35. The use of claim 34, wherein the DNA sequence is a genomic sequence.
36. The use of any one of claims 31-33, wherein the target sequence is an RNA sequence.
37. The use of any one of claims 33-36, wherein the target cell is ex vivo or in vivo.
38. The use of the composition of any one of claims 1-23 for the treatment of a disease or disorder, comprising administering a therapeutically-effective amount of the composition to a subject in need thereof.
39. A method of modifying a target sequence, comprising contacting the composition of any one of claims 1-23 and the target sequence under conditions suitable for nuclease activity.
40. A method of modifying a target sequence, comprising contacting the composition of any one of claims 1-23 and the target sequence under conditions suitable for transposase activity.
41. The method of claim 39 or 40, wherein a target cell comprises the target sequence.
42. The method of any one of claims 39-41, wherein the target sequence is a DNA sequence.
43. The method of claim 42, wherein the DNA sequence is a genomic sequence.
44. The method of any one of claims 39-41, wherein the target sequence is an RNA sequence.
45. The method of any one of claims 41-44, wherein the target cell is ex vivo or in vivo.
46. A method of treating a disease or disorder, comprising administering a
therapeutically-effective amount of the composition of any one of claims 1-23 to a subject in need thereof.
PCT/US2018/066961 2017-12-20 2018-12-20 Micelles for complexation and delivery of proteins and nucleic acids WO2019126589A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762608518P 2017-12-20 2017-12-20
US62/608,518 2017-12-20

Publications (1)

Publication Number Publication Date
WO2019126589A1 true WO2019126589A1 (en) 2019-06-27

Family

ID=65278451

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/066961 WO2019126589A1 (en) 2017-12-20 2018-12-20 Micelles for complexation and delivery of proteins and nucleic acids

Country Status (1)

Country Link
WO (1) WO2019126589A1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110840839A (en) * 2019-09-20 2020-02-28 中山大学 Multifunctional polymer micelle for combined delivery of photosensitizer and gene editing system, and preparation method and application thereof
CN110974784A (en) * 2019-09-20 2020-04-10 中山大学 Multifunctional polymer micelle for combined delivery of chemotherapeutic drugs and gene editing system, and preparation method and application thereof
WO2020132396A1 (en) 2018-12-20 2020-06-25 Poseida Therapeutics, Inc. Nanotransposon compositions and methods of use
WO2021046261A1 (en) 2019-09-05 2021-03-11 Poseida Therapeutics, Inc. Allogeneic cell compositions and methods of use
WO2021127505A1 (en) 2019-12-20 2021-06-24 Poseida Therapeutics, Inc. Anti-muc1 compositions and methods of use
WO2021183795A1 (en) 2020-03-11 2021-09-16 Poseida Therapeutics, Inc. Chimeric stimulatory receptors and methods of use in t cell activation and differentiation
WO2022012758A1 (en) * 2020-07-17 2022-01-20 Probiogen Ag Hyperactive transposons and transposases
WO2022182797A1 (en) 2021-02-23 2022-09-01 Poseida Therapeutics, Inc. Genetically modified induced pluripotent stem cells and methods of use thereof
WO2023060088A1 (en) 2021-10-04 2023-04-13 Poseida Therapeutics, Inc. Transposon compositions and methods of use thereof
WO2023164573A1 (en) 2022-02-23 2023-08-31 Poseida Therapeutics, Inc. Genetically modified cells and methods of use thereof
WO2023225471A3 (en) * 2022-05-16 2024-01-11 Flagship Pioneering Innovations Vi, Llc Helitron compositions and methods
WO2024178055A1 (en) 2023-02-21 2024-08-29 Poseida Therapeutics, Inc. Compositions and methods for genome editing
WO2024178069A1 (en) 2023-02-21 2024-08-29 Poseida Therapeutics, Inc. Compositions and methods for genome editing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007008300A2 (en) * 2005-05-31 2007-01-18 ECOLE POLYTECHNIQUE FéDéRALE DE LAUSANNE Triblock copolymers for cytoplasmic delivery of gene-based drugs
WO2017190091A1 (en) * 2016-04-29 2017-11-02 Vindico Nanobiotechnology, Llc Poly(histidine)-based micelles for complexation and delivery of proteins and nucleic acids
US10041077B2 (en) 2014-04-09 2018-08-07 Dna2.0, Inc. DNA vectors, transposons and transposases for eukaryotic genome modification

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007008300A2 (en) * 2005-05-31 2007-01-18 ECOLE POLYTECHNIQUE FéDéRALE DE LAUSANNE Triblock copolymers for cytoplasmic delivery of gene-based drugs
US10041077B2 (en) 2014-04-09 2018-08-07 Dna2.0, Inc. DNA vectors, transposons and transposases for eukaryotic genome modification
WO2017190091A1 (en) * 2016-04-29 2017-11-02 Vindico Nanobiotechnology, Llc Poly(histidine)-based micelles for complexation and delivery of proteins and nucleic acids

Non-Patent Citations (25)

* Cited by examiner, † Cited by third party
Title
"GenBank", Database accession no. 741958
"GenBank", Database accession no. AAA87375
"GenBank", Database accession no. AAL39784
"GenBank", Database accession no. AB179012
"GenBank", Database accession no. ABD76335
"GenBank", Database accession no. BAD11135
"GenBank", Database accession no. BAF82026
"GenBank", Database accession no. EU287451
"GenBank", Database accession no. GU270322
"GenBank", Database accession no. GU329918
"GenBank", Database accession no. GU477713
"GenBank", Database accession no. GU477714
"GenBank", Database accession no. NP_689808
"GenBank", Database accession no. XP _001814566
"GenBank", Database accession no. XP_001948139
"GenBank", Database accession no. XP_002123602
"GenBank", Database accession no. XP_220453
"GenBank", Database accession no. XP_310729
"GenBank", Database accession no. XP_312615
"GenBank", Database accession no. XP_320414
CHANGYU HE ET AL: "Reductive triblock copolymer micelles with a dynamic covalent linkage deliver antimiR-21 for gastric cancer therapy", POLYMER CHEMISTRY ROYAL SOCIETY OF CHEMISTRY UK, vol. 7, no. 26, 14 July 2016 (2016-07-14), pages 4352 - 4366, XP002790537, ISSN: 1759-9954 *
CONNIE CHENG ET AL: "Multifunctional triblock copolymers for intracellular messenger RNA delivery", BIOMATERIALS, ELSEVIER SCIENCE PUBLISHERS BV., BARKING, GB, vol. 33, no. 28, 15 June 2012 (2012-06-15), pages 6868 - 6876, XP028428397, ISSN: 0142-9612, [retrieved on 20120620], DOI: 10.1016/J.BIOMATERIALS.2012.06.020 *
GASPAR VÍTOR M ET AL: "Bioreducible poly(2-ethyl-2-oxazoline)-PLA-PEI-SS triblock copolymer micelles for co-delivery of DNA minicircles and Dox", JOURNAL OF CONTROLLED RELEASE, ELSEVIER, AMSTERDAM, NL, vol. 213, 13 July 2015 (2015-07-13), pages 175 - 191, XP029259364, ISSN: 0168-3659, DOI: 10.1016/J.JCONREL.2015.07.011 *
KANJIRO MIYATA ET AL: "Polyplex Micelles from Triblock Copolymers Composed of Tandemly Aligned Segments with Biocompatible, Endosomal Escaping, and DNA-Condensing Functions for Systemic Gene Delivery to Pancreatic Tumor Tissue", PHARMACEUTICAL RESEARCH, KLUWER ACADEMIC PUBLISHERS-PLENUM PUBLISHERS, NL, vol. 25, no. 12, 10 September 2008 (2008-09-10), pages 2924 - 2936, XP019647965, ISSN: 1573-904X, DOI: 10.1007/S11095-008-9720-2 *
KIM HYUN JIN ET AL: "siRNA delivery from triblock copolymer micelles with spatially-ordered compartments of PEG shell, siRNA-loaded intermediate layer, and hydrophobic core", BIOMATERIALS, vol. 35, no. 15, 6 March 2014 (2014-03-06), pages 4548 - 4556, XP028631254, ISSN: 0142-9612, DOI: 10.1016/J.BIOMATERIALS.2014.02.016 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020132396A1 (en) 2018-12-20 2020-06-25 Poseida Therapeutics, Inc. Nanotransposon compositions and methods of use
WO2021046261A1 (en) 2019-09-05 2021-03-11 Poseida Therapeutics, Inc. Allogeneic cell compositions and methods of use
CN110840839A (en) * 2019-09-20 2020-02-28 中山大学 Multifunctional polymer micelle for combined delivery of photosensitizer and gene editing system, and preparation method and application thereof
CN110974784A (en) * 2019-09-20 2020-04-10 中山大学 Multifunctional polymer micelle for combined delivery of chemotherapeutic drugs and gene editing system, and preparation method and application thereof
CN110974784B (en) * 2019-09-20 2021-01-12 中山大学 Multifunctional polymer micelle for combined delivery of chemotherapeutic drugs and gene editing system, and preparation method and application thereof
CN110840839B (en) * 2019-09-20 2021-01-12 中山大学 Multifunctional polymer micelle for combined delivery of photosensitizer and gene editing system, and preparation method and application thereof
WO2021127505A1 (en) 2019-12-20 2021-06-24 Poseida Therapeutics, Inc. Anti-muc1 compositions and methods of use
WO2021183795A1 (en) 2020-03-11 2021-09-16 Poseida Therapeutics, Inc. Chimeric stimulatory receptors and methods of use in t cell activation and differentiation
WO2022012758A1 (en) * 2020-07-17 2022-01-20 Probiogen Ag Hyperactive transposons and transposases
WO2022182797A1 (en) 2021-02-23 2022-09-01 Poseida Therapeutics, Inc. Genetically modified induced pluripotent stem cells and methods of use thereof
WO2023060088A1 (en) 2021-10-04 2023-04-13 Poseida Therapeutics, Inc. Transposon compositions and methods of use thereof
WO2023164573A1 (en) 2022-02-23 2023-08-31 Poseida Therapeutics, Inc. Genetically modified cells and methods of use thereof
WO2023225471A3 (en) * 2022-05-16 2024-01-11 Flagship Pioneering Innovations Vi, Llc Helitron compositions and methods
WO2024178055A1 (en) 2023-02-21 2024-08-29 Poseida Therapeutics, Inc. Compositions and methods for genome editing
WO2024178069A1 (en) 2023-02-21 2024-08-29 Poseida Therapeutics, Inc. Compositions and methods for genome editing

Similar Documents

Publication Publication Date Title
WO2019126589A1 (en) Micelles for complexation and delivery of proteins and nucleic acids
US11213594B2 (en) Poly(histidine)-based micelles for complexation and delivery of proteins and nucleic acids
AU2016288237B2 (en) Compositions and methods for delivery of gene editing tools using polymeric vesicles
EP3230451B1 (en) Protected guide rnas (pgrnas)
EP3237615B2 (en) Crispr having or associated with destabilization domains
US11692205B2 (en) Systems and methods for one-shot guide RNA (ogRNA) targeting of endogenous and source DNA
US20170349914A1 (en) DELIVERY, USE AND THERAPEUTIC APPLICATIONS OF CRISPR SYSTEMS AND COMPOSITIONS FOR GENOME EDITING AS TO HEMATOPOIETIC STEM CELLS (HSCs)
JP2020513783A (en) CRISPR
CA3026112A1 (en) Cpf1 complexes with reduced indel activity
WO2016094872A1 (en) Dead guides for crispr transcription factors
WO2016094874A1 (en) Escorted and functionalized guides for crispr-cas systems
EP3080257A1 (en) Compositions and methods of use of crispr-cas systems in nucleotide repeat disorders
WO2016049163A2 (en) Use and production of chd8+/- transgenic animals with behavioral phenotypes characteristic of autism spectrum disorder
WO2015191693A2 (en) Method for gene editing
US20210115475A1 (en) Systems and methods for modulating chromosomal rearrangements
TW202333747A (en) Method of modulating pcsk9 and uses thereof
WO2023134658A1 (en) Method of modulating vegf and uses thereof
WO2024146916A1 (en) Activated bec nucleases for degrading nucleic acid molecules
TW202346588A (en) Compositions and methods of genome editing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18842868

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24/09/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18842868

Country of ref document: EP

Kind code of ref document: A1