WO2012074758A1 - Transposases piggybac hyperactives - Google Patents

Transposases piggybac hyperactives Download PDF

Info

Publication number
WO2012074758A1
WO2012074758A1 PCT/US2011/061054 US2011061054W WO2012074758A1 WO 2012074758 A1 WO2012074758 A1 WO 2012074758A1 US 2011061054 W US2011061054 W US 2011061054W WO 2012074758 A1 WO2012074758 A1 WO 2012074758A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
amino acid
leucine
methionine
protein
Prior art date
Application number
PCT/US2011/061054
Other languages
English (en)
Inventor
Eric Ostertag
Blair Madison
Original Assignee
Transposagen Bioharmaceuticals, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transposagen Bioharmaceuticals, Inc. filed Critical Transposagen Bioharmaceuticals, Inc.
Publication of WO2012074758A1 publication Critical patent/WO2012074758A1/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/8509Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/15Animals comprising multiple alterations of the genome, by transgenesis or homologous recombination, e.g. obtained by cross-breeding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2799/00Uses of viruses
    • C12N2799/02Uses of viruses as vector
    • C12N2799/021Uses of viruses as vector for the expression of a heterologous nucleic acid
    • C12N2799/027Uses of viruses as vector for the expression of a heterologous nucleic acid where the vector is derived from a retrovirus
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/90Vectors containing a transposable element

Definitions

  • the present invention is directed, in part, to PiggyBac transposase proteins, nucleic acids encoding the same, compositions comprising the same, kits comprising the same, non- human transgenic animals, and methods of using the same.
  • the PiggyBac (PB) transposase is a compact functional transposase protein that catalyzes the excision and re-integration of the PB transposon (Fraser et al., Insect Mol. Biol., 1996, 5, 141-51 ; Mitra et al., EMBO J., 2008, 27, 1097-1109; and Ding et al., Cell, 2005, 122, 473-83). In many cases, an increase in the movement of the transposon to another part of the genome is desired.
  • Hyperactive piggybac transposases have been described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, and PCT/US2010/025386, filed on February 25, 2010, both of which are herein incorporated by reference in their entirety.
  • hyperactive PiggyBac transposase proteins nucleic acids encoding the same
  • compositions comprising the same
  • kits comprising the same, non-human transgenic animals comprising the same
  • methods of using the same that comprise at least one mutation or modification within the defined N-terminal region and/or within the C- terminal region but specifically excluding any previously published species in those regions.
  • the present invention provides proteins comprising at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the present invention provides proteins comprising at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for
  • the present invention provides proteins comprising at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein comprises one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for the
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 and wherein the protein does not comprise one or more of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a glutamine for the methionine at position 282; a tryptophan for the leucine at position 296
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 and wherein the protein comprises one or more of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a glutamine for the methionine at position 282; a tryptophan for the leucine at position 296;
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein comprises one of the amino acid substitutions in SEQ ID NO:2 described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or PCT/US2010/025386, filed on February 25, 2010.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise one of the amino acid substitutions in SEQ ID NO:2 described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or PCT/US2010/025386, filed on February 25, 2010.
  • the protein comprises at least 90% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein comprises one of the amino acid substitutions in SEQ ID NO:2 described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or PCT/US2010/025386, filed on February 25, 2010.
  • the protein comprises at least 90% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise one of the amino acid substitutions in SEQ ID NO:2 described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or PCT/US2010/025386, filed on February 25, 2010.
  • the protein comprises at least 95% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein comprises at least one substitution mutation described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or PCT/US2010/025386, filed on February 25, 2010.
  • the protein comprises at least 95% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise a substitution mutation described in PCT Serial
  • the protein comprises at least 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein comprises at least one substitution mutation described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or PCT/US2010/025386, filed on February 25, 2010.
  • the protein comprises at least 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise one substitution mutation described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or PCT/US2010/025386, filed on February 25, 2010.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:
  • the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise one of the amino acid substitutions in SEQ ID NO:2 described in PCT Serial Number PCT/US2010/025380, filed on February 25, 2010, or
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for serine at position 3; a valine for isoleucine at position 30, a tryptophan or valine for isoleucine at position 82, a proline for serine at position 103, or a proline for arginine at position 119.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein comprises one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for serine at position 3; a valine for isoleucine at position 30, a tryptophan or valine for isoleucine at position 82, a proline for serine at position 103, or a proline for arginine at position 119.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 and wherein the protein does not comprise one of the following amino acid substitutions in SEQ ID NO:2: an isoleucine for valine at position 436; a tyrosine for methionine at position 456; a phenylalanine for leucine at position 470; or a leucine for methionine at position 503.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises more than one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein one of the mutations in the protein comprises only one of the following amino acid substitutions in SEQ ID NO:2: an isoleucine for valine at position 436; a tyrosine for methionine at position 456; a phenylalanine for leucine at position 470; or a leucine for methionine at position 503.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2. In some embodiments, the protein comprises at least 90% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein comprises at least 95% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2. In some embodiments, the protein comprises at least 99% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein comprises at least 70% sequence identity to SEQ
  • SEQ ID NO:2 comprises an amino acid substitution in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2, wherein the protein does not consist of: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, and comprises an amino acid substitution in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2, wherein the protein does not consist of: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:
  • the protein comprises at least 85% sequence identity to SEQ ID NO:2, and comprises an amino acid substitution in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2, wherein the protein does not consist of: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO:
  • SEQ ID NO: 14 SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:
  • the protein comprises at least 90% sequence identity to SEQ ID NO:2, and comprises an amino acid substitution in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2, wherein the protein does not consist of: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:
  • the protein comprises at least 95% sequence identity to SEQ ID NO:2, and comprises an amino acid substitution in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2, wherein the protein does not consist of: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO:
  • SEQ ID NO:14 SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ ID NO:74, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:80, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:
  • the protein comprises at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to SEQ ID NO:2, and comprises an amino acid substitution in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2, wherein the protein does not consist of: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ
  • the protein comprises at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2 and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2. Any residue within the above-identified domains may be mutated to any natural or unnatural amino acid.
  • the protein comprises at least 70%, 75%, 80%, 85%, 90%,
  • SEQ ID NO:2 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2 and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2. Any residue within the above-identified domains may be mutated to any natural or unnatural amino acid.
  • the protein comprises at least 80% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2. In some embodiments, the protein comprises at least 90% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein comprises at least 95% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2. In some embodiments, the protein comprises at least 99% sequence identity to SEQ ID NO:2, and comprises more than one of the amino acid substitutions in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein does not comprise SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO: 14, SEQ ID NO:16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ ID NO:
  • the protein does not consist of: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:60, SEQ ID NO:62, SEQ ID NO:64, SEQ ID NO:66, SEQ ID NO:68, SEQ ID NO:70, SEQ ID NO:72, SEQ ID NO:74
  • the protein does not comprise SEQ ID NO:24, SEQ ID NO:24, SEQ ID NO:24, SEQ ID NO:24, SEQ ID NO:24, SEQ ID NO:24, SEQ ID NO:
  • the protein does not consist of: SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:26, SEQ ID NO:
  • SEQ ID NO:28 SEQ ID NO:30, SEQ ID NO:34, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:62, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, or SEQ ID NO: 102.
  • the present invention also provides nucleic acids encoding any of the proteins described above.
  • the nucleic acid does not consist of: SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51 , SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:
  • the nucleic acid does not consist of: SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51 , SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:61 , SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, or SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51 , SEQ ID NO:53, SEQ
  • the present invention also provides nucleic acids encoding any of the proteins described above.
  • the nucleic acid does not comprise: SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID N0:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID N0:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID N0:51 , SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID N0:61, SEQ ID NO:63, SEQ ID NO:65,
  • the nucleic acid does not comprise: SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51 , SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:61 , SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, or SEQ ID NO: 101.
  • the present invention also provides vectors comprising any of the nucleic acids described above encoding any of the proteins described above.
  • the vector is a plasmid.
  • the vector is a retrovirus or component of a retrovirus.
  • the retrovirus comprises long terminal repeats, a psi packaging signal, a cloning site, and a sequence encoding a selectable marker.
  • the present invention also provides cells comprising any of the nucleic acids or vectors described herein.
  • the cell is a sperm or an egg.
  • the cell is a stem cell.
  • the stem cell is an induced pluripotnent stem cell, an embryonic stem cell, or a spermatogonial stem cell.
  • the nucleic acid vectors are introduced into spermatogonial stem cells using methods and compositions described in PCT/US2009/066275, which is herein incorporated by reference in its entirety.
  • kits comprising: a vector comprising a nucleic acid encoding any of the proteins described herein; and a transposon comprising an insertion site for an exogenous nucleic acid, wherein the insertion site is flanked by a first inverted repeat sequence comprising a sequence at least about 90% sequence identity to SEQ ID NO:91 and/or a second inverted repeat sequence comprising a sequence at least about 90% sequence identity to SEQ ID NO: 92.
  • the present invention also provides non-human, transgenic animals comprising a nucleic acid molecule encoding any of the proteins described herein.
  • the non-human, transgenic animal further comprises a transposon comprising an insertion site for an exogenous nucleic acid, wherein the insertion site is flanked by a first inverted repeat sequence comprising a sequence at least about 90% sequence identity to SEQ ID NO:91 and/or a second inverted repeat sequence comprising a sequence at least about 90% sequence identity to SEQ ID NO: 92.
  • the present invention also provides methods of integrating an exogenous nucleic acid into the genome of at least one cell of a multicellular or unicellular organism comprising administering directly to the multicellular or unicellular organism: a transposon comprising the exogenous nucleic acid, wherein the exogenous nucleic acid is flanked by a first inverted repeat sequence comprising a sequence at least about 90% sequence identity to SEQ ID NO:91 and/or a second inverted repeat sequence comprising a sequence at least about 90% sequence identity to SEQ ID NO: 92; and a protein described herein to excise the exogenous nucleic acid from a plasmid, episome, or transgene and integrate the exogenous nucleic acid into the genome.
  • the protein is administered as a nucleic acid encoding the protein.
  • the transposon and nucleic acid encoding the protein are present on separate vectors.
  • the transposon and nucleic acid encoding the protein are present on the same vector.
  • the multicellular or unicellular organism is a vertebrate.
  • the vertebrate animal is a mammal.
  • the vertebrate animal or non-human, transgenic animal is selected from a rodent, mini-pig, dog, cat, sheep, pig, cow and goat.
  • the administering is administering systemically.
  • the exogenous nucleic acid comprises a gene.
  • the present invention also provides methods of generating a non-human, transgenic animal comprising a germline mutation comprising: breeding a first non-human, transgenic animal comprising a transposon with a second non-human, transgenic animal comprising a vector comprising a nucleotide sequence encoding any of the proteins described herein.
  • the present invention also provides methods of generating a non-human, transgenic animal comprising: introducing a nucleic acid molecule encoding any of the proteins described herein into a cell under conditions sufficient to generate a transgenic animal.
  • the Tichoplusia ni wikltype piggyBac transposon nucleic acid sequence corresponds to SEQ ID NO: 1 , shown below: ATGGGCTCTAGCCTGGACGACGAGCACATCCTGAGCGCCCTGCTGCAGA GCGACGACGAACTGGTGGGCGAGGACAGCGACAGCGAGATCAGCGACCACGTG TCCGAGGACGACGTGCAGTCCGACACCGAGGAAGCCTTCATCGACGAGGTGCAC GAAGTGCAGCCTACCAGCAGCGGCTCCGAGATCCTGGACGAGCAGAACGTGATC GAGCAGCCTGGCAGCTCCCTGGCCAGCAACAGAATCCTGACCCTGCCCCAGAGA ACCATCAGAGGCAAGAACAAGCACTGCTGGTCCACCTCCAAGAGCACCAGGCGG AGCAGAGTGTCCGCCCTGAACATCGTGCGGAGCCAGAGGGGCCCCACCAGAATG TGCAGAAACATCTACGACCCCCTGCTGTGCTTCAAGCTGTTCTTCACCGACGAGA TCATCA
  • the Tichoplusia ni wiSdtype piggyBac transposon amino acid sequence corresponds to SEQ ID NO: 2, shown below:
  • the invention relates to a protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprising at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one modification in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503;
  • the protein does not comprise the following single amino acid substitutions: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for the tyrosine at position 177; a histidine for the tyrosine at position 177; a leucine for the phenylalanine at position 180; an isoleucine for the phenylalanine at position 180; a valine for the
  • phenylalanine at position 180 a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the valine at position 209; a phenylalanine for the methionine at position 226; an arginine for the leucine at position 235; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a lysine for the proline at position 243; a serine for the asparagine at position 258; a glutamine for the methionine at position 282; a tryptophan for the leucine at position 296; a tyrosine for the leucine at position 296; a phenylalanine for the leucine at position 296;
  • S26P DOV ⁇ G165S, T43A, Q55R, T57A, S61R, I82V, I90V, S103P, S103T, N113S, M185I, Ml 94V, S230N, R28IG. M282V, G316E, P410L, I426V, Q497L. K501N, K5651, N505D, S573L, S509G, N570S, N538K, K575R, Q591P, Q591R, F594L, L15P, H33Y, E45G, C97R/T242I, S I03P, 194T, Ml 94V, 122 IT, R372A. S373P, K375A, N384T, T560A, N571S. S573A, S584P, M589V, M589V/D170D, S592G, or F594L.
  • the invention relates to a protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprising at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one modification in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503;
  • the protein does not comprise the following single amino acid substitutions in the transposase amino acid sequence
  • the invention relates to a protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprising at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one modification in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503; provided that the protein does not comprise the amino acid substitution mutants disclosed in WO/2010/085699, WO/2010/099301, WO/2010/099296, or any priority document related thereto, all of which are incorporated herein by reference in their entirety.
  • the invention relates to a protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprising at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one modification in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503; provided that the protein does not comprise the following single amino acid substitutions or individual species: F395L, E66G, T319A,
  • the invention relates to a protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprising at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one modification in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503; provided that the protein does not comprise the following single amino acid substitutions or individual species:
  • the invention relates to a protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprising at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one modification in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503; provided that the protein does not comprise the following amino acid substitutions in isolation: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for
  • the protein comprises 122 IT only in combination with at least one other mutation. In some embodiments, the protein comprises M589V only in combination with at least one other mutation excluding the D170D mutation. In some embodiments, the protein comprises M589V, D170D only in combination with a third mutation.
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not consist of single substitutions at the following amino acid positions in SEQ ID NO:2: position 3, position 30, position 46, position 82, position 103, position 119, position 125, position 165, position 177, position 180, position 185, position 187, position 200, position 207, position 209, position 226, position 235, position 240, position 241, position 243, position 258, position 282, position 296, position 298, position 311, position 315, position 319, position 327, position 328, position 340, position 421, position 436, position 456, position 470, position 486, position 503, position 552; position 570, or position 5
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise a substitution in at least one of the following amino acid positions in SEQ ID NO:2: position 3, position 30, position 46, position 82, position 103, position 119, position 125, position 165, position 177, position 180, position 185, position 187, position 200, position 207, position 209, position 226, position 235, position 240, position 241, position 243, position 258, position 282, position 296, position 298, position 311, position 315, position 319, position 327, position 328, position 340, position 421, position 436, position 456, position 470, position 486, position 503, position 552; position 570, or position 591.
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise a conservative amino acid substitution in at least one of the following amino acid positions in SEQ ID NO:2: position 3, position 30, position 46, position 82, position 103, position 119, position 125, position 165, position 177, position 180, position 185, position 187, position 200, position 207, position 209, position 226, position 235, position 240, position 241, position 243, position 258, position 282, position 296, position 298, position 311, position 315, position 319, position 327, position 328, position 340, position 421, position 436, position 456, position 470, position 486, position 503, position 552; position 570, or position 591.
  • the protein comprises at least 75% sequence identity to SEQ
  • the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not consist of the following amino acid substitutions as a single mutation within SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for the tyrosine at position 177; a
  • phenylalanine at position 180 a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the valine at position 209; a phenylalanine for the methionine at position 226; an arginine for the leucine at position 235; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a lysine for the proline at position 243; a serine for the asparagine at position 258; a glutamine for the methionine at position 282; a tryptophan for the leucine at position 296; a tyrosine for the leucine at position 296; a phenylalanine for the leucine at position 296;
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for
  • the protein comprises at least 75% sequence identity to SEQ
  • the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not consist of one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for the tyrosine at position 177; a histidine
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not consist of one or more of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a ly
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not consist of a single substitution in the following amino acid positions of SEQ ID NO:2: position 165, position 185, position 187, position 200, position 207, position 226, position 240, position 241, position 282, position 296, position 298, position 311, position 315, position 436, position 456, position 486, or position 503.
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise a substitution in at least one of the following amino acid positions in SEQ ID NO:2: position 165, position 185, position 187, position 200, position 207, position 226, position 240, position 241, position 282, position 296, position 298, position 311, position 315, position 436, position 456, position 486, or position 503.
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not consist of a single substitution in the following amino acid positions of SEQ ID NO:2: position 165, position 185, position 187, position 200, position 207, position 226, position 240, position 241, position 282, position 296, position 298, position 311, position 315, position 436, position 456, position 486, or position 503.
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise a conservative amino acid substitution in at least one of the following amino acid positions in SEQ ID NO:2: position 165, position 185, position 187, position 200, position 207, position 226, position 240, position 241, position 282, position 296, position 298, position 311, position 315, position 436, position 456, position 486, or position 503.
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not consist of the following single amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a glutamine for the methionine at position 282; a tryptophan for the leu
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a glutamine for the methionine at position 282; a tryptophan for the
  • the protein comprises at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and, if the protein comprises a single mutation, the following amino acid substitutions in SEQ ID NO:2 are excluded: G2C ⁇ a cysteine for glycine in position 2), Q40R, S3N, S26F.
  • the protein comprises at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and, if the protein comprises one or two mutations, the following amino acid substitutions are excluded: SEQ ID NO:2: LISP, D19 /F395L, S31P/T164A, H33Y, E44K/ 334R, E45G, C97R/T242L SI COP, R189K/G120G,
  • the protein comprises at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations and wherein at least one of the mutations is in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the following amino acid substitutions in SEQ ID NO:2 are excluded: G2C (a cysteine for glycine in position 2), Q40R, S3N, S26P, DOV, G165S, T43A, Q55R, T57A, S61R, I82V, I90V, S103P, S 103T, N113S, M 185L, M194V, S230N, R281G, M282V, G316E, P410L, I426V, Q497L, K501 N, K565L N505
  • the protein comprises at least at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations and wherein at least one of the mutations is in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the following amino acid substitutions in SEQ ID NO:2 are excluded: L15P, D19N F395L, S31P/T164A, H33Y, E44K K334R, E45G, C97R T242I, S103P, R 189K7G120G, R189R/D450 /R526R, M194T, Ml 94V, S213SN 361, 1221 T, R372A, S373P, K375
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: G2C (a cysteine for glycine in position 2), Q40R, S3N, S26P, DOV, G165S, T43A, Q55R, T57A, S61R, I82V, I90V, S 103P, S103T, N 113S, M185L, 194V, S230N, R281G, M282V, G31 6E, P410L, I426V, Q497L, K501N, K565I, N505D, S573L, S509G, N570S, N538K, K575R, Q59
  • the protein comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: L15P, D19 /F395L, S31 P/T164A, H33Y, E44K/K334R, E45G, C97R T242I, S 103P, R189 /G120G, R189R/D450N/R526R.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein comprises at least one of the following amino acid substitutions as compared to SEQ ID NO:2: SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:2, S
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 85% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 90% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 95% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise more than one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise more than one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 85% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise more than one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 90% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise more than one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 95% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise more than one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 99% sequence identity to SEQ ID NO:2, wherein the protein comprises at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise more than one of the aforementioned amino acid substitutions in SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 85% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 90% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 95% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 99% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 75% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 80% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 85% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 90% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 95% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • the protein (as nucleic acid, as nucleic acid in a vector, or as purified recombinant protein) comprises at least 99% sequence identity to SEQ ID NO:2, wherein the protein comprises two or more mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2.
  • sequence identity is determined by using the stand-alone executable BLAST engine program for blasting two sequences (bl2seq), which can be retrieved from the National Center for Biotechnology Information (NCBI) ftp site, using the default parameters (Tatusova and Madden, FEMS Microbiol Lett., 1999, 174, 247-250; which is incorporated herein by reference in its entirety).
  • “conservative" amino acid substitutions may be defined as set out in Tables A, B, or C below.
  • Hyperactive transposases include those wherein conservative substitutions have been introduced by modification of polynucleotides encoding polypeptides of the invention.
  • Amino acids can be classified according to physical properties and contribution to secondary and tertiary protein structure.
  • a conservative substitution is recognized in the art as a substitution of one amino acid for another amino acid that has similar properties.
  • Exemplary conservative substitutions are set out in Table A.
  • Val (V) lie Leu Met Ala
  • hyperactive PiggyBac transposases described herein are intended to include polypeptides bearing one or more insertions, deletions, or substitutions, or any combination thereof, of amino acid residues as well as modifications other than insertions, deletions, or substitutions of amino acid residues.
  • the hyperactive piggyBac transposons are generated from the integration defective piggyBac variants. That is, alterations, one or more mutations, are made in the integration defective piggyBac transposon sequence. In some embodiments, the hyperactive piggyBac transposons are generated from the wildtype sequences. That is, alterations are made in the wild type piggyBac transposon sequence. In some embodiments, the transposons are not generated from integration defective piggyBac variants. In some embodiments, the transposons are not generated from from the wildtype sequences.
  • “more than one" or “two or more” of the aforementioned amino acid substitutions means 2, 3, 4, 5, 6, 7, 8, 9, or 10 of the recited amino acid substitutions. In some embodiments, “more than one” means 2, 3, 4, or 5 of the recited amino acid substitutions. In some embodiments, “more than one” means 2, 3, or 4 of the recited amino acid substitutions. In some embodiments, “more than one” means 2 or 3 of the recited amino acid substitutions. In some embodiments, “more than one” means 2 of the recited amino acid substitutions.
  • the transposon of the protein may be fused to a marker protein.
  • the nucleic acid sequence can encode a portion of any variety of recombinant proteins, e.g. any protein known in the art. e.g.
  • the protein encoded by the nucleic acid sequence can be a marker protein such as green fluorescent protein (GFP), the blue fluorescent protein (BFP), the photo activatable- GFP (PA-GFP), the yellow shifted green fluorescent protein (Yellow GFP), the yellow fluorescent protein (YFP), the enhanced yellow fluorescent protein (EYFP), the cyan fluorescent protein (CFP), the enhanced cyan fluorescent protein (ECFP), the monomelic red fluorescent protein (mRFPl), the kindling fluorescent protein (KFP1), aequorin, the autofluorescent proteins (AFPs), or the fluorescent proteins JRed, TurboGFP, PhiYFP and PhiYFP-m, tHc-Red (HcRed- Tandem), PS-CFP2 and KFP- Red (all available commercially available), or other suitable fluorescent proteins chloramphenicol acetyltransferase (CAT).
  • GFP green fluorescent protein
  • BFP blue fluorescent protein
  • PA-GFP photo activatable- GFP
  • the protein further may be selected from growth hormones, for example to promote growth in a transgenic animal, or from beta-galactosidase (lacZ), lucif erase (LUC), and insulin-like growth factors (IGFs), alpha-anti-trypsin, erythropoietin (EPO), factors VIII and XI of the blood clotting system, LDL-receptor, GATA-I, etc.
  • the nucleic acid sequence further may be a suicide gene encoding e.g. apoptotic or apoptose related enzymes and genes including: A1F, Apaf e.g.
  • the protein comprises more than one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO: 10, SEQ ID NO: 12, SEQ ID NO: 14, SEQ ID NO: 16, SEQ ID NO: 18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, SEQ ID NO:36, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:42, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO: 54, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO:4,
  • the protein comprises more than one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, SEQ ID NO:34, SEQ ID NO:38, SEQ ID NO:40, SEQ ID NO:44, SEQ ID NO:46, SEQ ID NO:48, SEQ ID NO:50, SEQ ID NO:52, SEQ ID NO:54, SEQ ID NO:56, SEQ ID NO:58, SEQ ID NO:62, SEQ ID NO:76, SEQ ID NO:78, SEQ ID NO:82, SEQ ID NO:84, SEQ ID NO:86, or SEQ ID NO: 102.
  • the present invention also provides nucleic acids encoding any one of the hyperactive PiggyBac transposase proteins described herein.
  • the present invention provides nucleic acids encoding a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, and the protein comprises more than one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the following, single amino acid substitutions in SEQ ID NO:2 are excluded: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine
  • phenylalanine at position 180 a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the valine at position 209; a phenylalanine for the methionine at position 226; an arginine for the leucine at position 235; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a lysine for the proline at position 243; a serine for the asparagine at position 258; a glutamine for the methionine at position 282; a tryptophan for the leucine at position 296; a tyrosine for the leucine at position 296; a phenylalanine for the leucine at position 296;
  • the present invention also provides nucleic acids encoding a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, and the protein comprises more than one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cyste
  • the nucleic acid encodes a protein that comprises at least
  • nucleic acid encodes a protein comprising at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241; a glutamine for the methion
  • the nucleic acid encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, wherein nucleic acid encodes a protein comprising at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise more than one of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the
  • the nucleic acid encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, wherein nucleic acid encodes a protein comprising at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 and in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the
  • the nucleic acid encodes a protein that comprises at least
  • nucleic acid encodes a protein comprising at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2, and wherein the protein does not comprise more than one of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a glutamine for the methi
  • the nucleic acid does not comprise SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: l l, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID N0:19, SEQ ID N0:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID N0:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID N0:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID N0:51, SEQ ID NO:53, SEQ ID NO:55,
  • the nucleic acid does not comprise SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:61, SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, or SEQ ID NO: 101.
  • the present invention also provides vectors comprising any of the aforementioned nucleic acids.
  • the present invention provides vectors comprising a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2 with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119;
  • the vector comprises a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cyst
  • the vector does not comprise a nucleic acid that comprises SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: l l, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31 , SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:
  • the vector does not comprise a nucleic acid that comprises SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:61 , SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:81 , SEQ ID NO:83, SEQ ID NO:85, or SEQ ID NO: 101.
  • the vector is a plasmid. In other embodiments, the vector is a retrovirus. In some embodiments, the vector is a linear DNA molecule. In some
  • the retrovirus comprises long terminal repeats, a psi packaging signal, a cloning site, and a sequence encoding a selectable marker.
  • the vector is a viral vector, such as pLXIN (Clontech).
  • the present invention also provides cells or organisms comprising any of the aforementioned nucleic acids.
  • the present invention provides cells or organisms comprising a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and, if the protein comprises a single mutation, the following amino acid substitutions in SEQ ID NO:2 are excluded: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the argin
  • phenylalanine at position 180 a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the valine at position 209; a phenylalanine for the methionine at position 226; an arginine for the leucine at position 235; a lysine for the valine at position 240; a leucine for the phenylalanine at position 241 ; a lysine for the proline at position 243; a serine for the asparagine at position 258; a glutamine for the methionine at position 282; a tryptophan for the leucine at position 296; a tyrosine for the leucine at position 296; a phenylalanine for the leucine at position 296;
  • the present invention relates to pharmaceutical compositions containing either a piggyBac transposase as a protein or encoded by a nucleic acid, and/or a hyperactive piggyBac transposon, or a gene transfer system as described herein comprising a piggyBac transposase as a protein or encoded by a nucleic acid, in combination with a hyperactive piggyBac transposon.
  • the pharmaceutical composition may optionally be provided together with a pharmaceutically acceptable carrier, adjuvant or vehicle.
  • a pharmaceutically acceptable carrier adjuvant or vehicle.
  • pharmaceutically acceptable carrier, adjuvant, or vehicle according to the invention refers to a non-toxic carrier, adjuvant or vehicle that does not destroy the pharmacological activity of the component(s) with which it is formulated.
  • compositions of this invention include, but are not limited to, ion exchangers, alumina, aluminum stearate, lecithin, serum proteins, such as human serum albumin, buffer substances such as phosphates, glycine, sorbic acid, potassium sorbate, partial glyceride mixtures of saturated vegetable fatty acids, water, salts or electrolytes, such as protamine sulfate, disodium hydrogen phosphate, potassium hydrogen phosphate, sodium chloride, zinc salts, colloidal silica, magnesium trisilicate, polyvinyl pyrrolidone, cellulose-based substances, polyethylene glycol, sodium carboxymethylcellulose, polyacrylates, waxes, polyethylene- polyoxypropylene-block polymers, polyethylene glycol and wool fat.
  • ion exchangers alumina, aluminum stearate, lecithin
  • serum proteins such as human serum albumin
  • buffer substances such as phosphates, glycine, sorbic acid, potassium sorbate, partial
  • compositions of the present invention may be administered orally, parenterally, by inhalation spray, topically, rectally, nasally, buccally, vaginally or via an implanted reservoir.
  • parenteral as used herein includes subcutaneous, intravenous,
  • Sterile injectable forms of the pharmaceutical compositions of this invention may be aqueous or oleaginous suspension. These suspensions may be formulated according to techniques known in the art using suitable dispersing or wetting agents and suspending agents.
  • the sterile injectable preparation may also be a sterile injectable solution or suspension in a non-toxic parenterally- acceptable diluent or solvent, for example as a solution in 1,3-butanediol.
  • the acceptable vehicles and solvents that may be employed are water, Ringer's solution and isotonic sodium chloride solution.
  • sterile, fixed oils are conventionally employed as a solvent or suspending medium.
  • any bland fixed oil may be employed including synthetic mono- or di-glycerides.
  • Fatty acids such as oleic acid and its glyceride derivatives are useful in the preparation of injectables, as are natural pharmaceutically-acceptable oils, such as olive oil or castor oil, especially in their polyoxyethylated versions.
  • These oil solutions or suspensions may also contain a long-chain alcohol diluent or dispersant, such as
  • carboxymethyl cellulose or similar dispersing agents that are commonly used in the formulation of pharmaceutically acceptable dosage forms including emulsions and suspensions.
  • Other commonly used surfactants such as Tweens, Spans and other emulsifying agents or bioavailability enhancers which are commonly used in the manufacture of pharmaceutically acceptable solid, liquid, or other dosage forms may also be used for the purposes of formulation.
  • compositions of this invention may be orally administered in any orally acceptable dosage form including, but not limited to, capsules, tablets, aqueous suspensions or solutions.
  • carriers commonly used include lactose and corn starch.
  • lubricating agents such as magnesium stearate, are also added.
  • useful diluents include lactose and dried cornstarch.
  • the active ingredient is combined with emulsifying and suspending agents. If desired, certain sweetening, flavoring or coloring agents may also be added.
  • the pharmaceutically acceptable compositions of this invention may be administered in the form of suppositories for rectal administration.
  • suppositories for rectal administration.
  • these can be prepared by mixing the inventive gene transfer system or components thereof with a suitable non- irritating excipient that is solid at room temperature but liquid at rectal temperature.
  • the suppositories will melt in the rectum to release the drug.
  • the compsitions comprise cocoa butter, beeswax and polyethylene glycols.
  • the pharmaceutically acceptable compositions of this invention may also be administered topically, especially when the target of treatment includes areas or organs readily accessible by topical application, including diseases of the eye, the skin, or the lower intestinal tract. Suitable topical formulations are readily prepared for each of these areas or organs.
  • the pharmaceutically acceptable compositions may be formulated in a suitable ointment containing the inventive gene transfer system or components thereof suspended or dissolved in one or more carriers.
  • Carriers for topical administration of the components of this invention include, but are not limited to, mineral oil, liquid petrolatum, white petrolatum, propylene glycol, polyoxyethylene, polyoxypropylene component, emulsifying wax and water.
  • the pharmaceutically acceptable compositions can be formulated in a suitable lotion or cream containing the active components suspended or dissolved in one or more pharmaceutically acceptable carriers. Suitable carriers include, but are not limited to, mineral oil, sorbitan monostearate, polysorbate 60, cetyl esters wax, cetearyl alcohol, 2-octyldodecanol, benzyl alcohol and water.
  • the pharmaceutically acceptable compositions may be formulated as micronized suspensions in isotonic, pH adjusted sterile saline, or, preferably, as solutions in isotonic, pH adjusted sterile saline, either with or without a preservative such as benzylalkonium chloride.
  • the pharmaceutically acceptable compositions may be formulated in an ointment such as petrolatum.
  • compositions of this invention may also be administered by nasal aerosol or inhalation.
  • Such compositions are prepared according to techniques well-known in the art of pharmaceutical formulation and may be prepared as solutions in saline, employing benzyl alcohol or other suitable preservatives, absorption promoters to enhance bioavailability, fluorocarbons, and/or other conventional solubilizing or dispersing agents.
  • the amount of the components of the present invention that may be combined with the carrier materials to produce a composition in a single dosage form will vary depending upon the host treated, the particular mode of administration. It has to be noted that a specific dosage and treatment regimen for any particular patient will depend upon a variety of factors, including the activity of the specific component employed, the age, body weight, general health, sex, diet, time of administration, rate of excretion, drug combination, and the judgment of the treating physician and the severity of the particular disease being treated. The amount of a component of the present invention in the composition will also depend upon the particular component(s) in the composition.
  • the pharmaceutical composition is preferably suitable for the treatment of diseases, particular diseases caused by gene defects such as cystic fibrosis, hypercholesterolemia, hemophilia, immune deficiencies including HIV, Huntington disease, .alpha. -anti-Trypsin deficiency, as well as cancer selected from colon cancer, melanomas, kidney cancer, lymphoma, acute myeloid leukemia (AML), acute lymphoid leukemia (ALL), chronic myeloid leukemia (CML), chronic lymphocytic leukemia (CLL), gastrointestinal tumors, lung cancer, gliomas, thyroid cancer, mamma carcinomas, prostate tumors, hepatomas, diverse virus-induced tumors such as e.g.
  • diseases particular diseases caused by gene defects such as cystic fibrosis, hypercholesterolemia, hemophilia, immune deficiencies including HIV, Huntington disease, .alpha. -anti-Trypsin deficiency
  • cancer selected from colon cancer, melanomas, kidney cancer, lymph
  • papilloma virus induced carcinomas e.g. cervix carcinoma
  • adeno carcinomas herpes virus induced tumors (e.g. Burkitt's lymphoma, EBV induced B cell lymphoma), Hepatitis B induced tumors (Hepato cell carcinomas), HTLV-I und HTLV-2 induced lymphoma, lung cancer, pharyngeal cancer, anal carcinoma, glioblastoma, lymphoma, rectum carcinoma, astrocytoma, brain tumors, stomach cancer, retinoblastoma, basalioma, brain metastases, medullo blastoma, vaginal cancer, pancreatic cancer, testis cancer, melanoma, bladder cancer, Hodgkin syndrome, meningeoma, Schneeberger's disease, bronchial carcinoma, pituitary cancer, mycosis fungoides, gullet cancer, breast cancer, neurinoma, spinaliom
  • the present invention also provides cells or organisms comprising any of the aforementioned nucleic acids or proteins.
  • the present invention provides cells or organisms comprising a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and, if the protein comprises a single mutation, the protein does not comprise the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline
  • the cells or organisms comprise a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine
  • the cells or organisms comprise a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2; provided that the protein does not comprise the species of amino acid substitution mutants disclosed in WO/2010/085699, WO/2010/099301, WO/2010/099296, or any priority document related thereto, all of which are incorporated herein by reference in their entirety.
  • the cells or organisms comprise a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 or in the domain defined between and including amino acid number 436 and 503 of SEQ ID NO:2 and wherein the protein does not comprise at least one of the following amino acid substitutions in SEQ ID NO:2: at least one of the following amino acid substitutions in SEQ ID NO:2: a serine for the glycine at position 165; a leucine for the methionine at position 185; a glycine for the alanine at position 187; a tryptophan for the phenylalanine at position 200; a proline for the valine at position 207; a phenylalanine for the methionine at position 226; a lysine for the va
  • the cells or organisms do not comprise a nucleic acid that consists of: SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:l l, SEQ ID NO: 13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO: 19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO
  • the cells or organisms do not comprise a nucleic acid that consists of: SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:61, SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, or SEQ ID NO:101.
  • a nucleic acid that consists of: SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO
  • the cells or organisms do not comprise a nucleic acid that comprises SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO: 11, SEQ ID NO:13, SEQ ID NO: 15, SEQ ID NO: 17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:31, SEQ ID NO:33, SEQ ID NO:35, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:41, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:63, SEQ ID NO:65, SEQ ID NO:67, SEQ ID NO:69, SEQ ID NO:
  • the cells or organisms do not comprise a nucleic acid that comprises SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:37, SEQ ID NO:39, SEQ ID NO:43, SEQ ID NO:45, SEQ ID NO:47, SEQ ID NO:49, SEQ ID NO:51, SEQ ID NO:53, SEQ ID NO:55, SEQ ID NO:57, SEQ ID NO:61, SEQ ID NO:75, SEQ ID NO:77, SEQ ID NO:81, SEQ ID NO:83, SEQ ID NO:85, or SEQ ID NO:101.
  • the vector comprises a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2, wherein the following single or double mutations are excluded: an asparagine for the serine at position 3; a valine for the isoleucine at position 30; a serine for the alanine at position 46; a threonine for the alanine at position 46; a tryptophan for the isoleucine at position 82; a proline for the serine at position 103; a proline for the arginine at position 119; an alanine for the cysteine at position 125; a leucine for the cysteine at position 125; a serine for the glycine at position 165; a lysine for the tyrosine at position
  • the vector comprises a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 110 and 119 of SEQ ID NO:2, between and including amino acid numbers 100 and 119 of SEQ ID NO:2, between and including amino acid numbers 90 and 119 of SEQ ID NO:2 between and including amino acid numbers 80 and 119 of SEQ ID NO:2, between and including amino acid numbers 70 and 119 of SEQ ID NO:2,between and including amino acid numbers 60 and 119 of SEQ ID NO:2, between and including amino acid numbers 50 and 119 of SEQ ID NO:2,between and including amino acid numbers 40 and 119 of SEQ ID NO:2, between and including amino acid numbers 30 and 119 of SEQ ID NO:2, between and including amino acid numbers 20 and 119 of SEQ ID NO:2, or between and including amino acid numbers 15 and 119 of SEQ ID NO:2.
  • the vector comprises a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2, between and including amino acid numbers 2 and 119 of SEQ ID NO:2, between and including amino acid numbers 3 and 119 of SEQ ID NO:2 between and including amino acid numbers 4 and 119 of SEQ ID NO:2, between and including amino acid numbers 5 and 119 of SEQ ID NO:2,between and including amino acid numbers 6 and 119 of SEQ ID NO:2, between and including amino acid numbers 7 and 119 of SEQ ID NO:2,between and including amino acid numbers 8 and 119 of SEQ ID NO:2, between and including amino acid numbers 9 and 119 of SEQ ID NO:2, between and including amino acid numbers 10 and 119 of SEQ ID NO:2, or between and including amino acid numbers 11 and 119 of SEQ ID NO:2.
  • the vector comprises a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 10 of SEQ ID NO:2, between and including amino acid numbers 1 and 20 of SEQ ID NO:2, between and including amino acid numbers 1 and 30 of SEQ ID NO:2 between and including amino acid numbers 1 and 40 of SEQ ID NO:2, between and including amino acid numbers 1 and 50 of SEQ ID NO:2, between and including amino acid numbers 1 and 60 of SEQ ID NO:2, between and including amino acid numbers 1 and 70 of SEQ ID NO:2, between and including amino acid numbers 1 and 80 of SEQ ID NO:2, between and including amino acid numbers 1 and 90 of SEQ ID NO:2, between and including amino acid numbers 1 and 100 of SEQ ID NO:2, or between and including amino acid numbers 1 and 110 of SEQ ID NO:2.
  • the vector comprises a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 1 and 10 of SEQ ID NO:2, between and including amino acid numbers 2 and 10 of SEQ ID NO:2, between and including amino acid numbers 3 and 10 of SEQ ID NO:2 between and including amino acid numbers 4 and 10 of SEQ ID NO:2, between and including amino acid numbers 5 and 10 of SEQ ID NO:2, between and including amino acid numbers 5 and 15 of SEQ ID NO:2, between and including amino acid numbers 10 and 20 of SEQ ID NO:2,between and including amino acid numbers 20 and 30 of SEQ ID NO:2, between and including amino acid numbers 30 and 40 of SEQ ID NO:2, between and including amino acid numbers 40 and 50 of SEQ ID NO:2, between and including amino acid numbers 50 and 60 of SEQ ID NO:2, between and including amino acid numbers 60 and 70 of SEQ ID NO:2, between and
  • the vector comprises a nucleic acid that encodes a protein that comprises at least 75% (or 80%, 85%, 90%, 95%, or 99%) sequence identity to SEQ ID NO:2, with at least one mutation in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2.
  • the vector comprises a nucleic acid sequence that encodes a protein comprising at least one mutation in the domain defined between and including amino acid numbers: 430 and 440 of SEQ ID NO:2, 440 and 450 of SEQ ID NO:2, 450 and 460 of SEQ ID NO:2, 460 and 470 of SEQ ID NO:2, 470 and 480 of SEQ ID NO:2, 480 and 490 of SEQ ID NO:2, 500 and 503 of SEQ ID NO:2, 430 and 460 of SEQ ID NO:2, 440 and 470 of SEQ ID NO:2, 450 and 480 of SEQ ID NO:2, 460 and 490 of SEQ ID NO:2, or 480 and 503 of SEQ ID NO:2.
  • the vector comprises a nucleic acid sequence that encodes a protein comprising at least one mutation in the domain defined between and including amino acid numbers: 495 and 503 of SEQ ID NO:2, 485 and 503 of SEQ ID NO:2, 475 and 503 of SEQ ID NO:2, 465 and 503 of SEQ ID NO:2, 455 and 503 of SEQ ID NO:2, 445 and 503 of SEQ ID NO:2, 440 and 503 of SEQ ID NO:2, or 437 and 503 of SEQ ID NO:2.
  • the vector comprises a nucleic acid sequence that encodes a protein comprising at least one mutation in the domain defined between and including amino acid numbers: 436 and 446 of SEQ ID NO:2, 436 and 456 of SEQ ID NO:2, 436 and 466 of SEQ ID NO:2, 436 and 476 of SEQ ID NO:2, 436 and 486 of SEQ ID NO:2, 436 and 496 of SEQ ID NO:2, 436 and 497 of SEQ ID NO:2, 436 and 498 of SEQ ID NO:2, 436 and 499 of SEQ ID NO:2, 436 and 500 of SEQ ID NO:2, 436 and 501 of SEQ ID NO:2, 436 and 502 of SEQ ID NO:2, or 436 and 503 of SEQ ID NO:2.
  • the protein comprises three mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2. In some embodiments, the protein comprises three mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2. In some
  • the protein comprises three mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2 are selected from the following: I30V, I82W, S103P, and R119. In some embodiments, the protein comprises the following four mutations in the domain defined between and including amino acid numbers 1 and 119 of SEQ ID NO:2: 130V, I82W, S103P, and Rl 19.
  • the protein comprises three mutations in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2. In some embodiments, the protein comprises three mutations in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2 wherein at least one of the three mutations is selected from the following: V436I, M456Y, and M503L. In some embodiments, the protein comprises the following three mutations in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2: V436I, M456Y, and M503L. In some embodiments, the protein comprises three or more mutations in the domain defined between and including amino acid numbers 436 and 503 of SEQ ID NO:2, wherein none of the mutations are L470F.
  • a cell comprises any of the aforementioned vectors. In some embodiments,
  • kits comprising: 1) any of the aforementioned vectors; and 2) any of the hyperactive PiggyBac transposons described herein comprising an insertion site for an exogenous nucleic acid, wherein the insertion site is flanked by either one or more of the inverted repeat sequences that are specifically recognized by any of the aforementioned proteins.
  • the inverted repeats comprises a first inverted repeat and/or a second inverted repeat, wherein the first inverted repeat comprises a sequence at least about 80% sequence identity to SEQ ID NO:91 and the second inverted repeat sequence comprises a sequence at least about 80% sequence identity to SEQ ID NO:92.
  • the first inverted repeat comprises a sequence at least about 85% sequence identity to SEQ ID NO:91 and the second inverted repeat sequence comprises a sequence at least about 85% sequence identity to SEQ ID NO:92. In some embodiments, the first inverted repeat comprises a sequence at least about 90% sequence identity to SEQ ID NO:91 and the second inverted repeat sequence comprises a sequence at least about 90% sequence identity to SEQ ID NO: 92. In some embodiments, the first inverted repeat comprises a sequence at least about 95% sequence identity to SEQ ID NO:91 and the second inverted repeat sequence comprises a sequence at least about 95% sequence identity to SEQ ID NO:92.
  • the first inverted repeat comprises a sequence at least about 99% sequence identity to SEQ ID NO:91 and the second inverted repeat sequence comprises a sequence at least about 99% sequence identity to SEQ ID NO:92. In some embodiments, the first inverted repeat comprises a sequence identical to SEQ ID NO:91 and the second inverted repeat sequence comprises a sequence identical to SEQ ID NO: 92.
  • the aforementioned transposon is a nucleic acid that is flanked at either end by inverted repeats which are recognized by an enzyme having PiggyBac transposase activity.
  • a PiggyBac transposase such as any of the aforementioned proteins, is capable of binding to the inverted repeat, excising the segment of nucleic acid flanked by the inverted repeats, and integrating the segment of nucleic acid flanked by the inverted repeats into the genome of the target cell.
  • the left (5') inverted repeat sequence is: 5'- CCCTAGAAAG ATAGTCTGCGTAA AATTGACGC ATG-3 ' (SEQ ID NO:91) and the right (3') inverted repeat is: 5'- CCCTAGAAAGATAATCATATTGTGACGTACGTTAAAGATAATCATGC GTAAAATTGACGC ATG-3' (SEQ ID NO:92).
  • the various elements of the transposon systems described herein can be produced by standard methods of restriction enzyme cleavage, ligation, and molecular cloning.
  • One protocol for constructing the vectors described herein includes the following steps. Purified nucleic acid fragments containing the desired component nucleotide sequences as well as extraneous sequences are cleaved with restriction endonucleases from initial sources, such as a vector comprising the PiggyBac transposase gene. Fragments containing the desired nucleotide sequences are separated from unwanted fragments of different size using conventional separation methods, such as for example, agarose gel electrophoresis.
  • RNA comprising the PiggyBac transposase can be produced with an RNA polymerase, using a DNA plasmid as a substrate.
  • Recombinant protein comprising the PiggyBac transposase can be produced by methods including, but not limited to, in vitro transcription and translation, or expression in E. coli followed by purification by affinity or fractionation. The procedures of cleavage, plasmid construction, cell transformation, plasmid production, RNA
  • the PiggyBac transposons described herein can include a wide variety of inserted nucleic acids, where the nucleic acids can include a sequence of bases that is endogenous and/or exogenous to a multicellular or unicellular organism.
  • the nature of the nucleic acid can vary depending upon the particular protocol being carried out.
  • the exogenous nucleic acid can be a gene.
  • the inserted nucleic acid that is positioned between the flanking inverted repeats can vary greatly in size. The only limitation on the size of the inserted nucleic acid is that the size should not be so great as to inactivate the ability of the transposon system to integrate the transposon into the target genome.
  • the upper and lower limits of the size of inserted nucleic acid can be determined empirically by those of skill in the art.
  • the inserted nucleic acid comprises at least one
  • transcriptionally active gene which is a coding sequence that is capable of being expressed under intracellular conditions, e.g. a coding sequence in combination with any requisite expression regulatory elements that are required for expression in the intracellular environment of the target cell whose genome is modified by integration of the transposon.
  • the transcriptionally active genes of the transposon can comprise a domain of nucleotides, i.e., an expression module that includes a coding sequence of nucleotides operably linked with requisite transcriptional mediation or regulatory element(s).
  • Requisite transcriptional mediation elements that may be present in the expression module include, but are not limited to, promoters, enhancers, termination and polyadenylation signal elements, splicing signal elements, and the like.
  • the expression module includes transcription regulatory elements that provide for expression of the gene in a broad host range.
  • transcription regulatory elements include, but are not limited to: SV40 elements, transcription regulatory elements derived from the LTR of the Rous sarcoma virus, transcription regulatory elements derived from the LTR of human cytomegalovirus (CMV), hsp70 promoters, and the like.
  • At least one transcriptionally active gene or expression module present in the inserted nucleic acid acts as a selectable marker.
  • selectable markers A variety of different genes have been employed as selectable markers, and the particular gene employed in the vectors described herein as a selectable marker is chosen primarily as a matter of convenience.
  • selectable marker genes include, but are not limited to: thymidine kinase gene, dihydrofolate reductase gene, xanthine-guanine phosporibosyl transferase gene, CAD, adenosine deaminase gene, asparagine synthetase gene, numerous antibiotic resistance genes (tetracycline, ampicillin, kanamycin, neomycin, and the like), aminoglycoside phosphotransferase genes, hygromycin B phosphotransferase gene, and genes whose expression provides for the presence of a detectable product, either directly or indirectly, such as, for example, beta-galactosidase, GFP, and the like.
  • the portion of the transposon containing the inverted repeats also comprises at least one restriction
  • restriction site located between the flanking inverted repeats, which serves as a site for insertion of an exogenous nucleic acid.
  • restriction sites include, but are not limited to: Hindlll, Pstl, Sail, Accl, Hindi, Xbal, BamHI, Smal, Xmal, Kpnl, Sacl, EcoRI, and the like.
  • the vector includes a polylinker, i.e. a closely arranged series or array of sites recognized by a plurality of different restriction enzymes, such as those listed above.
  • the inserted exogenous nucleic acid could comprise recombinase recognition sites, such as LoxP, FRT, or AttB/AttP sites, which are recognized by the Cre, Flp, and PhiC31 recombinases, respectively.
  • recombinase recognition sites such as LoxP, FRT, or AttB/AttP sites, which are recognized by the Cre, Flp, and PhiC31 recombinases, respectively.
  • the source of hyperactive transposase is a nucleic acid that encodes the hyperactive transposase
  • the nucleic acid encoding the hyperactive transposase protein is generally part of an expression module, as described above, where the additional elements provide for expression of the transposase as required.
  • the present invention also provides methods of integrating an exogenous nucleic acid into the genome of at least one cell of a multicellular or unicellular organism comprising administering directly to the multicellular or unicellular organism: a) a transposon comprising the exogenous nucleic acid, wherein the exogenous nucleic acid is flanked by one or more of any of the aforementioned inverted repeat sequences that are recognized by any of the aforementioned proteins; and b) any one of the aforementioned proteins to excise the exogenous nucleic acid from a plasmid, episome, or transgene and integrate the exogenous nucleic acid into the genome.
  • the protein of b) is administered as a nucleic acid encoding the protein.
  • the transposon and nucleic acid encoding the protein of b) are present on separate vectors. In some embodiments, the transposon and nucleic acid encoding the protein of b) are present on the same vector.
  • the portion of the vector encoding the hyperactive transposase is located outside the portion carrying the inserted nucleic acid.
  • the transposase encoding region is located external to the region flanked by the inverted repeats. Put another way, the tranposase encoding region is positioned to the left of the left terminal inverted repeat or to the right of the right terminal inverted repeat.
  • the hyperactive transposase protein recognizes the inverted repeats that flank an inserted nucleic acid, such as a nucleic acid that is to be inserted into a target cell genome.
  • the multicellular or unicellular organism is a plant or animal. In some embodiments, the multicellular or unicellular organism is a vertebrate. In some embodiments, the vertebrate animal is a mammal, such as for example, a rodent (mouse or rat), livestock (pig, horse, cow, etc.), pets (dog or cat), and primates, such as, for example, a human.
  • a rodent mouse or rat
  • livestock pig, horse, cow, etc.
  • pets dog or cat
  • primates such as, for example, a human.
  • the methods described herein can be used in a variety of applications in which it is desired to introduce and stably integrate an exogenous nucleic acid into the genome of a target cell.
  • In vivo methods of integrating exogenous nucleic acid into a target cell are known.
  • the route of administration of the transposon system to the multicellular or unicellular organism depends on several parameters, including: the nature of the vectors that carry the system components, the nature of the delivery vehicle, the nature of the multicellular or unicellular organism, and the like, where a common feature of the mode of administration is that it provides for in vivo delivery of the transposon system components to the target cell(s).
  • linear or circularized DNA such as a plasmid
  • plasmid is employed as the vector for delivery of the transposon system to the target cell.
  • the plasmid may be administered in an aqueous delivery vehicle, such as a saline solution.
  • an agent that modulates the distribution of the vector in the multicellular or unicellular organism can be employed.
  • the vectors comprising the subject system components are plasmid vectors
  • lipid-based such as a liposome
  • vehicles can be employed, where the lipid-based vehicle may be targeted to a specific cell type for cell or tissue specific delivery of the vector.
  • polylysine-based peptides can be employed as carriers, which may or may not be modified with targeting moieties, and the like (Brooks et al., J. Neurosci. Methods, 1998, 80, 137-47; and Muramatsu et al., Int. J. Mol. Med., 1998, 1 , 55-62).
  • the system components can also be incorporated onto viral vectors, such as adenovirus-derived vectors, Sindbis-virus derived vectors, retrovirus -derived vectors, hybrid vectors, and the like.
  • viral vectors such as adenovirus-derived vectors, Sindbis-virus derived vectors, retrovirus -derived vectors, hybrid vectors, and the like.
  • the above vectors and delivery vehicles are merely representative. Any vector/delivery vehicle combination can be employed, so long as it provides for in vivo administration of the transposon system to the multicellular or unicellular organism and target cell.
  • the elements of the PiggyBac transposase system are administered to the multicellular or unicellular organism in an in vivo manner such that they are introduced into a target cell of the multicellular or unicellular organism under conditions sufficient for excision of the inverted repeat flanked nucleic acid from the vector carrying the transposon and subsequent integration of the excised nucleic acid into the genome of the target cell.
  • the method can further include a step of ensuring that the requisite PiggyBac transposase activity is present in the target cell along with the introduced transposon.
  • the method can further include introducing a second vector into the target cell that encodes the requisite transposase activity, where this step also includes an in vivo administration step.
  • administration can be by a number of different routes, where representative routes of administration include, but are not limited to: oral, topical, intraarterial, intravenous, intraperitoneal, intramuscular, and the like.
  • the administering is administering systemically.
  • the particular mode of administration depends, at least in part, on the nature of the delivery vehicle employed for the vectors that harbor the PiggyBac transposons system.
  • the vector or vectors harboring the PiggyBac transposase system are administered intravascularly, such as intraarterially or intravenously, employing an aqueous based delivery vehicle, such as a saline solution.
  • the amount of vector nucleic acid comprising the transposon element, and in many embodiments the amount of vector nucleic acid encoding the transposase, which is introduced into the cell is sufficient to provide for the desired excision and insertion of the transposon nucleic acid into the target cell genome.
  • the amount of vector nucleic acid introduced should provide for a sufficient amount of transposase activity and a sufficient copy number of the nucleic acid that is desired to be inserted into the target cell.
  • the amount of vector nucleic acid that is introduced into the target cell varies depending on the efficiency of the particular introduction protocol that is employed, such as the particular in vivo administration protocol that is employed.
  • each component of the system that is administered to the multicellular or unicellular organism varies depending on the nature of the transposon nucleic acid, e.g. the nature of the expression module and gene, the nature of the vector on which the component elements are present, the nature of the delivery vehicle and the like. Dosages can readily be determined empirically by those of skill in the art.
  • the amount of transposon plasmid that is administered in many embodiments typically ranges from about 0.5 to 40 ⁇ g and is typically about 25 ⁇ g, while the amount of PiggyBac transposase encoding plasmid that is administered typically ranges from about 0.5 to 25 ⁇ g and is usually about 1 ⁇ g.
  • the nucleic acid region of the vector that is flanked by inverted repeats i.e. the vector nucleic acid positioned between the PiggyBac transposase-recognized inverted repeats, is excised from the vector via the provided transposase and inserted into the genome of the targeted cell.
  • introduction of the vector DNA into the target cell is followed by subsequent transposase mediated excision and insertion of the exogenous nucleic acid carried by the vector into the genome of the targeted cell.
  • the subject methods may be used to integrate nucleic acids of various sizes into the target cell genome.
  • the size of DNA that is inserted into a target cell genome using the subject methods ranges from about 0.5 kb to 100.0 kb, usually from about 1.0 kb to about 60.0 kb, or from about 1.0 kb to about 10.0 kb.
  • the subject methods result in stable integration of the nucleic acid into the target cell genome.
  • stable integration is meant that the nucleic acid remains present in the target cell genome for more than a transient period of time, and is passed on a part of the chromosomal genetic material to the progeny of the target cell.
  • the subject methods of stable integration of nucleic acids into the genome of a target cell find use in a variety of applications in which the stable integration of a nucleic acid into a target cell genome is desired. Applications in which the subject vectors and methods find use include, for example, research applications, polypeptide synthesis applications and therapeutic applications.
  • the present invention can be used in, for example, germline mutagenesis in a rat, mouse, or other vertebrate; somatic mutagenesis in a rat, mouse, or other vertebrate;
  • the hyperactive transposase can be delived as DNA, RNA, or protein.
  • the present invention relates to a colony of transgenic animals each such transgenic animal comprising one or more exogenous nucleic acid seqeunces and one or two internal tandem repeat sequences of the a transposon.
  • the present invention relates to one or more progeny from an animal comprising the one more more exogenous nucleic acid sequences and one or more internal tandem repeat sequences of the transposons.
  • the present invention relates to a colony of transgenic animals each such transgenic animal comprising one or more exogenous nucleic acid seqeunces and one or two internal tandem repeat sequences of the a transposon described herein.
  • the present invention relates to one or more progeny from an animal comprising the one more more exogenous nucleic acid sequences and one or more internal tandem repeat sequences of the transposons described herein.
  • the hyperactive PiggyBac transposase system described herein can be used for germline mutagenesis in a vertebrate species.
  • One method would entail the production of transgenic animals by, for example, pronuclear injection of newly fertilized oocytes.
  • transposase typically two types can be produced; one transgene provides expression of the transposase (a "driver” transgene) in germ cells (i.e., developing sperm or ova) and the other transgene (the “donor” transgene) comprises a transposon containing gene-disruptive sequences, such as a gene trap.
  • the transposase may be directed to the germline via a ubiquitously active promoter, such as the ROSA26 (Gt(ROSA)26Sor), pPol2 (Polr2a), or CMV/beta-actin (CAG) promoters.
  • the germline specific promoter is a female- specific promoter (e.g., a ZP3 promoter).
  • double transgenic animals which contain both transgenes in their genome
  • the PiggyBac transposase expressed in germ cells catalyzes the excision of the transposon and mediates mobilization to another site in the genome. If this new site contains a gene, then gene expression or protein production can be perturbed through a gene trap.
  • the most effective gene traps consist of strong splicing signals, whereby disruption and creation of a null allele is mediated through a strong splice acceptor.
  • a strong splice acceptor can also create alleles of altered function (such as a dominant negative, dominant active, or gain of function). Alternately, expression is rendered ectopic, constitutive, or altered through the use of a heterologous promoter and strong splice donor.
  • Mutagenesis occurs in the germline of double-transgenic animals (with both driver and donor transgenes) and upon breeding double-transgenic animals, mutant offspring with heritable and permanent mutations are produced. Mutations can be generated by injection of a fertilized oocyte with transposase RNA or protein. Alternatively, the transposase (as DNA, RNA, or protein) is electroporated, transfected or injected into embryonic stem cells, induced pluripotent stem cells, or spermatogonial stem cells. These mutations (transposon insertions) can be detected by, for example, Southern blot and PCR.
  • the specific insertion sites within each mutant animal can then be identified by, for example, linker-mediated PCR, inverse PCR, or other PCR cloning techniques. Some of the mutant animals identified via PiggyBac - mediated mutagenesis can serve as valuable models for studying human disease.
  • Somatic mutagenesis is very useful for discovering tumor suppressors and oncogenes in a model vertebrate animal, such as the rat. Such experiments are otherwise not possible in humans, but through PiggyBac-driven mutagenesis, carcinogenesis can be triggered, much in the same way that ionizing radiation triggers carcinogenesis through DNA damage. With PiggyBac transposon-mediated insertional mutagenesis, however, mutations can easily be pinpointed through, for example, PCR cloning techniques. The mutations uncovered are often directly linked to the cancer, and in a single animal, hundreds of such mutations can be identified. This is incredibly valuable for linking specific genes as causative agents (tumor suppressors and oncogenes) that are directly involved in providing the growth and survival advantages inherent in a developing neoplasia.
  • the transgenic strategy can be very similar to that for germline mutagenesis, except the driver transgene provides expression of the transposase in the tissue where carcinogenesis will be targeted.
  • the intestine-specific Villin (Vill) promoter can provide highly specific expression of the PiggyBac transposase for targeted mutagenesis and carcinogenesis in the intestine and colon. This provides a valuable gene-discovery system of colon cancer in which oncogenes and tumor suppressors directly linked to colon cancer can be easily and rapidly identified.
  • the donor transgene would likely be a bi-directional gene trap that can cause a loss of function, such as a null allele in either orientation, and a gain of function.
  • the gain-of-function parameter is achieved through the use of a constitutive promoter, perhaps containing a strong enhancer sequence, which over- expresses a trapped oncogene.
  • the resulting tumors from PiggyBac-mediated mutagenesis would likely contain both types of mutations, and thus both tumor suppressors and oncogenes can be uncovered.
  • the PiggyBac transposon can mediate gene delivery in a target tissue with a much lower risk for immune reactions and cancer.
  • the inherently low immunogenicity of the PiggyBac transposon is due to its simplicity; there are no coat proteins, no receptor molecules, and no extracellular components, but simply a single small enzyme that interacts with host factors to mediate transposon insertion. While the PiggyBac transposon shows a slight preference for inserting within genes, this preference is much less pronounced that a retrovirus, which has a very high preference for inserting within transcriptional units.
  • PiggyBac -mediated gene therapy two plasmids can be delivered to the patient: one that provides expression of the transposase (a driver plasmid), and another that provides the transposon containing a therapeutic transgene (the donor plasmid). These DNAs can be complexed with liposomes and administered via parenteral injection. Upon entering a cell the PiggyBac transposase may bind to the transposon in the donor plasmid, excise it, and then integrate it into the genome.
  • the present invention also provides methods of generating a transgenic, non-human vertebrate comprising in the genome of one or more of its cells a PiggyBac transposon which comprises nucleotide sequence that, when integrated into the genome, modifies a trait in the transgenic, non-human vertebrate, comprising: introducing ex vivo into a non-human vertebrate embryo or fertilized oocyte a nucleic acid comprising a PiggyBac transposon which comprises a nucleotide sequence that, when integrated into the genome, modifies a trait in the transgenic, non-human vertebrate, and, within the same or on a separate nucleic acid, a nucleotide sequence encoding a PiggyBac transposase; implanting the resultant non- human vertebrate embryo or fertilized oocyte into a foster mother of the same species under conditions favoring development of the embryo into a transgenic, non-human vertebrate; and, after a period of time sufficient to allow development of the
  • the present invention also provides methods of mobilizing a PiggyBac transposon in a non-human vertebrate, comprising: mating a first transgenic, non-human vertebrate comprising in the genome of one or more of its germ cells a PiggyBac transposon, wherein the PiggyBac transposon comprises a nucleotide sequence, that when integrated into the genome, modifies a trait in the transgenic, non-human vertebrate, with a second transgenic, non-human vertebrate comprising in the genome of one or more of its germ cells a nucleotide sequence encoding a PiggyBac transposase to yield one or more progeny; identifying at least one of the one or more progeny comprising in the genome of one or more of its cells both the PiggyBac transposon and the nucleotide sequence encoding the PiggyBac transposase, such that the PiggyBac transposase is expressed and the transposon is mobilized
  • transgenes are introduced into the pronuclei of fertilized oocytes.
  • animals such as mice fertilization is performed in vivo and fertilized ova are surgically removed.
  • in vitro fertilization permits a transgene to be introduced into substantially synchronous cells at an optimal phase of the cell cycle for integration (not later than S-phase).
  • Transgenes are usually introduced by microinjection (see, U.S. Patent No. 4,873,292). Fertilized oocytes are cultured in vitro until a pre-implantation embryo is obtained containing about 16-150 cells.
  • Pre-implantation embryos can be stored frozen for a period pending implantation. Pre-implantation embryos are transferred to an appropriate female resulting in the birth of a transgenic or chimeric animal depending upon the stage of development when the transgene is integrated. Chimeric mammals can be bred to form true germline transgenic animals.
  • the PiggyBac transgenes described above are introduced into nonhuman mammals. Most nonhuman mammals, including rodents such as mice and rats, rabbits, sheep, goats, pigs, and cattle.
  • transgenes can be introduced into embryonic stem cells (ES) or SS cells, or iPS cells, or spermatogonial stem cells, etc. These cells are obtained from preimplantation embryos cultured in vitro (Bradley et al., Nature, 1984, 309, 255-258). Transgenes can be introduced into such cells by electroporation or microinjection.
  • Transformed ES cells are combined with blastocysts from a nonhuman animal.
  • the ES cells colonize the embryo and in some embryos form the germ line of the resulting chimeric animal (Jaenisch, Science, 1988, 240, 1468-1474).
  • ES cells can be used as a source of nuclei for transplantation into an enucleated fertilized oocyte giving rise to a transgenic mammal.
  • the transgenes can be introduced simultaneously using the same procedure as for a single transgene.
  • the transgenes can be initially introduced into separate animals and then combined into the same genome by breeding the animals.
  • a first transgenic animal is produced containing one of the transgenes.
  • a second transgene is then introduced into fertilized ova or embryonic stem cells from that animal.
  • Transgenic mammals can be generated conventionally by introducing by microinjecting the above-described transgenes into mammals' fertilized eggs (those at the pronucleus phase), implanting the eggs in the oviducts of female mammals (recipient mammals) after a few additional incubation or directly in their uteri synchronized to the pseudopregnancy, and obtaining the offspring.
  • Transgenic mammals can be generated by introducing the above-described transgenes into spermatogonial stem cells, implanting the cells into the gonads of previously sterilized male mammals, and mating after incubation, and obtaining the offspring.
  • the transgenic mammals generated can be propagated by conventionally mating and obtaining the offspring, or transferring nuclei (nucleus transfer) of the transgenic mammal's somatic cells, which have been initialized or not, into fertilized eggs of which nuclei have previously been enucleated, implanting the eggs in the oviducts or uteri of the recipient mammals, and obtaining the clone offspring.
  • Transformed cells and/or transgenic organisms such as those containing the DNA inserted into the host cell's DNA, can be selected from untransformed cells and/or transformed organisms if a selectable marker is included as part of the introduced DNA sequences.
  • Selectable markers include, for example, genes that provide antibiotic resistance; genes that modify the physiology of the host, such as for example green fluorescent protein, to produce an altered visible phenotype. Cells and/or organisms containing these genes are capable of surviving in the presence of antibiotic, insecticides or herbicide concentrations that kill untransformed cells/organisms or producing an altered visible phenotype.
  • DNA can be isolated from transgenic cells and/or organisms to confirm that the introduced DNA has been inserted.
  • Retroviral vectors derived from the MoMuLV retrovirus contain elements for expression and packaging of the RNA genome, but lack genes that enable replication. These vectors contain viral long terminal repeats (LTRs), a psi packaging signal that regulates encapsidation of the RNA, a site for cloning the cDNA of interest, and typically, a selectable marker such as the neomycin phosphotransferase gene (Neo). These viral vectors are transfected (as plasmid DNA) into a special packaging cell line that expresses genes necessary for encapsidation of the RNA genome into infectious virions.
  • LTRs viral long terminal repeats
  • Neo neomycin phosphotransferase gene
  • the virus is then harvested by collecting the supernatant from transfected packaging cells. Upon infecting a susceptible target cell one or more viral particles fuse with the cell membrane and are uncoated.
  • the nucleocapsid (uncoated virion) enters the nucleus, where it is reverse-transcribed into DNA, and integrates into the genome as a permanent proviral insertion. Since the LTRs of viral vectors act as a strong promoter in many cells, the cDNA of choice is expressed. The proviral insertion cannot produce more infectious virus particles.
  • a packaging line derived from 293T cells which produces high-titer virus (>10 7 cfu/ml) from transiently transfected plasmid DNA is used herein (Morita et al., Gene Ther., 2000, 7, 1063-6). Transient transfection is desired to maintain library diversity.
  • the level of efficiency of the retroviral packaging cell lines is achieved through enhanced expression of the gag-pol and env retroviral genes in a cell-line (293T) that is easily transfected at high efficiency. This cell line out-performs two previously designed packaging cell lines, Bosc23 and Phoenix- Eco.
  • the the retroviral packaging cell line produces replication-defective ecotropic viruses, which only infect mouse and rat cells via the ecotropic retroviral receptor, and are, thus, quite safe.
  • each distinct variant should be easily discriminated in an appropriate cell-based assay.
  • a hyperactive transposase For selection of a hyperactive transposase, one would want to select a transposase that can rapidly and efficiently mobilize a transposon from one genomic location to another. Thus, for a given experimental period a hyperactive transposase should yield a greater number of transposon integrations per cell vs. a mediocre transposase.
  • Described herein is a special transposon-based selection system, which when integrated, yields either: 1) a green fluorescent protein (GFP) signal proportional to the number of integrations per cell, or 2) variable resistance to the toxic alkaloid colchicine, which is likewise proportional to the number of integrations per cell.
  • GFP green fluorescent protein
  • transposase variants within each library, the intensity of GFP fluorescence as a read-out of transposase efficiency was used.
  • BII-sd2GFP the PiggyBac ITRs recognized by the PiggyBac transposase
  • BII-sMdrl Two versions of the transposon have been created, each containing the PiggyBac ITRs recognized by the PiggyBac transposase. These genetraps express either an EGFP gene or a human multidrug resistance gene (Mdrl or ABCBl) upon integration within a gene and upstream of a polyadenylation (polyA) signal, and has been designed
  • a hyperactive transposase drives the insertion of multiple copies of the transposon, yielding more GFP signal or a higher Mdrl gene dosage.
  • the GFP protein is destabilized by a C-terminal PEST domain, which causes rapid turnover. This yields a larger dynamic range for measuring copy-number dependent expression of GFP; this allows one to easily discriminate between low-expressors (low copy-number) and high-expressors (high copy-number).
  • High copy number transposition events of the BII- sMdrl transposon is achieved through stringent drug selection with the microtubule- depolymerizing toxic alkaloid called colchicine.
  • Mdrl is a glycoprotein transporter that confers resistance to a variety of drugs, including chemotherapeutic compounds, by reducing intracellular levels of the drug (Metz et al., Virology, 1995, 208, 634-43; Pastan et al., Proc. Natl. Acad. Sci. USA, 1988, 85, 4486-90; and Ueda et al., Proc. Natl. Acad. Sci. USA, 1987, 84, 3004-8).
  • the reporter (GFP or Mdrl) is driven by a constitutive promoter that lacks CpG dinucleotides, and thus cannot be silenced by methylation.
  • the genetrap is flanked by the core insulator from the chicken beta- globin locus control region hypersensitive site IV (cHSIV) (Burgess-Beusse et al., Proc. Natl. Acad. Sci. USA, 2002, 99, 16433-7; Chung et al., Proc. Natl. Acad. Sci.
  • the cHSIV insulator insulates against any adjacent enhancers to reduce variability in expression (enhancer-blocking activity). This cHSIV element also prevents the encroachment of gene- silencing heterochromatin (insulator activity) that could silence expression (Burgess-Beusse et al., Proc. Natl. Acad. Sci. USA, 2002, 99, 16433-7; Chung et al., Proc. Natl. Acad. Sci. USA, 1997, 94, 575-80; and Chung et al., Cell, 1993, 74, 505-14).
  • the polyA trap components include an internal ribosomal entry site (IRES) from the encephalomyocarditis virus downstream of the GFP/Mdrl open reading frame (ORF).
  • IRES internal ribosomal entry site
  • ORF GFP/Mdrl open reading frame
  • a splice donor from the exonl/intronl boundary of the adenovirus type 2 (Ad2) late major transcript, which enables the reporter transcript to splice with the splice acceptor of a trapped exon.
  • an mRNA instability signal from the 3 ' untranslated region (UTR) of the mouse Csf.3 gene was also included, which causes active deadenylation and degradation of the mRNA transcript.
  • UTR untranslated region
  • the use of an mRNA instability signal from the Csf3 gene was effectively employed for reducing transcript levels in a previous design of a polyA trap (Ishida et al., Nucleic Acids Res., 1999, 27, e35).
  • transposon Prior to mobilization the transposon can be introduced into cells in two manners: 1) as a multi-copy tandem array (concatemer) integrated into the genome of cells at an intergenic region (where the polyA trap does not have any genes nearby for splicing and transcript stabilization) or 2) as a transfected circular plasmid.
  • BII-sEGFP transposon For the BII-sEGFP transposon one can easily distinguish the absence of (or weak) GFP expression produced by an unmobilized transposon from the increased GFP expression that occurs when the transposon integrates into a gene. Fluorescence is therefore a surrogate marker of transposase activity and enables cell sorting by FACS.
  • the efficient mobilization of the BII-sMdrl transposon is easily screened by treating cells with increasing amounts of colchicine; only PiggyBac transposases that can mobilize multiple copies of the BII-sMdrl transposon are isolated following stringent selection with high doses of colchicine. Those cells exhibiting a high level of GFP fluorescence or tolerating high doses of colchicine can then be collected.
  • transposons contain components for driving expression of either the destabilized GFP (d2GFP), EGFP, or human Mdrl protein upon insertion within a gene, and upstream of a polyadenylation signal.
  • Expression of either cDNA is driven by a CpG-less promoter that consists of a mouse cytomegalovirus (CMV) enhancer and a basal promoter from the human Efl or gene.
  • CMV mouse cytomegalovirus
  • Mdrl cDNA Downstream of the d2GFP, EGFP or Mdrl cDNA is an IRES that prevents non-sense mediated decay of hybrid transcripts.
  • the splice donor (SD) is from the human Adenovirus type 2 late major transcript, and efficiently permits splicing with 3' trapped exons.
  • An mRNA instability signal from the Csf3 gene (zigzag) minimizes transcript levels when no polyA trap occurs.
  • ITRs inverted terminal repeats
  • a high-throughput fluorescent activated cell-sorting (FACS) assay that measures the intensity of EGFP fluorescence as a read-out of transposase efficiency was developed. This was accomplished by using a polyA- trap genetrap transposon, called the sEGFP transposon. sEGFP transposons were created that are flanked by the PB, inverted terminal repeats (ITRs). This genetrap expresses EGFP upon integration within a gene and upstream of a polyadenylation (poly A) signal, and has been designed to yield copy-number dependent expression.
  • FACS fluorescent activated cell-sorting
  • a hyperactive transposase will drive the insertion of multiple copies of the transposon, yielding more EGFP expression, and thus a brighter fluorescence signal.
  • Copy number-dependent and polyA- dependent expression is conferred by several additional components.
  • EGFP cDNA is driven by a CpG-less promoter (consisting of a mouse cytomegalovirus (CMV) enhancer and a basal promoter from the human Efla gene).
  • CMV mouse cytomegalovirus
  • IRES Downstream of the EGFP ORF is an IRES that prevents non-sense mediated decay of hybrid transcripts following a successful genetrap event.
  • the splice donor (SD) is from the human Adenovirus type 2 late major transcript, and permits efficient splicing when integration occurs 5' of an exon.
  • An mRNA instability signal from the Csf3 gene (zigzag) destabilizes untrapped transcripts.
  • the cHSIV insulator blue ovals) promotes consistent copy-number dependent expression.
  • the inverted terminal repeats (ITRs, arrows) from the SB, TniPB, or TcB transposons provide mobilization by the respective transposases.
  • ITRs inverted terminal repeats
  • FIG. 2B FACS analysis is shown.
  • 2.5 x 10 6 NIH3T3 cells were electroporated with 2.5 ⁇ g of the sEGFP transposon flanked by piggyBac ITRs, along with 1.5 ⁇ g of an expression plasmid containing a PB transposase (pCMV-PBoM226F) or an empty vector (pCMV empty), and then assayed by flow cytometry 72 hours later.
  • Most EGFP + cells were between 10- to 100-fold below the detection maximum.
  • PiggyBac - like polypeptide sequences from the following species were analyzed for phylogenetic comparison: Trichoplusia ni
  • Gasterosteus aculeatus ChngroupXX, 10624991-10626727
  • Ciona savignyi reftig_140, 9946-11658
  • Ciona intestinalis NW_001955008.1, 51209-52888, and NW_001955804.1, 1016-2667
  • Anopholes gambiae NZ_AAAB02008849, 2962684-2964410
  • Tribolium castaneum NW_001092821.1, 1981685-1983397
  • Myotis lucifugus GeneScaffold_410, 157651-159369.
  • PiggyBac-like transposases that each contains a critical DDD motif (D268, D346, D447) and a tryptophan residue (W465), which are all necessary for Trichoplusia ni PiggyBac enzyme activity. Protein sequences were aligned by the ClustalW method. Individual amino acid positions of the PiggyBac transposase sequence were deemed divergent if no more than five PiggyBac-like sequences (from species other than Trichoplusia ni) shared the PiggyBac amino acid (or highly- similar amino acid) with Trichoplusia ni (T. ni) at a given position, when no fewer than seven ⁇ - ⁇ .
  • DDD motif D268, D346, D447
  • W465 tryptophan residue
  • ni transposases contained the identical (or highly-similar) amino acid for that position. Such commonly shared amino acids at a given position among these PiggyBac-like transposases represent a "consensus" amino acid sequence. In addition, at least twice as many species must contain the consensus amino acid, compared to the number of species (including T. ni) that share an identical non-consensus amino acid. The rational substitution of individual amino acids was deduced from divergent positions, such that PiggyBac sequences were reverted, or restored, to the consensus.
  • StEP staggered extension process
  • ni PiggyBac amino acid sequence ( Figure 1).
  • StEP equimolar concentrations of the PBo and the PB-var DNA were combined in 8 to 12 separate reactions containing a primer pair that flanks two unique Sfil sites and the PiggyBac transposase sequence.
  • Phusion (NEB) high fidelity polymerase was used. Thermocycling uses this program: 98°C for 60 seconds, 98°C for 15 seconds, and 56°C for 5 seconds, for 198 cycles.
  • the short 56°C step incorporates primer annealing and a very brief period of polymerase extension, producing small fragments, which after denaturation then randomly anneals to homologous regions (template switching). 198 cycles are desired to generate full-length hybrid fragments at a length of 2 Kb.
  • PCR reactions were digested with Dpnl, run on a 1 % agarose gel, extracted, pooled, and if DNA yield was too low, a standard high-fidelity PCR was performed on this template. DNA was digested with Sfil and ligated into the retroviral expression vector pLXIN (Clontech) that was modified by adding two unique and compatible Sfil sites.
  • This ligation was then purified by agarose gel electrophoresis and 50 to 100 ng of DNA was electroporated each into 4 aliquots of DH10B Mega-X (Invitrogen) electrocompetent cells, then recovered by shaking at 37°C for 45 minutes. Serial dilutions (10-2, 10-3, 10-4) of each transformation were plated onto LB- Ampicillin to determine the number of colony forming units (cfu) per ml and to estimate potential diversity. Approximately 5 x 10 6 cfu from the library was then amplified in semisolid LB agar and DNA isolated using the Pure Yield Plasmid Midiprep Kit (Promega). Prior to DNA isolation, a small aliquot of the amplified transformation was plated on LB agar plus Ampicillin, and 24 colonies picked and sequenced to assess diversity and recombination frequency between each substitution (the GO library).
  • PB-var sequence was shuffled with a wildtype PBo sequence. After shuffling, DNA libraries were ligated via directional Sfil sites into a modified pLXIN (Clontech) vector. Following transformation of each library into Mega-X electrocompetent DH10B cells (Invitrogen), 5 x 10 6 cfu were amplified for the GO PB library, in semi-solid agar. Plasmid DNA will be isolated and used to transfect the retroviral packaging cell line for retrovirus production.
  • transduction of the library into NIH-3T3 cells enables each individual transposase to be assayed in a single cell, if one adjusts the multiplicity of infection (MOI) such that each cell receives, on average, one functional proviral insertion.
  • MOI multiplicity of infection
  • a hyperactive transposase drives the mobilization of multiple transposon insertions per cell.
  • the sEGFP transposon which is capable of driving copy-number dependent expression, serves as a surrogate read-out of transposase efficiency, as determined by the intensity of EGFP fluorescence. Transduced cells expressing a library of variants can thus be functionally sorted using FACS.
  • the retroviral packaging cells in 10 cm dishes were transiently transfected with 7.5 ⁇ g of the amplified and purified plasmid library using LipoD293 Transfection Reagent (SignaGen). Viral supernatant was then collected 48 hours later, gently filtered through a 0.45 ⁇ filter, and mixed with polybrene (8 ⁇ g/ml final concentration). Viral titers were assessed by infecting NIH-3T3 cells with serial dilutions (10 1 , 10 "2 , 10 "3 , 10 "4 , 10 "5 , 10 "6 ) of the viral supernatant in 6-well plates. A titer between 5 x 10 5 and 1 x 10 7 cfu/ml was typical.
  • GFP intensity as a surrogate marker of transposase efficiency was used.
  • the BII-sd2GFP or EGFP transposon was integrated into the genome of NIH-3T3 cells as a concatemer repeat, to establish a source of the transposon in a native chromatin environment. This was a standard procedure accomplished by transfecting NIH-3T3 cells with the BII-sd2GFP or EGFP transposon along with one-tenth (molar ratio) the amount of a plasmid containing a puromycin or hygromycin selectable marker.
  • Stable lines were selected by picking individual drug-resistant colonies that are not GFP-positive, as assessed by examination under an inverted fluorescent microscope. The copy number of the sd2GFP or EGFP transposon were then evaluated by quantitative PCR (QPCR). Clones with at least 25 copies were retained for further study. Each cell line was then evaluated for transposition by transfection of an expression vector containing the PB transposase. Successful mobilization of the BII-sd2GFP or EGFP transposon was then evaluated by flow cytometry on a Becton Dickinson FACScaliber (BD Biosciences). Alternatively, BII-sd2EGFP or EGFP transposons can be used in these assays when transiently transfected as a plasmid.
  • Retroviral libraries were used to infect approximately 5 x 10 6 NIH-3T3-sd2GFP or NIH-3T3-EGFP cells at an MOI of one. After 8 hours, the medium was changed, and after 16 hours the medium is changed again, and then incubated for an additional 48 hours. Cells were sorted for the brightest GFP fluorescence (if at least 100-fold over background) on a
  • FACSVantage SE (BD Biosciences). Genomic DNA was isolated from these cells and the transposase coding sequence within proviral insertions was amplified by PCR using Phusion polymerase (NEB). PCR products were digested with Sfil and ligated into the pLXIN retroviral vector and a second retroviral library produced again as above.
  • the first generation (Gl) library was produced by selection in a puromycin assay to acquire the mutation frequency described below in Table 1.
  • the retroviral library was then put through an assay to infect NIH-3T3-sd2GFP or NIH-3T3 EGFP cells and the process repeated (in succession) to produce subsequent generations (G2, G3), with three
  • BII- sEGFP polyA-trap transposon
  • BII-sEGFP polyA-trap transposon
  • EGFP enhanced (non- destabilized) GFP
  • the EGFP protein can be used to enable more sensitive detection of single transposition events in single cells.
  • HEK293T cells were transfected with plasmids containing the PB transposon (BII-sEGFP) and each mutant PB transposase.
  • HEK293T cells Approximately 3 x 10 5 HEK293T cells were transfected in 6- well plates with calcium- phosphate and the GFP fluorescence was analyzed 72 hours later by FACS.
  • the mutant PiggyBac transposases yielded increased mobilization of the BII-sEGFP transposon, as measured by the number of GFP positive cells. These assays illustrate an increased ability of these mutant transposases to mobilize the BII-sEGFP transposon.
  • This hyperactivity may be due to an enhanced stability of the transposase, an increase in the catalytic efficiency, and/or an augmented preference for integration within genes (which would yield more GFP signal). Any of these features would be desirable for performing mutagenesis in vertebrate or mammalian cells. A refining process occurs through this repeated cycling, which culminates in a small collection of hyperactive transposases. No more than four generations were analyzed.
  • Proviral insertions from the final population of sorted cells were amplified by PCR and clones sequenced to identify each hyperactive transposase.
  • the primary (GO) library for PB was initially screened for the ability to mobilize a
  • PB transposon containing a PAC expression cassette (BII-SV-Puro), which confers puromycin resistance in NIH-3T3 cells.
  • a PAC expression cassette (BII-SV-Puro)
  • To screen PB libraries using the transposons containing the PAC expression cassettes approximately 1.2 x 10 7 NIH-3T3 cells were electroporated with the respective SVPuro transposon and split between four T175 flasks. Twenty-four hours after electroporation, cells were infected with the PB retroviral libraries. Puromycin selection yielded hundreds of thousands of surviving cells from each library. In the absence of transposase expression (i.e., following infection with a virus lacking a transposase open reading frame), no cells survived puromycin selection.
  • genomic DNA was isolated and the provirus was PCR amplified with Phusion Hot Start (Finnzymes) using primers that flank the unique pair of Sfil sites adjacent to the coding region of the transposase.
  • the amplified transposases were subcloned into the pLXIN vector for production of the secondary and tertiary transposase libraries.
  • the generation of a Gl, G2, and G3 library for the PB transposase was accomplished. Individual clones from the ensuing Gl (24 cloines), G2 (10 clones), or G3 (24 cloines) libraries were sequenced to glean some idea of mutation abundance.
  • Table 1 shows a quantitative summary of the frequency of mutations in each of the GO, Gl , G2, and G3 lirbaries.
  • the expected frequencies for the GO library was 50% for each of the mutations.
  • the frequency of the mutations listed in Table 1 in the G2 and G3 libraries, among other mutations, suggests that mutations within these domains 1-119 and 436-503 are heavily selected for potential hyperactivity and far exceed the expected frequencies that should have occurred during the analytics.
  • the hyperactivity of isolated PB transposases will be compared to wildtype T. ni PBo by FACS assays in HEK293T cells and in a chromosomal transposition assay in HeLa cells (Baus et al., Molecular Therapy, 2005, 12, 1148-1156; Ivies et al., Cell, 1997, 91, 501- 10; Zayed et al., Mol. Ther., 2004, 9, 292-304; and Yant et al., Mol. Cell. Biol., 2004, 24, 9239-47).
  • a Neo cassette in the transposon confers G418-resistance while the transposase is expressed from a separate plasmid.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Veterinary Medicine (AREA)
  • Physics & Mathematics (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

L'invention concerne des protéines transposases PiggyBac, des acides nucléiques codant pour celles-ci, des compositions les comprenant, des kits les comprenant, des animaux transgéniques non humains les comprenant et des procédés les utilisant.
PCT/US2011/061054 2010-11-16 2011-11-16 Transposases piggybac hyperactives WO2012074758A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US41439910P 2010-11-16 2010-11-16
US61/414,399 2010-11-16

Publications (1)

Publication Number Publication Date
WO2012074758A1 true WO2012074758A1 (fr) 2012-06-07

Family

ID=46172217

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2011/061054 WO2012074758A1 (fr) 2010-11-16 2011-11-16 Transposases piggybac hyperactives

Country Status (1)

Country Link
WO (1) WO2012074758A1 (fr)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200190524A1 (en) * 2014-04-09 2020-06-18 Dna2.0, Inc. Dna vectors, transposons and transposases for eukaryotic genome modification
WO2021110119A1 (fr) * 2019-12-04 2021-06-10 上海细胞治疗集团有限公司 Transposase hautement active et son application
WO2021178707A1 (fr) * 2020-03-04 2021-09-10 Poseida Therapeutics, Inc. Compositions et méthodes pour le traitement de troubles hépatiques métaboliques
WO2021218000A1 (fr) 2020-04-30 2021-11-04 深圳市深研生物科技有限公司 Cellule de production et cellule d'enveloppement pour vecteur rétroviral et son procédé de préparation
WO2022012758A1 (fr) * 2020-07-17 2022-01-20 Probiogen Ag Transposons et transposases hyperactifs
US11913015B2 (en) 2017-04-17 2024-02-27 University Of Maryland, College Park Embryonic cell cultures and methods of using the same

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010099301A2 (fr) * 2009-02-25 2010-09-02 The Johns Hopkins University Variants de transposon piggybac et procédés d'utilisation
US20100287633A1 (en) * 2009-02-26 2010-11-11 Transposagen Bio Hyperactive PiggyBac Transposases

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010099301A2 (fr) * 2009-02-25 2010-09-02 The Johns Hopkins University Variants de transposon piggybac et procédés d'utilisation
US20100287633A1 (en) * 2009-02-26 2010-11-11 Transposagen Bio Hyperactive PiggyBac Transposases

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DATABASE UNIPROTKB/TREMBL 29 April 2008 (2008-04-29), Database accession no. B1P5D5_9NEOP *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200190524A1 (en) * 2014-04-09 2020-06-18 Dna2.0, Inc. Dna vectors, transposons and transposases for eukaryotic genome modification
US11913015B2 (en) 2017-04-17 2024-02-27 University Of Maryland, College Park Embryonic cell cultures and methods of using the same
WO2021110119A1 (fr) * 2019-12-04 2021-06-10 上海细胞治疗集团有限公司 Transposase hautement active et son application
WO2021178707A1 (fr) * 2020-03-04 2021-09-10 Poseida Therapeutics, Inc. Compositions et méthodes pour le traitement de troubles hépatiques métaboliques
WO2021218000A1 (fr) 2020-04-30 2021-11-04 深圳市深研生物科技有限公司 Cellule de production et cellule d'enveloppement pour vecteur rétroviral et son procédé de préparation
WO2022012758A1 (fr) * 2020-07-17 2022-01-20 Probiogen Ag Transposons et transposases hyperactifs

Similar Documents

Publication Publication Date Title
US11485959B2 (en) Hyperactive piggybac transposases
US20230348867A1 (en) Transposon, gene transfer system and method of using the same
SG174155A1 (en) Piggybac transposon variants and methods of use
WO2012074758A1 (fr) Transposases piggybac hyperactives
US6576463B1 (en) Hybrid vectors for gene therapy
US7951927B2 (en) Reconstructed human mariner transposon capable of stable gene transfer into chromosomes in vertebrates
Converse et al. Counterselection and co-delivery of transposon and transposase functions for Sleeping Beauty-mediated transposition in cultured mammalian cells
US11882815B2 (en) Recombinant adeno-associated viruses for delivering gene editing molecules to embryonic cells
EP2025748A1 (fr) Variantes hyperactives de protéine transposase du système de transposon sleeping beauty
US20230081547A1 (en) Non-human animals comprising a humanized klkb1 locus and methods of use
US20050071895A1 (en) Generation of transgenic mice by transgene-mediated rescue of spermatogenesis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11844828

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 20/09/2013)

122 Ep: pct application non-entry in european phase

Ref document number: 11844828

Country of ref document: EP

Kind code of ref document: A1