WO2010028245A2 - Altered phic31 integrases having improved efficiency and specificity and methods of using same - Google Patents

Altered phic31 integrases having improved efficiency and specificity and methods of using same Download PDF

Info

Publication number
WO2010028245A2
WO2010028245A2 PCT/US2009/056040 US2009056040W WO2010028245A2 WO 2010028245 A2 WO2010028245 A2 WO 2010028245A2 US 2009056040 W US2009056040 W US 2009056040W WO 2010028245 A2 WO2010028245 A2 WO 2010028245A2
Authority
WO
WIPO (PCT)
Prior art keywords
site
integrase
vector
seq
pseudo
Prior art date
Application number
PCT/US2009/056040
Other languages
French (fr)
Inventor
Michele P. Calos
Original Assignee
Poetic Genetics, Llc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Poetic Genetics, Llc. filed Critical Poetic Genetics, Llc.
Publication of WO2010028245A2 publication Critical patent/WO2010028245A2/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination

Definitions

  • DNA into the chromosomes of higher organisms is holding up advances in basic and applied biology.
  • Recently strategies for chromosomal integration that take advantage of the high efficiency and tight sequence specificity of recombinase enzymes isolated from microorganisms have been described.
  • a class of phage integrases that includes the ⁇ C31 integrase (Kuhstoss, S., and Rao, R. N., J. MoI. Biol. 222, 897-908 (1991); Rausch, H., and Lehmann, M., Nucleic Acids Research 19, 5187-5189 (1991)) have been shown to function in mammalian cells (Groth, A. C, et al., Proc. Natl. Acad. Sci. USA 97, 5995-6000 (2000)).
  • the present disclosure addresses these needs by providing altered ⁇ C31 integrases that can be used more effectively in genetic engineering of the chromosomes of higher cells.
  • the present invention relates to the identification, isolation, cloning, expression, purification, and methods of use of altered ⁇ C31 integrases.
  • the present invention is directed to a method of site-specifically integrating a polynucleotide sequence of interest in a genome of a target cell using an altered ⁇ C31 integrase of the present invention.
  • FIG 1 shows schematic diagrams of screens used to characterize ⁇ C31 integrase mutants.
  • Panel A shows a schematic of a bacterial pre-screen for functional integrases.
  • pFCl was the recipient plasmid for libraries of mutant integrases. Recombination between the attB and attP sites mediated by a functional mutant integrase led to excision of the lacZ gene when transformed into E. coli, giving rise to a white colony on X-gal plates.
  • Panel B shows a schematic of a mammalian screen for recombination efficiency. DNA isolated from white colonies (pFC-M) was co-transfected into 293 cells with pBP- Green.
  • pBP-Green carries the eGFP gene and a CMV promoter in the inactive orientation, flanked by attP and attB. Recombination between the att sites, mediated by a functional mutant integrase, "flips" the CMV promoter, allowing eGFP expression, detected and quantified by a fluorescence analyzer.
  • FIG 2 shows schematics of plasmids.
  • Panel A shows the pFC-Pl, pFC-P2, and pFC-P3 plasmids that express the mutant integrases Pl, P2, and P3 under the control of the CMV and lacZ promoters and also carry the chloramphenicol resistance gene.
  • Panel B shows the pCS-Pl, pCS-P2, and pCS-P3 plasmids that express the mutant integrases Pl, P2, and P3 having the N-terminal fusion sequence and under the control of the CMV promoter on the ampicillin-resistant pCS vector backbone.
  • Panel C shows the pCS-dPl, pCS-dP2, and pCS-dP3 plasmids that express the mutant integrases dPl, dP2, and dP3, deleted for the N-terminal fusion sequence, on a pCS backbone vector.
  • Panel D shows the pNC-attB donor plasmid containing the ⁇ C31 attB site immediately downstream of a CMV promoter, an eGFP gene driven by a second CMV promoter, and the kanamycin resistance gene.
  • Panel E shows the pVFB donor plasmid carrying the ⁇ C31 attB site, the human factor IX gene expressed from the hAAT promoter, and a kanamycin resistance gene.
  • FIG 3 shows results of a Western blot that demonstrates expression of the N- terminal fusion sequence.
  • Each lane contains a HeLa cell extract from cells transfected with the following plasmids expressing an HA-tagged integrase: lane 1: pCSI-HA; lane 2: pCS-dP2-HA; lane 3: pCS-P2-HA; lane 4: pCS-dP3-HA; lane 5: pCS-P3-HA.
  • the blot was stained with antibodies that detect the HA tag and ⁇ -actin.
  • the lower band in all five lanes is the ⁇ -actin loading control.
  • FIGS 4A and 4B show results of integration into pseudo attP sites in cultured human cells.
  • FIG 4A shows that HeLa cells received a donor attB neomycin resistant plasmid and a plasmid expressing a wild-type or mutant integrase and were grown under G418 selection for 2 weeks. The average numbers of G418-resistant colonies obtained for each integrase are shown. Values of interest are + SEM. Asterisks denote values that differ at p ⁇ 0.05 (Student's t test).
  • FIG 4B shows frequency of integration at five commonly used human pseudo attP sites. Transfection and selection were done as above using the three designated integrases.
  • PCR was performed on pooled G418-resistant colonies from plates that had been diluted as noted. PCR primers specific for detecting integration of the plasmid at each of the five sites were used, and the reactions were scored as positive or negative, depending on whether a band was obtained.
  • FIGS 5A and 5B show integration into the pre-integrated chromosomal attP site in 293-P3 cells.
  • FIG 5A shows results of 293-P3 cells that were co-transfected with pNC-attB and a wild-type or mutant integrase-expressing plasmid and were maintained in selection media containing G418 and zeocin for ⁇ 2 weeks. The average number of colonies resistant to both G418 + zeocin are shown. All values shown represent the mean + SEM.
  • FIG 5B shows PCR analysis to determine specificity of integration at attP. 293-P3 cells were co-transfected with a G418-resistant attB donor plasmid and either pCS-P3 or pCS-dP3. Numbers of G418-resistant colonies are shown. After two weeks of selection with G418, -50 colonies generated by each integrase were picked and individually analyzed by PCR using primers that detect integration at the chromosomal attP site. Values are + SEM, and asterisks denote values that differ at p ⁇ 0.05 (Student's t test).
  • FIGS 6A and 6B show results of human factor IX expression in mouse liver.
  • FIG 6A shows levels of hFIX expression in mouse serum up to 90 days post-injection.
  • Animals were hydrodynamically injected with pVFB and pCSml, pCSI, pCS-P2, or pCS-P3.
  • Sera from blood samples were assayed by ELISA at various time points.
  • Asterisk (*) denotes p-values of interest that were ⁇ 0.05.
  • FIG 6B shows immunofluorescence staining of mouse liver sections for human factor IX. The top row shows DAPI stained nuclei, the middle row shows staining for human factor IX, and the bottom row is a merged image of the top and middle rows.
  • Panel 1 na ⁇ ve control; panel 2, HBSS alone; panel 3, pVFB + pCSml; panel 4, pVFB + pCSI; panel 5, pVFB + pCS- P2; and panel 6: pVFB + pCS-P3.
  • FIG 7 shows evidence of integration of nucleic acid into a genomic pseudo attP site in mouse liver.
  • Panel A shows that PCR was carried out on DNA extracted from liver, using primers that detect the junction between the attB site of pVFB and mpsLl, a preferred pseudo attP site in the mouse genome.
  • Lane 1 liver DNA sample known to have ⁇ ff ⁇ -plasmid DNA integrated at mpsLl; lane 2, primers alone; lane 3, liver that received pVFB and pCSml; lane 4, liver that received pVFB and pCSI; lane 5, liver that received pVFB and pCS-P2; and lane 6, liver that received pVFB and pCS-P3.
  • Panel B shows PCR bands from lanes 1, 4, and 5 were excised, purified, TOPO-cloned, and sequenced. The sequences depicted begin in attB and join genomic DNA within the mpsLl site. The TT core cross-over sequence is in bold, and ":" represents small deletions seen in the cross-over region.
  • FIG 8 shows the nucleic acid sequence of the wild-type ⁇ C31 integrase (SEQ ID NO: 1
  • FIG 9 shows the amino acid sequence of the wild-type ⁇ C31 integrase (SEQ ID NO:
  • FIG 10 shows the nucleic acid sequence of the altered ⁇ C31 integrase Pl (SEQ ID NO:
  • FIG 11 shows the amino acid sequence of the altered ⁇ C31 integrase Pl (SEQ ID NO:
  • FIG 12 shows the nucleic acid sequence of the altered ⁇ C31 integrase P2 (SEQ ID NO: 1
  • FIG 13 shows the amino acid sequence of the altered ⁇ C31 integrase P2 (SEQ ID NO:
  • FIG 14 shows the nucleic acid sequence of the altered ⁇ C31 integrase P3 (SEQ ID NO: 1]
  • FIG 15 shows the amino acid sequence of the altered ⁇ C31 integrase P3 (SEQ ID NO:
  • FIGS 16A-16C show the DNA sequences of the full length ⁇ C31 attP (SEQ ID NO:08).
  • FIG 16A (FIG 16A) and attB (SEQ ID NO: 10) (FIG 16B) sites, respectively, and a 59 bp wild-type ⁇ C31 attP site (SEQ ID NO: 11) (FIG 16C).
  • the TT core is indicated in upper case.
  • FIG. 17 shows approximately 475 bp of DNA sequence from human chromosome 8 that encompasses an exemplary ⁇ C31 integrase pseudo-attp site ⁇ A (SEQ ID NO: 12).
  • the core TT sequence of the pseudo site is shown in bold.
  • Approximately 40 bp surrounding the core represent the minimal pseudo attP site.
  • FIG 18 shows the nucleic acid sequences of 19 different pseudo attP sites present in the human genome (SEQ ID NO: 13-31).
  • the site name identifies the chromosomal location of each pseudo attP site.
  • the top sequence is a consensus attP site, which is symmetrical about the core and contains inverted repeats extending over the length of the consensus, indicated by the arrows.
  • FIG 19 shows an amino acid sequence alignment of the wild-type ⁇ C31 integrase
  • WT (SEQ ID NO:02) and the altered ⁇ C31 integrases Pl (SEQ ID NO:04), P2 (SEQ ID NO:06), and P3 (SEQ ID NO:08).
  • Recombinases are a family of enzymes that mediate site-specific recombination between specific DNA sequences recognized by the recombinase (Esposito, D., and Scocca, J. J., Nucleic Acids Research 25, 3605 3614 (1997); Nunes-Duby, S. E., et al., Nucleic Acids Research 26, 391 406 (1998); Stark, W. M., et al., Trends in Genetics 8, 432 439 (1992)).
  • altered recombinases refer to recombinase enzymes in which the native, wild- type recombinase gene found in the organism of origin has been mutated in one or more positions.
  • An altered recombinase possesses a DNA binding specificity and/or level of activity that differs from that of the wild-type enzyme.
  • Such altered binding specificity permits the recombinase to react with a given DNA sequence differently than would the native enzyme, while an altered level of activity permits the recombinase to carry out the reaction at greater or lesser efficiency.
  • a recombinase reaction typically includes binding to the recognition sequence and performing concerted cutting and ligation, resulting in strand exchanges between two recombining recognition sites.
  • a "unidirectional site-specific recombinase” is a naturally-occurring recombinase, such as the ⁇ C31 integrase, a mutated or altered recombinase, such as a mutated or altered ⁇ C31 integrase that retains unidirectional, site-specific recombination activity, or a bi-directional recombinase modified so as to be unidirectional, such as a ere recombinase that has been modified to become unidirectional.
  • altered recombinases and “mutant recombinases” are used interchangeably herein to refer to recombinase enzymes in which the native, wild-type recombinase gene found in the organism of origin has been mutated in one or more positions relative to a parent recombinase (e.g., in one or more nucleotides, which may result in alterations of one or more amino acids in the altered recombinase relative to a parent recombinase).
  • Parent recombinase is used to refer to the nucleotide and/or amino acid sequence of the recombinase from which the altered recombinase is generated.
  • the parent recombinase can be a naturally occurring enzyme (i.e., a native or wild-type enzyme) or a non- naturally occurring enzyme (e.g., a genetically engineered enzyme).
  • Altered recombinases of interest in the invention exhibit a DNA binding specificity and/or level of activity that differs from that of the wild-type enzyme or other parent enzyme. Such altered binding specificity permits the recombinase to react with a given DNA sequence differently than would the parent enzyme, while an altered level of activity permits the recombinase to carry out the reaction at greater or lesser efficiency.
  • a recombinase reaction typically includes binding to the recognition sequence and performing concerted cutting and ligation, resulting in strand exchanges between two recombining recognition sites.
  • Site-specific integration refers to the sequence specific recombination and integration of a first nucleic acid with a second nucleic acid, typically mediated by a recombinase.
  • site-specific recombination or integration occurs at particular defined sequences recognized by the recombinase.
  • site specific integration occurs at a particular sequence (e.g., a recombinase attachment site) at a higher efficiency.
  • a "wild-type recombination site" as used herein means a recombination site normally used by an integrase or recombinase.
  • lambda is a temperate bacteriophage that infects E. coli.
  • the phage has one attachment site for recombination (attP) and the E. coli bacterial genome has an attachment site for recombination (attB). Both of these sites are wild-type recombination sites for lambda integrase.
  • wild-type recombination sites occur in the homologous phage/bacteria system. Accordingly, wild-type recombination sites can be derived from the homologous system and associated with heterologous sequences, for example, the attB site can be placed in other systems to act as a substrate for the integrase.
  • the wild-type attB and attP recognition sites of phage ⁇ C31 are generally about 34 to 40 nucleotides in length (Groth et al. Proc Natl Acad Sci USA 97:5995-6000 (2000)). These sites are typically arranged as follows: AttB comprises a first DNA sequence attB5', a core region, and a second DNA sequence attB3 ⁇ in the relative order from 5' to 3' attB5'-co ⁇ e ⁇ egion-attB3' .
  • AttP comprises a first DNA sequence attP5', a core region, and a second DNA sequence attP3', in the relative order from 5' to 3' attP5' -com mgion-attP3' .
  • the core region of attP and attB of ⁇ C31 has the sequence 5'-TTG-3'.
  • Action of the integrase upon these recognitions sites is unidirectional in that the enzymatic reaction produces nucleic acid recombination products that are not effective substrates of the integrase. This results in stable integration with little or no detectable recombinase-mediated excision, i.e., recombination that is "unidirectional".
  • the recombination product of integrase action upon the recognition site pair comprises, for example, in order from 5' to 3': attB5 '-recombination product site sequence- ⁇ ftf ⁇ ', and ⁇ ftf ⁇ -recombination product site sequence-a??Z?3'.
  • a typical recombination product comprises the sequence (from 5' to 3'): attP5'-TTG- attB3' ⁇ targeting vector sequence ⁇ ?? ⁇ 5'-TTG- ⁇ ?tf'3'.
  • recombination results in a hybrid site-specific recombination site (designated attL or attR for left and right, respectively) that is neither an attB sequence or an attP sequence, and is functionally unrecognizable as a site-specific recombination site (e.g., attB or attP) to the relevant unidirectional site-specific recombinase, thus removing the possibility that the unidirectional site-specific recombinase will catalyze a second recombination reaction between the attL and the attR that would reverse the first recombination reaction.
  • site-specific recombination site designated attL or attR for left and right, respectively
  • a "native recognition site”, as used herein, means a recognition site that occurs naturally in the genome of a cell (i.e., the sites are not introduced into the genome, for example, by recombinant means).
  • a "pseudo-site” or a “pseudo-recombination site” as used herein means a DNA sequence comprising a recognition site that is bound by a recombinase enzyme where the recognition site differs in one or more nucleotides from a wild-type recombinase recognition sequence and/or is present as an endogenous sequence in a genome that differs from the sequence of a genome where the wild-type recognition sequence for the recombinase resides.
  • a pseudo-recombination sequence is functionally equivalent to a wild-type recombination sequence, occurs in an organism other than that in which the recombinase is found in nature, and may have sequence variation relative to the wild type recombination sequences.
  • a "pseudo attP site” or “pseudo attB site” refer to pseudo sites that are similar to the recognitions site for wild-type phage (attP) or bacterial (attB) attachment site sequences, respectively, for phage integrase enzymes, such as the phage ⁇ C31.
  • the pseudo attP site is present in the genome of a host cell, while the wild type attB site is present on a targeting vector.
  • "Pseudo att site” is a more general term that can refer to either a pseudo attP site or a pseudo attB site. It is understood that att sites or pseudo att sites may be present on linear or circular nucleic acid molecules.
  • the presence of "pseudo-recombination sites" in the genome of the target cell avoids the need for introducing a recombination site into the genome.
  • a “hybrid-recombination site”, as used herein, refers to a recombination site constructed from portions of wild type and/or pseudo-recombination sites.
  • a wild-type recombination site may have a short, core region flanked by palindromes.
  • the sequence 5' of the core region sequence of the hybrid-recombination site matches a pseudo-recombination site and the sequence 3 ' of the core of the hybrid-recombination site match the wild-type recombination site.
  • the hybrid-recombination site may be comprised of the region 5' of the core from a wild- type attB site and the region 3' of the core from a wild-type attP recombination site, or vice versa.
  • Other combinations of such hybrid-recombination sites will be evident to those having ordinary skill in the art, in view of the teachings of the present specification.
  • nucleic acid construct it is meant a nucleic acid sequence that has been constructed to comprise one or more functional units not found together in nature. Examples include circular, double-stranded, extrachromosomal DNA molecules (plasmids), cosmids (plasmids containing COS sequences from lambda phage), viral genomes comprising non-native nucleic acid sequences, and the like.
  • plasmids extrachromosomal DNA molecules
  • cosmids plasmids containing COS sequences from lambda phage
  • viral genomes comprising non-native nucleic acid sequences, and the like.
  • nucleic acid fragment of interest any nucleic acid fragment that one wishes to insert into a genome. Suitable examples of nucleic acid fragments of interest include therapeutic genes, marker genes, control regions, trait-producing fragments, and the like.
  • therapeutic genes are those nucleic acid sequences which encode molecules that provide some therapeutic benefit to the host, including proteins, antibodies, functional RNAs (antisense, hammerhead ribozymes), RNAi, siRNA, miRNA, shRNA, and the like.
  • Well known examples include the cystic fibrosis transmembrane conductance regulator (CFTR) gene and the Factor IX gene.
  • CFTR cystic fibrosis transmembrane conductance regulator
  • cystic fibrosis The primary physiological defect in cystic fibrosis is the failure of electrogenic chloride ion secretion across the epithelia of many organs, including the lungs.
  • One of the most dangerous aspects of the disorder is the cycle of recurrent airway infections which gradually destroy lung function resulting in premature death.
  • Cystic fibrosis is caused by a variety of mutations in the CFTR gene. Since the problems arising in cystic fibrosis result from mutations in a single gene, the possibility exists that the introduction of a normal copy of the gene into the lung epithelia could provide a treatment for the disease, or effect a cure if the gene transfer was permanent.
  • disorders resulting from mutations in a single gene include alpha- 1 -antitrypsin deficiency, chronic granulomatous disease, familial hypercholesterolemia, Fanconi anemia, Gaucher disease, Hunter syndrome, ornithine transcarbamylase deficiency, purine nucleoside phosphorylase deficiency, severe combined immunodeficiency disease (SCID)-ADA, X-linked SCID, hemophilia, muscular dystrophy, and the like.
  • SCID severe combined immunodeficiency disease
  • Therapeutic benefit in other disorders may also result from the addition of a protein-encoding therapeutic nucleic acid.
  • addition of a nucleic acid encoding an immunomodulating protein such as interleukin-2 may be of therapeutic benefit for patients suffering from different types of cancer.
  • a nucleic acid fragment of interest may additionally be a "marker nucleic acid” or “marker polypeptide”.
  • Marker genes encode proteins which can be easily detected in transformed cells and are, therefore, useful in the study of those cells. Marker genes are being used in bone marrow transplantation studies, for example, to investigate the biology of marrow reconstitution and the mechanism of relapse in patients. Examples of suitable marker genes include beta-galactosidase, green or yellow fluorescent proteins, chloramphenicol acetyl transferase, luciferase, and the like.
  • a nucleic acid fragment of interest may additionally be a control region.
  • control region or "control element” includes all nucleic acid components which are operably linked to a nucleic acid fragment (e.g., DNA) and involved in the expression of a protein or RNA therefrom.
  • the precise nature of the control (or regulatory) regions needed for coding sequence expression may vary from organism to organism. Such regions typically include those 5' noncoding sequences involved with initiation of transcription and translation, such as the enhancer, TATA box, capping sequence, CAAT sequence, and the like.
  • Further exemplary control sequences include, but are not limited to, any sequence that functions to modulate replication, transcriptional or translational regulation, and the like. Examples include promoters, signal sequences, propeptide sequences, transcription terminators, polyadenylation sequences, enhancer sequences, attenuatory sequences, intron splice site sequences, and the like.
  • a nucleic acid fragment of interest may additionally be a trait-producing sequence, by which it is meant a sequence conferring some non-native trait upon the organism or cell in which the protein encoded by the trait-producing sequence is expressed.
  • the term "non-native" when used in the context of a trait-producing sequence means that the trait produced is different than one would find in an unmodified organism which can mean that the organism produces high amounts of a natural substance in comparison to an unmodified organism, or produces a non-natural substance.
  • the genome of a crop plant, such as corn can be modified to produce higher amounts of an essential amino acid, thus creating a plant of higher nutritional quality, or could be modified to produce proteins not normally produced in plants, such as antibodies. (See U.S. Pat.
  • Methods of transforming cells are well known in the art.
  • transformed it is meant a heritable alteration in a cell resulting from the uptake of foreign DNA.
  • Suitable methods include viral infection, transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate precipitation, direct microinjection, and the like.
  • the choice of method is generally dependent on the type of cell being transformed and the circumstances under which the transformation is taking place (i.e. in vitro, ex vivo, or in vivo).
  • a general discussion of these methods can be found in Ausubel, et al, Short Protocols in Molecular Biology, 3rd ed., Wiley & Sons, 1995.
  • nucleic acid molecule and “polynucleotide” are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown.
  • Non-limiting examples of polynucleotides include a gene, a gene fragment, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
  • mRNA messenger RNA
  • transfer RNA transfer RNA
  • ribosomal RNA ribozymes
  • cDNA recombinant polynucleotides
  • branched polynucleotides branched polynucleotides
  • plasmids vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
  • a polynucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA).
  • A adenine
  • C cytosine
  • G guanine
  • T thymine
  • U uracil
  • T thymine
  • a "coding sequence” or a sequence which "encodes" a selected polypeptide is a nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the case of mRNA) into a polypeptide, for example, in vivo when placed under the control of appropriate regulatory sequences (or “control elements”).
  • the boundaries of the coding sequence are typically determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxy) terminus.
  • a coding sequence can include, but is not limited to, cDNA from viral, procaryotic or eucaryotic mRNA, genomic DNA sequences from viral or procaryotic DNA, and even synthetic DNA sequences.
  • a transcription termination sequence may be located 3' to the coding sequence.
  • Other "control elements" may also be associated with a coding sequence.
  • a DNA sequence encoding a polypeptide can be optimized for expression in a selected cell by using the codons preferred by the selected cell to represent the DNA copy of the desired polypeptide coding sequence.
  • Encoded by refers to a nucleic acid sequence which codes for a polypeptide sequence, wherein the polypeptide sequence or a portion thereof contains an amino acid sequence of at least 3 to 5 amino acids, more preferably at least 8 to 10 amino acids, and even more preferably at least 15 to 20 amino acids from a polypeptide encoded by the nucleic acid sequence. Also encompassed are polypeptide sequences which are immunologically identifiable with a polypeptide encoded by the sequence.
  • operably linked refers to an arrangement of elements wherein the components so described are configured so as to perform their usual function.
  • a given promoter that is operably linked to a coding sequence e.g., a reporter expression cassette
  • the promoter or other control elements need not be contiguous with the coding sequence, so long as they function to direct the expression thereof.
  • intervening untranslated yet transcribed sequences can be present between the promoter sequence and the coding sequence and the promoter sequence can still be considered “operably linked" to the coding sequence.
  • a "vector" is capable of transferring gene sequences to target cells. Typically,
  • vector construct means any nucleic acid construct capable of directing the expression of a gene of interest and which can transfer gene sequences to target cells.
  • vector construct means any nucleic acid construct capable of directing the expression of a gene of interest and which can transfer gene sequences to target cells.
  • the term includes cloning, and expression vehicles, as well as integrating vectors.
  • An "expression cassette” comprises any nucleic acid construct capable of directing the expression of a gene/coding sequence of interest. Such cassettes can be constructed into a “vector,” “vector construct,” “expression vector,” or “gene transfer vector,” in order to transfer the expression cassette into target cells. Thus, the term includes cloning and expression vehicles, as well as viral vectors.
  • sequence identity also is known in the art. Typically, such techniques include determining the nucleotide sequence of the mRNA for a gene and/or determining the amino acid sequence encoded thereby, and comparing these sequences to a second nucleotide or amino acid sequence. In general, “identity” refers to an exact nucleotide-to-nucleotide or amino acid-to-amino acid correspondence of two polynucleotides or polypeptide sequences, respectively.
  • Two or more sequences can be compared by determining their "percent identity.”
  • the percent identity of two sequences, whether nucleic acid or amino acid sequences is the number of exact matches between two aligned sequences divided by the length of the shorter sequences and multiplied by 100.
  • An approximate alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482 489 (1981). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff, Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353 358, National Biomedical Research Foundation, Washington, D.
  • the Smith- Waterman algorithm can be employed where default parameters are used for the scoring table (for example, gap open penalty of 12, gap extension penalty of one, and a gap of six). From the data generated the "Match" value reflects "sequence identity.”
  • Other suitable programs for calculating the percent identity or similarity between sequences are generally known in the art, for example, another alignment program is BLAST, used with default parameters.
  • homology can be determined by hybridization of polynucleotides under conditions that form stable duplexes between homologous regions, followed by digestion with single-stranded-specific nuclease(s), and size determination of the digested fragments.
  • Two DNA, or two polypeptide sequences are "substantially homologous" to each other when the sequences exhibit at least about 80% 85%, preferably at least about 85% 90%, more preferably at least about 90% 95%, and most preferably at least about 95% 98% sequence identity over a defined length of the molecules, as determined using the methods above.
  • substantially homologous also refers to sequences showing complete identity to the specified DNA or polypeptide sequence.
  • DNA sequences that are substantially homologous can be identified in a Southern hybridization experiment under, for example, stringent conditions, as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art. See, e.g., Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid Hybridization, supra.
  • Two nucleic acid fragments are considered to "selectively hybridize” as described herein.
  • the degree of sequence identity between two nucleic acid molecules affects the efficiency and strength of hybridization events between such molecules.
  • a partially identical nucleic acid sequence will at least partially inhibit a completely identical sequence from hybridizing to a target molecule. Inhibition of hybridization of the completely identical sequence can be assessed using hybridization assays that are well known in the art (e.g., Southern blot, Northern blot, solution hybridization, or the like, see Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, (1989) Cold Spring Harbor, N.Y.).
  • Such assays can be conducted using varying degrees of selectivity, for example, using conditions varying from low to high stringency. If conditions of low stringency are employed, the absence of non-specific binding can be assessed using a secondary probe that lacks even a partial degree of sequence identity (for example, a probe having less than about 30% sequence identity with the target molecule), such that, in the absence of non-specific binding events, the secondary probe will not hybridize to the target.
  • a secondary probe that lacks even a partial degree of sequence identity (for example, a probe having less than about 30% sequence identity with the target molecule), such that, in the absence of non-specific binding events, the secondary probe will not hybridize to the target.
  • a nucleic acid probe is chosen that is complementary to a target nucleic acid sequence, and then by selection of appropriate conditions the probe and the target sequence "selectively hybridize,” or bind, to each other to form a hybrid molecule.
  • a nucleic acid molecule that is capable of hybridizing selectively to a target sequence under "moderately stringent” typically hybridizes under conditions that allow detection of a target nucleic acid sequence of at least about 10 14 nucleotides in length having at least approximately 70% sequence identity with the sequence of the selected nucleic acid probe.
  • Stringent hybridization conditions typically allow detection of target nucleic acid sequences of at least about 10 14 nucleotides in length having a sequence identity of greater than about 90 95% with the sequence of the selected nucleic acid probe.
  • Hybridization conditions useful for probe/target hybridization where the probe and target have a specific degree of sequence identity can be determined as is known in the art (see, for example, Nucleic Acid Hybridization: A Practical Approach, editors B. D. Hames and S. J. Higgins, (1985) Oxford; Washington, D.C.; IRL Press).
  • stringency conditions for hybridization it is well known in the art that numerous equivalent conditions can be employed to establish a particular stringency by varying, for example, the following factors: the length and nature of probe and target sequences, base composition of the various sequences, concentrations of salts and other hybridization solution components, the presence or absence of blocking agents in the hybridization solutions (e.g., formamide, dextran sulfate, and polyethylene glycol), hybridization reaction temperature and time parameters, as well as, varying wash conditions.
  • the selection of a particular set of hybridization conditions is selected following standard methods in the art (see, for example, Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, (1989) Cold Spring Harbor, N. Y.)
  • a first polynucleotide is "derived from" a second polynucleotide if it has the same or substantially the same basepair sequence as a region of the second polynucleotide, its cDNA, complements thereof, or if it displays sequence identity as described above.
  • a first polypeptide is "derived from” a second polypeptide if it is (i) encoded by a first polynucleotide derived from a second polynucleotide, or (ii) displays sequence identity to the second polypeptides as described above.
  • recombinase when a recombinase is "derived from a phage" the recombinase need not be explicitly produced by the phage itself, the phage is simply considered to be the original source of the recombinase and coding sequences thereof.
  • Recombinases can, for example, be produced recombinantly or synthetically, by methods known in the art, or alternatively, recombinases may be purified from phage infected bacterial cultures.
  • substantially purified general refers to isolation of a substance (compound, polynucleotide, protein, polypeptide, polypeptide composition) such that the substance comprises the majority percent of the sample in which it resides.
  • a substantially purified component comprises 50%, preferably 80% 85%, more preferably 90 95% of the sample.
  • Techniques for purifying polynucleotides and polypeptides of interest are well-known in the art and include, for example, ion-exchange chromatography, affinity chromatography and sedimentation according to density.
  • Altered ⁇ C31 integrase refer to a ⁇ C31 integrase enzymes in which the native, wild-type integrase gene found in ⁇ C31 has been mutated in one or more positions.
  • An altered ⁇ C31 integrase possesses a DNA binding specificity and/or level of activity that differs from that of the wild-type ⁇ C31 integrase enzyme.
  • a recombinase reaction typically includes binding to the recognition sequence and performing concerted cutting and ligation, resulting in strand exchanges between two recombining recognition sites.
  • Altered ⁇ C31 integrases of the present invention include the Pl (SEQ ID NO:04),
  • P2 SEQ ID NO:06
  • P3 SEQ ID NO:08
  • the amino acid sequences of the Pl, P2, and P3 ⁇ C31 integrases are described in Figs. 11 (SEQ ID NO:04), 13 (SEQ ID NO:06), and 15 (SEQ ID NO:08), and the nucleic sequences encoding the polypeptides are described in Figs. 10 (SEQ ID NO:03), 12 (SEQ ID NO:05), and 14 (SEQ ID NO:07).
  • homologues of the above sequences are also of interest.
  • the source of homologous sequence may be any species or the sequence may be wholly or partially synthetic.
  • sequence similarity between homologues is at least about 20%, sometimes at least about 25%, and may be 30%, 35%, 40%, 50%, 60%, 70% or higher, including 75%, 80%, 85%, 90% and 95% or higher.
  • Sequence similarity is calculated based on a reference sequence, which may be a subset of a larger sequence, such as a conserved motif, coding region, flanking region, etc.
  • a reference sequence will usually be at least about 18 nt long, more usually at least about 30 nt long, and may extend to the complete sequence that is being compared.
  • the sequences provided herein are essential for recognizing related and homologous nucleic acids in database searches.
  • nucleic acids of substantially the same length as the nucleic acid identified as SEQ ID NOS: 03, 05, or 07 where by substantially the same length is meant that any difference in length does not exceed about 20 number %, usually does not exceed about 10 number % and more usually does not exceed about 5 number %; and have sequence identity to any of these sequences of at least about 90%, usually at least about 95% and more usually at least about 99% over the entire length of the nucleic acid.
  • the nucleic acids have a sequence that is substantially similar (i.e. the same as) or identical to the sequences of SEQ ID NOS: 03, 05, or 07.
  • sequence identity will generally be at least about 60%, usually at least about 75% and often at least about 80, 85, 90, or even 95%.
  • amino acids of substantially the same length as the amino acid identified as SEQ ID NOS: 04, 06, or 08 where by substantially the same length is meant that any difference in length does not exceed about 20 number %, usually does not exceed about 10 number % and more usually does not exceed about 5 number %; and have sequence identity to any of these sequences of at least about 90%, usually at least about 95% and more usually at least about 99% over the entire length of the amino acid.
  • the amino acids have a sequence that is substantially similar (i.e. the same as) or identical to the sequences of SEQ ID NOS: 04, 06, or 08.
  • sequence identity will generally be at least about 60%, usually at least about 75% and often at least about 80, 85, 90, or even 95%.
  • nucleic acids that encode the proteins encoded by the above described nucleic acids, but differ in sequence from the above described nucleic acids due to the degeneracy of the genetic code.
  • nucleic acids that hybridize to the above described nucleic acid under stringent conditions.
  • stringent hybridization conditions is hybridization at 50 0 C or higher and 0. IxSSC (15 mM sodium chloride/1.5 mM sodium citrate).
  • Stringent hybridization conditions are hybridization conditions that are at least as stringent as the above representative conditions, where conditions are considered to be at least as stringent if they are at least about 80% as stringent, typically at least about 90% as stringent as the above specific stringent conditions.
  • Other stringent hybridization conditions are known in the art and may also be employed to identify nucleic acids of this particular embodiment of the invention.
  • nucleic acids that encode fusion proteins of the subject proteins, or fragments thereof, which are fused to a second protein, e.g., a degradation sequence, a signal peptide, etc.
  • Fusion proteins may comprise a subject polypeptide, or fragment thereof, and a non- ⁇ C31 integrase polypeptide ("the fusion partner") fused in- frame at the N-terminus and/or C-terminus of the subject polypeptide.
  • Fusion partners include, but are not limited to, polypeptides that can bind antibody specific to the fusion partner (e.g., epitope tags); antibodies or binding fragments thereof; polypeptides that provide a catalytic function or induce a cellular response; ligands or receptors or mimetics thereof; and the like.
  • the fusion partner is generally not naturally associated with the subject altered ⁇ C31 integrase portion of the fusion protein, and is typically not a ⁇ C31 protein or derivative/fragment thereof, i.e., it is not found in ⁇ C31 bacteriophage.
  • constructs comprising the subject nucleic acids inserted into a vector, where such constructs may be used for a number of different applications, including propagation, protein production, etc.
  • Viral and non- viral vectors may be prepared and used, including plasmids.
  • the choice of vector will depend on the type of cell in which propagation is desired and the purpose of propagation. Certain vectors are useful for amplifying and making large amounts of the desired DNA sequence.
  • Other vectors are suitable for expression in cells in culture.
  • Still other vectors are suitable for transfer and expression in cells in a whole animal or person. The choice of appropriate vector is well within the skill of the art. Many such vectors are available commercially.
  • the partial or full-length polynucleotide is inserted into a vector typically by means of DNA ligase attachment to a cleaved restriction enzyme site in the vector.
  • the desired nucleotide sequence can be inserted by homologous recombination in vivo. Typically this is accomplished by attaching regions of homology to the vector on the flanks of the desired nucleotide sequence. Regions of homology are added by ligation of oligonucleotides, or by polymerase chain reaction using primers comprising both the region of homology and a portion of the desired nucleotide sequence, for example.
  • expression cassettes or systems that find use in, among other applications, the synthesis of the subject proteins.
  • the gene product encoded by a polynucleotide of the invention is expressed in any convenient expression system, including, for example, bacterial, yeast, insect, amphibian and mammalian systems. Suitable vectors and host cells are described in U.S. Patent No. 5,654,173.
  • a subject polynucleotide e.g., as set forth in SEQ ID NOS :03; 05; or 07, is linked to a regulatory sequence as appropriate to obtain the desired expression properties.
  • These regulatory sequences can include promoters (attached either at the 5' end of the sense strand or at the 3' end of the antisense strand), enhancers, terminators, operators, repressors, and inducers.
  • the promoters can be regulated or constitutive. In some situations it may be desirable to use conditionally active promoters, such as tissue- specific or developmental stage-specific promoters. These are linked to the desired nucleotide sequence using the techniques described above for linkage to vectors. Any techniques known in the art can be used.
  • the expression vector will provide a transcriptional and translational initiation region, which may be inducible or constitutive, where the coding region is operably linked under the transcriptional control of the transcriptional initiation region, and a transcriptional and translational termination region.
  • control regions may be native to the subject species from which the subject nucleic acid is obtained, or may be derived from exogenous sources.
  • Expression vectors generally have convenient restriction sites located near the promoter sequence to provide for the insertion of nucleic acid sequences encoding heterologous proteins.
  • a selectable marker operative in the expression host may be present.
  • Expression vectors may be used for, among other things, the production of fusion proteins, as described above.
  • Expression cassettes may be prepared comprising a transcription initiation region, the gene or fragment thereof, and a transcriptional termination region.
  • sequences that allow for the expression of functional epitopes or domains usually at least about 8 amino acids in length, more usually at least about 15 amino acids in length, to about 25 amino acids, and up to the complete open reading frame of the gene.
  • the cells containing the construct may be selected by means of a selectable marker, the cells expanded and then used for expression.
  • the above described expression systems may be employed with prokaryotes or eukaryotes in accordance with conventional ways, depending upon the purpose for expression.
  • a unicellular organism such as E. coli, B. subtilis, S. cerevisiae, insect cells in combination with baculovirus vectors, or cells of a higher organism such as vertebrates, e.g. COS 7 cells, HEK 293, CHO, Xenopus Oocytes, etc.
  • vertebrates e.g. COS 7 cells, HEK 293, CHO, Xenopus Oocytes, etc.
  • Small peptides can also be synthesized in the laboratory. Polypeptides that are subsets of the complete protein sequence may be used to identify and investigate parts of the protein important for function.
  • humanized versions of the subject nucleic acids are also of interest.
  • the term "humanized” refers to changes made to the a nucleic acid sequence to optimize the codons for expression of the protein in human cells (Yang et al., Nucleic Acids Research 24 (1996), 4592-4593). See also U.S. Patent No. 5,795,737 which describes humanization of proteins, the disclosure of which is herein incorporated by reference.
  • the inventors have discovered native recombination sites existing in the genomes of a variety of organisms, where the native recombination site does not necessarily have a nucleotide sequence identical to the wild-type recombination sequences for a ⁇ C31 integrase; but such native recombination sites are nonetheless sufficient to promote recombination meditated by ⁇ C31 integrase.
  • Such native recombination site sequences existing in the genomes are referred to herein as "pseudo-recombination sequences.”
  • Identification of pseudo-recombination sequences can be accomplished, for example, by using sequence alignment and analysis, where the query sequence is the recombination site of interest (for example, attP and/or attB).
  • the query sequence is the recombination site of interest (for example, attP and/or attB).
  • the genome of a target cell may be searched for sequences having sequence identity to the selected recombination site for a given recombinase, for example, the attP and/or attB of ⁇ C31.
  • Nucleic acid sequence databases may be searched by computer.
  • the findpatterns algorithm of the Wisconsin Software Package Version 9.0 developed by the Genetics Computer Group (GCG; Madison, Wis.), is an example of a programmed used to screen all sequences in the GenBank database (Benson et al., 1998, Nucleic Acids Res. 26, 1 7).
  • the genomic sequences of the target cell can be searched for suitable pseudo-recombination sites using either the attP or attB sequences associated with a ⁇ C31 integrase or an altered ⁇ C31 integrase.
  • Functional sizes and the amount of heterogeneity that can be tolerated in these recombination sequences can be empirically evaluated, for example, by evaluating integration efficiency of a targeting construct using an altered ⁇ C31 integrase of the present invention (for exemplary methods of evaluating integration events, see, WO 00/11155, published 2 Mar. 2000).
  • Such sites are recovered, for example, by plasmid rescue and analyzed at the DNA sequence level, producing, for example, the DNA sequence of a pseudo attP site from the human genome, such as ⁇ A (FIG. 17).
  • This empirical method for identification of pseudo-sites can be used, even if a detailed knowledge of the recombinase recognition sites and the nature of recombinase binding to them are unknown.
  • Exemplary pseudo attP site present in the human genome are provided in Fig 18, as described in Chalberg et al., JMB 357:28-48 (2006).
  • Exemplary pseudo attP sites present in the genome of human embryonic stem cells are also descried in Thyagarajan et al., Stem Cells, published online October, 26, 2007, available at www.StemCells.com.
  • a pseudo-recombination site is identified (using either attP or attB search sequences) in a target genome (such as human or mouse), that pseudo-recombination site can be used in the methods of the present invention of using an altered ⁇ C31 integrase to integrate a nucleic acid of interest into a target cell genome.
  • AttP or attB sites corresponding to the pseudo-recombination sites can be used in the targeting construct to be employed with an altered ⁇ C31 integrase.
  • the wild-type attB sequence can be used in the targeting construct.
  • the wild-type attP sequence can be used in the targeting construct.
  • the targeting constructs contemplated by the invention may contain additional nucleic acid fragments such as control sequences, marker sequences, selection sequences and the like as discussed below.
  • the present invention also provides means for targeted insertion of a polynucleotide (or nucleic acid sequence(s)) of interest into a genome by, for example, (i) providing an altered ⁇ C31 integrase capable of facilitating recombination between a first recombination site and a second recombination site, (ii) providing a targeting construct having a first recombination sequence and a polynucleotide of interest, (iii) introducing the altered ⁇ C31 integrase and the targeting construct into a cell which contains in its nucleic acid the second recombination site, wherein said introducing is done under conditions that allow the altered ⁇ C31 integrase to facilitate a recombination event between the first and second recombination sites.
  • the attachment site in a bacterial genome is designated “attB” and in a corresponding bacteriophage the site is designated “attP".
  • attB the attachment site in a bacterial genome
  • attP the site in a corresponding bacteriophage
  • attP the site in a corresponding bacteriophage
  • attP the site in a corresponding bacteriophage
  • attP the site in a bacterial genome
  • attP the site for an altered ⁇ C31 integrase is identified in a target cell of interest.
  • These sites can be identified by several methods including searching all known sequences derived from the cell of interest against a wild- type recombination site (e.g., attb or attp) for an altered ⁇ C31 integrase (e.g., as described above).
  • the functionality of pseudo-recombination sites identified in this way can then be empirically evaluated following the teachings of the present specification to determine their ability to participate in a
  • a targeting construct to direct integration to a pseudo-recombination site, would then comprise a recombination site wherein the altered ⁇ C31 integrase can facilitate a recombination event between the recombination site in the genome of the target cell and a recombination site in the targeting construct.
  • a targeting vector may further comprise a polynucleotide of interest. Polynucleotides of interest can include, but are not limited to, expression cassettes encoding polypeptide products.
  • the targeting constructs are typically circular and may also contain selectable markers, an origin of replication, and other elements. Targeting constructs of the present invention are typically circular.
  • the targeting construct will have one or more of the following features: a promoter, promoter-enhancer sequences, a selection marker sequence, an origin of replication, an inducible element sequence, an epitope-tag sequence, and the like.
  • Promoter and promoter-enhancer sequences are DNA sequences to which RNA polymerase binds and initiates transcription.
  • the promoter determines the polarity of the transcript by specifying which strand will be transcribed.
  • Bacterial promoters consist of consensus sequences, -35 and -10 nucleotides relative to the transcriptional start, which are bound by a specific sigma factor and RNA polymerase.
  • Eukaryotic promoters are more complex. Most promoters utilized in expression vectors are transcribed by RNA polymerase II.
  • General transcription factors (GTFS) first bind specific sequences near the start and then recruit the binding of RNA polymerase II.
  • Viral promoters serve the same function as bacterial or eukaryotic promoters and either provide a specific RNA polymerase in trans (bacteriophage T7) or recruit cellular factors and RNA polymerase (SV40, RSV, CMV). Viral promoters may be preferred as they are generally particularly strong promoters.
  • Promoters may be, furthermore, either constitutive or regulatable.
  • Inducible elements are DNA sequence elements which act in conjunction with promoters and may bind either repressors (e.g. lacO/LAC Iq repressor system in E. coli) or inducers (e.g. gall/GAL4 inducer system in yeast). In such cases, transcription is virtually “shut off” until the promoter is derepressed or induced, at which point transcription is "turned-on.”
  • Examples of constitutive promoters include the int promoter of bacteriophage ⁇ , the bla promoter of the ⁇ -lactamase gene sequence of pBR322, the CAT promoter of the chloramphenicol acetyl transferase gene sequence of pPR325, and the like.
  • Examples of inducible prokaryotic promoters include the major right and left promoters of bacteriophage (P L and P R ), the trp, reca, lacZ, AraC and gal promoters of E. coli, the .alpha.-amylase (Ulmanen, et al., J. Bacterid.
  • Preferred eukaryotic promoters include, but are not limited to, the following: the promoter of the mouse metallothionein I gene sequence (Hamer et al., J. MoI. Appl. Gen. 1:273 288, 1982); the TK promoter of Herpes virus (McKnight, Cell 31:355 365, 1982); the SV40 early promoter (Benoist et al., Nature (London) 290:304 310, 1981); the yeast gall gene sequence promoter (Johnston et al., Proc. Natl. Acad. Sci. (USA) 79:6971 6975, 1982); Silver et al., Proc. Natl. Acad. Sci. (USA) 81:5951 59SS, 1984), the CMV promoter, the EF-I promoter, Ecdysone-responsive promoter(s), tetracyc line-responsive promoter, and the like.
  • Exemplary promoters for use in the present invention are selected such that they are functional in the cell type (and/or animal or plant) into which they are being introduced.
  • Selection markers are valuable elements in expression vectors as they provide a means to select for growth of only those cells that contain a vector. Such markers are typically of two types: drug resistance and auxotrophic.
  • a drug resistance marker enables cells to detoxify an exogenously added drug that would otherwise kill the cell.
  • Auxotrophic markers allow cells to synthesize an essential component (usually an amino acid) while grown in media that lacks that essential component.
  • Common selectable marker genes include those for resistance to antibiotics such as ampicillin, tetracycline, kanamycin, bleomycin, streptomycin, hygromycin, neomycin, ZeocinTM, G418, and the like.
  • Selectable auxotrophic genes include, for example, hisD, that allows growth in histidine free media in the presence of histidinol.
  • a further element useful in an expression vector is an origin of replication.
  • Replication origins are unique DNA segments that contain multiple short repeated sequences that are recognized by multimeric origin-binding proteins and that play a key role in assembling DNA replication enzymes at the origin site.
  • Suitable origins of replication for use in expression vectors employed herein include E. coli oriC, colEl plasmid origin, 2 ⁇ and ARS (both useful in yeast systems), SV40, and EBV oriP (useful in mammalian systems), and the like.
  • Epitope tags are short peptide sequences that are recognized by epitope specific antibodies.
  • a fusion protein comprising a recombinant protein and an epitope tag can be simply and easily purified using an antibody bound to a chromatography resin.
  • the presence of the epitope tag furthermore allows the recombinant protein to be detected in subsequent assays, such as Western blots, without having to produce an antibody specific for the recombinant protein itself.
  • Examples of commonly used epitope tags include V5, glutathione-S-transferase (GST), hemaglutinin (HA), the peptide Phe-His-His-Thr-Thr, chitin binding domain, and the like.
  • a further useful element in an expression vector is a multiple cloning site or polylinker.
  • Synthetic DNA encoding a series of restriction endonuclease recognition sites is inserted into a plasmid vector, for example, downstream of the promoter element. These sites are engineered for convenient cloning of DNA into the vector at a specific position.
  • Suitable prokaryotic vectors include plasmids such as those capable of replication in E. coli (for example, pBR322, ColEl, pSClOl, PACYC 184, pVX, pRSET, pBAD (Invitrogen, Carlsbad, Calif.) and the like). Such plasmids are disclosed by Sambrook (cf.
  • Bacillus plasmids include pC194, pC221, pT127, and the like, and are disclosed by Gryczan (In: The Molecular Biology of the Bacilli, Academic Press, NY (1982), pp. 307 329).
  • Suitable Streptomyces plasmids include plilOl (Kendall et al., J. Bacteriol.
  • Suitable eukaryotic plasmids include, for example, BPV, EBV, vaccinia, SV40,
  • the targeting cassettes described herein can be constructed utilizing methodologies known in the art of molecular biology (see, for example, Ausubel or Maniatis) in view of the teachings of the specification. As described above, the targeting constructs are assembled by inserting, into a suitable vector backbone, a recombination site, polynucleotides encoding sequences of interest operably linked to a promoter of interest; and, optionally a sequence encoding a positive selection marker.
  • a preferred method of obtaining polynucleotides, including suitable regulatory sequences (e.g., promoters) is PCR.
  • PCR General procedures for PCR are taught in MacPherson et al., PCR: A PRACTICAL APPROACH, (IRL Press at Oxford University Press, (1991)).
  • PCR conditions for each application reaction may be empirically determined.
  • a number of parameters influence the success of a reaction. Among these parameters are annealing temperature and time, extension time, Mg2+ and ATP concentration, pH, and the relative concentration of primers, templates and deoxyribonucleotides.
  • the resulting fragments can be detected by agarose gel electrophoresis followed by visualization with ethidium bromide staining and ultraviolet illumination.
  • kits can include, but are not limited to, containers, instructions, solutions, buffers, disposables, and hardware.
  • an altered ⁇ C31 integrase is introduced into a cell whose genome is to be modified.
  • Methods of introducing functional proteins into cells are well known in the art.
  • Introduction of purified altered ⁇ C31 integrase protein ensures a transient presence of the protein and its function, which is often a preferred embodiment.
  • a gene encoding the altered ⁇ C31 integrase can be included in an expression vector used to transform the cell. It is generally preferred that the altered ⁇ C31 integrase be present for only such time as is necessary for insertion of the nucleic acid fragments into the genome being modified. Thus, the lack of permanence associated with most expression vectors is not expected to be detrimental.
  • the altered ⁇ C31 integrase used in the practice of the present invention can be introduced into a target cell before, concurrently with, or after the introduction of a targeting vector.
  • the altered ⁇ C31 integrase can be directly introduced into a cell as a protein, for example, using liposomes, coated particles, or microinjection.
  • a polynucleotide encoding the altered ⁇ C31 integrase can be introduced into the cell using a suitable expression vector.
  • the targeting vector components described above are useful in the construction of expression cassettes containing sequences encoding an altered ⁇ C31 integrase of interest. Expression of the altered ⁇ C31 integrase is typically desired to be transient.
  • vectors providing transient expression of the altered ⁇ C31 integrase are preferred in the practice of the present invention.
  • expression of the altered ⁇ C31 integrase may be regulated in other ways, for example, by placing the expression of the altered ⁇ C31 integrase under the control of a regulatable promoter (i.e., a promoter whose expression can be selectively induced or repressed).
  • Altered ⁇ C31 integrase for use in the practice of the present invention can be produced recombinantly or purified as previously described.
  • Altered ⁇ C31 integrase polypeptides having the desired recombinase activity can be purified to a desired degree of purity by methods known in the art of protein purification, including, but not limited to, ammonium sulfate precipitation, size fractionation, affinity chromatography, HPLC, ion exchange chromatography, heparin agarose affinity chromatography (e.g., Thorpe & Smith, Proc. Nat. Acad. Sci.
  • nucleic acids of interest may be introduced into the genome of cell using the methods of the invention including protein encoding nucleic acids, including, for example, enzymes that can be used for the production of nutrients and for performing enzymatic reactions in chemistry, or polypeptides which are useful and valuable as nutrients or for the treatment of human or animal diseases or for the prevention thereof, for example hormones, polypeptides with immunomodulatory activity, anti-viral and/or anti-tumor properties (e.g., maspin), antibodies, viral antigens, vaccines, clotting factors, enzyme inhibitors, foodstuffs, and the like.
  • enzymes that can be used for the production of nutrients and for performing enzymatic reactions in chemistry
  • polypeptides which are useful and valuable as nutrients or for the treatment of human or animal diseases or for the prevention thereof
  • hormones, polypeptides with immunomodulatory activity e.g., maspin
  • anti-viral and/or anti-tumor properties e.g., maspin
  • antibodies e
  • polypeptides encoded by the nueleic acid of interest that may be introduced by the methods of the invention are, for example, those coding for hormones such as secretin, thymosin, relaxin, luteinizing hormone, parathyroid hormone, adrenocorticotropin, melanoycte-stimulating hormone, ⁇ -lipotropin, urogastrone or insulin, growth factors, such as epidermal growth factor, insulin-like growth factor (IGF), e.g. IGF-I and IGF-II, mast cell growth factor, nerve growth factor, glial cell line-derived neurotrophic factor (GDNF), or transforming growth factor (TGF), such as TGF- ⁇ or TGF- ⁇ (e.g.
  • hormones such as secretin, thymosin, relaxin, luteinizing hormone, parathyroid hormone, adrenocorticotropin, melanoycte-stimulating hormone, ⁇ -lipotropin, urogastrone or insulin
  • growth factors such
  • TGF- ⁇ l, ⁇ 2 or ⁇ 3) growth hormone, such as human or bovine growth hormones, interleukins, such as interleukin- 1 or -2, human macrophage migration inhibitory factor (MIF), interferons, such as human ⁇ - interferon, for example interferon- ⁇ A, ⁇ B, ⁇ D or ⁇ F, ⁇ - interferon, ⁇ -interferon or a hybrid interferon, for example an ⁇ A- ⁇ D- or an ⁇ B- ⁇ D-hybrid interferon, especially the hybrid interferon BDBB, protease inhibitors such as ⁇ i -antitrypsin, SLPI, ⁇ i- antichymotrypsin, Cl inhibitor, hepatitis virus antigens, such as hepatitis B virus surface or core antigen or hepatitis A virus antigen, or hepatitis nonA-nonB (i.e., hepatitis C) virus antigen, plasmin
  • calcitonin human calcitonin-related peptide
  • blood clotting factors such as factor IX or VIIIc
  • erythropoietin erythropoietin
  • eglin such as eglin C
  • desulphatohirudin such as desulphatohirudin variant HVl, HV2 or PA
  • human superoxide dismutase viral thymidine kinase, ⁇ - lactamase, glucose isomerase
  • transport proteins such as human plasma proteins, e.g., serum albumin transferring, and transcription factors, including Oct-3/4, Sox2, c-myc, and the like.
  • CellsCD23 calcitonin, human calcitonin-related peptide
  • blood clotting factors such as factor IX or VIIIc
  • erythropoietin erythropoietin
  • eglin such as eglin C
  • desulphatohirudin such as des
  • Cells suitable for modification employing the methods of the invention include both prokaryotic cells and eukaryotic cells, provided that the cell's genome contains a pseudo-recombination sequence recognizable by an altered ⁇ C31 integrase of the present invention.
  • Prokaryotic cells are cells that lack a defined nucleus. Examples of suitable prokaryotic cells include bacterial cells, mycoplasmal cells and archaebacterial cells.
  • Particularly preferred prokaryotic cells include those that are useful either in various types of test systems or those that have some industrial utility, such as Klebsiella oxytoca (ethanol production), Clostridium acetobutylicum (butanol production), and the like (see Green and Bennet, Biotech & Bioengineering 58:215 221, 1998; Ingram, et al, Biotech & Bioengineering 58:204 206, 1998).
  • Suitable eukaryotic cells include both animal cells (such as, from insect, fish, bird, rodent (including mice and rats), cow, goat, rabbit, sheep, non-human primate, human, and the like) and plant cells (such as, from rice, corn, cotton, tobacco, tomato, potato, and the like). Cell types applicable to particular purposes are discussed in greater detail below.
  • Yet another embodiment of the invention comprises isolated genetically engineered cells.
  • Suitable cells may be prokaryotic or eukaryotic, as discussed above.
  • the genetically engineered cells of the invention may be unicellular organisms or may be derived from multicellular organisms.
  • isolated in reference to genetically engineered cells derived from multicellular organisms it is meant the cells are outside a living body, whether plant or animal, and in an artificial environment. The use of the term isolated does not imply that the genetically engineered cells are the only cells present.
  • the genetically engineered cells of the invention contain any one of the nucleic acid constructs of the invention.
  • an altered ⁇ C31 integrase that specifically recognizes recombination sequences is introduced into genetically engineered cells containing one of the nucleic acid constructs of the invention under conditions such that the nucleic acid sequence(s) of interest will be inserted into the genome.
  • the genetically engineered cells possess a modified genome. Methods of introducing polypeptides and DNA sequences into such cells are well known in the art and are discussed above.
  • the genetically engineered cells of the invention can be employed in a variety of ways.
  • Unicellular organisms can be modified to produce commercially valuable substances such as recombinant proteins, industrial solvents, industrially useful enzymes, and the like.
  • Preferred unicellular organisms include fungi such as yeast (for example, S. pombe, Pichia pastoris, S. cerevisiae (such as INVScI), and the like) Aspergillis, and the like, and bacteria such as Klebsiella, Streptomyces, and the like.
  • Isolated cells from multicellular organisms can be similarly useful, including insect cells, mammalian cells and plant cells.
  • Mammalian cells that may be useful include those derived from rodents, primates and the like. They include HeLa cells, cells of fibroblast origin such as VERO, 3T3 or CHOKl, HEK 293 cells or cells of lymphoid origin (such as 32D cells) and their derivatives, neuronal cells, hepatic cells, and the like.
  • Exemplary mammalian host cells include nonadherent cells such as CHO, 32D, and the like.
  • plant cells are also available as hosts, and control sequences compatible with plant cells are available, such as the cauliflower mosaic virus 35S and 19S, nopaline synthase promoter and polyadenylation signal sequences, and the like. Appropriate transgenic plant cells can be used to produce transgenic plants.
  • Another preferred host is an insect cell, for example from the Drosophila larvae.
  • Drosophila alcohol dehydrogenase promoter can be used (Rubin, Science 240:1453 1459, 1988).
  • baculovirus vectors can be engineered to express large amounts of peptide encoded by a desired nucleic acid sequence in insect cells (Jasny, Science 238:1653, (1987); Miller et al., In: Genetic Engineering (1986), Setlow, J. K., et al., eds., Plenum, Vol. 8, pp. 277 297)).
  • the genetically engineered cells of the invention are additionally useful as tools to screen for substances capable of modulating the activity of a protein encoded by a nucleic is acid fragment of interest.
  • an additional embodiment of the invention comprises methods of screening comprising contacting genetically engineered cells of the invention with a test substance and monitoring the cells for a change in cell phenotype, cell proliferation, cell differentiation, enzymatic activity of the protein or the interaction between the protein and a natural binding partner of the protein when compared to test cells not contacted with the test substance.
  • test substances can be evaluated using the genetically engineered cells of the invention including peptides, proteins, antibodies, low molecular weight organic compounds, natural products derived from, for example, fungal or plant cells, and the like.
  • low molecular weight organic compound it is meant a chemical species with a molecular weight of generally less than 500 1000.
  • Sources of test substances are well known to those of skill in the art.
  • Various assay methods employing cells are also well known by those skilled in the art. They include, for example, assays for enzymatic activity (Hirth, et al, U.S. Pat. No. 5,763,198, issued Jun. 9, 1998), assays for binding of a test substance to a protein expressed by the genetically engineered cells, assays for transcriptional activation of a reporter gene, and the like.
  • Cells modified by the methods of the present invention can be maintained under conditions that, for example, (i) keep them alive but do not promote growth, (ii) promote growth of the cells, and/or (iii) cause the cells to differentiate or dedifferentiate.
  • Cell culture conditions are typically permissive for the action of the altered ⁇ C31 integrase in the cells, although regulation of the activity of the altered ⁇ C31 integrase may also be modulated by culture conditions (e.g., raising or lowering the temperature at which the cells are cultured). For a given cell, cell-type, tissue, or organism, culture conditions are known in the art.
  • the present invention comprises transgenic plants and nonhuman transgenic animals whose genomes have been modified by employing the methods and compositions of the invention.
  • Transgenic animals may be produced employing the methods of the present invention to serve as a model system for the study of various disorders and for screening of drugs that modulate such disorders.
  • a "transgenic” plant or animal refers to a genetically engineered plant or animal, or offspring of genetically engineered plants or animals.
  • a transgenic plant or animal usually contains material from at least one unrelated organism, such as, from a virus.
  • the term "animal” as used in the context of transgenic organisms means all species except human. It also includes an individual animal in all stages of development, including embryonic and fetal stages. Farm animals (e.g., chickens, pigs, goats, sheep, cows, horses, rabbits and the like), rodents (such as mice and rats), and domestic pets (e.g., cats and dogs) are included within the scope of the present invention.
  • the animal is a mouse or a rat.
  • chimeric plant or animal is used to refer to plants or animals in which the heterologous gene is found, or in which the heterologous gene is expressed in some but not all cells of the plant or animal.
  • transgenic animal also includes a germ cell line transgenic animal.
  • germ cell line transgenic animal is a transgenic animal in which the genetic information provided by the invention method has been taken up and incorporated into a germ line cell, therefore conferring the ability to transfer the information to offspring. If such offspring, in fact, possess some or all of that information, then they, too, are transgenic animals.
  • a transgenic animal of the present invention is produced by introducing into a single cell embryo a nucleic acid construct (e.g., a targeting construct), comprising a recombination site capable of recombining with a recombination site found within the genome of the organism from which the cell was derived and a nucleic acid fragment of interest, in a manner such that the nucleic acid fragment of interest is stably integrated into the DNA of germ line cells of the mature animal and is inherited in normal Mendelian fashion.
  • the nucleic acid fragment of interest can be any one of the fragments described previously.
  • the nucleic acid sequence of interest can encode an exogenous product that disrupts or interferes with expression of an endogenously produced protein of interest, yielding transgenic animals with decreased expression of the protein of interest.
  • a nucleic acid construct of the invention can be injected into the pronucleus, or cytoplasm, of a fertilized egg before fusion of the male and female pronuclei, or injected into the nucleus of an embryonic cell (e.g., the nucleus of a two-cell embryo) following the initiation of cell division (Brinster, et al., Proc. Nat. Acad. Sci. USA 82: 4438, 1985).
  • Embryos can be infected with viruses, especially retroviruses, modified at a recombination site with a nucleic acid sequence of interest.
  • the cell can further be treated with an altered ⁇ C31 integrase as described above to promote integration of the nucleic acid sequence of interest into the genome.
  • an altered ⁇ C31 integrase in the form of an mRNA may be particularly advantageous. There would then be no requirement for transcription of the incoming altered ⁇ C31 integrase gene and no chance that the altered ⁇ C31 integrase gene would become integrated into the genome.
  • transgenic mice female mice are induced to superovulate. After being allowed to mate, the females are sacrificed by CO 2 asphyxiation or cervical dislocation and embryos are recovered from excised oviducts. Surrounding cumulus cells are removed. Pronuclear embryos are then washed and stored until the time of injection. Randomly cycling adult female mice are paired with vasectomized males. Recipient females are mated at the same time as donor females. Embryos then are transferred surgically. The procedure for generating transgenic rats is similar to that of mice. See Hammer, et al., Cell 63: 1099 1112, 1990). Rodents suitable for transgenic experiments can be obtained from standard commercial sources such as Charles River (Wilmington, Mass.), Taconic (Germantown, N.Y.), Harlan Sprague Dawley (Indianapolis, Ind.), etc.
  • DNA into the pronucleus of the zygote are well known to those of ordinary skill in the art (Hogan, et al., supra). Microinjection procedures for fish, amphibian eggs and birds are detailed in Houdebine and Chourrout, Experientia 47:897 905, 1991). Other procedures for introduction of DNA into tissues of animals are described in U.S. Pat. No., 4,945,050 (Sandford et al., JuI. 30, 1990).
  • Totipotent or pluripotent stem cells derived from the inner cell mass of the embryo and stabilized in culture can be manipulated in culture to incorporate nucleic acid sequences employing invention methods.
  • a transgenic animal can be produced from such cells through injection into a blastocyst that is then implanted into a foster mother and allowed to come to term.
  • Methods for the culturing of stem cells and the subsequent production of transgenic animals by the introduction of DNA into stem cells using methods such as electroporation, calcium phosphate/DNA precipitation, microinjection, liposome fusion, retroviral infection, and the like are also are well known to those of ordinary skill in the art. (See, for example, Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed., IRL Press, 1987).
  • the final phase of the procedure is to inject targeted ES cells into blastocysts and to transfer the blastocysts into pseudopregnant females.
  • the resulting chimeric animals are bred and the offspring are analyzed by Southern blotting to identify individuals that carry the transgene.
  • Procedures for the production of non-rodent mammals and other animals have been discussed by others (see Houdebine and Chourrout, supra; Pursel, et al., Science 244:1281 1288, 1989; and Simms, et al., Bio/Technology 6:179 183, 1988).
  • transgenic as used herein additionally includes any organism whose genome has been altered by in vitro manipulation of the early embryo or fertilized egg or by any transgenic technology to induce a specific gene knockout.
  • gene knockout refers to the targeted disruption of a gene in vivo with loss of function that has been achieved by use of the invention vector.
  • transgenic animals having gene knockouts are those in which the target gene has been rendered nonfunctional by an insertion targeted to the gene to be rendered non-functional by targeting a pseudo-recombination site located within the gene sequence.
  • a further embodiment of the invention comprises a method of treating a disorder in a subject in need of such treatment.
  • at least one cell or cell type (or tissue, etc.) of the subject has a target recombination sequence for an altered ⁇ C31 integrase of the present invention, such as a pseudo attP site.
  • This cell(s) is transformed with a nucleic acid construct (a "targeting construct") comprising a second recombination sequence and one or more polynucleotides of interest (typically a therapeutic gene).
  • an altered ⁇ C31 integrase is introduced that specifically recognizes the recombination sequences under conditions such that the nucleic acid sequence of interest is inserted into the genome via a recombination event.
  • Subjects treatable using the methods of the invention include both humans and non- human animals. Such methods utilize the targeting constructs and altered ⁇ C31 integrase of the present invention.
  • a variety of disorders may be treated by employing the method of the invention including monogenic disorders, infectious diseases, acquired disorders, cancer, and the like.
  • monogenic disorders include ADA deficiency, cystic fibrosis, familial- hypercholesterolemia, hemophilia, chronic ganulomatous disease, Duchenne muscular dystrophy, Fanconi anemia, sickle-cell anemia, Gaucher's disease, Hunter syndrome, X- linked SCID, and the like.
  • Infectious diseases treatable by employing the methods of the invention include infection with various types of virus including human T-cell lympho tropic virus, influenza virus, papilloma virus, hepatitis virus, herpes virus, Epstein-Bar virus, immunodeficiency viruses (HIV, and the like), cytomegalovirus, and the like. Also included are infections with other pathogenic organisms such as Mycobacterium Tuberculosis, Mycoplasma pneumoniae, and the like or parasites such as Plasmadium falciparum, and the like.
  • viruses including human T-cell lympho tropic virus, influenza virus, papilloma virus, hepatitis virus, herpes virus, Epstein-Bar virus, immunodeficiency viruses (HIV, and the like), cytomegalovirus, and the like.
  • infections with other pathogenic organisms such as Mycobacterium Tuberculosis, Mycoplasma pneumoniae, and the like or parasites such as Plasmadium falciparum, and the like.
  • the term "acquired disorder” as used herein refers to a noncongenital disorder.
  • Such disorders are generally considered more complex than monogenic disorders and may result from inappropriate or unwanted activity of one or more genes.
  • Examples of such disorders include peripheral artery disease, rheumatoid arthritis, coronary artery disease, and the like.
  • a particular group of acquired disorders treatable by employing the methods of the invention include various cancers, including both solid tumors and hematopoietic cancers such as leukemias and lymphomas.
  • Solid tumors that are treatable utilizing the invention method include carcinomas, sarcomas, osteomas, fibrosarcomas, chondrosarcomas, and the like.
  • Specific cancers include breast cancer, brain cancer, lung cancer (non-small cell and small cell), colon cancer, pancreatic cancer, prostate cancer, gastric cancer, bladder cancer, kidney cancer, head and neck cancer, and the like.
  • the suitability of the particular place in the genome is dependent in part on the particular disorder being treated.
  • the disorder is a monogenic disorder and the desired treatment is the addition of a therapeutic nucleic acid encoding a non- mutated form of the nucleic acid thought to be the causative agent of the disorder
  • a suitable place may be a region of the genome that does not encode any known protein and which allows for a reasonable expression level of the added nucleic acid.
  • the nucleic acid construct (e.g., a targeting vector) useful in this embodiment is additionally comprised of one or more nucleic acid fragments of interest.
  • Preferred nucleic acid fragments of interest for use in this embodiment are therapeutic genes and/or control regions, as previously defined.
  • the choice of nucleic acid sequence will depend on the nature of the disorder to be treated.
  • a nucleic acid construct intended to treat hemophilia B which is caused by a deficiency of coagulation factor IX, may comprise a nucleic acid fragment encoding functional factor IX.
  • a nucleic acid construct intended to treat obstructive peripheral artery disease may comprise nucleic acid fragments encoding proteins that stimulate the growth of new blood vessels, such as, for example, vascular endothelial growth factor, platelet-derived growth factor, and the like. Those of skill in the art would readily recognize which nucleic acid fragments of interest would be useful in the treatment of a particular disorder.
  • the nucleic acid construct can be administered to the subject being treated using a variety of methods. Administration can take place in vivo or ex vivo.
  • in vivo it is meant in the living body of an animal.
  • ex vivo it is meant that cells or organs are modified outside of the body, such cells or organs are typically returned to a living body.
  • Nucleic acid constructs can be delivered with cationic lipids (Goddard, et al, Gene Therapy, 4: 1231 1236, 1997; Gorman, et al, Gene Therapy 4:983 992, 1997; Chadwick, et al, Gene Therapy 4:937 942, 1997; Gokhale, et al, Gene Therapy 4:1289 1299, 1997; Gao, and Huang, Gene Therapy 2:710 722, 1995, all of which are incorporated by reference herein), using viral vectors (Monahan, et al, Gene Therapy 4:40 49, 1997; Onodera, et al, Blood 91:30 36, 1998, all of which are incorporated by reference herein), by uptake of "naked DNA", and the like.
  • nucleic acid constructs can be used for the ex vivo administration of nucleic acid constructs.
  • the exact formulation, route of administration and dosage can be chosen by the individual physician in view of the patient's condition. (See e.g. Fingl et al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 1 pi).
  • the attending physician would know how to and when to terminate, interrupt, or adjust administration due to toxicity, to organ dysfunction, and the like. Conversely, the attending physician would also know how to adjust treatment to higher levels if the clinical response were not adequate (precluding toxicity).
  • the magnitude of an administered dose in the management of the disorder being treated will vary with the severity of the condition to be treated, with the route of administration, and the like. The severity of the condition may, for example, be evaluated, in part, by standard prognostic evaluation methods. Further, the dose and perhaps dose frequency will also vary according to the age, body weight, and response of the individual patient.
  • the method and route of administration will optimally be chosen to modify at least 0.1-1% of the target cells per administration. In this way, the number of administrations can be held to a minimum in order to increase the efficiency and convenience of the treatment.
  • Such agents may be formulated and administered systemically or locally. Techniques for formulation and administration may be found in "Remington's Pharmaceutical Sciences," 1990, 18th ed., Mack Publishing Co., Easton, Pa. Suitable routes may include oral, rectal, transdermal, vaginal, transmucosal, or intestinal administration; parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or intraocular injections, just to name a few.
  • the subject being treated will additionally be administered an altered ⁇ C31 integrase that specifically recognizes the recombination sequences that are selected for use.
  • the particular altered ⁇ C31 integrase can be administered by including a nucleic acid encoding it as part of a nucleic acid construct, or as a protein to be taken up by the cells whose genome is to be modified. Methods and routes of administration will be similar to those described above for administration of a targeting construct comprising a recombination sequence and nucleic acid sequence of interest.
  • the altered ⁇ C31 integrase protein is likely to only be required for a limited period of time for integration of the nucleic acid sequence of interest.
  • the vector carrying the altered ⁇ C31 integrase gene will lack sequences mediating prolonged retention.
  • conventional plasmid DNA decays rapidly in most mammalian cells.
  • the altered ⁇ C31 integrase gene may also be equipped with gene expression sequences that limit its expression.
  • an inducible promoter can be used, so that altered ⁇ C31 integrase expression can be temporally limited by limited exposure to the inducing agent.
  • promoters are tetracycline-responsive promoters the expression of which can be regulated using tetracycline or doxycycline.
  • kits for practicing the subject methods at least include one or more of, and usually all an altered ⁇ C31 integrase or an expression vector encoding the same and a targeting vector as described above.
  • the altered ⁇ C31 integrase component can be provided in any suitable form (e.g., as a protein formulated for introduction into a target cell or in a recombinase vector which provides for expression of the desired recombinase following introduction into the target cell).
  • the targeting vector will include at least a first recombination site, such as an attB site, and a restriction endonuclease site for insertion of a nucleic acid sequence of interest.
  • kits may further include an aqueous delivery vehicle, e.g. a buffered saline solution, etc.
  • the kits may include one or more restriction endonucleases for use in transferring a nucleic acid of interest into the targeting vector.
  • the above components may be combined into a single aqueous composition for delivery into the host or separate as different or disparate compositions, e.g., in separate containers.
  • the kit may further include a vascular delivery means for delivering the aqueous composition to the host, e.g. a syringe etc., where the delivery means may or may not be pre-loaded with the aqueous composition.
  • the subject kits typically further include instructions for using the components of the kit to practice the subject methods.
  • the instructions for practicing the subject methods are generally recorded on a suitable recording medium.
  • the instructions may be printed on a substrate, such as paper or plastic, etc.
  • the instructions may be present in the kits as a package insert, in the labeling of the container of the kit or components thereof (i.e., associated with the packaging or subpackaging) etc.
  • the instructions are present as an electronic storage data file present on a suitable computer readable storage medium, e.g. CD-ROM, diskette, etc.
  • the actual instructions are not present in the kit, but means for obtaining the instructions from a remote source, e.g. via the internet, are provided.
  • An example of this embodiment is a kit that includes a web address where the instructions can be viewed and/or from which the instructions can be downloaded. As with the instructions, this means for obtaining the instructions is recorded on a suitable substrate.
  • Plasmid pFCl was used as the recipient plasmid for cloning libraries of mutant integrases for use in the pre-screen in bacteria.
  • pFCl (FIG 1, panel A) was derived from the plasmid pBCPB+ (Groth et al., PNAS 97:5995-6000 (2000)), which carries the chloramphenicol resistance gene and the lacZ gene flanked by ⁇ C31 attP and attB sites in the same orientation.
  • An empty expression cassette containing the CMV promoter, the lac promoter, the SV40 polyA, and a multiple cloning site was cloned into the Agel site of pBCPB+.
  • Mutant integrase libraries were cloned downstream of the CMV and lac promoters, for expression in both mammalian cells and bacteria.
  • the lacZ gene was excised, giving rise to a plasmid termed pFCl-M, containing a candidate mutant integrase gene (FIG 1, Panel A and FIG 2, Panel A).
  • pFCl-M mutant integrase candidates were assayed in the mammalian screen by using the plasmid pBP-Green (FIG 1, panel B), which was derived from pEGFP-Cl (Stratagene, La Jolla, CA).
  • the CMV promoter of pEGFP-Cl was reversed so that it no longer drove expression of the enhanced green fluorescent protein (eGFP) gene.
  • the ⁇ C31 attP and attB sites were cloned in opposite orientations flanking the reversed CMV promoter. An integrase-mediated recombination reaction between the two att sites would result in "flipping" the CMV promoter, enabling it to drive expression of eGFP.
  • pCS-Pl, pCS-P2, and pCS-P3 were constructed by cloning integrases Pl, P2, P3, dPl, dP2, and dP3 from the pFCl-M plasmid into the pCS (Olivares et al., 2002) backbone.
  • the "d” signified deletion of the N-terminal fusion sequence.
  • Plasmid pCS was digested with EcoRI and Smal (New England Biolabs, Ipswich, MA).
  • pFC-Pl,-P2, and -P3 were digested with BamHI, treated with DNA polymerase I (Klenow; New England Biolabs) to blunt the ends, and then digested with EcoRI.
  • the EcoRI-BamHI fragments containing the mutant integrase genes including the N-terminal fusion sequence were ligated into the fcoRI-SVn ⁇ l-digested pCS backbone to give pCS-Pl, pCS-P2, and pCS-P3 (FIG 2, panel B).
  • pCS was digested with Sm(A.
  • pFC-Pl, -P2, and -P3 were digested with Asel, BamHI, AIwNI, and Dralll.
  • the Asel-BamHI fragment containing the mutant integrase gene was treated with DNA polymerase I (Klenow) to create blunt ends and then ligated into the Smal- linearized pCS vector to give pCS-dPl, pCS-dP2, and pCS-dP3 (FIG 2, Panel C).
  • the plasmid pNC-attB Thigarajan et al., MoI. Cell Biol., 21:3926-3934 (2001)
  • FIG 2, Panel D was used for in vitro cell culture assays in mammalian cells.
  • This donor plasmid carried the ⁇ C31 attB site positioned 3" of a CMV promoter, the eGFP gene, and a neomycin resistance gene.
  • the pVFB donor plasmid used for mouse liver studies, was generated by cloning a human factor IX mini-gene under the hAAT promoter (Miao et al., MoI. Ther., 1:522-532 (2000); Olivares et al., (2002)) into the pVax plasmid (Invitrogen, Carlsbad, CA).
  • pVax was digested with Spel and then self- ligated to delete the CMV promoter (pVax-CMV ).
  • This pVax-CMV " vector was then digested with Spel and Xhol.
  • pBS-hFIX-attB (Ehrhardt et al., Molecular Therapy 11:695-706 (2005)) was digested with Spel and Xhol to release a fragment containing the hFIX gene with hAAT promoter and the ⁇ C31 attB site. This fragment was ligated into pVax-CMV " to generate pVFB (FIG 2, Panel E). Structure of the plasmid was confirmed by diagnostic digests and DNA sequencing.
  • a negative control (pFCl) and the wild-type integrase (pFCl-WT, generated by cloning the wild-type ⁇ C31 integrase into pFCl) were also transfected into 8 wells of each plate to provide a comparison for the mutants.
  • HA-tagged versions of the wild-type and integrase mutants were generated.
  • pCSI-HA was constructed by introducing the sequence coding the HA tag, YPYDVPY (SEQ ID NO:32), onto the C-terminal end of the ⁇ C31 inegrase gene by PCR of a fragment using the forward primer FC31-873-F, 3 '-CTGAGGTGATCTACAAGAA (SEQ ID NO:33), and the reverse primer FC31-CHA-R, 3'-
  • PCR product and pCSI were digested with BstEII and BamHl, gel extracted, and ligated to form pCSI-HA.
  • mutant integrase plasmids were digested with BstEll and BamHl and ligated with the digested PCR product containing the HA tag.
  • Total cell lysates were prepared from HeLa cells transfected with pCSI-HA, pCS-dP2-HA, pCS- P2-HA, pCS-dP3-HA, and pCS-P3-HA, and integrase expression levels were determined by Western blot analysis.
  • the cell extracts were prepared using cell lysis buffer (IM Tris pH7.5-7.8, 0.5M EDTA, 5 M NaCl, 0.5 M NaF, 1 M B-glycerophosphate, 0.2 M Na orthovanate, 10% Triton X-100, 0.1 M PMSF in isopropanol, and aprotanin), separated by 10% SDS-PAGE, and the protein bands transferred to nitrocellulose (Bio-Rad, Hercules, CA).
  • cell lysis buffer IM Tris pH7.5-7.8, 0.5M EDTA, 5 M NaCl, 0.5 M NaF, 1 M B-glycerophosphate, 0.2 M Na orthovanate, 10% Triton X-100, 0.1 M PMSF in isopropanol, and aprotanin
  • the blot was incubated for 1 hour in blocking buffer (IX TBS, containing 5% milk), followed by overnight incubation of the blot with blocking buffer containing an anti- ⁇ -actin antibody and a mouse anti-HA antibody (CalBiochem, San Diego, CA) at a 1:1,000 dilution at 4° C.
  • the blot was washed three times with IX TBS containing 0.2% Tween for 7 min each and then incubated for 1 hour at room temperature with blocking buffer (IX TBS, with 0.2% Tween and 5% milk) containing HRP-labeled goat anti- mouse IgG (CalBiochem) at a 1:10,000 dilution.
  • the blot was washed three times again for 7 min each with wash buffer (IX TBS with 0.2% Tween) before developing using the SuperSignal West Pico Chemiluminescent Substrate kit (Pierce Biotechnology, Rockford, IL).
  • the cells were cultured in Dulbecco's modified Eagle medium (DMEM; Gibco) containing 4 mM L-glutamine and supplemented with 9% fetal bovine serum (FBS; Gibco) and 1% penicillin-streptomycin (Gibco). 24 hours after transfection, the cells from each well were trypsinized and seeded onto two 100 mm plates. 72 hours after the transfection, the medium was replaced with DMEM containing 1.25 mg/ml G418 (Invitrogen), and selection was maintained for 14-17 days, after which the colonies were counted.
  • DMEM Dulbecco's modified Eagle medium
  • FBS fetal bovine serum
  • Gibco penicillin-streptomycin
  • 293-P3 cells (Thyagarajan et al., MoI. Cell. Biool., 21 :3926-3934 (2001)) were used to test the ability of mutants to integrate at an attP site placed in the chromosomes.
  • 293-P3 cells at 70% confluence in 6-well plates were transfected in triplicate with 20 ng pNC-attB donor plasmid and 980 ng of plasmids expressing the mutant integrases (pCS- Pl, pCS-P2, pCS-P3, pCS-dPl, pCS-dP2, and pCS-dP3) using 3 ⁇ l FuGENE 6.
  • the cells from each well were trypsinized and seeded onto two 100 mm dishes and cultured in complete DMEM for 24 hours, after which drug selection was initiated.
  • the 293 -P3 cells on one 100 mm dish from each transfection were selected by using a combination of 350 ⁇ g/ml of G418 and 100 ⁇ g/ml of zeocin (Invitrogen), while the cells on the second 100 mm dish were selected using 350 ⁇ g/ml of G418 alone.
  • zeocin Invitrogen
  • Genomic DNA was isolated using the DNEasy Tissue kit (Qiagen). 200 ng of this genomic DNA was used as a template for PCR amplification with primers designed to detect integration at five of the most frequently observed integration sites for ⁇ C31 integrase in the human genome (Chalberg et al., J. MoI. Biol. 357:28-48 (2006)).
  • each colony was individually trypsinized, transferred to one well of a 6-well plate, and each clone was allowed to grow to confluency. Fifty clones were picked for each integrase transfection. Genomic DNA was isolated using the DNEasy Tissue kit (Qiagen) following the manufacturer's protocol. Integration at the attP site was scored by PCR amplification of 200 ng of genomic DNA using primers designed to detect the junction created by recombination between the attP site placed in the genome and the attB site present in the donor plasmid.
  • the primer sequences were P3F: 5 V - AGGTCTATATAAGCAGAGCTC (SEQ ID NO:35) and P3R: 5 N - TGAGCACCGGAACGGCACTGG (SEQ ID NO:36).
  • the reaction was carried out at 95° C for 30 s; 65° C for 30 s; 72° C for 30 s for 10 cycles, with each cycle decreasing the annealing temperature by 1 degree, followed by 25 cycles of 95° C for 30 s; 55° C for 30 s; 72° C for 30 s, followed by 72° C for 7 min.
  • the PCR reactions were run on a 1 % agarose gel, and the number of colonies having a band corresponding to the expected product size were counted to determine specificity for the attP site.
  • mice were acclimatized for 3-4 days prior to experimentation. The animal protocol was approved by the Stanford University Animal Use and Care Committee, based on NIH guidelines. Each experimental group consisted of 5-8 mice.
  • mice 20 ⁇ g pVFB donor plasmid and 20 ⁇ g pCSml, pCSI, pCS-P2, or pCS-P3 in 1.8 ml of Hank's Balanced Salt Solution (HBSS; Gibco) were hydrodynamically injected over 6 - 8 seconds using a 3 ml Luer-Lok syringe (Becton Dickinson; Franklin Lakes, NJ) with a 21 1 A G needle (Becton Dickinson) via the tail vein. To dilate the tail-vein, the mice were kept under a heat lamp for 3-7 minutes prior to injection. Detection of Human Factor IX by ELISA
  • hFIX For quantification of hFIX, whole blood was collected retro-orbitally from the mice at various time points. The blood samples were allowed to clot for 2 hours at room temperature or overnight at 2-8°C before centrifuging for 20 min at approximately 2000 x g. The serum was removed and stored at ⁇ -20° C. The level of hFIX in serum was determined by using an ELISA protocol described previously (Olivares et al., Nature Biotechnology 20:1124-1128 (2002)). Briefly, ELISA plates were coated overnight at 4° C using monoclonal anti-hFIX antibody produced in mouse (Sigma, St. Louis, MO), diluted 1: 1000 in coating buffer (0.1 M NaHCO 3 , pH 9.5).
  • the primary antibody was discarded and the wells rinsed thrice with IX phosphate buffered saline containing 0.5% Tween (PBST; Gibco).
  • PBST IX phosphate buffered saline containing 0.5% Tween
  • the plate was then incubated for 1 hour in blocking buffer (IX PBST containing 5% BSA; Roche), followed by 1-2 hours of incubation with the serum samples and a hFIX standard (Factor IX From Human Plasma, Sigma) diluted in blocking buffer at 37° C.
  • the wells were washed three times with IX PBST and subsequently incubated for 1 hour at 37° C with the secondary HRP-labeled goat anti- hFIX antibody (GAFIX- APHRP; Enzyme Research, South Bend, IN) at a 1:1200 dilution in blocking buffer.
  • the wells were rinsed five times with IX PBST before developing using an o-phenylenediamine (OPD; Sigma, St. Louis, MO) tablet dissolved in sodium citrate buffer, pH 4.5, and freshly added H 2 O 2 .
  • OPD o-phenylenediamine
  • the reaction was stopped by addition of 2N ⁇ SO 4, and the plate was read at 490 nm using a standard microplate reader (Bio-Rad).
  • the data were analyzed using the Microplate Manager III Macintosh data analysis software.
  • Genomic DNA was obtained from the livers as previously described (Laird et al., Nucleic Acids Res 19:4293 (1991)). Generally, sections from different lobes of the liver were finely chopped and treated with lysis buffer (100 mM Tris pH 8.5, 5mM EDTA pH 8, 0.2% SDS, 200 mM NaCl, 100 ⁇ g/ml Proteinase K) at 55° C for 5 hours. The lysed liver samples were then centrifuged at 3,000 x g for 10-15 min and the supernatant carefully transferred to another tube containing 1 volume of isopropanol.
  • lysis buffer 100 mM Tris pH 8.5, 5mM EDTA pH 8, 0.2% SDS, 200 mM NaCl, 100 ⁇ g/ml Proteinase K
  • 1 ⁇ l of the primary PCR product was used as a template for the second round of PCR.
  • the primers for this round were attBF4: 5 ⁇ -CGGTGCGGGTGCCA (SEQ ID NO:39) and mpsLlR2: 5 V -GGTCATGGAGCCCCTTCACAA (SEQ ID NO:40), and run on a program of 5 min at 94° C, 35 cycles of 94° C for 30 s, 63° C for 30 s, and 72° C for 45 s, followed by 7 min at 72° C.
  • the PCR product was subjected to agarose gel electrophoresis to detect a band corresponding to the expected size of 290 bp.
  • the sections were then blocked with 10% rabbit serum in PBST for 1 hour at room temperature, washed with PBST, and incubated overnight at 4° C with goat anti- hFIX primary antibody (Affinity Biologicals, Ancaster, ON, Canada). After rinsing three times with IX PBST, sections were incubated for 1 h at room temperature with rabbit anti-goat Alexafluor 488 (Invitrogen) as the secondary antibody. The sections were washed three times with IX PBST and mounted using ProLong Gold Antifade Reagent with DAPI (Invitrogen). Fluorescence images were obtained using a Zeiss microscope.
  • [00175] Libraries carrying mutant ⁇ C31 integrase genes were generated by using three different methods. In method one, site-directed mutants were synthesized by using overlapping oligonucleotides and high-fidelity PCR. This method was carried out primarily for alanine-scanning mutagenesis and for combining beneficial mutations. For alanine-scanning mutagenesis, all charged amino acids in the N-terminal catalytic domain (-amino acids 1 - 150) were replaced with alanine.
  • mutant integrase genes were cloned into a pFCl vector, transformed into E.coli, and plated on LB -agar plates containing chloramphenicol and X-gal. After overnight incubation at 37°C, plates were screened for the presence of blue and white colonies (FIG 1, Panel A). White colonies signified plasmids carrying a functional integrase and were picked for further analysis.
  • the mean fluorescence intensity of the green cells was used as a measure of the efficiency of the integrase. Control experiments showed that the mean fluorescence intensity of the green cells was not highly sensitive to the amount of integrase plasmid transfected into the cells (data not shown). Eight replicates of the "flipper" assay were performed with each mutant, in order to more reliably detect the expected small improvements in efficiency reflected by higher mean fluorescence. The best mutants showed improvement in activity over wild-type that typically ranged from 1.2 - 1.7 fold (Table 1). A collection of mutants that showed improved activity was sequenced to determine the location of the mutations.
  • mutant integrases were chosen for combinatorial studies. For convenience, and because of the focus on increased efficiency, only mutants that had amino acid changes located in the N-terminal catalytic domain were used. Several mutants were combined in various configurations to generate a second generation of mutants. The second-generation mutants were synthesized by using overlapping oligonucleotides containing the desired sequences and high-fidelity PCR amplification. Second generation mutants were tested in the pBP-Green flipper assay. Some of the combinations produced mutants that showed higher catalytic efficiency than either of the parents. The amino acid changes and fold improvements of the first and second generation mutants with highest integration efficiencies are shown in Table 1.
  • the best second-generation mutants called Pl, P2, and P3 and cloned in the pFC backbone (FIG 2, Panel A), had a 1.8 to 2.3-fold improved ability over wild-type integrase to recombine native attB and attP sites in the pBP-Green extra-chromosomal flipper assay.
  • the Pl mutant was a combination of two individual mutants and had a total of five amino acid changes in the ⁇ C31 integrase protein.
  • the P2 mutant combined three individual alanine-scanning mutations, for a total of three amino acid changes in the integrase protein.
  • the P3 mutant was a combination of Pl and P2 and had nine amino acid changes in the integrase amino acid sequence including an inadvertent deletion of 3 base pairs that deleted an amino acid (Table 1).
  • the P2 mutant carries three mutations in the catalytic domain of ⁇ C31 integrase that change charged residues to alanines.
  • the mutational changes in P2 involve amino acids 40, 44, and 52 (Table 1), all of which are well within the predicted -160 amino acid catalytic domain of ⁇ C31 integrase (Smith et al., Molec. Microbiol. 44: 299- 307(2002)).
  • the atomic structure has been solved to date only for the gamma-delta resolvase (Yang et al., Cell 82:193-207 (1995)).
  • ⁇ C31 integrase may share a similar three-dimensional structure in the catalytic domain. It is plausible that the amino acid changes in P2, as they are in proximity to the catalytic serine residue, influence the activity of the enzyme.
  • AN N-TERMINAL FUSION SEQUENCE IS TRANSLATED [00182] The efficiency of the mutant integrases in mediating genomic integration of plasmid DNA in mammalian cells was also examined. In order to compare directly the new mutants with the previously characterized wild-type integrase, pCSI, which is in the pCS backbone (Thorpe et al., Proc. Natl. Acad. Sci. USA 95:5505-5510 (1998), the Pl, P2, and P3 integrase genes were first transferred from the pFCl backbone to the pCS backbone.
  • mutant integrases with or without the putative fusion sequence was cloned into a pCS backbone such that they were tagged with the haemaglutinin (HA) peptide to permit easy purification of the proteins.
  • pCS- Pl-HA, pCS-P2-HA, and pCS-P3-HA carried the fusion sequence (FIG 2, Panel B), while pCS-dPl-HA, pCS-dP2-HA, and pCS-dP3-HA lacked the fusion sequence (FIG 2, Panel C).
  • Western blot analysis was performed on total HeLa cell lysates isolated 48 hours after transfecting plasmids encoding the HA-tagged integrases. As shown in FIG 3, the mutant integrases with the N-terminal fusion sequence were larger than those without the fusion sequence. The size difference corresponded to the expected 33 amino acids that would result from translation of the fusion sequence (Table 1).
  • mutant integrases containing the fusion had higher integration efficiencies than the integrases from which the fusion sequence had been removed.
  • pCS- P2 had an integration frequency that was approximately two-fold elevated compared to wild-type ⁇ C31 integrase.
  • HeLa cells did not address the integration specificity of the integrase.
  • HeLa cells were transfected with pNC-attB and either pCSI, pCS-P2, or pCS-dP2, and G418 selection was carried out.
  • the cells were either plated undiluted or were diluted 1:2, 1:4, or 1:10, to create populations representing various numbers of clones. For each integrase, approximately 250 colonies were pooled from the undiluted plate, 130 colonies from the 1:2 diluted plate, 60 colonies from the 1:4 diluted plate, and 25 colonies from the 1:10 diluted plate. Genomic DNA was isolated from the pools, and PCR analysis was performed to look for integration at the five pseudo sites. This analysis utilized PCR primers (Chalberg et al., J. MoI. Biol. 357: 28 - 48 (2006)) that specifically detected junction fragments created by the juxtaposition of attB and chromosomal sequences located at the five pseudo attP sites.
  • the 293-P3 cell line contains a randomly integrated expression cassette carrying an attP site and a promoterless zeocin resistance gene (Thyagarajan et al., Molecular and Cellular Biology 21:3926-3934 (2001)).
  • 293-P3 was co-transfected with pNC-attB and the three mutant integrases, Pl, P2, and P3, with or without the N- terminal fusion sequence. Integration of pNC-attB at the chromosomal attP site was expected to give rise to colonies that were resistant to both G418 and zeocin.
  • G418-resistant clones generated by these integrases were analyzed for the numbers of clones having integration at the attP site, versus at another location. To perform this analysis, approximately 50 G418-resistant clones were picked for each integrase, genomic DNA was isolated, and PCR analysis was carried out to determine the presence or absence of a band diagnostic for integration at attP.
  • a donor plasmid, pVFB, carrying the ⁇ C31 attB site and the human factor IX (hFIX) gene (FIG 2, Panel E) was co-injected with either pCSI carrying the wild-type ⁇ C31 integrase gene; pCSml, an identical plasmid carrying a point mutation inactivating the integrase; pCS-P2; or pCS-P3.
  • pCSI carrying the wild-type ⁇ C31 integrase gene
  • pCSml an identical plasmid carrying a point mutation inactivating the integrase
  • pCS-P2 pCS-P3
  • FIG 6A The results of this study are depicted in FIG 6A.
  • the group that received wild-type ⁇ C31 integrase (pCSI) displayed a 3.9-fold increase in hFIX levels over the group that received pCSml, a significant increase (p ⁇ 0.05).
  • Mice that were co- injected with mutant integrase pCS-P2 had a significant (2.3-fold) increase in hFIX levels over pCSI. This result shows that that the P2 mutant can elevate the level of integration in mouse liver, as it did in cultured human cells.
  • hFIX levels mediated by pCSI and pCS-P2 persisted during the three-month duration of the experiment and represented therapeutic levels of hFIX.
  • the levels of hFIX generated by pCS-P3 mutant were not significantly higher than those generated by the pCSml inactive integrase.
  • FIG 6B shows representative liver sections from mice that were uninjected, received buffer alone, or received pVFB along with pCSml, pCSI, pCS-P2, or pCS-P3. These sections revealed that a significantly higher number of cells stained positive for hFIX in livers that received wild-type integrase, compared to the control group that received the inactive form of integrase, as expected. Moreover, the P2 mutant gave rise to more cells expressing hFIX than did the wild-type integrase. This result shows that P2 had a higher integration efficiency than wild-type ⁇ C31 integrase. By contrast, the P3 mutant generated a lower number of stained cells, consistent with an integration efficiency at pseudo attP sites that was lower than that of the wild-type or P2 integrases.
  • hFIX positive cells were counted and all nuclei visible from the DAPI stain. Mice that received pCSml had about 1.9% of hepatocytes that were positive for hFIX. Much of this signal may be due to random integration of the pVFB plasmid following hydrodynamic delivery of many copies of plasmid DNA into the hepatocytes. In the group that was given pCSI, approximately 12.4% of the hepatocytes were positive for hFIX, representing a robust integration frequency. The P2 mutant resulted in an even higher integration efficiency, with approximately 18.5% of the cells expressing hFIX. However, the P3 mutant generated only about 4.25% positive cells.
  • genomic DNA was isolated from the livers of two animals per group and analyzed by PCR for integration of pVFB at the mpsLl site. PCR bands of the expected size were seen in the positive control lane, as well as in samples from the groups that received pVFB and pCSI or pCS-P2 (FIG 7, Panel A). The PCR product was not observed in reactions using DNA from animals that received the inactive integrase or the P3 mutant integrase. The PCR bands from the positive control liver, as well as from the pCSI and pCS-P2 livers, were excised and subjected to DNA sequencing.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Mycology (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Plant Pathology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Description

ALTERED PHIC31 INTEGRASES HAVING IMPROVED EFFICIENCY AND SPECIFICITY AND METHODS OF USING SAME
CROSS-REFERENCE
[0001] This application claims the benefit of U.S. Provisional Application No.
61/095,257, filed September 8, 2008, which application is incorporated herein by reference in its entirety.
BACKGROUND OF THE INVENTION
[0002] The current inability to perform efficient, site-specific integration of incoming
DNA into the chromosomes of higher organisms is holding up advances in basic and applied biology. Recently strategies for chromosomal integration that take advantage of the high efficiency and tight sequence specificity of recombinase enzymes isolated from microorganisms have been described. In particular, a class of phage integrases that includes the φC31 integrase (Kuhstoss, S., and Rao, R. N., J. MoI. Biol. 222, 897-908 (1991); Rausch, H., and Lehmann, M., Nucleic Acids Research 19, 5187-5189 (1991)) have been shown to function in mammalian cells (Groth, A. C, et al., Proc. Natl. Acad. Sci. USA 97, 5995-6000 (2000)).
[0003] Such site-specific recombinase enzymes have long DNA recognition sites that are typically not present even in the large genomes of mammalian cells. However, it has been recently demonstrated that recombinase pseudo sites, i.e. sites with a significant degree of identity to the wild-type binding site for the recombinase, are present in these genomes (Thyagarajan, B., et al., Gene 244, 47-54 (2000)).
[0004] The present disclosure addresses these needs by providing altered φC31 integrases that can be used more effectively in genetic engineering of the chromosomes of higher cells.
Relevant Literature
[0005] U.S. Patent No. 7,141,426 and Published U.S. Patent Application No.
2007/0077589.
SUMMARY OF THE INVENTION
[0006] The present invention relates to the identification, isolation, cloning, expression, purification, and methods of use of altered φC31 integrases. In one aspect, the present invention is directed to a method of site-specifically integrating a polynucleotide sequence of interest in a genome of a target cell using an altered φC31 integrase of the present invention.
[0007] These and other objects, advantages, and features of the invention will become apparent to those persons skilled in the art upon reading the details of the invention as more fully described below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The invention is best understood from the following detailed description when read in conjunction with the accompanying drawings. It is emphasized that, according to common practice, the various features of the drawings are not to-scale. On the contrary, the dimensions of the various features are arbitrarily expanded or reduced for clarity. Included in the drawings are the following figures:
[0009] FIG 1 shows schematic diagrams of screens used to characterize φC31 integrase mutants. Panel A shows a schematic of a bacterial pre-screen for functional integrases. pFCl was the recipient plasmid for libraries of mutant integrases. Recombination between the attB and attP sites mediated by a functional mutant integrase led to excision of the lacZ gene when transformed into E. coli, giving rise to a white colony on X-gal plates. Panel B shows a schematic of a mammalian screen for recombination efficiency. DNA isolated from white colonies (pFC-M) was co-transfected into 293 cells with pBP- Green. pBP-Green carries the eGFP gene and a CMV promoter in the inactive orientation, flanked by attP and attB. Recombination between the att sites, mediated by a functional mutant integrase, "flips" the CMV promoter, allowing eGFP expression, detected and quantified by a fluorescence analyzer.
[0010] FIG 2 shows schematics of plasmids. Panel A shows the pFC-Pl, pFC-P2, and pFC-P3 plasmids that express the mutant integrases Pl, P2, and P3 under the control of the CMV and lacZ promoters and also carry the chloramphenicol resistance gene. Panel B shows the pCS-Pl, pCS-P2, and pCS-P3 plasmids that express the mutant integrases Pl, P2, and P3 having the N-terminal fusion sequence and under the control of the CMV promoter on the ampicillin-resistant pCS vector backbone. Panel C shows the pCS-dPl, pCS-dP2, and pCS-dP3 plasmids that express the mutant integrases dPl, dP2, and dP3, deleted for the N-terminal fusion sequence, on a pCS backbone vector. Panel D shows the pNC-attB donor plasmid containing the φC31 attB site immediately downstream of a CMV promoter, an eGFP gene driven by a second CMV promoter, and the kanamycin resistance gene. Panel E shows the pVFB donor plasmid carrying the φC31 attB site, the human factor IX gene expressed from the hAAT promoter, and a kanamycin resistance gene.
[0011] FIG 3 shows results of a Western blot that demonstrates expression of the N- terminal fusion sequence. Each lane contains a HeLa cell extract from cells transfected with the following plasmids expressing an HA-tagged integrase: lane 1: pCSI-HA; lane 2: pCS-dP2-HA; lane 3: pCS-P2-HA; lane 4: pCS-dP3-HA; lane 5: pCS-P3-HA. The blot was stained with antibodies that detect the HA tag and β-actin. The lower band in all five lanes is the β-actin loading control.
[0012] FIGS 4A and 4B show results of integration into pseudo attP sites in cultured human cells. FIG 4A shows that HeLa cells received a donor attB neomycin resistant plasmid and a plasmid expressing a wild-type or mutant integrase and were grown under G418 selection for 2 weeks. The average numbers of G418-resistant colonies obtained for each integrase are shown. Values of interest are + SEM. Asterisks denote values that differ at p<0.05 (Student's t test). FIG 4B shows frequency of integration at five commonly used human pseudo attP sites. Transfection and selection were done as above using the three designated integrases. PCR was performed on pooled G418-resistant colonies from plates that had been diluted as noted. PCR primers specific for detecting integration of the plasmid at each of the five sites were used, and the reactions were scored as positive or negative, depending on whether a band was obtained.
[0013] FIGS 5A and 5B show integration into the pre-integrated chromosomal attP site in 293-P3 cells. FIG 5A shows results of 293-P3 cells that were co-transfected with pNC-attB and a wild-type or mutant integrase-expressing plasmid and were maintained in selection media containing G418 and zeocin for ~ 2 weeks. The average number of colonies resistant to both G418 + zeocin are shown. All values shown represent the mean + SEM. Asterisks denote comparative values between wild-type integrase (pCSI) and the P3 integrase with the N-terminal fusion sequence (pCS-P3) that differed at p<0.05 (Student's t test). FIG 5B shows PCR analysis to determine specificity of integration at attP. 293-P3 cells were co-transfected with a G418-resistant attB donor plasmid and either pCS-P3 or pCS-dP3. Numbers of G418-resistant colonies are shown. After two weeks of selection with G418, -50 colonies generated by each integrase were picked and individually analyzed by PCR using primers that detect integration at the chromosomal attP site. Values are + SEM, and asterisks denote values that differ at p<0.05 (Student's t test).
[0014] FIGS 6A and 6B show results of human factor IX expression in mouse liver.
FIG 6A shows levels of hFIX expression in mouse serum up to 90 days post-injection. Animals were hydrodynamically injected with pVFB and pCSml, pCSI, pCS-P2, or pCS-P3. Sera from blood samples were assayed by ELISA at various time points. Asterisk (*) denotes p-values of interest that were < 0.05. FIG 6B shows immunofluorescence staining of mouse liver sections for human factor IX. The top row shows DAPI stained nuclei, the middle row shows staining for human factor IX, and the bottom row is a merged image of the top and middle rows. Panel 1, naϊve control; panel 2, HBSS alone; panel 3, pVFB + pCSml; panel 4, pVFB + pCSI; panel 5, pVFB + pCS- P2; and panel 6: pVFB + pCS-P3.
[0015] FIG 7 shows evidence of integration of nucleic acid into a genomic pseudo attP site in mouse liver. Panel A shows that PCR was carried out on DNA extracted from liver, using primers that detect the junction between the attB site of pVFB and mpsLl, a preferred pseudo attP site in the mouse genome. Lane 1, liver DNA sample known to have αffβ-plasmid DNA integrated at mpsLl; lane 2, primers alone; lane 3, liver that received pVFB and pCSml; lane 4, liver that received pVFB and pCSI; lane 5, liver that received pVFB and pCS-P2; and lane 6, liver that received pVFB and pCS-P3. Panel B shows PCR bands from lanes 1, 4, and 5 were excised, purified, TOPO-cloned, and sequenced. The sequences depicted begin in attB and join genomic DNA within the mpsLl site. The TT core cross-over sequence is in bold, and ":" represents small deletions seen in the cross-over region.
[0016] FIG 8 shows the nucleic acid sequence of the wild-type φC31 integrase (SEQ ID
NO:01).
[0017] FIG 9 shows the amino acid sequence of the wild-type φC31 integrase (SEQ ID
NO:02).
[0018] FIG 10 shows the nucleic acid sequence of the altered φC31 integrase Pl (SEQ
ID NO:03).
[0019] FIG 11 shows the amino acid sequence of the altered φC31 integrase Pl (SEQ ID
NO:04).
[0020] FIG 12 shows the nucleic acid sequence of the altered φC31 integrase P2 (SEQ
ID NO:05).
[0021] FIG 13 shows the amino acid sequence of the altered φC31 integrase P2 (SEQ ID
NO:06).
[0022] FIG 14 shows the nucleic acid sequence of the altered φC31 integrase P3 (SEQ
ID NO:07).
[0023] FIG 15 shows the amino acid sequence of the altered φC31 integrase P3 (SEQ ID
NO:08). [0024] FIGS 16A-16C show the DNA sequences of the full length φC31 attP (SEQ ID
NO:09) (FIG 16A) and attB (SEQ ID NO: 10) (FIG 16B) sites, respectively, and a 59 bp wild-type φC31 attP site (SEQ ID NO: 11) (FIG 16C). In the figures the TT core is indicated in upper case.
[0025] FIG. 17 shows approximately 475 bp of DNA sequence from human chromosome 8 that encompasses an exemplary φC31 integrase pseudo-attp site ψA (SEQ ID NO: 12). The core TT sequence of the pseudo site is shown in bold. Approximately 40 bp surrounding the core represent the minimal pseudo attP site.
[0026] FIG 18 shows the nucleic acid sequences of 19 different pseudo attP sites present in the human genome (SEQ ID NO: 13-31). The site name identifies the chromosomal location of each pseudo attP site. The top sequence is a consensus attP site, which is symmetrical about the core and contains inverted repeats extending over the length of the consensus, indicated by the arrows.
[0027] FIG 19 shows an amino acid sequence alignment of the wild-type φC31 integrase
(WT) (SEQ ID NO:02) and the altered φC31 integrases Pl (SEQ ID NO:04), P2 (SEQ ID NO:06), and P3 (SEQ ID NO:08).
DEFINITIONS
[0028] "Recombinases" are a family of enzymes that mediate site-specific recombination between specific DNA sequences recognized by the recombinase (Esposito, D., and Scocca, J. J., Nucleic Acids Research 25, 3605 3614 (1997); Nunes-Duby, S. E., et al., Nucleic Acids Research 26, 391 406 (1998); Stark, W. M., et al., Trends in Genetics 8, 432 439 (1992)). Within this group are several subfamilies including "Integrase" or tyrosine recombinase (including, for example, Cre and lambda integrase) and "Resolvase/Invertase" or serine recombinase (including, for example, φC31 integrase, R4 integrase, and TP-901 integrase).
[0029] "Altered recombinases" refer to recombinase enzymes in which the native, wild- type recombinase gene found in the organism of origin has been mutated in one or more positions. An altered recombinase possesses a DNA binding specificity and/or level of activity that differs from that of the wild-type enzyme. Such altered binding specificity permits the recombinase to react with a given DNA sequence differently than would the native enzyme, while an altered level of activity permits the recombinase to carry out the reaction at greater or lesser efficiency. A recombinase reaction typically includes binding to the recognition sequence and performing concerted cutting and ligation, resulting in strand exchanges between two recombining recognition sites. [0030] A "unidirectional site-specific recombinase" is a naturally-occurring recombinase, such as the φC31 integrase, a mutated or altered recombinase, such as a mutated or altered φC31 integrase that retains unidirectional, site-specific recombination activity, or a bi-directional recombinase modified so as to be unidirectional, such as a ere recombinase that has been modified to become unidirectional.
[0031] "Altered recombinases" and "mutant recombinases" are used interchangeably herein to refer to recombinase enzymes in which the native, wild-type recombinase gene found in the organism of origin has been mutated in one or more positions relative to a parent recombinase (e.g., in one or more nucleotides, which may result in alterations of one or more amino acids in the altered recombinase relative to a parent recombinase). "Parent recombinase" is used to refer to the nucleotide and/or amino acid sequence of the recombinase from which the altered recombinase is generated. The parent recombinase can be a naturally occurring enzyme (i.e., a native or wild-type enzyme) or a non- naturally occurring enzyme (e.g., a genetically engineered enzyme). Altered recombinases of interest in the invention exhibit a DNA binding specificity and/or level of activity that differs from that of the wild-type enzyme or other parent enzyme. Such altered binding specificity permits the recombinase to react with a given DNA sequence differently than would the parent enzyme, while an altered level of activity permits the recombinase to carry out the reaction at greater or lesser efficiency. A recombinase reaction typically includes binding to the recognition sequence and performing concerted cutting and ligation, resulting in strand exchanges between two recombining recognition sites.
[0032] "Site-specific integration" or "site-specifically integrating" as used herein refers to the sequence specific recombination and integration of a first nucleic acid with a second nucleic acid, typically mediated by a recombinase. In general, site-specific recombination or integration occurs at particular defined sequences recognized by the recombinase. In contrast to random integration, site specific integration occurs at a particular sequence (e.g., a recombinase attachment site) at a higher efficiency.
[0033] A "wild-type recombination site" as used herein means a recombination site normally used by an integrase or recombinase. For example, lambda is a temperate bacteriophage that infects E. coli. The phage has one attachment site for recombination (attP) and the E. coli bacterial genome has an attachment site for recombination (attB). Both of these sites are wild-type recombination sites for lambda integrase. In the context of the present invention, wild-type recombination sites occur in the homologous phage/bacteria system. Accordingly, wild-type recombination sites can be derived from the homologous system and associated with heterologous sequences, for example, the attB site can be placed in other systems to act as a substrate for the integrase.
[0034] The wild-type attB and attP recognition sites of phage φC31 (i.e. bacteriophage φC31) are generally about 34 to 40 nucleotides in length (Groth et al. Proc Natl Acad Sci USA 97:5995-6000 (2000)). These sites are typically arranged as follows: AttB comprises a first DNA sequence attB5', a core region, and a second DNA sequence attB3\ in the relative order from 5' to 3' attB5'-coτe τegion-attB3' . AttP comprises a first DNA sequence attP5', a core region, and a second DNA sequence attP3', in the relative order from 5' to 3' attP5' -com mgion-attP3' . The core region of attP and attB of φC31 has the sequence 5'-TTG-3'.
[0035] Action of the integrase upon these recognitions sites is unidirectional in that the enzymatic reaction produces nucleic acid recombination products that are not effective substrates of the integrase. This results in stable integration with little or no detectable recombinase-mediated excision, i.e., recombination that is "unidirectional". The recombination product of integrase action upon the recognition site pair comprises, for example, in order from 5' to 3': attB5 '-recombination product site sequence-αftf^', and αftf^-recombination product site sequence-a??Z?3'. Thus, where the target vector comprises an attB site and the target genome comprises an attP sequence, a typical recombination product comprises the sequence (from 5' to 3'): attP5'-TTG- attB3'{ targeting vector sequence}α??β5'-TTG-α?tf'3'. Because the attB and attP sites are different sequences, recombination results in a hybrid site-specific recombination site (designated attL or attR for left and right, respectively) that is neither an attB sequence or an attP sequence, and is functionally unrecognizable as a site-specific recombination site (e.g., attB or attP) to the relevant unidirectional site-specific recombinase, thus removing the possibility that the unidirectional site-specific recombinase will catalyze a second recombination reaction between the attL and the attR that would reverse the first recombination reaction.
[0036] A "native recognition site", as used herein, means a recognition site that occurs naturally in the genome of a cell (i.e., the sites are not introduced into the genome, for example, by recombinant means).
[0037] A "pseudo-site" or a "pseudo-recombination site" as used herein means a DNA sequence comprising a recognition site that is bound by a recombinase enzyme where the recognition site differs in one or more nucleotides from a wild-type recombinase recognition sequence and/or is present as an endogenous sequence in a genome that differs from the sequence of a genome where the wild-type recognition sequence for the recombinase resides. For a given recombinase, a pseudo-recombination sequence is functionally equivalent to a wild-type recombination sequence, occurs in an organism other than that in which the recombinase is found in nature, and may have sequence variation relative to the wild type recombination sequences. In some embodiments a "pseudo attP site" or "pseudo attB site" refer to pseudo sites that are similar to the recognitions site for wild-type phage (attP) or bacterial (attB) attachment site sequences, respectively, for phage integrase enzymes, such as the phage φC31. In many embodiments of the invention the pseudo attP site is present in the genome of a host cell, while the wild type attB site is present on a targeting vector. "Pseudo att site" is a more general term that can refer to either a pseudo attP site or a pseudo attB site. It is understood that att sites or pseudo att sites may be present on linear or circular nucleic acid molecules. In certain embodiments, the presence of "pseudo-recombination sites" in the genome of the target cell avoids the need for introducing a recombination site into the genome.
[0038] A "hybrid-recombination site", as used herein, refers to a recombination site constructed from portions of wild type and/or pseudo-recombination sites. As an example, a wild-type recombination site may have a short, core region flanked by palindromes. In one embodiment of a "hybrid-recombination site" the sequence 5' of the core region sequence of the hybrid-recombination site matches a pseudo-recombination site and the sequence 3 ' of the core of the hybrid-recombination site match the wild-type recombination site. In an alternative embodiment, the hybrid-recombination site may be comprised of the region 5' of the core from a wild- type attB site and the region 3' of the core from a wild-type attP recombination site, or vice versa. Other combinations of such hybrid-recombination sites will be evident to those having ordinary skill in the art, in view of the teachings of the present specification.
[0039] By "nucleic acid construct" it is meant a nucleic acid sequence that has been constructed to comprise one or more functional units not found together in nature. Examples include circular, double-stranded, extrachromosomal DNA molecules (plasmids), cosmids (plasmids containing COS sequences from lambda phage), viral genomes comprising non-native nucleic acid sequences, and the like.
[0040] By "nucleic acid fragment of interest" it is meant any nucleic acid fragment that one wishes to insert into a genome. Suitable examples of nucleic acid fragments of interest include therapeutic genes, marker genes, control regions, trait-producing fragments, and the like. [0041] "Therapeutic genes" are those nucleic acid sequences which encode molecules that provide some therapeutic benefit to the host, including proteins, antibodies, functional RNAs (antisense, hammerhead ribozymes), RNAi, siRNA, miRNA, shRNA, and the like. Well known examples include the cystic fibrosis transmembrane conductance regulator (CFTR) gene and the Factor IX gene. The primary physiological defect in cystic fibrosis is the failure of electrogenic chloride ion secretion across the epithelia of many organs, including the lungs. One of the most dangerous aspects of the disorder is the cycle of recurrent airway infections which gradually destroy lung function resulting in premature death. Cystic fibrosis is caused by a variety of mutations in the CFTR gene. Since the problems arising in cystic fibrosis result from mutations in a single gene, the possibility exists that the introduction of a normal copy of the gene into the lung epithelia could provide a treatment for the disease, or effect a cure if the gene transfer was permanent.
[0042] Other disorders resulting from mutations in a single gene (known as monogenic disorders) include alpha- 1 -antitrypsin deficiency, chronic granulomatous disease, familial hypercholesterolemia, Fanconi anemia, Gaucher disease, Hunter syndrome, ornithine transcarbamylase deficiency, purine nucleoside phosphorylase deficiency, severe combined immunodeficiency disease (SCID)-ADA, X-linked SCID, hemophilia, muscular dystrophy, and the like.
[0043] Therapeutic benefit in other disorders may also result from the addition of a protein-encoding therapeutic nucleic acid. For example, addition of a nucleic acid encoding an immunomodulating protein such as interleukin-2 may be of therapeutic benefit for patients suffering from different types of cancer.
[0044] A nucleic acid fragment of interest may additionally be a "marker nucleic acid" or "marker polypeptide". Marker genes encode proteins which can be easily detected in transformed cells and are, therefore, useful in the study of those cells. Marker genes are being used in bone marrow transplantation studies, for example, to investigate the biology of marrow reconstitution and the mechanism of relapse in patients. Examples of suitable marker genes include beta-galactosidase, green or yellow fluorescent proteins, chloramphenicol acetyl transferase, luciferase, and the like.
[0045] A nucleic acid fragment of interest may additionally be a control region. The term "control region" or "control element" includes all nucleic acid components which are operably linked to a nucleic acid fragment (e.g., DNA) and involved in the expression of a protein or RNA therefrom. The precise nature of the control (or regulatory) regions needed for coding sequence expression may vary from organism to organism. Such regions typically include those 5' noncoding sequences involved with initiation of transcription and translation, such as the enhancer, TATA box, capping sequence, CAAT sequence, and the like. Further exemplary control sequences include, but are not limited to, any sequence that functions to modulate replication, transcriptional or translational regulation, and the like. Examples include promoters, signal sequences, propeptide sequences, transcription terminators, polyadenylation sequences, enhancer sequences, attenuatory sequences, intron splice site sequences, and the like.
[0046] A nucleic acid fragment of interest may additionally be a trait-producing sequence, by which it is meant a sequence conferring some non-native trait upon the organism or cell in which the protein encoded by the trait-producing sequence is expressed. The term "non-native" when used in the context of a trait-producing sequence means that the trait produced is different than one would find in an unmodified organism which can mean that the organism produces high amounts of a natural substance in comparison to an unmodified organism, or produces a non-natural substance. For example, the genome of a crop plant, such as corn, can be modified to produce higher amounts of an essential amino acid, thus creating a plant of higher nutritional quality, or could be modified to produce proteins not normally produced in plants, such as antibodies. (See U.S. Pat. No. 5,202,422 (issued Apr. 13, 1993); U.S. Pat. No. 5,639,947 (Jun. 17, 1997).) Likewise, the genomes of industrially important microorganisms can be modified to make them more useful such as by inserting new metabolic pathways with the aim of producing novel metabolites or improving both new and existing processes such as the production of antibiotics and industrial enzymes. Other useful traits include herbicide resistance, antibiotic resistance, disease resistance, resistance to adverse environmental conditions (e.g., temperature, pH, salt, drought), and the like.
[0047] Methods of transforming cells are well known in the art. By "transformed" it is meant a heritable alteration in a cell resulting from the uptake of foreign DNA. Suitable methods include viral infection, transfection, conjugation, protoplast fusion, electroporation, particle gun technology, calcium phosphate precipitation, direct microinjection, and the like. The choice of method is generally dependent on the type of cell being transformed and the circumstances under which the transformation is taking place (i.e. in vitro, ex vivo, or in vivo). A general discussion of these methods can be found in Ausubel, et al, Short Protocols in Molecular Biology, 3rd ed., Wiley & Sons, 1995.
[0048] The terms "nucleic acid molecule" and "polynucleotide" are used interchangeably and refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown. Non-limiting examples of polynucleotides include a gene, a gene fragment, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers.
[0049] A polynucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA). Thus, the term polynucleotide sequence is the alphabetical representation of a polynucleotide molecule. This alphabetical representation can be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
[0050] A "coding sequence" or a sequence which "encodes" a selected polypeptide, is a nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the case of mRNA) into a polypeptide, for example, in vivo when placed under the control of appropriate regulatory sequences (or "control elements"). The boundaries of the coding sequence are typically determined by a start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxy) terminus. A coding sequence can include, but is not limited to, cDNA from viral, procaryotic or eucaryotic mRNA, genomic DNA sequences from viral or procaryotic DNA, and even synthetic DNA sequences. A transcription termination sequence may be located 3' to the coding sequence. Other "control elements" may also be associated with a coding sequence. A DNA sequence encoding a polypeptide can be optimized for expression in a selected cell by using the codons preferred by the selected cell to represent the DNA copy of the desired polypeptide coding sequence.
[0051] "Encoded by" refers to a nucleic acid sequence which codes for a polypeptide sequence, wherein the polypeptide sequence or a portion thereof contains an amino acid sequence of at least 3 to 5 amino acids, more preferably at least 8 to 10 amino acids, and even more preferably at least 15 to 20 amino acids from a polypeptide encoded by the nucleic acid sequence. Also encompassed are polypeptide sequences which are immunologically identifiable with a polypeptide encoded by the sequence.
[0052] "Operably linked" refers to an arrangement of elements wherein the components so described are configured so as to perform their usual function. Thus, a given promoter that is operably linked to a coding sequence (e.g., a reporter expression cassette) is capable of effecting the expression of the coding sequence when the proper enzymes are present. The promoter or other control elements need not be contiguous with the coding sequence, so long as they function to direct the expression thereof. For example, intervening untranslated yet transcribed sequences can be present between the promoter sequence and the coding sequence and the promoter sequence can still be considered "operably linked" to the coding sequence.
[0053] A "vector" is capable of transferring gene sequences to target cells. Typically,
"vector construct," "expression vector," and "gene transfer vector," mean any nucleic acid construct capable of directing the expression of a gene of interest and which can transfer gene sequences to target cells. Thus, the term includes cloning, and expression vehicles, as well as integrating vectors.
[0054] An "expression cassette" comprises any nucleic acid construct capable of directing the expression of a gene/coding sequence of interest. Such cassettes can be constructed into a "vector," "vector construct," "expression vector," or "gene transfer vector," in order to transfer the expression cassette into target cells. Thus, the term includes cloning and expression vehicles, as well as viral vectors.
[0055] Techniques for determining nucleic acid and amino acid "sequence identity" also are known in the art. Typically, such techniques include determining the nucleotide sequence of the mRNA for a gene and/or determining the amino acid sequence encoded thereby, and comparing these sequences to a second nucleotide or amino acid sequence. In general, "identity" refers to an exact nucleotide-to-nucleotide or amino acid-to-amino acid correspondence of two polynucleotides or polypeptide sequences, respectively. Two or more sequences (polynucleotide or amino acid) can be compared by determining their "percent identity." The percent identity of two sequences, whether nucleic acid or amino acid sequences, is the number of exact matches between two aligned sequences divided by the length of the shorter sequences and multiplied by 100. An approximate alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman, Advances in Applied Mathematics 2:482 489 (1981). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff, Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353 358, National Biomedical Research Foundation, Washington, D. C, USA, and normalized by Gribskov, Nucl. Acids Res. 14(6):6745 6763 (1986). An exemplary implementation of this algorithm to determine percent identity of a sequence is provided by the Genetics Computer Group (Madison, Wis.) in the "BestFit" utility application. The default parameters for this method are described in the Wisconsin Sequence Analysis Package Program Manual, Version 8 (1995) (available from Genetics Computer Group, Madison, Wis.). A preferred method of establishing percent identity in the context of the present invention is to use the MPSRCH package of programs copyrighted by the University of Edinburgh, developed by John F. Collins and Shane S. Sturrok, and distributed by IntelliGenetics, Inc. (Mountain View, Calif.). From this suite of packages the Smith- Waterman algorithm can be employed where default parameters are used for the scoring table (for example, gap open penalty of 12, gap extension penalty of one, and a gap of six). From the data generated the "Match" value reflects "sequence identity." Other suitable programs for calculating the percent identity or similarity between sequences are generally known in the art, for example, another alignment program is BLAST, used with default parameters. For example, BLASTN and BLASTP can be used using the following default parameters: genetic code=standard; filter=none; strand=both; cutoff=60; expect=10; Matrix=BLOSUM62; Descriptions=50 sequences; sort by =HIGH SCORE; Databases=non-redundant, GenBank+EMBL+DDBJ+PDB+GenBank CDS translations+Swiss protein+Spupdate+PIR. Details of these programs can be found at the following internet address: the world wide website of the National Center for Biotechnology Information.
[0056] Alternatively, homology can be determined by hybridization of polynucleotides under conditions that form stable duplexes between homologous regions, followed by digestion with single-stranded-specific nuclease(s), and size determination of the digested fragments. Two DNA, or two polypeptide sequences are "substantially homologous" to each other when the sequences exhibit at least about 80% 85%, preferably at least about 85% 90%, more preferably at least about 90% 95%, and most preferably at least about 95% 98% sequence identity over a defined length of the molecules, as determined using the methods above. As used herein, substantially homologous also refers to sequences showing complete identity to the specified DNA or polypeptide sequence. DNA sequences that are substantially homologous can be identified in a Southern hybridization experiment under, for example, stringent conditions, as defined for that particular system. Defining appropriate hybridization conditions is within the skill of the art. See, e.g., Sambrook et al., supra; DNA Cloning, supra; Nucleic Acid Hybridization, supra.
[0057] Two nucleic acid fragments are considered to "selectively hybridize" as described herein. The degree of sequence identity between two nucleic acid molecules affects the efficiency and strength of hybridization events between such molecules. A partially identical nucleic acid sequence will at least partially inhibit a completely identical sequence from hybridizing to a target molecule. Inhibition of hybridization of the completely identical sequence can be assessed using hybridization assays that are well known in the art (e.g., Southern blot, Northern blot, solution hybridization, or the like, see Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, (1989) Cold Spring Harbor, N.Y.). Such assays can be conducted using varying degrees of selectivity, for example, using conditions varying from low to high stringency. If conditions of low stringency are employed, the absence of non-specific binding can be assessed using a secondary probe that lacks even a partial degree of sequence identity (for example, a probe having less than about 30% sequence identity with the target molecule), such that, in the absence of non-specific binding events, the secondary probe will not hybridize to the target.
[0058] When utilizing a hybridization-based detection system, a nucleic acid probe is chosen that is complementary to a target nucleic acid sequence, and then by selection of appropriate conditions the probe and the target sequence "selectively hybridize," or bind, to each other to form a hybrid molecule. A nucleic acid molecule that is capable of hybridizing selectively to a target sequence under "moderately stringent" typically hybridizes under conditions that allow detection of a target nucleic acid sequence of at least about 10 14 nucleotides in length having at least approximately 70% sequence identity with the sequence of the selected nucleic acid probe. Stringent hybridization conditions typically allow detection of target nucleic acid sequences of at least about 10 14 nucleotides in length having a sequence identity of greater than about 90 95% with the sequence of the selected nucleic acid probe. Hybridization conditions useful for probe/target hybridization where the probe and target have a specific degree of sequence identity, can be determined as is known in the art (see, for example, Nucleic Acid Hybridization: A Practical Approach, editors B. D. Hames and S. J. Higgins, (1985) Oxford; Washington, D.C.; IRL Press).
[0059] With respect to stringency conditions for hybridization, it is well known in the art that numerous equivalent conditions can be employed to establish a particular stringency by varying, for example, the following factors: the length and nature of probe and target sequences, base composition of the various sequences, concentrations of salts and other hybridization solution components, the presence or absence of blocking agents in the hybridization solutions (e.g., formamide, dextran sulfate, and polyethylene glycol), hybridization reaction temperature and time parameters, as well as, varying wash conditions. The selection of a particular set of hybridization conditions is selected following standard methods in the art (see, for example, Sambrook, et al., Molecular Cloning: A Laboratory Manual, Second Edition, (1989) Cold Spring Harbor, N. Y.)
[0060] A first polynucleotide is "derived from" a second polynucleotide if it has the same or substantially the same basepair sequence as a region of the second polynucleotide, its cDNA, complements thereof, or if it displays sequence identity as described above.
[0061] A first polypeptide is "derived from" a second polypeptide if it is (i) encoded by a first polynucleotide derived from a second polynucleotide, or (ii) displays sequence identity to the second polypeptides as described above.
[0062] In the present invention, when a recombinase is "derived from a phage" the recombinase need not be explicitly produced by the phage itself, the phage is simply considered to be the original source of the recombinase and coding sequences thereof. Recombinases can, for example, be produced recombinantly or synthetically, by methods known in the art, or alternatively, recombinases may be purified from phage infected bacterial cultures.
[0063] "Substantially purified" general refers to isolation of a substance (compound, polynucleotide, protein, polypeptide, polypeptide composition) such that the substance comprises the majority percent of the sample in which it resides. Typically in a sample a substantially purified component comprises 50%, preferably 80% 85%, more preferably 90 95% of the sample. Techniques for purifying polynucleotides and polypeptides of interest are well-known in the art and include, for example, ion-exchange chromatography, affinity chromatography and sedimentation according to density.
DETAILED DESCRIPTION OF THE INVENTION
[0064] Before the present invention is described, it is to be understood that this invention is not limited to particular embodiments described, as such may, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting, since the scope of the present invention will be limited only by the appended claims.
[0065] Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit unless the context clearly dictates otherwise, between the upper and lower limits of that range is also specifically disclosed. Each smaller range between any stated value or intervening value in a stated range and any other stated or intervening value in that stated range is encompassed within the invention. The upper and lower limits of these smaller ranges may independently be included or excluded in the range, and each range where either, neither or both limits are included in the smaller ranges is also encompassed within the invention, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the invention.
[0066] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, some potential and preferred methods and materials are now described. All publications mentioned herein are incorporated herein by reference to disclose and describe the methods and/or materials in connection with which the publications are cited. It is understood that the present disclosure supercedes any disclosure of an incorporated publication to the extent there is a contradiction.
[0067] It must be noted that as used herein and in the appended claims, the singular forms "a", "an", and "the" include plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a cell" includes a plurality of such cells and reference to "the compound" includes reference to one or more compounds and equivalents thereof known to those skilled in the art, and so forth.
[0068] The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
[0069] It is further noted that the claims may be drafted to exclude any optional element.
As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as "solely", "only" and the like in connection with the recitation of claim elements, or the use of a "negative" limitation.
Altered φC31 Integrases
[0070] As summarized above, the present application provides altered φC31 integrase proteins, as well as nucleic acids encoding the same. Altered φC31 integrase refer to a φC31 integrase enzymes in which the native, wild-type integrase gene found in φC31 has been mutated in one or more positions. An altered φC31 integrase possesses a DNA binding specificity and/or level of activity that differs from that of the wild-type φC31 integrase enzyme. Such altered binding specificity permits the altered φC31 integrase to react with a given DNA sequence differently than would the native wild-type φC31 integrase enzyme, while an altered level of activity permits the altered φC31 integrase to carry out the reaction at greater or lesser efficiency. A recombinase reaction typically includes binding to the recognition sequence and performing concerted cutting and ligation, resulting in strand exchanges between two recombining recognition sites.
[0071] Altered φC31 integrases of the present invention include the Pl (SEQ ID NO:04),
P2 (SEQ ID NO:06), and P3 (SEQ ID NO:08) altered φC31 integrases. The amino acid sequences of the Pl, P2, and P3 φC31 integrases are described in Figs. 11 (SEQ ID NO:04), 13 (SEQ ID NO:06), and 15 (SEQ ID NO:08), and the nucleic sequences encoding the polypeptides are described in Figs. 10 (SEQ ID NO:03), 12 (SEQ ID NO:05), and 14 (SEQ ID NO:07).
[0072] In addition to the above described specific amino acid sequences and nucleic acid compositions, also of interest are homologues of the above sequences. With respect to homologues of the subject amino acid sequences and nucleic acids, the source of homologous sequence may be any species or the sequence may be wholly or partially synthetic. In certain embodiments, sequence similarity between homologues is at least about 20%, sometimes at least about 25%, and may be 30%, 35%, 40%, 50%, 60%, 70% or higher, including 75%, 80%, 85%, 90% and 95% or higher. Sequence similarity is calculated based on a reference sequence, which may be a subset of a larger sequence, such as a conserved motif, coding region, flanking region, etc. For example, a reference sequence will usually be at least about 18 nt long, more usually at least about 30 nt long, and may extend to the complete sequence that is being compared. Algorithms for sequence analysis are known in the art, such as BLAST, described in Altschul et al. (1990), /. MoI. Biol. 215:403-10 (using default settings, i.e. parameters w=4 and T=Il). The sequences provided herein are essential for recognizing related and homologous nucleic acids in database searches.
[0073] Of particular interest in certain embodiments are nucleic acids of substantially the same length as the nucleic acid identified as SEQ ID NOS: 03, 05, or 07, where by substantially the same length is meant that any difference in length does not exceed about 20 number %, usually does not exceed about 10 number % and more usually does not exceed about 5 number %; and have sequence identity to any of these sequences of at least about 90%, usually at least about 95% and more usually at least about 99% over the entire length of the nucleic acid. In many embodiments, the nucleic acids have a sequence that is substantially similar (i.e. the same as) or identical to the sequences of SEQ ID NOS: 03, 05, or 07. By substantially similar is meant that sequence identity will generally be at least about 60%, usually at least about 75% and often at least about 80, 85, 90, or even 95%.
[0074] Of particular interest in certain embodiments are amino acids of substantially the same length as the amino acid identified as SEQ ID NOS: 04, 06, or 08, where by substantially the same length is meant that any difference in length does not exceed about 20 number %, usually does not exceed about 10 number % and more usually does not exceed about 5 number %; and have sequence identity to any of these sequences of at least about 90%, usually at least about 95% and more usually at least about 99% over the entire length of the amino acid. In many embodiments, the amino acids have a sequence that is substantially similar (i.e. the same as) or identical to the sequences of SEQ ID NOS: 04, 06, or 08. By substantially similar is meant that sequence identity will generally be at least about 60%, usually at least about 75% and often at least about 80, 85, 90, or even 95%.
[0075] Also provided are nucleic acids that encode the proteins encoded by the above described nucleic acids, but differ in sequence from the above described nucleic acids due to the degeneracy of the genetic code.
[0076] Also provided are nucleic acids that hybridize to the above described nucleic acid under stringent conditions. An example of stringent hybridization conditions is hybridization at 500C or higher and 0. IxSSC (15 mM sodium chloride/1.5 mM sodium citrate). Another example of stringent hybridization conditions is overnight incubation at 42°C in a solution: 50 % formamide, 5 x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH7.6), 5 x Denhardt's solution, 10% dextran sulfate, and 20 μg/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1 x SSC at about 65°C. Stringent hybridization conditions are hybridization conditions that are at least as stringent as the above representative conditions, where conditions are considered to be at least as stringent if they are at least about 80% as stringent, typically at least about 90% as stringent as the above specific stringent conditions. Other stringent hybridization conditions are known in the art and may also be employed to identify nucleic acids of this particular embodiment of the invention.
[0077] Also provided are nucleic acids that encode fusion proteins of the subject proteins, or fragments thereof, which are fused to a second protein, e.g., a degradation sequence, a signal peptide, etc. Fusion proteins may comprise a subject polypeptide, or fragment thereof, and a non-φC31 integrase polypeptide ("the fusion partner") fused in- frame at the N-terminus and/or C-terminus of the subject polypeptide. Fusion partners include, but are not limited to, polypeptides that can bind antibody specific to the fusion partner (e.g., epitope tags); antibodies or binding fragments thereof; polypeptides that provide a catalytic function or induce a cellular response; ligands or receptors or mimetics thereof; and the like. In such fusion proteins, the fusion partner is generally not naturally associated with the subject altered φC31 integrase portion of the fusion protein, and is typically not a φC31 protein or derivative/fragment thereof, i.e., it is not found in φC31 bacteriophage.
[0078] Also provided are constructs comprising the subject nucleic acids inserted into a vector, where such constructs may be used for a number of different applications, including propagation, protein production, etc. Viral and non- viral vectors may be prepared and used, including plasmids. The choice of vector will depend on the type of cell in which propagation is desired and the purpose of propagation. Certain vectors are useful for amplifying and making large amounts of the desired DNA sequence. Other vectors are suitable for expression in cells in culture. Still other vectors are suitable for transfer and expression in cells in a whole animal or person. The choice of appropriate vector is well within the skill of the art. Many such vectors are available commercially. To prepare the constructs, the partial or full-length polynucleotide is inserted into a vector typically by means of DNA ligase attachment to a cleaved restriction enzyme site in the vector. Alternatively, the desired nucleotide sequence can be inserted by homologous recombination in vivo. Typically this is accomplished by attaching regions of homology to the vector on the flanks of the desired nucleotide sequence. Regions of homology are added by ligation of oligonucleotides, or by polymerase chain reaction using primers comprising both the region of homology and a portion of the desired nucleotide sequence, for example.
[0079] Also provided are expression cassettes or systems that find use in, among other applications, the synthesis of the subject proteins. For expression, the gene product encoded by a polynucleotide of the invention is expressed in any convenient expression system, including, for example, bacterial, yeast, insect, amphibian and mammalian systems. Suitable vectors and host cells are described in U.S. Patent No. 5,654,173. In the expression vector, a subject polynucleotide, e.g., as set forth in SEQ ID NOS :03; 05; or 07, is linked to a regulatory sequence as appropriate to obtain the desired expression properties. These regulatory sequences can include promoters (attached either at the 5' end of the sense strand or at the 3' end of the antisense strand), enhancers, terminators, operators, repressors, and inducers. The promoters can be regulated or constitutive. In some situations it may be desirable to use conditionally active promoters, such as tissue- specific or developmental stage-specific promoters. These are linked to the desired nucleotide sequence using the techniques described above for linkage to vectors. Any techniques known in the art can be used. In other words, the expression vector will provide a transcriptional and translational initiation region, which may be inducible or constitutive, where the coding region is operably linked under the transcriptional control of the transcriptional initiation region, and a transcriptional and translational termination region. These control regions may be native to the subject species from which the subject nucleic acid is obtained, or may be derived from exogenous sources.
[0080] Expression vectors generally have convenient restriction sites located near the promoter sequence to provide for the insertion of nucleic acid sequences encoding heterologous proteins. A selectable marker operative in the expression host may be present. Expression vectors may be used for, among other things, the production of fusion proteins, as described above.
[0081] Expression cassettes may be prepared comprising a transcription initiation region, the gene or fragment thereof, and a transcriptional termination region. Of particular interest is the use of sequences that allow for the expression of functional epitopes or domains, usually at least about 8 amino acids in length, more usually at least about 15 amino acids in length, to about 25 amino acids, and up to the complete open reading frame of the gene. After introduction of the DNA, the cells containing the construct may be selected by means of a selectable marker, the cells expanded and then used for expression.
[0082] The above described expression systems may be employed with prokaryotes or eukaryotes in accordance with conventional ways, depending upon the purpose for expression. For large scale production of the protein, a unicellular organism, such as E. coli, B. subtilis, S. cerevisiae, insect cells in combination with baculovirus vectors, or cells of a higher organism such as vertebrates, e.g. COS 7 cells, HEK 293, CHO, Xenopus Oocytes, etc., may be used as the expression host cells. In some situations, it is desirable to express the gene in eukaryotic cells, where the expressed protein will benefit from native folding and post-translational modifications. Small peptides can also be synthesized in the laboratory. Polypeptides that are subsets of the complete protein sequence may be used to identify and investigate parts of the protein important for function.
[0083] Also of interest are humanized versions of the subject nucleic acids. As used herein, the term "humanized" refers to changes made to the a nucleic acid sequence to optimize the codons for expression of the protein in human cells (Yang et al., Nucleic Acids Research 24 (1996), 4592-4593). See also U.S. Patent No. 5,795,737 which describes humanization of proteins, the disclosure of which is herein incorporated by reference.
Recombination Sites
[0084] The inventors have discovered native recombination sites existing in the genomes of a variety of organisms, where the native recombination site does not necessarily have a nucleotide sequence identical to the wild-type recombination sequences for a φC31 integrase; but such native recombination sites are nonetheless sufficient to promote recombination meditated by φC31 integrase. Such native recombination site sequences existing in the genomes are referred to herein as "pseudo-recombination sequences."
[0085] Identification of pseudo-recombination sequences can be accomplished, for example, by using sequence alignment and analysis, where the query sequence is the recombination site of interest (for example, attP and/or attB).
[0086] The genome of a target cell may be searched for sequences having sequence identity to the selected recombination site for a given recombinase, for example, the attP and/or attB of φC31. Nucleic acid sequence databases, for example, may be searched by computer. The findpatterns algorithm of the Wisconsin Software Package Version 9.0 developed by the Genetics Computer Group (GCG; Madison, Wis.), is an example of a programmed used to screen all sequences in the GenBank database (Benson et al., 1998, Nucleic Acids Res. 26, 1 7). In this aspect, when selecting pseudo-recombination sites in a target cell, the genomic sequences of the target cell can be searched for suitable pseudo-recombination sites using either the attP or attB sequences associated with a φC31 integrase or an altered φC31 integrase. Functional sizes and the amount of heterogeneity that can be tolerated in these recombination sequences can be empirically evaluated, for example, by evaluating integration efficiency of a targeting construct using an altered φC31 integrase of the present invention (for exemplary methods of evaluating integration events, see, WO 00/11155, published 2 Mar. 2000).
[0087] Functional pseudo-sites can also be found empirically. For example, experiments performed in support of the present invention have shown that after co-transfection into human cells of a plasmid carrying φC31 attB and the neomycin resistance gene, along with a plasmid expressing the φC31 integrase or an altered φC31 integrase, an elevated number of neomycin resistant colonies are obtained, compared to co-transfections in which either attB or the integrase gene were omitted. Most of these colonies reflected integration into native pseudo attP sites. Such sites are recovered, for example, by plasmid rescue and analyzed at the DNA sequence level, producing, for example, the DNA sequence of a pseudo attP site from the human genome, such as ψA (FIG. 17). This empirical method for identification of pseudo-sites can be used, even if a detailed knowledge of the recombinase recognition sites and the nature of recombinase binding to them are unknown. Exemplary pseudo attP site present in the human genome are provided in Fig 18, as described in Chalberg et al., JMB 357:28-48 (2006). Exemplary pseudo attP sites present in the genome of human embryonic stem cells are also descried in Thyagarajan et al., Stem Cells, published online October, 26, 2007, available at www.StemCells.com.
[0088] When a pseudo-recombination site is identified (using either attP or attB search sequences) in a target genome (such as human or mouse), that pseudo-recombination site can be used in the methods of the present invention of using an altered φC31 integrase to integrate a nucleic acid of interest into a target cell genome.
[0089] Then attP or attB sites corresponding to the pseudo-recombination sites can be used in the targeting construct to be employed with an altered φC31 integrase. For example, if an attP site for an altered φC31 integrase is used to identify a pseudo- recombination site in the target cell genome, then the wild-type attB sequence can be used in the targeting construct. In an alternative example, if attB for an altered φC31 integrase is used to identify a pseudo-recombination site in the target cell genome, then the wild-type attP sequence can be used in the targeting construct.
[0090] The targeting constructs contemplated by the invention may contain additional nucleic acid fragments such as control sequences, marker sequences, selection sequences and the like as discussed below.
Targeting Constructs and Methods of the Present Invention [0091] The present invention also provides means for targeted insertion of a polynucleotide (or nucleic acid sequence(s)) of interest into a genome by, for example, (i) providing an altered φC31 integrase capable of facilitating recombination between a first recombination site and a second recombination site, (ii) providing a targeting construct having a first recombination sequence and a polynucleotide of interest, (iii) introducing the altered φC31 integrase and the targeting construct into a cell which contains in its nucleic acid the second recombination site, wherein said introducing is done under conditions that allow the altered φC31 integrase to facilitate a recombination event between the first and second recombination sites.
99 [0092] Historically, the attachment site in a bacterial genome is designated "attB" and in a corresponding bacteriophage the site is designated "attP". In one aspect of the present invention, at least one pseudo-recombination site for an altered φC31 integrase is identified in a target cell of interest. These sites can be identified by several methods including searching all known sequences derived from the cell of interest against a wild- type recombination site (e.g., attb or attp) for an altered φC31 integrase (e.g., as described above). The functionality of pseudo-recombination sites identified in this way can then be empirically evaluated following the teachings of the present specification to determine their ability to participate in a recombinase-mediated recombination event.
[0093] A targeting construct, to direct integration to a pseudo-recombination site, would then comprise a recombination site wherein the altered φC31 integrase can facilitate a recombination event between the recombination site in the genome of the target cell and a recombination site in the targeting construct. A targeting vector may further comprise a polynucleotide of interest. Polynucleotides of interest can include, but are not limited to, expression cassettes encoding polypeptide products. The targeting constructs are typically circular and may also contain selectable markers, an origin of replication, and other elements. Targeting constructs of the present invention are typically circular.
[0094] A variety of expression vectors are suitable for use in the practice of the present invention, both for prokaryotic expression and eukaryotic expression. In general, the targeting construct will have one or more of the following features: a promoter, promoter-enhancer sequences, a selection marker sequence, an origin of replication, an inducible element sequence, an epitope-tag sequence, and the like.
[0095] Promoter and promoter-enhancer sequences are DNA sequences to which RNA polymerase binds and initiates transcription. The promoter determines the polarity of the transcript by specifying which strand will be transcribed. Bacterial promoters consist of consensus sequences, -35 and -10 nucleotides relative to the transcriptional start, which are bound by a specific sigma factor and RNA polymerase. Eukaryotic promoters are more complex. Most promoters utilized in expression vectors are transcribed by RNA polymerase II. General transcription factors (GTFS) first bind specific sequences near the start and then recruit the binding of RNA polymerase II. In addition to these minimal promoter elements, small sequence elements are recognized specifically by modular DNA-binding/trans-activating proteins (e.g. AP-I, SP-I) that regulate the activity of a given promoter. Viral promoters serve the same function as bacterial or eukaryotic promoters and either provide a specific RNA polymerase in trans (bacteriophage T7) or recruit cellular factors and RNA polymerase (SV40, RSV, CMV). Viral promoters may be preferred as they are generally particularly strong promoters.
[0096] Promoters may be, furthermore, either constitutive or regulatable. Inducible elements are DNA sequence elements which act in conjunction with promoters and may bind either repressors (e.g. lacO/LAC Iq repressor system in E. coli) or inducers (e.g. gall/GAL4 inducer system in yeast). In such cases, transcription is virtually "shut off" until the promoter is derepressed or induced, at which point transcription is "turned-on."
[0097] Examples of constitutive promoters include the int promoter of bacteriophage λ, the bla promoter of the β-lactamase gene sequence of pBR322, the CAT promoter of the chloramphenicol acetyl transferase gene sequence of pPR325, and the like. Examples of inducible prokaryotic promoters include the major right and left promoters of bacteriophage (PL and PR), the trp, reca, lacZ, AraC and gal promoters of E. coli, the .alpha.-amylase (Ulmanen, et al., J. Bacterid. 162:176 182, 1985) and the sigma-28- specific promoters of B. subtilis (Gilman et al., Gene 32:11 20(1984)), the promoters of the bacteriophages of Bacillus (Gryczan, In: The Molecular Biology of the Bacilli, Academic Press, inc., NY (1982)), Streptomyces promoters (Ward et at., MoI. Gen. Genet. 203:468 478, 1986), and the like. Exemplary prokaryotic promoters are reviewed by Glick (J. Ind. Microtiot. 1:277 282, 1987); Cenatiempo (Biochimie 68:505 516, 1986); and Gottesman (Ann. Rev. Genet. 18:415 442, 1984).
[0098] Preferred eukaryotic promoters include, but are not limited to, the following: the promoter of the mouse metallothionein I gene sequence (Hamer et al., J. MoI. Appl. Gen. 1:273 288, 1982); the TK promoter of Herpes virus (McKnight, Cell 31:355 365, 1982); the SV40 early promoter (Benoist et al., Nature (London) 290:304 310, 1981); the yeast gall gene sequence promoter (Johnston et al., Proc. Natl. Acad. Sci. (USA) 79:6971 6975, 1982); Silver et al., Proc. Natl. Acad. Sci. (USA) 81:5951 59SS, 1984), the CMV promoter, the EF-I promoter, Ecdysone-responsive promoter(s), tetracyc line-responsive promoter, and the like.
[0099] Exemplary promoters for use in the present invention are selected such that they are functional in the cell type (and/or animal or plant) into which they are being introduced.
[00100] Selection markers are valuable elements in expression vectors as they provide a means to select for growth of only those cells that contain a vector. Such markers are typically of two types: drug resistance and auxotrophic. A drug resistance marker enables cells to detoxify an exogenously added drug that would otherwise kill the cell. Auxotrophic markers allow cells to synthesize an essential component (usually an amino acid) while grown in media that lacks that essential component.
[00101] Common selectable marker genes include those for resistance to antibiotics such as ampicillin, tetracycline, kanamycin, bleomycin, streptomycin, hygromycin, neomycin, Zeocin™, G418, and the like. Selectable auxotrophic genes include, for example, hisD, that allows growth in histidine free media in the presence of histidinol.
[00102] A further element useful in an expression vector is an origin of replication.
Replication origins are unique DNA segments that contain multiple short repeated sequences that are recognized by multimeric origin-binding proteins and that play a key role in assembling DNA replication enzymes at the origin site. Suitable origins of replication for use in expression vectors employed herein include E. coli oriC, colEl plasmid origin, 2μ and ARS (both useful in yeast systems), SV40, and EBV oriP (useful in mammalian systems), and the like.
[00103] Epitope tags are short peptide sequences that are recognized by epitope specific antibodies. A fusion protein comprising a recombinant protein and an epitope tag can be simply and easily purified using an antibody bound to a chromatography resin. The presence of the epitope tag furthermore allows the recombinant protein to be detected in subsequent assays, such as Western blots, without having to produce an antibody specific for the recombinant protein itself. Examples of commonly used epitope tags include V5, glutathione-S-transferase (GST), hemaglutinin (HA), the peptide Phe-His-His-Thr-Thr, chitin binding domain, and the like.
[00104] A further useful element in an expression vector is a multiple cloning site or polylinker. Synthetic DNA encoding a series of restriction endonuclease recognition sites is inserted into a plasmid vector, for example, downstream of the promoter element. These sites are engineered for convenient cloning of DNA into the vector at a specific position.
[00105] The foregoing elements can be combined to produce expression vectors suitable for use in the methods of the invention. Those of skill in the art would be able to select and combine the elements suitable for use in their particular system in view of the teachings of the present specification. Suitable prokaryotic vectors include plasmids such as those capable of replication in E. coli (for example, pBR322, ColEl, pSClOl, PACYC 184, pVX, pRSET, pBAD (Invitrogen, Carlsbad, Calif.) and the like). Such plasmids are disclosed by Sambrook (cf. "Molecular Cloning: A Laboratory Manual," second edition, edited by Sambrook, Fritsch, & Maniatis, Cold Spring Harbor Laboratory, (1989)) and many such vectors are commercially available. Bacillus plasmids include pC194, pC221, pT127, and the like, and are disclosed by Gryczan (In: The Molecular Biology of the Bacilli, Academic Press, NY (1982), pp. 307 329). Suitable Streptomyces plasmids include plilOl (Kendall et al., J. Bacteriol. 169:4177 4183, 1987), and streptomyces bacteriophages such as .phi.C31 (Chater et al., In: Sixth International Symposium on Actinomycetales Biology, Akademiai Kaido, Budapest, Hungary (1986), pp. 45 54). Pseudomonas plasmids are reviewed by John et al. (Rev. Infect. Dis. 8:693 704, 1986), and Izaki (Jpn. J. Bacteriol. 33:729 742, 1978).
[00106] Suitable eukaryotic plasmids include, for example, BPV, EBV, vaccinia, SV40,
2-micron circle, pcDNA3.1, pcDNA3.1/GS, pYES2/GS, pMT, p IND, pIND(Spl), pVgRXR (Invitrogen), and the like, or their derivatives. Such plasmids are well known in the art (Botstein et al., Miami Wntr. SyTnp. 19:265 274, 1982; Broach, In: "The Molecular Biology of the Yeast Saccharomyces: Life Cycle and Inheritance", Cold Spring Harbor Laboratory, Cold Spring Harbor, N. Y., p. 445 470, 1981; Broach, Cell 28:203 204, 1982; Dilon et at., J. Clin. Hematol. Oncol. 10:39 48, 1980; Maniatis, In: Cell Biology: A Comprehensive Treatise, Vol. 3, Gene Sequence Expression, Academic Press, NY, pp. 563 608, 1980.
[00107] The targeting cassettes described herein can be constructed utilizing methodologies known in the art of molecular biology (see, for example, Ausubel or Maniatis) in view of the teachings of the specification. As described above, the targeting constructs are assembled by inserting, into a suitable vector backbone, a recombination site, polynucleotides encoding sequences of interest operably linked to a promoter of interest; and, optionally a sequence encoding a positive selection marker.
[00108] A preferred method of obtaining polynucleotides, including suitable regulatory sequences (e.g., promoters) is PCR. General procedures for PCR are taught in MacPherson et al., PCR: A PRACTICAL APPROACH, (IRL Press at Oxford University Press, (1991)). PCR conditions for each application reaction may be empirically determined. A number of parameters influence the success of a reaction. Among these parameters are annealing temperature and time, extension time, Mg2+ and ATP concentration, pH, and the relative concentration of primers, templates and deoxyribonucleotides. After amplification, the resulting fragments can be detected by agarose gel electrophoresis followed by visualization with ethidium bromide staining and ultraviolet illumination.
[00109] The expression cassettes, targeting constructs, vectors, altered recombinases and altered recombinase-coding sequences of the present invention can be formulated into kits. Components of such kits can include, but are not limited to, containers, instructions, solutions, buffers, disposables, and hardware.
Introducing Recombinases Into Cells
[00110] In the methods of the invention an altered φC31 integrase is introduced into a cell whose genome is to be modified. Methods of introducing functional proteins into cells are well known in the art. Introduction of purified altered φC31 integrase protein ensures a transient presence of the protein and its function, which is often a preferred embodiment.
[00111] Alternatively, a gene encoding the altered φC31 integrase can be included in an expression vector used to transform the cell. It is generally preferred that the altered φC31 integrase be present for only such time as is necessary for insertion of the nucleic acid fragments into the genome being modified. Thus, the lack of permanence associated with most expression vectors is not expected to be detrimental.
[00112] The altered φC31 integrase used in the practice of the present invention can be introduced into a target cell before, concurrently with, or after the introduction of a targeting vector. The altered φC31 integrase can be directly introduced into a cell as a protein, for example, using liposomes, coated particles, or microinjection. Alternately, a polynucleotide encoding the altered φC31 integrase can be introduced into the cell using a suitable expression vector. The targeting vector components described above are useful in the construction of expression cassettes containing sequences encoding an altered φC31 integrase of interest. Expression of the altered φC31 integrase is typically desired to be transient. Accordingly, vectors providing transient expression of the altered φC31 integrase are preferred in the practice of the present invention. However, expression of the altered φC31 integrase may be regulated in other ways, for example, by placing the expression of the altered φC31 integrase under the control of a regulatable promoter (i.e., a promoter whose expression can be selectively induced or repressed).
[00113] Sequences encoding altered φC31 integrase useful in the practice of the present invention are disclosed herein and may be obtained following the teachings of the present specification.
[00114] Altered φC31 integrase for use in the practice of the present invention can be produced recombinantly or purified as previously described. Altered φC31 integrase polypeptides having the desired recombinase activity can be purified to a desired degree of purity by methods known in the art of protein purification, including, but not limited to, ammonium sulfate precipitation, size fractionation, affinity chromatography, HPLC, ion exchange chromatography, heparin agarose affinity chromatography (e.g., Thorpe & Smith, Proc. Nat. Acad. Sci. 95:5505 5510, 1998.) A variety of nucleic acids of interest may be introduced into the genome of cell using the methods of the invention including protein encoding nucleic acids, including, for example, enzymes that can be used for the production of nutrients and for performing enzymatic reactions in chemistry, or polypeptides which are useful and valuable as nutrients or for the treatment of human or animal diseases or for the prevention thereof, for example hormones, polypeptides with immunomodulatory activity, anti-viral and/or anti-tumor properties (e.g., maspin), antibodies, viral antigens, vaccines, clotting factors, enzyme inhibitors, foodstuffs, and the like. Other useful polypeptides encoded by the nueleic acid of interest that may be introduced by the methods of the invention are, for example, those coding for hormones such as secretin, thymosin, relaxin, luteinizing hormone, parathyroid hormone, adrenocorticotropin, melanoycte-stimulating hormone, β-lipotropin, urogastrone or insulin, growth factors, such as epidermal growth factor, insulin-like growth factor (IGF), e.g. IGF-I and IGF-II, mast cell growth factor, nerve growth factor, glial cell line-derived neurotrophic factor (GDNF), or transforming growth factor (TGF), such as TGF-α or TGF-β (e.g. TGF-βl, β2 or β3), growth hormone, such as human or bovine growth hormones, interleukins, such as interleukin- 1 or -2, human macrophage migration inhibitory factor (MIF), interferons, such as human α- interferon, for example interferon-αA, αB, αD or αF, α- interferon, γ-interferon or a hybrid interferon, for example an αA-αD- or an αB-αD-hybrid interferon, especially the hybrid interferon BDBB, protease inhibitors such as αi -antitrypsin, SLPI, αi- antichymotrypsin, Cl inhibitor, hepatitis virus antigens, such as hepatitis B virus surface or core antigen or hepatitis A virus antigen, or hepatitis nonA-nonB (i.e., hepatitis C) virus antigen, plasminogen activators, such as tissue plasminogen activator or urokinase, tumor necrosis factors (e.g., TNF-α or TNF-β), somatostatin, renin, β-endorphin, immunoglobulins, such as the light and/or heavy chains of immunoglobulin A, D, E, G, or M or human-mouse hybrid immunoglobulins, immunoglobulin binding factors, such as immunoglobulin E binding factor, e.g. sCD23 and the like, calcitonin, human calcitonin-related peptide, blood clotting factors, such as factor IX or VIIIc, erythropoietin, eglin, such as eglin C, desulphatohirudin, such as desulphatohirudin variant HVl, HV2 or PA, human superoxide dismutase, viral thymidine kinase, β- lactamase, glucose isomerase, transport proteins such as human plasma proteins, e.g., serum albumin transferring, and transcription factors, including Oct-3/4, Sox2, c-myc, and the like. Cells
[00116] Cells suitable for modification employing the methods of the invention include both prokaryotic cells and eukaryotic cells, provided that the cell's genome contains a pseudo-recombination sequence recognizable by an altered φC31 integrase of the present invention. Prokaryotic cells are cells that lack a defined nucleus. Examples of suitable prokaryotic cells include bacterial cells, mycoplasmal cells and archaebacterial cells. Particularly preferred prokaryotic cells include those that are useful either in various types of test systems or those that have some industrial utility, such as Klebsiella oxytoca (ethanol production), Clostridium acetobutylicum (butanol production), and the like (see Green and Bennet, Biotech & Bioengineering 58:215 221, 1998; Ingram, et al, Biotech & Bioengineering 58:204 206, 1998).
[00117] Suitable eukaryotic cells include both animal cells (such as, from insect, fish, bird, rodent (including mice and rats), cow, goat, rabbit, sheep, non-human primate, human, and the like) and plant cells (such as, from rice, corn, cotton, tobacco, tomato, potato, and the like). Cell types applicable to particular purposes are discussed in greater detail below.
[00118] Yet another embodiment of the invention comprises isolated genetically engineered cells. Suitable cells may be prokaryotic or eukaryotic, as discussed above. The genetically engineered cells of the invention may be unicellular organisms or may be derived from multicellular organisms. By "isolated" in reference to genetically engineered cells derived from multicellular organisms it is meant the cells are outside a living body, whether plant or animal, and in an artificial environment. The use of the term isolated does not imply that the genetically engineered cells are the only cells present.
[00119] In one embodiment, the genetically engineered cells of the invention contain any one of the nucleic acid constructs of the invention. In a second embodiment, an altered φC31 integrase that specifically recognizes recombination sequences is introduced into genetically engineered cells containing one of the nucleic acid constructs of the invention under conditions such that the nucleic acid sequence(s) of interest will be inserted into the genome. Thus, the genetically engineered cells possess a modified genome. Methods of introducing polypeptides and DNA sequences into such cells are well known in the art and are discussed above.
[00120] The genetically engineered cells of the invention can be employed in a variety of ways. Unicellular organisms can be modified to produce commercially valuable substances such as recombinant proteins, industrial solvents, industrially useful enzymes, and the like. Preferred unicellular organisms include fungi such as yeast (for example, S. pombe, Pichia pastoris, S. cerevisiae (such as INVScI), and the like) Aspergillis, and the like, and bacteria such as Klebsiella, Streptomyces, and the like.
[00121] Isolated cells from multicellular organisms can be similarly useful, including insect cells, mammalian cells and plant cells. Mammalian cells that may be useful include those derived from rodents, primates and the like. They include HeLa cells, cells of fibroblast origin such as VERO, 3T3 or CHOKl, HEK 293 cells or cells of lymphoid origin (such as 32D cells) and their derivatives, neuronal cells, hepatic cells, and the like. Exemplary mammalian host cells include nonadherent cells such as CHO, 32D, and the like.
[00122] In addition, plant cells are also available as hosts, and control sequences compatible with plant cells are available, such as the cauliflower mosaic virus 35S and 19S, nopaline synthase promoter and polyadenylation signal sequences, and the like. Appropriate transgenic plant cells can be used to produce transgenic plants.
[00123] Another preferred host is an insect cell, for example from the Drosophila larvae.
Using insect cells as hosts, the Drosophila alcohol dehydrogenase promoter can be used (Rubin, Science 240:1453 1459, 1988). Alternatively, baculovirus vectors can be engineered to express large amounts of peptide encoded by a desired nucleic acid sequence in insect cells (Jasny, Science 238:1653, (1987); Miller et al., In: Genetic Engineering (1986), Setlow, J. K., et al., eds., Plenum, Vol. 8, pp. 277 297)).
[00124] The genetically engineered cells of the invention are additionally useful as tools to screen for substances capable of modulating the activity of a protein encoded by a nucleic is acid fragment of interest. Thus, an additional embodiment of the invention comprises methods of screening comprising contacting genetically engineered cells of the invention with a test substance and monitoring the cells for a change in cell phenotype, cell proliferation, cell differentiation, enzymatic activity of the protein or the interaction between the protein and a natural binding partner of the protein when compared to test cells not contacted with the test substance.
[00125] A variety of test substances can be evaluated using the genetically engineered cells of the invention including peptides, proteins, antibodies, low molecular weight organic compounds, natural products derived from, for example, fungal or plant cells, and the like. By "low molecular weight organic compound" it is meant a chemical species with a molecular weight of generally less than 500 1000. Sources of test substances are well known to those of skill in the art. [00126] Various assay methods employing cells are also well known by those skilled in the art. They include, for example, assays for enzymatic activity (Hirth, et al, U.S. Pat. No. 5,763,198, issued Jun. 9, 1998), assays for binding of a test substance to a protein expressed by the genetically engineered cells, assays for transcriptional activation of a reporter gene, and the like.
[00127] Cells modified by the methods of the present invention can be maintained under conditions that, for example, (i) keep them alive but do not promote growth, (ii) promote growth of the cells, and/or (iii) cause the cells to differentiate or dedifferentiate. Cell culture conditions are typically permissive for the action of the altered φC31 integrase in the cells, although regulation of the activity of the altered φC31 integrase may also be modulated by culture conditions (e.g., raising or lowering the temperature at which the cells are cultured). For a given cell, cell-type, tissue, or organism, culture conditions are known in the art.
Transgenic Plants and Non-Human Animals
[00128] In another embodiment, the present invention comprises transgenic plants and nonhuman transgenic animals whose genomes have been modified by employing the methods and compositions of the invention. Transgenic animals may be produced employing the methods of the present invention to serve as a model system for the study of various disorders and for screening of drugs that modulate such disorders.
[00129] A "transgenic" plant or animal refers to a genetically engineered plant or animal, or offspring of genetically engineered plants or animals. A transgenic plant or animal usually contains material from at least one unrelated organism, such as, from a virus. The term "animal" as used in the context of transgenic organisms means all species except human. It also includes an individual animal in all stages of development, including embryonic and fetal stages. Farm animals (e.g., chickens, pigs, goats, sheep, cows, horses, rabbits and the like), rodents (such as mice and rats), and domestic pets (e.g., cats and dogs) are included within the scope of the present invention. In a preferred embodiment, the animal is a mouse or a rat.
[00130] The term "chimeric" plant or animal is used to refer to plants or animals in which the heterologous gene is found, or in which the heterologous gene is expressed in some but not all cells of the plant or animal.
[00131] The term transgenic animal also includes a germ cell line transgenic animal. A
"germ cell line transgenic animal" is a transgenic animal in which the genetic information provided by the invention method has been taken up and incorporated into a germ line cell, therefore conferring the ability to transfer the information to offspring. If such offspring, in fact, possess some or all of that information, then they, too, are transgenic animals.
[00132] Methods of generating transgenic plants and animals are known in the art and can be used in combination with the teachings of the present application.
[00133] In one embodiment, a transgenic animal of the present invention is produced by introducing into a single cell embryo a nucleic acid construct (e.g., a targeting construct), comprising a recombination site capable of recombining with a recombination site found within the genome of the organism from which the cell was derived and a nucleic acid fragment of interest, in a manner such that the nucleic acid fragment of interest is stably integrated into the DNA of germ line cells of the mature animal and is inherited in normal Mendelian fashion. In this embodiment, the nucleic acid fragment of interest can be any one of the fragments described previously. Alternatively, the nucleic acid sequence of interest can encode an exogenous product that disrupts or interferes with expression of an endogenously produced protein of interest, yielding transgenic animals with decreased expression of the protein of interest.
[00134] A variety of methods are available for the production of transgenic animals. A nucleic acid construct of the invention can be injected into the pronucleus, or cytoplasm, of a fertilized egg before fusion of the male and female pronuclei, or injected into the nucleus of an embryonic cell (e.g., the nucleus of a two-cell embryo) following the initiation of cell division (Brinster, et al., Proc. Nat. Acad. Sci. USA 82: 4438, 1985). Embryos can be infected with viruses, especially retroviruses, modified at a recombination site with a nucleic acid sequence of interest. The cell can further be treated with an altered φC31 integrase as described above to promote integration of the nucleic acid sequence of interest into the genome. In this case, introducing the altered φC31 integrase in the form of an mRNA may be particularly advantageous. There would then be no requirement for transcription of the incoming altered φC31 integrase gene and no chance that the altered φC31 integrase gene would become integrated into the genome.
[00135] By way of example only, to prepare a transgenic mouse, female mice are induced to superovulate. After being allowed to mate, the females are sacrificed by CO2 asphyxiation or cervical dislocation and embryos are recovered from excised oviducts. Surrounding cumulus cells are removed. Pronuclear embryos are then washed and stored until the time of injection. Randomly cycling adult female mice are paired with vasectomized males. Recipient females are mated at the same time as donor females. Embryos then are transferred surgically. The procedure for generating transgenic rats is similar to that of mice. See Hammer, et al., Cell 63: 1099 1112, 1990). Rodents suitable for transgenic experiments can be obtained from standard commercial sources such as Charles River (Wilmington, Mass.), Taconic (Germantown, N.Y.), Harlan Sprague Dawley (Indianapolis, Ind.), etc.
[00136] The procedures for manipulation of the rodent embryo and for microinjection of
DNA into the pronucleus of the zygote are well known to those of ordinary skill in the art (Hogan, et al., supra). Microinjection procedures for fish, amphibian eggs and birds are detailed in Houdebine and Chourrout, Experientia 47:897 905, 1991). Other procedures for introduction of DNA into tissues of animals are described in U.S. Pat. No., 4,945,050 (Sandford et al., JuI. 30, 1990).
[00137] Totipotent or pluripotent stem cells derived from the inner cell mass of the embryo and stabilized in culture can be manipulated in culture to incorporate nucleic acid sequences employing invention methods. A transgenic animal can be produced from such cells through injection into a blastocyst that is then implanted into a foster mother and allowed to come to term.
[00138] Methods for the culturing of stem cells and the subsequent production of transgenic animals by the introduction of DNA into stem cells using methods such as electroporation, calcium phosphate/DNA precipitation, microinjection, liposome fusion, retroviral infection, and the like are also are well known to those of ordinary skill in the art. (See, for example, Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed., IRL Press, 1987). Reviews of standard laboratory procedures for microinjection of heterologous DNAs into mammalian (mouse, pig, rabbit, sheep, goat, cow) fertilized ova include: Hogan et al., Manipulating the Mouse Embryo (Cold Spring Harbor Press 1986); Krimpenfort et al., 1991, Bio/Technology 9:86; Palmiter et al., 1985, Cell 41:343; Kraemer et al., Genetic Manipulation of the Early Mammalian Embryo (Cold Spring Harbor Laboratory Press 1985); Hammer et al., 1985, Nature, 315:680; Purcel et al., 1986, Science, 244:1281; Wagner et al., U.S. Pat. No. 5,175,385; Krimpenfort et al., U.S. Pat. No. 5,175,384, the respective contents of which are incorporated by reference.
[00139] The final phase of the procedure is to inject targeted ES cells into blastocysts and to transfer the blastocysts into pseudopregnant females. The resulting chimeric animals are bred and the offspring are analyzed by Southern blotting to identify individuals that carry the transgene. Procedures for the production of non-rodent mammals and other animals have been discussed by others (see Houdebine and Chourrout, supra; Pursel, et al., Science 244:1281 1288, 1989; and Simms, et al., Bio/Technology 6:179 183, 1988). Animals carrying the transgene can be identified by methods well known in the art, e.g., by dot blotting or Southern blotting. [00140] The term transgenic as used herein additionally includes any organism whose genome has been altered by in vitro manipulation of the early embryo or fertilized egg or by any transgenic technology to induce a specific gene knockout. The term "gene knockout" as used herein, refers to the targeted disruption of a gene in vivo with loss of function that has been achieved by use of the invention vector. In one embodiment, transgenic animals having gene knockouts are those in which the target gene has been rendered nonfunctional by an insertion targeted to the gene to be rendered non-functional by targeting a pseudo-recombination site located within the gene sequence.
Gene Therapy and Disorders
[00141] A further embodiment of the invention comprises a method of treating a disorder in a subject in need of such treatment. In one embodiment of the method, at least one cell or cell type (or tissue, etc.) of the subject has a target recombination sequence for an altered φC31 integrase of the present invention, such as a pseudo attP site. This cell(s) is transformed with a nucleic acid construct (a "targeting construct") comprising a second recombination sequence and one or more polynucleotides of interest (typically a therapeutic gene). Into the same cell an altered φC31 integrase is introduced that specifically recognizes the recombination sequences under conditions such that the nucleic acid sequence of interest is inserted into the genome via a recombination event. Subjects treatable using the methods of the invention include both humans and non- human animals. Such methods utilize the targeting constructs and altered φC31 integrase of the present invention.
[00142] A variety of disorders may be treated by employing the method of the invention including monogenic disorders, infectious diseases, acquired disorders, cancer, and the like. Exemplary monogenic disorders include ADA deficiency, cystic fibrosis, familial- hypercholesterolemia, hemophilia, chronic ganulomatous disease, Duchenne muscular dystrophy, Fanconi anemia, sickle-cell anemia, Gaucher's disease, Hunter syndrome, X- linked SCID, and the like.
[00143] Infectious diseases treatable by employing the methods of the invention include infection with various types of virus including human T-cell lympho tropic virus, influenza virus, papilloma virus, hepatitis virus, herpes virus, Epstein-Bar virus, immunodeficiency viruses (HIV, and the like), cytomegalovirus, and the like. Also included are infections with other pathogenic organisms such as Mycobacterium Tuberculosis, Mycoplasma pneumoniae, and the like or parasites such as Plasmadium falciparum, and the like.
[00144] The term "acquired disorder" as used herein refers to a noncongenital disorder.
Such disorders are generally considered more complex than monogenic disorders and may result from inappropriate or unwanted activity of one or more genes. Examples of such disorders include peripheral artery disease, rheumatoid arthritis, coronary artery disease, and the like.
[00145] A particular group of acquired disorders treatable by employing the methods of the invention include various cancers, including both solid tumors and hematopoietic cancers such as leukemias and lymphomas. Solid tumors that are treatable utilizing the invention method include carcinomas, sarcomas, osteomas, fibrosarcomas, chondrosarcomas, and the like. Specific cancers include breast cancer, brain cancer, lung cancer (non-small cell and small cell), colon cancer, pancreatic cancer, prostate cancer, gastric cancer, bladder cancer, kidney cancer, head and neck cancer, and the like.
[00146] The suitability of the particular place in the genome is dependent in part on the particular disorder being treated. For example, if the disorder is a monogenic disorder and the desired treatment is the addition of a therapeutic nucleic acid encoding a non- mutated form of the nucleic acid thought to be the causative agent of the disorder, a suitable place may be a region of the genome that does not encode any known protein and which allows for a reasonable expression level of the added nucleic acid. Methods of identifying suitable places in the genome are known in the art and identification of target recombination sequences is discussed herein in the context of the altered recombinases of the present invention.
[00147] The nucleic acid construct (e.g., a targeting vector) useful in this embodiment is additionally comprised of one or more nucleic acid fragments of interest. Preferred nucleic acid fragments of interest for use in this embodiment are therapeutic genes and/or control regions, as previously defined. The choice of nucleic acid sequence will depend on the nature of the disorder to be treated. For example, a nucleic acid construct intended to treat hemophilia B, which is caused by a deficiency of coagulation factor IX, may comprise a nucleic acid fragment encoding functional factor IX. A nucleic acid construct intended to treat obstructive peripheral artery disease may comprise nucleic acid fragments encoding proteins that stimulate the growth of new blood vessels, such as, for example, vascular endothelial growth factor, platelet-derived growth factor, and the like. Those of skill in the art would readily recognize which nucleic acid fragments of interest would be useful in the treatment of a particular disorder.
[00148] The nucleic acid construct can be administered to the subject being treated using a variety of methods. Administration can take place in vivo or ex vivo. By "in vivo," it is meant in the living body of an animal. By "ex vivo" it is meant that cells or organs are modified outside of the body, such cells or organs are typically returned to a living body.
[00149] Methods for the therapeutic administration of nucleic acid constructs are well known in the art. Nucleic acid constructs can be delivered with cationic lipids (Goddard, et al, Gene Therapy, 4: 1231 1236, 1997; Gorman, et al, Gene Therapy 4:983 992, 1997; Chadwick, et al, Gene Therapy 4:937 942, 1997; Gokhale, et al, Gene Therapy 4:1289 1299, 1997; Gao, and Huang, Gene Therapy 2:710 722, 1995, all of which are incorporated by reference herein), using viral vectors (Monahan, et al, Gene Therapy 4:40 49, 1997; Onodera, et al, Blood 91:30 36, 1998, all of which are incorporated by reference herein), by uptake of "naked DNA", and the like. Techniques well known in the art for the transfection of cells (see discussion above) can be used for the ex vivo administration of nucleic acid constructs. The exact formulation, route of administration and dosage can be chosen by the individual physician in view of the patient's condition. (See e.g. Fingl et al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 1 pi).
[00150] It should be noted that the attending physician would know how to and when to terminate, interrupt, or adjust administration due to toxicity, to organ dysfunction, and the like. Conversely, the attending physician would also know how to adjust treatment to higher levels if the clinical response were not adequate (precluding toxicity). The magnitude of an administered dose in the management of the disorder being treated will vary with the severity of the condition to be treated, with the route of administration, and the like. The severity of the condition may, for example, be evaluated, in part, by standard prognostic evaluation methods. Further, the dose and perhaps dose frequency will also vary according to the age, body weight, and response of the individual patient.
[00151] In general at least 1-10% of the cells targeted for genomic modification should be modified in the treatment of a disorder. Thus, the method and route of administration will optimally be chosen to modify at least 0.1-1% of the target cells per administration. In this way, the number of administrations can be held to a minimum in order to increase the efficiency and convenience of the treatment.
[00152] Depending on the specific conditions being treated, such agents may be formulated and administered systemically or locally. Techniques for formulation and administration may be found in "Remington's Pharmaceutical Sciences," 1990, 18th ed., Mack Publishing Co., Easton, Pa. Suitable routes may include oral, rectal, transdermal, vaginal, transmucosal, or intestinal administration; parenteral delivery, including intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, intranasal, or intraocular injections, just to name a few. [00153] The subject being treated will additionally be administered an altered φC31 integrase that specifically recognizes the recombination sequences that are selected for use. The particular altered φC31 integrase can be administered by including a nucleic acid encoding it as part of a nucleic acid construct, or as a protein to be taken up by the cells whose genome is to be modified. Methods and routes of administration will be similar to those described above for administration of a targeting construct comprising a recombination sequence and nucleic acid sequence of interest. The altered φC31 integrase protein is likely to only be required for a limited period of time for integration of the nucleic acid sequence of interest. Therefore, if introduced as a gene encoding an altered φC31 integrase, the vector carrying the altered φC31 integrase gene will lack sequences mediating prolonged retention. For example, conventional plasmid DNA decays rapidly in most mammalian cells. The altered φC31 integrase gene may also be equipped with gene expression sequences that limit its expression. For example, an inducible promoter can be used, so that altered φC31 integrase expression can be temporally limited by limited exposure to the inducing agent. One such exemplary group of promoters are tetracycline-responsive promoters the expression of which can be regulated using tetracycline or doxycycline.
Kits
[00154] Also provided by the subject invention are kits for practicing the subject methods, as described above. In certain embodiments, the subject kits at least include one or more of, and usually all an altered φC31 integrase or an expression vector encoding the same and a targeting vector as described above. For example, the altered φC31 integrase component can be provided in any suitable form (e.g., as a protein formulated for introduction into a target cell or in a recombinase vector which provides for expression of the desired recombinase following introduction into the target cell). In general, the targeting vector will include at least a first recombination site, such as an attB site, and a restriction endonuclease site for insertion of a nucleic acid sequence of interest.
[00155] The subject kits may further include an aqueous delivery vehicle, e.g. a buffered saline solution, etc. In addition, the kits may include one or more restriction endonucleases for use in transferring a nucleic acid of interest into the targeting vector. In the subject kits, the above components may be combined into a single aqueous composition for delivery into the host or separate as different or disparate compositions, e.g., in separate containers. Optionally, the kit may further include a vascular delivery means for delivering the aqueous composition to the host, e.g. a syringe etc., where the delivery means may or may not be pre-loaded with the aqueous composition. [00156] In addition to above-mentioned components, the subject kits typically further include instructions for using the components of the kit to practice the subject methods. The instructions for practicing the subject methods are generally recorded on a suitable recording medium. For example, the instructions may be printed on a substrate, such as paper or plastic, etc. As such, the instructions may be present in the kits as a package insert, in the labeling of the container of the kit or components thereof (i.e., associated with the packaging or subpackaging) etc. In other embodiments, the instructions are present as an electronic storage data file present on a suitable computer readable storage medium, e.g. CD-ROM, diskette, etc. In yet other embodiments, the actual instructions are not present in the kit, but means for obtaining the instructions from a remote source, e.g. via the internet, are provided. An example of this embodiment is a kit that includes a web address where the instructions can be viewed and/or from which the instructions can be downloaded. As with the instructions, this means for obtaining the instructions is recorded on a suitable substrate.
EXAMPLES
[00157] The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g. amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Centigrade, and pressure is at or near atmospheric.
Methods and Materials
[00158] The following methods and materials were used in the examples below. Plasmids
[00159] Plasmid pFCl was used as the recipient plasmid for cloning libraries of mutant integrases for use in the pre-screen in bacteria. pFCl (FIG 1, panel A) was derived from the plasmid pBCPB+ (Groth et al., PNAS 97:5995-6000 (2000)), which carries the chloramphenicol resistance gene and the lacZ gene flanked by φC31 attP and attB sites in the same orientation. An empty expression cassette containing the CMV promoter, the lac promoter, the SV40 polyA, and a multiple cloning site was cloned into the Agel site of pBCPB+. A similar plasmid, pFCl-CTerm, also carried the wild-type C-terminal section of the φC31integrase gene and was used as a recipient for DNA fragments containing mutations in the N-terminal catalytic domain of the integrase. Mutant integrase libraries were cloned downstream of the CMV and lac promoters, for expression in both mammalian cells and bacteria. In the event of a successful recombination between attB and attP, the lacZ gene was excised, giving rise to a plasmid termed pFCl-M, containing a candidate mutant integrase gene (FIG 1, Panel A and FIG 2, Panel A).
[00160] pFCl-M mutant integrase candidates were assayed in the mammalian screen by using the plasmid pBP-Green (FIG 1, panel B), which was derived from pEGFP-Cl (Stratagene, La Jolla, CA). The CMV promoter of pEGFP-Cl was reversed so that it no longer drove expression of the enhanced green fluorescent protein (eGFP) gene. The φC31 attP and attB sites were cloned in opposite orientations flanking the reversed CMV promoter. An integrase-mediated recombination reaction between the two att sites would result in "flipping" the CMV promoter, enabling it to drive expression of eGFP.
[00161] Plasmids pCSI (pCMVInt) (Groth et al., 2000), expressing wild-type φC31 integrase, and pCSml (Olivares et al., Nat. Biotech., 20:1124-1128 (2002)), containing a non-functional integrase gene, have been described. pCS-Pl, pCS-P2, and pCS-P3 (FIG 2, panel B), as well as pCS-dPl, pCS-dP2, and pCS-dP3 (FIG 2, panel C) were constructed by cloning integrases Pl, P2, P3, dPl, dP2, and dP3 from the pFCl-M plasmid into the pCS (Olivares et al., 2002) backbone. The "d" signified deletion of the N-terminal fusion sequence. Plasmid pCS was digested with EcoRI and Smal (New England Biolabs, Ipswich, MA). pFC-Pl,-P2, and -P3 (FIG 2, panel A) were digested with BamHI, treated with DNA polymerase I (Klenow; New England Biolabs) to blunt the ends, and then digested with EcoRI. The EcoRI-BamHI fragments containing the mutant integrase genes including the N-terminal fusion sequence were ligated into the fcoRI-SVnαl-digested pCS backbone to give pCS-Pl, pCS-P2, and pCS-P3 (FIG 2, panel B). To clone the mutants without the N-terminal fusion sequence, pCS was digested with Sm(A. Alongside, pFC-Pl, -P2, and -P3 were digested with Asel, BamHI, AIwNI, and Dralll. The Asel-BamHI fragment containing the mutant integrase gene was treated with DNA polymerase I (Klenow) to create blunt ends and then ligated into the Smal- linearized pCS vector to give pCS-dPl, pCS-dP2, and pCS-dP3 (FIG 2, Panel C). [00162] The plasmid pNC-attB (Thygarajan et al., MoI. Cell Biol., 21:3926-3934 (2001))
(FIG 2, Panel D) was used for in vitro cell culture assays in mammalian cells. This donor plasmid carried the φC31 attB site positioned 3" of a CMV promoter, the eGFP gene, and a neomycin resistance gene. The pVFB donor plasmid, used for mouse liver studies, was generated by cloning a human factor IX mini-gene under the hAAT promoter (Miao et al., MoI. Ther., 1:522-532 (2000); Olivares et al., (2002)) into the pVax plasmid (Invitrogen, Carlsbad, CA). pVax was digested with Spel and then self- ligated to delete the CMV promoter (pVax-CMV ). This pVax-CMV" vector was then digested with Spel and Xhol. Alongside, pBS-hFIX-attB (Ehrhardt et al., Molecular Therapy 11:695-706 (2005)) was digested with Spel and Xhol to release a fragment containing the hFIX gene with hAAT promoter and the φC31 attB site. This fragment was ligated into pVax-CMV" to generate pVFB (FIG 2, Panel E). Structure of the plasmid was confirmed by diagnostic digests and DNA sequencing.
Mammalian Screen For Improved Integrase Mutants
[00163] White colonies from the bacterial screen were picked and plasmid DNA purified from them using the QiaPrep8 Turbo Miniprep Kit (Qiagen, Valencia, CA). 25 ng of this plasmid DNA, along with 25 ng pBP-Green (FIG 1, Panel B), were transfected into 293 cells in 96-well plates using FuGENE 6 (Roche, Palo Alto, CA). Each mutant was transfected into 8 wells of a 96-well plate containing 293 cells. A negative control (pFCl) and the wild-type integrase (pFCl-WT, generated by cloning the wild-type φC31 integrase into pFCl) were also transfected into 8 wells of each plate to provide a comparison for the mutants. Three days after transfection, cells were analyzed for eGFP expression using a Guava PCA-96 Analyzer (Guava Technologies, Hayward, CA). The mean fluorescence intensity was used as an indicator of efficiency of the mutant integrases. Integrase Protein Analysis
[00164] HA-tagged versions of the wild-type and integrase mutants were generated. pCSI-HA was constructed by introducing the sequence coding the HA tag, YPYDVPY (SEQ ID NO:32), onto the C-terminal end of the φC31 inegrase gene by PCR of a fragment using the forward primer FC31-873-F, 3 '-CTGAGGTGATCTACAAGAA (SEQ ID NO:33), and the reverse primer FC31-CHA-R, 3'-
ATTCCGGGATCCAGTCTAAGCGTAGTCTGGGACGTCGTATGGGTACGCCGCT ACGTCTTCCGTG (SEQ ID NO: 34), using pCSI as the template. Both the PCR product and pCSI were digested with BstEII and BamHl, gel extracted, and ligated to form pCSI-HA. Similarly, the mutant integrase plasmids were digested with BstEll and BamHl and ligated with the digested PCR product containing the HA tag. Total cell lysates were prepared from HeLa cells transfected with pCSI-HA, pCS-dP2-HA, pCS- P2-HA, pCS-dP3-HA, and pCS-P3-HA, and integrase expression levels were determined by Western blot analysis. The cell extracts were prepared using cell lysis buffer (IM Tris pH7.5-7.8, 0.5M EDTA, 5 M NaCl, 0.5 M NaF, 1 M B-glycerophosphate, 0.2 M Na orthovanate, 10% Triton X-100, 0.1 M PMSF in isopropanol, and aprotanin), separated by 10% SDS-PAGE, and the protein bands transferred to nitrocellulose (Bio-Rad, Hercules, CA).
[00165] The blot was incubated for 1 hour in blocking buffer (IX TBS, containing 5% milk), followed by overnight incubation of the blot with blocking buffer containing an anti-β-actin antibody and a mouse anti-HA antibody (CalBiochem, San Diego, CA) at a 1:1,000 dilution at 4° C. The blot was washed three times with IX TBS containing 0.2% Tween for 7 min each and then incubated for 1 hour at room temperature with blocking buffer (IX TBS, with 0.2% Tween and 5% milk) containing HRP-labeled goat anti- mouse IgG (CalBiochem) at a 1:10,000 dilution. The blot was washed three times again for 7 min each with wash buffer (IX TBS with 0.2% Tween) before developing using the SuperSignal West Pico Chemiluminescent Substrate kit (Pierce Biotechnology, Rockford, IL).
Chromosomal Integration Assays
[00166] 50-70% confluent HeLa cells in 6-well plates were co-transfected, in duplicate, with 20 ng pNC-attB donor plasmid and 980 ng of plasmid expressing the mutant integrases (pCS-Pl, pCS-P2, pCS-P3, pCS-dPl, pCS-dP2, and pCS-dP3) using 3 μl FuGENE 6 (Roche) as the transfection reagent according to the manufacturer's instructions. The cells were cultured in Dulbecco's modified Eagle medium (DMEM; Gibco) containing 4 mM L-glutamine and supplemented with 9% fetal bovine serum (FBS; Gibco) and 1% penicillin-streptomycin (Gibco). 24 hours after transfection, the cells from each well were trypsinized and seeded onto two 100 mm plates. 72 hours after the transfection, the medium was replaced with DMEM containing 1.25 mg/ml G418 (Invitrogen), and selection was maintained for 14-17 days, after which the colonies were counted.
[00167] 293-P3 cells (Thyagarajan et al., MoI. Cell. Biool., 21 :3926-3934 (2001)) were used to test the ability of mutants to integrate at an attP site placed in the chromosomes. 293-P3 cells at 70% confluence in 6-well plates were transfected in triplicate with 20 ng pNC-attB donor plasmid and 980 ng of plasmids expressing the mutant integrases (pCS- Pl, pCS-P2, pCS-P3, pCS-dPl, pCS-dP2, and pCS-dP3) using 3 μl FuGENE 6. The cells from each well were trypsinized and seeded onto two 100 mm dishes and cultured in complete DMEM for 24 hours, after which drug selection was initiated. The 293 -P3 cells on one 100 mm dish from each transfection were selected by using a combination of 350 μg/ml of G418 and 100 μg/ml of zeocin (Invitrogen), while the cells on the second 100 mm dish were selected using 350 μg/ml of G418 alone. During selection, cells were maintained at 37°C in a CO2 incubator and were supplied with fresh medium every 2-3 days until well-isolated colonies could be seen, usually a period of 14-17 days.
Integration Site Analysis In Cultured Human Cells
[00168] To demonstrate genomic integration at pseudo attP sites mediated by pCS-P2 and pCS-dP2 mutant integrases, 50-60% sub-confluent HeLa cells in 6-well plates were transfected using FuGENE 6 with 20 ng pNC-attB and 980 ng pCS-P2 or pCS-dP2. 24 hours post-transfection, the cells in each well were trypsinized, dluted 1: 1, 1:2, 1:4, and 1:10, and seeded onto 100 mm plates. 24 hours later, selection was carried out at 37° C in a CO2 incubator using DMEM media containing 4 mM L-glutamine, 10% FBS, 1% penicillin/streptomycin, and 1.25 mg/ml G418. After 14 days, colonies were trypsinized as pools and allowed to grow to confluency in DMEM with 10% FBS and 1% antibiotics. Genomic DNA was isolated using the DNEasy Tissue kit (Qiagen). 200 ng of this genomic DNA was used as a template for PCR amplification with primers designed to detect integration at five of the most frequently observed integration sites for φC31 integrase in the human genome (Chalberg et al., J. MoI. Biol. 357:28-48 (2006)). The amplified bands were excised, TOPO-cloned, and sequenced for verification. [00169] To demonstrate specific integration at a pre-integrated attP site in the human genome, 60-70% confluent 293-P3 cells were transfected in a 6-well plate with 20 ng of donor plasmid pNC-attB and 980 ng of pCSI or mutant integrase-expressing pCS-P3 or pCS-dP3 using 3 μl FuGENE 6. 24 hours after transfection, the cells in each well were split to two 100 mm plates. 48 hours after transfection, selection was begun using media containing 350 μg /ml G418. After 14-17 days of selection to ensure well-isolated colonies, each colony was individually trypsinized, transferred to one well of a 6-well plate, and each clone was allowed to grow to confluency. Fifty clones were picked for each integrase transfection. Genomic DNA was isolated using the DNEasy Tissue kit (Qiagen) following the manufacturer's protocol. Integration at the attP site was scored by PCR amplification of 200 ng of genomic DNA using primers designed to detect the junction created by recombination between the attP site placed in the genome and the attB site present in the donor plasmid. The primer sequences were P3F: 5V- AGGTCTATATAAGCAGAGCTC (SEQ ID NO:35) and P3R: 5N- TGAGCACCGGAACGGCACTGG (SEQ ID NO:36). The reaction was carried out at 95° C for 30 s; 65° C for 30 s; 72° C for 30 s for 10 cycles, with each cycle decreasing the annealing temperature by 1 degree, followed by 25 cycles of 95° C for 30 s; 55° C for 30 s; 72° C for 30 s, followed by 72° C for 7 min. The PCR reactions were run on a 1 % agarose gel, and the number of colonies having a band corresponding to the expected product size were counted to determine specificity for the attP site.
Hvdrodvnamic Injection In Mouse Livers
[00170] Eight week old female C57/BL6 mice were purchased from Charles Rivers
Laboratories (Wilmington, MA) and housed in the Stanford Research Animal Facility. Animals were acclimatized for 3-4 days prior to experimentation. The animal protocol was approved by the Stanford University Animal Use and Care Committee, based on NIH guidelines. Each experimental group consisted of 5-8 mice. 20 μg pVFB donor plasmid and 20 μg pCSml, pCSI, pCS-P2, or pCS-P3 in 1.8 ml of Hank's Balanced Salt Solution (HBSS; Gibco) were hydrodynamically injected over 6 - 8 seconds using a 3 ml Luer-Lok syringe (Becton Dickinson; Franklin Lakes, NJ) with a 211A G needle (Becton Dickinson) via the tail vein. To dilate the tail-vein, the mice were kept under a heat lamp for 3-7 minutes prior to injection. Detection of Human Factor IX by ELISA
[00171] For quantification of hFIX, whole blood was collected retro-orbitally from the mice at various time points. The blood samples were allowed to clot for 2 hours at room temperature or overnight at 2-8°C before centrifuging for 20 min at approximately 2000 x g. The serum was removed and stored at < -20° C. The level of hFIX in serum was determined by using an ELISA protocol described previously (Olivares et al., Nature Biotechnology 20:1124-1128 (2002)). Briefly, ELISA plates were coated overnight at 4° C using monoclonal anti-hFIX antibody produced in mouse (Sigma, St. Louis, MO), diluted 1: 1000 in coating buffer (0.1 M NaHCO3, pH 9.5). The primary antibody was discarded and the wells rinsed thrice with IX phosphate buffered saline containing 0.5% Tween (PBST; Gibco). The plate was then incubated for 1 hour in blocking buffer (IX PBST containing 5% BSA; Roche), followed by 1-2 hours of incubation with the serum samples and a hFIX standard (Factor IX From Human Plasma, Sigma) diluted in blocking buffer at 37° C. The wells were washed three times with IX PBST and subsequently incubated for 1 hour at 37° C with the secondary HRP-labeled goat anti- hFIX antibody (GAFIX- APHRP; Enzyme Research, South Bend, IN) at a 1:1200 dilution in blocking buffer. The wells were rinsed five times with IX PBST before developing using an o-phenylenediamine (OPD; Sigma, St. Louis, MO) tablet dissolved in sodium citrate buffer, pH 4.5, and freshly added H2O2. The reaction was stopped by addition of 2N ^SO4, and the plate was read at 490 nm using a standard microplate reader (Bio-Rad). The data were analyzed using the Microplate Manager III Macintosh data analysis software.
Integration Site Analysis in Mouse Liver
[00172] Three months after injection, animals were sacrificed and the livers collected.
Genomic DNA was obtained from the livers as previously described (Laird et al., Nucleic Acids Res 19:4293 (1991)). Generally, sections from different lobes of the liver were finely chopped and treated with lysis buffer (100 mM Tris pH 8.5, 5mM EDTA pH 8, 0.2% SDS, 200 mM NaCl, 100 μg/ml Proteinase K) at 55° C for 5 hours. The lysed liver samples were then centrifuged at 3,000 x g for 10-15 min and the supernatant carefully transferred to another tube containing 1 volume of isopropanol. The DNA was spooled and dissolved in Tris-EDTA pH 7.5, then extracted with phenol: chloroform, ethanol precipitated, and resuspended in Tris-EDTA. Integration into the mouse genome was confirmed by PCR analysis at the mpsLl site. Primers used for the first round were attBF3: 5V-CGAAGCCGCGGTGCG (SEQ ID NO:37) and mpsLIRl: 5N- GTAAATGTTATTGCGGCTCT (SEQ ID NO: 38) with the following PCR conditions: 5 min at 94° C, 35 cycles of 94° C for 30 s, 63° C for 30 s, and 72° C for 45 s, followed by 7 min at 72° C. 1 μl of the primary PCR product was used as a template for the second round of PCR. The primers for this round were attBF4: 5^-CGGTGCGGGTGCCA (SEQ ID NO:39) and mpsLlR2: 5V-GGTCATGGAGCCCCTTCACAA (SEQ ID NO:40), and run on a program of 5 min at 94° C, 35 cycles of 94° C for 30 s, 63° C for 30 s, and 72° C for 45 s, followed by 7 min at 72° C. The PCR product was subjected to agarose gel electrophoresis to detect a band corresponding to the expected size of 290 bp.
Immunofluorescence Microscopy
[00173] Harvested mouse liver was cut into ~1 cm pieces, and liver sections were washed in PBS briefly, fixed in 4% paraformaldehyde (Electron Microscopy Sciences, Hatfield, PA) overnight, embedded in paraffin, and sectioned by Histo-Tec Laboratory (Hayward, CA). The sections were deparaffinized and rehydrated by using the Trilogy reagent (Cell Marque, Hot Springs, AR). Slides were rinsed with IX PBS and washed three times with IX PBST. The sections were then blocked with 10% rabbit serum in PBST for 1 hour at room temperature, washed with PBST, and incubated overnight at 4° C with goat anti- hFIX primary antibody (Affinity Biologicals, Ancaster, ON, Canada). After rinsing three times with IX PBST, sections were incubated for 1 h at room temperature with rabbit anti-goat Alexafluor 488 (Invitrogen) as the secondary antibody. The sections were washed three times with IX PBST and mounted using ProLong Gold Antifade Reagent with DAPI (Invitrogen). Fluorescence images were obtained using a Zeiss microscope.
Statistical Analysis
[00174] Data were analyzed using the Microsoft Excel graphing and statistical program.
The Student' s t-test assuming unequal variances was performed in order to determine significant differences between groups, p value of <0.05 was considered statistically significant.
EXAMPLE 1 GENERATION OF IMPROVED MUTANT INTEGRASES
[00175] Libraries carrying mutant φC31 integrase genes were generated by using three different methods. In method one, site-directed mutants were synthesized by using overlapping oligonucleotides and high-fidelity PCR. This method was carried out primarily for alanine-scanning mutagenesis and for combining beneficial mutations. For alanine-scanning mutagenesis, all charged amino acids in the N-terminal catalytic domain (-amino acids 1 - 150) were replaced with alanine. In method two, error-prone PCR (GeneMorph, Stratagene, La Jolla, CA) was used to generate mutants that had mutations either throughout the integrase gene or localized within the N-terminal catalytic domain. In method three, mutator strains of E.coli (XL-I Red, Stratagene) were used to generate mutations throughout the integrase gene.
[00176] Some mutations were expected to inactivate the integrase, and such nonfunctional integrases were not of interest. In order to screen candidate mutants quickly for presence of functional integrases, the mutants were tested for their ability to mediate recombination in E.coli between native attB and attP sites on plasmid pFCl. This assay, diagrammed in FIG 1, Panel A, constituted a blue-white colony color screen. Integrases having reduced function were identified by their inability to recombine the att sites and delete the lacZ gene and consequent failure to produce white colonies. Such mutants, producing blue colonies in E. coli, were eliminated from further consideration. The majority of the mutants were non-functional or functionally impaired in this assay, while a smaller fraction were indistinguishable from wild-type integrase, and rare mutants (<3%) were improved. To perform the bacterial pre-screen, mutant integrase genes were cloned into a pFCl vector, transformed into E.coli, and plated on LB -agar plates containing chloramphenicol and X-gal. After overnight incubation at 37°C, plates were screened for the presence of blue and white colonies (FIG 1, Panel A). White colonies signified plasmids carrying a functional integrase and were picked for further analysis.
[00177] Small-scale DNA preparations of plasmids derived from white colonies, carrying functional integrases, were screened for improved function over wild-type integrase in cultured human cells in an extra-chromosomal assay. This assay, diagrammed in FIG 1, Panel B, quantitatively measured the ability of the candidate integrase mutants to recombine the wild-type attB and attP sites on plasmid pBP-Green. Such recombination inverted or "flipped" the incorrectly oriented CMV promoter on the pBP-Green plasmid. Therefore, recombination gave rise to GFP expression, which was monitored by fluorescence (FIG 1, Panel B). Since a more efficient integrase can recombine more plasmids than a less efficient integrase, the mean fluorescence intensity of the green cells was used as a measure of the efficiency of the integrase. Control experiments showed that the mean fluorescence intensity of the green cells was not highly sensitive to the amount of integrase plasmid transfected into the cells (data not shown). Eight replicates of the "flipper" assay were performed with each mutant, in order to more reliably detect the expected small improvements in efficiency reflected by higher mean fluorescence. The best mutants showed improvement in activity over wild-type that typically ranged from 1.2 - 1.7 fold (Table 1). A collection of mutants that showed improved activity was sequenced to determine the location of the mutations. Several of these mutant integrases were chosen for combinatorial studies. For convenience, and because of the focus on increased efficiency, only mutants that had amino acid changes located in the N-terminal catalytic domain were used. Several mutants were combined in various configurations to generate a second generation of mutants. The second-generation mutants were synthesized by using overlapping oligonucleotides containing the desired sequences and high-fidelity PCR amplification. Second generation mutants were tested in the pBP-Green flipper assay. Some of the combinations produced mutants that showed higher catalytic efficiency than either of the parents. The amino acid changes and fold improvements of the first and second generation mutants with highest integration efficiencies are shown in Table 1. The best second-generation mutants, called Pl, P2, and P3 and cloned in the pFC backbone (FIG 2, Panel A), had a 1.8 to 2.3-fold improved ability over wild-type integrase to recombine native attB and attP sites in the pBP-Green extra-chromosomal flipper assay. The Pl mutant was a combination of two individual mutants and had a total of five amino acid changes in the φC31 integrase protein. The P2 mutant combined three individual alanine-scanning mutations, for a total of three amino acid changes in the integrase protein. The P3 mutant was a combination of Pl and P2 and had nine amino acid changes in the integrase amino acid sequence including an inadvertent deletion of 3 base pairs that deleted an amino acid (Table 1).
Table 1: Amino acid changes in integrases that showed improved efficiency in the extrachromosomal "flipper" assay
Figure imgf000048_0001
Figure imgf000049_0001
1 Numbers refer to positions in the wild-type 613 amino acid protein (SEQ ID NO:02). All mutants also carry a 33 amino acid N-terminal fusion sequence (MTMITPSAQLTLTKGNKSWSSLVTAASVLEFAT (SEQ ID NO:41).
2 Three base pairs were inadvertently deleted when P3 was synthesized, leading to replacement of V 6 and S7 with A.
[00179] This study reports the generation of two useful φC31 integrase variants. The P2 mutant has double the integration frequency of wild-type integrase, while the P3 mutant displays a several-fold improvement in specificity for the attP site. This work demonstrates the feasibility of using mutagenesis, in combination with bacterial and mammalian screens, to create and identify altered integrases with increased functionality.
[00180] The P2 mutant carries three mutations in the catalytic domain of φC31 integrase that change charged residues to alanines. The mutational changes in P2 involve amino acids 40, 44, and 52 (Table 1), all of which are well within the predicted -160 amino acid catalytic domain of φC31 integrase (Smith et al., Molec. Microbiol. 44: 299- 307(2002)). Among serine recombinase family members, the atomic structure has been solved to date only for the gamma-delta resolvase (Yang et al., Cell 82:193-207 (1995)). Based on amino acid sequence similarities and structural predictions, φC31 integrase may share a similar three-dimensional structure in the catalytic domain. It is plausible that the amino acid changes in P2, as they are in proximity to the catalytic serine residue, influence the activity of the enzyme.
[00181] Mutant integrase P3 resulted from combining the three amino acid changes present in P2 with the five amino acid changes of Pl (Table 1), at amino acids 2, 7, 9, 10, and 49 plus an inadvertent deletion of one amino acid (see Table 1). All of the changes in P3 are encompassed within the predicted catalytic domain. The P3 mutant provides an increased integration specifically at the attP sequence. The results show that P3 combines the higher catalytic activity of the P2 mutant with the higher recognition specificity provided by Pl (FIG 5A). EXAMPLE 2
AN N-TERMINAL FUSION SEQUENCE IS TRANSLATED [00182] The efficiency of the mutant integrases in mediating genomic integration of plasmid DNA in mammalian cells was also examined. In order to compare directly the new mutants with the previously characterized wild-type integrase, pCSI, which is in the pCS backbone (Thorpe et al., Proc. Natl. Acad. Sci. USA 95:5505-5510 (1998), the Pl, P2, and P3 integrase genes were first transferred from the pFCl backbone to the pCS backbone. During the subcloning process, a sequence that was part of the bacterial/mammalian promoter was noticed in the pFCl backbone formed a potential translational fusion at the N-terminus of the mutant integrase genes. To determine if this sequence was actually being translated, the mutant integrases with or without the putative fusion sequence was cloned into a pCS backbone such that they were tagged with the haemaglutinin (HA) peptide to permit easy purification of the proteins. pCS- Pl-HA, pCS-P2-HA, and pCS-P3-HA carried the fusion sequence (FIG 2, Panel B), while pCS-dPl-HA, pCS-dP2-HA, and pCS-dP3-HA lacked the fusion sequence (FIG 2, Panel C). To examine the sizes of the mutant integrases, Western blot analysis was performed on total HeLa cell lysates isolated 48 hours after transfecting plasmids encoding the HA-tagged integrases. As shown in FIG 3, the mutant integrases with the N-terminal fusion sequence were larger than those without the fusion sequence. The size difference corresponded to the expected 33 amino acids that would result from translation of the fusion sequence (Table 1). This result showed that the translational fusion was present in the mutant integrases. Further characterization the fusion sequence was attempted to verify that it was being translated by Edmund's N-terminal sequencing. However, the N-terminus appeared to be blocked, possibly by myristoylation, inhibiting the sequencing reaction.
EXAMPLE 3 GENOMIC INTEGRATION EFFICIENCY IN HELA CELLS
[00183] The efficiency of genomic integration in mammalian cells mediated by the mutants was also compared to the wild-type integrase. HeLa cells were co-transfected with an α/tfi-containing donor plasmid carrying a neomycin (G418) resistance gene (pNC-attB, FIG 2, Panel D) (Thyagarajan et al., Molecular and Cellular Biology 21:3926-3934 (2001)) and plasmids expressing the mutant integrases pCS-Pl, pCS-P2, pCS-P3, pCS-dPl, pCS-dP2, pCS-dP3, the wild-type pCSI, or the negative control pCSml, carrying an inactive integrase (Olivares et al., Nature Biotechnology 20: 1124- 1128 (2002)). After two weeks of selection, the numbers of G418 -resistant colonies were counted to determine the integration efficiencies among the integrases. As shown in FIG 4A, mutant integrases containing the fusion had higher integration efficiencies than the integrases from which the fusion sequence had been removed. In addition, pCS- P2 had an integration frequency that was approximately two-fold elevated compared to wild-type φC31 integrase.
EXAMPLE 4
CHARACTERIZATION OF PSEUDO ATTP SITES IN HELA CELLS Example 3 showed that the P2 integrase has a higher integration efficiency in
HeLa cells, but did not address the integration specificity of the integrase. To shed light on the integration specificity of the mutant integrases pCS-P2 and pCS-dP2 in the human genome, whether integration occurred at five previously characterized pseudo attP sites commonly used by wild-type φC31 integrase in human tissue culture cells was investigated, as determined by a previous study (Chalberg et al., J. MoI. Biol. 357: 28 - 48 (2006)). HeLa cells were transfected with pNC-attB and either pCSI, pCS-P2, or pCS-dP2, and G418 selection was carried out. The cells were either plated undiluted or were diluted 1:2, 1:4, or 1:10, to create populations representing various numbers of clones. For each integrase, approximately 250 colonies were pooled from the undiluted plate, 130 colonies from the 1:2 diluted plate, 60 colonies from the 1:4 diluted plate, and 25 colonies from the 1:10 diluted plate. Genomic DNA was isolated from the pools, and PCR analysis was performed to look for integration at the five pseudo sites. This analysis utilized PCR primers (Chalberg et al., J. MoI. Biol. 357: 28 - 48 (2006)) that specifically detected junction fragments created by the juxtaposition of attB and chromosomal sequences located at the five pseudo attP sites. As shown in FIG 4B, integration at all five pseudo attP sites was observed with wild-type integrase, as well as with the pCS-P2 and pCS-dP2 mutants. This result shows that the integration specificity of the P2 mutant does not differ at a gross level from that of the wild-type φC31 integrase, despite a higher integration efficiency. EXAMPLE 5 INTEGRATION INTO A PRE-INTEGRATED CHROMOSOMAL A TTP SITE
[00185] To examine the extent of specificity of the mutant integrases for the wild-type attP site, the 293-P3 cell line was used, which contains a randomly integrated expression cassette carrying an attP site and a promoterless zeocin resistance gene (Thyagarajan et al., Molecular and Cellular Biology 21:3926-3934 (2001)). 293-P3 was co-transfected with pNC-attB and the three mutant integrases, Pl, P2, and P3, with or without the N- terminal fusion sequence. Integration of pNC-attB at the chromosomal attP site was expected to give rise to colonies that were resistant to both G418 and zeocin. Integration at other locations would result in colonies that were resistant to neomycin, but sensitive to zeocin. After two weeks of selection with either G418 and zeocin or G418 alone, the numbers of antibiotic resistant colonies were counted. Drug-resistant clones were picked, and genomic DNA was isolated and analyzed by PCR for presence of a recombination junction band diagnostic for integration at the attP site.
[00186] The numbers of G418 and zeocin resistant colonies are depicted in FIG 5A.
Modest numbers of colonies were obtained with pCSI, encoding the wild-type integrase, indicating a moderate level of integration at the pre-integrated attP site. Approximately twice as many colonies were obtained with pCS-P2, consistent with increased integration efficiency, but no increased specificity for the attP site. However, pCS-Pl and pCS-P3 both showed a greatly elevated number of colonies having dual resistance, consistent with increased integration at the attP site. pCS-P3 had the highest colony numbers, showing that it was the most active specificity mutant, with 25-fold more integration at attP compared to the wild-type integrase. The results also show that the fusion sequence was beneficial in conferring on pCS-P3 an increased ability to integrate at attP (FIG 5A).
[00187] To measure the integration specificity of pCSI, pCS-P3, and pCS-dP3 for the pre- integrated attP site more directly, G418-resistant clones generated by these integrases were analyzed for the numbers of clones having integration at the attP site, versus at another location. To perform this analysis, approximately 50 G418-resistant clones were picked for each integrase, genomic DNA was isolated, and PCR analysis was carried out to determine the presence or absence of a band diagnostic for integration at attP. As summarized in FIG 5B, 5% of the clones (3 out of 60) generated by pCSI carried an integration at the pre-integrated attP site, whereas 16% (8 out of 50) and 44% (22 out of 50) of the clones generated by pCS-dP3 and pCS-P3, respectively, carried integration events at attP. Since φC31 integrase generally mediates a single integration event per cell, this increase shows that mutant integrase pCS-P3 had significantly higher specificity for the attP site, compared to integration at pseudo attP sites. Furthermore, the pCS-P3 mutant integrase carrying the fusion sequence had a higher specificity for attP than the version (pCS-dP3) in which the fusion sequence had been removed (FIG 5B).
EXAMPLE 6 INTEGRATION EFFICIENCY IN MOUSE LIVER IN VIVO
[00188] To determine if the properties that these mutant integrases exhibited in human cultured cells would translate to a mouse model system, the pCS-P2 and pCS-P3 mutant integrases were tested in vivo in mouse liver studies. Hydrodynamic tail vein injection (Zhang et al., Human Gene Therapy 10: 1735-1737 (1999); Liu et al., Gene Therapy 6: 1258-1266 (1999)) was used to achieve efficient delivery of plasmid DNA to the liver. A donor plasmid, pVFB, carrying the φC31 attB site and the human factor IX (hFIX) gene (FIG 2, Panel E) was co-injected with either pCSI carrying the wild-type φC31 integrase gene; pCSml, an identical plasmid carrying a point mutation inactivating the integrase; pCS-P2; or pCS-P3. In addition, two animals were injected with buffer alone, and two animals were used as naϊve controls.
[00189] At various time points, blood was collected and serum hFIX levels were determined by ELISA. The results of this study are depicted in FIG 6A. The group that received wild-type φC31 integrase (pCSI) displayed a 3.9-fold increase in hFIX levels over the group that received pCSml, a significant increase (p<0.05). Mice that were co- injected with mutant integrase pCS-P2 had a significant (2.3-fold) increase in hFIX levels over pCSI. This result shows that that the P2 mutant can elevate the level of integration in mouse liver, as it did in cultured human cells. The increased hFIX levels mediated by pCSI and pCS-P2 persisted during the three-month duration of the experiment and represented therapeutic levels of hFIX. In contrast, the levels of hFIX generated by pCS-P3 mutant were not significantly higher than those generated by the pCSml inactive integrase.
[00190] To further investigate the ability of the wild-type and mutant integrases to mediate integration and provide long-term expression of hFIX, immunofluorescence staining was performed on liver sections using an antibody against hFIX. The livers of mice in the groups described above were harvested at three months after injection, a time when little unintegrated plasmid DNA is present (Olivares et al., Nature Biotechnology 20: 1124-1128 (2002)). It was expected, then, that most hFIX expression detected was coming from integrated plasmid DNA.
[00191] FIG 6B shows representative liver sections from mice that were uninjected, received buffer alone, or received pVFB along with pCSml, pCSI, pCS-P2, or pCS-P3. These sections revealed that a significantly higher number of cells stained positive for hFIX in livers that received wild-type integrase, compared to the control group that received the inactive form of integrase, as expected. Moreover, the P2 mutant gave rise to more cells expressing hFIX than did the wild-type integrase. This result shows that P2 had a higher integration efficiency than wild-type φC31 integrase. By contrast, the P3 mutant generated a lower number of stained cells, consistent with an integration efficiency at pseudo attP sites that was lower than that of the wild-type or P2 integrases.
[00192] To quantify the percentage of hFIX positive cells in these sections, hFIX positive cells were counted and all nuclei visible from the DAPI stain. Mice that received pCSml had about 1.9% of hepatocytes that were positive for hFIX. Much of this signal may be due to random integration of the pVFB plasmid following hydrodynamic delivery of many copies of plasmid DNA into the hepatocytes. In the group that was given pCSI, approximately 12.4% of the hepatocytes were positive for hFIX, representing a robust integration frequency. The P2 mutant resulted in an even higher integration efficiency, with approximately 18.5% of the cells expressing hFIX. However, the P3 mutant generated only about 4.25% positive cells.
EXAMPLE 7 INTEGRATION AT A KNOWN PSEUDO ATTP SITE IN MOUSE LIVER
[00193] Previous studies utilizing φC31 integrase in the liver and muscle of mice have demonstrated that the enzyme frequently mediates integration at a genomic location termed mpsLl that is located on chromosome 2 (Thyagragan et al., Mecular and Cellular Biology 21:3926-3934 (2001); Olivares et al., Nature Biotechnology 20: 1124-1128 (2002); (Held et al., Molecular Therapy 11:399-408 (2005); Bertoni et al., Proc. Natl. Acad. Sci. USA 103:419-424 (2006); and Portluck et al., Human Gene Therapy 17:871- 876 (2006)). Integration at this site has been demonstrated by PCR, using primers specific for the vector and for sequences at mpsLl (Olivares et al., Nature Biotechnology 20: 1124-1128 (2002)).
[00194] To investigate whether the mutant integrases also utilized this integration site in the mouse genome, genomic DNA was isolated from the livers of two animals per group and analyzed by PCR for integration of pVFB at the mpsLl site. PCR bands of the expected size were seen in the positive control lane, as well as in samples from the groups that received pVFB and pCSI or pCS-P2 (FIG 7, Panel A). The PCR product was not observed in reactions using DNA from animals that received the inactive integrase or the P3 mutant integrase. The PCR bands from the positive control liver, as well as from the pCSI and pCS-P2 livers, were excised and subjected to DNA sequencing. This analysis confirmed that integration took place at mpsLl (FIG 7, Panel B). Furthermore, sequence features, including small deletions near the recombination junction that are characteristic of φC31 -mediated recombination, were observed in some instances (FIG 7, Panel B).
] The preceding merely illustrates the principles of the invention. It will be appreciated that those skilled in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. The scope of the present invention, therefore, is not intended to be limited to the exemplary embodiments shown and described herein. Rather, the scope and spirit of present invention is embodied by the appended claims.

Claims

CLAIMSThat which is claimed is:
1. A nucleic acid encoding an altered φC31 integrase, wherein said altered φC31 integrase comprises an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
2. An altered φC31 integrase comprising an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
3. An expression cassette, comprising:
(a) a transcriptional initiation region functional in an expression host;
(b) a nucleic acid encoding an altered φC31 integrase, wherein said altered φC31 integrase comprises an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08, and
(c) a transcriptional termination region functional in said expression host.
4. A system comprising:
(a) a targeting vector comprising a vector attachment site; and
(b) a nucleic acid encoding an altered φC31 integrase that recognizes said vector attachment site as a substrate, wherein said altered φC31 integrase comprises an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
5. The system according to Claim 4, wherein said targeting vector further comprises a coding sequence.
6. The system according to Claim 5, wherein said coding sequence is present in an expression cassette.
7. The system according to claim 4, wherein the vector attachment site is a bacterial genomic recombination site (attB) or a phage genomic recombination site (attP).
8. The system according to claim 4, wherein the vector attachment site is a pseudo- bacterial genomic recombination site (pseudo-attB) or a pseudo-phage genomic recombination attP site (pseudo-attP).
9. The system according to claim 5, wherein said coding sequence is a therapeutic gene.
10. A targeting vector comprising a therapeutic protein coding sequence and a vector attachment site that serves as a substrate for an altered φC31 integrase comprising an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
11. The vector according to Claim 10, wherein said therapeutic protein coding sequence is present in an expression cassette.
12. The vector according to Claim 10, wherein said vector is a viral vector.
13. The vector according to claim 10, wherein the vector attachment site is a bacterial genomic recombination site (attB) or a phage genomic recombination site (attP).
14. The vector according to claim 10, wherein the vector attachment site is a pseudo- bacterial genomic recombination site (pseudo-attB) or a pseudo-phage genomic recombination attP site (pseudo-attP).
15. A kit for use in integrating a nucleic acid into a genome of a target cell of a multicellular organism, said kit comprising:
(a) a targeting vector comprising a vector attachment site; and
(b) a nucleic acid encoding an altered φC31 integrase that recognizes said vector attachment site as a substrate, wherein said altered φC31 integrase comprises an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
16. The kit according to claim 15, wherein the vector attachment site is a bacterial genomic recombination site (attB) or a phage genomic recombination site (attP).
17. The kit according to claim 15, wherein the vector attachment site is a pseudo- bacterial genomic recombination site (pseudo-attB) or a pseudo-phage genomic recombination attP site (pseudo-attP).
18. The kit according to claim 15, wherein said targeting vector further comprises a coding sequence.
19. The kit according to claim 18, wherein said coding sequence is a therapeutic gene.
20. A method of site- specifically integrating a nucleic acid into a genome of a cell of a multicellular non-human animal, said method comprising: introducing a targeting vector comprising said nucleic acid and a vector attachment site and an altered φC31 integrase into a cell and maintaining said cell under conditions sufficient for said vector to integrate into said genome of said cell by a recombination event mediated by said φC31 integrase, wherein said altered φC31 integrase comprises an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
21. The method according to claim 20, wherein said vector attachment site is a bacterial genomic recombination site (attB) or a phage genomic recombination site (attP).
22. The method according to claim 20, wherein said vector attachment site is a pseudo-bacterial genomic recombination site (pseudo-attB) or a pseudo-phage genomic recombination attP site (pseudo-attP).
23. The method according to Claim 20, wherein said cell is present in vitro.
24. The method according to Claim 20, wherein said cell is present in vivo.
25. The method according to Claim 20, wherein said nucleic acid comprises a coding sequence.
26. The method according to Claim 25, wherein said coding sequence is present in an expression cassette.
27. The method according to claim 20, wherein said genome attachment site is a preselected site in said genome.
28. The method according to claim 20, wherein said nucleic acid comprises a control region.
29. The method according to claim 28, wherein said control region is a promoter.
30. The method according to claim 20, wherein said cell is a mammalian cell.
31. A method of site- specifically integrating a nucleic acid into a genome of a mammalian cell of a multicellular organism, said method comprising: introducing a targeting vector comprising said nucleic acid and a vector attachment site and an altered φC31 integrase into a cell present in vitro and maintaining said cell under conditions sufficient for said vector to integrate into said genome of said cell by a recombination event mediated by said altered φC31 integrase, wherein said altered φC31 integrase comprises an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
32. The method according to claim 31, wherein said vector attachment site is a bacterial genomic recombination site (attB) or a phage genomic recombination site (attP).
33. The method according to claim 31, wherein said vector attachment site is a pseudo-bacterial genomic recombination site (pseudo-attB) or a pseudo-phage genomic recombination attP site (pseudo-attP).
34. The method according to Claim 31, wherein said nucleic acid comprises a coding sequence.
35. The method according to Claim 34, wherein said coding sequence is present in an expression cassette.
36. The method according to claim 31, wherein said genome attachment site is a preselected site in said genome.
37. The method according to claim 31, wherein said nucleic acid comprises a control region.
38. The method according to claim 37, wherein said control region is a promoter.
39. The method according to claim 31, wherein said cell is a human cell.
40. A method of site- specifically integrating a nucleic acid into a genome of a cell of a multicellular non-human animal, said method comprising: introducing a targeting vector comprising said nucleic acid and a vector attachment site and an altered φC31 integrase into a cell present in vivo and maintaining said cell under conditions sufficient for said vector to integrate into said genome of said cell by a recombination event mediated by said altered φC31 integrase, wherein said altered φC31 integrase comprises an amino acid sequence of SEQ ID NO:04, SEQ ID NO:06, or SEQ ID NO:08.
41. The method according to claim 40, wherein said vector attachment site is a bacterial genomic recombination site (attB) or a phage genomic recombination site (attP).
42. The method according to claim 40, wherein said vector attachment site is a pseudo-bacterial genomic recombination site (pseudo-attB) or a pseudo-phage genomic recombination attP site (pseudo-attP).
43. The method according to Claim 40, wherein said nucleic acid comprises a coding sequence.
44. The method according to Claim 43, wherein said coding sequence is present in an expression cassette.
45. The method according to claim 40, wherein said genome attachment site is a preselected site in said genome.
46. The method according to claim 40, wherein said nucleic acid comprises a control region.
47. The method according to claim 46, wherein said control region is a promoter.
PCT/US2009/056040 2008-09-08 2009-09-04 Altered phic31 integrases having improved efficiency and specificity and methods of using same WO2010028245A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US9525708P 2008-09-08 2008-09-08
US61/095,257 2008-09-08

Publications (1)

Publication Number Publication Date
WO2010028245A2 true WO2010028245A2 (en) 2010-03-11

Family

ID=41797879

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/056040 WO2010028245A2 (en) 2008-09-08 2009-09-04 Altered phic31 integrases having improved efficiency and specificity and methods of using same

Country Status (1)

Country Link
WO (1) WO2010028245A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021220020A1 (en) * 2020-05-01 2021-11-04 Mote Research Limited Modifying genomes with integrase

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021220020A1 (en) * 2020-05-01 2021-11-04 Mote Research Limited Modifying genomes with integrase

Similar Documents

Publication Publication Date Title
US7141426B2 (en) Altered recombinases for genome modification
US6632672B2 (en) Methods and compositions for genomic modification
JP6577969B2 (en) Site-specific serine recombinases and methods for their use
US7985739B2 (en) Enhanced sleeping beauty transposon system and methods for using the same
AU2001239792A1 (en) Altered recombinases for genome modification
US20060252140A1 (en) Development of a transposon system for site-specific DNA integration in mammalian cells
EP1751180A2 (en) Enzymes, cells and methods for site specific recombination at asymmetric sites
JP7418796B2 (en) DNA plasmids for rapid generation of homologous recombination vectors for cell line development
JP4769796B2 (en) Hybrid recombinase for genome manipulation
EP1274854B1 (en) Self-extinguishing recombinases, nucleic acids encoding them and methods of using the same
WO2010028245A2 (en) Altered phic31 integrases having improved efficiency and specificity and methods of using same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09812282

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09812282

Country of ref document: EP

Kind code of ref document: A2