EP0871711A1 - Compositions et procedes d'integration directionnelle dans de l'adn - Google Patents

Compositions et procedes d'integration directionnelle dans de l'adn

Info

Publication number
EP0871711A1
EP0871711A1 EP96944223A EP96944223A EP0871711A1 EP 0871711 A1 EP0871711 A1 EP 0871711A1 EP 96944223 A EP96944223 A EP 96944223A EP 96944223 A EP96944223 A EP 96944223A EP 0871711 A1 EP0871711 A1 EP 0871711A1
Authority
EP
European Patent Office
Prior art keywords
dna
fusion protein
protein
integrase
lexa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP96944223A
Other languages
German (de)
English (en)
Inventor
Samson A. Chow
Hélène GOULAOUIC
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chow Samson A
University of California
Original Assignee
University of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of California filed Critical University of California
Publication of EP0871711A1 publication Critical patent/EP0871711A1/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Definitions

  • the present invention relates generally to molecular biological techniques for manipulating nucleic acid molecules.
  • the present invention provides a fusion protein comprising an N-terminal integrase catalytic domain and a C-terminal nucleic acid binding domain having binding specificity for a target nucleic acid.
  • the fusion protein is useful for site-specific integration of a donor nucleic acid into a target nucleic acid at or near the site of binding of the nucleic acid binding protein.
  • Nucleic acids encoding the fusion protein, expression vectors, hosts, and methods of integrating a donor nucleic acid into a target nucleic acid are provided.
  • Retroviral RNA is copied by the enzyme reverse transcriptase into a double- stranded linear viral DNA which is integrated into the host genome as a provirus. Integration of retroviral DNA into the host cell genome is an essential step during the life cycle of retroviruses (Varmus and Brown, 1989). Three factors are required for the integration process: the viral protein integrase, sequences at each end of the linear viral DNA, and a divalent metal ion cofactor.
  • the human immunodeficiency virus type 1 integrase is encoded as a 32-kDa protein at the C-terminus of the Gag-Pol polyprotein which is processed into its individual components by the viral protease during budding. Integrase can be considered as having three domains, an N-terminal zinc finger domain, a central catalytic domain, and a C-terminal DNA binding domain.
  • the viral DNA precursor for the integration reaction is a linear double-stranded molecule. Two bases from each 3' end of the linear viral DNA are removed by integrase such that the viral 3' ends are recessed by two bases from the 5' ends and terminate with the dinucleotide CA. A staggered cut is then made in the target DNA and the resulting overhanging 5'-P ends are covalently joined to the recessed 3'-OH ends of the viral DNA.
  • This cleavage-ligation reaction produces a gapped intermediate; integration is completed by a gap repair process that remains to be characterized.
  • integrase can carry out an in vitro reversal of the integration reaction, named disintegration, in which a branched DNA structure resembling an integration product is converted into two molecules resembling the initial viral and target DNAs.
  • nucleosomal DNA in the chromatin is preferred to nucleosome-free DNA, and integration tends to cluster in the exposed face of the major groove within the nucleosome core (Pruss et al., 1994; Pryciak and Varmus, 1992).
  • the basis for preferred integration in nucleosomes may be related to DNA distortion, as DNA bending itself creates favored sites for integration (Muller and Varmus, 1994;
  • DNA binding proteins Another factor in target site selection is sequence- or structure-specific DNA binding proteins.
  • Certain DNA-binding proteins such as the yeast transcriptional repressor ⁇ 2 and the lac repressor of E. coli, can prevent integration, presumably by steric hindrance (Muller and Varmus, 1994; Pryciak and Varmus, 1992).
  • yeast transcriptional repressor ⁇ 2 and the lac repressor of E. coli can prevent integration, presumably by steric hindrance (Muller and Varmus, 1994; Pryciak and Varmus, 1992).
  • histones and other proteins that stimulate integration by inducing DNA bends certain DNA-binding proteins that stimulate integration by inducing DNA bends.
  • DNA-binding proteins may promote integration by interacting with the integration machinery.
  • the significance of such an interaction is illustrated by the position-specific integration of the yeast retrovirus-like element Ty3 (Sandmeyer et al., 1990).
  • Integrase itself is a major factor in determining target site specificity.
  • C-terminus does not show any sequence specificity, which led to its proposed role as the domain for binding target DNA, and this binding may partly explain the ability of integrase to insert viral DNA at sites with weak consensus sequences.
  • Directed integration has been reported by tethering integrase to a target DNA site, accomplished by use of a hybrid protein composed of the DNA-binding domain of ⁇ repressor at the N-terminus and a full-length HIV-1 integrase at the C-terminus of the hybrid protein (Bushman, 1994).
  • the hybrid protein mediates integration preferentially to target DNA containing ⁇ operators.
  • the integration sites are near the ⁇ operator on the same face of the DNA helix, indicating that the hybrid protein binds to the operator and captures targets probably by looping out the intervening DNA (Bushman, 1994).
  • Genes have been transferred by incubating cells with DNA, possibly in the presence of chemicals such as polyions or calcium phosphate. Genetic material can also be injected into the nucleus or cytoplasm of cells or zygotes. Other methods include electroporation, liposome mediated gene insertion, asialoglycoprotein gene insertion, particle acceleration and viral transduction. The use of viruses in the transduction method has been shown to be very efficient when retroviruses are used.
  • Foreign genes are inserted into either a replication defective or replication competent viral vector construct (usually as a plasmid), and are transferred into cells containing all the genes necessary for packaging and replication of the virus.
  • helper or viral packaging cells
  • the vectors themselves do not harbor the necessary genes for replication so that when the vectors infect cells, the vectors replicate using the enzymes in the viral particle to insert themselves into the host genome (chromosomes).
  • the vectors should be unable to replicate further because the essential viral genes were left behind in the "helper" cell.
  • Retroviruses are now widely used as vectors for genetic engineering in higher eukaryotes and are considered to be promising vectors for gene therapy, owing to their natural aptitude for introducing foreign genes into cellular chromosomes (Mulligan, 1993).
  • several features of current retroviral vectors limit their usefulness in gene therapy, including the limited size of their genome, their inability to infect nondividing cells, and their inability to target integration to a specific site (Mulligan, 1993; Shiramizu et al, 1994; Temin, 1990).
  • the major shortcoming of retroviral vectors is their inability to target the DNA integration to a specific site. With random integration, there is a risk of activating a proto-oncogene or inactivating a tumor suppressor gene in the target DNA.
  • the present invention seeks to overcome these and other drawbacks inherent in the prior art by providing a fusion peptide having an N-terminal retroviral integrase catalytic domain covalently bonded to a C-terminal DNA binding moiety. Integration into a specific site is facilitated by the fusion protein since the DNA binding moiety provides the binding specificity for a particular site on a target DNA molecule and the integrase catalytic domain provides the catalytic machinery for accomplishing the integration.
  • An aspect of the invention is a fusion protein comprising a retroviral integrase catalytic domain COOH-terminally coupled to a DNA binding protein domain having binding specificity for a target nucleotide sequence, the fusion protein capable of integrating a donor DNA molecule into a target DNA molecule at or near the target nucleotide sequence.
  • Integrase catalytic domain is meant to include the sequence of amino acids from the catalytic domain of a retroviral integrase capable of carrying out disintegration, an in vitro reversal of the normal DNA strand transfer reaction.
  • the catalytic domain includes amino acids from about position 50 to about position 212, or about position 234, of the HIV-1 integrase (Cannon et al., 1994).
  • the catalytic domain is relatively conserved among retroviral integrases, and this region may be considered as applying to other retroviral integrases as well as HIV- 1 integrase (Engelman and Craigie, 1992).
  • Disintegration is the reverse reaction of integration. In this reaction, a branched oligonucleotide substrate, or Y-mer, is resolved into its constituent donor and target double-stranded DNA components (see FIGS. 1 -3 and brief description thereof).
  • the disintegration substrate has the advantage that the site of integration into target DNA is predetermined and can be manipulated. The disintegration substrate is therefore particularly well suited for studies that benefit from a defined site of integration, such as investigations of protein-target DNA interactions during retroviral DNA integration.
  • the nucleotide sequence and structural requirements for disintegration are less stringent than those for 3' processing and strand transfer (Chow et al, 1992). This characteristic allows genetic variants of integrase that lack detectable activity in 3' processing and strand transfer to retain disintegration activity (Bushman et al, 1993; Engelman and Craigie, 1992; Leavitt et al, 1993; van Gent et al, 1992; Vincent et al, 1993; Vink et al, 1993). Thus, the disintegration assay has played an important role in locating the catalytic domain of integrase and is useful in mapping other functional domains of the protein (Chow and Brown, 1994).
  • a retroviral integrase may be human immunodeficiency virus type 1 or type 2, simian immunodeficiency virus, equine infectious anemia virus, feline immunodeficiency virus, caprine arthritis-encephalitis virus, bovine immunodeficiency virus, Mason-Pfizer monkey virus, mouse mammary tumor virus, intraci sternal A particle, Rous sarcoma virus, bovine leukemia virus, human T-cell leukemia virus type
  • a retroviral integrase may also be from avian myeloblastosis virus
  • retrotransposons some eukaryotic and prokaryotic transposons, and the integrase of murine leukemia virus also share mechanistic features of HIV integration.
  • the retroviral integrase catalytic domain is integrase from human immunodeficiency virus type 1 or type 2, or from feline immunodeficiency virus integrase.
  • a "DNA binding protein domain” or moiety is a functional amino acid sequence that has binding affinity and specificity for a particular nucleotide sequence in DNA.
  • a DNA binding protein domain may include binding domains from: Cro repressor from phage lambda, cl repressor from phage lambda, Cro from phage 434, cl repressor from phage 434, P22 repressor, E. coli tryptophan repressor, E. coli CAP, P22 Arc, P22 Mnt, E. coli lactose repressor, tetracycline repressor from E. coli, MAT-al-alpha2 from yeast, GAL4 from yeast, Polyoma Large T antigen, SV40 Large T antigen, adenovirus
  • TFIIIA from Xenopus laevis, or zinc finger DNA binding proteins.
  • An example of a DNA binding protein domain is one having binding specificity for a target nucleotide sequence is LexA binding protein domain.
  • a preferred target nucleotide sequence is the LexA consensus sequence, CTGTNNNNNNACAG, (SEQ ID NO:20) and a more preferred target nucleotide sequence is the LexA sequence,
  • the N-terminal integrase catalytic domain is covalently bonded at its carboxy terminus to a DNA binding protein domain, so that the DNA binding protein domain is at the carboxy terminus of the resultant fusion protein.
  • the covalent bonding may be accomplished chemically by fusing the C-terminal carboxyl group of the integrase domain to the N-terminal amide group of the DNA binding moiety to form a peptide bond, but the fusion protein is more easily made by genetic engineering means, for example, by ligating nucleotide sequences together that encode the different moieties.
  • the fusion proteins of the present invention are useful for their capability of integrating a donor DNA molecule into a target DNA molecule at or near a target nucleotide sequence. This utility is very broad and includes the integration of genes encoding therapeutic products, or the integration of a piece of DNA for purposes of disrupting a particular function, disrupting oncogene function, for example.
  • a preferred fusion protein has an amino acid sequence essentially as set forth in SEQ ID NO:23, or SEQ ID NO:25, SEQ ID NO:29, or SEQ ID NO:31, a combination thereof, or a biologically functional fragment thereof.
  • Capable of integrating a donor DNA molecule into a target DNA molecule at or near the target nucleotide sequence means that the donor DNA molecule may be integrated within a distance of about 30-50 base pairs or so from the target nucleotide sequence.
  • the DNA binding domain when bound to the nucleotide sequence for which it has affinity, will occupy about 30 nucleotides and therefore, the actual binding site is unavailable for integration. Integration will preferably occur within about 30-50 base pairs of the DNA binding site, a distance affected in part by topology and flexibility of the fusion protein and the target DNA molecule.
  • the conditions for integration include temperatures for enzymatic activity to occur, preferably at room or body temperature, keeping in mind that the reaction will occur more slowly at lower temperatures.
  • a divalent metal cation is important for catalysis, preferably the cation is Mn(II) or Mg(II).
  • a fusion protein having an N-terminal integrase catalytic domain and a nucleic acid binding domain at the C-terminus has several advantages over a construction where the nucleic acid binding domain is at the N-terminus of the fusion protein. For example, when the DNA encoding the fusion protein is introduced into the viral genome, placement of the DNA-binding protein at the N-terminus of integrase may affect the ability of viral protease to process the precursor polypeptide, leading to defective viruses and nonfunctional proteins. It is therefore, an advantage to place the
  • the invention provides major improvements as a result of site-specific integration; i) safety - insertion of exogenous DNA will be directed towards innocuous regions of chromosomes, and away from essential genes, cancer-causing genes, or tumor suppressor genes, and ii) improved expression- insertion of exogenous DNA will be directed towards regions that are known for efficient and stable expression of genes.
  • Donor DNA is a linear double-stranded oligonucleotide with end sequences of about 15-35 nucleotides derived from the U5 or U3 ends of the retroviral long terminal repeat (LTR) (Varmus and Brown, 1989).
  • LTR contains regulatory sequences, such as promoter and enhancer sequences for gene expression, transcription initiation, and polyadenylation. Since the LTR sequence varies among different retroviruses, the exact sequence of the ends of the donor DNA will depend on the particular integrase used in the fusion construct. For instance, if the fusion protein comprises HIV-1 integrase and LexA protein, the sequences of the ends of the donor
  • DNA will be constructed so as to mimic either the U5 or U3 end of the HIV-1 LTR. Although there is no consensus DNA sequence for the retroviral LTR, one invariant feature is a CA dinucleotide at positions 3 and 4 from the 3' end of the processed DNA strand.
  • the donor DNA can be blunt-ended with the CA dinucleotide located 2 nucleotides from the 3' end of the processed strand.
  • the donor DNA can also have a
  • the donor DNA may be a DNA molecule up to 10 kbp in length.
  • the donor DNA may contain the entire LTR (350 -700 bp) at both ends of the donor DNA.
  • the sequence of the LTR corresponds to that of the retrovirus from which the integrase component of the fusion protein is obtained.
  • the donor DNA contains a psi sequence which is important for RNA packaging, and may contain a gene for therapeutic purposes (e.g. cystic fibrosis gene), or a reporter gene for selection (e.g. neomycin resistant gene) or for gene disruption, or a toxic gene for cell killing (e.g. ricin gene).
  • Target DNA is DNA that has a site recognizable by a DNA binding protein domain.
  • a DNA molecule can be made into a target DNA by incorporation of nucleotides, the sequence of which is recognizable by a DNA binding protein domain. Incorporation of a sequence of nucleotides is most easily accomplished by restriction enzyme digestion of a DNA, and ligation to a double stranded oligonucleotide having the particular sequence of nucleotides and having end linkers corresponding to the restriction enzyme used. Therefore, the target DNA is very broad, and includes any sequence where one would desire to incorporate a donor DNA molecule.
  • the invention relates to a purified nucleic acid molecule consisting essentially of a nucleotide sequence encoding an integrase-DNA binding protein domain fusion protein, the protein having an amino acid sequence essentially as set forth in SEQ ID NOS:23, 25, 29 or 31.
  • "Purified" nucleic acid molecule having a nucleotide sequence encoding an integrase-DNA binding protein domain fusion protein means a fusion protein encoding nucleic acid molecule substantially free of nucleic acid molecules not encoding a fusion protein essentially as set forth in SEQ ID NOS:23, 25, 29 or 31.
  • the purified nucleic acid molecule is a DNA molecule wherein the nucleotide sequence is essentially as set forth in SEQ ID NOS:22, 24, 28, or 30.
  • amino acid sequence essentially as set forth in SEQ ID NOS:23, 25, 29 or 31 means that the sequence substantially corresponds to a portion of SEQ ID NOS:23, 25, 29 or 31, and has relatively few amino acids which are not identical to, or a biologically functional equivalent of, the amino acids of SEQ ID NOS:23, 25, 29 or 31.
  • biologically functional equivalent is well understood in the art and is further defined as a protein having a sequence essentially as set forth in SEQ ID NOS:23, 25,29 or 31, capable of integrating a donor DNA molecule into a target DNA molecule at or near a site specific to the DNA binding protein domain portion of the fusion protein.
  • sequences which have between about 70% and about 80%; or more preferably, between about 81% and about 90%; or even more preferably, between about 91% and about 99%; of amino acids that are identical or functionally equivalent to the amino acids of SEQ ID NOS:23, 25, 29 or 31 will be sequences which are "essentially as set forth in SEQ ID NOS:23, 25, 29 or 31 ".
  • a further embodiment of the present invention is where the nucleic acid molecule has a nucleotide sequence as set forth in SEQ ID NOS:22, 24, 28, 30, a combination or a biologically functional fragment thereof.
  • the nucleic acid molecule is further defined as including a detectable label.
  • An embodiment of the present invention is a purified nucleic acid molecule that encodes an integrase-DNA binding moiety fusion protein.
  • the fusion protein includes at a minimum an integrase catalytic domain covalently bonded to a DNA binding moiety and may have an amino acid sequence in accordance with SEQ ID NOS: 23, 25, 29, 31 , a combination or a biologically functional fragment thereof.
  • nucleic acid molecule may refer to a DNA or RNA molecule which has been isolated free of total genomic DNA, or free of total RNA, of a particular species.
  • a "purified" nucleic acid molecule refers to a nucleic acid molecule that contains an integrase catalytic domain-DNA binding moiety coding sequence, yet is isolated away from, or purified free from, total genomic DNA or total RNA, for example, total human genomic DNA .
  • DNA molecule includes DNA segments and smaller fragments of such segments, and also recombinant vectors, including, for example, plasmids, cosmids, phage, viruses, and the like.
  • biologically functional as used in the description of the present invention is defined as a capable of providing the site-directed integration of a nucleic acid into DNA as described in the present disclosure.
  • Another embodiment of the present invention is a purified nucleic acid molecule, further defined as including a nucleotide sequence in accordance with SEQ ID NOS:22, 24, 28 or 30.
  • the purified nucleic acid segment consists essentially of the nucleotide sequence of SEQ ID NOS:22, 24, 28, 30, or a combination thereof.
  • Such nucleotide sequences are more particularly defined as being substantially free of nucleic acids not encoding the corresponding fusion protein.
  • a DNA molecule comprising an isolated or purified integrase-DNA binding moiety fusion protein gene refers to a DNA molecule including fusion protein coding sequences isolated substantially away from other naturally occurring genes or protein encoding sequences.
  • the term “gene” is used for simplicity to refer to a functional protein, polypeptide or peptide encoding unit.
  • this functional term includes genomic sequences, cDNA sequences or combinations thereof.
  • isolated substantially away from other coding sequences means that the gene of interest, in this case the fusion protein encoding gene, forms the significant part of the coding region of the DNA molecule, and that the DNA molecule does not contain large portions of naturally-occurring coding DNA, such as large chromosomal fragments or other functional genes or cDNA coding regions. Of course, this refers to the DNA molecule as originally isolated, and does not exclude genes or coding regions later added to the segment by the hand of man.
  • Another embodiment of the present invention is a purified nucleic acid molecule that encodes a protein in accordance with SEQ ID NOS:23, 25, 29, or 31 , or a combination thereof, further defined as a recombinant vector.
  • the term "recombinant vector” refers to a vector that has been modified to contain a nucleic acid segment that encodes a fusion protein of the present invention, or fragment of interest thereof.
  • the recombinant vector may be further defined as an expression vector comprising a promoter operatively linked to said fusion protein encoding nucleic acid molecule.
  • the recombinant vector comprises a nucleic acid sequence in accordance with SEQ ID NOS:22, 24, 28, 30, a combination or a biologically functional fragment thereof.
  • vectors may be further defined as a pT7-7, pET, pBluescript, pCMV, pUC and derivatives thereof, pBS24Ub, pYes2, pAC360 SV40, adenoviral, retroviral, yeast plasmids, Baculovirus or Vaccinia virus vector.
  • the expression vector is pT7-7, pET, pBS24Ub, pYes2, or pAC360.
  • a further embodiment of the present invention is a host cell, made recombinant with a recombinant vector comprising an integrase-DNA binding moiety encoding gene.
  • the recombinant host cell may be a prokaryotic or a eukaryotic cell, or a helper cell.
  • the recombinant host cell is a eukaryotic cell.
  • engineered or "recombinant" cell is intended to refer to a cell into which a recombinant gene, such as a gene encoding an integrase-DNA binding moiety, has been introduced.
  • engineered cells are distinguishable from naturally occurring cells which do not contain a recombinantly introduced gene.
  • engineered cells are cells having a gene or genes introduced through the hand of man.
  • Recombinantly introduced genes will either be in the form of a cDNA gene (i.e., they will not contain introns), a copy of a genomic gene, or will include genes positioned adjacent to a promoter not naturally associated with the particular introduced gene, or combinations thereof.
  • Preferred host cells may be further defined as any cell derived from a human, such as a stem cell, hepatocyte, fibroblast, or muscle cell; established cell lines such as CEM, MT-2, MT-4, T293, Jurkat, H9, HeLa, a COS cell, Saccharomyces cerevisiae, or Escherichia coli cell.
  • a human such as a stem cell, hepatocyte, fibroblast, or muscle cell
  • established cell lines such as CEM, MT-2, MT-4, T293, Jurkat, H9, HeLa, a COS cell, Saccharomyces cerevisiae, or Escherichia coli cell.
  • a further aspect of the present invention is a method of integrating a donor DNA molecule at or near a specific site or region thereof on a target DNA molecule.
  • the method comprises the steps of i) selecting a DNA binding protein domain having binding affinity for the specific site or region thereof on the target DNA molecule, ii) constructing a fusion protein having an N-terminal retroviral integrase catalytic domain and the DNA binding protein domain at a C-terminus, and iii) contacting the donor DNA molecule, the target DNA molecule and the fusion protein, wherein the fusion protein facilitates integration of the donor DNA molecule at or near the specific site or region thereof of the target DNA molecule.
  • the donor DNA molecule comprises a gene encoding an integrase-DNA binding moiety fusion protein
  • the donor DNA molecule may comprise HIV-1 viral DNA having an integrase gene replaced with a gene encoding an integrase-DNA binding moiety fusion protein.
  • the contacting step may further comprise the steps of i) incubating the fusion protein with the target DNA molecule to form an incubate, and ii) contacting the incubate with the donor DNA molecule.
  • the target DNA is DNA containing a defective gene, or DNA containing an oncogene or other disease causing gene, or DNA having no genes but is suitable as an acceptor site for exogenous DNA.
  • a preferred DNA binding domain has binding affinity for nucleotide sequences found in regions of DNA as mentioned above for preferred target DNA.
  • the retroviral integrase catalytic domain may be integrase from human immunodeficiency virus type 1 or type 2, or feline immunodeficiency virus.
  • the DNA binding domain protein may be the LexA binding protein, and the specific site on the target nucleic acid may be the LexA binding sequence.
  • the LexA nucleotide sequence may be CTGTATGAGCATACAG (SEQ ID NO:21).
  • a further embodiment of the present invention is a method of inactivating an oncogene by integrating a donor DNA molecule at or near the oncogene, or regulatory regions thereof.
  • the method comprises i) selecting a DNA binding protein domain having binding affinity for the oncogene or regulatory regions thereof, ii) constructing a fusion protein having an N-terminal retroviral integrase catalytic domain and the DNA binding protein domain at a C-terminus, and iii) contacting a donor DNA molecule, the oncogene or regulatory regions thereof, and the fusion protein, wherein the fusion protein facilitates integration of the donor DNA molecule at or near the oncogene or regulatory regions thereof, thereby inactivating the oncogene.
  • a further aspect of the present invention is a fusion protein comprising a catalytic domain of retroviral integrase and an N-terminal zinc finger domain having binding specificity for a DNA molecule.
  • the zinc finger domain is other than a zinc finger domain naturally occurring with the catalytic domain in a retroviral integrase molecule.
  • a fusion protein comprising an integrase catalytic domain fused to a protein domain having affinity for a transcription factor is also an embodiment of the present invention.
  • the transcription factor may be RNA polymerase III or TFIIIC.
  • the protein domain having affinity for a transcription factor may be transcription factor IIIB-related factor (BRF).
  • a protein-oligonucleotide construct comprising an integrase catalytic domain covalently bonded to an oligonucleotide is also as aspect of the present invention.
  • LABD - LexA DNA binding protein domain from about amino acids 1-87 of LexA WT - wild-type
  • FIG. 1 Formation of recombination intermediate.
  • the initially blunt-ended linear viral DNA is cleaved by integrase, resulting in 3' ends recessed by 2 bases.
  • the target DNA is cleaved with a 5-bp stagger, and the resulting 5'-P ends are joined to the 3'-OH ends of the viral DNA.
  • the DNA joining reaction that gives rise to this recombination intermediate is referred to as integration (signified by a solid arrow) and to the reverse reaction that resolves its viral and target components as disintegration (signified by a broken arrow).
  • Arrowheads indicate site of cleavage or strand exchange.
  • the 3'-OH ends of DNA strands are denoted by half-arrows.
  • the Y-oligomer substrate which resembles the initial recombination intermediate shown in FIG. 1 , was formed by annealing the following four oligonucleotides: Tl, 16-mer; T3, 30-mer; V2, 21-mer; and the hybrid strand, V1.T2, 33-mer (SEQ ID NOS: 12-15, respectively)
  • FIG. 3 Strand breakage and joining mediated by fusion proteins of the present invention. Schematic illustration of the expected products after disintegration of the Y- oligomer. Thick lines represent viral DNA sequences, and thin lines represent target DNA sequences. Closed circles denote the 32 P-labeled 5' ends. The length in nucleotides of each strand is indicated.
  • FIG. 4 Primary structures of HIV-1 integrase-E. coli LexA fusion proteins. Open and stippled boxes represent peptides derived from HIV-1 integrase and LexA proteins, respectively. Filled boxes represent the seven consecutive histidine residues (7xHis) used for protein purification. The left and right ends of the boxes denote the amino- and carboxy-terminus of the fusion proteins, respectively. The numbers in the boxes correspond to the amino acid residues from the native protein included in each fusion protein. Full-length HIV-1 integrase and LexA have 288 and 202 amino acids, respectively. LexA, full-length LexA protein; LexA BD, DNA-binding domain (amino acid residues 1-87) of LexA.
  • FIG. 5 DNA substrate for assaying distribution of integration sites.
  • the LexA-binding sequence (underlined) was cloned into the Kpn I site of a plasmid derived from pBluescript KSII+.
  • the resulting plasmid pBS-LA was digested with Mbo II to produce 6 fragments of different sizes (978, 639, 543, 409, 228, and 187 bp).
  • LexA-binding site is present in the 543-bp fragment.
  • the arrows represent the primers used in PCR amplification of the integration products occurring in the plus or minus strand of the plasmid DNA.
  • Primer BS+ is complementary to the plus strand of pBS-LA, whereas primer BS- is complementary to the minus strand.
  • the numbers in parentheses denote the map positions of the sites for primer annealing and restriction enzyme cleavage. M, Mbo II. FIG. 6.
  • a peptide linker indicated by arrows ( 1 ) is the result of cloning techniques.
  • FIG. 7 Nucleotide sequence (SEQ ID NO:24) and amino acid sequence (SEQ ID NO:25) of INl-288/LexA, the full-length HIV integrase (amino acids 1-288 of integrase) fused to the full-length LexA repressor (amino acids 2-202 of LexA repressor).
  • a peptide linker indicated by arrows ( ! ) is the result of cloning techniques.
  • FIG. 8 Full-length nucleotide sequence (SEQ ID NO:28), and full-length amino acid sequence (SEQ ID NO:29), of F-INI-281/LexA (full-length FIV integrase fused to full- length LexA repressor).
  • FIG. 9 Nucleotide sequence (SEQ ID NO:30) and amino acid sequence (SEQ ID NO:31).
  • the present invention demonstrates that selection of sites in a target DNA molecule can be manipulated by fusing retroviral integrase with a sequence-specific DNA binding protein.
  • a hybrid protein was constructed that has the E. coli LexA protein fused to the C-terminus of the HIV-1 integrase. The fusion protein,
  • IN1-288 LA retained the catalytic activities in vitro of the wild-type HIV-1 integrase (WT IN).
  • WT IN wild-type HIV-1 integrase
  • IN1-288/LA preferentially integrated viral DNA into the fragment containing a DNA sequence specifically bound by LexA protein. No bias was observed when the LexA-binding sequence was absent, when the fusion protein was replaced by
  • the invention concerns isolated DNA molecules and recombinant vectors which encode a fusion protein or peptide that includes within its amino acid sequence an amino acid sequence essentially as set forth in SEQ ID NO:23, 25, 29, 31, a combination thereof or a biologically functional fragment thereof.
  • DNA segment or vector encodes a full length integrase-LexA binding protein, or is intended for use in expressing the integrase-LexA binding protein
  • the most preferred sequences are those which are essentially as set forth in SEQ ID NO:25.
  • the invention concerns isolated DNA segments and recombinant vectors that include within their sequence a nucleic acid sequence essentially as set forth in SEQ ID NO:22, 24, 28, 30, a combination thereof, or a biologically functional fragment thereof.
  • the term "essentially as set forth in SEQ ID NO:22, 24, 28 or 30", is used in the same sense as described above and means that the nucleic acid sequence substantially corresponds to a portion of SEQ ID NO:22, 24, 28 or 30, and has relatively few codons which are not identical, or functionally equivalent, to the codons of SEQ ID NO:22, 24, 28 or 30.
  • codons that encode the same amino acid such as the six codons for arginine or serine, as set forth in Table 1 , and also refers to codons that encode biologically equivalent amino acids.
  • amino acid and nucleic acid sequences may include additional residues, such as additional N- or C-terminal amino acids or 5' or 3' sequences, and yet still be essentially as set forth in one of the sequences disclosed herein, so long as the sequence meets the criteria set forth above, including the maintenance of biological protein activity where protein expression is concerned.
  • the addition of terminal sequences particularly applies to nucleic acid sequences which may, for example, include various non-coding sequences flanking either of the 5' or 3' portions of the coding region or may include various internal sequences, i.e., amino acids that form the junction between the integrase catalytic domain and the DNA binding protein domain of the fusion protein.
  • nucleic acid segments of the present invention may be combined with other DNA sequences, such as promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning sites, other coding segments, and the like, such that their overall length may vary considerably.
  • sequences which have between about 70% and about 80%; or more preferably, between about 80% and about 90%; or even more preferably, between about 90% and about 99%; of nucleotides which are identical to the nucleotides of SEQ ID NO:
  • sequences which are "essentially as set forth in SEQ ID NO:22, 24, 28 or 30" will be sequences which are "essentially as set forth in SEQ ID NO:22, 24, 28 or 30". Sequences which are essentially the same as those set forth in SEQ ID NO:22, 24, 28 or 30 may also be functionally defined as sequences which are capable of hybridizing to a nucleic acid segment containing the complement of SEQ ID NO:22, 24, 28 or 30 under relatively stringent conditions. Suitable relatively stringent hybridization conditions will be well known to those of skill in the art and are clearly set forth herein, for example conditions for use with PCR, and as described in the examples.
  • the present invention includes a purified nucleic acid molecule complementary, or essentially complementary, to the nucleic acid molecule having the sequence set forth in SEQ ID NO:22, 24, 28 or 30.
  • Nucleic acid sequences which are "complementary” are those which are capable of base-pairing according to the standard Watson-Crick complementarity rules.
  • the term "complementary sequences” means nucleic acid sequences which are substantially complementary, as may be assessed by the same nucleotide comparison set forth above, or as defined as being capable of hybridizing to the nucleic acid segment of SEQ ID NO:22, 24, 28 or 30 under relatively stringent conditions such as those described herein in the detailed description of the preferred embodiments.
  • Complementary nucleotide sequences are useful for detection and purification of hybridizing nucleic acid molecules.
  • the present fusion proteins have an N-terminal histidine tag for purposes of facilitating purification of the fusion proteins.
  • other molecular tags known to those of skill in the art may also be used in conjunction with the practise of the present invention.
  • the present inventors also envision the preparation of further fusion proteins and peptides, e.g., where the DNA binding moiety is from different DNA binding proteins as cited above, also where the fusion protein coding regions are aligned within the same expression unit with other proteins or peptides having desired functions, such as for further purification or immunodetection purposes (e.g., proteins which may be purified by affinity chromatography and enzyme label coding regions, respectively).
  • the fusion proteins of the present invention have been successfully expressed in a prokaryotic expression system by the present inventors, especially using the pT7- 7(His) vector in E. coli cells.
  • Other expression systems contemplated by the present inventors include, e.g., baculovirus-based, yeast-based, mammalian cell-based, or the like.
  • baculovirus-based e.g., baculovirus-based, yeast-based, mammalian cell-based, or the like.
  • the transcriptional unit which includes the fusion protein gene, an appropriate polyadenylation site if one was not contained within the original cloned segment.
  • the poly A addition site is placed about 30 to 2000 nucleotides "downstream" of the termination site of the protein at a position prior to transcription termination.
  • any of the commonly employed host cells can be used in connection with the expression of the fusion proteins of the present invention in accordance herewith.
  • Examples include cell lines typically employed for eukaryotic expression such as COS, CV-1, CHO, murine fibroblasts C127 and 3T3, HeLa, HeLa
  • a pseudotype virus is made using two components, i) donor DNA having viral LTR-like ends, and ii) a helper cell encoding a fusion protein of the present invention and other essential viral proteins, and having necessary cellular machinery for making virus.
  • Donor DNA includes a packaging signal that allows the packaging of
  • RNA made from donor DNA This RNA together with viral proteins synthesized by the helper cell produce infectious virus.
  • the virus is harvested and used to infect cells that are needing treatment.
  • Oligonucleotide sequences based on the fusion proteins of the present invention may be used as primers in a polymerase chain reaction or as hybridization probes to screen for the incorporation of fusion protein encoding sequences into a subject of interest, a helper cell, for example.
  • DNA probes and primers useful in hybridization studies and PCR reactions may be derived from any portion of SEQ ID NO:22, 24, 28 or 30, and are generally at least about seventeen nucleotides in length. Therefore, probes and primers are specifically contemplated that comprise nucleotides 1 to 17, or 2 to 18, or 3 to 19 and so forth up to a probe comprising the last 17 nucleotides of the nucleotide sequence of SEQ ID
  • each probe would comprise at least about 17 linear nucleotides of the nucleotide sequence of SEQ ID NO:22, 24, 28 or 30, designated by the formula "n to n + 16," where n is an integer from 1 to about 753 or 1473, respectively.
  • Longer probes that hybridize to the fusion protein gene under low, medium, medium-high and high stringency conditions are also contemplated, including those that comprise the entire nucleotide sequence of SEQ ID NO:22, 24, 28 or 30.
  • Selected oligonucleotide subportions of the gene encoding a fusion protein of the present invention have significant utility as hybridization probes.
  • Such probes may be used in the identification of genes encoding a fusion protein of the present invention that have been incorporated into helper cells or into a virus, for example.
  • a general method for preparing oligonucleotides of various lengths and sequences is described by Caracciolo et al. (1989).
  • Preferred oligonucleotides resistant to in vivo hydrolysis may contain a phosphorothioate substitution at each base.
  • Oligodeoxynucleotides or their phosphorothioate analogues may be synthesized using an Applied Biosystem 380B DNA synthesizer (Applied Biosystems, Inc., Foster City, CA).
  • a further embodiment of the invention is a purified nucleic acid molecule having at least a 17, 20, 25, 30, 50, 100, 200, 500, or 1000 nucleotide sequence that corresponds to, or is capable of hybridizing to the nucleic acid sequence of SEQ ID NO:22, 24, 28 or 30 under conditions standard for hybridization fidelity and stability.
  • nucleic acid molecules having a nucleotide sequence of SEQ ID NO:22, 24, 28 or 30 for stretches of between about 10 nucleotides to about 20 or to about 30 nucleotides will find particular utility, with even longer sequences, e.g., 40, 50, 150, 250, 450, even up to full length, being more preferred for certain embodiments.
  • probes will be useful in hybridization embodiments, such as Southern and Northern blotting.
  • the total size of fragment, as well as the size of the complementary stretch(es), will ultimately depend on the intended use or application of the particular nucleic acid segment. Smaller fragments will generally find use in hybridization embodiments, wherein the length of the complementary region may be varied, such as between about 20 and about 40 nucleotides, or even up to the full length of the nucleic acid as shown in SEQ ID NOS: 1, 9-13, 26 and 27 according to the complementary sequences one wishes to detect.
  • hybridization probe of about 10 nucleotides in length allows the formation of a duplex molecule that is both stable and selective. Molecules having complementary sequences over stretches greater than 10 bases in length are preferred, though, in order to increase stability and selectivity of the hybrid, and thereby improve the quality and degree of specific hybrid molecules obtained.
  • Such fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, by application of nucleic acid reproduction technology, such as the PCR technology of U.S. Patent 4,683,202 (herein incorporated by reference) or by introducing selected sequences into recombinant vectors for recombinant production.
  • nucleic acid sequences of the present invention in combination with an appropriate means, such as a label, for determining hybridization.
  • appropriate indicator means include fluorescent, radioactive, enzymatic or other ligands, such as avidin/biotin, which are capable of giving a detectable signal.
  • fluorescent label or an enzyme tag such as urease, alkaline phosphatase or peroxidase, instead of radioactive or other environmental undesirable reagents.
  • enzyme tags colorimetric indicator substrates are known which can be employed to provide a means visible to the human eye or spectrophotometiically, to identify specific hybridization with complementary nucleic acid-containing samples.
  • the hybridization probes described herein will be useful both as reagents in solution hybridization as well as in embodiments employing a solid phase.
  • the test DNA or RNA
  • the test DNA is adsorbed or otherwise affixed to a selected matrix or surface.
  • This fixed, single-stranded nucleic acid is then subjected to specific hybridization with selected probes under desired conditions.
  • the selected conditions will depend on the particular circumstances based on the particular criteria required (depending, for example, on the
  • G+C contents type of target nucleic acid, source of nucleic acid, size of hybridization probe, etc.). Following washing of the hybridized surface so as to remove nonspecifically bound probe molecules, specific hybridization is detected, or even quantified, by means of the label.
  • DNA segments prepared in accordance with the present invention may also encode biologically functional equivalent proteins or peptides which have variant amino acid sequences. Such sequences may arise as a consequence of codon redundancy and functional equivalency which are known to occur naturally within nucleic acid sequences and the proteins thus encoded.
  • functionally equivalent proteins or peptides may be constructed via the application of recombinant DNA technology, in which changes in the protein structure may be engineered, based on considerations of the properties of the amino acids being exchanged.
  • Table 2 lists the identity of sequences of the present disclosure having sequence identifiers. Table 2 Identification of Sequences Having Sequence Identifiers
  • double stranded oligonucleotide allowed insertion of ATG initiation codon (italicized) and seven histidine codons (underlined) into the unique Nde I site of pT7-7
  • oligonucleotide 10 5 , -CAGGCCTGTATGAGCATACAGGTAC-3 , . double stranded oligonucleotide allowed preparation of a plasmid that contains a single specific binding site for LexA protein 1 1 5'- CTGTATGCTCATACAGGCCTGGTAC-3 ' . complement to SEQ
  • the present invention provides a purified integrase-DNA binding moiety fusion protein having an amino acid sequence essentially as set forth in SEQ ID NO:23, 25, 29 or 31.
  • Peptides of a fusion protein are useful for designing oligonucleotides for screening for the presence of the gene encoding said fusion protein.
  • Peptides having less than about 45 amino acid residues may be chemically synthesized by the solid phase method of Merrifield (1963) in light of this disclosure.
  • the Merrifield reference is specifically incorporated by reference herein, using an automatic peptide synthesizer with standard t-butoxycarbonyl (t-Boc) chemistry that is well known to one skilled in this art.
  • the amino acid composition of the synthesized peptides may be determined by amino acid analysis with an automated amino acid analyzer to confirm that they correspond to the expected compositions.
  • the purity of the peptides may be determined by sequence analysis or HPLC
  • the method comprises growing recombinant host cells comprising a vector that encodes a protein which includes an amino acid sequence in accordance with SEQ ID NO:23, 25, 29 or 31 , under conditions permitting nucleic acid expression and protein production followed by recovering the protein so produced.
  • the host cell, conditions permitting nucleic acid expression, protein production and recovery, will be known to those of skill in the art, in light of the present disclosure of the fusion proteins of the invention.
  • a preferred host cell is an E. coli cell.
  • Modifications and changes may be made in the sequence of the fusion proteins of the present invention and still obtain a peptide or protein having like or otherwise desirable characteristics.
  • certain amino acids may be substituted for other amino acids in a peptide without appreciable loss of function. Since it is the interactive capacity and nature of an amino acid sequence that defines the peptide's functional activity, certain amino acid sequences may be chosen (or, of course, its underlying DNA coding sequence) and nevertheless obtain a peptide with like properties. It is thus contemplated by the inventors that certain changes may be made in the sequence of an integrase-DNA binding moiety fusion protein (or underlying DNA) without appreciable loss of its ability to function.
  • an amino acid can be substituted for another having a similar hydrophilicity value and still obtain a biologically equivalent peptide.
  • substitution of amino acids whose hydrophilicity values are within ⁇ 2 is preferred, those which are within ⁇ 1 are more preferred, and those within ⁇ 0.5 are most preferred.
  • amino acid substitutions are generally therefore based on the relative similarity of the amino acid side-chain substituents, for example, their hydrophobicity, hydrophilicity, charge, size, and the like.
  • Exemplary substitutions which take various of the foregoing characteristics into consideration are well known to those of skill in the art and include: arginine and lysine; glutamate and aspartate; serine and threonine; glutamine and asparagine; and valine, leucine and isoleucine.
  • Another aspect of the present invention provides therapeutic agents for the incorporation of a therapeutic gene or for the inactivation of an oncogene, for example, in an animal.
  • the therapeutic agent comprises an admixture of integrase-DNA binding moiety fusion protein in a pharmaceutically acceptable excipient.
  • the therapeutic agent will be formulated so as to be suitable for injection.
  • Pharmacologically active fusion proteins may also be provided to a subject via gene therapy. Many different vehicles exist for accomplishing this end, such as incorporation of the fusion protein gene, or fragment thereof, into an adenovirus, retrovirus, or other techniques known to those of skill in the art in light of the present disclosure. Ex vivo gene therapy is also contemplated as another mode of administration.
  • compositions and preparations should contain at least 0.1 % of active compound.
  • the percentage of the compositions and preparations may, of course, be varied and may conveniently be between about 2 to about 60% of the weight of the unit.
  • the amount of active compounds in such therapeutically useful compositions is such that a suitable dosage will be obtained.
  • the active compounds may be administered parenterally or intraperitoneally.
  • Solutions of the active compounds as free base or pharmacologically acceptable salts can be prepared in water suitably mixed with a surfactant, such as hydroxypropyl cellulose.
  • Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations contain a preservative to prevent the growth of microorganisms.
  • the pharmaceutical forms suitable for injectable use include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersions. In all cases the form must be sterile and must be fluid to the extent that easy syringability exists.
  • the carrier can be a solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, and the like), suitable mixtures thereof, and vegetable oils.
  • the proper fluidity can be maintained, for example, by the use of a coating, such as lecithin, by the maintenance of the required particle size in the case of dispersion and by the use of surfactants.
  • the prevention of the action of microorganisms can be brought about by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, sorbic acid, thimerosal, and the like.
  • isotonic agents for example, sugars or sodium chloride.
  • Prolonged absorption of the injectable compositions can be brought about by the use in the compositions of agents delaying absorption, for example, aluminum monostearate and gelatin.
  • Sterile injectable solutions are prepared by incorporating the active compounds in the required amount in the appropriate solvent with various of the other ingredients enumerated above, as required, followed by filtered sterilization.
  • dispersions are prepared by incorporating the various sterilized active ingredients into a sterile vehicle which contains the basic dispersion medium and the required other ingredients from those enumerated above.
  • the preferred methods of preparation are vacuum-drying and freeze-drying techniques which yield a powder of the active ingredient plus any additional desired ingredient from a previously sterile-filtered solution thereof.
  • pharmaceutically acceptable carrier includes any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents and the like.
  • the use of such media and agents for pharmaceutical active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active ingredient, its use in the therapeutic compositions is contemplated. Supplementary active ingredients can also be incorporated into the compositions. See, for example, Remington (1995), which reference is incorporated by reference herein.
  • the present invention includes an antibody that is immunoreactive with an integrase-DNA binding moiety fusion polypeptide as described for the invention.
  • An antibody can be a polyclonal or a monoclonal antibody. In some embodiments, the antibody is a monoclonal antibody.
  • Means for preparing and characterizing antibodies are well known in the art (See, e.g., Antibodies "A Laboratory Manual, E. Howell and D. Lane, Cold Spring Harbor Laboratory, 1988).
  • the present invention in still another aspect defines an immunoassay for the detection of an integrase-DNA binding moiety fusion protein in a biological sample.
  • the immunoassay comprises; preparing an antibody having binding specificity for the fusion protein to provide an anti-fusion protein antibody, incubating the anti-fusion protein antibody with the biological sample for a sufficient time to permit binding between antibody and fusion protein present in said biological sample, and determining the presence of bound antibody by contacting the incubate of the sample and antibody with a detectably labeled antibody specific for the anti-fusion protein antibody, wherein the presence of anti-fusion protein antibody in the biological sample is detectable as the measure of the detectably labeled antibody from the biological sample.
  • the antibody may be labeled with any of a variety of detectable molecular labeling tags.
  • detectable molecular labeling tags include, an enzyme-linked antibody, a fluorescent-tagged antibody, or a radio-labelled antibody.
  • the present example provides constructs of fusion proteins studied as part of the present invention.
  • LexA repressor a sequence-specific DNA binding protein.
  • the LexA repressor of E.coli negatively regulates the transcription of about 20 SOS genes that are mostly involved in DNA repair, mutagenesis, DNA replication, and cell division (for reviews, see Little and Mount, 1982; and Schnarr et al, 1991).
  • LexA protein contains two domains: the first 87 amino acids at the N-terminus constitute the DNA binding domain, and amino acid residues 88 to 202 constitute the dimerization domain (Fogh et al, 1994; Schnarr et al, 1988; Thliveris and Mount, 1992).
  • LexA protein binds specifically to a 16-bp DNA sequence that consists of two dyad symmetric half-sites of 8 bp each, starting with a highly conserved CTG trinucleotide and followed by a less conserved but AT-rich 5-bp sequence (Wertman and Mount, 1985).
  • the sequence used in this study corresponds to the recA operator, a site that LexA binds with high affinity (Lewis et al, 1994).
  • LexA The ability of LexA to bind to specific DNA sequences is retained after LexA is fused to various other proteins (Brent and Ptashne, 1985; Golemis and Brent, 1992; Schmidt-Dorr et al, 1991 ; Wang and Stillman, 1993).
  • HIV-1 integrase and the lexA genes were obtained from plasmids pT7-7-IN (Vincent et al, 1993) and pBTMl 17, respectively.
  • a parent plasmid to pBTM117, pBTMl 16, is described in Vojtek (1993). For purposes of the present invention, these plasmids are essentially the same.
  • the genes were amplified by polymerase chain reaction (PCR). Oligonucleotide primers used in PCR were from Operon Technologies, Inc.
  • the primers for the N-terminus of the full-length and the N-terminus truncated (amino acid residues 1-50) integrases were 5'-GAAGGAGATATACATATGTTTTTAGATGGA-3' (SEQ ID NO:l) and 5'-TAGACTCATATGCATGGACAAGTA-3' (SEQ ID NO:2), respectively.
  • N-terminus primers contain an Nde I site.
  • the primers for the C terminus of the full-length and the C-terminus truncated (amino acid residues 235-288) integrases were 5'-GCTAGAGGTACCATCCTCATCCTGTCTACT-3' (SEQ ID NO:3) and 5 , -GCTAGAGGTACCAACTGGATCTCTGCTGTC-3 ⁇ (SEQ ID NO:4) respectively.
  • the C-terminus primers contain a Kpn I site.
  • the primer for the N terminus of the lexA gene was 5'-CAGTCAGGTACCAAAGCGTTAACGGCCAGG-3' (SEQ IDNO:5) and contains a Kpn I site.
  • the primers for the C terminus of the full-length and the DNA-binding domain (amino acids 1 to 87) of LexA protein were
  • the C-terminus primers for the lexA gene contain a BamU I site and a stop codon (italicized). After PCR, the DNA fragments containing the integrase gene were cut with Nde I and Kpn I, and the DNA fragments containing the lexA gene were cut with Kpn
  • the cleaved DNA fragments were purified with the Qiaex gel extraction kit (Qiagen) and ligated to pT7-7(His) plasmid DNA, previously cut with Nde I and BamU I.
  • the plasmid pT7-7(His) is derived from pT7-7, a T7 RNA polymerase-promoter system (Tabor and Richardson, 1985), and was prepared by i n s e r t i n g a d o u b 1 e - s t r a n d e d o l i g o n u c l o t i d e rS'-TA ⁇ rGCATCACCATCACCATCACCA-,! 1 (SEQ ID NO:8) and
  • the various fusion proteins constructed and studied in this report are shown in FIG. 4.
  • the fusion protein consisting of full-length HIV-1 integrase fused to LexA (IN1-288/LA) serves as the prototype.
  • INI -234/LABD were prepared for determining whether fusion proteins containing only the DNA binding domain of LexA was sufficient for altering target site selection. Since the central core of integrase contains the catalytic site and the C-terminus of integrase shows non-specific DNA binding (Engelman et al, 1994; Schauer and Billich, 1992; Vink et al, 1993; Woerner et al, 1992), several fusion constructs were prepared that include various truncated forms of integrase, such as IN1-234 LA, IN50-288/LA, and IN50-234/LA. These constructs would indicate whether the fusion proteins containing truncated integrase, when compared with those containing full-length integrase, have an increased specificity toward LexA-binding sequence in target site usage.
  • the present example provides studies carried out to demonstrate 3 '-end processing and 3'-end joining activities, and footprinting analyses of protein binding to a Lex A-recognition sequence.
  • the DNA constructs were transformed into E. coli BL21 (DE3). The cells were grown at 30°C. When the OD 600 was 0.8-1, 0.4 mM isopropyl-1-thio- ⁇ -D-galactopyranoside was added for expression induction, and the culture was grown for an additional 3 hours.
  • the cell pellet was resuspended in a buffer (5 ml buffer per gram of cells) containing 20 mM Tris-HCl, pH 8, 0.5 M NaCl and 6 M guanidine-HCl (Buffer A). The suspension was frozen and thawed, homogenized by stirring for one hour at room temperature, and spun at 27,000 x g for
  • the cell pellet was resuspended in a buffer containing a final concentration of 20 mM HEPES, pH 7.5, 1 M NaCl, 10% glycerol,
  • Ni-NTA resin was sequentially washed with buffer C, buffer C plus 10 mM imidazole, buffer C plus 50 mM imidazole, and buffer C plus 70 mM imidazole. The resin was then packed in a column and the protein was eluted with a linear gradient from buffer C plus 70 mM imidazole to buffer C plus 500 mM imidazole. The fractions containing the protein were pooled, concentrated on a Centricon- 10 column (Amicon), and dialyzed against the final buffer (20 mM HEPES, pH 7.5, 0.5 M NaCl, 20% glycerol, 0.1 mM EDTA, 1 mM DTT and 10 mM CHAPS). Protein concentrations were determined by the Bradford method (Bio-Rad) using bovine serum albumin (BSA) as a standard.
  • BSA bovine serum albumin
  • the wild-type integrase and the fusion proteins IN1-234/LA and IN50-234/LA were purified in both native and denaturing conditions. For each protein, no difference in activity was observed when the protein was purified in either condition.
  • the proteins IN50-234 and IN50-288/LA were purified under the native condition only, whereas the proteins INI -234, IN1-288/LABD, and IN1-234/LABD were purified under the denaturing condition only.
  • the digestion was stopped by the addition of 18 mM EDTA, and the samples were deproteinized by phenol-chloroform extraction, ethanol precipitated in the presence of 10 ⁇ g of tRNA as a carrier, and resuspended in 5 ⁇ of formamide, 10 mM EDTA. After denaturation at 90°C for 3 min, the samples were analyzed by electrophoresis through a 5% denaturing polyacrylamide gel.
  • oligonucleotides (Operon Technologies, Inc., Alameda, CA) were used as DNA substrates: Tl (16 mer), 5'-CAGCAACGCAAGCTTG-3', (SEQ ID NO:12); T3 (30 mer), S'-GTCGACCTGCAGCCCAAGCTTGCGTTGCTG-S', (SEQ ID NO:13); V2 (21 mer), 5'-ACTGCTAGAGATTTTCCACAT-3', (SEQ ID NO: 14); V1/T2 (33 mer), 5'-ATGTGGAAAATCTCTAGCAGGCTGCAGGTCGAC-3', (SEQ ID NO:
  • oligonucleotides were purified by electrophoresis through a 15% denaturing polyacrylamide gel. Oligonucleotides Tl, C220 and B2-1 were labeled at the 5' end with [ ⁇ - 32 P] ATP (6000 Ci/mmol, Amersham, Arlington Heights, IL) using T4 polynucleotide kinase.
  • the 3 '-end processing and 3 '-end joining substrate which corresponds to the terminal 21 nucleotides of the U5 end of viral DNA, was prepared by annealing the labeled C220 strand with its complementary oligonucleotide V2.
  • the preprocessed substrate which resembles the viral U5 end after 3 '-end processing, was prepared by annealing the labeled B2-1 strand with the V2 strand and was used to assay only the
  • the substrate for assaying disintegration activity was prepared by annealing the labeled Tl strand with oligonucleotides T3, V2 and VI T2 (Chow et al, 1992).
  • the DNA substrate (0.1 p ol) was incubated with the protein for one hour at 37°C in the standard reaction buffer containing a final concentration of 20 mM HEPES, pH 7.5, 10 mM DTT, 0.05% Nonidet P-40 and 10 mM MnCl 2 .
  • the reaction was stopped by the addition of 18 mM EDTA.
  • reaction products were heated at 90°C for 3 min before analysis by electrophoresis on 15% polyacrylamide gels with 7M urea in Tris-borate-EDTA buffer.
  • a reaction was carried out with 5 nM of the Y-oligomer substrate and 250 nM of protein.
  • the 5'-end-labeled Tl strand of the Y-substrate migrated as a 16-nucleotide on the denaturing gel.
  • the disintegration product was a 30-mer. Controls were done in the absence of protein.
  • Relative activities are expressed as the percentage of the activity of wild-type HIV-1 integrase. +,50% or less; ++, wild-type level of activity; +++, 150% or more; -, no activity.
  • Integrases containing various truncations, and fusion proteins containing truncated integrase were inactive in 3'-end joining and 3'-end processing but retained disintegration activity (Table 1). Although the truncated variants of integrase, either by themselves or fused with LexA, did not exhibit 3 '-end joining activity using the oligonucleotide-based assays, the ability of these proteins to mediate 3'-end joining was demonstrated by a more sensitive PCR-based assay. I 1-186/LA did not display any catalytic activities. Fusing WT IN or truncated integrase to full length LexA or only the DNA-binding domain of LexA increased the disintegration activity of the cognate protein.
  • the present example demonstrates selective integration into DNA mediated by integrase-LexA fusion proteins and the effect of preincubation of IN1-288/LA with target DNA.
  • the donor DNA substrate used to assay the distribution of integration sites of the HIV integrase-LexA fusion proteins was the preprocessed U5 DNA substrate (B2-1/V2).
  • the target DNA was the plasmid pBS-LA, as described in Example 1.
  • the distribution of the integration sites was analyzed by the following assay and the PCR assay of Example 5.
  • pBS-LA was cleaved with Mbo II to generate multiple fragments ranging in size from 0.1 to 1 kbp (see FIG. 5).
  • the fragment that contains the LexA-binding sequence is 543 bp in length (FIG. 5).
  • the DNA fragments (1 ⁇ g) were incubated with WT IN or with the fusion protein for 5 min on ice in the standard reaction buffer.
  • the integration reaction was started by adding 15 nM of the preprocessed U5 substrate (B2-1/V2), labeled at the 5' end of B2-1 , and transferring the reaction to 37°C. After a 30-min incubation, the reaction was stopped by adding 2 ⁇ of 0.2 M EDTA, pH 8.0.
  • the total reaction volume was 20 ⁇ l.
  • the reaction product was mixed with a 1/6 volume of loading buffer (30% glycerol, 0.25% bromophenol blue, 0.25%) xylene cyanol) and separated by electrophoresis on a 1.5% agarose gel in Tris-borate-EDTA buffer. After electrophoresis, the DNA fragments were visualized by ethidium bromide staining (0.5 ⁇ g/ml) and autoradiography.
  • Directed integration mediated by integrase-LexA fusion protein Formation of recombinant products by integration of the labeled U5 DNA into target DNA was assayed by the appearance of labeled, high molecular weight DNA fragments. In the presence of WT IN (no fusion), integration appeared to be random and occurred in each of the DNA fragments with similar frequency. The integration frequency using WT IN increased at higher protein concentrations but the relative intensity among the various DNA fragments remained the same. In contrast, integration of the U5 DNA by the fusion protein IN1-288/LA was unevenly distributed and showed a bias towards the DNA fragment containing the LexA-binding sequence.
  • the molar ratio between the DNA fragment containing the LexA-binding sequence and the IN 1 -288/LA dimer was about 1 :1.
  • the 543-bp lexA-containing DNA fragment was preferred approximately 14-50 fold over the other fragments.
  • the integration frequency increased but the bias became less apparent.
  • 543-bp fragment was approximately 4-fold.
  • the protein was preincubated at room temperature for 5 min with the preprocessed U5 DNA before the reaction was started by adding target DNA.
  • the DNA fragment containing the LexA-binding sequence was preferred when the fusion protein was preincubated with the target DNA, although the time of preincubation was not critical.
  • the integration events became more evenly distributed.
  • no difference was observed whether the protein was preincubated with the target or donor DNA. The result is consistent with the preferred integration being mediated by the specific interaction between the fusion protein and the LexA-binding sequence, and that such an interaction is promoted when the fusion protein is preincubated with the target DNA.
  • the present example confirms that integration by the fusion protein at a targeted site is directed by a DNA binding protein domain having binding specificity for a target nucleotide sequence, such as for example the presence of the LexA-binding sequence.
  • the present inventor examined the distribution of integration sites into DNA fragments generated from Mbo II cleavage of the parental plasmid pBS, which contains no LexA-binding sequence as a model.
  • INI -288/LA in the presence of 0-20 pmol of LexA repressor.
  • the LexA protein was preincubated first with the target DNA (Mbo Il-cleaved pBS-LA) for 5 min at room temperature before the reaction was started by adding the WT IN or the INI -288/LA and 0.3 pmol of the 5'-end labeled U5 DNA.
  • the preferred integration mediated by INI -288/LA into the DNA fragment containing the LexA-binding sequence correspondingly diminished, and the integration became more evenly distributed among all DNA fragments.
  • the present example provides a detailed analysis of the integration sites using a PCR-based assay that has a much higher sensitivity and resolution than the agarose gel assay (Pryciak and Varmus, 1992).
  • PCR assay One microgram of plasmid pBS-LA was incubated with the protein on ice for 5 min in the standard reaction buffer. The integration reaction was started by adding 15 nM of preprocessed U5 DNA (B2-1/V2) and incubating the sample at 37°C. After 30 or 60 min, the reaction was stopped by the addition of a final concentration of 15 mM EDTA. The sample was extracted with phenol-chloroform, ethanol precipitated in the presence of 10 ⁇ g tRNA, and washed with 70% ethanol. The pellet was resuspended in 50 ⁇ l of 10 mM Tris-HCl and 1 mM EDTA, pH 7.5.
  • PCR primers used were 0.2 ⁇ M unlabeled B2-1 , 0.05 ⁇ M 5'-end labeled B2-1 and 0.25 ⁇ M BS+ (5'-CATTAATGCAGCTGGCACGA-3', SEQ ID NO: 18), which is complementary to the plus strand of the plasmid DNA and is located at 232 bp from the 3 '-end of the LexA-binding sequence.
  • the BS+ primer was replaced by the primer BS- (5'-TAATACGACTCACTATAGGG-3', SEQ ID NO: 19), which is complementary to the minus strand of the plasmid DNA and is located at 140 bp from the 3'-end of the
  • LexA-binding sequence The PCR reaction was performed in a buffer containing a final concentration of 10 mM Tris-HCl, pH 8.3, 50 mM KC1, 0.001% w/v gelatin, 1.5 mM MgCl 2 , 200 ⁇ M dNTPs, and 1 unit Taq polymerase (Perkin-Elmer Corp., Norwalk, CT), in a final volume of 20 ⁇ l.
  • the labeled PCR products were analyzed on a denaturing 5% polyacrylamide gel and visualized by autoradiography. Each band on the resulting autoradiogram corresponded to an integration event at a given phosphodiester bond.
  • the frequency of integration at a particular site and its exact position was determined by the intensity of the band and by use of a sequencing ladder, respectively.
  • the distribution and frequency of integration events around the LexA-recognition sequence were compared between WT IN and
  • the integration reaction was carried out in the presence of a fixed amount of WT IN and various amounts of LexA protein.
  • concentration of LexA protein increased in the reaction, there was a proportional decrease in the integration events occurring in the LexA-binding sequence.
  • INI -288/LA there was no increase in integration in the regions flanking the LexA-binding sequence, nor a decrease in integration in the outlying regions.
  • the data show that the integration pattern of INI -288 LA results from two components working in cis, and not from a combined effect of two separate functions provided in trans by individual components.
  • Integration reaction using the PCR assay was also performed with the fusion protein INI -288/LABD in order to examine possible differences in the integration pattern between fusion proteins containing full-length or only the DNA-binding domain of LexA protein.
  • the integration pattern of INI -288/LABD was similar to that of IN 1 -288 LA, except that the pattern of IN 1 -288/LABD was less specific since there was more integration within the LexA-binding sequence as well as the outlying regions. The result is consistent with the findings from the agarose gel assay and the footprinting analysis.
  • the present example provides studies that examine whether truncated forms of integrase are competent at the integration function.
  • the central core region of integrase contains the catalytic domain and the C-terminus of the protein is reported to bind non-specific DNA.
  • the integration patterns of fusion proteins containing various truncations of integrase by the PCR assay were examined.
  • the integration reaction was carried out for 1 h at 37°C in the presence of 250 nM of IN50-234, IN50-234/LA, IN50-288/LA, and IN1-234/LA.
  • the recombinant products were amplified by PCR using oligonucleotides B2-1 and BS+ as primers.
  • PCR-based assay between INI -288/LA and the various truncated integrase-LexA fusion proteins indicate that no added specificity was achieved by removing the N- or C-terminus of integrase. The result indicates that though the C-terminus contributes to non-specific DNA binding, it is unlikely to be involved in target site selection.
  • the result on the integration pattern of the truncated integrases suggests that the integrase domain responsible for target site selection may reside in the central core (amino acid residues from about 50-234, or about 50-212) of the protein.
  • the present example provides for a fusion protein having an integrase domain with an aspartic acid residue, previously thought to be critical for catalysis, replaced with an asparagine residue.
  • the truncated integrases IN1-234 and IN50-234 showed a weak 3'-end joining activity when assayed by the sensitive PCR-based method; no 3'-end joining activity was detectable using the conventional in vitro assays.
  • a weak 3'-end joining activity was also observed by the same PCR assay with a Dl 16N mutant, which contains an asparagine substituting the highly conserved aspartic acid at position 116.
  • the weak 3 '-end joining activity observed with the truncated integrases and the Dl 16N mutant was not changed in the presence or absence of the N-terminal His-tag.
  • the Dl 16N mutant has been shown previously to be inactive in all known catalytic activities of integrase using the conventional assays (Engelman and Craigie, 1992; Kulkosky et al,
  • viruses containing a D116 mutation of integrase may be capable of forming a low level of proviruses, which may in turn produce sufficient Tat protein required for the indicator cell assay.
  • the present example provides a further fusion protein construct where the integrase catalytic domain is from feline immunodeficiency virus.
  • the feline immunodeficiency virus (FIV) full-length integrase gene was obtained from plasmid p34TF10 (Talbott, et al, 1989, provided by Tom Phillips at Scripps Research Institute) and was amplified by polymerase chain reaction (PCR).
  • the 5' and 3 Oligonucleotide primers for FIV integrase are 5'-CCAGTGCATATGTCCTCTTGGGTTGACAGA-3' and 5' -CAGTCAGGTACCCTCATCCCCTTCAGG-3' and contain Nde I and Kpn I sites at the N- and C-termini, respectively.
  • the DNA fragment containing the integrase gene was cut with Nde I and Kpn I.
  • the cleaved DNA fragment was purified and ligated to pT7-7(His)/H-IN/LA plasmid DNA, previously cut with Nde I and BamW I.
  • the plasmid pT7-7(His) is derived from pT7-7, a T7 RNA polymerase- promoter system (Tabor and Richardson, 1985), and it contains an ATG initiation codon and seven histidine codons that are in-frame with the unique Nde I site.
  • the DNA sequence of the fusion construct was confirmed by dideoxy sequencing and the construct was transformed into E. coli BL21 (DE3).
  • the fusion protein was expressed under IPTG induction, and purified by nickel- chelating affinity chromatography and gel filtration chromatography.
  • the purified FIV integrase-LexA fusion protein was catalytically active when tested by conventional in vitro assays (Vincent et al, 1993; Chow and Brown, 1994); it was capable of carrying out 3'-end processing, 3'-end joining, and disintegration.
  • a PCR-based assay as described in Example 5 was utilized to determine if there was a bias in the selection of target sits towards the LexA DNA-binding sequence.
  • the target substrate was a plasmid DNA containing a single binding site (LexA operator) for the LexA protein.
  • the enzyme was first incubated with a preprocessed U5 viral DNA end to allow the integration reaction to proceed. The reaction products were then subjected to PCR to determine at what locations integration had occurred.
  • the PCR reaction was carried out with a radiolabeled primer to the U5 viral DNA substrate, and a primer approximately 250 bases downstream from the Lex A operator.
  • the 5' primer for FIV INI-235 is identical to that described earlier for the full-length FIV integrase; the 3' primer is 5'- GCTAGAGGTACCTTTCTTATCTTTTTGATC and contains a Kpn I site.
  • the DNA fragments containing the truncated integrase gene were cut with Nde I and Kpn I.
  • the cleaved DNA fragments were purified and ligated to pT7-7(His)/F-IN/LA plasmid DNA, previously cut with Nde I and Kpn I, and purified to remove the full length FIV integrase gene.
  • the DNA sequence of the fusion construct was confirmed by dideoxy sequencing and the construct was transformed into E. Coli BL21 (DE3).
  • the protein was expressed under IPTG induction, and purified by nickel-chelating affinity chromatography and SP-sepharose chromatography.
  • the purified F-INI-235/LA fusion protein was catalytically active when tested by conventional in vitro assays; it was capable of carrying out 3 '-end processing, 3 '-end joining, and disintegration.
  • Preliminary results obtained from the PCR-based assay showed that integration of donor DNA mediated by the fusion protein containing a truncated FIV integrase, F-INI-235/LA, is also biased towards LexA-binding sequence.
  • the relative specificity between the full-length and truncated fusion proteins is still under investigation. However, unlike the case with HIV-1 integrase, the activity of the F-INI-235/LA was only 2 to 3 -fold less than that of the full-length integrase fusion protein.
  • the present example provides for a variety of DNA binding domains that may be fused to an integrase catalytic domain for purposes of the present invention.
  • sequences and/or plasmid sources include (the references are incorporated by reference herein for this particular purpose): i) the tetracycline repressor of E. coli (Gossen and Bujard, 1992; Gossen et al, 1995), ii) the Lac repressor of E. coli (Reznikoff, 1992; Brown et al, 1987), iii) GAL4 protein of yeast (S. cerevisiae) (Laughon and Gesteland, 1984), and iv) Cro repressor of phage lambda (Ohlendorf et al, 1982; Hochschild and Ptashne, 1986).
  • DNA binding proteins or binding domains thereof will be fused to the C-terminus of integrase or to the C-terminus of an integrase catalytic domain in a similar manner to the strategy used for the integrase-LexA fusion protein as described in Example 1.
  • the present example provides expression vectors, and host cells for the expression of fusion proteins of the present invention.
  • a fusion protein consisting of full-length HIV-1 integrase and the reverse tetracycline repressor (rTET) of E. coli (Gossen, et al, 1995) was prepared.
  • the N-terminus of rTet was fused to the C-terminus of HIV-1 integrase.
  • the r7et gene was obtained by PCR amplification using pUHD172-Inco as the template.
  • the 5' and 3' primers for the rtet gene are 5'-CAGTCAGGTACCTCTAGATTAGATAAAAGT-3 ' (SEQ ID NO:33) and S'-CAGTCAGGATCCGGACCCACTTTCACATTT-S', (SEQ ID NO: 34) respectively, and contain a Kpn I and BamH I site, respectively.
  • the PCR- amplified fragment was digested with Kpn I and BamH I and cloned into pINI-288/LA previously cut with Kpn I and BamH I.
  • the fusion protein was purified according to the procedure described in Example 2, and the activities examined as described in Examples 2-5.
  • the target DNA for IN/rTet fusion protein was pUHC13-3, which contains heptomerized Tet-operator sequences for specific binding of rTet.
  • the result shows that integrase from different sources, such as HIV-1 and FIV, can be fused with different DNA-binding proteins, such as LexA and rTet, to achieve site-directed integration
  • Prokaryotic and eukaryotic cells useful for propagating vectors carrying a fusion protein gene of the present invention and for expression of the fusion protein include E. coli (e.g. BL21 (DE3), HB101, DH5 ⁇ ), yeast such as Pichia pastor is (e.g. GS115) and S. cerevisiae (e.g. AB116), and insect cells (e.g. Sf9).
  • the expression vectors useful for expression and purification of the fusion protein include pT7-7, pET, pBS24Ub, pYes2, and pAC360.
  • the expression vector and the prokaryotic cell employed to propagate and express the fusion protein of the present invention are pT7-7 and E. coli BL21(DE3), respectively.
  • the fusion protein of the present invention was purified with a histidine-tag (His-tag; sequence is a methionine followed by seven histidine residues) fused to the N-terminus of integrase. Inserted between the integrase and the His-tag was a thrombin cleavage site.
  • His-tag histidine-tag
  • Other peptides that can be fused to the N- terminus of integrase for the purpose of purification include glutathione-S-transferase, maltose-binding protein, and thioredoxin (Ausubel et al, 1995).
  • the His-tag can be removed by thrombin digestion.
  • the peptides for purification can also be fused to the C-terminus of the LexA component of the fusion protein.
  • Fusion proteins will also be expressed in mammalian cell lines. Examples include VERO, HeLa cells, W138, COS, HOS, Jurkat, CEM, 293T and MDCK cell lines. Most preferably, a mammalian cell line employed to propagate an expression vector and for the expression of the fusion proteins of the present invention is 293T cells.
  • Expression vectors for mammalian cells useful for the expression of fusion proteins of the present invention include pCDM8, pZeoSV, pEUK-Cl , pMAM, pREP, and pEBVHis. These vectors contain promoters (e.g. CMV, MMTV, RSV, SV40) for driving the expression of the cloned gene, polyA signal for termination of transcription, origin of replication (SV40, oriP), and selectable markers (e.g. resistance to neomycin, hygromycin, and zeocin).
  • the present example provides for targeted delivery of a fusion protein of the present invention.
  • the nucleotide sequence representing the LexA binding site may be introduced into the target DNA.
  • these reagents may be supplied as laboratory reagents for that purpose.
  • the LexA binding site is most easily introduced into a target DNA at a restriction enzyme site, where the appropriate linkers have been attached to the ends of the double stranded LexA binding site oligonucleotide molecule.
  • the LexA-binding site may also be introduced by homologous recombination (Bollag et al, 1989). In such an approach, the LexA-binding sequence will be flanked by DNA sequences homologous to the region of insertion.
  • any nucleotide sequence that represents a binding site on DNA may be introduced into a target DNA, and the corresponding DNA binding domain having binding specificity for that DNA sequence is engineered into a fusion protein.
  • the first step of the process is to produce infectious, yet replication- defective viruses. There are two general methods for doing so. In the first method, a stable helper cell line will be prepared by transforming 293T cells with a plasmid containing a partial retrovirus genome.
  • the partial genome contains the essential genes, gag, pol, env; and the integrase gene at the 3' end of the pol gene is substituted with a gene encoding a fusion protein of the present invention.
  • the partial viral genome lacks the packaging signal and the psi sequence, so the RNA transcribed from the viral genome cannot be packaged into viral particles.
  • the function of the helper cell is, therefore, to provide essential viral proteins and the fusion protein so that a donor DNA of choice can be packaged.
  • a donor retroviral DNA vector will be introduced.
  • retroviral vectors include LNSX, LNCX, LHDCX,
  • LXSHD LXSH
  • LXSH Large et al, 1993.
  • Many of these vectors contain DNA sequences derived from murine leukemia virus (MLV).
  • the donor vector DNA contains the LTR (which contains the sequences for integration), the packaging signal, a selectable marker (e.g. neomycin resistance), and a promoter upstream of a site for gene insertion.
  • the gene inserted can be any gene of interest, for example, the adenosine deaminase gene.
  • the retroviral vector does not contain any essential viral genes.
  • the necessary viral proteins deleted from the disabled vector must be therefore provided "in trans" by the helper cell. Since the RNA transcribed from the retroviral vector has the packaging signal, it will be packaged by the viral proteins provided by the helper cell to form infectious, replication-defective viruses, which can be harvested from the culture medium.
  • MLV spleen necrosis virus
  • ABV avian leukosis virus
  • REV reticuloendotheliosis virus
  • Patents have issued for helper cell lines for MLV and REV (Miller, U.S. Pat. No. 4,861,719; Temin et al, U.S. Pat. No. 4,650,764).
  • These existing helper cell lines do not contain a gene that encodes a fusion protein of the present invention, however, they can be modified to carry a fusion protein- encoding gene.
  • MLV viruses have become useful vectors for animal genetic engineering of cells and organisms, because of their compatibility with a wide variety of animal cell types including certain germ cells as well as human cells. MLV was used to insert viral transgenes into the mouse germline, creating a transgenic mouse (Jaenisch et al, 1976,
  • MLV vector systems have been approved for limited human gene therapy trials despite some of the problems described previously.
  • a helper cell is not prepared. Instead, the plasmid DNA containing the essential viral genes and the plasmid containing the donor retroviral vector will be co-transfected into 293T cells. The replication-defective viruses will then be harvested from the culture medium. In both methods, the replication-defective retroviruses, which contain the donor RNA and the fusion protein, will be used to infect target cells. It is envisioned that the replication-defective virus, prepared by the methods described earlier, will be used to introduce a donor RNA containing a therapeutic gene into a host cell. After infection, the donor RNA will be made into cDNA by the viral reverse transcriptase. The donor cDNA will then enter the nucleus and integrate into a specific site determined by the specificity of the DNA-binding moiety of the fusion protein.
  • a modified FIV containing the integrase/LexA fusion will be prepared to produce infectious, replication-defective retroviruses for site-directed integration as an in vivo representative model.
  • the approach involves the use of a replication-defective virus, FIV ⁇ E-N, which is derived from the full-length FIV clone or f2rep (Scripps Research Institute).
  • FIV ⁇ E-N contains a deletion (map positions 7248-8287) in the env gene, and the deleted fragment will be replaced with a neomycin-resistant gene.
  • the plasmid DNA containing the FIV ⁇ E-N will be digested with Bsp H I and Avr II, which cleave the genome within the integrase gene at positions 4436 and 6718, respectively.
  • the FIV integrase/LexA fusion gene will be amplified by PCR, and the product partially digested with Bsp H I and Avr II. The desired fragment will be isolated and ligated with the similarly cleaved FIV ⁇ E-N to produce FIV fTN ⁇ E-N.
  • the final construct retains all the known splice donor and acceptor sites, and the putative vif and rev genes of FIV that are required for gene expression and infectivity (Talbott, et al,
  • the replication-defective virus will be pseudotyped with the envelope of vesicular stomatitus virus.
  • a virus stock will be generated by electroporation of 293T cells at 50% confluence using 10 ⁇ g of FIV fTN ⁇ E-N plasmid DNA and lO ⁇ g of envelope-expressing plasmid DNA. The culture supernatant will be collected and filtered 60 h later. The virus stock will be titered and characterized by measuring the p25 (capsid) content and the in vitro reverse transcriptase activity.
  • the ability of the fusion protein to mediate site-directed integration in tissue culture cells will be examined by using he pseudotyped, modified FIV (FIV fTN ⁇ E-N) to infect HeLa cells that have previously been infected with SV40.
  • the SV40 used contains a wild-type or mutated LexA operator site inserted into the unique Kpn I site located in the noncoding region of he 5.2 kbp genome.
  • SV40 DNA was chosen as a target because SV40 replicates to a copy number of about 10 5 , which makes it possible to analyze many thousands of integration events from a single experiment.
  • the use of extrachromosomal DNA as a target will also lower the nonspecific amplification that can result from using the genomic DNA.
  • the recombinant products will be separated from the chromosomal DNA, and the distribution of the integration sites used in vivo will be determined by the assays described earlier in Examples 2-5.
  • Zinc Finger Domain is Substituted by a DNA Binding Domain
  • the present example provides another potential approach for engineering integration proteins having site-specificity for binding to DNA.
  • the present inventors envision the replacement of the N-terminal zinc-finger motif of integrase (from about amino acids 1-50) with other zinc-finger protein domains having binding specificity for DNA sequences (Berg, 1990; Klug and Rhodes, 1987).
  • the zinc-finger motif of integrase will be deleted and replaced with other zinc-finger motif that recognizes specific DNA sequences.
  • the resulting hybrid protein may retain the integration activity and may gain an added ability to recognize specific DNA sequences.
  • the integrase-LexA fusion protein of the present invention has binding specificity for an E. coli LexA nucleotide sequence and would not be normally expected to bind specifically to a human DNA sequence. However, considering the size of the human genome of 3 billion bp, the integrase-LexA protein may bind to several LexA- like sequences in the genome. Integration into these LexA-like sequences may be harmless, alternatively, the LexA-binding sequence may be introduced into a desired target site for specific integration.
  • the present example addresses this aspect and provides for further integrase constructs, for example, a construct where an N-terminal integrase catalytic domain is fused to a protein domain having affinity for a transcription factor, and a construct where an integrase is covalently bonded to an oligonucleotide which provides binding specificity for its complementary nucleotide sequence.
  • RNA polymerase III (Pol III) is responsible for transcribing tRNA and some small nuclear RNA genes. Transcription by Pol III involves the polymerase itself and several protein factors called transcription factors, such as TFIIIA, TFIIIB, and TFIIIC. TFIIIB is believed to be recruited to the transcription complex by its interaction with TFIIIC and Pol III. TFIIIB itself is a large complex and contains many subunits. One subunit is BRF (IIIB-related factor). The present inventor envisions a fusion protein consisting of integrase and BRF.
  • the fusion protein will be brought into close proximity of Pol III transcribed genes through protein-protein interaction (BRF and TFIIIC and Pol III).
  • BRF and TFIIIC and Pol III protein-protein interaction
  • Advantages of such an approach are i) protein-protein interaction may be more specific than protein-DNA interaction, ii) integration would likely be directed towards regions that are transcribed by Pol III, which most likely are tRNA genes. These regions are ideal sites because i) they are transcriptionally active, and ii) tRNA genes are in multiple copies, and disruption of one tRNA gene by integration should not have a detrimental effect on the cell.
  • Integrase Covalently Linked with an Oligonucleotide In this approach, an oligonucleotide will be covalently linked to an amino acid residue of integrase, possibly through an amide bond with aspartic acid or glutamic acid, or a disulfide linkage with a cysteine. Site-directed integration will be achieved by base-pairing between the oligonucleotide of the integrase-linked oligonucleotide and the complementary region of the genome.
  • the main advantage of this strategy is that any region of the genome can be targeted as long as some information on the DNA sequence of the desired region is known. This approach is particularly applicable to ex vivo gene therapy.
  • the present example provides a description of potential uses of the herein described site-specific integration of DNA into stem or cord blood cells ex vivo.
  • Stem cells are obtained from a patient in need of gene therapy, for example, a patient having cancer, particularly leukemia, AIDS, or a genetic disease.
  • Cord blood cells are obtained from placenta.
  • Stem cells or cord blood cells are treated with a replication-defective retro virus harvested from helper cells encoding a fusion protein of the present invention and with donor DNA. Treated stem or cord blood cells are transferred to the patient to provide a transplant.
  • Donor DNA in this case may be genes for therapeutic replacement of defective genes, genes for providing a therapeutic function, or DNA for disruption of an undesirable gene. Examples include providing a gene encoding clotting factor VIII or IX for hemophilia, the ada gene for adenosine deaminase deficiency, a gene encoding the chloride channel for cystic fibrosis, or an LDL receptor encoding gene for hypercholesterolemia.
  • compositions and methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the composition, methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. More specifically, it will be apparent that certain agents which are both chemically and physiologically related may be substituted for the agents described herein while the same or similar results would be achieved. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biotechnology (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Immobilizing And Processing Of Enzymes And Microorganisms (AREA)
  • Saccharide Compounds (AREA)

Abstract

L'invention porte sur des protéines de fusion capables d'intégrer une molécule donneuse d'ADN à une molécule cible d'ADN au niveau d'une séquence cible de nucléotide ou à proximité. Les protéines de fusion comprennent un domaine catalytique d'intégrase rétrovirale lié par sa terminaison COOH à un domaine protéique de fixation d'ADN présentant une affinité de fixation pour la séquence de nucléotide cible. L'invention porte également sur des acides nucléiques codant pour lesdites protéines de fusion, sur des vecteurs, des systèmes d'expression et des cellules hôtes porteuses d'acides nucléiques codant pour ces mêmes protéines de fusion. L'invention porte en outre sur un procédé d'intégration d'une molécule donneuse d'ADN à une molécule cible d'ADN au niveau d'une séquence cible de nucléotide ou à proximité, l'intégration pouvant avoir par exemple pour résultat un gène codant pour un élément thérapeutique s'introduisant par thérapie génique, ou un oncogène inactivé.
EP96944223A 1995-12-01 1996-11-27 Compositions et procedes d'integration directionnelle dans de l'adn Withdrawn EP0871711A1 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US826395P 1995-12-01 1995-12-01
US8263P 1995-12-01
PCT/US1996/019277 WO1997020038A1 (fr) 1995-12-01 1996-11-27 Compositions et procedes d'integration directionnelle dans de l'adn

Publications (1)

Publication Number Publication Date
EP0871711A1 true EP0871711A1 (fr) 1998-10-21

Family

ID=21730660

Family Applications (1)

Application Number Title Priority Date Filing Date
EP96944223A Withdrawn EP0871711A1 (fr) 1995-12-01 1996-11-27 Compositions et procedes d'integration directionnelle dans de l'adn

Country Status (3)

Country Link
EP (1) EP0871711A1 (fr)
AU (1) AU1408597A (fr)
WO (1) WO1997020038A1 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6855545B1 (en) 1996-10-04 2005-02-15 Lexicon Genetics Inc. Indexed library of cells containing genomic modifications and methods of making and utilizing the same
US7332338B2 (en) 1996-10-04 2008-02-19 Lexicon Pharmaceuticals, Inc. Vectors for making genomic modifications
US6139833A (en) * 1997-08-08 2000-10-31 Lexicon Genetics Incorporated Targeted gene discovery
SI1546322T1 (sl) * 2002-07-24 2011-05-31 Manoa Biosciences Inc Vektorji na osnovi transpozona in metode integracije nukleinskih kislin
US20050074865A1 (en) * 2002-08-27 2005-04-07 Compound Therapeutics, Inc. Adzymes and uses thereof

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6150511A (en) * 1995-05-09 2000-11-21 Fox Chase Cancer Center Chimeric enzyme for promoting targeted integration of foreign DNA into a host genome

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9720038A1 *

Also Published As

Publication number Publication date
WO1997020038A1 (fr) 1997-06-05
AU1408597A (en) 1997-06-19

Similar Documents

Publication Publication Date Title
KR100659922B1 (ko) Gnn에 대한 아연 핑거 결합 도메인
CA2442909C (fr) Multimerisation de la proteine vif du vih-1 utilisee comme cible therapeutique
US6010860A (en) Method for site-specific integration of nucleic acids and related products
Chiu et al. Structure and function of HIV-1 integrase
Goulaouic et al. Directed integration of viral DNA mediated by fusion proteins consisting of human immunodeficiency virus type 1 integrase and Escherichia coli LexA protein
US6221355B1 (en) Anti-pathogen system and methods of use thereof
Violot et al. The human polycomb group EED protein interacts with the integrase of human immunodeficiency virus type 1
WO2000062067A9 (fr) Nouvelles molecules de transduction et leurs procedes d'utilisation
JP2010207234A (ja) ハイブリッドおよび単鎖メガヌクレアーゼならびにその使用
CA2362560A1 (fr) Regulation des taux de proteines dans des organismes eucaryotiques
Shibagaki et al. Characterization of feline immunodeficiency virus integrase and analysis of functional domains
EP0871711A1 (fr) Compositions et procedes d'integration directionnelle dans de l'adn
US7709606B2 (en) Interacting polypeptide comprising a heptapeptide pattern and a cellular penetration domain
US5654398A (en) Compositions and methods for inhibiting replication of human immunodeficiency virus-1
US11186614B2 (en) Anti-HIV peptides
Boross et al. Drug targets in human T-lymphotropic virus type 1 (HTLV-1) infection
JP4562290B2 (ja) インテグラーゼn−末端領域を標的としたウイルス感染阻害剤
WO1997006257A1 (fr) Cofacteur cellulaire pour vih rev et htlv rex
WO1997006257A9 (fr) Cofacteur cellulaire pour vih rev et htlv rex
WO1998001155A1 (fr) Compositions et procedes de regulation de l'expression genique du vih
Boulton An Investigation Into the Effect of Myristoylation on the Interactions Between HIV-1 Nef and Cellular Proteins
WO2000040606A2 (fr) Modulation de la replication du vih par l'utilisation de sam68

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19980618

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: CHOW, SAMSON A.

Owner name: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20000601