WO2001053479A2 - Sequences activatrices ii - Google Patents

Sequences activatrices ii Download PDF

Info

Publication number
WO2001053479A2
WO2001053479A2 PCT/GB2001/000187 GB0100187W WO0153479A2 WO 2001053479 A2 WO2001053479 A2 WO 2001053479A2 GB 0100187 W GB0100187 W GB 0100187W WO 0153479 A2 WO0153479 A2 WO 0153479A2
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
molecule
binding
ligand
candidate
Prior art date
Application number
PCT/GB2001/000187
Other languages
English (en)
Other versions
WO2001053479A3 (fr
Inventor
Yen Choo
Christopher Graeme Ullman
Michael Moore
Original Assignee
Sangamo Biosciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB0001578A external-priority patent/GB0001578D0/en
Priority claimed from GB0001582A external-priority patent/GB0001582D0/en
Priority claimed from PCT/GB2000/002071 external-priority patent/WO2000073434A1/fr
Priority claimed from PCT/GB2000/002080 external-priority patent/WO2001000815A1/fr
Priority claimed from GBGB0029901.6A external-priority patent/GB0029901D0/en
Application filed by Sangamo Biosciences, Inc. filed Critical Sangamo Biosciences, Inc.
Priority to AU2001226921A priority Critical patent/AU2001226921A1/en
Publication of WO2001053479A2 publication Critical patent/WO2001053479A2/fr
Publication of WO2001053479A3 publication Critical patent/WO2001053479A3/fr

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/02Libraries contained in or displayed by microorganisms, e.g. bacteria or animal cells; Libraries contained in or displayed by vectors, e.g. plasmids; Libraries containing only microorganisms or vectors
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4702Regulators; Modulating activity
    • C07K14/4703Inhibitors; Suppressors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1037Screening libraries presented on the surface of microorganisms, e.g. phage display, E. coli display
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1055Protein x Protein interaction, e.g. two hybrid selection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8217Gene switch
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/82Vectors or expression systems specially adapted for eukaryotic hosts for plant cells, e.g. plant artificial chromosomes (PACs)
    • C12N15/8216Methods for controlling, regulating or enhancing expression of transgenes in plant cells
    • C12N15/8237Externally regulated expression systems
    • C12N15/8238Externally regulated expression systems chemically inducible, e.g. tetracycline
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/70Fusion polypeptide containing domain for protein-protein interaction
    • C07K2319/71Fusion polypeptide containing domain for protein-protein interaction containing domain for transcriptional activaation, e.g. VP16
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • C07K2319/81Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor containing a Zn-finger domain for DNA binding

Definitions

  • This invention relates to molecular switches and methods for identifying and selecting such switches.
  • Particular molecular switches include gene switches that use nucleic acid binding molecules capable of binding a specific nucleic acid sequence (for example, a DNA sequence) in a ligand-dependent manner, and protein switches in which two protein binding artners bind in a manner which is modulatable by a ligand.
  • Such methods optionally make use of array technology
  • this invention relates to methods for the identification of the ligand-dependent binding molecules as well as identification of ligands.
  • the invention in particular relates to screening of arrays of nucleic acid targets with a known nucleic acid binding molecule, and a known or library of ligands for identification of new molecules which potentially modulate the interaction between nucleic acid and nucleic acid binding molecule.
  • Protein-protein interactions are crucial to almost every physiological and pharmacological process. These interactions often are characterized by very high affinity, with dissociation constants in the low nanomolar to subpicomolar range. Such strong affinity between proteins is possible when a high level of specificity allows subtle discrimination among closely related structures. Proteins can bind to each other through several types of interface, for example, a "surface string” where a portion of the surface of one protein contacts an extended loop of polypeptide chain on a second protein, a "helix- helix” configuration involving two alpha helices, and a "surface-surface” configuration involving the matching of one surface to another. For example, it is known that the SH2 domain binds tightly to a region of a polypeptide chain that contains a phosphorylated tyrosine side chain.
  • Polypeptides can form higher order tertiary structures with like polypeptides (homo-oligomers) or with unalike polypeptides (hetero-oligomers).
  • two identical polypeptides associate to form an active homodimer.
  • An example of this type of association is the natural association of myosin II molecules in the assembly of myosin into filaments.
  • Protein-protein association may be mediated by several factors, including post-translational modifications, by means of which enzymatic activity may be biologically controlled. For example, the phosphorylation state of a protein may cause it to associate with or dissociate from another protein.
  • the phospohrylation state of a protein is thought to be determined by the relative activity of protein kinases which add phosphate and protein phosphatases which remove the phosphate moiety from the protein. For example, it is thought that phosphorylation of myosin II by protein- kinases is involved in the priming event leading to dimerization of myosin II monomers and subsequent formation of myosin filaments.
  • Ligand mediated association and dissociation of proteins is also known, in which the ability of a protein to interact with another protein is dependent on the binding of a ligand to one or both proteins.
  • An example of ligand-mediated heterodimer association is described in patent application number W092/00388.
  • This publication describes an adenosine 3 : 5 cyclic monophosphate (cAMP) dependent protein kinase which is a four- subunit enzyme being composed of two catalytic polypeptides (C) and two regulatory polypeptides (R).
  • cAMP cyclic monophosphate
  • R regulatory polypeptides
  • the present invention seeks to describe methods of identifying tripartite systems comprising protein switches.
  • Proteins can also interact with nucleic acids, for example, DNA and RNA. Many important biological interactions involve the binding of proteins to nucleic acids. Such interactions are important, for example, in the control of transcription and translation, and include the control and execution of nucleic acid replication, recombination, modification, cleavage, degradation, ligation, splicing, packaging etc.
  • nucleic acids aptamers
  • nucleic acids have been engineered which are capable of binding to proteins and affecting the function of these proteins.
  • Zinc finger proteins are transcriptional regulators of gene expression which may be adapted to regulate a desired gene by modulating the binding specificity of the zinc finger for its target nucleic acid.- A number of applications for zinc finger technology have been suggested, including the treatment of diseases, use as reagents for manipulating nucleic acids and the regulation of gene expression.
  • zinc fingers are relatively large polypeptides.
  • regulation of zinc finger activity is difficult, because the amount of zinc finger present in each cell nucleus can only be controlled indirectly, by influencing the level of zinc finger expression, or by varying the amount of protein administered to the cell.
  • upregulation or downregulation of zinc finger activity is slow. Downregulation in particular is dependent on the natural turnover of zinc finger molecules within the cell.
  • the switching ligand In the natural context of these switches, the switching ligand is able to produce a physiological effect by affecting a protein-nucleic acid interaction.
  • the ability to apply gene switch capability to any desired promoter is highly desirable. Most promoters of clinical or commercial significance,' however, do not possess regulatory elements which are susceptible to gene switch regulation.
  • the present invention seeks to describe methods of identifying tripartite systems comprising gene switches.
  • a method of selecting one or more components of a switching system comprising: (i) a first molecule and (ii) a second molecule, in which the first molecule binds to the second molecule in a manner modulatable by a ligand, and (iii) a ligand, the method comprising the steps of: (a) determining the degree of binding between a candidate first molecule and a candidate second molecule in the presence of a candidate ligand; (b) determining the degree of binding between the candidate first molecule and the candidate second molecule in the absence of the candidate ligand; and (c) identifying a first molecule / second molecule pair in which the binding of the first molecule to the second molecule differs in the presence and absence of a ligand; and (d) optionally isolating and/or identifying first molecule, the second molecule or the ligand.
  • the degree of binding between each of a plurality of candidate first molecules and a single candidate second molecule is determined substantially simultaneously. Furthermore, the degree of binding between each of a plurality of candidate first molecules and each of a plurality of candidate second molecules may also be determined substantially simultaneously.
  • the plurality of candidate second molecules is provided in the form of an array of candidate second molecules.
  • the plurality of candidate first molecules is provided in the form of an array of candidate first molecules.
  • the degree of binding between the or each candidate first molecule and the or each candidate second molecule is determined substantially simultaneously in the presence and/or absence of each of a plurality of candidate ligands.
  • the plurality of candidate ligands may be provided in the form of an array of candidate ligands.
  • the invention provides for a method of selecting one or more components of a switching system, the switching system comprising: (i) a first molecule and (ii) a second molecule, in which the first molecule binds to the second molecule in a manner modulatable by a ligand, and (iii) a ligand, the method comprising the steps of: (a) determining the degree of binding between one or more candidate first molecules and one or more candidate second molecules in the presence of one or more candidate ligands; (b) determining the degree of binding between the one or more candidate first molecules and the one or more candidate second molecules in the absence of the candidate ligand(s); and (c) identifying a first molecule / second molecule pair in which the binding of the first molecule to the second molecule differs in the presence and absence of a ligand; and (d) optionally isolating and/or identifying a first molecule, a second molecule or a ligand, in which at least one
  • the first molecule may have a higher affinity for the second molecule in the presence of the ligand than in the absence of the ligand.
  • the first molecule component has a higher affinity for the second molecule in the absence of the ligand than in the presence of the ligand.
  • the invention allows for the selection of one or more components of a gene switch.
  • one of the first molecule and the second molecule is a nucleic acid binding molecule and the other of the first molecule and the second molecule is a nucleic acid.
  • a method of selecting one or more components of a gene switch which gene switch comprises (i) a target nucleic acid molecule; (ii) a nucleic acid binding molecule which binds to the target nucleic acid molecule in a manner modulatable by a ligand; and (iii) the ligand, which method comprises: (a) contacting one or more candidate target nucleic acid molecule(s) with one or more candidate nucleic acid binding molecules, in the presence of one or more ligands; (b) selecting a complex comprising a candidate target nucleic acid, a nucleic acid binding molecule and a ligand; (c) .optionally isolating and/or identifying the unknown components of the complex; (d) comparing the binding of the nucleic acid binding molecule component of the complex to the target nucleic acid component of the complex in the presence and absence of the ligand component of the complex; and (e) identifying
  • a method of selecting a gene switch which gene switch comprises (i) a target nucleic acid molecule; (ii) a nucleic acid binding molecule which binds to the target nucleic acid molecule in a manner modulatable by a ligand; and (iii) the ligand, which method comprises: (a) identifying a candidate target nucleic acid and a candidate nucleic acid binding molecule, capable of binding to each other, in which: (i) the target nucleic acid is a modified target nucleic acid and the nucleic acid binding molecule is an unmodified nucleic acid binding molecule; or (ii) the target nucleic acid is an unmodified target nucleic acid and the nucleic acid binding molecule is a modified nucleic acid binding molecule; or (iii) the target nucleic acid is a modified target nucleic acid and the nucleic acid binding molecule is a modified nucleic acid binding molecule;
  • the candidate target nucleic acid and/or a candidate nucleic acid binding molecule are identified in step (a) by contacting one or more target nucleic acid molecule(s) with one or more nucleic acid binding molecules.
  • a plurality of nucleic acid binding molecules preferably an array of nucleic acid binding molecules, may be used.
  • a plurality of nucleic acids preferably an array of nucleic acids, more preferably related to one another by sequence homology, may be used.
  • the candidate target nucleic acid identified in step (a) is a modified nucleic acid molecule
  • the candidate nucleic acid binding molecule identified in step (a) is an unmodified nucleic acid binding molecule.
  • the binding of the unmodified nucleic acid binding molecule to the unmodified nucleic acid is compared in step (c) in the presence and absence of a plurality of candidate ligands, preferably a library of candidate ligands.
  • a plurality of candidate ligands preferably a library of candidate ligands.
  • One, two or all of the plurality of candidate target nucleic acids, the plurality of candidate nucleic acid binding molecules and the plurality of candidate ligands, where present, may be provided in the form of an array.
  • the modified nucleic acid may be selected from a methylated nucleic acid and a phosphorylated nucleic acid.
  • the modified nucleic acid binding molecule may be a modified polypeptide selected from the group consisting of: a polypeptide modified with a ubiquitin moiety, a polypeptide modified with a glycosyl moiety, a polypeptide modified with a fatty acyl moiety, a polypeptide modified with a sentrin moiety, a polypeptide modified with an ADP-ribosyl and a polypeptide modified with a phosphate moiety.
  • the modified nucleic acid is produced by the reaction of a nucleic acid capable of being derivatised together with a modifying moiety. More preferably, the nucleic acid capable of being derivatised contains an amino, thio, oxo or bromogroup, or a group that can be chemically or photo-acitaved.
  • the nucleic acid binding molecule has a higher affinity for the target nucleic acid in the presence of the ligand by virtue of the ligand mimicking the interaction between the modified nucleic acid and the nucleic acid binding molecule, or the interaction between the nucleic acid and the modified nucleic acid binding molecule, or the interaction between the modified nucleic acid and the modified nucleic acid binding molecule, as the case may be.
  • the nucleic acid binding molecule is a polypeptide, which is preferably at least partly derived from a transcription factor, preferably a zinc finger transcription factor.
  • the target nucleic acid is a DNA or RNA.
  • at least one of the candidate nucleic acid binding molecules may comprise a non-naturally occurring nucleic acid binding domain.
  • the candidate nucleic acid -binding molecules may be provided as a phage display library.
  • the methods of our invention allow the identification of any or all of the first molecule, the second molecule, and the ligand.
  • a ligand is isolated and/or identified.
  • a nucleic acid binding molecule is isolated and/or identified.
  • the ligand may be a nucleic acid binding ligand, preferably selected from Distamycin A, Actinomycin D and echinomycin.
  • a method of selecting a ligand which is capable of modulating the interaction between a nucleic acid binding molecule and a target nucleic acid comprising the steps of: (a) providing one or a plurality of candidate ligands; (b) determining the degree of binding between the nucleic acid binding molecule and the target nucleic acid sequence in the presence of the or each candidate ligand; (c) determining the degree of binding between the nucleic acid binding molecule and the target nucleic acid sequence in the absence of the or each candidate ligand; and (d) identifying a ligand for which the binding of the nucleic acid binding molecule to the target nucleic acid sequence differs in the presence and absence of the ligand.
  • the degree of binding between, each of a plurality of target nucleic-acids and a single candidate transcription factor may be determined substantially simultaneously. Furthermore, the degree of binding between each of a plurality of target nucleic acids and each of a plurality of nucleic acid binding molecules is determined substantially simultaneously. Furthermore, one, two or all of the plurality of candidate ligands, the plurality of nucleic acids and/or the plurality of nucleic acid binding molecules, where present, is provided in the form of an array.
  • the nucleic acid binding molecule is a transcription factor and the target nucleic acid is a DNA sequence.
  • nucleic acid binding molecule As a fifth aspect of the present invention, there is provided a nucleic acid binding molecule, a target nucleic acid or a ligand selected by a method according to the first, second, third or fourth aspects of the invention.
  • each of the first and second molecules comprises a polypeptide.
  • One or both of the first and second molecules may comprise a polypeptide binding domain.
  • One or both of the first and second molecules may comprise an immunoglobulin molecule, preferably an antibody molecule.
  • the ligand may also comprise an immunoglobulin molecule, preferably an antibody molecule.
  • one or both of the first and second molecules is a nucleic acid binding protein capable of binding to nucleic acid. More preferably, the nucleic acid binding protein binds to nucleic acid in a manner modulatable by the other of the first and second molecule.
  • the present invention in a sixth aspect, provides a polypeptide or ligand selected by a method according to the above protein switch embodiment.
  • a nucleic acid binding molecule, a target nucleic acid, and/or a ligand according to fifth or sixth aspect of the invention, or as selected by a method according to the first to fourth aspects of the invention in a method of regulating a biological process, in which the biological process involves binding between a nucleic acid binding molecule and a target nucleic acid.
  • the binding between a target nucleic acid and a nucleic acid binding molecule is dependent on the presence or absence of a ligand.
  • nucleic acid binding molecule a target nucleic acid, and/or a ligand according to fifth or sixth aspect of the invention, or as selected by a method according to the first to fourth aspects of the invention, in a method of regulating transcription or translation from a nucleic acid sequence comprising a target nucleic acid to which a nucleic acid binding molecule binds in a manner modulatable by the ligand.
  • a method of modulating a biological process affecting one or more genes comprising, administering a nucleic acid binding molecule and/or a ligand according to fifth or sixth aspect of the invention, or as selected by a method according to the first to fourth aspects of the invention to a cell, in which the regulatory sequences of said genes comprise a target nucleic acid according according to fifth aspect of the invention, or as selected by a method according to the first to fourth aspects of the invention.
  • a method of modulating a biological process affecting one or more nucleotide sequences of interest in a host cell which host cell comprises a nucleic acid sequence capable of directing the expression of a nucleic acid binding molecule and a target nucleic acid sequence to which the nucleic acid binding molecule binds in a manner modulatable by a ligand, which method comprises administering said ligand to the cell and wherein the nucleic acid binding molecule is heterologous or endogenous to the host cell, in which one or more of the nucleic acid binding molecule, the ligand and the target nucleic acid sequence is/are according to fifth or sixth aspect of the invention, or as selected by a method according to the first to fourth aspects of the invention.
  • the biological process is selected from the group consisting of: transcription, translation, phosphorylation, methylation, replication, restriction, modification, ligation, transport, degradation, editing, splicing, integration and recombination.
  • the host cell is a plant cell. More preferably, the plant cell is part of a plant and the target sequence is part of a regulatory sequence to which the nucleotide sequence of interest is operably linked, said regulatory sequence being preferentially active in the male or female organs of the plant.
  • a non human transgenic organism comprising a target nucleic acid sequence and a nucleic acid sequence capable of directing the expression of a nucleic acid binding molecule which binds to the target nucleic acid in a manner modulatable by a ligand wherein the target nucleic acid sequence and/or nucleic acid sequence are heterologous to the organism, in which one or more of the nucleic acid binding molecule, the ligand and the target nucleic acid sequence is/are according to according to fifth or sixth aspect of the invention, or in which one or more of the nucleic acid binding molecule, the ligand and the target nucleic acid sequence is/are as selected by a method according to the first to fourth aspects of the invention.
  • the transgenic non-human organism is a plant.
  • a method of selecting a ligand which is capable of modulating the interaction between a nucleic acid binding molecule and a nucleic acid target comprising contacting a nucleic acid binding molecule with a one or a plurality of nucleic acid targets in the form of an array, together with one or a plurality of candidate target nucleic acid molecules in the form of an array; selecting a complex comprising a candidate target nucleic acid, a nucleic acid binding molecule and a ligand; optionally isolating and/or identifying the unknown components of the complex; comparing the binding of the nucleic acid binding molecule component of the complex to the target nucleic acid component of the complex in the presence and absence of the ligand component of the complex; and selecting complexes where said binding differs in the presence and absence of the ligand component.
  • a ligand selected according to a method according to the twelth aspect of the invention.
  • Figure 1 shows a graph of the effect of Distamycin A concentration on binding of two different phage (clone 3 (3/2F) and clone 4 (4/5F)) to the DNA sequence AAAAAGGCG.
  • the small molecule causes phage binding to DNA.
  • Figure 2 shows a graph of the effect of Actinomycin D concentration on binding of two different phage (AD clone 1 and 6) to the DNA sequence AGCTTGGCG.
  • the small molecule causes phage binding to DNA.
  • Figure 3 shows four different phage (0.4/1, 0.4/2, 0.4/4 and 0.4/5) binding to the randomised DNA oligo YRYRYGGCG (where Y is C or T and R is G or A) in the presence, but not in the absence, of echinomycin (EM).
  • EM echinomycin
  • Figure 4 shows the binding site signature of phage 0.4/4 selected using the randomised DNA sequence (Yl )(R2)(Y3)(R4)(Y5)GGCG.
  • the phage has a preference for the DNA sequence (T)(G/A)(C)(G/A)(T) in the presence of echinomycin.
  • Figure 5 shows binding of the phage 0.4/4 to three related DNA sequences, TACGTGGCG, TGTATGGCG and CGTACGGCG, as a function of echinomycin concentration.
  • the first DNA site contains the optimal binding sequence as revealed by the binding site signature.
  • Figure 6 shows a graph of the effect of ligand concentration on binding of two different phage to specific DNA sequences. In this case, the respective phage are dissociated from the DNA in the presence of distamycin A or actinomycin D.
  • Figures 7 to 12 are referred to in Example 14.
  • Figure 7 shows the layout of the 96- well stock plate of oligos used in the arrays.
  • Figure 8 shows the layout of the 96-well assays for a single zinc finger phage arrayed against a single drug.
  • Figure 9 shows a graph of the sensitivity of actinomycin D binding phage clone 1 to actinomycin D against the array of oligos.
  • Figure 10 shows a graph of the sensitivity of echinomycin binding phage 0.4/4 to echinomycin against the array of oligos.
  • Figure 11 shows the layout of the 384-well assay for a single zinc finger phage against multiple drugs.
  • the white square in each quadrant contains no drug whereas the shaded squares contain drug.
  • a different shade represents a different drug with the square in the top right of each quadrant being distamycin A (DA), bottom right of each quadrant being echinomycin (EM) and bottom left being actinomycin D (AD).
  • DA distamycin A
  • EM echinomycin
  • AD actinomycin D
  • Figure 12 shows a graph of the sensitivity of distamycin A binding phage clone 3 (3/2F) to the 3 drugs against the array of oligos in a 384-well assay.
  • the array follows the layout of Figure 11.
  • Figure 13 shows a graph of the sensitivity of distamycin A binding phage clone 3
  • Figure 14 shows a graph of the sensitivity of distamycin A binding phage clone 3 (3/2F) to echinomycin against the array of oligos. These data were extracted from the 384- well assay in Figure 12.
  • Figure 15 shows a graph of the sensitivity of distamycin A binding phage clone 3 (3/2F) to actinomycin D against the array of oligos. These data were extracted from the 384-well assay in Figure 12.
  • the term 'modulatable by' is used to indicate that binding of the first molecule to the second molecule can be modulated or affected by the ligand.
  • the ligand modulates or affects the binding of the nucleic acid binding molecule to the target nucleic acid, and (as applied to a protein switch), the binding of the two polypeptide molecules is modulated or affected by the ligand.
  • the ligand can modulate, affect, regulate, adjust, alter, or vary the binding of the first molecule to the second molecule.
  • 'isolating' in the context of the invention, refers to the act of removing one or more components or molecules from a sample of candidate molecules which are used in the methods disclosed herein. Alternatively, 'isolating' means deducing the identity of the molecule, though it may not be physically separated from a mixture.
  • the term 'complex' is used to describe an association between a DNA and one or more molecules as defined herein, or between a polypeptide molecule and one or more molecules. In the case of a polypeptide, these molecules may include another polypeptide molecule and/or a ligand molecule.
  • DNA binding molecule DNA binding ligand
  • target DNA target DNA
  • Nucleic acids will in general be RNA or DNA, double stranded or single stranded. RNA may be at least partially double-stranded in the context of the present invention.
  • references to "DNA” mean deoxyribonucleic acid in a literal sense.
  • An array may be defined as an orderly arrangement of samples. Such samples may include nucleic acids (including oligonucleotides, double stranded and single stranded DNA, cDNAs, mRNAs, whole chromosomes, etc), proteins (including polypeptides), or any other molecules (such as ligands). Where reference is made to an "array" in this document, such references should be understood to include references to alternative terminologies for the same or similar technology. Thus, arrays are also known as: biochips, chips (e.g., DNA chips), microarrays, gene arrays, genome chips and GeneChip®s (Affymetrix, Inc).
  • DNA microarray(s) and “DNA chip(s)” are used interchangeably.
  • array also includes microfluidics-based chips or lab-on-chip systems (as disclosed in for example US Patent 5,750,015).
  • This invention relates to the selection of one or more components of a gene switch.
  • the method invovles the use of arrays of any or all of the components.
  • the term "gene switch" is used herein to describe a multiple component system comprising (i) a target nucleic acid molecule; (ii) a nucleic acid binding molecule which binds to the target nucleic acid molecule in a manner modulatable by a ligand; and (iii) the ligand.
  • the nucleic acid is a DNA, and preferably, the nucleic acid binding molecule is a polypeptide.
  • the nucleic acid binding molecule may or may not comprise a transcriptional effector domain, especially when part of the assay procedure. However, since ultimately the gene switch will be used to regulate transcription from one or more promoters, the nucleic acid binding molecule may need to be modified to include a transcriptional activator or repressor domain, if one is not already present. It is noted however that other effector domains, e.g. a nuclease domain such as Fold or an integrase domain such as from HIVb, may be used to modulate gene expression indirectly, as described in further detail below.
  • effector domains e.g. a nuclease domain such as Fold or an integrase domain such as from HIVb, may be used to modulate gene expression indirectly, as described in further detail below.
  • a nucleic acid binding molecule a nucleic acid and a ligand, in which binding between the nucleic acid binding molecule and the nucleic acid is modulatable by a ligand.
  • Some of the methods disclosed make use of single nucleic acid binding molecule species, single nucleic acid species and a single ligand species.
  • Other methods make use of a plurality of nucleic acid binding molecule species, a plurality of nucleic acid species and a plurality of ligand species.
  • Yet other methods make use of combinations of the above, e.g., a plurality of nucleic acid species, a plurality of nucleic acid binding molecule species and a single ligand species.
  • the component is preferably provided in the form of a library.
  • Various libraries may be used and are disclosed in detail below.
  • the invention encompasses methods of selecting one or more components of a gene switch in which one, two or all three of the nucleic acid binding molecules, target nucleic acids and candidate ligands is in the form of ah array.
  • the methods may involve: arrayed nucleic acid binding molecules and arrayed target nucleic acids; arrayed nucleic acid binding molecules and arrayed candidate ligands; or arrayed target nucleic acids and arrayed candidate ligands.
  • the invention includes the use of all three components in the form of arrays, i.e., arrayed nucleic acid binding molecules, arrayed target nucleic acids and arrayed candidate ligands.
  • the ligand may be capable of binding to the nucleic acid binding molecule.
  • the ligand may be capable of binding to the nucleic acid.
  • the ligand may be capable of binding to each of the nucleic acid and the nucleic acid binding molecule.
  • the nucleic acid is a DNA and the nucleic acid binding molecule is a DNA binding polypeptide (DNA binding protein), such as a transcription factor.
  • DNA binding protein DNA binding protein
  • the nucleic acid may be an RNA, as disclosed in further detail below.
  • one nucleic acid binding molecule species is used, in conjunction with a plurality of candidate ligand species in the form of an array.
  • Either one target nucleic acid or a plurality of target nucleic acid species may be used, in which case the target DNAs may be in the form of an array.
  • our methods make use of modified nucleic acid or polypeptide to select gene switches.
  • Such a method involves firstly identifying a candidate target nucleic acid and a candidate nucleic acid binding molecule which are capable of binding to each other. Either or both of the candidate target nucleic acid and the candidate nucleic acid binding molecule are in a modified form, as explained in further detail below.
  • the target nucleic acid may be a modified target nucleic acid and the nucleic acid binding molecule is an unmodified nucleic acid binding molecule.
  • the target nucleic acid is an unmodified target nucleic acid and the nucleic acid binding molecule is a modified nucleic acid binding molecule.
  • the target nucleic acid may be a modified target nucleic acid and the nucleic acid binding molecule is a modified nucleic acid binding molecule.
  • Unmodified versions of the or each modified component in the binding pair are then identified. Finally, the method involves comparing the binding of the unmodified nucleic acid binding molecule to the unmodified nucleic acid in the presence and absence of a candidate ligand, and selecting complexes where the binding differs in the presence and absence of a ligand.
  • the candidate target nucleic acid and the candidate nucleic acid binding molecule may be in the form of a single species.
  • the candidate target nucleic acid and the candidate nucleic acid binding molecule may be molecules known to bind to each other.
  • the binding pair may comprise a nucleic acid binding molecule such as a polypeptide (for example, a transcription factor) known to bind to a modified nucleic acid sequence (such as a methylated DNA), or a modified polypeptide known to bind to a nucleic acid sequence.
  • the binding pair may be identified through screening one or a plurality of candidate target nucleic acids against one or a plurality of candidate nucleic acid binding molecules, either or both of which may comprise known or unknown species. Where a plurality of a candidate target nucleic acid or a candidate nucleic acid binding molecule is involved, this is preferably in the form of a library of the component, more preferably in the form of an array of that component.
  • RNA binding polypeptides have been devised for selecting RNA binding polypeptides.
  • This assay is useful for determining whether polypeptide interacts with RNA in vivo, and is based on translational repression of a reporter mRNA encoding green fluorescent protein by an RNA-binding protein for which a cognate binding site has been introduced into the 5' untranslated region.
  • An in vitro variation of the TRAP assay, utilising coupled transcription and translation, may also be used.
  • In vitro selection of RNA- BP variants that bind to a target RNA is also described in Laird-Offringa and Belasco, Methods Enzymol, 261.
  • a modified nucleic acid may be screened against a library, or an array, of randomised nucleic acid binding molecules.
  • the modified nucleic acid may be modified (for example, covalently modified) with any suitable moeity, which may be a drug, a ligand, a chemical group or other small molecule. The moeity is thus attached to the nucleic acid in a permanent or semi-permanent manner.
  • the nucleic acid binding molecule may be randomised by means known in the art and described in detail below. Binding between the modified nucleic acid and the nucleic acid binding molecules is assessed to select those nucleic acid binding molecules which bind to the modified nucleic acid (i.e., to select one or more binding pairs).
  • An unmodified nucleic acid is then provided (i.e., a nucleic acid which lacks the attached moeity).
  • Such an umodified nucleic acid may or may not bind to those nucleic acid binding molecules identified previously as being capable of binding to the modified nucleic acid. More often than not, the unmodified nucleic acid does not bind to the nucleic acid molecules identified previously.
  • Binding between the unmodified nucleic acid to each of the nucleic acid binding molecules is then assessed in the presence of a ligand, and those nucleic acid binding molecules selected which are capable of binding to the unmodified nucleic acid in the presence of ligand.
  • a ligand such an unmodified nucleic acid, together with the selected nucleic acid binding molecule and the ligand forms a gene switch; i.e., the interaction between the nucleic acid and the nucleic acid binding molecule is modulatable by the presence or absence of a ligand.
  • a similar selection for one or more components of a gene switch may be performed using a modified nucleic acid binding molecule and a library of randomised nucleic acids.
  • a plurality of modified nucleic acid species may be screened against a single nucleic acid binding molecule species, or a single modified nucleic acid species may be screened against a plurality of nucleic acid binding molecule species.
  • a plurality of nucleic acid species may be screened against a single modified nucleic acid binding molecule species, or a single nucleic acid species may be screened against a plurality of modified nucleic acid binding molecule species.
  • a plurality of modified nucleic acid species may be screened against a plurality of nucleic acid binding molecule species, or a plurality of nucleic acid species may be screened against a plurality of modified nucleic acid binding molecule species.
  • a plurality of modified nucleic acid species may be screened against a plurality of modified nucleic acid binding molecule species. Where a plurality of species is involved, this is preferably in the form of a randomised library, such as a combinatorial library, and preferably in the form of an array of molecules.
  • the above description refers to determination of binding in the presence of a single ligand, it will be appreciated that a plurality of ligands may be used.
  • the plurality of ligands is in the form of a library of ligands, more preferably in the form of a combinatorial library of ligands, most preferably in the form of an array of ligands.
  • the modification to the nucleic acid may be any known modification, such as methylation or phosphorylation.
  • the modification to the nucleic acid binding molecule may be any known modification such as ubiquitination, glycosyation, a fatty acylation, sentrinisation, ADP-ribosylation or phosphorylation.
  • the invention relates to protein switches, their components and selection of the protein switch and selection of any or all of the components of the switch.
  • Array technology may be utilised in this aspect of the invention, in that one, two or all three of the first polypeptide, second polypeptide and ligand may be in the form of an array.
  • protein switch is used herein to describe a multiple component system comprising (i) a first polypeptide molecule; (ii) a second polypeptide molecule which binds to the first polypeptide molecule in a manner modulatable by a ligand; and (iii) the ligand.
  • first and second polypeptide molecules may be in the form of a plurality of polypeptide molecules, preferably in the form of an array of polypeptide molecules.
  • the first and second polypeptides may be screened against one or a plurality of ligands, preferably a library of ligands, most preferably an array of ligands.
  • this aspect of the invention encompasses the use of: arrayed first polypeptide and arrayed second polypeptide and arrayed first and/or second polypeptide and arrayed candidate ligands.
  • the invention includes the use of all three components in the form of arrays, i.e., arrayed first polypeptide, arrayed second polypeptide and arrayed candidate ligands.
  • Array technology overcomes the disadvantages with traditional methods in molecular biology, which generally work on a "one gene in one experiment” basis, resulting in low throughput and the inability to appreciate the "whole picture” of gene function.
  • the major applications for array technology include the identification of sequence (gene / gene mutation) and the determination of expression level (abundance) of genes.
  • Gene expression profiling may make use of array technology, optionally in combination with proteomics techniques (Celis et al, 2000, FEBS Lett, 480(1):2-16;
  • any library may be arranged in an orderly manner into an array, by spatially separating the members of the library.
  • suitable libraries for arraying include nucleic acid libraries (including DNA, cDNA, oligonucleotide, etc libraries), peptide, polypeptide and protein libraries, as well as libraries comprising any molecules, such as ligand libraries, among others. Accordingly, where reference is made to a "library” in this document, unless the context dictates otherwise, such reference should be taken to include reference to a library in the form of an array.
  • the samples are generally fixed or immobilised onto a solid phase, preferably a solid substrate, to limit diffusion and admixing of the samples.
  • libraries of ligands may be prepared.
  • the libraries may be immobilised to a substantially planar solid phase, including membranes and non- porous substrates such as plastic and glass.
  • the samples are preferably arranged in such a way that indexing (i.e., reference or access to a particular sample) is facilitated.
  • indexing i.e., reference or access to a particular sample
  • the samples are applied as spots in a grid formation. Common assay systems may be adapted for this purpose.
  • an array may be immobilised on the surface of a microplate, either with multiple samples in a well, or with a single sample in each well.
  • the solid substrate may be a membrane, such as a nitrocellulose or nylon membrane (for example, membranes used in blotting experiments).
  • Alternative substrates include glass, or silica based substrates.
  • the samples are immobilised by any suitable method known in the art, for example, by charge interactions, or by chemical coupling to the walls or bottom of the wells, or the surface of the membrane.
  • Other means of arranging and fixing may be used, for example, pipetting, drop-touch, piezoelectric means, ink-jet and bubblejet technology, electrostatic application, etc.
  • photolithography may be utilised to arrange and fix the samples on the chip.
  • the samples may be arranged by being "spotted" onto the solid substrate; this may be done by hand or by making use of robotics to deposit the sample.
  • arrays may be described as macroarrays or mieroarrays, the difference being the size of the sample spots.
  • Macroarrays typically contain sample spot sizes of about 300 microns or larger and may be easily imaged by existing gel and blot scanners.
  • the sample spot sizes in mieroarrays are typically less than 200 microns in diameter and these arrays usually contain thousands of spots.
  • mieroarrays may require specialized robotics and imaging equipment, which may need to be custom made. Instrumentation is described generally in a review by Cortese, 2000, The Engineer 14[11]:26.
  • Arrays of peptides may also be synthesised on a surface in a manner that places each distinct library member (e.g., unique peptide sequence) at a discrete, predefined location in the array.
  • the identity of each library member is determined by its spatial location in the array.
  • the locations in the array where binding interactions between a predetermined molecule (e.g., a target or probe) and reactive library members occur is determined, thereby identifying the sequences of the reactive library members on the basis of spatial location.
  • targets and probes may be labelled with any readily detectable reporter, for example, a fluorescent, bioluminescent, phosphorescent, radioactive, etc reporter.
  • a fluorescent, bioluminescent, phosphorescent, radioactive, etc reporter Such reporters, their detection, coupling to targets/probes, etc are discussed elsewhere in this document. Labelling of probes and targets is also disclosed in Shalon et al., 1996, Genome Res 6(7):639-45
  • DNA arrays are as follow:
  • probe cDNA 500-5,000 bases long
  • a solid surface such as glass
  • robot spotting exposing to a set of targets either separately or in a mixture. This method is widely considered as having been developed at Stanford University (Ekins and Chu, 1999, Trends in Biotechnology, 1999, 17, 217-218).
  • oligonucleotide (20 ⁇ 25-mer oligos) or peptide nucleic acid (PNA) probes is synthesized either in situ (on-chip) or by conventional synthesis followed by on-chip immobilization.
  • the array is exposed to labeled sample DNA, hybridized, and the identity/abundance of complementary sequences are determined.
  • a DNA chip is sold by Affymetrix, Inc., under the GeneChip® trademark.
  • the raw data from a microarray experiment typically are images, which need to be transformed into gene expression matrices - tables where rows represent for example genes, columns represent for example various samples such as tissues or experimental conditions, and numbers in each cell for example characterize the expression level of the particular gene in the particular sample.
  • These matrices have to be analyzed further, if any knowledge about the underlying biological processes is to be extracted.
  • proteins, polypeptides, etc may also be immobilised in arrays.
  • antibodies have been used in microarray analysis of the proteome using protein chips (Borrebaeck CA, 2000, Immunol Today 21(8):379-82).
  • Polypeptide arrays are reviewed in, for example, MacBeath and Schreiber, 2000, Science, 289(5485): p. 1760- 1763.
  • any, some or all of the components of the gene switch and/or the protein switch may be provided in the form of an array. This is done by for example immobilising one of the components (for example, a nucleic acid) in an array, and exposing the members of the array to the other component (i.e., the nucleic acid binding molecule). Duplicates are set up, in which in one set, the candidate ligand(s) is present, and in the other set the candidate ligand(s) are absent.
  • the components for example, a nucleic acid
  • Duplicates are set up, in which in one set, the candidate ligand(s) is present, and in the other set the candidate ligand(s) are absent.
  • the second component binds to the first component on the array, it may be detected by means of a suitable probe (e.g., an antibody against the nucleic acid binding molecule).
  • a suitable probe e.g., an antibody against the nucleic acid binding molecule
  • the nucleic acid binding molecule may incorporate a suitable tag, whose presence may easily be detected. Comparison of the pattern of bound probe in the duplicates allows the determination of binding in the presence and absence of the ligand. It will be appreciated that, with the use of arrays of one, several or all of the components of the gene switch, at no time is it necessary to isolate or select complexes of candidate nucleic acid, candidate nucleic acid bindng molecule and candidate ligand, in order to identify a gene switch. Similarly, at no time is it necessary to isolate complexes of first and second polypeptide and ligand in order to identify a protein switch. Thus, the strength of binding between the partners may be determined in the presence and absence of the candidate ligand(s) directly.
  • libraries according to the invention may suitably be in the form of combinatorial libraries (also known as combinatorial chemical libraries).
  • a "combinatorial library”, as the term is used in this document, is a collection of multiple species of chemical compounds that consist of randomly selected subunits. According to the invention, combinatorial libraries may be screened for ligands that affect the binding between a nucleic acid binding molecule and a target nucleic acid.
  • the target nucleic acid may be a known nucleotide sequence of interest, for example, a transcription regulatory element.
  • the nucleic acid binding molecule may be a known nucleic acid binding molecule, for example, a transcription factor.
  • combinatorial libraries of chemical compounds are currently available, including libraries active against proteolytic and nonproteolytic enzymes, libraries of agonists and antagonists of G-protein coupled receptors (GPCRs), libraries active against non-GPCR targets (e.g., integrins, ion channels, domain interactions, nuclear receptors, and transcription factors) and libraries of whole-cell oncology and antiinfective targets, among others.
  • GPCRs G-protein coupled receptors
  • non-GPCR targets e.g., integrins, ion channels, domain interactions, nuclear receptors, and transcription factors
  • libraries of whole-cell oncology and antiinfective targets among others.
  • Soluble random combinatorial libraries may be synthesized using a simple principle for the generation of equimolar mixtures of peptides which was first described by Furka (Furka, A. et al, 1988, Xth International Symposium on Medicinal Chemistry, Budapest 1988; Furka, A. et al, 1988, 14th International Congress of Biochemistry, Prague 1988; Furka, A. et al, 1991, Int. J. Peptide Protein Res. 37:487-493). The construction of soluble libraries for iterative screening has also been described (Houghten, R. A. et all 991, Nature 354:84-86).
  • K. S. Lam disclosed the novel and unexpectedly powerful technique of using insoluble random combinatorial libraries.
  • Lam synthesized random combinatorial libraries on solid phase supports so that each support had a test compound of uniform molecular structure, and screened the libraries without prior removal of the test compounds from the support by solid phase binding protocols (Lam, K. S. et al, 1991, Nature 354:82-84).
  • a library of candidate ligands or nucleic acid binding molecules may be a synthetic combinatorial library (e.g., a combinatorial chemical library), a cellular extract, a bodily fluid (e.g., urine, blood, tears, sweat, or saliva), or other mixture of synthetic or natural products (e.g., a library of small molecules or a fermentation mixture).
  • a synthetic combinatorial library e.g., a combinatorial chemical library
  • a cellular extract e.g., a cellular extract
  • a bodily fluid e.g., urine, blood, tears, sweat, or saliva
  • other mixture of synthetic or natural products e.g., a library of small molecules or a fermentation mixture.
  • a library of candidate ligands, nucleic acid binding molecules or target nucleic acids may include, for example, amino acids, oligopeptides, polypeptides, proteins, or fragments of peptides or proteins; nucleic acids (e.g., antisense; DNA; RNA; or peptide nucleic acids, PNA); aptamers; or carbohydrates or polysaccharides.
  • Each member of the library can be singular or can be a part of a mixture (e.g., a compressed library).
  • the library may-eontain purified compounds or can be "dirty" (i.e., containing a significant quantity of impurities).
  • Diversity files contain a large number of compounds (e.g., 1000 or more small molecules) representative of many classes of compounds that could potentially result in nonspecific detection in an assay. Diversity files are commercially available or can also be assembled from individual compounds commercially available from the vendors listed above.
  • nucleic acid binding molecule includes any molecule which is capable of binding or associating with nucleic acid.
  • polypeptide binding molecule includes any molecule capable of binding or association with a polypeptide. This binding or association may be via covalent bonding, via ionic bonding, via hydrogen bonding, via Van-der-Waals bonding, or via any other type of reversible or irreversible association.
  • nucleic acid is usually DNA.
  • Reference to 'nucleic acid binding molecule' is to be taken to include reference to 'DNA binding molecule'.
  • DNA binding molecule is to be construed as including any molecule which is capable or binding or associating with DNA.
  • the nucleic acid may be RNA, as set out below, or any other form of nucleic acid, including completely or partially synthetic nucleic acids.
  • the nucleic acid binding molecule is a polypeptide.
  • peptide refers to a polymer in which the monomers are amino acids and are joined together through peptide or disulfide bonds.
  • Polypeptide refers to either a full-length naturally-occurring amino acid chain or a "fragment thereof or “peptide”, such as a selected region of the polypeptide that binds to another protein, peptide or polypeptide in a manner modulatable by a ligand, or to an amino acid polymer, or a fragment or peptide thereof, which is partially or wholly non- natural.
  • “Fragment thereof thus refers to an amino acid sequence that is a portion of a full- length polypeptide, between about 8 and about 500 amino acids in length, preferably about 8 to about 300, more preferably about 8 to about 200 amino acids, and even more preferably about 10 to about 50 or 100 amino acids in length.
  • “Peptide” refers to a short amino acid sequence that is 10-40 amino acids long, preferably 10-35 amino acids. Additionally, unnatural amino acids, for example, -alanine, phenyl glycine and homoarginine may be included. Commonly-encountered amino acids which are not gene- encoded may also be used in the present invention. All of the amino acids used in the present invention may be either the D- or L- optical isomer.
  • a "polypeptide binding molecule” is a molecule, preferably a polypeptide, protein or peptide, which has the ability to bind to another polypeptide, protein or peptide. Preferably, this binding ability is modulatable by a ligand.
  • domain refers to a linear sequence of amino acids which exhibits biological function, such as the ability to bind another molecule (for example, another polypeptide or fragment thereof).
  • This linear sequence includes full-length amino acid sequences (e.g. those encoded by a full-length gene or polynucleotide), or a portion or fragment thereof, provided the biological function, in particular binding ability, is maintained by that portion or fragment.
  • domain also may refer to polypeptides and peptides having biological function.
  • a polypeptide useful in the invention will at least have a binding capability, i.e, with respect to binding as or to a binding partner, and also may have another biological function that is a biological function of a protein or domain from which the peptide sequence is derived.
  • molecules according the invention may be free in solution, or may be partially or fully immobilised. They may be present as discrete entities, or may be complexed with other molecules.
  • molecules according to the invention include polypeptides displayed on the surface of bacteriophage particles. Alternatively or in addition, the polypeptides may be arranged in arrays. More preferably, molecules according to the invention include libraries of polypeptides presented as integral parts of the envelope proteins on the outer surface of bacteriophage particles. Methods for the production of libraries encoding randomised polypeptides are known in the art and may be applied in the present invention.
  • Randomisation may be total, or partial; in the case of partial randomisation, the selected codons preferably encode options for amino acids, and not for stop codons.
  • the term 'candidate nucleic binding molecules' is used to describe any one or more molecule(s) as defined above which may or may not be capable of binding nucleic acids.
  • the term 'candidate DNA binding molecules' is used to describe any one or more molecule(s) as defined above which may or may not be capable of binding DNA.
  • the term 'candidate polypeptide binding molecules' is used to describe any one or more molecule(s) as defined above which may or may not be capable of binding polypeptides.
  • the capability of molecules to bind DNA or nucleic acids may or may not be modulatable by a ligand. The latter of these properties may be investigated by the methods of this invention.
  • candidate polypeptide, DNA or nucleic acid binding molecules such as DNA binding molecules comprise a plurality of, or a library of polypeptides.
  • the candidate polypeptide, DNA or nucleic acid binding molecules comprise an array of polypeptide, DNA or nucleic acid binding molecules.
  • polypeptide binding molecules are, or are derived from known polypeptide binding proteins, preferably polypeptide binding domains.
  • a polypeptide binding domain useful in the present invention will comprise a binding site which will permit it to bind to other polypeptides (binding partners) to form a complex.
  • Polypeptides are known to be able to associate in a number of ways, and domains which mediate polypeptide association are known in the art. For example, coiled coils, acid patches, zinc fingers, calcium hands, WD40 motifs, SH2/SH3 domains and leucine zippers are all polypeptide domains known to mediate protein-protein interactions, as are other domains known to those skilled in the art.
  • the invention makes use of randomised polypeptides for the selection of protein switches.
  • the randomised polypeptides comprise any of the polypeptide binding domains described here; most preferably, the randomisation occurs at the domain (i.e., that part of the polypeptide responsible for its interaction with another polypeptide).
  • DNA or nucleic acid binding polypeptides are, or are derived from,
  • DNA binding proteins such as DNA repair enzymes, polymerases, recombinases, methylases, restriction enzymes, replication factors, histones, or DNA binding structural proteins such as chromosomal scaffold proteins; even more preferably DNA binding polypeptides are derived from DNA binding proteins, preferably transcription factors.
  • nucleic acid binding polypeptides are derived from RNA-binding proteins such as ribosomal proteins, components of the splicing machinery, viral or cellular regulatory or RNA packaging proteins, or RNA trafficking proteins.
  • nucleic acid binding molecules comprise molecules capable of binding to both RNA and DNA, for example ethidium bromide, or the transcription factor TFIIIA.
  • the candidate DNA binding molecules preferably comprise one or more of; DNA binding proteins, transcription factors, fragment(s) of DNA binding proteins or transcription factors, sequences homologous to DNA binding proteins or transcription factors, or polypeptides which have been fully or partially randomised from a starting sequence which is a DNA binding proteins or a transcription factor, a fragment of a DNA binding protein or a transcription factor, or homologous to a DNA binding protein or a transcription factor.
  • candidate polypeptide binding molecules preferably comprise one or more of the known polypeptide binding molecules or domains listed above or known in the art.
  • candidate polypeptide, DNA or nucleic acid binding molecules comprise polypeptides which are at least 40% homologous, more preferably at least 60% homologous, even more preferably at least 75% homologous or even more, for example 85 %, or 90 %, or even more than 95% homologous to one or more DNA/polypeptide binding proteins (as the case may be), preferably transcription factors (in the case of DNA or nucleic acid binding molecules), using one of the homo logy calculation algorithms defined below.
  • Candidate DNA or nucleic acid binding molecules may comprise, among other things, DNA binding part(s) of any protein(s), for example zinc finger transcription factors, Zif268, ATF family transcription factors, ATF1, ATF2, bZIP proteins, CHOP, NF-I BKB, TATA binding protein (TBP), MDM, c-jun, elk, serum response factor (SRF), ternary complex factor (TCF); KRUPPEL, Odd Skipped, even skipped and other D.melanogaster transcription factors; yeast transcription factors such as GCN4, the GAL family of galactose-inducible transcription factors; bacterial transcription factors or repressors such
  • candidate polypeptide binding molecules or DNA or nucleic acid binding molecules may be non-randomised polypeptides, for example 'wild-type' or allelic variants of naturally occurring polypeptides, or may be specific mutant(s), or may be wholly or partially randomised polypeptides, preferably structurally related to protein, nucleic acid or DNA binding proteins as described herein.
  • the candidate DNA binding molecules may be displayed on the surface of bacteriophage particles.
  • Such displayed nucleic acid and polypeptide binding molecule are preferably partially randomised zinc-finger type transcription factors, preferably retaining at least 40% homology (as described herein) to zinc-finger type transcription factors.
  • the bacteriophage particles may themselves be arranged in the form of an array.
  • sequence homology may be considered in relation to structurally important residues, or those residues which are known or suspected of being evolutionarily conserved.
  • residues known to be variable or non-essential for a particular structural conformation may be discounted from the homology calculation.
  • zinc fingers are known to have certain residues which are important for the formation of the three-dimensional zinc finger structure.
  • homology may be considered over about seven of said important amino acid residues amongst approximately thirty residues which may comprise the whole finger structure.
  • homology may refer to structural homology.
  • Structural homology may be estimated by comparing the structural RMS deviation of the main part of the carbon atom backbone of two or more molecules.
  • the molecules may be considered structurally homologous if the deviation is 5 A or less, preferably 3 A or less, more preferably 1.5A or less.
  • Structurally homologous molecules will not necessarily show significant sequence homology.
  • Candidate nucleic acid DNA binding molecules or polypeptide binding molecules may be pre-screened prior to being tested in the methods of the invention using routine assays known in art for determining the binding of molecules to nucleic acids or polypeptides so as to eliminate molecules that do not have binding ability DNA.
  • a candidate nucleic acid DNA binding molecule preferably a library of candidate nucleic acid DNA binding molecules, are contacted with nucleic acid and binding determined.
  • the nucleic acids may for example be labelled with a detectable label, such as a fluorophore/flurochrome, such that after a wash step binding can be determined easily, for example by monitoring fluorescence. Similar methods may be used to pre-screen polypeptide binding molecules. Other methods for measuring binding to nucleic acid DNA are set out below.
  • the nucleic acid with which the candidate nucleic acid binding molecules are contacted may be non-specific nucleic acids, such as a random oligonucleotide library or sonicated genomic DNA and the like.
  • a specific sequence such as a specific DNA sequence or a partially randomised library of sequences may be used or partially randomised library of sequences.
  • the nucleic acid DNA binding molecules of the invention may bind the target nucleic acid with different affinity in the presence or in the absence of a ligand.
  • the polypeptide binding molecules may bind their targets (i.e., where the first and second molecules are both polypeptides) with a different affinity in the presence or in the absence of ligand.
  • the binding to the nucleic acid may be enhanced by the presence of the ligand (i.e. bind with a higher affinity in the presence of ligand), or may be reduced in the presence of ligand (i.e. bind with a lower affinity in the presence of ligand).
  • association of the nucleic acid or DNA binding molecule(s) with the target nucleic acid is enhanced by the presence of ligand
  • said association may be additive with the binding of the ligand, or may be synergistic with the binding of the ligand, or may affect the binding in another way. If the binding is synergistic with the binding of the ligand, said binding may be either wholly or partly dependent on the presence of the ligand.
  • the characteristics of binding may be such that the nucleic acid DNAor binding molecule(s) or polypeptide binding molecule(s) may be eluted by addition of an excess of the ligand.
  • Nucleic acid/nucleic acid binding molecule and polypeptide/polypeptide binding assays are known in the art.
  • the strength of binding or degree of binding is measured by measuring the dissociation constant of the relevant complex.
  • the term 'DNA binding molecule' includes any molecule which is capable of binding or associating with DNA. This binding or association may be via covalent bonding, via ionic bonding, via hydrogen bonding, via Van-der-Waals bonding, or via any other type of reversible or irreversible association.
  • molecules according the invention may be free in solution, or may be partially or fully immobilised. They may be present as discrete entities, or may be complexed with other molecules.
  • molecules according to the invention include polypeptides displayed on the surface of bacteriophage particles. More preferably, molecules according to the invention include libraries of polypeptides presented as integral parts of the envelope proteins on the outer surface of bacteriophage particles. Methods for the production of libraries encoding randomised polypeptides are known in the art and may be applied in the present invention. Randomisation may be total, or partial; in the case of partial randomisation, the selected codons preferably encode options for amino acids, and not for stop codons.
  • candidate DNA binding molecules is used to describe any one or more molecule(s) as defined above which may or may not be capable of binding DNA.
  • the capability of said molecules to bind DNA may or may not be modulatable by a ligand.
  • candidate DNA binding molecules comprise a plurality of, or a library of polypeptides.
  • these polypeptides are, or are derived from, DNA binding proteins such as DNA repair enzymes, polymerases, recombinases, methylases, restriction enzymes, replication factors, histones, or DNA binding structural proteins such as chromosomal scaffold proteins; even more preferably said polypeptides are derived from DNA binding proteins, preferably transcription factors.
  • DNA binding proteins such as DNA repair enzymes, polymerases, recombinases, methylases, restriction enzymes, replication factors, histones, or DNA binding structural proteins such as chromosomal scaffold proteins; even more preferably said polypeptides are derived from DNA binding proteins, preferably transcription factors.
  • the candidate DNA binding molecules preferably comprise one or more of; DNA binding proteins, transcription factors, fragment(s) of DNA binding proteins or transcription factors, sequences homologous to DNA binding proteins or transcription factors, or polypeptides which have been fully or partially randomised from a starting sequence which is a DNA binding proteins or a transcription factor, a fragment of a DNA binding protein or a transcription factor, or homologous to a DNA binding protein or a transcription factor.
  • candidate DNA binding molecules comprise polypeptides which are at least 40% homologous, more preferably at least 60% homologous, even more preferably at least 75% homologous or even more, for example 85 %, or 90 %, or even more than 95% homologous to one or more DNA binding proteins, preferably transcription factors, using one of the homology calculation algorithms defined below.
  • Candidate DNA binding molecules may comprise, among other things, DNA binding part(s) of any protein(s), for example zinc finger transcription factors, Zif268, ATF family transcription factors, ATF1, ATF2, bZIP proteins, CHOP, NF-I B, TATA binding protein (TBP), MDM, c-jun, elk, serum response factor (SRF), ternary complex factor (TCF); KRUPPEL, Odd Skipped, even skipped and other D.melanogaster transcription factors; yeast transcription factors such as GCN4, the GAL family of galactose-inducible transcription factors; bacterial transcription factors or repressors such as lacl ⁇ , or fragments or derivatives thereof. Derivatives would be considered by a person skilled in the art to be functionally and/or structurally related to the molecule(s) from which they are derived, for example through sequence homology of at least 40%.
  • the candidate DNA binding molecules may be non-randomised polypeptides, for example 'wild-type' or allelic variants of naturally occurring polypeptides, or may be specific mutant(s), or may be wholly or partially randomised polypeptides, preferably structurally related to DNA binding proteins as described herein.
  • these polypeptide candidate DNA binding molecules are displayed on the surface of bacteriophage particles, and are preferably partially randomised zinc-finger type transcription factors, preferably retaining at least 40% homology (as described herein) to zinc-finger type transcription factors.
  • sequence homology may be considered in relation to structurally important residues, or those residues which are known or suspected of being evolutionarily conserved. In such instances, residues known to be variable or non-essential for a particular structural conformation may be discounted from the homology calculation.
  • zinc fingers are known to have certain residues which are important for the formation of the three-dimensional zinc finger structure. In these cases, homology may be considered over about seven of said important amino acid residues amongst approximately thirty residues which may comprise the whole finger structure.
  • homology may refer to structural homology.
  • Structural homology may be estimated by comparing the structural RMS deviation of the main part of the carbon atom backbone of two or more molecules.
  • the molecules may be considered structurally homologous if the deviation is 5A or less, preferably 3A or less, more preferably 1.5 A or less.
  • Structurally homologous molecules will not necessarily show significant sequence homology.
  • Candidate DNA binding molecules may be pre-screened prior to being tested in the methods of the invention using routine assays known in art for determining the binding of molecules to nucleic acids so as to eliminate molecules that do not bind DNA.
  • a candidate DNA binding molecule preferably a library of candidate DNA binding molecules, are contacted with nucleic acid and binding determined.
  • the nucleic acids may for example be labelled with a detectable label, such as a fluorophore/flurochrome, such that after a wash step binding can be determined easily, for example by monitoring fluorescence.
  • Other methods for measuring binding to DNA are set out below.
  • the nucleic acid with which the candidate nucleic acid binding molecules are contacted may be non-specific nucleic acids, such as a random oligonucleotide library or sonicated genomic DNA and the like. Alternatively, a specific sequence may be used or partially randomised library of sequences.
  • the library of sequences may be in the form of an array.
  • Nucleic acid binding molecules and polypeptide binding molecules according to the invention are preferably polypeptide sequences, optionally encoded by nucleic acid sequences. Fragments, mutants, alleles and other derivatives of the molecules of the invention preferably retain substantial homology with said sequence(s).
  • "homology” means that the two entities share sufficient characteristics for the skilled person to determine that they are similar. Preferably, homology is used to refer to sequence identity.
  • the derivatives of said nucleic acid binding molecules of the invention and polypeptide binding molecules of the invention preferably retain substantial sequence identity with said molecules.
  • a homologous sequence is taken to include any sequence which is at least 60, 70, 80 or 90% identical, preferably at least 95 or 98% identical over at least 5, preferably 8, 10, 15, 20, 30, 40 or even more residues or bases with the molecules (i.e. the sequences thereof) of the invention, for example as shown in the sequence listing herein.
  • homology should typically be considered with respect to those regions of the molecule(s) which may be known to be functionally important rather than non-essential neighbouring sequences.
  • homology can also be considered in terms of similarity (i.e. amino acid residues having similar chemical properties/functions), in the context of the present invention it is preferred to express homology in terms of sequence identity.
  • Homology comparisons can be conducted by eye, or more usually, with the aid of readily available sequence comparison programs. These commercially available computer programs can calculate % homology between two or more sequences.
  • % homology may be calculated over contiguous sequences, i.e. one sequence is aligned with the other sequence and each amino acid in one sequence directly compared with the corresponding amino acid in the other sequence, one residue at a time. This is called an "ungapped" alignment. Typically, such ungapped alignments are performed only over a relatively short number of residues (for example less than 50 contiguous amino acids). Although this is a very simple and consistent method, it fails to take into consideration that, for example, in an otherwise identical pair of sequences, one insertion or deletion will cause the following amino acid residues to be put out of alignment, thus potentially resulting in a large reduction in % homology when a global alignment is performed. Consequently, most sequence comparison methods are designed to produce optimal alignments that take into consideration possible insertions and deletions without penalising unduly the overall homology score. This is achieved by inserting "gaps" in the sequence alignment to try to maximise local homology.
  • BLAST and FASTA are available for offline and online searching (see Ausubel et al., 1999 ibid, pages 7-58 to 7-60). However it is preferred to use the GCG Bestfit program. Although the final % homology can be measured in terms of identity, the alignment process itself is typically not based on an all-or-nothing pair comparison. Instead, a scaled similarity score matrix is generally used that assigns scores to each pairwise comparison based on chemical similarity or evolutionary distance. An example of such a matrix commonly used is the BLOSUM62 matrix - the default matrix for the BLAST suite of programs. GCG Wisconsin programs generally use either the public default values or a custom symbol comparison table if supplied (see user manual for further details). It is preferred to use the public default values for the GCG package, or in the case of other software, the default matrix, such as BLOSUM62.
  • % homology preferably % sequence identity.
  • the software typically does this as part of the sequence comparison and generates a numerical result.
  • Nucleic acid binding molecules and polypeptide binding molecules according to the invention may include any atom, ion, molecule, macromolecule (for example polypeptide), or combination of such entities that are capable of binding to nucleic acids, such as DNA, or (in the case of polypeptide binding molecules) polypeptides.
  • nucleic acid binding molecules according to the invention may include families of polypeptides with known or suspected nucleic acid binding motifs. These may include for example zinc finger proteins (see below). Molecules according to the invention may also include helix- turn-helix proteins, homeodomains, leucine zipper proteins, helix-loop-helix proteins or ⁇ - sheet motifs which are well known to a person skilled in the art.
  • Polypeptide binding molecules of the invention advantageously contain protein- binding motifs, such as protein dimerization motifs as known in the art.
  • protein-binding motifs include the tetratricopeptide repeat (TPR) which is found in proteins associated with multiprotein complexes (Blatch and Lassie, 1999, Bioessays 21, 932-9), the Arg-Gly-Asp-Ser found in multimerin (Hayward 1997, Clin Invest Med 20, 176-87), the LXCXE motif found in SV40 Large T antigen necessary for binding to p53 protein (DeCaprio 1999, Biologicals 27, 23-8), the C-terminal VXI motif of ABP, which mediates binding of ABP to GluR2/3 through a Class I PDZ interaction to form homodimers and heteromultimers (Srivastava and Ziff, Ann N Y Acad Sci 868, 561-4), as well as the conserved Ran-binding motif found in species from yeasts to mammals (Seki et al
  • nucleic acid binding motifs such as DNA binding motifs of one or more known or suspected nucleic acid/DNA binding polypeptide(s), or polypeptide binding motifs, may advantageously be randomised, in order to provide libraries of candidate nucleic acid binding molecules.
  • the randomised libraries of candidate nucleic acid binding molecules may be in the form of an array.
  • Crystal structures may advantageously be used in selecting or predicting the relevant binding regions of nucleic acid and polypeptide binding proteins by methods known in the art.
  • Nucleic acid binding regions of proteins within the same structural family are often conserved or homologous to one another, for example zinc finger ⁇ -helices, the leucine zipper basic region, homeodomain helix 3.
  • nucleic acid binding criteria for zinc fingers as preferred nucleic acid binding molecules according to the present invention are set out in this application (see above).
  • the methods of the present invention could be advantageously applied to the selection of ligand-modulatable nucleic acid binding molecules from other families of transcription factors, for example from the helix-turn- helix (HTH) family and/or from the probe helix (PH) family, and/or from the C4 Zinc- binding family (which includes the hormone receptor (HR) family), from the Gal4 family, from the c-myb family, from other zinc finger families, or from any other family of nucleic acid binding proteins or polypeptide binding proteins known to one skilled in the art.
  • HTH helix-turn- helix
  • PH probe helix
  • C4 Zinc- binding family which includes the hormone receptor (HR) family
  • polypeptides from one or more of these families could be advantageously randomised to provide a library of candidate molecules for use in the methods of the invention.
  • the amino acid residues known to be important for nucleic acid or polypeptide binding could be randomised.
  • the randomised library is in the form of an array of candidate molecules.
  • randomisation may involve alteration of zinc finger polypeptides, said alteration being accomplished at the DNA or protein level.
  • Mutagenesis and screening of zinc finger polypeptides may be achieved by any suitable means.
  • the mutagenesis is performed at the nucleic acid level, for example by synthesising novel genes encoding mutant polypeptides and expressing these to obtain a variety of different proteins.
  • existing genes can themselves be mutated, such as by site-directed or random mutagenesis, in order to obtain the desired mutant genes.
  • Mutations may be performed by any method known to those of skill in the art.
  • site-directed mutagenesis of a nucleic acid sequence encoding the protein of interest.
  • a number of methods for site-directed mutagenesis are known in the art, from methods employing single-stranded phage such as Ml 3 to PCR-based techniques (see “PCR Protocols: A guide to methods and applications", M.A. Innis, D.H. Gelfand, J.J. Sninsky, T.J. White (eds.). Academic Press, New York, 1990).
  • the commercially available Altered Site II Mutagenesis System may be employed, according to the manufacturer's instructions.
  • Randomisation of the zinc finger binding motifs is preferably directed to those amino acid residues where the code provided herein gives a choice of residues (see below).
  • positions +1, +5 and +8 are advantageously randomised, whilst preferably avoiding hydrophobic amino acids; positions involved in binding to the nucleic acid, notably -1, +2, +3 and +6, may be randomised also, preferably within the choices provided by the rules of the present invention.
  • Screening of the proteins produced by mutant genes is preferably performed by expressing the genes and assaying the binding ability of the protein product.
  • a simple and advantageously rapid method by which this may be accomplished is by phage display, in which the mutant polypeptides are expressed as fusion proteins with the coat proteins of filamentous bacteriophage, such as the minor coat protein pll of bacteriophage ml 3 or gene III of bacteriophage Fd, and displayed on the capsid of bacteriophage transformed with the mutant genes.
  • the target nucleic acid sequence or target polypeptide is used as a probe to bind directly to the protein on the phage surface and select the phage possessing advantageous mutants, by affinity purification.
  • the phage are then amplified by passage through a bacterial host, and subjected to further rounds of selection and amplification in order to enrich the mutant pool for the desired phage and eventually isolate the preferred clone(s).
  • Detailed methodology for phage display is known in the art and set forth, for example, in US Patent 5,223,409; Choo and Klug, (1995) Current Opinion in Biotechnology 6:431-436; Smith, (1985) Science 228:1315-1317; and McCafferty et al, (1990) Nature 348:552-554; all incorporated herein by reference.
  • Vector systems and kits for phage display are available commercially, for example from Pharmacia.
  • Specific peptide ligands such as zinc finger polypeptides may moreover be selected for binding to targets by affinity selection using large libraries of peptides linked to the C-terminus of the lac repressor Lacl (Cull et al, (1992) Proc Natl Acad Sci U S A, 89,
  • the repressor protein When expressed in E. coli the repressor protein physically links the ligand to the encoding plasmid by binding to a lac operator sequence on the plasmid.
  • polypeptides may be partitioned in physical compartments for example wells of an in vitro dish, or subcellular compartments, or in small fluid particles or droplets such as emulsions; further teachings on this topic may be found in Griffith et al, ⁇ see WO 99/02671).
  • a library for use in the invention may be randomised at those positions for which choices are given as set out below. The rules are intended allow the person of ordinary skill in the art to make informed choices concerning the desired codon usage at the given positions.
  • a library for use in the invention may be in the form of an array, as discussed above.
  • the recognition helix of PH family polypeptides contains conserved Arg/Lys residues which are important structural elements involved in the binding of phosphates in the nucleic acid. Base specificity is attributed to amino acids 1, 4, 5 and 8 of the helix. These residues could be advantageously varied, for example amino acid 1 could be selected from Asn, Asp, His, Val, Ile to provide the possibility of binding to A, C, G, or T. Similarly, amino acid 4 could be selected from Asn, Asp, His, Val, He, Gin, Glu, Arg, Lys, Met, or Leu to provide the possibility of binding to A,C,G or T.
  • the rules laid out in would be used in order to randomise those amino acids which affect interaction of the molecule with the nucleic acid, whether in a base specific manner, or via binding to the phosphate backbone, thereby producing a library of candidate nucleic acid binding molecules, which may be in the form of an array, for use in the methods of the invention.
  • polypeptide molecules of the helix-turn-helix family could be randomised to produce a library of candidate molecules, optionally in the form of an array, at least some of which may preferably be capable of binding nucleic acid in a ligand- dependent manner when used in the methods of the present invention.
  • amino acids 1, 2, 5 and 6 are known to be conserved and function in base-specific nucleic acid ' binding in HTH motifs. Therefore, at least amino acids 1, 2, 5 or 6 would preferably be randomised so as to produce molecules for use according to the present invention.
  • amino acids 1 , 5 and 6 could be selected from Asn, Asp, His, Val, Ile, Glu, Gin, Arg, Met, Lys or Leu
  • amino acid 2 could be selected from Asn, Asp, His, Val, Ile, Glu, Gin, Arg, Met, Lys, Leu, Cys, Ser, Thr, or Ala.
  • C4 family which includes hormone receptor type transcription factors. It is envisaged that polypeptides of this family could advantageously be used to provide candidate molecules for use in selecting nucleic acid binding molecules whose association with nucleic acid is modulatable by a nucleic acid binding ligand. Amino acids 1, 4, 5 and 9 of the C4 motif are known to be involved in contacting the DNA, and therefore these residues would preferably be altered to provide a plurality of different molecules which may bind DNA in a ligand dependent manner.
  • amino acids 1 and 5 could be selected from Asn, Asp, His, Val, Ile, Glu, Gin, Arg, Met, Lys or Leu, and amino acids 4 and 9 could be selected from Gin, Glu, Arg, Lys, Leu or Met.
  • C4 transcription factors including hormone receptors bind DNA as homo or heterodimers in either inverted or direct repeat configurations through the action of a dimerisation domain or surface, and can therefore be used to exemplify a further embodiment of the invention.
  • randomisations or modifications may be introduced into one or both of the respective dimerisation domains or interfaces and optionally simultaneously into the DNA binding surfaces of these proteins, and a gene switch and/or a protein switch may be isolated by the methods described above.
  • DNA binding molecules are Cys2-His2 zinc finger binding proteins which, as is well known in the art, bind to target nucleic acid sequences via ⁇ -helical zinc metal atom co-ordinated binding motifs known as zinc fingers.
  • Each zinc finger in a zinc finger nucleic acid binding protein is responsible for determining binding to a nucleic acid triplet, or an overlapping quadruplet, in a nucleic acid binding sequence.
  • there are 2 or more zinc fingers for example 2, 3, 4, 5 or 6 zinc fingers, in each binding protein.
  • the invention provides a method for preparing a DNA binding polypeptide of the Cys2-His2 zinc finger class capable of binding to a target DNA sequence, wherein binding is via a zinc finger DNA binding motif of the polypeptide, and wherein said binding is modulatable by a ligand.
  • the present invention is in one aspect concerned with the production of what are essentially artificial nucleic acid binding proteins such as DNA binding proteins as well as polypeptide binding molecules such as proteins.
  • artificial analogues of amino acids may be used, to impart the proteins with desired properties or for other reasons.
  • amino acid particularly in the context where "any amino acid” is referred to, means any sort of natural or artificial amino acid or amino acid analogue that may be employed in protein construction according to methods known in the art.
  • any specific amino acid referred to herein may be replaced by a functional analogue thereof, particularly an artificial functional analogue.
  • the nomenclature used herein therefore specifically comprises within its scope functional analogues or mimetics of the defined amino acids.
  • the ⁇ -helix of a zinc finger binding protein aligns antiparallel to the nucleic acid strand, such that the primary nucleic acid sequence is arranged 3' to 5' in order to correspond with the N terminal to C-terminal sequence of the zinc finger. Since nucleic acid sequences are conventionally written 5' to 3', and amino acid sequences N-terminus to C-terminus, the result is that when a nucleic, acid sequence and a zinc finger protein are aligned according to convention, the primary interaction of the zinc finger is with the - strand of the nucleic acid, since it is this strand which is aligned 3' to 5'. These conventions are followed in the nomenclature used herein.
  • the present invention may be integrated with the rules set forth for zinc finger polypeptide design in our copending European or PCT patent applications having publication numbers; WO 98/53057, WO 98/53060, WO 98/53058, WO 98/53059, describe improved techniques for designing zinc finger polypeptides capable of binding desired nucleic acid sequences. In combination with selection procedures, such as phage display, set forth for example in WO 96/06166, these techniques enable the production of zinc finger polypeptides capable of recognising practically any desired sequence.
  • the invention provides a method for preparing a DNA binding polypeptide of the Cys2-His2 zinc finger class capable of binding to a target DNA sequence, wherein said binding is modulatable by a ligand, and wherein binding to each base of the triplet by an ⁇ -helical zinc finger DNA binding motif in the polypeptide is determined as follows:
  • position +6 in the ⁇ -helix may be any amino acid, provided that position ++2 in the ⁇ -helix is not Asp;
  • the foregoing represents a set of rules which permits the design of a zinc finger binding protein specific for any given target DNA sequence.
  • a zinc finger binding motif is a structure well known to those in the art and defined in, for example, Miller et al, (1985) EMBO J. 4:1609-1614; Berg (1988) PNAS (USA) 85:99-102; Lee et al, (1989) Science 245:635-637; see International patent applications WO 96/06166 and WO 96/32475, corresponding to USSN 08/422,107, incorporated herein by reference.
  • a preferred zinc finger framework has the structure: (A) XO-2 C Xl-5 C X9-14 H X3- 6 H/C
  • zinc finger nucleic acid binding motifs may be represented as motifs having the following primary structure:
  • X (including X a , X and X c ) is any amino acid.
  • X 2 . 4 and X 2 . 3 refer to the presence of 2 or 4, or 2 or 3, amino acids, respectively.
  • the Cys and His residues, which together co-ordinate the zinc metal atom, are marked in bold text and are usually invariant, as is the Leu residue at position +4 in the ⁇ -helix.
  • Modifications to this representation may occur or be effected without necessarily abolishing zinc fmger function, by insertion, mutation or deletion of amino acids.
  • the second His residue may be replaced by Cys (Krizek et al. , (1991) J. Am. Chem. Soc. 113:4518-4523) and that Leu at +4 can in some circumstances be replaced with Arg.
  • the Phe residue before X c may be replaced by any aromatic other than Trp.
  • experiments have shown that departure from the preferred structure and residue assignments for the zinc finger are tolerated and may even prove beneficial in binding to certain nucleic acid sequences.
  • structures (A) and (B) above are taken as an exemplary structure representing all zinc finger structures of the Cys2-His2 type.
  • X is / ⁇ -X or P- / ⁇ -X.
  • X is any amino acid.
  • X is E, K, T or S. Less preferred but also envisaged are Q, V, A and P.
  • the remaining amino acids remain possible.
  • X 2 . 4 consists of two amino acids rather than four. The first of these amino acids may be any amino acid, but S, E, K, T, P and R are preferred.
  • it is P or R.
  • the second of these amino acids is preferably E, although any amino acid may be used.
  • X is T or I.
  • X c is S or T.
  • X 2 . 3 is G-K-A, G-K-C, G-K-S or G-K-G.
  • X 2 . 3 is G-K-A, G-K-C, G-K-S or G-K-G.
  • departures from the preferred residues are possible, for example in the form of M-R-N or M-R.
  • the linker is T-G-E-K or T-G-E-K-P.
  • amino acids -1, +3 and +6 amino acids +4 and +7 are largely invariant.
  • the remaining amino acids may be essentially any amino acids.
  • position +9 is occupied by Arg or Lys.
  • positions +1, +5 and +8 are not hydrophobic amino acids, that is to say are not Phe, Trp or Tyr.
  • position ++2 is any amino acid, and preferably serine, save where its nature is dictated by its role as a ++2 amino acid for an N-terminal zinc finger in the same nucleic acid binding molecule.
  • the invention allows the definition of every residue in a zinc finger DNA binding motif which will bind specifically to a given target DNA triplet.
  • Arrays may be constructed which include some or all of the zinc fingers capable of binding to a given target DNA. Such arrays may be used in the methods of the invention.
  • the code provided by the present invention is not entirely rigid; certain choices are provided. For example, positions +1, +5 and +8 may have any amino acid allocation, whilst other positions may have certain options: for example, the present rules provide that, for binding to a central T residue, any one of Ala, Ser or Val may be used at +3. In its broadest sense, therefore, the present invention provides a very large number of proteins which are capable of binding to every defined target DNA triplet. Preferably, however, the number of possibilities may be significantly reduced. For example, the non-critical residues +1, +5 and +8 may be occupied by the residues Lys, Thr and Gin respectively as a default option. In the case of the other choices, for example, the first-given option may be employed as a default. Thus, the code according to the present invention allows the design of a single, defined polypeptide (a "default" polypeptide) which will bind to its target triplet.
  • a method for preparing a DNA binding protein of the Cys2-His2 zinc finger class capable of binding to a target DNA sequence in a manner modulatable by a ligand comprising the steps of: (a) selecting a model zinc finger domain from the group consisting of naturally occurring zinc fingers and consensus zinc fingers; and (b) mutating at least one of positions -1, +3, +6 (and ++2) of the finger as required by a method according to the present invention.
  • naturally occurring zinc fingers may be selected from those fingers for which the DNA binding specificity is known.
  • these may be the fingers for which a crystal structure has been resolved: namely Zif 268 (Elrod-Erickson et al. , ( 1996) Structure 4:1171-1180), GLI (Pavletich and Pabo, (1993) Science 261 :1701-1707), Tramtrack (Fairall et al, (1993) Nature 366:483-487) and YY1 (Houbaviy et al, (1996) PNAS (USA) 93:13577-13582).
  • residues which are outside the DNA-contacting region may be mutated. Mutations in such residues may affect the interaction between zinc fingers in a zinc finger polypeptide, and thus alter binding site specificity. Moreover, ligands which bind to a zinc finger polypeptide so as to influence zinc finger interaction and thus binding may be identified. Mutation of residues which affect the interaction between zinc fingers allows for selection of fingers which are modulatable by ligand binding at these sites.
  • the naturally occurring zinc finger 2 in Zif 268 makes an excellent starting point from which to engineer a zinc finger and is preferred.
  • Consensus zinc finger structures may be prepared by comparing the sequences of known zinc fingers, irrespective of whether their binding domain is known.
  • the consensus structure is selected from the group consisting of the consensus structure P Y K CPECGKSFSQKSDLVKHQRTHTG, and the consensus structure PYKCS ECGKAFSQKSNLTRHQRIHTGEKP.
  • the consensuses are derived from the consensus provided by Krizek etal., (1991) J. Am. Chem. Soc.113: 4518-4523 and from Jacobs, (1993) PhD thesis, University of Cambridge, UK.
  • the linker sequences described above for joining two zinc finger motifs together namely TGEK or TGEKP can be formed on the ends of the consensus.
  • a P may be removed where necessary, or, in the case of the consensus terminating T G, E K (P) can be added.
  • the mutation of the finger in order to modify its specificity to bind to the target DNA may be directed to residues known to affect binding to bases at which the natural and desired targets differ. Otherwise, mutation of the model fingers should be concentrated upon residues -1, +3, +6 and ++2 as provided for in the foregoing rules.
  • the rules provided by the present invention may be supplemented by physical or virtual modelling of the protein/DNA interface in order to assist in residue selection.
  • a method for producing a zinc finger polypeptide capable of binding to a target DNA sequence, wherein said binding is modulatable by a ligand comprising: (a) providing a nucleic acid library encoding a repertoire of zinc finger polypeptides, the nucleic acid members of the library being at least partially randomised at one or more of the positions encoding residues -1, 2, 3 and 6 of the ⁇ -helix of the zinc finger polypeptides; (b) displaying the library in a selection system and screening it against a target DNA sequence; (c) isolating the nucleic acid members of the library encoding zinc finger polypeptides capable of binding to the target sequence in the presence/absence of ligand; (d) selecting those members of the library isolated in (c) which bind the target nucleic acid sequence with different affinities in the presence and absence of the ligand.
  • the nucleic acid library encoding the zinc finger polypeptides may be (and are preferably) in the form of an array.
  • Randomisation may be total, or partial; in the case of partial randomisation, the selected codons preferably encode options for amino acids as set forth in the rules above.
  • Zinc finger polypeptides may be designed which specifically bind to nucleic acids incorporating the base U, in preference to the equivalent base T.
  • a method for producing a zinc finger polypeptide capable of binding to a target DNA sequence comprising: (a) providing a nucleic acid library encoding a repertoire of zinc finger polypeptides each possessing more than one zinc fingers, the nucleic acid members of the library being at least partially randomised at one or more of the positions encoding residues -1, 2, 3 and 6 of the ⁇ -helix in a first zinc finger and at one or more of the positions encoding residues -1, 2, 3 and 6 of the ⁇ -helix in a further zinc finger of the zinc finger polypeptides; (b) displaying the library in a selection system and screening it against a target DNA sequence; (c) assessing the affinity of the DNA binding molecules for the target DNA in the presence and absence of the ligand, and (d) isolating the nucleic acid members of the library encoding zinc finger polypeptides capable of binding to the target sequence with different affinities in the presence and
  • WO 98/53057 describes the production of zinc finger polypeptide libraries in which each individual zinc finger polypeptide comprises more than one, for example two or three, zinc fingers; and wherein within each polypeptide partial randomisation occurs in at least two zinc fingers.
  • Zinc finger binding motifs designed according to the invention may be combined into nucleic acid binding polypeptide molecules having a multiplicity of zinc fingers.
  • the proteins have at least two zinc fingers.
  • zinc finger binding proteins commonly have at least three zinc fingers, although two-zinc finger proteins such as Tramtrack are known, The presence of at least three zinc fingers is preferred.
  • Nucleic acid binding proteins may be constructed by joining the required fingers end to end, N-terminus to C-terminus. Preferably, this is effected by joining together the relevant nucleic acid sequences which encode the zinc fingers to produce a composite nucleic acid coding sequence encoding the entire binding protein.
  • DNA binding protein as defined above, wherein the DNA binding protein is constructed by recombinant DNA technology, the method comprising the steps of: (a) preparing a nucleic acid coding sequence encoding two or more zinc finger binding motifs as defined above, placed N-terminus to C-terminus; (b) inserting the nucleic acid sequence into a suitable expression vector; and (c) expressing the nucleic acid sequence in a host organism in order to obtain the DNA binding protein.
  • a "leader” peptide may be added to the N-terminal finger.
  • the leader peptide is MAEEKP.
  • the invention also provides zinc finger polypeptides which comprise more than three zinc fingers, such as four, five, six, seven, eight or nine zinc fingers.
  • the invention comprises a zinc finger polypeptide which includes the natural zinc fingers TFIIIa and Zif268.
  • a zinc finger protein has been engineered that contains a novel linker sequence.
  • the novel linker joins two three zinc finger domains, but may be used to join multiple groups of zinc finger domains or other domains used in engineered transcription factors. This linker differs from previously described linkers in that it is structured and comprises of a single non-DNA-binding zinc finger.
  • structured linkers are more suitable for connecting zinc finger domains that bind to subsites separated by longer gaps of DNA sequence than linkers previously described, for example, the 8 and 11 amino acid linkers used to span 1 and 2 base pairs respectively. No linkers have been designed for spanning longer regions. The ability of structured linkers to span longer gaps is propbably due to the fact that these linkers confine the relative freedom of the two domains, thus minimising the conformational search that preceeds binding and also the entropy loss on binding.
  • the multiple zinc finger protein that we have engineered here is composed of zinc fingers 1-3 of TFIIIA and the three zinc fingers from Zif268 joined by zinc finger 4, including flanking sequences, of TFIIIA.
  • Zinc finger 4 of TFIIIA does not bind DNA but acts as a linker in between the two sets of zinc fingers that are involved in DNA recognition. Despite the fact that this zinc finger does not make any base contacts within the major groove of the DNA, it is folded in the classical way, for Cys2His2 zinc fingers, around a Zn(II) ion and is folded to contain an alpha helix within its structure (No Ite et al, 1998). Although this particular finger was used in this example, solely because it was a familiar structured polypetide, we believe that other tertiary structures would also be suitable for use as structured linkers.
  • the DNA binding site for the TFIIIAZif protein contains the DNA recognition sites for zinc fingers 1-3 of TFIIIA and the three zinc fingers of Zif 268. These are the DNA sequences GGATGGGAGAC and GCGTGGGCGT, respectively, as shown in Sequence ID 3.
  • the six base pair sequence GTACCT in Sequence ID 3 is a spacer region of DNA that separates the two binding sites and the nucleotide composition of the DNA spacer appears to have no effect on binding of the protein. Therefore, this or other structured linkers could be used with other DNA spacers of different length and sequence.
  • amino acid sequence of zinc Finger 4 of TFIIIA is NIKICVYVCHFENCGKAFKKHNQLK VHQFSHTQQLP.
  • the nucleotide Sequence of Zinc Finger 4 of TFIIIA, including the flanking sequences, is -
  • RNA binding molecule is an RNA binding polypeptide.
  • Preferred polypeptides capable of binding RNA may comprise one or more domains which facilitate interaction with RNA, for example, the RNA Recognition Motif (RRM)
  • RRM RNA Recognition Motif
  • RNA recognition motif RRM
  • cs-RBD consensus sequence RNA- binding domain
  • RBD consensus sequence RNA- binding domain
  • the babbab secondary structural elements of RBDs form a four-stranded antiparallel b sheet packed against the two perpendicularly oriented a helices.
  • Amino acids of RNP 1 and RNP2 are solvent exposed and make direct contact with bound RNA, probably through hydrogen bonds and ring stacking.
  • a second role is structural: the aromatic side chain at the last position of RNPl points to the interior of the folded domain and, along with other highly conserved hydrophobic amino acids in the two a helices, forms part of the hydrophobic core of the domain.
  • Other structural features include the pronounced right-handed twist of the b sheets, a very small antiparallel b sheet between a2 and b4 with a type F turn, and bulges in bl and b4.
  • Highly conserved RNPl and RNP2 although crucial-for RNA binding, probably do not distinguish between different RNA sequences.
  • Major determinants of RNA-binding specificity reside in the most variable regions of the RNP motif
  • Short arginine-rich sequences also called Basic domains
  • ribosomal proteins also mediate RNA binding.
  • arginine-rich elements there is little identity between different ARM sequences, and the structures of the ARM regions of two proteins, Tat and Rev, are diverse.
  • Rev binds with high affinity to an internal bulged loop (Rev responsive element
  • RRE found in all intron-containing viral mRNAs.
  • Another HIV-encoded ARM protein, Tat binds the trans-acting responsive element (TAR) of HIV mRNAs and functions in transcription. Amino acids outside Rev and Tat ARMs also contribute to RNA binding. Peptides encompassing the Rev ARM (TRQARRNRRRWRERQ) specifically bind RRE as an ⁇ -helix, and at least six amino acids, including four arginines, are essential for specificity of binding.
  • Tat ARM (ALGISYGRKKRRQRRRP ) peptides are unstructured but adopt a stable conformation upon binding TAR. Again, amino acids outside ARM are required for wild-type binding activity.
  • RNA binding sites of proteins are complex, and consist of stem-loops (N-proteins), internal loops (Rev) or bulges (Tat), and their structure, rather than particular sequence, may be the major binding determinant.
  • the RGG box is a 20-25 amino acid long RNA-binding motif typically found in combination with other types of RNA-binding domains.
  • the motif is defined as closely spaced Arg-Gly-Gly (RGG) repeats interspersed with other, often aromatic amino acids.
  • RGG Arg-Gly-Gly
  • the high density of glycine and variations within the motif suggest that it is not a rigid protein structure, but spectroscopic modelling of nucleolin RGGF repeats (GRGGFGGRGGGRGGRGGFGGRGR ) predicted a helical b-spiral structure.
  • K homology (KH) motif containing proteins have been associated with important biological functions. A single amino acid substitution that unfolds a human KH domain leads to the fragile-X syndrome. The definitive role of the KH motif in RNA binding is not clear, but the motif is essential for RNA binding and it probably binds RNA directly. KH motifs are found in one or multiple copies per protein. Their topology is ⁇ .
  • Double-Stranded RNA-Binding Motif (DSRM or dsRBD)
  • the dsRBD domain (around 70 amino acid region) is a general double-stranded RNA-binding module with ⁇ topology. Isolated domains bind double-stranded RNA of any sequence with little or no specificity, but multiple domains may specifically recognize certain RNA structures. conserveed positions, including many basic (Arg and Lys) and hydrophobic amino acids, are scattered throughout the DSRM. Some DSRM proteins bind unique RNA sequences and they do not bind dsDNA. DSRM proteins are involved in diverse functions and provide an example of post-transcriptional gene regulation by RNA-binding proteins. An important component of the response of mammalian cells to viral infection is the interferon-induced protein kinase (DAI), containing two DSRMs.
  • DAI interferon-induced protein kinase
  • a cellular protein containing two DSRMs binds the TAR RNA element of HIV mRNA and may influence Tat-mediated activation.
  • RNP, KH and DSRM represent ab protein folds with an antiparallel b-sheet on one face of the protein packed by a hydrophobic core against an a-helical face. Although a number of all-helical RNA-binding proteins have been identified, the ⁇ structural theme is conserved in many RNA-binding proteins that do not share sequence homology with RNP, KH and DSRM motifs.
  • RNA-binding proteins including retro viral nucleocapsid proteins, RNA polymerases and yeast RNA-binding proteins, contain sequences with appropriately spaced cysteine-histidine residues that relate these proteins to the zinc fmger family of DNA-binding proteins.
  • a generalized Zinc-fmger-knuckle motif can be written as CX2-5CX4-12C/HX2-
  • TFIIIA a nine zinc-finger protein that binds both the 5S rRNA gene and 5S rRNA.
  • the middle three out of nine zinc fingers are primarily responsible for RNA binding.
  • the amino acid sequences of several eukaryotic transcription factors (Y-box proteins) are related to the bacterial cold shock domain and they may have dual roles as DNA- and RNA-binding proteins.
  • Most tRNA synthetases have motifs that are common to related groups of synthetases and whose amino acids directly contact RNA. They can facilitate or hinder the formation of specialized complexes at particular sites on the RNA.
  • RNA binding proteins can serve as structural components and form, together with RNA, stable RNP particles, or they may serve to transport and localize RNAs.
  • Table 1 lists some non-limiting examples of post-translational modifications known to affect polypeptides. Any of these modifications may be used in the methods disclosed here for selection of gene switches using modified entities.
  • a particularly important polypeptide modification is phoshorylation and dephosphorylation.
  • the art is replete with references to enzymes capable of effecting phosphorylation and dephosphorylation, i.e. protein kinases and phosphatases, and their targets, including consensus phosphorylation motifs (such as -SQ- or -TQ- for the DNA dependent protein kinase (DNA-PK).
  • protein kinases identified to date include the protein tyrosine kinase subfamily (such as PDGF receptors, EGF receptors, src family kinases (see Brown and Cooper, 1996, Biochimica and Biophysica Acta 1287: 121-149 for a review), the JAK kinase family (such as JAK1, JAK2 and tyk2), Erb B2, Bcr-Abl, Alk, Trk, Res/Sky - for a detailed review see Al-Obeidi et al, 1998, Biopolymers (Peptide Science), Vol 47: 197- 223), the MAP kinase pathway subfamily (such as the MAP family, the ERK family, the MEK family, the MEKK family, RAF-1 and JNK), the cyclin-dependent kinase subfamily (such as p34 cdc2 and cdk2 - see Nigg, 1995, Bioessays 17: 471-480 for a protein
  • PKA protein kinase C
  • PK-A protein kinase C
  • PKG cyclic-GMP dependent kinase
  • Ca2+/calmodulin dependent kinases such as CaM kinase I, II and IV
  • DNA dependent protein kinase DNA dependent protein kinase
  • P13K phosphoinositide 3-kinases
  • PDK-1 the p21- activated protein kinase family (PAKs), such as Pakl, Pak2 and Pak3- see Sells and Chernoff, 1997, Trends in Cell Biol.
  • PAKs p21- activated protein kinase family
  • p70 S6 kinase IkB kinase
  • casein kinase II glycogen-synthase kinases
  • kinases include the src family tyrosine kinases Lck and
  • TCR ⁇ chain comprises specific tyrosine residues present in immunoreceptor tyrosine-based activation motifs (ITAMs) that are phosphorylatd by Lck and Fyn (Kuriyan and Cowburn, 1997, ibid.).
  • ITAMs immunoreceptor tyrosine-based activation motifs
  • the SH2 domain of another tyrosine kinase, ZAP70 binds to phosphorylated TCR ⁇ .
  • TCR ⁇ IT AM and ZAP70 SH2 represent binding domains and binding partners that may be of interest in studying the activity of the kinases Lck and Fyn (see Elder et al, 1994, Science 264: 1596-1599 and Chan et al, 1994, Science 264: 1599-1601.
  • Another example is the IgE receptor ⁇ subunit and the SH2 domain of Syk that may be used to study the activity of the Lyn kinase.
  • the PPP family includes the following catalytic subunits: PPlc, PP2Ac, PP2B, PPP1, PPP2A and PPP5 and the following regulatory subunits: NIPP-1, RIPP-1, p53BP2, ⁇ 4.5, PR65, PR55, PR72, PTPA, SV40 small T antigen, PPY, PP4, PP6 and PP5.
  • the PPM family includes pyruvate dehydrogenase phosphatase and Arabidopsis ABU.
  • the protein tyrosine phosphatase family includes PTP1B, SHP-1, SHP-2 (cytosolic non-receptor forms), CD45 (see Thomas and Brown, 1999, Trends in Immunol, 20: 406 and Ashwell and D'Oro, 1999, Trends in Immunol, 20: 412 for further details), RPTP (receptor-like, transmembrane forms) and cdc25, kinase-associated phosphatase and MAP kinase phosphatase-1 (dual-specificity phosphatases).
  • PTP1B is known to associate with the insulin receptor in vivo (Bandy opadhyay et al, 1997, J. Biol. Chem. 272: 1639-1645).
  • Table 3 provides a non- limiting list of enzymes that are representative of some of the classes of modifying enzymes discussed herein which may be used to modify polypeptides.
  • Mono-ADP-ribosylation is a post-translational modification of proteins which is currently thought to play a fundamental role in cellular signalling.
  • a number of mono- ADP-ribosyl-transferases have been identified, including endogenous enzymes from both bacterial and eukaryotic sources and bacterial toxins.
  • a mono-ADP-ribosylating enzyme using as substrates the protein to be modified and nicotinamide adenine dinucleotide (NAD + ), is NAD: Arginine ADP ribosyltransferase (Zolkiewska et al, 1992, Proc. Natl. Acad. Sci. U.S.A., 89: 11352-11356).
  • This toxin induces the mono- ADP-ribosylation of BARS-50 (a G protein involved in membrane transport) and glyceraldehyde-3 -phosphate dehydrogenase.
  • the cellular effects of brefeldin A include the blocking of constitutive protein secretion and the extensive disruption of the Golgi apparatus.
  • Inhibitors of the brefeldin A mono-ADP-ribosyl-transferase reaction have been shown to antagonise the disassembly of the Golgi apparatus induced by the toxin (Weigert et al, 1997, J. Biol. Chem, 272: 14200-14207).
  • a number of amino acid residues within proteins have been shown to function as ADP-ribose acceptors.
  • Bacterial transferases have been identified which modify arginine, asparagine, cysteine and diphthamide residues in target proteins.
  • Endogenous eukaryotic transferases are known which also modify these amino acids, in addition there is evidence that serine, threonine, tyrosine, hydroxyproline and histidine residues may act as ADP-ribose acceptors but the relevant transferases have not yet been identified (Cervantes-Laurean et al, 1997, Methods Enzymol, 280: 275-287 and references therein).
  • Poly- ADP-ribosylation is thought to play an important role in events such as DNA repair, replication, recombination and packaging and also in chromosome decondensation.
  • the enzyme responsible for the poly- ADP-ribosylation of proteins involved in these processes is poly (ADP-ribose) polymerase (PARP; for Drosophila melanogaster PARP, see Genbank Accession Nos. D 13806, D 13807 and D13808).
  • PARP poly (ADP-ribose) polymerase
  • ADP-ribosylation sites are those found at Cys3 and Cys4 (underlined) of the B-50 protein (Coggins et al, 1993, J. Neurochem, 60: 368-371 ; SwissProt Accession No. P06836): MLCCMRRTKQVEKNDDD and P ⁇ (the ⁇ subunit of cylic CMP phophodiesterase; Bondarenko et al, 1997, J. Biol. Chem, 272: 15856-15864; Genbank Accession No. X04270): FKQRQTRQFK.
  • N- linked glycosylation is a post-translational modification of proteins which occurs in the endoplasmic reticulum and golgi apparatus and is utilized with some proteins en route for secretion or destined for expression on the cell surface or in another organelle.
  • the carbohydrate moiety is attached to Asn residues in the non-cytoplasmic domains of the target proteins, and the consensus sequence (Shakineshleman, 1996, Trends Glycosci. Glycotech, 8: 115-130) for a glycosylation site is NxS/T, where x cannot be proline or aspartic acid.
  • N-linked sugars have a common five-residue core consisting of two GlcNAc residues and three mannose residues due to the biosynthetic pathway.
  • This core is modified by a variety of Golgi enzymes to give three general classes of carbohydrate known as oligomannosyl, hybrid and lactosamine-containing or complex structures (Zubay, 1998, Biochemistry, Wm. C. Brown Publishers).
  • An enzyme known to mediate N-glycosylation at the initial step of synthesis of dolichyl-P-P-oligosaccharides is UDP-N-
  • Acetylglucosamine-Dolichyl-phosphate-N-acetylsglucosamine phosphotransferase for mouse, Genbank Accession Nos. X65603 and S41875.
  • Oxygen-linked glycosylation also occurs in nature with the attachment of various sugar moieties to Ser or Thr residues (Hansen et al, 1995, Biochem. J, 308: 801-813).
  • Intracellular proteins are among the targets for O-glycosylation through the dynamic attachment and removal of O-N-Acetyl-D-glucosamine (O-GlcNAc; reviewed by Hart, 1997, Ann. Rev. Biochem, 66: 315-335).
  • Proteins known to be O-glycosylated include cytoskeletal proteins, transcription factors, the nuclear pore protein complex, and tumor- suppressor proteins (Hart, 1997, supra). Frequently these proteins are also phosphoproteins, and there is a suggestion that O-GlcNAc and phosphorylation of a protein play reciprocal roles. Furthermore, it has been proposed that the glycosylation of an individual protein regulates proteimprotein interactions in which it is involved.
  • O-GlcNAc The identity of sites of O-GlcNAc is additionally known for a small number of proteins including c-myc (Thr58, also a phosphorylation site; Chou et al, 1995, J. Biol. Chem, 270: 18961-18965), the nucleopore protein p62 (see Reason et al, 1992, supra): MAGGPADTSDPL and band 4.1 of the erythrocyte (see Reason et al, 1992, supra): AQTITSETPSSTT.
  • the post-translational modification of proteins with fatty acids includes the attachment of myristic acid to the primary amino group of an N-terminal glycine residue (Johnson et al, 1994, Ann. Rev. Biochem, 63: 869-914) and the attachment of palmitic acid to cysteine residues (Milligan et al, 1995, Trends Biochem. Sci, 20: 181-186).
  • Fatty acylation of proteins is a dynamic post-translational modification which is critical for the biological activity of many proteins, as well as their interactions with other proteins and with membranes.
  • the location of the protein within a cell can be controlled by its state of prenylation (fatty acid modification) as can its ability to interact with effector enzymes (ras and MAP kinase, Itoh et al. , 1993, J. Biol. Chem, 268: 3025-; ras and adenylate cyclase in yeast; Horiuchi et al, 1992, Mol. Cell. Biol, 12: 4515) or with regulatory proteins (Shirataki et al, 1991, J. Biol.
  • Sentrin is a novel 101-amino acid protein which has 18% identity and 48% similarity with human ubiquitin (Okura et al, 1996, J. Immunol, 157: 4277-4281). This protein is known by a number of other names including SUMO-1, UBL1, PIC1, GMP1 and SMT3C and is one of a number of ubiquitin-like proteins that have recently been identified. Sentrin is expressed in all tissues (as shown by Northern blot analysis), but mRNA levels are higher in the heart, skeletal muscle, testis, ovary and thymus.
  • RanGAPl Ran-specific GTPase-activating protein
  • NPC nuclear pore complex
  • Sentrin has been shown in yeast two-hybrid screens to interact with a number of
  • Fas/APOl and TNF receptors include the death domains of Fas/APOl and the TNF receptors, PML, RAD51 and RAD52 (Johnson and Hochstrasser, 1997, supra).
  • Fas/APOl and TNF receptors are involved in transducing the apoptosis signal via their death domains.
  • Ligation of Fas on the cell surface results in the formation of a complex via death domains and death-effector domains, triggering the induction of apoptosis.
  • the overexpression of sentrin protects cells from both anti-Fas/ APO and TNF-induced cell death (Okura et al, 1996, supra). It is not clear whether this protection is achieved simply by preventing the binding of other proteins to these death domains or whether a more complex process is involved, possibly one involving the ubiquitin pathway.
  • PML a RING finger protein
  • PML a RING finger protein
  • PML is found in a nuclear multiprotein complex known as a nuclear body. These nuclear bodies are disrupted in acute promyelocytic leukaemia, where a chromosomal translocation generates a fusion between regions of the retinoic acid receptor and PML. This disruption can be reversed by treatment with retinoic acid. It has been shown that PML is covalently modified at multiple sites by members of the sentrin family of proteins (but not by ubiquitin or NEDD8). Two forms of the aberrant fusion protein have been identified, neither of which is modified by sentrin.
  • a modified polypeptide comprises one or more modified nucleic acids.
  • modified nucleic acids include: 2-Aminoadipic acid, 3-Aminoadipic acid, beta-Alanine, beta-Aminopropionic acid, 2-Aminobutyric acid, 4-Aminobutyric acid, piperidinic acid, 6-Aminocaproic acid, 2-Aminoheptanoic acid, 2-Aminoisobutyric acid, 3- Aminoisobutyric acid, 2-Aminopimelic acid, 2,4-Diaminobutyric acid, Desmosine, 2,2'- Diaminopimelic acid, 2,3-Diaminopropionic acid, N-Ethylglycine, N-Ethylasparagine, Hydroxylysine, allo-Hydroxylysine, 3-Hydroxyproline, 4-Hydroxyproline, Isodesmosine, allo-Isoleucine, N-Methylglycine
  • Nucleic acid modifications which are useful in the embodiments of the invention which make use of modified nucleic acid or polypeptide to select gene switches, are discussed in this section.
  • the modified nucleic acids may comprise epigenetic modifications such as methylated nucleic acids, or comprise nucleotide analogues as described below, etc.
  • Nucleotides generally include a base, a sugar and a phosphate group, with the base generally located at the 1 ' position of a sugar moiety.
  • Modified nucleic acids generally comprise one or more modified nucleotides (also referred to interchangeably as nucleotide analogs, modified nucleotides, non-natural nucleotides, non-standard nucleotides and other; see for example, Usman and McSwiggen, supra; Eckstein et al. International PCT Publication No. WO 92/07065; Usman et al. International PCT Publication No. WO 93/15187; all hereby incorporated by reference herein).
  • modified nucleotides also referred to interchangeably as nucleotide analogs, modified nucleotides, non-natural nucleotides, non-standard nucleotides and other; see for example, Usman and McSwiggen, supra; Eckstein et al. International PCT Publication No. WO 92/07065; Usman et al. International PCT Publication No. WO 93/15187; all hereby incorporated by reference herein).
  • modified nucleotides may be modified at the sugar, phosphate and/or base moiety.
  • modified nucleic acid bases known in the art as recently summarized by Limbach et al, 1994, Nucleic Acids Res. 22, 2183.
  • base modifications that can be introduced into nucleic acids include, inosine, purine, pyridin-4-one, pyridin-2-one, phenyl, pseudouracil, 2, 4, 6- trimethoxy benzene, 3 -methyl uracil, dihydrouridine, naphthyl, aminophenyl, 5- alkylcytidines (e.g., 5-methylcytidine), 5-alkyluridines (e.g., ribothymidine), 5-halouridine (e.g., 5-bromouridine) or 6-azapyrimidines or 6-alkylpyrimidines (e.g.
  • modified nucleotides include: 4-acetylcytidine, 5-(carboxyhydroxymethyl)uridine, 2'-O-methylcytidine, 5- carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluridine, dihydrouridine, 2'-O-methylpseudouridine, betaD-galactosylqueuosine, 2'-0- methyl guanosine, inosine, N6-isopentenyladenosine, 1-methyladenosine, 1- methylpseudouridine, 1-methylguanosine, 1-methylinosine, 2,2-dimethylguanosine, 2- methyladenosine, 2-methylguanosine, 3-methylcytidine, 5-methylcytidine, N6- methyladenosine, 7-methylguanosine, 5-methylaminomethyl
  • Methylation of DNA is an epigenetic modification that can play an important role in the control of gene expression in mammalian cells (reviewed in Momparler and Bovenzi 2000, J Cell Physiol 183(2): 145-54 and Robertson and Jones, 2000, Carcinogenesis 21, 461-467).
  • spermine conjugated oligonucleotides may be made by linking a sperminyl moiety at N4 of dC.
  • terminal amino groups of spermidine may be derivitised into guanidinium function.
  • the spermine appendage at C4 of 5-Me-dC may be replaced with 1,1 l-diamino-3,6,9-trioxaundecaneto create 5-Me-dC-(N4-tetraethylene- glycolmonoamine) (teg)-ODNs.
  • Spermine conjugated olignonucleotides are described in detail in Org. Chem, 1997, 62, 5169-5173; Tung, C. H, Breslauer, K. J.
  • 5-Amino-dU oligonucleotides may be made by replacement of the 5-methyl group of T by an amino function to generate 5-amino-dU, a purine mimic.
  • 5-amino-dU is described in detail in Barawkar, D. A, Krishna Kumar, R. and Ganesh, K. N, Tetrahedron, 1992, 48, 8505-8514; Barawkar, D. A. and Ganesh, K. N, BioMed. Chem. Lett, 1993, 3, 347-352; Rana, V. S, Barawkar, D. A. and Ganesh, K. N, J. Org. Chem, 1996, 61, 3578- 3579; Trapane, T. L, Christopherson, M. S, Coby, C. D, Ts'O, P. and Wang, D, J. Am. Chem. Soc, 1994, 116, 8412-8420.
  • the 5-amino function in 5-Amino-dU may be used to append ligands such as fluorescent groups, metallocomplexes, peptides, etc.
  • ligands such as fluorescent groups, metallocomplexes, peptides, etc.
  • dansyl and 5/6- carboxyfluorescein groups have been linked to this analogue to form 5-amidodansyl-dU, etc.
  • Fluorescent oligonucleotides are described in detail in Singh, D, Kumar, V. A. and Ganesh, K. N, Nucleic Acids Res, 1990, 18, 3339-3345; Barawkar, D. A. and Ganesh, K. N, Nucleic Acids Res, 1995, 23, 159-164; Barawkar, D. A. and Ganesh, K. N, Biochem.
  • Chloramphenicol backbone containing oligonucleotides are described in detail in Sanghvi, Y. S. and Cook, P. D, Carbohydrate Modification in Antisense Research, ACS Symposium Series, ACS, Washington DC, 1994; Rana, V. S, Kumar, V. A. and Ganesh, K. N, Bioorg. Med. Chem. Lett, 1997, 7, 2837-2842.
  • PNAs Peptide nucleic acids
  • PNAs Peptide nucleic acids
  • PNAs Peptide nucleic acids
  • their derivatives such as chiral, fluorescent and polyamine conjugates
  • Peptide nucleic acids are a novel class of non-chiral, designed synthetic molecules and have an ethylenediamine-glycine backbone to which the nucleobases are linked (at N) through an acetyl chain.
  • PNA is chemically stable and in contrast to natural nucleic acids and peptides. PNAs are described in detail in Nielsen, P. E, Egholm, M, Berg, R. H. and Buchardt, O, Science, 1991, 254, 1497-1501 ; Nielsen, P. E, Egholm, M.
  • nucleic acid may be modified by binding, conjugation, or linking etc a ligand or moeity to the nucleic acid.
  • the modified nucleic acid is produced by the reaction of a nucleic acid capable of being derivatised together with a modifying moiety. More preferably, the nucleic acid capable of being derivatised contains an amino, thio, oxo or bromo group, or a group that can be chemically or photo-acitaved. Photochemically induced cross-linking is especially suitable for this purpose.
  • modified nucleic acids include derivatives of nucleic acids, for example, 5-bromo-pyrimidines,
  • 5-iodo-pyrimidines and 4-thiopyrimidines.
  • Photoaffinity labelling using modified bases such as phosphoramidites of thionucleosides are also possible.
  • 4-thiothymidine and 6-thiodeoxyguanosine (S6-dG) are shown to cross-link effectively with EcoRV endonuclease and methyltransferase.
  • nucleobases containing thiocarbonyl groups can also be chemically modified selectively at the sulfur position by alkylating reagents.
  • 6-Thiodeoxyguanosine has also been incorporated into G-rich triple helix-forming oligonucleotides. Replacement of all or some G residues in G-rich oligonucleotides with S6-dG has been shown to inhibit self association and formation of G tetrads, especially in potassium buffers. This allows triple helix formation to take place normally.
  • Deprotected bases may also be effectively used in the manufacture of modified nucleic acids.
  • a S6-dG monomer (Glen Research, www.alenres.com) in which the S6 position has been protected with cyanoethyl and N2 with trifluoroacetyl protecting groups may be used.
  • the synthesis column with oligonucleotides containing S6-dG may be treated5 with IM l,8-diazabicyclo[5.4.0]undec-7-ene (DBU) in anhydrous acetonitrile at room temperature for 5 hours to remove the S6-cyanoethyl protecting group.
  • DBU IM l,8-diazabicyclo[5.4.0]undec-7-ene
  • the oligo deprotection is completed with 50 mM sodium hydrosulfide (NaSH) in ammonium hydroxide at room temperature for 24 hours.
  • NaSH sodium hydrosulfide
  • Ribonucleotides may also be derivatised or modified to produce modified RNA such as 2'-Amino-RNA.
  • RNA modification is currently in vogue for such applications as antisense and ribozymes.
  • interesting changes in RNA activity can be effected by substituting the 2'-hydroxyl with 2'-fluoro or 2'-O-alkyl groups.
  • a further substitution which may be used could include the 2' -amino group.
  • the thermal stability of duplexes containing 2'-amino-RNA has been determined and it is reported that 2'-amino-C substitutions destabilized by about 4 relative to RNA C. It is also further reported that 2'-amino-RNA linkages are nuclease-resistant.
  • the pKa of the 2'-amino group is quite low at 6.2 but this retains sufficient nucleophilicity to allow conjugation reactions to take place. It is therefore possible to label a 2'-amino group with a fluorophore like rhodamine. This activity has been used to investigate thermal motion in a large ribozyme.
  • the 2'-position within an RNA duplex is directed towards the outside of the helix in a location which is very amenable to interhelix contact.
  • the researchers were able to conjugate a disulfide group to the 2 '-amino group via an activated ester to yield intermediates. An exchange reaction between the activated disulfide and a thiol in the complementary section or strand neatly forms a disulfide cross-link.
  • Sequence Modifiers may be designed for use in automated synthesis of modified nucleic acids.
  • the carboxy-dT is hydrolyzed during deprotection and may be coupled directly to a molecule containing a primary amino group by a standard peptide coupling or via the intermediate N-hydroxysuccinimide (NHS) ester.
  • Both Amino-Modifier dT products can be added in place of a Thymidine residue during oligonucleotide synthesis.
  • the primary amine on the C6 analogue is separated from the oligonucleotide by a spacer arm with a total of 10 atoms and can be labelled or attached to an enzyme.
  • the C2 analogue is more suitable for the attachment of molecules designed to react with the oligonucleotide.
  • Oligonucleotides containing pyrazolo[3,4-D]pyrimidines are described in detail in US Patent No. 6,127,121.
  • Arrays of modified nucleic acid probes and methods using them are described in detail in US Paten No. 6, 156,501.
  • modified nucleic acids or modifications of nucleic acids described above, including modified DNA and modified RNA, and other known modifications of these, may be used in the methods of our invention for selecting switching systems such as protein or gene switches.
  • a nucleic acid encoding a polypeptide including a nucleic acid binding protein (which may be a DNA binding protein) as well as a polypeptide binding protein according to the invention can be incorporated into vectors for further manipulation.
  • vector or plasmid refers to discrete elements that are used to introduce heterologous nucleic acid into cells for either expression or replication thereof. Selection and use of such vehicles are well within the skill of the person of ordinary skill in the art. Many vectors are available, and selection of appropriate vector will depend on the intended use of the vector, i.e. whether it is to be used for DNA amplification or for nucleic acid expression, the size of the DNA to be inserted into the vector, and the host cell to be transformed with the vector.
  • Each vector contains various components depending on its function (amplification of DNA or expression of DNA) and the host cell for which it is compatible.
  • the vector components generally include, but are not limited to, one or more of the following: an origin of replication, one or more marker genes, an enhancer element, a promoter, a transcription termination sequence and a signal sequence.
  • Both expression and cloning vectors generally contain nucleic acid sequence that enable the vector to replicate in one or more selected host cells.
  • this sequence is one that enables the vector to replicate independently of the host chromosomal DNA, and includes origins of replication or autonomously replicating sequences.
  • origins of replication or autonomously replicating sequences are well known for a variety of bacteria, yeast and viruses.
  • the origin of replication from the plasmid pBR322 is suitable for most Gram-negative bacteria, the 2 ⁇ plasmid origin is suitable for yeast, and various viral origins (e.g. SV40, polyoma, adenovirus) are useful for cloning vectors in mammalian cells.
  • the origin of replication component is not needed for mammalian expression vectors unless these are used in mammalian cells competent for high level DNA replication, such as COS cells.
  • Most expression vectors are shuttle vectors, i.e. they are capable of replication in at least one class of organisms but can be transfected into another class of organisms for expression.
  • a vector is cloned in E. coli and then the same vector is transfected into yeast, mammalian or plant cells even though it is not capable of replicating independently of the host cell chromosome.
  • DNA may also be replicated by insertion into the host genome.
  • the recovery of genomic DNA encoding the nucleic acid or polypeptide binding protein is more complex than that of episomally replicated vector because restriction enzyme digestion is required to excise nucleic acid binding protein
  • DNA can be amplified by PCR and be directly transfected into the host cells without any replication component.
  • an expression and cloning vector may contain a selection gene also referred to as selectable marker.
  • This gene encodes a protein necessary for the survival or growth of transformed host cells grown in a selective culture medium. Host cells not transformed with the vector containing the selection gene will not survive in the culture medium.
  • Typical selection genes encode proteins that confer resistance to antibiotics and other toxins, e.g. ampicillin, neomycin, methotrexate or tetracycline, complement auxotrophic deficiencies, or supply critical nutrients not available from complex media.
  • Selectable markers which may be used in fungal cells include wild-type genes which complement auxotrophic defects in for example the Uracil (e.g. URA3 gene), Lysine (e.g. LYS2 gene), Adenine (e.g. ADE2 gene), Methionine (e.g. MET3 gene), Histidine (e.g. HIS3 gene), Tryptophan (e.g. TRP1 gene), Leucine (e.g. LEU2 gene) or other metabolic pathways.
  • Uracil e.g. URA3 gene
  • Lysine e.g. LYS2 gene
  • Adenine e.g. ADE2 gene
  • Methionine e.g. MET3 gene
  • Histidine e.g. HIS3 gene
  • Tryptophan e.g. TRP1 gene
  • Leucine e.g. LEU2 gene
  • Examples of these include; 5-fluoro-orotic acid, which is converted to a toxic compound by the action of the URA3 gene product; ⁇ -amino-adipic acid, which is converted to a toxic compound by the LYS2 gene product; allyl alcohol, which is converted to a toxic compound by alcohol dehydrogenase activity as encoded by the ADH genes, or any other suitable selective regime known to those skilled in the art.
  • Other selective markers are based on the expression of a gene in a fungus such as yeast which overcomes the metabolic arrest induced by, or toxicity of, a chemical entity which may be added to the growth medium or otherwise presented to the cells.
  • Examples of these may include the KAN gene(s) which confer resistance to antibiotics such as G-148, the HIS3 gene which confers resistance to 3-amino-triazole, or the ADH2 gene which can confer resistance to heavy metal ions such as cadmium, or any other suitable genes which confer resistance to toxic or growth arresting regimes.
  • an E. coli genetic marker and an E. coli origin of replication are advantageously included. These can be obtained from E. coli plasmids, such as pBR322, Bluescript ⁇ vector or a pUC plasmid, e.g. pUC18 or pUC19, which contain both E. coli replication origin and E. coli genetic marker conferring resistance to antibiotics, such as ampicillin.
  • Suitable selectable markers for mammalian cells are those that enable the identification of cells competent to take up nucleic acid binding protein or polypeptide binding protein nucleic acid, such as dihydrofolate reductase (DHFR, methotrexate resistance), thymidine kinase, or genes conferring resistance to G418 or hygromycin.
  • DHFR dihydrofolate reductase
  • thymidine kinase or genes conferring resistance to G418 or hygromycin.
  • the mammalian cell transformants are placed under selection pressure which only those transformants which have taken up and are expressing the marker are uniquely adapted to survive.
  • selection pressure can be imposed by culturing the transformants under conditions in which the pressure is progressively increased, thereby leading to amplification (at its chromosomal integration site) of both the selection gene and the linked DNA that encodes the nucleic acid binding protein or the polypeptide binding protein.
  • Amplification is the process by which genes in greater demand for the production of a protein critical for growth, together with closely associated genes which may encode a desired protein, are reiterated in tandem within the chromosomes of recombinant cells. Increased quantities of desired protein are usually synthesised from thus amplified DNA.
  • Expression and cloning vectors usually contain a promoter that is recognised by the host organism and is operably linked to nucleic acid encoding nucleic acid binding protein or the nucleic acid encoding polypeptide binding protein. Such a promoter may be inducible or constitutive.
  • the promoters are operably linked to DNA encoding the binding protein by removing the promoter from the source DNA by restriction enzyme digestion and inserting the isolated promoter sequence into the vector. Both the native nucleic acid binding protein (or polypeptide binding protein, as the case may be) promoter sequence and many heterologous promoters may be used to direct amplification and/or expression of the binding protein.
  • Promoters suitable for use with prokaryotic hosts include, for example, the ⁇ -lactamase and lactose promoter systems, alkaline phosphatase, the tryptophan (trp) promoter system and hybrid promoters such as the tac promoter.
  • Their nucleotide sequences have been published, thereby enabling the skilled worker operably to ligate them to DNA encoding nucleic acid or polypeptide binding protein, using linkers or adapters to supply any required restriction sites.
  • Promoters for use in bacterial systems will also generally contain a Shine-Delgarno sequence operably linked to the DNA encoding the nucleic acid or polypeptide binding protein.
  • Preferred expression vectors are bacterial expression vectors which comprise a promoter of a bacteriophage such as phagex or T7 which is capable of functioning in the bacteria.
  • the nucleic acid encoding the fusion protein may be transcribed from the vector by T7 RNA polymerase (Studier et al, Methods in Enzymol. 185; 60-89, 1990).
  • T7 RNA polymerase In the E. coli BL21(DE3) host strain, used in conjunction with pET vectors, the T7 RNA polymerase is produced from the ⁇ -lysogen DE3 in the host bacterium, and its expression is under the control of the IPTG inducible lac UV5 promoter. This system has been employed successfully for over-production of many proteins.
  • the polymerase gene may be introduced on a lambda phage by infection with an int- phage such as the CE6 phage which is commercially available (Novagen, Madison, USA).
  • vectors include vectors containing the lambda PL promoter such as PLEX (Invitrogen, NL), vectors containing the trc promoters such as pTrcHisXpressTm (Invitrogen) or pTrc99 (Pharmacia Biotech, SE) or vectors containing the tac promoter such as pKK223-3 (Pharmacia Biotech) or PMAL (New England Biolabs, MA, USA).
  • the nucleic acid binding protein or polypeptide binding protein gene according to the invention preferably includes a secretion sequence in order to facilitate secretion of the polypeptide from bacterial hosts, such that it will be produced as a soluble native peptide rather than in an inclusion body.
  • the peptide may be recovered from the bacterial periplasmic space, or the culture medium, as appropriate.
  • Suitable promoting sequences for use with yeast hosts may be regulated or constitutive and are preferably derived from a highly expressed yeast gene, especially a Saccharomyces cerevisiae gene.
  • ADHII gene the acid phosphatase (PH05) gene, a promoter of the yeast mating pheromone genes coding for the a- or ⁇ -factor or a promoter derived from a gene encoding a glycolytic enzyme such as the promoter of the enolase, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), 3 -phospho glycerate kinase (PGK), hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triose phosphate isomerase, phosphoglucose isomerase or glucokinase genes, or a promoter from the TATA binding protein (TBP) gene can be used.
  • TATA binding protein TATA binding protein
  • hybrid promoters comprising upstream activation sequences (UAS) of one yeast gene and downstream promoter elements including a functional TATA box of another yeast gene
  • a hybrid promoter including the UAS(s) of the yeast PH05 gene and downstream promoter elements including a functional TATA box of the yeast GAP gene PH05-GAP hybrid promoter
  • a suitable constitutive PHO5 promoter is e.g. a shortened acid phosphatase PH05 promoter devoid of the upstream regulatory elements (UAS) such as the PH05 (-173) promoter element starting at nucleotide -173 and ending at nucleotide -9 of the PH05 gene.
  • Binding protein gene transcription from vectors in mammalian hosts may be controlled by promoters derived from the genomes of viruses such as polyoma virus, adenovirus, fowlpox virus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus (CMV), a retrovirus and Simian Virus 40 (SV40), from heterologous mammalian promoters such as the actin promoter or a very strong promoter, e.g. a ribosomal protein promoter, and from the promoter normally associated with nucleic acid binding protein or polypeptide binding protein sequence, provided such promoters are compatible with the host cell systems.
  • viruses such as polyoma virus, adenovirus, fowlpox virus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus (CMV), a retrovirus and Simian Virus 40 (SV40), from heterologous mammalian promoter
  • Enhancers are relatively orientation and position independent. Many enhancer sequences are known from mammalian genes (e.g. elastase and globin). However, typically one will employ an enhancer from a eukaryotic cell virus. Examples include the SV40 enhancer on the late side of the replication origin (bp 100-270) and the CMV early promoter enhancer. The enhancer may be spliced into the vector at a position 5' or 3' to binding protein DNA, but is preferably located at a site 5' from the promoter.
  • a eukaryotic expression vector encoding a nucleic binding protein or polypeptide binding protein according to the invention may comprise a locus control region (LCR).
  • LCRs are capable of directing high-level integration site independent expression of transgenes integrated into host cell chromatin, which is of importance especially where the binding protein gene is to be expressed in the context of a permanently-transfected eukaryotic cell line in which chromosomal integration of the vector has occurred, or in transgenic animals.
  • Eukaryotic vectors may also contain sequences necessary for the termination of transcription and for stabilising the mRNA. Such sequences are commonly available from the 5' and 3' untranslated regions of eukaryotic or viral DNAs or cDNAs. These regions contain nucleotide segments transcribed as polyadenylated fragments in the untranslated portion of the mRNA encoding nucleic acid or polypeptide binding protein.
  • An expression vector includes any vector capable of expressing nucleic acid binding protein nucleic acids and polypeptide binding protein nucleic acids that are operatively linked with regulatory sequences, such as promoter regions, that are capable of expression of such DNAs.
  • an expression vector refers to a recombinant DNA or RNA construct, such as a plasmid, a phage, recombinant virus or other vector, that upon introduction into an appropriate host cell, results in expression of the cloned DNA.
  • Appropriate expression vectors are well known to those with ordinary skill in the art and include those that are replicable in eukaryotic and/or prokaryotic cells and those that remain episomal or those which integrate into the host cell genome.
  • DNAs encoding relevant binding protein may be inserted into a vector suitable for expression of cDNAs in mammalian cells, e.g. a CMV enhancer-based vector such as pEVRF (Matthias, et ⁇ /, (1989) NAR 17, 6418).
  • nucleic acid binding protein and polypeptide binding protein constructs of the invention are expressed in plant cells under the control of transcriptional regulatory sequences that are known to function in plants.
  • the regulatory sequences selected will depend on the required temporal and spatial expression pattern of the binding protein in the host plant.
  • Many plant promoters have been characterised and would be suitable for use in conjunction with the invention. By way of illustration, some examples are provided below:
  • promoters are known in the art which direct expression in specific tissues and organs (e.g. roots, leaves, flowers) or in cell types (e.g. leaf epidermal cells, leaf mesophyll cells, root cortex cells).
  • the maize PEPC promoter from the phosphoenol carboxylase gene (Hudspeth & Grula Plant Mol. Bio. 12: 579-589 (1989)) is green tissue-specific; the trpA gene promoter is pith cell-specific (WO 93/07278 to Ciba- Geigy); the TA29 promoter is pollen-specific (Mariani et al. Nature 347: 737-741 (1990); Mariani et al Nature 357: 384-387 (1992)).
  • promoters direct transcription under conditions of presence of light or absence or light or in a circadian manner.
  • the GS2 promoter described by Edwards and Coruzzi, Plant Cell 1 : 241-248 (1989) is induced by light
  • the AS1 promoter described by Tsai and Coruzzi, EMBO J 9: 323-332 (1990) is expressed only In conditions of darkness.
  • promoters are wound- inducible and typically direct transcription not just on wound induction, but also at the sites of pathogen infection. Examples are described by Xu et al. (Plant Mol. Biol. 22: 573-588 (1993)); Logemann et al. (Plant Cell 1: 151-158 (1989)); and Firek et al. (Plant Mol Biol 22: 129-142 (1993)).
  • a number of constitutive promoters can be used in plants. These include the Cauliflower Mosaic Virus 35S promoter (US 5,352,605 and US 5,322,938, both to Monsanto) including minimal promoters (such as the -46 or -90 CaMV 35S promoter) linked to other regulatory sequences, the rice actin promoter (McElroy et al. Mol. Gen. Genet. 231: 150-160 (1991)), and the maize and sunflower ubiquitin promoters (Christensen et al. Plant Mol Biol. 12: 619-632 (1989); Binet et al. Plant Science 79: 87-94 (1991)).
  • Cauliflower Mosaic Virus 35S promoter US 5,352,605 and US 5,322,938, both to Monsanto
  • minimal promoters such as the -46 or -90 CaMV 35S promoter
  • rice actin promoter McElroy et al. Mol. Gen. Genet. 231: 150-
  • the nucleic acid or polypeptide binding protein of the invention can be expressed in the required cell or tissue types. For example, if it is the intention to utilise the nucleic acid or polypeptide binding protein to regulate a gene in a specific cell or tissue type, then the appropriate promoter can be used to direct expression of the binding protein construct.
  • transcriptional terminator sequences that are known to function in plants include the nopaline synthase terminator found in the ' pBI vectors (Clontech catalog 1993/1994), the E9 terminator from the rbcS gene (ref), and the tml terminator from Cauliflower Mosaic Virus.
  • sequences found within the transcriptional unit are known to enhance gene expression and these can be used within the context of the current invention.
  • Such sequences include intron sequences which, particularly in monocotyledonous cells, are known to enhance expression.
  • intron 1 of the maize Adhl gene and the intron from the maize bronze 1 gene have been found to be effective in enhancing expression in maize cells (Callis et al. Genes Develop. 1 : 1 183-1200 (1987)) and intron sequences are frequently incorporated into plant transformation vectors, typically within the non- translated leader.
  • a number of virus-derived non-translated leader sequences have been found to enhance expression, especially in dicotyledonous cells. Examples include the " ⁇ " leader sequence of Tobacco Mosaic Virus, and similar leader sequences of Maize Chlorotic Mottle Virus and Alfalfa Mosaic Virus (Gallie et al. Nucl. Acids Res. 15: 8693-8711 (1987); Shuzeski et al. Plant Mol Biol, 15: 65-79 (1990)).
  • the nucleic acid binding proteins of the current invention are targeted to the cell nucleus so that they are able to interact with host cell DNA and bind to the appropriate DNA target in the nucleus and regulate transcription. It may also be desirable to target the polypeptide binding proteins of the invention to the nucleus, if this is where the target polypeptides bound by the polypeptide binding proteins are located, and/or where the activity modulated by binding of the proteins to each other is to be expressed.
  • NLS Nuclear Localisation Sequence
  • NLS nuclear Localisation Signals of TGA-1 A and TGA-IB (van der Krol et al; Plant Cell 3: 667-675 (1991)).
  • a variety of transformation vectors are available for plant transformation and the nucleic acid or polypeptide binding protein encoding genes of the invention can be used in conjunction with any such vectors. The selection of vector will depend on the preferred transformation technique and the plant species which is to be transformed. For certain target species, different selectable markers may be preferred.
  • binary vectors or vectors carrying at least one T-DNA border sequence are suitable.
  • a number of vectors are available including pBIN19 (Bevan, Nucl. Acids Res. 12: 8711-8721 (1984), the pBI series of vectors, and pCIBlO and derivatives thereof (Rothstein et al. Gene 53: 153-161 (1987); WO 95/33818 to Ciba-Geigy).
  • Binary vector constructs prepared for Agrobacterium transformation are introduced into an appropriate strain of Agrobacterium tumefaciens (for example, LB A 4044 or GV 3101) either by triparental mating (Bevan; Nucl. Acids Res. 12: 8711-8721 (1984)) or direct transformation (H ⁇ fgen & Willmitzer, Nucl. Acids Res. 16: 9877 (1988)).
  • any vector is suitable and linear DNA containing only the construct of interest may be preferred.
  • Direct gene transfer can be undertaken using a single DNA species or multiple DNA species (co-transformation; Schroder et ⁇ l Biotechnology 4: 1093-1096 (1986)).
  • transient expression usually involves the use of an expression vector that is able to replicate efficiently in a host cell, such that the host cell accumulates many copies of the expression vector, and, in turn, synthesises high levels of nucleic acid or polypeptide binding protein.
  • transient expression systems are useful e.g. for identifying DNA binding protein mutants, to identify potential phosphorylation sites, or to characterise functional domains, for example domains which mediate protein-protein interaction, of the protein.
  • Plasmids according to the invention employs conventional ligation techniques. Isolated plasmids or DNA fragments are cleaved, tailored, and religated in the form desired to generate the plasmids required. If desired, analysis to confirm correct sequences in the constructed plasmids is performed in a known fashion. Suitable methods for constructing expression vectors, preparing in vitro transcripts, introducing DNA into host cells, and performing analyses for assessing DNA binding protein expression and function are known to those skilled in the art.
  • Gene presence, amplification and/or expression may be measured in a sample directly, for example, by conventional Southern blotting, Northern blotting to quantitate the transcription of mRNA, dot blotting (DNA or RNA analysis), or in situ hybridisation, using an appropriately labelled probe which may be based on a sequence provided herein. Those skilled in the art will readily envisage how these methods may be modified, if desired.
  • cells containing the above-described nucleic acids are provided.
  • host cells such as prokaryote, yeast and higher eukaryote cells may be used for replicating DNA and producing the DNA binding protein.
  • Suitable prokaryotes include eubacteria, such as Gram-negative or Gram-positive organisms, such as E.coli, e.g. E.coli K-12 strains, DH5 ⁇ and HB101, or Bacilli.
  • Further hosts suitable for the nucleic acid or polypeptide binding protein encoding vectors include eukaryotic microbes such as filamentous fungi or yeast, e.g. Saccharomyces cerevisiae.
  • Higher eukaryotic cells include plant cells and animal cells such as insect and vertebrate cells, particularly mammalian cells including human cells, or nucleated cells from other multicellular organisms.
  • plant cells and animal cells such as insect and vertebrate cells, particularly mammalian cells including human cells, or nucleated cells from other multicellular organisms.
  • useful mammalian host cell lines are epithelial or fibroblastic cell lines such as Chinese hamster ovary (CHO) cells, NIH 3T3 cells, HeLa cells or 293T cells.
  • the host cells referred to in this disclosure comprise cells in in vitro culture as well as cells that are within a multicellular host organism.
  • DNA may be stably incorporated into cells or may be transiently expressed using methods known in the art.
  • Stably transfected cells may be prepared by transfecting cells with an expression vector having a selectable marker gene, and growing the transfected cells under conditions selective for cells expressing the marker gene. To prepare transient transfectants, cells are transfected with a reporter gene to monitor transfection efficiency.
  • the cells should be transfected with a sufficient amount of the nucleic acid or polypeptide binding protein-encoding nucleic acid to form the relevant binding protein.
  • the precise amounts of DNA encoding the nucleic acid or polypeptide binding protein may be empirically determined and optimised for a particular cell and assay.
  • Host cells are transfected or, preferably, transformed with the above-mentioned expression or cloning vectors of this invention and cultured in conventional nutrient media modified as appropriate for inducing promoters, selecting transformants, or amplifying the genes encoding the desired sequences.
  • Heterologous DNA may be introduced into host cells by any method known in the art, such as transfection with a vector encoding a heterologous DNA by the calcium phosphate coprecipitation technique or by electroporation. Numerous methods of transfection are known to the skilled worker in the field. Successful transfection is generally recognised when any indication of the operation of this vector occurs in the host cell. Transformation is achieved using standard techniques appropriate to the particular host cells used.
  • Transfected or transformed cells are cultured using media and culturing methods known in the art, preferably under conditions whereby the nucleic acid or polypeptide binding protein encoded by the DNA is expressed.
  • suitable media is known to those in the art, so that they can be readily prepared.
  • Suitable culturing media are also commercially available. Transformation of plant cells is normally undertaken with a selectable marker which may provide resistance to an antibiotic or to a herbicide. Selectable markers that are routinely used in transformation include the nptll gene which confers resistance to kanamycin (Messing & Vierra Gene 19: 259-268 (1982); Bevan et al.
  • Agrobacterium-mediated transformation is generally a preferred technique as it has broad application to many dicotyledonous species and is generally very efficient.
  • Agrobacterium-mediated transformation generally involves the co-cultivation of Agrobacterium with explants from the plant and follows procedures and protocols that are known in the art. Transformed tissue is generally regenerated on medium carrying the appropriate selectable marker. Protocols are known in the art for many dicotyledonous crops including (for example) cotton, tomato, canola and oilseed rape, poplar, potato, sunflower, tobacco and soybean (see for example EP 0 317 511, EP 0 249 432, WO 87/07299, US 5,795,855).
  • nucleic acid and polypeptide binding protein constructs of the invention are suitable for expression in a variety of different organisms. However, to enhance the efficiency of expression it may be necessary to modify the nucleotide sequence encoding the nucleic acid or polypeptide binding protein to account for different frequencies of codon usage in different host organisms. Hence it is preferable that the sequences to be introduced into organisms, such as plants, conform to preferred usage of codons in the host organism.
  • codon sequences that have a GC content of at least 35% and preferably more than 45%. This is thought to be because the existence of ATTTA motifs destabilise messenger RNAs and the existence of AATAAA motifs may cause inappropriate polyadenylation, resulting in truncation of transcription.
  • Murray et al. (Nucl. Acids Res. 17: 477-498 (1989)) have shown that even within plants, monocotyledonous and dicotyledonous species have differing preferences for codon usage, with monocotyledonous species generally preferring GC richer sequences.
  • gene sequences can be altered to accommodate such preferences in codon usage in such a manner that the codons encoded by the DNA are not changed.
  • Plants also have a preference for certain nucleotides adjacent to the ATG encoding the initiating methionine and for most efficient translation, these nucleotides may be modified.
  • a plant translational initiation context sequence A variety of sequences can be inserted at this position. These include the sequence the sequence 5'-AAGGAGATATAACAATG-3' (Prasher et al Gene 111: 229-233 (1992); Chalfie et al.
  • Any changes that are made to the coding sequence can be made using techniques that are well known in the art and include site directed mutagenesis, PCR, and synthetic gene construction. Such methods are described in published patent applications EP 0 385 962 (to Monsanto), EP 0 359 472 (to Lubrizol) and WO 93/07278 (to Ciba-Geigy). Well known protocols for transient expression in plants can be used to check the expression of modified genes before their transfer to plants by transformation.
  • any of the vectors including expression vectors, described above, in particular those vectors comprising cloned members of a library of nucleic acid binding polypeptides, may be arranged in the form of an array.
  • the invention therefore includes such arrays of vectors and uses of these arrays.
  • a ligand according to the invention is typically any molecule capable of binding to any of the other components of a switching system.
  • a ligand is typically capable of binding to nucleic acid such as DNA, the nucleic acid binding molecule or any other component of the gene expression machinery.
  • nucleic acid binding ligands include acridine orange, 9-Amino-6-chloro-2-methoxyacridine, actinomycin D, 7- aminoactinomycin D, echinomycin, dihydroethidium, ethidium-acridine heterodimer, ethidium bromide, propidium iodide, hexidium iodide, Hoechst 33258, Hoechst 33342, hydroxystibamidine, psoralen, Distamycin A, calicheamicin oligosaccharides, triple helix forming oligos, PNA, pyrole-imidazole polyamides and synthetic peptides or peptide derivatives such as described by Les
  • a ligand may be capable of binding to a nucleic acid.
  • a nucleic acid binding ligand for example, a DNA binding ligand
  • a polypeptide capable of binding to the nucleic acid i.e., a DNA binding molecule such as a DNA binding polypeptide.
  • the term ligand includes molecules which are themselves comprised of nucleic acids, for example, RNA aptamers, and are capable of binding to a polypeptide, a nucleic acid, or both.
  • ligand and nucleic acid binding molecules molecules capable of binding RNA and/or other nucleic acids.
  • Ligands may bind to the primary, secondary or tertiary structure of RNA.
  • RNA binding ligands include aminoglycosides, which are caapble of binding ribosomal RNA and causing misreading of the genetic code, paromomycin, which is capable of interacting with the RNA major groove within a pocket created by an A-A base pair and a bulged adenine, and paromycin.
  • the ligand may be capable of binding to the nucleic acid binding molecule, or indeed to both the nucleic acid and the nucleic acid binding molecule.
  • a ligand is any molecule capable of binding to the polypeptide binding molecule (including a polypeptide binding protein), or another protein.
  • Protein binding ligands are known in the art, and include, for example, immunoglobulins. antibodies, ATP, cAMP, GAB A, Fas ligand, CIDs (chemical inducers of dimerization), an FK506 and FK1012 (as described in Spencer et al, 1993, Science 262 1019), peptide hormone molecules, retinoic acid, acridine derivatives and other anticancer drugs as described in Finlay and Baguley (2000), Cancer Chemother Pharmacol 45, 417, etc.
  • the ligand may be capable of binding to both members of the protein switch, i.e., the first polypeptide and the second polypeptide.
  • Ligand mediated protein-protein association is described for example in Lin et al (1998), Blood 91, 890-897, Spencer et al (1993), Science 262, 1019, Keenan et al. (1998) Bioorganic and Medicinal Chemistry 6, 1309-1335 and Fan et al. (1999), Human Gene Therapy 10, 2273-2285.
  • ligands are also included provided that they are capable of binding to the nucleic acid or polypeptide components, as the case may be, of the switching system as described herein.
  • the ligands according to the present invention for use in the switching systems described here are preferably capable of modulating the interaction between the components of the switching system.
  • a ligand component is capable of affecting the strength of binding between the nucleic acid binding molecule component and the nucleic acid component of the switch.
  • the ligand is capable of affecting the strength of binding between the polypeptide members of the switch. Addition of a ligand to a switching system can therefore cause the association or disassociation between the components of the switching system.
  • the ligand may do this by direct or indirect means.
  • the ligand may be capable of directly binding to one or more members of the switching system to disrupt the binding or to precipitate association.
  • the ligand may be capable of binding to other entities associated with one or more members of the gene switch (including accessory proteins).
  • a nucleic acid binding molecule such as a DNA binding protein (for example a transcription factor) may be in the form of a complex which binds to the nucleic acid (e.g., DNA). Only one or some of the members of the complex, however, may actually physically contact the nucleic acid.
  • the ligand may bind to one or more members of the complex, not necessarily the nucleic acid binding molecule, to disrupt the binding between the nucleic acid binding molecule and the DNA (and hence the complex and the DNA).
  • association may be promoted by ligand binding to one or more members of a complex comprising the nucleic acid binding molecule. Similar considerations apply to protein switches.
  • a ligand according to the invention is capable of modulating the topology, locally or otherwise, of the nucleic acid or polypeptide to which it is bound.
  • a ligand according to the invention may be capable of modulating the topology and/or stereochemistry of a juxtaposed nucleic acid sequence motif to which it is desired to bind a DNA binding molecule according to the invention, or the topology and/or stereochemistry of a protein binding motif on a protein capable of binding to another protein.
  • the ligand is preferably capable of mimicking the topology and/or stereochemistry of the modified component, most preferably, at or around the interface with another component (which itself may or may not be modified).
  • the initial steps of selection in this embodiment may involve, for example, selection of a polypeptide modified with a moeity, which is capable of binding to a nucleic acid.
  • a subsequent step compares the binding of the unmodified form of the polypeptide to the nucleic acid, in the presence of one or more candidate ligands, to select ligand(s) which are capable of causing the unmodified polypeptide to bind to the nucleic acid.
  • the ligand in this case is preferably capable of mimicking the topology or sterochemistry, etc of the (unmodified) polypeptide where it interacts with the nucleic acid.
  • the ligand may do this by binding transiently to the polypeptide such that the resultant complex is topologically similar to the modified polypeptide.
  • the ligand may be similar in shape to the moeity.
  • the ligand bound to the unmodified polypeptide may adopt a similar charge or hydrophobicity, etc as the modified polypeptide.
  • the ligand and the unmodified polypeptide preferably adopts or mimicks one or more characteristics of the modified polypeptide. More preferably, the ligand preferably mimicks one or more characteristics of the moeity with which the polypeptide is modified.
  • Exemplary ligands for nucleic acid binding have shape and charge characteristics that allow them to reside along the DNA, in either the minor or major groove, intercalate or a combination of these.
  • Suitable ligands in addition to those known in the art may be selected by the use of nucleic acid or polypeptide binding assays.
  • a candidate ligand preferably a plurality of candidate ligands, is contacted with target nucleic acid or polypeptides and binding determined.
  • the targets may for example be labelled with a detectable label, such as a fluorophore/fluorochrome, such that after a wash step binding can be determined easily, for example by monitoring fluorescence.
  • the target with which the candidate binding ligands are contacted may be non-specific, such as a random polypeptide or nucleic acid libraries or sonicated genomic DNA and the like.
  • a specific sequence may be used, or a partially randomised library of sequences.
  • a ligand library may be in the form of a combinatorial chemical library.
  • ligands of the invention bind to polypeptides or DNA in a sequence and/or topology dependent manner so that binding can be restricted to a particular target, thus enhancing the specificity of the gene or protein switch. Specificity of binding may be determined, for example, by comparing the binding of the ligand to a target sequence with binding to a mixture of non-specific molecules.
  • Ligands according to the invention may bind conditionally to their targets.
  • psoralen is a ligand that can bind DNA covalently if illuminated at wavelengths of about 400 nm or less.
  • Ligands capable of binding their targets in more than one manner may be employed in the current invention.
  • Such ligands may bind or associate with the target via any one or more mechanism(s) such as outlined above.
  • libraries of ligands may be prepared.
  • libraries of ligands may be immobilised to a solid phase, such as a substantially planar solid phase, including membranes and non-porous substrates such as plastic and glass.
  • the resulting immobilised library may conveniently be used in high throughput screening procedures.
  • Particularly preferred ligands are those which are substantially non toxic to plants and or animal cells such that they may be administered to said cells and modulate binding of the nucleic acid or polypeptide binding molecule without having an adverse effect on the cells. Thus it may be desirable to pre-screen compounds to exclude toxic compounds.
  • preferred compounds are suitable for administration to animals and plants.
  • preferred compounds are capable of being taken up via the leaves (for foliar application) or roots of plants (for application to the soil) or of permeating seeds (for use in seed treatment). It may also preferred to use compounds that can be taken up by bacteria, yeast and/or fungi that can themselves be delivered to the target host organism.
  • the compounds should also preferably be stable in the soil and/or plant for prolonged periods.
  • preferred compounds are suitable for topical, systemic, or oral adminstration.
  • target nucleic acid refers to any DNA or other nucleic acid for use in the methods of the invention.
  • This nucleic acid may be of known sequence, or may be of unknown sequence.
  • This nucleic acid may be prepared artificially in a laboratory, or may be a naturally occurring nucleic acid.
  • This nucleic acid may be in substantially pure form, or may be in a partially purified form, or may be part of an unpurified or heterogeneous sample.
  • the target nucleic acid is a putative promoter or other transcription regulatory region such as an enhancer. More preferably, the target nucleic acid is in substantially pure form. Even more preferably, the target nucleic acid is of known sequence.
  • the target nucleic acid is purified nucleic acid of known sequence of a promoter from a gene of interest, for example from a gene suspected of being associated with a disease state, more preferably from a gene useful in gene therapy.
  • target sequences of interest include sequence motifs that are bound by transcription factors, such as zinc fingers. Particular examples include the promoters of genes involved in the biosythesis and catabolism of gibberellins (Phillips et al, Plant Physiol 108: 1049-1057 (1995), MacMillin et ⁇ /.
  • Target nucleic acid may also be provided as a plurality of sequences, for example where one or more residues in the nucleic acid sequence are varied or random.
  • Examples of a plurality of sequences are libraries of nucleic acid sequences comprising putative zinc finger binding sites.
  • the plurality library of target nucleic acid sequences are in the form of an array of target nucleic acid sequences, as described above.
  • the invention encompasses arrays of nucleic acids comprising putative zinc finger binding sites, and their use in screening for gene switches.
  • Other sequence motifs that bind the nucleic acid binding domain of a transcription factor may also be included in the plurality of sequences, typically varied or randomised at one or more positions.
  • the chemically inducible promoter fragments described above may be randomised to produce a plurality of target nucleic acid sequences for use in the screening methods of the present invention.
  • the invention includes arrays of randomised chemically inducible prmoter fragments, and their use.
  • the methods of the present invention typically involve using a tripartite configuration of one or more nucleic acid binding molecules, one or more ligands and one or more target nucleic acid sequences as described above to screen for (i) nucleic acid binding molecules that bind to a target nucleic acid in a manner that is modulatable by a ligand (ii) ligands that modulate binding of a nucleic acid binding molecule to a target nucleic acid and/or (iii) a target nucleic acid that is bound modulatably by a nucleic acid binding molecule as a result of an interaction with a ligand.
  • the methods of the present invention typically involve using a tripartite configuration of one or more first polypeptide molecules, one or more ligands and one or more second polypeptide as described above to screen for (i) polypeptide binding molecules that bind to a (another) target polypeptide in a manner that is modulatable by a ligand and/or (ii) ligands that modulate binding of two polypeptides to each other.
  • the methods of the invention may be used to screen for any or all of the components of the gene switch system or protein switch system of the present invention.
  • one or two of the components is a known constant while two or one, respectively, of the other components are screened.
  • a given nucleic acid binding molecule and target nucleic acid may be used to screen a plurality of ligands or candidate ligands.
  • a plurality of nucleic acid binding molecules and of ligands may be screened against a given target nucleic acid for a gene switch, and a plurality of polypeptide binding molecules and of ligands may be screened against a given target polypeptide for a protein switch.
  • Other combinations are also envisaged.
  • Each component may be one individual molecular species or a plurality of molecular species. Where a plurality of species is used, they may be substantially all known, partially randomised or fully randomised.
  • the plurality of nucleic acid binding molecules may be a randomised zinc finger library and the plurality of target nucleic acid may be a library of nucleic acid molecules randomised at one or more, typically three or more contiguous, residues.
  • the plurality of polypeptide molecules may be a library of polypeptides randomised at one or more locations.
  • the library corresponding to the plurality of species is in the form of an array.
  • the invention provides a method for isolating multiple nucleic acid or polypeptide binding molecules in the presence of multiple ligands, said nucleic acid or polypeptide binding molecules being selected using multiple target nucleic acid sequences (or target polypeptides as the case may be) in a single selection (isolation) procedure.
  • a single selection (isolation) procedure Preferably, however, in the selection of a gene switch, at least one of the nucleic acid binding molecules, target nucleic acids and candidate ligands is provided in the form of an array.
  • the invention encompasses methods of selecting a gene switch in which one, two or all three of the nucleic acid binding molecules, target nucleic acids and candidate ligands is in the form of an array.
  • the methods may involve: arrayed nucleic acid binding molecules and arrayed target nucleic acids; arrayed nucleic acid binding molecules and arrayed candidate ligands; or arrayed target nucleic acids and arrayed candidate ligands.
  • the invention includes the use of all three components in the form of arrays, i.e., arrayed nucleic acid binding molecules, arrayed target nucleic acids and arrayed candidate ligands.
  • at least one of the candidate first and second polypeptides and the candidate ligand is in the form of an array.
  • any or all of the components not in the form of an array may comprise a single species, or a plurality of species (such as a library).
  • the library of candidate nucleic acid or polypeptide binding molecules is preferably a phage display library.
  • individual candidate molecules of the library optionally are structurally related to zinc finger transcription factors (for example see Choo and Klug, (1994) PNAS (USA) 91:11163-67, which describes aspects of such libraries and is incorporated herein by reference).
  • This library is preferably constructed with DNA sequences of the form GCGNNNGCG (where all 64 middle triplets are represented in the mixture).
  • One or more ligands means at least one ligand, preferably two, three or four ligands, more preferably five, six, or seven ligands, most preferably a mixture of eight ligands, or even more.
  • the ligands may be in any molar ratio to one another within the mixture, but will preferably be approximately equimolar with one another.
  • the ligands may be provided in the form of a library of ligands.
  • the methods of our invention as described herein allow the selection of potential ligand molecules of interest as a first step, i.e., those which form complexes with the first and second polypeptides.
  • ligands of interest are selected which are capable of binding to one or both of the polypeptides.
  • the strength of binding of the polypeptides to each other are tested in the absence or presence of the ligand component of the complex to select those complexes in which the binding between the polypeptides differs in the presence or absence of the ligand component.
  • Our selection method therefore directly selects ligands which bind to one or both of the polypeptides, without the need for any further screen to determine whether an individual ligand molecule is capable of forming a complex with the polypeptides.
  • the method of our invention may preferably be carried out over at least 3, 4, 5 or 6 rounds of selection, preferably about 6 rounds of selection.
  • Nucleic acid or polypeptide binding molecules (such as phage clones) isolated by the above methods are preferably individually assayed (for example in microtitre plates as described below, preferably in the form of an array) for binding to the target nucleic acid (such as a GCGNNNGCG mixture) or a target polypeptide (as the case may be) in the presence and absence of a mixture of the ligands to identify clones which are capable of ligand-modulatable binding.
  • target nucleic acid such as a GCGNNNGCG mixture
  • a target polypeptide as the case may be
  • Those phage clones which are capable of ligand-modulatable binding are preferably tested in the presence of a mixture of the ligands, in order to deduce the optimum target nucleic acid or polypeptide sequence, for example using different or variant target sequences, or by the binding site signature method for nucleic acid binding proteins (see Choo and Klug, (1994) PN4S (USA) 91 :11163-67). As described above, array technology may be employed in such a screen.
  • the method of the invention preferably features a pre-selection step to remove candidate n binding molecules which do not require ligand to bind the nucleic acid or polypeptide.
  • Association of the candidate nucleic acid or polypeptide binding molecule with the target nucleic acid or polypeptide may be assessed by any suitable means known to those skilled in the art.
  • the nucleic acid or polypeptide may be immobilised by biotinylation and linking to beads such as streptavidin coated beads (Dynal).
  • the nucleic acid or polypeptide is immobilised on an array, so that a high throughput screen may be carried out.
  • nucleic acid or polypeptide binding molecules are phage displayed polypeptides
  • binding of said molecules to the nucleic acid or polypeptide may be assessed by eluting those phage which bind, and infecting logarithmic phase E.coli TGI cells.
  • the presence of infective particles eluted from the nucleic acid indicates that association of the nucleic acid binding molecule(s) with the nucleic acid has occurred, or that association of the polypeptide binding molecule(s) with the polypeptide has occurred in the case of a protein switch.
  • association of the candidate nucleic acid or polypeptide binding molecule(s) with the target nucleic acid or polypeptide may be assessed by Scintillation Proximity Assay (SPA).
  • SPA Scintillation Proximity Assay
  • the target nucleic acid or polypeptide could be biotinylated and immobilised to streptavidin coated SPA beads, and the candidate nucleic acid or polypeptide binding molecules may be radioactively labelled, for example with 33 S- methionine where the molecules are polypeptides.
  • Association of the candidate nucleic acid or polypeptide binding molecules with the target nucleic acid or polypeptide could then be assessed by monitoring the readout of the SPA. Alternatively, the association could be monitored by fluorescent resonance energy transfer (FRET).
  • FRET fluorescent resonance energy transfer
  • the target nucleic acid or polypeptide could be labelled with a donor fluor, and the nucleic acid binding molecule(s) or polypeptide(s) could be labelled with a suitable acceptor fluor. Whilst the two entities are separated, no FRET would be observed, but if association (binding) took place, then there would be a change in the amount of FRET observed, this allowing assessment of the degree of associaiton.
  • Array techniques may be employed in such assessment: thus, the candidate nucleic acid binding molecules may be formed into an array, and binding to the labelled nucleic acid binding molecules assayed as described.
  • an in vivo assay such as a TRAP assay described in Paraskeva et al (1998), Proc. Natl. Acad. Sci. USA, 95, 951-956, may be used for determining whether polypeptide interacts with RNA in vivo. Association of the candidate nucleic acid or polypeptide binding molecule with the target nucleic acid or polypeptide may also be assessed by bandshift assays. Bandshift assays are conducted by measuring the mobility of one or more of the components of the assay, for example the mobility of the nucleic acid or polypeptide, as it is electrophoresed through a suitable gel such as a polyacrylamide or agarose gel, as is well known to those skilled in the art.
  • a suitable gel such as a polyacrylamide or agarose gel
  • the mobility of the nucleic acid or polypeptide could be measured in the presence and absence of the candidate binding molecule. If the mobility of the target nucleic acid or polypeptide is essentially the same in the presence or absence of the candidate binding molecule, then it may be inferred that the molecules do not associate, or that the association is weak. If the mobility of the nucleic acid or polypeptide is retarded in the presence of the candidate binding molecule, then it may be inferred that the candidate molecule is associating with or binding to the nucleic acid or polypeptide.
  • association of the candidate nucleic acid or polypeptide binding molecule with the target nucleic acid or polypeptide may also be assessed using filter binding assays.
  • the target nucleic acid or polypeptide molecule may be immobilised on a suitable filter, such as a nitrocellulose filter.
  • the candidate binding molecule may then be labelled, for example radioactively labelled, and contacted with the immobilised target nucleic acid or polypeptide.
  • the binding of or association with the target nucleic acid or polypeptide may be assessed by comparing the amount of labelled candidate nucleic acid or polypeptide binding molecule which associates with the filter only to the amount of labelled candidate nucleic acid or polypeptide binding molecule which associates with the filter-immobilised target. If more labelled candidate nucleic acid or polypeptide binding molecule associates with the immobilised nucleic acid or polypeptide than with the filter only, it may be inferred that the target molecule does indeed associate with the candidate binding molecule.
  • Binding affinities may be estimated by any suitable means known to those skilled in the art. Binding affinities for the purposes of this invention may be absolute or may be relative. Binding affinities may be determined biochemically, or may simply be estimated by assessing the association of the candidate nucleic acid or polypeptide binding molecule with the target nucleic acid or polypeptide as described above. As used herein, the term binding affinity may refer to a simple estimation of the association of one component of the system with another.
  • nucleic acid binding proteins Another suitable detection method for nucleic acid binding proteins is the use of target nucleic acid sequences linked to reporter constructs, such as bacterial luciferase or lacZ.
  • reporter constructs such as bacterial luciferase or lacZ.
  • the reporter gene product can be measured using optical detection techniques.
  • a multiarray format could be used with a different candidate ligand in each position in the array (such as a microtitre plate well) and the same library of zinc fmger proteins and target nucleic acid sequences at each position.
  • the zinc finger proteins will generally be fused to a transcriptional activation domain such as the GAL4 acidic activation domain.
  • Transcription may then be compared in the various wells and wells showing a variation in transcription compared to a control well with no ligand may be selected and the ligand further tested to identify specific target sequences/zinc fmger proteins whose interaction is affected. These further tests may again be performed using an array format in which this time the ligand is kept constant and the target sequence/zinc fingers varied. Phage display techniques as described above may be used to simplify the isolation of suitable zinc finger proteins. Although described in the context of zinc fingers, this method could be applied to other nucleic acid binding molecules.
  • Keenan et al a fluorescence polarization assay, as described in Keenan et al (1998) Bioorganic and Medicinal Chemistry 6, 1309-1335.
  • Other assays described in Keenan et al include assays for inducible Fas activation and for inducible transcriptional activation.
  • two fusion proteins are constructed, each comprising amino acids 175 to 304 of human Fas together with a first polypeptide or a second polypeptide respectively.
  • Cell line clones expressing both constructs are plated in 96-well plates and treated the next day with serial dilutions of compound (ligand or candidate ligand) at typicaly 1 ⁇ M maximum concentration.
  • Wells are assayed the next day for viability with for example Alamar Blue or Trypan Blue.
  • Controls can include, for example, untransfected cells.
  • transcription factor fusions are expressed from the tricistronic vector pCGNN-F3p65/ZIF3 Neo.
  • HT1080 cell line (ATCC CCL-121) cell line which contains an integrated secreted alkaline phosphatase (SEAP) target gene under control of a minimal interleukin 2 gene promoter and 12 ZFHD1 binding sites is generated as described in Rivera et al (1996) Nature Med 2, 1028.
  • SEAP alkaline phosphatase
  • This cell line is transiently transfected with fusion protein expressing construct, with and without incubation for 18-24 hours with the ligand or candidate ligand.
  • Cell supernatant is removed and assayed for SEAP activity using any suitable phosphatase assay, for example, the assay described in Rivera et al (supra), taking into account background SEAP activity (as measured from mock transfected HT1080 cells).
  • Ligand mediated protein-protein interaction may also be assayed by way of a modified two hybrid assay.
  • two fusion protein constructs are made, one comprising one of a pair of protein binding partners and the GAL4 binding domain, and the other comprising the other of the pair of protein binding partners and the VP 16 activation domain.
  • Expression of a reporter gene for example, beta-galactosidase, is measured in the presence and absence of the candidate ligand.
  • the methods of the invention may be applied in vivo, for example they could be applied to the selection or isolation of nucleic acid or polypeptide binding molecules capable of associating with target nucleic acid or polypeptide in vivo inside one or more cells, in a manner analagous to the one-hybrid system.
  • multiple target nucleic acids or polypeptides could be used in a single selective step, thereby enabling multiple nucleic acid or polypeptide binding molecules to be isolated simultaneously, even in the same physical vessel.
  • the multiple nucleic acid or polypeptide binding molecules may preferably be different from one another.
  • the multiple nucleic acid or polypeptide binding molecules may have similar or identical binding specificities, or may preferably have different binding specificities.
  • the invention may be worked using multiple ligands, either separately or in combination.
  • a target nucleic acid or polypeptide sequence may be used to isolate binding molecules according to the methods essentially as disclosed above, with the modification that more than one ligand may be present. In this way, it is possible to isolate multiple nucleic acid or polypeptide binding molecules which require different ligands to bind to the same target nucleic acid or polypeptide sequence(s).
  • Bacterial colonies containing phage libraries that express a library of zinc fingers randomised at one or more nucleic acid binding residues are transferred from plates to culture medium. Bacterial cultures are grown overnight at 30°C. Culture supernatant containing phages is obtained by centrifugation.
  • phage solution is transferred to a streptavidin coated tube and incubated with biotinylated nucleic acid target site in the presence of a candidate ligand and 4 ⁇ g poly [d(I-C)]. After a one hour incubation the tubes are washed 20 times with PBS containing 50 ⁇ M ZnCt ⁇ and 1% Tween, and 3 times with PBS containing 50 ⁇ M ZnCl2 to remove non-binding phage.
  • step 1 to 5 bacteria are plated and phage prepared from 96 colonies are screened for binding to the nucleic acid target site in the presence and absence of the ligand. Binding reactions are carried out in wells of a streptavidin-coated microtitre plate (Boehringer Mannheim) and contain 50 ⁇ l of phage solution (bacterial culture supernatant diluted 1 :1 with PBS containing 50 ⁇ M ZnCta, 4% Marvel, 2% Tween), 0.15 pmol nucleic acid target site and 0.25 ⁇ g poly [d(I-C)]. When added, the ligand is present at a concentration of about 1 ⁇ M.
  • Bound phage are detected by ELISA (carried out in the presence of the ligand at a concentration of about 1 ⁇ M where appropriate) with horseradish peroxidase- conjugated anti-M13 IgG (Pharmacia Biotech) and quantitated using SOFTMAX 2.32 (Molecular Devices).
  • yet another particular embodiment of the method of the invention is as follows:
  • Bacteria containing a phage display library of zinc fingers randomised at one or more DNA binding residues are transferred from plates to culture medium. Bacterial cultures are grown overnight at 30°C. Culture supernatant containing phage is obtained by centrifugation and 200 ⁇ l is aliquoted into each well of a 96 well micro titre plate.
  • the bacterial culture supernatant in each well (from step 1 or step 5) is diluted 1 :1 with PBS containing 50 ⁇ M ZnCl2, 4% Marvel, 2% Tween.
  • the phage mixture is replica-transferred into an array plate and binding to DNA is allowed to proceed for 1 hour at 20°C as a preselection step to remove phage that bind the target DNA in the absence of a ligand.
  • the retained phage are eluted using lO ⁇ l 0.1 M triethylamine, replica- transferred to a micotitre plate and the solutions neutralised with an equal volume of 1 M Tris-Cl (pH 7.4).
  • Eluted phage are replica-transferred to a deep-well/96-well plate in which wells contain logarithmic-phase E. coli TGI cells. Infections are allowed to proceed for lh at 37oC, and cells are collected by centrifugation. The supernatant is discarded and replaced with 1ml 2xTY medium containing Tet, and the bacteria are grown overnight to prepare phage supernatants for subsequent rounds of selection.
  • step 7 bacteria from each well (i.e. each different selection) are plated and phage prepared from 36 colonies are screened for binding to the corresponding DNA target site in the presence and absence of the ligand. Binding reactions are carried out in wells of a streptavidin-coated microtitre plate (Boehringer Mannheim) and contain 50 ⁇ l of phage solution (bacterial culture supernatant diluted 1 :1 with PBS containing 50 ⁇ M ZnCl2, 4% Marvel, 2% Tween), 0.15 pmol DNA target site and 0.25 ⁇ g poly [d(I-C)]. When added, the DNA binding ligand is present at a concentration of about 1 ⁇ M.
  • Bound phage are detected by ELISA (carried out in the presence of the ligand at a concentration of about 1 ⁇ M where appropriate) with horseradish peroxidase- conjugated anti-M13 IgG (Pharmacia Biotech) and quantitated using SOFTMAX 2.32 (Molecular Devices).
  • Single colonies of transformants obtained after four rounds of selection as described, are grown overnight in culture.
  • Single-stranded DNA is prepared from phage in the culture supernatant and sequenced using the SequenaseTM 2.0 kit (U.S. Biochemical Corp.). The amino acid sequences of the zinc fmger clones are deduced.
  • a modification of the above examples may be used to select polypeptide binding proteins. Briefly, bacteria containing phage libraries expressing a library of polypeptide binding proteins randomised at one or more residues as described above are screened against a biotinylated target polypeptide or protein, which has been immobilised on streptavidin coated beads, essentially as described above. Unbound phage are washed, and bound phage are eluted and used to infect E.coli cells. After several rounds of selection, each round involving the above steps, phage are prepared and screened for binding to the target polypeptide or protein in the presence and absence of the ligand.
  • Bound phage are detected by ELISA and identified, and the corresponding colonies are amplified, and the DNA sequence of the polypeptide binding proteins are deduced.
  • a modification of the above example using arrays of sequences in the wells of a microtitre plate may also be used to select protein switches.
  • the library of sequences can be screened using the ligand and selected phage expressing the zinc finger or other protein of interest to identify specific target nucleic acid or polypeptide sequences. This may conveniently be carried out with the nucleic acid or polypeptide sequences arrayed onto a solid substrate.
  • nucleic acid or polypeptide binding molecules e.g., zinc fingers
  • alternative methods for displaying the nucleic acid or polypeptide binding molecules could be used.
  • an entirely in vitro polysome display system has also been reported (Mattheakis et al, (1994) Proc Natl Acad Sci U S A, 91, 9022-6) in which nascent peptides are physically attached via the ribosome to the RNA which encodes them.
  • screening is performed in a similar manner to the phage display method except that typically, after an initial preselection step to remove nucleic acid or polypeptide binding molecules that bind in the absence of the ligand only one selection step is performed and the resulting nucleic acid or polypeptide binding molecules identified by cloning the RNA from the RNA/ribosome complexes and sequencing the clones obtained.
  • nucleic acid or polypeptide may be labelled with a fluorescent tag and the nucleic acid (or polypeptide) binding molecule labelled with biotin, such that an enzyme conjugate such as streptavidin- horse radish peroxidase (HRP), that catalyses an optically detectable change in a substrate (different from the fluorescent tag) can be used.
  • HRP streptavidin- horse radish peroxidase
  • tripartite complexes can be detected because they will both fluoresce and give HRP activity. Labelling of one or more of the components with a detectable label is especially useful where screening is being performed on array(s) of the(se) component(s).
  • a further method which is useful where multiple candidate ligands are to be screened involves the use of beads to which are attached different peptide tags.
  • Known combinatorial chemistry techniques are used to produce a library of beads whereby the peptide tag can be used to identify unambiguously the ligand attached to the same bead.
  • Complexes comprising the ligand, a target nucleic acid and a nucleic acid binding molecule (or a ligand, a target polypeptide and a polypeptide binding protein) can be identified by the use of labelled target and binding molecules as described above.
  • Beads comprising a tripartite complex can then be selected and the identity of the tag determined by spectroscopy techniques which will then give the identity of the ligand.
  • a bead format is advantageous since it allows easier isolation of productive tripartite complexes and prescreening.
  • nucleic acid binding molecules according to the invention may be advantageously used to determine the sequence composition of a sample of target nucleic acid.
  • a nucleic acid binding molecule according to the invention may be prepared which binds to a known target nucleic acid sequence. By applying this molecule to, or contacting it with, one or more test nucleic acid samples and monitoring its binding thereto, it is possible to determine whether said nucleic acid sample(s) contain the cognate nucleic acid recognition site of the nucleic acid binding molecule, and therefore derive information about the nucleotide composition of said nucleic acid test sample(s).
  • Such analyses may be advantageously conducted using the binding site signature method (see Choo and Klug, (1994) PNAS (USA) 91:11163-67). Where a plurality of nucleic acid samples is being tested for possession of the cognate sequence, they may usefully be disposed in the form of an array for high throughput analysis.
  • Individual phage clones could advantageously be assayed for binding of their cognate nucleic acid sequence(s) in the presence or absence of individual ligands, to monitor which particular ligand modulates binding, i.e., binding between the nucleic acid and the nucleic acid binding molecule, or binding between a protein and a polypeptide binding molecule such as a protein.
  • nucleic acid or polypeptide binding molecules may be assayed for binding to target sequence(s) in the presence of discrete ligand mixtures, wherein each ligand mixture preferably contains a unique mixture of ligands.
  • ligands which may modulate binding of a particular nucleic acid or polypeptide binding molecule to its cognate target sequence may advantageously be determined.
  • this invention may be advantageously used in the isolation of a ligand that is capable of modulating the association of a particular nucleic acid binding molecule or a particular -polypeptide binding molecule with its target nucleic acid or polypeptide sequence.
  • a pre-selection step may optionally be performed in the absence of ligand prior to each round of selection. This step removes from the library those clones which do not require ligand for nucleic acid or polypeptide binding.
  • candidate molecules selected in this manner may be screened by ELISA for binding to the nucleic acid or polypeptide target in the presence or absence of the ligand(s).
  • RNA binding molecules capable of binding nucleic acids other than DNA
  • Structural considerations of RNA binding molecules are discussed in Afshar et al (Afshar et al, 1999: Curr. Op. Biotech, vol 10 pages 59-63).
  • ligands suitable for use in the methods of the invention as applied to RNA include those ligands described above, or may be selected from aminoglycosides and their derivatives such as paromomycin, neomycin (for examples see Park et al, 1996: J. Am. Chem. Soc.
  • nucleic acid binding ligands may be prepared.
  • RNA binding molecule which binds to a target RNA molecule in a manner modulatable by an RNA-binding ligand, wherein said RNA-binding ligand and said RNA-binding molecule are different, said method comprising; providing a target RNA molecule; (a) contacting the target RNA molecule with an RNA-binding ligand, to produce an RNA-ligand complex; (b) assessing the ability of candidate RNA-binding molecules to bind the target RNA molecule and the RNA-ligand complex; and isolating those candidate RNA-binding molecules which bind the target RNA molecule and RNA-ligand complex with different binding affinities.
  • the candidate RNA-binding molecules may be in the form of an array.
  • the methods of the invention may be advantageously used to select nucleic acid sequences which allow binding of a particular ligand/nucleic acid binding molecule combination, or alternatively or to select polypeptidesequences which allow binding of a particular ligand/polypeptide binding protein combination.
  • a method for isolating target nucleic acid sequences to which a particular nucleic acid binding molecule will bind comprising providing a library of target nucleic acid molecule(s); contacting said nucleic acid molecules with a nucleic acid binding molecule in the presence or absence of ligand assessing the ability of the candidate target nucleic acid molecule(s) to bind the nucleic acid binding molecule; and isolating those target nucleic acid molecules which bind the nucleic acid binding molecule.
  • a method for isolating target polypeptide sequences to which a particular polypeptide binding molecule will bind comprising providing a library of target polypeptide molecule(s); contacting said polypeptide molecules with a polypeptide binding molecule in the presence or absence of ligand assessing the ability of the candidate target polypeptide molecule(s) to bind the polypeptide binding molecule; and isolating those target polypeptide molecules which bind the polypeptide binding molecule.
  • a library of target nucleic acid or polypeptide molecule(s) according to the invention may preferably comprise a plurality of different nucleic acid or polypeptide molecules; preferably said nucleic acid or polypeptide molecules may be related to one another in terms of sequence homology.
  • the library of target nucleic acid molecule(s) may advantageously be in the form of an array of target nucleic acid molecule(s).
  • a library of candidate nucleic acid or polypeptide binding molecule(s) according to the invention may preferably comprise a plurality of different candidate nucleic acid or polypeptide binding proteins; preferably said candidate nucleic acid or polypeptide binding proteins may be related to one another in terms of amino acid sequence homology.
  • the library of candidate nucleic acid molecule(s) may advantageously be in the form of an array of candidate nucleic acid molecule(s).
  • this method could be advantageously used in order to isolate nucleic acid or polypeptide sequences which require ligand to associate with a known nucleic acid or polypeptide binding molecule.
  • there may be a nucleic acid or polypeptide sequence which is bound by a known nucleic acid or polypeptide binding molecule in a ligand-independent manner and it may be desirable to find a nucleic acid or polypeptide sequence(s) which can also associate with the same wild-type nucleic acid or polypeptide binding molecule, but which do so in a ligand-modulatable manner.
  • this may be accomplished according to the above method of the present invention.
  • the assay methods of the invention may be used to identify nucleic acid or polypeptide binding molecules, ligands and/or target nucleic acid or polypeptide where the binding the binding molecule to the target is modulatable by the ligand.
  • nucleic acid binding proteins may be used individually or in combination in a wide variety of applications.
  • nucleic acid or polypeptide binding proteins according to the invention and identified by the assay methods of the invention may be employed in a wide variety of applications, including diagnostics and as research tools.
  • they may be employed as diagnostic tools for identifying the presence of particular nucleic acid or polypeptide molecules in a complex mixture.
  • Nucleic acid or polypeptide binding molecules according to the invention can preferably differentiate between different target nucleic acid or polypeptide molecules, and their binding affinities for the nucleic acid or polypeptide target sequences are preferably modulated by ligand(s).
  • Nucleic acid or polypeptide binding molecules according to the invention are useful in switching or modulating gene expression, especially in gene therapy applications and agricultural biotechnology applications as described below.
  • the polypeptides, nucleic acids, nucleic acid binding molecules and ligands may be used generally in regulating any biological process.
  • the polypeptides, etc are suitable for regulating any biological process which is dependent on interaction between a nucleic acid binding molecule and a nucleic acid, or which is dependent on interaction between a polypeptide and another polypeptide.
  • biological processes include enzyme functions, signal transduction, protein and nucleic acid trafficking, macromolecular assembly, antibody-antigen interactions, DNA / gene transcription, translation, phosphorylation, methylation, replication, restriction, modification, ligation, transport, degradation, editing, splicing, integration and recombination, etc.
  • targeted nucleic acid or polypeptide binding molecules such as zinc fingers
  • targeted nucleic acid or polypeptide binding molecules may moreover be employed in the regulation of gene transcription, for example by specific cleavage of nucleic acid sequences using a fusion polypeptide comprising a zinc finger targeting domain and a nucleic acid cleavage domain, or by fusion of an transcriptional effector domain to a zinc finger, to activate or repress transcription from a gene which possesses the zinc finger binding sequence in its upstream sequences.
  • a polypeptide binding protein according to the invention fused to a transcriptional effector domain may be used to target proteins bound to particular gene regulatory sequences such as promoters or enhancers, to turn on or off transcription of a gene.
  • Gene transcription may also be increased or decreased from a promoter or enhancer containing zinc fmger binding sequences, by making use of a fusion protein comprising a zinc finger fused to a polypeptide binding protein and another fusion protein comprising a protein which binds to the polypeptide binding protein fused to a transcriptional effector domain, for example, VP 16.
  • activation or repression only occurs in the presence of the ligand, since in a preferred embodiment the zinc fingers or polypeptide binding proteins will not bind their targets in the absence of the ligand.
  • activation only occurs in the absence of the ligand, since the zinc fingers or polypeptide binding proteins may not bind their target nucleic acid or polypeptide sequences in the presence of the ligand.
  • Zinc fingers capable of differentiating between U and T may be used to preferentially target RNA or nucleic acid, as required. Where RNA-targeting polypeptides are intended, these are included in the term "nucleic acid binding molecule".
  • nucleic acid or polypeptide binding molecules according to the invention will typically require the presence of a transcriptional effector domain, such as an activation domain or a repressor domain.
  • transcriptional activation domains include the VP16 and VP64 transactivation domains of Herpes Simplex Virus.
  • Alternative transactivation domains are various and include the maize Cl transactivation domain sequence (Sainz et al, 1997, Mol. Cell. Biol. 17: 115-22) and PI (Goff et al, 1992, Genes Dev. 6: 864-75; Estruch et al, 1994, Nucleic Acids Res. 22: 3983-89) and a number of other domains that have been reported from plants (see Estruch et al, 1994, ibid).
  • a repressor of gene expression can be fused to the nucleic acid binding protein or polypeptide binding protein and used to down regulate the expression of a gene contiguous or incorporating the nucleic acid binding protein target sequence, or a gene bound by the target polypeptide of the polypeptide binding protein as described above.
  • repressors are known in the art and include, for example, the KRAB-A domain (Moosmann et al., Biol. Chem. 378: 669-677 (1997)) the engrailed domain (Han et al, Embo J. 12: 2723-2733 (1993)) and the snag domain (Grimes et al, Mol Cell. Biol. 16: 6263 r 6272 (1996)). These can be used alone or in combination to down-regulate gene expression.
  • nucleic acid cleavage moieties such as the catalytic domain of a restriction enzyme
  • a restriction enzyme capable of cleaving only target nucleic acid of a specific sequence see Kim et al, (1996) Proc. Natl. Acad. Sci. USA 93:1156-1160.
  • nucleic acid binding domains can be used to create restriction enzymes with any desired recognition nucleotide sequence, but which cleave nucleic acid conditionally dependent on the presence or absence of a particular ligand, for instance Distamycin A. It may also be possible to use enzymes other than those that cleave nucleic acids for a variety of purposes.
  • catalytic polypeptides are known including naturally-occuring proteins such as enzymes or engineered proteins such as catalytic antibodies.
  • catalytic RNAs are being developed by processes such as SELEX. It is a further application of this technology that tripartite systems can be isolated comprising a catalytically active first molecule and a substrate second molecule wherein the reaction between these components is modulated by a ligand.
  • the zinc fmger polypeptides of the invention may be employed to detect the presence or absence of a particular target nucleic acid sequence in a sample.
  • the polypeptide binding proteins of the invention may be used to detect the presence or absence of a particular target polypeptide sequence in a sample.
  • a method for determining the presence of a target nucleic acid molecule comprising the steps of: (a) preparing a nucleic acid binding protein by the method set forth above which is specific for the target nucleic acid molecule; (b) exposing a test system which may comprise the target nucleic acid molecule to the nucleic acid binding protein under conditions which promote binding, and removing any nucleic acid binding protein which remains unbound; (c) detecting the presence of the nucleic acid binding protein in the test system. To detect the presence of a target protein in a sample.
  • the following steps may be taken: (a) preparing a polypeptide binding protein by the method set forth above which is specific for the target polypeptide molecule; (b) exposing a test system which may comprise the target polypeptide molecule to the polypeptide binding protein under conditions which promote binding, and removing any polypeptide binding protein which remains unbound; (c) detecting the presence of the polypeptide binding protein in the test system.
  • the methods disclosed here are suitable for screening of arrays of DNA with a known transcription factor, and a known or library of ligands for identification of new molecules which potentially modulate DNA-transcription factor interaction.
  • this is preferably a natural or known target of the DNA binding molecule; thus, our method is capable of selecting for molecules which alter or modulate the interaction between a transcription factor and its DNA target.
  • RNA switches i.e., gene switches comprising an RNA component, an RNA binding component and a ligand
  • an RNA switch may be used to disrupt translation of an mRNA which comprises a binding site for the RNA binding component. Addition of ligand may cause binding of the RNA binding molecule to bind to the RNA, and hence prevent or inhibit translation. by, for example, steterically hindering ribosomes or tRNA binding.
  • the binding site may be located in the coding sequence, or at the 5' or 3' UTR.
  • RNA interactions with small molecules may be used to control gene expression (Werstuck and Green, 1998, Science 282, 296-298).
  • small molecules e.g., ligands
  • RNA interactions with small molecules may be used to control gene expression (Werstuck and Green, 1998, Science 282, 296-298).
  • short RNA aptamers that specifically bind to a wide variety of ligands in vitro axe isolated from randomised pools of RNA, and are shown to bind their ligands in vivo. Insertion of a small molecule aptamer into the 5' UTR of a mRNA allows its translation to be repressible by ligand addition in vitro as in vivo.
  • RNA binding ligands isolated according to the methods of our invention may be used in controlling gene regulation in the system described in Werstuck and Green, as well as similar systems.
  • the ligands isolated according to our invention would be expected to be particularly useful, as in some embodiments they are capable of binding to both RNA as well as RNA binding molecules (e.g., RNA binding proteins).
  • RNA binding molecules e.g., RNA binding proteins.
  • the methods of selecting gene switches, particularly using arrays may also be used in the treatment of diseases. It is known that certain diseases are associated with up- regulation or down-regulation of particular genes.
  • An array comprising DNA sequences (including regulatory sequences, for example, promoters and enhancers) of genes of interest may be made. Such an array may be contacted with a cellular extract, for example, a nuclear extract, from a diseased patient, and also with a corresponding extract from a normal, undiseased patient. Changes in promoter occupancy in the diseased patient may be identified for example by probing the arrays with suitable probes against proteins of interest (for example, antibodies against particular transcription factors), and detecting transcription factors bound to the DNA sequences (e.g., promoter or enhancer sequences, etc).
  • the transcription factor / DNA sequence binding pairs may additionally or separately be screened against one or a library of candidate ligands which are capable of modulating the interaction between the transcription factor and the DNA.
  • Such ligands are suitable candidates for treatment of the particular diseases.
  • Components of protein switches, and protein switches themselves, may be used in various ways. Protein protein interactions underpin a wide variety of biological processes. These biological processes include signal transduction, which may be brought about by dimerisation of receptors.
  • the methods of our invention may be used to select ligands which are capable of modulating the association between a first receptor molecule and a second receptor molecule, to regulate any process invovled in signal transduction.
  • protein-protein interactions are also involved in intracellular trafficking of proteins, nucleic acids, etc, and the methods of our invention may be used to identify ligands which are capable of modulating these processes.
  • These ligands may be used as therapeutics and administered to a cell, tissue, organ or patient, to regulate processes involving protein-protein interactions, which processes are associated with a diseased state.
  • a disease may be identified caused by or associated with decreased signal transduction.
  • a ligand may be identified which is capable of promoting dimerisation of receptors. Such a ligand may be used to treat that disease.
  • nucleic acid binding molecules capable of binding to a target nucleic acid in a manner modulatable by a ligand are used to regulate expression from a gene in vivo.
  • the target gene may be endogenous to the genome of the cell or may be heterologous. However, in either case it will comprise a target nucleic acid sequence, such as a target nucleic acid sequence described above, to which a nucleic acid binding molecule of the invention binds in a manner modulatable by a ligand, or which is bound by the complex consisting of a polypeptide and a polypeptide binding protein.
  • a target nucleic acid sequence such as a target nucleic acid sequence described above, to which a nucleic acid binding molecule of the invention binds in a manner modulatable by a ligand, or which is bound by the complex consisting of a polypeptide and a polypeptide binding protein.
  • the nucleic acid binding molecule is a polypeptide
  • it may typically be expressed from a nucleic acid construct present in the host cell comprising the target sequence.
  • a polypeptide binding protein may similarly be expressed.
  • Such a nucleic acid construct is preferably stab
  • a host cell comprises a target nucleic acid sequence and a construct capable of directing expression of the nucleic acid binding molecule in the cell.
  • the host cell may comprise a target nucleic acid sequence and a construct capable of directing expression of the polypeptide binding molecule in the cell.
  • nucleic acid constructs for expressing the nucleic acid or polypeptide binding molecule are known in the art and are described above.
  • the coding sequence may be expressed constitutively or be regulated. Expression may be ubiquitous or tissue-specific.
  • Suitable regulatory sequences are known in the art and are also described above.
  • the nucleic acid construct will comprise a nucleic acid sequence encoding a nucleic acid binding molecule or a or polypeptide binding molecule operably linked to a regulatory sequence capable of directing expression of the nucleic acid or polypeptide binding molecule in a host cell.
  • target nucleic acid sequences that include operably linked neighbouring sequences that bind transcriptional regulatory proteins, such as transactivators.
  • transcriptional regulatory proteins are endogenous to the cell. If not, they typically will need to be introduced into the host cell using suitable nucleic acid constructs.
  • nucleic acid constructs into host cells are known in the art for both prokaryotic and eukaryotic cells, including yeast, fungi, plant and animal cells. Many of these techniques are mentioned below in the section on the production of transgenic organisms.
  • the ligand is a molecule such as Distamycin A which may be administered exogenously to the cell and taken up by the cell whereupon it may contact the nucleic acid or polypeptide binding molecule and modulate its binding directly or indirectly to the target sequence.
  • the ligand may interact in such a way that they bind to the target sequence only when bound to each other (i.e., dimerised), in which case an antibody which modulates the interaction between two protein binding partners may be used to modulate binding of the proteins to the target sequence.
  • antibody ligands may be identified by screening a library of randomised antibodies with the methods of our invention.
  • polypeptide ligands may also be introduced into the cell either directly or by introducing suitable nucleic acid vectors, including viruses.
  • the target nucleic acid sequence and the nucleic acid construct encoding the nucleic acid or polypeptide binding molecule are preferably stably integrated into the genome of the host cell.
  • the host cell is a single celled organism or part of a multicellular organism, the resulting organism may be termed transgenic.
  • the target nucleic acid may, in a preferred embodiment, be a naturally occurring sequence for which a corresponding nucleic acid or polypeptide binding molecule and ligand have been identified using the screening methods of the invention.
  • multicellular organism here denotes all multicellular plants, fungi and animals except humans, i.e. prokaryotes and unicellular eukaryotes are excluded specifically.
  • a "transgenic" multicellular organisms is any multicellular organism containing cells that bear genetic information received, directly or indirectly, by deliberate genetic manipulation at the subcellular level, such as by microinjection or infection with recombinant virus.
  • the organism is transgenic by virtue of comprising at least a heterologous nucleotide sequence encoding a nucleic acid binding molecule (or a polypeptide binding molecule) or target nucleic acid as herein defined.
  • Transgenic in the present context does not encompass classical crossbreeding or in vitro fertilization, but rather denotes organisms in which one or more cells receive a recombinant nucleic acid molecule. Transgenic organisms obtained by subsequent classical crossbreeding or in vitro fertilization of one or more transgenic organisms are included within the scope of the term "transgenic” .
  • germline transgenic organism refers to a transgenic organism in which the genetic information has been taken up and incorporated into a germline cell, therefore conferring the ability to transfer the information to offspring. If such offspring, in fact, possess some or all of that information, then they, too, are transgenic multicellular organisms within the scope of the present invention.
  • the information to be introduced into the organism is preferably foreign to the species of animal to which the recipient belongs (i.e., "heterologous"), but the information may also be foreign only to the particular individual recipient, or genetic information already possessed by the recipient. In the last case, the introduced gene may be differently expressed than is the native gene.
  • control sequences refers to polynucleotide sequences which are necessary to effect the expression of coding and non-coding sequences to which they are ligated.
  • the nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and transcription termination sequence; in eukaryotes, generally, such control sequences include promoters and a transcription termination sequence.
  • control sequences is intended to include, at a minimum, components whose presence can influence expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences.
  • nucleic acid constructs are typically to be integrated into the host genome, it is important to include sequences that will permit expression of polypeptides in a particular genomic context.
  • One possible approach would to use homologous recombination to replace all or part of the endogenous gene whose expression it is desired to regulate with equivalent sequences comprising a target nucleic acid in its regulatory sequences. This should ensure that the gene is subject to the same transcriptional regulatory mechanisms as the endogenous gene, with the exception of the target nucleic acid sequence.
  • homologous recombination may be used in a similar manner but with the regulatory sequences also replaced so that the gene is subject to a different form of regulation.
  • LCRs locus control regions
  • LCRs also known as scaffold attachment regions (SARs) or matrix attachment regions (MARs)
  • SARs scaffold attachment regions
  • MARs matrix attachment regions
  • CD2 gene LCR described by Lang et al, 1991, Nucl. Acid. Res. 19: 5851-5856.
  • a polynucleotide construct for use in the present invention typically comprises a nucleotide sequence encoding the nucleic acid or polypeptide binding molecule operably linked to a regulatory sequence capable of directing expression of the coding sequence.
  • the polynucleotide construct may comprise flanking sequences homologous to the host cell organism genome to aid in integration.
  • An alternative approach would be to use viral vectors that are capable of integrating into the host genome, such as retro viruses.
  • a nucleotide construct for use in the present invention further comprises flanking LCRs.
  • a transgenic organism of the invention is preferably a multicellular eukaryotic organism, such as an animal, a plant or a fungus.
  • Animals include animals of the phyla cnidaria, ctenophora, platyhelminthes, nematoda, annelida, moUusca, chelicerata, uniramia, Crustacea and chordata.
  • Uniramians include the subphylum hexpoda that includes insects such as the winged insects.
  • Chordates includes vertebrate groups such as mammals, birds, reptiles and amphibians. Particular examples of mammals include non-human primates, cats, dogs, ungulates such as cows, goats, pigs, sheep and horses and rodents such as mice, rats, gerbils and hamsters.
  • Plants include the seed-bearing plants (angiosperms) and conifers.
  • Angiosperms include dicotyledons and monocotyledons.
  • dicotyledonous plants include tobacco, (Nicotiana plumbaginifolia and Nicotiana tabacum), arabidopsis (Arabidopsis thalian ⁇ ), Brassica napus, Brassica nigra, Datura innoxia, Vicia narbonensis, Viciafaba, pea (Pisum s ⁇ tivum), cauliflower, carnation and lentil (Lens culin ⁇ ris).
  • monocotyledonous plants include cereals such as wheat, barley, oats and maize. PRODUCTION OF TRANSGENIC ANIMALS
  • transgenic animals are well known in the art.
  • a useful general textbook on this subject is Houdebine, Transgenic animals - Generation and Use (Harwood Academic, 1997) - an extensive review of the techniques used to generate transgenic animals from fish to mice and cows.
  • totipotent or pluripotent stem cells can be transformed by micro injection, calcium phosphate mediated precipitation, liposome fusion, retroviral infection or other means, the transformed cells are then introduced into the embryo, and the embryo then develops into a transgenic animal.
  • developing embryos are infected with a retrovirus containing the desired nucleic acid, and transgenic animals produced from the infected embryo.
  • the appropriate nucleic acids are coinjected into the pronucleus or cytoplasm of embryos, preferably at the single cell stage, and the embryos allowed to develop into mature transgenic animals.
  • Those techniques as well known. See reviews of standard laboratory procedures for micro injection of heterologous nucleic acids into mammalian fertilized ova, including Hogan et al, Manipulating the Mouse Embryo, (Cold Spring Harbor Press 1986); Krimpenfort et al, Bio/Technology 9:844 (1991); Palmiter et al, Cell, 41 : 343 (1985); Kraemer et al, Genetic manipulation of the Mammalian Embryo, (Cold Spring Harbor Laboratory Press 1985); Hammer et al, Nature, 315: 680 (1985); Wagner et al, U.S. Pat. No. 5,175,385; Krimpenfort et al, U.S. Pat. No. 5,175,384, the respective contents of which are incorporated herein by
  • Another method used to produce a transgenic animal involves microinjecting a nucleic acid into pro-nuclear stage eggs by standard methods. Injected eggs are then cultured before transfer into the oviducts of pseudopregnant recipients.
  • Transgenic animals may also be produced by nuclear transfer technology as described in Schnieke, A.E. et al, 1997, Science, 278: 2130 and Cibelli, J.B. et al, 1998, Science, 280: 1256.
  • fibroblasts from donor animals are stably transfected with a plasmid incorporating the coding sequences for a binding domain or binding partner of interest under the control of regulatory.
  • Stable transfectants are then fused to enucleated oocytes, cultured and transferred into female recipients.
  • nucleotide constructs comprising a sequence encoding a nucleic acid binding molecule are microinjected using, for example, the technique described in U.S. Pat. No. 4,873,191, into oocytes which are obtained from ovaries freshly removed from the mammal.
  • the oocytes are aspirated from the follicles and allowed to settle before fertilization with thawed frozen sperm capacitated with heparin and prefractionated by Percoll gradient to isolate the motile fraction.
  • the fertilized oocytes are centrifuged, for example, for eight minutes at 15,000 g to visualize the pronuclei for injection and then cultured from the zygote to morula or blastocyst stage in oviduct tissue-conditioned medium.
  • This medium is prepared by using luminal tissues scraped from oviducts and diluted in culture medium.
  • the zygotes must be placed in the culture medium within two hours following micro injection.
  • Oestrous is then synchronized in the intended recipient mammals, such as cattle, by administering coprostanol. Oestrous is produced within two days and the embryos are 'transferred to the recipients 5-7 days after estrous. Successful transfer can be evaluated in the offspring by Southern blot.
  • the desired constructs can be introduced into embryonic stem cells (ES cells) and the cells cultured to ensure modification by the transgene.
  • the modified cells are then injected into the blastula embryonic stage and the blastulas replaced into pseudopregnant hosts.
  • the resulting offspring are chimeric with respect to the ES and host cells, and nonchimeric strains which exclusively comprise the ES progeny can be obtained using conventional cross-breeding. This technique is described, for example, in W091/10741. PRODUCTION OF TRANSGENIC PLANTS
  • transgenic plants are well known in the art. Typically, either whole plants, cells or protoplasts may be transformed with a suitable nucleic acid construct encoding a nucleic acid binding molecule or a polypeptide binding molecule or target nucleic acid (see above for examples of nucleic acid constructs).
  • a suitable nucleic acid construct encoding a nucleic acid binding molecule or a polypeptide binding molecule or target nucleic acid (see above for examples of nucleic acid constructs).
  • Suitable methods include Agrobacterium infection (see, among others, Turpen et al, 1993, J. Virol. Methods, 42: 227-239) or direct delivery of nucleic acid such as, for example, by PEG-mediated transformation, by electroporation or by acceleration of nucleic acid coated particles. Acceleration methods are generally preferred and include, for example, microprojectile bombardment.
  • a typical protocol for producing transgenic plants in particular moncotyledons), taken from U.S. Patent No. 5, 874, 26
  • non-biological particles may be coated with nucleic acids and delivered into cells by a propelling force.
  • Exemplary particles include those comprised of tungsten, gold, platinum, and the like.
  • a particular advantage of microprojectile bombardment in addition to it being an effective means of reproducibly stably transforming both dicotyledons and monocotyledons, is that neither the isolation of protoplasts nor the susceptibility to
  • An illustrative embodiment of a method for delivering nucleic acid into plant cells by acceleration is a Biolistics Particle Delivery System, which can be used to propel particles coated with nucleic acid through a screen, such as a stainless steel or Nytex screen, onto a filter surface covered with plant cells cultured in suspension.
  • the screen disperses the tungsten-nucleic acid particles so that they are not delivered to the recipient cells in large aggregates. It is believed that without a screen intervening between the projectile apparatus and the cells to be bombarded, the projectiles aggregate and may be too large for attaining a high frequency of transformation. This may be due to damage inflicted on the recipient cells by projectiles that are too large.
  • cells in suspension are preferably concentrated on filters.
  • Filters containing the cells to be bombarded are positioned at an appropriate distance below the macroprojectile stopping plate.
  • one or more screens are also positioned between the gun and the cells to be bombarded.
  • froci a marker gene
  • a preferred step is to identify the transformed cells for further culturing and plant regeneration. This step may include assaying cultures directly for a screenable trait or by exposing the bombarded cultures to a selective agent or agents.
  • An example of a screenable marker trait is the red pigment produced under the control of the R-locus in maize.
  • This pigment may be detected by culturing cells on a solid support containing nutrient media capable of supporting growth at this stage, incubating the cells at, e.g., 18°C and greater than 180 ⁇ E m "2 s '1 , and selecting cells from colonies (visible aggregates of cells) that are pigmented. These cells may be cultured further, either in suspension or on solid media.
  • An exemplary embodiment of methods for identifying transformed cells involves exposing the bombarded cultures to a selective agent, such as a metabolic inhibitor, an antibiotic, herbicide or the like.
  • a selective agent such as a metabolic inhibitor, an antibiotic, herbicide or the like.
  • Cells which have been transformed and have stably integrated a marker gene conferring resistance to the selective agent used will grow and divide in culture. Sensitive cells will not be amenable to further culturing.
  • bombarded cells on filters are resuspended in nonselective liquid medium, cultured (e.g. for one to two weeks) and transferred to filters overlaying solid medium containing from 1-3 mg/1 bialaphos. While ranges of 1-3 mg/1 will typically be preferred; it is proposed that ranges of 0.1-50 mg/1 will find utility in the practice of the invention.
  • the type of filter for use in bombardment is not believed to be particularly crucial, and can comprise any solid, porous, inert support.
  • Cells that survive the exposure to the selective agent may be cultured in media that supports regeneration of plants. Tissue is maintained on a basic media with hormones for about 2-4 weeks, then transferred to media with no hormones. After 2-4 weeks, shoot development will signal the time to transfer to another media.
  • Regeneration typically requires a progression of media whose composition has been modified to provide the appropriate nutrients and hormonal signals during sequential developmental stages from the transformed callus to the more mature plant.
  • Developing plantlets are transferred to soil, and hardened, e.g., in an environmentally controlled chamber at about 85% relative humidity, 600 ppm CO , and 250 ⁇ E m "2 s "1 of light.
  • Plants are preferably matured either in a growth chamber or greenhouse. Regeneration will typically take about 3-12 weeks.
  • cells are grown on solid media in tissue culture vessels.
  • An illustrative embodiment of such a vessel is a petri dish.
  • Regenerating plants are preferably grown at about 19°C to 28°C. After the regenerating plants have reached the stage of shoot and root development, they may be transferred to a greenhouse for further growth and testing.
  • Genomic DNA may be isolated from callus cell lines and plants to determine the presence of the exogenous gene through the use of techniques well known to those skilled in the art such as PCR and/or Southern blotting.
  • the present invention relates to a vector system which carries a construct encoding a nucleic acid or polypeptide binding molecule or target nucleic acid according to the present invention and which is capable of introducing the construct Into the genome of an organism, such as a plant.
  • the vector system may comprise one vector, but it can comprise at least two vectors. In the case of two vectors, the vector system is normally referred to as a binary vector system.
  • Binary vector systems are described in further detail in Gynheung An et al. (1980), Binary Vectors, Plant Molecular Biology Manual A3, 1-19.
  • One extensively employed system for transformation of plant cells with a given promoter or nucleotide sequence or construct is based on the use of a Ti plasmid from Agrobacterium tumefaciens or a Ri plasmid from Agrobacterium rhizogenes (An et al. (1986), Plant Physiol. 81, 301-305 and Butcher D.N. et al. (1980), Tissue Culture Methods for Plant Pathologists, eds.: D.S. Ingrams and J.P. Helgeson, 203-208).
  • Ti and Ri plasmids have been constructed which are suitable for the construction of the plant or plant cell constructs described above.
  • the nucleic acid (or polypeptide) binding molecule/ target nucleic acid (or polypeptide)/ ligand combination may be used to regulate the expression of a nucleotide sequence of interest, such as in a cell of an organism, including prokaryotes, yeasts, fungi, plants and animals, for example mammals, including humans.
  • Nucleotide sequences of interest include genes associated with disease in humans and animals and therapeutic genes.
  • a nucleic acid or polypeptide binding molecule may be used in conjunction with a target nucleic acid or polypeptide sequence and ligand in a method of treating or preventing disease in an animal or human patient.
  • a switching system whether a gene switch or a protein switch, may be used to regulate expression of a nucleotide sequence of interest in a plant.
  • Examples of specific applications include the following: 1. Improvement of ripening characteristics in fruit. A number of genes have been identified that are involved in the ripening process (such as in ethylene biosynthesis). Control of the ripening process via regulation of the expression of those genes will help reduce significant losses via spoilage.
  • heterologous genes may be introduced whose expression is regulated by a gene switch of the invention.
  • a nucleotide sequence of interest may encode a gene product that is preferentially toxic to cells of the male or female organs of the plant such that the ability of the plant to reproduce can be regulated.
  • the regulatory sequences to which the nucleotide sequence is operably linked may be tissue-specific such that expression when induced only occurs in male or female organs of the plant. Suitable sequences and/or gene products are described in WO89/10396, WO92/04454 (the TA29 promoter from tobacco) and EP-A-344,029, EP-A-412,006 and EP-A-412,911.
  • the methods according to our invention allow the screening of ligands that affect the binding of a nucleic acid binding molecule, for example, DNA binding protein such as a transcription factor, to its binding sites (e.g., DNA).
  • a transcription factor such as c-myc may be used to bind to an array of all the putative promoter sites in the genome in the presence or absence of one or more ligands.
  • the transcription factor may be recombinant, or purified from a cell extract, or present in a cell extract or extract from a subcellular compartment, e.g. the nucleus.
  • the array may contain for example the promoters of therapeutically important genes, or a subsection of these, for example, cytokine genes.
  • ligands that are capable of affecting transcription factor binding to certain genes may be isolated and used for example in therapy.
  • the Examples demonstrate the above as well as other uses.
  • the proteins and ligands used are known to affect each other and the DNA sequences are quite similar. However, it should be appreciated that all three components could have been chosen arbitrarily.
  • the Examples describe experiments in which the protein (nucleic acid binding molecule) was displayed on phage, but the nucleic acid binding molecule could have been identified by other means.
  • the nucleic acid binding molecule e.g., protein
  • the protein could have been epitope tagged, or an antibody to the native protein could have been used or the protein could have been fluorescently tagged or made as a GFP fusion, etc.
  • the protein is expressed by E coli on phage in the Examples, it could have been produced by any means known in the art including in vitro translation or importantly isolation from a relevant cell type.
  • a cell extract could have been used in the assays, preferably a mammalian cell extract where the protein is expressed eg HELA cell extract (for example, for Spl protein).
  • HELA cell extract for example, for Spl protein.
  • Proteins isolated from relevant cell extracts may be modified (including being glycosylated, phosphorylated, or otherwise post-translationally modified) and the drug (candidate ligand) could affect the modification.
  • the relevant protein in the cell extract may be capable of binding to the DNA together with other proteins present in the extract.
  • the DNA array can be stained with antibodies to any protein thought to bind the extract, for example, an Spl accessory protein.
  • test chemical may bind and 'knock out' any component of the DNA binding complex, not just the target (epitope tagged) protein.
  • Cells may be pre-treated with chemicals or modified in other ways (for example, by altering of growth conditions) prior to the preparation of extracts and testing on DNA arrays.
  • Arrays may be RNA and RNA-binding proteins may be assayed according to our invention.
  • the RNAs may be short synthetic fragments or full length mRNAs. Not only RNA binding, but RNA-processing, editing and splicing may be assayed.
  • other processes such as RNA trans splicing can may be assayed instead of protein binding.
  • Assays according to our methods involving wild-type proteins may be used to identify protein-drug-DNA combinations for use as gene switches, as well as to discover therapeutics.
  • a powerful method of selecting DNA binding proteins is the cloning of peptides (Smith (1985) Science 228, 1315-1317), or protein domains (McCafferty et al, (1990) Nature 348:552-554; Bass et al, (1990) Proteins 8:309-314), as fusions to the minor coat protein (pill) of bacteriophage fd, which leads to their expression on the tip of the capsid.
  • a phage display library is created comprising variants of the middle finger from the DNA binding domain of Zif268.
  • the gene for the Zif268 fingers (residues 333-420) is assembled from 8 overlapping synthetic oligonucleotides (see Choo and Klug, (1994) PNAS (USA) 91:11163-67), giving Sfil and Notl overhangs.
  • the genes for fingers of the phage library are synthesised from 4 oligonucleotides by directional end to end ligation using 3 short complementary linkers, and amplified by PCR from the single strand using forward and backward primers which contain sites for Notl and Sfil respectively.
  • Backward PCR primers in addition introduce Met-Ala-Glu as the first three amino acids of the zinc finger peptides, and these are followed by the residues of the wild type or library fingers as required. Cloning overhangs are produced by digestion with Sfil and Notl where necessary. Fragments are ligated to 1 ⁇ g similarly prepared Fd-Tet-SN vector.
  • Fd-Tet-SN vector This is a derivative of fd-tet-DOGl (Hoogenboom et al, (1991) Nucleic Acids Res. 19, 4133-4137) in which a section of the pelB leader and a restriction site for the enzyme Sfil (underlined) have been added by site-directed mutagenesis using the oligonucleotide:
  • Electrocompetent DH5 ⁇ cells are transformed with recombinant vector in 200ng aliquots, grown for 1 hour in 2xTY medium with 1% glucose, and plated on TYE containing 15 ⁇ g/ml tetracychne and 1% glucose.
  • the zinc finger phage display library of the present invention contains amino acid randomisations in putative base-contacting positions from the second and third zinc fingers of the three-finger DNA binding domain of Zif268, and contains members that bind DNA of the sequence XXXXXGGCG where X is any base. Further details of the library used may be found in WO 98/53057, which is incorporated herein by reference.
  • the DNA sequences AAAAAAGGCG and AAAAAAGGCGAAAAAA are used as selection targets in this example because short runs of adenines can cause intrinsic DNA bending - moreover, the structure of the bend can be disrupted by binding of the antibiotic distamycin A.
  • Bacterial colonies containing zinc fmger phage libraries are transferred from plates to 200ml 2xTY medium (l ⁇ g/litre Bactotryptone, lOg/litre Bactoyeast extract, 5g/litre NaCl) containing 50 ⁇ M ZnCl2 and 15 ⁇ g/ml tetracychne. Bacterial cultures are grown overnight at 30°C. Culture supernatant containing phages is obtained by centrifuging at 1500xg for 5 minutes. Phage selection is over 4 rounds.
  • a pre-selection step comprising binding of 10 pmol of biotinylated DNA target sites immobilised on 50mg streptavidin coated beads (Dynal) to 1 ml of phage solution (bacterial culture supernatant diluted 1 : 1 with PBS containing 50 ⁇ M ZnC , 4% Marvel, 2% Tween), for 1 hour at 20°C on a rolling platform. After this time, 0.5 ml of phage solution is transferred to a streptavidin coated tube and incubated with 2 pmol biotinylated DNA target site in the presence of 2 ⁇ M distamycin A (Sigma) and 4 ⁇ g poly [d(I-C)].
  • phage prepared from 96 colonies are screened for binding to the DNA target site in the presence and absence of distamycin A. Binding reactions are carried out in wells of a streptaVidin-coated microtitre plate (Boehringer Mannheim) and contain 50 ⁇ l of phage solution (bacterial culture supernatant diluted 1:1 with PBS containing 50 ⁇ M ZnCl 2 , 4% Marvel, 2% Tween), 0.15 pmol DNA target site and 0.25 ⁇ g poly [d(I-C)]. When added, distamycin A is present at a concentration of 2 ⁇ M.
  • Single colonies of transformants obtained after four rounds of selection as described, are grown overnight in 2xTY/Zn/Tet. Small aliquots of the cultures are stored in 15% glycerol at-20°C, to be used as an archive.
  • Single-stranded DNA is prepared from TiVT phage in the culture supernatant and sequenced using the Sequenase 2.0 kit (U.S. Biochemical Corp.). The amino acid sequences of the zinc finger clones are deduced.
  • Amino acid sequences from helical regions of zinc fingers selected to bind DNA in the presence of distamycin
  • Clones 1-4 were selected to bind the oligo:
  • Clones 5-8 were selected to bind the oligo:
  • Zinc finger phage clones are isolated according to this method which bind the target with higher affinity in the presence of ligand than in the absence of ligand (see Figure 1). This method also selected certain clones that bound DNA in the absence of the ligand but were displaced from the DNA in the presence of the ligand (see Example 1.4 below).
  • Example 1.1 An adaptation to the method outlined in the Example 1.1 was used to isolate phage that bound DNA in the presence of a different small molecule, actinomycin D.
  • the DNA target was AGCTTGGCG.
  • the preselection step comprised of 7.5 pmol of biotinylated DNA target site immobilised on 18.75 ⁇ l streptavidin coated beads (Dynal) in a 100 ⁇ l mixture containing 4 ⁇ l phage library 96 ⁇ l PBS, 2% Marvel, 1% Tween-20, 50 ⁇ M ZnCl 2 for 1 hour at room temperature with constant mixing.
  • Phage selections were made in streptavidin coated tubes with the phage supernatant, 5 nM biotinylated target DNA, 10 ⁇ M actinomycin D in the presence of 1 ⁇ g poly [d(I-C)] competitor. The selections were incubated for 1 hour at room temperature. The bound phage were washed and eluted as described above.
  • ELISA was performed as described above but using 5 nM biotinylated target DNA, 0.25 ⁇ g poly[d(I-C)] competitor in the assay and 10 ⁇ M actinomycin D where appropriate. Phage were sequenced using Big Dye Terminator Cycle Sequencing Kit (Perkin Elmer Biosystems) and automated sequencing.
  • amino acid sequences from the helical regions of the selected zinc fingers were sequenced as:
  • the library of DNA binding molecules was sorted using a library of DNA sequences in the presence of a small molecule. After DNA binding molecules that bound to DNAs in the presence of the small molecule had been selected, the optimal binding site(s) for each DNA binding molecule were determined using the binding site signature.
  • ELISA was performed as described above but using 30 nM biotinylated target DNA, 0.5 ⁇ g poly[d(I-C)] competitor in the assay and 10 ⁇ M echinomycin where appropriate. Phage were sequenced using Big Dye Terminator Cycle Sequencing Kit (Perkin Elmer Biosystems) and automated sequencing.
  • amino acid sequences from the helical regions of the selected zinc fingers were sequenced as:
  • the signature of the clone 0.4/4 was determined using a modified binding site signature assay. For each of the 5 randomised positions of the oligo, a base was fixed at one of the five positions whilst the remaining 4 positions contained defined mixtures of bases. For the pyrimidine position the base was fixed as either C or T and for the purine position the base was fixed as either G or A so thatby testing each position in turn an optimal sequence or binding site signature could be determined.
  • This method determined the binding site sequence of clone 0.4/4 to be
  • the optimal target DNA sequence as determined by the binding site signature, was synthesised together with two other related DNA sequences that were present in the original random DNA library but differed in some of the optimal base positions of the binding site.
  • oligonucleotides had the sequence:
  • Binding of the phage clone was tested as a function of DNA concentrations (from 5 nM to 0.312 nM) in the presence of 10 ⁇ M echinomycin.
  • a phage ELISA was set up using 20 ⁇ l phage supernatant, 0.5 ⁇ g poly[d(I-C)], 10 ⁇ M echinomycin in PBS containing 1% Marvel, 1% Tween-20, 50 ⁇ M ZnCl 2 .
  • the total volume of the assay was 50 ⁇ l.
  • the assay was washed and developed as described as for the binding site signature assay.
  • This example describes phage that bound DNA targets with higher affinity in the absence of ligand. These phage were isolated using either: (a) the same method as in example 1.1, or (b) by selection in the absence of small molecule and phage elution from DNA using a small molecule. In this latter case (b) the method was as follows.
  • Phage selection is over 4 rounds. Binding reactions contain 10 pmol biotinylated DNA site immobilised on 50mg streptavidin coated beads (Dynal) and a 1 ml solution of zinc finger phage library (as described in 1.1) Reactions were incubated for 1 h on a rolling platform. After this time, beads were washed 20 times as described in 1.1 and finally phage were eluted from the beads over 5 minutes using a solution containg ligand (10 ⁇ M Distamycin A, or 1 ⁇ M Actinomycin D in PBS/Zn).
  • phage isolated by either of the above methods (a or b) bound DNA in the absence of ligand but could be displaced by concentrations of distamycin A at 10 ⁇ M and actinomycin D at 1 ⁇ M.
  • the distamycin sensitive clone was selected using the DNA target AAAAAGCGGAAAAA and its helices were sequenced as:
  • actinomycin D sensitive clone was selected with the DNA target AGCTTGGCG and its helices were sequenced as:
  • Figure 6 demonstrates the sensitivity of each clone to the respective drug.
  • Binding assay reactions are carried out in wells of a streptavidin-coated microtitre plate (Boehringer Mannheim) as in Example 1, except that the distamycin concentration is varied while the DNA concentration is kept constant at 2 nM. Induction of higher affinity DNA binding is observed when distamycin is added to the binding reaction at 10" 6 M - 10" 7 M.
  • Background level affinity binding is defined as the phage retention in binding reactions that contain no DNA binding site.
  • Phage-selected or rationally designed zinc finger domains which bind target DNA sequences in a manner modulatable by a ligand can be converted to restriction enzymes which cleave DNA containing said target sequences in a manner modulatable by ligand. This is achieved by coupling an appropriate zinc finger, as isolated in Example 1 above, to a cleavage domain of a restriction enzyme or other nucleic acid cleaving moiety.
  • the oligonucleotides AAAAAAGGCG and AAAAAAGGCGAAAAAA are synthesised and ligated to arbitrary DNA sequences. After incubation with the zinc fmger restriction enzyme, the nucleic acids are analysed by gel electrophoresis. Bands indicating cleavage of the nucleic acid at a position corresponding to the location of the oligonucleotide(s) (AAAAAAGGCG / AAAAAAGGCGAAAAAA) are visible. In a further experiment, the zinc finger is fused to an amino terminal copper/nickel binding motif. Under the correct redox conditions (Nagaoka, M, et al, (1994) J. Am. Chem. Soc. 1 16:4085-4086), sequence-specific DNA cleavage is observed, only in the presence of DNA incorporating oligonucleotide AAAAAAGGCG or AAAAAAGGCGAAAAAA.
  • a reporter system is produced which produces a reporter signal conditionally depending on the binding of the zinc finger DNA binding molecule to its target DNA sequence. This binding, and hence transcription from the reporter system, is modulated by the ligand Distamycin A.
  • a transient transfection system using zinc fmger transcription factors is produced as described in Choo, Y, et al, (1997) J. Mol. Biol 273:525-532.
  • This system comprises an expression plasmid which produces a phage-selected zinc finger fused to the activation domain of HSV VP16, and a reporter plasmid which contains the recognition sequence of the zinc finger upstream of a CAT reporter gene.
  • a zinc finger which recognises the DNA sequence AAAAAAGGCG is selected by phage display as described in Example 1.
  • This zinc f ger domain is used to construct a multifinger protein.
  • said zinc finger is used to construct transcription factors as described above.
  • a transient expression experiment is conducted, wherein the CAT reporter gene on the reporter plasmid is placed downstream of a ten copy repeat of the sequence AAAAAAGGCG.
  • the reporter plasmid is cotransfected with a plasmid vector expressing the zinc finger-HS V fusion under the control of a constitutive promoter. No activation of CAT gene expression is observed.
  • CAT expression is observed as a result of the binding of the zinc fmger transcription factor to its recognition sequence AAAAAAGGCG.
  • target DNA sequences to which it can bind are isolated.
  • the 434 repressor is a gene regulatory protein of phage 434. It binds to a 14bp operator site (see Koudelka et al, 1987, Nature vol 326 pp 886-888). This operator site consists of five conserved bp (1-5), then four variable bp (6-9), then five more conserved - bp (10-14) as shown below:
  • the conserved bases contact the 434 repressor protein.
  • the four variable bases are thought not to contact the 434 repressor protein. However, the four bases which do not contact the 434 repressor protein may affect the affinity of binding of the repressor to the operator site.
  • the 434 repressor protein (ie. the DNA binding molecule) is contacted with a library of different target DNA sequences in the presence and absence of ligand.
  • the target DNA sequences are synthesized using an Applied Biosystems 380 A DNA synthesizer and are purified by gel electrophoresis.
  • the four variable bases ('X' as shown above) are randomised, producing a library of 256 different target DNA molecules, position 5 being T, and position 10 being A.
  • the different target molecules are arranged in an array spotted onto a glass slide by means of a polypeptide linker. Structure of target DNA sequence library:
  • X is any base, and the partially randomised 434 operator is underlined.
  • the 434 repressor protein is added to the library of target DNA sequences, in the presence and absence of 2 ⁇ M distamycin A (Sigma) ligand in 200 ⁇ l binding buffer (9 mM Tris-HCl pH 8.0, 90 mM KC1, 90 ⁇ M ZnSO 4 ) and incubated for 30 min.
  • Binding of the 434 repressor to the array is visualised by immunosorbent techniques, for intance using a fluorescently labeled antibody.
  • Target DNA sequences are selected which bind the 434 repressor with higher affinity in the presence of ligand than in the absence of ligand. Furthermore, DNA sequences are selected which bind the 434 repressor in the absence of ligand with a higher affinity than in the presence of ligand.
  • Example 6 Isolation of ligands which affect the binding of a DNA binding molecule to its cognate DNA target
  • the 434 repressor protein of Example 5 is used in conjunction with a target operator DNA sequence to which it binds.
  • a library of ligands is used in place of the 2 ⁇ M distamycin A (Sigma) ligand of
  • ligands are isolated which are capable of increasing the affinity of the 434 repressor for its cognate DNA target sequence, ligands are also isolated which are capable of decreasing the affinity of the 434 repressor for its cognate DNA target sequence.
  • oligonucleotides are made containing 13 base binding regions on a 37 base oligo template TATANN123456789NNTCACAGTCAGTCCACACGTC where NN are the two base flanking sequences and 123456789 is the specific 9 base binding sequence.
  • 47 oligos had the binding sequence 12345GGCG where bases 1-5 are of a particular sequence and 46 oligos the sequence GCGG56789 where bases 5-9 are of a particular sequence.
  • Two oligos are used that had the binding sequence AAAAAAGGCGAAAAAA and AAAAAAGCGGAAAAAA in place of NN123456789NN.
  • the oligos are annealed to a biotinylated oligo GACGTGTGGACTGACTGTGA and filled-in using dideoxynucleotides and Klenow polymerase.
  • the oligonucleotides are diluted to a concentration of 0.1 pmol/ ⁇ l and arrayed randomly into single wells on a 96-well plate as shown in Figure 7. Position 95 in Figure 7 did not have any oligo arrayed into the well and is used as a negative (background) control. This plate served as a stock oligonucleotide plate.
  • the drugs are distamycin A, a minor groove binding drug, actinomycin D, an intercalating drug and echinomycin, a bis- intercalating antibiotic.
  • the phage chosen are distamycin binding phage clone 3 (Dist3/2F), actinomycin D binding phage clone 1 (Adl) and echinomycin binding phage 0.4/4 (EM0.4/4).
  • Two 96-well streptavidin coated plates are preblocked with 150 ⁇ l 4% Marvel, PBS, 50 ⁇ M ZnCl 2 for 1 hour at room temperature. Following blocking, the solution is discarded and 45 ⁇ l of Assay Mixture (2% Marvel, 1% Tween 20, PBS, 50 ⁇ M ZnCl 2 , 20 ⁇ g salmon sperm DNA) with and without 10 ⁇ M drug are arrayed so that in the 96-well plates there are adjacent columns of no drug and drug solutions. Binding sites are added to a concentration of 8 nM as in Figure 8 so that the same oligo is arrayed into adjacent wells without drug and containing drug. 5 ⁇ l of overnight zinc finger phage culture supernatant are then added to each well in each plate and the assay is incubated at room temperature for 1 hour.
  • Assay Mixture 2% Marvel, 1% Tween 20, PBS, 50 ⁇ M ZnCl 2 , 20 ⁇ g salmon sperm DNA
  • the plates are washed seven times by flooding the with 1% Tween 20, PBS, 50 ⁇ M ZnCl 2 followed by three washes with PBS, 50 ⁇ M ZnCl 2 .
  • 100 ⁇ l of anti-M13 HRP conjugated antibody (1/5000 dilution) in 2% Marvel, 0.05% Tween 20, PBS, 50 ⁇ M ZnCl are incubated in each well for 1 hour at room temperature and then the plates are washed with three washes of 0.05% Tween 20, PBS, 50 ⁇ M ZnCl 2 and three washes with PBS, 50 ⁇ M ZnCl .
  • the assay is developed with TMB substrate and stopped by the addition of 100 ⁇ l 0.1 M H 2 SO 4 .
  • the assays are read at OD 450 and subtracted from absorbance readings at OD 695 . Readings from the zinc finger phage incubated in the absence of drug are subtracted from readings in the presence of drug to give values for drug/nucleic acid binding.
  • Table 2 Examples of ON/OFF Switches Identified by Zinc Finger/Drug/DNA Mieroarrays Single Zinc Finger Proteins/Multiple Drugs/Multiple Site Binding Array
  • binding site oligos are arrayed on 384-well plates so that each site is dispensed into 4 adjacent wells. For each binding site, one well remained free of drug, whilst distamycin A, actinomycin D, echinomycin are dispensed into the each of the three remaining wells, see Figure 11. Binding sites and drug mixtures are prepared in stock plates and 18 ⁇ l of the Assay Mixture (see above) with or without 10 ⁇ M drug are dispensed into a 384-well plate. After 5 minutes, 2 ⁇ l phage are then added to each well and incubated for 1 hour at room temperature.
  • the wells are washed seven times with 100 ⁇ l 1% Tween 20, PBS, 50 ⁇ M ZnCl 2 followed by three washes with 100 ⁇ l PBS, 50 ⁇ M ZnCl .
  • 40 ⁇ l of anti-M13 HRP conjugated antibody (1/5000 dilution) in 2% Marvel, 0.05% Tween 20, PBS, 50 ⁇ M ZnCl 2 are incubated in each well for 1 hour at room temperature and then the plates are washed with three washes of 100 ⁇ l 0.05% Tween 20, PBS, 50 ⁇ M ZnCl 2 and three washes with 100 ⁇ l PBS, 50 ⁇ M ZnCl 2 .
  • the assay is developed with TMB substrate and stopped by the addition of 40 ⁇ l 0.1 M H 2 S0 . The assays are read as previously described.
  • Oligo 86 is an 'ON' sequence with distamycin A, however, this becomes an 'OFF' switch with either echinomycin or actinomycin D.
  • DNA sites are short lengths of DNA but these may be longer stretches of DNA such as cDNA libraries or libraries of subgenomic fragments of DNA such as promoter regions.
  • the following are examples of how the nucleic acid binding proteins used in these assays and their binding charateristics in response to small molecules are to be utilised.
  • Dist3/2F will bind DNA site AAAAAAGGCGAAAAAA in the presence of distamycin A (oligo 35), but will dissociate from DNA site ATGTTAGGGCGTG (oligo 55) in the presence of distamycin A.
  • the same drug will therefore cause this protein to re-locate to a different DNA sequence, which has broad utility in molecular biology, for example in differential regulation of two genes.
  • Dist3/2F will bind DNA site GAGCTGGGGCGTG (oligo 86) in the presence of distamycin A, but will bind DNA site GCGCCGCGGCGTG (oligo 94) in the presence of echinomycin.
  • Different drugs will therefore determine the binding of this protein different DNA sequences, which has broad utility in molecular biology, for example in differential regulation of two genes.
  • 6-finger protein that comprises a dimer of domain EMO.4/4 (protein EM0.4/4-EM0.4/4) and a 6-finger protein that comprises a dimer of domain Dist3/2F (protein Dist3/2F-Dist3/2F) where the size of the two molecules differs such that they can be resolved by non-denaturing polyacrylamide gel electrophoresis.
  • Dist3/2F protein Dist3/2F-Dist3/2F
  • Dist3/2F-Dist3/2F In the absence of drug echinomycin, protein Dist3/2F-Dist3/2F only is seen to bind the DNA. In the presence of increasing concentrations of drug, the Dist3/2F-Dist3/2F protein no longer binds the DNA, but rather protein EM0.4/4-EM0.4/4 is seen to bind the DNA. Control reactions showing each protein bound to the DNA separately, as well as the free DNA, are included as size standards.
  • This example describes the use of the methods of our invention to select gene switches, and their components, using modified DNA.
  • the creation of the modified DNA is prophetic. There are protocols for linking BP to DNA though, but perhaps it's best not to disclose these at present.
  • a zinc finger library is constructed according to the protocols described in US Patent Nos. 6,013,453 and 6,007,988 and International Patent Publication Nos. WO 98/53060, WO 98/53057, and WO 98/53058; selection of zinc finger proteins capable of binding to specific sequences is also described in these documents. These documents also describe phage ELISA screening in presence and absence of a drug, formation of multifinger proteins, and bandshift assays, and the relevant protocols described in those documents are used here.
  • a zinc finger protein is selected to bind the modified DNA sequence d(Gl-C2-G3- G4-C5-[BP]G6-C7-T8-A9-C10-C11) by selections from the library libl2 which is designed to recognise DNA sequences of the form GCGGXXXX.
  • the [BP]dG moiety is derived from the covalent linkage of a benzo[a]pyrenyl moiety to the guanine.
  • the recovered zinc finger phage clones are tested for binding to the unmodified DNA sequence (i.e., d(Gl-C2-G3-G4-C5-G6-C7-T8-A9-C10-Cl l). Those clones which do not bind the unmodified DNA are re-tested in the presence of compounds derived from a combinatorial library of benzo[a]pyrenyl derivatives.
  • Gene switches are identified comprising combinations of (1) the unmodified DNA sequence, (2) zinc fingers selected by the above method and (3) library-derived compounds capable of promoting the association of (1) and (2).
  • any selected zinc finger domain can be homo- or hetero- multimerised to the form of a polydactyl protein using appropriate linkers and tested for binding to a suitable head-to-tail multimer of a composite cognate DNA sequence, in the presence and absence of the corresponding BP derivative. In this way, gene switches that bind DNA with greater affinity and that display a greater responsiveness to the BP derivative are identified.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • Plant Pathology (AREA)
  • Physics & Mathematics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Cell Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Toxicology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Virology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

L'invention concerne un procédé de sélection d'un ou plusieurs composants d'un système activateur composé (I) d'une première molécule et (II) d'une deuxième molécule, la première molécule se liant à la deuxième de manière modulable par un ligand, et (III) d'un ligand. Ledit procédé consiste (a) à déterminer le degré de liaison entre une première molécule candidate et une deuxième molécule candidate en présence d'un ligand candidat, (b) à déterminer le degré de liaison entre la première molécule candidate et la deuxième molécule candidate en l'absence du ligand candidat, (c) à identifier un couple première molécule/deuxième molécule dans laquelle la liaison de la première molécule à la deuxième diffère selon la présence ou l'absence d'un ligand, et (d) à isoler et/ou identifier éventuellement la première molécule, la deuxième molécule, ou le ligand.
PCT/GB2001/000187 2000-01-24 2001-01-18 Sequences activatrices ii WO2001053479A2 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001226921A AU2001226921A1 (en) 2000-01-24 2001-01-18 Molecular switches ii

Applications Claiming Priority (10)

Application Number Priority Date Filing Date Title
GB0001578.4 2000-01-24
GB0001578A GB0001578D0 (en) 2000-01-24 2000-01-24 Gene switches
GB0001582A GB0001582D0 (en) 2000-01-24 2000-01-24 Gene switches
GB0001582.6 2000-01-24
PCT/GB2000/002071 WO2000073434A1 (fr) 1999-05-28 2000-05-30 Commutateurs genetiques
PCT/GB2000/002080 WO2001000815A1 (fr) 1999-05-28 2000-05-30 Commutateurs moleculaires
GBPCT/GB00/02080 2000-05-30
GBPCT/GB00/02071 2000-05-30
GBGB0029901.6A GB0029901D0 (en) 2000-01-24 2000-12-07 Molecular switches II
GB0029901.6 2000-12-07

Publications (2)

Publication Number Publication Date
WO2001053479A2 true WO2001053479A2 (fr) 2001-07-26
WO2001053479A3 WO2001053479A3 (fr) 2002-01-31

Family

ID=27447759

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2001/000187 WO2001053479A2 (fr) 2000-01-24 2001-01-18 Sequences activatrices ii

Country Status (1)

Country Link
WO (1) WO2001053479A2 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080058219A1 (en) * 2002-04-02 2008-03-06 Ambit Biosciences Corporation Assays and kits for detecting protein binding
US8329889B2 (en) 2008-02-15 2012-12-11 Trustees Of Boston University In vivo gene sensors

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991016429A1 (fr) * 1990-04-19 1991-10-31 The General Hospital Corporation Techniques de triage de partenaires de proteines et leurs utilisations
WO1992000388A1 (fr) * 1990-07-02 1992-01-09 The Regents Of The University Of California Detection d'analytes par transfert fluorescent d'energie
WO1993023431A1 (fr) * 1992-05-14 1993-11-25 Baylor College Of Medicine Recepteurs d'hormones steroides mutees, procede d'utilisation et commutateur moleculaire pour therapie genique
WO1995029195A1 (fr) * 1994-04-22 1995-11-02 California Institute Of Technology Detecteur de proteines morcelees a base d'ubiquitine
WO1996040892A1 (fr) * 1995-06-07 1996-12-19 Basf Ag Modulateurs de transcription regules par la tetracycline et presentant des specificites de fixation de l'adn modifiees
WO1998034120A1 (fr) * 1997-01-31 1998-08-06 Universite De Montreal Analyses par complementation de fragments proteiques pour detecter des interactions biomoleculaires
WO1998044350A1 (fr) * 1997-04-02 1998-10-08 The Board Of Trustees Of The Leland Stanford Junior University Detection d'interactions moleculaires par complementation de sous-unites de marquage
WO1999000488A1 (fr) * 1997-06-30 1999-01-07 The Regents Of The University Of California Procede de criblage des facteurs de transcription nucleaire en fonction de leur aptitude a moduler une reponse des oestrogenes
WO1999028746A1 (fr) * 1997-12-04 1999-06-10 Institut Pasteur Systeme bacterien comportant des hybrides multiples et ses mises en application
WO2000009704A1 (fr) * 1998-08-13 2000-02-24 Syngenta Limited Commutateur de gene
WO2000052179A2 (fr) * 1999-03-03 2000-09-08 Genelabs Technologies, Inc. Syteme de sequence activatrice a mediation assuree par compose de liaison a l'adn
WO2000073434A1 (fr) * 1999-05-28 2000-12-07 Sangamo Biosciences, Inc. Commutateurs genetiques
WO2001000815A1 (fr) * 1999-05-28 2001-01-04 Sangamo Biosciences, Inc. Commutateurs moleculaires

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991016429A1 (fr) * 1990-04-19 1991-10-31 The General Hospital Corporation Techniques de triage de partenaires de proteines et leurs utilisations
WO1992000388A1 (fr) * 1990-07-02 1992-01-09 The Regents Of The University Of California Detection d'analytes par transfert fluorescent d'energie
WO1993023431A1 (fr) * 1992-05-14 1993-11-25 Baylor College Of Medicine Recepteurs d'hormones steroides mutees, procede d'utilisation et commutateur moleculaire pour therapie genique
WO1995029195A1 (fr) * 1994-04-22 1995-11-02 California Institute Of Technology Detecteur de proteines morcelees a base d'ubiquitine
WO1996040892A1 (fr) * 1995-06-07 1996-12-19 Basf Ag Modulateurs de transcription regules par la tetracycline et presentant des specificites de fixation de l'adn modifiees
WO1998034120A1 (fr) * 1997-01-31 1998-08-06 Universite De Montreal Analyses par complementation de fragments proteiques pour detecter des interactions biomoleculaires
WO1998044350A1 (fr) * 1997-04-02 1998-10-08 The Board Of Trustees Of The Leland Stanford Junior University Detection d'interactions moleculaires par complementation de sous-unites de marquage
WO1999000488A1 (fr) * 1997-06-30 1999-01-07 The Regents Of The University Of California Procede de criblage des facteurs de transcription nucleaire en fonction de leur aptitude a moduler une reponse des oestrogenes
WO1999028746A1 (fr) * 1997-12-04 1999-06-10 Institut Pasteur Systeme bacterien comportant des hybrides multiples et ses mises en application
WO2000009704A1 (fr) * 1998-08-13 2000-02-24 Syngenta Limited Commutateur de gene
WO2000052179A2 (fr) * 1999-03-03 2000-09-08 Genelabs Technologies, Inc. Syteme de sequence activatrice a mediation assuree par compose de liaison a l'adn
WO2000073434A1 (fr) * 1999-05-28 2000-12-07 Sangamo Biosciences, Inc. Commutateurs genetiques
WO2001000815A1 (fr) * 1999-05-28 2001-01-04 Sangamo Biosciences, Inc. Commutateurs moleculaires

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BULYK MARTHA L ET AL: "Exploring the DNA-binding specificities of zinc fingers with DNA microarrays." PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES, vol. 98, no. 13, 19 June 2001 (2001-06-19), pages 7158-7163, XP002174591 June 19, 2001 ISSN: 0027-8424 *
DICKINSON L A ET AL: "INHIBITION OF ETS-1 DNA BINDING AND TERNARY COMPLEX FORMATION BETWEEN ETS-1, NF-KB, AND DNA BY A DESIGNED DNA-BINDING LIGAND" JOURNAL OF BIOLOGICAL CHEMISTRY,AMERICAN SOCIETY OF BIOLOGICAL CHEMISTS, BALTIMORE, MD,US, vol. 274, no. 18, 1999, pages 12765-12773, XP000901235 ISSN: 0021-9258 *
M.L. BULYK ET AL.: "Quantifying DNA-protein interactions by double-stranded DNA arrays" NATURE BIOTECHNOLOGY, vol. 17, no. 6, June 1999 (1999-06), pages 573-577, XP002174590 NATURE PUBL. CO.,NEW YORK, US *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080058219A1 (en) * 2002-04-02 2008-03-06 Ambit Biosciences Corporation Assays and kits for detecting protein binding
US9267165B2 (en) * 2002-04-02 2016-02-23 Discoverx Corporation Assays and kits for detecting protein binding
US8329889B2 (en) 2008-02-15 2012-12-11 Trustees Of Boston University In vivo gene sensors

Also Published As

Publication number Publication date
WO2001053479A3 (fr) 2002-01-31

Similar Documents

Publication Publication Date Title
US6706470B2 (en) Gene switches
AU751487B2 (en) Nucleic acid binding proteins
Denis et al. The CCR4-NOT complex plays diverse roles in mRNA metabolism
US7262055B2 (en) Regulated gene expression in plants
US5498530A (en) Peptide library and screening method
White Gene transcription: mechanisms and control
US7521241B2 (en) Regulated gene expression in plants
AU2005200548B2 (en) Protein Switches
WO1998053060A1 (fr) Proteines de liaison d'acide nucleique
Yamada et al. An Arabidopsis protein that interacts with the cytokinin-inducible response regulator, ARR4, implicated in the His-Asp phosphorylay signal transduction
WO2001053479A2 (fr) Sequences activatrices ii
CA2382541A1 (fr) Bibliotheque d'adn
US7943731B1 (en) Dimerizing peptides
NAGAMINE et al. STAN STASINOPOULOS,* HOANH TRAN,* EMILY CHEN, MYTHILY SACHCHITHANANTHAN
AU726759B2 (en) Improvements in or relating to binding proteins for recognition of DNA
AU2018241075A1 (en) Novel Methods of Constructing Libraries Comprising Displayed and/or Expressed Members of a Diverse Family of Peptides, Polypeptides or Proteins and the Novel Libraries
Szymanski DNA-protein interactions throughout the Arabidopsis calmodulin-3 gene
Palzkill Identification of Protein
Klemm Structural and biochemical studies of Oct-1 POU domain-DNA interactions

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP