EP1349943A2 - Detection de conformation de proteines a l'aide d'un systeme de rapporteurs d'ubiquitine dedoublee - Google Patents

Detection de conformation de proteines a l'aide d'un systeme de rapporteurs d'ubiquitine dedoublee

Info

Publication number
EP1349943A2
EP1349943A2 EP02718797A EP02718797A EP1349943A2 EP 1349943 A2 EP1349943 A2 EP 1349943A2 EP 02718797 A EP02718797 A EP 02718797A EP 02718797 A EP02718797 A EP 02718797A EP 1349943 A2 EP1349943 A2 EP 1349943A2
Authority
EP
European Patent Office
Prior art keywords
protein
polypeptide
domain
fusion protein
amino
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP02718797A
Other languages
German (de)
English (en)
Inventor
Nils Johnsson
Xavier Raquet
Jorg H. Eckert
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Max Planck Gesellschaft zur Foerderung der Wissenschaften eV
Original Assignee
Max Planck Gesellschaft zur Foerderung der Wissenschaften eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Max Planck Gesellschaft zur Foerderung der Wissenschaften eV filed Critical Max Planck Gesellschaft zur Foerderung der Wissenschaften eV
Publication of EP1349943A2 publication Critical patent/EP1349943A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • G01N33/6842Proteomic analysis of subsets of protein mixtures with reduced complexity, e.g. membrane proteins, phosphoproteins, organelle proteins
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/62DNA sequences coding for fusion proteins
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/536Immunoassay; Biospecific binding assay; Materials therefor with immune complex formed in liquid phase
    • G01N33/542Immunoassay; Biospecific binding assay; Materials therefor with immune complex formed in liquid phase with steric inhibition or signal modification, e.g. fluorescent quenching
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • G01N33/6803General methods of protein analysis not limited to specific proteins or families of proteins
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/40Fusion polypeptide containing a tag for immunodetection, or an epitope for immunisation
    • C07K2319/42Fusion polypeptide containing a tag for immunodetection, or an epitope for immunisation containing a HA(hemagglutinin)-tag
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/70Fusion polypeptide containing domain for protein-protein interaction
    • C07K2319/71Fusion polypeptide containing domain for protein-protein interaction containing domain for transcriptional activaation, e.g. VP16
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/80Fusion polypeptide containing a DNA binding domain, e.g. Lacl or Tet-repressor
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/95Fusion polypeptide containing a motif/fusion for degradation (ubiquitin fusions, PEST sequence)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value

Definitions

  • Protein interactions facilitate most biological processes including signal transduction and homeostasis.
  • the elucidation of particular interacting protein partners facilitating these biological processes has been advanced by the development of in vivo "two-hybrid” or “interaction trap” methods for detecting and selecting interacting protein partners (see Fields & Song, Nature 340: 245-6, 1989; Gyuris et al., Cell 75: 791-803, 1993).
  • These methods rely upon the reconstitution of a nuclear transcriptional activator via the interaction of two binding partner polypeptides - i.e. a first polypeptide fused to a DNA binding domain and a second polypeptide fused to a transcriptional activation domain.
  • both proteins need to be localized to the nucleus. Accordingly, the interaction of polypeptides which are normally localized to other compartments may not be detected because of the absence of other non-nuclear polypeptide components which facilitate the interaction or particular non-nuclear post-translational modifications which fail to occur in the nucleus or because the interacting proteins fail to fold properly when localized to the nuclear compartment.
  • the nuclear two-hybrid assay is ill-suited to the detection of protein interactions occurring within or at the surface of cellular membranes.
  • the Split Ubiquitin Protein Sensor is described in U.S. Patent No. 5,585,245 and 5,503,977.
  • the "split ubiquitin" method is a means of detecting protein- protein interactions that relies in part upon the fact that isolated amino- and carboxyl fragments of ubiquitin (e.g. comprising amino acids 1 to 37 and 38 to 76 respectively) are able to spontaneously associate to reconstitute a bimolecular ubiquitin polypeptide complex that is recognized by ubiquitin specific proteases (UBPs). These proteases can then actively cleave the polypeptide bond between amino acid residue 76 of the carboxyl fragment of ubiquitin and any linked polypeptide.
  • UBPs ubiquitin specific proteases
  • this linked polypeptide is a reporter which can be detected from the carboxyl-terminal ubiquitin protein fragment, then the association of amino and carboxyl ubiquitin fragments can be monitored by the release of the reporter activity.
  • This "re-association" of ubiquitin amino and carboxyl fragments can be made dependent upon the association of two heterologous polypeptides by mutating one or both of the ubiquitin fragments (e.g. by a conservative amino acid substitution of a neutral amino acid residue) so that they fail to "reassociate" without the aid of linked heterologous binding partners.
  • the two heterologous polypeptides i.e.
  • a first polypeptide and a second polypeptide are provided as fusions to the mutant amino and/or carboxyl ubiquitin fragments.
  • the carboxyl ubiquitin fragment is fused at its C-terminus to a reporter gene.
  • the resulting two fusions have the structures 1 st polypeptide-N-Ub*(i- 37 ) and 2 nd poly ⁇ eptide-C-Ub( 38 . 76) -reporter.
  • the altered ubiquitin amino and carboxyl fragments fail to associate.
  • association of the first and second polypeptides results in reassembly of the amino Ub* and carboxyl Ub fragments and cleavage of the carboxyl Ub-reporter bond, thereby releasing free reporter. If the reporter is active upon its release, but inactive while fused to the carboxyl fragment of ubiquitin, its activity can be monitored in a screen for polypeptide binding partners (see U.S. Patent Nos. 5,585,245 and 5,503,977).
  • the split-Ub assay has been shown to detect stable interactions between soluble proteins, between membrane proteins, and a transient interaction between substrate and transporter during protein translocation in vivo (D ⁇ nnwald et al., Mol. Biol. Cell.
  • the split ubiquitin method has been applied to measurements of protein / protein interpolypeptide interactions, however not to measuring intramolecular or intrapolypeptide interactions, such as occur during polypeptide folding.
  • the proper folding of a protein to its mature conformation is a particularly important step in the expression of any biologically active protein. For example, proper folding is essential to the activity of proteins encoding enzymatic activities as well as to those serving structural roles in the cell. Indeed two interacting proteins, must each first fold appropriately so that a proper conformation for each to interact is first adopted. Protein folding and polypeptide conformation have many important biological consequences.
  • Conformational transitions are usually accompanied by changes in chemical and physical parameters of the protein.
  • Existing biophysical techniques such as circular dichroism or x-ray crystallograph, that measure these parameters generally rely on purified samples and cannot monitor conformational alterations in living cells.
  • one alternative to the established methods is to attach conformation-specific probes to the protein of interest.
  • the instant invention provides methods and reagents for measuring polypeptide conformation using a unique intrapolypeptide split-ubiquitin method.
  • the invention provides certain methods and reagents useful for detecting and measuring polypeptide conformational changes.
  • the method of the invention allows for the detection of a conformational change in a polypeptide resulting from a mutational alteration in the polypeptide sequence or from contact of the polypeptide with a test compound.
  • the method of the invention utilizes a fusion protein having the general structure N ub -X-C ub , where N ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain and X is a polypeptide of interest - preferably a nonubiquitin polypeptide.
  • the fusion further comprises a reporter polypeptide fused to the carboxy-terminus of the C Ub domain and the fusion protein comprises a fusion protein reporter moiety having the general structure: N ub -X-C Ub -Reporter.
  • the reporter is URA3, thymidine kinase or Green Fluorescent Protein (GFP).
  • the N U domain is a wild-type or mutant amino-terminal ubiquitin domain having mutated amino acid replacements at positions three and thirteen of ubiquitin such as: N V i, N ⁇ a , N va , N lg , N V g, N a i, N aa , N ag , N g j, N ga , or N gg respectively.
  • the protein of interest is Gulcl, Fprl, Sec62p, beta-amyloid, G- proteins or p53.
  • a polypeptide conformational alteration resulting from a mutational alteration in the polypeptide sequence or from contact of the polypeptide with a test compound is detected by first measuring the reporter activity from the N Ub -X-C Ub -reporter and then comparing it to the reporter activity from the N Ub -X-C ub -reporter following mutational alteration of the polypeptide of interest or contact of the polypeptide of interest with a test compound.
  • a change in the level of the fusion protein reporter activity following mutation or contact with the test compound indicates that the mutational alteration or test compound causes a conformational change in the protein.
  • the method of the invention is performed in vivo, with the polypeptide reporter fusion expressed in an intact cell in situ or in a cultured cell existing in vitro.
  • the test compound is a polypeptide or a small molecule and is supplied from a library of test polypeptides or library of test small molecules.
  • One aspect of the invention provides a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein N ub is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C U domain, and X is a nonubiquitin polypeptide selected from the group consisting of: Gulcl ,
  • Fprl Sec62p, beta-amyloid, p53, calmodulin, estrogen receptor alpha (ER ⁇ ), FKBP, G-protein,VHL, tyrosine kinases, Src, Abl, Epidermal Growth Factor receptor (EGFR), Protein Kinase A (PKA), Protein Kinase C (PKC), Cyclophillins, Cyclin Dependent Kinases (CDKs), Cyclins, a protein of therapeutic, physiological or biological interest or variants / fragments thereof.
  • EGFR Epidermal Growth Factor receptor
  • PKA Protein Kinase A
  • PKC Protein Kinase C
  • CDKs Cyclin Dependent Kinases
  • a related aspect of the invention provides a fusion protein comprising the structure N cicl b -X-C ub -RM, wherein C Cincinnati is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C u domain, X is a nonubiquitin polypeptide, and N u is a mutant amino-terminal ubiquitin domain selected from the group consisting of: N V i, N va , N vg , N a j, N aa , N ag , N g i, N ga , and N gg .
  • a related aspect of the invention provides a fusion protein comprising the structure N ub -X-C Ub -RM, wherem N u is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C ub domain, and X is a nonubiquitin polypeptide, wherein RM is a selectable marker.
  • a related aspect of the invention provides a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein C ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain, X is a nonubiquitin polypeptide, and N ub is a mutant amino-terminal ubiquitin domain which has altered affinity for C Ub chosen such that for a given X polypetide it just inhibits or just allows the reconstitution of a quasi-native ubiquitin and hence cleavage of RM from the fusion protein.
  • a related aspect of the invention provides a fusion protein comprising the structure N Ub -X-C ub -RM 5 wherein N L , b is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain, and X is a non-yeast nonubiquitin polypeptide.
  • a related aspect of the invention provides a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein N u is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C Cincinnati is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide, and RM is a reporter moiety fused to the carboxy-terminus of the C ub domain, wherein upon cleavage of the C ub -RM junction, the first amino acid of the released RM is an amino acid other than methionine.
  • the first amino acid of the cleaved RM is Arginine, Lysine, Histidine, Phenylalanine, Tryptophan, Tyrosine, Leucine, Aspartate, Glutamate, Cysteine, Asparagine, Glutamine or Isoleucine.
  • Another aspect of the invention provides a polynucleotide sequence encoding any one of the fusion proteins of the instant invention.
  • Another aspect of the invention provides a host cell harboring a polynucleotide sequence encoding any one of the fusion proteins of the instant invention.
  • Another aspect of the invention provides a method of detecting a conformational change in a polypeptide resulting from a mutational alteration in the polypeptide sequence comprising: (a) measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N ub -X-C ub -RM, wherein N ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide of interest and RM is a reporter moiety, wherein upon cleavage of the C ub -RM junction, the first amino acid of the released RM is an amino acid other than methionine; and, (b) measuring a second fusion protein reporter moiety activity from a N Ub -
  • a related aspect of the invention provides a method of detecting a conformational change in a polypeptide resulting from a point mutation or an insertion / deletion of no more than 3 amino acids in the polypeptide sequence comprising: (a) measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein N ub is an amino- terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide of interest and RM is a reporter moiety; and, (b) measuring a second fusion protein reporter moiety activity from a N Ub -X'-C U b-RM, wherein X' is a point mutation or a deletion / insertion of no more than three amino acids form of polypeptide X; wherein a change in the level of the second fusion protein RM activity relative to the first fusion protein
  • a related aspect of the invention provides a method of detecting a conformational change in a polypeptide resulting from a stimulus comprising: (a) measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N Ub -X-C ub -RM, wherein N cicl b is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide of interest and RM is a reporter moiety; and, (b) measuring a second fusion protein reporter moiety activity from a N ub -X' -C ub -RM, wherein X' is the X polypeptide which has been altered by the stimulus; wherein a change in the level of the second fusion protein RM activity relative to the first fusion protein RM activity indicates that the polypeptide has undergone a conformation change resulting from the stimulus.
  • the stimulus is a post-translational modification of the X protein, which can be phosphorylation, methylation, prenylation, acetylation, palmitoylation, myristoylation, reduction, oxidation, glycosylation, proteolytic cleavage, sulfation, hydroxylation, carboxylation, or the covalent linkage of ubiquitin-like proteins (Ubl) to X such as ubiquitination or sumoylation.
  • ubiquitin-like proteins Ubl
  • the stimulus is contacting the X protein with a test compound in trans, which test compound is selected from the group consisting of: a polypeptide, a hormone, a steroid, an ion, a polynucleotide, an oligosaccharide, a lipid, an enzyme substrate, a gas molecule, a small molecule, a co-factor, a vitamin, a metal ion, and a nucleotide phosphate.
  • test compound is selected from the group consisting of: a polypeptide, a hormone, a steroid, an ion, a polynucleotide, an oligosaccharide, a lipid, an enzyme substrate, a gas molecule, a small molecule, a co-factor, a vitamin, a metal ion, and a nucleotide phosphate.
  • the nonubiquitin polypeptide of interest is selected from the group consisting of: Gulcl, Fprl, Sec62p, beta-amyloid, p53, calmodulin, estrogen receptor alpha (ER ⁇ ), FKBP, and G-protein, NHL, tyrosine kinases, Src, Abl, Epidermal Growth Factor (EGF) receptor, Protein Kinase A (PKA) Protein Kinase C (PKC), Cyclophillins, Cyclin Dependent Kinases (Cdlc), Cyclins, a protein of therapeutic, physiological or biological interest or variants / fragments thereof.
  • GEF Epidermal Growth Factor
  • PKA Protein Kinase A
  • PKC Protein Kinase C
  • Cyclophillins Cyclin Dependent Kinases
  • Cyclins a protein of therapeutic, physiological or biological interest or variants / fragments thereof.
  • the ⁇ Ub domain is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain selected from the group consisting of: N ia , N ig , N vi , N va , N vg , N ai , N aa , N ag , N gi , N ga , and N gg .
  • the reporter moiety is a selectable marker.
  • the first amino acid of the RM is a non- methionine residue when the RM is released by cleavage of the C Comp b -RM junction by a ubiquitin-specific protease (UBP).
  • UBP ubiquitin-specific protease
  • N Firm b is a mutant amino-terminal ubiquitin domain which has altered affinity for C Ub chosen such that for a given X polypetide it just inhibits or just allows the reconstitution of a quasi-native ubiquitin and hence cleavage of RM from the fusion protein.
  • X is a non-yeast nonubiquitin polypeptide.
  • At least one step is performed in a host cell expressing a ubiquitin-specific protease.
  • Another aspect of the invention provides a method to identify a compound which can change the conformation of a protein upon contacting the protein, comprising: (a) providing a plurality of test compounds which are not known to be able to cause the conformation change of the protein; (b) testing each compound by measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N Ub -X-C Intel b -RM, wherein N Ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin protein of interest, and RM is a reporter moiety; and, measuring a second fusion protein reporter moiety activity from a N L , b - X' -C ub -RM, wherein X' is the X protein which has been
  • the method further comprises formulating the identified compound into a pharmaceutical composition.
  • the plurality of test compounds is a library of compounds which comprises 2 to 10 test compounds, or greater than 10 test compounds. Preferably 10 to 500, 500 to 10,000 or greater than 10,000 test compounds.
  • Another aspect of the invention provides a method to identify a mutation in a protein which leads to the conformation change of the protein, comprising: (a) generating a plurality of candidate mutations of the protein; (b) testing each candidate mutation by measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein N ub is an amino- terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin protein of interest and RM is a reporter moiety; and, measuring a second fusion protein reporter moiety activity from a N Ub -X'-C ub -RM, wherein X' is a mutational altered form of the X protein harboring the candidate mutation; wherein a change in the level of the second fusion protein RM activity relative to the first fusion protein RM activity indicates that the X protein has undergone a conformation
  • Another aspect of the invention provides a method to identify a protein which changes conformation upon contacting a given compound or encountering an alteration in environmental factor, comprising: (a) providing a plurality of test proteins; (b) testing each protein X by measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N ub -X-C Ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, X is the nonubiquitin test protein, and RM is a reporter moiety; and, measuring a second fusion protein reporter moiety activity from a Nêt b -X'-C Ub -RM, wherein X' is the X protein which has been altered by contacting the given compound or by the given alteration in environmental factor; wherein a change in the level of the second fusion protein RM activity relative to the first fusion protein RM activity indicates that the
  • Another aspect of the invention provides a method to conduct a business, comprising: (a) by a suitable method of the invention, identifying one or more compounds which change the conformation of a polypeptide; (b) conducting therapeutic profiling of said identified compounds, or other derivatives thereof, for using the compounds in therapy for a condition; and, (c) formulating a pharmaceutical preparation including one or more compounds identified in (b) as a product having an acceptable therapeutic profile.
  • the business method further comprises establishing a distribution system for distributing said product for sale.
  • the business method further includes establishing a sales group for marketing the product.
  • Another aspect of the invention provides a method to conduct a business, comprising: (a) by a suitable method of the invention, identifying one or more compounds which change the conformation of a polypeptide; (b) conducting therapeutic profiling of said identified compounds, or other derivatives thereof, for using the compounds in therapy for a condition; and, (c) licensing, to a third party, the rights for further development of compounds and/or formulating a pharmaceutical preparation including one or more compounds identified in (b) to affect conformation change of the polypeptide for treatment of the condition.
  • Another aspect of the invention provides a method to conduct a business, comprising: (a) by one or more suitable methods of the invention, generating information or data, or identifying compounds, proteins or mutations / variants / derivatives thereof; (b) licensing, selling, providing for consideration or access to said information, said data, said identified compounds, proteins or mutations / variants / derivatives thereof.
  • kits for detecting or identifying alterations in the conformation of an X protein comprising a panel of at least two vector constructs for expressing fusion proteins of the general structure Nu b -X-C ub - RM, wherein each vector construct comprises a coding sequence for N ub , an amino- terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain selected from N ia , N ig , N vi , N ia , N va , N ig , N vg , N ai , N aa , N ag , N gi , N ga , or N gg ; a coding sequence for C ub , a carboxy-terminal ubiquitin domain; a coding sequences for RM, a reporter moiety fused to the carboxy-terminus of the C ub domain; and at least one cloning site or multicloning site for subclon
  • the kit further comprises a host cell for expressing said fusion proteins from said vector constructs.
  • the kit further comprises instructions for detecting or identifying alterations in protein conformation by using the vector constructs.
  • Another aspect of the invention provides a method for detecting a conformation change of a protein resulting from a stimulus, comprising: (a) measuring a first spectrum of fusion protein reporter moiety activity from a first panel of at least two fusion proteins, each different from the other, comprising the general structure N ub -X-C Ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain selected from at least one of Nj a , N lg , N vi , Nia, N va , Ni g , N vg , N ai , N a a, N ag , N gi , N ga , and N gg , C ub is a carboxy-terminal ubiquitin domain, X is the protein, and RM is a reporter moiety; (b) measuring a first
  • the stimulus is a mutational alteration of the X protein, an alteration in environmental factor, a post-translational modification of the X protein, or contacting the X protein with a test compound in trans.
  • compositions comprising: (a) a fusion protein comprising the structure N Ub -X-C ub -RM, wherein N ub is an amino- terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy- terminus of the C ub domain and X is a nonubiquitin polypeptide; and, (b) a compound that when brought in to contact with, causes conformational change in polypeptide X; and/or, (c)a fusion protein comprising the structure N Ub -X'-C Ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain and
  • Another aspect of the invention provides a method of controlling activity of a target gene, comprising: (a) providing a fusion protein comprising the structure N U b- X-C ub -RM, wherein N cicl b is an amino-terminal ubiquitm domain or a mutant amino- terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide, and RM is a reporter moiety fused to the carboxy- terminus of the C ub domain, wherein the reporter moiety is a gene activating moiety; (b) treating the X polypeptide with a stimulus, thereby causing the cleavage of the RM as a result of a conformational change of the X polypeptide; wherein the released RM controls activity of the target gene.
  • Another aspect of the invention provides a method of controlling activity of a protein, comprising: (a) providing a fusion protein comprising the structure N ub -X- Cub-RM, wherein N u b is an amino-terminal ubiquitin domain or a mutant amino- terminal ubiquitin domain, C u b is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide, and RM is the protein fused to the carboxy-terminus of the C Ub domain, wherein upon cleavage of the C Ub -RM junction, the first amino acid of the released RM is an amino acid other than methionine; (b) treating the X polypeptide with a stimulus, thereby causing the cleavage of the RM as a result of a conformational change of the X polypeptide; wherein the released RM is degraded by N-end rule components, thereby controlling activity of the protein.
  • the stimulus is an alteration in environmental factor, a post-translational modification of the X protein, or contacting the X protein with a test compound in trans.
  • kits for measuring or detecting protein conformation change caused by a stimulus comprising: (a) one or more vector constructs for expressing fusion proteins of the general structure N ub -X-C ub - RM, wherein each vector construct comprises a coding sequence for N Ub , an amino- terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain selected from N ia , Njg, N v ;, N ia , N va , N ig , N vg , N ai , N aa , N ag , N gi , N ga , or N gg ; a coding sequence for C Ub , a carboxy-terminal ubiquitm domain; a coding sequences for RM, a reporter moiety fused to the carboxy-terminus of the C Ub domain; and at least one cloning site or multicloning site for subcloning the X-
  • the instruction is not physically associated with the vector constructs of (a).
  • the instruction can be posted on a website, or updated periodically, or accessible as a published document.
  • the stimulus is an alteration in environmental factor, a post-translational modification of the X protein, or contacting the X protein with a test compound in trans.
  • Another aspect of the invention provides a method to detect or measure an alteration of an environmental factor or the presence of a compound in a sample comprising the steps: (a) providing a fusion protein comprising the structure N ub -X- C ub -RM, wherein N cicl b is an amino-terminal ubiquitin domain or a mutant amino- terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C ub domain, and X is a nonubiquitin polypeptide which changes confirmation from said alteration in environmental factor or presence of said compound; (b) contacting the fusion protein with the environment or the sample containing the compound; and, (c) measuring the degree of cleavage of the reporter moiety (RM) from the fusion protein; wherein a change in the degree of RM activity compared to a standard or control indicates an alternation in said environmental factor or the presence of said compound in the sample.
  • Figure 3 Intrapolypeptide split-ubiquitin assay applied to visualizing different spatial arrangements of N- and C-termini.
  • FIG. 5 Application of the intrapolypeptide split-ubiquitin method to monitoring destabilizing mutations in three proteins with different N- and C- terminal topologies,
  • Fprlp carries the N- and C-terminus on opposite faces of the molecule.
  • An immunoblot of the extracts of cells expressing the N V g-Fprlp-C Ub -Dha and the corresponding constructs of the different mutants of Fprlp is shown together with the quantification of four independent experiments by chemiluminescence.
  • Guklp carries the N-and C-terminus in spatial proximity.
  • the exposed N-terminal arginine of the RUra3p channels the reporter into the N-end rule pathway of protein degradation.
  • the cells are uracil auxotroph.
  • the cells are uracil prototroph.
  • (b) 10 5 , 10 4 , 10 3 , 10 2 and 10 1 yeast cells expressing the different N ub -X-C ub -RUra3p fusion proteins as indicated were spotted onto medium lacking uracil and tryptophan to select for the presence of the plasmid. Cells were incubated for three days at 30°C.
  • Cells containing ⁇ C125-C Ub -Dha or an empty plasmid and coexpressing different N Ub - ⁇ C125-Dha were extracted after one hour of induction with CuSO 4 and extracts probed after 12.5% SDS-PAGE with anti-HA antibody.
  • FIG. 8 The three potential outcomes of the split-Ub experiment.
  • A) Stel8p is bound to Ste4p.
  • the structure of Stel8p keeps N Ub and C Ub at a distance that inhibits their efficient reassociation.
  • the RUra3p reporter is not cleaved off and the cells grow on SD-ura.
  • B) Stel ⁇ p maintains its conformation in the absence of Ste4p. The cells can grow on SD-ura.
  • the coupled N u b and C u b can reassociate, the RUra3p reporter is cleaved off and degraded.
  • the cells cannot grow on SD-ura.
  • Stel 8 91 p an N-terminally truncated form of Stel 8p that was used for the experiments, is functional.
  • the halo of non-growing cells around a filter disk soaked with ⁇ -factor documents the functionality of the protein.
  • the cells containing an empty vector instead are unaffected by the mating hormone.
  • Ste4p induces an altered conformation in Stel8p.
  • A) Cells containing N la -SrEi ⁇ 0 C ub -RURA3 and an empty vector or P GAU HA-STE4 (shown are two independent transformants) and cells containing N-terminally truncated derivatives of N ⁇ a -S7/E;S 9/ -C ub -RURA3 and P GAU HA-STE4 were spotted (10 5 , 10 3 , 10 2 , 10 1 ) on plates lacking uracil, tryptophan, and containing galactose to induce or glucose to repress the expression of HA-Ste4p.
  • Plates contained 20 ⁇ M copper ions to moderately express the STE18 constructs. Cells were grown for 3 days at 30°C.
  • Figure 11 The ratio of uncleaved to cleaved N vg -Stel 8 ⁇ -C U b-Dha is influenced by coexpression of HA-Ste4p.
  • the quantification of the experiment is shown in (B). Bars indicate the percentage of cleaved Dha.
  • the Nj a fusion protein allows to distinguish between the cells expressing the wild type p53 core (growth) and the cells expressing the VI 43 A mutant of the core (non-growth).
  • the invention provides methods and reagents for monitoring protein conformation by attaching N Ub and C Ub to the N- and C- terminus of the same polypeptide, thereby allowing the measurement of intramolecular N ub and C Ub reassociation by quantifying the ratio of cleaved to uncleaved fusion protein.
  • the resulting ratio is defined by the affinity of N ub to C ub , and by the nature of the polypeptide separating N u b from C u b-
  • the invention further provides a variety of mutant N ub sequences for varying the intrinsic affinity of the N ub and C Ub moieties.
  • N U b By introducing mutations into N U b and thereby altering its affinity for , b , a cleavage spectrum of different N u b-X-C U b fusion proteins is obtained which is characteristic for the inserted polypeptide X.
  • the invention allows the balance between the folded and the unfolded state of a protein to be sensitively adjusted by appropriate selection of the N Ub moiety.
  • N Ub and C Ub to the termini of a protein will disturb this balance. This effect is clearly seen for the N ⁇ -labeled fusion proteins where cleavage is always complete irrespective of the position of the termini and the stability of the structure ( Figures 2, 3).
  • a N Ub -C Ub -pair has to be selected that does not disturb the balance between the folded and the unfolded state too much. That is, it may be advantageous that a N Ub is chosen such that for a given protein it just inhibits or just allows the reconstitution of a quasi- native ubiquity and hence cleavage of the reporter moiety from the fusion protein.
  • a N u b is closen that just inhibits the growth of yeast cells expressing a N ub -X-C Ub -RUra3p fusion protein when plated on appropriate selective medium.
  • the detection of uncleaved fusion protein in a steady state analysis is a good first indication.
  • the optimal extent of cleavage depends on the known or expected shift upon destabilizing the structure. The optimal cleavage is greater than 50% for proteins like Guklp and less than 50% for proteins like Fprlp.
  • the invention provides additional functional ubiquitin mutant having particular desirable reassociatiqn properties.
  • the cleavage spectrum of N ub -Gukl-F- C Ub -Dha revealed an interesting feature about the influence of the two isoleucines at position three and thirteen of Ub on the stability of the protein ( Figure 2c).
  • the amount of cleaved Dha is always higher for the N ub -fusions carrying the residue with the smaller side chain at position three of N u b (compare Nj a with N a j, N; g with N g ; or N ga with N ag ; Figure lc).
  • one aspect of the invention provides a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein N cicl b is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain, and X is a nonubiquitin polypeptide selected from the group consisting of: Gulcl, Fprl, Sec62p, beta-amyloid, p53, calmodulin, estrogen receptor alpha (ER ⁇ ), FKBP, G- ⁇ rotein,NHL, tyrosine kinases, Src, Abl, Epidermal Growth Factor receptor (EGFR), Protein Kinase A (PKA), Protein Kinase C (PKC), Cyclophillins, Cyclin Dependent Kinases (CDKs), Cyclins, a protein of therapeutic, physiological or biological
  • the nonubiquitin polypeptide may be a protein of therapeutic, physiological or biological interest.
  • proteins include NHL, tyrosine kinases, Src, Abl, Epidermal Growth Factor (EGF) receptor, Protein Kinase A (PKA) Protein Kinase C (PKC), Cyclophillins, Cyclin Dependent Kinases (Cdk), Cyclins or variants/ fragments thereof.
  • the nonubiquitin polypeptide comprises a protein from a non-yeast species.
  • Such polypeptides offer particularly advantageous features and may be obtained from species that include mammalian, insect, plant, vertebrate, animal or prokaryotic species.
  • fusion proteins of the invention comprising such polypeptides may be obtained by first isolating or synthesizing nucleic acids that encode a non-yeast polypeptide of interest and then expressing it as a fusion protein of the invention in a suitable host cell.
  • cleavage of the reporter moity can be detected. This may be conducted, for example, by using a Western blot using an antibody specific for the reporter moiety or an epitope attched to the reporter moity. Manny suitable epitopes will be known to a person skilled in the art, and include HLAD, HA, HIS etc.
  • the reporter moiety has anactvity.
  • the report moiety is a selectable marker, a transcription factor or a fluorescent marker.
  • the selectable marker is selected from the group consisting of: URA3, HIS3, LYS2, HygTk, Tkneo, TlcBSD, PACTlc, HygCoda, Codaneo, CodaBSD, PACCoda, Tk, codA, HPRT, and GPT2.
  • the selectable marker is selected from the group consisting of: TRP1, CYH2, and CA ⁇ 1.
  • the reporter moiety is a negative selectable marker such as URA3 or CYH2.
  • the reporter moiety is a positive selectable marker such as HIS3, TRP2, ADE2, Zeocin, Bla, ⁇ -galactosidase.
  • a fusion protein comprising the structure ⁇ Ub -X-C ub -RM, wherein C Ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C ub domain, X is a nonubiquitin polypeptide, and N Ub is a mutant amino-terminal ubiquitin domain selected from the group consisting of: N V i, N va , N vg , N a i, N aa , N ag , N g ;, N ga , and N gg .
  • a related aspect of the invention provides a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C ub domain, and X is a nonubiquitin polypeptide, wherein RM is a selectable marker.
  • a fusion protein comprising the structure Nêt b -X-C Ub -RM, wherein C u b is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C ub domain, X is a nonubiquitin polypeptide, and N ub is a mutant amino-terminal ubiquitin domain which has altered affinity for C Compute in such a way that cleavage of the RM from C u as a result of reconstituting a quasi-native ubiquitin moiety leads to a growth advantage on a selective media for a cell harboring the N ub -X-C Ub -RM fusion protein.
  • a related aspect of the invention provides a fusion protein comprising the structure N Ub -X-C ub -RM, wherein C U b is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain, X is a nonubiquitin polypeptide, and N U b is a mutant amino-terminal ubiquitin domain which has altered affinity for C Painb chosen such that for a given X polypetide it just inhibits or just allows the reconstitution of a quasi-native ubiquitin and hence cleavage of RM from the fusion protein.
  • N Ub can be generated by any art-recognized mutagenesis procedure (such as random mutagenesis or combinatory mutagenesis).
  • the candidate plurality (or library) of N U b-X-C U b-RM constructs can be introduced into a plurality of target cells. The ability of any of these cells to cleave the RM from the fusion protein depends on the reconstitution of a quasi -native ubiquitin moiety (see below).
  • a quasi-native ubiquitin moiety may be reconstituted, resulting in the cleavage of the C ub -RM juncture.
  • the RM may (or may not be) degraded based on the identity of its nascent N-terminal amino acid, and the survival of the host cell on a selective media can be determined by this event. For example, if RM is a negative selectable marker (R-ura3, see example below), only cells that have lost the RM will survive on the selective media (e.g. 5-FOA).
  • Such cells can be selected and their N ub -X-C ub -RM constructs recovered to obtain the N ub mutant.
  • RM is a transcription factor that is stable after cleavage, it may enter nucleus to initiate the transcription of a gene essential for host cell survival in a selective media - a function that a tethered RM is unable to perform because of its cytosolic or non-nucleus localization or its unfavorable conformation as a N ub -X-C ub -RM fusion protein.
  • nub that just allows association of the amino and carboxy ubiquity domains of a N Ub -X-C Ub -RM fusion protein.
  • Such mutations would be useful to detect conformational changes in the X polypeptide that cause disassociation of the quasi-native ubiquitin moiety, for example through the use of a RM that is a positive selectable marker with a non-methionine first amino acid after cleavage.
  • the invention also provides a method to identify a fusion protein comprising the structure N Ub -X-C ub -RM, wherein C Constant b is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C ub domain, X is a nonubiquitin protein, and N ub is a mutant amino-terminal ubiquitin domain which has altered affinity for C Ub in such a way that the cleavage of the RM from C Ub as a result of reconstituting a quasi-native ubiquitin moiety leads to a growth advantage of a cell harboring the N ub -X-C ub -RM fusion protein to grow on a selective media, comprising: (i) generating a plurality of mutant N Ub of the ub -X- C ub -RM fusion construct; (ii) introducing the plurality of mutant N Ub -X-C u
  • a related aspect of the invention provides a fusion protein comprising the structure N Ub -X-C ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C u b domain, and X is a non-yeast nonubiquitin polypeptide.
  • a related aspect of the invention provides a fusion protein comprising the structure N Ub -X-C ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide, and RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain, wherein upon cleavage of the C ub -RM junction, the first amino acid of the released RM is an amino acid other than methionine.
  • the first amino acid of the cleaved RM is Arginine, Lysine, Histidine, Phenylalanine, Tryptophan, Tyrosine, Leucine, Aspartate, Glutamate, Cysteine, Asparagine, Glutamine or Isoleucine.
  • Another aspect of the invention provides a polynucleotide sequence encoding any one of the fusion proteins of the instant invention.
  • a vector that encompasses a fusion polynucleotide encoding any of the fusion proteins of the instant invention.
  • a host cell that harbors a vector or a fusion polynucleotide or a fusion protein of the instant invention.
  • Various types of host cells will be suitable to practice this aspect of the invention and include a mammalian cell, or a plant cell, or an insect cell.
  • the cell is selected from the group consisting of: a human cell, a mouse cell, a rat cell, a hamster cell, a zebrafish cell, a Drosophila cell, or a nematode cell.
  • the cell is selected from the group consisting of: an A. thaliana cell and an N. tabacum cell.
  • the invention further provides methods for measuring the effects of a number of stimuli on polypeptide conformation.
  • One aspect of the invention provides a method of detecting a conformational change in a polypeptide resulting from a mutational alteration in the polypeptide sequence comprising: (a) measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N ub -X-C ub -RM, wherein N cicl b is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide of interest and RM is a reporter moiety, wherein upon cleavage of the C u b-RM junction, the first amino acid of the released RM is an amino acid other than methionine; and, (b) measuring a second fusion protein reporter moiety activity from a N ub -X'-C Ub -RM, wherein
  • a method of detecting a conformational change in a polypeptide resulting from a point mutation or a small insertion or deletion (such as an insertion / deletion of no more than 3 amino aicds, preferably no more than 5, 10, 15, 20, 30, or 50 amino acids) in the polypeptide sequence comprising: (a) measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N Ub -X-C ub -RM, wherein N ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide of interest and RM is a reporter moiety; and, (b) measuring a second fusion protein reporter moiety activity from a N Ub -X'-C Ub -RM, wherein X' is a point mutation or a deletion / insertion of no more
  • Point mutation includes one or more point mutations in the same protein, either adjacent to one another or apart from one another in the primary protein sequence.
  • the mutation is selected from deletion, insertion /addition, substitution, reversion, missense or nonsense point mutation.
  • the technique was used to measure the effect of mutations in Fprlp which we assumed would most probably alter the general stability of the protein. This was achieved by drastically changing residues in the C-terminal -strand of the protein. Consequently, the amount of the cleaved N V g-C Ub -fusion protein rose from 22% for the wild type to approximately 61% for the mutant.
  • the valine at position 107 that is part of the C-terminal -strand with alanine or glycine
  • FprlMC probably unfolded protein
  • the alanine mutations shift the ratio of cleaved to uncleaved N vg -Fprl-C ub -Dha from 22%) to 55% (Figure 4a).
  • the corresponding mutation in the human FKBP 12 has been previously analyzed in vitro.
  • the structure of FKBP 12 is superimposable onto the structure of the yeast homologue Fprlp (Rotonda et al., J. Biol. Chem. 268: 7607-09, 1993).
  • Replacing the equivalent valine 101 with an alanine reduces the stability of the human protein by 2.75 kcalmol "1 (Main et al., Biochemistry 37: 6145-53, 1998).
  • FKBP12 was denatured by urea and the unfolding was followed by changes in the spectroscopic parameters of the protein ensemble.
  • the split-Ub assisted analysis was performed under cellular conditions at 30°C.
  • a glycine at this position of the protein should therefore destabilize the structure of Fprlp even more than the corresponding alanine exchange (Kellis et al., Biochemistry 28: 4914-22, 1989).
  • a related aspect of the invention provides a method of detecting a conformational change in a polypeptide resulting from a stimulus comprising: (a) measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N Ub -X-C ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide of interest and RM is a reporter moiety; and, (b) measuring a second fusion protein reporter moiety activity from a N Ub -X' -C ub -RM, wherein X' is the X polypeptide which has been altered by the stimulus; wherein a change in the level of the second fusion protein RM activity relative to the first fusion protein RM activity indicates that the polypeptide has undergone a conformation change resulting from the stimulus.
  • the stimulus is an alteration in environmental factor, which can be pH change, temperature change, pressure change, redox-state change or ionic strength change.
  • the stimulus is a post-translational modification of the X protein, which can be phosphorylation, methylation, prenylation, acetylation, palmitoylation, myristoylation, reduction, oxidation, glycosylation, proteolytic cleavage, sulfation, hydroxylation, carboxylation, or the covalent linkage of ubiquitin-like proteins (Ubl) to X such as ubiquitination or sumoylation.
  • Ubl ubiquitin-like proteins
  • the post-translational modification may be brought about through the use of a test compound or an alteration in an environmental factor.
  • phophorylation through the provision of a protein kinase, or the reduction of a protein through a change in redox state or through the provision of a reducing agent.
  • the stimulus is contacting the X protein with a test compound in trans, which test compound is selected from the group consisting of: a polypeptide, a hormone, a steroid, an ion, a polynucleotide, an oligosaccharide, a lipid, an enzyme substrate, gas molecules such as CO, CO 2 or O , a small molecule, a co-factor, a vitamin, a metal ion, and a nucleotide phosphate.
  • the test compund may be provided by the expression of a nucleotide sequence.
  • test polypeptide for example, may be provided by any of numerous methods known to the skilled artisan including recombinant DNA technology, expression cloning, from a cDNA or genomic library.
  • test compounds may be provided by the expression of activity of endogenous or recomdniant genes, such as through the use of recombinant or natural oragnsisms that produce proteins or secondary metabolites suitable for testing.
  • the nonubiquitin polypeptide of interest is selected from the group consisting of: Gulcl, Fprl, Sec62p, beta-amyloid, p53, calmodulin, estrogen receptor alpha (ER ⁇ ), FKBP, and G-protein, VHL, tyrosine kinases, Src, Abl, Epidermal Growth Factor (EGF) receptor, Protein Kinase A (PKA) Protein Kinase C (PKC), Cyclophillins, Cyclin Dependent Kinases (Cdlc), Cyclins, a protein of therapeutic, physiological or biological interest or variants / fragments thereof.
  • GEF Epidermal Growth Factor
  • PKA Protein Kinase A
  • PKC Protein Kinase C
  • Cyclophillins Cyclin Dependent Kinases
  • Cyclins a protein of therapeutic, physiological or biological interest or variants / fragments thereof.
  • the N L , b domain is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain selected from the group consisting of: N ia , N ig , N vi , N va , N vg , N ai , N aa , N ag , N gi , N ga , and N gg .
  • the reporter moiety (RM) is a selectable marker.
  • the first amino acid of the RM is a non- methionine residue when the RM is released by cleavage of the C u b-RM junction by a ubiquitin-specific protease (UBP).
  • UBP ubiquitin-specific protease
  • N ub is a mutant amino-terminal ubiquitin domain which has altered affinity for C Pain b chosen such that for a given X polypetide it just inhibits or just allows the reconstitution of a quasi-native ubiquitin and hence cleavage of RM from the fusion protein.
  • X is a non-yeast nonubiquitin polypeptide.
  • At least one step is performed in a host cell expressing a ubiquitin-specific protease.
  • the host cell will express a ubiquity-specific protease.
  • a cell-free environment may be utilised to practice the methods of the invention. For example, a cell-extract, a transcription-translation mixture or an appropriate set of purified proteins in an appropriate buffer.
  • the cell-free environment contains a ubiquitin-specific protease and/or components of the N-end rule protein degregation system.
  • Another aspect of the invention provides a method for detecting a conformation change of a protein resulting from a stimulus, comprising: (a) measuring a first spectrum of fusion protein reporter moiety activity from a first panel of at least two fusion proteins, each different from the other, comprising the general structure N ub -X-C ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain selected from at least one of Nj a , Nj g , N V i, Nja, N va , Ni g , N vg , N ai , N aa , N ag , Ngi, Nga, and N gg , C ub is a carboxy-terminal ubiquitin domain, X is the protein, and RM is a reporter moiety; (b) measuring a second spectrum of fusion protein reporter moiety activity from a second panel of fusion proteins comprising the general structure N Ub
  • the stimulus is a mutational alteration of the X protein, an alteration in environmental factor, a post-translational modification of the X protein, or contacting the X protein with a test compound in trans.
  • other stimuli may be detected by the selection of an appropriate X protein, or alternatively, a given X protein may change conformation with another stimulus or infact more than one stimulus.
  • the invention further provides a number of methods to identify or screen for conditions that may cause a conformation change of a protein.
  • One aspect of the invention provides a method to identify a compound which can change the conformation of a protein upon contacting the protein, comprising: (a) providing a plurality of test compounds which are not known to be able to cause the conformation change of the protein; (b) testing each compound by measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N Ub -X-C Ub -RM, wherein N ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin protein of interest, and RM is a reporter moiety; and, measuring a second fusion protein reporter moiety activity from a N Ub -X'-C ub -RM, wherein X' is the X protein which has been altered by the test compound
  • the method further comprises formulating the identified compound into a pharmaceutical composition.
  • the plurality of test compounds is a library of compounds which comprises 2 to 10 test compounds, or greater than 10 test compounds. Preferably 10 to 500, 500 to 10,000 or greater than 10,000 test compounds.
  • Another aspect of the invention provides a method to identify a mutation in a protein which leads to the conformation change of the protein, comprising: (a) generating a plurality of candidate mutations of the protein; (b) testing each candidate mutation by measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N ub -X-C ub -RM, wherein N U b is an amino- terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin protein of interest and RM is a reporter moiety; and, measuring a second fusion protein reporter moiety activity from a N ub -X'-C Ub -RM, wherein X' is a mutational altered form of the X protein harboring the candidate mutation; wherein a change in the level of the second fusion protein RM activity relative to the first fusion protein RM activity indicates that the X protein has undergone
  • Another aspect of the invention provides a method to identify a protein which changes conformation upon contacting a given compound or encountering an alteration in environmental factor, comprising: (a) providing a plurality of test proteins; (b) testing each protein X by measuring a first fusion protein reporter moiety activity from a fusion protein comprising the structure N ub -X-C ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or mutant amino-terminal ubiquitin domain, C u is a carboxy-terminal ubiquitin domain, X is the nonubiquitin test protein, and RM is a reporter moiety; and, measuring a second fusion protein reporter moiety activity from a N U b-X'-C ub -RM, wherein X' is the X protein which has been altered by contacting the given compound or by the given alteration in environmental factor; wherein a change in the level of the second fusion protein RM activity relative to the first fusion protein RM activity indicates that the
  • test compounds, mutations of X proteins may be provided as a plurality of test compounds, mutations of X proteins wherein the plurality is a library.
  • libraries may be provided by a variety of chemical, biochemical, natural or genetic means (see below). Such libraries may comprise 2 to 10, 10 to 500, 500 to 10,000 or greater than 10,000 members.
  • the invention further provides a useful assay for monitoring the effect of specific amino acid alterations on protein conformation.
  • This assay we learned that the mutation of an intensively studied allele of SEC62 most likely induces a conformational alteration in the N-terminal cytosolic domain of this protein ( Figure 7). Since the exchange of the glycine for an aspartate at position 46 increases the proportion of uncleaved N vg - ⁇ C125-C Ub -Dha, the N-terminal domain of Sec62p reacts to the destabilization of its structure as Guklp does (Figure 5b). We therefore interpret the increase in mean distance between its N- and C-terminus as a reflection of a higher proportion of unfolded molecules.
  • the invention also provides various methods to conduct a pharmaceutical business.
  • One aspect of the invention provides a method to conduct a business, comprising: (a) by a suitable method of the invention, identifying one or more compounds which change the conformation of a polypeptide; (b) conducting therapeutic profiling of said identified compounds, or other derivatives thereof, for using the compounds in therapy for a condition; and, (c) formulating a pharmaceutical preparation including one or more compounds identified in (b) as a product having an acceptable therapeutic profile.
  • the business method further comprises establishing a distribution system for distributing said product for sale.
  • the business method further includes establishing a sales group for marketing the product.
  • Another aspect of the invention provides a method to conduct a business, comprising: (a) by a suitable method of the invention, identifying one or more compounds which change the conformation of a polypeptide; (b) conducting therapeutic profiling of said identified compounds, or other derivatives thereof, for using the compounds in therapy for a condition; and, (c) licensing, to a third party, the rights for further development of compounds and/or formulating a pharmaceutical preparation including one or more compounds identified in (b) to affect conformation change of the polypeptide for treatment of the condition.
  • Another aspect of the invention provides a method to conduct a business, comprising: (a) by one or more suitable methods of the invention, generating information or data, or identifying compounds, proteins or mutations / variants / derivatives thereof; (b) licensing, selling, providing for consideration or access to said information, said data, said identified compounds, proteins or mutations / variants / derivatives thereof
  • kits for detecting or identifying alterations in the conformation of an X protein comprising a panel of at least two vector constructs for expressing fusion proteins of the general structure N ub -X-C Ub - RM, wherein each vector construct comprises a coding sequence for N ub , an amino- terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain selected from Ni a , Ni g , N vi , N ia , N va , N ig , N vg , N ai , N aa , N ag , N gi , N ga , or N gg ; a coding sequence for C ub , a carboxy-terminal ubiquitin domain; a coding sequences for RM, a reporter moiety fused to the carboxy-terminus of the C ub domain; and at least one cloning site or multicloning site for subcloning the
  • the kit further comprises a host cell for expressing said fusion proteins from said vector constructs. In one embodiment, the kit further comprises instructions for detecting or identifying alterations in protein conformation by using the vector constructs.
  • the kit further comprises a control protein that can be subcloned into the multicloning site so that a user can positively control for a working experimental condition.
  • a kit for measuring or detecting protein conformation change caused by a stimulus comprising: (a) one or more vector constructs for expressing fusion proteins of the general structure N ub -X-C u b- RM, wherein each vector construct comprises a coding sequence for N u , an amino- terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain selected from N ia , Nig, Nvi, N ia , N va , N ig , N vg , N ai , N aa , N ag , N gi , N ga , or N gg ; a coding sequence for C ub , a carboxy-terminal ubiquitin domain; a coding sequences for RM, a reporter mo
  • the instruction is not physically associated with the vector constructs of (a).
  • the instruction can be posted on a website, or updated periodically, or accessible as a published document.
  • the stimulus is an alteration in environmental factor, a post-translational modification of the X protein, or contacting the X protein with a test compound in trans.
  • compositions comprising: (a) a fusion protein comprising the structure N ub -X-C ub -RM, wherein N cicl b is an amino- terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy- terminus of the C ub domain and X is a nonubiquitin polypeptide; and, (b) a compound that when brought in to contact with, causes conformational change in polypeptide X; and/or, (c)a fusion protein comprising the structure N ub -X'-C ub ⁇ RM, wherein N Ub is an amino-terminal ubiquitin domain or a mutant amino-terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain
  • Another aspect of the invention provides a method of controlling activity of a target gene, comprising: (a) providing a fusion protein comprising the structure N ub - X-C ub -RM, wherein N ub is an amino-terminal ubiquitin domain or a mutant amino- terminal ubiquitin domain, C u b is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide, and RM is a reporter moiety fused to the carboxy- terminus of the C ub domain, wherein the reporter moiety is a gene activating moiety; (b) treating the X polypeptide with a stimulus, thereby causing the cleavage of the RM as a result of a conformational change of the X polypeptide; wherein the released RM controls activity of the target gene.
  • Another aspect of the invention provides a method of controlling activity of a protein, comprising: (a) providing a fusion protein comprising the structure N ub -X- C ub -RM, wherein N Ub is an amino-terminal ubiquitin domain or a mutant amino- terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, X is a nonubiquitin polypeptide, and RM is the protein fused to the carboxy-terminus of the C ub domain, wherein upon cleavage of the C ub -RM junction, the first amino acid of the released RM is an amino acid other than methionine; (b) treating the X polypeptide with a stimulus, thereby causing the cleavage of the RM as a result of a conformational change of the X polypeptide; wherein the released RM is degraded by N-end rule components, thereby controlling activity of the protein.
  • the cleavage of the reporter moiety serves as a "switch" that can be controled by a stimulus.
  • the stimulus is an alteration in environmental factor, a post-translational modification of the X protein, or contacting the X protein with a test compound in trans.
  • Another aspect of the invention provides a method to detect or measure an alteration of an environmental factor or the presence of a compound in a sample comprising the steps: (a) providing a fusion protein comprising the structure N ub -X- Cu b -RM, wherein N Ub is an amino-terminal ubiquitin domain or a mutant amino- terminal ubiquitin domain, C Ub is a carboxy-terminal ubiquitin domain, RM is a reporter moiety fused to the carboxy-terminus of the C Ub domain, and X is a nonubiquitin polypeptide which changes confirmation from said alteration in environmental factor or presence of said compound; (b) contacting the fusion protein with the environment or the sample containing the compound; and, (c) measuring the degree of cleavage of the reporter moiety (RM) from the fusion protein; wherein a change in the degree of RM activity compared to a standard or control indicates an alternation in said environmental factor or the presence of said compound in the sample.
  • agonist is meant to refer to an agent that mimics or upregulates (e.g. potentiates or supplements) bioactivity of a protein of interest, or an agent that facilitates or promotes (e.g. potentiates or supplements) an interaction among polypeptides or between a polypeptide and another molecule (e.g. a steroid, hormone, nucleic acids, small molecule etc.).
  • An agonist can be a wild-type protein or derivative thereof having at least one bioactivity of the wild-type protein.
  • An agonist can also be a small molecule that upregulates expression of a gene or which increases at least one bioactivity of a protein.
  • An agonist can also be a protein or small molecule which increases the interaction of a polypeptide of interest with another molecule, e.g., a target peptide or nucleic acid.
  • Antagonist as used herein is meant to refer to an agent that downregulates (e.g. suppresses or inhibits) bioactivity of the protein of interest, or an agent that inhibits/suppresses or reduces (e.g. destabilizes or decreases) interaction among polypeptides or other molecules (e.g. steroids, hormones, nucleic acids, etc.).
  • An antagonist can be a compound which inhibits or decreases the interaction between a protein and another molecule, e.g., a target peptide, such as interaction between ubiquitin and its substrate.
  • An antagonist can also be a compound that downregulates expression of a gene of interest or which reduces the amount of the wild type protein present.
  • An agonist can also be a protein or small molecule which decreasaes or inhibits the interaction of a polypeptide of interest with another molecule, e.g., a target peptide or nucleic acid.
  • "Alter” as used in "altered by a test composition” or “altered by an environmental alteration” means a number of situations in its broadest sense. It should be understood to encompass reversibly and irreversibly change caused by a test composition. In the irreversible change situation, a test compound may contact an X protein and reacts with it (for example, transferring at least part of the compound to the X protein), causing an irreversible conformation change of the X protein.
  • Certain suicide substrate of an enzyme may react with the catalytic site of the enzyme and lead to the inactivation and irreversible conformation change of the enzyme.
  • reversible change may be caused by binding of a test compound to the X protein.
  • the test compound may or may not need to remain bound by the X protein for the X protein to assume the changed conformation for at least a ceratin period of time.
  • allele which is used interchangeably herein with "allelic variant” refers to alternative forms of a gene or portions thereof. Alleles occupy the same locus or position on homologous chromosomes. When a subject has two identical alleles of a gene, the subject is said to be homozygous for that gene or allele.
  • bioactive fragment of a polypeptide refers to a fragment of a full-length polypeptide, wherein the fragment specifically agonizes (mimics) or antagonizes (inhibits) the activity of a wild-type polypeptide.
  • the bioactive fragment preferably is a fragment capable of interacting with at least one other molecule, protein or DNA, with which a full length protein can bind.
  • Bioactivity or “bioactivity” or “activity” or “biological function”, which are used interchangeably, for the purposes herein means a catalytic, effector, antigenic, molecular tagging or molecular interaction function that is directly or indirectly performed by the polypeptides of this invention (whether in its native or denatured conformation), or by any subsequence thereof.
  • Activity as used in "reporter moiety activity” means a number of things in its broadest sense. It generally means a detectable event. For example, it means an enzymatic activity if the RM is an enzyme; it means fluorescent signal if the RM is a fluorescent protein; it means a cleavage between the C ub and RM if the RM is to be detected by Western blot using an antibody specific for the RM or an epitope attached to the RM (FLAG tag or HA tag, etc.); it means a cleaved (rather than a C ub fusion tethered outside the nucleus) thus activated transcription factor for initiating downstream reporter gene transcription in the nucleus if the RM is a transcription factor, etc.
  • Reporter moiety cleavage means cleavage by ubiquitin-specific proteases (UBPs) of the C ub -RM juncture. There could be a number of consequences resulting from this cleavage. It generally creates a detectable event.
  • UBPs ubiquitin-specific proteases
  • the RM can result in a change of enzymatic activity if the RM is an enzyme, since the cleaved RM may not be stable at the presence of the N-end rule components (or it mat be stable and activated upon cleavage, due to the removal of the inhibitory N-terminal domain); it can result in a change in fluorescent signal if the RM is a fluorescent protein (for example, fluorescent RM may be degraded by N-end rule components); it can result in a detectable size change of the RM on Western blot using an antibody specific for the RM or an epitope attached to the RM (FLAG tag or HA tag, etc.); it can release a cleaved (rather than a C Ub fusion tethered outside the nucleus) thus activated transcription factor for initiating downstream reporter gene transcription in the nucleus if the RM is a transcription factor, etc.
  • Cells “host cells” or “recombinant host cells” are terms used interchangeably herein. It is understood that such terms refer not only to a particular subject cell but to the progeny or potential progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term as used herein.
  • the term “cell death” or “necrosis”, is a phenomenon when cells die as a result of being killed by a toxic material, or other extrinsically imposed loss of function of a particular essential gene function.
  • Charge means a detailed study of a polypeptide or a nucleic acid (polynucleotide) encoding a polypeptide to reveal relevant chemical and biological information.
  • This information generally includes one or more, but is not limited to, the following: sequence information for protein and nucleic acid, secondary, tertiary, and quarternary structure information, molecular weight, enzymatic or other activity, isoelectric focusing point, binding affinity to other molecules, binding partners, stability, expression pattern, tissue distribution, subcellular localization, expression regulation, developmental roles, phenotypes of transgenic animals overexpressing or devoid of the polypeptide or nucleic acid, size of nucleic acid, and hybridization property of nucleic acid.
  • a "chimeric polypeptide” or “fusion polypeptide” is a fusion of a first amino acid sequence encoding a first polypeptide with a second amino acid sequence defining a domain (e.g. polypeptide portion) foreign to and not substantially homologous with any domain of the first polypeptide.
  • Such second amino acid sequence may present a domain which is found (albeit in a different polypeptide) in an organism which also expresses the first polypeptide, or it may be an "interspecies", “intergenic”, etc. fusion of polypeptide structures expressed by different kinds of organisms. At least one of the first and the second polypeptides may also be partially or completely synthetic or random, i.e. not previously identified in any organism.
  • To clone as used herein, as will be apparent to skilled artisan, may be meant as obtaining exact copies of a given polynucleotide molecule using recombinant DNA technology. Details of molecular cloning can be found in a number of commonly used laboratory protocol books such as Molecular Cloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 1989). "To clone” as used herein, as will be apparent to skilled artisan, may be also meant as obtaining identical or nearly identical population of cells possecessing a common given property, such as the presence or absence of a fluorescent marker, or a positive or negative selectable marker.
  • the population of identical or nearly identical cells obtained by cloning is also called a "clone.”
  • Cell cloning methods are well known in the art as described in many commonly available laboratory manuls (see Current Protocols in Cell Biology, CD-ROM Edition, ed. by Juan S. Bonifacino, Jennifer Lippincott-Schwartz, Joe B. Harford, and Kemieth M. Yamada, John Wiley & Sons, 1999).
  • Complementation screen means genetic screening for genes or source DNA that can conferred certain specified phenotype which will not exist without the presence of said genes or source DNA. It is usually done in vivo, by introducing into cells lacking certain phenotype a library of source DNA to be screened for, and identifying cells that have obtained a source DNA and now exhibit the specified phenotype. Alternatively, it could be done in vivo by randomly inactivating genes in the genome of the cell lacking certain phenotype and identify cells that have lost the function of certain genes and exhibit the specificed phenotype. However, complementation screen can also be done in vitro in cell-free systems, either by testing each candidate individually or as pools of individuals.
  • a cell under conditions wherein a cell is selectable. That can mean selecting from a population of cells, a subpopulation or a single cell possessing a common given property such as the presence or absence of fluorescent markers, or the presence or absence of positive or negative selectable markers, and obtaining a clone of each selected cell.
  • the cells can be selected under conditions that will completely or nearly completely eliminate any cell that does not have the desired property of the cells to be selected. For example, by growing cells in selective media, only cells possessing a certain desired property will survive. The surviving cells can be cloned using standard cell and molecular biology protocols (see Current Protocols in Cell Biology, CD-ROM Edition, ed. by Juan S.
  • cells possessing a desired property can be selected from a population based on the observation of a certain discernable phenotype, such as the presence or absence of fluoresent markers. The selected cells can then be cloned using standard cell and molecular biology protocols (see Current Protocols in Cell Biology, CD-ROM Edition, ed. by Juan S. Bonifacino, Jennifer Lippincott-Schwartz, Joe B. Harford, and Kenneth M. Yamada, John Wiley & Sons, 1999).
  • “Compound” in its broadest sense shall include (but is not limited to) chemical compounds (organic or inorganic), macromolecules such as polynucleotides, polypeptides, polysaccharides, lipids, or derivatives thereof.
  • a “delivery complex” shall mean a targeting means (e.g. a molecule that results in higher affinity binding of a gene, protein, polypeptide or peptide to a target cell surface and/or increased cellular or nuclear uptake by a target cell).
  • targeting means include: sterols (e.g. cholesterol), lipids (e.g. a cationic lipid, virosome or liposome), viruses (e.g. adenovirus, adeno-associated virus, and refrovirus) or target cell specific binding agents (e.g. ligands recognized by target cell specific receptors).
  • Preferred complexes are sufficiently stable in vivo to prevent significant uncoupling prior to internalization by the target cell. However, the complex is cleavable under appropriate conditions within the cell so that the gene, protein, polypeptide or peptide is released in a functional form.
  • epitopope and epipe tag are meant to refer to any of various convenient molecular markers known in the art, such as hemaglutinin or FLAG, so that the level of a polypeptide can be confirmed in a Western blot using, for example, a suitable anti-flu or anti-FLAG antibody.
  • epitopope tag epitope tag
  • Equivalent nucleotide sequences will include sequences that differ by one or more nucleotide substitutions, additions or deletions, such as allelic variants; and will, therefore, include sequences that differ from the nucleotide sequence of a particular gene, due to the degeneracy of the genetic code.
  • Equivalent polypeptides will include polypeptides that differ by one or more amino acid substitutions, additions or deletions, which amino acid substitutions, additions or deletions leave the function and/or activity of the polypeptide substantially unaltered.
  • a polypeptide equivalent to a given polypeptide could e.g. be the polypeptide that performs the same function in another species.
  • murine ubiquitin herein is considered an equivalent of human ubiquitin.
  • G-protein as used herein means heterotrimeric G proteins, including the ⁇ , ⁇ and ⁇ subunits, as well as the Ras superfamily of small G-proteins, and fragments thereof.
  • the Ras super family of small G proteins shall include, but are not limited to, Ras, Ran, Rho, Arf, Rab subfamily of small G-proteins.
  • G-protein includes proteins that are structurally and/or functionally similar to the high eukaryotic G-proteins, but are found in other species, such as Ste4p, Stel ⁇ p of yeast, etc.
  • the terms “gene”, “recombinant gene” and “gene construct” refer to a nucleic acid comprising an open reading frame encoding a polypeptide, including both exon and (optionally) intron sequences.
  • the term “intron” refers to a DNA sequence present in a given gene which is not translated into protein and is generally found between exons.
  • a “recombinant gene” refers to nucleic acid encoding a polypeptide and comprising -encoding exon sequences, though it may optionally include intron sequences which are derived from, for example, a chromosomal gene or from an unrelated chromosomal gene.
  • Homology or “identity” or “similarity” refers to sequence similarity between two peptides or between two nucleic acid molecules, with identity being a more strict comparison. Homology and identity can each be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When a position in the compared sequence is occupied by the same base or amino acid, then the molecules are identical at that position.
  • a degree of homology or similarity or identity between nucleic acid sequences is a function of the number of identical or matching nucleotides at positions shared by the nucleic acid sequences.
  • a degree of identity of amino acid sequences is a function of the number of identical amino acids at positions shared by the amino acid sequences.
  • a degree of homology or similarity of amino acid sequences is a function of the number of amino acids, i.e. structurally related, at positions shared by the amino acid sequences.
  • An "unrelated" or “non-homologous” sequence shares less than 40 % identity, though preferably less than 25 % identity with another sequence.
  • interact as used herein is meant to include detectable interactions (e.g. biochemical interactions) between molecules, such as interaction between protein-protein, protein-nucleic acid, nucleic acid-nucleic acid, and protein-small molecule or nucleic acid-small molecule in nature.
  • detectable interactions e.g. biochemical interactions
  • isolated as used herein with respect to nucleic acids, such as
  • DNA or RNA refers to molecules separated from other DNAs, or RNAs, respectively, that are present in the natural source of the macromolecule.
  • an isolated nucleic acid encoding one of the subject polypeptides preferably includes no more than 10 kilobases (kb) of nucleic acid sequence which naturally immediately flanks the gene in genomic DNA, more preferably no more than 51cb of such naturally occurring flanking sequences, and most preferably less than 1.5kb of such naturally occurring flanking sequence.
  • isolated as used herein also refers to a nucleic acid or peptide that is substantially free of cellular material, viral material, or culture medium when produced by recombinant DNA techniques, or chemical precursors or other chemicals when chemically synthesized.
  • isolated nucleic acid is meant to include nucleic acid fragments which are not naturally occurring as fragments and would not be found in the natural state.
  • isolated is also used herein to refer to polypeptides which are isolated from other cellular proteins and is meant to encompass both purified and recombinant polypeptides.
  • Kit as used herein means a collection of at least two components constituting the kit. Together, the components constitute a functional unit for a given purpose. Individual member components may be physically packaged together or separately. For example, a kit comprising an instruction for using the kit may or may not physically include the instruction with other individual member components. Instead, the instruction can be supplied as a separate member component, either in a paper form or an electronic form which may be supplied on computer readable memory device or downloaded from an internet website, or as recorded presentation. The individual components of the kit may or may not be from the same supplier, or manufacturer. A component can either be purchased as a part of the kit, or generated by user "in-house” according to the instruction of the kit.
  • 'Tnstruction(s) as used herein means documents describing relevant materials or methodologies pertaining to a kit. These materials may include any combination of the following: background information, list of components and their availability information (purchase information, etc.), brief or detailed protocols for using the kit, trouble-shooting, references, technical support, and any other related documents. Instructions can be supplied with the kit or as a separate member component, either as a paper form or an electronic form which may be supplied on computer readable memory device or downloaded from an internet website, or as recorded presentation. Instructions can contain one or multiple documents or future updates. Instruction also includes patents, published patent applications, published scientific literatures or references, etc.
  • Identify or “identification” as used herein means selecting, screening, or finding at least one candidate possessing a desired property, from a pool of more than 2 candidates.
  • the desired property can be, for example, a mutation or a compound that can cause a conformational change in a given protein, a condition or stimulus that can cause a conformational change in a given protein, etc.
  • identification may also include further characterization (see above) of the identified candidate.
  • Library as used herein generally means a multiplicity of member components constituting the library which member components individually differ with respect to at least one property, for example, a chemical compound library.
  • library means a plurality of nucleic acids / polynucleotides, preferrably in the form of vectors comprising functional elements (promoter, transcription factor binding sites, enhancer, etc.) necessary for expression of polypeptides, either in vitro or in vivo, which are functionally linked to coding sequences for polypeptides.
  • the vector can be a plasmid or a viral-based vector suitable for expression in prokaryotes or eukaryotes or both, preferably for expression in mammalian cells.
  • the cloning sites can be restriction endonuclease recognition sequences, or other recombination based recognition sequences such as loxP sequences for Cre recombinase, or the Gateway system (Life Teclmologies, Inc.) as described in U.S. Pat. No. 5,888,732, the contents of which is incorporated by reference herein.
  • Coding sequences for polypeptides can be cDNA, genomic DNA fragments, or random/semi-random polynucleotides. The methods for cDNA or genomic DNA library construction are well-known in the art, which can be found in a number of commonly used laboratory molecular biology manuls (see below).
  • modulation refers to both upregulation (i.e., activation or stimulation, e.g., by agonizing or potentiating) and downregulation (i.e. inhibition or suppression e.g., by antagonizing, decreasing or inhibiting) of an activity.
  • mutation or “mutated” as it refers to a gene or nucleic acid means an allelic or modified form of a gene or nucleic acid, which exhibits a different nucleotide sequence and/or an altered physical or chemical property as compared to the wild-type gene or nucleic acid. Generally, the mutation could alter the regulatory sequence of a gene without affecting the polypeptide sequence encoded by the wild- type gene.
  • a mutated gene or nucleic acid will either completely lose the ability to encode a polypeptide (null mutation) or encode a polypeptide with an altered property, including a polypeptide with reduced or enhanced biological activity, a polypeptide with novel biological activity, or a polypeptide that interferes with the function of the corresponding wild-type polypeptide.
  • a mutation may take advantage of the degeneracy of the genetic code, by replacing a triplett codon by a different triplett codon that nevertheless encodes the same amino acid as the wild-type triplett codon. Such replacement may, for example, lead to increased stability of the gene or nucleic acid under certain conditions.
  • a mutation may comprise a nucleotide change in a single position of the gene or nucleic acid, or in several positions, or deletions or additions of nucleotides in one or several positions.
  • reduced-associating mutant as used herein means a mutant polypeptide that exhibits reduced affinity for its normal binding partner.
  • a reduced-associating mutant of the ubiquitin N-terminus is a polypeptide that exhibits reduced affinity for its normal binding partner — the C-terminal half of ubiquitin (C u b), to the point that it will show reduced association or not associate with a wild-type C Ub and form a "quasi-wild-type ubiquitin" without the supplemented binding affinity between two polypeptides fused to N U ⁇ and C ub , respectively.
  • such mutations in Nux are certain missense mutations introduced to either the 3 rd or the 13 th amino acid residue of the wild-type ubiquitin.
  • missense mutations at these positions may differentially affect the affinity/association between N ux and C Ub , thereby providing different sensitivity of the assay as disclosed by the instant invention.
  • These missense point mutations can be routinely introduced into cloned genes using standard molecular biology protocols, such as site-directed mutagenesis using PCR.
  • the "non-human animals" of the invention include mammalians such as rodents, non-human primates, sheep, dog, cow, chickens, amphibians, reptiles, etc.
  • Preferred non-human animals are selected from the rodent family including rat and mouse, most preferably mouse, though transgenic amphibians, such as members of the Xenopus genus, and transgenic chickens can also provide important tools for understanding and identifying agents which can affect, for example, embryogenesis and tissue formation.
  • transgenic amphibians such as members of the Xenopus genus
  • transgenic chickens can also provide important tools for understanding and identifying agents which can affect, for example, embryogenesis and tissue formation.
  • the term "chimeric animal” is used herein to refer to animals in which the recombinant gene is found, or in which the recombinant gene is expressed in some but not all cells of the animal.
  • tissue-specific chimeric animal indicates that one of the recombinant gene is present and/or expressed or disrupted in some tissues but not others.
  • nucleic acid in its broadest sense, refers to polynucleotides such as deoxyribonucleic acid (DNA), and, where appropriate, ribonucleic acid (RNA).
  • DNA deoxyribonucleic acid
  • RNA ribonucleic acid
  • the term should also be understood to include, as equivalents, analogs of either RNA or DNA made from nucleotide analogs, and, as applicable to the embodiment being described, single (sense or antisense) and double-stranded polynucleotides.
  • nucleic acid(s) may refer to polynucleotides that contain information required for transcription and/or translation of polypeptides encoded by the polynucleotides.
  • plasmids comprising transcription signals (e.g. transcription factor binding sites, promoters and/or enhancers) functionally linked to downstream coding sequences for polypeptides
  • genomic DNA fragments comprising transcription signals (e.g. transcription factor binding sites, promoters and/or enhancers) functionally linked to downstream coding sequences for polypeptides
  • cDNA fragments linear or circular
  • transcription signals e.g. transcription factor binding sites, promoters and/or enhancers
  • RNA molecules comprising functional elements for translation either in vitro or in vivo or both, which are functionally linked to sequences encoding polypeptides.
  • polynecleotides should also be understood to include, as equivalents, analogs of either RNA or DNA made from nucleotide analogs, and, as applicable to the embodiment being described, single (sense or antisense) and double-stranded polynucleotides.
  • These polynucleotides can be in an isolated form, e.g. an isolated vector, or included into the episome or the genome of a cell.
  • the term "nucleotide sequence complementary to the nucleotide sequence set forth in SEQ ID NO. x” refers to the nucleotide sequence of the complementary strand of a nucleic acid strand having SEQ ID NO. x.
  • complementary strand is used herein interchangeably with the term "complement”.
  • the complement of a nucleic acid strand can be the complement of a coding strand or the complement of a non-coding strand.
  • Nucleotide phosphate referrs to any one or more of the following and derivatives or variants: AMP, ADP, ATP, TMP, TDP, TTP, CMP, CDP, CTP, GMP, GDP, GTP, cAMP, cTMP, cGMP, cCMP, dAMP, dADP, dATP, dTMP, dTDP, dTTP, dCMP, dCDP, dCTP, dGMP, dGDP, dGTP, ddAMP, ddADP, ddATP, ddTMP, ddTDP, ddTTP, ddCMP, ddCDP, ddCTP, ddGMP, ddGDP, ddGTP.
  • either 3'- or 2'- can be -OH.
  • Modifications (including replacement of O, N, or P atoms by S or others) on sugar rings, bases, and/or phosphate groups are all considered derivatives or variants.
  • genes or a particular polypeptide may exist in single or multiple copies within the genome of an individual. Such duplicate genes may be identical or may have certain modifications, including nucleotide substitutions, additions or deletions, which all still code for polypeptides having substantially the same activity.
  • certain differences in nucleotide sequences may exist between individual organisms, which are called alleles. Such allelic differences may or may not result in differences in amino acid sequence of the encoded polypeptide yet still encode a polypeptide with the same biological activity.
  • percent identical refers to sequence identity between two amino acid sequences or between two nucleotide sequences. Identity can each be determined by comparing a position in each sequence which may be aligned for purposes of comparison. When an equivalent position in the compared sequences is occupied by the same base or amino acid, then the molecules are identical at that position; when the equivalent site occupied by the same or a similar amino acid residue (e.g., similar in steric and/or electronic nature), then the molecules can be referred to as homologous (similar) at that position.
  • Expression as a percentage of homology, similarity, or identity refers to a function of the number of identical or similar amino acids at positions shared by the compared sequences.
  • FASTA FASTA
  • BLAST BLAST
  • ENTREZ is available through the National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Md.
  • the percent identity of two sequences can be determined by the GCG program with a gap weight of 1, e.g., each amino acid gap is weighted as if it were a single amino acid or nucleotide mismatch between the two sequences.
  • MPSRCH uses a Smith- Waterman algorithm to score sequences on a massively parallel computer. This approach improves ability to pick up distantly related matches, and is especially tolerant of small gaps and nucleotide sequence errors. Nucleic acid-encoded amino acid sequences can be used to search both protein and DNA databases. Databases with individual sequences are described in Methods in
  • Preferred nucleic acids have a sequence at least 70%, and more preferably 80% identical and more preferably 90% and even more preferably at least 95% identical to an nucleic acid sequence encoding any one of the polypeptides of the instant application.
  • the nucleic acid is mammalian.
  • several alignment tools are available. Examples include PileUp, which creates a multiple sequence alignment, and is described in Feng et al., J. Mol. Evol. (1987) 25: 351-360.
  • Another method, GAP uses the alignment method of Needleman et al., J. Mol. Biol. (1970) 48: 443- 453. GAP is best suited for global alignment of sequences.
  • “Pharmaceutical composition” of the present invention comprise any one or more of the described compounds, or compositions of the present invention, or a pharmaceutically acceptable salt thereof, together with a pharmaceutically acceptable carrier in accordance with the properties and expected performance of such carriers which are well-known in the pertinent art.
  • promoter means a DNA sequence that regulates expression of a selected DNA sequence operably linked to the promoter, and which effects expression of the selected DNA sequence in cells.
  • tissue specific i.e. promoters, which effect expression of the selected DNA sequence only in specific cells (e.g. cells of a specific tissue).
  • leaky so-called “leaky” promoters, which regulate expression of a selected DNA primarily in one tissue, but cause expression in other tissues as well.
  • the term also encompasses non-tissue specific promoters and promoters that constitutively express or that are inducible (i.e. expression levels can be controlled).
  • protein protein
  • polypeptide peptide
  • a "protein of therapeutic, physiological or biological interest” shall mean a polypeptide for which exists at least one publicly available document, for example, without limitation, a published patent document or an article in the scientific literature, in which document a causal relationship is shown or proposed between said polypeptide and a state of a biological system, or a particular change of state of a biological system, which state or change of state may be desirable or undesirable.
  • An example for a polypeptide with a proposed causal relationship to an undesirable state of a biological system is, without limitation, the ScPrP polypeptide, the proposed cause for Bovine Spongiforme Enzephalopathy (BSE).
  • BSE Bovine Spongiforme Enzephalopathy
  • An example for a polypeptide with a proposed causal relationship to a desirable state of a biological system is, without limitation, the unmutated form of CFTR (Cystic Fibrosis Conductance Transmembrane Regulator).
  • recombinant protein refers to a polypeptide which is produced by recombinant DNA techniques, wherein generally, DNA encoding a polypeptide is inserted into a suitable expression vector which is in turn used to transform a host cell to produce the polypeptide encoded by said DNA.
  • This polypeptide may be one that is naturally expressed by the host cell, or it may be heterologous to the host cell, or the host cell may have been engineered to have lost the capability to express the polypeptide which is otherwise expressed in wild type forms of the host cell.
  • the polypeptide may also be a fusion polypeptide.
  • phrase "derived from”, with respect to a recombinant gene is meant to include within the meaning of "recombinant protein” those proteins having an amino acid sequence of a native polypeptide, or an amino acid sequence similar thereto which is generated by mutations, including substitutions, deletions and truncation, of a naturally occurring form of the polypeptide.
  • Small molecule as used herein, is meant to refer to a composition, which has a molecular weight of less than about 5 lcD and most preferably less than about 4 lcD.
  • Small molecules can be nucleic acids, peptides, polypeptides, peptidomimetics, carbohydrates, lipids or other organic (carbon containing) or inorganic molecules.
  • Many pharmaceutical companies have extensive libraries of chemical and/or biological mixtures, often fungal, bacterial, or algal extracts, which can be screened with any of the assays of the invention to identify compounds that modulate a bioactivity.
  • Reporter moiety is a reporter for the cleavage of the peptide bond between
  • the UBP-cleaved RM is unstable due to the fact that the first amino acid is non-Met and RM will be degraded at the presence of a functional N-end rule system.
  • RM may be stable if the first amino acid is Met or any other stabilizing amino acid residue.
  • the detection of the RM activity can be through a variety of means. It can be detected by Western blot using an antibody against the RM or an attached epitope to reveal the cleaved RM and the uncleaved RM. It can be detected by degree of enzymatic activity of the RM if the RM is an enzyme with a non-Met first amino acid.
  • the reporter moiety may be chosen from the list of URA3, HIS3, LYS2, HygTlc, Tlcneo, TlcBSD, PACTlc, HygCoda, Codaneo, CodaBSD, PACCoda, Tic, codA, and GPT2.
  • the reporter moiety may also be TRP1, CYH2, CAN1, HPRT, beta-galactosidase or a luciferase.
  • the reporter moiety may also be a fluorescent marker, e.g.
  • GFP GFP
  • YFP YFP
  • BFP BFP
  • RFP RFP
  • a transcription factor e.g. hTBPl (human TATA binding protein 1), or DHFR.
  • hTBPl human TATA binding protein 1
  • DHFR DHFR
  • Transcription is a generic term used throughout the specification to refer to a process of synthesizing RNA molecules according to their corresponding DNA template sequences, which may include initiation signals, enhancers, and promoters that induce or control transcription of protein coding sequences with which they are operably linked.
  • Transcriptional repressor refers to any of various polypeptides of prokaryotic or eukaryotic origin, or which are synthetic artificial chimeric constructs, capable of repression either alone or in conjunction with other polypeptides and which repress transcription in either an active or a passive manner.
  • transcription of a recombinant gene can be under the control of transcriptional regulatory sequences which are the same or which are different from those sequences which control transcription of the naturally-occurring forms of the recombinant gene, or its components.
  • Translation is a generic term used to describe the synthesis of protein or polypeptide on a template, such as messenger RNA (mRNA). It is the making of a protein polypeptide sequence by translating the genetic code of an mRNA molecule associated with a ribosome. The whole process can be performed in vivo inside a cell using protein translation machinery of the cell, or be performed in vitro using cell-free systems, such as reticulocyte lysates or any other equivalents.
  • the RNA template for translation may be separately provided either directly as RNA or indirectly as the product of transcription from a provided DNA template, such as a plasmid.
  • Translationally providing means providing a polypeptide/protein by way of translation.
  • translation is a process that can be done in vivo inside a cell using protein translation machinery of the cell, or be performed in vitro using cell-free systems, such as reticulocyte lysates or any other equivalents.
  • the RNA template for translation may be separately provided either directly as RNA or indirectly as the product of transcription from a provided DNA template, such as a plasmid.
  • the template DNA can be introduced into a host/target cell by a variety of standard molecular biology procedures, such as transformation, transfection, mating or cell fusion, or can be provided to an in vitro translation reaction directly.
  • the term “transfection” means the introduction of a nucleic acid, e.g., via an expression vector, into a recipient cell by nucleic acid-mediated gene transfer.
  • "Transformation" refers to a process in which a cell's genotype is changed as a result of the cellular uptake of exogenous DNA or RNA, and, for example, the transformed cell expresses a recombinant form of a polypeptide or, in the case of anti-sense expression from the transferred gene, the expression of a naturally-occurring form of the polypeptide is disrupted.
  • transgene means a nucleic acid sequence (encoding, e.g., one of the polypeptides, or an antisense transcript thereto) which has been introduced into a cell.
  • a transgene could be partly or entirely heterologous, i.e., foreign, to the transgenic animal or cell into which it is introduced, or, homologous to an endogenous gene of the transgenic animal or cell into which it is introduced, but which is designed to be inserted, or is inserted, into the animal's genome in such a way as to alter the genome of the cell into which it is inserted (e.g., it is inserted at a location which differs from that of the natural gene or its insertion results in a knockout).
  • a transgene can also be present in a cell in the form of an episome.
  • a transgene can include one or more transcriptional regulatory sequences and any other nucleic acid, such as introns, that may be necessary for optimal expression of a selected nucleic acid.
  • a "transgenic animal” refers to any animal, preferably a non-human mammal, bird or an amphibian, in which one or more of the cells of the animal contain heterologous nucleic acid introduced by way of human intervention, such as by transgenic techniques well known in the art.
  • the nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus.
  • the term genetic manipulation does not include classical crossbreeding, or in vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. This molecule may be integrated within a chromosome, or it may be extracliromosomally replicating DNA.
  • transgenic animal In the typical transgenic animals described herein, the transgene causes cells to express a recombinant form of one of the polypeptide, e.g. either agonistic or antagonistic forms.
  • transgenic animals in which the recombinant — gene is silent are also contemplated, as for example, the FLP or CRE recombinase dependent constructs described below.
  • transgenic animal also includes those recombinant animals in which gene disruption of one or more genes is caused by human intervention, including both recombination and antisense techniques.
  • treating is intended to encompass curing as well as ameliorating at least one symptom of the condition or disease.
  • the term also means, in the context of "treating ... with a stimulus,” contacting, affecting, effecting, causing to happen, exposing to (an environmental alteration), etc.
  • vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
  • One type of preferred vector is an episome, i.e., a nucleic acid capable of extra-chromosomal replication.
  • Preferred vectors are those capable of autonomous replication and/or expression of nucleic acids to which they are linked.
  • Vectors capable of directing the expression of genes to which they are operatively linked are referred to herein as "expression vectors”.
  • expression vectors of utility in recombinant DNA techniques are often in the form of "plasmids" which refer generally to circular double stranded DNA loops which, in their vector form are not bound to the chromosome.
  • plasmid and "vector” are used interchangeably as the plasmid is the most commonly used form of vector.
  • vector is intended to include such other forms of expression vectors which serve equivalent functions and which become known in the art subsequently hereto.
  • wild-type allele refers to an allele of a gene which, when present in two copies in a subject results in a wild-type phenotype. There can be several different wild-type alleles of a specific gene, since certain nucleotide changes in a gene may not affect the phenotype of a subject having two copies of the gene with the nucleotide changes.
  • ubiquitin refers to an abundant 76 amino acid residue polypeptide that is found in all eukaryotic cells.
  • the ubiquitin polypeptide is characterized by a carboxy-terminal glycine residue that is activated by ATP to a high-energy thiol-ester intermediate in a reaction catalyzed by a ubiquitin-activating enzyme (El).
  • El ubiquitin-activating enzyme
  • the activated ubiquitin is transferred to a substrate polypeptide via an isopeptide bond between the activated carboxy-terminus of ubiquitin and the epsilon-amino group of a lysine residue(s) in the protein substrate.
  • ubiquitin conjugating enzymes such as E2 and, in some instances, E3 activities.
  • the ubiquitin modified substrate is thereby altered in biological function, and, in some instances, becomes a substrate for components of the ubiquitin-dependent proteolytic machinery which includes both UBP enzymes as well as proteolytic proteins which are subunits of the proteasome.
  • the term "ubiquitin” includes within its scope all known as well as unidentified eukaryotic ubiquitin homologs of vertebrate or invertebrate origin which can be classified as equivalents of human ubiquitin.
  • ubiquitin polypeptides examples include the human ubiquitin polypeptide which is encoded by the human ubiquitin encoding nucleic acid sequence (GenBank Accession Numbers: U49869, X04803).
  • Equivalent ubiquitin polypeptide encoding nucleotide sequences are understood to include those sequences that differ by one or more nucleotide substitutions, additions or deletions, such as allelic variants; as well as sequences which differ from the nucleotide sequence encoding the human ubiquitin coding sequence due to the degeneracy of the genetic code.
  • ubiquitin polypeptide as referred to herein is murine ubiquitin which is encoded by the murine ubiquitin encoding nucleic acid sequence (GenBank Accession Number: X51730). It will be readily apparent to the person skilled in the art how to modify the methods and reagents provided by the present inevntion to the use of ubiquitin polypeptides other than human ubiquitin.
  • ubiquitin-like protein refers to a group of naturally occurring proteins, not otherwise describable as ubiquitin equivalents, but which nonetheless show strong amino acid homology to human ubiquitin. As used herein this term includes the polypeptides NEDD8, UBL1, NPVAC, and NPVOC. These "ubiquitin-like proteins" are at least over 40%> identical in sequence to the human ubiquitin polypeptide and contain a pair of carboxy-terminal glycine residues which ftmction in the activation and transfer of ubiquitin to target substrates as described supra.
  • ubiquitin-related protein refers to a group of naturally occurring proteins, not otherwise describable as ubiquitin equivalents, but which nonetheless show some relatively low degree ( ⁇ 40%> identity) of amino acid homology to human ubiquitin.
  • ubiquitin-related proteins include human Ubiquitin Cross-Reactive Protein (UCRP, 36% identical to huUb, Accession No. P05161), FUBI (36% identical to huUb, GenBank Accession No. AA449261), and Sentrin/Sumo/Picl (20% identical to huUb, GenBank Accession No. U83117).
  • ubiquitin-related protein as used herein further pertains to polypeptides possessing a carboxy-terminal pair of glycine residues and which function as protein tags through activation of the carboxy-terminal glycine residue and subsequent transfer to a protein substrate.
  • ubiquitin-homologous protein refers to a group of naturally occurring proteins, not otherwise describable as ubiquitin equivalents or ubiquitin-like or ubiquitin-related proteins, which appear functionally distinct from ubiquitin in their ability to act as protein tags, but which nonetheless show some degree of homology to human ubiquitin (34-41 % identity).
  • ubiquitin- homologous proteins include RAD23A (36% identical to huUb, SWISS-PROT. Accession No. P54725), RAD23B (34% identical to huUb, SWISS-PROT. Accession No. P54727), DSK2 (41% identical to huUb, GenBank Accession No.
  • ubiquitin-homologous protein as used herein is further meant to signify a class of ubiquitin homologous polypeptides whose similarity to ubiquitin does not include glycine residues in the carboxy-terminal and penultimate residue positions. Said proteins appear functionally distinct from ubiquitin, as well as ubiquitin-like and ubiquitin-related polypeptides, in that, consistent with their lack of a conserved carboxy-terminal glycine for use in an activation reaction, they have not been demonstrated to serve as tags to other proteins by covalent linkage.
  • ubiquitin conjugation machinery refers to a group of proteins which function in the ATP-dependent activation and transfer of ubiquitin to substrate proteins.
  • the term thus encompasses: El enzymes, which transform the carboxy-terminal glycine of ubiquitin into a high energy thiol intermediate by an ATP-dependent reaction; E2 enzymes (the UBC genes), which transform the El- S ⁇ Ubiquitin activated conjugate into an E2-S ⁇ Ubiquitin intermediate which acts as a ubiquitin donor to a substrate, another ubiquitin moiety (in a poly-ubiquitination reaction), or an E3; and the E3 enzymes (or ubiquitin ligases) which facilitate the transfer of an activated ubiquitin molecule from an E2 to a substrate molecule or to another ubiquitin moiety as part of a polyubiquitin chain.
  • ubiquitin conjugation machinery is further meant to include all known members of these groups as well as those members which have yet to be discovered or characterized but which are sufficiently related by homology to known ubiquitin conjugation enzymes so as to allow an individual skilled in the art to readily identify it as a member of this group.
  • the term as used herein is meant to include novel ubiquitin activating enzymes which have yet to be discovered as well as those which function in the activation and conjugation of ubiquitin-like or ubiquitin-related polypeptides to their substrates and to poly-ubiquitin-like or poly-ubiquitin-related protein chains.
  • ubiquitin-dependent proteolytic machinery refers to proteolytic enzymes which function in the biochemical pathways of ubiquitin, ubiquitin-like, and ubiquitin-related proteins.
  • proteolytic enzymes include the ubiquitin C-terminal hydrolases, which hydrolyze the linkage between the carboxy- terminal glycine residue of ubiquitin and various adducts; UBPs, which hydrolyze the glycine76-lysine48 linkage between cross-linked ubiquitin moieties in polyubiquitin conjugates; as well as other enzymes which function in the removal of ubiquitin conjugates from ubiquitinated substrates (generally termed "deubiquitinating enzymes").
  • protease activities function in the removal of ubiquitin units from a ubiquitinated substrate following or during uibiquitin-dependent degradation as well as in certain proofreading functions in which free ubiquitin polypeptides are removed from incorrectly ubiquitinated proteins.
  • ubiquitin-dependent proteolytic machinery as used herein is also meant to encompass the proteolytic subunits of the proteasome (including human proteasome subunits C2, C3, C5, C8, and C9).
  • the term "ubiquitin- dependent proteolytic machinery” as used herein thus encompasses two classes of proteases: the deubiquitinating enzymes and the proteasome subunits.
  • protease functions of the proteasome subunits are not known to occur outside the context of the assembled proteasome, however independent functioning of these polypeptides has not been excluded.
  • the term "ubiquitin system" as referred to herein is meant to describe all of the aforementioned components of the ubiquitin biochemical pathways including ubiquitin, ubiquitin-like proteins, ubiquitin-related proteins, ubiquitin-homologous proteins, ubiquitin conjugation machinery, ubiquitin-dependent proteolytic machinery, or any of the substrates which these ubiquitin system components act upon.
  • the invention provides negative selectable marker genes or "negative selectable reporter moieties" which can be used in a eukaryotic host cell, preferably a yeast or a mammalian cell, and which can be selected against under appropriate conditions.
  • the selectable reporter is provided as a fusion polypeptide with a carboxy- or C-terminal subdomain of ubiquitin (or Cub) and is in some embodiments of the present invention altered so as to encode a non- methionine amino acid residue at the junction with the Cub.
  • the non-methionine amino acid residue is preferably an amino acid which is recognized by the N-end rule ubiquitin protease system (e.g.
  • an arginine, lysine, histidine, phenylalanine, tryptophan, tyrosine, leucine or isoleucine residue an arginine, lysine, histidine, phenylalanine, tryptophan, tyrosine, leucine or isoleucine residue
  • a preferred example of a negative selectable marker gene for use in yeast is the URA3 gene which can be both selected for (positive selection) by growing ura3 auxotrophic yeast strains in the absence of uracil, and selected against (negatively selection) by growing cells on media containing 5-ffuoroorotic acid (5-FOA) (see Boelce, et al. (1987) Methods Enzymol 154: 164-75).
  • the concentration of 5-FOA can be optimized by titration so as to maximally select for cells in which the URA3 reporter is inactivated by proteolytic degradation to some preferred extent. For example, relatively high concentrations of 5-FOA can be used which allow only cells expressing very low steady-state levels of URA3 reporter to survive.
  • Such cells will correspond to those in which the first and second ubiquitin subdomain fusion proteins have a relatively high affinity for one another, resulting in efficient reassembly of the N Ub and C U b fragments and a correspondingly efficient release of the n-URA3 labilized marker.
  • lower concentrations of 5-FOA can be used to select for protein binding partners with relatively weak affinities for one another.
  • proline can be used in the media as a nitrogen source to make the cells hypersensitive to the toxic affects of the 5-FOA (McCusker & Davis (1991) Yeast 7: 607-8).
  • proline concentrations as well as 5-FOA concentrations can be titrated so as to obtain an optimal selection for URA3 reporter deficient cells. Therefore the use of URA3 as a negative selectable marker allows a broad range of selective stringencies which can be adapted to minimize false positive background noise and/or to optimize selection for high affinity binding interactions.
  • Other negative selectable markers which operate in yeast and which can be adapted to the method of the invention are included within the scope of the invention.
  • a negative selectable marker gene for use in yeast is the TRP1 gene which can be both selected for (positive selection) by growing trpl auxotrophic yeast strains in the absence of tryptophan, and selected against (negatively selection) by growing cells on media containing 5- fluoroantliranilic acid (5-FAA) (Toyn et al. (2000) Yeast 16 : 553-560).
  • 5-FAA 5- fluoroantliranilic acid
  • Two other negative selectable marker genes for the use in yeast are CYH2 and CAN1 both of which can be selected against (negative selection) by growing cells on media containing cycloheximide or canavanine (The yeast two-hybrid system, ed. by Bartel and Fields, Oxford University Press: 1997).
  • mammalian negative selectable markers include Thymidine kinase (Tic) (Wigler et al. (1977) Cell 11: 223-32; Borrelli et al. (1988) Proc. Natl. Acad. Sci. USA 85: 7572-76) of the Herpes Simplex virus, the human gene for hypoxanthine phosphoriboxyl transferase (HPRT) (Lester et al. (1980) Somatic Cell Genet. 6: 241- 59; Albertini et al.
  • Tic Thymidine kinase
  • HPRT hypoxanthine phosphoriboxyl transferase
  • the Tk gene can be selected against using Gancyclovir (GANG) (e.g. using a 1 uM concentration) and codA gene can be selected against using 5-Fluor Cytidin (5-FIC) (e.g. using a 0.1- 1.0 mg/ml concentration).
  • GANG Gancyclovir
  • 5-Fluor Cytidin 5-FIC
  • chimeric selectable markers have been reported (Karreman (1998) Gene 218: 57-61) in which a functional mammalian negative selectable marker is fused to a functional mammalian positive selectable marker such as Hygromycinresistance (Hyg R , neomycin resistance (neo R ), puromycin resistance (PAC R ) or Blasticidin S resistance (BlaS R ).
  • Hygromycinresistance Hygromycinresistance
  • neo R neomycin resistance
  • PAC R puromycin resistance
  • BaS R Blasticidin S resistance
  • Tk- based positive/ negative selectable markers for mammalian cells such as HygTk, Tkneo, TlcBSD, and PACTlc
  • codA-based positive/negative selectable markers for mammalian cells such as HygCoda, Codaneo, CodaBSD, and PACCoda.
  • Tlc-neo reporters which incorporate luciferase, green fluorescent protein and/or beta-galactosidase have also been recently reported (Strathdee et al. (2000) BioTechniques 28: 210-14). These vectors have the advantage of allowing ready screening of the "positive" marker/reporter by fluorescent and/or immunofluorescent microscopy.
  • the invention further provides positive selectable marker genes or "positive selectable reporter moieties" which can be used in a eukaryotic host cell, preferably a yeast or a mammalian cell, and which can be selected for under appropriate conditions.
  • the selectable reporter is provided as a fusion polypeptide with a carboxy- or C-terminal subdomain of ubiquitin (or C Conduct b ) and is in some embodiments of the present invention altered so as to encode a non-methionine amino acid residue at the junction with the C ub as further described supra.
  • any non-redundant gene in a synthetic pathway that is essential to the survival of the cell can be used for the construction of an auxotrophic positive selectable marker, but frequently used such makers include, without limitation, HIS3, LYS2, LEU2, TRP2, ADE2.
  • a cell line is constructed that is deficient in the marker gene, and that can only grow on media supplemented with the corresponding metabolic product, i.e. histidine, lysine, leucine, tryptophane or adenine.
  • a desirable phenotype i.e.
  • antibiotic resistance markers e.g. Hygromycinresistance (Hyg R ), neomycin resistance (neo R ), puromycin resistance (PAC R ) or Blasticidin S resistance (BlaS R ), as mentioned supra, or any other antibiotic resistance marker.
  • expression of a desired recombinant gene is linked to the expression of the antibiotic resistance marker by transforming cells with gene constructs comprising both the desired recombinant gene and a recombinant form of the antibiotic resistance marker gene. Selection is then carried out on media containing the antibiotic, e.g. Hygromycin, neomycin, puromycin or Blasticidin S. Furthermore, the above mentioned combinations of positive and negative markers can also be employed.
  • N-end rule system for proteolytic degradation is a particular branch of the ubiquitin-mediated proteolytic pathway present in eukaryotic cells (Bachmair et al., Science 234: 179-86, 1986). This system operates to degrade a cellular polypeptide " at a rate dependent upon the amino-terminal amino acid residue of that polypeptide. Protein translation ordinarily initiates with an ATG methionine codon and so most polypeptides have an amino-terminal methionine residue and are typically relatively stable in vivo. For example, in the yeast S.
  • a beta-galactosidase polypeptide with a methionine amino terminus has a half-life of >20 hours (Varshavsky, Cell 69: 725-35, 1992).
  • polypeptides possessing a non-methionine amino-terminal residue can be created.
  • an endoprotease hydi-olyzes and thus cleaves a unique polypeptide bond (Y-n) internal to a polypeptide it results in the release of two separate polypeptides - one of which possesses an amino-terminal amino acid, n, which may not be methionine.
  • the endoproteases UBP, ubiquitin specific proteases , which are a preferred component of the present invention, will cleave a polypeptide bond carboxy-terminal to the final glycine residue (codon 76) of ubiquitin, regardless of what the next codon is.
  • these UBPs serve to cleave a polyubiquitin precursor or other ubiquitin fusion proteins to liberate individual ubiquitin units.
  • target polypeptide with virtually any amino-terminal residue by merely fusing the target polypeptide in-frame to a codon corresponding to the desired amino-terminal amino acid (n), which codon, in turn, is fused downstream of ubiquitin (typically contiguous with ubiquitin Gly codon 76).
  • the resulting target gene chimera construct has the general structure Ubiquitin-n-target.
  • Preferred target constructs further comprise an epitope tag (Ep) so that the resulting target gene chimera construct has the general structure Ubiquitin-n-Ep-target, which results in the eventual production of a polypeptide of the general structure n-Ep-target.
  • ubiquitin-specific protease activities present in eukaryotic cells will result in the endoproteolytic processing of the Ubiquitin-n-target polypeptide into Ubiquitin and n-target entities.
  • the n-target polypeptide is further acted upon by the components of the N-end rule system as described below. If the target polypeptide is a negative selection marker (NSM) and if n is an amino acid residue (such as arg) which potentiates rapid degradation by the N-end rule system, then cells expressing intact Ubiquitin-n-NSM can be selected against while cells in which the fusion is clipped into a relatively labile n-NSM polypeptide can be selected for.
  • NSM negative selection marker
  • N-end rule system components are those gene products which act to bring about the rapid proteolysis of polypeptides possessing amino-terminal residues which confer instability.
  • the N-end rule system for proteolysis in eukaryotes appears to be a part of the general ubiquitin-dependent proteolytic system pathways possessed by apparently all eukaryotic cells.
  • this system involves the covalent tagging of a target polypeptide on one or more lysine residues by a ubiquitin polypeptide marker (to form a target(lys)-epsilon amino-gly(76)Ubiquitin covalent bond). Additional ubiquitin moieties may be subsequently conjugated to the target polypeptide and the resulting "ubiquitinated" target polypeptide is then subject to complete proteolytic destruction by a large (26S) multiprotein complex known as the proteasome.
  • the enzymes which conjugate the ubiquitin moieties to the targeted protein include E2 and E3 (or ubiquitin ligase) functions. The E2 and E3 enzymes are thought to possess most of the specificity for ubiquitin dependent proteolytic processes.
  • a key component of the N-end rule proteolytic pathway in yeast is UBR1
  • UBR1 can be used as a regulatable N-end rule component which is the effector of proteolytic degradation of the target gene polypeptide.
  • the UBR1 gene has now been cloned from a mammalian organism (Kwon et al, Proc. Natl. Acad. Sci. USA 95: 7893-903, 1998) as well as from yeast.
  • yeast yeast
  • the UBR1 gene is particularly central to the invention because it can be selectively used in conjunction with any of the above described non-methionine "n" amino-terminal destabilizing residues including: the most destabilizing - arg; strongly destabilizing residues - such as lys phe, leu, trp, his, asp, and asn; and moderately destabilizing residues - such as tyr, ile, glu, or gin.
  • S. cerevisiae UBC2 RAD6
  • E2 ubiquitin conjugating function which cooperates with the UBR1 - encoded N-end rule E3 to promote multiubiquitination and subsequent degradation of N-end rule substrates
  • a target gene polypeptide possessing an N- end rule destabilizing amino-terminal amino acid (such as arg) will be stable until expression of either the UBR1 (E3) or the UBC2 (E2) is induced from the cognate inducible promoter construct.
  • Both UBR1 and UBC2 can be used in conjunction with any of the above described "n" amino-terminal destabilizing residues including: the most destabilizing - arg; strongly destabilizing residues - such as lys phe, leu, trp, his, asp, and asn; and moderately destabilizing residues - such as tyr, ile, glu, or gin.
  • Still other alternative embodiments of the N-end rule component of the present invention are components of the N-end rule system which affect only a subset of the destabilizing residues.
  • the NTA1 deamidase (Baker and Varshavsky, J Biol Chem 270: 12065-74, 1995) functions to deaminate amino-terminal asn or gin residues (to form polypeptides with asp or glu amino-terminal residues respectively).
  • Yeast strains harboring ntal null alleles are unable to degrade N-end rule substrates that bear amino-terminal asn or gin residues.
  • the NTA1 gene is an alternative embodiment of the N-end rule component of the present invention, but is used preferably in conjunction with a target gene polypeptide (n-target), in which n is either asn or gin.
  • the ATE1 transferase (Balzi et al., J. Biol Chem 265: 7464-71, 1990) is an enzyme which acts to transfer the arg moiety from a tRNA ⁇ Arg activated tRNA to amino-terminal glu or asp bearing polypeptides.
  • the resulting arg-glu-polypeptide and arg-asp-polypeptide products are then susceptible to the E2/E3 - mediated N-end rule dependent proteolytic processes described above.
  • the ATE1 transferase is an alternative embodiment of the N-end rule component of the present invention, but its use is preferably tied to target gene polypeptides (n-target), in which n is asp, glu, asn or gin.
  • Polypeptides bearing the latter two amino-terminal residues are first converted to polypeptides bearing one of the former tow amino-terminal residues by NTA1 deamidase function described above. From the description above, it is apparent to a skilled artisan that different cell types might possess different N-end rule components. Therefore, it might be necessary and important to genetically engineer a given cell line so that a complementation screen based on the instant invention can be successfully carried out in that given cell line. For example, many libraries or constructs generated for use in mammalian systems might be easily adapted for use in a different cell type if that cell type has the same or very similar N-end rule components and operates essentially the same as mammalian cells.
  • the N-end rule components may be provided as a clone so that it they can be put under the control of an inducible promoter (using standard subcloning methods well Icnown in the art). It is also possible that other genetic engineering steps can be performed in a given cell type to make it suitable for expression of source DNA in libraries using mammalian expression vectors.
  • genes which genes may potentially be heterologous to the cell type employed, and/or "knocking-out" genes, techniques which are well Icnown in the art and can be readily appreciated by a skilled artisan.
  • the N-end rule component must be available as a clone so that it can be put under the control of an inducible promoter (using standard subcloning methods known in the art). This can be achieved by first introducing genetically engineered copies of the inducible repressor and the inducible N-end rule component constructs, and subsequently deleting the normal chromosomal copies of these genes from the host by "knockout" methods. Such methods, we note here are well developed in the art - particularly in the case of both the yeast Saccharomyces cerevisiae and the mammal mouse.
  • FIG. 2A diagrams this process for the replacement of the native promoter of the target gene with a repressible promoter, but this principle is also applicable to the replacement of the native promoter of the effector of suppression (i.e. the transcriptional repressor and/or the N-end rule component) with a suitable inducible promoter.
  • Ub ubiquitin
  • Ub is a 76-residue, single-domain protein whose covalent coupling to other proteins yields branched Ub-protein conjugates and plays a role in a number of cellular processes, primarily through routes that involve protein degradation.
  • linear Ub adducts are the translational products of natural or engineered Ub fusions.
  • UBPs Ub-specific proteases
  • the present invention relies in part upon the previously described split ubiquitin protein sensor system (see U.S. Patent Nos. 5,503,977 & 5,585,245). Briefly, it has been demonstrated that an N-terminal ubiquitin subdomain and a C- terminal ubiquitm subdomain, the latter bearing a reporter extension at its C- terminus, when coexpressed in the same cell by recombinant DNA techniques as distinct entities, have the ability to associate, reconstituting a ubiquitin molecule which is recognized, and cleaved, by ubiquitin-specific processing proteases which are present in all eukaryotic cells.
  • ubiquitin-specific proteases recognize the folded conformation of ubiquitin.
  • ubiquitin-specific proteases retained their cleavage activity and specificity of recognition of the ubiquitin moiety that had been reconstituted from two unlinked ubiquitin subdomains.
  • Ubiquitin is a 76-residue, single-domain protein comprising two subdomains which are relevant to the present invention-the N-terminal subdomain and the C- terminal subdomain.
  • the ubiquitin protein has been studied extensively and the
  • N u b N-terminal subdomain
  • the C-terminal subdomain of ubiquitin (C Intel b ), as referred to herein, is that portion of the ubiquitin which is not a portion of the N-terminal subdomain defined in the preceding paragraph. Generally speaking, this subdomain comprises amino acid residues from about 35-38 to about 76. It should be recognized that by using only routine experimentation it will be possible to define with precision the minimum requirements at both ends of the N-terminal subdomain and the C-terminal subdomain which are necessary to be useful in connection with the present invention.
  • N Ub refer, in preferred embodiments of the invention, to ubiquitin subdomain units which have been mutated so as to decrease their binding affinity, thereby making the C Ub /N ub association dependent upon the binding of a second protein pair fused to the C ub and N ub subunits or a conformational change of a protein fused in between N ub and C U b- Suitable forms of N ub are described below and still others are readily available to the skilled artisan by routine mutation and screening methods.
  • one member of the pair is fused to the N-terminal subdomain of ubiquitin and the other member of the specific-binding pair is fused to the C-terminal subdomain of ubiquitin. Since the members of the specific-binding pair (linked to subdomains of ubiquitin) have an affinity for one another, this affinity increases the "effective" (local) concentration of the N-terminal and C-terminal subdomains of ubiquitin, thereby promoting the reconstitution of a quasi-native ubiquitin moiety.
  • the term "quasi-native ubiquitin moiety" will be used herein to denote a moiety recognizable as a substrate by ubiquitin-specific proteases.
  • ubiquitin-specific proteases a further requirement is imposed in the present invention in order to increase the resolving capacity of the method for studying such interactions.
  • the binding interaction studies described herein are carried out under conditions appropriate for protein/protein interaction. Such conditions are provided in vivo (i.e., under physiological conditions inside living cells) or in vitro, when parameters such as temperature, pH and salt concentration are controlled in a manner intended to mimic physiological conditions.
  • the present invention preferably uses the disclosed in vivo screening methods which have the advantage of being subject to a powerful negative selection method.
  • the mutational alteration of a ubiquitin subdomain for use with the instant invention is preferably a point mutation.
  • mutational alterations which would be expected to grossly affect the structure of the subdomain bearing the mutation are to be avoided.
  • a number of ubiquitin-specific proteases have been reported, and the nucleic acid sequences encoding such proteases are also known (see e.g., Tobias et al, J. Biol. Chem. 266: 12021, 1991; Baker et al., J. Biol. Chem. 267: 23364, 1992).
  • the preferred mutational alteration within the N Ub subunit is a mutation in which an amino acid substitution is effected. For example, the substitution of an amino acid having chemical properties similar to the substituted amino acid (e.g., a conservative substitution) is preferred.
  • the desired mild perturbation of ubiquitin subdomain interaction is achieved by substituting a chemically similar amino acid residue which differs primarily in the size of its side chain.
  • Such a steric perturbation is expected to introduce a desired (mild) conformational destabilization of a ubiquitin subdomain.
  • the goal is to reduce the affinity of the N-terminal and C-terminal subdomains for one another, not necessarily to eliminate this affinity.
  • the mutational alteration may be introduced into the N-terminal subdomain of ubiquitin. More specifically, a first neutral amino acid residue may be replaced with a second neutral amino acid having a side chain which differs in size from the first neutral amino acid residue side chain to achieve the desired decrease in affinity.
  • the first neutral amino acid residue isoleucine (either residue 3 or 13 of wild-type ubiquitin) may be replaced with a neutral amino acids which has a side chain which differs in size from isoleucine such as glycine, alanine or valine.
  • fusion construct combinations can be used in the methods of this invention.
  • One strict requirement which applies to all N- and C-terminal fusion construct combinations is that the C-terminal subdomain must bear an amino acid (e.g., peptide, polypeptide or protein) extension. This requirement is based on the fact that the detection of interaction between two proteins of interest linked to two subdomains of ubiquitin is achieved through cleavage after the C-terminal residue of the quasi-native ubiquitin moiety, with the formation of a free reporter moiety (or peptide) that had previously been linked to a C-terminal subdomain of ubiquitin.
  • amino acid e.g., peptide, polypeptide or protein
  • Ubiquitin-specific proteases cleave a linear ubiquitin fusion between the C-terminal residue of ubiquitin and the N-terminal residue of the ubiquitin fusion partner, but they do not cleave an otherwise identical fusion whose ubiquitin moiety is conformationally perturbed. In particular, they do not recognize as a substrate a C- terminal subdomain of ubiquitin linked to a "downstream" reporter sequence, unless this C-terminal subdomain associates with an N-terminal subdomain of ubiquitin to yield a quasi-native ubiquitin moiety.
  • the characteristics of the C-terminal amino acid extension of the C-terminal ubiquitin subdomain must be such that the products of the cleaved fusion protein are distinguishable from the uncleaved fusion protein. In practice, this is generally accomplished by monitoring a physical property or activity of the C- terminal extension which is cleaved free from the C-terminal ubiquitin moiety. It is generally a property of the free C-terminal extension that is monitored as an indication that a quasi-native ubiquitin has formed, because monitoring of the quasi- native ubiquitin moiety directly is difficult in eukaryotic cells due to the presence of native ubiquitin.
  • the size of the C-terminal extension which is released following cleavage of the quasi-native ubiquitin moiety within a reporter fusion by a ubiquitin-specific protease is a particularly convenient characteristic in light of the fact that it is relatively easy to monitor changes in size using, for example, electrophoretic methods. For instance, if the C-terminal reporter extension has a molecular weight of about 20 lcD, the cleavage products will be distinguishable from the non-cleaved quasi-native ubiquitin moiety by virtue of the appearance of a previously absent reporter-specific 20 lcD band following cleavage of the reporter fusion.
  • cleavage can take place, for example, in crude cell extracts or in vivo, it is generally not possible to monitor such changes in molecular weight of cleavage products by simply staining an electrophoretogram with a dye that stains proteins nonspecifically, because there are too many proteins in the mixture to analyze in this manner.
  • One preferred method of analysis is immunoblotting. This is a conventional analytical method wherein the cleavage products are separated electrophoretically, generally in a polyacrylamide gel matrix, and subsequently transferred to a charged solid support (e.g., nitrocellulose or a charged nylon membrane). An antibody which binds to the reporter of the ubiquitin- specific protease cleavage products is then employed to detect the transferred cleavage products using routine methods for detection of the bound antibody.
  • Another useful method is immunoprecipitation of either a reporter- containing fusion to C-terminal subdomains of ubiquitin or the free reporter (liberated through the cleavage by ubiquitin-specific proteases upon reconstitution of a quasi-native ubiquitin moiety) with an antibody to the reporter.
  • the proteins to be immunoprecipitated are first labeled in vivo with a radioactive amino acid such as
  • a cell extract is then prepared, and reporter-containing proteins are precipitated from the extract using an anti-reporter antibody.
  • the immunoprecipitated proteins are fractionated by electrophoresis in a polyacrylamide gel, followed by detection of radioactive protein species by autoradiography or fluorography.
  • a preferred experimental design is to extend the C-terminal subdomain of ubiquitin with a peptide containing an epitope foreign to the system in which the assay is being carried out. It is also preferable to design the experiment so that the C- terminal reporter extension of the C-terminal subdomain of ubiquitin is sufficiently large, i.e., easily detectable by the electrophoretic system employed. In this preferred embodiment, the C-terminal reporter extension of the C-terminal subdomain should be viewed as a molecular weight marker. The characteristics of the extension other than its molecular weight and immunological reactivity are not of particular significance.
  • this C-terminal extension can represent an amalgam comprising virtually any amino acid sequence combination fused to an epitope for which a specifically binding antibody is available.
  • the C-terminal extension of the C-terminal ubiquitin subdomain may be a combination of the "ha” epitope fused to mouse DHFR (an antibody to the "HA" epitope is readily available).
  • a "reporter" enzyme which, in its native form, exhibits an enzymatic activity that is abolished when the enzyme is N-terminally extended, can also serve as the C- terminal reporter linked to the C-terminal ubiquitin subdomain.
  • the reporter moiety when the reporter is present as a fusion to the C- terminal ubiquitin subdomain, the reporter moiety is inactive. However, if the C- terminal ubiquitin subdomain and the N-terminal ubiquitin subdomain associate to reconstitute a quasi-native ubiquitin moiety in the presence of a ubiquitin-specific protease, the reporter moiety will be released, with the concomitant restoration of its enzymatic activity.
  • the reporter moiety is a eukaryotic negative selectable marker (NSM) which has been engineered to be processed and released as an N-end rule-labile n-NSM fusion following UBP cleavage.
  • NSM eukaryotic negative selectable marker
  • the negative selectable markers (NSMs) for use in the invention are described elsewhere.
  • the advantage of using an n-NSM fusion is that interaction of the specific binding pair can be directly selected for (as opposed to screened for) by virtue of the fact that only cells in which n-NSM has been released will survive negative selection.
  • the target gene reporter (negative selectable marker) must be fused downstream of a codon which encodes an N-end rule susceptible residue (n, as described above) and this residue, in term, must be fused in-frame to the carboxy- terminus of a ubiquitin coding sequence (generally the carboxy-terminus of a C- terminal ubiquitin subdomain (C ub ) which corresponds to gly76 of intact ubiquitin).
  • C ub C- terminal ubiquitin subdomain
  • UBPs normally functions to process poly-ubiquitin chains (the translational product of the tandem ubiquitin encoding sequences of eukaryotic genomes) into discrete (normally 76 a.a.) ubiquitin moieties which are used in ubiquitin-system pathways.
  • the UBPs serve as a convenient means to generate target gene polypeptides bearing specific amino- terminal residues (n). Nonetheless, it is understood that other alternatives to mammalian or yeast ubiquitin exist which can function in the method of the present invention.
  • Such ubiquitin equivalents include, for example, ubiquitin mutants, ubiquitin-like proteins, ubiquitin-related proteins, and ubiquitin-homologous proteins.
  • ubiquitin-like proteins such as NEDD8, UBL1, FUBI, and UCRP, as well as analogous ubiquitin-related proteins such as SUMO/Sentrin/Picl may be used as ubiquitin equivalents in the method of the invention.
  • These ubiquitin-like proteins share the common features of being related to ubiquitin by amino acid sequence homology and, with the apparent exception of the ubiquitin homologous proteins, of being covalently transferred to cellular protein targets post- translationally.
  • N-end rule susceptible residue arg, lys, his, leu, phe, try, ile, trp, asn, gin, asp, or glu
  • General methods for engineering such N-end rule residues into ubiquitin-reporter chimera expression vectors are well Icnown in the art (e.g. the "fusion PCR” method; see Karreman, BioTechniques 24: 736-42, 1988).
  • intrapolypeptide split-ubiquitin conformational assays of the invention polypeptide and/or small molecule libraries may be utilized.
  • intrapolypeptide split-ubiquitin assays for therapeutic compounds which stabilize or destabilize the conformation of a particular target protein, such as a beta conformation of the beta-amyloid protein can be performed using small molecule libraries, peptide libraries or nucleic acid expression libraries.
  • specific binding polypeptides for a predetermined ligand can be designed by expression of appropriate libraries of variegated split-ubiquitin/polypeptide sequences, interacting polypeptides can be selected by monitoring split-ubiquitin reporter output from individual clones in the presence and absence of the ligand.
  • cDNA complementary DNA
  • Genomic DNA is another major source of DNA, although it is less common for construction of an expression library, largely due to the presence of introns and other non-coding regions.
  • the isolation of genomic DNA and size fractionation into suitable pieces for library construction is also well-known in the art.
  • DNA sources can also be used.
  • random or semi-random polynucleotide sequences can be used as source DNA for library construction. This is a particularly powerful method when small stretches of these random fragments are incorporated into a Icnown coding sequence to screen for optimal sequences for certain activity, i.e. binding between two proteins or enzymatic activity.
  • the chosen vector shall have at least one cloning site for insertion of source DNA.
  • the most commonly used cloning sites are restriction enzyme sites, preferably those restriction enzymes that rarely cut inside coding sequences, such as Notl, Sail.
  • other sites can also be used.
  • loxP sites can be used instead of or in addition to restriction enzyme sites.
  • Such sites flanlcing the cloned source DNA can be recognized by Cre recombinase and readily excised in a controlled manner since Cre recombinase can be conditionally provided by induced expression.
  • Many other similar recombination-based systems are also commercially available, such as the Gateway system (Life Technology, Inc.) that is described in U.S. Pat. No. 5,888,732, the content of which is incorporated by reference herein.
  • the vector shall also be suitable for expression of the cloned source DNA, either in vitro or in vivo. At the minimum, it shall have a promoter for transcription of the DNA in its intended host.
  • the host can be a mammalian cell, an insect cell, or a plant cell, or any other cell as specified in other sections of this specification.
  • the vector shall also have the ability to maintain itself in the host cell, at least during the pendency of the experiment. That can be achieved by self replication or integration into the host genome. Some vector may also contain selectable markers to facilitate easy identification of cells that have accepted/maintained the vector, and thus the source DNA.
  • U.S. Pat. No. 6,255,071 has detailed description of a variety of viral vectors suitable for mammalian expression screen, which is incorporated herein by reference in its entirety. Specifically, U.S. Pat. No. 6,255,071 relates to methods and compositions for improved mammalian complementation screening, functional inactivation of specific essential or non-essential mammalian genes, and identification of mammalian genes which are modulated in response to specific stimuli.
  • retroviral vectors libraries comprising such vectors, retroviral particles produced by such vectors in conjunction with retroviral packaging cell lines, integrated provirus sequences derived from the retroviral particles and circularized provirus sequences which have been excised from the integrated provirus sequences. It further discloses novel retroviral packaging cell lines for use for those viral vectors. Exemplary vectors disclosed by the patent are:
  • the retroviral vector may also contain a polycistronic message cassette which makes possible a selection scheme that directly links expression of a selectable marker to transcription of a cDNA or genomic DNA (gDNA) sequence.
  • Such a polycistronic message cassette can comprise, in one embodiment, from 5' to 3', the following elements: a nucleotide polylinker, an internal ribosome entry site and a mammalian selectable marker.
  • the polycistronic cassette is situated within the retroviral vector between the 5' LTR and the 3' LTR at a position such that transcription from the 5' LTR promoter transcribes the polycistronic message cassette.
  • the transcription of the polycistronic message cassette may also be driven by an internal cytomegalovirus (CMV) promoter or an inducible promoter, which may be preferable depending on the screenings.
  • CMV cytomegalovirus
  • the polycistronic message cassette can further comprise a cDNA or genomic DNA (gDNA) sequence operatively associated within the polylinker.
  • Internal ribosome entry site sequences are well known to those of skill in the art and can comprise, for example, internal ribosome entry sites derived from foot and mouth disease virus (FDV), encephalomyocarditis virus, poliovirus and RDV (Scheper, 1994, Biochemic 76: 801-809; Meyer, 1995, J. Virol. 69: 2819-2824; Jang, 1988, J. Virol. 62: 2636-2643; Haller, 1992, J. Virol. 66: 5075-5086).
  • Any mammalian selectable marker can be utilized as the polycistronic message cassette mammalian selectable marker.
  • Such mammalian selectable markers are well Icnown to those of skill in the art and can include, but are not limited to, kanamycin/G418, hygromycinB or mycophenolic acid resistance markers. Other examples are provided elsewhere herein.
  • the retroviral vectors' proviral excision element allows for excision of retroviral provirus (see below) from the genome of a recipient cell.
  • the element comprises a nucleotide sequence which is specifically recognized by a recombinase enzyme.
  • the recombinase enzyme cleaves nucleic acid at its site of recognition in such a manner that excision via recombinase action leads to circularization of the excised nucleic acid molecules.
  • the recombinase recognition site is located within the 3' LTR at a position which is duplicated upon integration of the provirus. This results in a provirus that is flanked by recombinase sites.
  • the proviral excision element comprises a loxP recombination site, which is cleavable by a Cre recombinase enzyme. Contacting Cre recombinase to an integrated provirus derived from the retroviral vector results in excision of the provirus nucleic acid.
  • a mutant lox P recombination site may be used (e.g., lox P511 (Hoess et al., 1986, Nucleic Acids Research 14:2287-2300)) that can only recombine with an identical mutant site.
  • an FRT recombination site which is cleavable by a FLP recombinase enzyme, is utilized in conjunction with FLP recombinase enzyme, as described above for the loxP/Cre embodiment.
  • a rare-cutting restriction enzyme e.g., Not I
  • the recovered DNA would be digested with Not I and then recircularized with ligase.
  • the Not I site is included in the vector next to loxP.
  • an r recombinase site and r recombinase from Zygosaccharomyces rouxii can be utilized, as described above, for the loxP/Cre embodiment.
  • the retroviral vectors' proviral recovery element allows for recovery of excised provirus from a complex mixture of nucleic acid, thus allowing for the selective recoveiy and excision of provirus from a recipient cell genome.
  • the proviral recovery element comprises a nucleic acid sequence which corresponds to the nucleic acid portion of a high affinity binding nucleic acid/protein pair.
  • the nucleic acid can include, but is not limited to, a nucleic acid which binds with high affinity to a lac repressor, tet repressor or lambda repressor protein.
  • the proviral recoveiy element comprises a lac operator nucleic acid sequence, which binds to a lac repressor peptide sequence.
  • a proviral recovery element can be affinity-purified using lac repressor bound to a matrix (e.g., magnetic beads or sepharose).
  • An excised provirus derived from the retroviral vectors of the invention also contains the retroviral recovery element and can be affinity purified.
  • the 5' LTR comprises a promoter, including but not limited to an LTR promoter, an R region, a U5 region and a primer binding site, in that order. Nucleotide sequences of these LTR elements are well known to those of skill in the art.
  • the 3' LTR comprises a U3 region which comprises the proviral excision element, a promoter, an R region and a polyadenylation signal. Nucleotide sequences of such elements are well Icnown to those of skill in the art.
  • the bacterial origin of replication (Ori) utilized is preferably one which does not adversely affect viral production or gene expression in infected cells.
  • the bacterial Ori is a non-pUC bacterial Ori relative (e.g., pUC, colEI, pSClOl, pi 5 A and the like). Further, it is preferable that the bacterial Ori exhibit less than 90% overall nucleotide similarity to the pUC bacterial Ori.
  • the bacterial origin of replication is a RK2 OriV or fl phage Ori.
  • Bacterial selectable markers are well known to those of skill in the art and can include, but are not limited to, kanamycin/G418, zeocin, actinomycin, ampicillin, gentamycin, tetracycline, chloramphenicol or penicillin resistance markers.
  • the retroviral vectors can further comprise a lethal stuffer fragment which can be utilized to select for vectors containing cDNA or gDNA inserts during, for example, construction of libraries comprising the retroviral vectors of the invention.
  • Lethal stuffer fragments are well known to those of skill in the art (see, e.g., Bernord et al., 1994, Gene 148:71-74, which is incorporated herein by reference in its entirety).
  • a lethal stuffer fragment contains a gene sequence whose expression conditionally inhibits cellular growth.
  • the stuffer fragment is present in the retroviral vectors of the invention within the polycistronic message cassette polylinlcer such that insertion of a cDNA or gDNA sequence into the polylinlcer replaces the stuffer fragment.
  • the polycistronic message cassette polylinlcer is located within the lethal stuffer fragment coding sequence such that, upon insertion of a cDNA or gDNA sequence into the polylinlcer, the lethal stuffer fragment coding region is disrupted.
  • the retroviral vectors can further comprise a single-stranded replication origin, preferably an fl single-stranded replication origin.
  • the single-stranded replication origin allows for the production of normalized single-stranded retroviral libraries derived from the retroviral vectors of the invention.
  • a normalized library is one constructed in a manner that increases the relative frequency of occurrence of rare clones while decreasing simultaneously the relative frequency of the occurrence of abundant clones.
  • Soares et al. Soares, M. B. et al, 1994, Proc. Natl. Acad. Sci. USA
  • pEHRE vector A mammalian episomal vector, termed pEHRE vector, which makes possible, stable, efficient, high-level episomal expression within a wide spectrum of mammalian cells.
  • pEHRE vector A mammalian episomal vector, termed pEHRE vector, which makes possible, stable, efficient, high-level episomal expression within a wide spectrum of mammalian cells.
  • Such vectors can also, for example, be utilized as part of the complementation screening methods of the invention.
  • Such pEHRE expression vectors comprise a replication cassette, an expression cassette and minimal cis-acting elements necessary for replication and stable episomal maintenance.
  • the pEHRE vectors can further contain at least one bacterial origin of replication and/or recombination sites.
  • the recombination sites preferably flank the replication cassette, and can include, but are not limited to, any of the recombination sites described above.
  • any bacterial origin of replication which does not adversely affect the expression of pEHRE sequences can be utilized.
  • the bacterial Ori can be a pUC bacterial Ori relative (e.g., pUC, colEI, pSClOl, pl5A and the like).
  • the bacterial origin of replication can also, for example, be a RK2 OriV or fl phage Ori.
  • the pEHRE vectors can further comprise a single stranded replication origin, preferably an f 1 single-stranded replication origin.
  • the single-stranded replication origin allows for the production of normalized single-stranded libraries derived from the pEHRE vectors of the invention.
  • the pEHRE vectors can additionally comprise a nucleic acid sequence which corresponds to the nucleic acid portion of a high affinity binding nucleic acid/protein pair.
  • nucleic acid/protein pairs can be those as described above, the nucleic acid portion of which can include, but is not limited to, a lacO site.
  • the nucleic acid can include, but is not limited to, a nucleic acid which binds with high affinity to a lac repressor, tet repressor or lambda repressor protein.
  • the proviral recovery element comprises a lac operator nucleic acid sequence, which binds to a lac repressor peptide sequence.
  • Such a proviral recovery element can be affinity- purified using lac repressor bound to a matrix (e.g., magnetic beads or sepharose).
  • a matrix e.g., magnetic beads or sepharose.
  • An excised provirus derived from the retroviral vectors of the invention also contains the retroviral recovery element and can be affinity purified.
  • a pEHRE vector replication cassette comprises nucleic acid sequences which encode papillomaviruses (PV) El and E2 proteins, wherein such nucleic acid sequences are operatively attached to and transcribed by, a constitutive transcriptional regulatory sequence.
  • Representative El and E2 amino acid sequences are well known to those of skill in the art. See, e.g., sequences publicly available in databases such as Genbank.
  • the El and E2 coding sequences can, first, include any nucleotide sequences which encode endogenous PV, including but not limited to bovine papillomavirus (BPV), such as BPV-1 El or E2 gene products.
  • BPV bovine papillomavirus
  • El also refers to any protein which is capable of functioning in PV in the same manner as the endogenous El protein, i.e., is capable of complementing an El mutation.
  • El also refers to any protein which is capable of functioning in PV in the same manner as the endogenous El protein, i.e., is capable of complementing an El mutation.
  • E2 refers to any protein which is capable of functioning in PV in the same manner as the endogenous E2 protein, i.e., is capable of complementing a E2 mutation.
  • Talcing BPV as an example, an E2 protein, as described herein, is one capable of complementing a BPV E2 mutation.
  • the replication cassette constitutive transcriptional regulatory sequence can include, but is not limited to, any polll promoter, such as an SV40, CMV or PGK promoter, nucleotide sequences of which are well known to those of skill in the art.
  • El and E2 coding sequences can be operatively attached to, and transcribed by, separate transcriptional regulatory sequences.
  • at least one of the El or E2 coding sequences can be transcribed along with a selectable marker as a polycistronic message.
  • a selectable marker preferably a mammalian selectable marker
  • the portion of a replication cassette encoding such a polycistronic message could comprise, from 5' to 3': a constitutive transcriptional regulatory sequence, an E2 (or El) coding sequence, an internal ribosome entry site (IRES), and a selectable marker.
  • both El and E2 coding sequences can be transcribed as a polycistronic message. That is, both El and E2 coding sequences, separated by an internal ribosome entry site, can be transcribed by a single transcriptional regulatory sequence.
  • El, E2 and selectable marker sequences can be transcribed as a polycistronic message.
  • the replication cassette could comprise, from 5' to 3': a constitutive transcriptional regulatory sequence, an E2 (or El) coding sequence, an IRES, an El (or E2) coding sequence, an IRES and a selectable marker.
  • the order in instances wherein the El and E2 coding sequences are transcribed as part of a polycistronic message, it is preferred that the order, from 5' to 3', be E2 then El. This is to ensure against possible rare, undesirable RNA splicing events.
  • the pEHRE vector expression cassette is designed to yield high level expression of a cDNA or genomic DNA (gDNA) sequence.
  • a pEHRE vector expression cassette comprises, from 5' to 3', a transcriptional regulatory sequence, a ' nucleotide polylinlcer, an internal ribosome entry site, a mammalian selectable marker and, preferably, either a poly-A site or a transcriptional termination sequence, depending upon the transcriptional regulatory sequence utilized (see below).
  • a cDNA or gDNA sequence can be expressed via operative association within the polylinlcer.
  • a pEHRE expression vector can contain a single or multiple expression cassettes, such that greater than one cDNA or gDNA sequence can be expressed from the same pEHRE expression vector.
  • the pEHRE vector expression cassette transcriptional regulatory sequence can be either constitutive or inducible, and can be derived from cellular or viral sources.
  • transcriptional regulatory sequences can include, but are not limited to, a retroviral long terminal repeat (LTR), cytomegalovirus (CMV), Va- 1 RNA or U6 snRNA promoter sequence, nucleotide sequences of which are well known to those of skill in the art.
  • the expression cassette can contain either a poly-A site (pA) or a transcriptional termination sequence.
  • pA poly-A site
  • a transcriptional termination sequence a transcriptional termination sequence.
  • polll-type transcriptional regulatory sequences can be coupled with pA sites
  • polIII-type transcriptional regulatory sequences can be coupled with transcriptional termination sequences.
  • Expression from the transcriptional regulatory sequence yields a polycistronic message comprising the cDNA or gDNA sequence of interest, IRES and mammalian selectable marker.
  • a polycistronic message approach allows a selection scheme which ensure that the cDNA or gDNA of interest has been expressed.
  • the pEHRE vectors further comprise cis-acting elements which function in replication and stable episomal maintenance.
  • Such sequences include: a PV minimal origin of replication (MO) and a PV minicl romosomal maintenance element (MME).
  • MO and MME sequences are well known to those of skill in the art. See, e.g., Piirson, M. et al., 1996, EMBO J. 15:1-11, which is incorporated herein by reference in its entirety.
  • the term "MO” refers to any nucleotide sequence capable of functioning in PV in the same manner as endogenous MO, i.e., is capable of complementing an MO mutation.
  • Talcing BPV as an example, an MO sequence, as described herein, would be one capable of complementing or replacing a BPV MO mutation.
  • MME refers to any nucleotide sequence capable of functioning in PV in the same manner as endogenous MME, i.e., is capable of complementing a MME mutation.
  • a MME sequence can be one containing multiple E2 binding sites.
  • Talcing BPV as an example, a MME sequence, as described herein, would be one capable of complementing or replacing a BPV MME mutation.
  • the pEHRE IRES and mammalian and bacterial selectable markers can be, for example, as those described above.
  • the pEHRE expression vectors of the invention can be utilized for the production, including large scale production, of recombinant proteins.
  • the vectors' desirable features in fact, make them especially amenable to large scale production.
  • current methods of producing recombinant proteins in mammalian cells involve transfection of cells (e.g., CHO, NS/0 cells) and subsequent amplification of the transfected sequence using drugs (e.g., methotrexate or inhibitors of glutamine synthetase).
  • amplicons are subject to statistical variation depending on their genomic integration loci, and from the fact that the amplicons are unstable in the absence of continued selection (which is impractical at production scale).
  • the pEHRE vectors it should be pointed out, achieve such levels equal or higher than these naturally, that is, in the absence of outside selection.
  • the pEHRE vectors give consistently high episomal expression, making them genomic integration-independent. Further, the episomal pEHRE vectors are retained as stable nuclear plasmids even in the absence of selective pressure.
  • pEHRE vectors can be utilized which employ an additional level of such internal, or self, selection (that is, selection which does not depend on the addition of outside selective pressures such as, e.g., drugs).
  • pEHRE vectors can be utilized which complement a defect the specific producer cell line being utilized for expression.
  • such pEHRE selection elements can complement an auxotrophic mutation or can bypass a growth factor requirement (e.g., proline or insulin, respectively) from the cell media.
  • the coding sequence of the marker is transcribed as part of a polycistronic message along with the coding sequence of the proteins being recombinantly expressed.
  • such an expression selection cassette can comprise, from 5' to 3': a transcriptional regulatory sequence, recombinant protein coding sequence, IRES, selection marker, poly-A site.
  • the episomal pEHRE vectors can further be utilized, for example, in the delivery of large nucleic acid segments, e.g., chromosomal segments.
  • pEHRE vectors can be utilized in connection with bacterial artificial chromosome (BAC) or yeast artificial chromosome (YAC) sequences to allow delivery of large genomic segments (e.g., segments ranging from tens of kilobases to megabases in length).
  • pEHRE vectors can be combined with existing BAC clones to generate pEHRE/BAC hybrid constructs, comprising BACs into which pEHRE vector sequences have been inserted.
  • Such pEHRE/BAC hybrids represent BACs that can replicate in a wide variety of mammalian, including human cells.
  • pEHRE vectors which can be utilized to donate elements to BACs comprise a pEHRE replication cassette, MO and MME sequences, and a bacterial selectable marker, all flanked by BAC recombination sequences.
  • BAC recombination sequences caN include any nucleotide sequence which can be cleaved and then used to recombine with BAC elements so as to incorporate the necessary pEHRE sequences described above. Any recombination site for which a compatible recombination site exists, or is engineered to exist, in the recipient BAC can be used.
  • BAC recombination elements can include, but are not limited to, loxP, mutant loxP or fit sites as described above.
  • CosN sites whose nucleotide sequences are well known to those of skill in the art, can be utilized. Rather than a recombinase enzyme, such CosN sites are cleaved by lambda terminase enzyme.
  • BAC teaching including CosN teaching, see, e.g., Shizuya, H. et al., 1992, Proc. Natl. Acad. Sci. USA 89:8794-8797; and Kim, U.-J. et al, 1996, Genomics 34:213-218, which are incorporated herein by reference in their entirety.
  • pEHRE vectors and BAC are treated together with the appropriate recombinase or terminase enzyme.
  • a subsequent ligation step is included.
  • Concatamers representing the desired pEHRE/B AC hybrids can be selected for based upon their resistance to both the BAC selectable marker (usually chloramphenicol) and the pEHRE vector selectable marker within the pEHRE region meant to be donated. It is, therefore, desirable that the BAC and pEHRE selectable markers be different.
  • the resulting constructs are further tested to ensure that the second pEHRE bacterial selectable marker is no longer present. Plasmids which have recombined the desired BAC and pEHRE elements, will be able to replicate in E. coli, as well as a wide range of mammalian cells, including human cells.
  • the vector termed a pBPV-BacDonor vector represents one embodiment of a pEHRE vector designed to donate essential pEHRE sequences to recipient BAC clones.
  • the vector's recombination elements are depicted as containing loxP and/or CosN sites.
  • the bacterial marker to be incorporated into the pEHRE/BAC hybrid is depicted as tetracycline or kanamycin.
  • the vector contains a pUC bacterial origin (Ori) of replication, an fl Ori and a second bacterial selectable marker, ampicillin.
  • pEHRE/BAC cloning vectors can be produced and utilized.
  • Such vectors contain the pEHRE replication cassette, MO and MME sequences as described above, the nucleotide sequences necessary for BAC maintenance in E. coli (such sequences are well known to those of skill in the art; see, e.g., Shizuya and Kim, above), and a polylinker site.
  • the vector termed pBP V-BlueB AC represents one embodiment of such a pEHRE/BAC cloning vector.
  • the El and E2 coding sequences are BPV sequences, and are in operative association with individual SV40 promoters. El is transcribed as part of a polycistronic message along with the selectable marker, hygro.
  • the replication cassette further comprises an SV40 pA site downstream of the IRES-marker.
  • the MO and MME sequences are
  • BPV-derived in the figure, both of these sequences are illustrated as "BPV origin").
  • the cloning site comprises a polylinlcer embedded within the alpha complementation fragment of lacZ, which allows blue/white selection of recombinants.
  • T7 and SP6 promoters flank the lacZ sequence, and the vector additionally contains cosN and loxP sites for linearization. The remainder of the elements depicted are present for BAC maintenance in E. coli.
  • a genetic suppressor element (GSE)-producing replication-deficient retroviral vectors Such vectors are designed to facilitate the expression of antisense GSE single-stranded nucleic acid sequences in mammalian cells, and can, for example, be utilized in conjunction with the antisense-based functional gene inactivation methods of the invention.
  • the GSE-producing retroviral vectors can comprise a replication-deficient retroviral genome containing a proviral excision element, a proviral recovery element and a genetic suppressor element (GSE) cassette.
  • the GSE-producing retroviral vectors can further comprise, (a) a 5' LTR; (b) a 3' LTR; (c) a bacterial Ori; (d) a mammalian selectable marker; (e) a bacterial selectable marker; and (f) a packaging signal.
  • the proviral recovery element, GSE cassette, bacterial Ori, mammalian selectable marker and bacterial selectable marker are located between the 5'LTR and the 3' LTR.
  • the proviral excision element is located within the 3' LTR.
  • the proviral excision element can also flanlc the functional cassette without being present in the 3' LTR.
  • the 5' LTR, 3' LTR, proviral excision element, bacterial selectable marker, mammalian selectable marker and proviral recovery element are as described above.
  • Each of the GSE cassette embodiments described below can further comprise a sense or antisense cDN A or gDNA fragment or full length sequence operatively associated within the polylinlcer.
  • the GSE cassette can, for example, comprise, from 5' to 3': (a) a transcriptional regulatory sequence; (b) a polylinker; and (c) polyadenylation signal.
  • the GSE cassette polyadenylation signal is located within the 3' retroviral long terminal repeat.
  • the GSE cassette can comprise, from 5' to 3': (a) a transcriptional regulatory sequence; (b) a polylinlcer; (c) a cis-acting ribozyme sequence; (d) an internal ribosome entry site; (e) the mammalian selectable marker; and (f) a polyadenylation signal.
  • a sense GSE can be constructed, in which case the
  • GSE cassette can further comprise a polylinlcer containing a Kozalc consensus methionine in front of the sense-orientation fragments to create a "domain library" for domain and fragment expression.
  • transcription from the transcriptional regulatory sequence produces a bifunctional transcript.
  • the first half i.e., the portion upstream of the ribozyme sequence
  • the portion downstream of the ribozyme sequence i.e., the portion containing the selectable marker
  • Such a bicistronic configuration therefore, directly links selection for the selectable marker to expression of the GSE.
  • the GSE cassette can comprise, from 5' to 3': (a) an RNA polymerase III transcriptional regulatory sequence; (b) a polylinlcer; (c) a transcriptional termination sequence.
  • the transcriptional regulatory sequence and transcriptional termination sequence are adenovirus Ad2 VA RNAI transcriptional regulatory and termination sequences.
  • GSE genetic suppressor element
  • the GSE-producing pEHRE vectors of the invention can comprise a replication cassette, a genetic suppressor element (GSE) cassette and minimal cis- acting elements necessary for replication and stable episomal maintenance.
  • GSE genetic suppressor element
  • the GSE-producing pEHRE vectors can further comprise at least one bacterial origin of replication and at least one bacterial selectable marker.
  • the replication cassette, minimal cis-acting elements, bacterial origin of replication and bacterial selectable marker are as described above.
  • Each of the GSE cassette embodiments described below can further comprise a sense or antisense cDNA or gDNA fragment or full length sequence operatively associated within the polylinlcer.
  • the GSE cassette can, for example, comprise, from 5' to 3': (a) a transcriptional regulatory sequence; (b) a polylinker; and (c) polyadenylation signal.
  • the GSE transcriptional regulatory sequence can be a constitutive or inducible one, and can represent, for example, retroviral long terminal repeat (LTR), cytomegalo virus (CMV), Va-1 RNA or U6 snRNA promoter sequence, nucleotide sequences of which are well known to those of skill in the art.
  • a pEHRE GSE vector could, for example be constructed in such a way that the El and E2 coding sequences are BPV sequences, and are in operative association with individual SV40 promoters.
  • El is transcribed as part of a polycistronic message along with the selectable marker, hygro.
  • the replication cassette further comprises an SV40 pA site downstream of the IRES-marker.
  • the MO and MME sequences are BPV-derived.
  • the vector's GSE cassette comprises a CMV promoter operatively associated with a sequence to be expressed as a GSE, which, in turn, is operatively attached to a bgH poly-A site.
  • the vector contains apUC bacterial origin (Ori) of replication, an fl Ori and an ampicillin bacterial selectable marker.
  • the GSE cassette can comprise, from 5' to 3': (a) a transcriptional regulatory sequence; (b) a polylinker; (c) a cis-acting ribozyme sequence; (d) an internal ribosome entry site; (e) the mammalian selectable marker; and (f) a polyadenylation signal.
  • a sense GSE can be constructed, in which case the GSE cassette can further comprise a polylinlcer containing a Kozalc consensus methionine in front of the sense-orientation fragments to create a "domain library" for domain and fragment expression.
  • transcription from the transcriptional regulatory sequence produces a bifunctional transcript.
  • the first half i.e., the portion upstream of the ribozyme sequence
  • the portion downstream of the ribozyme sequence i.e., the portion containing the selectable marker
  • Such a bicistronic configuration therefore, directly links selection for the selectable marker to expression of the GSE.
  • the GSE cassette can comprise, from 5' to 3': (a) an RNA polymerase III transcriptional regulatory sequence; (b) a polylinlcer; (c) a transcriptional termination sequence.
  • the transcriptional regulatory sequence and transcriptional termination sequence are adenovirus Ad2 VA RNA transcriptional regulatory and termination sequences.
  • a vector useful for the display of constrained and unconstrained random peptide sequences Such vectors are designed to facilitate the selection and identification of random peptide sequences that bind to a protein of interest.
  • the retroviral and pEHRE vectors displaying random peptide sequences of the present invention can comprise, (a) a splice donor site or a LoxP site (e.g., LoxP511 site); (b) a bacterial promoter (e.g., pTac) and a shine-delgarno sequence; (c) a pel B secretion signal for targeting fusion peptides to the periplasm; (d) a splice-acceptor site or another LoxP511 site (Lox P511 sites will recombine with each other, but not with the LoxP site in the 3' LTR); (e) a peptide display cassette or vehicle; (f) an amber stop codon; (g) the Ml 3 bacteriophage
  • a peptide display cassette or vehicle consists of a vector protein, either natural or synthetic into which a polylinlcer has been inserted into one flexible loop of the natural or synthetic protein.
  • a library of random oligonucleotides encoding random peptides may be inserted into the polylinlcer, so that the peptides are expressed on the cell surface.
  • the display vehicle of the vector may be, but is not limited to, thioredoxin for intracellular peptide display in mammalian cells (Colas et al, 1996, Nature 380:548-550) or may be a minibody (Tramonteno, 1994, J. Mol. Recognit. 7:9-24) for the display of peptides on the mammalian cell surface.
  • the display vehicle may be extracellular, in this case the minibody could be preceded by a secretion signal and followed by a membrane anchor, such as the one encoded by the last 37 amino acids of DAF-1 (Rice et al., 1992, Proc. Natl. Acad. Sci. 89:5467-5471). This could be flanked by recombinase sites (e.g., FRT sites) to allow the production of secreted proteins following passage of the library through a recombinase expressing host.
  • recombinase sites e.g., FRT sites
  • these cassettes would reside at the position normally occupied by the cDNA in the sense-expression vectors described above.
  • these vectors would produce a relatively conventional phage display library which could be used exactly as has been previously described for conventional phage display vectors.
  • Recovered phage that display affinity for the selected target would be used to infect bacterial hosts of the appropriate genotype (i.e., expressing the desired recombinases depending upon the cassettes that must be removed for a particular application).
  • any bacterial host would be appropriate (provided that splice sites are used to remove pelB in the mammalian host).
  • the minibody vector For a secreted display, the minibody vector would be passed through bacterial cells that catalyze the removal of the DAF anchor sequence. Plasmids prepared from these bacterial hosts are used to produce virus for assay of specific phenotypes in mammalian cells.
  • a replication-deficient retroviral gene trapping vector Such gene trapping vectors contain reporter sequences which, when integrated into an expressed gene, "tag" the expressed gene, allowing for the monitoring of the gene's expression, for example, in response to a stimulus of interest.
  • the gene trapping vectors of the invention can be used, for example, in conjunction with the gene trapping-based methods of the invention for the identification of mammalian genes which are modulated in response to specific stimuli.
  • the replication-deficient retroviral gene trapping vectors of the invention can comprise: (a) a 5' LTR; (b) a promoterless 3' LTR (a SIN LTR); (c) a bacterial Ori; (d) a bacterial selectable marker; (e) a selective nucleic acid recovery element for recovering nucleic acid containing a nucleic acid sequence from a complex mixture of nucleic acid; (f) a polylinlcer; (g) a mammalian selectable marker; and (h) a gene trapping cassette.
  • those elements necessary to produce a high titer virus are required. Such elements are well Icnown to those of skill in the art and contain, for example, a packaging signal.
  • the bacterial Ori, bacterial selectable marker, selective nucleic acid recovery element, polylinlcer, and mammalian selectable marker are located between the 5' LTR and the 3' LTR.
  • the bacterial selectable marker and the bacterial Ori are located in close operative association in order to facilitate nucleic acid recovery, as described below.
  • the gene trapping cassette element is located within the 3' LTR.
  • the 5' LTR, bacterial selectable marker and mammalian selectable marker are as described above.
  • the selective nucleic acid recovery element is as the proviral recovery element described above.
  • the 3' LTR contains the gene trapping cassette and lacks a functional LTR transcriptional promoter.
  • the gene trapping cassette can comprise from 5' to 3': (a) a nucleic acid, sequence encoding at least one stop codon in each reading frame; (b) an internal ribosome entry site; and (c) a reporter sequence.
  • the gene trapping cassette can further comprise, upstream of the stop codon sequences, a transcriptional splice acceptor nucleic acid sequence.
  • the inclusion of the IRES sequence in the gene trapping vectors of the present invention offers a key improvement over conventional gene trapping vectors.
  • the IRES sequence allows the vector to land anywhere in the mature message to create a bicistronic transcript, this effectively increases the number of integration sites that will report promoters by a factor of at least 10.
  • U.S. Pat. No. 6,255,071 are intended for use in mammalian cells, with minor modification, most can be adepted for use in other cell types. Especially when specific packaging cells are used to generate viruses with a wide spectrum of infection.
  • Nux coding sequence shall be present in the vector.
  • the Nux coding sequence could be either at the 5 ' - or 3 ' -end of the cloning site(s) for source DNA.
  • a normalized library is one constructed in a manner that increases the relative frequency of occurrence of rare clones while decreasing simultaneously the relative frequency of the occurrence of abundant clones.
  • Soares et al. Soares, M. B. et al., 1994, Proc. Natl. Acad. Sci. USA 91 :9228-9232, which is incorporated herein by reference in its entirety.
  • Alternative normalization procedures based upon biotinylated nucleotides may also be utilized.
  • methods for vector construction and protein expression described above and/or provided in the examples are examplary.
  • yeast complementation screens have been adapted for use in cross-species complementation screens, for example, in yeast for plant (Arabidopsis) genes (Gietz, D. et al, Nucl. Acids Res. 20: 1425, 1992; Schiesti, R.H. and Gietz, R.D., Cun.-. Genet. 16: 339-334, 1989), the details of which will not be discussed further. Nevertheless, complementation screens in mammalian cells constitute one of the most important aspects of the invention.
  • Such complementation screen methods can include, for example, a method for identification of a nucleic acid sequence whose expression complements a cellular phenotype, comprising: (a) infecting a mammalian cell exhibiting the cellular phenotype with a, for example, retrovirus particle derived from a cDNA or gDNA-containing retroviral vector of the invention, or, alternatively, transfecting such a cell with a pEHRE vector of the invention wherein, depending on the vector, upon infection an integrated retroviral provirus is produced or upon transfection an episomal sequence is established, and the cDNA or gDNA sequence is expressed; and (b) analyzing the cell for the phenotype, so that suppression of the phenotype identifies a nucleic acid sequence which complements the cellular phenotype.
  • n-RM when a Nux-fusion protein is expressed at the presence of P-Cub-n-RM, interaction between P and the polypeptide encoded as a Nux-fusion will result in the generation of n-RM, which can then be detected depending on the specific nature of the reportermoiety and the nature of the amino acid n. Phenotypic differences between an uncleaved and cleaved n-RM shall allow selection of cells comprising cleaved n-RM. Isolation and characterization of positive clones
  • the vectors used may also facilitate the cloning and further characterization of the encoded polypeptide in the selected cell(s). Such methods utilize the proviral excision and the proviral recovery elements described above.
  • the proviral excision element comprises a loxP recombination site present in two copies within the integrated provirus
  • the proviral recovery element comprises a lacO site, present in the provirus between the two loxP sites.
  • the loxP sites are cleaved by a Cre recombinase enzyme, yielding an excised provirus which, upon excision, becomes circularized.
  • the excised, circular provirus, which contains the lacO site is recovered from the complex mixture of recipient cell genomic nucleic acid by lac repressor affinity purification. Such an affinity purification is made possible by the fact that the lacO nucleic acid specifically binds to the lac repressor protein.
  • the excised provirus is amplified in order to increase its rescue efficiency.
  • the excised provirus can further comprise an SV40 origin of replication such that in vivo amplification of the excised provirus can be accomplished via delivery of large T antigen. The delivery can be made at the time of recombinase administration, for example.
  • the excised provirus may be recovered by use of a Cre recombinase.
  • the isolated DNA is fragmented to a controlled size.
  • the provirus containing fragments are isolated via LacO/LacI. Following IPTG elution, circularization of the provirus can be accomplished by treatment with purified recombinase.
  • the person skilled in the art will be able to anticipate other methods to isolate and characterize nucleic acids from selected cells. Variegated Peptide Display
  • the variegated peptide libraries of the subject method can be generated by any of a number of methods, and, though not limited by, preferably exploit recent trends in the preparation of chemical libraries.
  • the library can be prepared, for example, by either synthetic or biosynthetic approaches, and screened for activity against the D-enantiomer target in a variety of assay formats.
  • variant refers to the fact that a population of peptides is characterized by having a peptide sequence which differ from one member of the library to the next.
  • the total number of different peptide sequences in the library is given by the product of where each nn represents the number different amino acid residues occurring at position n of the peptide.
  • the peptide display collectively produces a peptide library including at least 96 to 10 7 different peptides, so that diverse peptides may be simultaneously assayed for the ability to interact with the target protein.
  • Peptide libraries are systems which simultaneously display, in a form which permits interaction with a target protein, a highly diverse and numerous collection of peptides. These peptides may be presented in solution (Houghten, BioTechniques 13: 412-421, 1992), or on beads (Lam, Nature 354: 82-84, 1991), chips (Fodor, Nature 364: 555-556, 1993), bacteria (Ladner US Pat. No. 5,223,409), spores (Ladner US Pat. No.
  • the peptide library is derived to express a combinatorial library of peptides which are not based on any known sequence, nor derived from cDNA.
  • the sequences of the library are largely random. It will be evident that the peptides of the library may range in size from dipeptides to large proteins.
  • the peptide library is derived to express a combinatorial library of peptides which are based at least in part on a known polypeptide sequence or a portion thereof (not a cDNA library). That is, the sequences of the library is semi-random, being derived by combinatorial mutagenesis of a Icnown sequence(s). See, for example, Ladner et al. PCT publication WO 90/02909; Ga ⁇ -ard et al., PCT publication WO 92/09690; Marks et al, J. Biol.
  • polypeptide(s) which are known ligands for a target protein can be mutagenized by standard techniques to derive a variegated library of polypeptide sequences which can further be screened for agonists and/or antagonists.
  • the combinatorial polypeptides are produced from a cDNA library.
  • the combinatorial peptides of the library can be generated as is, or can be incorporated into larger fusion proteins.
  • the fusion protein can provide, for example, stability against degradation or denaturation, as well as a secretion signal if secreted.
  • the polypeptide library is provided as part of thioredoxin fusion proteins (see, for example, U.S. Patent Nos. 5,270,181 and 5,292,646; and PCT publication WO94/ 02502).
  • the combinatorial peptide can be attached on the terminus of the thioredoxin protein, or, for short peptide libraries, inserted into the so-called active loop.
  • the combinatorial polypeptides are in the range of 3-100 amino acids in length, more preferably at least 5-50, and even more preferably at least 10, 13, 15, 20 or 25 amino acid residues in length.
  • the polypeptides of the library are of uniform length. It will be understood that the length of the combinatorial peptide does not reflect any extraneous sequences which may be present in order to facilitate expression, e.g., such as signal sequences or invariant portions of a fusion protein.
  • the harnessing of biological systems for the generation of peptide diversity is now a well established technique which can be exploited to generate the peptide libraries of the subject method.
  • the source of diversity is the combinatorial chemical synthesis of mixtures of oligonucleotides. Oligonucleotide synthesis is a well-characterized chemistry that allows tight control of the composition of the mixtures created. Degenerate DNA sequences produced are subsequently placed into an appropriate genetic context for expression as peptides.
  • the DNAs are synthesized a base at a time.
  • a suitable mixture of nucleotides is reacted with the nascent DNA, rather than the pure nucleotide reagent of conventional polynucleotide synthesis.
  • the second method provides more exact control over the amino acid variation.
  • trinucleotide reagents are prepared, each trinucleotide being a codon of one (and only one) of the amino acids to be featured in the peptide library.
  • a mixture is made of the appropriate trinucleotides and reacted with the nascent DNA.
  • the necessary "degenerate" DNA Once the necessary "degenerate" DNA is complete, it must be joined with the DNA sequences necessary to assure the expression of the peptide, as discussed in more detail below, and the complete DNA construct must be introduced into the cell.
  • chemical synthesis of a degenerate gene sequence can be carried out in an automatic DNA synthesizer, and the synthetic genes can then be ligated into an appropriate gene for expression.
  • the purpose of a degenerate set of genes is to provide, in one mixture, all of the sequences encoding the desired set of potential test peptide sequences.
  • a variegated peptide library can be expressed by a population of display packages to form a peptide display library.
  • the display package on which the variegated peptide library is manifest it will be appreciated from the discussion provided herein that the display package will often preferably be able to be (i) genetically altered to encode a test peptide, (ii) maintained and amplified in culture, (iii) manipulated to display the peptide in a manner permitting the peptide to interact with a target protein during an affinity separation step, and (iv) affinity separated while retaining the peptide-encoding gene such that the sequence of the peptide can be obtained.
  • the display remains viable after affinity separation.
  • the display package comprises a system that allows the sampling of very large variegated peptide display libraries, rapid sorting after each affinity separation round, and easy isolation of the peptide-encoding gene from purified display packages.
  • the most attractive candidates for this type of screening are prokaryotic organisms and viruses, as they can be amplified quickly, they are relatively easy to manipulate, and large number of clones can be created.
  • Preferred display packages include, for example, vegetative bacterial cells, bacterial spores, and most preferably, bacterial viruses (especially DNA viruses).
  • the present invention also contemplates the use of eukaryotic cells, including yeast and their spores, as potential display packages.
  • kits for generating phage display libraries e.g. the Pharmacia Recombinant Phage Peptide System, catalog no. 27-9400-01; and the Stratagene SurfZAPTM phage display kit, catalog no. 240612
  • methods and reagents particularly amenable for use in generating the variegated peptide display library of the present method can be found in, for example, the Ladner et al. U.S. Patent No. 5,223,409; the Kang et al. International Publication No. WO 92/18619; the Dower et al. International Publication No. WO 91/17271; the Winter et al.
  • an important criteria for the present selection method can be that it is able to discriminate between peptides of different affinity for a particular target, and preferentially enrich for the peptides of highest affinity.
  • manipulating the display package to be rendered effectively monovalent can allow affinity enrichment to be carried out for generally higher binding affinities (i.e. binding constants in the range of 10 6 to 10 10 M "1 ) as compared to the broader range of affinities isolable using a multivalent display package.
  • the natural (i.e. wild-type) form of the surface or coat protein used to anchor the peptide to the display can be added at a high enough level that it almost entirely eliminates inclusion of the peptide fusion protein in the display package.
  • the library of display packages will comprise no more than 5 to 10% polyvalent displays, and more preferably no more than 2% of the display will be polyvalent , and most preferably, no more than 1%> polyvalent display packages in the population.
  • the source of the wild-type anchor protein can be, for example, provided by a copy of the wild-type gene present on the same construct as the peptide fusion protein, or provided by a separate construct altogether.
  • the library is comprised of a variegated pool of nucleic acids, e.g. single or double-stranded DNA or ARNA.
  • a variety of techniques are known in the art for generating screenable nucleic acid libraries which may be exploited in the present invention.
  • many of the techniques described above for synthetic peptide libraries can be used to generate nucleic acid libraries of a variety of formats. For example, divide-couple-recombine techniques can be used in conjugation with standard nucleic acid synthesis techniques to generate bead immobilized nucleic acid libraries.
  • solution libraries of nucleic acids can be generated which rely on PCR techniques to amplify for sequencing those nucleic acid molecules which selectively bind the screening target.
  • the SELEX systematic evolution of ligands by exponential enrichment
  • the enantiomeric screening target See, for example, Tuerlc et al., Science 249: 505-510, 1990, for a review of SELEX.
  • a pool of variant nucleic acid sequences is created, e.g. as a random or semi-random library.
  • an invariant 3' and (optionally) 5' primer sequence are provided for use with PCR anchors or for permitting subcloning.
  • the nucleic acid library is applied to- screening a target, and nucleic acids which selectively bind (or otherwise act on the target) are isolated from the pool, the isolates are amplified by PCR and subcloned into, for example, phagemids. The phagemids are then transfected into bacterial cells, and individual isolates can be obtained and the sequence of the nucleic acid cloned from the screening pool can be determined.
  • RNA is the test ligand
  • the RNA library can be directly synthesized by standard organic chemistry, or can be provided by in vitro translation as described by Tuerk et al., supra.
  • RNA isolated by binding to the screening target can be reverse transcribed and the resulting cDNA subcloned and sequenced as above.
  • Exemplary combinatorial libraries include benzodiazepines, peptoids, biaryls and hydantoins.
  • peptoids include benzodiazepines, peptoids, biaryls and hydantoins.
  • biaryls include benzodiazepines, peptoids, biaryls and hydantoins.
  • hydantoins include benzodiazepines, peptoids, biaryls and hydantoins.
  • the subject method is envisaged with a variety of detection methods for isolating and identifying compounds which interact with the screening target.
  • the screening programs which test libraries of compounds will be derived for high throughput analysis in order to maximize the number of compounds surveyed in a given period of time.
  • the screening portion of the subject method involves contacting the screening target with the compound library and isolating those compounds from the library which interact with the screening target.
  • Such interaction may be detected, for example, based on directly detecting the binding of the compounds to the screening target, or inferred through the modulation of interactions involving the screening target with other molecules, such as protein-protein or protein-DNA interaction involving the screening target or modulation of an enzymatic/catalytic activity of the screening target.
  • the efficacy of the test compounds can be assessed by generating dose response curves from data obtained using various concentrations of the test compound.
  • a control assay can also be performed to provide a baseline for comparison.
  • Complex formation between a test compounds and a screening target may be directly detected by a vaiiety of techniques.
  • the complexes can be scored for using, for example, detectable labeled compounds or screening targets, such as radiolabeled, fluorescently labeled, or enzymatically labeled polypeptides, by immunoassay, or by chromatographic detection.
  • the variegated compound library is subjected to affinity enrichment in order to select for compounds which bind a preselected screening target.
  • affinity separation or “affinity enrichment” includes, but is not limited to (1) affinity chromatography utilizing immobilizing screening targets, (2) precipitation using screening targets, (3) fluorescence activated cell sorting where the compound library is so amenable, (4) agglutination, and (5) plaque lifts.
  • the library of compounds are ultimately separated based on the ability of a particular compound to bind a screening target of interest. See, for example, the Ladner et al. U.S. Patent No. 5,223,409; the Kang et al. International Publication No.
  • affinity chromatography it will be generally understood by those skilled in the art that a great number of chromatography techniques can be adapted for use in the present invention, ranging from column chromatography to batch elution, and including ELISA and reverse biopanning techniques.
  • the screening target is immobilized on an insoluble carrier, such as sepharose or polyacrylamide beads, or, alternatively, the wells of a microtitre plate.
  • the population of compounds is applied to the affinity matrix under conditions compatible with the binding of compounds in the library to the immobilized screening target.
  • the population is then fractionated by washing with a solute that does not greatly effect specific binding of compounds to the screening target, but which substantially disrupts any non-specific binding of components the library to the screening target or matrix.
  • a certain degree of control can be exerted over the binding characteristics of the compounds recovered from the library by adjusting the conditions of the binding incubation and subsequent washing.
  • the temperature, pH, ionic strength, divalent cation concentration, and the volume and duration of the washing can select for compounds within a particular range of affinity and specificity. Selection based on slow dissociation rate, which is usually predictive of high affinity, is a very practical route. This may be done either by continued incubation in the presence of a saturating amount of free screening target, or by increasing the volume, number, and length of the washes. In each case, the rebinding of dissociated compounds from the applied libraiy is prevented, and with increasing time, compounds of higher and higher affinity are recovered.
  • affinities of some compounds may be dependent on ionic strength or cation concentration.
  • Specific examples are peptides which depend on Ca ++ or other ions for binding activity and which release from the screening target in the presence of a chelating agent such as EGTA. (see, Hopp et al., Biotechnology 6: 1204-1210, 1988).
  • Such peptides may be identified in the compound library by a double screening technique isolating first those that bind the screening target in the presence of Ca + , and by subsequently identifying those in this group that fail to bind in the presence of EGTA.
  • specifically compounds can be eluted by either specific desorption (using excess screening target) or non-specific desorption (using pH, polarity reducing agents, or chaotropic agents).
  • the elution protocol does not kill the organism used as the display package such that the enriched population of display packages can be further amplified by reproduction.
  • the list of potential eluants includes salts (such as those in which one of the counter ions is Na + , NH 4 + , Rb + , SO 4 2" , H 2 PO " , citrate, K + , Li + , Cs + , HSO 4 ⁇ CO 3 2 -, Ca 2+ , Sr 2+ , CF, PO 4 2 ⁇ HCO 3 ⁇ Mg 2+ , Ba 2+ , Br “ , HPO 4 2” , or acetate), acid, heat, and, when available, soluble forms of the target antigen (or analogs thereof).
  • salts such as those in which one of the counter ions is Na + , NH 4 + , Rb + , SO 4 2" , H 2 PO " , citrate, K + , Li + , Cs + , HSO 4 ⁇ CO 3 2 -, Ca 2+ , Sr 2+ , CF, PO 4 2 ⁇ HCO 3 ⁇ Mg 2+ , Ba
  • buffer components especially eluates
  • Neutral solutes such as ethanol, acetone, ether, or urea, are examples of other agents useful for eluting the bound display packages.
  • affinity enriched packages or nucleic acids are iteratively amplified and subjected to further rounds of affinity separation until enrichment of the desired binding activity is detected.
  • the specifically bound biological display packages, especially bacterial cells need not be eluted er se, but rather, the matrix bound display packages can be used directly to inoculate a suitable growth media for amplification.
  • the fusion protein generated with the coat protein can interfere substantially with the subsequent amplification of eluted phage particles, particularly in embodiments wherein the cpIII protein is used as the display anchor.
  • the peptide can be derived on the surface of the display package so as to be susceptible to proteolytic cleavage which severs the covalent linkage of at least the antigen binding sites of the displayed peptide from the remaining package.
  • such a strategy can be used to obtain infectious phage by treatment with an enzyme which cleaves between the peptide portion and cpIII portion of a tail fiber fusion protein (e.g. such as the use of an enterokinase cleavage recognition sequence).
  • DNA prepared from the eluted phage can be transformed into host cells by electroporation or well known chemical means.
  • the cells are cultivated for a period of time sufficient for marker expression, and selection is applied as typically done for DNA transformation.
  • the colonies are amplified, and phage harvested for a subsequent round(s) of panning.
  • the nucleic acid encoding the peptide for each of the purified display packages can be recloned in a suitable eukaryotic or prokaryotic expression vector and transfected into an appropriate host for production of large amounts of protein.
  • the isolated peptides are identified either directly from the display, e.g., by direct microsequencing, or the display packages are appropriately decoded, e.g., by elucidating the identity of an associated tag/index. Deconvolution techniques are also Icnown in the art.
  • compound libraries can be fractionated based on other activities of the target molecule, such as modulation of catalytic activity.
  • the intrapolypeptide split-ubiquitin therapeutic formulations used in the method of the invention are most preferably applied in the form of appropriate compositions.
  • appropriate compositions there may be cited all compositions usually employed for systemically or topically administering drugs.
  • the pharmaceutically acceptable carrier should be substantially inert, so as not to act with the active component. Suitable inert carriers include water, alcohol, polyethylene glycol, mineral oil or petroleum gel, propylene glycol and the like.
  • Knock out mice are generated by homologous integration of a "knock out" construct into a mouse embryonic stem cell chromosome which encodes the gene to be knocked out.
  • gene targeting which is a method of using homologous recombination to modify an animal's genome, can be used to introduce changes into cultured embryonic stem cells. By targeting a target gene of interest in ES cells, these changes can be introduced into the germlines of animals to generate chimeras.
  • the gene targeting procedure is accomplished by introducing into tissue culture cells a DNA targeting construct that includes a segment homologous to a target gene locus, and which also includes an intended sequence modification to the target genomic sequence (e.g., insertion, deletion, point mutation). The treated cells are then screened for accurate targeting to identify and isolate those which have been properly targeted.
  • a DNA targeting construct that includes a segment homologous to a target gene locus, and which also includes an intended sequence modification to the target genomic sequence (e.g., insertion, deletion, point mutation).
  • the treated cells are then screened for accurate targeting to identify and isolate those which have been properly targeted.
  • Gene targeting in embryonic stem cells is in fact a scheme contemplated by the present invention as a means for disrupting a target gene function through the use of a targeting transgene construct designed to undergo homologous recombination with one or more target genomic sequences.
  • the targeting construct can be arranged so that, upon recombination with an element of a target gene, a positive selection marker is inserted into (or replaces) coding sequences of the gene.
  • the inserted sequence functionally disrupts the target gene, while also providing a positive selection trait.
  • Exemplary target gene targeting constructs are described in more detail below.
  • the embryonic stem cells (ES cells ) used to produce the knockout animals will be of the same species as the knockout animal to be generated.
  • mouse embryonic stem cells will usually be used for generation of knockout mice.
  • Embryonic stem cells are generated and maintained using methods well Icnown to the skilled artisan such as those described by Doetschman et al., J Embryol. Exp. MoMFGFhol. 87: 27-45, 1985). Any line of ES cells can be used, however, the line chosen is typically selected for the ability of the cells to integrate into and become part of the germ line of a developing embryo so as to create germ line transmission of the knockout construct. Thus, any ES cell line that is believed to have this capability is suitable for use herein.
  • One mouse strain that is typically used for production of ES cells is the 129J strain.
  • Another ES cell line is murine cell line D3 (American Type Culture Collection, catalog no.
  • Still another preferred ES cell line is the WW6 cell line (Ioffe et al., PNAS 92: 7357-7361, 1995).
  • the cells are cultured and prepared for knockout construct insertion using methods well known to the skilled artisan, such as those set forth by Robertson in: Teratocarcinomas and Embryonic Stem Cells: A Practical Approach, E.J. Robertson, ed. IRL Press, Washington, D.C., 1987); by Bradley et al., Current Topics in Devel. Biol. 20: 357-371, 1986); and by Hogan et al. (Manipulating the Mouse Embryo: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1986) .
  • a knock out construct refers to a uniquely configured fragment of nucleic acid which is introduced into a stem cell line and allowed to recombine with the genome at the chromosomal locus of the gene of interest to be mutated.
  • a given knock out construct is specific for a given gene to be targeted for disruption. Nonetheless, many common elements exist among these constructs and these elements are well Icnown in the art.
  • a typical knock out construct contains nucleic acid fragments of not less than about 0.5 kb nor more than about 10.0 kb from both the 5' and the 3' ends of the genomic locus which encodes the gene to be mutated.
  • nucleic acid which encodes a positive selectable marker, such as the neomycin resistance gene (neo R ).
  • the resulting nucleic acid fragment consisting of a nucleic acid from the extreme 5' end of the genomic locus linked to a nucleic acid encoding a positive selectable marker which is in turn linked to a nucleic acid from the extreme 3' end of the genomic locus of interest, omits most of the coding sequence for target gene or other- gene of interest to be knocked out.
  • the resulting construct recombines homologously with the chromosome at this locus, it results in the loss of the omitted coding sequence, otherwise known as the structural gene, from the genomic locus.
  • a stem cell in which such a rare homologous recombination event has taken place can be selected for by virtue of the stable integration into the genome of the nucleic acid of the gene encoding the positive selectable marker and subsequent selection for cells expressing this marker gene in the presence of an appropriate drug (neomycin in this example).
  • a "knock-in" construct refers to the same basic arrangement of a nucleic acid encoding a 5' genomic locus fragment linked to nucleic acid encoding a positive selectable marker which in turn is linked to a nucleic acid encoding a 3 ' genomic locus fragment, but which differs in that none of the coding sequence is omitted and thus the 5' and the 3' genomic fragments used were initially contiguous before being disrupted by the introduction of the nucleic acid encoding the positive selectable marker gene.
  • This "knock-in” type of construct is thus very useful for the construction of mutant transgenic animals when only a limited region of the genomic locus of the gene to be mutated, such as a single exon, is available for cloning and genetic manipulation.
  • the "knock-in” construct can be used to specifically eliminate a single functional domain of the targeted gene, resulting in a transgenic animal which expresses a polypeptide of the targeted gene which is defective in one function, while retaining the function of other domains of the encoded polypeptide.
  • This type of "knock-in” mutant frequently has the characteristic of a so-called “dominant negative” mutant because, especially in the case of proteins which homomultimerize, it can specifically block the action of (or "poison") the polypeptide product of the wild-type gene from which it was derived.
  • a marker gene is integrated at the genomic locus of interest such that expression of the marker gene comes under the control of the transcriptional regulatory elements of the targeted gene.
  • a marker gene is one that encodes an enzyme whose activity can be detected (e.g., ⁇ -galactosidase), the enzyme substrate can be added to the cells under suitable conditions, and the enzymatic activity can be analyzed.
  • an enzyme whose activity can be detected (e.g., ⁇ -galactosidase)
  • the enzyme substrate can be added to the cells under suitable conditions, and the enzymatic activity can be analyzed.
  • One skilled in the art will be familiar with other useful markers and the means for detecting their presence in a given cell. All
  • homologous recombination of the above described "knock out” and “knock in” constructs is very rare and frequently such a construct inserts nonhomologously into a random region of the genome where it has no effect on the gene which has been targeted for deletion, and where it can potentially recombine so as to disrupt another gene which was otherwise not intended to be altered.
  • Such nonhomologous recombination events can be selected against by modifying the above-mentioned knock out and knock in constructs so that they are flanked by negative selectable markers at either end (particularly through the use of two allelic variants of the thymidine kinase gene, the polypeptide product of which can be selected against in expressing cell lines in an appropriate tissue culture medium well known in the art - i.e. one containing a drug such as 5- bromodeoxyuridine).
  • a preferred embodiment of such a knock out or knock in construct of the invention consist of a nucleic acid encoding a negative selectable marker linlced to a nucleic acid encoding a 5' end of a genomic locus linlced to a nucleic acid of a positive selectable marker which in turn is linlced to a nucleic acid encoding a 3' end of the same genomic locus which in turn is linlced to a second nucleic acid encoding a negative selectable marker
  • Nonhomologous recombination between the resulting knock out construct and the genome will usually result in the stable integration of one or both of these negative selectable marker genes and hence cells which have undergone nonhomologous recombination can be selected against by growth in the appropriate selective media (e.g.
  • the knockout construct is inserted into a vector (described infra), linearization is accomplished by digesting the DNA with a suitable restriction endonuclease selected to cut only within the vector sequence and not within the knockout construct sequence.
  • the knockout construct is added to the ES cells under appropriate conditions for the insertion method chosen, as is known to the skilled artisan. For example, if the ES cells are to be electroporated, the ES cells and knockout construct DNA are exposed to an electric pulse using an electroporation machine and following the manufacturer's guidelines for use. After electroporation, the ES cells are typically allowed to recover under suitable incubation conditions. The cells are then screened for the presence of the knock out construct as explained above. Where more than one construct is to be introduced into the ES cell, each knockout construct can be introduced simultaneously or one at a time.
  • the cells can be inserted into an embryo. Insertion may be accomplished in a variety of ways Icnown to the skilled artisan, however a preferred method is by microinjection. For microinjection, about 10-30 cells are collected into a micropipet and injected into embryos that are at the proper stage of development to permit integration of the foreign ES cell containing the knockout construct into the developing embryo. For instance, the transformed ES cells can be microinjected into blastocytes. The suitable stage of development for the embryo used for insertion of ES cells is very species dependent, however for mice it is about 3.5 days. The embryos are obtained by perfusing the uterus of pregnant females.
  • Suitable methods for accomplishing this are known to the skilled artisan, and are set forth by, e.g., Bradley et al. (supra). While any embryo of the right stage of development is suitable for use, preferred embryos are male. In mice, the preferred embryos also have genes coding for a coat color that is different from the coat color encoded by the ES cell genes. In this way, the offspring can be screened easily for the presence of the knockout construct by looking for mosaic coat color (indicating that the ES cell was incorporated into the developing embryo). Thus, for example, if the ES cell line carries the genes for white fur, the embryo selected will carry genes for black or brown fur.
  • the embryo may be implanted into the uterus of a pseudopregnant foster mother for gestation. While any foster mother may be used, the foster mother is typically selected for her ability to breed and reproduce well, and for her ability to care for the young. Such foster mothers are typically prepared by mating with vasectomized males of the same species.
  • the stage of the pseudopregnant foster mother is important for successful implantation, and it is species dependent. For mice, this stage is about 2-3 days pseudopregnant.
  • Offspring that are born to the foster mother may be screened initially for mosaic coat color where the coat color selection strategy (as described above, and in the appended examples) has been employed.
  • DNA from tail tissue of the offspring may be screened for the presence of the knockout construct using Southern blots and/or PCR as described above. Offspring that appear to be mosaics may then be crossed to each other, if they are believed to carry the knockout construct in their germ line, in order to generate homozygous knockout animals.
  • Homozygotes may be identified by Southern blotting of equivalent amounts of genomic DNA from mice that are the product of this cross, as well as mice that are known heterozygotes and wild type mice.
  • Northern blots can be used to probe the mRNA for the presence or absence of transcripts encoding either the gene knocked out, the marker gene, or both.
  • Western blots can be used to assess the level of expression of the MFGF gene knocked out in various tissues of the offspring by probing the Western blot with an antibody against the particular MFGF protein, or an antibody against the marker gene product, where this gene is expressed.
  • in situ analysis such as fixing the cells and labeling with antibody
  • FACS fluorescence activated cell sorting
  • knock-out or disruption transgenic animals are also generally Icnown. See, for example, Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986).
  • Recombinase dependent knockouts can also be generated, e.g. by homologous recombination to insert target sequences, such that tissue specific and/or temporal control of inactivation of a target -gene can be controlled by recombinase sequences (described infra).
  • Animals containing more than one knockout construct and/or more than one transgene expression construct are prepared in any of several ways.
  • the preferred manner of preparation is to generate a series of mammals, each containing one of the desired transgenic phenotypes. Such animals are bred together tlirough a series of crosses, backcrosses and selections, to ultimately generate a single animal containing all desired knockout constructs and/or expression constructs, where the animal is otherwise congenic (genetically identical) to the wild type except for the presence of the knockout construct(s) and/or transgene(s) .
  • a target transgene can encode the wild-type form of the protein, or can encode homologs thereof, including both agonists and antagonists, as well as antisense constructs.
  • the expression of the transgene is restricted to specific subsets of cells, tissues or developmental stages utilizing, for example, cis-acting sequences that control expression in the desired pattern.
  • mosaic expression of a target gene protein can be essential for many forms of lineage analysis and can additionally provide a means to assess the effects of, for example, lack of target gene expression which might grossly alter development in small patches of tissue within an otherwise normal embryo.
  • tissue-specific regulatory sequences and conditional regulatory sequences can be used to control expression of the transgene in certain spatial patterns.
  • temporal patterns of expression can be provided by, for example, conditional recombination systems or prokaryotic transcriptional regulatory sequences.
  • target sequence refers to a nucleotide sequence that is genetically recombined by a recombinase.
  • the target sequence is flanked by recombinase recognition sequences and is generally either excised or inverted in cells expressing recombinase activity.
  • Recombinase catalyzed recombination events can be designed such that recombination of the target sequence results in either the activation or repression of expression of one of the subject target gene proteins.
  • excision of a target sequence which interferes with the expression of a recombinant target gene such as one which encodes an antagonistic homolog or an antisense transcript, can be designed to activate expression of that gene.
  • This interference with expression of the protein can result from a variety of mechanisms, such as spatial separation of the target gene from the promoter element or an internal stop codon.
  • the transgene can be made wherein the coding sequence of the gene is flanked by recombinase recognition sequences and is initially transfected into cells in a 3' to 5' orientation with respect to the promoter element.
  • inversion of the target sequence will reorient the subject gene by placing the 5' end of the coding sequence in an orientation with respect to the promoter element which allow for promoter driven transcriptional activation.
  • the transgenic animals of the present invention all include within a plurality of their cells a transgene of the present invention, which transgene alters the phenotype of the "host celj" with respect to regulation of cell growth, death and/or differentiation. Since it is possible to produce transgenic organisms of the invention utilizing one or more of the transgene constructs described herein, a general description will be given of the production of transgenic organisms by referring generally to exogenous genetic material. This general description can be adapted by those skilled in the art in order to incorporate specific transgene sequences into organisms utilizing the methods and materials described below.
  • cre/loxP recombinase system of bacteriophage PI (Lalcso et al., PNAS 89: 6232-6236, 1992; Orban et al, PNAS 89: 6861-6865, 1992) or the FLP recombinase system of Saccharomyces cerevisiae (O' Gorman et al, Science 251: 1351-1355, 1991; PCT publication WO 92/15694) can be used to generate in vivo site-specific genetic recombination systems. Cre recombinase catalyzes the site-specific recombination of an intervening target sequence located between loxP sequences.
  • loxP sequences are 34 base pair nucleotide repeat sequences to which the Cre recombinase binds and are required for Cre recombinase mediated genetic recombination.
  • the orientation of loxP sequences determines whether the intervening target sequence is excised or inverted when Cre recombinase is present (Abremski et al, J Biol. Chem. 259: 1509-1514, 1984); catalyzing the excision of the target sequence when the loxP sequences are oriented as direct repeats and catalyzes inversion of the target sequence when loxP sequences are oriented as inverted repeats. Accordingly, genetic recombination of the target sequence is dependent on expression of the Cre recombinase.
  • Expression of the recombinase can be regulated by promoter elements which are subject to regulatory control, e.g., tissue-specific, developmental stage-specific, inducible or repressible by externally added agents. This regulated control will result in genetic recombination of the target sequence only in cells where recombinase expression is mediated by the promoter element.
  • the activation expression of a recombinant target gene protein can be regulated via control of recombinase expression.
  • cre/loxP recombinase system to regulate expression of a recombinant target gene protein requires the construction of a transgenic animal containing transgenes encoding both the Cre recombinase and the subject protein. Animals containing both the Cre recombinase and a recombinant target gene can be provided through the construction of "double" transgenic animals. A convenient method for providing such animals is to mate two transgenic animals each containing a transgene, e.g., a target gene and recombinase gene.
  • transgenic animals containing a target transgene in a recombinase-mediated expressible format derives from the likelihood that the subject protein, whether agonistic or antagonistic, can be deleterious upon expression in the transgenic animal.
  • a founder population in which the subject transgene is silent in all tissues, can be propagated and maintained. Individuals of this founder population can be crossed with animals expressing the recombinase in, for example, one or more tissues and/or a desired temporal pattern.
  • prokaryotic promoter sequences which require prokaryotic proteins to be simultaneous expressed in order to facilitate expression of the target transgene.
  • Exemplary promoters and the corresponding trans-activating prokaryotic proteins are given in U.S. Patent No. 4,833,080.
  • conditional transgenes can be induced by gene therapy-like methods wherein a gene encoding the trans-activating protein, e.g. a recombinase or a prokaryotic protein, is delivered to the tissue and caused to be expressed, such as in a cell-type specific manner.
  • a target transgene could remain silent into adulthood until "turned on” by the introduction of the trans- activator.
  • the "transgenic non-human animals" of the invention are produced by introducing transgenes into the germline of the non- human animal.
  • Embryonal target cells at various developmental stages can be used to introduce transgenes. Different methods are used depending on the stage of development of the embryonal target cell.
  • the specific line(s) of any animal used to practice this invention are selected for general good health, good embryo yields, good pronuclear visibility in the embryo, and good reproductive fitness.
  • the haplotype is a significant factor. For example, when transgenic mice are to be produced, strains such as C57BL/6 or FVB lines are often used (Jackson Laboratory, Bar Harbor, ME).
  • Preferred strains are those with H-2b, H-2d or H-2q haplotypes such as C57BL/6 or DBA/1.
  • the line(s) used to practice this invention may themselves be transgenics, and/or may be knockouts (i.e., obtained from animals which have one or more genes partially or completely suppressed) .
  • the transgene construct is introduced into a single stage embryo.
  • the zygote is the best target for micro-injection.
  • the male pronucleus reaches the size of approximately 20 ⁇ m in diameter which allows reproducible injection of 1-2 pL of DNA solution.
  • the use of zygotes as a target for gene transfer has a major advantage in that in most cases the injected DNA will be incorporated into the host gene before the first cleavage (Brinster et al, PNAS 82: 4438-4442, 1985).
  • all cells of the transgenic animal will carry the incorporated transgene. This will in general also be reflected in the efficient transmission of the transgene to offspring of the founder since 50% of the germ cells will harbor the transgene.
  • the nucleotide sequence comprising the transgene is introduced into the female or male pronucleus as described below. In some species such as mice, the male pronucleus is preferred. It is most preferred that the exogenous genetic material be added to the male DNA complement of the zygote prior to its being processed by the ovum nucleus or the zygote female pronucleus.
  • ovum nucleus or female pronucleus release molecules which affect the male DNA complement, perhaps by replacing the protamines of the male DNA with histones, thereby facilitating the combination of the female and male DNA complements to form the diploid zygote.
  • the exogenous genetic material be added to the male complement of DNA or any other complement of DNA prior to its being affected by the female pronucleus.
  • the exogenous genetic material is added to the early male pronucleus, as soon as possible after the formation of the male pronucleus, which is when the male and female pronuclei are well separated and both are located close to the cell membrane.
  • the exogenous genetic material could be added to the nucleus of the sperm after it has been induced to undergo decondensation.
  • Sperm containing the exogenous genetic material can then be added to the ovum or the decondensed sperm could be added to the ovum with the transgene constructs being added as soon as possible thereafter.
  • transgene nucleotide sequence into the embryo may be accomplished by any means Icnown in the art such as, for example, microinj ection, electroporation, or lipofection.
  • the embiyo may be incubated in vitro for varying amounts of time, or reimplanted into the surrogate host, or both. In vitro incubation to maturity is within the scope of this invention.
  • a zygote is essentially the formation of a diploid cell which is capable of developing into a complete organism.
  • the zygote will be comprised of an egg containing a nucleus formed, either naturally or artificially, by the fusion of two haploid nuclei from a gamete or gametes.
  • the gamete nuclei must be ones which are naturally compatible, i.e., ones which result in a viable zygote capable of undergoing differentiation and developing into a functioning organism.
  • a euploid zygote is preferred.
  • the number of chromosomes should not vary by more than one with respect to the euploid number of the organism from which either gamete originated.
  • physical ones also govern the amount (e.g., volume) of exogenous genetic material which can be added to the nucleus of the zygote or to the genetic material which forms a part of the zygote nucleus. If no genetic material is removed, then the amount of exogenous genetic material which can be added is limited by the amount which will be absorbed without being physically disruptive. Generally, the volume of exogenous genetic material inserted will not exceed about 10 picoliters.
  • the physical effects of addition must not be so great as to physically destroy the viability of the zygote.
  • the biological limit of the number and variety of DNA sequences will vaiy depending upon the particular zygote and functions of the exogenous genetic material and will be readily apparent to one skilled in the art, because the genetic material, including the exogenous genetic material, of the resulting zygote must be biologically capable of initiating and maintaining the differentiation and development of the zygote into a functional organism.
  • the number of copies of the transgene constructs which are added to the zygote is dependent upon the total amount of exogenous genetic material added and will be the amount which enables the genetic transformation to occur.
  • exogenous genetic material is preferentially inserted into the nucleic genetic material by microinjection. Microinjection of cells and cellular structures is known and is used in the art.
  • Reimplantation is accomplished using standard methods. Usually, the surrogate host is anesthetized, and the embryos are inserted into the oviduct. The number of embryos implanted into a particular host will vary by species, but will usually be comparable to the number of off spring the species naturally produces. Transgenic offspring of the surrogate host may be screened for the presence and/or expression of the transgene by any suitable method. Screening is often accomplished by Southern blot or Northern blot analysis, using a probe that is complementary to at least a portion of the transgene. Western blot analysis using an antibody against the protein encoded by the transgene may be employed as an alternative or additional method for screening for the presence of the transgene product.
  • DNA is prepared from tail tissue and analyzed by Southern analysis or PCR for the transgene.
  • the tissues or cells believed to express the transgene at the highest levels are tested for the presence and expression of the transgene using Southern analysis or PCR, although any tissues or cell types may be used for this analysis.
  • Alternative or additional methods for evaluating the presence of the transgene include, without limitation, suitable biochemical assays such as enzyme and/or immunological assays, histological stains for particular marker or enzyme activities, flow cytometric analysis, and the like. Analysis of the blood may also be useful to detect the presence of the transgene product in the blood, as well as to evaluate the effect of the transgene on the levels of various types of blood cells and other blood constituents.
  • suitable biochemical assays such as enzyme and/or immunological assays, histological stains for particular marker or enzyme activities, flow cytometric analysis, and the like.
  • Analysis of the blood may also be useful to detect the presence of the transgene product in the blood, as well as to evaluate the effect of the transgene on the levels of various types of blood cells and other blood constituents.
  • Progeny of the transgenic animals may be obtained by mating the transgenic animal with a suitable partner, or by in vitro fertilization of eggs and/or sperm obtained from the transgenic animal.
  • the partner may or may not be transgenic and/or a knockout; where it is transgenic, it may contain the same or a different transgene, or both.
  • the partner may be a parental line.
  • in vitro fertilization is used, the fertilized embryo may be implanted into a surrogate host or incubated in vitro, or both. Using either method, the progeny may be evaluated for the presence of the transgene using methods described above, or other appropriate methods.
  • the transgenic animals produced in accordance with the present invention will include exogenous genetic material.
  • the exogenous genetic material will, in certain embodiments, be a DNA sequence which results in the production of a target protein (either agonistic or antagonistic), and antisense transcript, or a target mutant.
  • the sequence will be attached to a transcriptional control element, e.g., a promoter, which preferably allows the expression of the transgene product in a specific type of cell.
  • Retroviral infection can also be used to introduce transgene into a non- human animal.
  • the developing non-human embryo can be cultured in vitro to the blastocyst stage.
  • the blastomeres can be targets for retroviral infection (Jaenich, PNAS 73: 1260-1264, 1976).
  • Efficient infection of the blastomeres is obtained by enzymatic treatment to remove the zona pellucida (Manipulating the Mouse Embryo, Hogan eds. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 1986).
  • the viral vector system used to introduce the transgene is typically a replication-defective retro virus carrying the transgene
  • transgenes into the germ line by intrauterine retroviral infection of the midgestation embiyo (Jahner et al, 1982, supra).
  • ES cells are obtained from pre-implantation embryos cultured in vitro and fused with embryos (Evans et al., Nature 292: 154-156, 1981; Bradley et al., Nature 309: 255-258, 1984; Gossler et al, PNAS 83: 9065-9069, 1986; and Robertson et al, Nature 322: 445-448, 1986).
  • Transgenes can be efficiently introduced into the ES cells by D ⁇ A transfection or by retrovirus-mediated fransduction. Such transformed ES cells can thereafter be combined with blastocysts from a non-human animal.
  • the instant invention provides a method to detect or screen for polypeptide conformation changes in response to a variety of reasons.
  • the intrapolypeptide split-ubiquitin therapeutic formulations used in the method of the invention are most preferably applied in the form of appropriate compositions.
  • appropriate compositions there may be cited all compositions usually employed for systemically or topically administering drugs.
  • the pharmaceutically acceptable carrier should be substantially inert, so as not to act with the active component. Suitable inert carriers include water, alcohol, polyethylene glycol, mineral oil or petroleum gel, propylene glycol and the like. For example, several p53 mutations in the core domain are shown in
  • Example 3 to cause conformation change in the core domain, which is readily detectable by the split-ubiquitin system of the instant invention.
  • a second type of conformation change of the polypeptide can be caused by binding of the polypeptide to an agent or compound that is not covalently linlced to the polypeptide.
  • agent or compound that is not covalently linlced to the polypeptide.
  • the nature of the agent / compound varies. It can be a polypeptide, a hormone, a steroid, an ion, a polynucleotide, a sugar or an oligosaccharide, a lipid, an enzyme substrate, O 2 , and a small molecule.
  • Protein-protein interaction may lead to conformation change of at least one interacting partners.
  • yeast G ⁇ binding to G ⁇ causes a conformation change in the G ⁇ protein, which can be readily detected by the method of the instant invention.
  • Hormone induced conformation change is well known in the art.
  • the nuclear hormone receptor estrogen receptor (ER) contains a ligand- binding domain (LBD) that is completely enveloped by other protein domains.
  • LBD ligand- binding domain
  • the process of ligand binding or unbinding must involve a significant conformational change of this domain.
  • Connor et al. As to conformational change induced by small chemical compounds or drugs, Connor et al. (Cancer Res. 61: 2917-22, 2001) report that an anti-estrogen chemical GW5638 induces a unique structural change in the estrogen receptor (ER). It is Icnown that tamoxifen inhibits estrogen receptor transcriptional activity by competitively inhibiting estradiol binding and inducing conformational changes in the receptor that may prevent its interaction with coactivators. In bone, the cardiovascular system, and some breast tumors, however, tamoxifen exhibits agonist activity, suggesting that the tamoxifen-ER complex is not recognized identically in all cells. Using phage display, Connor et al.
  • GR rat glucocorticoid receptor
  • One conformational change in the rat glucocorticoid receptor (GR) can be readily discerned by following the ability of trypsin digestion to afford a 16-kDa fragment. This fragment is seen after proteolysis of steroid-free receptors but disappears in digests of either glucocorticoid- or antiglucocorticoid- bound receptors (Xu et al., Mol. Cell. Endocrinology 155 : 85-100, 1999).
  • Ion binding may also change protein conformation.
  • Fur ferric uptake regulation protein
  • Fe iron uptake regulation protein
  • Gonzalez de Peredo et al. used selective chemical modification and mass spectrometry to investigate this mechanism of conformation change.
  • the reactivity of each lysine residue of the Fur protein was studied, first in the apo form of the protein, then after metal activation and finally after DNA binding.
  • Lys76 was shown to be highly protected from modification in the presence of target DNA. Hydrogen-deuterium exchange experiments were performed to map with higher resolution the conformational changes induced by metal binding.
  • DNA-induced conformation change in certain transcription factors is well Icnown in the art.
  • bacteriophage 434 repressor binds to its specific DNA sites only as a dimer, formation of the dimers in solution occurs at concentrations tliree orders of magnitude higher than those needed to bind the 434 operator DNA.
  • Ciubotaru et al. show that both specific and non-specific DNA may induce conformational changes that can lead to formation of repressor dimers (J. Mol. Biol. 294: 859-873, 1999).
  • the repressor conformational changes induced by DNA occur at concentrations much lower than those needed for binding of repressor, suggesting that the alternative conformations of repressor persist even if the protein is not in direct contact with DNA.
  • DNA acts in a "catalytic" fashion to induce a steady-state amount of an alternative repressor conformation that has an enhanced affinity for its specific binding site.
  • Lipid-protein interaction may also induce conformation change.
  • Apo H apolipoprotein H
  • lipid membrane has been considered to be a basic mechanism for the biological function of the protein.
  • Previous reports have demonstrated that Apo H can interact only with membranes containing anionic phospholipids.
  • Wang et a. study the membrane-induced conformational change of Apo H by CD spectroscopy with two different model systems: anionic- phospholipid-containing liposomes [such as 1, 2-dimyristoyl-sn-glycero-3- phosphoglycerol (DMPG) and cardiolipin], and the water/methanol mixtures at moderately low pH, which mimic the micro-physicochemical environment near the membrane surface.
  • anionic- phospholipid-containing liposomes such as 1, 2-dimyristoyl-sn-glycero-3- phosphoglycerol (DMPG) and cardiolipin
  • Apo H undergoes a remarkable conformational change on interaction with liposomes containing anionic phospholipid.
  • liposomes containing DMPG To interact with liposomes containing DMPG, there is a 6.8% increase in alpha-helix in the secondary structures; in liposomes containing cardiolipin, however, there is a 12.6% increase in alpha-helix and a 9% decrease in beta-sheet.
  • the similar conformation change in Apo H can be induced by treatment with an appropriate mixture of water/methanol. The results indicate that the association of Apo H with membrane is correlated with a certain conformational change in the secondary structure of the protein.
  • lac repressor protein A key element in the ability of lac repressor protein to control transcription reversibly is the capacity to assume different conformations in response to ligand (lactose and other structural homologs of the sugar, such as IPTG) binding.
  • ligand lactose and other structural homologs of the sugar, such as IPTG
  • Barry and Matthews investigated mutant repressor proteins containing single tryptophans created by mutating each of the two native tryptophan residues to tyrosine and changing the residue of interest to tryptophan (Biochemistry 36: 15632-42, 1997). This study suggests that, in the areas of the lac repressor probed by those substitutions, the inducer-bound form differs from the conformation of the unliganded form. Gas can also change protein conformation.
  • Hb hemoglobin
  • Quaternary structure of Hb is a tetramer with 2 alpha and 2 beta subunits and they are labelled alpha-1, alpha-2, beta-1 and beta-2. They join together in pairs of alpha-1 + beta-1 and alpha-2 + beta-2 because the interactions between these pairings are much stronger than e.g. between alpha-1 and alpha-2.
  • the interactions are salt bridges or electrostatic interactions, and all 4 are connected to each other by these electrostatic contacts. Examination of the structure showed that deoxy Hb has 8 more contacts (salt bridges) between the subunits than has oxy Hb. So deoxy Hb is a tighter, more rigid molecule.
  • Hb When Hb binds oxygen it undergoes a change in conformation, which disrupt the salt bridges. When all 4 oxygen molecules have bound, all 8 salt bridges are disrupted; so oxy Hb is more relaxed, held together loosely.
  • a change in conformation arises when oxygen binds to Fe and pulls the Fe into the plane of the haem. This movement pulls on the His F8 residue (the eight residue in helix F. The Fe atom is located very slightly above the haem plane, and attached to the His F8 side), and flattens the haem plane, and these movements cause a series of small changes in orientation of the amino acids involved in salt bridges between subunits so that the salt bridges break. So by small alterations in the molecular shape or conformation of the Hb this facilitates rapid additional binding of oxygen.
  • binding of glucose to hexokinase induces a large conformational change of the enzyme (a phenomenon also referred to as "induced fit").
  • the change in conformation brings the C6 hydroxyl of glucose close to the terminal phosphate of ATP, and excludes water from the active site. This prevents the enzyme from catalyzing ATP hydrolysis, rather than transfer of phosphate to glucose.
  • a third type of protein conformation change can be induced by environmental alterations, such as temperature, pressure, pH, redox state, etc.
  • Temperature change can induce conformation change in proteins.
  • Numerous temperature sensitive (including cold sensitive) mutant proteins have been described. These mutants typically adopt a defective conformation at higher temperature (or lower temperature in the cold sensitive case) and return to a relatively normal conformation at a lower temperature.
  • the intermediate state (I) was found to have a K m identical with that of the native state and a turnover rate k cat twofold higher than that of the native state with butyrylthiocholine as the substrate.
  • the increased catalytic efficiency (k cat /K m ) of (I) can be explained by a conformational change in the active-site gorge and/or restructuring of the water-molecule network in the active-site pocket, making the catalytic steps faster.
  • Protein conformation may also change in response to redox changes.
  • the Escherichia coli OxyR transcription factor senses H 2 O 2 and is activated through the formation of an intramolecular disulfide bond.
  • Choi et al. recently reported the crystal structures of the regulatory domain of OxyR in its reduced and oxidized forms, determined at 2.7 Aand 2.3 A resolutions, respectively (Cell 105: 103-113, 2001).
  • the two redox-active cysteines are separated by approximately 17 A.
  • Disulfide bond formation in the oxidized form results in a significant structural change in the regulatory domain.
  • the structural remodeling which leads to different oligomeric associations, accounts for the redox-dependent switch in OxyR and provides a novel example of protein regulation by "fold editing" through a reversible disulfide bond formation within a folded domain.
  • a fourth type of protein conformation change can be induced by post- translational modification, such as phosphorylation, acetylation, methylation, glycosylation, proteolytic cleavage, sulfation, hydroxylation, carboxylation and prenylation, etc.
  • Phosphorylation / dephosphorylation are one of the most important mechanisms in signal transduction.
  • phosphorylation induced protein conformation changes There are numerous examples of phosphorylation induced protein conformation changes. For example, Lee et al. studied phosphorylation induced conformation change in NADPH oxidase (Biochimie 82: 727-32, 2000).
  • the leukocyte NADPH oxidase of neutrophils is a membrane-bound enzyme that catalyzes the production of O 2" from oxygen using NADPH as the electron donor.
  • the cytosolic oxidase components p47(phox) and p67(phox), each containing two Src homology 3 (SH3) domains migrate to the plasma membrane, where they associate with cytochrome b(558), a membrane-integrated flavohemoprotein, to assemble the active oxidase.
  • Oxidase activation can be mimicked in a cell-free system using an anionic amphiphile, such as SDS or arachidonic acid and the phosphorylation of p47(phox )with protein kinase C, activators of the oxidase in vitro cause exposure of p47(phox)-SH3, which has probably been masked by the C-terminal region of this protein in a resting state.
  • an anionic amphiphile such as SDS or arachidonic acid
  • phosphorylation of p47(phox )with protein kinase C activators of the oxidase in vitro cause exposure of p47(phox)-SH3, which has probably been masked by the C-terminal region of this protein in a resting state.
  • Acetylation is an important mechanism in regulating conformation change of proteins. For example, modification of histones, DNA-binding proteins found in chromatin, by addition of acetyl groups occurs to a greater degree when the histones are associated with transcriptionally active DNA.
  • a breakthrough in understanding how this acetylation is mediated was the discovery that various transcriptional co- activator proteins have intrinsic histone acetyltransferase activity (for example, Gcn5 ⁇ , PCAF, TAF(II)250 and p300/CBP. These acetyltransferases also modify certain transcription factors (TFIIEbeta, TFIIF, EKLF and p53).
  • GATA-1 is an important transcription factor in the haematopoietic lineage and is essential for terminal differentiation of erythrocytes and megakaryocytes. It is associated in vivo with the acetyltransferase p300/CBP. Boyes et al. report that GATA-1 is acetylated in vitro by p300 (Nature 396: 594-8, 1998). This significantly increases the amount of GATA-1 bound to DNA and alters the mobility of GATA-1 -DNA complexes, suggestive of a conformational change in GATA-1. GATA-1 is also acetylated in vivo and acetylation directly stimulates GATA-1 -dependent transcription.
  • acetylation induced conformation change in transcription factors can alter interactions between these factors and DNA and among different transcription factors, and is an integral part of transcription and differentiation processes.
  • Methylation may also change protein conformation.
  • the Escherichia coli protein Ada specifically repairs the S(p) diastereomer of DNA methyl phosphotriesters in DNA by direct and irreversible transfer of the methyl group to its own Cys 69 which is part of a zinc-thiolate center.
  • the methyl transfer converts Ada into a transcriptional activator that binds sequence-specifically to promoter regions of its own gene and other methylation resistance genes.
  • Ada thus acts as a chemosensor to activate repair mechanisms in situations of methylation damage. Lin et al.
  • N- AdalO 10 kDa N-terminal domain
  • results from that study show that methylation of N-Ada induces a structural change, which enhances the promoter affinity of a remodeled surface region that does not include the transferred methyl group.
  • the maturation, conformational stability, and the rate of in vivo degradation are specific for each protein and depend on both the intrinsic features of the protein and those of the surrounding cellular environment. While synthesis and degradation can be measured in living cells, stability and maturation of proteins are more difficult to quantify.
  • the biophysical parameter that forms the basis of these measurements is the time-averaged distance between the N-terminus and C-terminus of a protein.
  • Example 1.1 Split-Ubiquitin intrapolypeptide Assay for N- to C- Terminus Distance
  • n was an amino-acid residue capable of destabilizing the reporter when the reporter is released as an n-RM following cleavage of the N ub -polypeptide-C Ub -n-RM by a ubiquitin-specific protease (UBP) .
  • UBP ubiquitin-specific protease
  • FKBP12 homologue spatially separates the N- from the C-terminus (Rotonda et al, J. Biol. Chem. 268 : 7607-09, 1993).
  • Fprlp in a N ub -Fpr 1 -C ub fusion protein should therefore inhibit the Ub-reassembly ( Figure la, b). These predictions should only apply to the fully folded state of the two proteins.
  • Example 1.2 Spatial Arrangement of the N- and C-Termini Influences Reassembly of Coupled Ub-peptides
  • N u b The isoleucine residues at position three and thirteen of N u b (Nii) were replaced by valine (N xv ; N vx ) alanine (N xa ; N ax ) and glycine (N xg ; N gx ) (Johnsson et al., Proc. Natl. Acad. Sci. USA 91: 10340-44, 1994; Kellis et al., Biochemistry 28: 4914-22, 1989).
  • the residues at position 3 and 13 are indicated by the suffix in N ub with the corresponding amino acids in the single letter code.
  • the uncleaved fusion protein starts from 100% cleaved N ⁇ -Gukl-F-C Ub -Dha, the uncleaved fusion protein accumulates according to the expected decrease in the affinity between N Ub and C Ub - Approximately 90% of uncleaved protein is obtained for the N gg -construct.
  • the first detectable uncleaved fusion protein carries the Nj g - at its N-terminus.
  • Fprlp Unlike Guklp, the structure of Fprlp separates N- and C-terminus onto opposite faces of the molecule (Rotonda et al., J. Biol. Chem. 268: 7607-09, 1993).
  • FPR1 was cloned between eight different N ub and C Ub -Dha.
  • the length of the linker that connects the Ub-moieties to the inserted protein slightly influences the reassembly reaction (compare Figure 2b and 3a, and our unpublished observation).
  • the FLAG-epitope was therefore omitted in the constructs of this and all the following experiments.
  • the cleavage spectrum was again examined after cell extraction and immunoblot analysis ( Figure 3).
  • N ub -mutants with a lower affinity for C ub than Ni g were therefore not tested. It thus can be concluded that the cleaved Dha observed for Nj g -C Ub fusion proteins and for those fusion proteins containing a N ub with a lower affinity for C ub than N; g must result exclusively from the intramolecular reassociation of the coupled Ub- peptides.
  • the structure of Guklp offers a reasonable explanation for this observation, that can not apply for the Fprlp fusion proteins. Here the reassociation must either occur before Fprlp is folded or after the unfolding of the already matured protein.
  • N ub - FprlMC-Cu b -Dha displays a cleavage spectrum that lies between those of N ub -Fprl- C ub -Dha and N Ub -Gukl -C Ub -Dha, respectively ( Figure 3c).
  • Uncleaved N ub -Fpr 1MC- C ub -Dha is first detected for the Nj g -construct.
  • Nj g has a weaker affinity for C Ub than Ni a , the first N ub that allows the accumulation of the uncleaved Fprlp fusion protein, but has a stronger affinity for N a i, the first N Ub that yields uncleaved Guklp fusion protein ( Figure 3d).
  • Example 1.5 Characterizing Destabilizing Mutations In vivo
  • N; a -Fprl- C ub -RUra3p already supports the growth of the cells whereas the cells require the N ag -Gukl-C Ub -RUra3p to achieve the same phenotype.
  • N ag has a much lower affinity for C ub than Nj a .
  • FprlMC one of the mutants of Fprlp, can be distinguished from the wild type protein by comparing the growth of the cells bearing the Nj a at the N- terminus of both fusion constructs. Growth is solid for the cells harboring the native ⁇ N m -FPRl-C ub -RURA3, whereas growth is very poor for cells containing N; a -
  • N ub derivatives containing mutations at position 3 were obtained via PCR using oligonucleotides carrying the corresponding nucleotide exchanges at position 3 and the plasmids carrying the UB4 genes coding for either an I, A, V or G in codon position 13 of the ORF as a template (Johnsson et al., Proc. Natl. Acad. Sci. USA 91: 10340-44, 1994).
  • Fragments encoding the complete ORF of GUK1 (558bp) or FPR1 (339bp) were obtained by PCR using yeast genomic DNA (JD53) as a template and an oligonucleotide primer complementary to the 5' and 3' ends of the gene, respectively.
  • Fragments containing the ORF of SUMO 1 lacking the last 18 nucleotides were obtained via PCR using the plasmid pGEX-SUMO-1 as a template (Bayer et al., J. Mol. Biol. 280: 275-86, 1998). All 5'-primers contained an additional BamHI site and all 3'-primers an additional Sail site to allow for the in- frame fusion with the N Ub and C ub moieties. The obtained PCR-fragments were cut with BamHI and Sail and introduced into the correspondingly cut Pcup ⁇ -N u b-C Ub - Dha cassette on a pRS314 vector.
  • the corresponding C ub -RURA3 constructs were obtained by inserting the Eagl-Sall cut P C up ⁇ -N ub -ORF in front of the C ub -RURA3 module on a pRS313 vector (Wittlce et al, Mol. Biol. Cell 10: 2519-30, 1999).
  • N ub - GUKl-ha, Nu b -FPRl-ha, GUK1-C ub -Dha and FPR1-C ub -Dha were derived from the corresponding Pcup ⁇ -N U b-X-C ub -Dha constructs.
  • the constructs containing mutations or deletions of the ORFs of GUK1, FPR1 and SUMO1 were obtained via PCR using the corresponding wild type fragments as a template and a combination of primers introducing the desired mutation and the BamHI or Sail site at the 5' or 3' end, respectively.
  • GUK1 ⁇ H was obtained by an in frame deletion of the internal ecru fragment (position 172-198).
  • a PCR of the first 474 Bp of the ORF of SEC62 was performed using genomic DNA of the yeast strain RS Y529 carrying the sec62-l allele as a template or the yeast strain JD53 carrying SEC62 as a template.
  • the product was digested with Sail and BamHI and inserted into the correspondingly cut vector-cassettes to obtain N vg -zlC/2 J-C Ub -Dha, N Ub - AC125-D a and zlCi25-C Ub -Dha on the pRS314 and pRS315 vectors, respectively (Wittlce et al., Mol. Biol. Cell 10: 2519-30, 1999).
  • the constructs of sec62-141 were obtained by chance by a PCR on genomic DNA of RSY529 which introduced a T to C transition at position 422 of the ORF of sec62-l.
  • DNA sequences were determined by the MPIZ DNA facility on PE Biosystems Abi Prism 377 and 3700 sequencers using BigDye-terminator chemistry. Oligonucleotides were purchased from Metabion (Martinsried, Germany) and MWG Biotech (Ebersberg, Germany). Pulse-chase analysis
  • S. cerevisiae cells expressing N vg -FPR1-C ub -Dha were grown at 30°C in 10 ml of SD-trp medium to an OD 60 o of ⁇ 1, and supplemented with 100 ⁇ M CuSO 4 30 r min prior to labeling the cells for 5 min with Redivue Promix-[ S] (Amersham, Buckinghamshire, UK).
  • the chase, preparation of cell extracts in the presence of N- efhylmaleimide and immunoprecipitation with the monoclonal anti-HA antibody were carried out essentially as described (Johnsson et al., Embo J. 13: 2686-98, 1994).
  • Proteins were fractionated by SDS-12.5%> PAGE and electroblotted onto nitrocellulose membranes (Schleicher and Schuell, Dassel, Germany), using a semi- dry transfer system (Hoefer, Pharmacia Biotech INC., San Francisco, CA). Blots were incubated with a monoclonal anti-ha antibody (Babco, Richmond, CA). Bound antibody was visualized with horseradish peroxidase-coupled rabbit anti-mouse or goat anti-rabbit antibody (BioRad, Hercules, CA), using the chemiluminescence detection system (Pierce, Rockford, IL). The chemiluminescence was quantified with the aid of the lumi-imager system (Boehringer, Mannheim, Germany). Yeast strains, Growth and Functionality assays
  • S. cerevisiae strains were JD53 (MATahis3- ⁇ 200 leu2-3,112 lys2-801 trpl- ⁇ 63 ura3-52), JD47-13C (MATa of JD53), AG215 (MATa GAL10 GUK LEU2 his3) and YDF5 (MATa trpl-901 ade2-101 ura 3-52 leu2-3-112 lys2-801 his3-200 gal4 ⁇ gal80 ⁇ LYS2::GAL1-HIS3 URA3::GALl-lacZFPRl::ADE2), and RSY529 (MATa his4 leu2-3,112 ura3-52 sec62-l).
  • JD53 MATahis3- ⁇ 200 leu2-3,112 lys2-801 trpl- ⁇ 63 ura3-52
  • JD47-13C MATa of JD53
  • AG215 MATa GAL10 GUK LEU2 his3
  • Growth assays S. cerevisiae cells were first grown at 30°C in liquid selective media containing uracil. Cells were diluted in water and dilutions were spotted on agar plates selecting for the presence of the fusion constructs and lacking uracil. The same dilutions were spotted onto plates containing uracil to check for cell numbers. The plates were incubated at 30°C for 2-3 days.
  • the strain YDF5 containing a deletion of FPR1 was transformed with the plasmids containing the different FPR1 fusion constructs.
  • the cells were tested for regaining rapamycin sensitivity by a halo assay.
  • Filter disks containing 5 ⁇ l of rapamycin (Sigma, Deisenhofen, Germany) at a concentration of either 0.1 or 1 ⁇ g/ml were mounted onto media lacking tryptophan to select for the presence of the constructs. Cells were grown at 30°C for two days. A halo of non dividing cells around the filter disk indicated the functionality of the constructs (Heitman et al., Science 253: 905-09, 1991).
  • Example 2 Introduction Interaction induced changes in the conformation of proteins are frequently the molecular basis for the modulation of their activities. Although proteins perform their functions in cells, surrounded by many potential interaction partners, the studies of their conformational changes have been mainly restricted to in vitro studies.
  • Ste4p (G ⁇ ) and Stel ⁇ p (G ⁇ ) are the subunits of a heterotrimeric G-protein in the yeast Saccharomyces cerevisiae.
  • a split-ubiquitin based conformational sensor was used to detect a major structural rearrangement in Stel8p upon binding to a test compound, in this case to the polypeptide Ste4p. Based on these in vivo results and the solved structure of the mammalian G ⁇ , it is shown that G ⁇ of yeast adopts an equally extended structure, which is only induced upon association with G ⁇ .
  • Example 2.1 Split-Ub Based Approach
  • the N- and the C- termini of Stel 8p are spatially separated. This should prevent the association of the N Ub and C ub when attached to the N- and the C-termini of the same Stel8p polypeptide.
  • an RUra3p reporter coupled to the C-terminus of C ub is not cleaved and will enable yeast ura3 cells to grow on plates lacking uracil (Fig. 8 A).
  • Two possibilities can be envisioned for the free Stel8p. If the structure of the free protein is indistinguishable from its bound form, the RUra3p reporter will remain linked to C u and the cells will retain the original phenotype (Fig. 8B).
  • the first 19 residues of Stel 8p are unique to G ⁇ of the yeast.
  • the structure of this N-terminal stretch could therefore not be predicted and its sequence was deleted to create Stel8 9 ⁇ p.
  • ⁇ -Factor induced growth inhibition of cells expressing Stel891p instead of the wild-type protein documented an undiminished functionality of Stel891p and thereby indirectly its binding to Ste4p (Fig. 9A, and data not shown) (Clark et al, Mol. Cell. Biol. 13: 1-8, 1993).
  • a STE18 9 ⁇ construct that lacked the last five C-terminal residues including the motif for isoprenylation was placed between N ub and C ub -RURA3.
  • N ub that just inhibits the growth of the N ub -Stel 8 9 ⁇ -C ub -RUra3p transformed cells.
  • N ⁇ and Nj a in a N ub -STE18 9 ⁇ -C u b-RURA3 construct both inhibit the SD-ura growth of cells, which do not overexpress Ste4p (Fig. 9B, and data not shown).
  • Nj a has a weaker affinity for C u b and should therefore react more sensitively to alterations in the conformation of Stel8p.
  • N vg displays a lower affinity for C ub that results in the accumulation of a larger fraction of uncleaved fusion protein than observed with the otherwise identical N; a construct.
  • N vg -Stel8 9 ⁇ -C Ub -RUra3p coexpressed N vg -Stel8 9 ⁇ -C Ub -RUra3p together with a Nj a -STE18 9 ⁇ construct which was C-terminally extended by Dha to facilitate the detection of the protein by immunoblotting (Nj a -Stel8 ⁇ -Dha). If the free Stel8p forms dimers or multimers, the coexpression of Ni a -Stel8 9 ⁇ -Dha should increase the cleavage of N vg -Stel 8 9 ⁇ -C Ub -RUra3p and thereby inhibit the SD-ura growth of the cells. This was not observed.
  • Example 2.3 Stel8p Undergoes a Change in Conformation upon Binding to Ste4p
  • Ni a -Stel8 9 ⁇ -C ub -RUra3p together with HA-tagged Ste4p (HA-Ste4p) in cells lacking the chromosomal STE18 gene.
  • the cells were spotted in different dilutions onto plates without uracil and containing either glucose or galactose as the carbon source. Since the expression of HA-STE4 was controlled by the P GALI promoter, the intracellular HA-Ste4p concentration was high on galactose medium but below the limits of detection on glucose medium (Fig.
  • Example 2.4 Quantitative Analysis of Conformation Change using the Split Ubiquitin System
  • N V g-Stel8 ⁇ -C ub -Dha orN v -Stel8 74 -C ub - Dha was either expressed alone or together with HA-Ste4p (Fig. 11).
  • Expression of the Nyg-Stel 8-C Ub -Dha constructs from the P C U P I promoter was held at a relatively low level or was induced by addition of copper ions to 100 ⁇ M 1 hr prior to protein extraction.
  • the ratio of uncleaved to cleaved fusion protein was calculated after denaturing gel electrophoresis and immunoblotting with the anti-HA antibody. This value was compared to the ratio of uncleaved to cleaved N vg -Stel8-C u b-Dha from protein extracts of cells lacking HA-Ste4p.
  • the expression of HA-Ste4p increases the fraction of uncleaved N vg -Stel 8 9 ⁇ -C Ub -Dha by a factor of two under inducing conditions (+Cu 2+ ) and a factor of 3.6 under non-inducing conditions (Fig. 11 A, B).
  • Fragments containing the open reading frame (ORF) of STE 18 lacking either the first 57 (STE18 9 ⁇ ), 75 (SJE7 ⁇ ° 85 ), 90 (STE18 &0 ), or 108 (STE18 14 ) nucleotides and lacking the last 15 nucleotides were obtained by PCR using yeast genomic DNA, the Ywo polymerase (Roche-Biochemicals, Penzberg, Germany) and oligonucleotide primers complementary to the 5' and 3' ends of the desired ORF, respectively
  • the corresponding C ub -RURA3 constructs were obtained by inserting the Eagl-Sall cut Pcup ⁇ -N Ub ORF in front of the C ub -RURA3 module on a pRS313 vector (Wittlce et al, Mol. Biol. Cell 10: 2519-2530, 1999).
  • N ia -SJE25 9 ⁇ - Dha was derived from the corresponding Pcc / p N ub -SrEiS 9 ⁇ -C ub -Dha constructs by cloning the Eagl-Sall fragment in front of the ORF of DHFR extended by the HA epitope (Dha).
  • STEI8 91 containing the natural stop codon and an additional start codon at the 5' end was obtained by PCR using an oligonucleotide complementary to the 3' region starting 61 bp downstream of the ORF and an oligonucleotide complementary to the 5' region of the ORF.
  • the HA-STE4 construct was obtained by PCR using an oligonucleotide complementary to a 3' region starting 60 bp downstream of the ORF and an oligonucleotide complementary to the 5' end of the ORF.
  • the introduced Sail and Kp ⁇ l sites were used to clone the PCR fragment downstream and in-frame of the P GALI -HA module in a pRS416 vector.
  • the 5' sequence of the newly generated ORF reads: ATG TCG ACC TAC CCA TAC GAT GTT CCA GAT TAC GCT GGC TCG ACC ATG (S ⁇ Q ID No. 3).
  • the sequence of the HA epitope is underlined.
  • the first codon of ST ⁇ 4 is printed in bold letters.
  • S. cerevisiae cells expressing the different fusion proteins were grown at 30°C to an OD 60 o of ⁇ 0.8 in 10 ml of SG lacking tryptophan and uracil.
  • Cell extraction for immunoblotting was performed essentially as previously described (Jolinsson and Varshavsky, EMBO J. 13: 2686-2698, 1994). Bound antibody was visualized with horseradish peroxidase coupled rabbit anti-mouse antibodies (BioRad, Hercules, CA, USA), using the chemiluminescence detection system (Pierce, Rockford, IL, USA) and quantified with the lumi-imager system (Boehringer, Mannheim, Germany). Yeast strains, functionality assay
  • S. cerevisiae starins were JD53 (MATahis3- ⁇ 200 leu2-3,112 lys2-801 trpl- ⁇ 63 ura3-52), JD55 (MAT his3- ⁇ 200 leu2-3,112 lys2-801 trpl-63 ura3-52 ubrl::HIS3), KMY940 (MATa his3-ll,15 leu2-3,112 ade2-l trpl-1 ura3-l canl- 100 stel8::LEU2), YEL2 (MATa his3-ll,15 leu2-3,112 ade2-l trpl-1 ura3-l canl- 100 ste4::URA3).
  • SD dextrose
  • SG galactose
  • the cells were tested for ⁇ -factor sensitivity by a halo assay.
  • Filter disks containing 2.4 ⁇ g of ⁇ -factor were mounted onto media lacking tryptophan or uracil to select for the presence of the plasmids expressing the constructs and 2% galactose to express HA-Ste4p. Cells were grown at 30°C for 1 day.
  • Example 3 Detecting Conformation Change in p53
  • the tumor suppressor gene p53 has been identified as the most frequent target of genetic alterations in human cancers. Most of these mutations occur in highly conserved regions in the DNA-binding core domain of the p53 protein, suggesting that the amino acid residues in these regions are critical for maintaining normal p53 structure and function.
  • N vg -p53core-Cub-Dha and the nine different p53 ⁇ re mutants were expressed in yeast at 37°C and the cleaved and uncleaved fraction of the fusion proteins were probed with the anti-HA antibody on a nitrocellulose blot after cell extraction and SDS-PAGE.
  • the amount of cleaved and uncleaved fusion protein were quantified by chemoluminescence.
  • the amino acid exchanges of the different p53core mutants are indicated at the top of the blot ( Figure 12A). Representation of the effect of different mutations on the stability of the p53core was shown in Figure 12B.
  • the wild-type core domain and a destabilizing mutation VI 43 A were further tested using a simple growth assay employing the R-ura3 reporter (see above) and selecting clones that can grow on 5-FOA.
  • the N Ub (II 3 A) mutant N was used. If the N ub -X-C ub -R-ura3 fusion is stable, the R-ura3 reporter will be cleaved off of the C-terminal end of Cub and subsequently be degraded by N-end rule components (see above), allowing the host yeast cell to grow on the 5-FOA selective media. The less stable the p53 mutant, the less cleavage occurs, and thus the less growth of the host cell is observed.
  • Figure 13A shows a Western bot of protein extracts of yeast cells expressing N ub -p53 core -C Ub -Dha and N ub -V143A-C ub -Dha containing the N law b -mutants Nj a and Nj g .
  • the quantification of the experiment is shown in Figure 13B. Consistent with previous results, the VI 43 A core mutant is a destabilizing mutant when compared to the wild-type p53.
  • Figure 13C shows a gowth assay of yeast cells expressing the corresponding N u , N , and N Ig N Ub -p53 COre -C ub -RUra3p fusion proteins on media containing 5-FOA.
  • the N Ia fusion protein allows to distinguish between the cells expressing the wild type p53 core (growth) and the cells expressing the V143A mutant of the core (non-growth).
  • the instant invention provides an assay to identify substances/compounds that can stabilize p53 -V 143 A (or any of the other destabilizing mutants of p53 or other proteins).
  • High throughput screen assay can be set up to screen for test compounds (such as small molecules, chemical compounds, etc.) that can stabilize those unstable mutants, thereby correcting defects of mutant proteins. This can be particularly useful in treating cancer and a variety of other disease, wherein restoration of wild-type p53 function can trigger apoptosis in cancer cells and/or induce growth arrest of abberantly proliferating cells.
  • test shown in this example is carried out in yeast cells
  • the same assay and also other methods of the invention can also be carried out in mammalian cells with minor modifications using mammalian selectable markers (see above) which are apparent to a skilled artisan.
  • mammalian selectable markers see above
  • the method of the invention may be practiced in a cell- free environment. Equivalents Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents of the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Hematology (AREA)
  • Urology & Nephrology (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Food Science & Technology (AREA)
  • Cell Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Organic Chemistry (AREA)
  • Plant Pathology (AREA)
  • Peptides Or Proteins (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

La présente invention concerne des méthodes et des réactifs permettant de surveiller une structure protéique à l'aide d'une analyse d'ubiquitine dédoublée en intrapolypeptide.
EP02718797A 2001-01-04 2002-01-03 Detection de conformation de proteines a l'aide d'un systeme de rapporteurs d'ubiquitine dedoublee Withdrawn EP1349943A2 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US25982701P 2001-01-04 2001-01-04
US259827P 2001-01-04
PCT/US2002/000325 WO2002066656A2 (fr) 2001-01-04 2002-01-03 Detection de conformation de proteines a l'aide d'un systeme de rapporteurs d'ubiquitine dedoublee

Publications (1)

Publication Number Publication Date
EP1349943A2 true EP1349943A2 (fr) 2003-10-08

Family

ID=22986559

Family Applications (1)

Application Number Title Priority Date Filing Date
EP02718797A Withdrawn EP1349943A2 (fr) 2001-01-04 2002-01-03 Detection de conformation de proteines a l'aide d'un systeme de rapporteurs d'ubiquitine dedoublee

Country Status (3)

Country Link
EP (1) EP1349943A2 (fr)
CA (1) CA2432489A1 (fr)
WO (1) WO2002066656A2 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7056683B2 (en) 2002-11-12 2006-06-06 Massachusetts Institute Of Technology Genetically encoded fluorescent reporters of kinase, methyltransferase, and acetyl-transferase activities
GB0404187D0 (en) * 2004-02-25 2004-03-31 Biotransformations Ltd Binding agents
AU2008289441A1 (en) 2007-08-22 2009-02-26 Cytomx Therapeutics, Inc. Activatable binding polypeptides and methods of identification and use thereof
BRPI1006141B8 (pt) 2009-01-12 2021-05-25 Cytomx Therapeutics Llc composições de anticorpo modificado, métodos para preparar e usar as mesmas
BRPI1011384A2 (pt) 2009-02-23 2016-03-15 Cytomx Therapeutics Inc pro-proteinas e seus metodos de uso

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO02066656A3 *

Also Published As

Publication number Publication date
CA2432489A1 (fr) 2002-08-29
WO2002066656A3 (fr) 2003-02-27
WO2002066656A2 (fr) 2002-08-29

Similar Documents

Publication Publication Date Title
CN101883863B (zh) 生物活性核酸酶的快速体内鉴定
US8349619B2 (en) Method for detecting and analyzing protein interactions in-vivo
JP4262778B2 (ja) 生体分子相互作用を検出するためのタンパク質断片相補性アッセイ
WO2019210268A2 (fr) Protéomique basée sur le séquençage
Fetchko et al. Application of the split-ubiquitin membrane yeast two-hybrid system to investigate membrane protein interactions
Grefen et al. Split‐ubiquitin system for identifying protein‐protein interactions in membrane and full‐length proteins
US7223556B1 (en) Targeted proteolysis by recruitment to ubiquitin protein ligases
US20210214708A1 (en) Engineered promiscuous biotin ligases for efficient proximity labeling
US6576469B1 (en) Inducible methods for repressing gene function
CN105555948A (zh) 靶向整合
CN107429241A (zh) Dna敲入系统
Murthi et al. Genome-wide screen for inner nuclear membrane protein targeting in Saccharomyces cerevisiae: roles for N-acetylation and an integral membrane protein
WO2007134272A2 (fr) Génération de modèles animaux
JP2000506007A (ja) 細胞によるアッセイ
CA2514066A1 (fr) Constructions de detecteurs de proteasome sensibles et procedes de fabrication de celles-ci et utilisations
US20040053388A1 (en) Detection of protein conformation using a split ubiquitin reporter system
EP1349943A2 (fr) Detection de conformation de proteines a l'aide d'un systeme de rapporteurs d'ubiquitine dedoublee
US9435055B2 (en) Method and kit for detecting membrane protein-protein interactions
US20040170970A1 (en) Split- ubiquitin based reporter systems and methods of their use
EP1131419A1 (fr) Methodes de validation de polypeptides cibles se rapportant a des phenotypes cellulaires
EP1808495A1 (fr) Procédé et kit de détection d'interactions entre protéines actives par transcription
Knudsen et al. Application of the yeast two-hybrid system in molecular gerontology
US7258981B2 (en) Sensitive proteasome sensor constructs and methods for their design and use
US20210189485A1 (en) Sequence detection systems
Doyle Investigating metabolic dysfunction in a Saccharomyces cerevisiae model of SOD1-associated amyotrophic lateral sclerosis

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20030701

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR

AX Request for extension of the european patent

Extension state: AL LT LV MK RO SI

17Q First examination report despatched

Effective date: 20031219

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20070801