EP1464960A1

EP1464960A1 - Screening Method for the Identification of new Proteome-Interacting Compounds

Info

Publication number: EP1464960A1
Application number: EP03007690A
Authority: EP
Inventors: Anne-Claude Gavin; Paola Grandi; Ulrich Kruse
Original assignee: Cellzome GmbH
Current assignee: Cellzome GmbH
Priority date: 2003-04-03
Filing date: 2003-04-03
Publication date: 2004-10-06

Abstract

The present invention relates to the search for new drugs and in particular to a method for screening a library of potentially proteome-interacting candidate compounds for identifying a protein/protein-complex interacting compound and thereby further identifying a proteome-interacting compound. Furthermore, new and yet unidentified interactions between the proteome and compounds can be identified using the method according to the present invention.

Description

The present invention relates to the search for new drugs and in particular to a method for screening a library of potentially proteome-interacting candidate compounds for identifying a protein and/or protein complex interacting compound and thereby further identifying a proteome-interacting compound. Furthermore, new and yet unidentified interactions between the proteome and compounds can be identified using the method according to the present invention.
The search for new drugs in essentially all therapeutic fields of pharmaceutical industry has been limited so far by the identification of so-called "suitable drug targets". These targets are mostly present as proteins and/or enzymes that constitute targets by which, for example, the processes underlying a specific disease can be modulated in order to achieve a beneficial therapeutic effect. In addition, these targets must be "druggable", that is, must be available for the interaction with a drug to be applied.
Currently, the concept of "drugability" (availability) is limited to a rather small number of proteins with enzymatic activity and some cell surface/nuclear receptors. Around 10 % of the human genome can be targeted for the development of new drugs (i.e. are expected to be "druggable"), according to top pharmaceutical industry scientists (BA Festival of Science at the University of Glasgow, 3rd September 2001; see also, for example, Johnson JA. Harris S, Foord SM. Drug target pharmacogenomics: an overview. Am J Pharmacogenomics 2001;1(4):271-81).
"Genomics is already beginning to reshape the way drugs are discovered, with around 3,500 of human genes representing potential druggable targets. So far, the complexities of the drug discovery process have meant that the pharmaceutical industry has exploited only 450 targets." (Mark Fidock, Pfizer Ltd. "From genes and cells to healthcare forum", organised by the Biotechnology and Biological Sciences Research Council).
A greater understanding of the molecular pathways that lead to disease will provide better targets and more selective and safer compounds. This will help reduce the rate of attrition-drugs that fail to make it to market due to poor clinical efficacy, or safety - and improve the overall efficiency of the process. A deeper understanding of cellular processes is needed, particularly in disease. Novel therapeutic targets can be found from understanding how proteins interact with each other and how they work co-operatively in cells as part of larger complexes.
From a pharmaceutical perspective, not all genes are equal and the industry is heavily investing in technologies required to identify those genes that are both highly druggable and disease relevant.
One way out of the dilemma of the lack of druggable would be to increase the number of drugs that are available for the druggable targets. The recent developments of new drug classes, like anti-sense RNA (and probably also RNAi) and antibodies could definitively help to bypass this limitation. These approaches remain however at relatively early stages and their very general use as future therapeutic tools to cure diseases is still under question.
Therefore, the therapeutic tools of choice for curing diseases remain small compounds. The broadening of the "druggable space" for small compounds and the possibility to target new enzyme classes will represent a clear advantage.
In case of traditional drug discovery, in one approach so-called "small molecule libraries" are screened against a single pre-defined target. Little is yet known about selectivity of such a assay in a crude proteome (the specific protein composition of a cell or tissue); for example, the ability to bind related enzymes (coming from similar or identical protein families) or other enzymes/protein classes including, but not limited to, drug transporters and/or drug modifying enzymes.
The current methods that are developed in order to tackle the above mentioned issues usually involve proteins that are expressed in heterologous systems, like bacteria or phages. A very commonly used system employs the phage display system.
One example for this system is the so-called "Proteome scan^TM", which screens for targets and lead structures (i.e. structures forming the chemical and structural basis ("core") of a future drug) at the proteome level. The method comprises a screening format in which one compound/peptide is tested against a phage display library that usually contains approx. 10000 human proteins (in case of a high quality library). The drugs themselves are compounds with unknown modes of action and can constitute a library as well. Currently, the introduction of robotics leads to a "through-put" of 200 scans per month, a number which is said to be improved up to an expected through-put of 4000 screens per months.
WO 02/092118 relates to proteome chips comprising arrays having a large proportion of all proteins expressed in a single species and to methods for using proteome chips to systematically assay all protein interactions in a species in a high-throughput manner. Furthermore, also methods for making protein arrays by attaching double-tagged fusion proteins to a solid support are described.
Evans DM et al. (in: Evans DM, Williams KP, McGuinness B, Tarr G, Regnier F, Afeyan N, Jindal S. Affinity-based screening of combinatorial libraries using automated, serial-column chromatography. Nat Biotechnol 1996 Apr;14(4):504-7) describe an automated serial chromatographic technique for screening a library of compounds based upon their relative affinity for a target molecule. A "target" column containing the immobilised target molecule is set in tandem with a reversed-phase column. A combinatorial peptide library is injected onto the target column. The target-bound peptides are eluted from the first column and transferred automatically to the reversed-phase column. The target-specific peptide peaks from the reversed-phase column are identified and sequenced. Using a monoclonal antibody (3E-7) against beta-endorphin as a target, a single peptide with the sequence YGGFL from approximately 5800 peptides present in a combinatorial library was selected. The technique is described as having broad applications for high throughout screening of chemical libraries or natural product extracts.
Although powerful, the above methods have limitations, since most proteins require laborious techniques, folding, post-translational modifications and association in complexes with regulatory/modulatory components. Although important, these post-translational events are not always sustained in heterologous systems.
Thus, there is an ongoing need in the pharmaceutical industry for efficient screening systems that are suitable for high-throughput screening and allow for the identification of new and more "physiological" targets, i.e. methods that more efficiently mimic the situation in the proteome in vivo.
This problem is solved by providing a method for screening a library of potentially proteome-interacting candidate compounds, comprising: a) providing a library comprising non-labelled potentially proteome-interacting candidate compounds, b) providing a second library comprising a variety of proteomes, wherein each proteome contains at least one labelled polypeptide, c) contacting the library from a) with the library in b) in a manner as to allow for an interaction of the candidate compounds with the at least one labelled polypeptide, d) determining an interaction between said candidate compounds and said at least one labelled polypeptide, and thereby identifying a a proteome-interacting compound.
A "library" according to the present invention relates to a (mostly large) collection of (numerous) different chemical entities that are provided in a sorted manner that enables both a fast functional analysis (screening) of the different individual entities, and at the same time provide for a rapid identification of the individual entities that form the library. Examples are collections of tubes or wells or spots on surfaces that contain chemical compounds that can be added into reactions with one or more defined potentially interacting partners in a high-throughput fashion. After the identification of a desired "positive" interaction of both partners, the respective compound can be rapidly identified due to the library construction. Libraries of synthetic and natural origins can either be purchased or designed by the skilled artisan.
In the context of the present invention, two different libraries are used. In the first "library of potentially proteome-interacting candidate compounds", a library of potentially proteome-interacting compounds is provided, wherein a collection of compounds is provided that potentially interact with the proteome to be analysed (screened). Examples of compounds are synthetic and/or naturally occurring chemical compounds, peptides, proteins, nucleic acids, antibodies, and the like.
Examples of the construction of libraries are provided in, for example, Breinbauer R, Manger M, Scheck M, Waldmann H. Natural product guided compound library development. Curr Med Chem. 2002 Dec;9(23):2129-45, wherein natural products are described that are biologically validated starting points for the design of combinatorial libraries, as they have a proven record of biological relevance. This special role of natural products in medicinal chemistry and chemical biology can be interpreted in the light of new insights about the domain architecture of proteins gained by structural biology and bioinformatics. In order to fulfil the specific requirements of the individual binding pocket within a domain family it is necessary to optimise the natural product structure by chemical variation. Solid-phase chemistry is said to become an efficient tool for this optimisation process, and recent advances in this field are highlighted in this review article. Other related references include Edwards PJ, Morrell AI. Solid-phase compound library synthesis in drug design and development. Curr Opin Drug Discov Devel. 2002 Jul;5(4):594-605.; Merlot C, Domine D, Church DJ. Fragment analysis in small molecule discovery. Curr Opin Drug Discov Devel. 2002 May;5(3):391-9. Review; Goodnow RA Jr. Current practices in generation of small molecule new leads. J Cell Biochem Suppl. 2001;Suppl 37:13-21; which describes that the current drug discovery processes in many pharmaceutical companies require large and growing collections of high quality lead structures for use in high throughput screening assays. Collections of small molecules with diverse structures and "drug-like" properties have, in the past, been acquired by several means: by archive of previous internal lead optimisation efforts, by purchase from compound vendors, and by union of separate collections following company mergers. Although high throughput/combinatorial chemistry is described as being an important component in the process of new lead generation, the selection of library designs for synthesis and the subsequent design of library members has evolved to a new level of challenge and importance. The potential benefits of screening multiple small molecule compound library designs against multiple biological targets offers substantial opportunity to discover new lead structures. Subsequent optimisation of such compounds is often accelerated because of the structure-activity relationship (SAR) information encoded in these lead generation libraries. Lead optimisation is often facilitated due to the ready applicability of high-throughput chemistry (HTC) methods for follow-up synthesis. Some of the strategies, trends, and critical issues central to the success of lead generation processes are discussed. One use of such a library is finally described in, for example, Wakeling AE, Barker AJ, Davies DH, Brown DS, Green LR, Cartlidge SA, Woodburn JR. Specific inhibition of epidermal growth factor receptor tyrosine kinase by 4-anilinoquinazolines. Breast Cancer Res Treat. 1996;38(1):67-73.
The proteome of a cell, tissue, and/or organism is largely sub-organised in protein-complexes, i.e. aggregations of proteins that form functional and/or structural sub-divisions of a cell. Examples for these complexes and their functional relevance are described in, for example, Gavin AC, Superti-Furga G. Protein complexes and proteome organisation from yeast to man. Curr Opin Chem Biol 2003 Feb;7(1):21-7, wherein protein complexes are described as well being the most relevant molecular units of cellular function. The activities of protein complexes have to be regulated both in time and space to integrate within the overall cell programs. The cell can be compared to a factory orchestrating individual assembly lines into integrated networks fulfilling particular and superimposed tasks.
The library of the potentially proteome-interacting compounds contains entities that are usually "non-labelled", i.e. that do not contain a marker that would be necessary for an identification of the library compound. Nevertheless, also libraries of labelled potentially proteome-interacting compounds can be screened without using the labels of the compounds.
The present invention uses furthermore a second "library of a variety of proteomes". This library therefore contains a collection of proteomes from different cells, organs, tissues or pool of cells. "Proteomes" as used in the context of the present invention are, for example, described in Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002 Jan 10;415(6868):141-7. Thus, a "proteome" according to the present invention is the specific protein composition of a cell or tissue or organism. Depending on the individual cell or tissue, one culture of yeast cells could, theoretically, contain as many proteomes as there are cells in said culture. For convenience, the proteomes of one cell culture under the same growth conditions are regarded as identical, i.e. representing one proteome. In other words, the proteomes of one type of organism differ depending from the status of the cells and the genomic background of the cells.
An "interaction" according to the present invention does relate to both an interaction between different components of the libraries that leads to functional or non-functional effects. That is, a compound is attached to the other with or without modifying the function of the polypeptide, the complex and/or the proteome. The attachment can be either covalent or non-covalent.
In general, the method of the present invention determines the interaction of a protein with a compound. Usually, a "protein-complex interacting compound" is a compound that interacts with at least one protein of a cell. Consequently, a "protein-complex interacting compound" is a compound that interacts with at least one protein of the cell as a protein-complex or as a subunit of a protein-complex. A "protein-complex" according to the present invention therefore can be formed by only one polypeptide. A "protein-complex interacting compound" therefore is a compound from one library that interacts with a complex or a protein as subunits of a proteome. Consequently, a "proteome-interacting compound" is a protein-complex interacting compound that interacts with a complex as part of a proteome.
In one embodiment of the present invention, a method is provided, wherein the potentially proteome-interacting candidate compounds are selected from enzymes, polypeptides, peptides, antibodies and fragments thereof, nucleic acids or derivatives thereof, and chemical entities having a molecular mass of less than 1000 kDa ("small molecules"). Thus, in one library of the present invention potentially proteome-interacting compounds are provided that potentially interact with the proteome to be analysed (screened). Examples of such compounds are synthetic and/or naturally occurring chemical compounds, peptides, proteins, antibodies, and the like. Since the library is related to "small" compounds, i.e. other that complete enzymes, antibodies or other proteins, the molecular weight of these compounds is preferably below 1000 kDa, more preferably below 500 kDa. Of course, these compounds can be much smaller, e.g. smaller than 1000 Da. Such compounds can be suitable as "leads" for further optimisation. One example of a peptide library is, for example, described in Sachpatzidis A, et al. Identification of allosteric peptide agonists of CXCR4. J Biol Chem 2003 Jan 10;278(2):896-907, wherein a synthetic cDNA library coding for 160,000 different SDF-based peptides was screened for small molecule CXCR4 agonist activity in a yeast strain.
In another embodiment of the method according to the present invention, said potentially proteome-interacting candidate compounds is a nucleic acid, such as a DNA, RNA and/or PNA. Such nucleic acids can be present in the form of oligonucleotides or polynucletides, covering specific binding-specific nucleotide specific nucleic acid sequences and/or motifs. Hybrid nucleic acids between the different forms might also be employed. Furthermore, the library can be present on a chip for high-throughput screening purposes.
In the context of the present invention, libraries comprising synthetic and/or naturally occurring "small" chemical compounds (e.g. drugs, metabolites, prodrugs, potential drugs, potential metabolites, potential prodrugs and the like) are most preferred.
As mentioned above, the library of potentially proteome-interacting candidate compounds can be present in different formats, such as in liquid solution, such as in tubes, microtiter-plates, or on a solid support, such as on filters, glass slides, silicon surfaces, beads or a customised chemical microarray. Microarrays are preferred due to their easy handling and their uses in high-throughput formats.
In one preferred embodiment of the method according to the present invention, the potentially proteome-interacting candidate compounds are bound to beads, such as sepharose (e.g. NHS-activated sepharose) or agarose beads. In contrast to Evans DM et al. (1996, see above), the present invention uses immobilised compound libraries that are screened with proteomes, which allows for a much more flexible screening procedure than using columns with immobilised binding partners. Furthermore, on columns the "in vivo-like conditions" of the present method is easily lost, which renders some interactions unspecific and leads to imprecise results. Preferably, said potentially proteome-interacting candidate compounds are bound to said beads via an amino-group or carboxy-group.
The use of proteomes (and protein-complexes) allows for an increased flexibility during the screening procedure. In contrast to other methods according to the art, the present method can also be used in an in vivo-environment. Thus, in performing the method according to the present invention, a variety of proteomes can be used that is present in or derived from one single cell or cell culture or from a mixture of cells, such as a tissue, organ or organism. In an even more preferred embodiment of the present invention the variety of proteomes is present in or derived from one single cell or cell culture or from a mixture of cells, such as a tissue, organ or organism, wherein said single cell or cell culture or mixture of cells, such as a tissue, organ or organism was exposed to certain conditions, such as heat, stress, starvation, drugs, radioactivity, chemical agents, toxins, viral infection, antibiotics, and ageing. These conditions lead to different "sets" of proteomes that can be used for the screening procedure. This approach resembles a situation that is as identical to the "real" situation in vivo, as possible.
The method according to the present invention can be employed on a large variety of proteomes that are present in or derived from a cell selected from prokaryotic or eukaryotic cells, such as a bacterial cell, a pathogenic micro-organism, a fungal cell, a yeast cell, a plant cell, a mammalian cell, a fish cell, a nematode cell, an insect cell, and, in particular, a non-human stem-cell. Furthermore, the method according to the present invention can be employed on a large variety of proteomes that are present in or derived from a tissue or organ, such as connective tissue, endothelial tissue, brain, bone, liver, heart, skeletal muscles, prostate, colon, kidney, glands, lymph nodes, pancreas, roots, leaves, and flowers. Finally the variety of proteomes can be present in a non-human organism or derived from an organism, such as E. coli, Drosophila melanogaster, Caenorhabditis elegans, zebrafish, rat, hamster, mouse, goat, sheep, monkey, human, jellyfish, rice, potato, Arabidopsis, wheat, oat, and tobacco. A human in-vivo use of the present invention is explicitly excluded from the scope of the present invention.
In the case of a preferred in-vitro method according to the present invention, said variety of proteomes is present in or derived from a lysate of one single cell or cell culture or from a lysate of a mixture of cells, such as a tissue, organ or organism, identical to those described above.
As already pointed out, the method according to the present invention in general makes no use of screening compound libraries that contain labelled potentially proteome-interacting candidate compounds. Although labelled libraries could also be used for screening, according to the preferred embodiment of the present invention, each proteome used for screening contains only one proteome-complex comprising one labelled polypeptide. Even more preferred is a method according to the present invention, wherein each member of the library in b) above contains only one labelled polypeptide that is different from the other members of said library.
One example for such a library would be the production of an essentially complete collection of cells in which a different gene (or polypeptide) is labelled (tagged), respectively. One particular example is TAP-tagging in yeast which recently was used in order to identify the yeast proteome (Gavin et al. Nature 415,141-7 (2002)).
EP 1 105 508 B1 as well as Puig O, et al. ("The tandem affinity purification (TAP) method: a general procedure of protein complex purification." Methods. 2001 Jul;24(3):218-29.; and Rigaut G, et al. A generic protein purification method for protein complex characterization and proteome exploration. Nat Biotechnol 1999 Oct;17(10):1030-2) describe the general principle of the TAP-method. The tandem affinity purification (TAP) method is described as a tool that allows rapid purification under native conditions of complexes, even when expressed at their natural level. Prior knowledge of complex composition or function is not required. The TAP method requires fusion of the TAP tag, either N- or C-terminally, to the target protein of interest. The TAP method was initially developed in yeast but can be successfully adapted to various organisms, such as mammalian cells (Cox DM, Du M, Guo X, Siu KW, McDermott JC. Tandem affinity purification of protein complexes from mammalian cells. Biotechniques. 2002 Aug;33(2):267-8, 270.)
Preferably, the label of the labelled polypeptide is selected from the group of radiolabels, such as ³²P, ³⁵S, ³H, ¹²⁵I, ^99mTc, ¹¹¹In, and the like, dye labels, labels that can be detected with antibodies, enzyme labels, and labels having a detectable mass. Preferred are labels that can be specifically detected with antibodies. Examples for the use of mass spectroscopy in proteome analysis is described in Ho Y, et al. (without labels) ("Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry" Nature. 2002 Jan 10;415(6868):180-3.); and Gu S, et al ("Precise peptide sequencing and protein quantification in the human proteome through in vivo lysine-specific mass tagging". J Am Soc Mass Spectrom 2003 Jan; 14(1):1-7) and Williams C and Addona TA ("The integration of SPR biosensors with mass spectrometry: possible applications for proteome analysis." Trends Biotechnol. 2000 Feb;18(2):45-8). Preferably, the label of the labelled polypeptide is selected from phosphorescent markers, fluorescent markers, chemiluninescent markers, phosphatases, streptavidin, biotin, TAP-method markers, and peroxidases. A wide variety of labels can be used when performing the method of the present invention. In addition to TAG, other markers are Arg-tag, calmodulin-binding peptide, cellulose-binding domain, DsbA, c-myc-tag, glutathione S-transferase, FLAG-tag, HAT-tag, His-tag, maltose-binding protein, NusA, S-tag, SBP-tag, Strep-tag, and thioredoxin. The uses of these tags are extensively described in the literature (for example, in Terpe K. Overview of tag protein fusions: from molecular and biochemical fundamentals to commercial systems. Appl Microbiol Biotechnol 2003 Jan;60(5):523-33).
In another aspect of the method according to the present invention, each member of the library in b) above comprises a mixture or collection (pool) of different proteomes. This embodiment is used in particular in an initial screening step that shall result into the formation of library-subgroups in order to accelerate the throughput of all library samples. These subgroups could, for example, each contain approximately 1000 proteomes in the initial screening, then 100 in the second screening of positive initial pools and finally are screened based on the individual proteomes. In accordance with this embodiment, different labels can be used in order to further discriminate between the subsets and/or proteomes in the pools.
Most preferred is a method according to the present invention, wherein contacting the library from a) above with the library in b) above is performed essentially under physiological conditions. This approach resembles a situation that is as identical to the "real" situation in vivo as possible, an elementary advantage of the method of the present invention, in contrast to other methods according to the state of the art that easily result in false positive results. For this contacting the library from a) above with the library in b) above is preferably performed using a suitable buffer and, optionally, a cofactor, such as calcium, magnesium, potassium, ATP, ADP, CAMP, and the like. Cofactors can significantly improve the formation of both complexes and proteomes and at the same time the interaction of the proteome and/or the complexes with the potentially interacting compound. As used herein the term "physiological conditions" refers to temperature, pH, ionic strength, viscosity, and like biochemical parameters which are compatible with a viable organism, and/or which typically exist intracellularly in a viable cultured yeast cell or mammalian cell. For example, the intracellular conditions in a yeast cell grown under typical laboratory culture conditions are physiological conditions. Suitable in vitro reaction conditions for in vitro transcription cocktails are generally physiological conditions. In general, in vitro physiological conditions comprise 50-200 mM NaCl or KCl, pH 6.5-8.5, 20-45.degree. C. and 0.001-10 mM divalent cation (e.g., Mg⁺⁺, Ca⁺⁺); preferably about 150 mM NaCl or KCl, pH 7.2-7.6, 5 mM divalent cation, and often include 0.01-1.0 percent nonspecific protein (e.g., BSA). A non-ionic detergent (Tween, NP-40, Triton X-100) can often be present, usually at about 0.001 to 2%, typically 0.05-0.2% (v/v). Particular aqueous conditions may be selected by the practitioner according to conventional methods. For general guidance, the following buffered aqueous conditions may be applicable: 10-250 mM NaCl, 5-50 mM Tris HCl, pH 5-8, with optional addition of divalent cation(s) and/or metal chelators and/or non-ionic detergents and/or membrane fractions and/or anti-foam agents and/or scintillants.
In a most preferred embodiment of the method according to the present invention, determining an interaction between said candidate compounds and said at least one labelled polypeptide comprises isolation of the labelled polypeptide and/or the complexes. Such isolation can be performed using common separation and/or purification techniques, such as chromatography, gel filtration, precipitation, immune absorption, gel electrophoresis, centrifugation, and the like. Furthermore preferred is method according to the present invention, wherein said determining an interaction between said candidate compounds and said at least one labelled polypeptide comprises a detection of the bound labelled polypeptide using antibodies, radioactivity detection methods, dye detection methods, enzymatic detection methods and mass spectroscopy. Such detection depends on the type of label that is used. Since the proteome-complex interacting compound interacts with a complex as a subunit of a proteome, it is possible, not only to analyse the direct interaction of the actually labelled polypeptide and the proteome-complex interacting compound, but also to obtain information regarding the context of the binding by analysis of the other partners in the proteome-complex, whereby a first information of the "mode of action" of a compound can be obtained. Furthermore, if the compound is a known pharmaceutical, the indirectly interacting protein-partners of said compound can be identified via the binding/interaction with a specific proteome-complex. Finally, different compound and proteome-complex interactions in different proteomes can identify different modes of actions between different organisms, cellular states and different diseases, both of known and/or unknown compounds. Consequently, the polypeptide-interacting compound as a proteome-complex interacting compound also reveals information about the proteome(s). This can not be achieved using other methods as present in the art.
As mentioned above, in a preferred method according to the present invention screening is performed in vivo or in vitro. Preferred are uses in vitro, as described above.
In yet another embodiment of the method according to the present invention, the steps a) above to d) above are repeated, wherein the interacting compounds identified in d) above are used to provide an improved candidate library for step a) above. These rounds can be regarded as "pre-screening" rounds and/or be performed as control screens. In addition, the composition of the proteome library and or compound library can be modified between the screens. Furthermore, pools can be generated, as indicated above. Finally, an additional round of screening can be performed as end-screening, in order to improve the reliability of the present method.
In another preferred embodiment according to the method according to the present invention, the screening is performed at least in part in a high throughput manner. High throughput techniques are usually employed in strategies that involve large libraries of compounds, i.e. potentially proteome-interacting compounds and/or proteome-libraries. In general, in the course of the method according to the present invention, one single potentially proteome-interacting compound is screened (brought in contact) with a multitude of differently tagged proteomes, coming from one or several organisms to be screened.
Examples of protein related high-throughput technologies are described in, for example, Sreekumar A and Chinnaiyan AM ("Protein microarrays: a powerful tool to study cancer". Curr Opin Mol Ther 2002 Dec;4(6):587-93), in which protein microarrays for examining the cellular proteome are described. Further described is the use of antigen and antibody arrays. Templin MF, et al. ("Protein microarray technology" Drug Discov Today 2002 Aug 1;7(15):815-22) describe the use of microarray technology in the form of microspots of capture molecules that are immobilised in rows and columns onto a solid support and exposed to samples containing the corresponding binding molecules. Readout systems based on fluorescence, chemiluminescence, mass spectrometry, radioactivity or electrochemistry can be used to detect complex formation within each microspot. Furthermore, arrays containing immobilised DNA probes that are exposed to complementary targets and their degree of hybridisation are described. Finally, US 6,197,599 relates to a device that comprises a solid support and multiple immobilised agents for protein detection. The immobilised agents are mainly proteins, such as antibodies and recombinant proteins. The immobilised agents can be synthesised peptides or other small chemicals. Agents are individually deposited in a predetermined order, so that each of the agents can be identified by the specific position it occupies on the support. The immobilised agents on the solid support retain their protein binding capability and specificity. Methods employing the device are described as being extremely powerful in screening protein expression patterns, protein post-translational modifications and protein-protein interactions. All these methods and devices can be easily adapted for the use in a method according to the present invention.
According to yet another embodiment of the method according to the present invention, the identifying of the proteome-complex interacting compound and said further identifying said proteome-interacting compound is performed, at least in part, by a computer system.
The use of these systems is preferred, due to the enormous amount of data that is generated and/or has to be handled in the high-throughput environment. Such handling is described, for example, in Jansen R et al. ("Relating whole-genome expression data with protein-protein interactions". Genome Res. 2002 Jan;12(1):37-46.), and Stumm G, et al . ("Deductive genomics: a functional approach to identify innovative drug targets in the post-genome era." Am J Pharmacogenomics 2002;2(4):263-71). Similarly, US 6,064,754 (hereby incorporated by reference in its entirety) relates to computer-assisted methods and apparatus for identifying, selecting and characterising biomolecules in a biological sample. A two-dimensional array is generated by separating biomolecules present in a complex mixture and a computer-readable profile is constructed representing the identity and relative abundance of a plurality of biomolecules detected by imaging the two dimensional array. Computer-mediated comparison of profiles from multiple samples permits automated identification of subsets of biomolecules that satisfy pre-ordained criteria. Identified biomolecules can be automatically isolated from the two dimensional array by a robotic device in accordance with computer-generated instructions. A supported gel suitable for electrophoresis is provided that is bonded to a solid support such that the gel has two-dimensional spatial stability and the solid support is substantially non-interfering with respect to detection of a label, such as a fluorescent label, associated with one or more biomolecules in the gel. Finally, US 6,146,830 (hereby incorporated by reference in its entirety) relates to methods and systems for characterising the actions of drugs in cells. In particular, described are methods for determining the presence of a number of primary targets through which a drug, drug candidate, or other compound of interest acts on a cell. Furthermore, also methods for drug development based on the disclosed methods for determining the presence of a number of primary targets of a drug are disclosed which involve: (i) measuring responses of cellular constituents to graded exposures of the cell to a drug of interest; (ii) identifying an "inflection concentration" of the drug for each cellular constituent measured; and (iii) identifying "expression sets" of cellular constituents from the distribution of the inflection drug concentrations. Each expression set corresponds to a particular primary target of the drug. Finally computer systems are described which determine the presence of a number of targets of a drug by executing the disclosed methods. All these methods can be easily adapted for the use in a method according to the present invention.
According to yet another aspect of the present invention, a proteome-interacting compound or its pharmaceutically acceptable salts is provided that has been identified based on the method according to the present invention as described above. These compounds can be used in order to provide new pharmaceutical compositions that include a proteome-interacting compound according to the present invention, together with a suitable carrier and/or diluent. The compounds that are found to be interactive with a labelled polypeptide that is part of a complex that, in turn, belongs to a specific proteome can be of varying nature. Depending on the library for screening, these compounds are selected from enzymes, polypeptides, peptides, antibodies and fragments thereof, nucleic acids or derivatives thereof, and chemical entities having a molecular mass of less than 1000 kDa ("small molecules"). Thus, in one library of the present invention potentially proteome-interacting compounds are provided that potentially interact with the proteome to be analysed (screened). Examples of such compounds are synthetic and/or naturally occurring chemical compounds, peptides, proteins, antibodies, and the like. Since the library is related to "small" compounds as indicated above, i.e. other that complete enzymes, antibodies or other proteins, the molecular weight of these compounds is preferably below 1000 kDa, more preferably below 500 kDa. Such compounds can be suitable as "leads" for further optimisation. Furthermore, the proteome-interacting compound can be a nucleic acid, such as a DNA, RNA and/or PNA. Such nucleic acids can be present in the form of oligonucleotides or polynucletides, covering binding-specific nucleic acid sequences and/or motifs, and can be suitable for, e.g., gene therapy or antisense-technology.
The method according to the present invention can be employed in a variety of medical and pharmaceutical uses, such as for further lead optimisation (e.g. in the cases of pre-screened compounds and/or already used pharmaceutical compounds), elucidating the mode of action of a compound (e.g. for scientific purposes and the finding of additional druggable targets), finding of further medical uses, the identification of potential side effects of the compound of interest in cases where the identified proteome-interacting compound is known to elicit side effects, and for the identification of diagnostic agents for a specific disease or condition. In yet another aspect, the method according to the present invention can be used for the identification of new lead compounds for established protein target classes, new protein target classes for known lead compounds and/or new lead compounds for new protein target classes. Finally, according to yet another aspect of the present invention, the inventive method can be used for the development of new tools for the functional assessment of new targets (e.g. chemical knock-outs) or the development of prediction/modellisation-tools for the binding of compounds to targets, drug transporters, and drug modifying enzymes (for example, by using computer-modelling techniques, see above), and as an improved data source for bioinformatic purposes.
Many of the above uses can be accomplished by comparison with known protein-complex data with the newly identified labelled polypeptide-compound interaction. Since the proteome-complex interacting compound interacts with a complex as a subunit of a proteome, it is possible based on the data as present, not only to analyse the direct interaction of the actually labelled polypeptide and the proteome-complex interacting compound, but also to obtain information regarding the context of the binding by analysis of the other partners in the proteome-complex, whereby a first information of the "mode of action" of a compound can be obtained. Furthermore, if the compound is a known pharmaceutical, the indirectly interacting protein-partners of said compound can be identified via the binding/interaction with a specific proteome-complex. Finally, different compound and proteome-complex interactions in different proteomes can identify different modes of actions between different organisms, cellular states and different diseases, both of known and/or unknown compounds. Consequently, the polypeptide-interacting compound as a proteome-complex interacting compound also reveals information about the proteome(s). This can not be achieved using other methods as present in the art and allows for the generation of important information with respect to the mode of action, further medical uses and/or potential side effects. One example for an analysis of complex protein-protein interactions can be found in Drewes G and Bouwmeester T ("Global approaches to protein-protein interactions." Curr Opin Cell Biol 2003 Apr;15(2):199-205) which describe more global, systematic strategies that analyse genes or proteins on a genomeand proteome-wide scale and several large-scale proteomics technologies that have been developed to generate comprehensive, cellular protein-protein interaction maps. This analyses can also be used in the analyses of protein-complex and proteome interactions.
All publications as cited herein are incorporated herein by reference in their entirety. The present invention shall now be further described based on the following examples, without being limited thereto.

Example 1: Production of a library of tagged (labelled) collection of proteomes

The production of a library of a library of cells, in each of which a different gene is tagged (labelled), respectively, has been described earlier, for example, in Gavin et al. Nature 415,141-7 (2002), and Rigaut et al. Nat. Biotechnol. 17,1030-3 (1999); and Puig et al, Methods 24, 218-229 (2001); EP 1 105 508 B1 for the TAG approach.

Example 2 Yeast drug pull-down protocol

This example describes the use of a drug coupled to a sepharose matrix through an amine reaction in order to pull down yeast TAP-tagged proteins that specifically interact with the drug.

Materials:

Buffers:

Yeast Lysis Buffer:

50 mM Tris-HCI pH 7.5
100mM NaCI
0.15% lgepal
1.5 mM MgCl2
0.5 mM DTT
self-prepared protease inhibitors (1000x) + 1 mM PMSF (1 x stock of lysis buffer in cold room, DTT and protease inhibitors have to be added when making up 1x)

TBST; TBS 1x with 0,5% Tween
Iodoacetamide 200mg/ml

Reagents:

NHS-activated Sepharose 4 Fast Flow provided in isopropanol, Amersham Biosciences, 17-0906-01
anhydrous Dimethylsulfoxide, Fluka, 41648
Dimethylsulfoxide for washing, e.g. FLUKA 34869
Ethanol (Merck, 1.00983.1000, pro analysis)
2-Aminoethanol (Aldrich, 11.016-7)
Methanol, GR for analysis, Merck, 1.060009

Equipment:

End-over-end shaker (Roto Shake Genie, Scientific Industries Inc.)

Materials:

50 mL Falcon tube
15 mL Falcon tubes
1,5 mL eppendorf tubes, siliconised
NHS activated sepharose 4 fast flow, Amersham Biosciences, 17-0906-01
UZ-polycarbonate tube, Beckmann, 355654
Molbiol columns + filter 90 urn, MoBiTech, Angebot 10055
Glass beads (0.5 mm diameter)
Poly-Prep Chromatography column, BIO-RAD, 731-1550.

Method:

1) Coupling of the compounds to resins via primary amines (NHS-beads drug coupling protocol)

Washing of beads: Use 1 ml (settled volume) activated NHS-beads for standard coupling reaction (NHS-activated Sepharose 4 Fast Row provided in isopropanol, Amersham Biosciences, 17-0906-01); Insert 10 ml chromatography column into 50 ml Falcon tube (Poly-Prep Chromatography column, BIO-RAD, 731-1550); Pipet 2 ml of the re-suspended 50% slurry of NHS-beads into the column; Wash the beads with 10 ml anhydrous DMSO (Di-methylsulfoxid, Fluka, 41648, H₂0 <= 0.005%) by adding the solvent directly into the chromatography column; allow flow through by gravity; discard flow-through into non-halogenous waste

2) Coupling reaction

Dissolve the compound of interest in anhydrous DMSO (final concentration = 100 µmol/ml); Add 20 µl of the 100 µmol/mL compound solution onto the washed beads in a 2.0 mL Eppendorf tube; Add 14 µL 7.2 M Triethylamine (TEA) (final conc. = 100 µmol/mL) (50 x molar excess over compound) (SIGMA, T-0886, 99% pure); Incubate at RT on an end-over-end shaker (Roto Shake Genie, Scientific Industries Inc.) for 16 h.

3) Blocking reaction:

Add 50 µL 16.56 M aminoethanol (2-Aminoethanol, Aldrich, 11.016-7) (final conc. = 830 µmol/mL) (> 40 fold excess over bead capacity) for blocking of non-reacted NHS-groups; Incubate at RT on the end-over-end shaker over night

4) Washing

Pipet this suspension back into a 15 ml Falcon tube (use cut off blue tip); Wash the beads first with 2 x 10 ml DMSO (e.g. FLUKA, 34869 or equivalent, needs not to be anhydrous anymore), second with 2x10 ml ethanol (Merck, 1.00983.1000, pro analysis); Resupend the beads with 1 mL ethanol to make a 50% slurry for storage at 4 °C (explosion proof refrigerator)

5) Blocked beads (control)

Washing of beads: Use 1 ml (settled volume) activated NHS-beads for standard coupling reaction (NHS-activated Sepharose 4 Fast Flow provided in isopropanol, Amersham Biosciences, 17-0906-01); Insert 10 ml chromatography column into 50 ml Falcon tube (Poly-Prep Chromatography column, BIO-RAD, 731-1550); Pipet 2 ml of the resuspended 50% slurry of NHS-beads into the column; Wash the beads with 10 ml anhydrous DMSO (Pimethylsulfoxide, Fluka, 41648) by adding the solvent directly into the chromatography column; allow flow through by gravity; discard flow-through into non-halogenous waste.

6) Coupling reaction:

1 ml DMSO with 500 mM TEA (no drug) over night at room temperature

7) Blocking reaction

Pipet the suspension into a 2 ml Eppendorf tube (use cut off blue tip); Add 50 µl aminoethanol (2-aminoethanol, Aldrich, 11.016-7) for blocking of non-reacted NHS-groups; Incubate at RT on the end-over-end shaker over night washing; Pipet this suspension back into a 10 ml chromatography column (use cut off blue tip); Wash the beads first with 20 ml DMSO, second with 20 ml ethanol (Merck, 1.00983.1000, pro analysis); Store the coupled beads (50% slurry in ethanol) in column at 4°C (explosion- proof refrigerator); Wash beads with 20 ml of the appropriate binding buffer before use.

Example 3: Affinity capture of TAP-tagged proteins via immobilised drugs

Method

Grow a fresh TAP-expressing yeast cell line in YPD until OD₂₆₀ reaches 3,5; Wash cells in PBS and lyse in 1,5 volume of Yeast Lysis Buffer; (Vol lysis = 1,5 x Vol pellet) using a planet mill beaker containing 25 mL of glass beads; Run 4 times 5min, 350 rpm; transfer supernatant (without glass beads) to a 50 mL Falcon tube; Wash glass beads with 10mL of Yeast Lysis Buffer; combine with the lysate; Centrifuge 10 min, 20.000 g; Transfer supernatant to a UZ-polycarbonate tube and centrifuge 1 h 10 min, 100.000g; Remove the lipid layer using a water pump, recover the supernatant and measure the protein concentration; Aliquot lysate by 100mg and freeze in liquid nitrogen; Store in a -80°C; Use 100 uL NHS-beads with coupled drug (2 µmol/mL) or blocked NHS-beads as control; Wash the beads 3 times with 5 mL of Yeast Lysis Buffer; invert tubes 3-5 times; centrifuge 1 min, 400g, 4°C; Aspirate supernatant and discard; Thaw lysate quickly in 37°C water bath, then keep on ice; Dilute lysate with Yeast Lysis Buffer to a final concentration of 10 mg/mL; Transfer supernatant to a UZ-polycarbonate tube (Beckmann, 355654); Spin lysate for 20 min. at 100.000 g at 4°C (33.500 rpm in Ti50.2, pre-cooled); Combine 100 mg of lysate with beads into a 15 mL Falcon tube; Incubate for 2h at 4°C on a rotating wheel; Recover beads by centrifugation (1 min, 1000 rpm , 4°C); take 40 µL as NBF; remove supernatant (> 1 ml liquid should be left in the tube), Transfer beads to Mobi columns; Wash with 20 ml TBST; Centrifuge column for 1 min, 400g in a table top centrifuge to remove excess buffer; Add 60 µL of 2xSB, put column in a 1.5 mL siliconised 1,5 mL Eppendorf tube, heat for 5 min at 95°C; Open column (first top, then bottom), put it back into the Eppendorf tube. centrifuge 1 min at 400g to recover the eluate; Load the eluate on a dot blot apparatus; Detect the presence of TAP proteins using a peroxidase anti-peroxidase antibody. Develop with chemiluminescence kit.
Using the above strategy, as an example, the proteins Faa4, a long-chain fatty acid CoA ligase of the yeast Saccharomyces cerevisiae could be shown as interacting with the antifungal compound Nystatin. The interaction was specific, since Faa4 did not interact with an unrelated drug (Bisindolylmaleimide III) and Faa1, a protein closely related to Faa4, did not interact with Nystatin.
Using the above strategy, as a further example, the kinase PKC 1 of the yeast Saccharomyces cerevisiae could be shown as interacting with the Bisindolylmaleimide III kinase inactivator. The interaction was specific, since PKC1 did not bind to an unrelated drug Nystatin and the related kinase TPK1 did not interact with Bisindolylmaleimide III.

Claims

A method for screening a library of potentially proteome-interacting candidate compounds, comprising:

a) providing a library comprising, preferably non-labelled potentially proteome-interacting candidate compounds,

b) providing a second library comprising a variety of proteomes, wherein each proteome comprises one proteome-complex comprising at least one labelled polypeptide,

c) contacting the library from a) with the library in b) in a manner as to allow for an interaction of the candidate compounds with the at least one labelled polypeptide,

d) determining an interaction between said candidate compounds and said at least one labelled polypeptide, and

thereby identifying a proteome-complex interacting compound and thereby further identifying a proteome-interacting compound.
The method according to claim 1, wherein said potentially proteome-interacting candidate compounds are selected from enzymes, polypeptides, peptides, antibodies and fragments thereof, nucleic acids or derivatives thereof, and chemical entities having a molecular mass of less than 1000 kDa ("small molecules").
The method according to claim 2, wherein said potentially proteome-interacting candidate compounds is a nucleic acid, such as a DNA, RNA and/or PNA or a small molecule, such as a drug, metabolite, prodrug, and the like.
The method according to any of claims 1 to 3, wherein said potentially proteome-interacting candidate compounds are present in liquid solution, such as in tubes, microtiter-plates, or on a solid support, such as on filters, glass slides, silicon surfaces, beads or a customised chemical microarray.
The method according to claim 4, wherein said potentially proteome-interacting candidate compounds are bound to beads, such as sepharose (e.g. NHS-activated sepharose) or agarose beads.
The method according to claim 5, wherein said potentially proteome-interacting candidate compounds are bound to said beads via an amino-group or carboxy-group.
The method according to any of claims 1 to 6, wherein said variety of proteomes is present in or derived from one single cell or cell culture or from a mixture of cells, such as a tissue, organ or organism.
The method according to any of claims 1 to 7, wherein said variety of proteomes is present in or derived from one single cell or cell culture or from a mixture of cells, such as a tissue, organ or organism, wherein said single cell or cell culture or mixture of cells, such as a tissue, organ or organism was exposed to certain conditions, such as heat, stress, starvation, drugs, radioactivity, chemical agents, toxins, viral infection, antibiotics, and ageing.
The method according to any of claims 1 to 7, wherein said variety of proteomes is present in or derived from a cell selected from prokaryotic or eukaryotic cells, such as a bacterial cell, a pathogenic micro-organism, a fungal cell, a yeast cell, a plant cell, a mammalian cell, a fish cell, a nematode cell, an insect cell, and, in particular, a non-human stem-cell.
The method according to any of claims 1 to 7, wherein said variety of proteomes is present in or derived from a tissue or organ, such as connective tissue, endothelial tissue, brain, bone, liver, heart, skeletal muscles, prostate, colon, kidney, glands, lymph nodes, pancreas, roots, leaves, and flowers.
The method according to any of claims 1 to 7, wherein said variety of proteomes is present in or derived from an organism, such as E. coli, Drosophila melanogaster, Caenorhabditis elegans, zebrafish, rat, hamster, mouse, goat, sheep, jellyfish, rice, potato, Arabidopsis, wheat, oat, and tobacco.
The method according to any of claims 1 to 11, wherein said variety of proteomes is present in or derived from a lysate of one single cell or cell culture or from a lysate of a mixture of cells, such as a tissue, organ or organism.
The method according to claim to any of claims 1 to 12, wherein each proteome contains only one labelled polypeptide.
The method according to any of claims 1 to 13, wherein each member of the library in b) contains only one labelled polypeptide that is different from the other members of said library.
The method according to any of claims 1 to 14, wherein said label of said labelled polypeptide is selected from the group of radiolabels, dye labels, labels that can be detected with antibodies, enzyme labels, and labels having a detectable mass.
The method according to claim 15, wherein said label of said labelled polypeptide is selected from phosphorescent markers, fluorescent markers, chemiluninescent markers, phosphatases, streptavidin, biotin, TAP, and peroxidases.
The method according to any of claims 1 to 16, wherein each member of the library in b) comprises a mixture or collection (pool) of different proteomes.
The method according to any of claims 1 to 17, wherein said contacting the library from a) with the library in b) is performed essentially under physiological conditions.
The method according to any of claims 1 to 18, wherein said contacting the library from a) with the library in b) is performed using a suitable buffer and, optionally, a cofactor, such as calcium, magnesium, potassium, ATP, ADP, cAMP, and the like.
The method according to any of claims 1 to 19, wherein said determining an interaction between said candidate compounds and said at least one labelled polypeptide comprises isolation of the labelled polypeptide and/or the complexes.
The method according to any of claims 1 to 20, wherein said determining an interaction between said candidate compounds and said at least one labelled polypeptide comprises a detection of the bound labelled polypeptide using antibodies, radioactivity detection methods, dye detection methods, enzymatic detection methods and mass spectroscopy.
The method according to any of claims 1 to 21, wherein said screening is performed in vivo or in vitro.
The method according to any of claims 1 to 22, comprising repeating steps a) to d), wherein the interacting compounds identified in d) are used to provide an improved candidate library for step a).
The method according to any of claims 1 to 23, wherein said screening is performed at least in part in a high throughput manner.
The method according to any of claims 1 to 24, wherein said identifying of said proteome-complex interacting compound and said further identifying said proteome-interacting compound is performed, at least in part, by a computer system.
A proteome-interacting compound, identified according to a method according to any of claims 1 to 25, or its pharmaceutically acceptable salts.
A pharmaceutical composition, comprising a proteome-interacting compound according to claim 26, together with a suitable carrier and/or diluent.
Use of a method according to any of claims 1 to 27 for further lead optimisation, elucidating the mode of action of a compound, finding of further medical uses, for the identification of potential side effects of the compound of interest when the identified proteome-interacting compound is known to elicit side effects, and for the identification of diagnostic agents for specific disease or condition.
Use of a method according to any of claims 1 to 27 for the identification of new lead compounds for established protein target classes. new protein target classes for known lead compounds or new lead compounds for new protein target classes.
Use of a method according to any of claims 1 to 27 for the development of new tools for the functional assessment of new targets (chemical knock-out) or the development of prediction/modellisation tools for the binding of compounds to targets, drug transporters, drug modifying enzymes, and as a data source for bioinformatic purposes.