AU1530501A - A process for identifying the active site in a biological target - Google Patents
A process for identifying the active site in a biological target Download PDFInfo
- Publication number
- AU1530501A AU1530501A AU15305/01A AU1530501A AU1530501A AU 1530501 A AU1530501 A AU 1530501A AU 15305/01 A AU15305/01 A AU 15305/01A AU 1530501 A AU1530501 A AU 1530501A AU 1530501 A AU1530501 A AU 1530501A
- Authority
- AU
- Australia
- Prior art keywords
- type
- ligand
- target
- ligands
- targets
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
- G16B15/30—Drug targeting using structural data; Docking or binding prediction
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Pharmacology & Pharmacy (AREA)
- Medicinal Chemistry (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Peptides Or Proteins (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Description
WO 01/36980 PCT/GBOO/04420 A process for identifying the active site in a biological target 5 The present invention relates to processes for the identification of binding site(s) in biological targets such as mRNA, rRNA, tRNA, DNA, proteins, peptides, endogenous ligands, receptors and enzymes. The invention is based on the use of multivariate methods such as experimental design, Principal Component Analysis (PCA), Soft Independent Modelling of Class Analogues (SIMCA), Principal Component Regression (PCR), 10 Projections to Latent Structures (PLS), Multivariate Design (MVD), Statistical Molecular Design (SMD), Informative Chemical Libraries, Multivariate Quantitative Structure Activity Relationships (MQSAR) and Multivariate Characterisation (MVC). These methods have been developed and applied since the beginning of the 1980's in the design and investigation of chemical, pharmaceutical, pharmacological and biochemical systems. 15 This invention, however, provides the use of the abovementioned methods, inter alia, in an integrated approach in order to identify the binding site (active site) or the binding sites of a ligand or ligands in different kinds of macromolecules. This is of outstanding value in the drug discovery process, both regarding lead-finding as well as for lead-optimisation. 20 Currently known methods for drug design include the synthesis of new compounds (referred to as "ligands" herein) followed by biological testing of these ligands. Usually the interaction(s) with the macromolecule or macromolecules of interest (i.e. the "targets") are measured, and a lead compound is identified and optimised for fulfilling the demands of candidate drug (CD). Typically the currently known methods involve the synthesis of a 25 huge number of compounds before a lead can be identified. A typical criterion for a ligand to be of interest is that it shows a desired activity, affinity or selectivity for a particular target. The ligands are most often tested for their affinity to macromolecular targets that are 30 proteins such as enzymes, hormone receptors or G-protein coupled receptors. However, the testing of these ligands with other macromolecular targets such as specific sequences of DNA is also common. Typical examples of the ligands that it is desired to test, and to design improved variants of, include organic compounds, peptides (linear or cyclic sequences of amino acids), mixtures of peptides and organic compounds, and sequences of 35 DNA.
WO 01/36980 PCT/GB0004420 A common procedure in the development of active compounds is to randomly synthesise a library containing a few hundred up to millions of different compounds. An HTS-assay is then used for measuring the biological activity of the compounds or the interaction with the macromolecule(s) of interest. The most promising compounds are then selected for 5 further refinement. Engineering of proteins is also a method in use. Through the use of DNA technologies, well known in the art, artificial proteins may be constructed. A typical example constitutes the design of an antibody. Antibodies constitute proteins with fixed and variable regions. 10 Changing amino acids in the variable region of an existing antibody can afford new properties to the antibody. The antibodies with the desired properties are usually selected from a set of engineered antibodies, by using a suitable selection method. Other methods well known in the art constitute the rational design of pharmaceutical 15 entities based on knowledge of the three dimensional (3D) structure of the macromolecule that the pharmaceutical is desired to interact with. A macromolecule of interest can, for example, be crystallized and its 3D structure determined by use of crystallographic methods. It is also possible to use NMR for elucidating the 3D structure of proteins. Once the 3D structure of the macromolecule is known, a chemical entity can then be designed to 20 fit into a suitable region of the macromolecule. A large problem is however that the determination of the 3D structure of macromolecules is difficult, expensive and not always possible (see Branden and Tooze, 1991). There also exist methods well known in the art collectively termed QSAR methods (QSAR 25 = Quantitative Structure Activity Relationships). Such methods analyse the relation between the structures of test compounds and their affinity to the macromolecule. The information is then used to deduce better structures. A well-known example of a QSAR method constitutes CoMFA (Cramer et al., J. Amer. Chem. Soc., 1988, 110, 5959-5967). 30 So called pharmacophore models are also used in drug design (see e.g. Daveu et al. 1999; de Groot et al. 1999; McGregor et al. 1999). It is also known that protein and DNA originating from living organisms virtually always exist in variants, that, although they have more or less differing amino acid sequences or 2 WO 01/36980 PCT/GBOO/04420 DNA sequences, retain similar structural organizations and functions (Branden and Tooze, 1991). Thus, in living organisms numerous proteins exist showing homologous amino acid sequences, and which share similar structures and biological properties. Such variants of proteins very often exist within the same species. However, when one also 5 considers all living organisms, one will find that numerous mutations have occurred during evolution that has led to the accumulation of a very large number of variants of proteins and genes that show similar structural and biological properties (Branden and Tooze, 1991). 10 Proteins are built from amino acid chains (generally termed primary structures) that form structural elements (motifs) such as cc-helices, p-sheets, loop regions, hairpin P motifs, and the like (generally termed secondary structures), which then are used in the building of larger structures (generally termed tertiary structures or domains); these domains in turn forming the overall protein structure (generally called quartenary structures) (for extensive 15 examples and discussion on this topic, reference is given to Branden and Tooze, 1991). Integral membrane proteins often exist in large number of homologous variants. Well known examples include tyrosine kinases, serine/threonine kinases, ion channels, G protein coupled receptors and the steroid/thyroid hormone receptor family. For example, about 1000 different variants of the G-protein coupled receptors have been cloned and 20 sequenced. The large group of G-protein coupled receptors constitutes a good example of homologous proteins with similar structural organization. The G-protein coupled receptors are known to be built from one single amino acid chain forming seven transmembrane c helices, one extracellular N-terminal amino acid chain, one intra-cellularly located C terminal amino acid chain, three extracellular loops and three intracellular loops (Baldwin, 25 1993). The methods that are known in the art make use of information on new or known ligands, and in some cases variants thereof, and the affinity to the target. In most of the cases, information on a number of variants of the ligands in question is correlated with 30 information derived from the binding of these variants with a single target molecule. In a large number of cases, the information of the 3D-structure of the target is used. However, use of the combination of chemical/physical descriptors for both the target and the ligands simultaneously for the identification of the active site of the target by applying quantitative methods has never been done. 3 WO 01/36980 PCT/GB00/04420 The present invention also takes advantage of the fact that available technology allows the construction and production of modified macromolecules. This is a technology that in the case of proteins is generally referred to as protein engineering (Branden and Tooze, 1991). Using techniques well known in the art, one or several amino acids in a protein may be 5 exchanged for other amino acid(s), removed or new amino acids are added. This is generally done by so-called directed mutagenesis techniques. E.g. one specific amino acid in a protein can be exchanged for another amino acid (see e.g. Frandberg et al, 1994). However, another approach constitutes the construction of so called chimeric proteins. A chimeric protein has incorporated or exchanged parts of the amino acid sequence(s) from 10 another protein. For an example of the approach, see Schi6th et al. (1998). The analogous procedure for the construction of chimeric DNA's can of course also be undertaken. The use of one, two, three or more different chemical/physical properties for molecules is in QSAR (Hansh et al) well known and established for the description of the ligands in a 15 Multiple Linear Regression (MLR) model. One basic assumption in the traditional QSAR is that the descriptors are independent of each other. A number of variations on this concept, introducing new descriptors and applying Stepwise Regression in order to obtain the best possible correlation, has been used in QSAR. 20 Since the physical/chemical descriptors for different compounds are highly unlikely to be independent of each other, they have to be handled by a different approach. The first example of handling this problem was in the investigation of "Solvent Selection for Organic Synthesis", by Carlson, Lundstedt and Albano (1985). In this paper, a multivariate characterisation of 82 solvents was made. In order to determine the number of 25 "independent variables" describing the solvents, PCA was used (Wold, 1987; Jackson, 1991). In addition to this, different strategies for selecting solvents on the basis of diversity were suggested. In WOOO/033218 Al, a similar approach was suggested, i.e. to investigate and make designs in the chemical space of drug-like compounds. Recently, an investigation was published where the binding sites of 7TM (seven transmembrane) 30 receptors were investigated. However, the investigation did not include or even suggest a joint analysis of both the ligands and the receptors (Clementi et al., 2000). The first example of investigating systems with several types of chemicals involved was described by Lundstedt (1986). In this work, Multivariate Design (MVD) was applied for 4 WO 01/36980 PCT/GB00/04420 the first time and exemplified by the investigation of scope and limitations for "The Willgerodt-Kindler Reaction". Solvents, starting material and reagents were described by multivariate characterisation. The selection for each subset was based on diversity and finally the combination of reactions for the investigation of "scope and limitations" was 5 suggested by the use of MVD. The same principle as above was the basis for "informative chemical libraries" (Lundstedt et al., 1997, and Andersson et al., 1999) and also for describing peptides and proteins, as further discussed below. The translation of protein and peptide sequences to a quantitative description based on 10 chemical/physical properties is reported in the literature in several investigations. A multivariate characterisation of amino acids with physical/chemical descriptors has been done and then followed by a PCA to determine the dimensionality (Hellberg et al. 1986; Hellberg et al. 1987; Jonsson et al. 1989; Collantes and Dunn, 1995; Sandberg et al. 1998). The characterisation made by Hellberg et. al. (1986; 1987) includes experimentally- as 15 well as semi-empirically-derived variables. Using Principal Component Analysis (PCA, Wold, 1987), three (or more) latent variables, so-called principal property variables may be generated, which summarise the information from the original variables. By using these principal properties, each amino acid in a protein or peptide sequence may be quantitatively characterised, i.e. be translated to three latent variables containing physical 20 and chemical information. This means that instead of comparing sequences with a one letter code, a quantitative description of each sequence can be generated. This method has been used for obtaining descriptors of the ligands (the active compounds) and then used in MQSAR to relate chemical structures to properties or biological activities (BA). By applying this approach, the important chemical properties of a ligand which are needed for 25 binding to the active site in a target, are identified (Andersson et. al., 1998, and Lundstedt et al., 2000). The same principle for peptides may be applied for DNA, RNA, and proteins, and similar polymeric molecules. The analysis of MQSAR models between peptides of different length and the biological 30 test results using PLS (Wold, 1993a) and many other calculation methods (e.g. neural networks) require a uniform matrix of descriptors where all sequences are described with the same number of variables. 5 WO 01/36980 PCT/GB00/04420 However, sequences of RNA, DNA, proteins and peptides are often of different length. In order for the calculations to be able to handle this, a transform has to made to obtain a uniform matrix, i.e. a matrix where each sequence is described with the same number of variables. A solution to the problem is to calculate auto covariances and auto cross 5 covariances (ACC) (Wold et al. 1993b) between amino acids. A similar approach has been developed for handling branched or circular peptides as well as for branched proteins (Lundstedt et al., 2000). Similar approaches have been used for classification of DNA and RNA sequences of different length, as well as for any set of polymers composed of building blocks (Wold et al. 1998). A further improvement is to combine the ACC with 10 OSC (Orthogonal Scatter Correction) to reduce the noise in the models (Andersson et al. 1998). These methods have so far been used for obtaining descriptors of targets and relating them to a measured biological activity or using them for comparing sequences to each other (Wold et al., 2000). 15 The same principle as for peptides may be applied for DNA, RNA and other polymers or oligomers. The current invention provides a novel method for identifying the interaction site, binding site or active site in a macromolecule such as mRNA, rRNA, tRNA, DNA, peptides, 20 proteins, carbohydrates or any kind of oligomers or polymers, whether natural or synthetic. The invention relates to the use of "informative combinatorial chemistry", "informative peptide libraries", MQSAR and a chemical/physical description of the target either based on the principal properties for the building blocks of the target (i.e. aminoacids or similar) or handled as chimeric target proteins or handled as mutated target proteins. Other 25 macromolecules such as mRNA, rRNA, tRNA, DNA, peptides or enzymes can be handled in the same way. In one embodiment, the invention relates to a process for characterising the interaction between a Ligand Y and a Target X comprising: 30 Step 1 Obtaining information representing one or more chemical and/or physical properties of at least two ligands of the type Y; 6 WO 01/36980 PCT/GB0004420 Step 2 Obtaining information representing one or more chemical and/or physical properties of at least two targets of the type X; Step 3 Obtaining information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y 5 and at least two of the targets of the type X; and processing the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and Target X from which one or more of the properties of the interaction between the Ligand Y and the Target X may be characterised. 10 In this context, the term "characterising the interaction" includes obtaining information on, determining, predicting or estimating at least one chemical and/or physical property of the interaction or of the sites of interaction; estimating or predicting the position of the site of interaction within the Target X; estimating or predicting the position of the site of 15 interaction within the Ligand Y; estimating or predicting the binding affinity, selectivity, activity, biological activity or avidity of the Ligand Y or Y' for Target X; estimating or predicting which subsequences, regions or parts of the Ligand Y interact with the Target X; or estimating or predicting which subsequences, regions or parts of the Target X interact with the Ligand Y. 20 The invention also provides a process for estimating the position of the active site in a Target X in an interaction between a Ligand Y and a Target X, or estimating one or more physical and/or chemical properties of the active site, comprising the above Steps 1, 2, and 3, and correlating the information from Steps 1, 2 and 3 in order to produce a model of the 25 interaction between the Ligand Y and the Target X from which the position of the active site or one or more physical and/or chemical properties of the active site in the Target X may be estimated. The invention further provides a process for predicting the position of the active site in an 30 interaction between a Ligand Y and a Target X, or predicting one or more physical and/or chemical properties of the active site, comprising: the above Steps 1, 2, and 3; Step 4, which comprises correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X; and using the model to 7 WO 01/36980 PCT/GB00/04420 predict the position of the active site or one or more physical and/or chemical properties of the active site. A further embodiment of the invention provides a process performed with the aid of a 5 programmed computer for the estimation of the position of the active site in a Target X, in an interaction between a Ligand Y and a Target X, or one or more physical and/or chemical properties of the active site, comprising the steps of: Step 1 Inputting information representing one or more chemical and/or physical 10 properties of at least two ligands of the type Y; Step 2 Inputting information representing one or more chemical and/or physical properties of at least two targets of the type X; Step 3 Inputting information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y 15 and at least two of the targets of the type X; Step 4 Computing a model from the inputted information which describes the interaction between the Ligand Y and the Target X; and then using the model to estimate the position of the active site, or to estimate one or more physical and/or chemical properties of the active site. 20 The invention also provides a process for assisting in the design of a Ligand Y' which binds to a Target X, the Ligand Y' having an increased or decreased binding affinity, selectivity or avidity for the Target X compared to that ofa Ligand Y, comprising the Steps 1, 2 and 3 of the invention; and then correlating the information from Steps 1, 2 and 25 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the structure and/or one or more chemical and/or physical properties of the Ligand Y' may be estimated or predicted. A further embodiment provides a process for estimating or predicting the binding affinity, 30 selectivity or avidity of a Ligand Y' with a Target X, comprising Steps 1, 2 and 3 of the invention; and then correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the binding affinity, selectivity or avidity of the Ligand Y' with the Target X may be estimated or predicted. In this context, the Ligand Y' is a ligand of the type Y, as hereindefined. 8 WO 01/36980 PCT/GB00/04420 The processes according the the invention may generally be divided into the following steps: 5 STEPS I AND 2 The first steps are to describe the target X (and ligand Y) by numbers which give a good representation of the chemical/physical properties of the target (and/or ligand, respectively). Examples of different ways to describe the targets and ligands are given 10 below: In some cases, the processes of the invention will make use of information which directly represents the chemical/physical properties of the targets and/or ligands. In most cases, however, the processes of the invention will make use of the information (i.e. descriptors) 15 which indirectly represents the information on the chemical/physical properties of the targets and/or ligands, i.e. the latter information is subjected to a conversion, operation, transformation or translation process such as those described herein (e.g. principal properties, bit-vectors, PCA, ACC, etc.) before being correlated. 20 In the present invention, the targets X may be of any chemical nature but it is preferred that X is represented by proteins, peptides, large or small peptides, protein subunits, receptors, ion-channels, transporters, carriers, enzymes, drug binding proteins, proteins participating in cell signaling, polymers, structures (including linear, cyclic and branched structures and combinations thereof) which at least in part are being composed of building blocks, 25 DNA's, part of DNA's, DNA sequences, RNA's, part of RNA's, RNA sequences, transfer RNA's, messenger RNA's, carbohydrate, with proteins being most preferred. In the case of a protein, X may be selected so as to contain one peptide chain (i.e. being a monomeric protein; in other words being composed of one sub-unit). However, X may equally well be selected to contain several peptide chains held together by molecular interactions (e.g. the 30 macromolecule being a multimeric protein, in other words being composed of several sub units). In the case of X being a protein, any chemical modification of the amino acid chain of X is allowed. Specific examples of such modification(s) include (but are not limited to) 9 WO 01/36980 PCT/GB00/04420 glycosylation, palmitoylation, phosphorylation, proteolytic degradation, peptide chain breaks, knicking, oxidations, or any other chemical, biochemical or biological modification(s). X can also contain non-protein moieties such as co-factors, prosthetic groups, metal atoms, and the like. It is also allowed that natural amino acids of a protein 5 are exchanged for non-natural amino acids. In the present invention, the molecular weight of the target X is preferably larger than 5000 g/mole, more preferably larger than 7000 g/mole, even more preferably larger than 10000 g/mole, even more preferably larger than 12000 g/mole, still even more preferably larger 10 than 14000 g/mole, even still even more preferably larger than 17000 g/mole and most preferably larger than 20000 g/mole. However, in some specific embodiments of the invention it is preferred that the molecular weight of target X is larger than 25000 g/mole or even larger than 30000 g/mole and more. However, for most embodiments of the invention the molecular weight of X can be as low as 3000 g/mole or even as low as 2000 15 g/mole or even as low as 1000 g/mole or smaller, or even lower, or of any other molecular weight suited for the problem to be investigated. In the present invention, the ligands included in Y are of any chemical nature. Thus included in Y are (but not limited to) organic compounds, chemical libraries, peptides, 20 peptide libraries, protein subunits, proteins, receptors, ion-channels, transporters, carriers, enzymes, drug binding proteins, proteins participating in cell signaling, polymers and structures (including linear, cyclic and branched structures and combinations thereof) which at least in part are being composed of building blocks, non-peptides, organic chemical compounds, DNA's, part of DNA's, DNA sequences, RNA's, RNA sequences, 25 part of RNA's, transfer RNA's, messenger RNA's, carbohydrates, hybrids of any of the aforementioned and the like. Y is preferably an informative organic library or an informative peptide library. Also Y can be a set of substances taken from nature, e.g., natural substance libraries. 30 In some cases Y is selected to be a ligand which has the properties that are listed above for the properties of target X. In these cases the molecular weight of Y is preferably not restricted to any particular size; it may be small or it may be large. Thus it can be seen that 10 WO 01/36980 PCT/GBOO/04420 the terms Target X and Ligand Y are essentially interchangeable, i.e. the invention is not limited to interactions between "targets" and "ligands"; it applies to any entities which are capable of interacting with one another. 5 Usually the molecular weight of Y is within the range 100 - 5000 g/mole. However, as mentioned, for many very important implementations of the present invention it is desired that Y is of a macromolecular nature. Thus, in these cases Y is preferably larger than 5000 g/mole, more preferably larger than 7000 g/mole, even more preferably larger than 10000 g/mole, even more preferably larger than 12000 g/mole, still even more preferably larger 10 than 14000 g/mole, even still even more preferably larger than 17000 g/mole and most preferably larger than 20000 g/mole. However, in some specific embodiments of the invention it is preferred that the molecular weight of molecule Y is larger than 25000 g/mole or even larger than 30000 g/mole and more. 15 In other embodiments of the invention, it is preferred that the ligand Y is a small peptide or a low molecular weight organic compound within the range of 100-5000g/mole, preferably below 2000g/mole, or even more preferably below 10OOg/mole or most preferably below 850g/mole. 20 The information on the properties of the targets of type X and/or the information on the properties of the ligands of type Y (i.e. descriptors of X and/or Y) may be derived, inter alia, from atom counts, measured or calculated thin layer liquid chromatography (TLC), retention times on HPLC, refractive index, isoelectric point, melting point, boiling point, molecular weight, hydrophobicity, hydrophilicity, chromatographic mobility, van der 25 Waals volume, octanol/water partion coefficient (logP), energy of molecular orbital, heat of formation, polarizability, electronegativity, hardness, total accessible molecular surface area, polar accessible molecular surface area, nonpolar accessible molecular surface area, number of hydrogen bond donors, number of hydrogen bond acceptors, charge, IR-spectra, NMR-spectra or other spectra, HOMO, LUMO, semi-empirical calculations ab inito 30 calculations or 3D quantum mechanical calculations. In most cases the descriptors of X and descriptors of Y may be calculated from already known facts about X or Y (e.g. the structural formula of X or Y), rather than obtaining them by chemical or physical measurements. In some cases, however, the information on the properties may be determined experimentally. 11 WO 01/36980 PCT/GB00/04420 With regard to the "targets of type X" and the "ligands of type Y", at least some of the "targets of type X" should be capable of interacting with at least some of the "ligands of type Y". However, it is not a prerequisite that all targets of type X are capable of 5 interacting with all of the ligands of type Y since a non-interacting X/Y pair might also provide useful information. It is preferred that the majority (e.g. 50%, 60%, 70%, 80%, 90% or even 100%) of the "targets of type X" are capable of interacting with the majority (e.g. 50%, 60%, 70%, 80%, 90% or even 100%) of the "ligands of type Y". 10 In some embodiments of the invention, the "targets of type X" all have similar physical, chemical, biological and/or pharmacological properties. In other embodiments of the invention, the "targets of type X" share similar structural, compositional or organisational features. 15 In the invention, it is preferred that the targets of the type X show high diversity. In some embodiments of the invention, the "ligands of type Y" all have similar physical, chemical, biological and/or pharmacological properties. In other embodiments of the invention, the "ligands of type Y" share similar structural, compositional or organisational 20 features. In this invention, it is preferred that the ligands of type Y form or are derived or are obtained from an informative library (low molecular weight organic compounds or small peptides) with as high chemical/physical diversity as possible. 25 Preferred targets of type X are macromolecules having a polymeric structure that show sequence (or other building block or composite building block) homologies of at least 10 %, more preferably of at least 20 % and most preferably of at least 30 %. Even more preferred are macromolecules of type X whose sequence(s) or subsequence(s) (or other 30 building block(s) or composite building block(s)) included in the region(s) included in the analysis according to the procedures of the invention show homologies of at least 10 %, more preferably of at least 20 % and most preferably of at least 30 %. Also preferred are macromolecules of type X where the subsequences (or other building blocks or composite building blocks) comprising parts included in the analysis according to the procedures of 12 WO 01/36980 PCT/GBOO/04420 the invention show homologies of at least 10 %, more preferably of at least 20 % and most preferably of at least 30 %. Preferred methods for calculating homologies are by using the BLAST algorithm (http:/www.bioactivesite.com/darwin2000/blast/; November 15 2000), using standing settings. 5 Thus it can be seen that a "target of the type X " is, in most cases, what would generally be termed a variant of the Target X, i.e. one which shares a property or function in common with the Target X. The same applies, mutatis mutandis, to the term "ligand of the type Y". 10 Preferably each of the targets of the type X independently shares a structure or a DNA, RNA or amino acid sequence in common with the Target X; or a building block (as defined herein) or combination of building blocks in common with the Target X. The same applies, mutatis mutandis, to the term "ligand of the type Y" when the ligands of type Y are macromolecules. 15 Particularly preferred are targets of the type X which are chimeric variants of the Target X, i.e. which differ from the Target X though the exchange or addition of one or more aminoacids or nucleotides, or aminoacid or nucleotide sequences. 20 The present invention recognizes that biological macromolecules have polymeric structures, as they are composed of smaller building blocks linked together. Thus, proteins are composed of chains of amino acids linked together, while DNA is composed of chains of nucleotides linked together. As will be evident below, the present invention takes advantage of the polymeric nature of macromolecules in the analysis of the chemical 25 and/or physical properties of X. Using such an approach it is not necessary to have any information of the positions of all atoms in X in three dimensional space. This contrasts the present method from e.g. molecular modelling (or any similar method) which aims to determine the position of all (or at least most of) the molecule's atoms in X in three dimensional space. Therefore, most embodiments of the present invention exclude the use 30 of numeric information which refers to co-ordinates of all (or at least most of) the atoms in X in three-dimensional space. Moreover, for the same reason, some embodiments of the invention exclude the use of numeric information that directly refers to co-ordinates of all 13 WO 01/36980 PCT/GBOO/04420 (or at least most of) atoms in Y in three-dimensional space. Accordingly, some embodiments of the invention exclude the use of numeric information which directly refers to co-ordinates of all (or at least most of) atoms in both X and Y in three-dimensional space. Moreover, the present invention recognizes that the three dimensional structure of a 5 molecule can be described by the angles betweens its atoms. Therefore most embodiments of the invention exclude the use of numeric information which refers to angles between all (or at least most of) atoms in X. Moreover, for the same reason, some embodiments of the invention exclude the use of numeric information that directly refers angles between all (or at least most of) of atoms in Y. Accordingly, some embodiments of the invention exclude 10 the use of numeric information which directly refers to angles between all (or at least most of) atoms in both X and Y. By term "angle" in this context is included bond angles, torsion angles and dihedral angles. 15 For the same reason as stated in the preceeding paragraph most embodiments of the present invention exclude the use of any physical method (such as X-ray crystallography or two dimensional NMR) that directly determines or predicts the structure of X in three dimensions, or the information derived therefrom. 20 In the present context, "building block" is defined as a chemical residue that can be linked together with other chemical residues so as to create a chain. Building blocks usually come in sets, where each member contains variable region(s) that bring different chemical properties to the different building blocks, and chemical groups which are used to linking the building blocks together. For example a set of building blocks could contain the eight 25 members ("residues") a, b, c, d, e, f, g and h. A molecule of desired size and composition could then created by linking the building blocks together, e.g.: f-a or 30 b-e-a-b-d-g-a-c-c-a-f-f-b, or the like, 14 WO 01/36980 PCT/GBOO/04420 thus creating a polymer. Usually special chemical groups are present at the "start" and "end" of the chain, such as 5 T-b-e-a-b-d-g-a-c-c-a-f-f-b-E, or the like, where T and E denotes start and end groups respectively. However, polymers used in the invention can also exist in cyclic variants, such as: 10 Eb-e-a-b-d-g-a-c-c-a-ff-b or 7~~L1 T-b-e-a-b-d-h-b-c-c-a-f-f-e-E where L is a chemical group comprising a linker so as to create a cycle, and T and E are 15 start and end groups. Moreover polymeric structures can exist in branched variants, such as e.g. T-b-e-a-b-d-g-a-c-c-a-f-f-b-E L T-a-c-g-h-a-c-c-E or T-b-e-a-b-d-g-a-c-c-a-f-f-b-E L T-a-c-g-h-a-c-c-E 15 WO 01/36980 PCT/GB00/04420 where L is a chemical group comprising a linker so as to create a branch in the molecule, and T and E are start and end groups. 5 (It should be noted that the above examples are just given for the case of illustration and are not intended to limit the invention in any way. Thus, any number and order of building blocks, number of cycles and branches, as well as placement of linkers at any postion(s) are allowed). 10 A structure used in conjunction with the present invention can have zero, one or more number of cycles. A structure used in conjunction with the present invention can have zero, one or more number of branches. 15 A structure used in conjunction with the present invention can be modified chemically by removing, adding or exchanging atom(s) within a building block. A structure used in conjunction with the present invention can be modified chemically by 20 removing, adding or exchanging chemical groups within a building block. If more than one start group is present in the molecule, the start-groups can be the same or different. 25 If more than one end-group is present in the molecule, the end-groups can be the same or different. If more than one linker is present in the molecule, the linkers can be the same or different. 16 WO 01/36980 PCT/GBOO/04420 The molecular weight of building block is preferably less than 10000 g/mole, more preferably less than 5000 g/mole, even more preferably less than 3000 g/mole, even somewhat more preferably less than 2000 g/mole and most preferably less than 1500 g/mole. However, for many cases the molecular weight of building blocks are quite small 5 such as less than 1000 g/mole, more preferably less than 600 g/mole, even more preferably less than 400 g/mole, even less than 300 g/mole, even less than 200 g/mole, even less than 100 g/mole. In the case of a peptide or a protein, the building blocks are sets of amino acid residues. In 10 the present context "amino acid residue" is defined as residue of glycine, alanine, valine, leucine, isoleucine, serine, cysteine, threonine, methionine, phenylalanine, tyrosine, tryptophan, proline, histidine, lysine, arginine, aspartic acid, glutamic acid, asparagine, glutamine, and any other naturally occuring amino acid. Moreover, in the case of a peptide or an artificial protein, the building block may also include a residue having the following 15 general structure z H 0 in which Z is hydrogen, X or -CH 2 X where X is chemical moiety of any structure with molecular weight preferably less than 2000 g/mole, more preferably less than 1000 g/mole, even more preferably less than 600 g/mole, and most preferably less than 400 g/mole, T is 20 start-group of any desired structure, or bond to another amino acid residue, and E end group of any desired structure, or bond to another amino acid residue. In the present context natural amino acid residues are denoted by one letter codes as defined in Branden & Tooze (1991, p. 6-7). 25 17 WO 01/36980 PCT/GBOO/04420 In case of DNA or RNA, "building block" is defined as residue of a nucleotide, in other words e.g. deoxyadenosine 5'-phosphoric acid, deoxyguanosine 5'-phosphoric acid, deoxycytidine 5'-phosphoric acid, deoxythymidine 5'-phosphoric acid, deoxyuridine 5' phosphoric acid. Moreover, artifical nucleotides may be used, such as deoxyinosine 5' 5 phosphoric acid, and alike. The invention of course recognizes that DNA and RNA generally occur in double stranded form with matching (or eventually mismatching) base pairs. Compounds of building blocks may include common atoms of organic compounds such as 10 hydrogen, carbon, nitrogen, oxygen, sulphur, phosphor. However, also other atoms may be used, e.g silicon. Polymers used in the present inventions include for both X and Y silicon containing compounds (as well as other types of organo-metallic-compounds) which by use of the procedures of the invention can be optimized for desired properties, e.g. for use as catalysts that can withstand harsh conditions (e.g. high temperature, high or low pH, 15 etc.). Both X and Y for use with the procedures of the invention include both natural and synthetic compounds. 20 Other important embodiments of the present invention take advantage of the use of chimeric proteins (and in an analogous fashion to the use of chimeric DNAs). By use of technology well known in the art, regions of the amino-acid chains of two or more homologous proteins or DNA's can be exchanged so as to create new proteins (or DNA's) inheriting properties of the original proteins (or DNA's). Creating such a set of chimeric 25 proteins (or DNA's) and using them as X's in conjunction with the procedures of the invention will create a case where the used proteins (or DNA's) are likely to show gross similarities in their three dimensional organization. This will have the effect that any differences in the chemical and/or physical properties of the X's will be dependent only on the differences such as, in case of protein differences in chemical properties in amino acid 30 residues, rather than differences in the overall positions of larger structural elements in the proteins. Therefore, one of the most important parts in the present invention is the use of 18 WO 01/36980 PCT/GBOO/04420 chimeric variations of targets of type X, together with informative variations of ligands of type Y and of observed biological activity. However, it has to be stressed that inactive ligands (no observed biological activity (BA)) is as important as active ligands in the modelling and identification of the active sites. The use of proteins with single or multiple 5 amino acid mutations as the target X is also preferred in the present invention. The invention also includes the use of informative peptide libraries and the use thereof for identification of the active site in any kind of biological target. Examples of such libraries are given in Examples 3 to 8 and can be used separately or in combination with each other 10 or with any kind of peptides. Since the length of the number of aminoacids (AA) in the peptides is varied, a pre-treatment with ACC of the matrix describing the properties of different peptides is preferably made in order to obtain an uniform matrix. This is generally necessary for making the required calculations in order to identify the "active site" in the target. 15 Further examples of the implementation of the present invention are given below. The procedure of the invention and its specific emodiments have in experimental investigations (including those of the specific examples of the present invention, and its amendments) found to comprise a surprisingly effective and useful method for analysis and design of 20 ligands of chemical and biochemical nature, something which prior to the disclosure of the present invention was not known. Step la: Principal Properties (PP's) 25 Principal Properties (PP's) makes use of amino acids (see Table 1 or the 5PP's described by Sandberg et al.) and replace each amino acid in a target or a group of targets with corresponding descriptors (PP's). Table 1 30 The z-scale used to characterise each amino acid. No. Name Name One letter code z1 z2 z3 1 Alanine ALA A 0.07 -1.73 0.09 2 Valine VAL V -2.69 -2.53 -1.29 3 Leucine LEU L -4.19 -1.03 -0.98 19 WO 01/36980 PCT/GBOO/04420 4 Isoleucine ILE I -4.44 -1.68 -1.03 5 Proline PRO P -1.22 0.88 2.23 6 Phenyalanine PHE F -4.92 1.30 0.45 7 Tryptophan TRP W -4.75 3.65 0.85 8 Methionine MET M -2.49 -0.27 -0.41 9 Lysine LYS K 2.84 1.41 -3.14 10 Arginine ARG R 2.88 2.52 -3.44 11 Histidine HIS H 2.41 1.74 1.11 12 Glycine GLY G 2.23 -5.36 0.30 13 Serine SER S 1.96 -1.63 0.57 14 Threonine THR T 0.92 -2.09 -1.40 15 Cysteine CYS C 0.71 -0.97 4.13 16 Tyrosine TYR Y -1.39 2.32 0.01 17 Aspargine ASN N 3.22 1.45 0.84 18 Glutamine GLN Q 2.18 0.53 -1.14 19 Aspartic acid ASP D 3.64 1.13 2.36 20 Glutamicacid GLU E 3.08 0.39 -0.07 A result of this is that a 330 AA long peptide (e.g. a receptor) will be described by 990 numbers reflecting its chemical/physical properties. Principal properties may also be used to describe the ligand. In an analogous manner, any polymeric structure may be described 5 by numbers representing chemical and/or physical properties of its building blocks. Step Ib: Binary coding bit vectors A second approach for the assignment of descriptors to the ligand X and/or to the target Y 10 is to use a binary coding and create a "bit-vector". For example, in the case of molecules that are composed of parts (or building blocks or composite building blocks) that can be systematically exchanged (i.e. such as when constructing chimeric molecules), a binary assignment may be performed as follows: 15 In the case of two variations of the part, a zero is used for one of the variations and an one for the other. For example, molecules that are composed of three parts A, B, C ... with two variations for each part (i.e. A=A 1 or A 2 , B=B 1 or B 2 and C= C 1 or C 2 ) may be described 20 WO 01/36980 PCT/GB00/04420 by one binary number for each part as descriptor; thus three binary numbers may be used to describe the whole molecule e.g.: A 1
B
1
C
1 is described by 0 0 0; A 2
B
1 C I is described by 1 0 0; A 1
B
2 CI is described by 0 1 0, etc. 5 In the case of more variations of a part, assignments may be done essentially as is exemplified for a case with three variations of a part where three binary numbers are used for each type of A: A I = 1 0 0; A 2 = 0 1 0; A 3 = 0 0 1. For four variations of A, one would used four binary numbers, etc.. 10 The binary approach is a convenient way of describing DNA. It is recognized that DNA is composed of building blocks (also termed bases or nucleotides) termed adenine (A), thymidine (T), cytosine (C) and guanosine (G). Thus four binary numbers may be used as descriptors: 15 A= 1 0 0 0 T=0 100 C=001 0 G=000 1 20 However, it may also be recognized that bases of DNA are sometimes allowed to be exchanged with other bases without affecting the functionality of the DNA. Such a case occurs in the coding of amino acids. Thus, phenylalanine is coded for by TTT and TTC while serine is coded for by TCU, TCC, TCA and TCG. Thus, the following principle may be used in binary coding of bases, where different possibilites are allowed: 25 A or T = 1100 A or C = 1010 A or G = 1 00 1; T or C =011 0 30 AorCorG=1011 AorTorCorG=l 11 l,etc. Moreover, when an artifical base such as inosine (I) is used, hybridization does not occur. Accordingly, inosine could be described as 0 0 0 0. (However, the numbers 1 1 1 1 may be 21 WO 01/36980 PCT/GBOO/04420 used if more appropriate for the problem under investigation, as inosine would not have any negative effect on hybridization thus allowing any base to match). Of course a similar approach as for DNA may be done for RNA also, as well as for any 5 other systematic varation of molecules being composed of building blocks where the physical and chemical effects may be described in terms of the specific effect of one type of building block only, the same effect as one or more building blocks, or no effect of the building block, respectively. This is exemplified as follows for a part with three original building blocks Al, A 2 and A 3 10 Part of type A 1 = 10 0 Part of type A2 = 0 10 Part of type A 3 = 0 0 1 15 If additional parts are used that may be described in terms of A 1 , A 2 and A 3 , then assignments may be done based on the above, e.g.: An additional building block A 4 having "no effect" may be described by 0 0 0. 20 An additional building block A 5 combining the effects of A 1 and A 2 may be described by 11 0. An additional building block A 6 combining the effects of A 2 and A 3 may be described by 0 11. 25 An additional building block A 7 combining the effects of AI, A 2 and A 3 may be described by 1 1 1, etc. In the above, "effect" may be interpreted as changes in the chemical and/or physical 30 properties and for identification of interactions between molecules of type X with molecules of type Y. The approach described above is very useful in the handling of chimeric receptors. 22 WO 01/36980 PCT/GBOO/04420 Step 1 c: Bit vectors for the description/characterisation of structures for use as descriptors of X or descriptors of Y. 5 Another way of assigning descriptors to X and/or Y is to create bit vectors, counting how many times a defined structural feature occurs in a structure. The concept is illustrated for a small set of structures in Table 2.1 using the bit string variables defined in Table 2.2. The structural features defined in the bit strings are used to identify how many times they occur in the investigated structures, resulting in Table 2.3, which is a description of the structures 10 in Table 2.1, using the descriptors in Table 2.2. The bit strings used here only serves as an example to illustrate the method. All structural functionalities that occur in the structures under investigation may be added as bit string variables. The bit strings may also be used as only indicating the presence or absence of the feature defined by the bit string, which would result in Table 2.4. 15 Table 2.1 20 Structure ID A B C Structure H H H' H I I | I H-C-H H-C-C-H H-C-H II | H H H H / CH H Table 2.2 Variable ID x X2 X3 Bit string used H H H I I I C-H C-C-H I I I H H H 23 WO 01/36980 PCT/GBOO/04420 Table 2.3 Structure xi X2 X3 A 1 0 0 B 1 1 0 C 2 0 1 5 Table 2.4 Structure xi X2 X3 A 1 .0 0 B 1 1 0 C 1 0 1 10 Additional operations may also be carried out on the descriptors of X and/or Y, for example, translation, PCA and ACC, as exemplified further below. Translating protein and peptide sequences to a quantitative description 15 In the literature several investigations have previously characterised amino acids with physico-chemical variables (Hellberg et al. 1986; Hellberg et al. 1987; Josson et al. 1989; Collantes and Dunn, 1995; Sandberg et al. 1998). The characterisation made by Hellberg et. al. (1986; 1987) includes experimental as well as semi-empirical derived variables. 20 Using Principal Component Analysis (PCA), three (or more) new so-called principal property variables may be generated, which summarise the information from the original variables. By using these principal properties, each amino acid in a protein or peptide sequence may be quantitatively characterised, i.e. be translated to three (or more) variables containing physical and chemical information (see Figure 1 for an example). This means 25 that instead of comparing sequences with a one-letter code, or a binary code, a quantitative description of each sequence is generated. The method may be used both for obtaining descriptors of X and descriptors of Y, and used in the procedures of the invention. 24 WO 01/36980 PCT/GBOO/04420 The same principle as for peptides may be applied for DNA, RNA, proteins and organic libraries, as well. Translating protein and peptide sequences of different lengths to a uniform matrix 5 When comparing a set of amino acid sequences of different lengths problems may arise because they would be characterized with a different number of variables if the amino acid residues were used as basis for assigment of descriptors. The analysis of the biological testing results using PLS and many other calculation methods (e.g. neural networks) 10 require a uniform matrix of descriptors where all sequences are described with the same number of variables. A solution to the problem is to calculate auto covariances and auto cross covariances (ACC) (Wold et al. 1993b) between building blocks, e.g. amino acids, which have been translated. ACC compares one amino acid to another neighbouring amino acid in the sequence, which are L positions away. ACC is illustrated in Figure 2. The only 15 restriction on L, called the "lag", is that the largest lag possible is restrained to the shortest sequence in the set minus one. ACC is thereby not dependent on all sequences being of equal length, no alignment is required and neighbouring effects are taken in to account. Auto covariances with lags 1 = 1, 2.. .L are given by the following equation: ACC. = N ii+l 20 j,1 n-I Index j is used for the scales (i = 1, 2, 3), n is the number of amino acids in a sequence and index i is the amino acid position (i = 1, 2 ...n). Crossed auto covariances (CC) between 25 two different scales, j and k, are calculated according to the following equation: 25 WO 01/36980 PCT/GBOO/04420 CC~k~l.- n,i +1 n-i This generates a new uniform matrix were each sequence is described by the same number 5 of variables (see Figure 3 for an illustration of the method; the Figure 3 being based on the approach shown in Figure 2), which may be used for further analysis with e.g. PCA or PLS or other calculation methods. A similar approach may be used for branched peptides. A similar approach may be used for DNA and RNA sequences of different length, as well as for any set of polymers composed of building blocks (or even composite building blocks). 10 The method may be used both for obtaining Descriptors of X and Descriptors of Y, to be used in conjunction with the procedures of the invention. A further improvement is to combine ACC with OSC (Orthogonal Scatter Correction) to 15 reduce the noise in the models (Andersson et al., 1998). Application of experimental design for manufacture of optimized sets of molecules for use in conjunction with the procedures of the invention 20 By combining parts (or building blocks or composite building blocks) from two macromolecules of type X chimeric macromolecules may be manufactured. In the same way, two molecules of type Y that are divided into parts (or building blocks or composite building blocks) may be exchanged so as to create chimeric variants of Y. The approach is of course not limited to mixtures of two original variants of X (or Y), but may be extended 25 to any number of original variants. However, for the case that not all chimeric variants are manufactured, the use of experimental design (Lundstedt et al. 1998; Box et al. 1978) will enhance the analysis when used in conjunction with the procedures of the invention. 26 WO 01/36980 PCT/GBOO/04420 Experimental design is used to allow the extraction of the maximum information from the selected subset of chimeras. The method is exemplified by a molecule that contains four parts A, B, C and D, with two 5 possible variations each (c.f. the case shown in Example 1:1 for chimeric MCI and MC3 receptors). Thus the molecule may be coded in a binary fashion using "-" or "+" as follows: AI= -, A 2 = +, BI= -, B 2 = +, C1= -, C 2 = +, DI= -, D 2 = +. All the 16 possible chimeric variants are shown in Table 3.1. Making a fractional factorial design where only eight chimeric receptors are manufactured the selection should desirably be done from the 10 ones marked "YES" in the "Manufacture" column of Table 3.1. This ensures that the best subset is selected, resulting in the best possible representation of all possible chimeric molecules when only eight molecules are generated. The molecules marked "NO" in the Manufacture column are the complementary chimeras, also resulting in a full factorial design. 15 Table 3.1 No. Structure A B C D Maufacture 1 AIBICIDI - - - - YES 2 A2BlCIDI + - - - NO 3 AIB 2 CIDI - + - - NO 4 A 2
B
2 ClDI + + - - YES 5 AlBIC 2 DI - - + - NO 6 A 2 BlC 2 DI - + - YES 7 AIB 2
C
2 DI - + + - YES 8 A 2
B
2
C
2 DI + + + - NO 9 AIBICID 2 - - - + NO 10 A 2
B
1
ICID
2 + - - + YES 11 A 1
B
2
CID
2 - + - + YES 12 A 2
B
2
CID
2 + + - + NO 13 AIBIC 2
D
2 - - + + YES 14 A 2 BlC 2
D
2 + - + + NO 15 A 1
B
2
C
2
D
2 - + + + NO 16 A 2
B
2
C
2
D
2 + + + + YES 27 WO 01/36980 PCT/GB00/04420 The concept can be further illustrated graphically for the case where molecules are combined with only three parts. Making all possible combinations would result in a total of eight molecules, see Table 3.2, which corresponds to the full factorial design, as is further illustrated in Figure 5 (i.e. all eight possible molecules being illustrated by white and 5 shaded circles). Making a fractional factorial design (marked in the Manufacture column of Table 3.2 as "YES", and as shaded circles in Figure 4), will cover the possible combinations as best as possible when only four receptors are manufactured. (That this is the case becomes evident when one analyzes Figure 4). It is of course equally valid to choose the "NO" chimeras of Table 3.2 as these represent the complementary part of the 10 full factorial design. Table 3.2 No. Structure A B C Manufacture I AlBlC 1 - - - NO 2 A 2
B
1
C
1 + - - YES 3 A 1
B
2
C
1 - + - YES 4 A 2
B
2
C
1 + + - NO 5 A 1
BIC
2 - - + YES 6 A 2
B
1
IC
2 + - + NO 7 AIB 2
C
2 - + + NO 8 A 2
B
2
C
2 + + + YES 15 When the selected chimeras of X (and eventually Y) have been generated according to the proper experimental design and tested in order to obtain the biological activity (BA) the results are analyzed according to the procedures of the invention. The use of experimental design according to the principles of the present method (and as is further described in the 20 literature; see Lundstedt et al. 1998) ensures that the experimental efforts will yield as much knowledge and information as possible about the investigated system. Analysing the result of the testing together with a calculation method, e.g. PLS, new directions as to how new more effective molecules of type X and/or Y should be constructed, may be derived. 28 WO 01/36980 PCT/GB00/04420 The use of experimental design is not limited to the cases where only two variants for each parts are present. It may be generalized to any case including any desired variant for each part, building block or composite building block. 5 The second step in this procedure is to describe the ligands Y in a relevant way. The approaches include those described above and also those which are used to design and describe informative chemical libraries or informative peptide libraries (Lundstedt et al. 1997 and Andersson et al. 1999). 10 Any other description of the Ligand Y may be used, such as any of the conventional descriptions used in QSAR and MQSAR for description of physicochemical properties of organic molecules. Examples of such useful descriptions include GRID (Goodford, 1985) and GRIND descriptors (http/www.miasrl.com/software/amanual/backgr.html of 15 November 15, 2000). A preferred method for the description of the Ligand Y is through the use of an informative peptide library. 20 Methods for design of an informative peptide library The twenty natural aminoacids (aa's) were characterised using the z-scale developed by Hellberg et al., Table 1, resulting in a description of each amino acid with three numerical variables. The twenty aa's where thereafter sorted in nine different groups according to a 25 23 full factorial design, as in Table 2. 29 WO 01/36980 PCT/GBOO/04420 Table 4 Aminoacids sorted according to the following 23 design. Exp. No. zi Z2 Z3 1 - - 2 + - 3 - + 4 + + 5 - - + 6 + - + 7 - + + 8 + + + 9 0 0 0 This resulted in the groups presented in Table 5. For some of the experimental settings in 5 the design there were no alternatives among the natural aa's. As an alternative, the aa closest to the setting was selected using a visual inspection of the score plot. These are indicated in grey, Table 5. For the center point (0;0;0), there was also no obvious alternative and therefore a number of aa's in the vicinity of the center of the structural space were selected to represent the center points. 30 WO 01/36980 PCT/GBOO/04420 Table 5. The resulting grouping of the aa's. No. Amino acid Coding Amino acid setting Exp. No. - - - 1 2 Valine VAL V -2.69 -2.53 -1.29 4 Isoleucine ILE I -4.44 -1.68 -1.03 3 Leucine LEU L -4.19 -1.03 -0.98 8 Methionine MET M -2.49 -0.27 -0.41 CP + - - 2 14 Threonine THR T 0.92 -2.09 -1.40 CP 6-,flro4,jc 01 CP 10 Arginine ARG R 2.88 2.52 -3.44 9 Lysine LYS K 2.84 1.41 -3.14 18 Glutamine GLN Q 2.18 0.53 -1.14 (20 Glutamic acid GLU E 3.08 0.39 -0,07) CP - -+ 5 -,IA+.0 - 0 + 6 12 Glycine GLY G 2.23 -5.36 0.30 13 Serine SER S 1.96 -1.63 0.57 CP 15 Cysteine CYS C 0.71 -0.97 4.13 + + 7 7 Tryptophan TRP W -4.75 3.65 0.85 6 Phenyalanine PHE F -4.92 1.30 0.45 5 Proline PRO P -1.22 0.88 2.23 CP 11 Histidine HIS H 2.41 1.74 1.11 8 17 Aspargine ASN N 3.22 1.45 0.84 19 Aspartic acid ASP D 3.64 1.13 2.36 0 0 0 1 Alanine ALA A 0.07 -1.73 0.09 9 20 Glutamic acid GLU E 3.08 0.39 -0.07 31 WO 01/36980 PCT/GBOO/04420 8 Methionine MET M -2.49 -0.27 -0.41 13 Serine SER S 1.96 -1.63 0.57 14 Threonine THR T 0.92 -2.09 -1.40 5 Proline PRO P -1.22 0.88 2.23 16 Tyrosine TYR Y -1.39 2.32 0.01 Using Table 5, the selection of peptides to include in peptide libraries ranging from di- to heptapeptides was made, as presented in Examples 3 to 8. 5 The library may also consist of non-peptidic compounds, e.g. low molecular weight organic or inorganic conipounds. STEP 3 10 The third step in the procedure is to measure the interaction between the ligands of the type Y and targets of the type X. This may be measured by any means known per se. The interaction may be quantitated, for example, on the basis of binding affinity, selectivity, activity, biological activity, avidity, Km of enzyme, hybridisation or any other means 15 which directly or indirectly provides a measure of the interaction. Preferably the affinity or activity of the different Ligands Y (most preferably from an informative compound library) for a target X or a number of targets X is measured. 20 The binding affinity or biological activity may, for example, be determined using methods described by Lunec et al. (1992), Szardenings et al. (1992), Schi6th et al. (1992) or other similar methods. A very specific example for how to the biological activity (BA) constitutes ligand binding 25 methods. In this method different concentrations of X or Y are usually incubated together. Usually the concentrations of Y are varied systematically into different assays containing the same concentration of X and the amount of Y bound to X is then measured and related to the activity for the interaction of variants of Y with variants of X. In most cases, a third labelled molecule (the "labelled ligand") is added which also binds to X, the binding of the 30 labelled ligand being prevented by Y. The degree and concentrations active for variants of 32 WO 01/36980 PCT/GB0004420 Y being capable of preventing the binding of the labelled ligand to variants of X are related the activity of the interactions of the Y's with the X's. Such ligand binding methods are well known in the art, specific examples are found in Uhlen and Wikberg (1991) and Schi6th et al. (1995). 5 The binding approach may also be useful when X is a non-protein macromolecule, such as DNA. Methods for hybridization measurements for DNA are well known in the art. When X includes enzymes, the capacity of variants of X to convert a substrate to a product 10 may be measured. The influence of different concentrations of variants of Y to either inhibit or promote the conversion of the substrate to the product may then be measured and used as a measure of BA. Other examples for use as measures of BA include quantifying second messenger elements 15 (cAMP, cGMP, intracellular calcium concentrations, inositol triphosphate, diacylglycerol, and the like), and quantifying protein phosphorylation (including phosphorylations of tyrosine, serine and threonine). Such measurements can be typically done in organs, isolated cells, cells in culture, cell free systems, membrane preparations, and the like. 20 Other examples included of BA constitute measurements of ion-channel opening and closure, single ion channel currents, membrane potentials, voltage clamping and other electrophysiological measurements. When X and Y are both represented by macromolecules any suitable biochemical, 25 biophysical or pharmacological response related to the interaction of X and Y can be used as a measure of BA. A very specific example is quantifying the dimerization of tyrosine kinase receptors. In such a case e.g. X could be one variant of subunit of the tyrosine kinase and Y another variant of a subunit of the tyrosine kinase and the capacity of X to interact with Y quantitated and used as a measure of BA. 30 33 WO 01/36980 PCT/GB00/04420 Yet another example of measurment for use as BA is measuring the avidity. Thus X could include variants of antibodies and Y could include variants of antigens. The degree of interaction of X with Y can then be measured by using methods well known in immunology, such as by quantifying avidity. 5 The X may also be included in a multicellular organism. The production of transgenic animal is well known in the art. Obtaining the BA may in e.g. involve the administration of a Y to transgenic animals containing different variants of X and observing any desired physiological response in the animal. 10 X may also be included in a viral particle or a phage. E.g. X may be included within the amino acid sequence of a capsid protein of a virus or phage, e.g. M13-phages. In some cases it may be possible to calculate values for the interactions of variants of X 15 with variants of Y. Also such calculated values are of direct use for the procedures of the present invention. However, this embodiment of the invention is very rare and seldom used in practice. The method for obtaining BA according to the procedure of the invention may be used in 20 several ways for the analysis of the interactions of X and Y, for the design of improved macromolecules X and/or for the design of improved molecules Y. It will, of course, be appreciated that Steps 1, 2 and 3 do not have to be carried out in this order. They may be carried out in any order or even simultaneously. 25 STEP 4 The fourth step is to establish a mathematical model describing the observed interaction between the Ligand X and Target Y, as a function of the properties of the ligands Y and 30 the properties of the targets X. 34 WO 01/36980 PCT/GBOO/04420 A preferred procedure for identifying an active site in a macromolecule is based on the chimeric approach exemplified by receptors. This is the fastest and simplest route for finding the region of the target wherein the active site is located. The chimeric receptors are preferably combined in accordance with a multivariate design, factorial or fractional 5 factorial design, in order to obtain a well balanced and informative set of combined "chimeric" receptors. This is the first real step towards informative combinatorial biology. However, the use of naturally-occurring variants of the Target X has proven to be surprisingly effective and useful in conjunction with the present invention. 10 Identification of the active site is done by describing the biological activity (BA) as a function of the properties of the ligands Y, the properties of the targets X, and the interaction between the ligands Y and the targets X. The interaction is defined by multiplying the descriptors capturing the properties of the ligand with the descriptors of the target where the descriptors may be principal properties or other chemical and/or physical 15 descriptors. However, new descriptors may be generated by any function of the descriptors of the target X and ligand Y. One example of a model of the biological activity is given by the equation: BA = f (X,Y) 20 This general function can, by a Taylor expansion, be approximated to a polynomial function with different degree of complexity. In the invention, it was found that a second order interaction model is sufficient for identification of "active-sites" in a macromolecule (see equation below in matrix-form) 25 BA = BAaverage + bi*(X) + b 2 *(Y) + bl 2 *(X)*(Y) The coefficients in the equation above may be determined by PLS but may also, if the number of measurements are big enough, be determined by PCR, MLR, NN (neural net), 30 Stepwise regression or other similar method. The coefficients in the equation provide the necessary information for finding the location of the binding site in the target as well as with important information of the chemical/physical properties needed for a very active ligand. The coefficients provide 35 WO 01/36980 PCT/GB00/04420 information about which features are important in the ligands, the targets and the important features of the interaction between them. The binding site and/or active site is identified as the interaction terms in between the 5 chemical/physical descriptors or "principal properties" of the ligands and the chemical/physical descriptors or principal properties for the target. Biological activity (BA) is described as function of the "PP's" for the ligand and the PP's of the target and the interaction term between the ligand and the target. 10 The estimated correlation coefficient of the "target-ligand-interaction" provides information of the position of the active site in the macromolecule as well as a description of the chemical/physical properties of the active-site. This information is of outstanding value in the design of new leads as well as in lead optimisation. Mathematically this a 15 simple procedure which surprisingly provides information regarding the "active site". The model is preferrably produced using one or more of multivariate methods, partial least squares methods, neural networks, multiple linear regression, non-linear regression, curve fitting, model fitting, stepwise regression and maximum likelihood methods. 20 STEP 5 When the region for the active site has been located, a new description of the interesting regions of the target may optionally be made with higher resolution, i.e each AA in the 25 interesting region is replaced by its principal properties (see Table 1). If further information regarding the target is needed, then exchange of specific AA's can be made by mutations in the interesting region or regions of the target. This should preferably be done by an informative design in order to ensure diversity in properties. 30 The model derived by the use of the present invention may be directly useful for predicting the properties of novel targets of type X as well as novel ligands of type Y. Hence the invention is particularly useful in drug design as well as in the engineering of new molecules of type X or Y (e.g. in protein engineering). 36 WO 01/36980 PCT/GBOO/04420 The processes in accordance with the present invention may be implemented at least partially using software e.g. computer programs. It will thus be seen that when viewed from further aspects, the present invention provides computer software specifically adapted to carry out the processes hereinabove described when installed on data processing means, 5 and a computer program element comprising computer software code portions for performing the processes hereinabove described when the program element is run on data processing means. The invention also extends to a computer software carrier comprising such software, particularly when used to operate a process of the invention. Such a computer software carrier may be a physical storage medium such as a ROM chip, CD 10 ROM or disk, or may be a signal such as an electronic signal over wires, an optical signal or a radio signal such as to a satellite or the like. It will further be appreciated that not all steps of the method of the invention need to be carried out by computer software and thus from a further broad aspect the present 15 invention provides computer software and such software installed on a computer software carrier for carrying out at least one of the steps of the processes set out hereinabove. LEGENDS TO THE FIGURES 20 Figure 1 Translation of physical and chemical properties of amino acids into principal properties. Figure 2 The ACC approach. Figure 3 Calculation of auto covariances and auto cross covariances. Figure 4 Full factorial and fractional factorial design. 25 Figure 5 Generalised template for X of Example 1 comprising aligned MCI and MC3 receptor amino acid sequences and division of template into parts A, B, C and D. Figure 6 Generalised template for Y of Example 1 comprising aligned MSH and MSO4 receptor amino acid sequences and division of template into parts a 30 and P. Figure 7 Molecules of type Y in Example 1. Figure 8 Permutation testing in Example 1:1. 37 WO 01/36980 PCT/GB00/04420 Figure 9 Observed versus calculated Ki for model of Example 1:1. Figure 10 Permutation testing in Example 1:2. Figure 11 Observed versus calculated Ki for model of Example 1:2. Figure 12 Permutation testing in Example 1:3. 5 Figure 13 Observed versus calculated Ki for model of Example 1:3. Figure 14 Variable importance in the projection (VIP) in Example 1:4. Figure 15 Variable importance in the projection (VIP) in Example 1:6. Figure 16 Alignment of three subtypes of human wild-type alpha 1 adrenoceptors. Figure 17 52 positions with sequence variation, extracted from TM regions of human 10 alpha-1 adrenoceptor subtypes and chimeras of alpha-1 adrenoceptors. Black characters on white background denote variation between 2 amino acids; white characters on black background denote variation between 3 amino acids. Figure 18 X data set from Example 2. 15 Figure 19 Molecular template and details of the compounds used in Example 2. Figure 20 Y block data from Example 2. Figure 21 Summary of pKi values (BA) reported for alpha-I adrenergic receptor interactions with 4-piperidyl oxazole antagonists. Figure 22 Graph showing observed v calculated pKi for Example 2. 20 Figure 23 Normalised PLS regression coefficients from Example 2.1. Figure 24 Summation of MIPs and MICs for parts A-G corresponding to TM regions 1 7 (Example 2.2). Figure 25 The 16 peptides selected according to a 2 6-2 fractional factorial design + 3 cp (or 1 cp 2 random). 25 Figure 26 16 peptides selected according to a 2 9-s fractional factorial design + 3 cp (or 1 cp 2 random). Figure 27 32 peptides selected according to a 2 12-7 fractional factorial design + 3 cp (or 1 cp 2 random). Figure 28 32 peptides selected according to a 2 15-10 fractional factorial design + 3 cp 30 (or 1 cp 2 random). Figure 29 32 peptides selected according to a 2 18-13 fractional factorial design + 3 cp (or 1 cp 2 random). 38 WO 01/36980 PCT/GBOO/04420 Figure 30 51 peptides selected according to a 2 21-16 fractional factorial design with additional experiments added from a half a fold over +3 cp. EXAMPLES 5 The following examples are intended to illustrate but not to limit the scope of the invention. Example 1:1 10 Analysis of the interaction of MSH-peptide variants with chimeric MCI and MC3 receptors by use of the method of the invention The analysis is divided into steps as described below, conforming essentially to the above 15 dscribed steps of the invention. i) A macromolecular template X was made by using a generalised structure of the melanocortin receptor 1 (MCI) (FEBS Lett. 1992, 309, 417-420) and melanocortin receptor 3 (MC3) (J. Biol. Chem. 1993, 268, 8246-8250). Thus, X was generalised into one 20 entity by aligning the MC1 and MC3 receptor amino acid sequencese as earlier described (J. Molecular Graphics Modelling. 1997, 15, 307-317) (Figure 5). The thus formed template was then divided into 4 parts termed A, B, C and D, as illustrated in Fig. 5. Thus, Figure 5 shows the aligned amino acid sequences of the MCI and the MC3 receptors with the parts A, B, C and D of template X indicated. 25 ii) Thus according to the foregoing paragraph, there are two structural variants for each part of the template X. We then selected A, B, C and D parts from the MCI receptor sequence and termed them A,, B 1 , C 1 , D, (Fig. 5). In the similar fashion, parts from the MC3 receptor sequence were also selected and termed A 2 , B 2 , C 2 , D 2 (Figure 5). 30 Combining different variants of the same part in template X would make total 16 possible combinations; one being the MCI receptor, one the MC3 receptor, and 14 being MC1/MC3 receptor chimeras. Using molecular biological techniques we earlier made 8 of these 14 possible chimeras (Mol. Pharmacol. 1998, 54, 154-161). For the present analysis we thus had in total 10 different receptors which by their parts could be described as 39 WO 01/36980 PCT/GBOO/04420 AiBiCIDi (i.e. native MCl-receptor), AlBICID 2 , AjB 1
C
2 D, AIBIC 2
D
2 , AiB 2 CiDi,
A
1
B
2
C
2 Di, AiB 2
C
2
D
2 , A 2
B
2 CIDI, A 2
B
2
C
2 Di, A 2
B
2
C
2
D
2 (i.e. native MC3-receptor). iii) The template Y was made by using a generalised structure derived from two known 5 peptides MSH and MSO4 (J. Biol. Chem. 1997, 272, 27943-27948) (Figure 6). As shown in the figure, both peptides have a common sequence in the middle, but their C- and N terminals differ. Using this central common part, both peptides could be aligned to each other creating the template Y (Figure 6). We then divided Y into three parts: N-terminal part, the middle and C-terminal part (see Figure 6). Because both peptides have exactly the 10 same sequence in their middle part, we neglected it for the further analysis, leaving two selected parts in Y: a (i.e. N-terminal part) and P (i.e. C-terminal part). iv) Thus according to the foregoing paragraph there are two structural variants for each part of the template Y. Combining different variants of the same part in template Y would 15 make total 4 possible combinations; one being the MSH and one the MSO4, and two being MSHI/MSO4 chimeras. All were synthesized thus yielding MSH (aip i), MSO4, (a 2 p2), MSO5 (a 2 pi) and MS06 (ai&p2) (Figure 7). v) In the present example, we used a binary representation of the data for both X and Y. To 20 the variants with subscript I we assigned value 0, and to variants with subscript 2 with assigned a value 1. Thus, e.g. the MCI receptor together with MSH peptide could be described with six zeroes (0,0,0,0,0,0), whereas e.g. the chimeric receptor AiBIC 2
D
1 with MS06 peptide (ai P2) could be described as 0,0,1,0,0,1. An abstraction of the signals used being shown in Table 6, for all the 40 possible cases: 40 WO 01/36980 PCT/GBOO/04420 Table 6 Molecules Descriptors XY x Y x Y Z Receptor Receptor Peptide name Peptide A B C D a p Binding name structure structure Log(Ki) MCI A, B, C, D, MSH al s1 0 0 0 0 0 0 0.21 MCi A, Bi C, Di MSO4 (X2P2 0 0 0 0 1 1 0.82 MCi A 1 Bi C, Di MSO5 a2'P 0 0 0 0 1 0 -0.07 MCi A 1 B, C, Di MS06 aI P 2 0 0 0 0 0 1 1.65 1(1)3 A 1
B
2
C
2
D
2 MSH al P] 0 1 1 1 0 0 1.37 1(1)3 A 1
B
2
C
2
D
2 MSO4 a 2 j02 0 1 1 1 1 1 3.82 1(1)3 Ai 1
B
2
C
2
D
2 MS05 X2p1 0 1 1 1 1 0 2.03 1(1)3 Ai 1
B
2
C
2
D
2 MS06 CC P2 0 1 1 1 0 1 2.72 1(6)3 At Bi C 1
D
2 MSH ali s 0 0 0 1 0 0 1.37 1(6)3 AiBiCiD 2 MSO4 a 2 p 2 0 0 0 1 1 1 3.36 1(6)3 At B 1
C
1
D
2 MSO5 a2 P1 0 0 0 1 1 0 2.30 1(6)3 A 1
B
1
C
1
D
2 MS06 aip2 0 0 0 1 0 1 2.18 1(1)3(6)1 A 1
B
2
C
2 DI MSH aI si 0 1 1 0 0 0 1.56 1(1)3(6)1 Al B 2
C
2 Di MSO4 U 2 P2 0 1 1 0 1 1 4.14 T(1)3(6)1 A 1
B
2
C
2 DI MSO5 a 2 Pi 0 1 1 0 1 0 2.31 1(1)3(6)1 Al B 2
C
2
D
1 i MS06 CI P2 0 1 1 0 0 1 2.83 1(1)3(4)1 A, B 2 C, D, MSH aI P 0 1 0 0 0 0 1.85 1(1)3(4)1 A, B 2 C, D, MSO4 C2 P2 0 1 0 0 1 1 4.19 1(1)3(4)1 Al B 2
C
1 DI MS05 a 2 PI 0 1 0 0 1 0 2.57 1(1)3(4)1 Al B 2 C, Di MS06 aI P2 0 1 0 0 0 1 3.70 1(4)3 A, BIC 2
D
2 MHi si 0 0 1 1 0 0 1.55 1(4)3 Al BC 2
D
2 MSO4 2 p2 0 0 1 1 1 1 3.11 1(4)3 A, BI C 2
D
2 MS05 a 2
P
1 0 0 1 1 1 0 2.41 1(4)3 Aj BiC 2
D
2 MSO6 a] P 2 0 0 0 1 2.40 41 WO 01/36980 PCT/GBOO/04420 1(4)3(6)1 Al Bi C 2 Di MSH i 0 0 1 0 0 0 0.54 1(4)3(6)1 Al Bi C 2 Di MS04 (2 P2 0 0 1 0 1 1 1.84 1(4)3(6)1 Al Bi C 2
D
1 MSO5 C2 P1 0 0 1 0 1 0 0.86 1(4)3(6)1 Al Bi C 2 Di MS06 aI p2 0 0 1 0 0 1 1.57 3(4)1 A 2
B
2
C
1 Di MSH ati P1 1 1 0 0 0 0 1.77 3(4)1 A 2
B
2 C, D, MSO4 a 2 P2 1 1 0 0 1 1 3.08 3(4)1 A 2
B
2 C, D, MSO5 a2 1 1 1 0 0 1 0 2.76 3(4)1 A 2
B
2 C1 D, MS06 ai P 2 1 1 0 0 0 1 2.88 3(6)1 A 2
B
2
C
2 Di MSH ai Pi 1 1 1 0 0 0 1.84 3(6)1 A 2
B
2
C
2 Di MSO4 a2 P2 p 1 1 0 1 1 3.67 3(6)1 A 2
B
2
C
2 Di MSO5 CC2 P1 1 1 1 0 1 0 2.63 3(6)1 A 2
B
2
C
2 Di MS06 al P2 1 1 1 0 0 1 2.86 MC3 A 2
B
2
C
2
D
2 MSH ai P 1 1 1 1 0 0 1.97 MC3 A 2
B
2
C
2
D
2 MSO4 a2 P 2 p 1 1 1 1 1 4.49 MC3 A 2
B
2
C
2
D
2 MSO5 a 2
P
1 I 1 1 0 3.02 MC3 A 2
B
2
C
2
D
2 MS06 al P 2 1 1 1 1 0 1 3.17 vi) In order to obtain quantitative information for peptide and receptor for the purpose of 5 deriving BA, we performed standard binding assays using the procedures, essentially as described in Mol. Pharmacol. 1998, 54, 154-161, for all receptors versus all peptides included in the analysis, resulting in a data set of 40 binding constants (Kis). In the further analysis, we used the positive logarithm of the Ki values [LogIo(Ki)] of the data in order to derive the BA that was used, an abstraction being shown in Table 6. 10 vii) We then applied the partial least squares (PLS) analysis method (Analytica Chimica Acta, 1986, 185, 1-17 ) to correlate the stored signals obtained in step v) with the stored signals obtained in step vi). For this purpose we used the Simca program (Umetri AB, Box 7960, SE-90719 Umei, Sweden) which was appropriately configured for use of the 15 approriate stored profiles as is detailed further below which resulted in a highly useful model of the BA. 42 WO 01/36980 PCT/GB00/04420 Results. One PLS component (see Analytica Chimica Acta 1986, 185, 1-17 for description of PLS 5 component) was sufficent for deriving a good model BA. The R 2 and Q 2 values for the model was 0.70 and 0.61 (see Eriksson et al., 1996, for definition of R 2 and Q 2 ). (Computations were performed using SIMCA 7.0 (see SIMCA 7.0 manual, 1998) using autofit with 7 cross-validation groups which indicated only one significant PLS component). Moreover, additional validation of model BA was performed by randomising 10 the data (i.e. input signals) and calculating corresponding R2 and Q 2 values for each random model, by performing so called permutation testing (see Eriksson et al., 1996, for a description of the procedure). The results, which are represented as the output abstraction shown in Figure 8, demonstrate the usefulness of the model BA. The goodness of the fit was further be explored by comparing predicted and actual values of responses (i.e. 15 predicted BA versus measured BA). The correlation results are shown in as the abstraction shown in Figure 9. As can be seen the correlation is good. Example 1:2 20 Improvement of model BA of Example 1:1 by the addition of cross-terms The model BA of Example 1:1 was improved by adding cross-terms. This was done by calculating new descriptor signals from the original descriptor signals given in Table 6 of Example 1:1 by performing all possible multiplications of two different original 25 descriptors. The new descriptor signals thus obtained are generally referred to as cross terms (see SIMCA 7.0 manual, 1998). The improved PLS model (i.e. improved model BA; in the following termed model BA of Example 1:2) was obtained using SIMCA autofit and had 2 significant components (see SIMCA 7.0 manual, 1998) and yielded R 2 and Q 2 values of respectively, 0.95 and 0.66. The permutations of the new model are shown in the output 30 abstraction of Figure 10. In Figure 11 is shown an output abstraction representing the comparison of the calculated BA and measured BAs being derived by use of the model BA of Example 1:2. As seen the correlation is excellent. 43 WO 01/36980 PCT/GBOO/04420 Example 1:3 Improvement of model BA of Example 1:2 by removing descriptors with low variable influence 5 A new model BA was created from the model of Example 1:2 by removing descriptor signals which had lower variable influence values than 0.3 (see SIMCA 7.0 manual, 1998 for the meaning of variable influence and how this is performed) and performing PLS calculations essentially as described above for Examples 1:1 and 1:2. The permutations of 10 the new model BA (in the following termed model BA of Example 1:3) are shown by the output abstraction represented in Figure 12. In Figure 13 is shown the output abstraction representing a comparison of the calculated BAs derived by the used of model BA of Example 1:3 and the measured BAs. As seen the correlation for the values is excellent. 15 Example 1:4 Analysis of influence of parts The model BA of Example 1:3 was used to analyze the influence and interactions of parts 20 in X and Y. This was done by calculating the variable importance in the projection (VIP) for each descriptor of Example 1:3 (including the cross-terms retained in Example 1:3) using SIMCA 7.0 (see SIMCA 7.0 manual, 1998, p. 15-11). An output abstraction representing these influences are shown in Figure 18. As can be seen from the abstraction the highest influence is exerted by part p of Y and part B of X. Part A of X and part aX of 25 Y are also important, while D and C parts of X are unimportant. Although part D is not important, the interaction of this part with part B (i.e. B x D column) shows a significant effect on the responses (Figure 14). Example 1:5 30 Use of model BA of Example 1:3 for prediction of properties of new receptors The model BA created in Example 1:3 was used to predict the abilty of new variants of X to bind MSH peptides. According to Example 1:1, step ii) only the signals derived from 8 44 WO 01/36980 PCT/GB00/04420 MC1 /MC3 receptor chimeras were used out of 14 possible chimeras. The interaction of the remaining 6 with the MSH peptides was predicted using the Model BA of Example 1:3, an output abstraction for the prediction being shown in Table 7. 5 Table 7 Molecules Descriptors XY Predict ed X Y X Y PRZ Receptor Receptor Peptide name Peptide A B C D a p Predicted name structure structure binding Log(Ki) CHI A 1
B
2
C
1
D
2 MSH ci pi 0 1 0 1 0 0 1.7147 CHL A 1
B
2
C
1
D
2 MSO4 a2 p2 0 1 0 1 1 1 4.0618 CHI A, B 2 Ci D 2 MSO5 a 2 P1 0 1 0 1 1 0 2.7177 CHI A, B 2
C
1
D
2 MS06 aI P2 0 1 0 1 0 1 3.0587 CH2 A 2 B, C, D, MSH ai 1 1 0 0 0 0 0 1.1064 CH2 A 2 B, C, D, MSO4 a2 P2 1 0 0 0 1 1 2.1097 CH2 A 2 B, Ci D, MSO5 a2 P1 1 0 0 0 1 0 1.1273 CH2 A 2 B, C, Di MS06 XI P2 1 0 0 0 0 1 2.0888 CH3 A 2 B, Ci D 2 MSH ai P 1 0 0 1 0 0 2.2616 CH3 A 2 B, C, D 2 MSO4 a 2 P2 1 0 0 1 1 1 3.7704 CH3 A 2 B, C, D 2 MSO5 a 2 Pi 1 0 0 1 1 0 2.7879 CH3 A 2 B, C, D 2 MS06 aI P2 1 0 0 1 0 1 3.244 CH4 A 2 Bi C 2 D, MSH ai P 1 1 0 1 0 0 0 1.7372 CH4 A 2 B, C 2 Di MSO4 a2 P2 1 0 1 0 1 1 3.0229 CH4 A 2 B, C 2 D, MSO5 a2 P1 1 0 1 0 1 0 2.0405 CH4 A 2 Bi C 2 Di MS06 ai P 2 1 0 1 0 0 1 2.7197 CH5 A 2 B, C 2
D
2 MSH ai 1i 1 0 1 1 0 0. 2.8925 CH5 A 2 B, C 2
D
2 MSO4 a2 P2 1 0 1 1 1 1 4.6835 CH5 A 2 B, C 2
D
2 MSO5 a2 1 1 0 1 1 1 0 3.7011 45 WO 01/36980 PCT/GBOO/04420 CH5 A 2
B
1
C
2
D
2 MSO6 a IP2 1 01 1 0 1 3.8749 CH6 A 2
B
2
C
1
D
2 MSH ai s1 1 1 0 1 0 0 1.4181 CH6 A 2
B
2 Ci D 2 MSO4 a2 p2 1 1 0 1 1 1 3.7652 CH6 A 2
B
2
C
1
D
2 MSO5 a 2
P
1 11 0 1 1 0 2.4211 CH6 A 2
B
2 Ci D 2 MS06 aI P2 I 1 0 1 0 1 2.7621 Example 1:6 5 Studies of the interactions of parts of X with parts of Y For this purpose we created a new model BA (in the following termed model BA of Example 1:6) using only those cross-terms containing signals derived from parts from both X and Y, and using the signals derived from measured BAs and using the same PLS 10 procedure as above. The new model showed R 2 and Q 2 values of respectively, 0.64 and 0.61. The variable importance in the projection (VIP) for each cross-term descriptor was calculated as in Example 1:4, an output abstraction for which is being shown in Figure 15. As can be seen from the Figure 15, the most important interactions are between part B and part P, and between part B and part a. 15 Example 2 Use of principal property variables of amino acids for describing macromolecules X in the analysis of the interaction of alpha-i adrenoceptors with 4-piperidyl oxazoles 20 The published data of Hamaguchi et al (Biochemistry. 37 (1998) 5730-5737) comprising studies on human alpha-I adrenoceptor subtypes formed the basis for the analysis. The analysis of this data was performed in a computer essentially according to the steps of the invention, as follows: 25 1. The three subtypes of the human wild type alpha-I adrenoceptors used in the Hamaguchi study were elected and aligned, thereby creating the macromolecular template X consisting of 7 parts A, B, C, D, E, F and G (i.e. the underlined amino acid sequences of 46 WO 01/36980 PCT/GBOO/04420 Fig. 16). The alpha-I recepor sequences were mixed as described in the Hamaguchi study creating 12 wild-type and chimeric receptor. From the seven parts A-G of the 12 receptors the differing amino acids were identified as shown in Fig. 17. Each of the parts A-G was assigned numbers as follows: Sequence positions that did not differ among the 12 receptors 5 was not assigned any numbers. (Note that amino acids at positions that did not differ are omitted in Fig. 17). For each amino acid position that differed by two amino acids or more among the 12 receptors, every amino acid was assigned 5 numbers selected from the 5 z scale descriptors for amino-acids derived by Sandberg (Sandberg et al J. Med. Chem. 41 (1998) 2481-2491). However, for positions differing by only 2 amino acids, the 5 z-scale 10 descriptor numbers were in an additional step merged into one number by calculating physico-chemical distances based on the two differing amino acids, as follows: AB= (zA -zB ) 2 z 1 15 Wherein AB is the physiocochemical distance between amino acids A and B and zA the z-scale of amino acid A and ZB the z-scale of amino acid B. The number of positions with two amino acids differing were for parts A-G, respectively, 9, 2, 5, 9, 9, 4 and 3 (totally 41). Number of positions with three amino acids differing were, respectively, 2, 3, 1, 2, 0, 2, 1 (totally 11). Thus, in total 52 amino acid positions differed in the data set 20 that yielded in total 41 + 11* 5 = 96 numbers for each receptor describing its physico chemical properties. Thus, in total the X data set comprised a matrix of 96*12 = 1152 floating point numbers stored in the computer according to Fig. 18 hereinafter termed X-block. 25 2. Twelve compounds comprising derivatives of 4-piperidyl oxazole modified at three positions were used. A molecular template Y for these compounds is indicated in Fig. 19 as well as each of the 12 compounds used. Each compound was coded using 24 binary descriptors comprising parts a, P and y as shown in Fig. 20. Hereinafter the data created by these descriptors is termed Y-block. 30 47 WO 01/36980 PCT/GB00/04420 3. BA for the interaction of X and Y defined in steps 1 and 2 of the present Example were obtained from the literature (Biochemistry. 37 (1998) 5730-5737) and is given as the pKi values shown in Fig. 21. 5 4. In order to correlate the X and Y with BA first all descriptors of Y were multiplied creating Cl-block as well as all descriptors of X and Y were multiplied creating C12 block. Thereby four blocks of descriptors X, Y, C1 and C 12 were obtained and stored in the computer. Descriptors were used to correlate to BA using PLS. In order to obtain optimal models the four descriptor blocks were scaled using scaling weights. Optimal 10 scaling was achieved by giving the same scaling weight to one block and then varying the scaling weight of the other blocks systematically until an optimal model was found using the Simplex optimisation strategy (see Lundstedt et al. Chemometrics Intelligent Laboratory Systems. 42 (1998) 3-40). We also systematically excluded descriptors with VIPs < 0.3-0.5 until optimal models (with respect to Q 2 values) were obtained. 15 The model finally obtained showed R 2 X = 91.5 %, R 2 Y = 95.6 % and Q 2 = 91.3 %. PLS calculations were performed using SIMCA 7.0 software (Umetrics, Umei, Sweden). (For definitions of R 2 X, R 2 , Q 2 and VIP see SIMCA 7.0 Manual, Umetrics, Umei, Sweden). 20 5. A graphical representation of the derived relationships is shown in Fig. 22, the figure showing observed and predicted pK-values. Example 2:1 25 Assessment of importance of the physico-chemical properties of amino acids in alpha-I adrenoceptors for binding 4-piperidyl oxazoles using normalized PLS regression coefficients The model of Example 2 was used to assess the importance of amino acids in the alpha-I 30 adrenoceptors for their binding of the 4-piperidyl oxazoles. This was achieved by assessing the PLS regression coefficients of the model. (In order to normalize the coefficients they were multiplied with the standard deviation of the corresponding descriptors). These normalised PLS regression coefficients are illustrated in Fig. 23. As can be seen, the 48 WO 01/36980 PCT/GB0004420 largest impact is taken by amino acids in TM2 (i.e. transmembrane region 2 = Part B) and TM5 (i.e. transmembrane region 5 = Part E). Example 2:2 5 Assessment of importance of trans-membrane regions in alpha-1 adrenoceptors for binding of 4-piperidyl oxazoles In order to get an over all assessment of the importance of TM regions MIPs were 10 calculated as follows: MP = o- -icoeff Wherein MIPa is the modelling importance of primary term, Ga the standard deviation, and 15 coeffa the regression coefficient of variable a in the data set. The MIPs were summed for each of the parts A-G corresponding to TM regions 1-7, the results of which are illustrated in Fig. 24A. As can be seen, TM2 and TM5 show clearly higher importance than the other TM regions for the binding of 4-piperidyl oxazoles. 20 In order to find the specificity portion of importance of TM regions MICs were calculated as follows: N MIC=, = a cCoeffan| ADn n=1 25 Wherein MICa is the modelling importance, Ga the standard deviation, and coeffa.n the regression coefficient of cross-terms in the data set. The ADn corresponds to the average deviation from the means of cross-terms partners of a, and was approximated by 0.8 - an, where an is the standard deviation of the cross-term partners of a. 30 The MICs were summed for each of the parts A-G corresponding to TM regions 1-7, the results of which are illustrated in Fig. 24B. As can be seen, TM2 and TM5 show clearly 49 WO 01/36980 PCT/GB00/04420 higher importance for the specificity of 4-piperidyl oxazoles binding to the alpha-I adrenoceptors, compared the other TM regions. Example 3 5 Reference is made to the dipeptides disclosed in Figure 25. The 16 peptides were selected according to a 2 6-2 fractional factorial design + 3 cp (or 1 cp 2 random). 10 Example 4 Reference is made to the tripeptides disclosed in Figure 26. The 16 peptides were selected according to a 2 9-5 fractional factorial design + 3 cp (or 1 cp 15 2 random). Example 5 Reference is made to the tetrapeptides disclosed in Figure 27. 20 The 32 peptides were selected according to a 2 12-7 fractional factorial design + 3 cp (or 1 cp 2 random). Example 6 Reference is made to the pentapeptides disclosed in Figure 28. 25 The 32 peptides were selected according to a 2 1510 fractional factorial design + 3 cp (or 1 cp 2 random). Example 7 30 Reference is made to the hexapeptides disclosed in Figure 29. The 32 peptides were selected according to a 2 18-13 fractional factorial design + 3 cp (or 1 cp 2 random). 50 WO 01/36980 PCT/GBOO/04420 Example 8 Reference is made to the heptapeptides disclosed in Figure 30. The 51 peptides were selected according to a 2 21-16 fractional factorial design with 5 additional experiments added from a half a fold over +3 cp. REFERENCES 10 Adan, R.A., Oosterom, J., Toonen, R.F., Kraan, M.V., Burbach, J.P., Gispen, W.H.: Molecular pharmacology of neural melanocortin receptors. Receptors Channels. 1997, 5, 215-123. Andersson, P.M., Sjastr6m, M., Lundstedt, T. Preprocessing peptide sequences for 15 multivariate sequence-property analysis. Chemometr. Intell. Lab. Syst. 42,41-50 (1998) Andersson, P.M., Linusson, A., Wold, S., Sj6str6m, M., Lundstedt, T., Norden, B. 'Design of Small Libraries for Lead Exploration'. In Molecular Diversity in Drug Design (Ed. R. Lewis, P.M. Dean) Kluwer Academic Publishers, November 1999, ISBN 0-7923-5980-1. 20 Baldwin, J.M.: The probable arrangement of the helices in G protein-coupled receptors. EMBO Journal. 1993, 12, 1693-1703. Bard, Y: Nonlinear parameter estimation. Academic Press, London, 1974, ISBN 0-12 25 078250-2. Bergstr6m, A. and Wikberg, J.E.S.: Structural and pharmacological differences between cod fish and rat brain alpha-I receptors revealed by photoaffinity labeling with 125I APDQ. Acta Pharmacol. Toxicol. 1986, 58, 148-155. 30 Box, G.E.P., Hunter, J.S., Hunter, W.G.: Statistics for experimenters: An introduction to design, data analysis, and model building. John Wiley & Sones, 1978. Branden, C. and Tooze, J.: Introduction to protein strucure. Garland Publishing, New 35 York, 1991, ISBN 0-8153-0344-0. 51 WO 01/36980 PCT/GB00/04420 Bylund, DB., Eikenberg, DC., Hieble, JP., Langer, SZ., Lefkowitz, RJ., Minneman, KP., Molinoff, PB., Ruffolo Jr, RR., Trendelenburg, U.: IV. International union of pharmacology nomenclature of adrenoceptors. Pharmacol. Rev. 1994, 46, 121-136. 5 Carlson, R., Lundstedt, T., Albano, C.: Screening of suitable solvents in organic synthesis. Strategies for solvent selection. Acta Chem. Scand. (1985), B39(2):79-91 Carlson, R., Prochazka, P., Lundstedt, T.: Principal properties for synthetic screening: 10 Ketones and aldehydes, Acta Chem. Scand, 1988, B42, 145-156. Chothia, C. & Lesk, A.M.: The relation between the divergence of sequence and structure in proteins. EMBO J. 1986, 5, 823-826. 15 Clementi, M., Clementi, S., Clementi, S., Cruciani, G., Pastor, M. (2000). "Chemometric detection of binding sites of 7TM receptors QSAR" in Molecular Modelling and Prediction of Bioactivity (Eds. K. Gundertofte, F.S. Jorgensen) New York, Kluwer Academic/Plenum Publishers. 20 Collantes, E.R., Dunn III, W.J.: Amino acid side chain descritpors for quantitative structure-activity relationship studies of peptide analogues. J. Med. Chem. 1995, 38, 2705 2713. Cramer III, R.D., Patteson, DE and Bunce, JD: Comparative molecular field analysis 25 (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J. Amer. Chem. Soc., 1988, 110, 5959-5967. Daveu C, Bureau R.: Definition of a pharmacophore for partial agonists of serotonin 5 HT3 receptors. J Chem Inf Comput Sci. 1999, 39, 362-369. 30 de Groot MJ, Ackland MJ.: Novel approach to predicting P450-mediated drug metabolism: development of a combined protein and pharmacophore model for CYP2D6. J Med Chem. 1999, 42, 1515-1524. 52 WO 01/36980 PCT/GBOO/04420 Eriksson, L., Johansson, E. and Wold, S. Quantitative Structure.Activity Relationship Model Validation. In: Quantitative Structure-Activity Relationships in Environmental Sience-VII, Eds. F Chen et al, Proceedings of QSAR 1996, June 24-28, Elsinore Denmark, SETAC Press, Florida, US, page 381-397. 5 Frandberg, P-A., Muceniece, R., Prusis, P., Wikberg, JES., Chhajlani, V.: Evidence for alternate points of attachement for -MSH and its stereoisomer [Nle 4 , D-Phe']- -MSH at the melanocortin receptor. Biochem. Biophys. Res. Commun. 1994, 202, 1266-1271. 10 Goodford, J. Med. Chem. (1985) 28, 849-857. Hansch, C., Maloney, P.P., Fujita, T., Muir, R. Correlation of biological activity of phenoxyacetic acids with Hamett substituent constants and partition coefficients. Nature (London) 1962, 194(178-180):1616-1626 15 Hansch, C., Fujita, T. p-o--n-Analysis. A method for the correlation of biological activity and chemical structure. J.Am. Chem. Soc. 1964. 86:1616-1626 Hellberg, S., Sj6stram, M. and Wold, S.: The prediction of bradykinin potency of 20 pentapeptides. An example of a peptide quantitative structure-activity relationship, Acta Chem. Scand. 1986, B40, 135-140. Hellberg, S., Sj6stram, M, Skagerberg, B., Wikstr6m, C and Wold, S.: On the design of multipositionally varied test series for quantitative structure-activity relationsships, Acta 25 Pharm. Jugoslavia, 1987, 37, 53-65. Jackson, J.E.: A users guide to principal components, Wiley, New York, 1991. Jensen, K. and Wirth, N.: Pascal User Manual and Report, 3d edition, Springer-Verlag, 30 1985. Jonsson, J., Eriksson, L., Hellberg, M., Sjastr6m, M. and Wold, S.: Multivariate parametrization of 55 coded and non-coded amino acids, Quant. Struct-Act. Relat., 1989, 8, 204-209. 53 WO 01/36980 PCT/GB00/04420 Lawrence, J.: Neural networks. Design, theory and applications. California Scientific Software Press, Nevada City, CA 95959, USA, 1993. 5 Lundstedt, T. The Willgerodt-Kindler reaction, a multivariate approach. (Thesis, Umei) 1986. ISBN 91-7174-248-4 Lundstedt, T., Andersson, P.M., Clementi, S., Cruciani, G., Kettaneh, N., Linusson, A., Nord6n, B., Pastor, M., Sjostrom, M., Wold, S., 'Intelligent combinatorial libraries'. In 10 computer-assisted lead finding and optimization (Ed. H. van de Waterbeemd) Verlag Helvetica Chimica Acta, Basel, Switzerland, 1997, 191-208. Lundstedt, T., Seifert, E., Abramo, L., Thelin, B., Nystr6m, A., Pettersen, J., Bergman, R.: Experimental design and optimization. Chemometrics Intelligent Laboratory Systems. 15 1998, 42, 3-40. Lunec, J., Pieron, C., Thody, A.J. MSH receptor expression and the relationship to melanogenesis and metastatic activity in B16 melanoma. Melanoma Res. (May 1992), 2(1): 5-12. 20 McGregor, MJ., Muskal, SM.: Pharmacophore fingerprinting. 1. Application to QSAR and focused library design. J Chem Inf Comput Sci. 1999, 39, 569-574. Nystr6m, A., Andersson, P.M., Lundstedt, T. Multivariate data analysis of topographically 25 modified-melanotropin analogues using Auto and Cross Auto Covariances (ACC). Quant. Struct.-Act. Relat. 264-269 (2000) Rang, H.P., Dale, M.M. and Ritter, J.M.: Pharmacology, 4*' edition, Churchil Livingstone, UK, 1999, ISBN 0443 059748. 30 Sandberg, M., Eriksson, L, Jonsson, J, Sjbstram, M and Wold, S.: New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J. Med Chem., 1998, 41, 2481-2491. 54 WO 01/36980 PCT/GBOO/04420 Schi6th, H.B., Muceniece, R., Wikberg, J.E.S., Chhajlani, V.: Characterisation of melanocortin receptor subtypes by radioligand binding analysis. Eur. J. Pharmacol., Mol. Pharm. Sect. 1995, 288, 311-317. 5 Schi6th, H.B., Mutulis, F., Muceniece, R., Prusis, P., Wikberg, J.E.S.: Discovery of novel melanocortin 4 receptor selective MSH analogues. Br. J. Pharmacol. 1998, 124, 75-82. Schi6th, H.B., Yook, P., Muceniece, R., Wikberg, JES., Szardenings, M.: Chimeric melanocortin 1/3 receptors: Identification of domains determining the specificity of MSH 10 peptides. Mol. Pharmacol. 1998, 54, 154-161. SIMCA 7.0 A new standard in multivariate data analysis, Manual, Edition August 21, 1998, Umetri AB, Box 7960, SE907 19 Umei, Sweden. 15 Sj6str6m, M. and Eriksson K: Application of statistical experimental design and PLS modelling in QSAR. In: QSAR: Chemometric method in molecular design, Methods and principles in medicinal chemistry, vol. 2. (Ed. H. Van de Waterbeemd) Verrlag Chemie, Weinheim, Germany, 1995, 63-90. 20 Szardenings, M., Tornroth, S., Mutulis, F., Muceniece, R., Keinanen, K., Kuusinen, A., Wikberg, J.E. Phage display selection on whole cells yields a peptide specific for melanocortin receptor 1. J. Biol. Chem.1997 Oct 31:272(44), 27943-8 Uhl6n, S., Wikberg, J.E.S.: Delineation of rat kidney a2A- and 2B-adrenoceptors with 25 [ 3 H]RX821002 radioligand binding: computer modelling reveals that guanfacine is an c(2A-selective compound. Eur. J. Pharmacol. 1991, 202, 235-243. Wold, S, Esbensen, K. and Geladi, P: Principal component analysis. In Chemometrics and intelligent laboratory systems, 1987, 2, 37-52. 30 Wold, S, Johansson, M., Cocchi, M.: PLS - partial least-squares projections to latent stuctures. In 3D QSAR in drug design; Theory, methods and application. (Ed. H. Kubinyi) ESCOM Science Publishers, Leiden, Holland, 1993a, 523-550. 55 WO 01/36980 PCT/GB0004420 Wold, S., Jonsson, M., Sj6str6m, M., Sandberg, S. and Rannar, S.: DNA and peptide sequences and chemical processes multivariately modelled by PCA and PLS projections to latent structures. Anal. Chim. Acta, 1993b, 227, 239-253. 5 Wold, S.: PLS for multivariate modelling. In: QSAR: Chemometric method in molecular design, Methods and principles in medicinal chemistry, vol. 2. (Ed. H. Van de Waterbeemd) Verlag Chemie, Weinheim, Germany, 1995, p. 195-218. S. Wold, M. Sj6str6m, P.M. Andersson, A. Linusson, M. Edman, T. Lundstedt, B. Nord6n, 10 M. Sandberg, L. Uppgird, Multivariate Design and Modelling in QSAR, Combinatorial Chemistry, and Bioinformatics, in Molecular Modelling and Prediction of Bioactivity, Eds. K. Gunddertofte, F.S. Jorgensen, Kluwer Academic/Plenum Publishers, New York (2000), p. 27-45 15 Zaliani, A and Gancia, E: MS-WHIM scores for amino acids: A new 3D-description for peptide QSAR and QSPR studies. J. Chem. Inf. Comput. Sci. 1999, 39, 525-533. 56
Claims (83)
1. A process for characterising the interaction between a Ligand Y and a Target 5 X comprising: Step 1 Obtaining information representing one or more chemical and/or physical properties of at least two ligands of the type Y; Step
2 Obtaining information representing one or more chemical and/or physical 10 properties of at least two targets of the type X; Step 3 Obtaining information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y and at least two of the targets of the type X; 15 and processing the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and Target X from which one or more of the properties of the interaction between the Ligand Y and the Target X may be identified and/or characterised. 20 2. A process for estimating the position of the active site in a Target X in an interaction between a Ligand Y and a Target X, or estimating one or more physical and/or chemical properties of the active site, comprising: Step 1 Obtaining information representing one or more chemical and/or physical 25 properties of at least two ligands of the type Y; Step 2 Obtaining information representing one or more chemical and/or physical properties of at least two targets of the type X; Step 3 Obtaining information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y 30 and at least two of the targets of the type X; and correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the position of the active 57 WO 01/36980 PCT/GBOO/04420 site or one or more physical and/or chemical properties of the active site in the Target X may be estimated.
3. A process for identifying the position of the active site in an interaction 5 between a Ligand Y and a Target X, or predicting one or more physical and/or chemical properties of the active site, comprising: Step 1 Obtaining information representing one or more chemical and/or physical properties of at least two ligands of the type Y; 10 Step 2 Obtaining information representing one or more chemical and/or physical properties of at least two targets of the type X; Step 3 Obtaining information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y and at least two of the targets of the type X; 15 Step 4 Correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X; and using the model to identify the position of the active site or one or more physical and/or chemical properties of the active site. 20
4. A process performed with the aid of a programmed computer for the estimation of the position of the active site in a Target X, in an interaction between a Ligand Y and a Target X, or one or more physical and/or chemical properties of the active site, comprising the steps of: 25 Step 1 Inputting information representing one or more chemical and/or physical properties of at least two ligands of the type Y; Step 2 Inputting information representing one or more chemical and/or physical properties of at least two targets of the type X; 30 Step 3 Inputting information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y and at least two of the targets of the type X; Step 4 Computing or calculating a model from the inputted information which describes the interaction between the Ligand Y and the Target X; 58 WO 01/36980 PCT/GBOO/04420 and using the model to estimate the position of the active site, or to estimate one or more physical and/or chemical properties of the active site.
5 5. A process for assisting in the design of a Ligand Y' which binds to a Target X, the Ligand Y' having an increased or decreased binding affinity, selectivity or avidity for the Target X compared to that of a Ligand Y, comprising the steps of: Step 1 Obtaining information representing one or more chemical and/or physical 10 properties of at least two ligands of the type Y; Step 2 Obtaining information representing one or more chemical and/or physical properties of at least two targets of the type X; Step 3 Obtaining information representing one or more chemical and/or physical properties of the interaction between at least two of the ligands of type Y 15 and at least two of the targets of the type X and correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the structure and/or one or more chemical and/or physical properties of the Ligand Y' may be estimated or predicted. 20
6. A process for estimating or predicting the binding affinity, selectivity or avidity of a Ligand Y' with a Target X, comprising Steps 1, 2 and 3 of claim 1; and correlating the information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the binding affinity, selectivity or avidity of the 25 Ligand Y' with the Target X may be estimated or predicted.
7. A process for producing a Ligand Y' which binds to a Target X, the Ligand Y' having an increased or decreased binding affinity, selectivity or avidity for the Target X compared to that of a Ligand Y, comprising Steps 1, 2 and 3 of claim 1; and correlating the 30 information from Steps 1, 2 and 3 in order to produce a model of the interaction between the Ligand Y and the Target X from which the structure and/or one or more properties of the Ligand Y' may be estimated or predicted; and then producing the Ligand Y' by a method known per se. 59 WO 01/36980 PCT/GBOO/04420
8. A process as claimed in any one of claims 1 to 4 from which the region(s) or part(s) or subsequence(s) of the Target X which interact with the Ligand Y can be estimated or predicted from the model. 5
9. A process as claimed in any one of claims 1 to 4 from which the region(s) or part(s) or subsequence(s) of the Ligand Y which interact with the Target X can be estimated or predicted from the model.
10 10. A process as claimed in any one of claims 1 to 7 which additionally comprises the step of determining experimentally part or all of the information on the chemical and/or physical properties of at~least two targets of type X or part or all of the information on the chemical and/or physical properties of at least two ligands of type Y or part or all of the information on the interaction between the targets of type X and the ligands of type Y. 15
11. A process as claimed in any one of claims 1 to 10 which additionally comprises producing one or more targets of type X or one or more ligands of type Y.
12. A process as claimed in claim 11 which additionally comprises designing and 20 producing one or more targets of type X or one or more ligands of type Y.
13. A process as claimed in any one of claims 1 to 12 which additionally comprises displaying or presenting part or all of the derived model or a representation thereof. 25
14. A process as claimed in claim 13 wherein the model is displayed or presented in the form of a table, graph or mathematical function.
15. A process as claimed in any one of claims 1 to 14 which additionally comprises the production of one or more lead compounds. 30
16. A process as claimed in any one of claims I to 14 which additionally comprises the production of one or more outliers. 60 WO 01/36980 PCT/GB0004420
17. A process as claimed in any one of claims 1 to 16 which additionally comprises the production of a further ligand of type Y with an affinity and/or selectivity for a target of type X. 5
18. A process as claimed in any one of claims 1 to 17 wherein the derived model is used to design a further target of type X or a further ligand of type Y.
19. A process as claimed in claim 18 wherein the further target of type X or the further ligand of type Y is subsequently produced. 10
20. A process as claimed in any one of claims 1 to 19 wherein the process is repeated using information on the, chemical and/or physical properties of the further target of type X or the further ligand of type Y. 15
21. A process as claimed in claim 20 wherein the repeated method additionally makes use of information on the interactions of the further target of type X and/or ligand of type Y with one or more of the formerly-used ligands of type Y and/or targets of type X, respectively. 20
22. A process as claimed in any one of the previous claims, wherein the information on the properties of the targets of type X is derived, at least in part, from regions or parts or subsequences of the targets.
23. A process as claimed in any one of the previous claims, wherein the information on 25 the properties of the ligands of type Y is derived, at least in part, from regions or parts or subsequences of the ligands.
24. A process as claimed in any one of the previous claims wherein the information in Steps 1, 2 and/or 3 comprises, at least in part, a binary descriptor or the information is 30 represented, at least in part, in binary form.
25. A process as claimed in any one of the previous claims wherein the information comprises, at least in part, a bit vector or the information is represented, at least in part, in bit vector form. 61 WO 01/36980 PCT/GB00/04420
26. A process as claimed in any one of the previous claims wherein the information is represented, at least in part, by Principal Property variables. 5
27. A process as claimed in claim 26 wherein Principal Component Analysis is used to generate the Principal Property variables.
28. A process as claimed in claim 26 or claim 27 wherein one or more characterstics of amino acids are used as the Principal Properties. 10
29. A process as claimed in claim 26 wherein the z-scale is used as Principal Properties for amino acids.
30. A process as claimed in any one of the previous claims which additionally 15 comprises generating an unequal number of descriptors for each target of type X and then transforming said unequal numbers of descriptors into an equal numbers of descriptors for each target of type X.
31. A process as claimed in any one of the previous claims which additionally 20 comprises generating an unequal number of descriptors for each ligand of type Y and then transforming said unequal numbers of descriptors into an equal numbers of descriptors for each ligand of type Y.
32. A process as claimed in claim 30 or claim 31 which involves the use of Auto 25 Covariances and/or Auto Cross Covariances (ACC) and/or Auto Correlations.
33. A process as claimed in any one of the previous claims, wherein the model is derived using one or more of multivariate methods, partial least squares methods, neural networks, multiple linear regression, non-linear regression, curve fitting, model fitting, 30 stepwise regression and maximum likelihood methods.
34. A process as claimed in any one of the previous claims, wherein experimental design is applied to the selection, design, manufacture or synthesis of the targets of type X and/or ligands of type Y. 62 WO 01/36980 PCT/GBOO/04420
35. A process as claimed in claim 34 wherein the experimental design is directed onto regions of the targets of type X and/or ligands of type Y. 5
36. A process as claimed in claim 34 or claim 35 wherein the experimental design is directed onto part(s) of the targets of type X and/or ligands of type Y.
37. A process as claimed in any one of the previous claims, which additionally comprises the use of cross-terms. 10
38. A process as claimed in any one of the previous claims, wherein the information on the properties of the targets of type X and/or the information on the properties of the ligands of type Y is derived from atom counts, measured or calculated thin layer liquid chromatography (TLC), retention times on HPLC, refractive index, isoelectric point, 15 melting point, boiling point, molecular weight, hydrophobicity, hydrophilicity, chromatographic mobility, van der Waals volume, octanol/water partion coefficient (logP), energy of molecular orbital, heat of formation, polarizability, electronegativity, hardness, total accessible molecular surface area, polar accessible molecular surface area, nonpolar accessible molecular surface area, number of hydrogen bond donors, number of hydrogen 20 bond acceptors, charge, IR-spectra, NMR-spectra or other spectra, HOMO, LUMO, connectivity indices, semi-empirical calculations ab inito calculations or 3D quantum mechanical calculations.
39. A process as claimed in any one of the previous claims, wherein the information on 25 the interaction of the targets of type X with ligands of type Y is derived from experiments, the experiment preferably being selected from chemical, physical, biological, molecular biological, physiologcal, microbiological, enzymological, pharmacological and molecular pharmacological experiments. 30
40. A process as claimed in any one of the previous claims, wherein the information on the chemical and/or physical properties of the targets of type X is derived from at least two different targets of type X, preferably more than 3, even more preferably more than 4, still even more preferably more than 6, still even more preferably more than 9, and most preferably more than 19 different targets, 63 WO 01/36980 PCT/GBOO/04420 and/or wherein the information on the chemical and/or physical properties of the ligands of type Y is derived from at least one ligand, preferably more than 2, even more preferably more than 3, still even more preferably more than 4, still even more preferably more than 5, and 5 still even more preferably more than 6, preferably at least 9, more preferably at least 11 and most preferably at least 19 different ligands.
41. A process as claimed in any one of the previous claims, wherein the information derived the targets of type X is derived from targets whose molecular weight is larger than 10 1000 g/mole, preferably larger than 2000 g/mole, larger than 3000 g/mole, larger than 5000 g/mole, larger than 7000 g/mole, larger than 10000 g/mole, larger than 12000 g/mole, larger than 14000 g/mole, larger than 17000 g/mole, larger than 20000 g/mole, larger than 25000 g/mole, and most preferably larger than 30000 g/mole; and/or 15 the molecular weight of the ligands of type Y is within the range 100 - 5000 g/mole, or the molecular weight of the ligands is below 3000g/mole, below 2000g/mole, below 10OOg/mole or preferably below 800g/mole.
42. A process as claimed in any one of the previous claims, wherein the information on 20 the properties of the targets of type X does not include information on the three-dimensional co-ordinates of the atoms of the targets of type X or information on the angles between the atoms of the targets of type X.
43. A process as claimed in any one of the previous claims, wherein the information on 25 the properties of the ligands of type Y does not include information on the three-dimensional co-ordinates of the atoms of the ligands of type Y or information on the angles between the atoms of the ligands of type Y.
44. A process as claimed in any one of the previous claims, wherein the targets of type 30 X are composed of building blocks and/or the targets of type X are composed of composite building blocks. 64 WO 01/36980 PCT/GBOO/04420
45. A process as claimed in any one of the previous claims, wherein the ligands of type Y are composed of building blocks and/or the ligands of type Y are composed of composite building blocks. 5
46. A process as claimed in claim 44 or claim 45, wherein the molecular weight of the building block is less than 10000 g/mole, less than 5000 g/mole, less than 3000 g/mole, less than 2000 g/mole, less than 1500 g/mole, less than 1000 g/mole, less than 600 g/mole, less than 400 g/mole, less than 300 g/mole, less than 200 g/mole or less than 100 g/mole. 10
47. A process as claimed in any one of claims 44 to 46, wherein the building block is an amino acid residue, anucleotide, a deoxyadenosine 5'-phosphoric acid, a deoxyguanosine 5'-phosphoric acid, a deoxycytidine 5'-phosphoric acid, a deoxythymidine 5'-phosphoric acid, a deoxyuridine 5'-phosphoric acid, an organic residue 15 or a sugar residue.
48. A process as claimed in any one of claims 44 to 47, wherein the composite building block is constructed from less than 11, more preferably less than 9, even more preferably less than 6, still even more preferably less than 4, and most preferably less than 3 building 20 blocks and/or wherein a composite building block is constructed from 16 or less, 24 or less, or 33 or less of building blocks.
49. A process as claimed in any one of the previous claims, wherein the information on the physical/chemical properties of the target X is derived from one or more building 25 blocks and/or composite building blocks within the target X.
50. A process as claimed in any one of the previous claims, wherein the information on the physical/chemical properties of the ligand Y is derived from one or more building blocks and/or composite building blocks within the ligand Y. 30
51. A process as claimed in any one of the previous claims, wherein the target X has a polymeric structure and/or wherein the ligand Y has a polymeric structure. 65 WO 01/36980 PCT/GBOO/04420
52. A method as claimed in any one of the previous claims, wherein the target X has a chimeric structure and/or wherein the ligand Y has a chimeric structure, preferably wherein target X is a chimeric protein/peptide or chimeric DNA molecule and/or ligand Y is a chimeric protein/peptide or a chimeric DNA molecule. 5
53. A process as claimed in any one of the previous claims, wherein the target X is one or more of synthetic or natural polymeric structures, synthetic or natural cyclic polymeric structures, synthetic or natural branched polymeric structures, peptides, polypeptides, proteins, DNA, RNA, enzymes, ion-channels, receptors, G-protein coupled receptors, 10 tyrosine kinase receptors, serine/threonine kinase receptors, steroid hormone receptors, thyroid hormone receptors, membrane transporters, structural proteins, antibodies or carbohydrates.
54. A process as claimed in any one of the previous claims, wherein the ligand Y is 15 selected from one or more of synthetic or natural polymeric structures, synthetic or natural cyclic polymeric structures, synthetic or natural branched polymeric structures, peptides, polypeptides, proteins, DNA, RNA, organic compounds, organic libraries, enzymes, ion-channels, receptors, G-protein coupled receptors, tyrosine kinase receptors, serine/threonine kinase receptors, steroid hormone receptors, thyroid hormone receptors, 20 membrane transporters, structural proteins, antibodies or carbohydrates.
55. A process as claimed in any one of the previous claims, wherein the information is derived from a target of type X and/or a ligand of type Y when it is situated in a viral particle, a cell and/or a multicellular organism. 25
56. A process as claimed in any one of the previous claims, wherein the information on the physical/chemical properties of the targets of type X and/or the ligands of type Y is derived from one or more building blocks and/or composite building blocks within the macromolecules of type X and/or the molecules of type Y and the information is derived 30 from atom counts, measured or calculated thin layer liquid chromatography (TLC), retention times on HPLC, refractive index, isoelectric point, melting point, boiling point, molecular weight, hydrophobicity, hydrophilicity, chromatographic mobility, van der Waals volume, octanol/water partion coefficient (logP), energy of molecular orbital, heat of formation, polarizability, electronegativity, hardness, total accessible molecular surface 66 WO 01/36980 PCT/GB00/04420 area, polar accessible molecular surface area, nonpolar accessible molecular surface area, number of hydrogen bond donors, number of hydrogen bond acceptors, charge, IR-spectra, NMR-spectra or other spectra, HOMO, LUMO, connectivity indices, semi-empirical calculations, ab inito calculations, 3D-quantum mechanical calculations. 5
57. A process as claimed in any one of the previous claims, wherein the information is derived from the three dimensional structure of one or more of the building blocks and/or the angles between one or more of the atoms in one or more of the building blocks. 10
58. A process as claimed in any one of the previous claims, wherein the use of angles between atoms in different building blocks is excluded and/or wherein the use of distances between atoms in different building blocks is excluded
59. A process as claimed in any one of the previous claims, wherein the use of the 15 coordinates of the Ca atoms (in three-dimensional space) of a peptide or a protein are excluded, and/or wherein the use of psi and phi angles in a peptide or a protein are excluded.
60. A process as claimed in any one of the previous claims, wherein the use of a 20 pharmacophore model is excluded.
61. A process as claimed in any one of the previous claims, wherein the information is derived from chimeric variations of the targets of type X and/or chimeric variations of the ligands of type Y. 25
62. A process as claimed in any one of the previous claims, for use in identifying outliers of type X or outliers of type Y.
63. A process as claimed in any one of the previous claims, for use in drug design. 30
64. A process as claimed in any one of the previous claims, for use in the design or identification of lead compounds. 67 WO 01/36980 PCT/GBOO/04420
65. A process as claimed in any one of the previous claims, for use in the design of ligands of type Y with improved affinity and/or selectivity for targets of type X.
66. A process as claimed in any one of the previous claims, for use in protein 5 engineering.
67. A process as claimed in any one of the previous claims, for the design of DNA or RNA molecules. 10
68. A process as claimed in any one of the previous claims, for the design of artificial targets of type X and/or artificial ligands of type Y.
69. A process as claimed in any one of the previous claims, for analysis and/or in the engineering of regions and/or parts of targets of type X and/or ligands of type Y. 15
70. A process as claimed in any one of the previous claims, for the design of an organic compound, catalyst, pharmaceutical, drug, macromolecule being capable of binding a molecule, peptide, peptidomimetic, protein, enzyme, antibody, molecule, macromolecule, DNA, RNA or a carbohydrate. 20
71. A process as claimed in any one of the previous claims, for the design of a ligand of type Y being capable of binding a target of type X.
72. A process as claimed in any one of the previous claims, for the design of any one of 25 organic compound, catalyst, pharmaceutical, drug, macromolecule being capable of binding a molecule, peptide, peptidomimetic, protein, enzyme, antibody, molecule and a macromolecule.
73. A lead, organic compound, catalyst, pharmaceutical, drug, macromolecule being 30 capable of binding a molecule, peptide, peptidomimetic, protein, enzyme, antibody, molecule, macromolecule, DNA, RNA, carbohydrate when designed by a process comprising a process as claimed in any one of claims
74. A process as claimed in any one of claims comprising the use of an organic library. 68 WO 01/36980 PCT/GBOO/04420
75. A process as claimed in any one of claims operated on or performed with the aid of a digital computer. 5
76. Computer software specifically adapted to carry out a process as claimed in any one of the previous claims when installed on data processing means.
77. A computer program element comprising computer software code portions for performing a process as claimed in any one claims 1 to 75 when the program element is 10 run on data processing means.
78. A computer software carrier comprising software as claimed in claim 76.
79. A ligand whose structure and/or properties has been estimated or predicted through 15 the use of a process as claimed in any of the claims I to 75.
80. Use of a process as claimed in any one of claims 1 to 75 for designing new ligands for known targets and/or for new targets. 20
81. A process as claimed in any one of claims 1 to 75 wherein the Target X is a 7TM receptor, preferably a melanocortin receptor.
82. A process as claimed in any one of claims 1 to 75 wherein the Ligand Y is any one of the peptides disclosed in any one of Figures 25 to 30. 25
83. A process as claimed in any one of claims I to 75 wherein the ligands of the type Y comprise the set of peptides disclosed in any one of Figures 25 to 30. 30 69
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2006200494A AU2006200494A1 (en) | 1999-11-18 | 2006-02-06 | A process for identifying the active site in a biological target |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9927346 | 1999-11-18 | ||
GBGB9927346.8A GB9927346D0 (en) | 1999-11-18 | 1999-11-18 | Method for analysis and design of entities of a chemical or biochemical nature |
PCT/GB2000/004420 WO2001036980A2 (en) | 1999-11-18 | 2000-11-20 | A process for identifying the active site in a biological target |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU2006200494A Division AU2006200494A1 (en) | 1999-11-18 | 2006-02-06 | A process for identifying the active site in a biological target |
Publications (1)
Publication Number | Publication Date |
---|---|
AU1530501A true AU1530501A (en) | 2001-05-30 |
Family
ID=10864777
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU15305/01A Abandoned AU1530501A (en) | 1999-11-18 | 2000-11-20 | A process for identifying the active site in a biological target |
Country Status (8)
Country | Link |
---|---|
EP (1) | EP1232466A2 (en) |
AU (1) | AU1530501A (en) |
CA (1) | CA2392086A1 (en) |
GB (1) | GB9927346D0 (en) |
HK (1) | HK1049218A1 (en) |
NZ (1) | NZ518980A (en) |
WO (1) | WO2001036980A2 (en) |
ZA (1) | ZA200203963B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2321026A1 (en) | 1998-03-09 | 1999-09-16 | Zealand Pharmaceuticals A/S | Pharmacologically active peptide conjugates having a reduced tendency towards enzymatic hydrolysis |
SE9801571D0 (en) | 1998-05-05 | 1998-05-05 | Wapharm Ab | Melanocortin-1 receptor selective compounds |
AU2002231706A1 (en) * | 2000-11-28 | 2002-06-11 | Clairbio Capital Management | Apparatus and method for determining affinity data between a target and a ligand |
EP1209610A1 (en) * | 2000-11-28 | 2002-05-29 | Valentin Capital Management | Method and apparatus for determining affinity data between a ligand and a target |
US20030158671A1 (en) * | 2001-07-18 | 2003-08-21 | Structural Genomix, Inc. | Systems and methods for predicting active site residues in a protein |
WO2003040994A2 (en) * | 2001-11-02 | 2003-05-15 | Arqule, Inc. | Cyp2c9 binding models |
DE10241793A1 (en) * | 2002-09-06 | 2004-06-17 | Roos, Gudrun, Dr. | Analysis apparatus for predicting the pharmaceutical activity of plant extracts comprises a nuclear magnetic resonance spectroscope producing a spectrum compared with a database of spectra of known active materials |
NZ566489A (en) | 2005-08-26 | 2008-10-31 | Action Pharma As | Therapeutically active alpha-MSH analogues |
US20130280238A1 (en) * | 2012-04-24 | 2013-10-24 | Laboratory Corporation Of America Holdings | Methods and Systems for Identification of a Protein Binding Site |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5587293A (en) * | 1994-01-06 | 1996-12-24 | Terrapin Technologies, Inc. | Method to identify binding partners |
EP1002235A1 (en) * | 1997-08-01 | 2000-05-24 | Novalon Pharmaceutical Corporation | Method of identifying and developing drug leads |
AU1925699A (en) * | 1997-12-18 | 1999-07-05 | Sepracor, Inc. | Methods for the simultaneous identification of novel biological targets and leadstructures for drug development |
-
1999
- 1999-11-18 GB GBGB9927346.8A patent/GB9927346D0/en not_active Ceased
-
2000
- 2000-11-20 CA CA002392086A patent/CA2392086A1/en not_active Abandoned
- 2000-11-20 NZ NZ518980A patent/NZ518980A/en unknown
- 2000-11-20 AU AU15305/01A patent/AU1530501A/en not_active Abandoned
- 2000-11-20 WO PCT/GB2000/004420 patent/WO2001036980A2/en active IP Right Grant
- 2000-11-20 EP EP00977666A patent/EP1232466A2/en not_active Withdrawn
-
2002
- 2002-05-17 ZA ZA200203963A patent/ZA200203963B/en unknown
-
2003
- 2003-02-20 HK HK03101309.1A patent/HK1049218A1/en unknown
Also Published As
Publication number | Publication date |
---|---|
NZ518980A (en) | 2005-05-27 |
HK1049218A1 (en) | 2003-05-02 |
WO2001036980A3 (en) | 2002-03-14 |
ZA200203963B (en) | 2003-05-19 |
EP1232466A2 (en) | 2002-08-21 |
GB9927346D0 (en) | 2000-01-12 |
WO2001036980A2 (en) | 2001-05-25 |
CA2392086A1 (en) | 2001-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ribeiro et al. | A chemical perspective on allostery | |
Campbell et al. | Ligand binding: functional site location, similarity and docking | |
Moller et al. | Prediction of the coupling specificity of G protein coupled receptors to their G proteins | |
Jespers et al. | QresFEP: an automated protocol for free energy calculations of protein mutations in Q | |
Tian et al. | Fast and reliable prediction of domain–peptide binding affinity using coarse-grained structure models | |
Pfleger et al. | Ensemble-and rigidity theory-based perturbation approach to analyze dynamic allostery | |
Cole et al. | Interrogation of the protein-protein interactions between human BRCA2 BRC repeats and RAD51 reveals atomistic determinants of affinity | |
Bogetti et al. | A twist in the road less traveled: The AMBER ff15ipq-m force field for protein mimetics | |
AU1530501A (en) | A process for identifying the active site in a biological target | |
Wikberg et al. | Proteochemometrics: a tool for modeling the molecular interaction space | |
Neuwald | Gleaning structural and functional information from correlations in protein multiple sequence alignments | |
Zhou et al. | Quantitative sequence-activity model (QSAM): applying QSAR strategy to model and predict bioactivity and function of peptides, proteins and nucleic acids | |
US6622094B2 (en) | Method for determining relative energies of two or more different molecules | |
Alston et al. | The analytical Flory random coil is a simple-to-use reference model for unfolded and disordered proteins | |
Gallardo et al. | Protein–nucleic acid interactions for RNA polymerase II elongation factors by molecular dynamics simulations | |
Poudel et al. | Activation-Induced Reorganization of Energy Transport Networks in the β2 Adrenergic Receptor | |
Treyde et al. | Bond dissociation energies of X–H bonds in proteins | |
Mai et al. | Exploring PROTAC cooperativity with coarse-grained alchemical methods | |
Prusis et al. | Prediction of indirect interactions in proteins | |
Ben-Shimon et al. | Protonation States in molecular dynamics simulations of peptide folding and binding | |
Trnka et al. | Role of integrative structural biology in understanding transcriptional initiation | |
Chipot | Free energy calculations in biological systems. How useful are they in practice? | |
Yang et al. | Importance of interface and surface areas in protein-protein binding affinity prediction: A machine learning analysis based on linear regression and artificial neural network | |
Punia et al. | Computation of the protein conformational transition pathway on ligand binding by linear response-driven molecular dynamics | |
Ma et al. | Prediction of protein–protein binding affinity using diverse protein–protein interface features |