EP1203330A2 - Analyse de la diversite moleculaire et proteique - Google Patents

Analyse de la diversite moleculaire et proteique

Info

Publication number
EP1203330A2
EP1203330A2 EP00925889A EP00925889A EP1203330A2 EP 1203330 A2 EP1203330 A2 EP 1203330A2 EP 00925889 A EP00925889 A EP 00925889A EP 00925889 A EP00925889 A EP 00925889A EP 1203330 A2 EP1203330 A2 EP 1203330A2
Authority
EP
European Patent Office
Prior art keywords
molecules
ofthe
theoretical
diversity
target surfaces
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP00925889A
Other languages
German (de)
English (en)
Inventor
Edward A. Wintner
Ciamac C. Moallemi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neogenesis Inc
Original Assignee
Neogenesis Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neogenesis Inc filed Critical Neogenesis Inc
Publication of EP1203330A2 publication Critical patent/EP1203330A2/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B15/00ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/40Searching chemical structures or physicochemical data

Definitions

  • This invention relates to analyzing molecule and protein diversity.
  • Combinatorial chemistry allows the creation of unprecedented numbers of organic compounds.
  • the rational synthesis of millions of small organic molecules is now achievable in a matter of days.
  • small we mean a molecule having fewer than 1500 Daltons, where a Dalton is defined as 1/12 of the weight of a carbon 12 atom or roughly the weight of a hydrogen atom.
  • the utility of small molecules as drugs depends in part on molecular complementarity: how well the molecules fit and/or stick to chemically active sites (often in the form of depressions in a protein) on the surface of a cell, on the surface of an intracellular organelle, or on a cytosolic protein.
  • the potential molecular complementarity of a small molecule is in large part determined by two factors: 1) The shape of the molecule, meaning the total Van Der Waals (VDW) surface of a given conformation of the molecule and how it follows or does not follow the VDW surface of the target site of interest.
  • VDW Van Der Waals
  • the shape complementarity of molecule and target are largely responsible for energetic forces such as the displacement of water (the "hydrophobic effect") and so called “Van Der Waals” or “London dispersion” forces. 2)
  • the types of potential energetic interactions (such as hydrogen bonding, percentage ionic bonding, proximity of polarizable moieties) found at various places on the molecule, and on the manner, order, and spatial orientation in which the energetically interactive portions ofthe molecule are connected to each other and presented in the presence of the target of interest.
  • each of these factors is measured or calculated in only a relative way with respect to a specific set of molecules or with respect to specific protein surfaces being examined.
  • a general comparison of potential molecular complementarity between two sets of molecules requires doing calculations or experiments using an absolute or fixed frame of reference. For example, if company 1 has tested a molecule set A for affinity to a set of target surfaces X found in cancerous cells, and company 2 has tested a molecule set B against a set of target surfaces Y found in nervous tissue, the value of molecule set B with respect to the target surfaces X of company 1 is not apparent because each of the affinity evaluations has been performed against a different standard. Also, without a full molecular calculation of set A against target surfaces Y, it is not apparent whether potential chemically active portions of the target surface Y could be bonded by molecules in set A. Finally, even if all calculations of sets A and B vs.
  • Implementations ofthe invention provide an absolute or fixed frame of reference for comparing sets of molecules and protein surfaces.
  • molecules in terms of their complementarity or attraction to a fully enumerated basis set of theoretical protein surfaces within given parameters, it is possible to measure the diversity of a set of molecules against a non relative standard, i.e., an absolute measurement.
  • This allows for an efficient comparison of different sets of molecules, for example, sets of drugs, and enables meaningful categorization of classes of molecules against a standard set of surfaces.
  • this allows for the detection of theoretical protein surfaces to which no molecules in a set are complementary, thus enhancing the ability of a chemist to design novel molecules that supplement the deficiencies ofthe original set.
  • real world protein surfaces By defining real world protein surfaces in terms of their similarity to a basis set of theoretical protein surfaces, it is similarly possible to categorize real world sets of protein surfaces against a standard set of theoretical surfaces. This allows for improved classification of proteins into similar target classes by the similarity of their surface sites.
  • the invention features a computer-based method in which a set of constraints on possible target surfaces is defined, and a fully enumerated set of theoretical target surfaces under the defined constraints is also defined, such that each surface has a defined, continuous volume and a defined, continuous surface area.
  • One or more sets of objects are mapped to the fully enumerated set of theoretical target surfaces to define corresponding subsets of the fully enumerated set of theoretical target surfaces.
  • An aspect of diversity ofthe objects is analyzed based on degrees of similarities and differences among the corresponding subsets.
  • Implementations ofthe invention may include one or more ofthe following features.
  • the target surfaces may include negative space target surfaces.
  • the objects may include positive space object surfaces associated with different molecules.
  • the objects may be mapped by defining corresponding subsets ofthe fully enumerated set of negative space theoretical target surfaces to which positive space object surfaces of conformations of molecules are complementary.
  • the aspect of diversity that is analyzed may be the difference or similarity between the molecules which map to those negative space theoretical target surfaces.
  • the objects may include negative space object surfaces associated with different proteins, and the objects may be mapped by defining corresponding subsets ofthe fully enumerated set of negative space theoretical target surfaces to which negative space object surfaces of protein pockets are similar.
  • the aspect of diversity that is analyzed may be the difference or similarity between protein pockets which map to those negative space theoretical target surfaces.
  • the objects may include positive space object surfaces associated with different molecules and negative space object surfaces associated with different proteins. In the case of molecules, the objects may be mapped by defining corresponding subsets ofthe fully enumerated set of negative space theoretical target surfaces to which positive space object surfaces of conformations of molecules are complementary.
  • the objects may be mapped by defining corresponding subsets of the fully enumerated set of negative space theoretical target surfaces to which negative space object surfaces of protein pockets are similar.
  • the aspect of diversity that is analyzed may be the difference or similarity of the molecules which map to those negative space theoretical target surfaces to the protein pockets which map to those negative space theoretical target surfaces.
  • the theoretical target surfaces and the objects may be polyhedrons, e.g., cubes, all ofthe same size and shape.
  • the set of all theoretical target surfaces defines a diversity space within which the diversity of objects can be measured by mapping those objects to the diversity space. Regions of the diversity space to which no objects map may be identified, and molecules may be designed that occupy at least one ofthe unfilled theoretical target surfaces ofthe diversity space. Complementarity may be associated with binding affinities of positive space object surfaces of conformations of molecules to negative space theoretical target surfaces.
  • the constraints may include volume, associations of each of a number of sites of the target surface with a preselected molecular property drawn from a larger set of possible molecular properties, including hydrophobic, polarizable, H-bond acceptor, H-bond donor, H-bond donor/acceptor, potentially positively charged, and potentially negatively charged. Fewer than all of the sites ofthe target surface may each be associated with a different one ofthe molecular properties and all of the other sites of the target surface may be associated with a common molecular property, such as slightly hydrophobic.
  • the degrees of similarities or differences may involve functional properties associated with the corresponding subsets of the fully enumerated set of theoretical target surfaces or shape properties associated with the corresponding subsets ofthe fully enumerated set of theoretical target surfaces.
  • Each of the objects may be defined by quantizing molecules into polyhedrons. Each of a fixed set of orientations of each conformation of each of the objects may be fitted to each of the target surfaces, and each ofthe fittings may be scored.
  • the constraints may include a resolution ofthe polyhedrons, e.g., 4.24 Angstroms, or maximum and minimum numbers of polyhedrons. Each of the polyhedrons may share a common interface with another of the polyhedrons. The constraints may also include the absence of any occluded volumes greater than a given user-defined parameter.
  • the target surfaces may be defined conceptually as having been carved out of a flat surface.
  • the invention features categorizing existing molecules based on negative space target surfaces to which conformations ofthe molecules are complementary, and designing novel molecules that are complementary to negative space target surfaces to which no conformations of the existing molecular are complementary.
  • the invention features a method of creating novel molecules to be tested as ligands for proteins.
  • proteins are categorized based on target surfaces to which their pockets of known structure map, and novel molecules are designed that are complementary to the negative space target surfaces to which the protein pockets map.
  • the invention features a computer programmed to determine the chemical similarity of different molecules.
  • the program approximates the surface shape of each one of a plurality of molecules of interest by linking a series of cubes, each cube having a dimension R, the locations of the cubes being determined by the calculated electron probability density of the individual one of the molecules of interest, each cube sharing at least one of its six faces with another cube, such that there is a specific number of linked cubes which varies for each individual one of the plurality of molecules of interest.
  • the chemical reactivity of each individual one of the plurality of molecules of interest is approximated by assigning each cube of each individual one of the plurality of molecules of interest, no more than one functionality value from a plurality of M different chemical functionality values.
  • V is approximated by subtracting a number V R ⁇ cubes of dimension R from a surface, wherein each of the cube spaces shares at least one face with another cube space and wherein N of the cube spaces has one of a plurality of M different chemical functionality values.
  • An attraction value K is calculated for each one of the plurality of molecules of interest to the chemically active surface.
  • a list of overall attraction values to the chemically active surface is calculated.
  • Implementations ofthe invention may include one or more ofthe following features.
  • the calculation of the attraction value K may be performed on a plurality of different predetermined chemically active surfaces, and a matrix of overall attractive values of each molecule of interest to each of the different surfaces may be calculated.
  • the molecules of interest may include organic molecules.
  • the chemically active surface having a plurality of predetermined active chemical locations may be calculated to correspond to the shape of an actual protein surface structure.
  • the molecules of interest may be organic molecules of 1500 Daltons or less.
  • the chemically active surface having a plurality of predetermined active chemical locations may be compared to an actual protein surface to calculate a similarity value ofthe actual protein surface to the predetermined active chemical locations.
  • the predetermined chemically active surfaces may be compared to a plurality of actual protein surfaces and a matrix of similarity values may be calculated.
  • the cube spaces subtracted from the surface may be calculated to approximate the electron probability density of at least one of a plurality of depressions in known protein surface structures.
  • the N sites of chemical functionality may be calculated to approximate the location and type of chemical functionality of actual depressions in known protein structures.
  • Figure 1 shows a (CH 2 ) n chain encapsulated by 4.24 A cubic units.
  • Figure 2 shows examples of surfaces allowed and disallowed by the non- occlusion parameter in a theoretical target surface generation algorithm. Gray shading represents the opening ofthe theoretical surface. A is allowed. B is disallowed due to two occluded negative space cubes (marked X).
  • Figure 3 shows a theoretical target surface of 13 negative space cubes and four sites of specific molecular property interaction: hydrophobic (white), polarizable (purple), H-bond accepting (green), and H-bond donating (orange). Blue shading indicates the opening of the theoretical surface.
  • Figure 4 shows a "quantized” representation (Q-file) of one conformation of molecule 6a superimposed on its atomic structure (ball and stick and spacefilling model). Molecular property characteristics ofthe Q-file are hydrophobic (white quanta), polarizable (purple quanta), H-bond accepting (green quanta), and negatively charged (red quanta).
  • Figure 5 shows test molecules.
  • Figure 6 shows a ranking of molecules by QCSD similarity scores.
  • Figure 7 illustrates examination of a theoretical target surface common to molecules 8c (top) and 8a (bottom). Blue shading indicates opening ofthe theoretical surface. Specific points of complementarity on the theoretical target surface are hydrophobic (white) polarizable (purple) and H-bond donating (orange). Superimposition ofthe original molecular conformations onto the theoretical target surface demonstrates that the extra phenyl substituent of 8c protrudes from the opening ofthe theoretical surface and is not involved in complementarity to the surface.
  • Figure 8 illustrates examination of a theoretical target surface common to molecules la (A) and 5a (B). Blue shading indicates opening of the theoretical surface.
  • Figure 9 illustrates ranking of molecules in Figure 5 by Tanimoto similarity score of 2D UNITY finge ⁇ rints.
  • Figure 10 shows a QSCD plot of all ofthe theoretical surface shapes covered by all ofthe conformations of all of the molecules (blue dots) in Figure 5. The total volume of the cube encompasses all 49,268,918 theoretical surface shapes as listed in Table 1. Red dots show two exemplary theoretical surface shapes (a, b) not covered by any ofthe molecules in Fig. 5. Axes used are functions of opening area, opening length/width, and depth per opening quantum.
  • Figure 11 shows a map ofthe 20 compounds in Fig. 5 (blue dots) in a representative BCUT three-axis diversity space.
  • BCUT axes used are, respectively: 1) BCUT HACCEPT S LNVDIST 050 R H, 2) BCUT HDONOR S LNVDIST 030 R H, and 3) BCUT TAB POLAR S LNVDIST 300 R L. Red dot shows an unfilled coordinate of diversity space, at (7.54, 7.25, 6.82). The information contained in this BCUT coordinate does not reveal information about the shape of a molecule which might be able to fill this position in diversity space.
  • Figure 12 shows use of QSCD to design complementary combinatorial libraries to unmatched theoretical target surfaces. Many conceivable libraries of a given shape and functionality may be designed to fill a given unmet diversity need.
  • Figure 13 shows two sample surfaces.
  • Figure 14 illustrates the quantization process.
  • Figure 15 shows a legend for symbols used in functionality rule diagrams.
  • Figure 16 shows functionality rules for Potential Negative Charge
  • Figure 17 shows functionality rules for Potential Positive Charge Functionality. The structures are searched for in order.
  • Figure 18 shows a functionality rule for Hydrogen Bond Donor/ Acceptor Functionality.
  • Figure 19 shows a functionality rule for Hydrogen Bond Donor Functionality.
  • Figure 20 shows a functionality rule for Hydrogen Bond Acceptor Functionality. The structures are searched for in order.
  • Figure 21 shows a functionality rule for Polarizable Functionality.
  • Figure 25 shows an example Core Molecule Mi to fill Central set Ci.
  • Figure 28 shows an Example Core Molecule Mi to fill Central set Ci.
  • Figure 30 shows a protein quantization process.
  • molecular diversity can be defined as the measure, based on biological criteria, of the difference or similarity between small molecules.
  • biological criteria of the difference or similarity between small molecules.
  • Each ofthe existing methods of calculating biologically relevant diversity of small molecules defines slightly different criteria for molecular comparison, and thus a different configuration of diversity space as a whole. Examples include low dimensional diversity space such as BCUT metrics, high dimensional diversity space such as Chem-X/ChemDiverse multiple point pharmacophores, and empirical biological diversity space such as affinity finge ⁇ rinting.
  • affinity finge ⁇ rinting Another example of current diversity methods is affinity finge ⁇ rinting, in which molecules are empirically assayed against a panel of 10-20 actual proteins selected to be promiscuous in their ability to bind small molecules. Position in molecular diversity space is assigned through the resulting string of IC 50 binding values, and these affinity finge ⁇ rints provide unprecedented ability to group similarly active compounds in diversity space. However, because the actual mode of binding in any assay is not inco ⁇ orated in the resulting IC 5 0 value, the mapping of molecules to the selected protein panel is an irreversible transformation. Thus, an empty coordinate in affinity finge ⁇ rinting diversity space (an "unmatched" string of IC50S to a given protein panel) cannot be back- translated into a 3D molecular template.
  • a similar affinity finge ⁇ rinting diversity method has been put into practice using a panel of computational surfaces of real- world protein pockets and a modified form ofthe DOCK program. While this method shows similar promise in its ability to detect pharmacological similarity, it is, like its empirical affinity finge ⁇ rinting counte ⁇ art, an irreversible mapping. Thus, for the most part, current methods are able to successfully identify compounds of the same pharmacological class as being similar and compounds of different pharmacological classes as being different. Given a starting pharmacophore from known ligands and/or the target site of a target crystal structure, such methods interface well with the design of complementary combinatorial libraries.
  • QSCD which defines a molecule numerically by a mapping that describes its complementarity K to every distinct theoretical protein surface of resolution R not exceeding volume V with N sites of M types of chemical functionality
  • P MN - K is defined as an algorithm that takes into account the molecular shape and chemical functionality of both the given molecule and the given theoretical protein surface. From this definition, it follows that a comparison of two molecules will yield a numerical difference that is representative of their complementarities: to the extent that two molecules each have complementarity for the same theoretical protein surfaces, the molecules are similar; to the extent that two molecules have no complementarity to common theoretical protein surfaces, the molecules are dissimilar.
  • each theoretical surface to be formed by successively carving c ubic units out of an initially flat surface. These cubic units represent "negative space" that a potential ligand could occupy. Given cubic units with sides of length R (the resolution ofthe model), we use at most V/R 3 negative space cubes to describe each theoretical target surface. Others have previously employed cubic units to successfully approximate complementarity between small molecules and individual protein surfaces. The size of a negative space cube is directly related to the resolution and type of diversity data which the user desires as output.
  • the size ofthe negative space cube In choosing the size ofthe negative space cube, one motivation is to maximize negative space cube size such that the difference of a single cube in a surface is highly differentiating in terms of molecular recognition (i.e., every surface is orthogonal to every other surface). At the same time, enough information must be retained in each negative space cube to predict shape and functional complementarity at a ligand/surface interface. The former constraint minimizes overlap of diversity information while the latter constraint maximizes precision of diversity information. Together, the competing constraints result in a basis unit for the enumeration of theoretical target surfaces that minimizes the number of negative space cubes needed to accurately model diversity for a given volume V.
  • 4.24 A is the approximate VDW "cross-section" of a (CH2)n chain; a series of 4.24 A units neatly encapsulates a (CH2)n chain in its ground state conformation as shown in Figure 1.
  • the basis set for diversity can be a set of theoretical target surfaces comprised of all possible shape combinations of 6 to 14 negative space cubes of resolution 4.24 A (negative volume between 460 and 1070 cubic A) subject to the following rules: Surfaces are created by successively "carving out" negative space cubes from a flat block of infinite width and depth (the theoretical target). All negative space cubes of a given surface must share at least one face with another negative space cube of the surface, and all must be part of a single, contiguous negative surface. No negative space cubes may be occluded in the +Z axis ofthe infinite surface block; that is, there may be no solid surface between any negative space cube and the surface plane ofthe infinite block.
  • the surface A is allowed, but the surface B is disallowed. Surfaces duplicating a previous surface with respect to rotation in the X-Y plane are discarded.
  • the occlusion rule provides a compromise between complete coverage of topological possibilities and acceptable computational speed. This compromise was made based on the topological assumption that occlusions of 4.24 A or more are infrequent in small molecule/target interactions, and that their omission would thus have only a small effect on predicting diversity of binding affinities of small molecules. Applying the rules yields 49,268,918 unique negative surface shapes including chiral opposites. Covering a negative volume between 460 and 1070 cubic A, these surface shapes are deemed sufficient to examine diversity of most small molecules. For instance, examining a previously published reference set of pharmaceutically relevant compounds (a filtered Comprehensive Medicinal Chemistry or CMC database), 5049 out of 5120 compounds (98.6%) have a volume of 1070 cubic A or less.
  • each negative space cube is assigned a molecular property characteristic Pm that represents the dominant molecular environment which any atoms that are placed within that negative space will experience. Properties used are PI hydrophobic, P2 polarizable (includes aromatics), P3 H-bond acceptor, P4 H-bond donor, P5 H-bond donor/acceptor, P6 potentially positively charged (basic), and P7 potentially negatively charged (acidic). These seven types of molecular environments are assumed to represent a minimal basis set of factors that contributes to the electrostatic/VDW complementarity of a ligand and a target surface.
  • Table 1 Numerical breakdown ofthe total number of theoretical target surfaces created using the algorithm given in the text. Surfaces consist of 6-14 negative space cubes and 4 sites of 7 possible molecular property characteristics. Number of functionally different surfaces per surface shape varies for infrequent cases in which a given shape has an axis of symmetry, so actual number of unique surfaces is slightly less than (# surface shapes) * 7 4* N!/((N-4)! * 4!).
  • the small molecules must be formatted in a similar frame of reference, for instance by quantizing them into positive space cubes ("quanta") of resolution 4.24 A according to the following process (illustrated in Fig. 4):
  • a set of up to 100 minimized energy conformations within user-defined parameters is created.
  • Tripos Multisearch modeling is used, and all conformations within 10 kcal of the lowest energy conformation found are accepted.
  • a 4.24 A 3D grid of cubes (quanta) is aligned on top of the 3D structure using the molecule's principle axes of rotation (calculated with all atoms having mass 1).
  • Order of dominance is from P 7 to Pi, in order of maximum complementarity score obtainable by a given characteristic as shown in Table 2:
  • Table 2 Relative magnitudes of parameters used in calculating molecular property interactions between negative space cubes (theoretical target surfaces) and positive space cubes (quantized molecules). Magnitudes (listed from highest to lowest): +++, ++, +, 0, -, --, — .
  • Minimum % of VDW radius parameter allows for a user-defined protrusion beyond the surface of a quantum cube, adding a measure of topological "flexibility" to the quantization process. A user defined 32% was found to be especially good.
  • the total number of 4.24 A quanta that have been assigned a property characteristic is counted.
  • the grid alignment is shifted per user-defined parameters and the process is repeated until all shift combinations have been searched.
  • a "Q-file” (3D configuration of property-assigned quanta) is saved that has the lowest number of quanta in and is closest to the principle alignment.
  • a typical Q-file (molecule 6a) is shown in Fig. 4, superimposed upon its co ⁇ esponding conformation. The process of optimization of quantization parameters is described later.
  • each quantized conformation can be mapped into the diversity space defined by the set of 1.1 x 10 14 theoretical target surfaces. In general, the following process is used. For each quantized conformation of each molecule, each of its 24 possible X/Y/Z rotations (6 faces * 4 rotations per face) is fit to each ofthe 49,268,918 available surface shapes.
  • a score is generated for the complementarity ofthe given conformation to each theoretical target surface of a given shape from based on user-defined parameters. (The process of optimization ofthe complementarity parameters is described later.)
  • the following complementarity parameters can be used: a) A negative parameter for each rotatable bond of the conformation. b) If conformational energies are calculated, a negative parameter for the energy ofthe conformation above the lowest energy conformation from that molecule. c) A positive parameter for the hydrophobic energy gained by removing "water” from any hydrophobic (Pi) or polarizable (P 2 ) surface face of either the conformation or the theoretical surface.
  • the implementations ofthe invention resolve the problem to a framework bounded by 24 possible fitting orientations and a finite number of translations. This approximation allows three-dimensional diversity computation on a scale that is applicable to very large sets of molecules.
  • QSCD makes many approximations of molecular recognition. As explained, these include cubic units of 4.24 A resolution, gross approximations of surface contact area, exactly 4 points of 7 finite types of molecular property characteristics, static theoretical surfaces, and a limited set (up to 100) of low energy conformers. Thus, the final complementarity scores are not presumed to give precise binding energies for any individual match of conformation to target surface. However, taken over all conformations of a molecule and across an enumerated set of theoretical target surfaces, the scoring system is statistically relevant as explained below. Model Validation
  • Table 4 Tabulation of surface shapes and total number of theoretical target surfaces complementary to each molecule in Fig. 5.
  • mappings were scored in similarity from 0 to 1000 based on a function of the number of theoretical surfaces in common:
  • the first term in this equation gives a percentage measure (0-100) of shape similarity between molecules A and B, while the second term gives a measure from 0-10 of functional similarity per given shape overlap.
  • the scoring constant ⁇ in the equation above adjusts the influence of functionality on scoring.
  • Fig. 6 shows a plot of all 190 pairings ranked by similarity score. Circles show "heterogeneous" pairs of expected dissimilarity (e.g. 2a, 6b), while squares show “homogeneous” pairs of expected similarity (e.g. 2a, 2b). Clearly, the QCSD model ranks homogeneous pairs almost exclusively higher than heterogeneous pairs; all 1 pharmacologically similar pairs fell within the top 20 scores out of 190. All homogeneous scores were ranked above 25, while the median score in this experiment was 2.8, showing good "signal to noise.” The QCSD model is thus a valid predictor of target binding similarity among these molecules.
  • the pairings also reveal further validation. As might be expected from their relative rigidity (low number of accessible conformations) and structural similarity, the highest scoring pairs are 2a/2b, 8a/8b, and 8d/8e. Furthermore, examination ofthe pairings of 8c with 8a,b,d,e (triangles in Fig. 6, yellow in Table 5) yields scores that are within the top 20% of the pairing experiment but which are generally lower that the "homogeneous" pairs. This makes sense from a target-binding point of view, considering that one face of 8c contains a large molecular difference (an extra phenyl substituent).
  • Fig. 7 shows one such case of a surface common to both 8a and 8c; the protruding phenyl substituent plays no role in complementarity.
  • Fig. 7 shows one such case of a surface common to both 8a and 8c; the protruding phenyl substituent plays no role in complementarity.
  • FIG. 8 depicts one such case between la and 5a; conformations of la and 5a are displayed that were found in the QCSD model to be complementary to the same surface (Fig. 8A, 8B).
  • 3D overlays (Fig. 8C) confirm correlation of general shape and 4 points of functionality, although they also make clear the limits of resolution of complementarity information using 4.24 A units.
  • the surface in question can detect general shape and functional similarity, but by does not provide a basis to predict atom-for-atom overlap between molecules.
  • Fig. 9 and Table 6, contained in Figure 23, show the same set of 20 molecules ranked by Tanimoto similarity of standard 2D UNITY fingerprints (see discussion below).
  • the data demonstrate that the 2D model is equally capable of predicting pharmacologically similar pairs; UNITY ranks similarity between ATI and AT2 subtype binders much higher than our QCSD model, although it finds unusually high similarity between 8a and 8c.
  • 2D fingerprint descriptors have been found effective in clustering pharmacologically similar compounds, and are widely used in determining molecular diversity of existing structures.
  • the QCSD model determines not only diversity of existing structures, but also the structure of non-existing diversity. Given theoretical surface shapes for which no complements exist in a general screening library, QSCD allows the design of molecules to fill the given diversity void.
  • the QSCD basis set is created through a reversible process. Although some information resolution may be lost in fixing the parameters of a cube's size and functional scope, information content is retained in either direction.
  • a single molecular conformation and orientation corresponds to a defined pattern in QSCD space
  • a single point in QSCD space corresponds to a unique 3D shape with a defined 3D array of functionality.
  • unoccupied points in QSCD space directly define the molecular shapes and functionalities which those molecules do not cover.
  • a set of detailed 3D molecular templates is immediately available for the creation of novel molecules.
  • Fig. 10 shows a plot of all of the theoretical surface shapes covered by all ofthe conformations of all ofthe molecules used in the example implementation (see Fig. 5).
  • the total volume ofthe cube in Fig. 10 encompasses all 49,268,918 theoretical surface shapes as listed in Table 1.
  • many theoretical surface shapes are "unfilled” by the set of compounds shown in Fig. 5.
  • searching for molecules or libraries to enhance the diversity ofthe given set of compounds the chemist is presented with a set of actual 3D templates into which new compound libraries may be designed.
  • mapping the same set of compounds in a "non-reversible" diversity space would also display a set of coordinates to which the molecules map, there would be no way to visualize the 3D shape of any point that was not filled by one of the compounds in the set.
  • the coordinates specified for an unfilled point leave the chemist with a set of normalized eigenvalues. While these may give an idea of relative abundance of a given functionality (e.g. H-Bond Donor) at this point in diversity space, the coordinates give no hint of what shape or class of molecules might fill that diversity void.
  • the above example shows how QSCD is a reversible diversity model with respect to molecular shape.
  • QSCD makes possible the contemplation of a "complete" library of screening molecules at a given resolution.
  • the model thus offers a theoretical and practical answer to the problem of generating lead structures for genomic targets of unknown structure and function.
  • UNITY 2D fingerprints (Unity 4.0, Tripos Ine, 1699 S. Hanley Rd., St. Louis, MO, 63144) were generated on an R10000 Silicon Graphics workstation. Pairwise Tanimoto coefficients were computed as described by Dixon and Koehler.
  • QSCD software for molecule quantization, mapping of Q-files, and surface complementarity display was developed using the Java programming language (JDK 1.2) and the Java3D graphics API (version 1.1) on Intel-based workstations.
  • Theoretical target surfaces were stored and indexed using an Oracle 7.3.3 database.
  • Parameters for theoretical target surface generation/molecular quantization and parameters for complementarity mapping/scoring were alternately optimized in three successive rounds as described below.
  • the parameters used for theoretical target surface generation and the closely related parameters for quantization of small molecules into quantized files (Q-files) were optimized in the context ofthe algorithms mentioned above.
  • Parameters were iteratively optimized by varying a given parameter and then quantizing training molecules other than those in Fig. 5. Training molecules used were taken from in house structures and two published SAR sets.
  • mapping/scoring molecular conformations to theoretical target surfaces were optimized in the context of the algorithm stated above. Parameters were iteratively optimized by varying a given parameter and then mapping a constant set of training molecules (see above) to a constant set of theoretical target surfaces, using the most cu ⁇ ent surface generation and quantization parameters. Diversity pairing scores were generated for all training molecules, and parameters were chosen which accurately predicted known homogeneous heterogeneous pairs and which maximized "signal to noise" of homogeneous scores over heterogeneous scores.
  • the minimum overlap requirement was set to either 9 quanta or N-2 quanta of a conformation of N quanta. This range allows large conformations to fit partially into a theoretical surface (protruding volume must be at the mouth of the surface) while also allowing smaller conformations to be considered for complementarity. It excludes large conformations which do not overlap at least 9 quanta.
  • Approximate computational speeds of typical QSCD operations are as follows on a single Pentium III 500 MHz workstation: Generation ofthe basis set of theoretical target surface used in the study required 17 min.; this data was stored for access by subsequent QSCD functions. Quantization of 100 conformations of a given molecule into 100 Q-files required 250 seconds. Complementarity mapping of 100 Q-files onto the basis set of theoretical target surfaces used in the study required 40 seconds. Algorithm for Designing Molecules for Unfilled Target Surfaces
  • Appendix E describes an algorithm for quantization of protein surfaces.
  • Appendix F describes an algorithm for comparing protein surfaces to determine a degree of similarity or dissimilarity. The following algorithm generates a set of files T of quantized protein binding surfaces which together represent the available surface of a given protein binding site.
  • Protein binding site see l, below
  • Transinc translational increment in angstroms
  • Rot # rotations (odd integer)
  • Rotvar rotational variance (%)
  • CN 94025) which minimally consists of: a) A calculated probable electron density surface ofthe binding site b) A list of all known atom types in the molecule with their coordinates and atomic radii c) A list of known connectivities of all atoms with the type of bond connecting each atom
  • step 7 At a given grid nexus place a cube with the center of one face tangent to the probable electron density surface ofthe protein binding site 6. If the cube contains a protein atom coordinate or any atomic radii of protein atoms protrude into the trial cube by more than Tol % of their atomic radius, then step 7 ' ., otherwise step 8.
  • a trial cube contains a protein atom coordinate, or any atomic radii of protein atoms protrude into the trial cube by more than TolA of their atomic radius, or if the cube does not intersect with a volume ofthe convex hull equal to at least R 3 * TolB (see 3. above), then the trial cube is removed. Otherwise the trial cube becomes a set cube (with unchecked faces).
  • step 9 Designate all cubes as negative space cubes fully enclosed except as detailed below. Designate the layer of cubes which is a) pe ⁇ endicular to the line pe ⁇ endicular to the probable electron density surface at the grid nexus being examined b) farthest from the grid nexus as negative space cubes which are open at their faces farthest from the grid nexus and pe ⁇ endicular to the line pe ⁇ endicular to the probable electron density surface at the grid nexus.
  • Types of functionality M may include but are not limited to:
  • Appendix G describes an algorithm for determining the complementarity of a library of molecules to a set of protein surfaces.
  • novel molecules for testing as ligands for proteins
  • novel molecules can be designed based on complementarity to negative space cube targets to which a set of protein pockets map. The following outline describes the steps for doing so:
  • Appendix H contains example parameter values useful in connection with the algorithms described in Appendixes A through G.
  • a 4.24 A cube was found to be the largest predictive unit size of diversity measure for our criteria of designing general screening libraries. For example, both 4.48 and 4.00 A units gave poorer prediction of homogeneous/heterogeneous pairs than the pairings of Fig. 6 (4.24 A units). This is likely due to the fact that most organic small molecules are themselves quantized by a limited basis set: the VDW radii of H, C, N, O and a few other atoms (see for example Fig.l). If there is no constraint on size of cubic units, however (i.e., if there is no attempt to maximize orthogonality of theoretical target surfaces), other unit measures of diversity can be found. A unit of 2.12 A should also provide effective diversity information but at a much higher resolution.
  • size of diversity space in terms of unique molecular points. In other words, what is the minimum set of molecules needed to fully cover a given diversity space. This calculation is dependent on two factors: the resolution stipulated in the model (e.g., what amount of molecular change is recognized as different) and the maximum values of each dimension ofthe model's basis axes. In the model of QSCD discussed above, resolution is fixed by cubic units of 4.24 A, and maximum values are fixed at 14 units (molecular volume of 1070 cubic A) and 4 points of 7 types of molecular property characteristics. As describe above, the result is a set of 1.1 * 10 14 unique molecular points.
  • Table 7 Summation of binding energies for an interaction of an average complementary molecule/theoretical target surface pair in the context ofthe QSCD model used herein.
  • An average complementary theoretical target surface is also assumed to have 60% non-polar exposed faces. Constants used in the table are taken from Ajay and Murcko.
  • the resolution used to calculate diversity translates roughly to nanomolar binding conditions for an average molecule/target surface pair.
  • a general screening library guaranteed to contain at least one nanomolar binder to any given target of interest would thus number at least 24 million molecules. This is a large number and will be attenuated by the fact that some molecules have significantly more than 100 conformations available to them.
  • the QSCD model suggests that if, in the near future, combinatorial chemistry and high-throughput screening are to generate initial hits primarily in the nanomolar rather than micromolar range, then the field must continue to focus its efforts on the development of numerically competent synthesis and screening technologies.
  • a surface opening O is a set of lattice squares in 7? denoted by their corners
  • the area of a surface opening is ed to be the number of lattice squares it contains. Surface openings are considered to be unique upto translations and rotations ofthe x-y plane.
  • a surface shape 5 is a set of "negative space" cubes represented as lattice cubes in Z 3 denoted by their comers:
  • the volume of a surface shape is defined to be the number of lattice cubes it contains. Surface shapes are considered to be unique up to translations and rotations of the x-y plane.
  • shape(0, d) ⁇ (x, y, z) ⁇ (x,y) e 0, d(x, y) > -z)
  • the function d specifies the "depth" ofthe surface shape at each opening point. Sample surface shapes and openings can be seen in Figure 13.
  • a theoretical surface consists of a surface shape where the cubes in the surface shape are each associated with functionality.
  • the set T of seven specific types of characteristic functionality is used:
  • a functionality map / : S — » defines the assignment. By default, all cubes are assigned functionality % , and all possibilties are considered where upto n/ of the cubes are given one ofthe functionalities T ⁇ - .
  • N ⁇ to be a set containing the only the surface openmg with a smgle square at (0, 0)
  • T 5 define T 5 to the set of all possible openings obtamed by addmg a single square to O adjacent to a square already present in O
  • Algorithm A.2 ⁇ PENlNGFl TER(C, ⁇ t , M nc , M c ) filter a set of surface openings O using area-threshold parameter At, max-non-central parameter M nc , and max- contiguous parameter M c .
  • Each atom in the molecule is assigned a functionality based on its type and connectivity
  • Each cube is assigned a functionality based on the atoms that it contains
  • a map / M — T assigning functionalities to all of the atoms
  • the map is defined by the using a set of rules to match molecular substructures based on extended atom types (as generated by T ⁇ pos, for example) and bondmg patems that encapsulate each functionality type
  • the algo ⁇ thm keeps track of atoms that are excluded from matching a lower p ⁇ o ⁇ ty rule because they have already been matched in a higher p ⁇ o ⁇ ty rule, where T ⁇ has the highest p ⁇ onty and T ⁇ the lowest.
  • Atoms not matching any rule are assigned functionality T ⁇ . No atoms are assigned functionality T %
  • the functionality rules used can be seen as follows:
  • Algorithm B.l ATOMFUNCTIONAL AP( ⁇ ) assign functionality to the atoms in molecular structure -Vf and determine which atoms are excluded from quantizafton
  • a conformation is a mapping c ⁇ M. — ⁇ E 3 of the molecular structure into three dimensional space.
  • OpenEye Omega software Open Eye Scientific Software Inc., 335c Winische Way, Santa Fe, NM, 87501
  • upto n c representation conformations for each molecule are generated within given rule-based energy parameters.
  • a coordinate frame T (R. t) I 3 ⁇ R 3 is a ⁇ gid motion ofthe space defined by a rotation R S S0 3 (R) and a translation t 6 R 3 that transforms a point p by the rule
  • a lattice on R 3 is implicitly defined by q £ Z 3 ⁇ [r ⁇ . rg + r) x [r ⁇ y , rg y + r) x [r ⁇ r 2 , rg 2 + r) C R 3
  • the base coordinate frame is generated from a conformation of molecular structure M.
  • a subset . of the atoms in M. are selected via F RAMEATO S These are atoms m or near ⁇ ng structures close to the center of the conformation
  • a ⁇ ng atom is defined to be an atom that contains at least one bond which, if removed, would not result in the molecule being disconnected If there are an insufficient number of ⁇ ng atoms, all atoms sufficently close to the center ofthe conformation are used
  • the base coordinate frame is calculated in B ASEFRAME.
  • the x-axis of the base coordinate frame is defined to to be the solution to the optimization problem max ⁇ x ⁇ (c(a) — p) ⁇ ⁇ M where
  • the base coordinate frame then simply involves cente ⁇ ng the conformation by translating p to the origin, and using the new x, y, and z
  • Algorithm B.2 FRAMEAT0MS(J . c, 7 , r m , q ⁇ select a set of atoms to use for generating the coordmate frame from a molecular structure M. and a conformation c.
  • the following parameters are used, ⁇ ng factor 7 , ring minimum r m , radius factor q ⁇ .
  • Algorithm B.3 RlNGGR ⁇ UP( ⁇ TZ, 77) calculate the set of atoms that can be reached starting at atom a and crossmg over at most 77 atoms that are not in the set of ⁇ ng atoms TZ
  • the lattice defined by a coordinate frame places corners of cubes at points all of whose coordinates are integer multiples of r. Given a particular conformation, it may be better to shift the lattice by a length of ⁇ /2 in a particular direction, recentering the lattice cubes.
  • I ⁇ - ⁇ c f ⁇ M ⁇ ] set ( to be the .th lowest element in the set ⁇
  • j to be the index of the least member of the set ⁇ (n ⁇ , d z , d Vt i , d x .i ⁇ , where comparisons are done using a dictionary ordering (that is, compare the first component, if case of equality compare the second component, etc.)
  • the base coordinate frame is not necessanly optimal for quantization, so a set of "close * ' frames are also exammed.
  • a set of "close * ' frames are also exammed.
  • R x to be the coordmate frame corresponding to rotation about the x-axis by ⁇ v r (2 ⁇ x + 1 — n r )/(4n r ) radians
  • R y to be the coordmate frame corresponding to rotation about the y- axis by ⁇ v r (2 ⁇ y + 1 — n r )/(4n r ) radians
  • R z to be the coordmate frame corresponding to rotation about the --axis by ⁇ v r (2 ⁇ z + 1 — n r )/(4n r ) radians
  • T x (p) p + (rv t (2 Jx + 1 - n r )/(2n P ).0.0)
  • T y (p) p + (Q, rv t (2j y + 1 - n r )/(2n r ), 0)
  • cubification is the process of determining which lattice cubes are filled by the conformation.
  • the set of lattice cubes is constructed by taking any cube in which an atom center in the conformation directly falls and also cubes which are sufficiently close to the van Der Waals sphere of an atom.
  • Algorithm B.7 CUBIFY(- , c, T, r, t): quantize the conformation c of the molecular structure M into a set of cubes defined on the coordinate frame T usmg parameters: coordinate frame T, resolution r, tolerance t.
  • d to be the distance of the point in R 3 m the cube defined by q that is closest to p
  • Algorithm B.8 Q ⁇ Am ⁇ Z ⁇ .(M,C,r,t,rf,r m ,q r ,Cf,C t ,n t , ⁇ t ,n r , ⁇ r ) quantize the set of conformations C for the molecular structure M with parameters resolution r, tolerance t, nng factor 77, ⁇ ng minimum r m , radius factor q r , cente ⁇ ng fraction c/, cente ⁇ ng tolerance c t , number of translations n t , translational vanance v t , number of rotattons n ⁇ , rotational va ⁇ ance ⁇ r , pola ⁇ zable minimum p m .
  • Algorithm Cl FITSURFACES( , /,, E c ,T b , ) calculate all surfaces with functionality that are complementary to the quantizated conformation Q with functionality map /,, conformational energy E c , and rt, rotatable bonds using the followmg parameters minimum surface openmg area A, maximum surface volume V, area- threshold A t , max-non-central M nc , max-contiguous M c , max-extrustion M e , number of points of characte ⁇ stic functionality /, minimum energy £—, deliberately, minimum fit quanta q mtn , minimum slackness s min , maximum slackness s mol , maximum protrusion levels p max , translational-rotational-vibrational entropy E trVs rotatable bond coefficient c r , hydrophobic energy coefficient C h , hydrophobic surface energy coefficient c s , potential function
  • TZ to be the set of 24 lattice rotations
  • Algo ⁇ thm C.3 DETECTSURFACES(S C , A V. A t , M nc , M c , M e ) detect additional surfaces by adding cubes to 5- subject to parameters minimum surface opening area A, maximum surface volume V, area-threshold A t , max-non-central M nc , max- contiguous ⁇ / c , max-extrustion M e
  • V to be the set of all possible openmgs obtained by adding a single square to O adjacent to a square already present m O
  • Complementarity energy between a quantization of a molecular conformation and a a cubic theoretical surface is the sum of several components:
  • a library is a set of molecular structures Given a library, the set of complementary theoretical surfaces is defined as the union of all surface shape/functionality pairs complementary to any quantized conformation of any molecule in the library
  • the algo ⁇ thm LlBRARYCOMPARE calculates a score proportional to the similarity of the two hbra ⁇ es
  • the score is calculated by representmg each library as its set of complementary theoretical surfaces, and usmg the S IMILARITYSCORE p ⁇ mitive to determine the similanty or dissimila ⁇ ty two sets of theoretical surfaces If the molecular bra ⁇ es each contam only one molecule, then the algonthm calculates a score proprotional to the similanty of the two molecules
  • Algorithm D.l LIBRARYCOMPARE(_ ! , _ 2 ) calculate a similanty score between 0 and 1000 for two molecular hbra ⁇ es
  • the target sites of a protein surface are quantized into the same negative space cubic representation used by theoretical surfaces. This allows the following analyses:
  • the protein quantization process is accomplished in the following steps, as depicted in Figure 30
  • a protein surface is generated from the 3D structure
  • a protein surface is a set of mangles defining the surface ofthe protein that is accessible to water molecules (known as the Connolly surface).
  • Michael Connolly's MSRoll software is an example of a package that can generate a protein surface suitable for this purpose
  • Subsets of the surface which are target sites likely for the binding of small molecules are detected. This can be accomplished, for example, by looking for highly concave regions.
  • Michael Connolly's MSForm software is an example of a package that can measure surface curvature and detect pockets suitable for this purpose
  • Each target site is quantized mto a set of negative space cubes with associated functionalities using the protem function map and the algo ⁇ thm T ARGETSlTE- QUANTIZE.
  • the unde ⁇ ngly process is very similar to the algonthm Q UANTIZE, APPENDIX E PROTEIN QUANTIZATION
  • Each set of quantized negative space cubes with functionality is convened to a set of theoretical surfaces satisfying the proper constraints (for example, no occluded cubes are allowed) using the algo ⁇ thm B UILDSURFACES
  • Algo ⁇ thm E.l npj S r calculate a negative space cubic representation ofthe target site defined by the t ⁇ angles in set T, with v as a normal vector pointing out ofthe target site, / as a functionality map for the entire protein, using parameters resolution r, lattice density n, lattice van Der Waals radius r v , lattice tolerance t;, number of lattice neighbors p , buffer distance b, search radius s r , and additional parameters for subrountme calls (see below) as necessary
  • V C R 3 define a set of points V C R 3 to be the points on a lattice with coordinate frame 7) and cube side length r; such that p € V if p is contained in the target site and the closest triangle in T is at least distance b away from p
  • a target surface set is defined as the set of all theoretical target surfaces to which a set of known protein surfaces map.
  • the target surface set may compnse, for example, all of the surfaces mapped from one protem, all of the surfaces mapped from multiple proteins, or all ofthe surfaces mapped from specific sites on multiple proteins
  • the algo ⁇ thm P ROTEINCOMPARE calculates a score proportional to the similanty ofthe two sets of protein surfaces.
  • the score is calculated by representmg each protein surface set as its target surface set, and usmg the S IMILARITYSCORE p ⁇ mitive to determine the similanty or dissimilanty two sets of theoretical surfaces.
  • Algorithm F.l PROTEINCOMPARE ⁇ I , ⁇ ) calculate a similanty score between 0 and 1000 for two protein surface sets.
  • T to be the target surface set corresponding to V
  • the algorithm P ROTEINLIBRARYCOMPARE calculates a score proportional to the com- plementanty of a library of small molecules and a set of protem surfaces.
  • the score is calculated by representing the protein surface set as the theoretical surface set to which it is similar, the molecular library as the theoretical surface set to which it is complementary, and usmg the S IMILARITYSCORE pnmitive to determine the similanty or dissimilarity two sets of theoretical surfaces.
  • T p to be the target surface set corresponding to V
  • Rotational vanance ( ⁇ r ): 0.1 Polarizable minimum (p m ). 2 Minimum fit quanta (q m ⁇ n ): 9 Minimum slackness ( s m ⁇ n ): 2 Maximum slackness (s m ⁇ :r ): 0 Maximum protrusion levels (p ma x)'- 1 Minimum energy (E mm ): 8.0 kCal
  • Buffer distance (b) 0.5 Angstroms

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Physics & Mathematics (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Computing Systems (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Peptides Or Proteins (AREA)

Abstract

L'invention concerne un procédé informatisé dans lequel un ensemble de contraintes est imposé à des surfaces cibles possibles, et un ensemble rigoureusement énuméré de surfaces cibles théoriques soumis aux contraintes données, de façon que chaque surface ait un volume continu défini et une aire superficielle continue définie. Au moins un ensemble d'objets est cartographié au niveau de l'ensemble rigoureusement énuméré de surfaces cibles théoriques afin de définir des sous ensembles correspondants de l'ensemble rigoureusement énuméré de surfaces cibles théoriques. Un aspect de diversité des objets est analysé en fonction des degrés de similitudes et de différences entre les sous ensembles correspondants.
EP00925889A 1999-04-02 2000-03-31 Analyse de la diversite moleculaire et proteique Withdrawn EP1203330A2 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US12748699P 1999-04-02 1999-04-02
US127486P 1999-04-02
PCT/US2000/008777 WO2000060507A2 (fr) 1999-04-02 2000-03-31 Analyse de la diversite moleculaire et proteique

Publications (1)

Publication Number Publication Date
EP1203330A2 true EP1203330A2 (fr) 2002-05-08

Family

ID=22430392

Family Applications (1)

Application Number Title Priority Date Filing Date
EP00925889A Withdrawn EP1203330A2 (fr) 1999-04-02 2000-03-31 Analyse de la diversite moleculaire et proteique

Country Status (5)

Country Link
EP (1) EP1203330A2 (fr)
JP (1) JP2002541560A (fr)
AU (1) AU4451100A (fr)
CA (1) CA2369570A1 (fr)
WO (1) WO2000060507A2 (fr)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1307536A2 (fr) * 2000-06-16 2003-05-07 Neogenesis Pharmaceuticals, Inc. Systeme et procede d'evaluation des concavites dans une proteine
EA010258B1 (ru) 2002-07-24 2008-06-30 Кеддем Байо-Сайенс Лтд. Способ поиска лекарственного вещества
CN110010199B (zh) * 2019-03-27 2021-01-01 华中师范大学 一种分析识别蛋白质特异性药物结合口袋的方法
CN110390997B (zh) * 2019-07-17 2023-05-30 成都火石创造科技有限公司 一种化学分子式拼接方法
CN113421610B (zh) * 2021-07-01 2023-10-20 北京望石智慧科技有限公司 一种分子叠合构象确定方法、装置以及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO0060507A2 *

Also Published As

Publication number Publication date
CA2369570A1 (fr) 2000-10-12
JP2002541560A (ja) 2002-12-03
AU4451100A (en) 2000-10-23
WO2000060507A2 (fr) 2000-10-12
WO2000060507A3 (fr) 2001-04-12

Similar Documents

Publication Publication Date Title
US6240374B1 (en) Further method of creating and rapidly searching a virtual library of potential molecules using validated molecular structural descriptors
US5307287A (en) Comparative molecular field analysis (COMFA)
US7765070B2 (en) Ellipsoidal gaussian representations of molecules and molecular fields
Perez Managing molecular diversity
US20070134662A1 (en) Structural interaction fingerprint
Andersson et al. Mapping of ligand‐binding cavities in proteins
US8374837B2 (en) Descriptors of three-dimensional objects, uses thereof and a method to generate the same
WO2000060507A2 (fr) Analyse de la diversite moleculaire et proteique
US8165818B2 (en) Method and apparatus for searching molecular structure databases
EP1862927A1 (fr) Descripteurs d'objets tridimensionnels, ses utilisations et procédé pour les générer
Gillet Designing combinatorial libraries optimized on multiple objectives
Mestres et al. A molecular-field-based similarity study of non-nucleoside HIV-1 reverse transcriptase inhibitors. 2. The relationship between alignment solutions obtained from conformationally rigid and flexible matching
Nassif et al. An inductive logic programming approach to validate hexose binding biochemical knowledge
Good 3D molecular similarity indices and their application in QSAR studies
CA2321303C (fr) Methode de determinataion de l'espace forme par un ensemble de molecules au moyen de distances metriques minimales
AU2008202475A1 (en) Descriptors of three-dimensional objects, uses thereof and a method to generate the same
Gorostiola González et al. an den, Braun, TGM, espers, W., I zerman
WO2009146735A1 (fr) Descripteurs d'objets tridimensionnels, utilisations de ces descripteurs et procédé de production de ceux-ci
US6470305B1 (en) Chemical analysis by morphological similarity
CA2633179A1 (fr) Descripteurs d'objets tridimensionnels, leurs utilisations et methode de creation connexe
Treado Computational Studies of Packing and Jamming in Biological Systems
Crippen et al. Quantitative structure-activity relationships (QSAR)
US20070005258A1 (en) Identification of ligands for macromolecules
CA2712083A1 (fr) Representations ellipsoides gaussiennes de molecules et de champs moleculaires
Starosolski et al. Intelligent System for Docking Ligands to Protein Active Sites

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20011101

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

17Q First examination report despatched

Effective date: 20040308

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20040720