US7188055B2 - Method, system, and computer program for displaying chemical data - Google Patents

Method, system, and computer program for displaying chemical data Download PDF

Info

Publication number
US7188055B2
US7188055B2 US09/802,956 US80295601A US7188055B2 US 7188055 B2 US7188055 B2 US 7188055B2 US 80295601 A US80295601 A US 80295601A US 7188055 B2 US7188055 B2 US 7188055B2
Authority
US
United States
Prior art keywords
window
objects
compounds
user
chemical compounds
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related, expires
Application number
US09/802,956
Other versions
US20020069043A1 (en
Inventor
Dimitris K Agrafiotis
Victor S Lobanov
Francis R Salemme
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Janssen Research and Development LLC
Original Assignee
Johnson and Johnson Pharmaceutical Research and Development LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Johnson and Johnson Pharmaceutical Research and Development LLC filed Critical Johnson and Johnson Pharmaceutical Research and Development LLC
Priority to US09/802,956 priority Critical patent/US7188055B2/en
Publication of US20020069043A1 publication Critical patent/US20020069043A1/en
Assigned to JOHNSON & JOHNSON PHARMACEUTICL RESEARCH AND DEVELOPMENT, L.L.C. reassignment JOHNSON & JOHNSON PHARMACEUTICL RESEARCH AND DEVELOPMENT, L.L.C. MERGER (SEE DOCUMENT FOR DETAILS). Assignors: 3-DIMENSIONAL PHARMACEUTICALS, INC.
Application granted granted Critical
Publication of US7188055B2 publication Critical patent/US7188055B2/en
Adjusted expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/20Drawing from basic elements, e.g. lines or circles
    • G06T11/206Drawing of charts or graphs
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J19/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J19/0046Sequential or parallel reactions, e.g. for the synthesis of polypeptides or polynucleotides; Apparatus and devices for combinatorial chemistry or for making molecular arrays
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2137Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/40Software arrangements specially adapted for pattern recognition, e.g. user interfaces or toolboxes therefor
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/80Data visualisation
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/0068Means for controlling the apparatus of the process
    • B01J2219/00686Automatic
    • B01J2219/00689Automatic using computers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/0068Means for controlling the apparatus of the process
    • B01J2219/00686Automatic
    • B01J2219/00691Automatic using robots
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/0068Means for controlling the apparatus of the process
    • B01J2219/00695Synthesis control routines, e.g. using computer programs
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/0068Means for controlling the apparatus of the process
    • B01J2219/007Simulation or vitual synthesis
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B01PHYSICAL OR CHEMICAL PROCESSES OR APPARATUS IN GENERAL
    • B01JCHEMICAL OR PHYSICAL PROCESSES, e.g. CATALYSIS OR COLLOID CHEMISTRY; THEIR RELEVANT APPARATUS
    • B01J2219/00Chemical, physical or physico-chemical processes in general; Their relevant apparatus
    • B01J2219/00274Sequential or parallel reactions; Apparatus and devices for combinatorial chemistry or for making arrays; Chemical library technology
    • B01J2219/0068Means for controlling the apparatus of the process
    • B01J2219/00702Processes involving means for analysing and characterising the products
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Definitions

  • the present invention is generally directed to displaying and processing data using a computer, and more particularly directed to visualizing and interactively processing chemical compounds using a computer.
  • a combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical “building blocks” such as reagents.
  • a linear combinatorial chemical library such as a polypeptide library is formed by combining a set of chemical building blocks called amino acids in every possible way for a given compound length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical compounds theoretically can be synthesized through such combinatorial mixing of chemical building blocks.
  • a directed diversity library is a large collection of chemical compounds having properties/features/characteristics that match some prescribed properties.
  • the generation, analysis, and processing of directed diversity libraries are described in U.S. Pat. Nos. 5,463,564; 5,574,656; and 5,684,711, and pending U.S. Application titled “SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR IDENTIFYING CHEMICAL COMPOUNDS HAVING DESIRED PROPERTIES,” Ser. No. 10/170,628 all of which are herein incorporated by reference in their entireties.
  • the present invention is directed to a system, method, and computer program product for visualizing and interactively analyzing data relating to chemical compounds.
  • the invention operates as follows.
  • a user selects a plurality of compounds to map, and also selects a method for evaluating similarity/dissimilarity between the selected compounds.
  • a non-linear map is generated in accordance with the selected compounds and the selected method.
  • the non-linear map has a point for each of the selected compounds, wherein a distance between any two points is representative of similarity/dissimilarity between the corresponding compounds.
  • a portion of the non-linear map is then displayed. Users are enabled to interactively analyze compounds represented in the non-linear map.
  • FIG. 1 illustrates a block diagram of a computing environment according to an embodiment of the invention
  • FIG. 2 is a block diagram of a computer useful for implementing components of the invention
  • FIG. 3 is a flowchart representing the operation of the invention in visualizing and interactively processing non-linear maps according to an embodiment of the invention
  • FIG. 4 is a flowchart representing the manner in which a non-linear map is generated according to an embodiment of the invention.
  • FIG. 5 illustrates a structure browser window according to an embodiment of the invention
  • FIG. 6 illustrates a compound visualization non-linear map window according to an embodiment of the invention
  • FIG. 7 is used to describe a zoom function of the present invention.
  • FIG. 8 illustrates a dialog used to adjust properties of a set containing one or more compounds
  • FIGS. 9 and 10 are used to describe the compound visualization non-linear map window according to an embodiment of the invention.
  • FIG. 11 is a flowchart illustrating the operation of the invention where a compound visualization non-linear map window is used as a source in an interactive operation;
  • FIG. 12 is a flowchart illustrating the operation of the invention where a compound visualization non-linear map window is used as a target in an interactive operation;
  • FIG. 13 conceptually illustrates an interactive operation where a compound visualization non-linear map window is used as a source
  • FIG. 14 conceptually illustrates an interactive operation where a compound visualization non-linear map window is used as a target.
  • the present invention is directed to a computer-based system, method, and/or computer program product for visualizing and analyzing chemical data using interactive multi-dimensional (such as 2- and/or 3-dimensional) non-linear maps.
  • the invention employs a suite of non-linear mapping algorithms to represent chemical compounds as objects in preferably 2D or 3D Euclidean space.
  • the distances between objects in that space represent the similarities and/or dissimilarities of the corresponding compounds (relative to selected properties or features of the compounds) computed by some prescribed method.
  • the resulting maps are displayed on a suitable graphics device (such as a graphics terminal, for example), and interactively analyzed to reveal relationships between the data, and to initiate an array of tasks related to these compounds.
  • FIG. 1 is a block diagram of a computing environment 102 according to a preferred embodiment of the present invention.
  • a chemical data visualization and interactive analysis module 104 includes a map generating module 106 and user interface modules 108 .
  • the map generating module 106 determines distances between chemical compounds relative to one or more selected properties or features (herein sometimes called evaluation properties or features) of the compounds.
  • the map generating module 106 performs this function by retrieving and analyzing data on chemical compounds and reagents from reagent and compound databases 122 . These reagent and compound databases 122 store information on chemical compounds and reagents of interest.
  • the reagent and compound databases 122 are part of databases 120 , which communicate with the chemical data visualization and interactive analysis module 104 via a communication medium 118 .
  • the communication medium 118 is preferably any type of data communication means, such as a data bus, a computer network, etc.
  • the user interface modules 108 which include a map viewer 112 and optionally a structure browser 110 , displays a preferably 2D or 3D non-linear map on a suitable graphics device.
  • the non-linear map includes objects that represent the chemical compounds, where the distances between the objects in the non-linear map are those distances determined by the map generating module 106 .
  • the user interface modules 108 enable human operators to interactively analyze and process the information in the non-linear map so as to reveal relationships between the data, and to initiate an array of tasks related to the corresponding compounds.
  • the user interface modules 108 enable users to organize compounds as collections (representing, for example, a combinatorial library).
  • Information pertaining to compound collections are preferably stored in a collection database 124 .
  • Information on reagents that are mixed to form compound collections are preferably stored in a library database 126 .
  • Input Device(s) 114 receive input (such as data, commands, queries, etc.) from human operators and forward such input to, for example, the chemical data visualization and interactive analysis module 104 via the communication medium 118 .
  • Any well known, suitable input device can be used in the present invention, such as a keyboard, pointing device (mouse, roller ball, track ball, light pen, etc.), touch screen, voice recognition, etc.
  • User input can also be stored and then retrieved, as appropriate, from data/command files.
  • Output Device(s) 116 output information to human operators. Any well known, suitable output device can be used in the present invention, such as a monitor, a printer, a floppy disk drive or other storage device, a text-to-speech synthesizer, etc.
  • the present invention enables the chemical data visualization and interactive analysis module 104 to interact with a number of other modules, including but not limited to one or more map viewers 112 , NMR (nuclear magnetic resonance) widget/module 130 , structure viewers 110 , MS (mass spectrometry) widget/module 134 , spreadsheets 136 , QSAR (Quantitative Structure-Activity Relationships) module 138 , an experiment planner 140 , property prediction programs 142 , active site docker 144 , etc. These modules communicate with the chemical data visualization and interactive analysis module 104 via the communication medium 118 .
  • Components shown in the computing environment 102 of FIG. 1 can be implemented using one or more computers, such as an example computer 202 shown in FIG. 2 .
  • the computer 202 includes one or more processors, such as processor 204 .
  • Processor 204 is connected to a communication bus 206 .
  • Various software embodiments are described in terms of this example computer system. After reading this description, it will become apparent to a person skilled in the relevant art(s) how to implement the invention using other computer systems and/or computer architectures.
  • Computer 202 also includes a main memory 208 , preferably random access memory (RAM), and can also include one or more secondary storage devices 210 .
  • Secondary storage devices 210 can include, for example, a hard disk drive 212 and/or a removable storage drive 214 , representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc.
  • Removable storage drive 214 reads from and/or writes to a removable storage unit 216 in a well known manner.
  • Removable storage unit 216 represents a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 214 .
  • Removable storage unit 216 includes a computer usable storage medium having stored therein computer software and/or data.
  • the computer 202 can include other similar means for allowing computer programs or other instructions to be loaded into computer 202 .
  • Such means can include, for example, a removable storage unit 220 and an interface 218 .
  • Examples of such can include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 220 and interfaces 218 which allow software and data to be transferred from the removable storage unit 220 to computer 202 .
  • the computer 202 can also include a communications interface 222 .
  • Communications interface 222 allows software and data to be transferred between computer 202 and external devices. Examples of communications interface 222 include, but are not limited to a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, etc.
  • Software and data transferred via communications interface 222 are in the form of signals which can be electronic, electromagnetic, optical or other signals capable of being received by communications interface 222 .
  • computer program product is used to generally refer to media such as removable storage units 216 , 220 , a hard drive 212 that can be removed from the computer 202 , and signals carrying software received by the communications interface 222 .
  • These computer program products are means for providing software to the computer 202 .
  • Computer programs are stored in main memory and/or secondary storage devices 210 . Computer programs can also be received via communications interface 222 . Such computer programs, when executed, enable the computer 202 to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 204 to perform the features of the present invention. Accordingly, such computer programs represent controllers of the computer 202 .
  • the software can be stored in a computer program product and loaded into computer 202 using removable storage drive 214 , hard drive 212 , and/or communications interface 222 .
  • the control logic when executed by the processor 204 , causes the processor 204 to perform the functions of the invention as described herein.
  • the automated portion of the invention is implemented primarily in hardware using, for example, hardware components such as application specific integrated circuits (ASICs).
  • ASICs application specific integrated circuits
  • the invention is implemented using a combination of both hardware and software.
  • the computer 202 can be any suitable computer, such as a computer system running an operating system supporting a graphical user interface and a windowing environment.
  • a suitable computer system is a Silicon Graphics, Inc. (SGI) workstation/server, a Sun workstation/server, a DEC workstation/server, an IBM workstation/server, an IBM compatible PC, an Apple Macintosh, or any other suitable computer system, such as one using one or more processors from the Intel Pentium family, such as Pentium Pro or Pentium II.
  • Suitable operating systems include, but are not limited to, IRIX, OS/Solaris, Digital Unix, AIX, Microsoft Windows 95/NT, Apple Mac OS, or any other operating system supporting a graphical user interface and a windowing environment.
  • the program may be implemented and run on an Silicon Graphics Octane workstation running the IRIX 6.4 operating system, and using the Motif graphical user interface based on the X Window System.
  • MDS Multidimensional Scaling
  • NLM Non-Linear Mapping
  • MDS multidimensional scaling
  • NLM non-linear mapping
  • MDS and NLM were introduced by Torgerson, Phychometrika, 17:401 (1952); Kruskal, Psychometrika, 29:115 (1964); and Sammon, IEEE Trans. Comput. , C-18:401 (1969) as a means to generate low-dimensional representations of psychological data.
  • Multidimensional scaling and non-linear mapping are reviewed in Schiffman, Reynolds and Young, Introduction to Multidimensional Scaling , Academic Press, New York (1981); Young and Hamer, Multidimensional Scaling: History, Theory and Applications , Erlbaum Associates, Inc., Hillsdale, N.J. (1987); and Cox and Cox, Multidimensional Scaling , Number 59 in Monographs in Statistics and Applied Probability , Chapman-Hall (1994). The contents of these publications are incorporated herein by reference in their entireties.
  • This projection which can only be made approximately, is carried out in an iterative fashion by minimizing an error function which measures the difference between the distance matrices of the original and projected vector sets.
  • error functions Several such error functions have been proposed, most of which are of the least-squares type, including Kruskal's ‘stress’:
  • ⁇ ij
  • is the Euclidean distance between the images x i and x j on the display plane.
  • the solution is found in an iterative fashion by (1) computing or retrieving from a database the distances d ij ; (2) initializing the images x i ; (3) computing the distances of the images ⁇ and the value of the error function (e.g. S, E or K in EQ.
  • ⁇ pq ⁇ ( m ) ⁇ E ⁇ ( m ) ⁇ x pq ⁇ ( m ) ⁇ ⁇ 2 ⁇ E ⁇ ( m ) ⁇ x pq ⁇ ( m ) 2 ⁇ EQ . ⁇ 5
  • the partial derivatives in EQ. 5 are given by:
  • the general refinement paradigm described in Section 4.1 is suitable for relatively small data sets, but has one important limitation that renders it impractical for large data sets. This limitation stems from the fact that the computational effort required to compute the gradients scales to the square of the size of the data set. For relatively large data sets, this quadratic time complexity makes even a partial refinement intractable.
  • this approach is to use iterative refinement based on ‘instantaneous’ errors.
  • ⁇ (.) in EQ. 8 can assume any functional form. Ideally, this function should try to minimize the difference between the actual and target distance between the i-th and j-th points. For example, ⁇ (.) may be given by EQ. 9:
  • t is the iteration number
  • ⁇ ij
  • ⁇ (t) is an adjustable parameter, referred to hereafter as the ‘learning rate.’
  • the learning rate ⁇ (t) in EQ. 9 plays a key role in ensuring convergence. If ⁇ is too small, the coordinate updates are small, and convergence is slow. If, on the other hand, ⁇ is too large, the rate of learning may be accelerated, but the non-linear map may become unstable (i.e. oscillatory).
  • ranges in the interval [0, 1] and may be fixed, or it may decrease monotonically during the refinement process.
  • may also be a function of i, j and/or d ij , and can be used to apply different weights to certain objects, distances and/or distance pairs. For example, ⁇ may be computed by EQ. 10:
  • the embedding procedure described above does not guarantee convergence to the global minimum (i.e., the most faithful embedding in a least-squares sense). If so desired, the refinement process may be repeated a number of times from different starting configurations and/or random number seeds. It should also be pointed out that the absolute coordinates in the non-linear map carry no physical significance. What is important are the relative distances between points, and the general structure and topology of the data (presence, density and separation of clusters, etc.).
  • the method described above is ideally suited for both metric and non-metric scaling.
  • the latter is particularly useful when the (dis)similarity measure is not a true metric, i.e. it does not obey the distance postulates and, in particular, the triangle inequality (such as the Tanimoto coefficient, for example).
  • the triangle inequality such as the Tanimoto coefficient, for example.
  • an ‘exact’ projection is only possible when the distance matrix is positive definite, meaningful projections can still be obtained even when this criterion is not satisfied.
  • the overall quality of the projection is determined by a sum-of-squares error function such as those shown in EQ. 1–3.
  • the distances d ij between chemical compounds are computed according to some prescribed measure of molecular ‘similarity’.
  • This similarity can be based on any combination of properties or features of the compounds.
  • the similarity measure may be based on structural similarity, chemical similarity, physical similarity, biological similarity, and/or some other type of similarity measure which can be derived from the structure or identity of the compounds.
  • any similarity measure can be used to construct the non-linear map.
  • the properties or features that are being used to evaluate similarity or dissimilarity among compounds are sometimes herein collectively called “evaluation properties.”
  • the similarity measure may be derived from a list of physical, chemical and/or biological properties (i.e., evaluation properties) associated with a set of compounds.
  • evaluation properties i.e., evaluation properties
  • the compounds are represented as vectors in multi-variate property space, and their similarity may be computed by some geometrical distance measure.
  • the property space is defined using one or more molecular features (descriptors).
  • molecular features may include topological indices, physicochemical properties, electrostatic field parameters, volume and surface parameters, etc.
  • these features may include, but are not limited to, molecular volume and surface areas, dipole moments, octanol-water partition coefficients, molar refractivities, heats of formation, total energies, ionization potentials, molecular connectivity indices, 2D and 3D auto-correlation vectors, 3D structural and/or pharmacophoric parameters, electronic fields, etc.
  • molecular features may include the observed biological activities of a set of compounds against an array of biological targets such as enzymes or receptors (also known as affinity fingerprints).
  • any vectorial representation of chemical data can be used in the present invention.
  • distance measure is some algorithm or technique used to determine the difference between compounds based on the selected evaluation properties. The particular distance measure that is used in any given situation depends, at least in part, on the set of values that the evaluation properties can take.
  • a suitable distance measure is the Minkowski metric, shown in EQ. 12:
  • k is used to index the elements of the property vector
  • r ⁇ [1, ⁇ ) r EQ . ⁇ 12
  • EQ. 12 is the city-block or Manhattan metric.
  • EQ. 12 is the ordinary Euclidean metric.
  • EQ. 12 is the maximum of the absolute coordinate distances, also referred to as the ‘dominance’ metric, the ‘sup’ metric, or the ‘ultrametric’ distance.
  • the Minkowski metric is a true metric, i.e. it obeys the distance postulates and, in particular, the triangle inequality.
  • the evaluation properties of the compounds may be represented in a binary form (i.e., either a compound has or does not have an evaluation property), where each bit is used to indicate the presence or absence (or potential presence or absence) of some molecular feature or characteristic.
  • compounds may be encoded using substructure keys where each bit is used to denote the presence or absence of a specific structural feature or pattern in the target molecule.
  • Such features include, but are not limited to, the presence, absence or minimum number of occurrences of a particular element (e.g. the presence of at least 1, 2 or 3 nitrogen atoms), unusual or important electronic configurations and atom types (e.g.
  • the evaluation properties of compounds may be encoded in the form of binary fingerprints, which do not depend on a predefined fragment or feature dictionary to perform the bit assignment. Instead, every pattern in the molecule up to a predefined limit is systematically enumerated, and serves as input to a hashing algorithm that turns ‘on’ a small number of bits at pseudo-random positions in the bitmap. Although it is conceivable that two different molecules may have exactly the same fingerprint, the probability of this happening is extremely small for all but the simplest cases. Experience suggests that these fingerprints contain sufficient information about the molecular structures to permit meaningful similarity comparisons.
  • a number of similarity (distance) measures can be used with binary descriptors (i.e., where evaluation properties are binary or binary fingerprints). The most frequently used ones are the normalized Hamming distance:
  • H ⁇ XOR ⁇ ( x , y ) ⁇ N EQ . ⁇ 13 which measures the number of bits that are different between x and y, the Tanimoto or Jaccard coefficient:
  • T ⁇ AND ⁇ ( x , y ) ⁇ ⁇ IOR ⁇ ( x , y ) ⁇ EQ . ⁇ 14 which is a measure of the number of substructures shared by two molecules relative to the ones they could have in common, and the Dice coefficient:
  • represents the number of bits that are identical in x and y (either 1's or 0's).
  • the Euclidean distance is a good measure of similarity when the binary sets are relatively rich, and is mostly used in situations in which similarity is measured in a relative sense.
  • the distance between two compounds is determined using a binary or multivariate representation.
  • the system of the present invention is not limited to this embodiment.
  • the similarity between two compounds may be determined by comparing the shapes of the molecules using a suitable 3-dimensional alignment method, or it may be inferred by a similarity model defined according to a prescribed procedure.
  • one such similarity model may be a neural network trained to predict a similarity coefficient given a suitably encoded pair of compounds.
  • Such a neural network may be trained using a training set of structure pairs and a known similarity coefficient for each such pair, as determined by user input, for example.
  • the features may be scaled differently to reflect their relative importance in assessing the proximity between two compounds. For example, suppose the user has selected two evaluation properties, Property A and Property B. If Property A has a weight of 2, and Property B has a weight of 10, then Property B will have five times the impact on the distance calculation than Property A.
  • EQ. 12 may be replaced by EQ. 17:
  • w k is the weight of the k-th property.
  • An example of such a weighting factor is a normalization coefficient. However, other weighting schemes may also be used.
  • the scaling (weights) need not be uniform throughout the entire map, i.e. the resulting map need not be isomorphic.
  • maps derived from uniform weights shall be referred to as globally weighted (isomorphic), whereas maps derived from non-uniform weights shall be referred to as locally weighted (non-isomorphic).
  • the distances on the non-linear map reflect a local measure of similarity. That is, what determines similarity in one domain of the non-linear map is not necessarily the same with what determines similarity on another domain of the non-linear map.
  • locally-weighted maps may be used to reflect similarities derived from a locally-weighted case-based learning algorithm.
  • Locally-weighted learning uses locally weighted training to average, interpolate between, extrapolate from, or otherwise combine training data. Most learning methods (also referred to as modeling or prediction methods) construct a single model to fit all the training data.
  • Local models attempt to fit the training data in a local region around the location of the query. Examples of local models include nearest neighbors, weighted average, and locally weighted regression. Locally-weighted learning is reviewed in Vapnik, in Advances in Neural Information Processing Systems, 4:831, Morgan-Kaufman, San Mateo, Calif. (1982); Bottou and Vapnik, Neural Computation, 4(6):888 (1992); and Vapnik and Bottou, Neural Computation, 5(6):893 (1993), all of which are incorporated herein by reference in their entireties.
  • a non-linear map from a distance matrix which is not strictly symmetric, i.e. a distance matrix where d ij ⁇ d ji .
  • a potential use of this approach is in situations where the distance function is defined locally, e.g. in a locally weighted model using a point-based local distance function.
  • each training case has associated with it a distance function and the values of the corresponding parameters.
  • the distance between two points is evaluated twice, using the local distance functions of the respective points. The resulting distances are averaged, and are used as input in the non-linear mapping algorithm described above. If the point-based local distance functions vary in some continuous or semi-continuous fashion throughout the feature space, this approach could potentially lead to a meaningful projection.
  • each of the enhancements described below is under the control of the user. That is, the user can elect to perform or not perform each of the enhancements discussed below. Alternatively, the invention can be defined so that the below enhancements are automatically performed, unless specifically overrided by the user (or in some embodiments, the user may not have the option of overriding one or more of the below enhancements).
  • the approach described above for generating the non-linear map may be accelerated by pre-ordering the data using a suitable statistical method.
  • the initial configuration of the points on the non-linear map may be computed using Principal Component Analysis.
  • the initial configuration may be constructed from the first 3 principal components of the feature matrix (i.e. the 3 latent variables which account for most of the variance in the data).
  • this technique can have profound effects in the speed of refinement. Indeed, if a random initial configuration is used, a significant portion of the training time is spent establishing the general structure and topology of the non-linear map, which is typically characterized by large rearrangements. If, on the other hand, the input configuration is partially ordered, the error criterion can be reduced relatively rapidly to an acceptable level.
  • the center of mass of the non-linear map is identified, and concentric shells centered at that point are constructed. A series of regular refinement iterations are then carried out, each time selecting points from within or between these shells. This process is repeated for a prescribed number of cycles. This phase is then followed by a phase of regular refinement using global sampling, and the process is repeated.
  • EQ. 10 and 11 describe a method to ensure that short-range distances are preserved more faithfully than long-range ones through the use of weighting.
  • An alternative (and complementary) approach is to ensure that points at close separation are sampled more extensively than points at long separation.
  • a preferred embodiment is to use an alternating sequence of global and local refinement cycles, similar to the one described above. In this embodiment, a phase of global refinement is initially carried out. At the end of this phase, the resulting non-linear map is partitioned into a regular grid, and the points (objects) in each cell are subjected to a phase of local refinement (i.e. only points from within the same cell are compared and refined).
  • the number of sampling steps in each cell should be proportional to the number of points contained in that cell.
  • This process is highly parallelizable.
  • This local refinement phase is then followed by another global refinement phase, and the process is repeated for a prescribed number of cycles, or until the embedding error is minimized within a prescribed tolerance.
  • the grid method may be replaced by another suitable method for identifying proximal points, such as a k-d tree, for example.
  • the approach and techniques described herein may be used for incremental refinement of a map. That is, starting from an organized non-linear map of a set of objects or points (compounds), a new set of objects (compounds) may be added without modification of the original map. Strictly speaking, this is statistically acceptable if the new set of objects is significantly smaller than the original set.
  • the new set of objects may be ‘diffused’ into the existing map, using a modification of the algorithm described above.
  • EQ. 8 and 9 can be used to update only the new objects.
  • the sampling procedure ensures that the selected pairs contain at least one object from the incoming set. That is, two objects are selected at random so that at least one of these objects belongs to the incoming set.
  • the user selects one or more compounds to map in a new non-linear map.
  • the user may select compounds to map by retrieving a list of compounds from a file, by manually typing in a list of compounds, and/or by using a graphical user interface (GUI) such as the structure browser shown in FIG. 5 (described below).
  • GUI graphical user interface
  • the invention envisions other means for enabling the user to specify compounds to display in a non-linear map.
  • the user can also select compounds from an already existing compound visualization non-linear map (in one embodiment, the user drags and drops the compounds from the old compound visualization non-linear map to the new compound visualization non-linear map—drag and drop operations according to the present invention are described below).
  • step 306 the user selects a method to be used for evaluating the molecular similarity or dissimilarity between the compounds selected in step 304 .
  • the similarity/dissimilarity between the compounds selected in step 304 is determined (in step 308 ) based on a prescribed set of evaluation properties.
  • evaluation properties can be any properties related to the structure, function, or identity of the compounds selected in step 304 .
  • Evaluation properties include, but are not limited to, structural properties, functional properties, chemical properties, physical properties, biological properties, etc., of the compounds selected in step 304 .
  • the selected evaluation properties may be scaled differently to reflect their relative importance in assessing the proximity (i.e., similarity or dissimilarity) between two compounds. Accordingly, also in step 306 , the user selects a scale factor for each of the selected evaluation. Note that such selection of scale factors is optional. The user need not select a scale factor for each selected evaluation property. If the user does not select a scale factor for a given evaluation property, then that evaluation property is given a default scale factor, such as unity.
  • step 306 the user can elect to retrieve similarity/dissimilarity values pertaining to the compounds selected in step 304 from a source, such as a database. These similarity/dissimilarity values in the database were previously generated.
  • the user in step 306 can elect to determine similarity/dissimilarity values using any well-known technique or procedure.
  • step 308 the map generating module 106 generates a new non-linear map.
  • This new non-linear map includes a point for each of the compounds selected in step 304 . Also, in this new non-linear map, the distance between any two points is representative of their similarity/dissimilarity.
  • the manner in which the map generating module 106 generates the new non-linear map shall now be further described with reference to a flowchart 402 in FIG. 4 .
  • step 404 coordinates on the new non-linear map of points corresponding to the compounds selected in step 304 are initialized.
  • step 406 two of the compounds i, j selected in step 304 are selected for processing.
  • step 408 similarity/dissimilarity d ij between compounds i, j is determined based on the method selected by the user in step 306 .
  • step 410 based on the similarity/dissimilarity d ij determined in step 408 , coordinates of points corresponding to compounds i, j on the non-linear map are obtained.
  • step 412 training/learning parameters are updated.
  • step 414 a decision is made as to terminate or not terminate. If a decision is made to not terminate at this point, then control returns to step 406 . Otherwise, step 416 is performed.
  • step 416 the non-linear map is output (i.e., generation of the non-linear map is complete).
  • step 312 the map viewer 112 displays the new non-linear map on an output device 116 (such as a computer graphics monitor). Examples of non-linear maps being displayed by the map viewer 112 are shown in FIGS. 6 and 7 (described below).
  • step 314 the user interface modules 108 enable operators to interactively analyze and process the compounds represented in the displayed non-linear map. These user interface functions of the present invention are described below.
  • the present invention enables users to modify existing compound visualization non-linear maps (as used herein, the term “compound visualization non-linear map” refers to a rendered non-linear map). For example, users can add additional compounds to the map, remove compounds from the map, highlight compounds on the map, etc. In such cases, pertinent functional steps of flowchart 302 are repeated. For example, steps 304 (selecting compounds to map), 310 (generating the non-linear map), and 312 (displaying the map) are repeated when the user opts to add new compounds to an existing map.
  • the map is incrementally refined and displayed in steps 310 and 312 when adding compounds to an existing compound visualization non-linear map (this incremental refinement is described above).
  • the user interface features of the present invention are described in this section. Various user interface modules and features are described below. Also, various functional/control threads (in the present context, a functional/control thread is a series of actions performed under the control of a user) employing these user interface modules and features are described below. It will be appreciated by persons skilled in the relevant art(s) that the user interface of the present invention is very flexible, varied, and diverse. An operator can employ the user interface of the present invention to perform a wide range of activities with respect to visualizing and interactively analyzing chemical compounds. Accordingly, it should be understood that the functional/control threads described herein are provided for illustrative purposes only. The invention is not limited to these functional/control threads.
  • the invention provides the following capabilities, features, and functions: displaying 2D and/or 3D chemical structures and/or chemical names; displaying compound collections and/or libraries; displaying components of structures (i.e. building blocks) of combinatorial libraries; visualization of compound collections and/or libraries as 2D and/or 3D maps of colored objects.
  • the present invention allows the following: (1) browsing compound collections and/or libraries; (2) selection of individual compounds, collections of compounds and/or libraries of compounds; (3) selection of compounds generated in a combinatorial fashion via selection of their respective building blocks; (4) mapping, visualization, and/or linking of compounds onto and/or from 2D and/or 3D maps; (5) manipulation of the 2D and/or 3D maps such as rotation, resizing, translation, etc.; (6) manipulation of objects on the 2D and/or 3D maps such as changing the appearance of objects (visibility, size, shape, color, etc.), changing position of objects on the map, and/or changing relationships between objects on the map; (7) interactive exploring of the 2D and/or 3D maps such as querying chemical structure, querying distance, selection of individual objects and/or areas of a map, etc.
  • the invention includes a structure browser 110 and a map viewer 112 . At any given time, each of these can have multiple instances depending on the program use.
  • FIG. 5 illustrates a structure browser window 502 generated by the structure browser 110 .
  • the structure browser window 502 includes a frame 504 , a menu pane 506 , and a group of labeled tabbed pages 508 .
  • Each tabbed page holds a molecular spreadsheet or a group of labeled tabbed pages.
  • Each tab is associated with a compound collection (tabs 510 ) or a library, such as a combinatorial library (tabs 512 ). Selecting a collection tab 510 brings up a table of corresponding chemical structures. Selecting a library tab 512 brings up a group of tabbed pages corresponding to the sets of building blocks used to generate the library. Each of the library's tabbed pages works the same way as a compound collection tabbed page. In the example shown in FIG. 5 , the tab 510 called “DDL 0 ” is selected.
  • DDL 0 has three building block tabs 512 , called “Cores,” “Acids,” and “Amines.”
  • the “Acids” collection tab is currently selected, so that a table 522 of the structures of the compounds in the “Acids” collection is shown.
  • the browser window 502 includes a table 522 , a slider 514 , an input field 516 , and two buttons: “Prev Page” 518 and “Next Page” 520 .
  • the slider 514 , the input field 516 , and the buttons 518 , 520 facilitate browsing the content of the Acids table 522 . If we consider the content of the table 522 as a contiguous ordered list of chemical structures (compounds or building blocks), that shown in the browser window 502 can be considered as a window positioned over the list. At any given moment this window displays part of the list depending on its position and the displayed part is equal to the size of the window, i.e., the number of cells in the table. Initially that window displays the top of the list.
  • Moving the slider 514 changes the position of the window over the list. Entering a value into the input field 516 specifies the position of the window over the list. Pushing the “Next Page” button 520 moves the window one window size down the list, pushing the “Prev Page” button 518 moves the window one window size up the list.
  • the user can select compounds shown in the table 522 for various actions.
  • compounds can be selected using the browser window 502 as input for the generation of a new compound visualization non-linear map, or as input for adding compounds to an existing compound visualization non-linear map.
  • Clicking with a left mouse button over a table cell selects or deselects the corresponding compound structure (toggling). Toggling on/off also changes the color of the cell, to indicate which cells have been selected. Selected structures are displayed on a first background color, and non-selected structures are displayed on a second background color. In the example of FIG. 5 , certain cells 523 in table 522 have been selected.
  • the menu pane 506 contains menus: File, Edit, Selection, Map, and/or other menus.
  • the File menu facilitates file open/save, print, and exit operations.
  • Edit menu contains commands for editing content of the table 522 .
  • the Selection menu provides options to select/deselect (clear) a current compound collection, a collection of building blocks of a combinatorial library, and/or all compounds.
  • the Map menu includes commands for creating a map viewer and for displaying a selection of compounds in that map viewer. The latter option brings up a dialog window ( FIG. 8 ), which allows the user to specify shape, color, and/or size of the selected objects, which will be used to represent the selected compounds on the map.
  • a map viewer window 600 generated by the map viewer 112 is shown in FIG. 6 . (also see FIGS. 6–10 and 13 ).
  • a compound visualization non-linear map is displayed in a render area 614 of the map view window 600 .
  • the map viewer 112 is based on Open Inventor, a C++ library of objects and methods for interactive 3D graphics, publicly available from Silicon Graphics Inc. Open Inventor relies on OpenGL for fast and flexible rendering of 3D objects. Alternatively, the map viewer 112 can be based on a publicly available VRML viewer. Alternatively, any other software and/or hardware product allowing rendering of 3D objects/scenes can be used.
  • 3D compound visualization maps of chemical compounds are implemented as Open Inventor 3D scene databases.
  • Each map is build as an ordered collection of nodes referred to as a scene graph.
  • Each scene graph includes, but is not limited to, nodes representing cameras (points of view), light sources, 3D shapes, objects surface materials, and geometric transformations.
  • Each chemical compound displayed on a map is associated with a 3D shape node, a material node and a geometric transformation node.
  • Geometric transformation node reflects compound coordinates in the map.
  • 3D shape node and material node determine shape, size and color of the visual object associated with the compound. Combinations of a particular shape, size and color are used to display compounds grouped by a certain criteria, thus allowing easy visual differentiation of different groups/sets of compounds.
  • 3D shapes of the visual objects in the map include, but not limited to, point, cube, sphere, and cone. Color of a visual object in the map can be set to any combination of three basic colors: red, green and blue. Besides the color, material node can specify transparency and shininess of a visual object's surface.
  • an object's display properties can represent physical, chemical, biological, and/or other properties of the corresponding compound, such as the cost of the compound, difficulty of synthesizing the compound, whether the compound is available in a compound repository, etc.
  • the larger the molecular weight of an object the larger the size of the corresponding object in the display map.
  • Each object or point displayed in the compound visualization non-linear map represents a chemical compound.
  • Objects in the compound visualization non-linear map can be grouped into sets.
  • a compound can be a member of several sets.
  • a different object is displayed in the compound visualization non-linear map for each set of which the compound is a member.
  • the objects in the compound visualization non-linear map that represent the compound as a member of each of the sets may overlap and only the biggest object may be visible.
  • a toggle sets feature (described below) may be used to reveal multiple set membership.
  • the map viewer window 600 includes a frame 602 , a menu pane 604 , and a viewer module preferably implemented as an Open Inventor component (examiner viewer).
  • the viewer module incorporates the following elements: (1) a render area 614 in which the compound visualization non-linear map is being displayed; (2) combinations of thumbwheels 608 , 610 , 612 , sliders, and/or viewer functions icons/buttons 620 , 622 , 624 , 626 , 628 , 630 , 632 ; and (3) pop-up menus and dialogs 616 , 702 , 902 which provide access to all viewers functions, features and/or properties.
  • the thumbwheels 608 , 610 rotate the compound visualization non-linear map around a reference point of interest. Thumbwheel 610 rotates in the y direction, and thumbwheel 608 rotates in the x direction.
  • the origin of rotation i.e., the camera position
  • the compound visualization non-linear map can also be panned in the screen plane, as well as dollied in and out (forward/backward movement) via thumbwheel 612 .
  • the map view window 600 has several different modes or states, e.g. view, pick, panning, dolly, seek, and/or other. Each mode defines a different mouse cursor and how mouse events are interpreted.
  • the view mode mouse motions are translated into rotations of the virtual trackball and corresponding rotations of the compound visualization non-linear map.
  • the view mode is the default mode.
  • the compound visualization non-linear map is translated in the screen plane following the mouse movements.
  • Seek mode allows the user to change the point of rotation (reference point) of a scene by attaching it to an object displayed in the compound visualization non-linear map.
  • Pick mode is used for picking (querying) objects displayed in the compound visualization non-linear map.
  • Picking an object in a 3D scene is achieved by projecting a conical ray from the camera through a point (defined by positioning and clicking the mouse) on the near plane of the view volume. The first object in the scene intersecting with the ray cone is picked.
  • a pick event an object being picked by pressing the left mouse button over the object
  • a small window displaying the corresponding compound pops up while the left mouse button is pressed (see, for example, window 1302 in FIG. 13 ).
  • the window will automatically disappear when the button is released. In order to keep the window on the screen, it is necessary to hold the shift key while releasing the mouse button.
  • Switching between the above-described modes can be achieved by selecting a mode from a pop-up menu, by clicking on a shortcut icon/button, and/or by pressing and/or holding a combination of mouse buttons and/or keys on a keyboard.
  • selecting a pointed arrow icon/button 620 switches to the pick mode.
  • Selecting a hand icon/button 622 switches to the view mode; selecting a target icon/button 624 switches to the seek mode.
  • Pressing and holding the middle mouse button switches to the panning mode. Pressing and holding the left and middle mouse buttons simultaneously switches to the dolly mode.
  • thumbwheels and/or sliders e.g. turning the dolly thumbwheel 612 moves the scene in and out of the screen. Also, turning the X and/or Y rotation thumbwheels 608 , 610 rotate the scene accordingly around the point of rotation.
  • the right mouse button is reserved for the pop-up menus 616 , 902 . Pressing the right mouse button anywhere over an empty rendering area brings up the viewer pop-up menu 902 . Pressing the right mouse button over an object brings up the object pop-up menu 616 .
  • the viewer pop-up menu 902 allows the user to select the mode (such modes are described above), change viewer properties (set up preferences, e.g. background color), toggle on/off sets of objects, and/or access any other viewer features.
  • the object pop-up menu 616 allows the user to change an object's shape, color (material), and/or size, select the corresponding set of compounds, and/or define a neighborhood 3D area around the object (zoom feature, described below).
  • all changes made to an object automatically apply to all other objects from the same set.
  • the object's shape can be changed to one of the predefined basic shapes (e.g. dot, cube, sphere, cone).
  • the object's material (color) is changed via a color dialog.
  • the object's size is changed via a resize dialog. Any set of objects can be visible (toggled on) or hidden (toggled off).
  • a toggle sets command brings up a list of sets defined for the current map 640 . Clicking on a set in the list (highlighting/clearing) toggles the set off and on.
  • Invoking the zoom feature creates a sphere 704 in the render area 614 ( FIG. 7 ), which is centered on the object.
  • the radius of the sphere 704 can be adjusted via a resize dialog 702 to select a desired neighborhood area around the object. All objects (and corresponding compounds) encompassed by the sphere 704 are then selected, displayed in a different map, added to a new or existing set, dragged to a target (described below), and/or viewed in a structure browser window 502 .
  • the map viewer 112 is capable of maintaining an interactive selection of objects/compounds. All selected objects are visualized in the same shape, color, and/or size. In other words, selecting an object changes its shape, color, and/or size (e.g. to a purple cone), deselecting an object changes its shape, color and/or size back to the original attributes. Executing the select set command from the object pop-up menu 616 selects the whole set of objects this object belongs to. Alternatively, an individual object can be selected or deselected by clicking a middle mouse button over an object.
  • the interactive selection of objects can be converted to a set of compounds and displayed in a structure browser window 502 .
  • the current selection can be converted into a set of compounds by invoking the save selection command from a selection menu, and/or it can be cleared by executing the clear selection command from the selection menu.
  • the present invention enables users to interact with the objects/compounds displayed in a compound visualization non-linear map. This interactivity provided by the present invention shall be further illustrated below.
  • a user can select a plurality of compounds from some source, and then add those compounds to a new or an existing compound visualization non-linear map being displayed in a map window 600 .
  • the map window 600 (or, equivalently in this context, the map viewer 112 ) is acting as a target for an interactive user activity.
  • FIG. 14 This operation is conceptually shown in FIG. 14 .
  • a compound visualization non-linear map 1404 is being displayed in a map window 600 .
  • the user can select compounds from a structure browser window 502 , and then add those selected compounds (through, for example, well known drag and drop operations) to the compound visualization non-linear map 1404 .
  • the user can select compounds from a compound database 122 , or from a MS (mass spectrometry) viewer 1402 , and then add those compounds to the compound visualization non-linear map 1404 .
  • MS mass spectrometry
  • new compounds are added to an existing compound visualization non-linear map by incremental refinement of the compound visualization non-linear map. Such incremental refinement is described above.
  • a user can select a plurality of compounds from a map window 600 , and then have those compounds processed by a target.
  • the map window 600 (or, equivalently in this context, the map viewer 112 ) is acting as a source for an interactive user activity.
  • This operation is conceptually shown in FIG. 13 .
  • a user selects one or more compounds from the compound visualization non-linear map being displayed in the map window 600 , and then drags and drops the selected compounds to a target.
  • the described action is interpreted as a submission of the corresponding chemical structure(s) to the receiving target for processing.
  • the receiving object can be anything that can handle a chemical structure: another map viewer 112 , a structure viewer 110 , a (molecular) spreadsheet 136 , a database 120 , an experiment planner 140 , an active site docker 144 , an NMR widget 130 , an MS widget 134 , a QSAR model 138 , a property prediction program 142 , or any other suitable process.
  • dragging and dropping a compound onto an NMR widget would display this compound's NMR spectrum, either an experimental or a predicted one.
  • the drag and drop concept described above provides a powerful enhancement of a 3D mapping and visualization of compound collections and libraries. Any conceivable information about a set of chemical compounds can thus be easily accessed from the compound visualization non-linear map. For example, a map of compounds capable of binding to an active site of a given enzyme or receptor would benefit from the possibility to visualize how compounds from the different areas of the map bind to that enzyme or receptor.
  • Visual maps can be based on the same and/or different non-linear maps.
  • Visual maps based on the same non-linear map can display different subsets of compounds and/or present different views of the same set of compounds (e.g. one visual map can display an XY plane view and another visual map can display an orthogonal, YZ plane view).
  • Visual maps based on different non-linear maps can visualize the same set of compounds on different projections, for example, maps derived from different similarity relations between these compounds.
  • the visual objects representing the compound on the different maps can be crosslinked.
  • Crosslinking means that any modifications made to a visual object in one of the visual maps will be automatically reflected into the other visual maps. For example, if an object is selected on one of the visual maps, it will be displayed as selected on the other visual maps as well. In fact, all objects on all maps can be crosslinked provided that they represent the same chemical compounds. Multiple visual maps can be also crosslinked in a way that mapping any additional compounds onto one of the visual maps will automatically map the same compounds onto the crosslinked maps.
  • the present invention is useful for visualizing and interactively processing any chemical entities including but not limited to small molecules, polymers, peptides, proteins, etc. It may also be used to display different similarity relationships between these compounds.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Organic Chemistry (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Medicinal Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)
  • Processing Or Creating Images (AREA)
  • Analysing Materials By The Use Of Radiation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A system, method, and computer program product for visualizing and interactively analyzing data relating to chemical compounds. A user selects a plurality of compounds to map, and also selects a method for evaluating similarity/dissimilarity between the selected compounds. A non-linear map is generated in accordance with the selected compounds and the selected method. The non-linear map has a point for each of the selected compounds, wherein a distance between any two points is representative of similarity/dissimilarity between the corresponding compounds. A portion of the non-linear map is then displayed. Users are enabled to interactively analyze compounds represented in the non-linear map.

Description

CROSS REFERENCE TO RELATED APPLICATIONS
This application is a continuation of U.S. Application Ser. No. 08/963,872 filed Nov. 4, 1997, now U.S. Pat. No. 6,295,514, which claimed priority to U.S. Provisional Application Ser. No. 60/030,187, filed Nov. 4, 1996, both of which are herein incorporated by reference in their entirety. This application is also related to U.S. application Ser. No. 08/963,870 filed Nov. 4, 1997, now U.S. Pat. No. 6,421,612, which is also incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is generally directed to displaying and processing data using a computer, and more particularly directed to visualizing and interactively processing chemical compounds using a computer.
2. Related Art
Currently, research to identify chemical compounds with useful properties (such as paints, finishes, plasticizers, surfactants, scents, drugs, herbicides, pesticides, veterinary products, etc.) often includes the synthesis/acquisition and analysis of large libraries of chemical compounds. More and more, combinatorial chemical libraries are being synthesized/acquired and analyzed to conduct this research.
A combinatorial chemical library is a collection of diverse chemical compounds generated by either chemical synthesis or biological synthesis by combining a number of chemical “building blocks” such as reagents. For example, a linear combinatorial chemical library such as a polypeptide library is formed by combining a set of chemical building blocks called amino acids in every possible way for a given compound length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical compounds theoretically can be synthesized through such combinatorial mixing of chemical building blocks. For example, one commentator has observed that the systematic, combinatorial mixing of 100 interchangeable chemical building blocks results in the theoretical synthesis of 100 million tetrameric compounds or 10 billion pentameric compounds (Gallop et al., “Applications of Combinatorial Technologies to Drug Discovery, Background and Peptide Combinatorial Libraries,” J. Med. Chem. 37, 1233–1250 (1994)).
Advanced research in this area often involves the use of directed diversity libraries. A directed diversity library is a large collection of chemical compounds having properties/features/characteristics that match some prescribed properties. The generation, analysis, and processing of directed diversity libraries are described in U.S. Pat. Nos. 5,463,564; 5,574,656; and 5,684,711, and pending U.S. Application titled “SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR IDENTIFYING CHEMICAL COMPOUNDS HAVING DESIRED PROPERTIES,” Ser. No. 10/170,628 all of which are herein incorporated by reference in their entireties.
In conducting such research, it would be very valuable to be able to compare the properties, features, and other identifying characteristics of compounds. For example, suppose that a researcher has identified a compound X that exhibits some useful properties. It would aid the researcher greatly if he could identify similar compounds, since those similar compounds might also exhibit those same useful properties.
It would also help a researcher in his work to be able to easily synthesize compounds, or retrieve compounds from a chemical inventory. Further, it would greatly aid a researcher to be able to interactively analyze and otherwise process chemical compounds.
SUMMARY OF THE INVENTION
Briefly stated, the present invention is directed to a system, method, and computer program product for visualizing and interactively analyzing data relating to chemical compounds. The invention operates as follows. A user selects a plurality of compounds to map, and also selects a method for evaluating similarity/dissimilarity between the selected compounds. A non-linear map is generated in accordance with the selected compounds and the selected method. The non-linear map has a point for each of the selected compounds, wherein a distance between any two points is representative of similarity/dissimilarity between the corresponding compounds. A portion of the non-linear map is then displayed. Users are enabled to interactively analyze compounds represented in the non-linear map.
Further features and advantages of the present invention, as well as the structure and operation of various embodiments of the present invention, are described in detail below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Also, the leftmost digit(s) of the reference numbers identify the drawings in which the associated elements are first introduced.
BRIEF DESCRIPTION OF THE FIGURES
The file of this patent contains at least one drawing executed in color. Copies of this patent with color drawing(s) will be provided by the Patent and Trademark Office upon request and payment of the necessary fee.
The present invention will be described with reference to the accompanying drawings, wherein:
FIG. 1 illustrates a block diagram of a computing environment according to an embodiment of the invention;
FIG. 2 is a block diagram of a computer useful for implementing components of the invention;
FIG. 3 is a flowchart representing the operation of the invention in visualizing and interactively processing non-linear maps according to an embodiment of the invention;
FIG. 4 is a flowchart representing the manner in which a non-linear map is generated according to an embodiment of the invention;
FIG. 5 illustrates a structure browser window according to an embodiment of the invention;
FIG. 6 illustrates a compound visualization non-linear map window according to an embodiment of the invention;
FIG. 7 is used to describe a zoom function of the present invention;
FIG. 8 illustrates a dialog used to adjust properties of a set containing one or more compounds;
FIGS. 9 and 10 are used to describe the compound visualization non-linear map window according to an embodiment of the invention;
FIG. 11 is a flowchart illustrating the operation of the invention where a compound visualization non-linear map window is used as a source in an interactive operation;
FIG. 12 is a flowchart illustrating the operation of the invention where a compound visualization non-linear map window is used as a target in an interactive operation;
FIG. 13 conceptually illustrates an interactive operation where a compound visualization non-linear map window is used as a source; and
FIG. 14 conceptually illustrates an interactive operation where a compound visualization non-linear map window is used as a target.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Table of Contents
  • 1. Overview of the Present Invention
  • 2. Structure of the Invention
  • 3. Implementation Embodiment of the Invention
  • 4. Overview of Multidimensional Scaling (MDS) and Non-Linear Mapping (NLM)
    • 4.1 Procedure Suitable for Relatively Small Data Sets
    • 4.2 Procedure Suitable for Large Data Sets
  • 5. Evaluation Properties (Features) and Distance Measures
    • 5.1 Evaluation Properties Having Continuous or Discrete Real Values
    • 5.2 Distance Measure Where Values of Evaluation Properties Are Continuous or Discrete Real Numbers
    • 5.3 Evaluation Properties Having Binary Values
    • 5.4 Distance Measures Where Values of Evaluation Properties Are Binary
  • 6. Scaling of Evaluation Properties
  • 7. Improvements to Map Generation Process
    • 7.1 Pre-Ordering
    • 7.2 Localized Refinement
    • 7.3 Incremental Refinement
  • 8. Operation of the Present Invention
  • 9. User Interface of the Present Invention
    • 9.1 Structure Browser
    • 9.2 Map Viewer
    • 9.3 Interactivity of the Present Invention
      • 9.3.1 Map Viewer as Target
      • 9.3.2 Map Viewer as Source
    • 9.4 Multiple Maps
  • 10. Examples
    1. Overview of the Present Invention
The present invention is directed to a computer-based system, method, and/or computer program product for visualizing and analyzing chemical data using interactive multi-dimensional (such as 2- and/or 3-dimensional) non-linear maps. In particular, the invention employs a suite of non-linear mapping algorithms to represent chemical compounds as objects in preferably 2D or 3D Euclidean space.
According to the invention, the distances between objects in that space represent the similarities and/or dissimilarities of the corresponding compounds (relative to selected properties or features of the compounds) computed by some prescribed method. The resulting maps are displayed on a suitable graphics device (such as a graphics terminal, for example), and interactively analyzed to reveal relationships between the data, and to initiate an array of tasks related to these compounds.
2. Structure of the Invention
FIG. 1 is a block diagram of a computing environment 102 according to a preferred embodiment of the present invention.
A chemical data visualization and interactive analysis module 104 includes a map generating module 106 and user interface modules 108. The map generating module 106 determines distances between chemical compounds relative to one or more selected properties or features (herein sometimes called evaluation properties or features) of the compounds. The map generating module 106 performs this function by retrieving and analyzing data on chemical compounds and reagents from reagent and compound databases 122. These reagent and compound databases 122 store information on chemical compounds and reagents of interest.
The reagent and compound databases 122 are part of databases 120, which communicate with the chemical data visualization and interactive analysis module 104 via a communication medium 118. The communication medium 118 is preferably any type of data communication means, such as a data bus, a computer network, etc.
The user interface modules 108, which include a map viewer 112 and optionally a structure browser 110, displays a preferably 2D or 3D non-linear map on a suitable graphics device. The non-linear map includes objects that represent the chemical compounds, where the distances between the objects in the non-linear map are those distances determined by the map generating module 106. The user interface modules 108 enable human operators to interactively analyze and process the information in the non-linear map so as to reveal relationships between the data, and to initiate an array of tasks related to the corresponding compounds.
The user interface modules 108 enable users to organize compounds as collections (representing, for example, a combinatorial library). Information pertaining to compound collections are preferably stored in a collection database 124. Information on reagents that are mixed to form compound collections are preferably stored in a library database 126.
Input Device(s) 114 receive input (such as data, commands, queries, etc.) from human operators and forward such input to, for example, the chemical data visualization and interactive analysis module 104 via the communication medium 118. Any well known, suitable input device can be used in the present invention, such as a keyboard, pointing device (mouse, roller ball, track ball, light pen, etc.), touch screen, voice recognition, etc. User input can also be stored and then retrieved, as appropriate, from data/command files.
Output Device(s) 116 output information to human operators. Any well known, suitable output device can be used in the present invention, such as a monitor, a printer, a floppy disk drive or other storage device, a text-to-speech synthesizer, etc.
As described below, the present invention enables the chemical data visualization and interactive analysis module 104 to interact with a number of other modules, including but not limited to one or more map viewers 112, NMR (nuclear magnetic resonance) widget/module 130, structure viewers 110, MS (mass spectrometry) widget/module 134, spreadsheets 136, QSAR (Quantitative Structure-Activity Relationships) module 138, an experiment planner 140, property prediction programs 142, active site docker 144, etc. These modules communicate with the chemical data visualization and interactive analysis module 104 via the communication medium 118.
3. Implementation Embodiment of the Invention
Components shown in the computing environment 102 of FIG. 1 (such as the chemical data visualization and interactive analysis module 104) can be implemented using one or more computers, such as an example computer 202 shown in FIG. 2.
The computer 202 includes one or more processors, such as processor 204. Processor 204 is connected to a communication bus 206. Various software embodiments are described in terms of this example computer system. After reading this description, it will become apparent to a person skilled in the relevant art(s) how to implement the invention using other computer systems and/or computer architectures.
Computer 202 also includes a main memory 208, preferably random access memory (RAM), and can also include one or more secondary storage devices 210. Secondary storage devices 210 can include, for example, a hard disk drive 212 and/or a removable storage drive 214, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. Removable storage drive 214 reads from and/or writes to a removable storage unit 216 in a well known manner. Removable storage unit 216 represents a floppy disk, magnetic tape, optical disk, etc. which is read by and written to by removable storage drive 214. Removable storage unit 216 includes a computer usable storage medium having stored therein computer software and/or data.
In alternative embodiments, the computer 202 can include other similar means for allowing computer programs or other instructions to be loaded into computer 202. Such means can include, for example, a removable storage unit 220 and an interface 218. Examples of such can include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 220 and interfaces 218 which allow software and data to be transferred from the removable storage unit 220 to computer 202.
The computer 202 can also include a communications interface 222. Communications interface 222 allows software and data to be transferred between computer 202 and external devices. Examples of communications interface 222 include, but are not limited to a modem, a network interface (such as an Ethernet card), a communications port, a PCMCIA slot and card, etc. Software and data transferred via communications interface 222 are in the form of signals which can be electronic, electromagnetic, optical or other signals capable of being received by communications interface 222.
In this document, the term “computer program product” is used to generally refer to media such as removable storage units 216, 220, a hard drive 212 that can be removed from the computer 202, and signals carrying software received by the communications interface 222. These computer program products are means for providing software to the computer 202.
Computer programs (also called computer control logic) are stored in main memory and/or secondary storage devices 210. Computer programs can also be received via communications interface 222. Such computer programs, when executed, enable the computer 202 to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 204 to perform the features of the present invention. Accordingly, such computer programs represent controllers of the computer 202.
In an embodiment where the invention is implemented using software, the software can be stored in a computer program product and loaded into computer 202 using removable storage drive 214, hard drive 212, and/or communications interface 222. The control logic (software), when executed by the processor 204, causes the processor 204 to perform the functions of the invention as described herein.
In another embodiment, the automated portion of the invention is implemented primarily in hardware using, for example, hardware components such as application specific integrated circuits (ASICs). Implementation of the hardware state machine so as to perform the functions described herein will be apparent to persons skilled in the relevant art(s).
In yet another embodiment, the invention is implemented using a combination of both hardware and software.
The computer 202 can be any suitable computer, such as a computer system running an operating system supporting a graphical user interface and a windowing environment. A suitable computer system is a Silicon Graphics, Inc. (SGI) workstation/server, a Sun workstation/server, a DEC workstation/server, an IBM workstation/server, an IBM compatible PC, an Apple Macintosh, or any other suitable computer system, such as one using one or more processors from the Intel Pentium family, such as Pentium Pro or Pentium II. Suitable operating systems include, but are not limited to, IRIX, OS/Solaris, Digital Unix, AIX, Microsoft Windows 95/NT, Apple Mac OS, or any other operating system supporting a graphical user interface and a windowing environment. For example, in a preferred embodiment the program may be implemented and run on an Silicon Graphics Octane workstation running the IRIX 6.4 operating system, and using the Motif graphical user interface based on the X Window System.
4. Overview of Multidimensional Scaling (MDS) and Non-Linear Mapping (NLM)
According to the present invention, multidimensional scaling (MDS) and non-linear mapping (NLM) techniques are used to generate the non-linear map (i.e., the non-linear map) that includes objects, where the objects represent chemical compounds, and the distances between the objects are indicative of the similarities and dissimilarities between the corresponding compounds. MDS and NLM are described in this section.
MDS and NLM were introduced by Torgerson, Phychometrika, 17:401 (1952); Kruskal, Psychometrika, 29:115 (1964); and Sammon, IEEE Trans. Comput., C-18:401 (1969) as a means to generate low-dimensional representations of psychological data. Multidimensional scaling and non-linear mapping are reviewed in Schiffman, Reynolds and Young, Introduction to Multidimensional Scaling, Academic Press, New York (1981); Young and Hamer, Multidimensional Scaling: History, Theory and Applications, Erlbaum Associates, Inc., Hillsdale, N.J. (1987); and Cox and Cox, Multidimensional Scaling, Number 59 in Monographs in Statistics and Applied Probability, Chapman-Hall (1994). The contents of these publications are incorporated herein by reference in their entireties.
4.1 Procedure Suitable for Relatively Small Data Sets
MDS and NLM (these are generally the same, and are hereafter collectively referred to as MDS) represent a collection of methods for visualizing proximity relations of objects by distances of points in a low-dimensional Euclidean space. Proximity measures are reviewed in Hartigan, J. Am. Statist. Ass., 62:1140 (1967), which is incorporated herein by reference in its entirety. In particular, given a finite set of vectorial or other samples A={ai, i=1, . . . , k}, a distance function dij=d(ai, aj), with ai, aj ∈ A, which measures the similarity and dissimilarity between the i-th and j-th objects in A, and a set of images X={xi, . . . , xk; xi
Figure US07188055-20070306-P00001
m} of A on an m-dimensional display plane (
Figure US07188055-20070306-P00001
m being an m dimensional vector of real numbers), the objective is to place xi onto the display plane in such a way that their Euclidean distances ||xi−xj|| approximate as closely as possible the corresponding values dij. This projection, which can only be made approximately, is carried out in an iterative fashion by minimizing an error function which measures the difference between the distance matrices of the original and projected vector sets. Several such error functions have been proposed, most of which are of the least-squares type, including Kruskal's ‘stress’:
S = i < j k ( ij - δ ij ) 2 i < j k ij 2 EQ . 1
Sammon's error criterion:
E = i < j k ( ij - δ ij ) 2 ij i < j k ij EQ . 2
and Lingoes' alienation coefficient:
K = k i < j ( ij δ ij ) 2 i < j k δ ij EQ . 3
where δij=||xi−xj|| is the Euclidean distance between the images xi and xj on the display plane. Generally, the solution is found in an iterative fashion by (1) computing or retrieving from a database the distances dij; (2) initializing the images xi; (3) computing the distances of the images δ and the value of the error function (e.g. S, E or K in EQ. 1–3 above); (4) computing a new configuration of the images xi using a gradient descent procedure, such as Kruskal's linear regression or Guttman's rank-image permutation; and (5) repeating steps 3 and 4 until the error is minimized within some prescribed tolerance.
For example, the Sammon algorithm minimizes EQ. 2 by iteratively updating the coordinates xi using Eq 4:
x pq(m+1)=x pq(m)−λΔpq(m)  EQ. 4
where m is the iteration number, xpq is the q-th coordinate of the p-th image xPλ is the learning rate, and
Δ pq ( m ) = E ( m ) x pq ( m ) 2 E ( m ) x pq ( m ) 2 EQ . 5
The partial derivatives in EQ. 5 are given by:
E ( m ) x pq ( m ) = - 2 j = 1 j p k pj - δ pj pj δ pj ( x pq - x jq ) i < j k ij EQ . 6 2 E ( m ) x pq ( m ) 2 = - 2 i < j k 1 pj δ pj ( pj - δ pj ) - ( x pq - x jq ) 2 δ pj ( 1 + ( pj - δ pj ) . δ pj ) i < j k ij EQ . 7
The non-linear mapping is obtained by repeated evaluation of EQ. 2, followed by modification of the coordinates using EQ. 4 and 5, until the error is minimized within a prescribed tolerance.
4.2 Procedure Suitable for Large Data Sets
The general refinement paradigm described in Section 4.1 is suitable for relatively small data sets, but has one important limitation that renders it impractical for large data sets. This limitation stems from the fact that the computational effort required to compute the gradients scales to the square of the size of the data set. For relatively large data sets, this quadratic time complexity makes even a partial refinement intractable.
According to the present invention, the following approach is used for large data sets. This approach is to use iterative refinement based on ‘instantaneous’ errors. As in the approach described in Section 4.1, this approach of Section 4.2 starts with an initial configuration of points generated at random or by some other procedure (as described below in Section 7). This initial configuration is then continuously refined by repeatedly selecting two points i, j, at random, and modifying their coordinates on the non-linear map according to Eq. 8:
x i(t+1)=f(t,x i(t),x j(t),d ij)  EQ. 8
where t is the current iteration, xi(t) and xj(t) are the current coordinates of the i-th and j-th points on the non-linear map, xi(t+1) are the new coordinates of the i-th point on the non-linear map, and dij is the true distance between the i-th and j-th points that we attempt to approximate on the non-linear map (see above). ƒ(.) in EQ. 8 above can assume any functional form. Ideally, this function should try to minimize the difference between the actual and target distance between the i-th and j-th points. For example, ƒ(.) may be given by EQ. 9:
x i ( t + 1 ) = f ( t , x i ( t ) , x j ( t ) , ij ) = x i ( t ) + 0.5 λ ( t ) ( ij - δ ij ( t ) ) δ ij ( t ) ( x i ( t ) - x j ( t ) ) EQ . 9
where t is the iteration number, δij=||xi(t)−xj(t)||, and λ(t) is an adjustable parameter, referred to hereafter as the ‘learning rate.’
An analogous equation has been suggested by Kohonen for the training of self-organizing maps (Kohonen, Self-Organizing Maps, Springer-Verlag, Berlin (1995)), incorporated herein by reference in its entirety. This process is repeated for a fixed number of cycles, or until some global error criterion is minimized within some prescribed tolerance. A large number of iterations are typically required to achieve statistical accuracy.
The method described above is generally reminiscent of Kohonen's self-organizing principle (Kohonen, Biological Cybernetics, 43:59 (1982)) and neural network back-propagation training (Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences, PhD Thesis, Harvard University, Cambridge, Mass. (1974)), and Rumelhart and McClelland, Eds., Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, MIT Press, Cambridge, Mass. (1986)), all of which are incorporated herein by reference in their entireties.
The learning rate λ(t) in EQ. 9 plays a key role in ensuring convergence. If λ is too small, the coordinate updates are small, and convergence is slow. If, on the other hand, λ is too large, the rate of learning may be accelerated, but the non-linear map may become unstable (i.e. oscillatory). Typically, λ ranges in the interval [0, 1] and may be fixed, or it may decrease monotonically during the refinement process. Moreover, λ may also be a function of i, j and/or dij, and can be used to apply different weights to certain objects, distances and/or distance pairs. For example, λ may be computed by EQ. 10:
λ ( t ) = ( λ max + t λ min - λ max T ) 1 1 + ad ij or EQ . 11 : EQ . 10 λ ( t ) = ( λ max + t λ min - λ max T ) - ad ij EQ . 11
where λmax and λmin are the (unweighted) starting and ending learning rates such that λmax, λmin ∈ [0,1], T is the total number of refinement steps (iterations), t is the current iteration number, and α is a constant scaling factor. EQ. 10 and 11 have the effect of decreasing the correction at large separations, thus creating a non-linear map which preserves short-range interactions more faithfully than long-range ones. Weighting is discussed in greater detail below. Because of the general resemblance of the training process described above to Kohonen's self-organizing principle, these maps shall sometimes be herein called ‘Self-Organizing Non-Linear Maps.’
One of the main advantages of this approach is that it makes partial refinements possible. It is often sufficient that the pair-wise dissimilarities are represented only approximately to reveal the general structure and topology of the data. Unlike traditional MDS, this approach allows very fine control of the refinement process. Moreover, as the non-linear map self-organizes, the pair-wise refinements become cooperative, which partially alleviates the quadratic nature of the problem. The general usefulness of multi-dimensional scaling stems from the fact that data in
Figure US07188055-20070306-P00001
d are almost never d-dimensional. Although scaling becomes more problematic as the true dimensionality of the space increases, the presence of structure in the data is very frequently reflected on the resulting map. Of course, one can easily conceive of situations where MDS is not effective, particularly when the data is random and truly hyper-dimensional. Fortunately, these situations rarely arise in practice, as some form of structure is always present in the data, particularly data related to molecular structure and function.
The embedding procedure described above does not guarantee convergence to the global minimum (i.e., the most faithful embedding in a least-squares sense). If so desired, the refinement process may be repeated a number of times from different starting configurations and/or random number seeds. It should also be pointed out that the absolute coordinates in the non-linear map carry no physical significance. What is important are the relative distances between points, and the general structure and topology of the data (presence, density and separation of clusters, etc.).
The method described above is ideally suited for both metric and non-metric scaling. The latter is particularly useful when the (dis)similarity measure is not a true metric, i.e. it does not obey the distance postulates and, in particular, the triangle inequality (such as the Tanimoto coefficient, for example). Although an ‘exact’ projection is only possible when the distance matrix is positive definite, meaningful projections can still be obtained even when this criterion is not satisfied. As mentioned above, the overall quality of the projection is determined by a sum-of-squares error function such as those shown in EQ. 1–3.
5. Evaluation Properties (Features) and Distance Measures
As mentioned above, the distances dij between chemical compounds are computed according to some prescribed measure of molecular ‘similarity’. This similarity can be based on any combination of properties or features of the compounds. For example, the similarity measure may be based on structural similarity, chemical similarity, physical similarity, biological similarity, and/or some other type of similarity measure which can be derived from the structure or identity of the compounds. Under the system of the present invention, any similarity measure can be used to construct the non-linear map. The properties or features that are being used to evaluate similarity or dissimilarity among compounds are sometimes herein collectively called “evaluation properties.”
5.1 Evaluation Properties Having Continuous or Discrete Real Values
As noted above, in a preferred embodiment of the present invention, the similarity measure may be derived from a list of physical, chemical and/or biological properties (i.e., evaluation properties) associated with a set of compounds. Under this formalism, the compounds are represented as vectors in multi-variate property space, and their similarity may be computed by some geometrical distance measure.
In a preferred embodiment, the property space is defined using one or more molecular features (descriptors). Such molecular features may include topological indices, physicochemical properties, electrostatic field parameters, volume and surface parameters, etc. For example, these features may include, but are not limited to, molecular volume and surface areas, dipole moments, octanol-water partition coefficients, molar refractivities, heats of formation, total energies, ionization potentials, molecular connectivity indices, 2D and 3D auto-correlation vectors, 3D structural and/or pharmacophoric parameters, electronic fields, etc. However, it should be understood that the present invention is not limited to this embodiment. For example, molecular features may include the observed biological activities of a set of compounds against an array of biological targets such as enzymes or receptors (also known as affinity fingerprints). In fact, any vectorial representation of chemical data can be used in the present invention.
5.2 Distance Measure Where Values of Evaluation Properties Are Continuous or Discrete Real Numbers
A “distance measure” is some algorithm or technique used to determine the difference between compounds based on the selected evaluation properties. The particular distance measure that is used in any given situation depends, at least in part, on the set of values that the evaluation properties can take.
For example, where the evaluation properties can take real numbers as values, then a suitable distance measure is the Minkowski metric, shown in EQ. 12:
ij = ( x i , x j ) = ( k x ik - x jk r ) 1 r EQ . 12
where k is used to index the elements of the property vector, and r∈[1, ∞). For r=1.0, EQ. 12 is the city-block or Manhattan metric. For r=2.0, EQ. 12 is the ordinary Euclidean metric. For r=∞, EQ. 12 is the maximum of the absolute coordinate distances, also referred to as the ‘dominance’ metric, the ‘sup’ metric, or the ‘ultrametric’ distance. For any value of r∈[1, ∞), it can be shown that the Minkowski metric is a true metric, i.e. it obeys the distance postulates and, in particular, the triangle inequality.
5.3 Evaluation Properties Having Binary Values
Alternatively, the evaluation properties of the compounds may be represented in a binary form (i.e., either a compound has or does not have an evaluation property), where each bit is used to indicate the presence or absence (or potential presence or absence) of some molecular feature or characteristic. For example, compounds may be encoded using substructure keys where each bit is used to denote the presence or absence of a specific structural feature or pattern in the target molecule. Such features include, but are not limited to, the presence, absence or minimum number of occurrences of a particular element (e.g. the presence of at least 1, 2 or 3 nitrogen atoms), unusual or important electronic configurations and atom types (e.g. doubly-bonded nitrogen or aromatic carbon), common functional groups such as alcohols, amines etc, certain primitive and composite rings, a pair or triplet of pharmacophoric groups at a particular separation in 3-dimensional space, and ‘disjunctions’ of unusual features that are rare enough not to worth an individual bit, yet extremely important when they do occur (typically, these unusual features are assigned a common bit that is set if any one of the patterns is present in the target molecule).
Alternatively, the evaluation properties of compounds may be encoded in the form of binary fingerprints, which do not depend on a predefined fragment or feature dictionary to perform the bit assignment. Instead, every pattern in the molecule up to a predefined limit is systematically enumerated, and serves as input to a hashing algorithm that turns ‘on’ a small number of bits at pseudo-random positions in the bitmap. Although it is conceivable that two different molecules may have exactly the same fingerprint, the probability of this happening is extremely small for all but the simplest cases. Experience suggests that these fingerprints contain sufficient information about the molecular structures to permit meaningful similarity comparisons.
5.4 Distance Measures Where Values of Evaluation Properties Are Binary
A number of similarity (distance) measures can be used with binary descriptors (i.e., where evaluation properties are binary or binary fingerprints). The most frequently used ones are the normalized Hamming distance:
H = XOR ( x , y ) N EQ . 13
which measures the number of bits that are different between x and y, the Tanimoto or Jaccard coefficient:
T = AND ( x , y ) IOR ( x , y ) EQ . 14
which is a measure of the number of substructures shared by two molecules relative to the ones they could have in common, and the Dice coefficient:
D = 2 AND ( x , y ) x + y EQ . 15
In the equations listed above, AND(x, y) is the intersection of binary sets x and y (bits that are ‘on’ in both sets), IOR(x, y) is the union or ‘inclusive or’ of x and y (bits that are ‘on’ in either x or y), XOR is the ‘exclusive or’ of x and y (bits that are ‘on’ in either x or y, but not both), |x| is the number of bits that are ‘on’ in x, and N is the length of the binary sets measured in bits (a constant).
Another popular metric is the Euclidean distance which, in the case of binary sets, can be recast in the form:
E=√{square root over (N−|XOR(x,NOT(y))|)}  EQ. 16
where NOT(y) denotes the binary complement of y. The expression |XOR(x, NOT(y))| represents the number of bits that are identical in x and y (either 1's or 0's). The Euclidean distance is a good measure of similarity when the binary sets are relatively rich, and is mostly used in situations in which similarity is measured in a relative sense.
In the examples described above, the distance between two compounds is determined using a binary or multivariate representation. However, the system of the present invention is not limited to this embodiment. For example, the similarity between two compounds may be determined by comparing the shapes of the molecules using a suitable 3-dimensional alignment method, or it may be inferred by a similarity model defined according to a prescribed procedure. For example, one such similarity model may be a neural network trained to predict a similarity coefficient given a suitably encoded pair of compounds. Such a neural network may be trained using a training set of structure pairs and a known similarity coefficient for each such pair, as determined by user input, for example.
6. Scaling of Evaluation Properties
Referring back to EQ. 12, according to the present invention, the features (i.e., evaluation properties) may be scaled differently to reflect their relative importance in assessing the proximity between two compounds. For example, suppose the user has selected two evaluation properties, Property A and Property B. If Property A has a weight of 2, and Property B has a weight of 10, then Property B will have five times the impact on the distance calculation than Property A.
According to this embodiment of the invention, EQ. 12 may be replaced by EQ. 17:
d ij = d ( x i , x j ) = ( k ( w k x ik - x jk ) r ) 1 r EQ . 17
where wk is the weight of the k-th property. An example of such a weighting factor is a normalization coefficient. However, other weighting schemes may also be used.
According to the present invention, the scaling (weights) need not be uniform throughout the entire map, i.e. the resulting map need not be isomorphic. Hereafter, maps derived from uniform weights shall be referred to as globally weighted (isomorphic), whereas maps derived from non-uniform weights shall be referred to as locally weighted (non-isomorphic). On locally-weighted maps, the distances on the non-linear map reflect a local measure of similarity. That is, what determines similarity in one domain of the non-linear map is not necessarily the same with what determines similarity on another domain of the non-linear map. For example, locally-weighted maps may be used to reflect similarities derived from a locally-weighted case-based learning algorithm. Locally-weighted learning uses locally weighted training to average, interpolate between, extrapolate from, or otherwise combine training data. Most learning methods (also referred to as modeling or prediction methods) construct a single model to fit all the training data. Local models, on the other hand, attempt to fit the training data in a local region around the location of the query. Examples of local models include nearest neighbors, weighted average, and locally weighted regression. Locally-weighted learning is reviewed in Vapnik, in Advances in Neural Information Processing Systems, 4:831, Morgan-Kaufman, San Mateo, Calif. (1982); Bottou and Vapnik, Neural Computation, 4(6):888 (1992); and Vapnik and Bottou, Neural Computation, 5(6):893 (1993), all of which are incorporated herein by reference in their entireties.
According to the present invention, it is also possible to construct a non-linear map from a distance matrix which is not strictly symmetric, i.e. a distance matrix where dij≠dji. A potential use of this approach is in situations where the distance function is defined locally, e.g. in a locally weighted model using a point-based local distance function. In this embodiment, each training case has associated with it a distance function and the values of the corresponding parameters. Preferably, to construct a non-linear map which reflects these local distance relationships, the distance between two points is evaluated twice, using the local distance functions of the respective points. The resulting distances are averaged, and are used as input in the non-linear mapping algorithm described above. If the point-based local distance functions vary in some continuous or semi-continuous fashion throughout the feature space, this approach could potentially lead to a meaningful projection.
7. Improvements to Map Generation Process
This section describes improvements to the chemical visualization map generation process described above. Each of the enhancements described below is under the control of the user. That is, the user can elect to perform or not perform each of the enhancements discussed below. Alternatively, the invention can be defined so that the below enhancements are automatically performed, unless specifically overrided by the user (or in some embodiments, the user may not have the option of overriding one or more of the below enhancements).
7.1 Pre-Ordering
In many cases, the approach described above for generating the non-linear map may be accelerated by pre-ordering the data using a suitable statistical method. For example, if the data is available in vectorial or binary form, the initial configuration of the points on the non-linear map may be computed using Principal Component Analysis. In a preferred embodiment, the initial configuration may be constructed from the first 3 principal components of the feature matrix (i.e. the 3 latent variables which account for most of the variance in the data). In practice, this technique can have profound effects in the speed of refinement. Indeed, if a random initial configuration is used, a significant portion of the training time is spent establishing the general structure and topology of the non-linear map, which is typically characterized by large rearrangements. If, on the other hand, the input configuration is partially ordered, the error criterion can be reduced relatively rapidly to an acceptable level.
7.2 Localized Refinement
If the data is highly clustered, by virtue of the sampling process low-density areas may be refined less effectively than high-density areas. In a preferred embodiment, this tendency may be partially compensated by a modification to the original algorithm which increases the sampling probability in low-density areas. In one embodiment, the center of mass of the non-linear map is identified, and concentric shells centered at that point are constructed. A series of regular refinement iterations are then carried out, each time selecting points from within or between these shells. This process is repeated for a prescribed number of cycles. This phase is then followed by a phase of regular refinement using global sampling, and the process is repeated.
As mentioned above, the basic algorithm does not distinguish short- from long-range distances. EQ. 10 and 11 describe a method to ensure that short-range distances are preserved more faithfully than long-range ones through the use of weighting. An alternative (and complementary) approach is to ensure that points at close separation are sampled more extensively than points at long separation. A preferred embodiment is to use an alternating sequence of global and local refinement cycles, similar to the one described above. In this embodiment, a phase of global refinement is initially carried out. At the end of this phase, the resulting non-linear map is partitioned into a regular grid, and the points (objects) in each cell are subjected to a phase of local refinement (i.e. only points from within the same cell are compared and refined). Preferably, the number of sampling steps in each cell should be proportional to the number of points contained in that cell. This process is highly parallelizable. This local refinement phase is then followed by another global refinement phase, and the process is repeated for a prescribed number of cycles, or until the embedding error is minimized within a prescribed tolerance. Alternatively, the grid method may be replaced by another suitable method for identifying proximal points, such as a k-d tree, for example.
7.3 Incremental Refinement
The approach and techniques described herein may be used for incremental refinement of a map. That is, starting from an organized non-linear map of a set of objects or points (compounds), a new set of objects (compounds) may be added without modification of the original map. Strictly speaking, this is statistically acceptable if the new set of objects is significantly smaller than the original set. In a preferred embodiment, the new set of objects may be ‘diffused’ into the existing map, using a modification of the algorithm described above. In particular, EQ. 8 and 9 can be used to update only the new objects. In addition, the sampling procedure ensures that the selected pairs contain at least one object from the incoming set. That is, two objects are selected at random so that at least one of these objects belongs to the incoming set.
8. Operation of the Present Invention
The operation of the present invention with regard to visualizing and interactively processing chemical compounds in a non-linear map shall now be described with reference to a flowchart 302 shown in FIG. 3. Unless otherwise specified, interaction with users described below is achieved by operation of the user interface modules 108 (FIG. 1).
In step 304, the user selects one or more compounds to map in a new non-linear map. The user may select compounds to map by retrieving a list of compounds from a file, by manually typing in a list of compounds, and/or by using a graphical user interface (GUI) such as the structure browser shown in FIG. 5 (described below). The invention envisions other means for enabling the user to specify compounds to display in a non-linear map. For example, the user can also select compounds from an already existing compound visualization non-linear map (in one embodiment, the user drags and drops the compounds from the old compound visualization non-linear map to the new compound visualization non-linear map—drag and drop operations according to the present invention are described below).
In step 306, the user selects a method to be used for evaluating the molecular similarity or dissimilarity between the compounds selected in step 304. In an embodiment, the similarity/dissimilarity between the compounds selected in step 304 is determined (in step 308) based on a prescribed set of evaluation properties. As described above, evaluation properties can be any properties related to the structure, function, or identity of the compounds selected in step 304. Evaluation properties include, but are not limited to, structural properties, functional properties, chemical properties, physical properties, biological properties, etc., of the compounds selected in step 304.
In an embodiment of the present invention, the selected evaluation properties may be scaled differently to reflect their relative importance in assessing the proximity (i.e., similarity or dissimilarity) between two compounds. Accordingly, also in step 306, the user selects a scale factor for each of the selected evaluation. Note that such selection of scale factors is optional. The user need not select a scale factor for each selected evaluation property. If the user does not select a scale factor for a given evaluation property, then that evaluation property is given a default scale factor, such as unity.
Alternatively in step 306, the user can elect to retrieve similarity/dissimilarity values pertaining to the compounds selected in step 304 from a source, such as a database. These similarity/dissimilarity values in the database were previously generated. In another embodiment, the user in step 306 can elect to determine similarity/dissimilarity values using any well-known technique or procedure.
In step 308, the map generating module 106 generates a new non-linear map. This new non-linear map includes a point for each of the compounds selected in step 304. Also, in this new non-linear map, the distance between any two points is representative of their similarity/dissimilarity. The manner in which the map generating module 106 generates the new non-linear map shall now be further described with reference to a flowchart 402 in FIG. 4.
In step 404, coordinates on the new non-linear map of points corresponding to the compounds selected in step 304 are initialized.
In step 406, two of the compounds i, j selected in step 304 are selected for processing.
In step 408, similarity/dissimilarity dij between compounds i, j is determined based on the method selected by the user in step 306.
In step 410, based on the similarity/dissimilarity dij determined in step 408, coordinates of points corresponding to compounds i, j on the non-linear map are obtained.
In step 412, training/learning parameters are updated.
In step 414, a decision is made as to terminate or not terminate. If a decision is made to not terminate at this point, then control returns to step 406. Otherwise, step 416 is performed.
In step 416, the non-linear map is output (i.e., generation of the non-linear map is complete).
Details regarding the steps of flowchart 402 are discussed above.
Referring again to FIG. 3, in step 312 the map viewer 112 displays the new non-linear map on an output device 116 (such as a computer graphics monitor). Examples of non-linear maps being displayed by the map viewer 112 are shown in FIGS. 6 and 7 (described below).
In step 314, the user interface modules 108 enable operators to interactively analyze and process the compounds represented in the displayed non-linear map. These user interface functions of the present invention are described below.
The present invention enables users to modify existing compound visualization non-linear maps (as used herein, the term “compound visualization non-linear map” refers to a rendered non-linear map). For example, users can add additional compounds to the map, remove compounds from the map, highlight compounds on the map, etc. In such cases, pertinent functional steps of flowchart 302 are repeated. For example, steps 304 (selecting compounds to map), 310 (generating the non-linear map), and 312 (displaying the map) are repeated when the user opts to add new compounds to an existing map. However, according to an embodiment of the invention, the map is incrementally refined and displayed in steps 310 and 312 when adding compounds to an existing compound visualization non-linear map (this incremental refinement is described above).
9. User Interface of the Present Invention
The user interface features of the present invention are described in this section. Various user interface modules and features are described below. Also, various functional/control threads (in the present context, a functional/control thread is a series of actions performed under the control of a user) employing these user interface modules and features are described below. It will be appreciated by persons skilled in the relevant art(s) that the user interface of the present invention is very flexible, varied, and diverse. An operator can employ the user interface of the present invention to perform a wide range of activities with respect to visualizing and interactively analyzing chemical compounds. Accordingly, it should be understood that the functional/control threads described herein are provided for illustrative purposes only. The invention is not limited to these functional/control threads.
Preferably, the invention provides the following capabilities, features, and functions: displaying 2D and/or 3D chemical structures and/or chemical names; displaying compound collections and/or libraries; displaying components of structures (i.e. building blocks) of combinatorial libraries; visualization of compound collections and/or libraries as 2D and/or 3D maps of colored objects.
Also, the present invention allows the following: (1) browsing compound collections and/or libraries; (2) selection of individual compounds, collections of compounds and/or libraries of compounds; (3) selection of compounds generated in a combinatorial fashion via selection of their respective building blocks; (4) mapping, visualization, and/or linking of compounds onto and/or from 2D and/or 3D maps; (5) manipulation of the 2D and/or 3D maps such as rotation, resizing, translation, etc.; (6) manipulation of objects on the 2D and/or 3D maps such as changing the appearance of objects (visibility, size, shape, color, etc.), changing position of objects on the map, and/or changing relationships between objects on the map; (7) interactive exploring of the 2D and/or 3D maps such as querying chemical structure, querying distance, selection of individual objects and/or areas of a map, etc.
Additional user interface features, functions, and capabilities of the present invention will be apparent to persons skilled in the relevant art(s) based on the discussion contained herein.
As shown in FIG. 1, the invention includes a structure browser 110 and a map viewer 112. At any given time, each of these can have multiple instances depending on the program use.
9.1 Structure Browser
FIG. 5 illustrates a structure browser window 502 generated by the structure browser 110. The structure browser window 502 includes a frame 504, a menu pane 506, and a group of labeled tabbed pages 508. Each tabbed page holds a molecular spreadsheet or a group of labeled tabbed pages.
Each tab is associated with a compound collection (tabs 510) or a library, such as a combinatorial library (tabs 512). Selecting a collection tab 510 brings up a table of corresponding chemical structures. Selecting a library tab 512 brings up a group of tabbed pages corresponding to the sets of building blocks used to generate the library. Each of the library's tabbed pages works the same way as a compound collection tabbed page. In the example shown in FIG. 5, the tab 510 called “DDL0” is selected. DDL0 has three building block tabs 512, called “Cores,” “Acids,” and “Amines.” The “Acids” collection tab is currently selected, so that a table 522 of the structures of the compounds in the “Acids” collection is shown.
The browser window 502 includes a table 522, a slider 514, an input field 516, and two buttons: “Prev Page” 518 and “Next Page” 520. The slider 514, the input field 516, and the buttons 518, 520 facilitate browsing the content of the Acids table 522. If we consider the content of the table 522 as a contiguous ordered list of chemical structures (compounds or building blocks), that shown in the browser window 502 can be considered as a window positioned over the list. At any given moment this window displays part of the list depending on its position and the displayed part is equal to the size of the window, i.e., the number of cells in the table. Initially that window displays the top of the list. Moving the slider 514 changes the position of the window over the list. Entering a value into the input field 516 specifies the position of the window over the list. Pushing the “Next Page” button 520 moves the window one window size down the list, pushing the “Prev Page” button 518 moves the window one window size up the list.
The user can select compounds shown in the table 522 for various actions. For example, compounds can be selected using the browser window 502 as input for the generation of a new compound visualization non-linear map, or as input for adding compounds to an existing compound visualization non-linear map. Clicking with a left mouse button over a table cell selects or deselects the corresponding compound structure (toggling). Toggling on/off also changes the color of the cell, to indicate which cells have been selected. Selected structures are displayed on a first background color, and non-selected structures are displayed on a second background color. In the example of FIG. 5, certain cells 523 in table 522 have been selected.
The menu pane 506 contains menus: File, Edit, Selection, Map, and/or other menus. The File menu facilitates file open/save, print, and exit operations. Edit menu contains commands for editing content of the table 522. The Selection menu provides options to select/deselect (clear) a current compound collection, a collection of building blocks of a combinatorial library, and/or all compounds. The Map menu includes commands for creating a map viewer and for displaying a selection of compounds in that map viewer. The latter option brings up a dialog window (FIG. 8), which allows the user to specify shape, color, and/or size of the selected objects, which will be used to represent the selected compounds on the map.
9.2 Map Viewer
A map viewer window 600 generated by the map viewer 112 is shown in FIG. 6. (also see FIGS. 6–10 and 13). A compound visualization non-linear map is displayed in a render area 614 of the map view window 600.
In a preferred embodiment, the map viewer 112 is based on Open Inventor, a C++ library of objects and methods for interactive 3D graphics, publicly available from Silicon Graphics Inc. Open Inventor relies on OpenGL for fast and flexible rendering of 3D objects. Alternatively, the map viewer 112 can be based on a publicly available VRML viewer. Alternatively, any other software and/or hardware product allowing rendering of 3D objects/scenes can be used.
In a preferred embodiment, 3D compound visualization maps of chemical compounds are implemented as Open Inventor 3D scene databases. Each map is build as an ordered collection of nodes referred to as a scene graph. Each scene graph includes, but is not limited to, nodes representing cameras (points of view), light sources, 3D shapes, objects surface materials, and geometric transformations. Each chemical compound displayed on a map is associated with a 3D shape node, a material node and a geometric transformation node.
Geometric transformation node reflects compound coordinates in the map. 3D shape node and material node determine shape, size and color of the visual object associated with the compound. Combinations of a particular shape, size and color are used to display compounds grouped by a certain criteria, thus allowing easy visual differentiation of different groups/sets of compounds. 3D shapes of the visual objects in the map include, but not limited to, point, cube, sphere, and cone. Color of a visual object in the map can be set to any combination of three basic colors: red, green and blue. Besides the color, material node can specify transparency and shininess of a visual object's surface.
In an embodiment, an object's display properties (color, intensity of color, transparent, degree of transparency, shininess, degree of shininess, etc.) can represent physical, chemical, biological, and/or other properties of the corresponding compound, such as the cost of the compound, difficulty of synthesizing the compound, whether the compound is available in a compound repository, etc. For example, the larger the molecular weight of an object, the larger the size of the corresponding object in the display map.
Each object or point displayed in the compound visualization non-linear map represents a chemical compound. Objects in the compound visualization non-linear map can be grouped into sets.
By default, every time a set of compounds is mapped into a compound visualization non-linear map, a new set of graphical objects is created and added to the compound visualization non-linear map. All objects in a particular set can share the same attributes: shape, color, and size, thus providing an easy visual identification of the objects belonging to the same set or to different sets.
A compound can be a member of several sets. In an embodiment, for a given compound, a different object is displayed in the compound visualization non-linear map for each set of which the compound is a member. In this case the objects in the compound visualization non-linear map that represent the compound as a member of each of the sets may overlap and only the biggest object may be visible. In this case, a toggle sets feature (described below) may be used to reveal multiple set membership.
The map viewer window 600 includes a frame 602, a menu pane 604, and a viewer module preferably implemented as an Open Inventor component (examiner viewer). The viewer module incorporates the following elements: (1) a render area 614 in which the compound visualization non-linear map is being displayed; (2) combinations of thumbwheels 608, 610, 612, sliders, and/or viewer functions icons/ buttons 620, 622, 624, 626, 628, 630, 632; and (3) pop-up menus and dialogs 616, 702, 902 which provide access to all viewers functions, features and/or properties.
The thumbwheels 608, 610 rotate the compound visualization non-linear map around a reference point of interest. Thumbwheel 610 rotates in the y direction, and thumbwheel 608 rotates in the x direction. The origin of rotation (i.e., the camera position) is by default the geometric center of the compound visualization map 614 (render area), but can be placed anywhere in the compound visualization non-linear map. The compound visualization non-linear map can also be panned in the screen plane, as well as dollied in and out (forward/backward movement) via thumbwheel 612.
The map view window 600 has several different modes or states, e.g. view, pick, panning, dolly, seek, and/or other. Each mode defines a different mouse cursor and how mouse events are interpreted.
In the view mode, mouse motions are translated into rotations of the virtual trackball and corresponding rotations of the compound visualization non-linear map. The view mode is the default mode.
In the panning mode, the compound visualization non-linear map is translated in the screen plane following the mouse movements.
In the dolly mode, a scene is moved in and out of screen according to the vertical motions of the mouse.
Seek mode allows the user to change the point of rotation (reference point) of a scene by attaching it to an object displayed in the compound visualization non-linear map.
Pick mode is used for picking (querying) objects displayed in the compound visualization non-linear map. Picking an object in a 3D scene is achieved by projecting a conical ray from the camera through a point (defined by positioning and clicking the mouse) on the near plane of the view volume. The first object in the scene intersecting with the ray cone is picked. As a response to a pick event (an object being picked by pressing the left mouse button over the object), a small window displaying the corresponding compound pops up while the left mouse button is pressed (see, for example, window 1302 in FIG. 13). The window will automatically disappear when the button is released. In order to keep the window on the screen, it is necessary to hold the shift key while releasing the mouse button.
Switching between the above-described modes can be achieved by selecting a mode from a pop-up menu, by clicking on a shortcut icon/button, and/or by pressing and/or holding a combination of mouse buttons and/or keys on a keyboard. In a preferred embodiment, selecting a pointed arrow icon/button 620 switches to the pick mode. Selecting a hand icon/button 622 switches to the view mode; selecting a target icon/button 624 switches to the seek mode. Pressing and holding the middle mouse button switches to the panning mode. Pressing and holding the left and middle mouse buttons simultaneously switches to the dolly mode.
Certain actions can be executed also via thumbwheels and/or sliders, e.g. turning the dolly thumbwheel 612 moves the scene in and out of the screen. Also, turning the X and/or Y rotation thumbwheels 608, 610 rotate the scene accordingly around the point of rotation.
In a preferred embodiment, the right mouse button is reserved for the pop-up menus 616, 902. Pressing the right mouse button anywhere over an empty rendering area brings up the viewer pop-up menu 902. Pressing the right mouse button over an object brings up the object pop-up menu 616.
The viewer pop-up menu 902 allows the user to select the mode (such modes are described above), change viewer properties (set up preferences, e.g. background color), toggle on/off sets of objects, and/or access any other viewer features.
The object pop-up menu 616 allows the user to change an object's shape, color (material), and/or size, select the corresponding set of compounds, and/or define a neighborhood 3D area around the object (zoom feature, described below). In a preferred embodiment, all changes made to an object automatically apply to all other objects from the same set. The object's shape can be changed to one of the predefined basic shapes (e.g. dot, cube, sphere, cone). The object's material (color) is changed via a color dialog. The object's size is changed via a resize dialog. Any set of objects can be visible (toggled on) or hidden (toggled off). A toggle sets command brings up a list of sets defined for the current map 640. Clicking on a set in the list (highlighting/clearing) toggles the set off and on.
Invoking the zoom feature (via the pick neighbors command on the object pop-up menu 616, for example) creates a sphere 704 in the render area 614 (FIG. 7), which is centered on the object. The radius of the sphere 704 can be adjusted via a resize dialog 702 to select a desired neighborhood area around the object. All objects (and corresponding compounds) encompassed by the sphere 704 are then selected, displayed in a different map, added to a new or existing set, dragged to a target (described below), and/or viewed in a structure browser window 502.
The map viewer 112 is capable of maintaining an interactive selection of objects/compounds. All selected objects are visualized in the same shape, color, and/or size. In other words, selecting an object changes its shape, color, and/or size (e.g. to a purple cone), deselecting an object changes its shape, color and/or size back to the original attributes. Executing the select set command from the object pop-up menu 616 selects the whole set of objects this object belongs to. Alternatively, an individual object can be selected or deselected by clicking a middle mouse button over an object. The interactive selection of objects can be converted to a set of compounds and displayed in a structure browser window 502. The current selection can be converted into a set of compounds by invoking the save selection command from a selection menu, and/or it can be cleared by executing the clear selection command from the selection menu.
9.3 Interactivity of the Present Invention
As should be apparent from the above, the present invention enables users to interact with the objects/compounds displayed in a compound visualization non-linear map. This interactivity provided by the present invention shall be further illustrated below.
9.3.1 Map Viewer as Target
According to the present invention, a user can select a plurality of compounds from some source, and then add those compounds to a new or an existing compound visualization non-linear map being displayed in a map window 600. In this instance, the map window 600 (or, equivalently in this context, the map viewer 112) is acting as a target for an interactive user activity.
This operation is conceptually shown in FIG. 14. A compound visualization non-linear map 1404 is being displayed in a map window 600. According to the present invention, the user can select compounds from a structure browser window 502, and then add those selected compounds (through, for example, well known drag and drop operations) to the compound visualization non-linear map 1404. Similarly, the user can select compounds from a compound database 122, or from a MS (mass spectrometry) viewer 1402, and then add those compounds to the compound visualization non-linear map 1404.
According to an embodiment of the invention, new compounds are added to an existing compound visualization non-linear map by incremental refinement of the compound visualization non-linear map. Such incremental refinement is described above.
9.3.2 Map Viewer as Source
According to the present invention, a user can select a plurality of compounds from a map window 600, and then have those compounds processed by a target. In this instance, the map window 600 (or, equivalently in this context, the map viewer 112) is acting as a source for an interactive user activity.
This operation is conceptually shown in FIG. 13. A user selects one or more compounds from the compound visualization non-linear map being displayed in the map window 600, and then drags and drops the selected compounds to a target. The described action is interpreted as a submission of the corresponding chemical structure(s) to the receiving target for processing. The receiving object can be anything that can handle a chemical structure: another map viewer 112, a structure viewer 110, a (molecular) spreadsheet 136, a database 120, an experiment planner 140, an active site docker 144, an NMR widget 130, an MS widget 134, a QSAR model 138, a property prediction program 142, or any other suitable process. For example, dragging and dropping a compound onto an NMR widget would display this compound's NMR spectrum, either an experimental or a predicted one.
The experiment planner is described in pending U.S. Patent Application titled “SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR IDENTIFYING CHEMICAL COMPOUNDS HAVING DESIRED PROPERTIES,” Ser. No. 10/170,628 herein incorporated by reference in its entirety.
The drag and drop concept described above provides a powerful enhancement of a 3D mapping and visualization of compound collections and libraries. Any conceivable information about a set of chemical compounds can thus be easily accessed from the compound visualization non-linear map. For example, a map of compounds capable of binding to an active site of a given enzyme or receptor would benefit from the possibility to visualize how compounds from the different areas of the map bind to that enzyme or receptor.
9.4 Multiple Maps
According to the present invention, it is possible to create multiple visual maps for any given set of collections and/or libraries of chemical compounds. Multiple visual maps can be based on the same and/or different non-linear maps. Visual maps based on the same non-linear map can display different subsets of compounds and/or present different views of the same set of compounds (e.g. one visual map can display an XY plane view and another visual map can display an orthogonal, YZ plane view). Visual maps based on different non-linear maps can visualize the same set of compounds on different projections, for example, maps derived from different similarity relations between these compounds.
If a compound is mapped on multiple visual maps, the visual objects representing the compound on the different maps can be crosslinked. Crosslinking means that any modifications made to a visual object in one of the visual maps will be automatically reflected into the other visual maps. For example, if an object is selected on one of the visual maps, it will be displayed as selected on the other visual maps as well. In fact, all objects on all maps can be crosslinked provided that they represent the same chemical compounds. Multiple visual maps can be also crosslinked in a way that mapping any additional compounds onto one of the visual maps will automatically map the same compounds onto the crosslinked maps.
10. Examples
The present invention is useful for visualizing and interactively processing any chemical entities including but not limited to small molecules, polymers, peptides, proteins, etc. It may also be used to display different similarity relationships between these compounds.
The present invention has been described above with the aid of functional building blocks illustrating the performance of specified functions and relationships thereof. The boundaries of these functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternate boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Any such alternate boundaries are thus within the scope and spirit of the claimed invention. These functional building blocks may be implemented by discrete components, application specific integrated circuits, processors executing appropriate software and the like or any combination thereof. It is well within the scope of one skilled in the relevant art(s) to develop the appropriate circuitry and /or software to implement these functional building blocks.
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.

Claims (20)

1. A method for graphically interfacing between a computer system and a user, wherein the computer system interactively displays objects representative of chemical compounds, wherein distances between the objects represent dissimilarity between the corresponding chemical compounds, comprising:
(1) receiving a user selected list of chemical compounds;
(2) displaying objects representative of the user-selected chemical compounds in a window of a display screen, wherein distances between the objects represent dissimilarity between the corresponding chemical compounds;
(3) receiving user input relating to one or more of the following:
(i) deleting one or more of the objects from the window;
(ii) adding one or more additional objects to the window;
(iii) displaying chemical compound information associated with one or more of the objects;
(iv) selecting between having the computer system evaluate the dissimilarities or retrieve dissimilarity values from a source;
(v) selecting one or more dissimilarity evaluation techniques;
(vi) selecting one or more properties to be evaluated as part of a dissimilarity evaluation;
(vii) selecting a scaling factor for one or more of the properties.
2. The method according to claim 1, further comprising:
(4) repeating steps (1) through (3) for a second user-selected list of chemical compounds, wherein objects representative of the second user-selected chemical compounds are displayed in the window.
3. The method according to claim 1, further comprising:
(4) repeating steps (1) through (3) for a second user-selected list of chemical compounds, wherein objects representative of the second user-selected chemical compounds are displayed in a second window.
4. The method according to claim 1, wherein step (3) further comprises receiving user input relating to one or more of the following:
(viii) dragging one or more objects from the first window to a second window of the display screen; and
(ix) dragging one or more objects to the first window from the second window.
5. The method according to claim 1, wherein step (3) further comprises receiving user input relating to one or more of the following:
(viii) selecting one or more of the objects; and
(ix) selecting one or more types of information related to the associated chemical compounds to be displayed.
6. The method according to claim 5, wherein the one or more types of information include one or more selected from: chemical compound information; active site docker information; and nuclear magnetic resonance information.
7. The method according to claim 1, wherein step (3) further comprises receiving user input relating to one or more of the following:
(viii) selecting one or more areas of the window; and
(ix) selecting one or more types of information related to the associated chemical compounds to be displayed.
8. The method according to claim 7, wherein the one or more types of information include one or more selected from: chemical compound information; active site docker information; and nuclear magnetic resonance information.
9. The method according to claim 1, wherein step (3) further comprises receiving user input relating to one or more of the following:
(viii) setting a number of dimensions represented in the window;
(ix) manipulating an orientation of the window;
(x) manipulating a zooming function associate with the window; and
(xi) manipulating one or more appearance features of one or more of the objects.
10. The method according to claim 9, wherein said manipulating an orientation of the window comprises manipulation one or more of rotation, resizing, and translation.
11. The method according to claim 9, wherein the one or more appearance features comprise one or more selected from: size; shape; color; intensity of color; degree of visibility; degree of transparency; and degree of shininess.
12. The method according to claim 11, wherein the one or more appearance features represent one or more of the following: a physical feature of the corresponding chemical compound; a chemical feature of the corresponding chemical compound; a biological feature of the corresponding chemical compound; a cost of the corresponding chemical compound; a difficulty of synthesizing of the corresponding chemical compound; and an availability of the corresponding compound.
13. The method according to claim 1, wherein step (3) further comprises receiving user input relating to one or more of the following:
(a) changing positions of one or more of the objects; and
(b) changing relationships between two or more of the objects.
14. The method according to claim 1, further comprising displaying multiple sets of objects on the window, wherein step (3) further comprises receiving user input commanding the computer system to toggle between the multiple sets of objects.
15. The method according to claim 1, wherein step (1) comprises allowing the user to drag a set of one or more selected compounds from a second window into the first window.
16. The method according to claim 1, wherein step (1) comprises allowing the user to select the list of chemical compounds from a structure browser window.
17. The method according to claim 1, wherein step (1) comprises allowing the user to type the list of chemical compounds.
18. The method according to claim 1, wherein step (1) comprises allowing the user to select building blocks, wherein the computer system generates a combinatorial library of chemical compounds from the user-selected building blocks.
19. The method according to claim 1, further comprising:
(4) Displaying a structure browse window, said structure browser window including a plurality user-selectable tabbed pages, each said user-selectable tabbed page associated with a set of chemical compounds or a library, wherein each said library tab is associated with a second set of tabbed pages corresponding to building blocks associated with the corresponding library, wherein the user can select one or more chemical compounds and/or de-select one or more chemical compounds and/or one or more building blocks for display in the window.
20. A computer program product comprising a computer useable medium having computer program logic stored therein, said computer program logic enabling a computer system to graphically interface with one or more users to interactively display information related to chemical compounds, wherein said computer program logic comprises:
(a) a compound selection function that enables the computer system to receive a user-selected list of chemical compounds;
(b) a displaying function that enables the computer system to display objects representative of the user-selected chemical compounds in a window of a display screen, wherein distances between the objects represent dissimilarity between the corresponding chemical compounds; and
(c) a user-input function that enables the computer system to receive user input relating to one or more of the following:
(i) deleting one or more of the objects from the window;
(ii) adding one or more additional objects to the window; and
(iii) displaying chemical compound information associated with one or more of the objects;
(iv) selecting between having the computer system evaluate the dissimilarities or retrieve dissimilarity values from a source;
(v) selecting one or more dissimilarity evaluation techniques;
(vi) selecting one or more properties to be evaluated as part of a dissimilarity evaluation;
(vii) selecting a scaling factor for one or more of the properties.
US09/802,956 1996-11-04 2001-03-12 Method, system, and computer program for displaying chemical data Expired - Fee Related US7188055B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/802,956 US7188055B2 (en) 1996-11-04 2001-03-12 Method, system, and computer program for displaying chemical data

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US3018796P 1996-11-04 1996-11-04
US08/963,872 US6295514B1 (en) 1996-11-04 1997-11-04 Method, system, and computer program product for representing similarity/dissimilarity between chemical compounds
US09/802,956 US7188055B2 (en) 1996-11-04 2001-03-12 Method, system, and computer program for displaying chemical data

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US08/963,872 Continuation US6295514B1 (en) 1996-11-04 1997-11-04 Method, system, and computer program product for representing similarity/dissimilarity between chemical compounds

Publications (2)

Publication Number Publication Date
US20020069043A1 US20020069043A1 (en) 2002-06-06
US7188055B2 true US7188055B2 (en) 2007-03-06

Family

ID=21852972

Family Applications (4)

Application Number Title Priority Date Filing Date
US08/963,872 Expired - Lifetime US6295514B1 (en) 1996-11-04 1997-11-04 Method, system, and computer program product for representing similarity/dissimilarity between chemical compounds
US08/963,870 Expired - Lifetime US6421612B1 (en) 1996-11-04 1997-11-04 System, method and computer program product for identifying chemical compounds having desired properties
US09/802,956 Expired - Fee Related US7188055B2 (en) 1996-11-04 2001-03-12 Method, system, and computer program for displaying chemical data
US10/170,628 Abandoned US20030014191A1 (en) 1996-11-04 2002-06-14 System, method and computer program product for identifying chemical compounds having desired properties

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US08/963,872 Expired - Lifetime US6295514B1 (en) 1996-11-04 1997-11-04 Method, system, and computer program product for representing similarity/dissimilarity between chemical compounds
US08/963,870 Expired - Lifetime US6421612B1 (en) 1996-11-04 1997-11-04 System, method and computer program product for identifying chemical compounds having desired properties

Family Applications After (1)

Application Number Title Priority Date Filing Date
US10/170,628 Abandoned US20030014191A1 (en) 1996-11-04 2002-06-14 System, method and computer program product for identifying chemical compounds having desired properties

Country Status (7)

Country Link
US (4) US6295514B1 (en)
EP (2) EP0935789A1 (en)
JP (2) JP2001503546A (en)
AU (2) AU722989B2 (en)
CA (2) CA2270527A1 (en)
IL (2) IL129498A0 (en)
WO (2) WO1998020437A2 (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050114331A1 (en) * 2003-11-26 2005-05-26 International Business Machines Corporation Near-neighbor search in pattern distance spaces
US20050130229A1 (en) * 2003-12-16 2005-06-16 Symyx Technologies, Inc. Indexing scheme for formulation workflows
US20060009864A1 (en) * 2002-09-30 2006-01-12 Gerhard Kranner Method for the computer-supported generation of prognoses for operative systems and a system for the generation of prognoses for operative systems
US20060064674A1 (en) * 2004-06-03 2006-03-23 Olson John B Jr Methods and apparatus for visual application design
US20060168515A1 (en) * 2005-01-27 2006-07-27 Symyx Technologies, Inc. Parser for generating structured data
US20060277201A1 (en) * 2001-01-05 2006-12-07 Symyx Technologies, Inc. Laboratory database system and method for combinatorial materials research
US20070050092A1 (en) * 2005-08-12 2007-03-01 Symyx Technologies, Inc. Event-based library process design
US20070143240A1 (en) * 2000-03-24 2007-06-21 Symyx Technologies, Inc. Remote execution of materials library designs
US20070185657A1 (en) * 1998-10-19 2007-08-09 Symyx Technologies, Inc. Graphic design of combinatorial material libraries
US20070203951A1 (en) * 2003-01-24 2007-08-30 Symyx Technologies, Inc. User-configurable generic experiment class for combinatorial material research
US20070214101A1 (en) * 2000-12-15 2007-09-13 Symyx Technologies, Inc. Methods and apparatus for preparing high-dimensional combinatorial experiments
US20080015837A1 (en) * 2001-01-29 2008-01-17 Symyx Technologies, Inc. Systems, Methods and Computer Program Products for Determining Parameters for Chemical Synthesis
US20090228445A1 (en) * 2008-03-04 2009-09-10 Systems Biology (1) Pvt. Ltd. Automated molecular mining and activity prediction using xml schema, xml queries, rule inference and rule engines
US20090281975A1 (en) * 2008-05-06 2009-11-12 Microsoft Corporation Recommending similar content identified with a neural network
US20090292517A1 (en) * 2007-02-07 2009-11-26 Fujitsu Limited Molecular design method and computer-readable storage medium
US20100076992A1 (en) * 2004-06-01 2010-03-25 Symyx Software, Inc. Methods and Systems for Data Integration
US20100211366A1 (en) * 2007-07-31 2010-08-19 Sumitomo Heavy Industries, Ltd. Molecular simulating method, molecular simulation device, molecular simulation program, and recording medium storing the same
US20100329930A1 (en) * 2007-10-26 2010-12-30 Imiplex Llc Streptavidin macromolecular adaptor and complexes thereof
US20110081637A1 (en) * 2009-10-07 2011-04-07 Doherty David C Electron Configuration Teaching Systems and Methods
US20110145175A1 (en) * 2009-12-14 2011-06-16 Massachusetts Institute Of Technology Methods, Systems and Media Utilizing Ranking Techniques in Machine Learning
US20110188758A1 (en) * 2010-02-04 2011-08-04 Sony Corporation Image processing device and method, and program therefor
US20120290624A1 (en) * 2011-05-09 2012-11-15 The Regents Of The University Of California Defining and mining a joint pharmacophoric space through geometric features
US20150051889A1 (en) * 2012-03-21 2015-02-19 Zymeworks Inc. Systems and methods for making two dimensional graphs of complex molecules
US9102526B2 (en) 2008-08-12 2015-08-11 Imiplex Llc Node polypeptides for nanostructure assembly
US9285363B2 (en) 2009-05-11 2016-03-15 Imiplex Llc Method of protein nanostructure fabrication
US9336302B1 (en) 2012-07-20 2016-05-10 Zuci Realty Llc Insight and algorithmic clustering for automated synthesis
US10168885B2 (en) 2012-03-21 2019-01-01 Zymeworks Inc. Systems and methods for making two dimensional graphs of complex molecules
US10229092B2 (en) 2017-08-14 2019-03-12 City University Of Hong Kong Systems and methods for robust low-rank matrix approximation
US10467325B2 (en) 2007-06-11 2019-11-05 Intel Corporation Acceleration of multidimensional scaling by vector extrapolation techniques
US10832800B2 (en) 2017-01-03 2020-11-10 International Business Machines Corporation Synthetic pathway engine
US11061873B2 (en) 2016-03-17 2021-07-13 Elsevier, Inc. Systems and methods for electronic searching of materials and material properties
US11205103B2 (en) 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis

Families Citing this family (183)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5463564A (en) 1994-09-16 1995-10-31 3-Dimensional Pharmaceuticals, Inc. System and method of automatically generating chemical compounds with desired properties
EP0935789A1 (en) 1996-11-04 1999-08-18 3-Dimensional Pharmaceuticals, Inc. System, method, and computer program product for the visualization and interactive processing and analysis of chemical data
US6453246B1 (en) 1996-11-04 2002-09-17 3-Dimensional Pharmaceuticals, Inc. System, method, and computer program product for representing proximity data in a multi-dimensional space
US6571227B1 (en) 1996-11-04 2003-05-27 3-Dimensional Pharmaceuticals, Inc. Method, system and computer program product for non-linear mapping of multi-dimensional data
JP2001515234A (en) * 1997-07-25 2001-09-18 アフィメトリックス インコーポレイテッド System for providing a polymorphism database
US6968342B2 (en) * 1997-12-29 2005-11-22 Abel Wolman Energy minimization for data merging and fusion
US20040186071A1 (en) 1998-04-13 2004-09-23 Bennett C. Frank Antisense modulation of CD40 expression
US7321828B2 (en) * 1998-04-13 2008-01-22 Isis Pharmaceuticals, Inc. System of components for preparing oligonucleotides
US20030228597A1 (en) * 1998-04-13 2003-12-11 Cowsert Lex M. Identification of genetic targets for modulation by oligonucleotides and generation of oligonucleotides for gene modulation
US6185548B1 (en) * 1998-06-19 2001-02-06 Albert Einstein College Of Medicine Of Yeshiva University Neural network methods to predict enzyme inhibitor or receptor ligand potency
JP3892166B2 (en) * 1998-09-11 2007-03-14 独立行政法人理化学研究所 Method for predicting molecular reaction characteristics
US6594673B1 (en) * 1998-09-15 2003-07-15 Microsoft Corporation Visualizations for collaborative information
SE9804127D0 (en) 1998-11-27 1998-11-27 Astra Ab New method
US7912689B1 (en) 1999-02-11 2011-03-22 Cambridgesoft Corporation Enhancing structure diagram generation through use of symmetry
US7295931B1 (en) * 1999-02-18 2007-11-13 Cambridgesoft Corporation Deriving fixed bond information
US6647341B1 (en) * 1999-04-09 2003-11-11 Whitehead Institute For Biomedical Research Methods for classifying samples and ascertaining previously unknown classes
US7324926B2 (en) * 1999-04-09 2008-01-29 Whitehead Institute For Biomedical Research Methods for predicting chemosensitivity or chemoresistance
US6937330B2 (en) 1999-04-23 2005-08-30 Ppd Biomarker Discovery Sciences, Llc Disposable optical cuvette cartridge with low fluorescence material
US6721754B1 (en) 1999-04-28 2004-04-13 Arena Pharmaceuticals, Inc. System and method for database similarity join
WO2000079263A2 (en) * 1999-06-18 2000-12-28 Synt:Em S.A. Identifying active molecules using physico-chemical parameters
US7225172B2 (en) * 1999-07-01 2007-05-29 Yeda Research And Development Co. Ltd. Method and apparatus for multivariable analysis of biological measurements
US6687395B1 (en) * 1999-07-21 2004-02-03 Surromed, Inc. System for microvolume laser scanning cytometry
DE19936148A1 (en) * 1999-07-31 2001-02-01 Abb Research Ltd Procedure for determining spray parameters for a paint spraying system
US6243615B1 (en) * 1999-09-09 2001-06-05 Aegis Analytical Corporation System for analyzing and improving pharmaceutical and other capital-intensive manufacturing processes
US6665685B1 (en) * 1999-11-01 2003-12-16 Cambridge Soft Corporation Deriving database interaction software
US20020156587A1 (en) * 2000-02-10 2002-10-24 Woolf Peter James Method of analyzing gene expression data using fuzzy logic
US6587845B1 (en) * 2000-02-15 2003-07-01 Benjamin B. Braunheim Method and apparatus for identification and optimization of bioactive compounds using a neural network
US7416524B1 (en) 2000-02-18 2008-08-26 Johnson & Johnson Pharmaceutical Research & Development, L.L.C. System, method and computer program product for fast and efficient searching of large chemical libraries
AU2001241800A1 (en) 2000-02-29 2001-09-12 3-Dimensional Pharmaceuticals, Inc. Method and computer program product for designing combinatorial arrays
US7113919B1 (en) * 2000-02-29 2006-09-26 Chemdomain, Inc. System and method for configuring products over a communications network
US6907350B2 (en) * 2000-03-13 2005-06-14 Chugai Seiyaku Kabushiki Kaisha Method, system and apparatus for handling information on chemical substances
US7039621B2 (en) 2000-03-22 2006-05-02 Johnson & Johnson Pharmaceutical Research & Development, L.L.C. System, method, and computer program product for representing object relationships in a multidimensional space
US7139739B2 (en) 2000-04-03 2006-11-21 Johnson & Johnson Pharmaceutical Research & Development, L.L.C. Method, system, and computer program product for representing object relationships in a multidimensional space
US7356419B1 (en) 2000-05-05 2008-04-08 Cambridgesoft Corporation Deriving product information
US7272509B1 (en) * 2000-05-05 2007-09-18 Cambridgesoft Corporation Managing product information
WO2001085334A2 (en) * 2000-05-09 2001-11-15 Pharmacia & Upjohn Chemical structure identification
JP2001331509A (en) * 2000-05-22 2001-11-30 Hitachi Ltd Relational database processor, relational database processing method, and computer-readable recording medium recorded with relational database processing program
IL153189A0 (en) * 2000-06-19 2003-06-24 Correlogic Systems Inc Heuristic method of classification
KR101054732B1 (en) * 2000-07-18 2011-08-05 더 유나이티드 스테이츠 오브 아메리카 애즈 리프리젠티드 바이 더 세크레터리 오브 더 디파트먼트 오브 헬쓰 앤드 휴먼 써비시즈 How to Identify Biological Conditions Based on Hidden Patterns of Biological Data
US20020010555A1 (en) * 2000-07-20 2002-01-24 Pfizer Inc. Ionization polarity prediction of compounds for efficient mass spectrometry
JP2004522411A (en) * 2000-07-31 2004-07-29 ジーン ロジック インコーポレイテッド Molecular toxicology modeling
US7590493B2 (en) * 2000-07-31 2009-09-15 Ocimum Biosolutions, Inc. Methods for determining hepatotoxins
US6834239B2 (en) 2000-08-22 2004-12-21 Victor S. Lobanov Method, system, and computer program product for determining properties of combinatorial library products from features of library building blocks
WO2002025504A2 (en) 2000-09-20 2002-03-28 Lobanov Victor S Method, system, and computer program product for encoding and building products of a virtual combinatorial library
US6787761B2 (en) * 2000-11-27 2004-09-07 Surromed, Inc. Median filter for liquid chromatography-mass spectrometry data
WO2002044715A1 (en) * 2000-11-28 2002-06-06 Surromed, Inc. Methods for efficiently minig broad data sets for biological markers
GB2375536A (en) * 2000-12-01 2002-11-20 Univ Sheffield Combinatorial molecule design system and method
AU2002240131A1 (en) * 2001-01-26 2002-08-06 Bioinformatics Dna Codes, Llc Modular computational models for predicting the pharmaceutical properties of chemical compounds
WO2002061419A1 (en) * 2001-01-29 2002-08-08 3-Dimensional Pharmaceuticals, Inc. Method, system, and computer program product for analyzing combinatorial libraries
US7167851B2 (en) * 2001-01-31 2007-01-23 Accelrys Software Inc. One dimensional molecular representations
US6615211B2 (en) * 2001-03-19 2003-09-02 International Business Machines Corporation System and methods for using continuous optimization for ordering categorical data sets
WO2002082329A2 (en) * 2001-04-06 2002-10-17 Axxima Pharmaceuticals Ag Method for generating a quantitative structure property activity relationship
DE10119853A1 (en) 2001-04-24 2003-01-09 Bayer Ag Hybrid model and method for determining mechanical properties and processing properties of an injection molded part
WO2002088662A2 (en) * 2001-04-25 2002-11-07 Bristol-Myers Squibb Company Method of molecular structure recognition
WO2002093409A1 (en) * 2001-05-16 2002-11-21 Isis Pharmaceuticals, Inc. Multi-paradigm knowledge-bases
AU2002259258A1 (en) 2001-05-17 2002-11-25 Entelos, Inc. Apparatus and method for validating a computer model
US7246329B1 (en) * 2001-05-18 2007-07-17 Autodesk, Inc. Multiple menus for use with a graphical user interface
CA2447357A1 (en) * 2001-05-22 2002-11-28 Gene Logic, Inc. Molecular toxicology modeling
US20070015146A1 (en) * 2001-05-22 2007-01-18 Gene Logic, Inc. Molecular nephrotoxicology modeling
US6584413B1 (en) * 2001-06-01 2003-06-24 Sandia Corporation Apparatus and system for multivariate spectral analysis
WO2003001391A1 (en) * 2001-06-21 2003-01-03 Bell Robert A Method and apparatus for spatially coordinating, storing and manipulating computer aided design drawings
US7447594B2 (en) * 2001-07-10 2008-11-04 Ocimum Biosolutions, Inc. Molecular cardiotoxicology modeling
CA2452897A1 (en) * 2001-07-10 2003-08-21 Gene Logic, Inc. Cardiotoxin molecular toxicology modeling
US20070054269A1 (en) * 2001-07-10 2007-03-08 Mendrick Donna L Molecular cardiotoxicology modeling
US20030018598A1 (en) * 2001-07-19 2003-01-23 Cawse James Norman Neural network method and system
US6873915B2 (en) * 2001-08-24 2005-03-29 Surromed, Inc. Peak selection in multidimensional data
US6954744B2 (en) * 2001-08-29 2005-10-11 Honeywell International, Inc. Combinatorial approach for supervised neural network learning
US7106903B2 (en) * 2001-09-24 2006-09-12 Vima Technologies, Inc. Dynamic partial function in measurement of similarity of objects
US6835927B2 (en) * 2001-10-15 2004-12-28 Surromed, Inc. Mass spectrometric quantification of chemical mixture components
US20050010603A1 (en) * 2001-10-31 2005-01-13 Berks Andrew H. Display for Markush chemical structures
DE10156245A1 (en) * 2001-11-15 2003-06-05 Bayer Ag Methods for the identification of pharmacophores
US7363311B2 (en) * 2001-11-16 2008-04-22 Nippon Telegraph And Telephone Corporation Method of, apparatus for, and computer program for mapping contents having meta-information
US20030139907A1 (en) * 2002-01-24 2003-07-24 Mccarthy Robert J System, Method, and Product for Nanoscale Modeling, Analysis, Simulation, and Synthesis (NMASS)
CA2471661A1 (en) * 2002-01-31 2003-08-07 Gene Logic, Inc. Molecular hepatotoxicology modeling
KR20030066095A (en) * 2002-02-04 2003-08-09 주식회사 넥스트테크 Chemical Information providing system on search engine for development of new-material
DE10209146A1 (en) * 2002-03-01 2003-09-18 Bayer Ag Method and system for the automatic planning of experiments
US20030168585A1 (en) * 2002-03-05 2003-09-11 Michael Wall Determination of sample purity through mass spectroscopy analysis
EP1485198A1 (en) * 2002-03-22 2004-12-15 Morphochem Aktiengesellschaft Für Kombinatorische Chemie Methods and systems for discovery of chemical compounds and their synthesis
CA2480202A1 (en) * 2002-04-10 2003-10-23 Transtech Pharma, Inc. System and method for data analysis, manipulation, and visualization
CA2484625A1 (en) * 2002-05-09 2003-11-20 Surromed, Inc. Methods for time-alignment of liquid chromatography-mass spectrometry data
US7805437B1 (en) * 2002-05-15 2010-09-28 Spotfire Ab Interactive SAR table
US7046247B2 (en) * 2002-05-15 2006-05-16 Hewlett-Packard Development Company, L.P. Method for visualizing graphical data sets having a non-uniform graphical density for display
CN1656364A (en) * 2002-05-22 2005-08-17 第一应答器系统及技术有限责任公司 Processing system for remote chemical identification
AU2003231879A1 (en) * 2002-05-28 2003-12-12 The Trustees Of The University Of Pennsylvania Methods, systems, and computer program products for computational analysis and design of amphiphilic polymers
WO2004011905A2 (en) 2002-07-29 2004-02-05 Correlogic Systems, Inc. Quality assurance/quality control for electrospray ionization processes
US6947579B2 (en) 2002-10-07 2005-09-20 Technion Research & Development Foundation Ltd. Three-dimensional face recognition
US7580903B2 (en) * 2002-10-24 2009-08-25 Complex Systems Engineering, Inc. Process for the creation of fuzzy cognitive maps from Monte Carlo simulation generated Meta Model
AU2003287250B9 (en) * 2002-10-30 2010-01-28 Ptc Therapeutics, Inc. Identifying therapeutic compounds based on their physical-chemical properties
TW200411574A (en) * 2002-12-31 2004-07-01 Ind Tech Res Inst Artificial intelligent system for classification of protein family
US7013238B1 (en) * 2003-02-24 2006-03-14 Microsoft Corporation System for delivering recommendations
WO2005007806A2 (en) * 2003-05-07 2005-01-27 Duke University Protein design for receptor-ligand recognition and binding
CA2466792A1 (en) * 2003-05-16 2004-11-16 Affinium Pharmaceuticals, Inc. Evaluation of spectra
US7853406B2 (en) * 2003-06-13 2010-12-14 Entelos, Inc. Predictive toxicology for biological systems
GB2403636A (en) * 2003-07-02 2005-01-05 Sony Uk Ltd Information retrieval using an array of nodes
US8373660B2 (en) * 2003-07-14 2013-02-12 Matt Pallakoff System and method for a portable multimedia client
AU2004261222A1 (en) * 2003-08-01 2005-02-10 Correlogic Systems, Inc. Multiple high-resolution serum proteomic features for ovarian cancer detection
JP2007501617A (en) * 2003-08-07 2007-02-01 ジーン ロジック インコーポレイテッド Primary rat hepatotoxicity modeling
US20050065733A1 (en) * 2003-08-08 2005-03-24 Paul Caron Visualization of databases
US20050079476A1 (en) * 2003-10-10 2005-04-14 Sutherland Scot M. Method of predictive assessment
CA2546567A1 (en) * 2003-11-21 2005-06-09 Optive Research, Inc. Method for providing a canonical structural representation
JPWO2005052819A1 (en) * 2003-11-28 2007-12-06 富士通株式会社 Material name setting support device, material name setting support program, and material name setting support method
JP4774534B2 (en) * 2003-12-11 2011-09-14 アングーク ファーマシューティカル カンパニー,リミティド A diagnostic method for biological status through the use of a centralized adaptive model and remotely manipulated sample processing
US20050177318A1 (en) * 2004-02-10 2005-08-11 National Institute Of Statistical Sciences Methods, systems and computer program products for identifying pharmacophores in molecules using inferred conformations and inferred feature importance
US20080281526A1 (en) * 2004-03-22 2008-11-13 Diggans James C Methods For Molecular Toxicology Modeling
US7248360B2 (en) * 2004-04-02 2007-07-24 Ppd Biomarker Discovery Sciences, Llc Polychronic laser scanning system and method of use
US20050222828A1 (en) * 2004-04-02 2005-10-06 Ehtibar Dzhafarov Method for computing subjective dissimilarities among discrete entities
WO2005100989A2 (en) * 2004-04-07 2005-10-27 Gene Logic, Inc. Hepatotoxicity molecular models
WO2006004986A1 (en) * 2004-06-29 2006-01-12 Pharmix Corporation Estimating the accuracy of molecular property models and predictions
US20060052943A1 (en) * 2004-07-28 2006-03-09 Karthik Ramani Architectures, queries, data stores, and interfaces for proteins and drug molecules
US20060031027A1 (en) * 2004-08-03 2006-02-09 Alman David H Method and apparatus for predicting properties of a chemical mixture
AU2005283930B2 (en) 2004-09-15 2011-07-14 Bp Oil International Limited Process for evaluating a refinery feedstock
US8000837B2 (en) * 2004-10-05 2011-08-16 J&L Group International, Llc Programmable load forming system, components thereof, and methods of use
DE102006001780A1 (en) * 2005-01-14 2006-08-24 Siemens Corp. Research, Inc. Method for diagnosis of amylotrophic lateral sclerosis, comprising surface-enhanced desorption-ionisation mass spectrometry of proteins from patients and analysing peak values on an alternating decision tree
EP1861704A2 (en) * 2005-02-09 2007-12-05 Correlogic Systems, Inc. Identification of bacteria and spores
US20080262467A1 (en) * 2005-02-16 2008-10-23 Humphrey Joseph A C Blood Flow Bypass Catheters and Methods for the Delivery of Medium to the Vasculature and Body Ducts
GB2439032B (en) * 2005-04-15 2010-10-20 Thermo Crs Ltd Method and system for sample testing
FI20055198A (en) * 2005-04-28 2006-10-29 Valtion Teknillinen Visualization technology for biological information
US20080312514A1 (en) * 2005-05-12 2008-12-18 Mansfield Brian C Serum Patterns Predictive of Breast Cancer
US20070150424A1 (en) * 2005-12-22 2007-06-28 Pegasus Technologies, Inc. Neural network model with clustering ensemble approach
US7978889B2 (en) * 2006-01-27 2011-07-12 Michael Valdiserri Automatic engine for 3D object generation from volumetric scan data and method
US20080015833A1 (en) * 2006-07-13 2008-01-17 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Methods and systems for molecular inhibition of protein misfolding
US20080014572A1 (en) * 2006-07-13 2008-01-17 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Methods and systems for molecular inhibition
US20080015835A1 (en) * 2006-07-13 2008-01-17 Searete Llc, A Limited Liability Corporation Of The State Of Delaware Methods and systems for treating disease
US20090082344A1 (en) * 2006-07-13 2009-03-26 Searete Llc Methods and systems for treating disease
US20080165135A1 (en) * 2007-01-10 2008-07-10 Jao-Ching Lin Functional expansion system for a touch pad
WO2008100633A1 (en) * 2007-02-15 2008-08-21 The Board Of Trustees Of The Leland Stanford Junior University Precursor selection method for chemical vapor deposition techniques
WO2008116495A1 (en) * 2007-03-26 2008-10-02 Molcode Ltd Method and apparatus for the design of chemical compounds with predetermined properties
US7785218B2 (en) 2007-03-26 2010-08-31 Acushnet Company Custom milled iron set
US8229875B2 (en) * 2007-04-11 2012-07-24 Oracle International Corporation Bayes-like classifier with fuzzy likelihood
SG182976A1 (en) * 2007-06-29 2012-08-30 Ahngook Pharmaceutical Co Ltd Predictive markers for ovarian cancer
EP2222851B1 (en) 2007-11-20 2017-06-28 Ionis Pharmaceuticals, Inc. Modulation of cd40 expression
US7966588B1 (en) 2008-01-26 2011-06-21 National Semiconductor Corporation Optimization of electrical circuits
US9087164B2 (en) * 2008-01-26 2015-07-21 National Semiconductor Corporation Visualization of tradeoffs between circuit designs
US20100030035A1 (en) * 2008-08-04 2010-02-04 The Hong Kong Polytechnic University Fuzzy system for cardiovascular disease and stroke risk assessment
US20100070200A1 (en) * 2008-09-17 2010-03-18 Mehmet Sarikaya Method and system for designing polypeptides and polypeptide-like polymers with specific chemical and physical characteristics
US8103672B2 (en) * 2009-05-20 2012-01-24 Detectent, Inc. Apparatus, system, and method for determining a partial class membership of a data record in a class
US20110202328A1 (en) * 2009-10-02 2011-08-18 Exxonmobil Research And Engineering Company System for the determination of selective absorbent molecules through predictive correlations
GB2475473B (en) * 2009-11-04 2015-10-21 Nds Ltd User request based content ranking
US8954893B2 (en) * 2009-11-06 2015-02-10 Hewlett-Packard Development Company, L.P. Visually representing a hierarchy of category nodes
US8530838B2 (en) * 2009-12-29 2013-09-10 Saint-Gobain Ceramics & Plastics, Inc. Radiation detection system and method of indicating presence of radiation
IN2012DN06588A (en) 2010-02-10 2015-10-23 Novartis Ag
US8712741B2 (en) 2010-06-28 2014-04-29 National Semiconductor Corporation Power supply architecture system designer
CN102541286B (en) * 2010-12-24 2015-09-16 北大方正集团有限公司 For building the method and apparatus of organic chemical structural formula
CN102541423A (en) * 2010-12-24 2012-07-04 北大方正集团有限公司 Method and device for compiling organic chemical structural formulas
CN102566876B (en) * 2010-12-24 2015-03-25 北大方正集团有限公司 Method and device for switching focus of organic chemical structural formula
CN102855230A (en) * 2011-06-30 2013-01-02 北大方正集团有限公司 Method and device for editing organic chemical structural formula
US20130041894A1 (en) * 2011-08-10 2013-02-14 International Business Machines Corporation Mitigating Environment, Health, and Safety Complications
CN103150296B (en) * 2011-12-06 2016-01-20 北大方正集团有限公司 The edit methods of atom belonging and device
US20130308840A1 (en) * 2012-04-23 2013-11-21 Targacept, Inc. Chemical entity search, for a collaboration and content management system
US9899066B2 (en) * 2012-09-10 2018-02-20 Texas Instruments Incorporated Priority based backup in nonvolatile logic arrays
US9646279B2 (en) * 2012-09-28 2017-05-09 Rex Wiig System and method of a requirement, compliance and resource management
US10268974B2 (en) * 2012-09-28 2019-04-23 Rex Wiig System and method of a requirement, compliance and resource management
KR102029055B1 (en) * 2013-02-08 2019-10-07 삼성전자주식회사 Method and apparatus for high-dimensional data visualization
BR112016003993A2 (en) * 2013-10-23 2017-09-12 Dow Global Technologies Llc method, computing device and system
US10943194B2 (en) 2013-10-25 2021-03-09 The Boeing Company Product chemical profile system
US10733499B2 (en) 2014-09-02 2020-08-04 University Of Kansas Systems and methods for enhancing computer assisted high throughput screening processes
KR101684742B1 (en) 2014-11-27 2016-12-09 이화여자대학교 산학협력단 Method and system for drug virtual screening and construction of focused screening library
GB2557113B (en) * 2015-10-30 2022-07-20 Halliburton Energy Services Inc Producing chemical formulations with cognitive computing
US10915808B2 (en) * 2016-07-05 2021-02-09 International Business Machines Corporation Neural network for chemical compounds
WO2018018025A1 (en) * 2016-07-21 2018-01-25 Ayasdi, Inc. Topological data analysis of data from a fact table and related dimension tables
US10998087B2 (en) * 2016-08-25 2021-05-04 The Government of the United States of Amercia as represented by the Secretary of Homeland Security Systems and methodologies for desigining simulant compounds
WO2018098588A1 (en) * 2016-12-02 2018-06-07 Lumiant Corporation Computer systems for and methods of identifying non-elemental materials based on atomistic properties
US10430395B2 (en) 2017-03-01 2019-10-01 International Business Machines Corporation Iterative widening search for designing chemical compounds
EP3607472A1 (en) * 2017-04-03 2020-02-12 American Chemical Society Systems and methods for query and index optimization for retrieving data in instances of a formulation data structure from a database
EP3612545A4 (en) * 2017-04-18 2021-01-13 X-Chem, Inc. Methods for identifying compounds
WO2019006391A1 (en) * 2017-06-30 2019-01-03 Sri International Apparatuses for reaction screening and optimization, and methods thereof
JP7201981B2 (en) * 2017-06-30 2023-01-11 学校法人 明治薬科大学 Prediction device, prediction method and prediction program
US10426424B2 (en) 2017-11-21 2019-10-01 General Electric Company System and method for generating and performing imaging protocol simulations
US20190286792A1 (en) * 2018-03-13 2019-09-19 International Business Machines Corporation Chemical compound discovery using machine learning technologies
CN109539596B (en) * 2018-11-28 2020-10-23 西安工程大学 GA-GRNN-based solar heat collection system photo-thermal efficiency prediction method
JP7330712B2 (en) * 2019-02-12 2023-08-22 株式会社日立製作所 Material property prediction device and material property prediction method
EP3712579B1 (en) * 2019-03-18 2021-12-29 Evonik Operations GmbH Method for generating a composition for paints, varnishes, inks, grinding resins, pigment concentrates or other coatings
AR118333A1 (en) * 2019-03-18 2021-09-29 Evonik Operations Gmbh METHOD OF GENERATING A COMPOSITION FOR PAINTS, VARNISHES, PRINTING INKS, GRINDING RESINS, PIGMENT CONCENTRATES OR OTHER COATING MATERIALS
CN110444250A (en) * 2019-03-26 2019-11-12 广东省微生物研究所(广东省微生物分析检测中心) High-throughput drug virtual screening system based on molecular fingerprint and deep learning
US12009066B2 (en) * 2019-05-22 2024-06-11 International Business Machines Corporation Automated transitive read-behind analysis in big data toxicology
US11710261B2 (en) * 2019-07-29 2023-07-25 University Of Southern California Scan-specific recurrent neural network for image reconstruction
CN115087865A (en) * 2020-01-27 2022-09-20 药水人工智能股份有限公司 Methods, systems, and apparatus for generating chemical data sequences for de novo chemical recipes using neural networks
EP4154194A1 (en) 2020-05-22 2023-03-29 BASF Coatings GmbH Prediction of properties of a chemical mixture
CN116097268A (en) * 2020-08-31 2023-05-09 松下知识产权经营株式会社 Characteristic display method, characteristic display device, information processing device, and program
US12045743B2 (en) * 2020-09-24 2024-07-23 Microsoft Technology Licensing, Llc Mixing techniques for probabilistic quantum circuits with fallback
WO2022130648A1 (en) 2020-12-18 2022-06-23 富士通株式会社 Information processing program, information processing method, and information processing device
JP2022118555A (en) * 2021-02-02 2022-08-15 富士通株式会社 Optimization device, optimization method, and optimization program
CN118140269A (en) 2021-10-28 2024-06-04 松下知识产权经营株式会社 Information processing method, information processing device, and program

Citations (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4773099A (en) 1985-10-10 1988-09-20 The Palantir Corporation Pattern classification means for use in a pattern recognition system
US4811217A (en) 1985-03-29 1989-03-07 Japan Association For International Chemical Information Method of storing and searching chemical structure data
US4859736A (en) 1987-03-30 1989-08-22 Ciba-Geigy Corporation Synthetic polystyrene resin and its use in solid phase peptide synthesis
US4908773A (en) 1987-04-06 1990-03-13 Genex Corporation Computer designed stabilized proteins and method for producing same
US4935875A (en) 1987-12-02 1990-06-19 Data Chem, Inc. Chemical analyzer
US4939666A (en) 1987-09-02 1990-07-03 Genex Corporation Incremental macromolecule construction methods
US5010175A (en) 1988-05-02 1991-04-23 The Regents Of The University Of California General method for producing and selecting peptides with specific properties
US5025388A (en) 1988-08-26 1991-06-18 Cramer Richard D Iii Comparative molecular field analysis (CoMFA)
WO1991019735A1 (en) 1990-06-14 1991-12-26 Bartlett Paul A Libraries of modified peptides with protease resistance
WO1992000091A1 (en) 1990-07-02 1992-01-09 Bioligand, Inc. Random bio-oligomer library, a method of synthesis thereof, and a method of use thereof
US5155801A (en) 1990-10-09 1992-10-13 Hughes Aircraft Company Clustered neural networks
US5167009A (en) 1990-08-03 1992-11-24 E. I. Du Pont De Nemours & Co. (Inc.) On-line process control neural network using data pointers
US5181259A (en) 1990-09-25 1993-01-19 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration General method of pattern classification using the two domain theory
EP0355266B1 (en) 1988-04-29 1993-06-09 Millipore Corporation Apparatus for performing repetitive chemical reactions
US5240680A (en) 1991-12-19 1993-08-31 Chiron Corporation Automated apparatus for use in peptide synthesis
WO1993020242A1 (en) 1992-03-30 1993-10-14 The Scripps Research Institute Encoded combinatorial chemical libraries
US5260882A (en) 1991-01-02 1993-11-09 Rohm And Haas Company Process for the estimation of physical and chemical properties of a proposed polymeric or copolymeric substance or material
EP0355628B1 (en) 1988-08-24 1993-11-10 Siemens Aktiengesellschaft Process for chemically decontaminating the surface of a metallic construction element of a nuclear power plant
US5265030A (en) 1990-04-24 1993-11-23 Scripps Clinic And Research Foundation System and method for determining three-dimensional structures of proteins
US5270170A (en) 1991-10-16 1993-12-14 Affymax Technologies N.V. Peptide library and screening method
US5288514A (en) 1992-09-14 1994-02-22 The Regents Of The University Of California Solid phase and combinatorial synthesis of benzodiazepine compounds on a solid support
US5323471A (en) 1991-09-12 1994-06-21 Atr Auditory And Visual Perception Research Laboratories Pattern recognition apparatus and pattern learning apparatus employing neural net including excitatory element-inhibitory element pair couplings
US5331573A (en) 1990-12-14 1994-07-19 Balaji Vitukudi N Method of design of compounds that mimic conformational features of selected peptides
WO1994028504A1 (en) 1993-05-21 1994-12-08 Arris Pharmaceutical A machine-learning approach to modeling biological activity for molecular design and to modeling other characteristics
WO1995001606A1 (en) 1993-06-30 1995-01-12 Daylight Chemical Information Systems, Inc. Method and apparatus for designing molecules with desired properties by evolving successive populations
US5436850A (en) 1991-07-11 1995-07-25 The Regents Of The University Of California Method to identify protein sequences that fold into a known three-dimensional structure
US5442122A (en) 1992-11-09 1995-08-15 Shimadzu Corporation Dibenzosuberyl and dibenzosuberenyl derivatives
US5463564A (en) 1994-09-16 1995-10-31 3-Dimensional Pharmaceuticals, Inc. System and method of automatically generating chemical compounds with desired properties
US5499193A (en) 1991-04-17 1996-03-12 Takeda Chemical Industries, Ltd. Automated synthesis apparatus and method of controlling the apparatus
US5519635A (en) 1993-09-20 1996-05-21 Hitachi Ltd. Apparatus for chemical analysis with detachable analytical units
US5524065A (en) 1992-02-07 1996-06-04 Canon Kabushiki Kaisha Method and apparatus for pattern recognition
US5549974A (en) 1994-06-23 1996-08-27 Affymax Technologies Nv Methods for the solid phase synthesis of thiazolidinones, metathiazanones, and derivatives thereof
US5553225A (en) 1994-10-25 1996-09-03 International Business Machines Corporation Method and apparatus for combining a zoom function in scroll bar sliders
US5565325A (en) 1992-10-30 1996-10-15 Bristol-Myers Squibb Company Iterative methods for screening peptide libraries
US5585277A (en) 1993-06-21 1996-12-17 Scriptgen Pharmaceuticals, Inc. Screening method for identifying ligands for target proteins
US5602755A (en) 1995-06-23 1997-02-11 Exxon Research And Engineering Company Method for predicting chemical or physical properties of complex mixtures
US5602938A (en) 1994-05-20 1997-02-11 Nippon Telegraph And Telephone Corporation Method of generating dictionary for pattern recognition and pattern recognition method using the same
WO1997009342A1 (en) 1995-09-08 1997-03-13 Scriptgen Pharmaceuticals, Inc. Screen for compounds with affinity for rna
EP0770876A1 (en) 1995-10-25 1997-05-02 Scriptgen Pharmaceuticals, Inc. A screening method for identifying ligands for target proteins
US5634017A (en) 1994-09-22 1997-05-27 International Business Machines Corporation Computer system and method for processing atomic data to calculate and exhibit the properties and structure of matter based on relativistic models
US5635598A (en) 1993-06-21 1997-06-03 Selectide Corporation Selectively cleavabe linners based on iminodiacetic acid esters for solid phase peptide synthesis
WO1997020952A1 (en) 1995-12-07 1997-06-12 Scriptgen Pharmaceuticals, Inc. A fluorescence-based screening method for identifying ligands
WO1997027559A1 (en) 1996-01-26 1997-07-31 Patterson David E Method of creating and searching a molecular virtual library using validated molecular structure descriptors
US5670326A (en) 1994-04-05 1997-09-23 Pharmagenics, Inc. Reiterative method for screening combinatorial libraries
US5679582A (en) 1993-06-21 1997-10-21 Scriptgen Pharmaceuticals, Inc. Screening method for identifying ligands for target proteins
US5699268A (en) * 1995-03-24 1997-12-16 University Of Guelph Computational method for designing chemical structures having common functional characteristics
US5703792A (en) 1993-05-21 1997-12-30 Arris Pharmaceutical Corporation Three dimensional measurement of molecular diversity
EP0818744A2 (en) 1996-07-08 1998-01-14 Proteus Molecular Design Limited Process for selecting candidate drug compounds
US5712171A (en) 1995-01-20 1998-01-27 Arqule, Inc. Method of generating a plurality of chemical compounds in a spatially arranged array
US5712564A (en) 1995-12-29 1998-01-27 Unisys Corporation Magnetic ink recorder calibration apparatus and method
US5740326A (en) 1994-07-28 1998-04-14 International Business Machines Corporation Circuit for searching/sorting data in neural networks
WO1998020459A1 (en) 1996-11-04 1998-05-14 3-Dimensional Pharmaceuticals, Inc. System, method, and computer program product for the visualization and interactive processing and analysis of chemical data
US5789160A (en) 1990-06-11 1998-08-04 Nexstar Pharmaceuticals, Inc. Parallel selex
US5807754A (en) 1995-05-11 1998-09-15 Arqule, Inc. Combinatorial synthesis and high-throughput screening of a Rev-inhibiting arylidenediamide array
US5811241A (en) 1995-09-13 1998-09-22 Cortech, Inc. Method for preparing and identifying N-substitued 1,4-piperazines and N-substituted 1,4-piperazinediones
US5832494A (en) 1993-06-14 1998-11-03 Libertech, Inc. Method and apparatus for indexing, searching and displaying data
US5861532A (en) 1997-03-04 1999-01-19 Chiron Corporation Solid-phase synthesis of N-alkyl amides
US5908960A (en) 1997-05-07 1999-06-01 Smithkline Beecham Corporation Compounds
US5993819A (en) 1987-09-08 1999-11-30 Duke University Synthetic vaccine for protection against human immunodeficiency virus infection
US6014661A (en) 1996-05-06 2000-01-11 Ivee Development Ab System and method for automatic analysis of data bases and for user-controlled dynamic querying
US6037135A (en) 1992-08-07 2000-03-14 Epimmune Inc. Methods for making HLA binding peptides and their uses
US6049797A (en) 1998-04-07 2000-04-11 Lucent Technologies, Inc. Method, apparatus and programmed medium for clustering databases with categorical attributes
US6185506B1 (en) 1996-01-26 2001-02-06 Tripos, Inc. Method for selecting an optimally diverse library of small molecules based on validated molecular structural descriptors

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US543796A (en) * 1895-07-30 Apparatus for separating dust
US4939668A (en) * 1987-08-24 1990-07-03 International Business Machines Corp. System for designing intercommunications networks
US5095443A (en) * 1988-10-07 1992-03-10 Ricoh Company, Ltd. Plural neural network system having a successive approximation learning method
EP0471857B1 (en) * 1990-03-12 2000-02-02 Fujitsu Limited Neuro-fuzzy integrated data processing system; network structure conversion system ; fuzzy model extracting system
JPH0744514A (en) * 1993-07-27 1995-02-14 Matsushita Electric Ind Co Ltd Learning data contracting method for neural network
US5598510A (en) * 1993-10-18 1997-01-28 Loma Linda University Medical Center Self organizing adaptive replicate (SOAR)
US5712565A (en) * 1994-06-22 1998-01-27 Seagate Technology, Inc. MR sensor having thick active region between two thinner inactive MR regions topped with respective permanent magnets
US5734796A (en) * 1995-09-29 1998-03-31 Ai Ware, Inc. Self-organization of pattern data with dimension reduction through learning of non-linear variance-constrained mapping
US6026397A (en) * 1996-05-22 2000-02-15 Electronic Data Systems Corporation Data analysis system and method
US6571227B1 (en) * 1996-11-04 2003-05-27 3-Dimensional Pharmaceuticals, Inc. Method, system and computer program product for non-linear mapping of multi-dimensional data
US5933819C1 (en) 1997-05-23 2001-11-13 Scripps Research Inst Prediction of relative binding motifs of biologically active peptides and peptide mimetics

Patent Citations (77)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4811217A (en) 1985-03-29 1989-03-07 Japan Association For International Chemical Information Method of storing and searching chemical structure data
US4773099A (en) 1985-10-10 1988-09-20 The Palantir Corporation Pattern classification means for use in a pattern recognition system
US4859736A (en) 1987-03-30 1989-08-22 Ciba-Geigy Corporation Synthetic polystyrene resin and its use in solid phase peptide synthesis
US4908773A (en) 1987-04-06 1990-03-13 Genex Corporation Computer designed stabilized proteins and method for producing same
US4939666A (en) 1987-09-02 1990-07-03 Genex Corporation Incremental macromolecule construction methods
US5993819A (en) 1987-09-08 1999-11-30 Duke University Synthetic vaccine for protection against human immunodeficiency virus infection
US4935875A (en) 1987-12-02 1990-06-19 Data Chem, Inc. Chemical analyzer
EP0355266B1 (en) 1988-04-29 1993-06-09 Millipore Corporation Apparatus for performing repetitive chemical reactions
US5010175A (en) 1988-05-02 1991-04-23 The Regents Of The University Of California General method for producing and selecting peptides with specific properties
EP0355628B1 (en) 1988-08-24 1993-11-10 Siemens Aktiengesellschaft Process for chemically decontaminating the surface of a metallic construction element of a nuclear power plant
US5025388A (en) 1988-08-26 1991-06-18 Cramer Richard D Iii Comparative molecular field analysis (CoMFA)
US5307287A (en) 1988-08-26 1994-04-26 Tripos Associates, Inc. Comparative molecular field analysis (COMFA)
US5265030A (en) 1990-04-24 1993-11-23 Scripps Clinic And Research Foundation System and method for determining three-dimensional structures of proteins
US5789160A (en) 1990-06-11 1998-08-04 Nexstar Pharmaceuticals, Inc. Parallel selex
WO1991019735A1 (en) 1990-06-14 1991-12-26 Bartlett Paul A Libraries of modified peptides with protease resistance
WO1992000091A1 (en) 1990-07-02 1992-01-09 Bioligand, Inc. Random bio-oligomer library, a method of synthesis thereof, and a method of use thereof
US5167009A (en) 1990-08-03 1992-11-24 E. I. Du Pont De Nemours & Co. (Inc.) On-line process control neural network using data pointers
US5181259A (en) 1990-09-25 1993-01-19 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration General method of pattern classification using the two domain theory
US5155801A (en) 1990-10-09 1992-10-13 Hughes Aircraft Company Clustered neural networks
US5331573A (en) 1990-12-14 1994-07-19 Balaji Vitukudi N Method of design of compounds that mimic conformational features of selected peptides
US5612895A (en) 1990-12-14 1997-03-18 Balaji; Vitukudi N. Method of rational drug design based on ab initio computer simulation of conformational features of peptides
US5260882A (en) 1991-01-02 1993-11-09 Rohm And Haas Company Process for the estimation of physical and chemical properties of a proposed polymeric or copolymeric substance or material
US5499193A (en) 1991-04-17 1996-03-12 Takeda Chemical Industries, Ltd. Automated synthesis apparatus and method of controlling the apparatus
US5436850A (en) 1991-07-11 1995-07-25 The Regents Of The University Of California Method to identify protein sequences that fold into a known three-dimensional structure
US5323471A (en) 1991-09-12 1994-06-21 Atr Auditory And Visual Perception Research Laboratories Pattern recognition apparatus and pattern learning apparatus employing neural net including excitatory element-inhibitory element pair couplings
US5270170A (en) 1991-10-16 1993-12-14 Affymax Technologies N.V. Peptide library and screening method
US5240680A (en) 1991-12-19 1993-08-31 Chiron Corporation Automated apparatus for use in peptide synthesis
US5524065A (en) 1992-02-07 1996-06-04 Canon Kabushiki Kaisha Method and apparatus for pattern recognition
WO1993020242A1 (en) 1992-03-30 1993-10-14 The Scripps Research Institute Encoded combinatorial chemical libraries
US6037135A (en) 1992-08-07 2000-03-14 Epimmune Inc. Methods for making HLA binding peptides and their uses
US5288514A (en) 1992-09-14 1994-02-22 The Regents Of The University Of California Solid phase and combinatorial synthesis of benzodiazepine compounds on a solid support
US5545568A (en) 1992-09-14 1996-08-13 The Regents Of The University Of California Solid phase and combinatorial synthesis of compounds on a solid support
US5565325A (en) 1992-10-30 1996-10-15 Bristol-Myers Squibb Company Iterative methods for screening peptide libraries
US5442122A (en) 1992-11-09 1995-08-15 Shimadzu Corporation Dibenzosuberyl and dibenzosuberenyl derivatives
WO1994028504A1 (en) 1993-05-21 1994-12-08 Arris Pharmaceutical A machine-learning approach to modeling biological activity for molecular design and to modeling other characteristics
US5526281A (en) 1993-05-21 1996-06-11 Arris Pharmaceutical Corporation Machine-learning approach to modeling biological activity for molecular design and to modeling other characteristics
US5703792A (en) 1993-05-21 1997-12-30 Arris Pharmaceutical Corporation Three dimensional measurement of molecular diversity
US5832494A (en) 1993-06-14 1998-11-03 Libertech, Inc. Method and apparatus for indexing, searching and displaying data
US5585277A (en) 1993-06-21 1996-12-17 Scriptgen Pharmaceuticals, Inc. Screening method for identifying ligands for target proteins
US5635598A (en) 1993-06-21 1997-06-03 Selectide Corporation Selectively cleavabe linners based on iminodiacetic acid esters for solid phase peptide synthesis
US5679582A (en) 1993-06-21 1997-10-21 Scriptgen Pharmaceuticals, Inc. Screening method for identifying ligands for target proteins
WO1995001606A1 (en) 1993-06-30 1995-01-12 Daylight Chemical Information Systems, Inc. Method and apparatus for designing molecules with desired properties by evolving successive populations
US5434796A (en) 1993-06-30 1995-07-18 Daylight Chemical Information Systems, Inc. Method and apparatus for designing molecules with desired properties by evolving successive populations
US5519635A (en) 1993-09-20 1996-05-21 Hitachi Ltd. Apparatus for chemical analysis with detachable analytical units
US5670326A (en) 1994-04-05 1997-09-23 Pharmagenics, Inc. Reiterative method for screening combinatorial libraries
US5866334A (en) 1994-04-05 1999-02-02 Genzyme Corporation Determination and identification of active compounds in a compound library
US5602938A (en) 1994-05-20 1997-02-11 Nippon Telegraph And Telephone Corporation Method of generating dictionary for pattern recognition and pattern recognition method using the same
US5549974A (en) 1994-06-23 1996-08-27 Affymax Technologies Nv Methods for the solid phase synthesis of thiazolidinones, metathiazanones, and derivatives thereof
US5740326A (en) 1994-07-28 1998-04-14 International Business Machines Corporation Circuit for searching/sorting data in neural networks
US5463564A (en) 1994-09-16 1995-10-31 3-Dimensional Pharmaceuticals, Inc. System and method of automatically generating chemical compounds with desired properties
US5574656A (en) 1994-09-16 1996-11-12 3-Dimensional Pharmaceuticals, Inc. System and method of automatically generating chemical compounds with desired properties
US5684711A (en) 1994-09-16 1997-11-04 3-Dimensional Pharmaceuticals, Inc. System, method, and computer program for at least partially automatically generating chemical compounds having desired properties
US5901069A (en) 1994-09-16 1999-05-04 3-Dimensional Pharmaceuticals, Inc. System, method, and computer program product for at least partially automatically generating chemical compounds with desired properties from a list of potential chemical compounds to synthesize
US5858660A (en) 1994-09-20 1999-01-12 Nexstar Pharmaceuticlas, Inc. Parallel selex
US5634017A (en) 1994-09-22 1997-05-27 International Business Machines Corporation Computer system and method for processing atomic data to calculate and exhibit the properties and structure of matter based on relativistic models
US5553225A (en) 1994-10-25 1996-09-03 International Business Machines Corporation Method and apparatus for combining a zoom function in scroll bar sliders
US5712171A (en) 1995-01-20 1998-01-27 Arqule, Inc. Method of generating a plurality of chemical compounds in a spatially arranged array
US5736412A (en) 1995-01-20 1998-04-07 Arqule, Inc. Method of generating a plurality of chemical compounds in a spatially arranged array
US5699268A (en) * 1995-03-24 1997-12-16 University Of Guelph Computational method for designing chemical structures having common functional characteristics
US5807754A (en) 1995-05-11 1998-09-15 Arqule, Inc. Combinatorial synthesis and high-throughput screening of a Rev-inhibiting arylidenediamide array
US5602755A (en) 1995-06-23 1997-02-11 Exxon Research And Engineering Company Method for predicting chemical or physical properties of complex mixtures
WO1997009342A1 (en) 1995-09-08 1997-03-13 Scriptgen Pharmaceuticals, Inc. Screen for compounds with affinity for rna
US5811241A (en) 1995-09-13 1998-09-22 Cortech, Inc. Method for preparing and identifying N-substitued 1,4-piperazines and N-substituted 1,4-piperazinediones
EP0770876A1 (en) 1995-10-25 1997-05-02 Scriptgen Pharmaceuticals, Inc. A screening method for identifying ligands for target proteins
WO1997020952A1 (en) 1995-12-07 1997-06-12 Scriptgen Pharmaceuticals, Inc. A fluorescence-based screening method for identifying ligands
US5712564A (en) 1995-12-29 1998-01-27 Unisys Corporation Magnetic ink recorder calibration apparatus and method
WO1997027559A1 (en) 1996-01-26 1997-07-31 Patterson David E Method of creating and searching a molecular virtual library using validated molecular structure descriptors
US6185506B1 (en) 1996-01-26 2001-02-06 Tripos, Inc. Method for selecting an optimally diverse library of small molecules based on validated molecular structural descriptors
US6014661A (en) 1996-05-06 2000-01-11 Ivee Development Ab System and method for automatic analysis of data bases and for user-controlled dynamic querying
EP0818744A2 (en) 1996-07-08 1998-01-14 Proteus Molecular Design Limited Process for selecting candidate drug compounds
WO1998020437A2 (en) 1996-11-04 1998-05-14 3-Dimensional Pharmaceuticals, Inc. System, method and computer program product for identifying chemical compounds having desired properties
WO1998020459A1 (en) 1996-11-04 1998-05-14 3-Dimensional Pharmaceuticals, Inc. System, method, and computer program product for the visualization and interactive processing and analysis of chemical data
US6295514B1 (en) * 1996-11-04 2001-09-25 3-Dimensional Pharmaceuticals, Inc. Method, system, and computer program product for representing similarity/dissimilarity between chemical compounds
US6421612B1 (en) * 1996-11-04 2002-07-16 3-Dimensional Pharmaceuticals Inc. System, method and computer program product for identifying chemical compounds having desired properties
US5861532A (en) 1997-03-04 1999-01-19 Chiron Corporation Solid-phase synthesis of N-alkyl amides
US5908960A (en) 1997-05-07 1999-06-01 Smithkline Beecham Corporation Compounds
US6049797A (en) 1998-04-07 2000-04-11 Lucent Technologies, Inc. Method, apparatus and programmed medium for clustering databases with categorical attributes

Non-Patent Citations (102)

* Cited by examiner, † Cited by third party
Title
Agrafiotis, D., "Theoretical Aspects of the Complex: Arts and New Technologies" Applications and Impacts Information Processing '94, North-Holland, vol. II, 1994, pp. 714-719.
Agrafiotis, D.K. and Jaegar, E.P., "Stochastic Algorithms for Exploring Molecular Diversity ," Abstracts of Papers Part 1: 213th ACS National Meeting, Apr. 13-17, 1997, p. 16-CINF.
Agrafiotis, D.K. and Jaeger, E.P., "Directed Diversity(R) : An Operating System For Combinatorial Chemistry," Abstracts of Papers Part 1: 211th ACS National Meeting, Mar. 24-28, 1996, p. 46-COMP.
Agrafiotis, D.K. and Lobanov, V.S., "An Efficient Implementation of Distance-Based Diveristy Measures Based on k-d Trees," J. Chem. Inf. Comput. Sci., American Chemical Society, vol. 39, No. 1, Jan./Feb. 1999, pp. 51-58.
Agrafiotis, D.K. and Lobanov, V.S., "Bridging The Gap Between Diversity And QSAR," Abstracts of Papers Part 1: 215th ACS National Meeting, Mar. 29-Apr. 2, 1998, p. 181-COMP.
Agrafiotis, D.K. et al., "Advances in diversity profiling and combinatorial series design," Molecular Diversity, Kluwer Academic Publishers, vol. 4, 1999, pp. 1-22.
Agrafiotis, D.K., "A New Method For Analyzing Protein Sequence Relationships Based on Sammon Maps," Protein Science, Cambridge University Press, vol. 6, No. 2, Feb. 1997, pp. 287-293.
Agrafiotis, D.K., "Diversity of Chemical Libraries," Encyclopedia of Computatorial Chemistry, John Wiley & Sons Ltd, vol. 1:A-D, 1998, pp. 742-761.
Agrafiotis, D.K., "On the Use of Information Theory for Assessing Molecular Diversity," J. Chem. Inf. Comput. Sci., American Chemical Society, vol. 37, No. 3, May/Jun. 1997, pp. 576-580.
Agrafiotis, D.K., et al., "Parallel QSAR," Abstracts of Papers Part 1: 217th ACS National Meeting, Mar. 21-25, 1999, p. 50-COMP.
Agrafiotis, D.K., et al., "PRODEN: A New Program for Calculating Integrated Projected Populations," Journal of Computational Chemistry, John Wiley & Sons, Inc., vol. 11, No. 9, Oct. 1990, pp. 1101-1110.
Agrafiotis, Dimitris K. and Lobanov, Victor S., "Ultrafast Algorithm for Designing Focused Combinational Arrays," J. Chem. Inf. Comput. Sci., American Chemical Society, 2000, vol. 40, No. 4, pp. 1030-1038.
Ajay et al., "Can We Learn To Distinguish between 'Drug-Like' and 'Nondrug-like' Molecules?" J. Med. Chem., 1998, American Chemical Society, vol. 41, No. 18, pp. 3314-3324.
Amzel, L.M., "Structure-based drug design," Current Opinion in Biotechnology, vol. 9, No. 4, Aug. 1998, pp. 366-369.
Barnard, John M. and Downs, Geoff M., "Computer representation and manipulation of combinatorial libraries," Perspectives in Drug Discovery and Design, Kluwer Academic Publishers, 1997, pp. 13-30.
Bellman, R.E., Adaptive Control Processes: A Guided Tour, Princeton Univ. Press, Princeton, NJ (1961).
Bezdek, J.C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, NY (1981).
Biswas, G. et al., "Evaluation of Projection Algorithms," IEEE Transactions On Pattern Analysis And Machine Intelligence, IEEE Computer Society, vol. PAMI-3, No. 6, Nov. 1981, pp. 701-708.
Blaney, J.M. and Martin, E.J., "Computational approaches for combinatorial library design and molecular diversity analysis," Current Opinion in Chemical Biology, Current Biology Ltd., vol. 1, No. 1, Jun. 1997, pp. 54-59.
Bonchev, D. and Trinajstic, N., "Information theory, distance matrix, and molecular branching," The Journal of Chemical Physics, American Institute of Physics, vol. 67, No. 10, Nov. 15, 1977, pp. 4517-4533.
Borg, Ingwer and Groenen, Patrick, Modern Multidimensional Sealing Theory and Applications, Springer Series in Statistics, 1997, entire book submitted.
Brint, Andrew T. and Willett, Peter, "Upperbound procedures for the identification of similar three-dim nsional chemical structures," Journal of Computer-Aided Molecular Design, ESCOM Science Publishers B.V., vol. 2, No. 4, Jan. 1989, pp. 311-320.
Brown, R.D. and Clark, D.E. "Genetic diversity: applications of evolutionary algorithms to combinatorial library design," Expert Opinion on Therapeutic Patents, vol. 8, No. 11, Nov. 1998, pp. 1447-1459.
Brown, Robert D. and Martin, Yvonne C., "Designing Combinatorial Library Mixtures Using a Genetic Algorithm," J. Med. Chem., American Chemical Society, 1997, vol. 40, No. 15, pp. 2304-2313.
Brown, Robert D. and Martin, Yvonne C., "Designing Combinatorial Library Mixtures Using a Genetic Algorithm," Journal of Medicinal Chemistry, American Chemical Society, vol. 40, No. 15, 1997, pp. 2304-2313.
Brown, Robert D. and Martin, Yvonne C., "The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor Binding," J. Chem. Info. Comput. Sci., American Chemical Society, 1997, vol. 37, No. 1, pp. 1-9.
Brown, Robert D. and Martin, Yvonne C., "Use of Structure-Activity Data To Compare Structure-Based Clustering Methods and Descriptors for Use in Compound Selection," J. Chem. Inf. Sci., American Chemical Society, 1996, vol. 36, No. 3, pp. 572-584.
Caflisch, A. and Karplus, M., "Computational combinatorial chemistry for de novo ligand design: Review and assessment," Perspectives in Drug Discovery and Design, ESCOM Science Publishers B.V., vol. 3, 1995, pp. 51-84.
Chang, C.L. and Lee, R.C.T., "A Heuristic Relaxation Method for Nonlinear Mapping in Cluster Analysis," IEEE Transactions on Systems, Man, and Cybernetics, IEEE Systems, Man, and Cybernetics Society, vol. SMC-3, Mar. 1973, pp. 197-200.
Copy of International Search Report issued Oct. 18, 1999, for Appl. No. PCT/US99/09963, 7 pages.
Cramer, R.D. et al., "Virtual Compound Libraries: A New Approach to Decision Making in Molecular Discovery Research," J. Chem. Inf. Comput. Sci., American Chemical Society, vol. 38, No. 6, Nov./Dec. 1998, pp. 1010-1023.
Cummins, David J. et al., "Molecular Diversity in Chemical Databases: Comparison of Medicinal Chemistry Knowledge Bases and Databases of Commercially Available Compounds," J. Chem. Info. Comput. Sci., American Chemical Society, 1996, vol. 36, No. 4, pp. 750-763.
Danheiser, S.L., "Current Trends in Synthetic Peptide and Chemical Diversity Library Design," Genetic Engineering News, May 1, 1994, pp. 10 and 31.
Daylight Theory: Fingerprints (visited Sep. 26, 2000) <http://www.daylight.com/dayhtml/doc/theory/theory.finger.html>, 9 pages.
Daylight Theory: SMARTS (visited Sep. 26, 2000) <http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html>, 10 pages.
DeMers, D. and Cottrell, G., "Non-Linear Dimensionality Reduction," Advances in Neural Information Processing Systems, vol. 5, 1993, pp. 580-587.
Downs, Geoff M. and Barnard, John M., "Techniques for Generating Descriptive Fingerprints in Combinatorial Libraries," J. Chem. Inf. Comput. Sci., American Chemical Society, 1997, vol. 37, No. 1, pp. 59-61.
Eicher, U. et al., "Addressing the problem of molecular diversity," Drugs of the FutureProus Science, vol. 24, No. 2, 1999, pp. 177-190.
Felder, E.R. and Poppinger, D., "Combinatorial Compound Libraries for Enhanced Drug Discovery Approaches," Advances in Drug Research, Academic Press, vol. 30, 1997, pp. 112-199.
Frey, P.W. and Slate, D.J., "Letter Recognition Using Holland-Style Adaptive Classifiers," Machine Learning, Kluwer Academic Publishers, vol. 6, 1991, pp. 161-182.
Friedman, J.H. and Tukey, J.W., "A Projection Pursuit Algorithm for Exploratory Data Analysis," IEEE Transactions on Computers, IEEE Computer Society, vol. C-23, No. 9, Sep. 1974, pp. 881-889.
Friedman, J.H., "Exploratory Projection Pursuit," Journal of the American Statistical Association, Am rican Statistical Association, vol. 82, No. 397, Mar. 1987, pp. 249-266.
Garrido, L. et al., "Use of Multilayer Feedforward Neural Nets As A Display Method for Multidimensional Distributions," International Journal of Neural Systems, World Scientific Publishing Co. Pte. Ltd., vol. 6, No. 3, Sep. 1995, pp. 273-282.
Gasteiger, J. et al, "Assessment of the Diversity of Combinatorial Libraries by an Encoding of Molecular Surface Properties," Abstracts of Papers Part 1: 211th ACS National Meeting, Mar. 24-28, 1996, p. 70-CINF.
Geysen, H.M. and Mason, T.J., "Screening Chemically Synthesized Peptide Libraries for Biologically-Relevant Molecules," Biorganic & Medicinal Chemistry Letters, Pergamon Press Ltd., vol. 3, No. 3, 1993, pp. 397-404.
Ghose, A.K. et al., "Prediction of Hydrophobic (Lipophilic) Properties of Small Organic Molecules Using Fragmental Methods: An Analysis of ALOGP and CLOGP Methods," J. Phys. Chem. A, American Chemical Society, vol. 102, No. 21, May 21, 1998, pp. 3762-3772.
Gillet, Valerie J. et al., "Selecting Combinatorial Libraries to Optimize Diversity and Physical Properties," Journal of Chemical Information Computer Sciences, American Chemical Society, vol. 39, No. 1, 1999, pp. 169-177.
Gillet, Valerie J. et al., "The Effectiveness of Reactant Pools for Generating Structurally-Diverse Combinatorial Libraries," Journal of Chemical Information Computer Sciences, American Chemical Society, vol. 37, No. 4, 1997, pp. 731-740.
Gillet, Valerie J., "Background Theory of Molecular Diversity," Molecular Diversity in Drug Design, Kluwer Academic Publishers, 1999, pp. 43-65.
Gobbi, A. et al., "New Leads By Selective Screening of Compounds From Large Databases," Abstracts of Papers Part 1: 213th ACS National Meeting, Apr. 13-17, 1997, p. 64-CINF.
Good, Andrew C. and Lewis, Richard A., "New Methodology for Profiling Combinatorial Libraries and Screening Sets: Cleaning Up the Design Process with HARPick," J. Med. Chem., American Chemical Society, 1997, vol. 40, No. 24, pp. 3926-3936.
Gorse, Dominique and Lahana, Roger, "Functional diversity of compound libraries," Current opinion in chemical biology, Elsevier Science Ltd., Jun. 2000, vol. 4, No. 3, pp. 287-294.
Hall, L.H. and Kier, L.B., "The Molecular Connectivity Chi Indexes and Kappa Shape Indexes in Structure-Property Modeling," Reviews in Computational Chemistry: Advances, VCH Publishers, Inc., 1991, pp. 367-422.
Hassan, Moises et al., "Optimization and visualization of molecular diversity of combinatorial libraries," Molecular Diversity, ESCOM Science Publishers B.V., 1996, vol. 2, pp. 64-74.
Hecht-Nielsen, R., "Replicator Neural Networks for Universal Optimal Source Coding," Science, American Association for the Advancement of Science, vol. 269, Sep. 29, 1995, pp. 1860-1863.
Hotelling, H., "Analysis of a Complex of Statistical Variables into Principal Components," The Journal of Educational Psychology, Warwick and York, Inc. XXIV, No. 7, Oct. 1933, pp. 498-520.
Hotelling, H., "Analysis of a Complex of Statistical Variables into Principal Components," The Journal of Educational Psychology, Warwick and York, Inc., vol. XXIV, No. 6, Sep. 1933, pp. 417-441.
Houghten, R.A. et al., "The Use of Synthetic Peptide Combinatorial Libraries for the Identification of Bioactive Peptides," Peptide Research, vol. 5, No. 6, 1992, pp. 351-358.
Jamois, Eric A. et al., "Evaluation of Reagent-Based and Product-Based Strategies in the Design of Combinatorial Library Subsets," J. Chem. Inf. Comput. Sci., American Chemical Society, 2000, vol. 40, No. 1, pp. 63-70.
Johnson, M.A., and Maggiora, G.M., Concepts and Applications of Molecular Similarity, John Wiley and Sons, New York, NY (1990).
Kearsley, Simon K. et al., "Chemical Similarity Using Physiochemical Property Descriptors," Journal of Chemical Information Computer Science, American Chemical Society, vol. 36, No. 1, 1996, pp. 118-127.
Klopman, G., "Artificial Intelligence Approach to Structure-Activity Studies. Computer Automated Structure Evaluation of Biological Activity of Organic Molecules," J. Am. Chem. Soc., American Chemical Society, vol. 106, No. 24, Nov. 28, 1984, pp. 7315-7321.
Kohonen, T., Self-Organizing Maps, Springer-Verlag, Heidelberg, Germany (1995).
Lajiness, M.S. et al., "Implementing Drug Screening Programs Using Molecular Similarity Methods," QSAR: Quantitative Structure-Activity Relationships in Drug Design, Alan R. Liss, Inc., 1989, pp. 173-176.
Leach, Andrew R. and Hann, Michael M., "The in silico world of virtual libraries," Drug discovery today Elsevier Science Ltd., Aug. 2000, vol. 5, No. 8, pp. 326-336.
Leach, Andrew R. et al., "Implementation of a System for Reagent Selection and Library Enumeration, Profiling, and Design," J. Chem. Inf. Comput. Sci., American Chemical Society, 1999, vol. 39, No. 6, pp. 1161-1172.
Lee, R.C.T. et al., "A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-Space," IEEE Transactions on Computers, The Institute of Electrical and Electronics Engineers, Mar. 1977, pp. 288-292.
Leland, Burton A. et al., "Managing the Combinatorial Explosion," Journal of Chemical Information Computer Science, American Chemical Society, vol. 37, No. 1, 1997, pp. 62-70.
Lewis, Richard A. et al., "Similarity Measures for Rational Set Selection and Analysis of Combinatorial Libraries: The Diverse Property-Derived (DPD) Approach," Journal of Chemical Information Computer Science, American Chemical Society, vol. 37, No. 3, 1997, pp. 599-614.
Lipinski, C.A. et al., "Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings," Advanced Drug Delivery Reviews, Elsevier Science B.V., vol. 23, 1997, pp. 3-25.
Lobanov, V.S. and Agrafiotis, D.K., "Intelligent Database Mining Techniques," Abstracts of Papers Part 1: 215th ACS National Meeting, Mar. 29-Apr. 2, 1998, p. 19-COMP.
Lobanov, V.S. et al., "Rational Selections from Virtual Libraries," Abstracts of Papers Part 1: 217th ACS National Meeting, Mar. 21-25, 1999, p. 181-COMP.
Lobanov, Victor S. and Agrafiotis, Dimitris K., "Stochastic Similarity Selections from Large Combinatorial Libraries," J. Chem. Inf. Comput. Sci., American Chemical Society, Mar./Apr. 2000, vol. 40, No. 2, pp. 460-470.
Loew, G.H. et al., "Strategies for Indirect Computer-Aided Drug Design," Pharmaceutical Research, Plenum Publishing Corporation, vol. 10, No. 4, 1993, pp. 475-486.
Lynch, M.F. et al., "Generic Structure Storage and Retrieval," J. Chem. Inf. Comput. Sci., American Chemical Society, vol. 25, No. 3, Aug. 1985, pp. 264-270.
Mao, J. and Jain, A.K., "Artificial Neural Networks for Feature Extraction and Multivariate Data Projection," IEEE transactions on Neural Networks, IEEE Neural Networks, vol. 6, No. 2, Mar. 1995, pp. 296-317.
Martin, Eric J. and Critchlow, Roger E., "Beyond Mere Diversity: Tailoring Combinatorial Libraries for Drug Discovery," Journal of Combinatorial Chemistry, American Chemical Society, vol. 1, No. 1, 1999, pp. 32-45.
Matter, Hans and Pötter, Thorsten, "Comparing 3D Pharmacophore Triplets and 2D Fingerprints for Selecting Diverse Compound Subsets," J. Chem. Inf. Comput. Sci., American Chemical Society, 1999, vol. 39, No. 6, pp. 1211-1225.
Matter, Hans, "Selecting Optimally Diverse Compounds from Structure Databases: A Validation Study of Two-Dimensional and Three-Dimensional Molecular Descriptors," J. Med. Chem., Amercian Chemical Society, 1997, vol. 40, No. 8, pp. 1219-1229.
Myers, P.L. et al., "Rapid, Reliable Drug Discovery," Today's Chemist At Work, American Chemical Society, vol. 6, No. 7, Jul./Aug. 1997, pp. 46-48, 51 & 53.
Oja, E., "Principal Components, Minor Components, and Linear Neural Networks," Neural Networks, Pergamon Press Ltd., vol. 5, 1992, pp. 927-935.
Oja, E., Subspace Methods of Pattern Recognition, Research Studies Press, Letchworth, England (1983).
Pabo, C.O. and Suchanek, E.G., "Computer-Aided Model-Building Strategies for Protein Design," Biochemistry, American Chemical Society, vol. 25, No. 20, 1986, pp. 5987-5991.
Patterson, D.E. et al., "Neighborhood Behavior: A Useful Concept for Validation of 'Molecular Diversity' Descriptors," Journal of Medicinal Chemistry, American Chemical Society, vol. 39, No. 16, 1996, pp. 3049-3059.
Pykett, C.E., "Improving the Efficiency of Sammon's Nonlinear Mapping by Using Clustering Archetypes," Electronics Letters, The Institute of Electrical Engineers, vol. 14, No. 25, Dec. 7, 1978, pp. 799-800.
Rooks, "A Unified Framework for Visual Interactive Simulation," ACM Proceedings of the 1991 Winter Simulation Conference, pp. 1146-1155 (1991). *
Rubner, J. and Tavan, P., "A Self-Organizing Network for Principal-Component Analysis," Europhysics Letters, European Physical Society, vol. 10, No. 7, Dec.1, 1989, pp. 693-698.
Sadowski et al, "Assessing Similarity and Diversity of Combinatorial Libraries by Spatial Autocorrelation Functions and Neural Networks," Angew. Chem. Int. Engl., vol. 34 (Issue 23/24 (1995). *
Sadowski, J. et al., "Assessing Similarity and Diversity of Combinatorial Libraries by Spatial Autocorrelation Functions and Neural Networks," Angewandte Cheme, VCH, vol. 34, No. 23/24, Jan. 5, 1996, pp. 2674-2677.
Sadowski, Jens and Kubinyi, Hugo, "A Scoring Scheme for Discriminating between Drugs and Nondrugs," J. Med.Chem., American Chemical Society, 1998, vol. 41, No. 18, pp. 3325-3329.
Saudek, V. et al., "Solution Conformation of Endothelin-1 by H NMR, CD, and Molecular Modeling," International Journal of Peptide Protein Res., Munksgaard International Publishers Ltd., vol. 37, No. 3, 1991, pp. 174-179.
Schnur, Dora, "Design and Diversity Analysis of Large Combinatorial Libraries Using Cell-Based Methods," J. Chem. Inf. Comput. Sci., American Chemical Society, 1999, vol. 39, No. 1, pp. 36-45.
Schuffenhauer, Ansgar et al., "Similarity Searching in Files of Three-Dimensional Chemical Structures: Analysis of the BIOSTER Database Using Two-Dimensional Fingerprints and Molecular Field Descriptors," J. Chem. Inf. Comput. Sci., American Chemial Society, 2000, vol. 40, No. 2, pp. 295-307.
Sheridan, Robert P. et al., "Chemical Similarity Using Geometric Atom Pair Descriptors," Journal of Chemical Information Computer Science, American Chemical Society, vol. 36, No. 1, 1996, pp. 128-136.
Singh, J. et al., "Application of Genetic Algorithms to Combinatorial Synthesis: A Computational Approach to Lead Identification and Lead Optimization," J. Am. Chem. Soc., American Chemical Society, vol. 118, No. 5, Feb. 7, 1996, pp. 1669-1676.
Thompson, L.A. and Ellman, J.A., "Synthesis and Applications of Small Molecule Libraries," Chemical Reviews, American Chemical Society, vol. 96, No. 1, Jan./Feb. 1996, pp. 555-600.
Turner, David B. et al., "Rapid Quantification of Molecular Diversity for Selective Database Acquisition," J. Chem. Inf. Sci., American Chemical Society, 1997, vol. 37, No. 1, pp. 18-22.
Van Drie, J.H. and Lajiness, M.S., "Approaches to virtual library design," Drug Discovery today, Elsevier Science Ltd., vol. 3, No. 6, Jun. 1998, pp. 274-283.
Wagener et al, "Autocorrelation of Molecular Surface Properties for Modeling Corticosteroid Binding Globulin and Cytosolic Ah Receptor Activity by Neural Networks," J. Am. Chem. Soc., vol. 117, pp. 7769-7775 (1995). *
Walters, W.P. et al., "Virtual screening-an overview," Drug Discovery today, Elsevier Science Ltd., vol. 3, No. 4, Apr. 1998, pp. 160-178.
Wang, Jing and Ramnarayan, Kal, "Toward Designing Drug-Like Libraries: A Novel Computational Approach for Prediction of Drug Feasibility of Compounds," J. Comb. Chem., American Chemical Society, Nov./Dec. 1999, vol. 1, No. 6, pp. 524-533.
Willett, Peter et al., "Chemical Similarity Searching," Journal of Chemical Information Computer Science, American Chemical Society, vol. 38, No. 6, 1998, pp. 983-996.

Cited By (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070185657A1 (en) * 1998-10-19 2007-08-09 Symyx Technologies, Inc. Graphic design of combinatorial material libraries
US7653607B2 (en) 2000-03-24 2010-01-26 Symyx Solutions, Inc. Remote execution of materials library designs
US20070143240A1 (en) * 2000-03-24 2007-06-21 Symyx Technologies, Inc. Remote execution of materials library designs
US20070214101A1 (en) * 2000-12-15 2007-09-13 Symyx Technologies, Inc. Methods and apparatus for preparing high-dimensional combinatorial experiments
US7882053B2 (en) 2000-12-15 2011-02-01 Freeslate, Inc. Methods and apparatus for preparing high-dimensional combinatorial experiments
US7676499B2 (en) 2001-01-05 2010-03-09 Symyx Solutions, Inc. Management of data from combinatorial materials experiments
US20060277201A1 (en) * 2001-01-05 2006-12-07 Symyx Technologies, Inc. Laboratory database system and method for combinatorial materials research
US7724257B2 (en) * 2001-01-29 2010-05-25 Symyx Solutions, Inc. Systems, methods and computer program products for determining parameters for chemical synthesis
US20080015837A1 (en) * 2001-01-29 2008-01-17 Symyx Technologies, Inc. Systems, Methods and Computer Program Products for Determining Parameters for Chemical Synthesis
US7480535B2 (en) * 2002-09-30 2009-01-20 Gerhard Kranner Method for the computer-supported generation of prognoses for operative systems and a system for the generation of prognoses for operative systems
US20060009864A1 (en) * 2002-09-30 2006-01-12 Gerhard Kranner Method for the computer-supported generation of prognoses for operative systems and a system for the generation of prognoses for operative systems
US20070203951A1 (en) * 2003-01-24 2007-08-30 Symyx Technologies, Inc. User-configurable generic experiment class for combinatorial material research
US7908285B2 (en) 2003-01-24 2011-03-15 Symyx Solutions, Inc. User-configurable generic experiment class for combinatorial material research
US20050114331A1 (en) * 2003-11-26 2005-05-26 International Business Machines Corporation Near-neighbor search in pattern distance spaces
US20050130229A1 (en) * 2003-12-16 2005-06-16 Symyx Technologies, Inc. Indexing scheme for formulation workflows
US7912845B2 (en) 2004-06-01 2011-03-22 Symyx Software, Inc. Methods and systems for data integration
US20100076992A1 (en) * 2004-06-01 2010-03-25 Symyx Software, Inc. Methods and Systems for Data Integration
US20060064674A1 (en) * 2004-06-03 2006-03-23 Olson John B Jr Methods and apparatus for visual application design
US7818666B2 (en) 2005-01-27 2010-10-19 Symyx Solutions, Inc. Parsing, evaluating leaf, and branch nodes, and navigating the nodes based on the evaluation
US20060168515A1 (en) * 2005-01-27 2006-07-27 Symyx Technologies, Inc. Parser for generating structured data
US20070050092A1 (en) * 2005-08-12 2007-03-01 Symyx Technologies, Inc. Event-based library process design
US20090292517A1 (en) * 2007-02-07 2009-11-26 Fujitsu Limited Molecular design method and computer-readable storage medium
US8126656B2 (en) * 2007-02-07 2012-02-28 Fujitsu Limited Molecular design method and computer-readable storage medium
US10467325B2 (en) 2007-06-11 2019-11-05 Intel Corporation Acceleration of multidimensional scaling by vector extrapolation techniques
US20100211366A1 (en) * 2007-07-31 2010-08-19 Sumitomo Heavy Industries, Ltd. Molecular simulating method, molecular simulation device, molecular simulation program, and recording medium storing the same
US8280699B2 (en) * 2007-07-31 2012-10-02 Sumitomo Heavy Industries, Ltd. Molecular simulating method, molecular simulation device, molecular simulation program, and recording medium storing the same
US20100329930A1 (en) * 2007-10-26 2010-12-30 Imiplex Llc Streptavidin macromolecular adaptor and complexes thereof
US8993714B2 (en) 2007-10-26 2015-03-31 Imiplex Llc Streptavidin macromolecular adaptor and complexes thereof
US20090228445A1 (en) * 2008-03-04 2009-09-10 Systems Biology (1) Pvt. Ltd. Automated molecular mining and activity prediction using xml schema, xml queries, rule inference and rule engines
US20090281975A1 (en) * 2008-05-06 2009-11-12 Microsoft Corporation Recommending similar content identified with a neural network
US8032469B2 (en) * 2008-05-06 2011-10-04 Microsoft Corporation Recommending similar content identified with a neural network
US9102526B2 (en) 2008-08-12 2015-08-11 Imiplex Llc Node polypeptides for nanostructure assembly
US9285363B2 (en) 2009-05-11 2016-03-15 Imiplex Llc Method of protein nanostructure fabrication
US8672685B2 (en) * 2009-10-07 2014-03-18 Bitwixt Software Systems Llc Electron configuration teaching systems and methods
US20110081637A1 (en) * 2009-10-07 2011-04-07 Doherty David C Electron Configuration Teaching Systems and Methods
US8862520B2 (en) 2009-12-14 2014-10-14 Massachusetts Institute Of Technology Methods, systems and media utilizing ranking techniques in machine learning
WO2011081950A1 (en) * 2009-12-14 2011-07-07 Massachussets Institute Of Technology Methods, systems and media utilizing ranking techniques in machine learning
US20110145175A1 (en) * 2009-12-14 2011-06-16 Massachusetts Institute Of Technology Methods, Systems and Media Utilizing Ranking Techniques in Machine Learning
US8594435B2 (en) * 2010-02-04 2013-11-26 Sony Corporation Image processing device and method, and program therefor
US20110188758A1 (en) * 2010-02-04 2011-08-04 Sony Corporation Image processing device and method, and program therefor
US9218460B2 (en) * 2011-05-09 2015-12-22 The Regents Of The University Of California Defining and mining a joint pharmacophoric space through geometric features
US20120290624A1 (en) * 2011-05-09 2012-11-15 The Regents Of The University Of California Defining and mining a joint pharmacophoric space through geometric features
US10254944B2 (en) 2012-03-21 2019-04-09 Zymeworks Inc. Systems and methods for making two dimensional graphs of complex molecules
US20150051889A1 (en) * 2012-03-21 2015-02-19 Zymeworks Inc. Systems and methods for making two dimensional graphs of complex molecules
US10168885B2 (en) 2012-03-21 2019-01-01 Zymeworks Inc. Systems and methods for making two dimensional graphs of complex molecules
US9336302B1 (en) 2012-07-20 2016-05-10 Zuci Realty Llc Insight and algorithmic clustering for automated synthesis
US10318503B1 (en) 2012-07-20 2019-06-11 Ool Llc Insight and algorithmic clustering for automated synthesis
US9607023B1 (en) 2012-07-20 2017-03-28 Ool Llc Insight and algorithmic clustering for automated synthesis
US11216428B1 (en) 2012-07-20 2022-01-04 Ool Llc Insight and algorithmic clustering for automated synthesis
US11061873B2 (en) 2016-03-17 2021-07-13 Elsevier, Inc. Systems and methods for electronic searching of materials and material properties
US11205103B2 (en) 2016-12-09 2021-12-21 The Research Foundation for the State University Semisupervised autoencoder for sentiment analysis
US10832800B2 (en) 2017-01-03 2020-11-10 International Business Machines Corporation Synthetic pathway engine
US10229092B2 (en) 2017-08-14 2019-03-12 City University Of Hong Kong Systems and methods for robust low-rank matrix approximation

Also Published As

Publication number Publication date
WO1998020459A1 (en) 1998-05-14
US20020069043A1 (en) 2002-06-06
US20030014191A1 (en) 2003-01-16
EP0935789A1 (en) 1999-08-18
AU5440798A (en) 1998-05-29
US6295514B1 (en) 2001-09-25
EP0935784A2 (en) 1999-08-18
AU5180098A (en) 1998-05-29
JP2001507675A (en) 2001-06-12
US6421612B1 (en) 2002-07-16
IL129728A0 (en) 2000-02-29
WO1998020437A2 (en) 1998-05-14
AU732397B2 (en) 2001-04-26
JP2001503546A (en) 2001-03-13
IL129498A0 (en) 2000-02-29
CA2269669A1 (en) 1998-05-14
CA2270527A1 (en) 1998-05-14
WO1998020437A3 (en) 1998-06-25
AU722989B2 (en) 2000-08-17

Similar Documents

Publication Publication Date Title
US7188055B2 (en) Method, system, and computer program for displaying chemical data
EP1078333B1 (en) System, method, and computer program product for representing proximity data in a multi-dimensional space
Bruckner et al. Result-driven exploration of simulation parameter spaces for visual effects design
US5379234A (en) Computer-aided chemical illustration system
US7117187B2 (en) Method, system and computer program product for non-linear mapping of multi-dimensional data
CA2286549A1 (en) Statistical deconvoluting of mixtures
WO2012102990A2 (en) Method and apparatus for selecting clusterings to classify a data set
Peltonen et al. Information retrieval approach to meta-visualization
US6332040B1 (en) Method and apparatus for sorting and comparing linear configurations
US7054757B2 (en) Method, system, and computer program product for analyzing combinatorial libraries
da Silva et al. Beyond the third dimension: Visualizing high-dimensional data with projections
Bisson et al. Improving visualization of large hierarchical clustering
AU7166900A (en) System, method, and computer program product for the visualization and interactive processing and analysis of chemical data
Lee et al. The next frontier for bio-and cheminformatics visualization
Kraemer et al. Molecules to maps: tools for visualization and interaction in support of computational biology.
US20030208322A1 (en) Apparatus, method, and computer program product for plotting proteomic and genomic data
Demšar Exploring geographical metadata by automatic and visual data mining
Wolter Navigation in time-varying scientific data.
New Visual Analytics Techniques for Trend Detection in Correlation Data
Maniyar et al. Visual data mining: integrating machine learning with information visualization
Nabi et al. Paraglide: Interactive Parameter Space Partitioning for Computer Simulations
MXPA00010727A (en) System, method,and computer program product for representing proximity data in a multi-dimensional space
Mining VDM@ ICDM

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: JOHNSON & JOHNSON PHARMACEUTICL RESEARCH AND DEVEL

Free format text: MERGER;ASSIGNOR:3-DIMENSIONAL PHARMACEUTICALS, INC.;REEL/FRAME:016950/0704

Effective date: 20040624

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20190306