WO2002010742A2 - System and method for predicting adme/tox characteristics of a compound - Google Patents
System and method for predicting adme/tox characteristics of a compound Download PDFInfo
- Publication number
- WO2002010742A2 WO2002010742A2 PCT/US2001/023763 US0123763W WO0210742A2 WO 2002010742 A2 WO2002010742 A2 WO 2002010742A2 US 0123763 W US0123763 W US 0123763W WO 0210742 A2 WO0210742 A2 WO 0210742A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- compound
- compounds
- chemical
- property
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/10—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
Definitions
- the present invention relates to systems and methods for predicting the characteristics of a chemical compound.
- the present invention is related to pharmaco inetic systems and methods for predicting the Absorption, Distribution, Metabolism, Excretion and/or Toxicological (ADME/TOX) characteristics or properties of a chemical compound based on structural modeling of the chemical compound and mathematical analysis.
- ADME/TOX Toxicological
- Pharmacodynamics refers to the study of fundamental or molecular interactions between drug and body constituents, which through a subsequent series of events results in a pharmacological response.
- the magnitude of a pharmacological effect depends on the time-dependent concentration of drug at the site of action (e.g., target receptor-ligand/drug interaction).
- Factors that influence rates of delivery and disappearance of drug to or from the site of action over time include its ADME properties.
- the study of factors that influence how drug concentration varies with time is the subject of pharmacokinetics. Additionally, the toxicological properties of a drug should also be considered. These properties taken together represent the ADME/TOX properties of a compound.
- the site of drug action is located on the other side of a membrane from the site of drug administration.
- an orally administered drug must be absorbed through a series of physiological barriers at some point or points along the gastrointestinal (Gl) tract. Once the drug is absorbed, and thus passes a membrane barrier of the Gl tract, it is transported through the portal vein to the liver and then eventually into systemic circulation (i.e., blood and lymph) for delivery to other body parts and tissues by blood flow.
- systemic circulation i.e., blood and lymph
- how well a drug crosses membranes is of key importance in assessing the rate and extent of absorption and distribution of the drug throughout different body compartments and tissues.
- an otherwise highly potent drug is administered extravascularly (e.g., oral) but is poorly absorbed (e.g., Gl tract), a majority of the drug will be excreted or eliminated and thus cannot be distributed to the site of action.
- ADME/TOX properties of a candidate drug are usually determined through conventional laboratory testing (in vitro or in vivo) combined with mathematical modeling.
- pharmacokinetic data analysis may be based on empirical observations after administering a known dose of drug to an animal and fitting of the data collected from the animal (e.g., from its liver cells) by either descriptive equations or mathematical (compartmental) models.
- Time- concentration data from a subject that has been given a particular dose of a drug may be collected followed by plotting the data points on a logarithmic graph of drug concentration versus time to generate one type of concentration-time curve.
- a mathematical equation is used to model what might happen to the drug as it is transported through a human body.
- the present invention solves the aforementioned problems by providing new and improved systems and methods of predicting the ADME/TOX properties of candidate drugs (chemical compounds).
- Such systems and methods may use empirical statistical pattern recognition approaches to take known chemical structures and characteristics (e.g., ADME/TOX) of all compounds for which data has been generated (e.g., data is available from various labs, is published, etc.) and to relate the structures and their characteristics to experimental data in such a way to accurately predict the characteristics of a new proposed structure (compound).
- a system for predicting the target data of a compound in a mammalian (actual descriptions are human related) body comprising a database facility and a processor facility.
- the database facility is configured to store input data.
- the processor facility is configured to allow the entry of input data relating to a new proposed chemical compound including structural data, to perform an analysis of the chemical compound by mapping the data entered to produce predicted target data for the chemical compound based on the analysis.
- a method for creating or developing a model to be used for evaluating the ADME/TOX characteristics of a proposed compound comprises the following steps:
- step (a) selecting training compounds based on the characteristics to be predicted of the proposed compounds (for which a complete set of input and target data exists)
- step (b) selecting descriptors applicable to the characteristic to be predicted based on an analysis of the training compounds selected in step (a), such as via a genetic algorithm or other appropriate mathematical analysis
- Compounds should be selected for their applicability for the problem to be solved, for example, such as for Caco-2 effective permeability (Caco-2 cells possess many of the properties of the small intestine; as such, these cells represent a useful and well-accepted tool for studying the absorption and/or secretion of drugs/chemicals across the intestinal mucosa).
- drugs may be selected as compounds to be analyzed because of their proven permeability or absorption properties.
- Other compounds may similarly be selected and added to the data set. Once compounds have been analyzed for descriptors, they may be tested by conventional means (e.g., lab testing, etc.) to determine various characteristics to be predicted by the system above (e.g., CaCo-2 permeability). Once all data has been analyzed and collected, they are loaded into the database for use in predicting the ADME/TOX properties of proposed compounds.
- the method may include:
- step (b) selecting training compounds from the database facility based on the characteristics to be predicted of the proposed compounds (for which a complete set of input and target data exists)
- step (c) selecting the most meaningful descriptors applicable to the characteristic to be predicted based on an analysis of the training compounds selected in step (b), such as via a genetic algorithm or other appropriate mathematical analysis
- step (h) running the model determined in either step (e) ,(f) or (g) using the required input data (the identity of the subset of input data itself was determined in step (c)) to predict the required target data
- a system for predicting the chemical properties of at least one proposed compound comprising: a database facility configured to store and to serve input data relating to the characteristics of training compounds (descriptor(s) (for example, structure and experimental data)) as well as target data (for example, chemical properties of selected compounds) for the training compounds; and a processor facility coupled to the database facility and configured to predict the characteristics of a proposed compound by:
- step (b) selecting descriptors applicable to the characteristic to be predicted based on an analysis of the training compounds selected in step (a), such as via a genetic algorithm or other appropriate mathematical analysis
- a system for predicting the chemical properties of at least one proposed compound comprising: a database facility configured to store and to serve input data relating to the characteristics of the proposed compound (descriptor(s) (for example, structure and experimental data)); and a processor facility coupled to the database facility and configured to predict the characteristics of a proposed compound by:
- a system for predicting the chemical properties of at least one proposed compound comprising: a database facility configured to store and to serve input data relating to the characteristics of training compounds (descriptor(s) (for example, structure and experimental data)) as well as target data (for example, chemical properties of selected compounds) for the training compounds; and a processor facility coupled to the database facility and configured to predict the characteristics of a proposed compound by:
- step (c) selecting the most meaningful descriptors applicable to the characteristic to be predicted based on an analysis of the training compounds selected in step (b), such as via a genetic algorithm or other appropriate mathematical analysis;
- step (g) combining (via boosting, committee machines etc,) a set of two or more models produced in (e or f) based upon performance on validation sets obtained in (d) to form a composite model; and (h) running the model determined in either step (e) ,(f) or (g) using the required input data (the identity of the subset of input data itself was determined in step (c)) to predict the required target data.
- Analysis used to select the most meaningful subset of input data (step (c)) for predicting target data may be performed via feature selection methods such as forwards or backwards selection and may include regression/classification methods. Such analyses should consider model bias and overtraining.
- the preceding analyses may include various data compression techniques.
- a particular model may be biased if the training data is poorly distributed (e.g. the distribution has sharp peaks, regions between nodes that are devoid of data, etc) . Accordingly, compounds may be selected and tested to improve the distribution and enhance the model's ability to generalize. Furthermore, the input's and target's distributions along with the proposed compound's descriptors and characteristic values are used to calculate a confidence metric.
- FIG. 1. is a block diagram of a system for predicting the ADME/Tox properties of a candidate drug
- FIG. 2 is a flow chart of the method for developing a model that will predict the ADME/Tox properties of a candidate drug; and for predicting the ADME/Tox properties of a candidate drug.
- FIGS. 3 - 45 are individual showings of particular points pertinent and important to the present invention and illustrate specific examples of an embodiment of the invention aimed at predicting human ADME data.
- Absorption Transfer of a compound across a physiological barrier as a function of time and initial concentration. Amount or concentration of the compound on the external and/or internal side of the barrier is a function of transfer rate and extent, and may range from zero to unity.
- Affine Regression Linearly combining input data to approximate output data. This is essentially a linear regression that does not require the regression to go through zero.
- Bioavailability Fraction of an administered dose of a compound that reaches the sampling site and/or site of action. May range from zero to unity. Can be assessed as a function of time.
- Boosting A general method which attempts to increase the accuracy of a learning algorithm.
- Compound Chemical entity.
- Computer Readable Medium Medium for storing, retrieving and/or manipulating information using a computer. Includes optical, digital, magnetic mediums and the like; examples include portable computer diskette, CD-ROMs, hard drive on computer etc. Includes remote access mediums; examples include internet or intranet systems. Permits temporary or permanent data storage, access and manipulation.
- Cross Validation Used to estimate the generalization error. This method is based on resampling the data set, using randomly (or otherwise chosen) samples of the training set as test sets.
- Input Data Data which is used as an input in the training or execution of a model.
- Target Data Data for which a model is generated. could be either experimentally determined or predicted.
- Test Data Experimentally determined data.
- Descriptor An element of the input data.
- Regression/Classification Methods for mapping the input data to the target data.
- Regression refers to the methods applicable to forming a continuous prediction of the target data
- classification or in general pattern recognition
- the specific methods for performing the regression or classification include where appropriate:
- Mapping The process of relating the input data space to the target data space, which is accomplished by regression/classification and produces a model that predicts or classifies the target data.
- Feature Selection Methods The method of selecting desirable descriptors from the input data to enable the prediction or classification of the target data. This is typically accomplished by forward selection, backward selection, branch and bound selection, genetic algorithmic selection, or evolutionary selection.
- ADME Properties of absorption, distribution, metabolism, and excretion and encompasses other measures related to absorption, distribution, metabolism, and excretion. For example, heptocyte turnover or Caco-2 effective permeability.
- Dissolution Process by which a compound becomes dissolved in a solvent.
- Fisher's Discriminate Analysis A linear method which reduces the input data dimension by appropriately weighting the descriptors in order to best aid the linear separation and thus classification of target data.
- Genetic Algorithms Based upon the natural selection mechanism. A population of models undergo mutations and only those which perform the best contribute to the subsequent population of models.
- Kernel Representations Variations of classical linear techniques employing a Mercer's Kernel or variations to incorporate specifically defined classes of nonlinearity. These include Fisher's Discriminate Analysis and principal component analysis.
- Kernel Representations as used by the present invention are described in the article, “Fisher Discriminate Analysis with Kernels,” Sebastian Mika, Gunnar Ratsch, Jason Weston, Bemhard Scholkopf, and Klaus-Robert Muller, GMD FIRST, Rudower C Too 5, 12489 Berlin, Germany, ⁇ IEEE 1999 (0-7803-5673-X/99), and in the article, “GA-based Kernel Optimization for Pattern Recognition: Theory for EHW Application,” Moritoshi Yasunaga, Taro Nakamura, Ikuo Yoshihara, and Jung Kim, IEEE ⁇ 2000 (0-7803-6375-2/00), which are both hereby incorporated herein by reference.
- Metabolism Conversion of a compound (the parent compound) into one or more different chemical entities (metabolites).
- Artificial neural networks A parallel and distributed system made up of the interconnection of simple processing units. Artificial neural networks as used in the present invention are described in detail in the book entitled, “Neural networks, A Comprehensive Foundation,” Second Edition, Simon Haykin, McMaster University, Hamilton, Ontario, Canada, published by Prentice Hall ⁇ 1999, which is hereby incorporated herein by reference.
- Permeability Ability of a barrier to permit passage of a substance or the ability of a substance to pass through a barrier. Refers to the concentration-dependent or concentration-independent rate of transport (flux), and collectively reflects the effects of characteristics such as molecular size, charge, partition coefficient and stability of a compound on transport. Permeability is substance and/or barrier specific.
- Physiologic Pharmacokinetic Model Mathematical model describing movement and disposition of a compound in the body or an anatomical part of the body based on pharmacokinetics and physiology.
- Principal Component Analysis A type of non-directed data compression which uses a linear combination of features to produce a lower dimension representation of the data.
- An example of principal component analysis as applicable to use in the present invention is described in the article, "Nonlinear Component Analysis as a Kernel Eigenvalue Problem,” Bemhard Scholkopt, Neural Computation, Vol. 10, Issue 5, pp. 1299 - 1319, 1998, MIT Press., and is hereby incorporated herein by reference.
- Simulation Engine Computer-implemented instrument that simulates behavior of a system using an approximate mathematical model of the system. Combines mathematical model with user input variables to simulate or predict how the system behaves. May include system control components such as control statements (e.g., logic components and discrete objects).
- Solubility Property of being soluble; relative capability of being dissolved.
- Support Vector Machines Method which regresses/classifies by projecting input data into a higher dimensional space. Examples of Support Vector machines and methods as applicable to the present invention are described in the article, "Support Vector Methods in Learning and Feature Extraction,” Berhard Scholkopf, Alex Smola, Klaus-Robert Muller, Chris Burges, Vladimir Vapnik, Special issue with selected papers of ACNN'98, Australian Journal of Intelligent Information Processing Systems, 5 (1), 3-9), and in the article, “Distinctive Feature Detection using Support Vector Machines," Partha Niyogi, chris Burges, and Padma Ramesh, Bell Labs, Lucent Technologies, USA, IEEE ⁇ 1999 (0-7803-5041-3/99), which are both hereby incorporated herein by reference.
- ADME Absorption, Distribution, Metabolism, and Elimination
- the present invention is directed to systems and methods for predicting various characteristics (ADME/Tox characteristics) related to the way a body will absorb, distribute, metabolize, eliminate, and respond to potential toxic effects of a compound based on the compound's chemical structure and/or associated experimental data.
- ADME/Tox characteristics various characteristics related to the way a body will absorb, distribute, metabolize, eliminate, and respond to potential toxic effects of a compound based on the compound's chemical structure and/or associated experimental data.
- the molecular structure of a proposed compound may be input as a 2- dimensional (2D) connection table, which is essentially a two-dimensional graph of how the atoms of a compound are arranged (the structures may actually be 3- dimensional (3D), but may be represented as 2D via well known methods).
- the structure may be input as a 3D structure. Either 2D or 3D structural representations are desirable inputs for models using structure to predict ADME/Tox characteristics.
- the first is whether or not it actually interacts with a particular molecular target in the body (in most cases, some kind of protein); the second is whether or not the body can absorb, metabolize, distribute and eliminate the compound adequately, and third, whether or not the compound elicits a toxic response.
- the present invention provides systems and methods for predicting the ADME/Tox properties (e.g., Caco-2 effective permeability or Caco-2 Peff), of a proposed compound through statistical analysis of compound data.
- ADME/Tox properties e.g., Caco-2 effective permeability or Caco-2 Peff
- the first section of the present invention employs mathematical analyses of a diverse compilation of training data (chemical compound data including conventional experimental results, chemical descriptor analysis, etc.) to determine what data relates to the ADME/Tox property to be predicted.
- training data chemical compound data including conventional experimental results, chemical descriptor analysis, etc.
- type or types of data that are applicable to the ADME/Tox property descriptors
- mathematical analyses of the selected training data to obtain the selected ADME/Tox characteristic for each training data compound are performed in order to create a model.
- the model can then be used to predict a proposed compound's ADME/Tox property by inputting the same type of data for the proposed compound into the model. Running the model with the proposed compound's descriptors produces the predicted ADME/Tox characteristic.
- Models are only as good as the input assay and test data, and therefore, a key to producing highly accurate predictions is the use of well-defined standard operating procedures for generating data as well as insuring that the data has a good distribution. Therefore, the present invention provides a method for collecting and compiling a diverse training data set to be used to mathematically predict the ADME/Tox characteristics of a proposed chemical compound.
- the input data is collected and/or calculated for a variety of chemical compounds preferably representing currently prescribed drugs as well as failed drugs and potential new drugs (this is a continual process, since as more data is collected, the resulting models will have improved performance).
- Assay data may be collected from well established sources or derived by conventional means.
- in vitro assays characterizing permeability and transport mechanisms may include in vitro cell-based diffusion experiments and immobilized membrane assays, as well as in situ perfusion assays, intestinal ring assays, incubation assays in rodents, rabbits, dogs, non-human primates and the like, assays of brush border membrane vesicles, and averted intestinal sacs or tissue section assays.
- In vivo assay data typically are conducted in animal models such as mouse, rat, rabbit, hamster, dog, and monkey to characterize bioavailability of a compound of interest, including distribution, metabolism, elimination and toxicity.
- animal models such as mouse, rat, rabbit, hamster, dog, and monkey
- cell culture-based in vitro assays or biochemical assays from isolated cell components or recombinantly expressed components are preferred.
- tissue-based in vitro and/or mammal-based in vivo data are preferred.
- Cell culture models are preferred for high-throughput screening, as they allow experiments to be conducted with relatively small amounts of a test sample while maximizing surface area and can be utilized to perform large numbers of experiments on multiple samples simultaneously.
- Cell models or biochemical assays also require fewer experiments since there is no animal to animal variability.
- An array of different cell lines also can be used to systematically collect complementary input data related to a series of transport barriers (passive paracellular, active paracellular, carrier-mediated influx, carrier-mediated efflux) and metabolic barriers (protease, esterase, cytochrome P450, conjugation enzymes).
- Cells and tissue preparations employed in the assays can be obtained from repositories, or from any eukaryote, such as rabbit, mouse, rat, dog, cat, monkey, bovine, ovine, porcine, equine, humans and the like.
- a tissue sample can be derived from any region of the body, taking into consideration ethical issues. The tissue sample can then be adapted or attached to various support devices depending on the intended assay. Alternatively, cells can be cultivated from tissue. This generally involves obtaining a biopsy sample from a target tissue followed by culturing of cells from the biopsy.
- Cells and tissue also may be derived from sources that have been genetically manipulated, such as by recombinant DNA techniques, that express a desired protein or combination of proteins relevant to a given screening assay.
- Artificially engineered tissues also can be employed, such as those made using artificial scaffolds/matrices and tissue growth regulators to direct three-dimensional growth and development of cells used to inoculate the scaffolds/matrices. It will be understood that ideally any known test results could be added to a test data set in order to adjust the model or to provide a new property to solve towards.
- the drugs (compounds) selected should be as diverse in character as possible. Therefore, the compounds may be analyzed and defined in chemical space. Chemical space can be represented as an N-base coordinate system in which to plot compounds and may be used to show the diversity of a sample of compounds. The axes of N-base coordinate system may be selected from all or some of the input data. Drugs may be eliminated from a particular training data set (the training data may be grouped to solve for a particular ADME/Tox property) if it is determined that they bias the training data set.
- a collection of drugs have been plotted in a six-base chemical space (see FIG. 3).
- the axes of the six-base are physicochemical descriptors that were selected so that the best separation of known drugs is maintained.
- Data is also selected from combinatorial libraries of chemicals which are near neighbors for each of the drugs creating an extended data set.
- the compounds are ideally each tested for various ADME/Tox characteristics or properties to be predicted, however it is not necessary to test every compound for actual results.
- Each data set of experimental data is analyzed to decide how it is going to be used in model building. For example, is it appropriate to use a certain data set to predict absolute values of compounds or is there too much error in the data set? If there is not enough data in a data set to cover a particular range (either coverage in the data space, representation in the data space, or certainty in the data space) it is possible to put the data into bins, such as 0 to 20, 21 to 40, 41 to 60, 61 to 80, 81 to 100. Alternatively, the data may require scaling correction to account for systematic variations in the data.
- bins such as 0 to 20, 21 to 40, 41 to 60, 61 to 80, 81 to 100.
- the data may require scaling correction to account for systematic variations in the data.
- One having ordinary skill in the art will readily understand the grouping of experimental data, scaling and systematic variations used to adjust a data set.
- Chemical descriptors are well known in the art of modeling compounds, and may be determined by analyzing a 2D or 3D structure of a compound.
- training data input and target data
- a relational database or other known means for making the data easily accessible and available to be manipulated and analyzed in accordance with the present invention.
- system 100 includes a processor facility 102 and a data facility 104 coupled to a network 106.
- the processor facility 102 may be a conventional computer, such as a PC, configured to access database facility 104 and to execute analytical software in accordance with the present invention.
- Database facility 104 may be a conventional database server running a database engine, such as SQLSERVER® or ORACLE 8i® and is configured to maintain and to serve data, such as the test data described above.
- the data may be stored and maintained by any means such as in a relational dataspace or an objected oriented dataspace.
- the present invention includes analytical tools which may be executed on processor facility 102.
- the analytical tools may be in the form of software that is loaded locally on processor facility 102 or may be served via a server 108 (e.g., an HTML form, JAVA program, etc. served on a web server), which optionally may be included.
- a client facility 110 may be connected to the network 106, which may include parts of the Internet and World Wide Web (WWW), or local area networks (LANS).
- the client facility 110 could be a web browser or other terminal configured to access and run the analytical tools remotely or to download the analytical tools (e.g., via HTML, HOP, etc.) via network 106 and run them locally.
- the configuration of system 100 is merely exemplary and is not meant to limit the present invention. It will be appreciated that the present invention may take many forms and configurations.
- the present invention may be implemented via a software solution including a database and forms configured to run on a stand-alone PC, or may alternatively be a combination of software and firmware, and may be implemented in a client-server, stand-alone or web configuration.
- the operational aspects of the present invention are now described with reference to the flow chart in FIG. 2.
- the flow chart represents two independent starting pathways which meet at step S2-5, a model development pathway, and a model execution or prediction pathway, these two initial pathways will be described independently.
- Model Development Pathway (S2-1a -> S2-5)
- the model development pathway begins in step S2-1a and immediately proceeds to step S2-2a.
- the ADME/Tox property to be predicted is selected. For example, it may be desired to predict the Caco-2 Peff of the compound, or the FDP (fraction of the dose administered that is absorbed at the portal vein).
- the system might allow for the selection to be from a table, radio group, pop-list, or by any known means.
- a set of training compounds appropriate for developing the selected ADME/Tox property model is entered into the system. Many compound descriptors may be entered or calculated, such as molecular weight, structure, specific gravity, etc.
- a group of meaningful input data is selected based on the property to be predicted or a related performance metric using feature selection methods. For example, a genetic algorithm coupled with a regression/classification method, such as a neural network, may be used to build many models predicting the Caco-2 Peff of a compound. Features are then selected from the resulting models with the objective of choosing the smallest number of dimensions that effectively describe the model space.
- a genetic algorithm coupled with a regression/classification method such as a neural network
- a model is created at step S2-4a by using regression/classification methods to map the input data to the ADME/Tox property to be predicted.
- the modeling effort may involve Affine Regressions, Nearest Neighbor Methods, Discriminate Analysis, Support Vector Machines, Artificial neural networks, Data Compression techniques (targeted and non-targeted), Genetic Algorithms, and Boosting.
- a method for calculating a confidence metric is created by analyzing information related to the model such as the distributions and values of the input and target data and the methods involved in building the model.
- the present invention may be used to classify a particular compound (e.g., can it be absorbed, is it toxic, etc.).
- a compound is classified by the same method predicting a specific ADME/Tox property, except that the analyses performed may vary slightly, and the classifications are performed to solve for a "yes/no" or "high, medium, low” binning type solution (e.g., 1-bit).
- step S2-4a The model resulting from step S2-4a is used in step S2-5 to predict new proposed compounds in the model execution pathway.
- Model Execution Pathway (S2-1 b -> S2-7)
- the model may be used to predict the ADME/Tox property of the proposed compound.
- the model execution pathway begins at step S2-1b, and proceeds directly to S2-2b where at least one proposed compound may be entered.
- the property to be predicted is selected. For example, it may be desired to predict the Caco-2 Peff of the compound, or the FDP.
- the system might allow for the selection to be from a table, radio group, pop-list, or by any known means.
- step S2-5 the descriptors for the proposed compound (identified in step S2-3a)) are input into the model created in step S2-4a.
- the model is run and a result (e.g., a Caco-2 Peff or FDP prediction) is produced in step S2-6.
- a measure of confidence in the result may also be produced.
- the preceding method may be implemented via numerous configurations.
- the preceding method and analysis therein may be implemented via a C++ program coupled to a data warehouse, or alternatively may be implemented via a combination of program components and databases.
- the present invention now provides a less expensive and time consuming, and potentially more accurate means for predicting the ADME characteristics of proposed drugs, and therefore, by using the present invention, many individuals and entities will now be able to more affordably screen compounds for their applicability as drugs before any animal testing or other lab testing is necessary.
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/332,997 US20040009536A1 (en) | 2001-07-30 | 2001-07-30 | System and method for predicting adme/tox characteristics of a compound |
CA002416787A CA2416787A1 (en) | 2000-07-28 | 2001-07-30 | System and method for predicting adme/tox characteristics of a compound |
JP2002516618A JP2004507718A (en) | 2000-07-28 | 2001-07-30 | Systems and methods for predicting ADME / TOX properties of compounds |
EP01956015A EP1358611A2 (en) | 2000-07-28 | 2001-07-30 | System and method for predicting adme/tox characteristics of a compound |
AU2001278056A AU2001278056A1 (en) | 2000-07-28 | 2001-07-30 | System and method for predicting adme/tox characteristics of a compound |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US22154800P | 2000-07-28 | 2000-07-28 | |
US60/221,548 | 2000-07-28 | ||
US26743501P | 2001-02-09 | 2001-02-09 | |
US60/267,435 | 2001-02-09 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2002010742A2 true WO2002010742A2 (en) | 2002-02-07 |
WO2002010742A3 WO2002010742A3 (en) | 2003-08-14 |
Family
ID=26915883
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/023763 WO2002010742A2 (en) | 2000-07-28 | 2001-07-30 | System and method for predicting adme/tox characteristics of a compound |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP1358611A2 (en) |
JP (1) | JP2004507718A (en) |
AU (1) | AU2001278056A1 (en) |
CA (1) | CA2416787A1 (en) |
WO (1) | WO2002010742A2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003048720A2 (en) * | 2001-12-07 | 2003-06-12 | Bayer Technology Services Gmbh | Computer system and method for calculating adme properties |
EP1426763A1 (en) * | 2002-12-03 | 2004-06-09 | Bayer Technology Services GmbH | Computer system and method for calculating a pharmacokinetic characteristic of a chemical substance in insects |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2487454A1 (en) * | 2002-05-28 | 2003-12-04 | The Trustees Of The University Of Pennsylvania | Methods, systems, and computer program products for computational analysis and design of amphiphilic polymers |
JP5512077B2 (en) * | 2006-11-22 | 2014-06-04 | 株式会社 資生堂 | Safety evaluation method, safety evaluation system, and safety evaluation program |
EP2153227B1 (en) * | 2007-05-29 | 2011-02-16 | Pharma Diagnostics NV | Reagents and methods for the determination of pk/adme-tox characteristics of new chemical entities and of drug candidates |
JP6903226B2 (en) * | 2018-04-11 | 2021-07-14 | 富士フイルム株式会社 | Estimator, estimation method, and estimation program |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996003644A1 (en) * | 1994-07-27 | 1996-02-08 | The Dow Chemical Company | Determining biodegradability of aspartic acid derivatives, degradable chelants, uses and compositions thereof |
WO1998009166A1 (en) * | 1996-08-29 | 1998-03-05 | Chiron Corporation | Method for analysing absorption, distribution, metabolism, excretion (adme) and pharmacokinetics properties of compound mixtures |
WO2000079268A2 (en) * | 1999-06-18 | 2000-12-28 | Biacore Ab | Method and apparatus for assaying a drug candidate to estimate a pharmacokinetic parameter associated therewith |
EP1111533A2 (en) * | 1999-12-15 | 2001-06-27 | Pfizer Products Inc. | Logistic regression trees for drug analysis |
-
2001
- 2001-07-30 CA CA002416787A patent/CA2416787A1/en not_active Abandoned
- 2001-07-30 WO PCT/US2001/023763 patent/WO2002010742A2/en not_active Application Discontinuation
- 2001-07-30 EP EP01956015A patent/EP1358611A2/en not_active Withdrawn
- 2001-07-30 JP JP2002516618A patent/JP2004507718A/en active Pending
- 2001-07-30 AU AU2001278056A patent/AU2001278056A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996003644A1 (en) * | 1994-07-27 | 1996-02-08 | The Dow Chemical Company | Determining biodegradability of aspartic acid derivatives, degradable chelants, uses and compositions thereof |
WO1998009166A1 (en) * | 1996-08-29 | 1998-03-05 | Chiron Corporation | Method for analysing absorption, distribution, metabolism, excretion (adme) and pharmacokinetics properties of compound mixtures |
WO2000079268A2 (en) * | 1999-06-18 | 2000-12-28 | Biacore Ab | Method and apparatus for assaying a drug candidate to estimate a pharmacokinetic parameter associated therewith |
EP1111533A2 (en) * | 1999-12-15 | 2001-06-27 | Pfizer Products Inc. | Logistic regression trees for drug analysis |
Non-Patent Citations (3)
Title |
---|
DATABASE BIOSIS [Online] BIOSCIENCES INFORMATION SERVICE, PHILADELPHIA, PA, US; November 1999 (1999-11) KALAMPOKIS ALKIVIADIS ET AL: "A heterogeneous tube model of intestinal drug absorption based on probabilistic concepts." Database accession no. PREV200000031330 XP002241973 & PHARMACEUTICAL RESEARCH (NEW YORK), vol. 16, no. 11, November 1999 (1999-11), pages 1764-1769, ISSN: 0724-8741 * |
NORRIS D A ET AL: "Development of predictive pharmacokinetic simulation models for drug discovery." JOURNAL OF CONTROLLED RELEASE, vol. 65, no. 1-2, 1 March 2000 (2000-03-01), pages 55-62, XP004190311 ISSN: 0168-3659 * |
See also references of EP1358611A2 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003048720A2 (en) * | 2001-12-07 | 2003-06-12 | Bayer Technology Services Gmbh | Computer system and method for calculating adme properties |
WO2003048720A3 (en) * | 2001-12-07 | 2004-04-22 | Bayer Ag | Computer system and method for calculating adme properties |
US7765092B2 (en) | 2001-12-07 | 2010-07-27 | Bayer Technology Services Gmbh | Computer system and method for calculating ADME properties |
EP1426763A1 (en) * | 2002-12-03 | 2004-06-09 | Bayer Technology Services GmbH | Computer system and method for calculating a pharmacokinetic characteristic of a chemical substance in insects |
US7539607B2 (en) | 2002-12-03 | 2009-05-26 | Bayer Technology Services Gmbh | Computer system and method of calculating a pharmacokinetic behavior of a chemical substance in insects |
Also Published As
Publication number | Publication date |
---|---|
AU2001278056A1 (en) | 2002-02-13 |
CA2416787A1 (en) | 2002-02-07 |
WO2002010742A3 (en) | 2003-08-14 |
EP1358611A2 (en) | 2003-11-05 |
JP2004507718A (en) | 2004-03-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sutherland et al. | oSCR: a spatial capture–recapture R package for inference about spatial ecological processes | |
Wilkens et al. | HierS: hierarchical scaffold clustering using topological chemical graphs | |
Heikamp et al. | The future of virtual compound screening | |
CN103093108B (en) | A kind of Chinese medicine system Pharmacological Analysis platform and the method for analysis | |
Kohler et al. | Flow-matching: Efficient coarse-graining of molecular dynamics without forces | |
US8301393B2 (en) | Methods and systems for genome-scale kinetic modeling | |
US20110275527A1 (en) | Predictive Toxicology for Biological Systems | |
Cavill et al. | Genetic algorithms for simultaneous variable and sample selection in metabonomics | |
Dingemanse et al. | Integrated pharmacokinetics and pharmacodynamics in drug development | |
US20040009536A1 (en) | System and method for predicting adme/tox characteristics of a compound | |
CN112420126A (en) | Drug target prediction method based on multi-source data fusion and network structure disturbance | |
Bajikar et al. | Multiscale models of cell signaling | |
EP1358611A2 (en) | System and method for predicting adme/tox characteristics of a compound | |
Tosca et al. | Modeling approaches for reducing safety-related attrition in drug discovery and development: a review on myelotoxicity, immunotoxicity, cardiovascular toxicity, and liver toxicity | |
US20040039530A1 (en) | Pharmacokinetic tool and method for predicting metabolism of a compound in a mammal | |
Cazade et al. | A comparative analysis of clustering algorithms: O2 migration in truncated hemoglobin I from transition networks | |
US20040180322A1 (en) | Regional intestinal permeability model | |
Chen et al. | A multi-objective ground motion selection approach matching the acceleration and displacement response spectra | |
Michelson | The impact of systems biology and biosimulation on drug discovery and development | |
Shimada et al. | Integrating computer-based de novo drug design and multidimensional filtering for desirable drugs | |
Caulk et al. | Robust latent-variable interpretation of in vivo regression models by nested resampling | |
EP1386274A2 (en) | Pharmacokinetic tool and method for predicting metabolism of a compound in a mammal | |
Van de Waterbeemd et al. | Can the Internet help to meet the challenges in ADME and e-ADME? | |
Pireddu et al. | Scaling-up simulations of diffusion in microporous materials | |
Vanmeerbeek et al. | Reverse translation: the key to increasing the clinical success of immunotherapy? |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2416787 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2001956015 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10332997 Country of ref document: US |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWP | Wipo information: published in national office |
Ref document number: 2001956015 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2001956015 Country of ref document: EP |