EP1866824A2 - System and method for prediction of drug metabolism, toxicity, mode of action, and side effects of novel small molecule compounds - Google Patents
System and method for prediction of drug metabolism, toxicity, mode of action, and side effects of novel small molecule compoundsInfo
- Publication number
- EP1866824A2 EP1866824A2 EP06748475A EP06748475A EP1866824A2 EP 1866824 A2 EP1866824 A2 EP 1866824A2 EP 06748475 A EP06748475 A EP 06748475A EP 06748475 A EP06748475 A EP 06748475A EP 1866824 A2 EP1866824 A2 EP 1866824A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- level
- chemical compound
- metabolites
- biological organism
- biological
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B35/00—ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/30—Prediction of properties of chemical compounds, compositions or mixtures
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/60—In silico combinatorial chemistry
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B45/00—ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/80—Data visualisation
Definitions
- the present invention relates to systems for the prediction of drug metabolism and the toxicity of novel compounds.
- the existing drug discovery and analysis systems can be classified into two categories.
- the first type of system analyzes high throughput (HT) data, which over the last several years has resulted in a paradigm shift for life science research due to the unprecedented scale-up of several laboratory techniques. These include automated DNA sequencing, global gene expression measurements, and proteomics and metabonomics techniques.
- High throughput data provides information on gene expression, protein interactions, and small molecule metabolism such that such data are ubiquitous throughout the drug discovery pipeline from target identification and validation to the development and testing of drug candidates to clinical trials.
- Software such as MetaCoreTM (GeneGo, Inc., St. Joseph, Michigan), PathArt and
- PathwayAssist (Ariadne Genomics, Inc., Rockville, Maryland), and Pathways Analysis (Ingenuity Systems, Mountain View, California) can be used to analyze such HT data in association with gene expression and protein pathways. These software programs can be used to predict and model which protein pathways may be affected by a small molecule. The information on protein interactions is collected from the published experimental data that is then annotated and assembled into databases on the interactions. The network data analysis software that is now commercially available is robust enough for simultaneous processing of many large data files containing thousands of data points such as whole- genome expression microarrays. The second category of drug discovery systems has existed since the 1970s.
- ADME/Tox systemic absorption, distribution, metabolism, elimination, and toxicology
- One aspect of the invention is a system for predicting the interaction between a chemical compound and a biological organism, the system comprising: a first means for predicting one or more first level metabolites of the chemical compound in the biological organism; a second means for predicting the interaction of the chemical compound and the first level metabolites with the biological organism; and a means for visualizing the interaction of the chemical compound, the first level metabolites, and the biological organism.
- the biological organism is a human being.
- the biological organism is modeled using one or more types of biological compounds.
- the one or more types of biological compounds is comprised of one or more of the group composed of proteins, nucleic acids, or organic compounds.
- the group of proteins is comprised of one or more of the group composed of enzymes, prions, and peptides.
- the group of nucleic acids is comprised of one or more of the group composed of DNA, RNA, genes, and chromosomes.
- the system further comprises a third means for predicting one or more higher-level metabolites of the one or more first level metabolites.
- the method for the third means of prediction is the same as the method for the first means of prediction.
- the system further comprises means for choosing the one or more predicted first level or higher-level metabolites to predict the interactions of the chemical compound in the biological organism.
- the system further comprises means for inputting one of the chemical compound name, structure, or data.
- the system comprises a fourth means for prediction of the likelihood of the one or more predicted first level or higher-level metabolites to occur in the biological organism.
- the system comprises means for analyzing high-throughput data.
- the system may further comprise means for generating one or more biological pathways from the high-throughput data.
- a system comprises databases comprising of at least one or more of the group composed of xenobiotics, endobiotics, ligands, drugs, drug interactions, drug binding data, biological pathways, genes, proteins, and disease links to genes.
- a system further comprises means for comparing predicted interactions of at least one of the chemical compound, first level metabolites and higher-level metabolites in the biological organism with the biological pathways generated from the high-throughput data.
- a system may further predict the interaction of at least one of the chemical compound, first level metabolites, and higher- level metabolites with the biological organism is advantageous. Additionally, a system may predict the advantageous interaction that a chemical compound can be used as a drug in the biological organism.
- a system predicts the interaction is disadvantageous in the biological organism of at least one of the chemical compound, first level metabolites, and higher-level metabolites.
- a system may further predict whether the disadvantageous interaction of a chemical compound is toxic or has side or deleterious effects in the biological organism.
- a system predicts the one or more first level or higher-level metabolites of the chemical compound using predetermined rules, QSAR models, or other algorithms.
- a system user may choose the predetermined rules, QSAR models, or other algorithms to predict the one or more first level or higher-level metabolites of the chemical compound.
- Another aspect of the invention is a method for predicting the interaction between a chemical compound and a biological organism, the system comprising: a first step for predicting one or more first level metabolites of the chemical compound in the biological organism; a second step for predicting the interaction of the chemical compound and the first level metabolites with the biological organism; and a third step for visualizing the interaction of the chemical compound, the first level metabolites, and the biological organism.
- the biological organism is a human being.
- the biological organism is modeled using one or more types of biological compounds.
- the one or more types of biological compounds is comprised of one or more of the group composed of proteins, nucleic acids, or organic compounds.
- the group of proteins is comprised of one or more of the group composed of enzymes, prions, and peptides.
- the group of nucleic acids is comprised of one or more of the group composed of DNA, RNA, genes, and chromosomes.
- the method further comprises a fourth step for predicting one or more higher-level metabolites of the one or more first level metabolites.
- the method for the means of prediction used for the fourth step may be the same means of prediction as needed for the first step.
- the method further comprises a step for choosing the one or more predicted first level or higher-level metabolites to predict the interactions of the chemical compound in the biological organism.
- the method further comprises a step for inputting one of the chemical compound name, structure, or data.
- the method comprises another step for prediction of the likelihood of the one or more predicted first level or higher-level metabolites to occur in the biological organism.
- the method comprises a step for analyzing high-throughput data.
- the method may further comprise a step(s) for generating one or more biological pathways from the high-throughput data.
- a method comprises databases comprising of at least one or more of the group composed of xenobiotics, endobiotics, ligands, drugs, drug interactions, drug binding data, biological pathways, genes, proteins, and disease links to genes.
- a method further comprises a step(s) for comparing predicted interactions of at least one of the chemical compound, first level metabolites and higher-level metabolites in the biological organism with the biological pathways generated from the high-throughput data.
- a method may have additional step(s) to further predict the interaction of at least one of the chemical compound, first level metabolites, and higher-level metabolites with the biological organism is advantageous. Additionally, a method may predict the advantageous interaction that a chemical compound can be used as a drug in the biological organism.
- a method may have a step(s) to predicts the interaction of at least one of the chemical compound, first level metabolites, and higher-level metabolites with the biological organism is disadvantageous.
- a method may have an additional step(s) that may further predict whether the disadvantageous interaction of a chemical compound is toxic or has side or deleterious effects in the biological organism.
- a method predicts the one or more first level or higher-level metabolites of the chemical compound using predetermined rules, QSAR models, or other algorithms.
- a system user may choose the predetermined rules, QSAR models, or other algorithms to predict the one or more first level or higher-level metabolites of the chemical compound.
- Another aspect of the invention is a computer means for predicting the interaction between a chemical compound and a biological organism, the computer means comprising: a first means for predicting one or more first level metabolites of the chemical compound in the biological organism; a second means for predicting the interaction of the chemical compound and the first level metabolites with the biological organism; and a means for visualizing the interaction of the chemical compound, the first level metabolites, and the biological organism.
- the biological organism is a human being.
- the biological organism is modeled using one or more types of biological compounds.
- the one or more types of biological compounds is comprised of one or more of the group composed of proteins, nucleic acids, or organic compounds.
- the group of proteins is comprised of one or more of the group composed of enzymes, prions, and peptides.
- the group of nucleic acids is comprised of one or more of the group composed of DNA, RNA, genes, and chromosomes.
- the computer means further comprises a third means for predicting one or more higher-level metabolites of the one or more first level metabolites.
- the computer means for the third means of prediction is the same as the method for the first means of prediction.
- the computer means further comprises means for choosing the one or more predicted first level or higher-level metabolites to predict the interactions of the chemical compound in the biological organism.
- the computer means further comprises means for inputting one of the chemical compound name, structure, or data.
- the computer means comprises a fourth means for prediction of the likelihood of the one or more predicted First level or higher-level metabolites to occur in the biological organism.
- the computer means comprises means for analyzing high-throughput data.
- the computer means may further comprise means for generating one or more biological pathways from the high-throughput data.
- a computer means comprises databases comprising of at least one or more of the group composed of xenobiotics, endobiotics, ligands, drugs, drag interactions, drug binding data, biological pathways, genes, proteins, and disease links to genes.
- a computer means further comprises means for comparing predicted interactions of at least one of the chemical compound, first level metabolites and higher-level metabolites in the biological organism with the biological pathways generated from the high-throughput data.
- a computer means may further predict the interaction of at least one of the chemical compound, first level metabolites, and higher-level metabolites with the biological organism is advantageous. Additionally, a computer means may predict the advantageous interaction that a chemical compound can be used as a drug in the biological organism.
- a computer means predicts the interaction of at least one of the chemical compound, first level metabolites, and higher- level metabolites with the biological organism is disadvantageous.
- a computer means may further predict whether the disadvantageous interaction of a chemical compound is toxic or has side or deleterious effects in the biological organism.
- a computer means predicts the one or more first level or higher-level metabolites of the chemical compound using predetermined rules, QSAR models, or other algorithms.
- a user may choose the predetermined rules, QSAR models, or other algorithms to predict the one or more first level or higher-level metabolites of the chemical compound.
- Figure 1 is a representation of a computer program according to embodiments of the invention
- Figure 2 is a diagram showing a high-level flow chart of a process for identifying promising drug compounds according to embodiments of the invention
- Figure 3 is a representation of a legend for metabolic map according to embodiments of the invention
- Figure 4 is a representation of a thiamine metabolism map according to embodiments of the invention
- FIG. 5 is a block diagram of a general-purpose computer system upon which various embodiments of the invention may be implemented;
- Figure 6 is a block diagram of a computer data storage system with which various embodiments of the invention may be practiced;
- Figure 7 is a diagram showing a high-level flow chart of a process for predicting side effects that may be caused by a compound;
- Figure 8 is a representation of a microarray network that confirms Iressa inhibits EGRF;
- Figure 9 is a diagram showing a high-level flow chart of a process for screening of analysis data;
- Figure 10 is a more detailed flow chart of a process for identifying promising drug compounds as shown in Fig. 2 and according to embodiments of the invention.
- Figure 11 is a representation of the information in a cell and the data generated at each level.
- Existing drug discovery systems may be categorized into distinct groups.
- the first group utilizes and analyzes HT data, which elucidates chemical interactions in the body, including protein pathways and gene expression.
- the difficulty with using this type of software is the upfront biological testing that must be performed to obtain the massive quantities of data to allow a predictive capability with other molecules.
- HT data is usually generated for one pathway at a time and therefore the interaction of the pathways is also often lacking.
- This type of software may be considered to be based largely upon biochemical data generated by studying one biochemical pathway at a time.
- a second type of approach is a rule-based system that uses a knowledge base of molecules with known information such as metabolism, binding etc.
- the similarity of a new molecule to one that exists in the database can be used as a method to suggest similar metabolism of activity at a biological target.
- a third type of drug discovery systems that uses chemical interaction rules or QSAR methodology can be used to predict the metabolism of xenobiotic compounds in the body as well as affinity for other proteins such as enzymes, transporters, channels and receptors. This type of approach looks at a biological organism as a chemical system with distinct and unconnected chemical reactions.
- a biological organism is defined as a living organism, such as a plant, insect, or mammal.
- the biological organism of the invention is a human being.
- a biological organism may be a subsystem of a living being, e.g. the citric acid cycle or the lymphatic system; the subsystem of interest for study using the drug discovery system may be defined by the system user.
- databases and expert systems that contain combined data or rules from many different mammalian species may be less useful for predicting human metabolism alone.
- the data and rules for each species should be separate. Consequently, the programs using this information in combined databases tend to predict all the metabolic possibilities for an exogenous molecule, essentially creating an 'average' mammal that may be dissimilar to the human situation.
- Effective drug discovery systems for the complexity of biological organisms require a system-wide approach to data analysis, which can be defined as the integration of "OMICs" data with computational methods or chemical modeling.
- a system-wide approach uses the relationships of all elements rather than approaching them separately. This approach can be taken from the “top down” (using a conceptual framework to integrate data) or from the “bottom up” (combining individually modeled biochemical processes).
- a system-wide approach states that the identification of the "parts list” of all the genes and proteins is insufficient to understand the whole. Rather, it is the assembly of these parts (the general schema, the modules, and elements) and the dynamics of changes in response to stimuli that is truly the key to understanding a biological organism.
- the assembly of "cellular machinery” can be called the “interactome", the network of interconnected signaling, regulatory, and biochemical networks with proteins as the nodes and physical protein-protein interactions as the edges.
- ADME/Tox and particularly drug metabolism using a system-wide approach may improve understanding and ultimately predictions associated with a biological organism.
- the perturbing effect of a molecule on the complete biological organism can be observed either experimentally (using high-throughput screening against many proteins) or theoretically (using many computational models) and across all metabolic and signaling pathways. In this way an understanding of the effects of binding to multiple proteins simultaneously can be provided.
- the iterative approach based on multiple cycles of data generation and modeling can also create dynamic hypotheses which are advantageous compared with purely static models.
- This approach also requires the collection of high-throughput and high content screening data, including global gene expression, protein content, and metabolic profiles for the same samples as well as individual genetic, clinical, and phenotypic data.
- systems for predicting the effect of a molecule on a biological organism predicts the interaction of the molecule with the interactome using basic rules, QSAR modeling, or analysis of high-throughput data. Furthermore, the prediction of the system using basic rules, QSAR modeling, or analysis of high-throughput data may be improved through comparison with chemically or structurally similar xenobiotics. Information on xenobiotics may be contained within a database.
- the drug discovery system of the invention may be used to predict any advantageous or disadvantageous interaction of a specific molecule on a biological organism, including the effectiveness or non-effectiveness of a molecule as a drug; the effect specific molecule on a biological organism, pathway or protein; the side effects that may be expected for a specific molecule in a biological organism; or the mode of action of a specific molecule to cause a known effect in a biological organism.
- the inventive system may perform any of the above functions and will be generically called a drug discovery system.
- Figure 7 is a high-level representation of a drug discovery system that predicts advantageous and disadvantageous effects of a compound.
- a compound and its predicted human metabolites are compared by chemical similarity and substructure search against the chemical content of a built-in database.
- the compounds of similar structure from the database are connected with different functional categories in the database, such as cell processes, biological networks, toxicity maps, and disease networks. P-values are then calculated for the distribution of such functional categories, and the categories are cross- referenced. Based on p-values and other statistical criteria, the highest scored potential indications (disease areas) and toxicities are calculated and then presented using simple color visualization modes.
- Figure 7 shows an example of a workflow path in a drug discovery system.
- the example is merely an illustrative embodiment of a flow path for a drug discovery system.
- an illustrative embodiment is not intended to limit the scope of the invention, as any of numerous other implementations of a drug discovery system, for example, variations on steps taken, are possible and are intended to fall within the scope of the invention.
- additional steps may be used or one or more steps may be removed from the example. Additionally, steps may be reversed or performed in a different order. None of the claims set forth below are intended to be limited to any particular implementation of the database structure unless such claim includes a limitation explicitly reciting a particular implementation.
- the metabolite(s) of the molecule of interest are also predicted.
- the drug discovery system may also predict the one or more metabolites of the molecule by using predetermined rules for metabolic pathways. These predicted metabolites may then be further similarly processed by the system to determine their metabolites and so on. The metabolites may then be visualized associated with the target proteins that undergo the chemical reaction within the stored biological pathways. This presents predicted metabolites in the context of the empirical data.
- the system may then predict the interaction of each of the predicted metabolites with the interactome.
- Figure 1 represents a possible database configuration.
- the following example is merely an illustrative embodiment of the database structure. It should be appreciated that an illustrative embodiment is not intended to limit the scope of the invention, as any of numerous other implementations of a database structure for a drug discovery system, for example, variations of database content, are possible and are intended to fall within the scope of the invention. None of the claims set forth below are intended to be limited to any particular implementation of the database structure unless such claim includes a limitation explicitly reciting a particular implementation.
- Figure 1 shows three types of information (or elements) required for the database structure. The three types are Component, Transformation, and Effect.
- Component may be defined as the functional groups of molecules in biological organisms and is related to a molecular entity, localization, cell/tissue, and organism.
- a Component represents biological molecules within their biological context.
- the molecular entity may be treated in a broader sense than just being a specific chemical compound.
- a molecular entity may also be a group of molecules (e.g. a protein family or class of chemical compounds) or a molecular complex. This is particularly useful for representing the cellular processes, when the exact chemical composition or particular isoform of a protein participating in a pathway is unknown or ambiguous. Transformation is defined as a biochemical reaction, transport, transcription and translation, or any biological process with a primary function being to change the amount of a Component (e.g., through a reaction) that is considered in its particular environment as linked to a sub-cellular compartment, tissue, and organism. During the Transformation of a Component, one or more other molecules (i.e., metabolites) may be generated.
- Effect is defined as the influence that a Component(s) exert on either a Transformation(s) or another Effect(s).
- Each Effect has an agent (Component) and a target (Transformation, another Effect or Functional Block).
- Effect is the description of biological activity, whether or not its exact mechanism is known.
- the three types of elements i.e., Components, Transformations, and Effects
- databases examples include BIND, DIP, the Human Protein Reference Database (HPRD), MetaCore, MINT, HomoMINT, MIPS, PathArt, Pathways Analysis, BlOCarta, Gene Ontology, GenMAPP, and KEGG. Additionally, data mining packages may be utilized.
- the drug discovery system may use the data as provided by the databases or additional manual or automated processing is performed to further parse the data.
- Table 1 lists examples of known enzymes and the number of known molecules that are associated with each enzyme.
- An associated molecule may be an input chemical, a metabolite, an enzyme regulator, or any other molecule that has an action on or is acted upon by the enzyme.
- a database associated with the summary in Table 1 may also include the actual names, structure, and/or properties of the associated molecules with each enzyme.
- Functional Blocks which are functional units, be it a particular category of metabolism or any other functional process.
- Functional Blocks link together Components, Effects and Transformations that are functionally related.
- Functional Blocks are hierarchical as they may contain other Functional Blocks as elements. Additionally, every element may be a part of more than one block. Therefore, Functional Blocks are linked to each other by shared elements. Assembling different elements within Functional Blocks enables rapid search of functional links and function-centered analysis of expression and other high- throughput molecular data.
- Functional blocks may be provided by the databases providing information on the three elements but preferably are generated specifically for or by the drug discovery software.
- Figure 2 shows an example of a workflow path (item 100) in a drug discovery system
- fig. 10 is a more detailed example of the same workflow path.
- the following example is merely an illustrative embodiment of a flow path for a drug discovery system. It should be appreciated that an illustrative embodiment is not intended to limit the scope of the invention, as any of numerous other implementations of a drug discovery system, for example, variations on steps taken, are possible and are intended to fall within the scope of the invention.
- additional steps may be used or one or more steps may be removed from the example. Additionally, steps may be reversed or performed in a different order.
- a scientist inputs the chemical that he or she wants to process through the drug discovery system using a user interface at step 102.
- the chemical may be input as a name or a structure.
- the name may be input by text using any chemical identification system, including CAS number, standard chemical nomenclature, common chemical nomenclature, or a custom nomenclature system. If only a chemical identifier is entered into the system, a two-dimensional or three-dimensional chemical structure may be determined using predetermined rules (that may be able to be revised by the scientist) or by accessing a database(s) of chemical structures.
- the chemical may also be input using a chemical structure format including sdf and mol files or through a structure drawing program such as ChemDraw (CambridgeSoft Corporation, Cambridge, Massachusetts). If only the chemical structure is input, the drug discovery software may determine or assign a chemical identifier to the molecule.
- a scientist may interact with the drug discovery system using wireless or line telephone with display, handheld device, kiosk, or computer.
- a scientist may operate a computer system that has an Internet-enabled interface (e.g., using Macromedia Flash or Java) and the computer system may display streamed information within that interface.
- an Internet-enabled interface e.g., using Macromedia Flash or Java
- any interface may be used to interact with the drug discovery system and that the invention is not limited to any particular interface.
- it may be necessary to download information or executable subprograms prior to interacting with the drug discovery system while another medium may allow continuous interaction with the drug discovery system without such downloads.
- a “database” is an arrangement of data defined by computer- readable signals.
- a "user interface” or "UI” is an interface between a human user and a computer that enables communication between a user and a computer.
- GUI graphical user interface
- a display screen a mouse
- a keyboard a keypad
- a track ball a microphone ⁇ e.g., to be used in conjunction with a voice recognition system
- a speaker a touch screen
- a specialized controller ⁇ e.g., a joystick
- the input molecule is then processed through the drug discovery system using predetermined rules or QSAR models to predict the first-level metabolites of the chemical. Examples of some of the predetermined rules or QSAR models are given in Table 2.
- the drug discovery system may also provide a statistical probability that each predicted metabolite will occur.
- there may be a number of readily interpretable molecular descriptors that are calculated for the input molecule such as the number of rotatable bonds, hydrogen bond acceptors and hydrogen bond donors.
- the input molecule may also be processed through rules developed specifically to predict likely reactive metabolites (such as quinones, aromatic and hydroxyl amines, acyl gl ⁇ curonides, acyl halides, epoxides, thiophenes, furans, phenoxyl radicals, phenols and aniline radicals) and and readily highlight these for the user.
- the predicted QSAR values can be filtered with user defined cutoff values; these values may also be used to prioritize metabolites.
- the drug discovery system or the scientist may then determine which metabolites to continue to process through the drug discovery system. The determination of which metabolites to continue processing may be based upon the statistical probability mentioned above, on instinct, on experimental data, or on any other criteria.
- Steps 104 and 106 may occur as many times as desired; more iterations of steps 104 and 106 predict higher and higher-level metabolites. Table 2 Examples of Metabolite Transformation Rules
- Glutathione S-transfer - halogen Glutathione S-transfer to alkenes
- Glutathione transfer to aldehyde Glutathione replacement of sulfate
- Glutathione S-transfer to quinines Glutathione S-transfer to benzyl
- Methyl transferases O-methyl transfer, N-methyl transfer, S-methyl transfer,
- Cysteine conjugation Cysteine S-transfer to epoxide, Cysteine S-transfer - halogen, Cysteine S-transfer to alkenes, Cysteine transfer to aldehyde, Cysteine replacement of sulfate, Cysteine S- transfer to benzyl, Cysteine transfer to Cys
- the drug discovery system processes the initial chemical and all the predicted metabolites left in the model and predicts the total effect of the initial chemical on the biological organism ⁇ e.g., homo sapiens) being studied.
- the total effect is determined after considering protein pathways, gene regulation, enzyme regulation or any biological interaction.
- the drug discovery system may automatically determine which interactions to consider or a scientist may make choices.
- the drug discovery system may also graphically represent the effect of the chemical and its metabolites on the biological organism. Due to the complexity of biological organisms, a legend may be necessary to differentiate the various elements of the map; Fig. 3 is a representation of a legend. The legend may use any combination of text, color, shape, and overlays to represent any or all elements in the biological map. With such a legend, the interaction map becomes easier to understand as represented in Fig. 4 for thiamine metabolism. The interaction map shown in Fig.
- the drug discovery system may also provide the input chemical, its predicted metabolites, statistics, and any other information as text, a worksheet (e.g., for Microsoft Excel), or as a new database to the user.
- a worksheet e.g., for Microsoft Excel
- the drug discovery system may also process OMICs or high throughput data.
- HT data is processed and in step 110 developed into a new biological pathway after analysis.
- the new biological pathway may also be graphically or textually represented.
- the HT data may be from actual biological studies of the chemical of interest in the specified or another biological organism or may be of any related or non-related chemical.
- step 112 can also be used to compare the predicted metabolic pathway for the molecule input in step 102 with the actual data generated in step 108.
- Figure 9 is a high level representation of a system for high-throughput screening for functional analysis of compound screening data.
- the data from both high-content screening (HCS) and high-throughput screening (HTS) assays may be analyzed using a chemical similarity search. Once similar chemicals are identified, two information sets are generated. The first dataset is based upon the known networks and information of the similar chemicals; such information may include maps of biological pathways, functional processes, diseases, toxicities, and biological networks. The second dataset is based upon the predicted metabolites of the compound being screened. AU the information from both datasets are then analyzed to refine the predicted biological pathways, functional processes, diseases, toxicities, and biological networks of the screened compound.
- Figure 11 represents the different data types that are generated in a cell, the interaction of the various data types, and the high-throughput technique that may be used to obtain the data.
- Various high-throughput or high content data can be linked to tables of human protein interactions.
- Nine levels of regulation of protein activity in a human cell can be summarized: 1) gene transcription, 2) mRNA processing and editing, 3) mRNA transport from nucleus, 4) mRNA stabilization, 5) protein translation, 6) protein transport, 7) folding and protein stabilization, 8) allosteric modulation, and 9) covalent modification.
- the types of data generated corresponding to these levels are also shown.
- a predicted biological network for Iressa, an anti-cancer drug is represented in
- Figure 8 the biological network shows both predicted metabolites and the mode of action. This assessment is produced by analysis of microarray expression data in mice model and also using the metabolic rules in the drug discovery system of this invention.
- the drug is predicted to inhibit EGFR as its primary target; microarray data confirms this.
- CYP2D6 and via nuclear hormone receptor PXR to GCR receptor and the upstream signaling cascades — all on one network.
- the computer or the scientist may then select the most promising chemical compounds that affects or regulates the biological organism as desired. With the appropriate parameters on the biological map, the promising chemical compounds may be also be chosen that create the lowest amount of side or deleterious effects.
- a drug discovery system may have any combination (including a few, some, many, or all) of the following features.
- a set of databases possibly having an Oracle-based architecture; o At least one or more databases; o At least one database of proprietary, manually curated mammalian data o Greater than 10,000 compounds, preferably greater than 15,000 compounds, more preferably greater than 20,000 compounds; o Greater than one thousand, preferably greater than 2,000, more preferably greater than 3,000 marketed drugs, possibly with associated binding information; o Greater than 1,000, preferably greater than 2,000, more preferably greater than 4,000 xenobiotics; o Greater than 1,000, preferably greater than 2,500, more preferably greater than 5,000 endogenous metabolites; o Greater than 1 ,000, preferably greater than 2,000, more preferably greater than 3,000 binding constants for CYPs, UGTS, SULTs, and other important human enzymes; o Greater than 1,000, preferably greater than 5,000, more preferably greater than 10,000 xenobiotic reactions; o Greater than 10,000
- o At least one model that determines standard deviations; o Chemical structures may be input using sdf and mol files; o Large files may be processed in batch mode; o PipelinePilot (Scitegic, San Diego, CA) may be used with MetaDrug modules to create work flows for batch processing and integration with other informatics software, o ChemDraw or other plug-in or molecular drawing device may be used for structure visualization; o Parsers may be used for genomics, proteomics, metabolomics or other experimental or theoretical data; o Concurrent visualization of genomics, proteomics, metabolomics or other experimental or theoretical data on objects from database; o Networks that interact with desired chemical compounds or metabolites may be built as needed; o At least two algorithms for network building; o Substructure, structure, and similarity searching of databases, that may use the Accord plug-in; o Grid or other types of visualization of multiple metabolites and predicted or empirical data points; o Export of predicted metabolites as well as predicted scores and properties as an sdf,
- the drug discovery system and components thereof such as the databases and software tools, may be implemented using software ⁇ e.g., C, C#, C++, Java, or a combination thereof), hardware ⁇ e.g., one or more application-specific integrated circuits), firmware ⁇ e.g., electronically programmed memory), or any combination thereof.
- One or more of the components of the drug discovery system may reside on a single computer system ⁇ e.g., the data mining subsystem), or one or more components may reside on separate, discrete computer systems. Further, each component may be distributed across multiple computer systems, and one or more of the computer systems may be interconnected.
- each of the components may reside in one or more locations on the computer system.
- different portions of the components of the drug discovery system may reside in different areas of memory (e.g., RAM, ROM, disk, etc.) on the computer system.
- Each of such one or more computer systems may include, among other components, a plurality of known components such as one or more processors, a memory system, a disk storage system, one or more network interfaces, and one or more busses or other internal communication links interconnecting the various components.
- the drug discovery system may be implemented on a computer system described below in relation to Figs. 5 and 6.
- the drug discovery system described above is merely an illustrative embodiment of a drug discovery system. Such an illustrative embodiment is not intended to limit the scope of the invention, as any of numerous other implementations of a drug discovery system, for example, variations of the databases contained within, are possible and are intended to fall within the scope of the invention. None of the claims set forth below are intended to be limited to any particular implementation of the drug discovery system unless such claim includes a limitation explicitly reciting a particular implementation.
- Various embodiments according to the invention may be implemented on one or more computer systems. These computer systems may be, for example, general-purpose computers such as those based on Intel PENTIUM-type processor, Motorola PowerPC, Sun UltraSPARC, Hewlett-Packard PA-RISC processors, or any other type of processor. It should be appreciated that one or more of any type computer system may be used to partially or fully automate play of the described game according to various embodiments of the invention. Further, the software design system may be located on a single computer or may be distributed among a plurality of computers attached by a communications network. A general-purpose computer system according to one embodiment of the invention is configured to perform any of the described drug discovery system functions. It should be appreciated that the system may perform other functions, including network communication, and the invention is not limited to having any particular function or set of functions.
- the computer system 400 may include a processor 403 connected to one or more memory devices 404, such as a disk drive, memory, or other device for storing data.
- Memory 404 is typically used for storing programs and data during operation of the computer system 400.
- Components of computer system 400 may be coupled by an interconnection mechanism 405, which may include one or more busses (e.g., between components that are integrated within a same machine) and/or a network (e.g., between components that reside on separate discrete machines).
- the interconnection mechanism 405 enables communications (e.g., data, instructions) to be exchanged between system components of system 400.
- Computer system 400 also includes one or more input devices 402, for example, a keyboard, mouse, trackball, microphone, touch screen, and one or more output devices 401, for example, a printing device, display screen, speaker.
- computer system 400 may contain one or more interfaces (not shown) that connect computer system 400 to a communication network (in addition or as an alternative to the interconnection mechanism 405.
- the storage system 406, shown in greater detail in Fig. 6, typically includes a computer readable and writeable nonvolatile recording medium 501 in which signals are stored that define a program to be executed by the processor or information stored on or in the medium 501 to be processed by the program.
- the medium may, for example, be a disk or flash memory.
- the processor causes data to be read from the nonvolatile recording medium 501 into another memory 502 that allows for faster access to the information by the processor than does the medium 501.
- This memory 502 is typically a volatile, random access memory such as a dynamic random access memory (DRAM) or static memory (SRAM). It may be located in storage system 406, as shown, or in memory system 404, not shown.
- the processor 403 generally manipulates the data within the integrated circuit memory 404, 502 and then copies the data to the medium 501 after processing is completed.
- a variety of mechanisms are known for managing data movement between the medium 501 and the integrated circuit memory element 404, 502, and the invention is not limited thereto. The invention is not limited to a particular memory system 404 or storage system 406.
- the computer system may include specially-programmed, special-purpose hardware, for example, an application-specific integrated circuit (ASIC).
- ASIC application-specific integrated circuit
- aspects of the invention may be implemented in software, hardware or firmware, or any combination thereof. Further, such methods, acts, systems, system elements and components thereof may be implemented as part of the computer system described above or as an independent component.
- computer system 400 is shown by .way of example as one type of computer system upon which various aspects of the invention may be practiced, it should be appreciated that aspects of the invention are not limited to being implemented on the computer system as shown in Fig. 5. Various aspects of the invention may be practiced on one or more computers having a different architecture or components than that shown in Fig. 5.
- Computer system 400 may be a general-purpose computer system that is programmable using a high-level computer programming language. Computer system 400 may be also implemented using specially programmed, special purpose hardware.
- processor 403 is typically a commercially available processor such as the well-known Pentium class processor available from the Intel Corporation. Many other processors are available.
- processor usually executes an operating system which may be, for example, the Windows 95, Windows 98, Windows NT, Windows 2000 (Windows ME) or Windows XP operating systems available from the Microsoft Corporation, MAC OS System X available from Apple Computer, the Solaris Operating System available from Sun Microsystems, or UNIX available from various sources. Many other operating systems may be used.
- the processor and operating system together define a computer platform for which application programs in high-level programming languages are written. It should be understood that the invention is not limited to a particular computer system platform, processor, operating system, or network. Also, it should be apparent to those skilled in the art that the present invention is not limited to a specific programming language or computer system. Further, it should be appreciated that other appropriate programming languages and other appropriate computer systems could also be used.
- One or more portions of the computer system may be distributed across one or more computer systems (not shown) coupled to a communications network. These computer systems also may be general-purpose computer systems. For example, various aspects of the invention may be distributed among one or more computer systems configured to provide a service (e.g., servers) to one or more client computers, or to perform an overall task as part of a distributed system. For example, various aspects of the invention may be performed on a client-server system that includes components distributed among one or more server systems that perform various functions according to various embodiments of the invention.
- a service e.g., servers
- These components may be executable, intermediate (e.g., IL), or interpreted (e.g., Java) code that communicates over a communication network (e.g., the Internet) using a communication protocol (e.g., TCP/IP).
- a communication network e.g., the Internet
- a communication protocol e.g., TCP/IP
- Various embodiments of the present invention may be programmed using an object-oriented programming language, such as SmallTalk, Java, C++, Ada, or C# (C- Sharp). Other object-oriented programming languages may also be used. Alternatively, functional, scripting, and/or logical programming languages may be used.
- Various aspects of the invention may be implemented in a non-programmed environment (e.g., documents created in HTML, XML or other format that, when viewed in a window of a browser program, render aspects of a graphical-user interface (GUI) or perform other functions).
- GUI graphical-user interface
- Various aspects of the invention may be implemented as programmed or non- programmed elements, or any combination thereof.
- the means are not intended to be limited to the means disclosed herein for performing the recited function, but are intended to cover in scope any means, known now or later developed, for performing the recited function.
- the terms are not intended to be limited to the means disclosed herein for performing the recited function, but are intended to cover in scope any means, known now or later developed, for performing the recited function.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Biophysics (AREA)
- Chemical & Material Sciences (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Physiology (AREA)
- Genetics & Genomics (AREA)
- Analytical Chemistry (AREA)
- Computing Systems (AREA)
- Crystallography & Structural Chemistry (AREA)
- Library & Information Science (AREA)
- Biochemistry (AREA)
- Medicinal Chemistry (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
Description
Claims
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US66269905P | 2005-03-17 | 2005-03-17 | |
PCT/US2006/010053 WO2006099624A2 (en) | 2005-03-17 | 2006-03-17 | Chemical interaction with metabolites in organism |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1866824A2 true EP1866824A2 (en) | 2007-12-19 |
EP1866824A4 EP1866824A4 (en) | 2009-08-05 |
Family
ID=36992486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP06748475A Withdrawn EP1866824A4 (en) | 2005-03-17 | 2006-03-17 | System and method for prediction of drug metabolism, toxicity, mode of action, and side effects of novel small molecule compounds |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP1866824A4 (en) |
GB (1) | GB2439675A (en) |
WO (1) | WO2006099624A2 (en) |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6365131B1 (en) * | 1997-10-31 | 2002-04-02 | J. B. Chemicals & Pharmaceuticals Ltd. | Pharmaceutical dental formulation for topical application of metronidazole benzoate, chlorhexidine gluconate and local anesthetic |
-
2006
- 2006-03-17 GB GB0719562A patent/GB2439675A/en not_active Withdrawn
- 2006-03-17 WO PCT/US2006/010053 patent/WO2006099624A2/en active Application Filing
- 2006-03-17 EP EP06748475A patent/EP1866824A4/en not_active Withdrawn
Non-Patent Citations (9)
Title |
---|
BUGRIM ANDREJ ET AL: "Early prediction of drug metabolism and toxicity: Systems biology approach and modeling." DRUG DISCOVERY TODAY, vol. 9, no. 3, 1 February 2004 (2004-02-01), pages 127-135, XP002532445 ISSN: 1359-6446 * |
EKINS S ET AL: "Techniques: Application of systems biology to absorption, distribution, metabolism, excretion and toxicity" TRENDS IN PHARMACOLOGICAL SCIENCES, ELSEVIER, HAYWARTH, GB, vol. 26, no. 4, 3 March 2005 (2005-03-03), pages 202-209, XP004829254 ISSN: 0165-6147 * |
EKINS S: "In silico approaches to predicting drug metabolism, toxicology and beyond." BIOCHEMICAL SOCIETY TRANSACTIONS, vol. 31, no. 3, June 2003 (2003-06), pages 611-614, XP002532447 ISSN: 0300-5127 * |
EKINS SEAN ET AL: "A combined approach to drug metabolism and toxicity assessment" DRUG METABOLISM AND DISPOSITION, WILLIAMS AND WILKINS, BALTIMORE, MD, US, vol. 34, no. 3, 1 March 2006 (2006-03-01), pages 495-503, XP002488587 ISSN: 0090-9556 * |
EKINS SEAN ET AL: "A novel method for visualizing nuclear hormone receptor networks relevant to drug metabolism." DRUG METABOLISM AND DISPOSITION: THE BIOLOGICAL FATE OF CHEMICALS MAR 2005, vol. 33, no. 3, March 2005 (2005-03), pages 474-481, XP002532446 ISSN: 0090-9556 * |
EKINS SEAN ET AL: "Computational prediction of human drug metabolism" EXPERT OPINION ON DRUG METABOLISM & TOXICOLOGY, ASHLEY PUBLICATIONS, LONDON, GB, vol. 1, no. 2, 1 August 2005 (2005-08-01), pages 303-324, XP009091066 ISSN: 1742-5255 * |
NIKOLSKY Y ET AL: "Biological networks and analysis of experimental data in drug discovery" DRUG DISCOVERY TODAY, ELSEVIER, RAHWAY, NJ, US, vol. 10, no. 9, 1 May 2005 (2005-05-01), pages 653-662, XP004890889 ISSN: 1359-6446 * |
SCOTT BOYER ET AL: "New methods in predictive metabolism" JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, KLUWER ACADEMIC PUBLISHERS, DO, vol. 16, no. 5-6, 1 May 2002 (2002-05-01), pages 403-413, XP019248057 ISSN: 1573-4951 * |
See also references of WO2006099624A2 * |
Also Published As
Publication number | Publication date |
---|---|
GB2439675A (en) | 2008-01-02 |
WO2006099624A9 (en) | 2006-11-09 |
WO2006099624A3 (en) | 2007-12-06 |
WO2006099624A2 (en) | 2006-09-21 |
EP1866824A4 (en) | 2009-08-05 |
GB0719562D0 (en) | 2007-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20120029896A1 (en) | System and method for prediction of drug metabolism, toxicity, mode of action, and side effects of novel small molecule compounds | |
Paananen et al. | An omics perspective on drug target discovery platforms | |
Zheng et al. | DrugComb update: a more comprehensive drug sensitivity data repository and analysis portal | |
Kirchmair et al. | Predicting drug metabolism: experiment and/or computation? | |
Ross et al. | Rapid and accurate prediction and scoring of water molecules in protein binding sites | |
Zhu et al. | Combinatorial QSAR modeling of chemical toxicants tested against Tetrahymena pyriformis | |
Giulini et al. | An information-theory-based approach for optimal model reduction of biomolecules | |
Agrafiotis et al. | Combinatorial informatics in the post-genomics era | |
Chou et al. | Machine learning and artificial intelligence in physiologically based pharmacokinetic modeling | |
Korolev et al. | Modeling of human cytochrome P450-mediated drug metabolism using unsupervised machine learning approach | |
Martinez-Romero et al. | Artificial intelligence techniques for colorectal cancer drug metabolism: ontologies and complex networks | |
Hunt et al. | WhichP450: A multi-class categorical model to predict the major metabolising CYP450 isoform for a compound | |
Berellini et al. | In silico prediction of total human plasma clearance | |
Maggiora | The reductionist paradox: are the laws of chemistry and physics sufficient for the discovery of new drugs? | |
Clancy et al. | From proteomes to complexomes in the era of systems biology | |
Kuepfer et al. | Multiscale mechanistic modeling in pharmaceutical research and development | |
Danielson et al. | In silico ADME techniques used in early-phase drug discovery | |
Yu | Predicting total clearance in humans from chemical structure | |
Doğan et al. | Protein domain-based prediction of drug/compound–target interactions and experimental validation on LIM kinases | |
Goh et al. | NetProt: complex-based feature selection | |
Li et al. | TAIJI: approaching experimental replicates-level accuracy for drug synergy prediction | |
Konovalov et al. | Statistical confidence for variable selection in QSAR models via Monte Carlo cross-validation | |
Sucharitha et al. | Absorption, distribution, metabolism, excretion, and toxicity assessment of drugs using computational tools | |
Conev et al. | EnGens: a computational framework for generation and analysis of representative protein conformational ensembles | |
Rappoport | Reaction networks and the metric structure of chemical space (s) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20071008 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA HR MK YU |
|
R17D | Deferred search report published (corrected) |
Effective date: 20071206 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06G 7/48 20060101ALI20080121BHEP Ipc: G01N 33/48 20060101AFI20080121BHEP |
|
DAX | Request for extension of the european patent (deleted) | ||
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G06G 7/48 20060101ALI20090618BHEP Ipc: G01N 33/48 20060101ALI20090618BHEP Ipc: G06F 19/00 20060101AFI20090618BHEP |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20090706 |
|
17Q | First examination report despatched |
Effective date: 20091104 |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: GENEGO, INC. |
|
RIN1 | Information on inventor provided before grant (corrected) |
Inventor name: NIKOLSKY, YURI Inventor name: NIKOLSKAYA, TATIANA Inventor name: BUGRIM, ANDREJ Inventor name: EKINS, SEAN |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20150505 |