CN113409899B - Method for predicting human developmental toxicity based on action mode - Google Patents

Method for predicting human developmental toxicity based on action mode Download PDF

Info

Publication number
CN113409899B
CN113409899B CN202110677549.4A CN202110677549A CN113409899B CN 113409899 B CN113409899 B CN 113409899B CN 202110677549 A CN202110677549 A CN 202110677549A CN 113409899 B CN113409899 B CN 113409899B
Authority
CN
China
Prior art keywords
activity
prediction
model
compound
toxicity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110677549.4A
Other languages
Chinese (zh)
Other versions
CN113409899A (en
Inventor
史薇
谭皓月
陈钦畅
于红霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202110677549.4A priority Critical patent/CN113409899B/en
Publication of CN113409899A publication Critical patent/CN113409899A/en
Application granted granted Critical
Publication of CN113409899B publication Critical patent/CN113409899B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references

Landscapes

  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a method for predicting human developmental toxicity based on an action mode, and belongs to the field of virtual screening and activity prediction of human developmental toxicity of chemicals. Constructing a compound activity data set, selecting a harmful ending path based on a human developmental toxicity action mode, and collecting research events and activity data thereof on a harmful ending path signal path; and constructing a first predictive model based on the dataset; then, utilizing a plurality of compounds with in-vivo experimental data, utilizing the first prediction model to predict the prediction result of the compounds with in-vivo experimental data, and utilizing a naive Bayesian algorithm to train to obtain a second prediction model; and inputting the compound to be tested into the first prediction model to perform qualitative prediction, and inputting a qualitative prediction result into the second prediction model to complete the prediction of the human developmental toxicity of the chemical. The invention allows for high throughput screening of potentially mode-of-action-based human developmental toxic chemicals.

Description

Method for predicting human developmental toxicity based on action mode
Technical Field
The invention relates to the field of virtual screening and activity prediction of human developmental toxicity of chemicals, in particular to a method for predicting human developmental toxicity based on an action mode.
Background
Numerous animal experiments and epidemiological studies have found that most small molecule compounds have a role in interfering with nuclear receptor mediated nucleic acid translation and expression and thus in affecting the human ontogenesis process, known as developmental toxicity (Developmental Toxicity). In particular, many environmental related contaminants found by a number of official and non-official organizations, including the U.S. environmental agency Protection Agency, U.S. epa and european chemical administration (European Chemicals Agency, ECHA), can potentially affect deleterious outcome pathways (Adverse Outcome Pathway, AOP) and cause dysplasia in human reproductive-related organs (male and female). For example. For men, contaminants can produce individual levels of male developmental toxicity (Male Developmental Toxicity, MDT) by affecting androgen receptor mediated deleterious outcome pathways and causing dysplasia in male reproductive related organs, including testes, retarded growth of the prostate, dysfunction or abnormalities, and the like. Environmental pollutants with MDT can lead not only to dysplasia of male reproduction-related organs, but also potentially to male reproductive health abnormalities, including increased incidence of testicular germ cell tumors, low semen quality, cryptorchid and hypourethral fissures, ultimately leading to reproductive toxicity (Reproductive Toxicity). For females, contaminants can produce individual levels of female developmental toxicity (Female Developmental Toxicity, FDT) by affecting estrogen receptor mediated deleterious outcome pathways and causing dysplasia in female reproductive related organs, including ovaries, fallopian tubes, uterus, placenta, and breast, among others. Successful pregnancy requires normal development and operation of these female reproductive-related organs, which can lead to difficult or impossible pregnancy, failure to successfully become pregnant to term, or difficulty in feeding the infant.
In principle, environmental pollutants, based on the mode of action, produce human developmental toxicity interference from molecular, cellular, and organ tissues by interfering with deleterious outcome pathways, resulting in incomplete development of these human reproductive-related organs, or malformation of development leading to failure of normal development and operation in humans, ultimately leading to human reproductive-related dysfunction. Currently, development-related hazard identification is based mainly on a large number of animal experiments and epidemiological studies, and a small number of interference mechanism studies. There are thousands of chemicals commercially, but few have been tested for individual-related developmental toxicity and fewer epidemiological studies have been directed to the endpoint of dysplasia in human reproductive-related organs. Thus, a number of traditional animal experiments have been conducted over the last two decades to test chemicals for toxicity in human developmental toxicity. However, since 2004, non-animal replacement test (non-animal testing) methods based on in vitro testing and virtual screening have evolved after the european union ethical problem has prohibited traditional animal testing. In vitro testing is time consuming and costly, and it is not possible to fully test tens of thousands of registered chemicals, so virtual screening techniques are particularly important.
Scientists developed computer-based virtual screening methods to predict the activity of chemical-related toxicity endpoints. Quantitative structural effect relationships (Quantitative Structure-Activity Relationship, QSAR) can use molecular descriptors to extract and characterize the relationship between compound biological activity and structural features. QSAR has been widely used as a mature method in a variety of toxicity predictions. For example, the invention creation name: a virtual screening method for human transthyretin interferon (patent publication No. CN106407665A, publication No. 2017-02-15) has been constructed by using QSAR technology. The invention also creates the following names: fresh water acute benchmark prediction method based on metal quantitative structure-activity relationship (patent publication number: CN104820873A, publication date: 2015-08-05) and invention creation name: seawater acute benchmark prediction method based on metal quantitative structure-activity relationship (patent publication No. CN105447248A, publication No. 2015-11-24) also utilizes QSAR technology to predict seawater and fresh water acute benchmark. Notably, there are many non-QSAR techniques that find use in developmental toxicity prediction. For example, the invention creation name: a method for evaluating the growth and development toxicity of triazole pesticide by using Drosophila melanogaster (patent publication No. CN110150236A, publication No. 2021-04-06). However, the method is not only low in application range (limited to triazole pesticides), but also predicts that the development end point is drosophila melanogaster, and is not a highly-focused human healthy toxicity end point. The invention also creates the following names: chemical developmental toxicity prediction method, prediction model, and construction method and application thereof (patent publication No. CN112063681A, publication day: 2020-12-11). Although the method uses human myocardial cell data to predict, the prediction end point is the activity of alpha-Actinin and SOX17 proteins in human myocardial cells, is limited to activity prediction at the cellular level, and cannot predict developmental toxicity at the organ and individual levels.
The deleterious end-point pathway (Adverse outcome pathway, AOP) is used to describe the existing correlation between a direct molecular initiation event (molecular initiating event, MIE) (e.g., ligand-receptor binding) and "deleterious end-points" associated with risk assessment that occur at different tissue structural levels (e.g., cells, organs, organisms, populations). The creation of an AOP not only identifies individual toxic events during the toxic event, but also the context between toxic events, thus modularizing these toxic events. AOP ultimately combines toxic events at the cellular, organ, and individual levels, comprehensively evaluates chemical toxic effects based on deleterious outcome pathways, and predicts chemical toxic mechanisms of action (e.g., interference with certain critical events). However, by analyzing the prior art, methods for high throughput prediction of human developmental toxicity based on mode of action are lacking in the prior art.
Disclosure of Invention
Technical problems: the invention aims to overcome the defect that high-throughput prediction of human developmental toxicity cannot be effectively performed based on an action mode in the prior art, and provides a method for predicting human developmental toxicity based on the action mode. The method can carry out high-throughput screening on chemicals which potentially act on the harmful ending path to further generate developmental toxicity, so that whether the compound has the developmental toxicity or not can be accurately and quickly judged, and whether the toxicity is generated by interfering the harmful ending path or not can be accurately and quickly judged.
The technical scheme is as follows: the invention provides a method for predicting human developmental toxicity based on an action mode, which comprises the following steps:
constructing a compound activity dataset comprising selecting deleterious outcome pathways based on the mode of action, determining a first number of research events capable of describing a second number of toxicity endpoints; collecting and collating relevant assay and or compound activity data for each study event;
according to a compound activity data set, combining a third number of molecular descriptor libraries, respectively training a fourth number of corresponding QSAR models by a fourth number of different machine learning algorithms for each toxicity end point, screening out QSAR models with the best prediction effect corresponding to each toxicity end point, and forming a first prediction model by the acquired set of the second number of QSAR models;
utilizing a plurality of compounds with in-vivo experimental data, utilizing the first prediction model to predict the prediction result of the compounds with in-vivo experimental data, and utilizing a naive Bayesian algorithm to train to obtain a second prediction model;
and inputting the compound to be tested into the first prediction model to perform qualitative prediction, and inputting a qualitative prediction result into the second prediction model to complete the prediction of the human developmental toxicity of the chemical.
Preferably, the method further comprises selecting a deleterious outcome path based on the mode of action, the method comprising determining the first number of research events by:
collecting a first number of research events including molecular initiation events and critical events from a molecular level, a cellular level, an individual or an organ level based on a human developmental toxicity mechanism of a deleterious outcome pathway;
the manner of collecting the relevant trial for each study event is: except for individual/organ level research events based on animal experiments, all research event data from molecular start events are high-throughput chemical information, in vitro experiments; and taking an individual or organ level research event as a research harmful outcome or a final research event, wherein the used animal test needs to meet the test guidelines of the American environmental protection agency or the economic cooperation and development organization;
the manner of sorting the activity data for each study event is:
firstly, removing unstructured information, polymer type, ionic type and mixed type compounds;
compound activity was then normalized using the following formula:
in the formula, activity value represents the Activity intensity value, K i Represents the inhibition constant, K d Represents the dissociation constant, AC 50 Represents half the active concentration, IC 50 Represents half inhibition concentration, EC 50 Represents half-maximal effect concentration, uM represents the micromolar amount;
finally, the cytotoxicity assay was used as an active filter to remove potentially false positive compounds due to cytotoxicity.
Preferably, after the activity data is collated, the collected compounds are subjected to an activity classification for each study event, comprising:
activity: for a certain research event, at least one experiment has activity, and if the activity is resistance in vitro experimental data, the activity intensity is required to be greater than the cytotoxicity activity intensity under the same experiment;
inactive: for a certain research event, all experimental determination results are inactive or only have resistance in-vitro experimental data, and the resistance activity intensity is smaller than or equal to the cytotoxicity activity intensity under the same experiment;
and (3) fitting: for a certain research event, on the premise that the compound has activity, pseudo-activity data exists;
resistance: for a certain research event, on the premise that the compound has activity, resistance activity data exist, and the resistance activity intensity is larger than the cytotoxicity activity intensity.
Preferably, the method for obtaining the first prediction model according to the compound activity data set and combining the third number of molecular descriptor libraries, respectively training a fourth number of corresponding QSAR models by a fourth number of different machine learning algorithms for each toxicity endpoint, and screening out the QSAR model with the best prediction effect corresponding to each toxicity endpoint includes:
Dividing the constructed compound activity data set into a training set and a testing set in a ratio of 4:1, wherein the training set is used for model construction and internal verification, and the testing set is used for external verification;
selecting a third number of molecular descriptor libraries to calculate structural information data of the compound;
for each toxicity end point, respectively training to obtain a fourth number of QSAR models through a fourth number of different machine learning algorithms;
adopting a fifth number of different indexes to verify the prediction effect of the QSAR model, and selecting the QSAR model with the optimal prediction capacity for each toxicity end point;
and the set of the screened second number of QSAR models forms the first prediction model.
Preferably, the fourth number of machine learning algorithms includes K-nearest neighbor algorithm, na iotave bayesian algorithm, random forest, support vector machine and decision tree.
Preferably, the third number of molecular descriptor libraries are respectively: OEState, gold 2, and Dragon v.7.
Preferably, the fifth number of indicators includes true positive, false positive, true negative, false negative, sensitivity, specificity, accuracy, and area under the curve.
Preferably, when constructing the QSAR model, the training set of the constructed QSAR model is internally validated using 5-fold cross validation to test the stability of the data.
Preferably, the in vivo experimental data are animal experimental data, and a part of the animal experimental data can detect a plurality of toxic effects of the compound at the organ/individual level, so that the second prediction model comprises a plurality of independent prediction models, each model predicts one toxic effect independently, and when the compound to be tested is predicted by the plurality of independent prediction models in the plurality of second prediction models, if the compound is positive in at least one model prediction, the compound has human developmental toxicity.
Further, the method for constructing the application domain comprises the following steps:
the application domains of the relevant QSAR models were described using 26 physicochemical properties of the compounds in the training set, 26 including 1D, 2D, 3D, nAcid, ALogP, AMR, apol, naAromAtom, nAromBond, nAtom, nhaavyom, nbands, nBondsD, nBondsT, nBondsQ, bpol, eta_alpha, FMF, nHBAcc, nHBDon, topoPSA, VABC, MW, AMW, XLogP, TPSA, nRing, nRotB, and nRotBt, respectively;
and calculating a distance matrix of the training set by adopting the Euclidean distance as an application domain of each QSAR model.
The beneficial effects are that: compared with the prior art, the invention has the following advantages:
The method provided by the embodiment of the invention selects a harmful ending path based on an action mode, determines research events, collects and sorts relevant tests and compound activity data of each research event, and builds a compound activity data set; then, a compound activity data set which is favorable for construction is combined with a molecular descriptor library and a machine learning algorithm to train and obtain a first prediction model; then, utilizing a plurality of compounds with in-vivo experimental data, utilizing the first prediction model to predict the prediction result of the compounds with in-vivo experimental data, and utilizing a naive Bayesian algorithm to train to obtain a second prediction model; and (3) carrying out prediction verification on the compound to be tested through a first prediction model and a second prediction model, so as to accurately and quickly judge whether the compound has developmental toxicity or not and whether the toxicity is generated by interfering a harmful ending passage or not.
The method provided by the invention can be used for carrying out high-throughput screening on chemicals which have potential actions and harmful ending paths and further generate developmental toxicity, so that whether the compound has the developmental toxicity or not can be accurately and rapidly judged, and whether the toxicity is generated by interfering with the harmful ending paths or not is accurately and rapidly judged, and the defect that the prior art lacks of high-throughput prediction of human developmental toxicity based on action modes is overcome.
In the embodiment of the invention, the constructed model is constructed based on a molecular mechanism of interfering a harmful ending path signal path by environmental pollutants, and the molecular level, the cell level and the individual organ level of the compound-generated human developmental toxicity activity mechanism are directly connected in a breakthrough manner, so that a key technical method and a theoretical basis are provided for realizing the extrapolation of human developmental toxicity based on calculated toxicology.
Drawings
FIG. 1 is a flow chart of a method of human developmental toxicity prediction based on mode of action in an embodiment of the invention;
FIG. 2 is a flow chart of a predictive model chemical prediction of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways;
FIG. 3 is a flow chart of a predictive model modeling of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways;
FIG. 4 is a graph of predictive indices for the construction of a first predictive model of a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways;
FIG. 5 is a graph of results of an external validation evaluation of a first predictive model of a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways;
FIG. 6 is a graph of predicted outcomes of first and second predictive models of a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways;
FIG. 7 is a predictive flow chart of a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways;
FIG. 8 is a graph of a predicted male developmental toxicity based on androgen receptor mediated deleterious outcome pathways.
Detailed Description
Example 1
In this example, a method for predicting male developmental toxicity of androgen receptor mediated deleterious outcome pathway is selected as an example, and the specific implementation of the present invention will be described in detail. Fig. 1 shows a flow chart of a method of human developmental toxicity prediction based on mode of action in an embodiment of the invention, fig. 2 shows a flow chart of a model chemical prediction of male developmental toxicity prediction of a base Yu Xiong hormone receptor mediated deleterious outcome pathway, and in combination with fig. 1 and 2, the method in this embodiment comprises:
step S100: constructing a compound activity dataset comprising selecting deleterious outcome pathways based on the mode of action, determining a first number of research events capable of describing a second number of toxicity endpoints; relevant assays for each study event and or compound activity data thereof are collected and collated.
Wherein the first number refers to the number of research events determined, in this example experimental data for seven research events including a Molecular Initiation Event (MIE) and a Key Event (KE) are collected from a molecular level, a cellular level, and an individual or organ level based on a mechanism of male developmental toxicity of an androgen receptor mediated deleterious outcome pathway. Seven study events were ligand-receptor binding (MIE), cofactor recruitment (KE 1), DNA binding (KE 2), abnormal protein transcriptional activity (KE 3), abnormal transcription (KE 4), cell proliferation (KE 5) and organ dysplasia (KE 6), respectively, as shown in detail in fig. 3.
In vitro high throughput (in vitro) test data from the first six study events were derived from ToxCAST developed by four U.S. official tissues, except for the animal experiment-based organ dysplasia event (KE 6) TM The Tox21 High throughput screening (High-Throughput Screening, HTS) project includes the national toxicology program (National Toxicology Program, NTP), the national transformation science development center (National Center for Advancing Translational Sciences, NCATS), the United states food and drug administration (U.S. food and Drug Administration, FDA) and the national computational toxicology center (National Center for Computational Toxicology, NCCT) belonging to U.S. EPA. Latest ToxCAST based on update of 2019, 2 and 26 days TM the/Tox 21 database has collected a total of 20 trials describing the first six study events, as shown in table 1.
TABLE 1 summary of experiments used in a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways
ligand-Receptor Binding (MIE), a key first step in the generation of androgen Receptor mediated deleterious outcome pathway activity, there are three molecular experiments (NVS_NR_cAR, NVS_NR_rAR, NVS_NR_hAR).
Co-factor Recruitment (COA re) as KE1, there are two trials (ot_ar_arsrc1_0480, ot_ar_arsrc1_0960) aimed at determining the effect of recruiting co-activator modulators (COAs) in androgen receptor mediated detrimental outcome pathways. Thus, DNA binding (Chromatin Binding) there was an experiment (TOX21_ARE_BLA_agonist_ratio) as KE2 to determine if AR had binding to DNA. Notably, protein production inhibition results determined in vitro based experiments are highly correlated with cellular activity, i.e., the presence of cytotoxic chemicals can produce potentially "false positive" activity. Thus, each in vitro assay-related cytotoxicity assay was also selected as a threshold filter for compounds with "false positive" activity. Thus, TOX21_ARE_BLA_agonist_villaity was also selected as a cytotoxicity test to prevent false positive phenomena.
Protein transcriptional activity abnormalities (Transcription Factor Activity) there were two experiments (atg_ar_trans_dn, atg_ar_trans_up) as KE 3. Two in vitro experiments measure the protein transcriptional activity up-trend (atg_ar_trans_up) or down-trend (atg_ar_trans_dn), respectively, so two models for different transcriptional trends were used in the subsequent model construction.
Transcriptional abnormalities (Gene Expression) there are two tests of activation transcription and inhibition transcription including 7 tests as KE 4. Compound activation androgen receptor mediated transcription of deleterious outcome pathways includes three in vitro assays (ot_ar_areluc_ag_1440, tox21_ar_bla_agonist_ratio,
TOX21_AR_LUC_MDAKB2_Agonist) to determine whether a compound has pseudo-activity; the inhibition of androgen receptor mediated deleterious outcome pathway transcription by a compound includes two in vitro assays (tox21_ar_bla_antagonist_ratio, tox21_ar_luc_mdagb2_antagonist) and two associated cytotoxicity assays (tox21_ar_bla_antagonist_availability, tox21_ar_luc_mdagb2_antagonist_availability) to determine whether the compound has resistance activity in non-cytotoxic conditions.
Similarly, cell proliferation (Cell Proliferation) as KE5 there are two tests for activating cell proliferation and inhibiting cell proliferation comprising 2 tests (ACEA_AR_agonist_80 hr, ACEA_AR_antagnoist_80 hr) and 2 cytotoxicity tests (ACEA_AR_agonist_AUC_vigilance, ACEA_AR_antagnoist_AUC_vigilance).
In addition, organ dysplasia was designated as KE6, and in this example, the rat Hershberger test was selected to determine the developmental toxicity of the androgenic and antiandrogenic activities on the male-related organs at the organ level. Rodent Hershberger experiments were identified as both the u.s.epa guidelines for experiments (EPA 890.1400) and OECD guidelines for experiments (OECD 441). In U.S. epa/OECD guidelines, the Hershberger experiment utilized castration male murine models. Rats were castrated around 42 days postnatal and allowed a post-operative recovery period of at least 7 days to reduce endogenous androgen (testosterone) levels. The assay results were a Hershberger assay based on the developmental weight changes of 5 androgen-dependent accessory organs (android-dependent accessory sex tissues, ASTs), including Ventral Prostate (VP), seminal Vesicles (SV) (plus fluid and clotting glands), levator ani cavernosum (LABC) muscles, paired glomerulonephritis (COW) and Glans Penis (GP). When more than two ASTs have obvious organ weight gain, the compound has a pseudo-androgenic effect; in contrast, when there is significant organ weight loss in more than two AST, the compound has an anti-androgenic effect. Both the pseudo-androgenic and anti-androgenic effects are different types of interfering activities by chemicals, which lead to abnormal development of the male organ, which in turn leads to Male Developmental Toxicity (MDT). The final 21 in vitro and in vivo assays were selected to screen compound datasets that produced male developmental toxicity through androgen receptor mediated deleterious outcome pathway effects.
The collected activity data are arranged to optimize the structure of the compound and the activity data information, and specifically comprise:
firstly, removing unstructured information (showing "NA" or "FAIL"), polymer type, ionic type and mixed type compounds;
then, the Activity (AV) of the compound is normalized using formula (1);
in the formula (1), the Activity value represents the Activity intensity value, K i Represents the inhibition constant, K d Represents the dissociation constant, AC 50 Represents half the active concentration, IC 50 Represents half inhibition concentration, EC 50 Representing half the effect concentration, uM represents the micromolar amount. Under the formula (1), for each experiment, an activity intensity of 3 or more (AV. Gtoreq.3) was defined as active, and an activity intensity of 3 (AV < 3) was defined as inactive.
After data conditioning, the collected compounds were activity classified for each study event, including:
activity (Active): for a certain research event, at least one experiment has activity, and if the activity is resistance in vitro experimental data, the activity intensity is required to be greater than the cytotoxicity activity intensity under the same experiment;
inactive (Inactive): for a certain research event, all experimental determination results are inactive or only have resistance in-vitro experimental data, and the resistance activity intensity is smaller than or equal to the cytotoxicity activity intensity under the same experiment;
Pseudoness (Agonist): for a certain research event, on the premise that the compound has activity, pseudo-activity data exists;
resistance (antangonist): for a certain research event, on the premise that the compound has activity, resistance activity data exist, and the resistance activity intensity is larger than the cytotoxicity activity intensity.
Finally, the collected compound information is shown in table 2.
TABLE 2 data collection results of a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways
In Table 2, "1" means "active" and "0" means "inactive".
From Table 2, it can be seen that 7 study events can describe 11 toxicity endpoints, ligand-receptor binding (receptor binding), cofactor recruitment (COA receptor binding), DNA binding (chromatin binding), protein transcriptional activity up-regulation (ranscription factor activity-agonist), protein transcriptional activity down-regulation (transcription factor activity-antagonist), transcriptional activation (gene expression-agonist), transcriptional inhibition (gene expression-antagonist), cell proliferation-activation (cell proliferation-agonist), cell proliferation-inhibition (cell proliferation-antagonist), organ dysplasia-weight gain (Hershberger test-agonist), and organ dysplasia-weight loss (Hershberger test-antagonist), respectively, and detailed information for each toxicity endpoint is shown in Table 2.
In the embodiment, the adopted data are reliable and effective, so that the reliability and the accuracy of the subsequent model on the human developmental toxicity prediction result are improved.
Step S200: according to a compound activity data set, combining a third number of molecular descriptor libraries, respectively training a fourth number of corresponding QSAR models by a fourth number of different machine learning algorithms for each toxicity end point, screening out QSAR models with the best prediction effect corresponding to each toxicity end point, and forming a first prediction model by the acquired set of the second number of QSAR models;
the third number refers to the number of the molecular descriptor libraries, and the fourth number refers to the number of the types of the selected machine learning algorithm.
Each toxicity endpoint was thus modeled using the data in table 2 for each toxicity endpoint. In this example, three molecular descriptor libraries and five kinds of machine are selected as shown in FIG. 4And (5) learning an algorithm. More specifically, the three analysis descriptor libraries comprise OEState, gold 2 and Dragon v.7, and the five selected machine learning algorithms comprise K nearest neighbor algorithm (K Nearest Neighbor, KNN) and naive Bayesian algorithm (KNN)Bayes, NB), random Forest (RF), support vector machine (Support Vector Machine, SVM), and Decision Tree (DT). Thus, for 11 toxicity endpoints of seven study events, 5 QSAR models were trained for each toxicity endpoint, for a total of 55 QSAR models. And screening 5 QSAR models trained for each toxicity end point, and selecting the QSAR model with the best prediction effect, wherein 11 QSAR models are screened out for 11 toxicity end points, and a first prediction model is formed by utilizing the set of 11 QSAR models.
Specifically, in connection with fig. 4, the following steps may be performed.
Step S201: the dataset was split into training set (80%) and test set (20%) in a 4:1 ratio using the "Partitioning Mode" module in the KNIE platform software, with the training set being used for model construction and internal validation and the test set being used for external validation.
Step S202: the structural information (indicated by SMILES) of the compound was checked for correctness using ChemBioDraw Ultra 14.0.14.0 software. This step may be omitted if it can be confirmed that the structural information of the compound is correct.
Step S203: three molecular descriptor libraries, OEState, gold 2, dragon v.7, were selected for structural information data calculation of the compounds. The three molecular descriptor libraries contained a total of 6049 1D, 2D, 3D molecular descriptors, calculated using the compound's SMILES information on Online Chemical Modeling Environment (OCHEM) online platform. The main steps include four steps of structure optimization, namely Standardization, neutralize, remove residues and Clean structure. The molecular descriptors obtained are subjected to the selection of related descriptors, and the main steps comprise the selection of low-variation descriptors (low variance filter), high-correlation descriptors (high correlation filter) and key feature descriptors (feature importance selection).
Step S204: for each toxicity end point, five machine learning algorithms are selected respectively, and five QSAR models are trained. Since there are 11 toxicity endpoints, a total of 55 QSAR models were trained.
In the invention, five machine learning algorithms are K nearest neighbor (K Nearest Neighbor, KNN) and naive BayesBayes, NB), random Forest (RF), support vector machine (Support Vector Machine, SVM), decision Tree (SVM). Therefore, for each toxicity endpoint, the 5 QSAR models obtained by training are a K-nearest neighbor model, a naive bayes model, a random forest model, a support vector machine model and a decision tree model.
Step S205: and verifying the prediction effect of the QSAR model by adopting a fifth number of different indexes, and selecting the QSAR model with the optimal prediction capacity for each toxicity end point.
The fifth number refers to the number of indexes, in this embodiment, 8 different indexes may be adopted to verify the obtained QSAR model, and the 8 indexes are respectively: the best prediction model may be selected from True Positive (TP), false Positive (FP), true Negative (TN), false Negative (FN), sensitivity (Sensitivity), specificity, accuracy (Accuracy), and Area Under the Curve (AUC), and in a specific implementation, only one or more of these 8 indices may be used to select the best prediction model. Table 3 gives the results of validation of the trained 55 QSAR models using 8 metrics.
TABLE 3 evaluation of the results of the first predictive model of the predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways
In this example, a QSAR model with the best prediction effect for each toxicity endpoint was screened out with Accuracy (Accuracy) as the final quasi-side, as shown in fig. 5.
As can be seen in FIG. 5, the predictive effect of QSAR models trained using Random Forest (RF) algorithms was best for ligand-receptor binding (Model 1), cofactor recruitment (COA receptor, model 2), DNA binding (chromatin binding, model 3), protein transcriptional activity up-regulation (transcription factor activity-agonist, model 4), transcriptional activation (gene expression-agonist, model 6), transcriptional inhibition (gene expression-antagnnist, model 7), cell proliferation-activation (cell proliferation-agonist, model 8) and cell proliferation-inhibition (cell proliferation-antagnnist, model 9).
Whereas for a decrease in protein transcriptional activity (transcription factor activity-antagonst, model 5), the QSAR Model trained using the Decision Tree (DT) algorithm predicted best. The QSAR Model trained by the support vector machine algorithm (SVM) has the best prediction effect on organ dysplasia-weight gain (Model 10) and organ dysplasia-weight loss (Hershberger test-antagnist, model 11).
In the process of training the QSAR model, in the embodiment, the training set for constructing the QSAR model is internally verified by using 5-fold cross verification (five-fold cross validation), and the stability of data is tested.
Additionally, from 48 compounds with in vivo experimental results of the rat Hershberger test, all experimental results were found in which 9 compounds were present based on seven study events of androgen receptor mediated deleterious outcome pathways. Thus, the predictive ability of the 11 QSAR models in the first predictive model was further validated with the 9 compounds. As shown in table 4 and fig. 6 (a), the 11 QSAR models in the first predictive model were able to accurately predict the experimental results of seven study events of the 9 reference compounds' base Yu Xiong hormone receptor mediated deleterious outcome pathways with an accuracy of up to 92%. It was demonstrated that the 11 QSAR models in the first predictive model were able to accurately and rapidly predict qualitatively seven study events on the compound's basal Yu Xiong hormone receptor mediated deleterious outcome pathways.
TABLE 4 predictive validation of 9 representative compounds in a first predictive model of a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways
In Table 4, the color indicates that the measured and predicted results are consistent; gray indicates that the measured-predicted results are inconsistent; the experimental values are outside brackets, and the predicted values are in brackets; "1" is characterized as active and "0" is characterized as inactive.
The prediction result of each research event for the established first prediction model can also give the human developmental toxicity mechanism of chemicals, and provide effective mechanism information for the development of green chemicals.
Step S300: utilizing a plurality of compounds with in-vivo experimental data, utilizing the first prediction model to predict the prediction result of the compounds with in-vivo experimental data, and utilizing a naive Bayesian algorithm to train to obtain a second prediction model;
in this step, the prediction results of the in vivo test data using 48 compounds having in vivo test data and the first prediction model were used to predict the compounds having in vivo test data, and the prediction results are shown in table 5. And carrying out composite superposition on the prediction results through a naive Bayes algorithm, and training to obtain a second prediction model, wherein in the embodiment, a weight-based comprehensive male developmental toxicity prediction model is obtained. The second predictive model trained using the naive bayes algorithm is essentially a weighted model.
TABLE 5 predictive outcome information for 48 compounds utilized by a predictive model of male developmental toxicity based on androgen receptor mediated deleterious outcome pathways and a first predictive model thereof
It should be noted that at this stage, since the in vivo animal test data used is the rat Hershberger test, the test can detect not only the androstatic effect of the compound but also the anti-androstatic effect of the compound. Thus, eventually there are two independent models in the second predictive model: (i) a pseudo-androgenic effect prediction model; (ii) an anti-androgenic effect prediction model. When compounds are predicted by two models, respectively, the compounds have male developmental toxicity if pseudo-androgenic and/or anti-androgenic activities exist. Meanwhile, since the experimental data which can be applied to the second prediction model are 48 compounds with in-vivo experimental data, in order to ensure the application domain of the model, in this stage, all data are used as a training set to construct a QSAR model, and internal verification is performed. Similarly, the prediction effect of the model is verified using one or more of the eight indices in step S200. Table 6 fig. 6 (B-C) shows the results of the pseudo-androgenic and anti-androgenic predictive models in the second predictive model.
TABLE 6 evaluation of the results of the second predictive model based on the predictive model of male developmental toxicity of androgen receptor mediated deleterious outcome pathways
The result shows that for the anti-androgenic activity prediction model, whether 48 compounds have the anti-androgenic activity or not can be completely predicted, and the prediction accuracy can reach 100%; while for the pseudo-androgenic activity prediction model, although five pseudo-androgenic compounds could be completely predicted, i.e., there were no false negative results, 36 compounds without pseudo-androgenic activity were predicted as false positives, i.e., erroneously predicted as compounds with pseudo-androgenic activity, with an accuracy of only 25%. Although a false negative rate of 0% guarantees the effectiveness of the pseudo-male model from a regulatory point of view, excessive misprediction does not achieve the predictive objective.
Thus, in the examples of the present invention, the chemical structures and characteristics of 48 compounds were studied in detail, and it was found that 5 pseudo-androstanes are all steroid (steroid) compounds (testosterone propionate,17-methyl testosterone, trenbolone, methyl-1-testosterone, testosterone), while the other 43 pseudo-androstane-free compounds are not steroid compounds. Thus, a new screening condition "is a compound predicted to have androgenic activity a steroid? ", the process is as in fig. 7.
During modeling, the screening conditions are predicted and screened using chemical structural similarity (chemical similarity). Specifically, five pseudo-androstane compounds were used as template compounds (positive controls), and the structures of the predicted compounds were matched one by one with the structures of the template compounds. Chemical structural similarity was characterized using Tanimoto Similarity Score. And similarity scoring was performed using a 12 molecular structure fingerprint library contained in PaDEL-descriptor software. The 12 molecular structure fingerprint libraries were fingerprinting, extended Fingerprinter, estate Fingerprinter, graphOnly Fingerprinter, MACCS Fingerprinter, pubchem Fingerprinter, substructure Fingerprinter, substructure FingerprintCount, klekotaRoth fingerprinting, klekotaRoth FingerprintCount, atomPairs D fingerprinting, and atom pairs2dfingerprint count, respectively. The output values of Tanimoto Score are all between 0 and 1, and the larger the Score is, the higher the chemical similarity is. Therefore, in this example, the cutoff value of Tanimoto Score is set to 0.8, and when the similarity between the test compound and at least one compound of the 5 pseudo-androstanes is greater than or equal to 0.8, it is proved that the test compound is a steroid compound satisfying the pseudo-androstane activity. The prediction capacity of the pseudo-male prediction model after the new screening condition is greatly improved, as shown in table 6 and fig. 6 (D), whether 48 compounds have pseudo-male activity can be completely predicted, and the prediction precision can reach 100%.
Step S400: and inputting the compound to be tested into the first prediction model to perform qualitative prediction, and inputting a qualitative prediction result into the second prediction model to complete the prediction of the human developmental toxicity of the chemical. It can be seen that the compound to be predicted is essentially predicted by two layers, the first layer being qualitatively predicted by the first prediction model and the second layer being synthetically predicted by the second prediction model.
In this example, according to the schemes shown in fig. 2 and 6, the method proposed in this example was tested and verified using the compound flutamide, which was first input into a first predictive model for qualitative prediction, and 11 QSAR models in the first predictive model predicted the activity/inactivity of the test compound on the androgen receptor mediated deleterious outcome pathways for seven study events; then, the prediction result of the first prediction model is input into the second prediction model, and in this embodiment, since the second prediction model includes two prediction models of the pseudo-androgenic activity and the antiandrogenic activity, the pseudo-androgenic activity and the antiandrogenic activity of the compound are predicted by the second prediction model, respectively. Finally, when a compound has androgenic and/or antiandrogenic activity, the compound has male developmental toxicity. Fig. 8 shows the overall display result of the present embodiment. Respectively displaying (i) information of the compound to be detected, including compound name, CASN, structure information and male developmental toxicity prediction result; (ii) Detailed male developmental toxicity prediction results (tabular format); (iii) Detailed male developmental toxicity prediction results (laser pattern format).
Further, in the embodiment, only small molecular organic matters can be predicted, and male developmental toxicity of heavy metals, mixtures and ionic structural compounds cannot be predicted, because of the problem of toxicity mechanism difference. Thus in this example, 26 physicochemical properties of compounds in the training set are used to describe the application domain of the relevant QSAR predictive model (Application Domain, A D). 26 physicochemical properties include 1D, 2D, 3D, nAcid, ALogP, AMR, apol, naAromAtom, nAromBond, nA tom, nHeavyAtom, nBonds, nBondsD, nBondsT, nBondsQ, bpol, ETA_alpha, FMF, nHBAcc, nHBDon, topoPSA, VABC, MW, AMW, XLogP, TPSA, nRing, nRotB, and nRotBP, respectively. The detailed physicochemical properties represented by each molecular descriptor are listed in table 7. The physicochemical properties of the compounds were calculated using the 1D &2D &3D molecular Descriptor library of PaDEL-Descriptor software.
TABLE 7 Male developmental toxicity prediction model for androgen receptor mediated deleterious outcome pathways Using field selection 26 physicochemical property information
The construction process of the Application Domain (AD) comprises four steps. First, all compounds of the training set used for modeling were calculated for each of the 26 physicochemical properties, and the numerical value of each property was normalized to a value between 0 and 1 using a normizer module of KNIE platform software. Next, a Distance Matrix (DM) of the training set compound in the model is calculated by using the euclidean Distance (Euclidean Distance) in RComplexheatmap package, and DM is the AD of the QSAR model. Again, 26 physicochemical values for the test compound were calculated and normalized based on the AD of the corresponding QSAR model (i.e., the 26 physicochemical parameters of the training set compound), and the normalized values were scored for similarity to the DM of the model and the compound closest to the test compound and similarity distance was found (similarity distance). And finally, when the similarity distance between the tested compound and the nearest compound is more than or equal to 0.5, judging that the tested compound is in the AD, otherwise, not in the AD.
There is one AD for each QSAR model in the first predictive model. Thus, overall, the application domain of a method of predictive male developmental toxicity based on androgen receptor mediated deleterious outcome pathways is the union of 11 QSAR predictive models.
As can be seen from fig. 8, the tested compound flutamide first undergoes the prediction of the 11 QSAR models of tier 1, and the prediction results and whether it is in AD (note in brackets, "in" table is in AD, and "out" table is not in AD) are both noted. It was found that for the compound flutamide, it was found to be in AD of 11 QSAR models, demonstrating its predictive effectiveness.
Currently there are few effective experimental data based on traditional animal experiments (the rat Hershberger experiment). In 2018, OECD researchers performed an extremely systematic literature search to sort the rat Hershberger experimental data to obtain an effective animal experimental database (https:// www.regulations.gov/document/EPA-HQ-OPPT-2009-0576-0008). As a result, it was found that, on the premise of meeting the OECD/US EPA specification, nearly 3200 pieces of research data were obtained, and only 134 compounds initially met the OECD/US EPA specification. It was further found that there were at least two and more than two of the 48 compounds and that the findings were consistent; in contrast, there are at least two of 24 compounds and more, and the unijunction is not uniform and cannot determine their developmental toxicity at the individual organ level. Although a QSAR model constructed based on 48 compounds with effective animal experimental results can effectively predict toxicity of the compounds in AD (accuracy=1, fig. 4), the too narrow AD range greatly limits practical application of the prediction model. The male developmental toxicity prediction method based on androgen receptor mediated deleterious outcome pathways in this embodiment is a two-layer prediction model based on androgen receptor mediated deleterious outcome pathways and constructed by using a multidimensional QSAR model. The model not only can analyze the interference mechanism (which key event is interfered) of the compound, but also greatly expands the narrow application range only based on animal experiment data by combining the multidimensional model, and greatly enhances the prediction range and the prediction capability of the model in practical application.
The above examples are only preferred embodiments of the present invention, it being noted that: it will be apparent to those skilled in the art that several modifications and equivalents can be made without departing from the principles of the invention, and such modifications and equivalents fall within the scope of the invention.

Claims (9)

1. A method of human developmental toxicity prediction based on mode of action comprising:
constructing a compound activity dataset comprising selecting deleterious outcome pathways based on the mode of action, determining a first number of research events capable of describing a second number of toxicity endpoints; collecting and collating relevant assays and compound activity data for each study event;
according to a compound activity data set, combining a third number of molecular descriptor libraries, respectively training a fourth number of corresponding QSAR models by a fourth number of different machine learning algorithms for each toxicity end point, screening out QSAR models with the best prediction effect corresponding to each toxicity end point, and forming a first prediction model by the acquired set of the second number of QSAR models;
Utilizing a plurality of compounds with in-vivo experimental data, utilizing the first prediction model to predict the prediction result of the compounds with in-vivo experimental data, and utilizing a naive Bayesian algorithm to train to obtain a second prediction model;
inputting a compound to be detected into the first prediction model to perform qualitative prediction, and inputting a qualitative prediction result into the second prediction model to complete prediction of human developmental toxicity of chemicals;
the method for determining the first number of research events based on the action mode comprises the following steps of:
collecting a first number of research events including molecular initiation events and critical events from a molecular level, a cellular level, an individual or an organ level based on a human developmental toxicity mechanism of a deleterious outcome pathway;
the manner of collecting the relevant trial for each study event is: except for individual or organ level research events based on animal experiments, all research event data from molecular start events are high-throughput chemical information, in vitro experiments; and taking an individual or organ level research event as a research harmful outcome or a final research event, wherein the used animal test needs to meet the test guidelines of the American environmental protection agency or the economic cooperation and development organization;
The manner of sorting the activity data for each study event is:
firstly, removing unstructured information, polymer type, ionic type and mixed type compounds;
compound activity was then normalized using the following formula:
in the formula, activity value represents the Activity intensity value, K i Represents the inhibition constant, K d Represents the dissociation constant, AC 50 Represents half the active concentration, IC 50 Represents half inhibition concentration, EC 50 Represents half-maximal effect concentration, uM represents the micromolar amount;
finally, the cytotoxicity assay was used as an active filter to remove potentially false positive compounds due to cytotoxicity.
2. The method of claim 1, wherein, after the activity data is collated, the collected compounds are activity classified for each study event, comprising:
activity: for a certain research event, at least one experiment has activity, and if the activity is resistance in vitro experimental data, the activity intensity is required to be greater than the cytotoxicity activity intensity under the same experiment;
inactive: for a certain research event, all experimental determination results are inactive or only have resistance in-vitro experimental data, and the resistance activity intensity is smaller than or equal to the cytotoxicity activity intensity under the same experiment;
And (3) fitting: for a certain research event, on the premise that the compound has activity, pseudo-activity data exists;
resistance: for a certain research event, on the premise that the compound has activity, resistance activity data exist, and the resistance activity intensity is larger than the cytotoxicity activity intensity.
3. The method of claim 2, wherein the training the fourth number of corresponding QSAR models by the fourth number of different machine learning algorithms for each toxicity endpoint according to the compound activity data set in combination with the third number of molecular descriptor libraries, and screening out the QSAR model with the best prediction effect corresponding to each toxicity endpoint, and the method for obtaining the first prediction model comprises:
dividing the constructed compound activity data set into a training set and a testing set in a ratio of 4:1, wherein the training set is used for model construction and internal verification, and the testing set is used for external verification;
selecting a third number of molecular descriptor libraries to calculate structural information data of the compound;
for each toxicity end point, respectively training to obtain a fourth number of QSAR models through a fourth number of different machine learning algorithms;
adopting a fifth number of different indexes to verify the prediction effect of the QSAR model, and selecting the QSAR model with the optimal prediction capacity for each toxicity end point;
And the set of the screened second number of QSAR models forms the first prediction model.
4. A method according to claim 3, wherein a fourth number of the machine learning algorithms comprises a K-nearest neighbor algorithm, a na iotave bayes algorithm, a random forest, a support vector machine, and a decision tree.
5. A method according to claim 3, wherein the third number of molecular descriptor libraries are respectively: OEState, gold 2, and Dragon v.7.
6. A method according to claim 3, wherein the fifth number of indicators comprises true positive, false positive, true negative, false negative, sensitivity, specificity, accuracy and area under the curve.
7. A method according to claim 3, wherein the training set of the QSAR model is internally validated using 5-fold cross validation to test the stability of the data when constructing the QSAR model.
8. The method of claim 1, wherein the in vivo test data is animal test data, and wherein a portion of the animal test data is capable of detecting a plurality of toxic effects exhibited by the compound at the individual or organ level, such that the second predictive model comprises a plurality of independent predictive models, each model predicting a toxic effect alone, and wherein the compound is subject to human developmental toxicity if positive in at least one of the plurality of second predictive models.
9. The method according to any one of claims 1-8, further comprising building an application domain, the method of building an application domain being:
the application domains of the relevant QSAR models were described using 26 physicochemical properties of the compounds in the training set, 26 including 1D, 2D, 3D, nAcid, ALogP, AMR, apol, naAromAtom, nAromBond, nAtom, nhaavyom, nbands, nBondsD, nBondsT, nBondsQ, bpol, eta_alpha, FMF, nHBAcc, nHBDon, topoPSA, VABC, MW, AMW, XLogP, TPSA, nRing, nRotB, and nRotBt, respectively;
and calculating a distance matrix of the training set by adopting the Euclidean distance as an application domain of each QSAR model.
CN202110677549.4A 2021-06-18 2021-06-18 Method for predicting human developmental toxicity based on action mode Active CN113409899B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110677549.4A CN113409899B (en) 2021-06-18 2021-06-18 Method for predicting human developmental toxicity based on action mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110677549.4A CN113409899B (en) 2021-06-18 2021-06-18 Method for predicting human developmental toxicity based on action mode

Publications (2)

Publication Number Publication Date
CN113409899A CN113409899A (en) 2021-09-17
CN113409899B true CN113409899B (en) 2024-02-09

Family

ID=77681337

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110677549.4A Active CN113409899B (en) 2021-06-18 2021-06-18 Method for predicting human developmental toxicity based on action mode

Country Status (1)

Country Link
CN (1) CN113409899B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115274002B (en) * 2022-06-13 2023-05-23 中国科学院广州地球化学研究所 Compound persistence screening method based on machine learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160132559A (en) * 2015-05-11 2016-11-21 주식회사 이큐스앤자루 Toxicity Prediction Model for Acute Toxicity by Oral Route based on Quantitative Structure-Toxcity Relationships with Non-linear Machine Learning Methods
CN109545289A (en) * 2018-09-25 2019-03-29 南京大学 A method of based on classification caution structure high flux examination incretion interferent
CN110415770A (en) * 2019-08-26 2019-11-05 南京大学 A method of simplifying the prediction chemicals embryonic development toxicity of transcript profile based on docs-effect

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899458B (en) * 2015-06-16 2017-09-15 中国环境科学研究院 Evaluate the QSAR toxicity prediction methods of nano-metal-oxide health effect
WO2017059022A1 (en) * 2015-09-30 2017-04-06 Inform Genomics, Inc. Systems and methods for predicting treatment-regiment-related outcomes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160132559A (en) * 2015-05-11 2016-11-21 주식회사 이큐스앤자루 Toxicity Prediction Model for Acute Toxicity by Oral Route based on Quantitative Structure-Toxcity Relationships with Non-linear Machine Learning Methods
CN109545289A (en) * 2018-09-25 2019-03-29 南京大学 A method of based on classification caution structure high flux examination incretion interferent
CN110415770A (en) * 2019-08-26 2019-11-05 南京大学 A method of simplifying the prediction chemicals embryonic development toxicity of transcript profile based on docs-effect

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
基于斑马鱼毒理基因组学的化学品测试技术研究进展;王志浩;彭颖;王萍萍;夏普;张效伟;;生态毒理学报(第05期);4-13 *
有害结局路径(AOP)框架在水体复合污染监测研究中的应用;张家敏;彭颖;方文迪;史薇;谢玉为;于红霞;张效伟;;生态毒理学报(第01期);4-17 *
条件性基因敲除动物及其在毒理学研究领域的应用进展;胡红;李子南;郑珊;敬海明;冯颖;尤育洲;李国君;宁钧宇;;实验动物科学(第02期);83-88 *
面向化学品风险评价的计算(预测)毒理学;王中钰;陈景文;乔显亮;李雪花;谢宏彬;蔡喜运;;中国科学:化学(第02期);96-114 *

Also Published As

Publication number Publication date
CN113409899A (en) 2021-09-17

Similar Documents

Publication Publication Date Title
Rappazzo et al. Ozone exposure during early pregnancy and preterm birth: A systematic review and meta-analysis
CN109815532B (en) Method for high-throughput screening of endocrine disruptors
Ng et al. Development and validation of decision forest model for estrogen receptor binding prediction of chemicals using large data sets
Lagoa et al. The role of initial trauma in the host's response to injury and hemorrhage: insights from a correlation of mathematical simulations and hepatic transcriptomic analysis
CN113409899B (en) Method for predicting human developmental toxicity based on action mode
CN101845501A (en) Comprehensive genetic analysis method of susceptibility of complex diseases
Vakili et al. The Association of Inflammatory Biomarker of Neutrophil‐to‐Lymphocyte Ratio with Spontaneous Preterm Delivery: A Systematic Review and Meta‐analysis
Di Filippo et al. A machine learning model to predict drug transfer across the human placenta barrier
Chen et al. Shared diagnostic genes and potential mechanism between PCOS and recurrent implantation failure revealed by integrated transcriptomic analysis and machine learning
CN116864011A (en) Colorectal cancer molecular marker identification method and system based on multiple sets of chemical data
Thomas et al. Risk science in the 21st century: a data-driven framework for incorporating new technologies into chemical safety assessment
KR102111820B1 (en) Dynamic network biomarker detection device, detection method, and detection program
Yao et al. Bioinformatics searching of diagnostic markers and immune infiltration in polycystic ovary syndrome
JP2022062189A (en) Systems, methods and gene signatures for predicting biological status of individual
Tejera et al. A multi-objective approach for drug repurposing in preeclampsia
CN109545289B (en) Method for high-flux screening of endocrine disruptors based on hierarchical warning structure
Judson et al. Using pathway modules as targets for assay development in xenobiotic screening
Marshall et al. Discriminant analysis for longitudinal data with multiple continuous responses and possibly missing data
CN113862371A (en) Prediction device for alcohol-related hepatocellular carcinoma disease progression and prognosis risk and training method of prediction model thereof
Jin et al. Identification and validation of potential hypoxia-related genes associated with coronary artery disease
Kishkovich et al. Performance of a Maternal Risk Stratification System for Predicting Low Apgar Scores
Smith et al. Finding Single and Multi-Gene Expression Patterns for Psoriasis Using Sub-Pattern Frequency Pruning
Xin et al. Knowledge-based machine learning for predicting and understanding the androgen receptor (AR)-mediated reproductive toxicity in zebrafish
US20220246232A1 (en) Method for diagnosing disease risk based on complex biomarker network
Peng et al. Using machine learning approach to predict short-term mortality risk of acute myocardial infarction after emergency admission

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant