CN110910970B - Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model - Google Patents

Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model Download PDF

Info

Publication number
CN110910970B
CN110910970B CN201911139387.8A CN201911139387A CN110910970B CN 110910970 B CN110910970 B CN 110910970B CN 201911139387 A CN201911139387 A CN 201911139387A CN 110910970 B CN110910970 B CN 110910970B
Authority
CN
China
Prior art keywords
tra
ext
compounds
weighted
rmse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911139387.8A
Other languages
Chinese (zh)
Other versions
CN110910970A (en
Inventor
陈景文
吴思甜
李雪花
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN201911139387.8A priority Critical patent/CN110910970B/en
Publication of CN110910970A publication Critical patent/CN110910970A/en
Application granted granted Critical
Publication of CN110910970B publication Critical patent/CN110910970B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a method for predicting the toxicity of chemicals by taking zebra fish embryos as receptors through building a QSAR model. On the basis of the known compound molecular structure, the established QSAR model is applied only by calculating the molecular descriptor with the structural characteristics, so that the half lethal concentration of the compound taking the zebra fish embryo as a receptor can be rapidly and efficiently predicted. The modeling is carried out according to the construction and use guide rules of the QSAR model of the economic cooperation and development organization, and a simple and transparent multivariate linear regression analysis method is applied, so that the QSAR model is easy to understand and apply; the method has a definite application domain, good fitting capability, robustness and prediction capability, can effectively predict the half lethal concentration of the compound in the application domain by taking the zebra fish embryo as a receptor, provides necessary basic data for ecological risk evaluation and management of the compound, and has important significance.

Description

Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model
Technical Field
The invention relates to a method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building a QSAR model, and belongs to the technical field of ecological risk evaluation test strategies.
Background
The semilethal concentration of a chemical is the concentration required to cause death in half of the animals, using LC50And (4) showing. Is often used to indicate the magnitude of acute toxicity. The greater the acute toxicity of a chemical, its LC50The smaller the value. Toxicity information of chemicals to aquatic organisms is one of the key indicators for chemical risk assessment and preferential contaminant screening. The Fish Embryo Toxicity (FET) test of zebrafish is widely used to assess acute toxicity of chemicals.
Many synthetic chemicals in everyday consumer products may have a deleterious effect on aquatic species. Ecological risk assessment of chemicals is an essential measure to control and prevent the harm of synthetic chemicals to aquatic organisms. Today, the half-Lethal Concentration (LC) of aquatic species, such as fish50) Is an indispensable biological endpoint. However, the number of ecotoxicity data is small. LC (liquid Crystal)50Data is typically obtained from experimental measurements according to standardized test protocols, which in most cases have the disadvantages of being time consuming, costly, prone to error, etc. Therefore, LC of all existing chemicals was obtained according to standardized animal protocol50The data is unrealistic. Many international regulations relating to chemical registration, assessment, authorization and Restriction (REACH) have approved the use of computational techniques to replace animal testing. Therefore, it is necessary to develop non-experimental techniques to efficiently and rapidly obtain the molecular structure parameter values of the substances to meet the requirements of ecological risk assessment and management of organic chemicals.
Among the computational methods, Quantitative Structure Activity Relationship (QSAR) is a non-animal experimental method that can provide toxicity data in a timely and cost-effective manner. As the QSAR technology is helpful for realizing the pre-prevention principle of toxic and harmful chemical pollution management, the QSAR technology can reduce or replace related experiments, make up the deficiency of experimental data, reduce experimental cost and be widely developed in the aspects of ecological risk evaluation and management of toxic and harmful chemicals in various countries in the world. The economic cooperation and development Organization (OECD) in 2004 formally determined the guideline for the development and use of QSAR models, as follows: (1) has well-defined environmental indicators; (2) have a well-defined algorithm; (3) defining an application domain of the model; (4) the method has proper fitting degree, stability and prediction capability; (5) it is preferable to be able to perform a mechanism explanation.
Although the toxicity model constructed at present has the characteristics of the toxicity model, the toxicity model also has some defects. These disadvantages are mainly reflected in the following aspects: first, no QSAR model for predicting toxicity has been established for a wide range of compounds using zebrafish or zebrafish embryotoxicity data. There are studies to establish QSAR models for a wide range of chemicals using toxicity data of black-headed fish. However, the blackhead fish has large body size and low sensitivity to chemicals, and the experiment is easily influenced by breeding seasons, and the quality of toxicity data has large fluctuation. Therefore, the model constructed by taking large fishes as receptors probably has a general prediction effect on the compounds with low toxicity. Second, the current model established with zebrafish embryo data is directed only to compounds with similar toxicity. Compounds with similar structural properties will also have less difference in toxicity. There are studies that use embryotoxicity data to model less toxic chemicals. But there is no study to model the extensive toxicity data. Third, the modeling mechanism is not sufficiently transparent. The QSAR model established based on complex modeling methods such as a support vector machine and the like has good toxicity prediction capability. However, the model mechanism of these methods is not transparent enough, the relationship between the toxicity of the chemicals and the descriptors cannot be presented by a definite mathematical expression, and the mechanism of the model is difficult to be explained, so that the methods are not suitable for acceptance and popularization.
There is a satisfactory correlation between QSAR models established based on mode of action (MOA) and biological endpoints. Therefore, the toxicity mechanism and the internal risk factors of the chemical substances are known, the chemical substances are preliminarily classified according to the action mode, and an accurate toxicity prediction model can be obtained.
Fish embryos have become a hotspot for ecotoxicology and toxicology research. The acute toxicity test (FET) of fish embryos using zebra fish embryos as a material was recently adopted as a technical guideline TG 236 by the Organization for Economic Cooperation and Development (OECD).
Based on the reasons, I analyze the existing fish embryo toxicity database, references Verhaar H J M, Van Leeuwen C J, Hermens J L M, classic environmental polutants [ J]The classification method established by Verhaar in Chemospere, 1992,25(4):471-491. the MOA of the chemicals is classified, and according to different chemical substances of different MOAs, the compound LC using zebra fish embryos as receptors is respectively established50The QSAR of (a).
Disclosure of Invention
The invention provides a simple and efficient method for predicting the toxicity of a chemical zebra fish embryo, which can judge the toxic action mode of the chemical zebra fish embryo according to a compound SMILES code so as to predict the LC of the chemical zebra fish embryo50And necessary basic data are provided for chemical risk evaluation and management. The QSAR model is constructed and used with reference to OECD during the modeling process,and performing internal and external verification to investigate the prediction capability and robustness of the model.
Experimental data of zebra fish embryotoxicity are collected by looking at an ECOTOX database, an ECHA database and various documents. The duplicate data was examined and a dataset containing 348 compounds was finally established.
The technical scheme of the invention is as follows:
a method for predicting the toxicity of chemicals taking zebra fish embryos as receptors by establishing a QSAR model comprises the following steps:
establishing corresponding QSAR models according to different action modes of 348 compounds; firstly, determining that action modes of 348 compounds are divided into six classes, namely inert compounds, weak inert compounds, reactive compounds, compounds acting according to a specific mechanism and compounds which cannot be classified; randomly splitting each class of compounds into a training set and a verification set according to a ratio of 4: 1; training compounds in the set to construct a model, and verifying the compounds in the set to evaluate the external prediction capability of the model; optimizing compounds in the data set to obtain stable configurations of corresponding compounds, extracting a Dragon descriptor based on the structure, screening the molecular descriptor by adopting an MLR regression analysis method and constructing a prediction model;
(1) inert compound prediction model:
logLC50=-1.556SM4_B(p)+13.738(R5e+)–1.983Mor16m–0.223RDF075m+7.375 (1)
wherein n istra=30,R2 tra=0.91,R2 adj.tra=0.89,RMSEtra=0.51,Q2 LOO=0.85,next=9,R2 ext=0.85,RMSEext0.59; SM4_ b (p) is the 4 th order moment of spectrum derived from the susceptibility weighted loading matrix; r5e + is the R lag maximum autocorrelation/sanderson electronegativity weighting at lag 5; mor16m is signal 16/weighted by mass; RDF075m is the radial distribution function-075/weighted by mass; n istraAnd nextTraining set and validation set compound numbers, respectively; r2Is a coefficient of determination, R2 adjIs a block of correctionDetermining a coefficient; RMSE is the root mean square error; q2 LOOIs a one-out cross validation coefficient;
(2) prediction model of weakly inert compounds:
logLC50=-0.962SpMax5_Bh(s)+0.689nHDon+0.391GATS5s+0.177Mor08i+0.922 (2)
wherein n istra=18,R2 tra=0.98,R2 adj.tra=0.97,RMSEtra=0.13,Q2 LOO=0.96,next=6,R2 ext=0.94,RMSEext0.34; SpMax5_ Bh(s) is load matrix number 5 maximum feature/weighted by I-state; nHDon is the hydrogen bond donor atom number (N and O); GATS5s is a Geary autocorrelation at lag 5/weighted by I-state; mor08i is signal 08/weighted by ionization potential;
(3) reactive compound prediction model:
logLC50=-0.013P_VSA_e_2–1.48B09[C-O]+0.607Hy–12.256(R4p+)+1.463nOHp+0.281 (3)
wherein n istra=35,R2 tra=0.86,R2 adj.tra=0.84,RMSEtra=0.44,Q2 LOO=0.78,next=10,R2 ext=0.78,RMSEext0.56; p _ VSA _ e _2 is the effect of a class P _ VSA on sanderson electronegativity, bin 2; b09[ C-O ]]Is the presence/absence of C-O at topological distance 9; hy is a hydrophilic factor; r4p + is the R maximum autocorrelation at lag 4/weighted by polarizability; nOhp is the amount of primary alcohol;
(4) compound predictive model acting on specific mechanisms:
logLC50=-0.221RDF085p+0.887Eig10_EA(dm)–1.833B09[C-S]+0.927GATS7i–0.698Mor28m–1.93 (4)
wherein n istra=47,R2 tra=0.82,R2 adj.tra=0.80,RMSEtra=0.39,Q2 LOO=0.75,next=12,R2 ext=0.76,RMSEext0.47; RDF085p is radialBoolean-085/weighted by polarizability; eig10_ EA (dm) is the number 10 characteristic value from the edge landing pad/weighted by the dipole moment; b09[ C-S]Is the presence/absence of C-S at topological distance 9; GATS7i is the geory autocorrelation at lag 7/weighted by ionization potential; mor28m is signal 28/weighted by mass;
because the unsorted chemical substances are relatively disordered and the unified modeling prediction effect is poor, the toxicity correlation of the COOR-containing substances is found to be strong in the MLR analysis process, and the COOR-containing substances are listed for independent modeling.
(5) Non-categorical chemical prediction model (COOR-containing):
logLC50=-4.96TDB10m+2.479nOHp–1.592MATS7s+1.659NssNH–0.367L3s–1.382 (5)
wherein n istra=32,R2 tra=0.88,R2 adj.tra=0.86,RMSEtra=0.39,Q2 LOO=0.83,next=8,R2 ext=0.76,RMSEext0.43; TDB10m is a three-dimensional topological distance-based descriptor-lag 10/by mass addition; nOhp is the amount of primary alcohol; MATS7s is the Moran autocorrelation at lag 7/weighted by I-state; nsssnh is an atomic number of the type ssNH; l3s is the WHIM index/weighted by I-state for the magnitude direction of the third component;
(6) non-categorical chemical prediction model (without COOR)
logLC50=-2.496SM6_B(p)+0.184Eta_betaS–0.419Eig10_AEA(dm)–6.637X3A+0.694B03[O-O]+19.193 (6)
Wherein n istra=110,R2 tra=0.71,R2 adj.tra=0.70,RMSEtra=0.69,Q2 LOO=0.68,next=31,R2 ext=0.74,RMSEext0.39; SM6_ b (p) is the 6 th order moment of spectrum derived from the polarizability-weighted loading matrix; eta _ betaS is the Eta sigma VEM number; eig10_ AEA (dm) is the No. 10 feature value/dipole moment addition from the enlarged edge landing pads; X3A is the average connectivity index of order 3; b03[ O-O ]]Is a rubbingO-O is present/absent at a flap distance of 3;
based on the leverage (h) of the organic chemicals in the modeli) Williams plots were made of the standard residuals (δ), characterizing the domain of application of the model. The standard residual (δ) is calculated as:
Figure BDA0002280490910000061
where δ is the standard residual, yiAnd
Figure BDA0002280490910000062
the experimental and predicted values are for the ith compound, n is the number of compounds in the data set, and p is the number of descriptors.
Leverage value (h)i) And its alarm value (h)*) The calculation formula is as follows:
hi=xi T(XTX)-1xi (7)
h*=3(k+1)/n (8)
wherein xiIs the descriptor matrix for the ith compound; x is the number ofi TIs xiThe transposed matrix of (2); x is a descriptor matrix for all compounds; xTIs the transpose of X; (X)TX)-1Is a matrix XTThe inverse of X; k is the number of variables in the model and n is the number of training set samples. A compound whose standard residual (δ) falls outside (-2, +2) is considered an outlier. The rest compounds are considered to be capable of well predicting the logLC taking zebra fish embryos as receptors50The value is obtained.
The invention has the beneficial effects that:
the model can be used for predicting the zebra fish embryotoxicity of various compounds. The method is simple, convenient and rapid, and has low cost. The toxicity prediction method conforms to the QSAR model development and use guide rules specified by OECD, so that the zebra fish embryo toxicity prediction result of the invention can provide data support for chemical supervision and has important significance for ecological risk evaluation of chemicals.
(1) Only PM6 optimization and Dragon 6.0 are adopted, toxicity prediction of the compound by taking zebra fish embryos as receptors can be realized by applying the method, the calculation is simple and convenient, and the method has strong mechanism explanatory property.
(2) The compound toxicity prediction model which has comprehensive data and contains a plurality of toxic action modes by using the zebra fish embryo is the compound toxicity prediction model which has the largest prediction range and takes the zebra fish embryo as a receptor.
(3) The model is constructed and evaluated according to OECD (organic electronic component analysis) construction and use guide rules of QSAR models, and the constructed model has good fitting capacity, robustness and prediction capacity and can be used for risk evaluation and management of chemicals.
(4) The linear regression algorithm is adopted for modeling, the model algorithm is transparent and simple, and the model algorithm is easy to explain and is beneficial to application and popularization.
Drawings
FIG. 1 is inert chemical logLC50And fitting the measured values with the predicted values.
FIG. 2 is a diagram of a weakly inert chemical logLC50And fitting the measured values with the predicted values.
FIG. 3 is a reactive chemical logLC50And fitting the measured values with the predicted values.
FIG. 4 shows the specific mechanism of action of chemical logLC50And fitting the measured values with the predicted values.
FIG. 5 is a log LC of unsorted chemicals (with COOR)50And fitting the measured values with the predicted values.
FIG. 6 is a log LC of unsorted chemicals (without COOR)50And fitting the measured values with the predicted values.
FIG. 7 is a Williams diagram of inert chemicals.
FIG. 8 is a Williams diagram of a weakly inert chemical.
FIG. 9 is a Williams diagram of reactive chemistry.
FIG. 10 is a Williams diagram of a chemical acting by a specific mechanism.
FIG. 11 is a Williams diagram of an unclassified chemical species (with COOR).
FIG. 12 is a Williams diagram of an unclassified chemical species (without COOR).
Detailed Description
The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.
Example 1
Given a compound isoamyl alcohol (CAS number: 123-51-3), its logLC is predicted50The value is obtained. Firstly, judging that isoamyl alcohol belongs to inert chemical substances according to Smiles code of isoamyl alcohol, calculating a 3D structure of the isoamyl alcohol by using Openbabel software, then performing structure optimization on the isoamyl alcohol by using PM6, and calculating a corresponding value of a descriptor by using Draogon6.0 software based on the optimized 3D structure. The h value calculated according to equation (7) is 0.087, and Williams diagram of inert chemicals in FIG. 7 shows that the lever value (h) of this species is 0.5 and the lever value of isoamyl alcohol is less than 0.5, so the compound is in the model application domain, substituting the values of the above descriptors into equation (1) to obtain logLC50Predicted value of (1.24), experimentally determined logLC thereof50The value was 1.06, and the data for the predicted and experimental values were very consistent.
Example 2
Given a compound, 2,4, 5-trichlorophenol (CAS number: 95-95-4), logLC is predicted50The value is obtained. Firstly, judging that the 2,4, 5-trichlorophenol belongs to weak inert chemical substances according to the Smiles code of the 2,4, 5-trichlorophenol, calculating a 3D structure of the 2,4, 5-trichlorophenol by using Openbabel software, then performing structure optimization on the 3D structure by using PM6, and calculating a corresponding value of a descriptor by using Draogon6.0 software based on the optimized 3D structure. The h value calculated according to equation (7) is 0.147 and the Williams diagram for the less inert chemistry in FIG. 8 shows that the lever value (h) for this type of material is 0.833. The lever value of 2,4, 5-trichlorophenol is less than 0.833, so that the compound is in the model application domain, and the value of the above descriptor is substituted into formula (2) to obtain logLC50Predicted value of (a) is-1.73, log LC experimentally determined50The value was-1.99, and the data for the predicted and experimental values were very consistent.
Example 3
Given a compound cyhalofop-butyl (CAS number: 122008-85-9), logLC is to be predicted50The value is obtained. Firstly, judging that cyhalofop-butyl belongs to the inseparable according to the Smiles code of cyhalofop-butylAnd (3) calculating the 3D structure of the chemical substances by using Openbabel software, then performing structure optimization on the chemical substances by using PM6, and calculating the corresponding values of the descriptors by using Draogon6.0 software based on the optimized 3D structure. And judging the chemical substances containing COOR from the chemical substances which cannot be classified according to the number of the descriptors nRCOOR and nACOR. The h value calculated according to equation (7) is 0.068 and the Williams diagram of COOR containing chemicals in FIG. 11 shows that the lever value (h) for such substances is 0.5625. The lever value of cyhalofop-butyl is less than 0.5625, so that the compound is in the model application domain, and the value of the descriptor is substituted into the formula (5) to obtain logLC50Predicted value of (a) is-2.62, experimentally determined logLC thereof50The value was-2.53, and the data for the predicted and experimental values were very consistent.

Claims (1)

1. A method for predicting the toxicity of chemicals taking zebra fish embryos as receptors by establishing a QSAR model is characterized by comprising the following steps:
establishing corresponding QSAR models according to different action modes of 348 compounds; firstly, determining that action modes of 348 compounds are divided into six classes, namely inert compounds, weak inert compounds, reactive compounds, compounds acting according to a specific mechanism, non-classifiable COOR-containing compounds and non-classifiable COOR-free compounds; randomly splitting each class of compounds into a training set and a verification set according to a ratio of 4: 1; training compounds in the set to construct a model, and verifying the compounds in the set to evaluate the external prediction capability of the model; optimizing compounds in the data set to obtain stable configurations of corresponding compounds, extracting a Dragon descriptor based on the structure, screening the molecular descriptor by adopting an MLR regression analysis method and constructing a prediction model;
(1) inert compound prediction model:
logLC50=-1.556SM4_B(p)+13.738(R5e+)–1.983Mor16m–0.223RDF075m+7.375 (1)
wherein n istra=30,R2 tra=0.91,R2 adj.tra=0.89,RMSEtra=0.51,Q2 LOO=0.85,next=9,R2 ext=0.85,RMSEext0.59; SM4_ b (p) is the 4 th order moment of spectrum derived from the susceptibility weighted loading matrix; r5e + is the R lag maximum autocorrelation or sanderson electronegativity weighting at lag 5; mor16m is signal 16 or weighted by mass; RDF075m is the radial distribution function-075 or weighted by mass; n istraAnd nextTraining set and validation set compound numbers, respectively; r2Is a coefficient of determination, R2 adjIs the decision coefficient of the correction; RMSE is the root mean square error; q2 LOOIs a one-out cross validation coefficient;
(2) prediction model of weakly inert compounds:
logLC50=-0.962SpMax5_Bh(s)+0.689nHDon+0.391GATS5s+0.177Mor08i+0.922 (2)
wherein n istra=18,R2 tra=0.98,R2 adj.tra=0.97,RMSEtra=0.13,Q2 LOO=0.96,next=6,R2 ext=0.94,RMSEext0.34; SpMax5_ Bh(s) is load matrix number 5 maximum feature or weighted by I-state; nHDon is the number of hydrogen bond donor atoms; GATS5s is a Geary autocorrelation at lag5 or weighted by I-state; mor08i is signal 08 or weighted by ionization potential;
(3) reactive compound prediction model:
logLC50=-0.013P_VSA_e_2–1.48B09[C-O]+0.607Hy–12.256(R4p+)+1.463nOHp+0.281 (3)
wherein n istra=35,R2 tra=0.86,R2 adj.tra=0.84,RMSEtra=0.44,Q2 LOO=0.78,next=10,R2 ext=0.78,RMSEext0.56; p _ VSA _ e _2 is the effect of a class P _ VSA on sanderson electronegativity, bin 2; b09[ C-O ]]Is the presence or absence of C-O at topological distance 9; hy is a hydrophilic factor; r4p + is the R maximum autocorrelation at lag4 or weighted by polarizability; nOhp is the amount of primary alcohol;
(4) compound predictive model acting on specific mechanisms:
logLC50=-0.221RDF085p+0.887Eig10_EA(dm)–1.833B09[C-S]+0.927GATS7i–0.698Mor28m–1.93 (4)
wherein n istra=47,R2 tra=0.82,R2 adj.tra=0.80,RMSEtra=0.39,Q2 LOO=0.75,next=12,R2 ext=0.76,RMSEext0.47; RDF085p is the radial distribution function-085 or weighted by polarizability; eig10_ EA (dm) is the number 10 characteristic from the edge landing pad or weighted by the dipole moment; b09[ C-S]Is the presence or absence of C-S at topological distance 9; GATS7i is a geory autocorrelation at lag7 or weighted by ionization potential; mor28m is signal 28 or weighted by mass;
(5) prediction model of non-classifiable COOR-containing compounds:
logLC50=-4.96TDB10m+2.479nOHp–1.592MATS7s+1.659NssNH–0.367L3s–1.382 (5)
wherein n istra=32,R2 tra=0.88,R2 adj.tra=0.86,RMSEtra=0.39,Q2 LOO=0.83,next=8,R2 ext=0.76,RMSEext0.43; TDB10m is based on three-dimensional topological distance descriptor-lag 10 or by mass addition; nOhp is the amount of primary alcohol; MATS7s is the Moran autocorrelation at lag7 or weighted by I-state; nsssnh is an atomic number of the type ssNH; l3s is the WHIM index of the magnitude direction of the third component or weighted by I-state;
(6) non-classifiable prediction models for COOR-free compounds:
logLC50=-2.496SM6_B(p)+0.184Eta_betaS–0.419Eig10_AEA(dm)–6.637X3A+0.694B03[O-O]+19.193 (6)
wherein n istra=110,R2 tra=0.71,R2 adj.tra=0.70,RMSEtra=0.69,Q2 LOO=0.68,next=31,R2 ext=0.74,RMSEext0.39; SM6_ B (p) is negatively weighted by polarizability6-order moment of spectrum obtained by the charge matrix; eta _ betaS is the Eta sigma VEM number; eig10_ AEA (dm) is the number 10 characteristic from an extended edge landing pad or by a dipole moment addition; X3A is the average connectivity index of order 3; b03[ O-O ]]Is the presence or absence of O-O at topological distance 3.
CN201911139387.8A 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model Active CN110910970B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911139387.8A CN110910970B (en) 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911139387.8A CN110910970B (en) 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model

Publications (2)

Publication Number Publication Date
CN110910970A CN110910970A (en) 2020-03-24
CN110910970B true CN110910970B (en) 2022-05-13

Family

ID=69818137

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911139387.8A Active CN110910970B (en) 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model

Country Status (1)

Country Link
CN (1) CN110910970B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634993A (en) * 2020-12-30 2021-04-09 中国科学院生态环境研究中心 Prediction model and screening method for activation activity of estrogen receptor of chemicals
CN113345524B (en) * 2021-06-02 2023-10-20 北京市疾病预防控制中心 Toxicology data screening method, screening device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761431A (en) * 2014-01-10 2014-04-30 大连理工大学 Method for predicting fish bio-concentration factors of organic chemicals by quantitative structure-activity relationship
CN105044317A (en) * 2015-08-26 2015-11-11 广东省微生物研究所 Method for predicating embryotoxicity of non-steroidal anti-inflammatory drug type novel pollutants on early-phase life stage of zebra fish
CN108733970A (en) * 2018-05-16 2018-11-02 常州大学 It is a kind of that method of the prediction organophosphorous fire retardant to zebra fish acute toxicity is combined based on QSAR/QEcoSAR methods

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899458B (en) * 2015-06-16 2017-09-15 中国环境科学研究院 Evaluate the QSAR toxicity prediction methods of nano-metal-oxide health effect

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761431A (en) * 2014-01-10 2014-04-30 大连理工大学 Method for predicting fish bio-concentration factors of organic chemicals by quantitative structure-activity relationship
CN105044317A (en) * 2015-08-26 2015-11-11 广东省微生物研究所 Method for predicating embryotoxicity of non-steroidal anti-inflammatory drug type novel pollutants on early-phase life stage of zebra fish
CN108733970A (en) * 2018-05-16 2018-11-02 常州大学 It is a kind of that method of the prediction organophosphorous fire retardant to zebra fish acute toxicity is combined based on QSAR/QEcoSAR methods

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
QSAR models for predicting hydroxyl radical reaction rate constants with organic chemicals in the atmosphere;Xu Tong; Chen Jingwen; Li Chao; Li Xuehua;《Huanjing Huaxue-Environmental Chemistry》;20180105;第36卷(第4期);全文 *
氯代酚类物质对斑马鱼的急性毒性及QSAR研究;宋志慧等;《环境科学与技术》;20141215(第12期);全文 *
面向毒害有机物生态风险评价的(Q)SAR技术:进展与展望;陈景文 等;《中国科学 B辑:化学》;20081230;第38卷(第6期);全文 *

Also Published As

Publication number Publication date
CN110910970A (en) 2020-03-24

Similar Documents

Publication Publication Date Title
Pillar et al. A framework for metacommunity analysis of phylogenetic structure
CN110910970B (en) Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model
Brenner et al. A taxonomy of inference in simulation models
Wong et al. Application of interval clustering approach to water quality evaluation
Gwinn et al. Evaluating estimators of species richness: the importance of considering statistical error rates
CN110837921A (en) Real estate price prediction research method based on gradient lifting decision tree mixed model
Kalogiouri et al. Liquid chromatographic methods coupled to chemometrics: A short review to present the key workflow for the investigation of wine phenolic composition as it is affected by environmental factors
Vukovic et al. Methodology of aiQSAR: a group-specific approach to QSAR modelling
CN103345544B (en) Adopt logistic regression method prediction organic chemicals biological degradability
Belanger et al. Comparisons of PNEC derivation logic flows under example regulatory schemes and implications for ecoTTC
Ghezelbash et al. Incorporating the genetic and firefly optimization algorithms into K-means clustering method for detection of porphyry and skarn Cu-related geochemical footprints in Baft district, Kerman, Iran
Hellen et al. Explainable AI for safe water evaluation for public health in urban settings
Zhang et al. Influence of methodological choices on results of macrofaunal functional feeding diversity and evenness analyses
Liebmann et al. Critical Points of Gaussian‐Distributed Scalar Fields on Simplicial Grids
CN112750507B (en) Method for simultaneously detecting nitrate and nitrite contents in water based on hybrid machine learning model
Gutierrez-Velez et al. Sampling bias mitigation for species occurrence modeling using machine learning methods
Cai [Retracted] Deep Learning‐Based Economic Forecasting for the New Energy Vehicle Industry
CN109784417B (en) Black hair pork image identification method
Yuan et al. Combining national and state data improves predictions of microcystin concentration
CN102323973B (en) Method for predicting common environment poison property/activity on the basis of intelligent correlation index
Belanger et al. Weight of evidence tools in the prediction of acute fish toxicity
CN111768813A (en) Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model
Albinet et al. Prediction of exchangeable potassium in soil through mid-infrared spectroscopy and deep learning: From prediction to explainability
Abu-Awwad et al. Semiparametric estimation for space-time max-stable processes: an F-madogram-based approach
CN112349098A (en) Method for estimating accident severity by environmental elements in exit ramp area of expressway

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant