CN110910970A - Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model - Google Patents

Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model Download PDF

Info

Publication number
CN110910970A
CN110910970A CN201911139387.8A CN201911139387A CN110910970A CN 110910970 A CN110910970 A CN 110910970A CN 201911139387 A CN201911139387 A CN 201911139387A CN 110910970 A CN110910970 A CN 110910970A
Authority
CN
China
Prior art keywords
tra
ext
compounds
weighted
rmse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911139387.8A
Other languages
Chinese (zh)
Other versions
CN110910970B (en
Inventor
陈景文
吴思甜
李雪花
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Priority to CN201911139387.8A priority Critical patent/CN110910970B/en
Publication of CN110910970A publication Critical patent/CN110910970A/en
Application granted granted Critical
Publication of CN110910970B publication Critical patent/CN110910970B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures

Landscapes

  • Engineering & Computer Science (AREA)
  • Chemical & Material Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The invention discloses a method for predicting the toxicity of chemicals taking zebra fish embryos as receptors by establishing a QSAR model. On the basis of the known compound molecular structure, the established QSAR model is applied only by calculating the molecular descriptor with the structural characteristics, so that the half lethal concentration of the compound taking the zebra fish embryo as a receptor can be rapidly and efficiently predicted. The modeling is carried out according to the construction and use guide rules of the QSAR model of the economic cooperation and development organization, and a simple and transparent multivariate linear regression analysis method is applied, so that the QSAR model is easy to understand and apply; the method has a definite application domain, good fitting capability, robustness and prediction capability, can effectively predict the half lethal concentration of the compound in the application domain by taking the zebra fish embryo as a receptor, provides necessary basic data for ecological risk evaluation and management of the compound, and has important significance.

Description

Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model
Technical Field
The invention relates to a method for predicting toxicity of chemicals taking zebra fish embryos as receptors by establishing a QSAR model, and belongs to the technical field of ecological risk evaluation test strategies.
Background
The semilethal concentration of a chemical is the concentration required to cause death in half of the animals, using LC50And (4) showing. Is often used to indicate the magnitude of acute toxicity. The greater the acute toxicity of a chemical, its LC50The smaller the value. Toxicity information of chemicals to aquatic organisms is one of the key indicators for chemical risk assessment and preferential contaminant screening. The Fish Embryo Toxicity (FET) test of zebrafish is widely used to assess acute toxicity of chemicals.
Many synthetic chemicals in everyday consumer products may have a deleterious effect on aquatic species. Ecological risk assessment of chemicals is an essential measure to control and prevent the harm of synthetic chemicals to aquatic organisms. Today, the half-Lethal Concentration (LC) of aquatic species, such as fish50) Is an indispensable biological endpoint. However, the number of ecotoxicity data available is small. LC (liquid Crystal)50Data is typically obtained from experimental measurements according to standardized test protocols, which in most cases have the disadvantages of being time consuming, costly, prone to error, etc. Therefore, LC of all existing chemicals was obtained according to standardized animal protocol50The data is unrealistic. Many international regulations relating to chemical registration, assessment, authorization and Restriction (REACH) have approved the use of computational techniques to replace animal testing. Therefore, it is necessary to develop non-experimental techniques to efficiently and rapidly obtain the molecular structure parameter values of the substances to meet the requirements of ecological risk assessment and management of organic chemicals.
Among the computational methods, Quantitative Structure Activity Relationship (QSAR) is a non-animal experimental method that can provide toxicity data in a timely and cost-effective manner. As the QSAR technology is helpful for realizing the pre-prevention principle of toxic and harmful chemical pollution management, the QSAR technology can reduce or replace related experiments, make up the deficiency of experimental data, reduce experimental cost and be widely developed in the aspects of ecological risk evaluation and management of toxic and harmful chemicals in various countries in the world. The economic cooperation and development Organization (OECD) in 2004 formally determined the guideline for the development and use of QSAR models, as follows: (1) has well-defined environmental indicators; (2) have a well-defined algorithm; (3) defining an application domain of the model; (4) the method has proper fitting degree, stability and prediction capability; (5) it is preferable to be able to perform a mechanism explanation.
Although the toxicity model constructed at present has the characteristics of the toxicity model, the toxicity model also has some defects. These disadvantages are mainly reflected in the following aspects: first, no QSAR model for predicting toxicity has been established for a wide range of compounds using zebrafish or zebrafish embryotoxicity data. There are studies to establish QSAR models for a wide range of chemicals using toxicity data of black-headed fish. However, the blackhead fish has large body size and low sensitivity to chemicals, and the experiment is easily influenced by breeding seasons, and the quality of toxicity data has large fluctuation. Therefore, the model constructed by taking large fishes as receptors probably has a general prediction effect on the compounds with low toxicity. Second, the current model established with zebrafish embryo data is directed only to compounds with similar toxicity. Compounds with similar structural properties will also have less difference in toxicity. There are studies that use embryotoxicity data to model less toxic chemicals. But there is no study to model the extensive toxicity data. Third, the modeling mechanism is not sufficiently transparent. The QSAR model established based on complex modeling methods such as a support vector machine and the like has good toxicity prediction capability. However, the model mechanism of these methods is not transparent enough, the relationship between the toxicity of the chemical and the descriptor cannot be represented by a definite mathematical expression, and the mechanism of the model is difficult to be explained, so that the methods are not suitable for acceptance and popularization.
There is a satisfactory correlation between QSAR models established based on mode of action (MOA) and biological endpoints. Therefore, the toxicity mechanism and the internal risk factors of the chemical substances are known, the chemical substances are preliminarily classified according to the action mode, and an accurate toxicity prediction model can be obtained.
Fish embryos have become a hotspot for ecotoxicology and toxicology research. The acute toxicity test (FET) of fish embryos using zebra fish embryos as a material was recently adopted as a technical guideline TG 236 by the Organization for Economic Cooperation and Development (OECD).
Based on the reasons, I analyzed the existing fish embryo toxicity database, references Verhaar H J M, VanLeeuwen C J, Hermens J L M, Classification environmental polutants [ J]The classification method established by Verhaar in Chemospere, 1992,25(4):471-491. the MOA of the chemicals is classified, and according to different chemical substances of different MOAs, the compound LC using zebra fish embryos as receptors is respectively established50The QSAR of (1).
Disclosure of Invention
The invention provides a simple and efficient method for predicting the toxicity of a chemical zebra fish embryo, which can judge the toxic action mode of the chemical zebra fish embryo according to a compound SMILES code so as to predict the LC of the chemical zebra fish embryo50And necessary basic data are provided for chemical risk evaluation and management. In the modeling process, the QSAR model is constructed and used by referring to the OECD, and the forecasting capability and the robustness of the model are examined through internal and external verification.
Experimental data of zebra fish embryotoxicity are collected by looking at an ECOTOX database, an ECHA database and various documents. The duplicate data was examined and a dataset containing 348 compounds was finally established.
The technical scheme of the invention is as follows:
a method for predicting the toxicity of chemicals taking zebra fish embryos as receptors by establishing a QSAR model comprises the following steps:
establishing corresponding QSAR models according to different action modes of 348 compounds; firstly, determining that action modes of 348 compounds are divided into six classes, namely inert compounds, weak inert compounds, reactive compounds, compounds acting according to a specific mechanism and compounds which cannot be classified; randomly splitting each class of compounds into a training set and a verification set according to a ratio of 4: 1; training compounds in the set to construct a model, and verifying the compounds in the set to evaluate the external prediction capability of the model; optimizing compounds in the data set to obtain stable configurations of corresponding compounds, extracting a Dragon descriptor based on the structure, screening the molecular descriptor by adopting an MLR regression analysis method and constructing a prediction model;
(1) inert compound prediction model:
logLC50=-1.556SM4_B(p)+13.738(R5e+)–1.983Mor16m–0.223RDF075m+7.375(1)
wherein n istra=30,R2 tra=0.91,R2 adj.tra=0.89,RMSEtra=0.51,Q2 LOO=0.85,next=9,R2 ext=0.85,RMSEext0.59; SM4_ b (p) is the 4 th order moment of spectrum derived from the susceptibility weighted loading matrix; r5e + is the R lag maximum autocorrelation/sanderson electronegativity weighting at lag 5; mor16m is signal 16/weighted by mass; RDF075m is the radial distribution function-075/weighted by mass; n istraAnd nextTraining set and validation set compound numbers, respectively; r2Is a coefficient of determination, R2 adjIs the decision coefficient of the correction; RMSE is the root mean square error; q2 LOOIs a one-out cross validation coefficient;
(2) prediction model of weakly inert compounds:
logLC50=-0.962SpMax5_Bh(s)+0.689nHDon+0.391GATS5s+0.177Mor08i+0.922(2)
wherein n istra=18,R2 tra=0.98,R2 adj.tra=0.97,RMSEtra=0.13,Q2 LOO=0.96,next=6,R2 ext=0.94,RMSEext0.34; SpMax5_ Bh(s) is load matrix number 5 maximum feature/weighted by I-state; nHDon is the hydrogen bond donor atom number (N and O); GATS5s is a Geary autocorrelation at lag 5/weighted by I-state; mor08i is signal 08/weighted by ionization potential;
(3) reactive compound prediction model:
logLC50=-0.013P_VSA_e_2–1.48B09[C-O]+0.607Hy–12.256(R4p+)+1.463nOHp+0.281 (3)
wherein n istra=35,R2 tra=0.86,R2 adj.tra=0.84,RMSEtra=0.44,Q2 LOO=0.78,next=10,R2 ext=0.78,RMSEext0.56; p _ VSA _ e _2 is the effect of a class P _ VSA on sanderson electronegativity, bin 2; b09[ C-O ]]Is the presence/absence of C-O at topological distance 9; hy is a hydrophilic factor; r4p + is the R maximum autocorrelation at lag 4/weighted by polarizability; nOhp is the amount of primary alcohol;
(4) compound predictive model acting on specific mechanisms:
logLC50=-0.221RDF085p+0.887Eig10_EA(dm)–1.833B09[C-S]+0.927GATS7i–0.698Mor28m–1.93 (4)
wherein n istra=47,R2 tra=0.82,R2 adj.tra=0.80,RMSEtra=0.39,Q2 LOO=0.75,next=12,R2 ext=0.76,RMSEext0.47; RDF085p is the radial distribution function-085/weighted by polarizability; eig10_ EA (dm) is the number 10 characteristic value from the edge landing pad/weighted by the dipole moment; b09[ C-S]Is the presence/absence of C-S at topological distance 9; GATS7i is the geory autocorrelation at lag 7/weighted by ionization potential; mor28m is signal 28/weighted by mass;
because the unsorted chemical substances are relatively disordered and the unified modeling prediction effect is poor, the toxicity correlation of the COOR-containing substances is found to be strong in the MLR analysis process, and the COOR-containing substances are listed for independent modeling.
(5) Non-categorical chemical prediction model (COOR-containing):
logLC50=-4.96TDB10m+2.479nOHp–1.592MATS7s+1.659NssNH–0.367L3s–1.382(5)
wherein n istra=32,R2 tra=0.88,R2 adj.tra=0.86,RMSEtra=0.39,Q2 LOO=0.83,next=8,R2 ext=0.76,RMSEext0.43; TDB10m is a three-dimensional topological distance-based descriptor-lag 10/by mass addition; nOhp is the amount of primary alcohol; MA (MA)TS7s is Moran autocorrelation/weighted by I-state at lag 7; nsssnh is an atomic number of the type ssNH; l3s is the WHIM index/weighted by I-state for the magnitude direction of the third component;
(6) non-categorical chemical prediction model (without COOR)
logLC50=-2.496SM6_B(p)+0.184Eta_betaS–0.419Eig10_AEA(dm)–6.637X3A+0.694B03[O-O]+19.193 (6)
Wherein n istra=110,R2 tra=0.71,R2 adj.tra=0.70,RMSEtra=0.69,Q2 LOO=0.68,next=31,R2 ext=0.74,RMSEext0.39; SM6_ b (p) is the 6 th order moment of spectrum derived from the polarizability-weighted loading matrix; eta _ betaS is the Eta sigma VEM number; eig10_ AEA (dm) is the No. 10 feature value/dipole moment addition from the enlarged edge landing pads; X3A is the average connectivity index of order 3; b03[ O-O ]]Is the presence/absence of O-O at topological distance 3;
based on the lever value (h) of the organic chemical in the modeli) Williams plots were made of the standard residuals (δ), characterizing the domain of application of the model. The standard residual (δ) is calculated as:
Figure BDA0002280490910000061
where δ is the standard residual, yiAnd
Figure BDA0002280490910000062
the experimental and predicted values are for the ith compound, n is the number of compounds in the data set, and p is the number of descriptors.
Leverage value (h)i) And its alarm value (h)*) The calculation formula is as follows:
hi=xi T(XTX)-1xi(7)
h*=3(k+1)/n (8)
wherein xiIs the firstA descriptor matrix of i compounds; x is the number ofi TIs xiThe transposed matrix of (2); x is a descriptor matrix for all compounds; xTIs the transpose of X; (X)TX)-1Is a matrix XTThe inverse of X; k is the number of variables in the model and n is the number of training set samples. A compound whose standard residual (δ) falls outside (-2, +2) is considered an outlier. The rest compounds are considered to be capable of well predicting the logLC taking zebra fish embryos as receptors50The value is obtained.
The invention has the beneficial effects that:
the model can be used for predicting the zebra fish embryotoxicity of various compounds. The method is simple, convenient and rapid, and has low cost. The toxicity prediction method conforms to the QSAR model development and use guide rules specified by OECD, so that the zebra fish embryo toxicity prediction result of the invention can provide data support for chemical supervision and has important significance for ecological risk evaluation of chemicals.
(1) Only PM6 optimization and Dragon 6.0 are adopted, toxicity prediction of the compound by taking zebra fish embryos as receptors can be realized by applying the method, the calculation is simple and convenient, and the method has strong mechanism explanatory property.
(2) The compound toxicity prediction model which has comprehensive data and contains a plurality of toxic action modes by using the zebra fish embryo is the compound toxicity prediction model which has the largest prediction range and takes the zebra fish embryo as a receptor.
(3) And (3) constructing and evaluating the model according to OECD (organic electronic component analysis) construction and use guide rules of the QSAR model, wherein the constructed model has good fitting capability, robustness and prediction capability and can be used for risk evaluation and management of chemicals.
(4) The linear regression algorithm is adopted for modeling, the model algorithm is transparent and simple, and the model algorithm is easy to explain and is beneficial to application and popularization.
Drawings
FIG. 1 is inert chemical logLC50And fitting the measured values with the predicted values.
FIG. 2 is a diagram of a weakly inert chemical logLC50And fitting the measured values with the predicted values.
FIG. 3 is a reactive chemistrySubstance logLC50And fitting the measured values with the predicted values.
FIG. 4 shows the specific mechanism of action of chemical logLC50And fitting the measured values with the predicted values.
FIG. 5 is a log LC of unsorted chemicals (with COOR)50And fitting the measured values with the predicted values.
FIG. 6 is a log LC of unsorted chemicals (without COOR)50And fitting the measured values with the predicted values.
FIG. 7 is a Williams diagram of inert chemicals.
FIG. 8 is a Williams diagram of a weakly inert chemical.
FIG. 9 is a Williams diagram of reactive chemistry.
FIG. 10 is a Williams diagram of a chemical acting by a specific mechanism.
FIG. 11 is a Williams diagram of an unclassified chemical species (with COOR).
FIG. 12 is a Williams diagram of an unclassified chemical species (without COOR).
Detailed Description
The following further describes a specific embodiment of the present invention with reference to the drawings and technical solutions.
Example 1
Given a compound isoamyl alcohol (CAS number: 123-51-3), its logLC is predicted50The value is obtained. Firstly, judging that isoamyl alcohol belongs to inert chemical substances according to Smiles code of isoamyl alcohol, calculating a 3D structure of the isoamyl alcohol by using Openbabel software, then performing structure optimization on the isoamyl alcohol by using PM6, and calculating a corresponding value of a descriptor by using Draogon6.0 software based on the optimized 3D structure. The h value calculated according to equation (7) is 0.087, and Williams diagram of inert chemicals in FIG. 7 shows that the lever value (h) of this species is 0.5 and the lever value of isoamyl alcohol is less than 0.5, so the compound is in the model application domain, substituting the values of the above descriptors into equation (1) to obtain logLC50Predicted value of (1.24), experimentally determined logLC thereof50The value was 1.06, and the data for the predicted and experimental values were very consistent.
Example 2
Giving a Compound 2,4, 5-trichloroPhenol (CAS number: 95-95-4), whose logLC is to be predicted50The value is obtained. Firstly, judging that the 2,4, 5-trichlorophenol belongs to weak inert chemical substances according to the Smiles code of the 2,4, 5-trichlorophenol, calculating a 3D structure of the 2,4, 5-trichlorophenol by using Openbabel software, then performing structure optimization on the 3D structure by using PM6, and calculating a corresponding value of a descriptor by using Draogon6.0 software based on the optimized 3D structure. The h value calculated according to equation (7) is 0.147 and the Williams diagram for the less inert chemistry in FIG. 8 shows that the lever value (h) for this type of material is 0.833. The lever value of 2,4, 5-trichlorophenol is less than 0.833, so that the compound is in the model application domain, and the value of the above descriptor is substituted into formula (2) to obtain logLC50Predicted value of (a) is-1.73, log LC experimentally determined50The value was-1.99, and the data for the predicted and experimental values were very consistent.
Example 3
Given a compound cyhalofop-butyl (CAS number: 122008-85-9), logLC is to be predicted50The value is obtained. Firstly, judging that cyhalofop-butyl belongs to an unsorted chemical substance according to the Smiles code of cyhalofop-butyl, calculating a 3D structure of the cyhalofop-butyl by using Openbabel software, then performing structure optimization on the cyhalofop-butyl by using PM6, and calculating a corresponding value of a descriptor by using Draogon6.0 software based on the optimized 3D structure. And judging the chemical substances containing COOR from the chemical substances which cannot be classified according to the number of the descriptors nRCOOR and nACOR. The h value calculated according to equation (7) is 0.068 and the Williams diagram of COOR containing chemicals in FIG. 11 shows that the lever value (h) for such substances is 0.5625. The lever value of cyhalofop-butyl is less than 0.5625, so that the compound is in the model application domain, and the value of the descriptor is substituted into the formula (5) to obtain logLC50Predicted value of (a) is-2.62, experimentally determined logLC thereof50The value was-2.53, and the data for the predicted and experimental values were very consistent.

Claims (1)

1. A method for predicting the toxicity of chemicals taking zebra fish embryos as receptors by establishing a QSAR model is characterized by comprising the following steps:
establishing corresponding QSAR models according to different action modes of 348 compounds; firstly, determining that action modes of 348 compounds are divided into six classes, namely inert compounds, weak inert compounds, reactive compounds, compounds acting according to a specific mechanism and compounds which cannot be classified; randomly splitting each class of compounds into a training set and a verification set according to a ratio of 4: 1; training compounds in the set to construct a model, and verifying the compounds in the set to evaluate the external prediction capability of the model; optimizing compounds in the data set to obtain stable configurations of corresponding compounds, extracting a Dragon descriptor based on the structure, screening the molecular descriptor by adopting an MLR regression analysis method and constructing a prediction model;
(1) inert compound prediction model:
logLC50=-1.556SM4_B(p)+13.738(R5e+)–1.983Mor16m–0.223RDF075m+7.375 (1)
wherein n istra=30,R2 tra=0.91,R2 adj.tra=0.89,RMSEtra=0.51,Q2 LOO=0.85,next=9,R2 ext=0.85,RMSEext0.59; SM4_ b (p) is the 4 th order moment of spectrum derived from the susceptibility weighted loading matrix; r5e + is the R lag maximum autocorrelation/sanderson electronegativity weighting at lag 5; mor16m is signal 16/weighted by mass; RDF075m is the radial distribution function-075/weighted by mass; n istraAnd nextTraining set and validation set compound numbers, respectively; r2Is a coefficient of determination, R2 adjIs the decision coefficient of the correction; RMSE is the root mean square error; q2 LOOIs a one-out cross validation coefficient;
(2) prediction model of weakly inert compounds:
logLC50=-0.962SpMax5_Bh(s)+0.689nHDon+0.391GATS5s+0.177Mor08i+0.922 (2)
wherein n istra=18,R2 tra=0.98,R2 adj.tra=0.97,RMSEtra=0.13,Q2 LOO=0.96,next=6,R2 ext=0.94,RMSEext0.34; SpMax5_ Bh(s) is load matrix number 5 maximum feature/weighted by I-state; nHDon is the hydrogen bond donor atom number (N and O); GATS5s isGeary autocorrelation at lag 5/weighted by I-state; mor08i is signal 08/weighted by ionization potential;
(3) reactive compound prediction model:
logLC50=-0.013P_VSA_e_2–1.48B09[C-O]+0.607Hy–12.256(R4p+)+1.463nOHp+0.281(3)
wherein n istra=35,R2 tra=0.86,R2 adj.tra=0.84,RMSEtra=0.44,Q2 LOO=0.78,next=10,R2 ext=0.78,RMSEext0.56; p _ VSA _ e _2 is the effect of a class P _ VSA on sanderson electronegativity, bin 2; b09[ C-O ]]Is the presence/absence of C-O at topological distance 9; hy is a hydrophilic factor; r4p + is the R maximum autocorrelation at lag 4/weighted by polarizability; nOhp is the amount of primary alcohol;
(4) compound predictive model acting on specific mechanisms:
logLC50=-0.221RDF085p+0.887Eig10_EA(dm)–1.833B09[C-S]+0.927GATS7i–0.698Mor28m–1.93 (4)
wherein n istra=47,R2 tra=0.82,R2 adj.tra=0.80,RMSEtra=0.39,Q2 LOO=0.75,next=12,R2 ext=0.76,RMSEext0.47; RDF085p is the radial distribution function-085/weighted by polarizability; eig10_ EA (dm) is the number 10 characteristic value from the edge landing pad/weighted by the dipole moment; b09[ C-S]Is the presence/absence of C-S at topological distance 9; GATS7i is the geory autocorrelation at lag 7/weighted by ionization potential; mor28m is signal 28/weighted by mass;
(5) prediction model of non-classifiable COOR-containing compounds:
logLC50=-4.96TDB10m+2.479nOHp–1.592MATS7s+1.659NssNH–0.367L3s–1.382 (5)
wherein n istra=32,R2 tra=0.88,R2 adj.tra=0.86,RMSEtra=0.39,Q2 LOO=0.83,next=8,R2 ext=0.76,RMSEext0.43; TDB10m is a three-dimensional topological distance-based descriptor-lag 10/by mass addition; nOhp is the amount of primary alcohol; MATS7s is the Moran autocorrelation at lag 7/weighted by I-state; nsssnh is an atomic number of the type ssNH; l3s is the WHIM index/weighted by I-state for the magnitude direction of the third component;
(6) non-classifiable prediction models for COOR-free compounds:
logLC50=-2.496SM6_B(p)+0.184Eta_betaS–0.419Eig10_AEA(dm)–6.637X3A+0.694B03[O-O]+19.193 (6)
wherein n istra=110,R2 tra=0.71,R2 adj.tra=0.70,RMSEtra=0.69,Q2 LOO=0.68,next=31,R2 ext=0.74,RMSEext0.39; SM6_ b (p) is the 6 th order moment of spectrum derived from the polarizability-weighted loading matrix; eta _ betaS is the Eta sigma VEM number; eig10_ AEA (dm) is the No. 10 feature value/dipole moment addition from the enlarged edge landing pads; X3A is the average connectivity index of order 3; b03[ O-O ]]Is the presence/absence of O-O at topological distance 3.
CN201911139387.8A 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model Active CN110910970B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911139387.8A CN110910970B (en) 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911139387.8A CN110910970B (en) 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model

Publications (2)

Publication Number Publication Date
CN110910970A true CN110910970A (en) 2020-03-24
CN110910970B CN110910970B (en) 2022-05-13

Family

ID=69818137

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911139387.8A Active CN110910970B (en) 2019-11-20 2019-11-20 Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model

Country Status (1)

Country Link
CN (1) CN110910970B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634993A (en) * 2020-12-30 2021-04-09 中国科学院生态环境研究中心 Prediction model and screening method for activation activity of estrogen receptor of chemicals
CN113345524A (en) * 2021-06-02 2021-09-03 北京市疾病预防控制中心 Method and device for screening toxicological data and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761431A (en) * 2014-01-10 2014-04-30 大连理工大学 Method for predicting fish bio-concentration factors of organic chemicals by quantitative structure-activity relationship
CN105044317A (en) * 2015-08-26 2015-11-11 广东省微生物研究所 Method for predicating embryotoxicity of non-steroidal anti-inflammatory drug type novel pollutants on early-phase life stage of zebra fish
US20180101664A1 (en) * 2015-06-16 2018-04-12 Chinese Research Academy Of Environmental Sciences Qsar toxicity prediction method for evaluating health effect of nano-crystalline metal oxide
CN108733970A (en) * 2018-05-16 2018-11-02 常州大学 It is a kind of that method of the prediction organophosphorous fire retardant to zebra fish acute toxicity is combined based on QSAR/QEcoSAR methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103761431A (en) * 2014-01-10 2014-04-30 大连理工大学 Method for predicting fish bio-concentration factors of organic chemicals by quantitative structure-activity relationship
US20180101664A1 (en) * 2015-06-16 2018-04-12 Chinese Research Academy Of Environmental Sciences Qsar toxicity prediction method for evaluating health effect of nano-crystalline metal oxide
CN105044317A (en) * 2015-08-26 2015-11-11 广东省微生物研究所 Method for predicating embryotoxicity of non-steroidal anti-inflammatory drug type novel pollutants on early-phase life stage of zebra fish
CN108733970A (en) * 2018-05-16 2018-11-02 常州大学 It is a kind of that method of the prediction organophosphorous fire retardant to zebra fish acute toxicity is combined based on QSAR/QEcoSAR methods

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XU TONG; CHEN JINGWEN; LI CHAO; LI XUEHUA: "QSAR models for predicting hydroxyl radical reaction rate constants with organic chemicals in the atmosphere", 《HUANJING HUAXUE-ENVIRONMENTAL CHEMISTRY》 *
宋志慧等: "氯代酚类物质对斑马鱼的急性毒性及QSAR研究", 《环境科学与技术》 *
陈景文 等: "面向毒害有机物生态风险评价的(Q)SAR技术:进展与展望", 《中国科学 B辑:化学》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634993A (en) * 2020-12-30 2021-04-09 中国科学院生态环境研究中心 Prediction model and screening method for activation activity of estrogen receptor of chemicals
CN113345524A (en) * 2021-06-02 2021-09-03 北京市疾病预防控制中心 Method and device for screening toxicological data and storage medium
CN113345524B (en) * 2021-06-02 2023-10-20 北京市疾病预防控制中心 Toxicology data screening method, screening device and storage medium

Also Published As

Publication number Publication date
CN110910970B (en) 2022-05-13

Similar Documents

Publication Publication Date Title
Klotz et al. Uncertainty estimation with deep learning for rainfall–runoff modeling
Pillar et al. A framework for metacommunity analysis of phylogenetic structure
Tikhonov et al. Using joint species distribution models for evaluating how species‐to‐species associations depend on the environmental context
Kingsolver et al. Synthetic analyses of phenotypic selection in natural populations: lessons, limitations and future directions
Anderson et al. Measures of precision for dissimilarity‐based multivariate analysis of ecological communities
Ligmann-Zielinska et al. ‘One size does not fit all’: A roadmap of purpose-driven mixed-method pathways for sensitivity analysis of agent-based models
González et al. New drought frequency index: Definition and comparative performance analysis
Vilmi et al. Dispersal–niche continuum index: a new quantitative metric for assessing the relative importance of dispersal versus niche processes in community assembly
Robertson et al. Optimising a widely-used coastal health index through quantitative ecological group classifications and associated thresholds
Gwinn et al. Evaluating estimators of species richness: the importance of considering statistical error rates
Brenner et al. A taxonomy of inference in simulation models
CN110910970B (en) Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model
Carroll et al. Hierarchical Bayesian spatial models for multispecies conservation planning and monitoring
Bellocchi et al. Validation of biophysical models: issues and methodologies
Franceschini et al. Cascaded neural networks improving fish species prediction accuracy: the role of the biotic information
Kent et al. Presence‐only versus presence–absence data in species composition determinant analyses
Belanger et al. Comparisons of PNEC derivation logic flows under example regulatory schemes and implications for ecoTTC
Hellen et al. Explainable AI for safe water evaluation for public health in urban settings
Botterill et al. Using machine learning to identify hydrologic signatures with an encoder–decoder framework
Belanger et al. Weight of evidence tools in the prediction of acute fish toxicity
CN111768813A (en) Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model
CN110853701A (en) Method for predicting fish biological enrichment factor of organic compound by adopting multi-parameter linear free energy relation model
Albinet et al. Prediction of exchangeable potassium in soil through mid-infrared spectroscopy and deep learning: From prediction to explainability
Javari A study of impacts of temperature components on precipitation in Iran using SEM-PLS-GIS
CN114420219A (en) Construction method, prediction method and device of relative retention time prediction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant