CN111768813A - Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model - Google Patents

Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model Download PDF

Info

Publication number
CN111768813A
CN111768813A CN202010645135.9A CN202010645135A CN111768813A CN 111768813 A CN111768813 A CN 111768813A CN 202010645135 A CN202010645135 A CN 202010645135A CN 111768813 A CN111768813 A CN 111768813A
Authority
CN
China
Prior art keywords
model
pdms
value
organic
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010645135.9A
Other languages
Chinese (zh)
Inventor
朱腾义
陈文瑄
程浩淼
李懿
王坤
吴晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou University
Original Assignee
Yangzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou University filed Critical Yangzhou University
Priority to CN202010645135.9A priority Critical patent/CN111768813A/en
Publication of CN111768813A publication Critical patent/CN111768813A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C10/00Computational theoretical chemistry, i.e. ICT specially adapted for theoretical aspects of quantum chemistry, molecular mechanics, molecular dynamics or the like
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics

Landscapes

  • Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for organic PDMS membrane-water distribution coefficient, which is characterized in that a molecular descriptor is calculated through the molecular structure of the existing compound, a stepwise linear regression-support vector machine (SW-SVM) analysis combination method is adopted, a quantitative structure-property relation model is constructed, and the organic compound can be rapidly and efficiently predictedK PDMS‑wA value; the method is simple and quick, has low cost, can save manpower, material resources and financial resources required by experimental tests, develops a nonlinear model with better generalization capability by using R language, and has good goodness of fit, robustness and prediction capability; the invention can effectively predict the PDMS film/water distribution coefficient of the organic compound in the application domain, fills the blank of data of other compounds, and is used for monitoring and passive sampling of environmental compoundsThe application of the device provides necessary basic data and has great significance.

Description

Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model
Technical Field
The invention relates to a method for predicting the distribution coefficient of organic PDMS (polydimethylsiloxane) film/water, in particular to a method for predicting the distribution coefficient of organic PDMS film/water based on a quantitative structure-activity relation model of SW-SVM (support vector machine) algorithm.
Background
Membrane passive sampling techniques are a widely accepted method of measuring the free dissolved concentration of organic compounds and assessing their environmental exposure risk. PDMS (polydimethylsiloxane) films are one of the passive sampling materials in a wide range of applications due to their good thermal stability and high affinity for hydrophobic compounds. The partition coefficient (K) of organic substances between PDMS membrane and water is usuallyPDMS-w) Is an important parameter for evaluating the environmental behavior of the compound and is also the key for the successful application of the passive sampler. The measurement of the conventional experiment is time-consuming and labor-consuming, and is difficult to meet the requirements of monitoring and managing the organic pollutant environment with large quantity and increasing quantity, so that the simple and accurate theoretical prediction method for estimating the K of the organic matter is developedPDMS-wIs particularly important.
The quantitative structure-property relationship (QSPR) is a computer modeling method for correlating molecular structures of organic matters with physicochemical properties, environmental behaviors and toxicological parameters of the organic matters, and can reduce or replace related experiments, make up for the deficiency of experimental data and reduce experimental cost. K currently on predicting organic matterPDMS-wThe research methods are less, particularly the research on the nonlinear method, the substances researched in the existing model are single and few, and the prediction precision of the model is also further improved. Considering that the environmental distribution behavior of various organic matters is a complex process, the distribution coefficient may involve some non-linear relations, and therefore, it is necessary to construct a K which covers various compounds, has a definite algorithm, is convenient for application and popularization, and does not depend on experimental dataPDMS-wAnd (4) a nonlinear prediction model, and verifying and characterizing the model according to OECD guiding rules.
Given the high dimensionality of feature descriptors, it becomes increasingly important how to select the most useful subset features from the original variables for modeling. In order to select more reasonable molecular descriptors for constructing the QSPR model, a screening method of Stepwise linear regression (Stepwise linear regression) is adopted to achieve the purpose of variable dimension reduction. In order to establish a reliable QSPR model, the nonlinear algorithm is a Support Vector Machine (SVM) regression algorithm, and the method is simple and easy to implement, and has good robustness and excellent generalization capability. The model can also be made reproducible by setting the seed number by calling the R language package e 1071.
Disclosure of Invention
The invention aims to provide a method for predicting the film-water distribution coefficient of organic PDMS (polydimethylsiloxane) based on a quantitative structure-activity relationship model of SW-SVM (support vector machine) algorithm, which can be used for rapidly and effectively predicting K of organic PDMS directly according to a molecular structure descriptor of an organic compoundPDMS-wThe method is beneficial to risk evaluation of pollutants, decision makers and managers to make relevant standards of chemical emission, and is also beneficial to providing a new idea for environmental management.
The purpose of the invention is realized as follows: a method for predicting the membrane-water distribution coefficient of organic PDMS based on a quantitative structure-activity relationship model of SW-SVM algorithm is characterized by comprising the following steps:
step 1) data collection: the log K containing several organic compounds was collected from a review of the literaturePDMS-wValue, the resulting data set is log KPDMS-wThe size of the value is extracted 1/5 as verification set data, and the rest is training set data;
step 2) descriptor computation: optimizing an initial molecular structure of the organic compound by using an MM2 molecular mechanics method, acquiring a molecular structure descriptor of the organic compound by using alvaDesc 1.0.0, and screening out a final descriptor through stepwise linear regression after pretreatment;
step 3), model construction: logarithm of organic PDMS-water distribution coefficient logK with final descriptor as independent variablePDMS-wAdopting a support vector machine regression algorithm to establish a QSPR prediction model for the training set as a dependent variable, selecting optimized parameters through a k-fold cross validation algorithm, and establishing the QSPR model based on the optimal SW-SVM algorithm;
step 4), model verification: the model is verified, and the method comprises the following two steps: a) evaluating the goodness of fit and the robustness of the model; b) carrying out application domain representation and performance evaluation on the model; entering step 5) after the verification is qualified;
step 5) application domain characterization: characterizing the model application domain by a Williams diagram;
step 6) model application: the model was used to predict the PDMS film/water distribution coefficient of unknown compounds.
As a further limitation of the present invention, in the step 1), the organic compound includes polycyclic aromatic hydrocarbon, polychlorinated biphenyl, benzene, pesticide, ether, dioxin, ester, aliphatic, hydrocarbon, nitrogen sulfur compound.
As a further limitation of the present invention, in step 1), data significantly deviating from the overall value of the collected same substance in the organic compound are removed, and the average value is taken to conduct model construction research.
As a further limitation of the present invention, the organic compounds in the training set in step 1) are used for constructing a model, performing internal verification, and the organic compounds in the verification set are used for external verification of the model
As a further limitation of the invention, in step 2), the preprocessing procedure includes removing descriptors with constants, near constants, deletions and correlations greater than 0.95.
As a further limitation of the present invention, in step 3), constructing a QSPR model based on an optimal SW-SVM algorithm by using an R language package specifically includes the following processes:
step 3-1, firstly dividing the whole data set into k sets, taking each set as a test set in turn, taking the rest sets as training sets, and repeating the training and testing for k times to ensure that each set is verified once as a test set;
step 3-2, calculating and comparing the average of k trainingUniformly cross-verifying the accuracy, selecting a group of parameters with the highest cross-verifying accuracy, applying the group of parameters (cost, gamma) as an optimal value of k-fold cross-verifying to regression prediction of a support vector machine, wherein a penalty factor cost controls the relative proportion of model structure risk and experience risk, determining the superiority of the model, the gamma parameters determine the distribution of data after mapping to a new feature space, and the prediction model selects gamma as a radial basis kernel function, wherein the formula g is 1/2 sigma2Wherein sigma is a width parameter of the function, and controls the radial action range of the function;
and 3-3, applying the parameters to the model to construct an optimized model.
As a further limitation of the present invention, in the step 4, the goodness-of-fit and robustness evaluation indexes during model verification are as follows: coefficient of determination of degree of freedom correction
Figure BDA0002572759970000041
Training set root mean square error RMSEtraAnd training set mean absolute error MAEtra
As a further limitation of the present invention, step 5) specifically comprises: using a standard residual error based leverage value hiThe Williams diagram of (1) characterizes the application domain of the model, with absolute values greater than 3.0, the compound being an outlier, with a lever value of hiWhen the value is more than the alarm value h, the structure of the compound is obviously different from the structures of other compounds; h isiAnd h is calculated by the following formula:
hi=xi T(XTX)-1xi
h*=3(p+1)/n
wherein xiIs the descriptor matrix for the ith compound; x is the number ofi TIs xiThe transposed matrix of (2); x is a descriptor matrix for all compounds; xTIs the transpose of X; (X)TX)-1Is a matrix XTThe inverse of X; p is the number of variables in the model and n is the number of data points in the dataset.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the guide rule of OECD on the construction and use of the QSRR model, the established QSRR model has good goodness-of-fit, robustness and prediction capability;
2. the application range of the model is wide, organic compounds with various structures are covered, and the model can be used for predicting the K of different compoundsPDMS-wThe value provides basic data for global environmental behavior analysis and ecological risk evaluation of organic compounds;
3. the model completely adopts a calculation mode, and is different from a method that predecessors rely on experimental values, so that the experimental cost can be greatly reduced, and the chemical K can be more efficiently obtainedPDMS-wA value;
4. the invention can quickly and effectively predict the PDMS film/water distribution coefficient of various organic compounds. The method has low cost, and is simple, convenient and rapid, and can save a large amount of manpower, material resources and financial resources. The guide rule of the QSPR model which is strictly constructed and used according to OECD regulation is established and verified by the QSRR model, is accurate and reliable, and can effectively obtain the K of a substancePDMS-wThe value provides important basic data for chemical supervision work and has important guiding significance for ecological risk evaluation.
Drawings
FIG. 1 is a flow chart of a prediction method according to the present invention.
FIG. 2 is a data set log K of the present inventionPDMS-wAnd (3) fitting graphs of experimental values and predicted values of (a).
FIG. 3 is a Williams diagram of the present invention.
Detailed Description
A method for predicting the membrane-water distribution coefficient of organic PDMS based on the quantitative structure-activity relationship model of SW-SVM algorithm as shown in figure 1 comprises the following steps.
Step 1) the log K of 347 organic compounds was collected from a review of the literaturePDMS-wThe method comprises the following steps of (1) removing data obviously deviating from an integral numerical value for the same substance, and taking an average value of the data to carry out model construction research, wherein organic compounds comprise polycyclic aromatic hydrocarbons, polychlorinated biphenyls, benzenes, pesticides, ethers, dioxins, esters, aliphatics, hydrocarbons and nitrogen sulfur compounds; the obtained data setAccording to log K thereofPDMS-wAnd (3) sorting the values, taking the first 4/5 organic compounds as training set data, taking the rest substances as verification set data, wherein the training set data comprises 277 organic compounds, the verification set data comprises 70 organic compounds, the organic compounds in the training set are used for constructing the model and carrying out internal verification, and the organic compounds in the verification set are used for external verification of the model.
Step 2) optimizing the initial molecular structure of the organic compound by using an MM2 molecular mechanics method, acquiring a molecular structure descriptor of the organic compound by using alvaDesc, removing descriptors with a removal constant, a proximity constant, a deletion and a correlation larger than 0.95, and screening out a final descriptor by stepwise linear regression.
Step 3) taking the screened final descriptor as an independent variable and taking logarithm logK of organic PDMS/water distribution coefficientPDMS-wCalling a support vector machine regression algorithm of an R language e1071 program package to establish a QSPR prediction model for a dependent variable, selecting an optimized parameter (cost, gamma) through a k-fold cross validation algorithm, and constructing the QSPR model based on the optimal SW-SVM algorithm by using the R language program package e1071 in the following specific process:
step 3-1, firstly dividing the whole data set into k sets, taking each set as a test set in turn, taking the rest sets as training sets, and repeating the training and testing for k times to ensure that each set is verified once as a test set;
step 3-2, calculating and comparing the average cross validation accuracy of k times of training, selecting a group of parameters with the highest cross validation accuracy, applying the parameters (cost, gamma) as the optimal value of k-fold cross validation to regression prediction of a support vector machine, wherein a penalty factor cost controls the relative proportion of model structure risk and empirical risk, determines the superiority of the model, the gamma parameter determines the distribution of data after being mapped to a new feature space, and the prediction model selects gamma as a radial basis kernel function, wherein the formula g is 1/2 sigma2Wherein sigma is a width parameter of the function, and controls the radial action range of the function;
and 3-3, applying the parameters to the model to construct an optimized model.
And 4, verifying and characterizing the model into two steps: 1) evaluating the goodness of fit and the robustness of the model; 2) performing application domain characterization and performance evaluation on the model;
determination coefficient of model fitting ability corrected by degree of freedom
Figure BDA0002572759970000061
Training set root mean square error RMSEtraAnd training set mean absolute error MAEtraCharacterisation, in the present example the decision coefficient for the correction of the degree of freedom
Figure BDA0002572759970000062
Training set root mean square error RMSEtra0.457, training set average absolute error MAEtra0.329, a smaller error value indicates a higher degree of fit,
Figure BDA0002572759970000071
the model has better goodness of fit and robustness; external verification uses fitting coefficients between prediction and actual measurement
Figure BDA0002572759970000072
And the consensus correlation coefficient CCC represents the model's external prediction capability. The judgment basis is as follows: r2>0.7,Q2>0.6,R2-Q2<0.3,CCC>0.85; in this example, the final model is
Figure BDA0002572759970000073
Figure BDA0002572759970000074
CCC is 0.930, which indicates that the model has good external prediction capability, and the fitting degree and the verification result of the model are shown in FIG. 2;
and 5: using a standard residual error based leverage value hiThe Williams diagram of (1) characterizes the application domain of the model, specifically: it is generally accepted that when the absolute value is greater than 3.0, the compound is an outlier, when the lever value h is greater thaniWhen the value is more than the alarm value h, the compound structure is proved to be combined with other compoundsThe structure of the product has significant difference; h isiAnd h is calculated by the following formula:
hi=xi T(XTX)-1xi
h*=3(p+1)/n
where xi is the descriptor matrix for the ith compound; x is the number ofi TIs xiThe transposed matrix of (2); x is a descriptor matrix for all compounds; xTIs the transpose of X; (X)TX)-1Is a matrix XTThe inverse of X; p is the number of variables in the model and n is the number of data points in the dataset. As shown in FIG. 3, h of the model is 0.054, and the model is obtained to be suitable for hiCompound logK of less than 0.054PDMS-wPrediction of the value of (c).
Step 6, model application: the model was used to predict the PDMS film/water distribution coefficient of unknown compounds.
Example 1: prediction of logK given to a Compound 3-chlorophenolPDMS-wThe value is obtained. Firstly, the molecular structure of 3-chlorophenol is optimized according to a MM2 molecular mechanics method, and then the values of 4 molecular descriptors BLTD48, Hy, SpMaxA _ B(s) and SpMaxA _ AEA (dm) are calculated by alvaDesc 1.0.0 software based on the optimized molecular structure, wherein the values are-3.341, -0.039, 0.789 and 0.374 respectively. The hi value of the substance is 0.0353 according to a calculation formula<0.054, so the compound is within the model application domain. Substituting the value of the descriptor into the model to obtain log KPDMS-wThe predicted value is 0.22, the experimental value is 0.31, and the predicted value is similar to the experimental value.
Example 2: given a compound pentachlorophenol the log K of which is predictedPDMS-wThe value is obtained. Firstly, the molecular structure of pentachlorophenol is optimized according to a MM2 molecular mechanics method, and then the values of 4 molecular descriptors BLTD48, Hy, SpMaxA _ B(s) and SpMaxA _ AEA (dm) are calculated by alvaDesc 1.0.0 software based on the optimized molecular structure, wherein the values are-5.033, 0.078, 0.526 and 0.313 respectively. The hi value of the substance is 0.0178 according to a calculation formula<0.054, so the compound is within the model application domain. Substituting the value of the descriptor into the model to obtain log KPDMS-wPredicted value is 2.68, experimental value is 2.65, predicted value and experimentThe values are similar.
The present invention is not limited to the above-mentioned embodiments, and based on the technical solutions disclosed in the present invention, those skilled in the art can make some substitutions and modifications to some technical features without creative efforts according to the disclosed technical contents, and these substitutions and modifications are all within the protection scope of the present invention.

Claims (8)

1. A method for predicting the membrane-water distribution coefficient of organic PDMS based on a quantitative structure-activity relationship model of SW-SVM algorithm is characterized by comprising the following steps:
step 1) data collection: the log K containing several organic compounds was collected from a review of the literaturePDMS-wValue, the resulting data set is log KPDMS-wThe size of the value is extracted 1/5 as verification set data, and the rest is training set data;
step 2) descriptor computation: optimizing an initial molecular structure of the organic compound by using an MM2 molecular mechanics method, acquiring a molecular structure descriptor of the organic compound by using alvaDesc 1.0.0, and screening out a final descriptor through stepwise linear regression after pretreatment;
step 3), model construction: logarithm of organic PDMS-water distribution coefficient logK with final descriptor as independent variablePDMS-wAdopting a support vector machine regression algorithm to establish a QSPR prediction model for the training set as a dependent variable, selecting optimized parameters through a k-fold cross validation algorithm, and establishing the QSPR model based on the optimal SW-SVM algorithm;
step 4), model verification: the model is verified, and the method comprises the following two steps: a) evaluating the goodness of fit and the robustness of the model; b) carrying out application domain representation and performance evaluation on the model; entering step 5) after the verification is qualified;
step 5) application domain characterization: characterizing the model application domain by a Williams diagram;
step 6) model application: the model was used to predict the PDMS film/water distribution coefficient of unknown compounds.
2. The prediction method according to claim 1, wherein in the step 1), the organic compound comprises polycyclic aromatic hydrocarbon, polychlorinated biphenyl, benzene, pesticide, ether, dioxin, ester, aliphatic, hydrocarbon, nitrogen sulfur compound.
3. The prediction method according to claim 1, wherein in step 1), data significantly deviating from the overall value of the collected same substance in the organic compound are removed, and the average value is used for model construction research.
4. The prediction method according to claim 1, wherein the organic compounds in the training set in step 1) are used for constructing a model, performing internal verification, and the organic compounds in the verification set are used for external verification of the model.
5. The prediction method according to claim 1, wherein in step 2) the preprocessing procedure comprises removing descriptors with constants, near constants, missing and correlation greater than 0.95.
6. The prediction method according to claim 1, wherein in the step 3), the QSPR model based on the optimal SW-SVM algorithm is constructed by using an R language package, and the method specifically comprises the following processes:
step 3-1, firstly dividing the whole data set into k sets, taking each set as a test set in turn, taking the rest sets as training sets, and repeating the training and testing for k times to ensure that each set is verified once as a test set;
step 3-2, calculating and comparing the average cross validation accuracy of k times of training, selecting a group of parameters with the highest cross validation accuracy, applying the group of parameters (cost, gamma) as the optimal value of k-fold cross validation to regression prediction of a support vector machine, wherein a penalty factor cost controls the relative proportion of model structure risk and experience risk, determines the superiority of the model, the gamma parameter determines the distribution of data after mapping to a new feature space, and the prediction model selects gamma as a radial basis kernel functionFormula g is 1/2 sigma2Wherein sigma is a width parameter of the function, and controls the radial action range of the function;
and 3-3, applying the parameters to the model to construct an optimized model.
7. The prediction method according to claim 1, wherein in the step 4, the goodness-of-fit and robustness evaluation indexes during model verification are as follows: coefficient of determination of degree of freedom correction
Figure FDA0002572759960000021
Training set root mean square error RMSEtraAnd training set mean absolute error MAEtra
8. The prediction method according to claim 1, wherein the step 5) specifically comprises: using a standard residual error based leverage value hiThe Williams diagram of (1) characterizes the application domain of the model, with absolute values greater than 3.0, the compound being an outlier, with a lever value of hiWhen the value is more than the alarm value h, the structure of the compound is obviously different from the structures of other compounds; h isiAnd h is calculated by the following formula:
hi=xi T(XTX)-1xi
h*=3(p+1)/n
wherein xiIs the descriptor matrix for the ith compound; x is the number ofi TIs xiThe transposed matrix of (2); x is a descriptor matrix for all compounds; xTIs the transpose of X; (X)TX)-1Is a matrix XTThe inverse of X; p is the number of variables in the model and n is the number of data points in the dataset.
CN202010645135.9A 2020-07-07 2020-07-07 Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model Pending CN111768813A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010645135.9A CN111768813A (en) 2020-07-07 2020-07-07 Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010645135.9A CN111768813A (en) 2020-07-07 2020-07-07 Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model

Publications (1)

Publication Number Publication Date
CN111768813A true CN111768813A (en) 2020-10-13

Family

ID=72724655

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010645135.9A Pending CN111768813A (en) 2020-07-07 2020-07-07 Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model

Country Status (1)

Country Link
CN (1) CN111768813A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722988A (en) * 2021-08-18 2021-11-30 扬州大学 Method for predicting organic PDMS membrane-air distribution coefficient by quantitative structure-activity relationship model
CN115470702A (en) * 2022-09-14 2022-12-13 中山大学 Sewage treatment water quality prediction method and system based on machine learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868540A (en) * 2016-03-25 2016-08-17 哈尔滨理工大学 A polycyclic aromatic hydrocarbon property/toxicity prediction method using an intelligent support vector machine
CN109212096A (en) * 2018-11-02 2019-01-15 扬州大学 Hydrophobic organic compound LDPE film/water partition coefficient rapid assay methods based on surfactant strengthening extraction

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868540A (en) * 2016-03-25 2016-08-17 哈尔滨理工大学 A polycyclic aromatic hydrocarbon property/toxicity prediction method using an intelligent support vector machine
CN109212096A (en) * 2018-11-02 2019-01-15 扬州大学 Hydrophobic organic compound LDPE film/water partition coefficient rapid assay methods based on surfactant strengthening extraction

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
ANDREA MAURI: "alvaDesc:A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints", 《ECOTOXICOLOGICAL QSARS》 *
朱腾义 等: "基于理论线性溶解能关系预测有机污染物在PDMS与水中的分配系数", 《东南大学学报(自然科学版)》 *
李美萍: "QSAR/QSPR方法在环境、药物和材料化学中的应用", 《中国博士学位论文全文数据库 工程科技Ⅰ辑》 *
李言伟: "QSPR研究在材料化学和环境化学中的应用", 《中国优秀硕士学位论文全文数据库 工程科技Ⅰ辑》 *
聂长明 等: "《计算化学》", 31 January 2010, 北京理工大学出版社 *
胡桂香 等: "化合物膜水分配系数的QSPR研究和分子三维参数表征", 《浙江大学学报(理学版)》 *
董霁红 等著: "《矿区复垦土壤重金属光谱解析与迁移特征研》", 31 May 2018, 中国矿业大学出版社 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113722988A (en) * 2021-08-18 2021-11-30 扬州大学 Method for predicting organic PDMS membrane-air distribution coefficient by quantitative structure-activity relationship model
CN113722988B (en) * 2021-08-18 2024-01-26 扬州大学 Method for predicting organic PDMS film-air distribution coefficient by quantitative structure-activity relationship model
CN115470702A (en) * 2022-09-14 2022-12-13 中山大学 Sewage treatment water quality prediction method and system based on machine learning
CN115470702B (en) * 2022-09-14 2024-06-11 中山大学 Sewage treatment water quality prediction method and system based on machine learning

Similar Documents

Publication Publication Date Title
CN114424058B (en) Tracing method for VOCs pollution
CN110534163B (en) Method for predicting octanol/water distribution coefficient of organic compound by adopting multi-parameter linear free energy relation model
WO2016179864A1 (en) Fresh water acute standard prediction method based on metal quantitative structure-activity relationship
Canter et al. Handbook of variables for environmental impact assessment
Quah et al. Application of neural networks for software quality prediction using object-oriented metrics
CN111768813A (en) Method for predicting organic PDMS membrane-water distribution coefficient based on SW-SVM algorithm quantitative structure-activity relationship model
CN113155939A (en) Online volatile organic compound source analysis method, system, equipment and medium
CN103345544B (en) Adopt logistic regression method prediction organic chemicals biological degradability
CN111768812A (en) Method for predicting organic PDMS film-water distribution coefficient
CN116187861A (en) Isotope-based water quality traceability monitoring method and related device
Kitson et al. PyKrev: a python library for the analysis of complex mixture FT-MS data
CN111554358A (en) Prediction method of heavy metal toxicity end point and ocean water quality reference threshold
CN111768815A (en) Method for predicting distribution coefficient of POPs (Point-of-sale) in PUF (physical unclonable function) membrane-air based on theoretical linear solvation energy relation model
CN110853701A (en) Method for predicting fish biological enrichment factor of organic compound by adopting multi-parameter linear free energy relation model
Yuan et al. Combining national and state data improves predictions of microcystin concentration
CN110910970B (en) Method for predicting toxicity of chemicals by taking zebra fish embryos as receptors through building QSAR model
CN112086141A (en) Method for predicting PA-water distribution coefficient of organic pollutant based on quantitative structure property relation
CN103714220B (en) Method for predicting elimination speed of persistent organic pollutants on coastal zones
Olenius et al. Role of gas–molecular cluster–aerosol dynamics in atmospheric new-particle formation
CN113722988B (en) Method for predicting organic PDMS film-air distribution coefficient by quantitative structure-activity relationship model
CN112200254A (en) Network intrusion detection model generation method, detection method and electronic equipment
CN107516016A (en) A kind of method by building the silicone oil air distribution coefficient of quantitative structure activity relationship model prediction hydrophobic compound
CN115758163A (en) Method, device, equipment and storage medium for detecting petroleum fraction composition
CN113782112B (en) Method and device for determining petroleum fraction composition model
Dai et al. Validating the accuracy of multiple sediment fingerprinting methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination