Method for predicting partition constant of organic pollutant in polyethylene microplastic and water phase
Technical Field
The invention relates to the technical field of ecological safety evaluation, in particular to the technical field of quantitative structure-property relation (QSPR) oriented to environmental ecological risk evaluation, and particularly relates to a method for predicting an organic pollutant distribution balance constant between polyethylene type microplastic and water phase.
Background
Microplastic (particle size < 5 mm) is a new type of global environmental pollutant, and is of great interest due to its bioaccumulation, ecotoxicity and adsorption behavior. After entering the water body, the micro plastic can interact with organic pollutants in the water environment, so that the toxic effect of the organic pollutants on organisms is enhanced, even the organic pollutants are transmitted along a food chain, the exposure risk of the organisms and human bodies is increased, and the ecological environment safety and the human health are further endangered.
K d Is an important parameter describing the partitioning behavior of organic contaminants between the microplastic and the water phase. K (K) d The value can influence the migration, transformation and tendency of organic pollutants in the environment, and the knowledge of the interaction between the microplastic and the organic pollutants in the water body is necessary for researching the subsequent toxicological effects of the microplastic and the organic pollutants. However, at present, related researches on microplastics are still in an initial stage, and K is reported in experiments d The number of values is very limited and can not meet the requirements of subsequent researches at all. Meanwhile, K is to be measured because of a plurality of organic pollutant types, the dissociable organic pollutants also have different dissociation forms, the difference of interaction strength between the organic pollutants in different water environment media and the microplastic is obvious, and the like d The number of values increases greatly. Experimental determination of K d In the process of the value, the micro plastic is not easy to be uniformly dispersed in the solution, so that the experimental error is caused, and the like. Thus, predictive K with good development performance d The method of value is particularly important.
The quantitative structure-property relationship (Quantitative Structure-Property Relationship, QSPR) can avoid experimental error, is not limited by experimental conditions and other factors, has low development cost, is convenient and quick, has been successfully applied to predicting physicochemical properties, environmental behaviors and ecotoxicity parameters of organic pollutants, and can predict K of the organic matters according to molecular structure information d Values. However, what is reported so far is about K d There are some disadvantages in the type and number of organics, predictive ability of the model, and adsorption mechanism disclosed.
Smedes et al (Smedes F, geertsma R W, zande T, et al Polymer-water partition coefficients of hydrophobic compounds for passive sampling: application of cosolvent models for evaluation, environmental Science & Technology,2009,43 (18): 7047-7054.) have established an adsorption model of 26 polycyclic aromatic hydrocarbons based on molecular weight and n-octanol/water partition coefficients, respectively, and compared to an optimal method for describing equilibrium partition coefficients. However, the model does not distinguish between different types of microplastic, and the application field of the model is poor in applicability, and cannot be used for predicting the adsorption capacity of the different types of microplastic in different water media on organic pollutants.
Huffer et al (Huffer T, hofmann T. Sorption of non-polar organic compounds by micro-sized plastic particles in aqueous solution, environmental polarization, 2016, 214:194-201.) established K for nonpolar organic compounds (including seven of n-hexane, cyclohexane, benzene, toluene, chlorobenzene, ethyl benzoate, naphthalene, etc.) using molar volumes, octanol/water partition coefficients, hexadecane/water partition coefficients, etc.) d And (5) a value prediction model. However, the number of the model compounds is small, the model is not tested and characterized, and the model compounds are limited by application fields and cannot be applied to the prediction of more types of compounds (such as polychlorinated biphenyl).
Li Anyu et al (Li Anyu, chen Jingwen, li Xuehua, et al, model of the linear dissolution energy relationship of microplastic/water partition coefficients for several classes of organic contaminants [ J)]Ecological toxicology report, 2017.) the K of organic pollutants (polychlorinated biphenyl, polycyclic aromatic hydrocarbon, hexachlorocyclohexane and chlorobenzene) between polypropylene type microplastic/seawater, polyethylene type microplastic/freshwater was constructed by using linear dissolution energy parameters d The predictive model of the values has good model performance, but the model parameters depend on experimental determination, the data set is less, the dissociation morphology difference is not considered, and the use of the model is limited to a certain extent.
As can be seen, the current models are not capable of simply and rapidly predicting K of a wide variety of organic pollutants in different aqueous medium environments d Values.
Therefore, the K with excellent development performance, simple and transparent algorithm and strong practicability d The prediction model can be used for predicting K of different types of organic pollutants such as polychlorinated biphenyl, chlorobenzene, polycyclic aromatic hydrocarbon and the like d The value can effectively solve the problem of missing basic data of risk evaluation and management of the organic chemicals, and provide data support and theoretical guidance for the biological safety and health risk evaluation.
Disclosure of Invention
The invention aims to provide a method for predicting the partition equilibrium constant K of organic pollutants between polyethylene microplastic and water phase d The method can evaluate the adsorption capacity of organic pollutants in different water environment media on the surface of polyethylene microplastic (PE), and has the advantages of high efficiency, high speed, low cost and high practicability.
In order to solve the technical problems, the invention adopts the following technical scheme:
(a) K of 37 organic pollutants between PE and seawater is obtained through the existing database and literature d Value, K of 24 organic pollutants between PE/fresh water d Value, K of 48 organic pollutants between PE/pure water d A value;
(b) Based on analysis of the partition mechanism of organic contaminants between microplastic and aqueous phase, the following descriptors were screened for the construction of log K d Prediction model:
ε α (ε α =E LUMO -E HOMO-water ,E LUMO for the lowest unoccupied molecular orbital energy, E HOMO To be the highest occupied molecular orbital energy),
ε β (ε β =E LUMO-water -E HOMO ) A kind of electronic device
log D (n-octanol/water partition coefficient at different pH conditions);
(c) K is established by adopting a Multiple Linear Regression (MLR) method d And epsilon α 、ε β And log D, the specific process being performed by SPSS 21.0; using the square of the correlation coefficient (r 2 ) And root mean square error (rms) as statistical indicators to characterize the fitting performance of the model, using the square (q) of the predicted correlation coefficient 2 ) Characterizing the predictive performance of the model;
the regression model obtained by the MLR method is as follows:
PE/seawater: log K d =0.725×log D-23.169×ε β -36.236×ε α +17.856 (1);
PE/fresh water: log K d =0.667×log D+1.714 (2);
PE/pure water: log K d =0.486×log D+2.420 (3)。
The organic pollutant in the step (a) covers polychlorinated biphenyl, chlorobenzene, polycyclic aromatic hydrocarbon, antibiotics, aromatic hydrocarbon, aliphatic hydrocarbon and hexachlorocyclohexane.
Further, the E HOMO-water And E is LUMO The method comprises the following steps of: according to the molecular structure of the organic matter to be predicted, the molecular structure is optimized and frequency analyzed by using the Density Functional Theory (DFT) B3LYP/6-31G (d, p) algorithm of Gaussian 09 software, and E of the organic matter to be predicted is extracted from an output file HOMO-water And E is LUMO Values.
Further, the log D is calculated by ACD Labs 6.0 software according to the molecular structure of the organic matter to be predicted.
Further, the organic compounds include polychlorinated biphenyls, chlorobenzene, aromatic hydrocarbons such as monocyclic aromatic hydrocarbons, polycyclic aromatic hydrocarbons and polycyclic aromatic hydrocarbons, antibiotics, aliphatic hydrocarbons, and hexachlorocyclohexane.
Collecting K of organic pollutants among PE/seawater, PE/fresh water and PE/pure water through existing databases and documents d In the process of value, the collected organic pollutants cover polychlorinated biphenyl, chlorobenzene, polycyclic aromatic hydrocarbon, antibiotics, aliphatic hydrocarbon and hexachlorocyclohexane, which are seven kinds of substances, namely log K d The numerical range is 0.79-8.84, spanning 8 orders of magnitude.
For the three models (1), (2), (3) obtained by the analysis of the present invention, r 2 0.87, 0.90 and 0.81, respectively, indicating that the model has good fitting ability, and no dependence between the prediction error and the experimental value. The stability and predictive power of the model was evaluated by two methods:
method one, simulated external verification: the original dataset was randomly divided into two subsets (70% and 30% compound respectively), one subset (70% compound) and modulo was usedThe descriptor selected by screening is re-modeled, and the result r is fitted 2 0.86, 0.90 and 0.80, respectively, applied to the other subset (30% compound in) to obtain the predicted result q 2 0.89, 0.91 and 0.84, respectively. The statistical performance of both subsets was very similar to the data set, indicating that the three models were all K-based d The intrinsic correlations, rather than the occasional correlations, with the descriptors are statistically stable;
cross-validation by a second method and a first method: q 2 CV The results were 0.89, 0.94 and 0.88, respectively, again demonstrating good stability and predictive power of the model.
The application domain of the model is characterized by a Williams diagram. H in a compound descriptor matrix i Values are plotted as abscissa and standard residual error (SE) is plotted as ordinate to determine highly influential compounds and delocalized points. Williams shows alert values h for models (1), (2), (3) of 0.32, 0.25 and 0.13, respectively, where h i >The number of h compounds in model (1) is 3, and the number of h compounds in model (3) is 1, and the compounds are far away from the center of the descriptor matrix, but have good prediction effect, so that the accuracy and extensibility of the model can be enhanced. The standard residuals for all compounds fall within + -3, indicating that the model has no delocalization points. In summary, the application domain of the model is defined as: polychlorinated biphenyls, chlorobenzene, polycyclic aromatic hydrocarbons, antibiotics, aromatic hydrocarbons, aliphatic hydrocarbons, hexachlorocyclohexane, and other compounds similar in structure thereto.
The prediction method provided by the invention has the following advantages:
(1) The model has wide application range, and can be used for rapidly predicting K of polychlorinated biphenyl, chlorobenzene, polycyclic aromatic hydrocarbon, antibiotics, aromatic hydrocarbon, aliphatic hydrocarbon, hexachlorocyclohexane and other compounds with similar structures d The adsorption capacity of organic pollutants in different water environment media on the surface of polyethylene type microplastic (PE) is evaluated, and important basic data can be provided for ecological risk evaluation of the compounds.
(2) The molecular structure descriptor used by the model is easy to obtain, regression analysis is simple and easy to realize, and the practical application capability of the model is strong; the prediction method provided by the invention has the advantages of convenience, rapidness, low cost, convenience in use and the like.
(3) The modeling process strictly follows the guidelines of economic Cooperation and development Organization (OECD) on the construction and use of QSAR models, and the modeled models have good fitting ability (r 2 =0.81 to 0.90), predictive power (q 2 =0.84~0.91,RMSE ext =0.47 to 0.75) and robustness (q cv 2 =0.88~0.94)。
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. In the drawings:
FIG. 1 is a model (1) predictive K d Value and literature K d A comparison plot of values;
FIG. 2 is a model (2) predictive K d Value and literature K d A comparison plot of values;
FIG. 3 is a model (3) predictive K d Value and literature K d A comparison plot of values;
FIG. 4 shows the prediction error value and K of model (1) d Fitting graphs of values;
FIG. 5 shows the prediction error value and K of model (2) d Fitting graphs of values;
FIG. 6 shows the prediction error value and K of model (3) d Fitting graphs of values;
FIG. 7 is a Williams diagram characterizing high-impact compounds and delocalized points of model (1);
FIG. 8 is a Williams diagram characterizing model (2) high-impact compounds and delocalization points;
FIG. 9 is a Williams diagram characterizing high-impact compounds and delocalized points of model (3).
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
EXAMPLE 1 perfluoro caprylic acid (PE/seawater)
This example shows log K of perfluorooctanoic acid between PE/seawater d Predicting the value, and calculating to obtain h by Williams graph method i A value of 0.195<h (guard value) =0.32, standard residual (SE) =0.027<3, illustrating the compound in the QSPR model application domain. DFT B3LYP/6-31G (d, p) algorithm using Gaussian 09 software for structural optimization and frequency analysis to obtain epsilon of the compound α And epsilon β The values were 0.31 and 0.30, respectively. Calculated using ACD Labs 6.0 software, the log D of the compound was 4.00.
Bringing the above descriptors into model (1):
log K d =0.725×log D-23.169×ε β -36.236×ε α +17.856
=0.725×4.00-23.169×0.30-36.236×0.31+17.856
=2.57
obtaining the log K of the perfluorooctanoic acid according to the model (1) d The predicted value is 2.57, log K in the literature d The value was 2.70, and the predicted value was very consistent with the experimental value.
Example 2, 3', 4' -pentachlorodiphenyl (PE/fresh water)
This example shows the log K of 2, 3', 4' -pentachlorobiphenyl between PE/fresh water d Predicting the value, and calculating to obtain h by Williams graph method i A value of 0.048<h (guard value) =0.25, standard residual (SE) = -0.029>-3, illustrating the compound in the application domain of the QSPR model. Calculated using ACD Labs 6.0 software, the log D of the compound was 6.98.
Bringing the above descriptors into model (2):
log K d =0.667×log D+1.714
=0.667×6.98+1.714
=6.37
obtaining the log K of the 2, 3', 4' -pentachlorodiphenyl according to the model (2) d The predicted value is 6.37, log K in the literature d The value was 6.35, and the predicted value was very consistent with the experimental value.
EXAMPLE 3 cyclohexane (PE/pure water)
The present example shows log K of cyclohexane between PE/pure water d Predicting the value, and calculating to obtain h by Williams graph method i A value of 0.012<h (guard value) =0.13, standard residual (SE) =0.539<3, illustrating the use of this compound in the QSPR model application domain. Calculated using ACD Labs 6.0 software, the log D of the compound was 3.18.
Bringing the above descriptors into model (3):
log K d =0.486×log D+2.420
=0.486×3.18+2.420
=3.97
obtaining the log K of cyclohexane according to model (3) d The predicted value is 3.97, log K in literature d The value was 3.88, and the predicted value was very consistent with the experimental value.
The present invention is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present invention are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.