CN109493922B - Method for predicting molecular structure parameters of chemicals - Google Patents

Method for predicting molecular structure parameters of chemicals Download PDF

Info

Publication number
CN109493922B
CN109493922B CN201811378715.5A CN201811378715A CN109493922B CN 109493922 B CN109493922 B CN 109493922B CN 201811378715 A CN201811378715 A CN 201811378715A CN 109493922 B CN109493922 B CN 109493922B
Authority
CN
China
Prior art keywords
molecular structure
lpcount
formula
parameter
cats2d
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811378715.5A
Other languages
Chinese (zh)
Other versions
CN109493922A (en
Inventor
陈景文
肖子君
鄢世阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Silike Environment Technology Co ltd
Original Assignee
Dalian Silike Environment Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Silike Environment Technology Co ltd filed Critical Dalian Silike Environment Technology Co ltd
Priority to CN201811378715.5A priority Critical patent/CN109493922B/en
Publication of CN109493922A publication Critical patent/CN109493922A/en
Application granted granted Critical
Publication of CN109493922B publication Critical patent/CN109493922B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Organic Low-Molecular-Weight Compounds And Preparation Thereof (AREA)

Abstract

The invention provides a method for predicting chemical molecular structure parameters, which comprises the steps of optimizing the molecular structure of an organic compound and calculating to obtain all parameters of the organic compound based on the optimized molecular structure. The method for predicting the multi-parameter linear free energy relation of the organic compounds can be used for various organic compounds; the actually measured data of E, S, A, B, L, V in the method reaches 3838 types, and the method has a very wide application domain, E, S, A, B, L, V adopts a linear regression algorithm for modeling, and the model algorithm is transparent, simple and convenient and is easy to explain; the method provided by the invention is simple, convenient and quick, has low cost, can provide data support for chemical supervision, and has important significance for ecological risk evaluation of chemicals.

Description

Method for predicting molecular structure parameters of chemicals
Technical Field
The invention relates to the field of ecological risk evaluation test strategies, in particular to a method for predicting chemical molecular structure parameters.
Background
The distribution of organic chemicals in the ecosystem, the reaction rate, the bioaccumulation and the toxic effects depend on their distribution behaviour. The experimental determination of the equilibrium distribution coefficient of the substance has the disadvantages of long time consumption, high cost, easy error and the like. When organic matters are difficult to obtain or the number of organic matters to be detected is large, the method is extremely difficult to measure the molecular structure parameters of chemicals only by virtue of experiments. It is therefore necessary to develop reliable prediction methods for the equilibrium molecular structure parameters of a substance in the environment.
Multiparameter linear free energy relationships have been demonstrated to be useful in characterizing equilibrium distributions of organic chemicals in various environmental and technical distribution systems, and predicting precise molecular structural parameters. However, the acquisition of molecular structure parameters (E is the molar excess refractive index, L is the n-hexadecane-water distribution coefficient, a is the hydrogen bond acidity number, B is the hydrogen bond basicity, S is the polarity/dipole moment, and V is the McGowan characteristic molecular volume) in the multi-parameter linear free energy relationship depends on complicated and various experimental methods. Whereas chemicals registered by the american Chemical Abstracts Service (CAS) exceed 1.35 billion and increase at a rate of 15000 species/day, it is obvious that it is difficult to determine molecular structural parameters of such a huge number of organic chemicals by experiment alone. At present, only about 4000 substances have molecular structure parameter experimental values, so that the development of a non-experimental technology is urgently needed to efficiently and quickly obtain the molecular structure parameter values of the substances so as to predict the multi-parameter linear free energy relationship and meet the requirements of ecological risk evaluation and management of organic chemicals.
Disclosure of Invention
The first purpose of the invention is to provide a simple, quick and efficient method for predicting molecular structure parameters of organic chemicals, which can predict the molecular structure parameters of the compounds according to the molecular structures of the compounds, further evaluate the environmental distribution coefficients of the compounds and provide necessary basic data for risk evaluation and management of the chemicals.
In order to achieve the above object, the present invention provides a method for predicting a molecular structural parameter of a chemical, the method comprising:
s1, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method, adding pseudo potential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency;
s2, obtaining polizabilty and E based on optimized molecular structure calculationHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]molarVolume value, wherein polarizability is polarizability, EHomo-ELumoThe central Broto-Moreau autocorrelation of lag2, which is the energy level difference of front-line molecular orbitals, I _ Lpcount is the lone pair electron logarithm of all iodine atoms, Atom _ num is the total number of atoms, nBT is the number of chemical bonds, Mor32p is the 3D-Morse signal 32/polarizability weighting, nHDon is the number of N and O atoms of the hydrogen bond donor, ATSC2I is ionization potential weightingNumber, F01[ C-N]F10[ O-O ] which is the probability of C-N occurring when the topological distance is 1]The probability of O-O occurring at a topological distance of 10, dipole-moment, CCR-energy, and EHOMO-1For the second highest occupied orbital energy, Rporim is the ring perimeter of the molecule, Mor05u is the 3D-MoRSE signal 05/unweighted, Mor02m is the 3D-MoRSE signal 02/mass-weighted, nRCO is the number of aliphatic ketone groups, H-046 is the sum of E-states with hydrogen atoms attached to sp3 hybridized carbon atoms and halogen-free carbon atoms attached to adjacent carbon atoms SdO is ≡ O, NtN is the number of ≡ N in the molecule, H _ Qmax is the highest charge on hydrogen atoms, H _ Qmean is the average charge on hydrogen atoms, nRCONH2 is the number of aliphatic primary amides, N-067 is Al2-NH, O-057 is the oxygen atom on phenol/enol/carboxyl, SsNH2 is-NH2The sum of E-states, CATS2D _01_ AN is the CATS2D descriptor of the hydrogen bond acceptor-negative charge at lag 01, CATS2D _03_ DD is the CATS2D descriptor of the hydrogen bond donor-hydrogen bond donor at lag03, CATS2D _03_ DA is the CATS3D descriptor of the hydrogen bond donor-hydrogen bond acceptor at lag03, B04[ O-O]O-O existence/absence when topological distance is 4, nArNHR is the number of aromatic secondary amines, O _ Lpcount is the lone pair electron pair logarithm of all oxygen atoms, N _ Qcount is the number of nitrogen atoms, Mor12i is 3D-MorSE signal 12/ionization potential weight, H-047 is a hydrogen atom connected with a carbon atom hybridized with sp2 and sp3, O-056 is an oxygen atom on a hydroxyl group, P-116 is the number of R3-P ═ X groups, NddsN is the number of-N ═ B01[ C-N ═ C-N]Presence/absence of C-N at topological distance of 1, F02[ C-N]The frequency of C-N when the topological distance is 2, F _ Lpcount is the number of lone-pair electrons of all fluorine atoms, Br _ Lpcount is the logarithm of the lone-pair electrons of all bromine atoms, H _ Qcount is the number of hydrogen atoms, NaasC is the number of aasC, SssCH2 is the sum of E-states of-CH 2-, F01[ O-Si ]]The frequency of O-Si when the topological distance is 1 and the molarVolume is the molar volume;
s3, calculating an organic compound molecular structure parameter E according to a formula (7), calculating an organic compound molecular structure parameter S according to a formula (8), calculating an organic compound molecular structure parameter A according to a formula (9), calculating an organic compound molecular structure parameter B according to a formula (10), calculating an organic compound molecular structure parameter L according to a formula (11), calculating an organic compound molecular structure parameter V according to a formula (12),
E=0.61313+0.01169polarizability+0.88701(EHomo-ELumo)+0.12676I_LPcount-0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+0.04305F01[C-N]+0.14475F10[O-O]
the compound of the formula (7),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420EHomo-1+0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
the compound of the formula (8),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0.28056nHDon–0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
the compound of the formula (9),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount–0.09927Mor12i+0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C-N]+0.03503F02[C-N]
the compound of the formula (10),
L=0.44713+0.03226polarizability–0.16282F_LPcount+0.07766Br_LPcount+0.25237Atom_num–0.35911H_Qcount+0.48173nHDon-0.08596NaasC+0.06518SssCH2–0.43300F01[O-Si]
in the formula (11),
v ═ 0.00910+1.027(molarVolume/100) formula (12),
wherein, the molecular parameter E is the molar refractive index of the excessive molecules, the molecular parameter L is the n-hexadecane-water distribution coefficient, the molecular parameter A is the hydrogen bond acidity, the molecular parameter B is the hydrogen bond alkalinity, the molecular parameter S is the polarity/dipole moment, and the molecular parameter V is the McGowan characteristic molecular volume;
the organic compound may be an alkane, alkene, alkyne, alcohol, ether, phenol, ketone, aldehyde, ester, quinone, substituted biphenyl, aniline, halogenated hydrocarbon, nitroaromatic, alkylbenzene, azobenzene, organic acid, benzamide, phthalate, polybrominated diphenyl ether, polycyclic aromatic hydrocarbon, sulfonic acid derivative, organophosphorus compound, organosulfide, organoiodide, organofluoride, heterocyclic compound and organosilicon compound.
The method for predicting the molecular structure parameters of the chemicals provided by the invention can be used for various organic compounds; the actual measurement data of the molecular structure parameters E, S, A, B, L, V in the method reaches 3838 types, and the method has a very wide application domain, E, S, A, B, L, V adopts a linear regression algorithm for modeling, and the model algorithm is transparent, simple and convenient and is easy to explain; and predicting the result of the multi-parameter linear free energy relation of the organic compound accurately according to the predicted molecular structure parameter E, the predicted molecular structure parameter S, the predicted molecular structure parameter A, the predicted molecular structure parameter B, the predicted molecular structure parameter L and the predicted molecular structure parameter V. The method for predicting the multi-parameter linear free energy relationship of the organic compound is simple, convenient and quick, has low cost, can provide data support for chemical supervision, and has important significance for ecological risk evaluation of chemicals.
Additional features and advantages of the invention will be set forth in the detailed description which follows.
Detailed Description
The following describes in detail specific embodiments of the present invention. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
The invention provides a method for predicting a molecular structure parameter of a chemical, which comprises the following steps:
s1, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method, adding pseudo potential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency;
s2, obtaining polizabilty and E based on optimized molecular structure calculationHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]molarVolume value, wherein polarizability is polarizability, EHomo-ELumoIs the energy level difference of front line molecular orbit, I _ Lpcount is the lone pair electron logarithm of all iodine atoms, Atom _ num is the total number of atoms, nBT is the number of chemical bonds, Mor32p is the 3D-Morse signal 32/polarizability weighting, nHDon is the N and O Atom number of hydrogen bond donor, ATSC2I is the center Broto-Moreau autocorrelation index of lag2 weighted by ionization potential, F01[ C-N ]]F10[ O-O ] which is the probability of C-N occurring when the topological distance is 1]The probability of O-O occurring at a topological distance of 10, dipole-moment, CCR-energy, and EHOMO-1For the second highest occupied orbital energy, Rporim is the ring perimeter of the molecule, Mor05u is the 3D-MoRSE signal 05/unweighted, Mor02m is the 3D-MoRSE signal 02/mass-weighted, nRCO is the number of aliphatic ketone groups, H-046 is the sum of E-states with hydrogen atoms attached to sp3 hybridized carbon atoms and halogen-free carbon atoms attached to adjacent carbon atoms SdO is ≡ O, NtN is the number of ≡ N in the molecule, H _ Qmax is the highest charge on hydrogen atoms, H _ Qmean is the average charge on hydrogen atoms, nRCONH2 is the number of aliphatic primary amides, N-067 is Al2-NH, O-057 is the oxygen atom on phenol/enol/carboxyl, SsNH2 is-NH2The sum of E-states, CATS2D _01_ AN is the CATS2D descriptor of the hydrogen bond acceptor-negative charge at lag 01, CATS2D _03_ DD is the CATS2D descriptor of the hydrogen bond donor-hydrogen bond donor at lag03, CATS2D _03_ DA is the CATS3D descriptor of the hydrogen bond donor-hydrogen bond acceptor at lag03, B04[ O-O]O-O existence/absence when topological distance is 4, nArNHR is the number of aromatic secondary amines, O _ Lpcount is the lone pair electron pair logarithm of all oxygen atoms, N _ Qcount is the number of nitrogen atoms, Mor12i is 3D-MorSE signal 12/ionization potential weight, H-047 is a hydrogen atom connected with a carbon atom hybridized with sp2 and sp3, O-056 is an oxygen atom on a hydroxyl group, P-116 is the number of R3-P ═ X groups, NddsN is the number of-N ═ B01[ C-N ═ C-N]Presence/absence of C-N at topological distance of 1, F02[ C-N]The frequency of C-N when the topological distance is 2, F _ Lpcount is the number of lone-pair electrons of all fluorine atoms, Br _ Lpcount is the logarithm of the lone-pair electrons of all bromine atoms, H _ Qcount is the number of hydrogen atoms, NaasC is the number of aasC, SssCH2 is the sum of E-states of-CH 2-, F01[ O-Si ]]The frequency of O-Si when the topological distance is 1 and the molarVolume is the molar volume;
s3, calculating an organic compound molecular structure parameter E according to a formula (13), calculating an organic compound molecular structure parameter S according to a formula (14), calculating an organic compound molecular structure parameter A according to a formula (15), calculating an organic compound molecular structure parameter B according to a formula (16), calculating an organic compound molecular structure parameter L according to a formula (17), calculating an organic compound molecular structure parameter V according to a formula (18),
E=0.61313+0.01169polarizability+0.88701(EHomo-ELumo)+0.12676I_LPcount-0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+0.04305F01[C-N]+0.14475F10[O-O]a compound of the formula (13),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420EHomo-1+0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
in the formula (14),
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0.28056nHDon–0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
a compound of the formula (15),
B=-0.01310+0.08131O_LPcount+0.13056N_Qcount–0.09927Mor12i+0.18232nRCO+0.01458H-047+0.14627O-056+0.95757P-116–0.53368NddsN+0.14104B01[C-N]+0.03503F02[C-N]
formula (16)
L=0.44713+0.03226polarizability–0.16282F_LPcount+0.07766Br_LPcount+0.25237Atom_num–0.35911H_Qcount+0.48173nHDon-0.08596NaasC+0.06518SssCH2–0.43300F01[O-Si]
Formula (17)
V ═ 0.00910+1.027(molarVolume/100) formula (18),
wherein, the molecular parameter E is the molar refractive index of the excessive molecules, the molecular parameter L is the n-hexadecane-water distribution coefficient, the molecular parameter A is the hydrogen bond acidity, the molecular parameter B is the hydrogen bond alkalinity, the molecular parameter S is the polarity/dipole moment, and the molecular parameter V is the McGowan characteristic molecular volume;
the organic compound may be an alkane, alkene, alkyne, alcohol, ether, phenol, ketone, aldehyde, ester, quinone, substituted biphenyl, aniline, halogenated hydrocarbon, nitroaromatic, alkylbenzene, azobenzene, organic acid, benzamide, phthalate, polybrominated diphenyl ether, polycyclic aromatic hydrocarbon, sulfonic acid derivative, organophosphorus compound, organosulfide, organoiodide, organofluoride, heterocyclic compound and organosilicon compound.
The method for predicting the molecular structure parameters of the chemicals provided by the invention can be used for various organic compounds; the actual measurement data of the molecular structure parameters E, S, A, B, L, V in the method reaches 3838 types, and the method has a very wide application domain, E, S, A, B, L, V adopts a linear regression algorithm for modeling, and the model algorithm is transparent, simple and convenient and is easy to explain; and predicting the multi-parameter linear free energy relation of the organic compound according to the predicted molecular structure parameter E, the predicted molecular structure parameter S, the predicted molecular structure parameter A, the predicted molecular structure parameter B, the predicted molecular structure parameter L and the predicted molecular structure parameter V. The method for predicting the multi-parameter linear free energy relationship of the organic compound is simple, convenient and quick, has low cost, can provide data support for chemical supervision, and has important significance for ecological risk evaluation of chemicals.
The present invention will be described in further detail below with reference to examples.
Example 1
Given a compound 4-nitrochlorobenzene (CAS number: 100-00-5), the value of the molecular structural parameter E is predicted. Firstly, optimizing the molecular structure of the compound, optimizing the molecular structure of the organic compound by a Gauss B3LYP/6-31G (d) method, andadding pseudo potential to atoms beyond the calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency; polizability, E was calculated using Draogon6.0 software based on optimized molecular structureHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]Are 85.02, -0.180, 0, 14, -0.160, 0, 0.249, 1, 0, respectively. Then, a predicted value of 0.98 is calculated according to the formula (19), and the predicted value is consistent with the experimental value and has a good prediction effect.
E=0.61313+0.01169polarizability+0.88701(EHomo-ELumo)+0.12676I_LPcount-0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+0.04305F01[C-N]+0.14475F10[O-O]Formula (19).
Example 2
The S value of 1, 4-diisopropylbenzene (CAS number: 100-18-5) is predicted given a compound. Firstly, optimizing the molecular structure of a compound, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method based on the optimized molecular structure, adding pseudopotential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency; dipole _ moment, CCR _ energy, E were calculated using Draogon6.0 softwareHOMO-1Rperim, ATSC2i, Mor05u, Mor02m, nRCO, nHDon, H-046, SdO, NtN have values of 0.0191, 682.158, -0.24112, 6, 0.6, -4.027, 11.018, 0, 14, 0, respectively. Then, the predicted value is 0.47 calculated according to the formula (20), the experimental value is 0.474, and the prediction effect is good.
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420EHomo-1+0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
Formula (20).
Example 3
A compound p-methoxyacetophenone (CAS number: 100-06-1) was given as a predictor of its A value. Firstly, optimizing the molecular structure of a compound, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method, adding pseudopotential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency; based on the optimized molecular structure, H _ Qmax, H _ Qmean, nRCONH2, nHDon, N-067, O-057, SsNH2, CATS2D _01_ AN, CATS2D _03_ DD, CATS2D _03_ DA, B04[ O-O ], nArNHR have the values of 0.179382, 0.1588766, 0 and 0 respectively, which are calculated by using Draogon6.0 software. And then, calculating according to a formula (21) to obtain a predicted value of 0.019, wherein the experimental value is 0, and the prediction effect is good.
A=-0.18760+0.41354H_Qmax+0.83897H_Qmean+0.20256nRCONH2+0.28056nHDon–0.16539N-067+0.08320O-057–0.07177SsNH2+0.14845CATS2D_01_AN–0.12936CATS2D_03_DD–0.04406CATS2D_03_DA–0.08829B04[O-O]-0.21963nArNHR
Formula (21).
Example 4
Propylbenzene (CAS number: 103-65-1) is given as a predictor of its log partition coefficient in methanol/water. Firstly, optimizing the molecular structure of a compound, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method, adding pseudopotential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency; polizability, E was calculated using Draogon6.0 software based on optimized molecular structureHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]molarVolume value; molecular parameters E, S, A, B, V of propylbenzene were calculated to be 0.626, 0.380, -0.016, 0.225, 1.126 based on the formulas (22), (23), (24), (25), and (26), respectively, and according to Abraham M H et alThe logarithm of the partition coefficient of propylbenzene in methanol/water was 3.42 and the experimental value was 3.52 as calculated by formula (27) described in the 2004 publication, and the result was found to be good.
E=0.61313+0.01169polarizability+0.88701(EHomo-ELumo)+0.12676I_LPcount-0.29072Atom_num+0.26076nBT-0.34881Mor32p+0.12675nHDon-0.57231ATSC2i+0.04305F01[C-N]+0.14475F10[O-O]In the formula (22),
S=0.80280+0.05210dipole_moment+0.00023CCR_energy+1.96420EHomo-1+0.03975Rperim-0.55400ATSC2i-0.05361Mor05u+0.01734Mor02m+0.24280nRCO+0.10889nHDon-0.02352H-046+0.01438SdO+0.43704NtN
in the formula (23),
a ═ 0.18760+0.41354H _ Qmax +0.83897H _ Qmean +0.20256nRCONH2+0.28056 nHDon-0.16539N-067 + 0.08320O-057-0.07177 SsNH2+0.14845CATS2D _01_ AN-0.12936 CATS2D _03_ DD-0.04406 CATS2D _03_ DA-0.08829B 04[ O-O ] -0.21963 rnANHR formula (24),
b ═ 0.01310+0.08131O _ LPcount +0.13056N _ Qcount-0.09927 Mor12i +0.18232nRCO +0.01458H-047+0.14627O-056+ 0.95757P-116-0.53368 NddsN +0.14104B01[ C-N ] +0.03503F02[ C-N ] (25)
V-0.00910 +1.027(molarVolume/100) formula (26),
logK ═ 0.299E-0.671S +0.080A-3.389B +3.512V +0.329 formula (27).
Example 5
A compound, bromobutane (CAS number: 109-65-9), was given to predict the logarithm of its partition coefficient in ethanol/water. Firstly, optimizing the molecular structure of a compound, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method, adding pseudopotential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency; polizability, E was calculated using Draogon6.0 software based on optimized molecular structureHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]molarVolume value; the molecular parameters E, S, A, B, V of propylbenzene were calculated to be 0.252, 0.279, 0.019, 0.052 and 0.799 from the formulae (22), (23), (24), (25) and (26), respectively, and the logarithm of partition coefficient of bromobutane in ethanol/water was 3.45 and the experimental value was 3.52 from the formula (28) described in the document 2004 by Abraham M H et al.
logK ═ 0.409E-0.959S +0.186A-3.645B +3.928V +0.208 formula (28).
Example 6
The logarithm of the partition coefficient of amyl alcohol/water is predicted given a compound of dimethyl ether (CAS number: 115-10-6). Firstly, optimizing the molecular structure of a compound, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method, adding pseudopotential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency; polizability, E was calculated using Draogon6.0 software based on optimized molecular structureHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]molarVolume value; molecular parameters E, S, A, B, V of propylbenzene were calculated to be 0.252, 0.279, 0.019, 0.052 and 0.799 from the formulas (22), (23), (24), (25) and (26), respectively, and the logarithm of partition coefficient of bromobutane in ethanol/water was 3.45 and the experimental value was 3.52 from the formula (29) described in the document of Abraham M H et al 2004, and the results of the prediction were thatIs good.
logK ═ 0.521E-1.294S +0.208A-3.908B +4.208V +0.08 formula (29).
As can be seen from the comparison of the above examples, the method for predicting the molecular structural parameters of the chemical provided by the present invention can be applied to various organic compounds, such as alkanes, alkenes, alkynes, alcohols, ethers, phenols, ketones, aldehydes, esters, quinones, substituted biphenyls, anilines, halogenated hydrocarbons, nitroaromatics, alkylbenzenes, azobenzenes, organic acids, benzamides, phthalates, polybromodiphenyl ethers, polycyclic aromatic hydrocarbons, sulfonic acid derivatives, organic phosphorus compounds, organic sulfides, organic iodides, organic fluorides, heterocyclic compounds, and organic silicon compounds. The actual measurement data of the molecular structure parameters E, S, A, B, L, V in the method reaches 3838 types, and the method has a very wide application domain, E, S, A, B, L, V adopts a linear regression algorithm for modeling, and the model algorithm is transparent, simple and convenient and is easy to explain; and predicting the multi-parameter linear free energy relation of the organic compound according to the predicted molecular structure parameter E, the predicted molecular structure parameter S, the predicted molecular structure parameter A, the predicted molecular structure parameter B, the predicted molecular structure parameter L and the predicted molecular structure parameter V.
The method for predicting the multi-parameter linear free energy relationship of the organic compound is simple, convenient and quick, has low cost, can provide data support for chemical supervision, and has important significance for ecological risk evaluation of chemicals.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (1)

1. A method of predicting a molecular structural parameter of a chemical, the method comprising:
s1, optimizing the molecular structure of an organic compound by a Gauss B3LYP/6-31G (d) method, adding pseudo potential to atoms beyond a calculation range by adopting LANL2DZ, and adding keywords pop ═ NBO and Volume, wherein the optimized molecular structure is stable and has no virtual frequency;
s2, obtaining polizabilty and E based on optimized molecular structure calculationHomo-ELumo、I_LPcount、Atom_num、nBT、Mor32p、nHDon、ATSC2i、F01[C-N]、F10[O-O]、dipole_moment、CCR_energy、EHOMO-1、Rperim、Mor05u、Mor02m、nRCO、H-046、SdO、NtN、H_Qmax、H_Qmean、nRCONH2、N-067、O-057、SsNH2、CATS2D_01_AN、CATS2D_03_DD、CATS2D_03_DA、B04[O-O]、nArNHR、O_Lpcount、N_Qcount、Mor12i、H-047、O-056、P-116、NddsN、B01[C-N]、F02[C-N]、F_Lpcount、Br_Lpcount、H_Qcount、NaasC、SssCH2、F01[O-Si]And molarVolume value, where polarizability is polarizability, EHomo-ELumoIs the energy level difference of front line molecular orbit, I _ Lpcount is the lone pair electron logarithm of all iodine atoms, Atom _ num is the total number of atoms, nBT is the number of chemical bonds, Mor32p is the 3D-Morse signal 32/polarizability weighting, nHDon is the N and O Atom number of hydrogen bond donor, ATSC2I is the center Broto-Moreau autocorrelation index of lag2 weighted by ionization potential, F01[ C-N ]]F10[ O-O ] which is the probability of C-N occurring when the topological distance is 1]The probability of O-O occurring at a topological distance of 10, dipole-moment, CCR-energy, and EHOMO-1For the second highest occupied orbital energy, Rporim is the ring perimeter of the molecule, Mor05u is the 3D-MoRSE signal 05/unweighted, Mor02m is the 3D-MoRSE signal 02/mass-weighted, nRCO is the number of aliphatic ketone groups, H-046 is the sum of E-states with hydrogen atoms attached to sp3 hybridized carbon atoms and halogen-free carbon atoms attached to adjacent carbon atoms SdO is ≡ O, NtN is the number of ≡ N in the molecule, H _ Qmax is the highest charge on hydrogen atoms, H _ Qmean is the average charge on hydrogen atoms, nRCONH2 is the number of aliphatic primary amides, N-067 is Al2-NH, O-057 is the oxygen atom on phenol/enol/carboxyl, SsNH2 is-NH2The sum of E-states, CATS2D _01_ AN is the CATS2D descriptor of the hydrogen bond acceptor-negative charge at lag 01, CATS2D _03_ DD is the CATS2D descriptor of the hydrogen bond donor-hydrogen bond donor at lag03, CATS2D _03_ DA is the CATS3D descriptor of the hydrogen bond donor-hydrogen bond acceptor at lag03, B04[ O-O]The presence/absence of O-O at a topological distance of 4nArNHR is the number of aromatic secondary amines, O _ Lpcount is the lone pair electron logarithm of all oxygen atoms, N _ Qcount is the number of nitrogen atoms, Mor12i is 3D-MorSE signal 12/ionization potential weight, H-047 is a hydrogen atom connected with a carbon atom hybridized with sp2 and sp3, O-056 is an oxygen atom on a hydroxyl group, P-116 is the number of R3-P ═ X groups, dsNdN is the number of-N ═ and B01[ C-N [ ]]Presence/absence of C-N at topological distance of 1, F02[ C-N]The frequency of C-N when the topological distance is 2, F _ Lpcount is the number of lone-pair electrons of all fluorine atoms, Br _ Lpcount is the logarithm of the lone-pair electrons of all bromine atoms, H _ Qcount is the number of hydrogen atoms, NaasC is the number of aasC, SssCH2 is the sum of E-states of-CH 2-, F01[ O-Si ]]The frequency of O-Si when the topological distance is 1 and the molarVolume is the molar volume;
s3, calculating an organic compound molecular structure parameter E according to the formula (1), calculating an organic compound molecular structure parameter S according to the formula (2), calculating an organic compound molecular structure parameter A according to the formula (3), calculating an organic compound molecular structure parameter B according to the formula (4), calculating an organic compound molecular structure parameter L according to the formula (5), calculating an organic compound molecular structure parameter V according to the formula (6),
E=0.61313+0.01169 polarizability+0.88701(EHomo-ELumo)+0.12676 I_LPcount-0.29072 Atom_num+0.26076 nBT-0.34881 Mor32p+0.12675 nHDon-0.57231 ATSC2i+0.04305 F01[C-N]+0.14475 F10[O-O]
the compound of the formula (1),
S=0.80280+0.05210 dipole_moment+0.00023 CCR_energy+1.96420 EHomo-1+0.03975 Rperim-0.55400 ATSC2i-0.05361 Mor05u+0.01734 Mor02m+0.24280 nRCO+0.10889 nHDon-0.02352 H-046+0.01438 SdO+0.43704 NtN
the compound of the formula (2),
A=-0.18760+0.41354 H_Qmax+0.83897 H_Qmean+0.20256 nRCONH2+0.28056 nHDon–0.16539 N-067+0.08320 O-057–0.07177 SsNH2+0.14845 CATS2D_01_AN–0.12936 CATS2D_03_DD–0.04406 CATS2D_03_DA–0.08829 B04[O-O]-0.21963 nArNHR
the compound of the formula (3),
b ═ 0.01310+0.08131O _ LPcount +0.13056N _ Qcount-0.09927 Mor12i +0.18232nRCO +0.01458H-047+0.14627O-056+ 0.95757P-116-0.53368 NddsN +0.14104B01[ C-N ] +0.03503F02[ C-N ] formula (4),
L=0.44713+0.03226 polarizability–0.16282 F_LPcount+0.07766 Br_LPcount+0.25237 Atom_num–0.35911 H_Qcount+0.48173 nHDon-0.08596 NaasC+0.06518 SssCH2–0.43300 F01[O-Si]
the compound of the formula (5),
v ═ 0.00910+1.027(molarVolume/100) formula (6),
wherein, the molecular parameter E is the molar refractive index of the excessive molecules, the molecular parameter L is the n-hexadecane-water distribution coefficient, the molecular parameter A is the hydrogen bond acidity, the molecular parameter B is the hydrogen bond alkalinity, the molecular parameter S is the polarity/dipole moment, and the molecular parameter V is the McGowan characteristic molecular volume.
CN201811378715.5A 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals Active CN109493922B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811378715.5A CN109493922B (en) 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811378715.5A CN109493922B (en) 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals

Publications (2)

Publication Number Publication Date
CN109493922A CN109493922A (en) 2019-03-19
CN109493922B true CN109493922B (en) 2021-06-29

Family

ID=65696276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811378715.5A Active CN109493922B (en) 2018-11-19 2018-11-19 Method for predicting molecular structure parameters of chemicals

Country Status (1)

Country Link
CN (1) CN109493922B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111986740B (en) * 2020-09-03 2024-05-14 深圳赛安特技术服务有限公司 Method for classifying compounds and related equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AR058852A1 (en) * 2005-12-09 2008-02-27 Dow Global Technologies Inc PROCESSES TO CONTROL THE DISTRIBUTION OF THE MOLECULAR WEIGHT IN THE ETHYLENE / ALFA OLEFINE COMPOSITIONS
JP2015535343A (en) * 2012-10-29 2015-12-10 ユニバーシティ・オブ・ユタ・リサーチ・ファウンデイション Functionalized nanotube sensors and related methods
CN108140920A (en) * 2015-10-27 2018-06-08 住友化学株式会社 Magnesium air electrode for cell and magnesium air battery and aromatic compound and metal complex
CN106588802A (en) * 2016-10-31 2017-04-26 南京工程学院 Bis(tetrazole-2-oxy-4-hydro)amine, design method, and application thereof
CN107563133B (en) * 2017-08-30 2021-05-04 大连理工大学 Method for predicting chlorine free radical reaction rate constant of organic chemicals by adopting quantitative structure-activity relation model

Also Published As

Publication number Publication date
CN109493922A (en) 2019-03-19

Similar Documents

Publication Publication Date Title
Burns et al. Experimental p K a determination for perfluorooctanoic acid (PFOA) and the potential impact of p K a concentration dependence on laboratory-measured partitioning phenomena and environmental modeling
Streidl et al. A practical guide for estimating rates of heterolysis reactions
Cerón‐Carrasco et al. Solvent polarity scales: determination of new ET (30) values for 84 organic solvents
Goss et al. The partition behavior of fluorotelomer alcohols and olefins
Cláudio et al. Extended scale for the hydrogen-bond basicity of ionic liquids
Parthasarathi et al. Hydrogen bonding without borders: an atoms-in-molecules perspective
Sheng et al. Detailed kinetics and thermochemistry of C2H5+ O2: Reaction kinetics of the chemically-activated and stabilized CH3CH2OO• adduct
Balaban et al. Steric fit in quantitative structure-activity relations
Izgorodina et al. Should contemporary density functional theory methods be used to study the thermodynamics of radical reactions?
Krusic et al. Vapor pressure and intramolecular hydrogen bonding in fluorotelomer alcohols
Kaiser et al. Vapor pressures of perfluorooctanoic,-nonanoic,-decanoic,-undecanoic, and-dodecanoic acids
Jackson et al. Atmospheric degradation of perfluoro-2-methyl-3-pentanone: Photolysis, hydrolysis and hydration
Boesch et al. Electron Affinities of Substituted p-Benzoquinones from Hybrid Hartree− Fock/Density-Functional Calculations
Baltaretu et al. Primary atmospheric oxidation mechanism for toluene
Dimitrova et al. Electrostatic potential at atomic sites as a reactivity descriptor for hydrogen bonding. Complexes of monosubstituted acetylenes and ammonia
Gupta et al. Origin of strong synergism in weakly perturbed binary solvent system: a case study of primary alcohols and chlorinated methanes
Zhu et al. Kinetics and Thermochemistry for the Gas-Phase Keto− Enol Tautomerism of Phenol↔ 2, 4-Cyclohexadienone
CN109493922B (en) Method for predicting molecular structure parameters of chemicals
Knyazev et al. Chemically and thermally activated decomposition of secondary butyl radical
Wu et al. Shock tube study on the thermal decomposition of ethanol
Stephens et al. Correlation of solute transfer into toluene and ethylbenzene from water and from the gas phase based on the Abraham model
Feng et al. Interspecies correlation estimation–applications in water quality criteria and ecological risk assessment
Tong et al. Prediction of the physicochemical properties of valine ionic liquids [C n mim][Val](n= 2, 3, 4, 5, 6) by semiempirical methods
Hirao et al. Theoretical study of reactivities in electrophilic aromatic substitution reactions: reactive hybrid orbital analysis
Cho et al. In silico prediction of linear free energy relationship descriptors of neutral and ionic compounds

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant