CN113326664A - Method for predicting dielectric constant of glass based on M5P algorithm - Google Patents

Method for predicting dielectric constant of glass based on M5P algorithm Download PDF

Info

Publication number
CN113326664A
CN113326664A CN202110717315.8A CN202110717315A CN113326664A CN 113326664 A CN113326664 A CN 113326664A CN 202110717315 A CN202110717315 A CN 202110717315A CN 113326664 A CN113326664 A CN 113326664A
Authority
CN
China
Prior art keywords
cation
dielectric constant
cluster
model
glass
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110717315.8A
Other languages
Chinese (zh)
Other versions
CN113326664B (en
Inventor
赵谦
赵明
刘鑫
陈阳
匡宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Fiberglass Research and Design Institute Co Ltd
Original Assignee
Nanjing Fiberglass Research and Design Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Fiberglass Research and Design Institute Co Ltd filed Critical Nanjing Fiberglass Research and Design Institute Co Ltd
Priority to CN202110717315.8A priority Critical patent/CN113326664B/en
Publication of CN113326664A publication Critical patent/CN113326664A/en
Application granted granted Critical
Publication of CN113326664B publication Critical patent/CN113326664B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]

Abstract

The invention discloses a method for predicting a glass dielectric constant based on an M5P algorithm, and belongs to the technical field of glass performance prediction. The method only constructs clusters but not crystals by providing a new construction method of an atomic structure model of oxide clusters with different symmetries and considering an electronic structure based on a first principle, so that the state of a system can be reflected, the calculation cost is not increased, and the prediction accuracy of the dielectric constant is ensured. In addition, the dielectric constant prediction model is constructed by adopting the M5P model tree, global linear regression does not need to be considered, the M5P model tree divides sample characteristics into a plurality of piecewise linear regressions, the tree is higher in interpretability, and the time for training the model of the algorithm is short.

Description

Method for predicting dielectric constant of glass based on M5P algorithm
Technical Field
The invention relates to a method for predicting a glass dielectric constant based on an M5P algorithm, and belongs to the technical field of glass performance prediction.
Background
Dielectric constant, also known as permittivity or relative permittivity, is an important parameter for characterizing the electrical properties of a dielectric or insulating material, with smaller dielectric constants giving faster propagation rates of signals. With the development of electronic technology and the rapid rise of 5G communication, miniaturization of electronic devices has become a mainstream trend, and electromagnetic wave frequencies used for electronic devices have reached GHz levels. This requires that the sealing glass have a low dielectric constant, which only serves to protect the circuitry, isolate the insulation, and prevent signal distortion in the electronic package. Meanwhile, the low dielectric constant glass can also reduce the relaxation and cross interference of signals, so the low dielectric constant glass has wide application prospect.
The publication No. CN110648727A entitled "preparation method of glass material with specific physical properties" discloses a preparation method of glass material, which takes the product of the property and content of cation element of glass network modifier and network intermediate oxide as input layer variable, glass properties as output variable, and combines with neural network algorithm to construct component intelligent design model. However, this method has two disadvantages: firstly, the data volume related to the performance of the glass material is relatively small at present, and overfitting (overfitting) of a model is easily caused by using machine learning methods such as a neural network and the like; secondly, the descriptors input as the machine learning model cannot well reflect the basic physical properties of various chemical components of the glass material, so that the well-fitted machine learning model is accurate only in a limited local chemical composition space, such as a chemical composition space identical to a training data set of machine learning data. Such machine learning models have only interpolated (interpolation) prediction capabilities and are not able to find optimal components in a broader composition space, i.e. do not have extrapolated (extrapolation) prediction capabilities.
The invention discloses a method for predicting the properties of a multi-component glass system, which is disclosed in the publication No. CN110364231B entitled "method for predicting the properties of a multi-component glass system", and is based on a first-character principle to carry out structural screening and calculate the properties of target glass according to a multi-component glass system lever model. However, the method cannot well reflect the basic physical properties of various chemical components of the glass material, and the performance of the glass cannot be predicted by adopting a machine learning related algorithm, so that the dielectric constant of the glass cannot be accurately predicted.
Disclosure of Invention
In order to accurately predict the dielectric constant of glass without being limited by chemical composition, the invention provides a method for predicting the dielectric constant of glass based on an M5P algorithm, which comprises the following steps: the method comprises the following steps:
step 1, collecting dielectric constant data of glass materials composed of different components, and constructing a dielectric constant database, wherein the database comprises glass components mapped one by one and dielectric constants corresponding to the glass components;
step 2, constructing atomic structure models of oxide clusters with different symmetries in the oxide glass material based on a first principle, and using the binding energy of each unit cation i of each cluster
Figure BDA0003135366490000021
The balance bond length of the cation corresponding to the cluster containing the cation i and the oxygen ion, the Bade charge of the cation i in various clusters and the HOMO-LUMO gap of various clusters containing the cation i; the HOMO-LUMO gap of each cluster containing the cation i is a descriptor containing a material gene, which takes the difference of the energy of the highest occupied molecular orbital and the lowest unoccupied molecular orbital of the cluster structure containing the cation i as a performance parameter to construct the dielectric constant;
step 3, constructing a training set, a verification set and a test set based on the dielectric constant database constructed in the step 1 and the descriptor constructed in the step 2;
step 4, constructing a dielectric constant prediction model based on the M5P model tree, and training the constructed dielectric constant prediction model according to the training set, the verification set and the test set constructed in the step 3 to obtain a trained dielectric constant prediction model;
and 5, aiming at the glass material to be predicted, predicting the dielectric constant of the glass material by using the trained dielectric constant prediction model.
Optionally, step 2 includes:
step 2-1, constructing atomic structure models of oxide clusters with different symmetries as unit cells calculated by a first principle;
step 2-2, calculating the unit cell structure of each type of oxide cluster constructed in the step 2-1 by a first principle to obtain cluster energy E of each unit cellclusterAnd a structural constant;
step 2-3, for each of the cell structures constructed in step 2-1; constructing descriptors for machine learning by further calculating the first principle to obtain the performance parameter set
Figure BDA0003135366490000022
Figure BDA0003135366490000023
Where n is all non-zero integers between-3 and +3, CiIs the ratio of the corresponding cations i, Cation is the set of cations i, xiIs a performance parameter calculated corresponding to the first principle of cation i; the performance parameters include the binding energy per unit cation i of each cluster
Figure BDA0003135366490000024
The balance bond length of the cation corresponding to the cluster containing the cation i and the oxygen ion, the Bade charge of the cation i in various clusters and the HOMO-LUMO gap of various clusters containing the cation i; the HOMO-LUMO gap of the various clusters containing the cation i is the difference in energy of the highest occupied molecular orbital and the lowest unoccupied molecular orbital of the cluster structure containing the cation i.
Optionally, when the atomic structure model of the oxide clusters with different symmetries is constructed in step 2-1, the construction is performed according to the following rules:
(1) each cluster is located at one
Figure BDA0003135366490000025
In the cubic unit cell of (a);
(2) for each cation present in the glass composition, the atom corresponding to that cation is placed in the center of the unit cell, around which 2 oxygen atoms are added in a linear molecular manner; simultaneously adding a hydrogen atom to each oxygen atom along the extension direction of the atomic bond from the central atom to the oxygen atom, wherein each oxygen atom and each hydrogen atom form a hydroxyl group;
(3) for each cation present in the glass component, placing the atom corresponding to the cation in the center of the unit cell, adding 3 oxygen atoms on the same plane around the cation in a 3-time rotational symmetry manner, and simultaneously adding a hydrogen atom on each oxygen atom along the extension direction of the atomic bond from the center atom to the oxygen atom, wherein each oxygen atom and each hydrogen atom form a hydroxyl group;
(4) for each cation present in the glass composition, placing the atom corresponding to the cation in the center of the unit cell, adding 4 oxygen atoms around it in a tetrahedrally symmetric manner, while adding one hydrogen atom on each oxygen atom along the extension of the atomic bond from the center atom to the oxygen atom, each oxygen atom and hydrogen atom constituting one hydroxyl group;
(5) for each cation present in the glass composition, the atom corresponding to the cation is placed in the center of the unit cell, 6 oxygen atoms are added around it in an octahedral symmetry, while on each oxygen atom a hydrogen atom is added along the extension of the atomic bond from the central atom to the oxygen atom, each oxygen atom and hydrogen atom constituting a hydroxyl group.
Optionally, the binding energy per unit cation i of each cluster is
Figure BDA0003135366490000031
The calculation method is as follows:
using cluster energy E of oxide cluster in unit cell structure constructed in step 2-1clusterSubtracting the sum of the energies of the single atoms with the same number and the same type to obtain the product, wherein the calculation formula is as follows:
Figure BDA0003135366490000032
wherein l is the number of oxygen atoms in the oxide cluster, EiAnd EOHRespectively being a single cation and a single hydroxy group in one
Figure BDA0003135366490000033
The cubic unit cell of (a).
Optionally, when the dielectric constant database is constructed in step 1, the method further includes preprocessing the acquired dielectric constant data, where the preprocessing includes:
judging whether the following two conditions are simultaneously satisfied or not according to the two glass components:
condition 1: the difference value of the mole ratio of the components of each oxide component is less than or equal to a first preset threshold, and the unit is percentage;
condition 2: the difference value of the dielectric constants is larger than a second preset threshold value, and the unit is percentage;
if both are true, the corresponding glass composition and corresponding dielectric constant data are removed from the database.
Optionally, the first preset threshold is 2%, and the second preset threshold is 10%.
Optionally, in step 4, constructing a dielectric constant prediction model based on the M5P model tree, including:
step 4-1, setting the maximum layer number set h of the M5P model tree as { h }1,h2,h3,.....,hgAnd the minimum set of sample numbers for node splitting f ═ f1,f2,f3,.....,fj};
Step 4-2, for (h)1,f1) Dividing based on standard deviation of sample data to select characteristic parameters of split nodes of binary tree
Figure BDA0003135366490000034
Step 4-3, adopting a multivariate linear regression model to establish a linear regression model of the dielectric constant and the branch residual undivided characteristic parameter Unpar for each leaf node:
Figure BDA0003135366490000041
in the formula, Unpar is a set of the rest non-divided descriptors in the branch tree, theta is a regression parameter set, D is a set formed by all the descriptors, and I is an indicative function;
4-4, utilizing the training set D in the step 31Determining the coefficient [ theta ] of the linear model corresponding to each leaf node according to the least square method0l,θl]Wherein [ theta ]0l,θl]As a data set DlRegression parameters of;
4-5, calculating a mean square error value of the linear regression model in the verification set by using the verification set in the step 3:
Figure BDA0003135366490000042
in the formula, N*In order to verify the amount of data in a set,
Figure BDA0003135366490000043
to validate the predicted values of the set, yaActual values for the validation set;
step 4-6, adopting a k-fold cross verification method, repeatedly executing the step 4-3 to the step 4-5 for k times, and calculating to obtain the final product (h)1,f1) Average of k-fold cross validation under conditions
Figure BDA0003135366490000044
Figure BDA0003135366490000045
Step 4-7, adjusting hyper-parameters (h, f) of the M5P tree model, and sequentially setting h coefficients as h1,h2,h3,.....,hgSetting f coefficient as f1,f2,f3,.....,fjRepeat and repeatStep 4-2 to step 4-6, calculating in sequence
Figure BDA0003135366490000046
Step 4-8, selecting the smallest
Figure BDA0003135366490000047
Corresponding to
Figure BDA0003135366490000048
The value is used as the optimal hyper-parameter of the M5P model;
step 4-9, taking the sum of all training sets { D } and verification set { V } as the training set of the final model after parameter adjustment is finished, namely { S }1,S2,S3.....Sk};
Step 4-10, based on steps 4-2 through 4-4, training set { S1,S2,S3.....SkAnd (5) obtaining a series of regression coefficients of leaf nodes of the M5P model after training to form a dielectric constant prediction model.
Optionally, the training set { D } and the verification set { V } are constructed by a k-fold cross-validation method.
The invention also provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the above method when executing the computer program.
The invention has the beneficial effects that:
by providing a new construction method of an atomic structure model of oxide clusters with different symmetries and considering the electronic structure based on the first principle, only the clusters are constructed without constructing crystals, so that the state of the system can be reflected, the calculation cost is not increased, and the prediction accuracy of the dielectric constant is ensured. In addition, the dielectric constant prediction model is constructed by adopting the M5P model tree, global linear regression does not need to be considered, the M5P model tree divides sample characteristics into a plurality of piecewise linear regressions, the tree is higher in interpretability, and the time for training the model of the algorithm is short.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart of a method for predicting the dielectric constant of glass based on the first principle in an embodiment of the present invention.
Fig. 2 is a flow chart of a first principle-based descriptor construction in an embodiment of the invention.
Fig. 3 is a schematic view of an atomic structural model of an oxide cluster in a tetrahedrally symmetric manner.
Fig. 4 is a schematic view of an atomic structure model of an oxide cluster in an octahedral symmetry manner.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The first embodiment is as follows:
the present embodiment provides a method for predicting the dielectric constant of glass based on M5P algorithm, referring to fig. 1, the method includes:
step 1, collecting dielectric constant data of glass materials composed of different components, and constructing a dielectric constant database, wherein the database comprises glass components mapped one by one and dielectric constants corresponding to the glass components;
step 2, constructing a descriptor containing a material gene of the dielectric constant of the oxide glass material based on a first principle, wherein the descriptor comprises:
step 2-1, constructing atomic structure models of oxide clusters with different symmetries as unit cells calculated by a first principle;
step 2-2, calculating the unit cell structure of each type of oxide cluster constructed in the step 2-1 by a first principle to obtain cluster energy E of each unit cellclusterAnd structure oftenCounting;
step 2-3, for each of the cell structures constructed in step 2-1; constructing descriptors for machine learning by further calculating the first principle to obtain the performance parameter set
Figure BDA0003135366490000051
Figure BDA0003135366490000052
Where n is all non-zero integers between-3 and +3, CiIs the ratio of the corresponding cations i, Cation is the set of cations i, xiIs a performance parameter calculated corresponding to the first principle of cation i; the performance parameters include the binding energy per unit cation i of each cluster
Figure BDA0003135366490000053
The balance bond length of the cation corresponding to the cluster containing the cation i and the oxygen ion, the Bade charge of the cation i in various clusters and the HOMO-LUMO gap of various clusters containing the cation i; the HOMO-LUMO gap of the various clusters containing the cation i is the difference in energy of the highest occupied molecular orbital and the lowest unoccupied molecular orbital of the cluster structure containing the cation i.
Step 3, constructing a training set, a verification set and a test set based on the dielectric constant database constructed in the step 1 and the descriptor constructed in the step 2;
step 4, constructing a dielectric constant prediction model based on the M5P model tree, and training the constructed dielectric constant prediction model according to the training set, the verification set and the test set constructed in the step 3 to obtain a trained dielectric constant prediction model;
and 5, aiming at the glass material to be predicted, predicting the dielectric constant of the glass material by using the trained dielectric constant prediction model.
Example two:
the present embodiment provides a method for predicting the dielectric constant of glass based on M5P algorithm, referring to fig. 1, the method includes:
step 1, collecting dielectric constant data of glass materials composed of different components, and constructing a dielectric constant database, wherein the database comprises glass components mapped one by one and dielectric constants corresponding to the glass components;
for the acquired dielectric constant data, in practical application, the method further comprises the step of preprocessing the acquired dielectric constant data, wherein the preprocessing comprises the following steps:
judging whether the following two conditions are simultaneously satisfied or not according to the two glass components:
condition 1: the difference value of the mole ratio of the components of each oxide component is less than or equal to a first preset threshold, and the unit is percentage;
condition 2: the difference value of the dielectric constants is larger than a second preset threshold value, and the unit is percentage;
if both are true, the corresponding glass composition and corresponding dielectric constant data are removed from the database.
In the above two conditions, the first preset threshold and the second preset threshold may be determined by those skilled in the art according to prior knowledge, and in the present application, the first preset threshold is 2% and the second preset threshold is 10%.
Step 2, constructing descriptors containing 'material genes' of the dielectric constant of the oxide glass material based on a first principle, specifically, see fig. 2;
step 2-1, constructing atomic structure models of oxide clusters with different symmetries as unit cells calculated by a first principle;
specifically, the method comprises the following steps:
(1) each cluster is located at one
Figure BDA0003135366490000061
In the cubic unit cell of (a);
(2) for each cation present in the glass composition, the atom corresponding to that cation is placed in the center of the unit cell, around which 2 oxygen atoms are added in a linear molecular manner; simultaneously adding a hydrogen atom to each oxygen atom along the extension direction of the atomic bond from the central atom to the oxygen atom, wherein each oxygen atom and each hydrogen atom form a hydroxyl group;
(3) for each cation present in the glass component, placing the atom corresponding to the cation in the center of the unit cell, adding 3 oxygen atoms on the same plane around the cation in a 3-time rotational symmetry manner, and simultaneously adding a hydrogen atom on each oxygen atom along the extension direction of the atomic bond from the center atom to the oxygen atom, wherein each oxygen atom and each hydrogen atom form a hydroxyl group;
(4) for each cation present in the glass composition, the atom corresponding to the cation is placed in the center of the unit cell, 4 oxygen atoms are added around the cation in a tetrahedrally symmetric manner as shown in fig. 3, and simultaneously, a hydrogen atom is added to each oxygen atom along the extension direction of the atomic bond from the center atom to the oxygen atom, and each oxygen atom and hydrogen atom form a hydroxyl group;
(5) for each cation present in the glass composition, the atom corresponding to the cation is placed in the center of the unit cell, 6 oxygen atoms are added around the cation in an octahedral symmetrical manner as shown in fig. 4, and simultaneously, one hydrogen atom is added to each oxygen atom along the extension direction of the atomic bond from the central atom to the oxygen atom, and each oxygen atom and hydrogen atom form a hydroxyl group;
that is, the above structures (2) to (5) are constructed once for each cation.
Step 2-2, calculating the unit cell structure of each type of oxide cluster constructed in the step 2-1 by a first principle to obtain cluster energy E of each unit cellclusterAnd a structural constant;
step 4 a: software Quantum Erespress calculated by adopting a first sex principle;
and 4 b: the cluster energy E of the constructed cell was obtained using the pw.x of Quantum espress with the following parameters to optimize the cell sizeclusterAnd a structural constant;
the self-contained pseudopotential library of Quantum Espresso is adopted: pseudotopic type: PAW; functional type: PBE; non Linear Core Correction functional simulation
The truncation energy is 45 Ry-612 eV, and the self-consistent field convergence criterion is 10-5Ry
The way of calculation is to obtain the equilibrium bond length of the cluster inside the unit cell by optimization (calculation ═ relax')
The unit cell structure keeps the original symmetry in the optimization process (nosym ═ FALSE')
A method of adopting a corresponding insulator for electron orbit occupation near the fermi level (air ═ fixed')
All calculations were performed by a method of non-spin polarization (nspin ═ 1)
For all the cluster unit cells, the lattice of K space is 1 multiplied by 1;
2-3, calculating the optimized structures containing various clusters according to a first principle; constructing descriptors for machine learning by further calculating the first principle to obtain the performance parameter set
Figure BDA0003135366490000071
Figure BDA0003135366490000072
Where n is all non-zero integers between-3 and +3, CiIs the ratio of the corresponding cations i, xiAre performance parameters calculated according to the first principle of the cation i, and the performance parameters are calculated according to the first principle.
The set of performance parameters includes the following performance parameters:
(1) binding energy per unit cation i of each cluster
Figure BDA0003135366490000073
Unit: eV/atom;
calculation of binding energy by using optimized cluster energy EclusterSubtracting the sum of the energies of the single atoms with the same number and the same type to obtain the product, wherein the calculation formula is as follows:
Figure BDA0003135366490000081
wherein l is the number of oxygen atoms in the cluster, EiAnd EOHRespectively being a single cation and a single hydroxy group in one
Figure BDA0003135366490000082
The cubic unit cell energy of (a);
(2) the equilibrium bond length of the cation and oxygen ion corresponding to the cluster containing the cation i;
(3) bader Charge of cation i in various clusters
After the optimization of the first principle is completed, the Bader Charge of the cations i in various cluster structures is calculated according to an electron density file output by Quantum Espresso.
(4) Directly calculating HOMO-LUMO gap of each cluster containing the cation i through Quantum Espresso;
after the optimization of the first principle, the energy of the highest occupied molecular orbital and the lowest unoccupied molecular orbital of the cluster structure of the 4 cluster structures corresponding to the cation i is directly obtained, and the difference value is HOMO-LUMO gap.
Here, the calculation of the descriptor is further elaborated, exemplarily in connection with the above-mentioned performance parameters:
one set of data collected from the database is as follows: the glass structure contains A mol SiO2,B mol B2O3,C mol Na2And O, the dielectric constant of the component glass is y.
The total number of ions was calculated to be (3A +5B +3C) mol, where the proportion of Si atoms CSiComprises the following steps:
Figure BDA0003135366490000083
B3+the proportion is as follows:
Figure BDA0003135366490000084
Na+the proportion is as follows:
Figure BDA0003135366490000085
in predicting the dielectric constant, for each cation i (Si, B, Na), the corresponding performance parameter x consists of the following 4 classes of parameters (16 seed parameters), including
Figure BDA0003135366490000086
(4 kinds in total, ET1~ET4) Cluster average bond length (4, LT)1~LT4) Bader Charge (4 kinds, Q)1~Q4) Group of 4 kinds of G1~EG4)。
When n is 1, the number of n is 4 as follows
Figure BDA0003135366490000087
):
Figure BDA0003135366490000088
Figure BDA0003135366490000089
Figure BDA0003135366490000091
Figure BDA0003135366490000092
By analogy, the rest of n-valued descriptors (n-3, -2, -1, 2, 3) can be constructed (4 descriptors per n-value).
Step 3, constructing a training set, a verification set and a test set based on the dielectric constant database constructed in the step 1 and the descriptor constructed in the step 2;
the specific process comprises the following steps:
step 3-1, p in terms of total data amount N from dielectric constant database1% random drawn data as the first subset of tests { T%1};
Step 3-2, for the remaining (1-p)1) % of the data set, selecting data with dielectric constant value less than third preset threshold value, and randomly selecting p from the data2Data of% N as a second subset of tests { T }2};
Step 3-3, test subset for divide { T }1}、{T2Acquiring the glass components of the concerned specific components in the preset interval, and then randomly selecting p from the glass components3Glass data of% N as a third subset of tests { T }3};
And 3-4, combining the three test subsets to form a test set { T } - { T } of the model1,T2,T3And taking the rest data as a training set and a verification set of the model, wherein p is1、p2And p3The value of (A) needs to ensure that the data ratio of the training set plus the verification set ({ D } + { V }) to the test set { T } is 9: 1.
The rest data are used as a training set and a verification set of the model, and the specific division process comprises the following steps:
constructing a training set { D } and a verification set { V } by adopting a k-fold cross validation method, and specifically comprising the following steps:
step 3-4-1, the remaining 90% N data in the database are sorted in ascending order according to dielectric constant values, and then divided into k disjoint subsets S on average1,S2,S3.....Sk};
Step 3-4-2, 1 subset S of the data is taken each time1As a verification set { V }1K, 1 ═ 1, 2, 3.. k, and the remaining k-1 subsets serve as training sets { D }1Will train set { D1V and verification set1As cross validation data.
Step 4, constructing a dielectric constant prediction model based on the M5P model tree, and training the constructed dielectric constant prediction model according to the training set, the verification set and the test set constructed in the step 3 to obtain a trained dielectric constant prediction model;
the specific process comprises the following steps:
step 4-1, setting the maximum layer number set h of the tree as h ═ h1,h2,h3,.....,hgAnd the minimum set of sample numbers for node splitting f ═ f1,f2,f3,.....,fj};
Step 4-2, for (h)1,f1) Dividing based on standard deviation of sample data to select characteristic parameters of split nodes of binary tree
Figure BDA0003135366490000101
The specific process comprises the following steps:
step 4-2-1, selecting each characteristic parameter
Figure BDA0003135366490000102
As binary tree partitioning nodes, respectively calculating the reduced values SDR of the standard deviations of the binary trees before and after partitioning:
Figure BDA0003135366490000103
where sd (T) represents the standard deviation of the total sample data, | TbI denotes according to
Figure BDA0003135366490000104
The number of samples, b 1, 2, n, of each classified subsetclass,nclassRepresents the number of subsets, and T represents the number of samples in the population;
step 4-2-2, selecting the characteristic corresponding to the maximum SDR value
Figure BDA0003135366490000105
Step 4-2-3, repeating iteration step 4-2-1 in each subset of the classification,selecting a series of characteristic parameters
Figure BDA0003135366490000106
Splitting the nodes until the tree layer number exceeds the set maximum layer number h1Or the minimum number of samples of node splitting is less than a set value f1The splitting is stopped and eventually all branches reach leaf nodes.
Step 4-3, adopting a multivariate linear regression model to establish a linear regression model of the dielectric constant and the branch residual undivided characteristic parameter Unpar for each leaf node:
Figure BDA0003135366490000107
in the formula, Unpar is a set of the rest non-divided descriptors in the branch tree, theta is a regression parameter set, D is a set formed by all the descriptors, and I is an indicative function;
4-4, utilizing the training set D in the step 31Determining the coefficient [ theta ] of the linear model corresponding to each leaf node according to the least square method0l,θl]Wherein [ theta ]0l,θl]As a data set DlRegression parameters of;
step 4-5, utilizing the verification set V in step 31Calculating the linear regression model in the verification set V1Mean square error value of (1):
Figure BDA0003135366490000108
in the formula, N*In order to verify the amount of data in a set,
Figure BDA0003135366490000109
to validate the predicted values of the set, yaActual values for the validation set;
step 4-6, adopting a k-fold cross verification method, repeatedly executing the step 4-3 to the step 4-5 for k times, and calculating to obtain the final product (h)1,f1) Average of k-fold cross validation under conditions
Figure BDA00031353664900001010
Figure BDA00031353664900001011
Step 4-7, adjusting hyper-parameters (h, f) of the M5P tree model, and sequentially setting h coefficients as h1,h2,h3,.....,hgSetting f coefficient as f1,f2,f3,.....,fjRepeating the steps 4-2 to 4-6, and calculating sequentially
Figure BDA00031353664900001012
Step 4-8, selecting the smallest
Figure BDA00031353664900001013
Corresponding to
Figure BDA00031353664900001014
The value is used as the optimal hyper-parameter of the M5P model;
step 4-9, taking the sum of all training sets { D } and verification set { V } as the training set of the final model after parameter adjustment is finished, namely { S }1,S2,S3.....Sk};
Step 4-10, based on steps 4-2 through 4-4, training set { S1,S2,S3.....SkAnd (5) obtaining a series of regression coefficients of leaf nodes of the M5P model after training to form a dielectric constant prediction model.
And 5, predicting the dielectric constant of the glass material by using the dielectric constant prediction model aiming at the glass material to be predicted.
The specific process comprises the following steps:
step 5-1, according to the process of step 2, constructing a descriptor of the glass material to be predicted
Figure BDA0003135366490000111
Step 5-2, the descriptor is processed
Figure BDA0003135366490000112
And substituting the dielectric constant into the dielectric constant prediction model to obtain the predicted dielectric constant.
To verify the effectiveness of the method of the present application, 10 groups of glass materials with known dielectric constants were predicted by the method of the present invention, and the prediction results are shown in table 1 below.
TABLE 1 comparison of predicted values and true values of the predicted glass dielectric constant by the method
Figure BDA0003135366490000113
As can be seen from the above Table 1, the average error between the dielectric constant of the glass predicted by the method provided by the invention and the true value is only 2.83%, and compared with the existing method, the method provided by the invention can relatively accurately predict the dielectric constant, thereby verifying the effectiveness of the method. Moreover, by adopting the method to predict the glass with unknown dielectric constant, the dielectric constant of the glass with different component proportions can be quickly estimated, the trial and error cost for glass research and development can be greatly reduced, and the method has great significance for some research and development targets with strict requirements on the dielectric constant of the glass.
EXAMPLE III
The embodiment provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the following steps:
step 1, collecting dielectric constant data of glass materials composed of different components, and constructing a dielectric constant database, wherein the database comprises glass components mapped one by one and dielectric constants corresponding to the glass components;
step 2, constructing a descriptor containing a material gene of the dielectric constant of the oxide glass material based on a first principle;
step 3, constructing a training set, a verification set and a test set based on the dielectric constant database constructed in the step 1 and the descriptor constructed in the step 2;
step 4, constructing a dielectric constant prediction model based on the M5P model tree, and training the constructed dielectric constant prediction model according to the training set, the verification set and the test set constructed in the step 3 to obtain a trained dielectric constant prediction model;
and 5, aiming at the glass material to be predicted, predicting the dielectric constant of the glass material by using the trained dielectric constant prediction model.
In one embodiment, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, performs the steps of:
step 1, collecting dielectric constant data of a glass material, and constructing a dielectric constant database, wherein the database comprises glass components mapped one by one and dielectric constants corresponding to the glass components;
step 2, constructing a descriptor containing a material gene of the dielectric constant of the oxide glass material based on a first principle;
step 3, constructing a training set, a verification set and a test set based on the dielectric constant database and the descriptors constructed in the step 2;
step 4, constructing a dielectric constant prediction model based on the M5P model tree;
and 5, predicting the dielectric constant of the glass material by using the dielectric constant prediction model aiming at the glass material to be predicted.
For the specific definition of each step, see the definition of the method for predicting the dielectric constant of the glass by using the M5P algorithm, which is not described herein again.
Some steps in the embodiments of the present invention may be implemented by software, and the corresponding software program may be stored in a readable storage medium, such as an optical disc or a hard disk.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (9)

1. A method for predicting the dielectric constant of glass based on an M5P algorithm, the method comprising:
step 1, collecting dielectric constant data of glass materials composed of different components, and constructing a dielectric constant database, wherein the database comprises glass components mapped one by one and dielectric constants corresponding to the glass components;
step 2, constructing atomic structure models of oxide clusters with different symmetries in the oxide glass material based on a first principle, and using the binding energy of each unit cation i of each cluster
Figure FDA0003135366480000011
The balance bond length of the cation corresponding to the cluster containing the cation i and the oxygen ion, the Bade charge of the cation i in various clusters and the HOMO-LUMO gap of various clusters containing the cation i; the HOMO-LUMO gap of each cluster containing the cation i is a descriptor containing a material gene, which takes the difference of the energy of the highest occupied molecular orbital and the lowest unoccupied molecular orbital of the cluster structure containing the cation i as a performance parameter to construct the dielectric constant;
step 3, constructing a training set, a verification set and a test set based on the dielectric constant database constructed in the step 1 and the descriptor constructed in the step 2;
step 4, constructing a dielectric constant prediction model based on the M5P model tree, and training the constructed dielectric constant prediction model according to the training set, the verification set and the test set constructed in the step 3 to obtain a trained dielectric constant prediction model;
and 5, aiming at the glass material to be predicted, predicting the dielectric constant of the glass material by using the trained dielectric constant prediction model.
2. The method of claim 1, wherein the step 2 comprises:
step 2-1, constructing atomic structure models of oxide clusters with different symmetries as unit cells calculated by a first principle;
step 2-2, calculating the unit cell structure of each type of oxide cluster constructed in the step 2-1 by a first principle to obtain cluster energy E of each unit cellclusterAnd a structural constant;
step 2-3, for each of the cell structures constructed in step 2-1; constructing descriptors for machine learning by further calculating the first principle to obtain the performance parameter set
Figure FDA0003135366480000012
Figure FDA0003135366480000013
Where n is all non-zero integers between-3 and +3, CiIs the ratio of the corresponding cations i, Cation is the set of cations i, xiIs a performance parameter calculated corresponding to the first principle of cation i; the performance parameters include the binding energy per unit cation i of each cluster
Figure FDA0003135366480000014
The balance bond length of the cation corresponding to the cluster containing the cation i and the oxygen ion, the Bade charge of the cation i in various clusters and the HOMO-LUMO gap of various clusters containing the cation i; the HOMO-LUMO gap of the various clusters containing the cation i is the difference in energy of the highest occupied molecular orbital and the lowest unoccupied molecular orbital of the cluster structure containing the cation i.
3. The method according to claim 2, wherein the step 2-1 constructs the atomic structure model of the oxide clusters having different symmetries according to the following rules:
(1) each cluster is located at one
Figure FDA0003135366480000021
In the cubic unit cell of (a);
(2) for each cation present in the glass composition, the atom corresponding to that cation is placed in the center of the unit cell, around which 2 oxygen atoms are added in a linear molecular manner; simultaneously adding a hydrogen atom to each oxygen atom along the extension direction of the atomic bond from the central atom to the oxygen atom, wherein each oxygen atom and each hydrogen atom form a hydroxyl group;
(3) for each cation present in the glass component, placing the atom corresponding to the cation in the center of the unit cell, adding 3 oxygen atoms on the same plane around the cation in a 3-time rotational symmetry manner, and simultaneously adding a hydrogen atom on each oxygen atom along the extension direction of the atomic bond from the center atom to the oxygen atom, wherein each oxygen atom and each hydrogen atom form a hydroxyl group;
(4) for each cation present in the glass composition, placing the atom corresponding to the cation in the center of the unit cell, adding 4 oxygen atoms around it in a tetrahedrally symmetric manner, while adding one hydrogen atom on each oxygen atom along the extension of the atomic bond from the center atom to the oxygen atom, each oxygen atom and hydrogen atom constituting one hydroxyl group;
(5) for each cation present in the glass composition, the atom corresponding to the cation is placed in the center of the unit cell, 6 oxygen atoms are added around it in an octahedral symmetry, while on each oxygen atom a hydrogen atom is added along the extension of the atomic bond from the central atom to the oxygen atom, each oxygen atom and hydrogen atom constituting a hydroxyl group.
4. The method of claim 3, wherein the binding energy per unit cation i of each cluster is
Figure FDA0003135366480000022
The calculation method is as follows:
using cluster energy E of oxide cluster in unit cell structure constructed in step 2-1clusterSubtracting the sum of the energies of the single atoms with the same number and the same type to obtain the product, wherein the calculation formula is as follows:
Figure FDA0003135366480000023
wherein l is the number of oxygen atoms in the oxide cluster, EiAnd EOHRespectively being a single cation and a single hydroxy group in one
Figure FDA0003135366480000024
The cubic unit cell of (a).
5. The method according to claim 4, wherein the step 1 of constructing the dielectric constant database further comprises preprocessing the collected dielectric constant data, wherein the preprocessing comprises:
judging whether the following two conditions are simultaneously satisfied or not according to the two glass components:
condition 1: the difference value of the mole ratio of the components of each oxide component is less than or equal to a first preset threshold, and the unit is percentage;
condition 2: the difference value of the dielectric constants is larger than a second preset threshold value, and the unit is percentage;
if both are true, the corresponding glass composition and corresponding dielectric constant data are removed from the database.
6. The method according to claim 5, wherein the first predetermined threshold is 2% and the second predetermined threshold is 10%.
7. The method of claim 6, wherein the step 4 of constructing the dielectric constant prediction model based on the M5P model tree comprises:
step 4-1, setting the maximum layer number set h of the M5P model tree as { h }1,h2,h3,……,hgAnd the minimum set of sample numbers for node splitting f ═ f1,f2,f3,……,fj};
Step 4-2, for (h)1,f1) Dividing based on standard deviation of sample data to select characteristic parameters of split nodes of binary tree
Figure FDA0003135366480000031
Step 4-3, adopting a multivariate linear regression model to establish a linear regression model of the dielectric constant and the branch residual undivided characteristic parameter Unpar for each leaf node:
Figure FDA0003135366480000032
in the formula, Unpar is a set of the rest non-divided descriptors in the branch tree, theta is a regression parameter set, D is a set formed by all the descriptors, and I is an indicative function;
4-4, utilizing the training set D in the step 3lDetermining the coefficient [ theta ] of the linear model corresponding to each leaf node according to the least square method0ll]Wherein [ theta ]0ll]As a data set DlRegression parameters of;
4-5, calculating a mean square error value of the linear regression model in the verification set by using the verification set in the step 3:
Figure FDA0003135366480000033
in the formula, N*In order to verify the amount of data in a set,
Figure FDA0003135366480000034
to validate the predicted values of the set, yaActual values for the validation set;
step 4-6, adopting a k-fold cross verification method, repeatedly executing the step 4-3 to the step 4-5 for k times, and calculating to obtain the final product (h)1,f1) Average of k-fold cross validation under conditions
Figure FDA0003135366480000035
Figure FDA0003135366480000036
Step 4-7, adjusting hyper-parameters (h, f) of the M5P tree model, and sequentially setting h coefficients as h1,h2,h3,.....,hgSetting f coefficient as f1,f2,f3,.....,fjRepeating the steps 4-2 to 4-6, and calculating sequentially
Figure FDA0003135366480000037
Step 4-8, selecting the smallest
Figure FDA0003135366480000038
Corresponding to
Figure FDA0003135366480000039
The value is used as the optimal hyper-parameter of the M5P model;
step 4-9, taking the sum of all training sets { D } and verification set { V } as the training set of the final model after parameter adjustment is finished, namely { S }1,S2,S3.....Sk};
Step 4-10, based on steps 4-2 through 4-4, training set { S1,S2,S3.....SkAnd (5) obtaining a series of regression coefficients of leaf nodes of the M5P model after training to form a dielectric constant prediction model.
8. The method of claim 7, wherein the training set { D } and validation set { V } are constructed using a k-fold cross validation method.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method of any of claims 1 to 8 are implemented when the computer program is executed by the processor.
CN202110717315.8A 2021-06-28 2021-06-28 Method for predicting dielectric constant of glass based on M5P algorithm Active CN113326664B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110717315.8A CN113326664B (en) 2021-06-28 2021-06-28 Method for predicting dielectric constant of glass based on M5P algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110717315.8A CN113326664B (en) 2021-06-28 2021-06-28 Method for predicting dielectric constant of glass based on M5P algorithm

Publications (2)

Publication Number Publication Date
CN113326664A true CN113326664A (en) 2021-08-31
CN113326664B CN113326664B (en) 2022-10-21

Family

ID=77424914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110717315.8A Active CN113326664B (en) 2021-06-28 2021-06-28 Method for predicting dielectric constant of glass based on M5P algorithm

Country Status (1)

Country Link
CN (1) CN113326664B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627036A (en) * 2021-09-15 2021-11-09 昆明理工大学 Method and device for predicting dielectric constant of material, computer equipment and storage medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2381381A1 (en) * 2010-04-23 2011-10-26 IFP Energies nouvelles Method for determining a physical or chemical property of a molecular compound with a known molecular structure
CN102944751A (en) * 2012-11-12 2013-02-27 中国传媒大学 Dielectric constant measurement method based on reverberation chamber
CN108960493A (en) * 2018-06-22 2018-12-07 中材科技股份有限公司 The prediction model of glass material performance is established and prediction technique, device
WO2019111636A1 (en) * 2017-12-04 2019-06-13 Tdk株式会社 Method, device, and program for detecting dielectric material, and dielectric composition
JP2019148437A (en) * 2018-02-26 2019-09-05 応用地質株式会社 Dielectric constant estimation apparatus and dielectric constant estimation method
WO2019172280A1 (en) * 2018-03-09 2019-09-12 昭和電工株式会社 Polymer physical property prediction device, storage medium, and polymer physical property prediction method
CN111091878A (en) * 2019-11-07 2020-05-01 上海大学 Method for rapidly predicting perovskite dielectric constant
AU2020100709A4 (en) * 2020-05-05 2020-06-11 Bao, Yuhang Mr A method of prediction model based on random forest algorithm
CN111627505A (en) * 2020-06-04 2020-09-04 安庆师范大学 Cluster structure type identification method
US20200381085A1 (en) * 2019-05-30 2020-12-03 Fujitsu Limited Material characteristic prediction apparatus and material characteristic prediction method
CN112216355A (en) * 2020-10-22 2021-01-12 哈尔滨理工大学 Multi-component crystal configuration energy prediction method based on machine learning
CN112687351A (en) * 2021-01-07 2021-04-20 哈尔滨工业大学 Method for rapidly predicting microwave electromagnetic performance of composite medium based on genetic algorithm-BP neural network
CN112992290A (en) * 2021-03-17 2021-06-18 华北电力大学 Perovskite band gap prediction method based on machine learning and cluster model

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2381381A1 (en) * 2010-04-23 2011-10-26 IFP Energies nouvelles Method for determining a physical or chemical property of a molecular compound with a known molecular structure
CN102944751A (en) * 2012-11-12 2013-02-27 中国传媒大学 Dielectric constant measurement method based on reverberation chamber
WO2019111636A1 (en) * 2017-12-04 2019-06-13 Tdk株式会社 Method, device, and program for detecting dielectric material, and dielectric composition
JP2019148437A (en) * 2018-02-26 2019-09-05 応用地質株式会社 Dielectric constant estimation apparatus and dielectric constant estimation method
WO2019172280A1 (en) * 2018-03-09 2019-09-12 昭和電工株式会社 Polymer physical property prediction device, storage medium, and polymer physical property prediction method
CN108960493A (en) * 2018-06-22 2018-12-07 中材科技股份有限公司 The prediction model of glass material performance is established and prediction technique, device
US20200381085A1 (en) * 2019-05-30 2020-12-03 Fujitsu Limited Material characteristic prediction apparatus and material characteristic prediction method
CN111091878A (en) * 2019-11-07 2020-05-01 上海大学 Method for rapidly predicting perovskite dielectric constant
AU2020100709A4 (en) * 2020-05-05 2020-06-11 Bao, Yuhang Mr A method of prediction model based on random forest algorithm
CN111627505A (en) * 2020-06-04 2020-09-04 安庆师范大学 Cluster structure type identification method
CN112216355A (en) * 2020-10-22 2021-01-12 哈尔滨理工大学 Multi-component crystal configuration energy prediction method based on machine learning
CN112687351A (en) * 2021-01-07 2021-04-20 哈尔滨工业大学 Method for rapidly predicting microwave electromagnetic performance of composite medium based on genetic algorithm-BP neural network
CN112992290A (en) * 2021-03-17 2021-06-18 华北电力大学 Perovskite band gap prediction method based on machine learning and cluster model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LIHUA CHEN: "Frequency-dependent dielectric constant prediction of polymers using machine learning", 《NPJ COMPUTATIONAL MATERIALS》 *
郑伟达等: "基于不同机器学习算法的钙钛矿材料性能预测", 《中国有色金属学报》 *
陈平等: "环氧树脂体系固化反应及其复合材料介电性能", 《高分子通报》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113627036A (en) * 2021-09-15 2021-11-09 昆明理工大学 Method and device for predicting dielectric constant of material, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113326664B (en) 2022-10-21

Similar Documents

Publication Publication Date Title
Nagai et al. The radial distribution of galaxies in Λ cold dark matter clusters
Yerokhin et al. Electron-atom bremsstrahlung: Double-differential cross section and polarization correlations
CN113326664B (en) Method for predicting dielectric constant of glass based on M5P algorithm
CN112884236B (en) Short-term load prediction method and system based on VDM decomposition and LSTM improvement
CN113362913A (en) Method for predicting and optimizing gasoline octane number loss based on random forest regression
CN113312852B (en) Method for predicting glass dielectric loss based on neural network algorithm
CN114186518A (en) Integrated circuit yield estimation method and memory
Farshbaf et al. Multi-objective optimization of graph partitioning using genetic algorithms
Sheng et al. Low-rank Green's function representations applied to dynamical mean-field theory
Amarasinghe et al. Fractal characteristics of simulated electrical discharges
Wu et al. Overcoming the slowing down of flat-histogram Monte Carlo simulations: Cluster updates and optimized broad-histogram ensembles
CN113657593B (en) Plasma parameter diagnosis method based on BP neural network
CN114117917B (en) Multi-objective optimization ship magnetic dipole array modeling method
Lopes et al. Optimized multicanonical simulations: A proposal based on classical fluctuation theory
Hetényi et al. Path-integral diffusion Monte Carlo: Calculation of observables of many-body systems in the ground state
CN115146702A (en) Transformer fault diagnosis method, medium and system
CN114239457A (en) Extended Debye model parameter identification method based on mixed frog-leaping particle swarm optimization
CN115810401A (en) Recipe construction system, method, readable storage medium and computer program product
Pfennig et al. A continued evaluation of the general method for determining the number of independent stirrer positions in reverberation chambers
Monteil et al. the caRamel R package for Automatic Calibration by Evolutionary Multi Objective Algorithm
Kadlec et al. Self-organizing migrating algorithm for optimization with general number of objectives
Xin et al. Noise‐enhanced quantum annealing approach and its application in plug‐in hybrid electric vehicle charging optimization
Wang et al. A radar waveform recognition method based on ambiguity function generative adversarial network data enhancement under the condition of small samples
Brandl et al. Multivariate analysis methods to tag b-quark events at LEP/SLC
Paul et al. Reducing the computational cost of inverse scattering problems with evolutionary algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant