CN113327652A

CN113327652A - Crystal property prediction and classification method based on attention mechanism and crystal atlas neural network

Info

Publication number: CN113327652A
Application number: CN202110509660.2A
Authority: CN
Inventors: 王步维; 范谦; 邵宇; 乐云亮
Original assignee: Yangzhou University
Current assignee: Yangzhou University
Priority date: 2021-05-11
Filing date: 2021-05-11
Publication date: 2021-08-31
Anticipated expiration: 2041-05-11
Also published as: CN113327652B

Abstract

The invention discloses a method for predicting and classifying crystal properties based on an attention mechanism and a crystal graph volume neural network. Extract the crystal features from the crystallographic information file, input the crystal features into the neural network, and obtain the output of the neural network; use the training set and the verification set to train and verify the constructed neural network model respectively, and obtain the prediction model and Classification model; the prediction of crystal properties is completed through the prediction model, and the classification of crystal properties is completed through the classification model. The invention can effectively improve the prediction and classification accuracy of crystal properties, and is less time-consuming, has engineering practical value, is helpful for realizing accurate large-scale crystal research simulation, and provides a method guarantee for the development and research of new crystal materials.

Description

Crystal property prediction and classification method based on attention mechanism and crystal atlas neural network

Technical Field

The invention relates to a crystal property prediction and classification technology, in particular to a crystal property prediction and classification method based on an attention mechanism and a crystal volume neural network.

Background

The simulation of the crystal properties is usually realized by a first-principle calculation method based on DFT (density functional theory), but the screening of crystal materials with ideal properties by using the first-principle is very time-consuming and is not low in calculation cost. Therefore, how to realize large-scale screening of crystalline materials becomes a difficult problem. With the development of computers, machine learning is becoming an important topic in the field of academic, and people try to adopt a machine learning method to perform large-scale crystal property simulation. With the continuous optimization of the machine learning algorithm, the simulation accuracy gradually approaches the result of the first principle calculation. The combination of machine learning and crystal simulation is helpful for realizing large-scale crystal research simulation, and accelerates the development and research of new crystal materials, thus receiving wide attention of people.

The difficulty in using machine learning methods for the simulation of crystal properties is: how to correctly encode chemical information (such as atomic information, crystal topology, etc.) in crystals of any size and to be compatible with machine learning models, and how to train models with sufficient accuracy from the limited available data.

The crystal map convolutional neural network is a machine learning algorithm for crystal property research, directly learns the crystal property from the connection of atoms in the crystal, and provides a universal and interpretable crystal chemical information coding mode. Various physical properties of crystals can be predicted based on a crystal map convolution neural network (abbreviated CGCNN) of a Graph Convolution (GCN). The crystal structure diagram is an omnidirectional multi-graph in which atoms are represented by nodes and the sides represent atomic bonds between the atoms. In CGCNN, node i uses a feature vector v_iIs to represent v_iThe characteristics of the encoding property of the atom i are contained in the method. Non-directional edge (i, j)_kRepresents the k-th bond between atoms i and j, u (i, j)_kThen the eigenvector representing the kth atomic bond between atoms i and j is represented. In order to solve the problem of the difference of interaction strength between neighbors, CGCNN designs a new convolution function,

wherein

Representing the connection of atoms and the eigenvectors of the atomic bonds.

b^(t)The convolution weight matrix, the self weight matrix and the bias of the t-th layer respectively, and g (-) represents the softplus activation function between the layers.

However, the CGCNN method has limited prediction accuracy as a machine learning method that is fast and capable of large-scale screening of crystalline materials. This is because CGCNN reduces the complexity of the network in order to improve the efficiency of the machine learning algorithm, and although the running rate is increased, it may cause a reduction in prediction accuracy. And the number of the default operation cycles (epochs) of the CGCNN method is 30, which can reduce the time consumption of the model establishing process, can also influence the fitting of the network, and can also cause the accuracy of the prediction model to be reduced.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defects in the prior art, a crystal property prediction and classification method based on an attention mechanism and a crystal graph volume neural network is provided, a new convolution function is designed for the crystal graph convolution neural network, the capability of fusing the topological structure and the node characteristics in graph convolution can be improved, the calculation accuracy is improved, a new normalization method is introduced to regularize a depth graph volume network, the fitting of the network is improved, a better model is established, and the improved new network has the characteristics of rapid and large-scale crystal material screening.

The technical scheme is as follows: in order to achieve the above object, the present invention provides a crystal property prediction and classification method based on attention mechanism and crystal volume neural network, comprising the following steps:

s1: acquiring a crystallography information file (crystal structure data) and DFT calculation data, and dividing the crystallography information file and the DFT calculation data into a training set, a verification set and a test set;

s2: extracting crystal characteristics from the crystallography information file, inputting the crystal characteristics into a neural network, and acquiring neural network output;

s3: training and verifying the constructed neural network model by adopting a training set and a verification set respectively to obtain a prediction model and a classification model, and completing prediction of crystal properties through the prediction model; classification of the crystal properties is accomplished by a classification model.

Further, the method for acquiring the structure data of the crystal and the DFT calculation data in step S1 includes:

a1: connecting a Materials Project database through a pymatgen program package in python software, and exporting the id number of the crystal and DFT calculation data forming physical properties such as energy, absolute energy, band gap, Fermi energy and the like to a csv file;

a2: connecting a Materials Project database through a pymatgen program package in python software, reading a crystal id number in the exported csv file, and exporting a corresponding cif file (crystallography information file);

a3: prepare one atom _ init.json file: and the JSON file is used for storing the initialization vector of each element.

Further, the obtaining process of the neural network output in step S2 is as follows:

b1: extracting cif the atom characteristics in the document, the bond characteristics between each atom and its neighbor atoms, the index of each atom's neighbor atoms and the index of crystal mapping to atoms, and using them as the input of the neural network;

b2: generating a new vector by inputting the atomic characteristics of the network through the embedding layer, and then inputting the new atomic characteristic vector, the key characteristic vector and the index vector of the neighbor atom into the convolution layer;

b3: in the convolutional layer, an atom is regarded as a node, an atomic bond is regarded as an edge, a node vector, a neighbor node vector and an edge vector are connected through an index vector to form a new embedded vector, and the new vector passes through a full connection layer 1 and then node normalization processing is carried out on output;

b4: node vector h after node normalization and softplus function activation^(t)Is M hidden vectors Z merged into neighbor features_T∈R^1×FThe combined arrays are transformed by a non-linear transformation, and then a shared attention vector q ∈ R is used^F′×1Get the attention value ω_T；

B5: attention to the value ω using the softmax function₁,ω₂,…,ω_MStandardizing to obtain final weight;

b6: combining the M hidden vectors merged into the neighbor features and the attention values of the M hidden vectors to obtain a final node embedding H^(t)Carrying out batch normalization, adding normalized original node feature vectors of the input convolutional layers, activating by a softplus function and outputting;

b7: after 3 layers of convolution layers, a new vector fused with a local chemical environment is generated, the new vector generates a vector representing the whole crystal through the pooling layer, is activated through a softplus function, is connected to the full-link layer 2, is activated through the same function, and is input into the full-link layer 3 for output.

Further, the formula of the node normalization processing in step B3 is as follows:

wherein h is^(t)Embedding vectors, μ, for newly generated nodes^(t)Is node h^(t)Average value of (a) ("sigma^(t)Is the deviation of the node;

the node vector h in the step B4^(t)Is expressed as:

wherein T is h^(t)The T line of (1), T belongs to M, M represents the maximum number of neighbor atoms, and F is the number of hidden features of atoms;

the value ω of interest in said step B4_TThe expression of (a) is:

ω_T＝q^T·tanh(W·(Z_T)^T+b) (3)

wherein W ∈ R^F′×FIs a weight matrix, b ∈ R^F′×1Is a bias vector;

the final weight expression in step B5 is:

embedding H into the node in the step B6^(t)The expression of (a) is:

H^(t)＝a₁Z₁+a₂Z₂+…+a_MZ_M。 (5)

further, the convolution formula of the neural network in step S2 is:

wherein, Nodnorm (cndot.) represents node normalization, g (cndot.) represents a softplus activation function, Attention (cndot.) represents an Attention mechanism, and BN (cndot.) represents batch normalization.

Further, the training method of the prediction model in step S3 includes:

using mean square loss and random gradient descent as a loss function and optimizer; the mean square loss is shown in equation (7),

loss(x_i,y_i)＝(x_i-y_i)² (7)

in the formula, x_iIs an input value, y_iIs a target attribute value, namely a DFT calculated value; the prediction model uses the Mean Absolute Error (MAE) as an index for evaluating the performance of the model.

Further, the Mean Absolute Error (MAE) in the step S3: MAE represents the average value of absolute errors between the predicted value and the tested value, and is an evaluation index of the prediction model. As shown in the formula (8),

wherein x is_iIndicates the predicted value, y_iThe test values are indicated.

Further, the classification process of the classification model in the step S3 for the crystal property is as follows:

under the framework of the same neural network, the activation function of the output layer is changed into a logsoftmax activation function and matched with a negative log-likelihood loss function to realize the classification of crystal properties.

Further, the logsoftmax activation function in the step S3 is shown as formula (9), the negative log likelihood loss function is shown as formula (10),

the classification model takes accuracy (accuracuracy) and area under ROC curve (AUC) as indexes for evaluating model performance.

Further, the area under the ROC curve (AUC) is obtained by summing the areas of the portions under the ROC curve. The ROC curve abscissa is False Positive Rate (FPR), i.e., the probability that a positive case is determined but not a true case. The ordinate is the true rate (TPR), i.e., the probability that the positive case is also the true case. An AUC size of approximately 1 indicates a better classification model.

The attention mechanism is a special structure embedded in a machine learning model by people and used for automatically learning and calculating the contribution of input data to output data. The method can improve the capability of fusing the topological structure and the node characteristics in graph convolution by learning the importance weight of self-adaptive embedding.

The invention provides a crystal property prediction and classification method based on an attention mechanism and a crystal graph volume neural network. In addition, in order to reduce the risk of overfitting, node normalization is further introduced. The depth map convolution network is regularized by suppressing the characteristic correlation of hidden embedding and improving the smoothness of the model relative to the input node characteristics, and the overfitting risk of the network is reduced.

Has the advantages that: compared with the prior art, the method has the advantages that the crystal data collection, the crystal property prediction and the crystal property classification are taken as a complete system, the crystal graph convolution neural network and the attention mechanism are fully combined, the prediction and classification precision of the crystal property can be effectively improved, the consumed time is low, the engineering practical value is realized, the accurate large-scale crystal research simulation is facilitated, and the method guarantee is provided for the development and research of new crystal materials.

Drawings

FIG. 1 is a schematic flow diagram of the process of the present invention;

FIG. 2 is a diagram of the convolution structure of a convolutional layer in the present invention;

fig. 3 is a structural diagram of a neural network in the present invention.

Detailed Description

The present invention is further illustrated by the following figures and specific examples, which are to be understood as illustrative only and not as limiting the scope of the invention, which is to be given the full breadth of the appended claims and any and all equivalent modifications thereof which may occur to those skilled in the art upon reading the present specification.

The invention provides a crystal property prediction and classification method based on an attention mechanism and a crystal atlas neural network, which mainly comprises two stages of crystal property prediction and crystal property classification, wherein in the first stage, the mean square loss is used as a loss function, the random gradient descent is used as an optimizer, and the crystal Formation energy (Formation energy) and the Absolute energy (Absolute energy) are respectively used for the crystal) Band gap (Bandgap) and Fermi energy (Fermi energy) were predicted and compared to DFT calculation data. In the second stage, the activation function of the output layer is changed into logsoftmax activation function, the loss function is changed into negative log-likelihood loss function, and the threshold value of the total magnetic moment is 0.5 mu_BThe crystal of (2) is classified, and a wide bandgap semiconductor crystal having a bandgap threshold of 2.3eV is also classified.

As shown in fig. 1, the method for predicting and classifying crystal properties based on attention mechanism and crystal volume neural network provided by the present invention specifically includes the following steps:

s1: acquiring structure data and DFT calculation data of the crystal, and dividing the structure data and the DFT calculation data into a training set, a verification set and a test set;

the method for acquiring the structure data and DFT calculation data of the crystal comprises the following steps:

S2: collecting a crystallography information file, extracting crystal characteristics, inputting the crystal characteristics into a neural network, and acquiring neural network output;

the acquisition process of the neural network output is as follows:

b3: in the convolutional layer, an atom is regarded as a node, an atomic bond is regarded as an edge, as shown in a convolutional structure diagram of fig. 2, a node vector, a neighbor node vector and an edge vector are connected through an index vector to form a new embedded vector, and the new vector passes through a full connection layer 1 and then node normalization processing is performed on output;

the formula of the node normalization processing is as follows:

Node vector h^(t)Is expressed as:

note the value ω_TThe expression of (a) is:

ω_T＝q^T·tanh(W·(Z_T)^T+b) (3)

wherein W ∈ R^F′×FIs a weight matrix, b ∈ R^F′×1Is a bias vector;

b5: attention to the value ω using the softmax function₁,ω₂,…,ω_MStandardizing to obtain the mostA final weight;

the expression for the weights is:

node embedding H^(t)The expression of (a) is:

H^(t)＝a₁Z₁+a₂Z₂+…+a_MZ_M (5)

b7: referring to fig. 3, after 3 convolutional layers, a new vector fused with the local chemical environment is generated, and then a vector representing the whole crystal is generated through the pooling layer, and is connected to the fully-connected layer 2 after being activated by the softplus function, and then is activated by the same function and then is input into the fully-connected layer 3 for output.

Based on the above process, the convolution formula of the neural network is:

S3: training and verifying the constructed neural network model by adopting a training set and a verification set respectively to obtain a prediction model and a classification model, and completing prediction of crystal properties through the prediction model according to the neural network output;

here the prediction model uses mean-square loss and random gradient descent as loss functions and optimizers; the mean square loss is shown in equation (7),

loss(x_i,y_i)＝(x_i-y_i)² (7)

in the formula, x_iIs an input value, y_iIs a target attribute value, namely a DFT calculated value; the prediction model takes the average absolute error (MAE) as an index for evaluating the performance of the model;

mean Absolute Error (MAE): MAE represents the average value of absolute errors between the predicted value and the tested value, and is an evaluation index of the prediction model. As shown in the formula (8),

wherein x is_iIndicates the predicted value, y_iThe test values are indicated.

S4: classification of the crystal properties is accomplished by a classification model.

The classification process of the classification model for the crystal properties here is:

The logsoftmax activation function is shown in equation (9), the negative log-likelihood loss function is shown in equation (10),

The area under the ROC curve (AUC) is the sum of the areas of the sections under the ROC curve. The ROC curve abscissa is False Positive Rate (FPR), i.e., the probability that a positive case is determined but not a true case. The ordinate is the true rate (TPR), i.e., the probability that the positive case is also the true case. An AUC size of approximately 1 indicates a better classification model.

In the step, positive and negative samples are prepared as a classification basis, and the method specifically comprises the following steps: the total magnetic moment is greater than 0.5 mu_BIs set to 1, and the total magnetic moment is less than 0.5 mu_BThe crystal (b) is set to 0, the value of the test results is between 0 and 1, and crystals with a value greater than 0.5 are considered to have a total magnetic moment greater than 0.5 mu_BCrystals with a value less than 0.5 are considered to have a total moment less than 0.5 mu_B. Similarly, a crystal having a band gap greater than 2.3eV is set to 1, a crystal having a band gap less than 2.3eV is set to 0, and the results of the test are found to be a value between 0 and 1, a crystal having a value greater than 0.5 is considered to have a band gap greater than 2.3eV, and a crystal having a value less than 0.5 is considered to have a band gap less than 2.3 eV.

In the embodiment, the above scheme is applied to an example experiment, and data of 3 tens of thousands of crystals are collected in the experiment, wherein 80% of the data is training data, 10% is verification data, and 10% is detection data. In the prediction experiment, the errors of the predicted values of the two physical quantities of the absolute energy and the formation energy and the calculated DFT value are minimum, and the MAE is 0.103eV/atom and 0.060 eV/atom; the band gap and fermi energy predictions are most erroneous from the DFT calculations, with MAEs of 0.312eV and 0.343 eV. Greater than 0.5 mu for total magnetic moment_BIn the crystal classification experiment, the accuracy of the model classification reaches 87.9 percent, and the AUC is 0.919. In a wide-bandgap semiconductor crystal classification experiment with a band gap larger than 0.23eV, the model classification accuracy is more 93.9%, and the AUC is 0.981. The running environment is Win10, the CPU is i7-10700k, and the GPU is RTX 3080.

Secondly, in order to better embody the effect of the method of the present invention, the present example performs a comparative experiment on the method of the present invention and the CGCNN method, and under the same data and over-parameter conditions, the most obvious improvement of the method of the present invention is the band gap, and the MAE thereof is reduced by 8.8%. In addition, the MAE formation energy and absolute energy were also reduced by 4.8% and 3.7%, respectively. Although the prediction error of fermi energy is the largest, the MAE still decreases by 1.4% by comparison. The research results further show that compared with the CGCNN method, the method of the invention introduced with an attention mechanism and node normalization has obvious improvement on the aspect of prediction precision.

When classifying the total magnetic moment, CGCNN uses 80% of the collected data as training data, which can be 86.9% accurate. However, the method of the present invention requires only 60% of the data to be used as training data to achieve the same accuracy. And, when 80% of the data is also used as the training set, the accuracy of the method of the present invention is improved by 1%. In classifying wide bandgap semiconductor crystals, CGCNN can achieve 92.1% accuracy using 80% of the data as training data, while the new method can achieve almost the same accuracy using only 40% of the data as training data. The accuracy of the method of the invention can be improved even by 1.8% when training is also performed using 80% of the data. Therefore, the accuracy level of the CGCNN can be achieved by using less training data, and the accuracy can be improved by using the training set with the same data quantity.

Claims

1. a crystal property prediction and classification method based on attention mechanism and crystal graph volume neural network, is characterized in that, comprises the steps:

S1: Obtain the crystallographic information file and DFT calculation data of the crystal, and divide it into a training set, a validation set and a test set;

S2: Extract the crystal features from the crystallographic information file, input the crystal features into the neural network, and obtain the output of the neural network;

S3: Use the training set and the validation set to train and verify the constructed neural network model respectively, obtain the prediction model and the classification model, and complete the prediction of the crystal properties through the prediction model; complete the classification of the crystal properties through the classification model.

2. A method for predicting and classifying crystal properties based on an attention mechanism and a crystal graph volume neural network according to claim 1, wherein the method for obtaining the structural data of the crystal and the DFT calculation data in the step S1 is as follows: :

A1: Connect the Materials Project database through the pymatgen package in the python software, and export the id number of the crystal and the DFT calculation data of the physical properties to a .csv file;

A2: Connect to the Materials Project database through the pymatgen package in the python software, read the crystal id number in the exported .csv file, and export the corresponding crystallographic information file (cif file);

A3: Stores the initialization vector for each element.

3. a kind of crystal property prediction and classification method based on attention mechanism and crystal graph volume neural network according to claim 2, is characterized in that, the acquisition process of neural network output in described step S2 is:

B1: Extract the atom features in the cif file, the bond features between each atom and its neighbor atoms, the index of each atom's neighbor atoms, and the index of the crystal mapped to the atom, and use them as the input of the neural network;

B2: The atomic features of the input network are passed through the embedding layer to generate a new vector, and then the new atomic feature vector, the bond feature vector and the index vector of neighbor atoms are input into the convolutional layer together;

B3: In the convolutional layer, the atoms are regarded as nodes, and the atomic bonds are regarded as edges. First, the node vector, the neighbor node vector and the edge vector are connected by the index vector to form a new embedding vector. The new vector passes through a fully connected layer 1. Perform node normalization processing on the output;

B4: The node vector h ^(t) after node normalization and softplus function activation is an array composed of M hidden vectors Z _T ∈ R ^1×F which are integrated into the neighbor features. The vectors are transformed by nonlinear transformation, and then Use a shared attention vector q∈R ^F′×1 to get the attention value ω _T ;

B5: Use the softmax function to standardize the attention values ω ₁ , ω ₂ ,...,ω _M to obtain the final weight;

B6: Combine the M hidden vectors integrated into the neighbor features and their attention values to obtain the final node embedding H ^(t) , perform batch normalization, and after normalization, it is consistent with the original node feature vector of the input convolution layer. After adding, it is activated and output by the softplus function;

B7: After 3 convolution layers, a new vector that integrates the local chemical environment is generated. The new vector is then passed through the pooling layer to generate a vector representing the entire crystal, activated by the softplus function, and then connected to the fully connected layer 2 and then through the same The function activation is then fed into the fully connected layer 3 output.

4. A kind of crystal property prediction and classification method based on attention mechanism and crystal graph volume neural network according to claim 3, is characterized in that, the formula of node normalization processing in described step B3 is:

where h ^(t) is the newly generated node embedding vector, μ ^(t) is the average value of node h ^(t) , and σ ^(t) is the deviation of the node;

The expression of the node vector h ^(t) in the step B4 is:

Among them, T is the T-th row of h ^(t) , T∈M, M represents the maximum number of neighbor atoms, and F is the number of hidden features of atoms;

The expression of the attention value ω _T in the step B4 is:

ω _T =q ^T ·tanh(W·(Z _T ) ^T +b) (3)

where W∈R ^F′×F is the weight matrix, and b∈R ^F′×1 is the bias vector;

The expression of the final weight in the step B5 is:

The expression of node embedding H ^(t) in the step B6 is:

H ^(t) = a ₁ Z ₁ +a ₂ Z ₂ +...+a _M Z _M . (5)

5. a kind of crystal property prediction and classification method based on attention mechanism and crystal graph volume neural network according to claim 4, is characterized in that, the convolution formula of neural network in described step S2 is:

Among them, Nodenorm( ) represents node normalization, g( ) represents softplus activation function, Attention( ) represents attention mechanism module, and BN( ) represents batch normalization.

6. a kind of crystal property prediction and classification method based on attention mechanism and crystal graph volume neural network according to claim 1, is characterized in that, the training method of prediction model in described step S3 is:

The mean squared loss and stochastic gradient descent are used as the loss function and optimizer; the mean squared loss is shown in Equation (7),

loss(x _i ,y _i )=(x _i -y _i ) ² (7)

In the formula, x _i is the input value, y _i is the target attribute value, that is, the DFT calculation value; the prediction model uses the mean absolute error (MAE) as an index to evaluate the performance of the model.

7. A method for predicting and classifying crystal properties based on an attention mechanism and a crystal graph volume neural network according to claim 6, wherein in the step S3, mean absolute error (MAE): MAE represents the predicted value and The mean value of the absolute error between the test values, as shown in Equation (8),

Among them, x _i represents the predicted value, and y _i represents the test value.

8. The method for predicting and classifying crystal properties based on an attention mechanism and a crystal graph volume neural network according to claim 1, wherein in the step S3, the classification process of the classification model for crystal properties is:

Under the framework of the same neural network, the activation function of the output layer is changed to a logsoftmax activation function and a negative log-likelihood loss function is used to achieve the classification of crystal properties.

9. A method for predicting and classifying crystal properties based on an attention mechanism and a crystal graph volume neural network according to claim 8, wherein the logsoftmax activation function in the step S3 is as shown in formula (9), and the negative The log-likelihood loss function is shown in formula (10),

The classification model uses accuracy and area under the ROC curve (AUC) as indicators to evaluate the performance of the model.

10. A method for predicting and classifying crystal properties based on an attention mechanism and a crystal graph volume neural network according to claim 9, wherein the area under the ROC curve (AUC) is the ratio of each part under the ROC curve. The area is summed up.