CN112765894A

CN112765894A - K-LSTM-based aluminum electrolysis cell state prediction method

Info

Publication number: CN112765894A
Application number: CN202110111679.1A
Authority: CN
Inventors: 曹丹阳; 田学法; 陈云博; 孔淑麒
Original assignee: North China University of Technology
Current assignee: North China University of Technology
Priority date: 2020-11-25
Filing date: 2021-01-27
Publication date: 2021-05-07
Anticipated expiration: 2041-01-27
Also published as: CN112765894B

Abstract

The invention discloses a K-LSTM-based aluminum electrolytic cell state prediction method, which comprises the following steps: step 1: carrying out normalization processing on the data; step 2: constructing a training set and a testing set according to a set sliding window size m; and step 3: constructing an improved LSTM model, and initializing parameters of the model; and 4, step 4: training the prediction model by using a training set, updating parameters by adopting a gradient descent method, and iterating for several times until the precision requirement is met; and 5: and feeding the test set into a trained model, and predicting the predicted value at the t +1 moment by using historical data. Based on the improved K-LSTM algorithm, the invention eliminates the sample unbalance by setting the weight aiming at the problem of sample unbalance in the LSTML forgetting gate unit, and can effectively predict the aluminum cell state.

Description

K-LSTM-based aluminum electrolysis cell state prediction method

Technical Field

The invention relates to the technical field of aluminum electrolysis industry, in particular to a K-LSTM-based aluminum electrolysis cell state prediction method.

Background

The production data of the aluminum electrolysis cell is a time series and has the characteristic of high data dimension. There are various models of existing prediction algorithms for time series data, such as artificial neural networks, autoregressive moving average, wavelet neural networks, etc. The time series prediction study begins with a regression equation that predicts the number of sunblack seeds in a year from data analysis. Autoregressive moving average models (ARMA) and autoregressive integrated moving average models (ARIMA) indicate that time series prediction models based on regression methods are becoming more and more popular.

Therefore, these models are also the simplest and most important models for time series prediction. However, due to the complexity, irregularity, randomness and nonlinearity of the actual data, it is difficult to realize high-precision prediction by a complex model. By adopting a machine learning method, a nonlinear prediction model based on a large amount of historical data can be established. In fact, through iterative training iterations and learning approximations, machine learning models can obtain more accurate predictions than traditional statistical-based models. Typical methods are support vector regression or kernel-based classification, artificial neural multi-order (ANN) and strong nonlinear function approximation and tree-based ensemble learning methods such as gradient enhanced regression or decision trees (GBRT, GBDT). However, the above method has limited effect in the time series prediction task because it lacks effective processing of the sequence dependency between the input variables.

With continuous and deep research on a deep learning algorithm, the problem that the deep learning algorithm can be suitable for predicting time series data is found, the algorithm gradually analyzes input data information, then extracts effective characteristics and extracts implicit relations from a data sequence. In order to enable RNN networks to process time series data more efficiently, a time series concept is introduced in the neural network architecture of RNN. An improved algorithm of the RNN is a long-short term memory neural network, the problems of gradient explosion, gradient disappearance, long-term sequence data memory and the like existing in the RNN structure are solved, and long-term sequence information can be effectively processed. The LSTM model is applied to many fields such as voice recognition, stock price prediction, rainfall prediction, traffic flow prediction, image and character recognition, and achieves good application effects.

Because aluminum electrolysis is an industrial process system with large time lag and the state is not changed frequently, the problem of unbalanced samples can be encountered when the existing data are used for training a neural network. The first gate of the LSTM is the forgetting gate, which is used to determine if some information is lost from the memory cell, and this process is what the forgetting gate needs to do and is handled by a sigmoid function. However, in the working process of the forgetting gate, if the state of the input data changes infrequently, the forgetting gate can be in the state 1 for a long time, namely, the last state is used, the state does not need to be updated, and the forgetting gate has the problem of unbalanced sample. I.e., for the ft model, there is a sample imbalance problem.

The sample imbalance problem mainly exists in a supervised machine learning task, and is mainly shown in that when sample imbalance data is encountered, a model tends to pay more attention to classes with a large number of classes in a classification prediction task, so that the sample prediction effect of a few classes is poor, and most common machine learning cannot effectively work on existing unbalanced data sets. There are generally two approaches to solving the sample imbalance problem, namely undersampling and oversampling.

(1) Undersampling: undersampling is the reduction of the total number of classes, and is usually chosen when the amount of data is sufficiently supported. The samples are balanced by retaining the number of samples contained in a small number of classes, then reducing the number of samples contained in a large number of classes, and then modeling further.

(2) Oversampling: when the amount of data is not sufficient to support method (1), an over-sampling method is selected that takes the addition of a small number of classes of data sets to balance the data sets, rather than removing the number of majority classes, by using a repeat, bootstrap, or composite minority over-sampling method to add the small number of classes.

However, neither over-sampling nor under-sampling methods have absolute advantages in comparison, and the application of these two methods depends on the data set to be applied.

Disclosure of Invention

The invention aims to provide a K-LSTM-based aluminum cell state prediction method to support the communication infrastructure construction mentioned in the background art and achieve the purposes of cost reduction and efficiency improvement.

In order to achieve the purpose, the invention aims at the problem of unbalanced samples in a forgotten door, adopts two methods in the background technology, judges the conditions and then respectively sets a weight, and reduces the weight when the ft model is in the same state for a long time so as to balance the sample problem.

Therefore, the K-LSTM-based aluminum electrolysis cell state prediction method specifically comprises the following steps:

step 1: carrying out normalization processing on the data;

step 2: constructing a training set and a testing set according to a set sliding window size m;

and step 3: constructing an improved LSTM model, and initializing parameters of the model;

and 4, step 4: training the prediction model by using a training set, updating parameters by adopting a gradient descent method, and iterating for several times until the precision requirement is met;

and 5: and feeding the test set into a trained model, and predicting the predicted value at the t +1 moment by using historical data.

The improved LSTM model is constructed based on the LSTM algorithm and the improved K-LSTM algorithm.

The K-LSTM algorithm is implemented as follows:

the three gate structures of the LSTM comprise an input gate, an output gate and a forgetting gate; the calculation process is as follows:

(1) forget the door: the information used for judging the discarding of the previous memory information is obtained by outputting the value h at the moment t-1_t-1Input value x at the current time t_tLinear combination, compressing the values to [0,1 ] by sigmoid function]In the range, the closer the value is to 1, the more f in the current cell state is represented_tThe more information to be retained, the closer to 0 the value is, the more f is represented_tThe more the information is selected to be abandoned, the calculation process of the forgetting gate is as follows:

f_t＝sigmoid(W_f·[h_t-1,x_t]+b_f) (1)

(2) an input gate: for processing input value x at current time t_tAnd the output h of the previous moment_t-1The method is also realized by a sigmoid function as total input information; then use x_tAnd h_t-1Obtaining new candidate cell information through tanh layer

The total amount of the input information at the current moment and the memory information at the previous moment is calculated, and the calculation process is as follows:

i_t＝sigmoid(W_i·[h_t-1,x_t]+b_i) (2)

then, the old cell information C is updated_t-1Updated to new cell information C_t(ii) a The updating method is that the old forgotten cell information is selected by the forgotten gate and the newly added information at the current moment is added by the input gate, and the two determine the finally updated cell information C_tThe calculation process is as follows:

(3) an output gate: for determining the information h to be transmitted when outputting to the next moment_t(ii) a The output information also needs to obtain a judgment condition through a sigmoid function, and then the cell state is obtained through a tanh function to obtain a range of [ -1,1]A value in between; the value is a vector, and the vector is multiplied by the previously obtained judgment condition to obtain the output at the current moment. The calculation process is as follows:

o_t＝sigmoid(W_o[h_t-1,x_t]+b_o) (5)

h_t＝o_t*tanh(C_t) (6)

when the network is learned, the offset derivatives of the loss functions on all the parameters are calculated, then the parameters are updated, the iteration is carried out in sequence until the loss functions are converged, and the offset derivatives of the loss functions on ft are intervened on the basis of the original LSTM gate control unit to avoid the problem of sample imbalance; according to whether the slot state is changed, if the slot state is changed, a larger weight is selected, and if the slot state is not changed, a smaller weight is selected;

f＝f*k+tf.stop_gradient(f-f*k) (7)

using tf.stop _ gradient () function, according to the characteristics of tf.stop _ gradient () function, tf.stop _ gradient () does not work in the forward process, so + (f × k) and- (f × k) cancel out, leaving only f forward pass; in the reverse process, the gradient of f-f × k becomes 0 due to the action of tf.

The improved K-LSTM algorithm comprises the following steps:

step (1): calculating input and output in the forward propagation of the K-LSTM and output values of each neuron of the hidden layer;

step (2): calculating an output error through a cross entropy function, and reversely transmitting the error to each layer of neural units through a back propagation algorithm;

and (3): in the backward propagation process of the forgetting gate, selecting a weight to change an original propagation function according to a judgment condition;

and (4): updating the parameters of each layer of neurons according to a gradient descent algorithm and the propagated error;

and (5): and (4) repeating the step (2), the step (3) and the step (4) according to the set iteration number until convergence, and finishing the model training.

Compared with the prior art, the method has the beneficial effects that: based on the improved K-LSTM algorithm, the invention eliminates the sample unbalance by setting the weight aiming at the problem of sample unbalance in the LSTML forgetting gate unit, and can effectively predict the aluminum cell state.

Drawings

FIG. 1 is a flow chart of the improved K-LSTM algorithm.

FIG. 2 is a flow chart of K-LSTM-based slot state prediction.

Fig. 3 is a schematic diagram of production data storage.

FIG. 4 is a graph of the predicted results of K-LSTM.

FIG. 5 is a graph of LSTM prediction results.

FIG. 6 is a graph of EA-LSTM prediction results.

Detailed Description

The technical solution of the present patent will be described in further detail with reference to the following embodiments.

The K-LSTM algorithm is implemented as follows:

f_t＝sigmoid(W_f·[h_t-1,x_t]+b_f) (1)

i_t＝sigmoid(W_i·[h_t-1,x_t]+b_i) (2)

o_t＝sigmoid(W_o[h_t-1,x_t]+b_o) (5)

h_t＝o_t*tanh(C_t) (6)

f＝f*k+tf.stop_gradient(f-f*k) (7)

The improved algorithm flow chart is shown in fig. 1, and the improved K-LSTM algorithm comprises the following steps:

The modified K-LSTM algorithm is shown in Table 1:

TABLE 1 improved K-LSTM Algorithm

In this embodiment, an LSTM slot state prediction model is constructed using a Keras framework of tensrflow, all programs are written using Python language, a prediction experiment is performed on a computer of a cpu2.50ghz, memory 8GB, Windows7 operating system, and data is data with cluster attributes.

Setting a sliding window m by using an improved K-LSTM algorithm, constructing a training set and a testing set according to the sliding window, training a model, selecting cross entropy as a loss function to reflect the deviation between predicted data and real data, and finally predicting the state of the aluminum electrolysis cell by using the trained model.

To further verify the validity of the algorithm, the improved algorithm was compared with the accuracy of the conventional LSTM and attention-based LSTM algorithms, respectively.

As shown in FIG. 2, the flow of predicting the aluminum cell state using the modified K-LSTM is as follows:

step 1: carrying out normalization processing on the data;

In this example, the data is derived from real aluminum cell production data (as shown in fig. 3), and the data is collected once a day, wherein each cell contains 13 characteristics of Fe content, aluminum level, molecular ratio, Si content, alumina concentration, electrolyte level, electrolysis temperature, and the like; before analyzing and mining the information hidden in the data, the data with many problems needs to be preprocessed; because the data used in this embodiment has a null value, the data is subjected to null value and noise processing, and then normalized according to the characteristic of high data dimensionality.

The results of the prediction of the state of the aluminum electrolysis cell based on K-LSTM are shown in FIG. 4. For clearer observation effect, only the first 200 pieces of data are shown; in fig. 4, the abscissa is a time series, and the ordinate is a slot state, there being a total of two

states

0 and 1; the two dotted lines are the true bin state and the model predicted bin state, where the cross entropy LOSS is 0.0418, and the accuracy reaches 99.6%.

Fig. 5 is a diagram of a prediction result of a conventional LSTM model, and it can be seen that in the prediction of the tank state by the model, the overall fitting effect of the predicted value and the true value is good when the tank state does not change, but when the tank state changes suddenly, accurate prediction cannot be made in time, and the prediction is successful after the tank state changes for a period of time.

Fig. 6 shows that the model prediction based on EA-LSTM construction of the attention mechanism is more accurate than the conventional LSTM model, but when the groove state changes, the change of the groove state is not predicted accurately, and the groove state at the previous time is generally continued.

In order to verify the effectiveness of the algorithm, the K-LSTM algorithm is compared and analyzed with the traditional LSTM algorithm and the EA-LSTM algorithm based on the attention mechanism respectively, and the same iteration number, sliding window size and neuron number are adopted.

From Table 2, it can be seen that the improved K-LSTM improves the accuracy of the prediction of the slot state to some extent, regardless of whether the traditional LSTM-constructed model or the EA-LATM model is inferior to the improved K-LSTM model in terms of error and accuracy.

TABLE 2 comparative analysis of the models

By comparison, it can be seen that the improved K-LSTM of the present invention provides a significant improvement in this problem, enabling a quicker prediction of the change in the state of the cell. The method is convenient for operators to find the abnormal state of the tank more quickly and make a decision in time to prevent the further deterioration of the tank state.

The embodiment performs experiments to verify the feasibility and the effectiveness of the algorithm, and simultaneously performs comparison experiments with the traditional LSTM model and the EA-LSTM model based on the attention mechanism, finally proves that the improved K-LSTM prediction effect is obviously superior to the other two models, and the accuracy of the groove state prediction is improved.

Although the preferred embodiments of the present patent have been described in detail, the present patent is not limited to the above embodiments, and various changes can be made without departing from the spirit of the present patent within the knowledge of those skilled in the art.

Claims

1. A K-LSTM-based aluminum electrolysis cell state prediction method is characterized by comprising the following steps:

step 1: carrying out normalization processing on the data;

2. The method of claim 1, wherein the improved LSTM model is constructed based on the LSTM algorithm modified K-LSTM algorithm.

3. The method of claim 2, wherein the improved K-LSTM algorithm comprises the steps of:

4. The K-LSTM based aluminum reduction cell condition prediction method of claim 1, wherein the K-LSTM algorithm is implemented as follows:

f_t＝sigmoid(W_f·[h_t-1,x_t]+b_f) (1)

i_t＝sigmoid(W_i·[h_t-1,x_t]+b_i) (2)

(3) an output gate: for determining the information h to be transmitted when outputting to the next moment_t(ii) a The output information also needs to obtain a judgment condition through a sigmoid function, and then the cell state is obtained through a tanh function to obtain a range of [ -1,1]A value in between; the value is a vector, then the vector is multiplied by the previously obtained judgment condition to obtain the output of the current moment, and the calculation process is as follows:

o_t＝sigmoid(W_o[h_t-1,x_t]+b_o) (5)

h_t＝o_t*tanh(C_t) (6)

f＝f*k+tf.stop_gradient(f-f*k) (7)