CN117291069A

CN117291069A - LSTM sewage water quality prediction method based on improved DE and attention mechanism

Info

Publication number: CN117291069A
Application number: CN202311148279.3A
Authority: CN
Inventors: 任健; 周红标; 秦源汇; 杨丹; 刘帅祥; 徐浩渊
Original assignee: Huaiyin Institute of Technology
Current assignee: Huaiyin Institute of Technology
Priority date: 2023-09-06
Filing date: 2023-09-06
Publication date: 2023-12-26

Abstract

The invention discloses an LSTM sewage water quality prediction method based on an improved DE and attention mechanism, which comprises the following steps: 1) Preprocessing an important characteristic data set of a sewage treatment plant; 2) Constructing and initializing an ATT-LSTM water quality prediction model; 3) Optimizing the super-parameters of the ATT-LSTM water quality prediction model by utilizing ADE to obtain an optimal super-parameter ADE-ATT-LSTM water quality prediction model; 4) Constructing a training set and a testing set based on the preprocessed data to train the optimal super-parameter ADE-ATT-LSTM water quality prediction model; 5) And inputting the acquired data into a trained ADE-ATT-LSTM water quality prediction model to obtain a DO prediction result. The water quality prediction model can ensure the optimal parameters of the prediction model, thereby effectively improving the water quality prediction precision; in addition, the invention combines ADE with ATT and LSTM models, thereby greatly improving the calculation efficiency of the water quality prediction model.

Description

LSTM sewage water quality prediction method based on improved DE and attention mechanism

Technical Field

The invention belongs to the technical field of water quality prediction, and particularly relates to an LSTM sewage water quality prediction method based on an improved DE and attention mechanism.

Background

At present, the most effective method for preventing further deterioration and pollution of water resources is to improve the prediction capability of key characteristics in the sewage treatment process, and the characteristics of uncertainty, nonlinearity, time lag and the like in the sewage treatment process are difficult to describe by using a traditional mathematical model. Meanwhile, the change of Dissolved Oxygen (DO) in a water body is closely related to the metabolism rate of the water body, so that DO is very important in evaluating the health condition of a water ecosystem of a sewage treatment plant. However, the lack of reliable DO prediction methods severely affects the effective monitoring and control of wastewater treatment quality. Although accurate concentrations, such as the determination of potassium dichromate and potassium permanganate indices, can be obtained by conventional predictive methods, there is inevitably a significant time delay, ranging from minutes to hours. This delay is too late for advanced wastewater treatment systems that require more accurate and timely control. The prediction capability of important characteristics of the sewage treatment plant is improved, the planning of the sewage treatment plant can be assisted, and the method has important significance for controlling the water environment pollution of the area.

Traditional prediction methods are time-consuming and high in cost, and are difficult to meet the requirement of rapid prediction in practical application. In recent years, the neural network based on the strong fitting capability and adaptability is gradually introduced into the field of wastewater treatment and used for data-driven modeling. However, ignoring the time series characteristics of the wastewater data, lack of efficient handling of the sequence dependencies between input variables, limits the ability of the model to handle time series predictions. LSTM neural networks have been proposed to balance the time and non-linear relationship of wastewater data. But similar to conventional neural networks, it is difficult to select super parameters.

To solve this problem, the present invention provides a prediction method based on a long-short-term memory (LSTM) neural network model that improves the differential evolution Algorithm (ADE) and the attention mechanism (ATT). ATT-LSTM is optimized with ADE because of the difficulty in choosing the superparameter of LSTM. In addition, ATT is added into the LSTM model, so that the water quality characteristics are mined, the DO prediction accuracy is improved, and the method has important practical engineering significance.

Disclosure of Invention

The invention aims to: aiming at the problems pointed out in the background art, the invention provides the LSTM sewage water quality prediction method based on the improved DE and attention mechanism, and the ATT-LSTM model is optimized through ADE, and finally DO prediction is carried out on sewage, so that an accurate DO prediction result is obtained, the calculation efficiency of the model is improved, and the prediction accuracy of the model is improved.

The technical scheme is as follows: the invention provides an LSTM sewage water quality prediction method based on an improved DE and attention mechanism, which comprises the following steps:

step 1: preprocessing an important characteristic data set of a sewage treatment plant, wherein the characteristic data set comprises pH, conductivity, water temperature, turbidity, potassium permanganate, total phosphorus, ammonia nitrogen, total nitrogen and dissolved oxygen historical data of the previous moments;

step 2: constructing and initializing an ATT-LSTM water quality prediction model, wherein the ATT-LSTM water quality prediction model is a long-term and short-term memory LSTM neural network model based on an attention mechanism ATT;

step 3: based on the preprocessed data, optimizing the super-parameters of the ATT-LSTM water quality prediction model by utilizing an improved differential evolution algorithm ADE to obtain an optimal super-parameter ADE-ATT-LSTM water quality prediction model;

step 4: constructing a training set and a testing set based on the preprocessed data to train the optimal super-parameter ADE-ATT-LSTM water quality prediction model;

step 5: and inputting the collected data set comprising the pH, conductivity, water temperature, turbidity, potassium permanganate, total phosphorus, ammonia nitrogen, total nitrogen and dissolved oxygen historical data at the previous moments into a trained ADE-ATT-LSTM water quality prediction model to obtain a DO value prediction result.

Further, the step of preprocessing the data set in the step 1 includes:

s11: outlier processing: firstly, carrying out descriptive statistics on characteristic values to check unreasonable data and whether the data are subjected to normal distribution, and when the average value of the sample distance is greater than 3 standard deviations, recognizing the sample as an abnormal value and deleting the abnormal value in a data set;

s12: missing value processing: indexing the data containing the missing values in the data set, and filling the indexed missing values according to the data of the previous time point and the next time point;

s13: normalization: the data of different scales and ranges are mapped into a unified range to eliminate dimensional differences between different features.

Further, the attention mechanism in the step 2 assigns corresponding weights according to the importance of the input features, and the calculation formula is as follows:

wherein T is the total time step; h is a ^m Is the output eigenvector of LSTM; alpha _m Is the result of the first weighted calculation through the full connection layer; w (W) _l ^αh And b _α The weight matrix and the bias of the full connection layer are respectively; beta _m Is calculated by softmax activation function and assigned to corresponding h _m Final weight of (c); and vector γ is an important feature of extraction.

Further, optimizing the super parameters of the ATT-LSTM water quality prediction model using ADE based on the preprocessed data in step 3 includes:

s31: the number of hidden layer neurons and the learning rate of the ATT-LSTM water quality prediction model are selected as optimization parameters of an ADE algorithm, the optimization range of the parameters to be optimized is determined, and the population P is initialized randomly;

s32: calculating the fitness value of each individual in the ADE algorithm:

wherein p is the number of samples, y _p As the true value of the sample,is a predicted value;

s33: randomly dividing the whole population P into exploratory sub-populations P ₁ Developing sub-population P ₂ And a balancer population P ₃ And adaptively selecting a mutation strategy;

s34: entering a mutation stage, and executing a corresponding particle mutation method according to a corresponding mutation strategy and the adaptive mutation factor F;

s35: entering a mutation stage, and executing a corresponding particle mutation method according to the self-adaptive crossover probability Cr;

s36: calculating the fitness value of the individual, determining the optimal value in the 3 sub-populations, and combining the 3 sub-populations;

s37: judging whether the condition of ending the iteration is satisfied; if the iteration number reaches the maximum value, returning to the optimal super-parameters of the model; otherwise, S33 is repeated to continue execution until the termination condition is satisfied.

Further, the neutron population adaptive selection mutation strategy in S33 is specifically as follows:

1) Sonde population P ₁

Sonde population P ₁ DE/rand/2 asMutation strategy; the target individual is interfered by using two random differential vectors, and the mutation strategy is as follows:

“DE/rand/2”：V _i,G ＝X _r1,G +F×(X _r2,G -X _r3,G )+F×(X _r4,G -X _r5,G )

wherein V is _i,G For individuals X in the current population _i,G Generating a mutation vector by using mutation strategy, r ₁ 、r ₂ 、r ₃ 、r ₄ 、r ₅ Respectively represent [1, N ]]Mutually exclusive integers randomly generated in range, F E [0,1]Is a variation factor;

2) Development of sub-population P ₂

Development of sub-population P ₂ DE/best/2 is used as a mutation strategy to search near the optimal individuals of the current population, and the mutation strategy is as follows:

“DE/best/2”：V _i,G ＝X _best,G +F×(X _r2,G -X _r3,G )+F×(X _r4,G -X _r5,G )

wherein X is _best,G The individual vector with the best fitness value in the G generation population is obtained;

3) Balancer population P ₃

For the balancer population P ₃ According to the size of individual fitness value, dividing the current iteration to the G generation population into an excellent population and a common population, and in the variation strategy of the improved differential evolution algorithm, X _r1,G ,X _r2,G ,X _r4,G Randomly selecting from a superior population, X _r3,G And X _r5,G Random selection from the common population, adaptive mutation strategy selection is as follows:

wherein the proportion of the excellent population scale to the whole population scale is set to 0.4, Q _G For the probability of being selected in the two mutation strategies of the above formula, the following is calculated:

。

further, the step S34 is performed to enter a mutation stage, and a corresponding particle mutation method is performed according to a corresponding mutation strategy and an adaptive mutation factor F, wherein the adaptive mutation factor F is generated as follows:

wherein F is ₀ For the initial value of the mutation factor, G is the number of current iterations, G _m The maximum iteration number; f from 2F ₀ Decrementing to F ₀ 。

Further, the adaptive crossover probability Cr of each individual in S35 is generated as follows:

wherein is Cr _max ,Cr _min Is a maximum value and a minimum value given to Cr, f _i ,f _min And f _max Respectively representing the fitness of the ith individual, the average value of the population fitness, the fitness of the optimal individual and the fitness of the worst individual.

Advantageous effects

(1) Aiming at the problem of sewage quality prediction, the invention provides an improved differential evolution algorithm, designs a self-adaptive mutation strategy and aims at evolving early-stage auxiliary P ₃ Expanding search range and helping P in later period of evolution ₃ The convergence speed is improved. Thereby promoting the population P ₃ The balance of the prediction model is synchronously improved.

(2) The improved differential evolution algorithm provided by the invention designs a self-adaptive F, cr strategy, and improves the optimizing performance of ADE. And then, the optimal hidden layer neuron number and learning rate are obtained by utilizing an ADE algorithm, so that the prediction accuracy of the prediction model is improved.

(3) The LSTM based on the attention mechanism can distribute corresponding weight according to the importance of the important characteristics of the input sewage. Thus, the ability of the LSTM neural network to learn the importance of the sewage data features can be improved, and more accurate prediction can be realized.

Drawings

FIG. 1 is a diagram of a multi-population evolution strategy according to the present invention;

FIG. 2 is a diagram of an attention module according to the present invention;

FIG. 3 is an algorithm flow of the DTL-EADE-LSWSVM of the present invention.

Detailed Description

Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.

The invention provides an LSTM sewage water quality prediction method based on an improved DE and attention mechanism, which comprises the following steps:

step 1: pretreatment is carried out on an important characteristic data set of a sewage treatment plant, wherein the characteristic data set comprises pH, conductivity, water temperature, turbidity, potassium permanganate, total phosphorus, ammonia nitrogen, total nitrogen and dissolved oxygen historical data at the previous moments at the current moment;

outlier processing is performed on the data set, and the formula is as follows:

where x is the value of the data point; μ is the average of the dataset; σ is the standard deviation of the dataset. If the absolute value of Z for a data point exceeds a threshold (typically 2 or 3 is selected), the data point may be marked as outliers.

The missing value processing is carried out on the data set, and the formula is as follows:

x ^# =(x _f +x _b )/2

wherein x is ^# Is a padded value; x is x _f Is the previous point in time data; x is x _b Is the latter point in time data. The missing values are filled as the average value of the data of the adjacent time points thereof, so that the filling values reflect the variation trend of the real data to a certain extent.

The data set is normalized, and the formula is as follows:

wherein x is the normalized value; and x is _max And x _min Respectively, a maximum value and a minimum value in the data set. The method can help the prediction model to better understand and capture the relation between different features, and improves the sensitivity of the model to data.

Step 2: constructing and initializing an ATT-LSTM water quality prediction model;

an ATT-LSTM water quality prediction model is constructed to obtain an initial framework of an ATT-LSTM mixed model, wherein the mixed model comprises two parts of ATT and LSTM, the ATT can be simply understood as weighted summation, and corresponding weights can be distributed according to the importance of input features. The calculation formula is as follows:

wherein T is the total time step; h is a ^m Is the output eigenvector of LSTM; alpha _m Is the result of the first weighted calculation through the full connection layer; w (W) _l ^αh And b _α The weight matrix and the bias of the full connection layer are respectively; beta _m Is calculated by softmax activation function and assigned to corresponding h _m Final weight of (c); and face toThe quantity γ is an important feature of the extraction. And LSTM predicts the characteristic weighted water quality time series.

Step 3: based on the preprocessed data, utilizing ADE to optimize the super-parameters of the ATT-LSTM water quality prediction model to obtain the optimal super-parameters ADE-ATT-LSTM water quality prediction model.

S31: the number of hidden layer neurons and the learning rate of the ATT-LSTM model are selected as optimization parameters of an ADE algorithm, the optimization range of the parameters to be optimized is determined, and the population P is initialized randomly.

Initialization of population P, the first step is to generate an initial population X in the N D dimension in the search space _i,G ＝{x ¹ _i,G ,…,x ^D _i,G }，i＝1,...,N，G＝0,1,…,G _max Where N represents population size, D represents the dimension of the decision vector, G _max Representing the maximum number of iterations. In addition, setting the search upper and lower limits of each dimension in the decision variable according to the problem to be optimized, wherein the search lower limit is defined as X _min ＝{x ¹ _min ,...,x ^D _min Upper search limit defined as X _max ＝{x ¹ _max ,...,x ^D _max }. In order for the initial population to cover the entire search space as much as possible, the population individuals are generally uniformly and randomly initialized within the given upper and lower search limits as follows:

wherein rand (0, 1) represents a random variable uniformly distributed in the range of [0,1 ].

S32: calculating the fitness value of each individual in the ADE algorithm:

wherein p is the number of samples, y _p As the true value of the sample,is a predicted value.

S33: randomly dividing the whole population P into exploratory sub-populations P ₁ Developing sub-population P ₂ And a balancer population P ₃ And adaptively select mutation strategies. How to select the mutation strategy is as follows:

sub-population adaptive selection mutation strategy:

1) Sonde population P ₁

Sonde population P ₁ The DE/rand/2 is adopted as a mutation strategy, and the mutation strategy uses two random differential vectors to interfere a target individual, so that the mutation strategy has stronger disturbance and better global searching capability, and can obviously improve the diversity of the population. The mutation strategy is as follows:

wherein V is _i,G For individuals X in the current population _i,G A mutation vector is generated using a mutation strategy. r is (r) ₁ 、r ₂ 、r ₃ 、r ₄ 、r ₅ Respectively represent [1, N ]]Mutually exclusive integers randomly generated within the range. F epsilon [0,1]]Is a mutation factor.

2) Developing sub-populations

Development of sub-population P ₂ DE/best/2 was used as a mutation strategy. The mutation strategy searches near the optimal individuals of the current population, has good local development capability, and is beneficial to the convergence of the whole population. The mutation strategy is as follows:

wherein X is _best,G The individual vector with the best fitness value in the G generation population.

3) Balance sub-population

For the balancer population P ₃ And dividing the current iteration to the generation G into an excellent population and a common population according to the fitness value of the individuals. Through excellent population and common speciesThe cooperation of the clusters achieves an improvement in the performance of the algorithm. The excellent population is intended to provide the most desirable direction of evolution, thereby enhancing the local search capabilities of the algorithm. The common population aims at adjusting the searching direction so as to increase the diversity of the population, increase the global searching capability and avoid premature convergence. In the variation strategy of the improved differential evolution algorithm, X _r1,G ,X _r2,G ,X _r4,G Randomly selecting from a superior population, X _r3,G And X _r5,G Randomly selected from a common population. The adaptive mutation strategy was selected as follows:

s34: entering a mutation stage, and executing a corresponding particle mutation method according to a corresponding mutation strategy and the adaptive mutation factor F; the adaptive mutation factor F is generated as follows:

F＝F ₀ ·2 ^λ

wherein F is ₀ For the initial value of the mutation factor, G is the number of current iterations, G _m The maximum iteration number; f from 2F ₀ Decrementing to F ₀ Therefore, the global searching capability of the algorithm can be enhanced in the early stage of the algorithm, the searching range can be enlarged, and the convergence rate of the algorithm can be improved in the later stage.

S35: and entering a mutation stage, and executing a corresponding particle mutation method according to the adaptive crossover probability Cr. The adaptive crossover probability Cr for each individual is generated as follows:

s37: determining whether the algorithm satisfies a condition for terminating the iteration; if the iteration number reaches the maximum value, returning to the optimal super-parameters of the model; otherwise, S23 is repeated to continue execution until the algorithm satisfies the termination condition.

Step 4: based on the preprocessed data, a training set and a testing set are constructed to train the optimal super-parameter ADE-ATT-LSTM water quality prediction model.

To verify that ADE-ATT-LSTM can be put into practical use, a certain amount of data set is required for verification and output of a predicted value. The model predictive ability is evaluated based on the predicted value and the actual value of the validation sample.

Step 5: the collected data set comprising the historical data of pH, conductivity, water temperature, turbidity, potassium permanganate, total phosphorus, ammonia nitrogen, total nitrogen and dissolved oxygen at the previous moments is input into a trained ADE-ATT-LSTM water quality prediction model to obtain DO prediction results, wherein the specific process is as follows:

s51: acquiring a water quality data sample, and sequentially performing abnormality, deletion and normalization;

s52: inputting the processed water quality data sample into a trained ADE-ATT-LSTM water quality prediction model;

s53: and outputting a final water quality prediction result through calculation of the neural network.

To objectively and equitably evaluate the predictive effect of the proposed ADE-ATT-LSTM model, a Root Mean Square Error (RMSE), a Mean Absolute Error (MAE), a mean percent error (MAPE), and a decision coefficient (R) are selected ² ) As an evaluation index. The correlation formula is as follows:

wherein N is the number of samples, y _n To be a true value of the value,is the average of the true values, +.>Is the predicted value of the model.

The performance index of all models is shown in table 1.

Table 1 evaluation index results of each classification model

Comparing the model of the present invention with LSTM, ATT-LSTM, PSO-LSTM, DE-LSTM and DE-ATT-LSTM, it can be seen from Table 1 that the R of the DE-ATT-LSTM prediction algorithm of the present invention is superior to other algorithms ² RMSE, MAPE and MAEIs optimal. DE-ATT-LSTM was improved over standard LSTM in all four criteria. Compared with the DE-LSSVM, after the attention mechanism and the self-adaptive DE are integrated, the classification performance of the ADE-ATT-LSSVM is greatly improved, and the sewage quality prediction problem can be effectively solved.

The foregoing embodiments are merely illustrative of the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the present invention and to implement the same, not to limit the scope of the present invention. All equivalent changes or modifications made according to the spirit of the present invention should be included in the scope of the present invention.

Claims

1. An LSTM sewage water quality prediction method based on an improved DE and attention mechanism is characterized by comprising the following steps:

step 1: preprocessing an important characteristic data set of a sewage treatment plant, wherein the characteristic data set comprises pH, conductivity, water temperature, turbidity, potassium permanganate, total phosphorus, ammonia nitrogen, total nitrogen and dissolved oxygen DO historical data at the previous moments;

step 5: and inputting the collected data set comprising the pH, conductivity, water temperature, turbidity, potassium permanganate, total phosphorus, ammonia nitrogen, total nitrogen and dissolved oxygen historical data at the previous moments into a trained ADE-ATT-LSTM water quality prediction model to obtain a future DO value prediction result.

2. The LSTM wastewater quality prediction method for improving DE and attention mechanisms according to claim 1, wherein the step of preprocessing the data set in step 1 comprises:

3. The LSTM sewage quality prediction method for improving DE and attention mechanism according to claim 1, wherein the attention mechanism in step 2 assigns corresponding weights according to the importance of the input features, and the calculation formula is as follows:

wherein T is the total time step; h is a ^m Is the output eigenvector of LSTM; alpha _m Is the result of the first weighted calculation through the full connection layer; w (W) _l ^αh And b _α The weight matrix and the bias of the full connection layer are respectively; beta _m Calculated by softmax activation functionAssigned to corresponding h _m Final weight of (c); and vector γ is an important feature of extraction.

4. The LSTM wastewater quality prediction method for improving DE and attention mechanisms according to claim 1, wherein optimizing the super parameters of the ATT-LSTM water quality prediction model using ADE based on the preprocessed data in the step 3 comprises:

s32: calculating the fitness value of each individual in the ADE algorithm:

5. The LSTM wastewater quality prediction method for improving DE and attention mechanisms according to claim 4, wherein the S33 neutron population adaptive selection mutation strategy is specifically as follows:

1) Sonde population P ₁

Sonde population P ₁ DE/rand/2 was used as a mutation strategy; the target individual is interfered by using two random differential vectors, and the mutation strategy is as follows:

2) Development of sub-population P ₂

3) Balancer population P ₃

6. the LSTM sewage quality prediction method for improving DE and attention mechanisms according to claim 4, wherein the step S34 is performed by performing a corresponding particle mutation method according to a corresponding mutation strategy and an adaptive mutation factor F, and the adaptive mutation factor F is generated as follows:

F＝F ₀ ·2 ^λ

7. The LSTM wastewater quality prediction method for improving DE and attention mechanisms according to claim 4, wherein said adaptive crossover probability Cr for each individual of S35 is generated as follows: