CN115689014A

CN115689014A - Water quality index prediction method based on bidirectional long-and-short-term memory neural network and time attention mechanism

Info

Publication number: CN115689014A
Application number: CN202211344095.XA
Authority: CN
Inventors: 陈泽贤; 毕敬; 乔俊飞
Original assignee: Beijing University of Technology
Current assignee: Beijing University of Technology
Priority date: 2022-10-31
Filing date: 2022-10-31
Publication date: 2023-02-03

Abstract

The invention designs a water quality index prediction method, and particularly relates to a water quality index prediction method based on a Bidirectional Long Short-Term Memory neural network (BilSTM) and a time Attention mechanism (Temporal Attention). Firstly, sequencing the acquired multi-feature water quality data according to a time sequence, then decomposing a predicted target sequence into k modes by adopting Variable Mode Decomposition (VMD), and combining the obtained modes and the rest features into new input data. And then, carrying out normalization processing on input data, dividing the water quality time sequence data into a plurality of subsequences according to the size of a preset sliding window as a characteristic sequence, converting the subsequences into supervised data, inputting the supervised data into a neural network model based on bidirectional long-short term memory and a time attention mechanism, predicting water quality index data of four hours in the future, and finally obtaining a water quality index prediction result with high accuracy.

Description

Water quality index prediction method based on bidirectional long-and-short-term memory neural network and time attention mechanism

Technical Field

The invention relates to a water quality index-oriented prediction method, in particular to a water quality index prediction method based on a bidirectional long-time memory neural network and a time attention mechanism.

Background

Water is a valuable resource, closely related to human production and life. The water quality index can be used as a specific measurement scale for judging the water pollution degree. With the advent of the internet of things and big data, a large amount of high-frequency multivariate time sequence data is accumulated in a water environment by deploying water quality monitoring sensors in rivers and lakes on a large scale. The accurate and real-time water quality prediction method is not only beneficial to preventing sudden water pollution, but also provides decision support for water quality detection and early warning. However, the water environment index is influenced by many complex factors such as physics, chemistry, biology, etc., has strong nonlinear characteristics, and is influenced by many factors. The traditional model cannot well sense the slight water quality change and cannot capture the nonlinear characteristics of a large-scale water quality sequence. On the other hand, due to the complex water environment, the time sequence of the water quality index has large noise, so that the traditional model is difficult to effectively predict the water environment index under the complex water environment condition.

In recent years, as the amount of data increases, more and more data-driven models based on deep learning are used to realize water quality time series prediction. In the early days, a BP (Back Propagation) neural network was used to predict water quality indicators. The BP neural network is easy to establish and train, has certain expression capacity on complex data sequences, firstly carries out data normalization, then carries out pre-training on the BP neural network and optimizes the BP neural network, and finally carries out prediction by utilizing the trained BP neural network. In the scheme, the BP neural network is mainly adopted to predict the water quality index data, but the memory ratio of BP to the water quality index data is poor, so that the improvement of the water quality index prediction precision is limited. Of course, not only BP neural networks, but also other conventional neural networks cannot capture the temporal correlation in the data. As a typical example, LSTM (Long Short Term Memory) can capture Long-Term dependence, effectively avoiding the gradient disappearance problem in the traditional recurrent neural network. Although the LSTM is widely used in water quality index prediction, it has a problem that it can only be encoded from front to back, and cannot capture information from back to front. In addition, LSTM cannot distinguish between input features, and some features that are not relevant to the prediction index may affect the prediction accuracy. A suitable method is needed to solve the above technical problems.

Disclosure of Invention

In view of the above disadvantages of the prior art, the present invention provides a water quality index prediction method based on Bidirectional Long Short-Term Memory (bilst) and Temporal Attention mechanism (Temporal Attention). The method comprises the following steps: a water quality time series decomposition treatment scheme based on variational modal decomposition; and realizing one-step prediction of the water quality index based on a VABED model of a BilSTM and time attention mechanism. The purpose of the invention is realized by the following technical scheme.

A water quality index prediction method based on a bidirectional long-and-short time memory neural network and a time attention mechanism comprises the following steps:

1) Acquiring time sequence data consisting of water quality indexes monitored in a river in the past period;

2) VMD decomposition is carried out on the predicted target data to obtain k modal components, and the k modal components and other data form new input;

3) On the basis of 2), normalizing the processed time sequence data, dividing the time sequence data into a plurality of subsequences according to the size of a preset sliding window, converting the subsequences into supervised data, and dividing a training set and a test set;

4) And 3) inputting the characteristic sequence data into a neural network model of a bidirectional long-time memory and time attention mechanism on the basis of the characteristic sequence data, and outputting a water quality index predicted value for four hours in the future.

5) And 4) performing inverse normalization on the predicted values on the basis of the step 4), thereby obtaining a real future water quality index predicted value.

Drawings

FIG. 1 is a schematic diagram of a water quality prediction method based on a two-way long-short time memory and time attention mechanism;

FIG. 2VMD decomposition flow diagram;

FIG. 3 is a schematic diagram of a bidirectional long-short memory and time attention mechanism network model;

Detailed Description

Features and exemplary embodiments of various aspects of the present invention will be described in detail below. The following description encompasses numerous specific details in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a clearer understanding of the present invention by illustrating examples thereof. The present invention is in no way limited to any specific configuration and algorithm set forth below, but rather covers any modifications, substitutions and alterations of the relevant elements, components and algorithms without departing from the spirit of the invention.

The following will describe specific steps of a water quality prediction method based on a bidirectional long-and-short memory and time attention mechanism according to an embodiment of the present invention with reference to fig. 1 as follows:

the first step is to obtain time sequence data which is composed of water quality indexes and is monitored by a river in the past period.

Because the actual monitoring frequency of the automatic water quality monitoring system is usually once every 4 hours, in the data preprocessing stage, the water quality parameter data is screened and uniformly adjusted to be 4 hours of data with equal intervals.

For missing data, the data is padded by interpolation.

And secondly, decomposing the predicted target water quality time series data by using a VMD (variable mode decomposition) variation mode to obtain k mode components, and combining the k mode components with the rest water quality index data.

The VMD is used for decomposing the original data, so that the characteristics of the water quality data can be effectively enriched, and the neural network learns more complex potential characteristics. FIG. 2 is a flow chart of an implementation method of VMD decomposition. The principle of VMD decomposition is as follows:

the variational modal decomposition is a self-adaptive signal processing method, and continuously updates each Mode function and center frequency by iteratively searching an optimal solution of a variational problem, and obtains a plurality of inherent Mode components (IMFs). The variational problem may be defined as solving k IMFs such that the sum of the estimated bandwidths for each mode is minimized. The time series l of the past period of time of the predicted target is decomposed into k modalities.

l＝IMF ₁ +IMF ₂ +…+IMF _k

VMD decomposition can reduce time series non-linearity and volatility and avoid the negative effects of mode mixing. Different modal components have different effects on the prediction outcome. By separating it and combining it with attention mechanisms, the neural network model is enabled to adaptively select important modalities, filter noise modalities from a plurality of modalities, and concentrate on modalities that contain important information.

And merging the k modes obtained by the VMD decomposition with the rest n characteristic time sequence data to obtain new input data Y.

X＝concat(IMF ₁ ,IMF ₂ ,…,IMF _k ,l ₁ ,l ₂ ,…,l _n )

And thirdly, carrying out normalization processing, and dividing the feature sequence data through a sliding window.

The VMD decomposed data is subjected to the following sliding window processing for model input.

1) And normalizing the data processed in the last step. The specific formula is as follows:

wherein x is ^* Representing the normalized target value, x representing the data to be normalized, x _min Represents the minimum value, x, in the data _max Representing the maximum value in the data.

2) The width of the sliding window is set as the sum of the input timing length and the predicted timing length, and the input value and the predicted value are intercepted by using the sliding window.

3) And separating the data intercepted by the sliding window into an input value and a predicted value, and converting the input value and the predicted value into supervised data.

Fourthly, predicting the neural network model of the bidirectional long-and-short time memory and time attention mechanism

The invention uses a special bidirectional long-short time memory (BilSTM) combined with an attention mechanism to analyze the water environment related data, and the data is processed in the previous step and then set an input sequence x ₀ ,x ₁ ,…,x _t ,…,x _T The encoder is constructed through the BilSTM and the input attention mechanism, the decoder is constructed through the BilSTM and the time attention mechanism, the encoder is used for processing input time sequence data with any length, characteristics are extracted from the input time sequence data, and then the decoder is used for predicting the future time sequence data.

The simple RNN model often has a certain limitation in constructing the relation between the last output and the previous data, because multiple multiplications between the step numbers can make it very difficult to establish the correlation between the far step numbers, for this reason, LSTM is a suitable model for establishing the long-time correlation, and an LSTM unit has a long memory (Cell) and three gates (input, output and forget gate), and the memory is modified by three gates, which can be described by the following formulas:

f _t ＝σ(W _f [h _t-1 ,x _t ]+ _b f)

i _t ＝σ(W _i [h _t-1 ,x _t ]+b _i )

o _t ＝σ(W _o [h _t-1 ,x _t ]+b _o )

h _t ＝o _t ⊙tanh(c _t )

multiplying an [ ] by the value W _i ，W _f ，W _o And W _c The matrices represent the parameters of the various gates and cells, σ () and Tanh (-) being the Sigmoid and Tanh functions.

While the LSTM extracts features, the problems of gradient disappearance and gradient explosion caused by parameter propagation of a multilayer neural network can be solved according to the mechanism of a plurality of gates of the LSTM.

However, LSTM has the problem of only being able to encode from front to back, and thus ignores the hidden information from back to front in the time series data. The BilSTM is formed by combining a forward LSTM and a backward LSTM, and the BilSTM is used as an encoder, so that the information coded from front to back and the information coded from back to front can be combined to capture more time sequence hidden information, and the prediction precision is improved.

In order to better capture important input features, the invention designs a bidirectional input attention mechanism suitable for BilSTM. The bidirectional input attention mechanism can adaptively select important features from a large number of input features and weaken the influence of noise. The attention weight represents the importance of the feature. The inputs of the forward and backward LSTM through the attention mechanism at time t can be represented by the following equations:

wherein X is input data after VMD decomposition, h ^F _t-1 And c ^F _t-1 Respectively representing the hidden state and the cell state of the forward LSTM encoder at the time t-1, h ^B _t+1 And c ^B _t+1 Respectively representing the hidden state and the cellular state of the inverse LSTM encoder at time t +1,

and

representing the inputs of the forward and backward LSTM encoders after passing attention at time t, respectively.

The hidden state of the encoder at t is calculated by:

wherein L is _F And L _B Representing forward LSTM and reverse LSTM, respectively.

In the decoder part, the invention adopts BilSTM as a decoder and designs a bidirectional time attention mechanism to add attention weight to the hidden state vector in the time dimension, so that the decoder can adaptively select important hidden states and ignore unimportant hidden states. The inputs of the forward and backward LSTM through the attention mechanism at time t may be expressed as follows:

wherein, H = concat (H) ₁ ,…,h _t ,…h _T ) And is the hidden state at all time points. d ^F _t-1 And s ^F _t-1 Respectively representing the hidden and cell states of the forward LSTM decoder at time t-1, d ^B _t+1 And s ^B _t+1 Respectively representing the hidden state and the cell state of the inverse LSTM decoder at time t +1,

and

representing the inputs of the forward and backward LSTM decoders, respectively, after passing the attention mechanism at time t.

After receiving sufficient training, the BilSTM decoder can extract complex time sequence information, and based on the effective characteristics, the final full-link layer can decode the time sequence information into a predicted value with reasonable precision.

The weight of the fully connected layers of the matrix W,

is a predicted value.

And fifthly, performing inverse normalization on the predicted value so as to obtain a real water quality index predicted value.

And generating a predicted value by using the model for the water quality test set, carrying out inverse normalization on the predicted value, comparing the predicted value with a true value by using RMSE (remote metering element), adjusting the size of a hidden layer of a bidirectional long-time memory network model in the water quality prediction model, testing the adjusted water quality prediction model, and finally obtaining a parameter model with the best effect. The water quality index prediction model can be applied to prediction of water quality indexes such as pH values, dissolved Oxygen (DO), ammonia nitrogen (NH 3-N), permanganate indexes (CODMN) and the like of different rivers of surface water, realizes accurate prediction of related water quality data, and is convenient for water quality early warning and water pollution treatment.

Technical contribution of the invention

The water quality index prediction aims at accurately predicting the water quality index change in a future river and providing reliable data for water quality early warning and water pollution treatment. The water quality index data sequence is a nonlinear time sequence essentially, but is influenced by multiple factors such as complex environment, weather and the like, and has the characteristic of high instability, and the characteristic makes the water quality index sequence difficult to express, so that the work of water quality early warning and the like becomes difficult. At present, most of the existing water quality index prediction models adopt methods of RNN and variants thereof to predict water quality indexes, although the methods can obtain relatively good prediction results, the methods cannot effectively distinguish the feature importance degree in input data, so that the prediction precision is not high enough, bidirectional coding is not achieved, and some hidden information in an original sequence is lost. In order to solve the problems, the patent provides a water quality index prediction method based on a bidirectional long-short term memory and time attention mechanism. The method can overcome the problems under the condition of ensuring the prediction precision. Compared with the prior art, the invention mainly contributes to the following aspects:

(1) The invention adopts VMD decomposition, which can enrich the dimensionality of the original sequence and make the model better learn the complex characteristics.

(2) The invention adopts an input attention mechanism combined with bidirectional BilSTM as an encoder, so that the model can adaptively extract important input characteristics, and encode input information from two directions to obtain more information.

(3) The invention adopts a time attention mechanism combined with bidirectional BilSTM as a decoder, and the model can select an important hidden state in the time dimension and decode the hidden state from two directions, thereby improving the final prediction precision.

The invention provides a water quality index prediction method based on bidirectional long-short term memory and time attention mechanism. It should be understood that the above detailed description of the technical solution of the present invention with the help of preferred embodiments is illustrative and not restrictive. After reading the description of the present invention, a person skilled in the art may modify the technical solutions described in the examples, or may substitute part of the technical features of the examples, but these modifications or substitutions do not make the essence of the corresponding technical solutions depart from the spirit and scope of the technical solutions of the examples of the present invention.

Claims

1. A water quality index prediction method based on a bidirectional long-short term memory neural network and a time attention mechanism is characterized by comprising the following steps:

1) Acquiring time sequence data consisting of water quality indexes monitored by a river in the past period;

2) Carrying out variation modal decomposition on the prediction target index to obtain k modal components, and splicing the k modal components with the data obtained in the step 1) to obtain new input data;

3) Normalizing the data processed in the step 2), and dividing the data into a plurality of subsequences according to the width of a preset sliding window to be used as characteristic sequence data;

4) Inputting the characteristic sequence data into a bidirectional long-time memory neural network and attention mechanism based neural network model, outputting a predicted value of 4 hours in the future, and performing reverse normalization on the predicted value to obtain a predicted value of the flow in the future.

2. The method of claim 1, wherein training the flow prediction model based on historical water quality timing data comprises:

acquiring water quality time series data of a target area as historical data; and carrying out variation modal decomposition on the historical data, and merging the modal obtained by decomposition and the original data. Then, carrying out normalization processing on the obtained product; dividing the normalized historical data into a training set and a testing set according to a preset proportion, and training the water quality prediction model according to the historical data of the training set to obtain parameters of the water quality prediction model.

3. The method of claim 1, wherein predicting water quality based on a water quality prediction model comprises:

acquiring water quality time sequence data of the target area in a preset time period before the current time and the current time; and (3) performing variation modal decomposition on the water quality time series data in the preset time periods before the current time and the current time of the target area to obtain 3 decomposition modes, and combining the 3 decomposition modes with the water quality time series data in the preset time periods before the current time and the current time of the target area. Normalizing the water quality time series data subjected to the variational modal decomposition, and inputting the data subjected to the normalization into the water quality prediction model; and performing inverse normalization processing on the output data of the water quality prediction model to obtain the water quality prediction data of the target area.

4. The method as claimed in claim 2 and claim 3, wherein the constructing the water quality prediction model based on the variational modal decomposition and the bidirectional long-and-short memory and time attention mechanism network comprises:

decomposing the water quality index by using variational modal decomposition, and combining the mode obtained by decomposition with the original water quality time sequence data; and taking the data obtained by decomposition and combination as the input of the bidirectional long-time memory and time attention mechanism network model to form the flow prediction model.

5. The method of claim 2, wherein said testing and optimizing said water quality prediction model from said test set of historical data comprises:

and the water quality prediction model adjusts the size of a bidirectional long-time memory network model hidden layer, the size of a time window and the like in the water quality prediction model and tests the adjusted water quality prediction model according to the test result of the water quality prediction model, so as to optimize the parameters of the water quality prediction model.

6. The method of claim 3, wherein the subsequences are divided into feature sequences by a preset sliding window width.

The length of the subsequence of each segment is the width of the sliding window, and is the sum of the input time sequence length and the predicted time sequence length.

The data intercepted by the sliding window is separated into an input value and a predicted value, the sum of the input time sequence length and the predicted time sequence length can be respectively set manually, and the sequence is converted into supervised data.

7. The method of claim 4, wherein prior to predicting water quality based on the water quality prediction model, further comprising:

changing the initial proportion, and dividing the normalized historical data into a training set and a test set according to the changed preset proportion; and training the water quality prediction model according to the historical data of the re-divided training set, and finely adjusting the water quality prediction model.

8. The method of claim 4, wherein predicting the flow water quality index based on a bidirectional long-time memory and time attention mechanism model comprises:

the length of the input sequence can be adjusted to further optimize the accuracy of model prediction. The data sequence input by the prediction model can be in any dimension, namely, multi-element prediction is realized. The dimensions of the input sequence can be adjusted to further optimize the accuracy of model prediction.