CN115563487A

CN115563487A - Water quality monitoring method based on EMD and improved LSTM

Info

Publication number: CN115563487A
Application number: CN202211178099.5A
Authority: CN
Inventors: 徐智龙; 易辉; 俞鑫丽
Original assignee: Jiangsu Ankong Zhihui Technology Co ltd
Current assignee: Jiangsu Ankong Zhihui Technology Co ltd
Priority date: 2022-09-23
Filing date: 2022-09-23
Publication date: 2023-01-03

Abstract

The invention discloses a water quality monitoring method based on EMD and improved LSTM, which comprises the following steps: step 1, acquiring data to obtain water quality information, wherein the water quality information comprises social indexes, meteorological indexes, water quantity indexes and water quality indexes; step 2, adopting an EMD algorithm to carry out time sequence on original monitoring data in the water quality indexX(t) Carrying out treatment; step 3, using the data processed in the step 2 to train a model, and optimizing parameters of the LSTM neural network by means of S-SSA (secure Shell analysis) so as to enable the input data to be better matched with the LSTM neural network structure; and 4, dividing the water quality information time sequence into a plurality of sub LSTM networks by the LSTM neural network and training the sub LSTM networks. The invention verifies the method through comparison experimentsThe superiority of the method is as follows. The invention adopts an LSTM method improved by combining EMD, and improves the precision of water quality detection together.

Description

Water quality monitoring method based on EMD and improved LSTM

Technical Field

The invention relates to a water quality monitoring method based on EMD and improved LSTM, and belongs to the technical field of hydrology.

Background

The traditional water quality monitoring is carried out by manual information acquisition and experimental determination and is influenced by a large number of uncertain factors. On the one hand, the monitoring data error is large, the water quality management is not facilitated, environmental emergencies cannot be dealt with timely, on the other hand, an automatic environmental monitoring station is built, corresponding network cables are laid, the environment is polluted, and the monitoring cost is also improved. In order to reasonably control the risk of the drainage basin, the real-time monitoring of the water quality is not slow, and the method has important significance for environmental protection and people's living through utilizing the Internet of things and a big data technology to solve the problem of water quality monitoring. The detection method based on the Internet of things and the big data technology can carry out quick and reagent-free water quality detection on multi-source data, and compared with the traditional detection method based on chemical reaction, the detection method is cleaner and more sustainable, but the practical application is limited due to unsatisfactory precision, the analysis result is far lagged behind the actual water quality change condition, the automation degree is low, and the comprehensive water quality data of a water area cannot be effectively searched.

Disclosure of Invention

In order to solve the problems, the invention adopts EMD combined with an improved LSTM method to improve the precision of water quality detection.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows:

a water quality monitoring method based on EMD and improved LSTM comprises the following steps:

step 1, acquiring data to obtain water quality information, wherein the water quality information comprises social indexes, meteorological indexes, water quantity indexes and water quality indexes;

step 2, processing the time sequence X (t) of the original monitoring data in the water quality index by adopting an EMD algorithm;

step 3, using the data processed in the step 2 to train a model, and optimizing parameters of the LSTM neural network by means of S-SSA (secure Shell analysis) so as to enable the input data to be better matched with the LSTM neural network structure;

step 4, the LSTM neural network divides the water quality information time sequence into a plurality of sub LSTM networks and trains the sub LSTM networks, the S-SSA extracts information from the time sequence through each time step in the iterative neuron sequence, and the output of the (i + 1) th time step is obtained after the input of the time sequence of the ith time step is iterated;

step 5, verifying the plurality of trained sub LSTM networks; comparing the water quality index detection value of the next time step of each sub-LSTM network with the water quality index prediction value of the next time step output by the sub-LSTM network, and judging whether the error is smaller than a set range;

and 6, accumulating the water quality index prediction values of the next time step output by the plurality of verified sub-LSTM networks to obtain a water quality index prediction result of the next time step.

In the step 1, the social indexes comprise population volume and water supply volume; weather indicators include relative humidity, air pressure and temperature; the water quantity indexes comprise flow, flow velocity and liquid level; the water quality indexes include pH, conductivity, biochemical oxygen demand COD, chemical oxygen demand BOD, dissolved oxygen DO, total phosphorus TP and total nitrogen TN.

In step 2, the time series X (t) of raw monitoring data includes X _COD (t)、X _BOD (t)、X _DO (t)、X _TP (t) and X _TN (t) the subdata set.

The step 2 specifically comprises the following steps:

step 2-1, all local maximum values of any subdata set in the time sequence X (t) of the original monitoring data are obtained, and an upper envelope line X is formed according to all the local maximum values _MAX (t)；

Step 2-2, obtaining all local minimum value points of any subdata set in the original monitoring data time sequence X (t), and forming a lower envelope line X according to all local minimum value points _MIN (t)；

Step 2-3, calculating the envelope X of the computer _MAX (t) and the lower envelope X _MIN (t) mean, obtaining the average envelope Arg:

and 2-4, subtracting the average envelope Arg from the original monitoring data time sequence X (t) to obtain a new difference data sequence Hrg:

Hrg＝X(t)-Arg

step 2-5, checking whether the difference data sequence Hrg meets the condition of a limited number of eigenmode functions IMF;

step 2-6, when the IMF condition is not met, taking the difference data sequence Hrg as an original monitoring data time sequence X (t), and repeating the steps 2-1 to 2-5 until the updated difference data sequence Hrg meets the IMF condition; if the condition is satisfied, the difference data sequence Hrg becomes the first function C of the finite number of eigenmode functions IMF ₁ (t) according to r ₁ (t)＝X(t)-C ₁ (t) generating a first residual r ₁ (t) replacing the original monitoring data time series X (t); repeating the steps 2-1 to 2-5, and iteratively generating the remaining n eigenmode functions IMF;

the n eigenmode functions IMF include X _COD (t)、X _BOD (t)、X _DO (t)、X _TP (t) and X _TN (t) the subdata set.

The conditions for a finite number of eigenmode functions IMF include (a) and (b):

(a) The number of extreme points and the number of zero-crossing points must be equal or differ by at most one;

(b) At any time t, the mean value of the upper envelope line consisting of the local maximum value points and the lower envelope line consisting of the local minimum value points is zero;

the extreme points comprise local maximum points and local minimum points;

the zero crossing point is the point where the line connecting the adjacent local maximum point and local minimum point passes through the X axis.

In step 3, the method for optimizing the parameters of the LSTM neural network by means of S-SSA comprises the following steps:

the formula of S-SSA is:

wherein the content of the first and second substances,

representing the best position of sparrows in the current populationPlacing; beta represents a random number which conforms to the standard normal distribution;

representing the position of the ith individual in the u generation in the population;

representing the position of the ith individual in the u +1 generation in the population; k represents [ -1,1]A uniform random number of; ε represents a very small number in the range of [0.01,0.1]Preventing the denominator from being zero; x is a radical of a fluorine atom _w Representing the fitness value of the worst-position sparrow; x is a radical of a fluorine atom _i Representing the fitness value of sparrows at any position; x is the number of _b Representing the fitness value of the sparrow at the optimal position;

representing the worst position of the sparrows in the current population;

improvement on beta:

wherein U represents the maximum number of iterations and U =200, U represents the current number of iterations; the u generation is the current iteration times;

and (3) proving that: let u ₁ ＜u ₂

∴β ₁ ＜β ₂

Beta is an increasing function, the value of beta at the early stage is small, so that the local searching capability is enhanced, the value of beta at the later stage is large, the global searching capability is enhanced, and the problem that the traditional SSA algorithm is easy to fall into the local optimum can be solved by improving beta;

improvement on k:

k is increased in the early period of iteration and reduced in the later period of iteration, and the convergence speed of the SSA algorithm can be improved by improving k.

The step 3 specifically comprises the following steps:

step 3-1, determining the size of a sparrow population, the number of iterations U and an initial security threshold value by taking the size of a time window, the batch processing size and the number of hidden layer units of an LSTM neural network as optimization objects, and initializing an SSA optimization algorithm;

3-2, determining the fitness value of each sparrow by using a predicted value of an algorithm of the LSTM neural network and the root mean square of sample data;

3-3, updating the positions of the sparrows to obtain fitness values of the sparrow population, and storing the optimal individual positions and the overall optimal position values in the population;

step 3-4, judging whether a termination condition is met or whether the maximum value of the updating iteration times is reached; if so, exiting the loop and returning to the optimal individual solution, namely determining the optimal parameters of the LSTM neural network structure, otherwise continuing to loop the step 3-3;

and 3-5, taking the optimal particle value output by the S-SSA algorithm as the time window size, batch processing size and hidden layer unit numerical value of the LSTM neural network.

Step 4, training a plurality of sub LSTM networks; the sub-LSTM network is trained using mean square error MSE,

wherein N is the total number of data, x is the monitoring data value,

is a predicted value.

In step 5, the set range is: MSEs for COD, BOD, DO, TP and TN are set to be less than 150, 100, 10, 1 and 10 respectively.

If the error is smaller than the set range, the verification is passed; and using a determined coefficient R ² And root mean square error RMSE to evaluate the prediction performance of the EMD-LSTM model;

statistical indicator R for quantifying model performance ² And RMSE is calculated as:

wherein, the first and the second end of the pipe are connected with each other,

in order to predict the mean of the values,

the mean value of the monitored data values.

The invention has the following beneficial effects:

(1) In the data preprocessing of the EMD, accurate abnormal values and unaligned data play an important role, and the accuracy of a prediction result can be improved by adopting the EMD.

(2) Aiming at the problems that long-term dependence can not be captured in the time sequence of water quality and gradient disappears, an LSTM neural network is adopted.

(3) In order to further improve the accuracy of the water quality detection method, a sparrow search algorithm is introduced to optimize the LSTM, so that the precision of water quality detection is improved.

(4) The same water quality raw data set is analyzed by using TO-LSTM, STFT-LSTM and EMD-LSTM, and the coefficients (R) are determined by comparing the three methods ² ) And Root Mean Square Error (RMSE) as two evaluation indicators, R ² The LSTM can reflect the strong nonlinear mapping capability and the unique memory capability of the improved sparrow search algorithm, and the difficulty of acquiring a data mode is reduced by applying EMD to the index to be predicted.

Aiming at the problems of abnormal water quality detection data and low automation degree, the data adopts a data preprocessing module taking Empirical Mode Decomposition (EMD) as a center to decompose parameter monitoring data time series such as medium biochemical oxygen demand (COD), chemical oxygen demand (BOD), dissolved Oxygen (DO), total Phosphorus (TP), total Nitrogen (TN) and the like in the water quality data, so that the precision of the detection method based on modeling is improved. Aiming at the problems that Long-Term dependence cannot be captured in a time sequence of water quality and gradient disappears, a Long Short-Term Memory (LSTM) neural network is adopted, a Sparrow Search Algorithm (SSA) is introduced to optimize the LSTM, LSTM model parameters are used as parameter optimization targets of the SSA to complete modeling, and the accuracy of the water quality detection method is further improved by the improved LSTM. Therefore, the invention adopts EMD combined with the improved LSTM method to improve the precision of water quality detection. The main innovation points of the invention are as follows: 1) Aiming at the problems of data abnormity and low automation degree in the water quality detection process, an EMD data preprocessing module is adopted to decompose the parameter time sequence, so that the detection precision is improved; 2) The LSTM is optimized by introducing an improved sparrow search algorithm, so that the water quality detection precision is further improved; 3) The invention adopts other two methods to compare and analyze the same water quality original data set, and determines the coefficient (R) by comparing the three methods ² ) And Root Mean Square Error (RMSE) as two evaluation indicators, R ² Can reflect the strong nonlinear mapping capability and unique memory capability of LSTM after the improvement of the sparrow search algorithm, and the application of EMD to the index to be predicted reduces the acquired dataThe difficulty of the mode, thereby highlighting the superiority of the proposed method.

Drawings

FIG. 1 is a flow chart of a water quality monitoring method based on EMD and improved LSTM;

FIG. 2 is a flow chart of the sparrow search algorithm optimizing LSTM;

FIG. 3 shows coefficients (R) determined by the TO-LSTM, STFT-LSTM and EMD-LSTM methods ² ) Comparing the graphs;

FIG. 4 is a graph comparing Root Mean Square Error (RMSE) for the three methods TO-LSTM, STFT-LSTM and EMD-LSTM.

Detailed Description

The present invention will be explained in further detail with reference to the drawings and embodiments. The specific embodiments described herein are merely illustrative of the invention and are not intended to be limiting.

Referring to fig. 1 to 4, the present embodiment provides a water quality monitoring method based on EMD and modified LSTM, including the following steps:

step 1, acquiring relevant information of water quality by data acquisition, and sampling and recording 15 indexes of 4 categories, wherein the indexes comprise 2 social indexes (population amount and water supply amount), 3 meteorological indexes (relative humidity, air pressure and temperature), 3 water quantity indexes (flow, flow speed and liquid level) and 7 water quality indexes (pH, conductivity, biochemical oxygen demand (COD), chemical oxygen demand (BOD), dissolved Oxygen (DO), total Phosphorus (TP) and Total Nitrogen (TN);

step 2, preprocessing original COD, BOD, DO, TP and TN monitoring data in the water quality index data by adopting an EMD algorithm; the time series X (t) of raw monitoring data comprises X _COD (t)、X _BOD (t)、X _DO (t)、X _TP (t) and X _TN (t) a subdata set;

specifically, the step 2 of decomposing the time series of the original COD, BOD, DO, TP, TN monitoring data comprises the following steps:

step 2-1, obtaining all local maximum value points of any subdata set in the time sequence X (t) of the original monitoring data, and forming an upper envelope line X according to all the local maximum value points _MAX (t); step 2-2, obtaining the original monitoring data in the time sequence X (t)All local minimum value points of any sub data set and forming a lower envelope line X according to all local minimum value points _MIN (t)；

Hrg＝X(t)-Arg

step 2-5, checking whether the difference data sequence Hrg satisfies the conditions (a) and (b) of a finite number of eigenmode functions (IMFs):

the extreme points comprise local maximum points and local minimum points;

a connecting line of the zero crossing point, namely the extreme point passes through the X axis, namely passes through a zero point; namely, the point of the connecting line between the adjacent local maximum value point and the local minimum value point passing through the X axis;

and 2-6, when the IMF condition is not met, taking the difference data sequence Hrg as the original monitoring data time sequence X (t), and repeating the steps from 2-1 to 2-5 until the updated difference data sequence Hrg meets the two conditions of (a) and (b). Satisfying both conditions means that the difference data sequence Hrg becomes a function C of the first finite eigenmode function IMF ₁ (t) according to r ₁ (t)＝X(t)-C ₁ (t) generating a first residual r ₁ (t) replacing the original monitoring data time series X (t). Repeating the steps 2-1 to 2-5, and iteratively generating the remaining n eigenmode functions IMF;

the n eigenmode functions IMF include X _COD (t)、X _BOD (t)、X _DO (t)、X _TP (t) and X _TN (t) a subdata set;

steps 2-7, e.g., raw dissolved oxygen monitor data time series X _DO (t) decomposition into a series of IMFs and a residual r _n (t) superposition:

wherein r is _n (t) denotes a residual, which denotes an ith eigenmode function; time series X of original dissolved oxygen monitoring data _DO (t) is X _DO (t) the subdata set.

Step 3, using the X (t) data processed by EMD in the step 2 to train a model, and optimizing parameters of the LSTM neural network by means of a sparrow search algorithm so as to enable input data to be better matched with the LSTM neural network structure;

specifically, in step 3, the parameters of the LSTM neural network are optimized by using S-SSA, which is expressed as:

wherein the content of the first and second substances,

representing the optimal position of sparrows in the current population; beta represents a random number which conforms to the standard normal distribution;

representing the position of the ith individual in the u +1 generation in the population; k represents [ -1,1]A uniform random number of; ε represents a very small number in the range of [0.01,0.1]Preventing the denominator from being zero; x is the number of _w Representing the fitness value of the worst sparrow position; x is the number of _i Representing the fitness value of sparrows at any position; x is the number of _b Indicating an optimal positionFitness value of sparrows;

representing the worst position of the sparrows in the current population;

improvement on beta:

where U represents the maximum number of iterations and U =200, U represents the current number of iterations.

And (3) proving that: let u ₁ ＜u ₂

∴β ₁ ＜β ₂

Therefore, beta is an increasing function, the value of beta at the early stage is small, the local searching capability can be enhanced, the value of beta at the later stage is large, the global searching capability can be enhanced, and therefore the problem that the traditional SSA algorithm is easy to fall into local optimization can be solved through the improvement of beta.

Improvement on k:

according to the formula, k is increased in the early period of iteration, and is reduced in the later period of iteration, and the convergence speed of the algorithm can be improved by improving k.

Step 3-1, determining the size of a sparrow population, the number of iterations U and an initial security threshold value by taking the size of a time window, the batch processing size and the number of hidden layer units of an LSTM neural network as optimization objects, and initializing an SSA optimization algorithm; the sparrow population size is 30, the initial safety threshold ST =0.6, and the initialized SSA optimization algorithm is the S-SSA algorithm;

3-2, determining the adaptive value of each sparrow by using the predicted value of the algorithm of the LSTM neural network and the root-mean-square of the sample data; the sample data is the processed X (t) data in the step 2;

3-3, updating the sparrow positions to obtain the fitness value of the sparrow population, and storing the optimal individual positions and the overall optimal position values in the population;

and 3-4, judging whether a termination condition is met or whether the maximum value of the update iteration times is reached. If so, exiting the loop and returning to the optimal individual solution, namely determining the optimal parameters of the LSTM neural network structure, otherwise continuing to loop the step 3-3;

and 3-5, taking the optimal particle value output by the SSA algorithm as the time window size, batch processing size and hidden layer unit value of the LSTM neural network.

Step 4, the LSTM neural network divides the time sequence into a plurality of sub-networks, wherein the sub-networks comprise 2 social indicators, 3 meteorological indicators, 3 water quantity indicators and 7 water quality indicators, the structure enables a sparrow search algorithm to extract information from the time sequence through each time step in an iterative neuron sequence, and the output of corresponding time is obtained after the input of the time sequence at the ith time step is iterated;

specifically, step 4 trains a plurality of sub-LSTM networks; the sub-LSTM network is trained using mean square error MSE,

wherein N is the total number of data, x is the monitoring data value,

is a predicted value.

Step 5, verifying the plurality of trained sub LSTM networks; comparing the next COD, BOD, DO, TP and TN detection value per unit time of each sub LSTM network with the next COD, BOD, DO, TP and TN prediction value per unit time output by the sub LSTM network, and judging whether the mean square error MSE is smaller than a set range or not;

MSE of COD, BOD, DO, TP, TN was set to a set range of less than 150, 100, 10, 1, 10, respectively.

If the error is smaller than the set range, the verification is passed. And using a determined coefficient (R) ² ) Root Mean Square Error (RMSE) the predicted performance of the EMD-LSTM model was evaluated.

Statistical indicator (R) quantifying model performance ² RMSE) is calculated as:

in order to predict the mean of the values,

is the mean of the monitored data values.

And 6, accumulating the predicted values of COD, BOD, DO, TP and TN in the next unit time output by the plurality of verified sub LSTM networks to obtain the predicted results of COD, BOD, DO, TP and TN in the next unit time.

As shown in FIGS. 3 and 4, for the time series, there is a varying degree of correlation between the value at any time instant and the value near that time instant, for EMD-LSTM method, R can be made using non-aligned data ² The value is improved by 0.28-0.57%, the RMSE value is reduced by 5.72-15.01%, which shows that the data are really helpful for improving the prediction precision, the prediction performance of the STFT-LSTM is improved to be lower than that of the EMD-LSTM, which shows that the EMD can more effectively utilize non-aligned data, and under the condition of utilizing the non-aligned data, the prediction performance of the comprehensive model accords with that of the EMD-LSTM>STFT-LSTM>TO-LSTM。

The invention discloses a water quality monitoring method based on EMD and improved LSTM, which comprises the following steps: acquiring water quality data, and performing data cleaning on the water quality data; decomposing the time sequence of the original biochemical oxygen demand (COD), the chemical oxygen demand (BOD), the Dissolved Oxygen (DO), the Total Phosphorus (TP) and the Total Nitrogen (TN) monitoring data in the water quality data by adopting an EMD algorithm; the processed data is used for training a model, and parameters of the LSTM neural network are optimized by means of an improved sparrow search algorithm (S-SSA), so that the input data is better matched with a network structure; training a plurality of sub LSTM networks; verifying the trained plurality of sub LSTM networks; and obtaining the predicted values of COD, BOD, DO, TP and TN in the next unit time corresponding to the components by utilizing the plurality of sub LSTM networks which pass the verification, accumulating the predicted values corresponding to all the components to obtain the predicted results of COD, BOD, DO, TP and TN in the next unit time, and verifying the superiority of the method through a comparison experiment. The invention adopts an LSTM method improved by combining EMD, and improves the precision of water quality detection together.

In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.

Those skilled in the art will appreciate that the modules or units or groups of devices in the examples disclosed herein may be arranged in a device as described in this embodiment, or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may additionally be divided into multiple sub-modules.

Those skilled in the art will appreciate that the modules in the devices in an embodiment may be adaptively changed and arranged in one or more devices different from the embodiment. Modules or units or groups in embodiments may be combined into one module or unit or group and may furthermore be divided into sub-modules or sub-units or sub-groups. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.

Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.

Additionally, some of the embodiments are described herein as a method or combination of method elements that can be implemented by a processor of a computer system or by other means of performing the described functions. A processor with the necessary instructions for carrying out the method or the method elements thus forms a device for carrying out the method or the method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.

The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.

In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the method of the invention according to instructions in said program code stored in the memory.

By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer-readable media includes both computer storage media and communication media. Computer storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.

As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.

While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims

1. A water quality monitoring method based on EMD and improved LSTM is characterized by comprising the following steps:

step 4, dividing the water quality information time sequence into a plurality of sub LSTM networks by the LSTM neural network and training the sub LSTM networks, extracting information from the time sequence by the S-SSA through each time step in the iterative neuron sequence, and iterating the ith time step to obtain the output of the (i + 1) th time step after the input of the time sequence;

step 5, verifying the plurality of trained sub LSTM networks; comparing the water quality index detection value of the next time step of each sub LSTM network with the water quality index prediction value of the next time step output by the sub LSTM network, and judging whether the error is smaller than a set range;

2. The method of claim 1, wherein in step 1, the social indicators include population size and water supply; meteorological indexes include relative humidity, air pressure and temperature; the water quantity index comprises flow, flow velocity and liquid level; the water quality indexes include pH, conductivity, biochemical oxygen demand COD, chemical oxygen demand BOD, dissolved oxygen DO, total phosphorus TP and total nitrogen TN.

3. The method of claim 1, wherein in step 2, the raw monitoring data time series X (t) comprises X _COD (t)、X _BOD (t)、X _DO (t)、X _TP (t) and X _TN (t) the subdata set.

4. The method according to claim 3, characterized in that step 2 comprises in particular the steps of:

step 2-1, obtaining all local maximum value points of any subdata set in the time sequence X (t) of the original monitoring data, and forming an upper envelope line X according to all the local maximum value points _MAX (t)；

Step 2-3, calculating the envelope X of the computer _MAX (t) and the lower envelope X _MIN (t) average value, obtaining an average envelopeArg：

Hrg＝X(t)-Arg

5. The method of claim 4, wherein the conditions for the finite number of eigenmode functions, IMFs, include (a) and (b):

the extreme points comprise local maximum points and local minimum points;

6. The method of claim 1, wherein in step 3, the method for optimizing the parameters of the LSTM neural network by means of S-SSA comprises:

the formula of S-SSA is:

wherein the content of the first and second substances,

representing the optimal position of a sparrow in the current population; beta represents a random number which conforms to the standard normal distribution;

representing the position of the ith individual in the u +1 generation in the population; k represents [ -1,1]A uniform random number of (a); ε represents the fractional number, which ranges from [0.01,0.1]Preventing the denominator from being zero; x is the number of _w Representing the fitness value of the worst-position sparrow; x is the number of _i Representing the fitness value of sparrows at any position; x is the number of _b Representing the fitness value of the sparrow at the optimal position;

representing the worst position of the sparrows in the current population;

improvement on beta:

and (3) proving that: let u ₁ ＜u ₂

∴β ₁ ＜β ₂

improvement on k:

7. The method according to claim 6, characterized in that step 3 comprises in particular the steps of:

and 3-5, taking the optimal particle value output by the S-SSA algorithm as the time window size, batch processing size and hidden layer unit value of the LSTM neural network.

8. The method of claim 1, wherein in step 4, a plurality of sub-LSTM networks are trained; the sub-LSTM network is trained using mean square error MSE,

wherein N is the total number of data, x is the monitoring data value,

is a predicted value.

9. The method according to claim 2, wherein in step 5, the range is set as follows: MSEs for COD, BOD, DO, TP and TN are set to be less than 150, 100, 10, 1 and 10 respectively.

10. The method of claim 9, wherein if the error is less than the set range, the verification is passed; and using a determined coefficient R ² And the root mean square error RMSE are used for evaluating the prediction performance of the EMD-LSTM model;

wherein the content of the first and second substances,

in order to predict the mean of the values,

is the mean of the monitored data values.