CN115563487A - Water quality monitoring method based on EMD and improved LSTM - Google Patents

Water quality monitoring method based on EMD and improved LSTM Download PDF

Info

Publication number
CN115563487A
CN115563487A CN202211178099.5A CN202211178099A CN115563487A CN 115563487 A CN115563487 A CN 115563487A CN 202211178099 A CN202211178099 A CN 202211178099A CN 115563487 A CN115563487 A CN 115563487A
Authority
CN
China
Prior art keywords
lstm
water quality
value
data
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211178099.5A
Other languages
Chinese (zh)
Inventor
徐智龙
易辉
俞鑫丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Ankong Zhihui Technology Co ltd
Original Assignee
Jiangsu Ankong Zhihui Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Ankong Zhihui Technology Co ltd filed Critical Jiangsu Ankong Zhihui Technology Co ltd
Priority to CN202211178099.5A priority Critical patent/CN115563487A/en
Publication of CN115563487A publication Critical patent/CN115563487A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Abstract

The invention discloses a water quality monitoring method based on EMD and improved LSTM, which comprises the following steps: step 1, acquiring data to obtain water quality information, wherein the water quality information comprises social indexes, meteorological indexes, water quantity indexes and water quality indexes; step 2, adopting an EMD algorithm to carry out time sequence on original monitoring data in the water quality indexX(t) Carrying out treatment; step 3, using the data processed in the step 2 to train a model, and optimizing parameters of the LSTM neural network by means of S-SSA (secure Shell analysis) so as to enable the input data to be better matched with the LSTM neural network structure; and 4, dividing the water quality information time sequence into a plurality of sub LSTM networks by the LSTM neural network and training the sub LSTM networks. The invention verifies the method through comparison experimentsThe superiority of the method is as follows. The invention adopts an LSTM method improved by combining EMD, and improves the precision of water quality detection together.

Description

Water quality monitoring method based on EMD and improved LSTM
Technical Field
The invention relates to a water quality monitoring method based on EMD and improved LSTM, and belongs to the technical field of hydrology.
Background
The traditional water quality monitoring is carried out by manual information acquisition and experimental determination and is influenced by a large number of uncertain factors. On the one hand, the monitoring data error is large, the water quality management is not facilitated, environmental emergencies cannot be dealt with timely, on the other hand, an automatic environmental monitoring station is built, corresponding network cables are laid, the environment is polluted, and the monitoring cost is also improved. In order to reasonably control the risk of the drainage basin, the real-time monitoring of the water quality is not slow, and the method has important significance for environmental protection and people's living through utilizing the Internet of things and a big data technology to solve the problem of water quality monitoring. The detection method based on the Internet of things and the big data technology can carry out quick and reagent-free water quality detection on multi-source data, and compared with the traditional detection method based on chemical reaction, the detection method is cleaner and more sustainable, but the practical application is limited due to unsatisfactory precision, the analysis result is far lagged behind the actual water quality change condition, the automation degree is low, and the comprehensive water quality data of a water area cannot be effectively searched.
Disclosure of Invention
In order to solve the problems, the invention adopts EMD combined with an improved LSTM method to improve the precision of water quality detection.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a water quality monitoring method based on EMD and improved LSTM comprises the following steps:
step 1, acquiring data to obtain water quality information, wherein the water quality information comprises social indexes, meteorological indexes, water quantity indexes and water quality indexes;
step 2, processing the time sequence X (t) of the original monitoring data in the water quality index by adopting an EMD algorithm;
step 3, using the data processed in the step 2 to train a model, and optimizing parameters of the LSTM neural network by means of S-SSA (secure Shell analysis) so as to enable the input data to be better matched with the LSTM neural network structure;
step 4, the LSTM neural network divides the water quality information time sequence into a plurality of sub LSTM networks and trains the sub LSTM networks, the S-SSA extracts information from the time sequence through each time step in the iterative neuron sequence, and the output of the (i + 1) th time step is obtained after the input of the time sequence of the ith time step is iterated;
step 5, verifying the plurality of trained sub LSTM networks; comparing the water quality index detection value of the next time step of each sub-LSTM network with the water quality index prediction value of the next time step output by the sub-LSTM network, and judging whether the error is smaller than a set range;
and 6, accumulating the water quality index prediction values of the next time step output by the plurality of verified sub-LSTM networks to obtain a water quality index prediction result of the next time step.
In the step 1, the social indexes comprise population volume and water supply volume; weather indicators include relative humidity, air pressure and temperature; the water quantity indexes comprise flow, flow velocity and liquid level; the water quality indexes include pH, conductivity, biochemical oxygen demand COD, chemical oxygen demand BOD, dissolved oxygen DO, total phosphorus TP and total nitrogen TN.
In step 2, the time series X (t) of raw monitoring data includes X COD (t)、X BOD (t)、X DO (t)、X TP (t) and X TN (t) the subdata set.
The step 2 specifically comprises the following steps:
step 2-1, all local maximum values of any subdata set in the time sequence X (t) of the original monitoring data are obtained, and an upper envelope line X is formed according to all the local maximum values MAX (t);
Step 2-2, obtaining all local minimum value points of any subdata set in the original monitoring data time sequence X (t), and forming a lower envelope line X according to all local minimum value points MIN (t);
Step 2-3, calculating the envelope X of the computer MAX (t) and the lower envelope X MIN (t) mean, obtaining the average envelope Arg:
Figure BDA0003861169190000021
and 2-4, subtracting the average envelope Arg from the original monitoring data time sequence X (t) to obtain a new difference data sequence Hrg:
Hrg=X(t)-Arg
step 2-5, checking whether the difference data sequence Hrg meets the condition of a limited number of eigenmode functions IMF;
step 2-6, when the IMF condition is not met, taking the difference data sequence Hrg as an original monitoring data time sequence X (t), and repeating the steps 2-1 to 2-5 until the updated difference data sequence Hrg meets the IMF condition; if the condition is satisfied, the difference data sequence Hrg becomes the first function C of the finite number of eigenmode functions IMF 1 (t) according to r 1 (t)=X(t)-C 1 (t) generating a first residual r 1 (t) replacing the original monitoring data time series X (t); repeating the steps 2-1 to 2-5, and iteratively generating the remaining n eigenmode functions IMF;
the n eigenmode functions IMF include X COD (t)、X BOD (t)、X DO (t)、X TP (t) and X TN (t) the subdata set.
The conditions for a finite number of eigenmode functions IMF include (a) and (b):
(a) The number of extreme points and the number of zero-crossing points must be equal or differ by at most one;
(b) At any time t, the mean value of the upper envelope line consisting of the local maximum value points and the lower envelope line consisting of the local minimum value points is zero;
the extreme points comprise local maximum points and local minimum points;
the zero crossing point is the point where the line connecting the adjacent local maximum point and local minimum point passes through the X axis.
In step 3, the method for optimizing the parameters of the LSTM neural network by means of S-SSA comprises the following steps:
the formula of S-SSA is:
Figure BDA0003861169190000031
wherein the content of the first and second substances,
Figure BDA0003861169190000032
representing the best position of sparrows in the current populationPlacing; beta represents a random number which conforms to the standard normal distribution;
Figure BDA0003861169190000033
representing the position of the ith individual in the u generation in the population;
Figure BDA0003861169190000034
representing the position of the ith individual in the u +1 generation in the population; k represents [ -1,1]A uniform random number of; ε represents a very small number in the range of [0.01,0.1]Preventing the denominator from being zero; x is a radical of a fluorine atom w Representing the fitness value of the worst-position sparrow; x is a radical of a fluorine atom i Representing the fitness value of sparrows at any position; x is the number of b Representing the fitness value of the sparrow at the optimal position;
Figure BDA0003861169190000035
representing the worst position of the sparrows in the current population;
improvement on beta:
Figure BDA0003861169190000036
wherein U represents the maximum number of iterations and U =200, U represents the current number of iterations; the u generation is the current iteration times;
and (3) proving that: let u 1 <u 2
Figure BDA0003861169190000037
Figure BDA0003861169190000038
Figure BDA0003861169190000039
Figure BDA00038611691900000310
∴β 1 <β 2
Beta is an increasing function, the value of beta at the early stage is small, so that the local searching capability is enhanced, the value of beta at the later stage is large, the global searching capability is enhanced, and the problem that the traditional SSA algorithm is easy to fall into the local optimum can be solved by improving beta;
improvement on k:
Figure BDA00038611691900000311
k is increased in the early period of iteration and reduced in the later period of iteration, and the convergence speed of the SSA algorithm can be improved by improving k.
The step 3 specifically comprises the following steps:
step 3-1, determining the size of a sparrow population, the number of iterations U and an initial security threshold value by taking the size of a time window, the batch processing size and the number of hidden layer units of an LSTM neural network as optimization objects, and initializing an SSA optimization algorithm;
3-2, determining the fitness value of each sparrow by using a predicted value of an algorithm of the LSTM neural network and the root mean square of sample data;
3-3, updating the positions of the sparrows to obtain fitness values of the sparrow population, and storing the optimal individual positions and the overall optimal position values in the population;
step 3-4, judging whether a termination condition is met or whether the maximum value of the updating iteration times is reached; if so, exiting the loop and returning to the optimal individual solution, namely determining the optimal parameters of the LSTM neural network structure, otherwise continuing to loop the step 3-3;
and 3-5, taking the optimal particle value output by the S-SSA algorithm as the time window size, batch processing size and hidden layer unit numerical value of the LSTM neural network.
Step 4, training a plurality of sub LSTM networks; the sub-LSTM network is trained using mean square error MSE,
Figure BDA0003861169190000041
wherein N is the total number of data, x is the monitoring data value,
Figure BDA0003861169190000042
is a predicted value.
In step 5, the set range is: MSEs for COD, BOD, DO, TP and TN are set to be less than 150, 100, 10, 1 and 10 respectively.
If the error is smaller than the set range, the verification is passed; and using a determined coefficient R 2 And root mean square error RMSE to evaluate the prediction performance of the EMD-LSTM model;
statistical indicator R for quantifying model performance 2 And RMSE is calculated as:
Figure BDA0003861169190000043
Figure BDA0003861169190000044
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003861169190000051
in order to predict the mean of the values,
Figure BDA0003861169190000052
the mean value of the monitored data values.
The invention has the following beneficial effects:
(1) In the data preprocessing of the EMD, accurate abnormal values and unaligned data play an important role, and the accuracy of a prediction result can be improved by adopting the EMD.
(2) Aiming at the problems that long-term dependence can not be captured in the time sequence of water quality and gradient disappears, an LSTM neural network is adopted.
(3) In order to further improve the accuracy of the water quality detection method, a sparrow search algorithm is introduced to optimize the LSTM, so that the precision of water quality detection is improved.
(4) The same water quality raw data set is analyzed by using TO-LSTM, STFT-LSTM and EMD-LSTM, and the coefficients (R) are determined by comparing the three methods 2 ) And Root Mean Square Error (RMSE) as two evaluation indicators, R 2 The LSTM can reflect the strong nonlinear mapping capability and the unique memory capability of the improved sparrow search algorithm, and the difficulty of acquiring a data mode is reduced by applying EMD to the index to be predicted.
Aiming at the problems of abnormal water quality detection data and low automation degree, the data adopts a data preprocessing module taking Empirical Mode Decomposition (EMD) as a center to decompose parameter monitoring data time series such as medium biochemical oxygen demand (COD), chemical oxygen demand (BOD), dissolved Oxygen (DO), total Phosphorus (TP), total Nitrogen (TN) and the like in the water quality data, so that the precision of the detection method based on modeling is improved. Aiming at the problems that Long-Term dependence cannot be captured in a time sequence of water quality and gradient disappears, a Long Short-Term Memory (LSTM) neural network is adopted, a Sparrow Search Algorithm (SSA) is introduced to optimize the LSTM, LSTM model parameters are used as parameter optimization targets of the SSA to complete modeling, and the accuracy of the water quality detection method is further improved by the improved LSTM. Therefore, the invention adopts EMD combined with the improved LSTM method to improve the precision of water quality detection. The main innovation points of the invention are as follows: 1) Aiming at the problems of data abnormity and low automation degree in the water quality detection process, an EMD data preprocessing module is adopted to decompose the parameter time sequence, so that the detection precision is improved; 2) The LSTM is optimized by introducing an improved sparrow search algorithm, so that the water quality detection precision is further improved; 3) The invention adopts other two methods to compare and analyze the same water quality original data set, and determines the coefficient (R) by comparing the three methods 2 ) And Root Mean Square Error (RMSE) as two evaluation indicators, R 2 Can reflect the strong nonlinear mapping capability and unique memory capability of LSTM after the improvement of the sparrow search algorithm, and the application of EMD to the index to be predicted reduces the acquired dataThe difficulty of the mode, thereby highlighting the superiority of the proposed method.
Drawings
FIG. 1 is a flow chart of a water quality monitoring method based on EMD and improved LSTM;
FIG. 2 is a flow chart of the sparrow search algorithm optimizing LSTM;
FIG. 3 shows coefficients (R) determined by the TO-LSTM, STFT-LSTM and EMD-LSTM methods 2 ) Comparing the graphs;
FIG. 4 is a graph comparing Root Mean Square Error (RMSE) for the three methods TO-LSTM, STFT-LSTM and EMD-LSTM.
Detailed Description
The present invention will be explained in further detail with reference to the drawings and embodiments. The specific embodiments described herein are merely illustrative of the invention and are not intended to be limiting.
Referring to fig. 1 to 4, the present embodiment provides a water quality monitoring method based on EMD and modified LSTM, including the following steps:
step 1, acquiring relevant information of water quality by data acquisition, and sampling and recording 15 indexes of 4 categories, wherein the indexes comprise 2 social indexes (population amount and water supply amount), 3 meteorological indexes (relative humidity, air pressure and temperature), 3 water quantity indexes (flow, flow speed and liquid level) and 7 water quality indexes (pH, conductivity, biochemical oxygen demand (COD), chemical oxygen demand (BOD), dissolved Oxygen (DO), total Phosphorus (TP) and Total Nitrogen (TN);
step 2, preprocessing original COD, BOD, DO, TP and TN monitoring data in the water quality index data by adopting an EMD algorithm; the time series X (t) of raw monitoring data comprises X COD (t)、X BOD (t)、X DO (t)、X TP (t) and X TN (t) a subdata set;
specifically, the step 2 of decomposing the time series of the original COD, BOD, DO, TP, TN monitoring data comprises the following steps:
step 2-1, obtaining all local maximum value points of any subdata set in the time sequence X (t) of the original monitoring data, and forming an upper envelope line X according to all the local maximum value points MAX (t); step 2-2, obtaining the original monitoring data in the time sequence X (t)All local minimum value points of any sub data set and forming a lower envelope line X according to all local minimum value points MIN (t);
Step 2-3, calculating the envelope X of the computer MAX (t) and the lower envelope X MIN (t) mean, obtaining the average envelope Arg:
Figure BDA0003861169190000061
and 2-4, subtracting the average envelope Arg from the original monitoring data time sequence X (t) to obtain a new difference data sequence Hrg:
Hrg=X(t)-Arg
step 2-5, checking whether the difference data sequence Hrg satisfies the conditions (a) and (b) of a finite number of eigenmode functions (IMFs):
(a) The number of extreme points and the number of zero-crossing points must be equal or differ by at most one;
(b) At any time t, the mean value of the upper envelope line consisting of the local maximum value points and the lower envelope line consisting of the local minimum value points is zero;
the extreme points comprise local maximum points and local minimum points;
a connecting line of the zero crossing point, namely the extreme point passes through the X axis, namely passes through a zero point; namely, the point of the connecting line between the adjacent local maximum value point and the local minimum value point passing through the X axis;
and 2-6, when the IMF condition is not met, taking the difference data sequence Hrg as the original monitoring data time sequence X (t), and repeating the steps from 2-1 to 2-5 until the updated difference data sequence Hrg meets the two conditions of (a) and (b). Satisfying both conditions means that the difference data sequence Hrg becomes a function C of the first finite eigenmode function IMF 1 (t) according to r 1 (t)=X(t)-C 1 (t) generating a first residual r 1 (t) replacing the original monitoring data time series X (t). Repeating the steps 2-1 to 2-5, and iteratively generating the remaining n eigenmode functions IMF;
the n eigenmode functions IMF include X COD (t)、X BOD (t)、X DO (t)、X TP (t) and X TN (t) a subdata set;
steps 2-7, e.g., raw dissolved oxygen monitor data time series X DO (t) decomposition into a series of IMFs and a residual r n (t) superposition:
Figure BDA0003861169190000071
wherein r is n (t) denotes a residual, which denotes an ith eigenmode function; time series X of original dissolved oxygen monitoring data DO (t) is X DO (t) the subdata set.
Step 3, using the X (t) data processed by EMD in the step 2 to train a model, and optimizing parameters of the LSTM neural network by means of a sparrow search algorithm so as to enable input data to be better matched with the LSTM neural network structure;
specifically, in step 3, the parameters of the LSTM neural network are optimized by using S-SSA, which is expressed as:
Figure BDA0003861169190000072
wherein the content of the first and second substances,
Figure BDA0003861169190000073
representing the optimal position of sparrows in the current population; beta represents a random number which conforms to the standard normal distribution;
Figure BDA0003861169190000074
representing the position of the ith individual in the u generation in the population;
Figure BDA0003861169190000075
representing the position of the ith individual in the u +1 generation in the population; k represents [ -1,1]A uniform random number of; ε represents a very small number in the range of [0.01,0.1]Preventing the denominator from being zero; x is the number of w Representing the fitness value of the worst sparrow position; x is the number of i Representing the fitness value of sparrows at any position; x is the number of b Indicating an optimal positionFitness value of sparrows;
Figure BDA0003861169190000076
representing the worst position of the sparrows in the current population;
improvement on beta:
Figure BDA0003861169190000081
where U represents the maximum number of iterations and U =200, U represents the current number of iterations.
And (3) proving that: let u 1 <u 2
Figure BDA0003861169190000082
Figure BDA0003861169190000083
Figure BDA0003861169190000084
Figure BDA0003861169190000085
∴β 1 <β 2
Therefore, beta is an increasing function, the value of beta at the early stage is small, the local searching capability can be enhanced, the value of beta at the later stage is large, the global searching capability can be enhanced, and therefore the problem that the traditional SSA algorithm is easy to fall into local optimization can be solved through the improvement of beta.
Improvement on k:
Figure BDA0003861169190000086
according to the formula, k is increased in the early period of iteration, and is reduced in the later period of iteration, and the convergence speed of the algorithm can be improved by improving k.
Step 3-1, determining the size of a sparrow population, the number of iterations U and an initial security threshold value by taking the size of a time window, the batch processing size and the number of hidden layer units of an LSTM neural network as optimization objects, and initializing an SSA optimization algorithm; the sparrow population size is 30, the initial safety threshold ST =0.6, and the initialized SSA optimization algorithm is the S-SSA algorithm;
3-2, determining the adaptive value of each sparrow by using the predicted value of the algorithm of the LSTM neural network and the root-mean-square of the sample data; the sample data is the processed X (t) data in the step 2;
3-3, updating the sparrow positions to obtain the fitness value of the sparrow population, and storing the optimal individual positions and the overall optimal position values in the population;
and 3-4, judging whether a termination condition is met or whether the maximum value of the update iteration times is reached. If so, exiting the loop and returning to the optimal individual solution, namely determining the optimal parameters of the LSTM neural network structure, otherwise continuing to loop the step 3-3;
and 3-5, taking the optimal particle value output by the SSA algorithm as the time window size, batch processing size and hidden layer unit value of the LSTM neural network.
Step 4, the LSTM neural network divides the time sequence into a plurality of sub-networks, wherein the sub-networks comprise 2 social indicators, 3 meteorological indicators, 3 water quantity indicators and 7 water quality indicators, the structure enables a sparrow search algorithm to extract information from the time sequence through each time step in an iterative neuron sequence, and the output of corresponding time is obtained after the input of the time sequence at the ith time step is iterated;
specifically, step 4 trains a plurality of sub-LSTM networks; the sub-LSTM network is trained using mean square error MSE,
Figure BDA0003861169190000091
wherein N is the total number of data, x is the monitoring data value,
Figure BDA0003861169190000092
is a predicted value.
Step 5, verifying the plurality of trained sub LSTM networks; comparing the next COD, BOD, DO, TP and TN detection value per unit time of each sub LSTM network with the next COD, BOD, DO, TP and TN prediction value per unit time output by the sub LSTM network, and judging whether the mean square error MSE is smaller than a set range or not;
MSE of COD, BOD, DO, TP, TN was set to a set range of less than 150, 100, 10, 1, 10, respectively.
If the error is smaller than the set range, the verification is passed. And using a determined coefficient (R) 2 ) Root Mean Square Error (RMSE) the predicted performance of the EMD-LSTM model was evaluated.
Statistical indicator (R) quantifying model performance 2 RMSE) is calculated as:
Figure BDA0003861169190000093
Figure BDA0003861169190000094
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003861169190000095
in order to predict the mean of the values,
Figure BDA0003861169190000096
is the mean of the monitored data values.
And 6, accumulating the predicted values of COD, BOD, DO, TP and TN in the next unit time output by the plurality of verified sub LSTM networks to obtain the predicted results of COD, BOD, DO, TP and TN in the next unit time.
As shown in FIGS. 3 and 4, for the time series, there is a varying degree of correlation between the value at any time instant and the value near that time instant, for EMD-LSTM method, R can be made using non-aligned data 2 The value is improved by 0.28-0.57%, the RMSE value is reduced by 5.72-15.01%, which shows that the data are really helpful for improving the prediction precision, the prediction performance of the STFT-LSTM is improved to be lower than that of the EMD-LSTM, which shows that the EMD can more effectively utilize non-aligned data, and under the condition of utilizing the non-aligned data, the prediction performance of the comprehensive model accords with that of the EMD-LSTM>STFT-LSTM>TO-LSTM。
The invention discloses a water quality monitoring method based on EMD and improved LSTM, which comprises the following steps: acquiring water quality data, and performing data cleaning on the water quality data; decomposing the time sequence of the original biochemical oxygen demand (COD), the chemical oxygen demand (BOD), the Dissolved Oxygen (DO), the Total Phosphorus (TP) and the Total Nitrogen (TN) monitoring data in the water quality data by adopting an EMD algorithm; the processed data is used for training a model, and parameters of the LSTM neural network are optimized by means of an improved sparrow search algorithm (S-SSA), so that the input data is better matched with a network structure; training a plurality of sub LSTM networks; verifying the trained plurality of sub LSTM networks; and obtaining the predicted values of COD, BOD, DO, TP and TN in the next unit time corresponding to the components by utilizing the plurality of sub LSTM networks which pass the verification, accumulating the predicted values corresponding to all the components to obtain the predicted results of COD, BOD, DO, TP and TN in the next unit time, and verifying the superiority of the method through a comparison experiment. The invention adopts an LSTM method improved by combining EMD, and improves the precision of water quality detection together.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or groups of devices in the examples disclosed herein may be arranged in a device as described in this embodiment, or alternatively may be located in one or more devices different from the devices in this example. The modules in the foregoing examples may be combined into one module or may additionally be divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the devices in an embodiment may be adaptively changed and arranged in one or more devices different from the embodiment. Modules or units or groups in embodiments may be combined into one module or unit or group and may furthermore be divided into sub-modules or sub-units or sub-groups. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
Additionally, some of the embodiments are described herein as a method or combination of method elements that can be implemented by a processor of a computer system or by other means of performing the described functions. A processor with the necessary instructions for carrying out the method or the method elements thus forms a device for carrying out the method or the method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the method of the invention according to instructions in said program code stored in the memory.
By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer-readable media includes both computer storage media and communication media. Computer storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims (10)

1. A water quality monitoring method based on EMD and improved LSTM is characterized by comprising the following steps:
step 1, acquiring data to obtain water quality information, wherein the water quality information comprises social indexes, meteorological indexes, water quantity indexes and water quality indexes;
step 2, processing the time sequence X (t) of the original monitoring data in the water quality index by adopting an EMD algorithm;
step 3, using the data processed in the step 2 to train a model, and optimizing parameters of the LSTM neural network by means of S-SSA (secure Shell analysis) so as to enable the input data to be better matched with the LSTM neural network structure;
step 4, dividing the water quality information time sequence into a plurality of sub LSTM networks by the LSTM neural network and training the sub LSTM networks, extracting information from the time sequence by the S-SSA through each time step in the iterative neuron sequence, and iterating the ith time step to obtain the output of the (i + 1) th time step after the input of the time sequence;
step 5, verifying the plurality of trained sub LSTM networks; comparing the water quality index detection value of the next time step of each sub LSTM network with the water quality index prediction value of the next time step output by the sub LSTM network, and judging whether the error is smaller than a set range;
and 6, accumulating the water quality index prediction values of the next time step output by the plurality of verified sub-LSTM networks to obtain a water quality index prediction result of the next time step.
2. The method of claim 1, wherein in step 1, the social indicators include population size and water supply; meteorological indexes include relative humidity, air pressure and temperature; the water quantity index comprises flow, flow velocity and liquid level; the water quality indexes include pH, conductivity, biochemical oxygen demand COD, chemical oxygen demand BOD, dissolved oxygen DO, total phosphorus TP and total nitrogen TN.
3. The method of claim 1, wherein in step 2, the raw monitoring data time series X (t) comprises X COD (t)、X BOD (t)、X DO (t)、X TP (t) and X TN (t) the subdata set.
4. The method according to claim 3, characterized in that step 2 comprises in particular the steps of:
step 2-1, obtaining all local maximum value points of any subdata set in the time sequence X (t) of the original monitoring data, and forming an upper envelope line X according to all the local maximum value points MAX (t);
Step 2-2, obtaining all local minimum value points of any subdata set in the original monitoring data time sequence X (t), and forming a lower envelope line X according to all local minimum value points MIN (t);
Step 2-3, calculating the envelope X of the computer MAX (t) and the lower envelope X MIN (t) average value, obtaining an average envelopeArg:
Figure FDA0003861169180000011
And 2-4, subtracting the average envelope Arg from the original monitoring data time sequence X (t) to obtain a new difference data sequence Hrg:
Hrg=X(t)-Arg
step 2-5, checking whether the difference data sequence Hrg meets the condition of a limited number of eigenmode functions IMF;
step 2-6, when the IMF condition is not met, taking the difference data sequence Hrg as an original monitoring data time sequence X (t), and repeating the steps 2-1 to 2-5 until the updated difference data sequence Hrg meets the IMF condition; if the condition is satisfied, the difference data sequence Hrg becomes the first function C of the finite number of eigenmode functions IMF 1 (t) according to r 1 (t)=X(t)-C 1 (t) generating a first residual r 1 (t) replacing the original monitoring data time series X (t); repeating the steps 2-1 to 2-5, and iteratively generating the remaining n eigenmode functions IMF;
the n eigenmode functions IMF include X COD (t)、X BOD (t)、X DO (t)、X TP (t) and X TN (t) the subdata set.
5. The method of claim 4, wherein the conditions for the finite number of eigenmode functions, IMFs, include (a) and (b):
(a) The number of extreme points and the number of zero-crossing points must be equal or differ by at most one;
(b) At any time t, the mean value of the upper envelope line consisting of the local maximum value points and the lower envelope line consisting of the local minimum value points is zero;
the extreme points comprise local maximum points and local minimum points;
the zero crossing point is the point where the line connecting the adjacent local maximum point and local minimum point passes through the X axis.
6. The method of claim 1, wherein in step 3, the method for optimizing the parameters of the LSTM neural network by means of S-SSA comprises:
the formula of S-SSA is:
Figure FDA0003861169180000021
wherein the content of the first and second substances,
Figure FDA0003861169180000022
representing the optimal position of a sparrow in the current population; beta represents a random number which conforms to the standard normal distribution;
Figure FDA0003861169180000023
representing the position of the ith individual in the u generation in the population;
Figure FDA0003861169180000024
representing the position of the ith individual in the u +1 generation in the population; k represents [ -1,1]A uniform random number of (a); ε represents the fractional number, which ranges from [0.01,0.1]Preventing the denominator from being zero; x is the number of w Representing the fitness value of the worst-position sparrow; x is the number of i Representing the fitness value of sparrows at any position; x is the number of b Representing the fitness value of the sparrow at the optimal position;
Figure FDA0003861169180000031
representing the worst position of the sparrows in the current population;
improvement on beta:
Figure FDA0003861169180000032
wherein U represents the maximum number of iterations and U =200, U represents the current number of iterations; the u generation is the current iteration times;
and (3) proving that: let u 1 <u 2
Figure FDA0003861169180000033
Figure FDA0003861169180000034
Figure FDA0003861169180000035
Figure FDA0003861169180000036
∴β 1 <β 2
Beta is an increasing function, the value of beta at the early stage is small, so that the local searching capability is enhanced, the value of beta at the later stage is large, the global searching capability is enhanced, and the problem that the traditional SSA algorithm is easy to fall into the local optimum can be solved by improving beta;
improvement on k:
Figure FDA0003861169180000037
k is increased in the early period of iteration and reduced in the later period of iteration, and the convergence speed of the SSA algorithm can be improved by improving k.
7. The method according to claim 6, characterized in that step 3 comprises in particular the steps of:
step 3-1, determining the size of a sparrow population, the number of iterations U and an initial security threshold value by taking the size of a time window, the batch processing size and the number of hidden layer units of an LSTM neural network as optimization objects, and initializing an SSA optimization algorithm;
3-2, determining the fitness value of each sparrow by using a predicted value of an algorithm of the LSTM neural network and the root mean square of sample data;
3-3, updating the sparrow positions to obtain the fitness value of the sparrow population, and storing the optimal individual positions and the overall optimal position values in the population;
step 3-4, judging whether a termination condition is met or whether the maximum value of the updating iteration times is reached; if so, exiting the loop and returning to the optimal individual solution, namely determining the optimal parameters of the LSTM neural network structure, otherwise continuing to loop the step 3-3;
and 3-5, taking the optimal particle value output by the S-SSA algorithm as the time window size, batch processing size and hidden layer unit value of the LSTM neural network.
8. The method of claim 1, wherein in step 4, a plurality of sub-LSTM networks are trained; the sub-LSTM network is trained using mean square error MSE,
Figure FDA0003861169180000041
wherein N is the total number of data, x is the monitoring data value,
Figure FDA0003861169180000042
is a predicted value.
9. The method according to claim 2, wherein in step 5, the range is set as follows: MSEs for COD, BOD, DO, TP and TN are set to be less than 150, 100, 10, 1 and 10 respectively.
10. The method of claim 9, wherein if the error is less than the set range, the verification is passed; and using a determined coefficient R 2 And the root mean square error RMSE are used for evaluating the prediction performance of the EMD-LSTM model;
statistical indicator R for quantifying model performance 2 And RMSE is calculated as:
Figure FDA0003861169180000043
Figure FDA0003861169180000044
wherein the content of the first and second substances,
Figure FDA0003861169180000045
in order to predict the mean of the values,
Figure FDA0003861169180000046
is the mean of the monitored data values.
CN202211178099.5A 2022-09-23 2022-09-23 Water quality monitoring method based on EMD and improved LSTM Pending CN115563487A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211178099.5A CN115563487A (en) 2022-09-23 2022-09-23 Water quality monitoring method based on EMD and improved LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211178099.5A CN115563487A (en) 2022-09-23 2022-09-23 Water quality monitoring method based on EMD and improved LSTM

Publications (1)

Publication Number Publication Date
CN115563487A true CN115563487A (en) 2023-01-03

Family

ID=84743590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211178099.5A Pending CN115563487A (en) 2022-09-23 2022-09-23 Water quality monitoring method based on EMD and improved LSTM

Country Status (1)

Country Link
CN (1) CN115563487A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200104639A1 (en) * 2018-09-28 2020-04-02 Applied Materials, Inc. Long short-term memory anomaly detection for multi-sensor equipment monitoring
CN111898673A (en) * 2020-07-29 2020-11-06 武汉大学 Dissolved oxygen content prediction method based on EMD and LSTM
CN113361115A (en) * 2021-06-11 2021-09-07 仲恺农业工程学院 Method for predicting dissolved oxygen change of industrial aquaculture water of tilapia
KR102440372B1 (en) * 2022-01-07 2022-09-05 니브스코리아 주식회사 Providing method, apparatus and computer-readable medium of managing influent environmental information of sewage treatment facilities based on big data and artificial intelligence
CN115062750A (en) * 2022-06-16 2022-09-16 合肥学院 Compound water solubility prediction method of dynamic evolution whale optimization algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200104639A1 (en) * 2018-09-28 2020-04-02 Applied Materials, Inc. Long short-term memory anomaly detection for multi-sensor equipment monitoring
CN111898673A (en) * 2020-07-29 2020-11-06 武汉大学 Dissolved oxygen content prediction method based on EMD and LSTM
CN113361115A (en) * 2021-06-11 2021-09-07 仲恺农业工程学院 Method for predicting dissolved oxygen change of industrial aquaculture water of tilapia
KR102440372B1 (en) * 2022-01-07 2022-09-05 니브스코리아 주식회사 Providing method, apparatus and computer-readable medium of managing influent environmental information of sewage treatment facilities based on big data and artificial intelligence
CN115062750A (en) * 2022-06-16 2022-09-16 合肥学院 Compound water solubility prediction method of dynamic evolution whale optimization algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭松林等: "改进SSA算法优化BP神经网络的电力负荷预测模型", 《黑龙江科技大学学报》, 30 June 2022 (2022-06-30), pages 401 - 405 *

Similar Documents

Publication Publication Date Title
Zhang et al. A comprehensive wind speed prediction system based on Monte Carlo and artificial intelligence algorithms
Jiang et al. Multi-objective algorithm for the design of prediction intervals for wind power forecasting model
CN105391083B (en) Wind power interval short term prediction method based on variation mode decomposition and Method Using Relevance Vector Machine
Zhang et al. SolarGAN: Multivariate solar data imputation using generative adversarial network
CN112734128B (en) 7-day power load peak prediction method based on optimized RBF
CN108090629B (en) Load prediction method and system based on nonlinear autoregressive neural network
CN112884056A (en) Optimized LSTM neural network-based sewage quality prediction method
CN104239489A (en) Method for predicting water level by similarity search and improved BP neural network
Xuesong et al. Research on contaminant sources identification of uncertainty water demand using genetic algorithm
CN115688579A (en) Basin multi-point water level prediction early warning method based on generation of countermeasure network
CN116578551A (en) GRU-GAN-based power grid data restoration method
CN115545334A (en) Land use type prediction method, land use type prediction device, electronic device, and storage medium
CN115640744A (en) Method for predicting corrosion rate outside oil field gathering and transportation pipeline
CN114970946A (en) PM2.5 pollution concentration long-term space prediction method based on deep learning model and empirical mode decomposition coupling
CN110852522A (en) Short-term power load prediction method and system
CN111126758B (en) Academic team influence propagation prediction method, academic team influence propagation prediction equipment and storage medium
CN115563487A (en) Water quality monitoring method based on EMD and improved LSTM
CN110542748B (en) Knowledge-based robust effluent ammonia nitrogen soft measurement method
CN116364203A (en) Water quality prediction method, system and device based on deep learning
CN110851784A (en) Early warning method for field operation of electric energy meter
CN116318773A (en) Countermeasure training type unsupervised intrusion detection system and method based on AE model optimization
CN113569384B (en) Digital-analog-linkage-based online adaptive prediction method for residual service life of service equipment
YU et al. Cyanobacterial bloom forecast method based on genetic algorithm-first order lag filter and long short-term memory network
CN114861555A (en) Regional comprehensive energy system short-term load prediction method based on Copula theory
CN115018137A (en) Water environment model parameter calibration method based on reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination