CN115456245A - Prediction method for dissolved oxygen in tidal river network area - Google Patents

Prediction method for dissolved oxygen in tidal river network area Download PDF

Info

Publication number
CN115456245A
CN115456245A CN202210967488.XA CN202210967488A CN115456245A CN 115456245 A CN115456245 A CN 115456245A CN 202210967488 A CN202210967488 A CN 202210967488A CN 115456245 A CN115456245 A CN 115456245A
Authority
CN
China
Prior art keywords
dissolved oxygen
time
value
information
long
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210967488.XA
Other languages
Chinese (zh)
Inventor
赵长进
叶颖欣
范中亚
杨汉杰
王文才
房怀阳
罗千里
曾凡棠
胡艳芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Institute of Environmental Science of Ministry of Ecology and Environment
Original Assignee
South China Institute of Environmental Science of Ministry of Ecology and Environment
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Institute of Environmental Science of Ministry of Ecology and Environment filed Critical South China Institute of Environmental Science of Ministry of Ecology and Environment
Priority to CN202210967488.XA priority Critical patent/CN115456245A/en
Publication of CN115456245A publication Critical patent/CN115456245A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/02Agriculture; Fishing; Forestry; Mining
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A20/00Water conservation; Efficient water supply; Efficient water use
    • Y02A20/152Water filtration

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Biomedical Technology (AREA)
  • Mining & Mineral Resources (AREA)
  • Animal Husbandry (AREA)
  • Agronomy & Crop Science (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Primary Health Care (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Marine Sciences & Fisheries (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a prediction method of dissolved oxygen in a tidal river network area, which comprises the following steps: s1, data acquisition; s2, data screening: s2-1, mutual information definition; s2-2, value domain division; s2-3, solving a maximum value; s2-4, analyzing relevance; s3, establishing a long-time and short-time memory network model: s3-1, constructing a framework; s3-2, initializing; s3-3, calculating forward propagation; s3-4, updating the weight; s3-5, evaluating the root mean square error; s4, performing k-fold cross validation; and S5, calculating and predicting. The invention fully considers the characteristics that the tidal river network area is influenced by tides and the dissolved oxygen shows periodic change, selects the dissolved oxygen data with time lag as input variables, identifies key factors influencing the dissolved oxygen change as the input variables by a maximum mutual information coefficient method, and effectively solves the problem that the gradient in the traditional circulation network disappears by establishing a long-term and short-term memory network by using a deep machine learning model.

Description

Prediction method for dissolved oxygen in tidal river network area
Technical Field
The invention relates to the technical field of water quality prediction, in particular to a prediction method of dissolved oxygen in a tidal river network area.
Background
Dissolved oxygen is a key metric of water environment and is generally used for evaluating the health condition of an aquatic ecosystem, and the metabolism, heredity and reproduction of aquatic organisms are greatly influenced by the oxygen deficiency of a water body. The tidal river network area is influenced by runoff and tide, the power condition is complex, factors such as temperature, salinity and water stratification can influence the reoxygenation of water, and the hypoxic phenomenon (the dissolved oxygen concentration is less than or equal to 3 mg/L) is often generated in the tidal river network area. The prediction of the change of the dissolved oxygen concentration in the tidal river network area is beneficial to early warning and forecasting and risk optimization control of the sudden hypoxia event of the water environment, and the water quality risk prevention and control and decision support capability of the tidal river network area are improved.
The dissolved oxygen prediction model is mainly divided into a process driving model and a data driving model. The process driving model can capture the nonlinear interaction of water body dynamics and nutrient component circulation and chemical and biological processes in the water body based on a physical law, and fully simulates the mechanism of the water pollution process, but the demand and the dependency of the modeling process on environmental data are large, the solving process is complex, a large amount of calculation cost is needed, and the water pollution process is difficult to simulate when data is lost or the environment is changed. The data driving model is different from a process driving model, does not depend on a physical mechanism, can capture a complex nonlinear relation between a target variable and an explanatory variable, can be used for predicting nonlinearity and high randomness through dynamically and adaptively modifying model elements (such as structures, algorithms and parameters), and is widely applied to relevant research in the field of hydrological water environment. The classic data driving model time sequence prediction model requires data to have certain stationarity and linear correlation, and cannot process the nonlinear problem; support Vector Machines (SVMs), boosting algorithms, maximum entropy methods (MaxEnt) and the like belong to the category of shallow machine learning, and a system structure usually comprises one-to-two layers of nonlinear feature transformation at most, so that the method is effective in solving many simple or well-constrained problems, but the limited modeling and expression capabilities of the method cause difficulties in processing more complex realistic problems.
A Long Short-Term Memory Network (LSTM) is one of deep learning machine models, an input gate, a forgetting gate and an output gate are introduced on the basis of a recurrent neural Network to realize automatic retention and rejection of information, effective association among past, present and future information can be realized in a prediction process, the problem of gradient disappearance in the traditional recurrent Network is solved, and the Long Short-Term Memory Network has better prediction performance compared with the traditional shallow learning Network. In actual prediction, excessive input variables can increase the complexity of model calculation and reduce the performance of the model, at the moment, identifying and screening important factors driving the change of dissolved oxygen as input variables of a prediction model has important significance for predicting the dissolved oxygen, and a maximum Mutual Information Coefficient (MIC) can effectively capture the linear and nonlinear relation between variables, so that the method is widely used for screening the input variables in various research fields. The dissolved oxygen change in the tidal river network area has strong daily periodicity, and the dissolved oxygen change at the same time every day has similar change trend, but at present, a method for more accurately predicting the dissolved oxygen by combining the daily periodicity of the long-time memory network and the short-time memory network with the dissolved oxygen change does not exist.
Disclosure of Invention
Aiming at the problems, the invention provides a method for predicting dissolved oxygen in a tidal river network area.
The technical scheme of the invention is as follows:
a prediction method for dissolved oxygen in a tidal river network area comprises the following steps:
s1, data acquisition: establishing a water quality automatic station in a tidal river network area needing dissolved oxygen prediction, acquiring water quality time sequence data through the water quality automatic station, and preprocessing the acquired water quality time sequence data, wherein the water quality time sequence data comprise dissolved oxygen and other environment variables;
s2, data screening: calculating the maximum mutual information coefficient of the dissolved oxygen and other environment variables in the water quality time series data obtained in the step S1, screening out other environment variables with large correlation with the dissolved oxygen, and taking the other environment variables as input variables of a long-time memory network;
s2-1, mutual information definition: mutual information is an index for measuring the degree of correlation between other environmental variables and dissolved oxygen, and a given variable A = { x = { (x) i I =1,2,. Cndot., n } and B = { y = i I =1, 2.,. N }, where n is the number of samples, and the mutual information I (a; B) of a and B defines the formula:
Figure BDA0003795331990000021
wherein p (x, y) is the joint probability density of A and B, p (x) is the edge probability density of A, and p (y) is the edge probability density of B;
s2-2, value domain division: let D = { (a) i B), i =1,2,.. Multidot.n } is a finite set, and meanwhile, the value ranges of the variable a and the variable B are divided into x sections and y sections respectively to obtain x × y grids G, then the mutual information MI (a, B) is calculated inside each obtained grid division to obtain the maximum value G of the mutual information MI (a, B), and then the maximum normalization value formula of the finite set D under the condition of defining the maximum value G is as follows:
MI*(D,x,y)=maxMI(D│G)
wherein D | G is a finite set D divided using a grid G, and MI (D, x, y) is a maximum normalized value;
s2-3, calculating the maximum value: the maximum value of the feature matrix composed of the maximum normalized values obtained under each grid division is obtained, and the formula of the maximum information coefficient is obtained as follows:
Figure BDA0003795331990000031
wherein MIC (D) is the maximum information coefficient;
s2-4, relevance analysis: calculating the value of the maximum information coefficient MIC (D) of the dissolved oxygen and other environment variables by taking the dissolved oxygen as a variable A and other environment variables as a variable B, wherein the obtained value of the maximum information coefficient MIC (D) is in a [0,1] interval, the greater the value of the maximum information coefficient MIC (D), the greater the relevance of the dissolved oxygen and other environment variables is, the smaller the value of the maximum information coefficient MIC (D), the smaller the relevance of the dissolved oxygen and other environment variables is, and the other environment variables with greater relevance to the dissolved oxygen are selected as input variables of a prediction model;
s3, establishing a long-time memory network model:
s3-1, framework construction: the long-time and short-time memory network model comprises 1 input layer, 1 output layer and a plurality of hidden layers, each hidden layer is composed of a plurality of memory units, the memory units control the updating and utilization of historical information by introducing a gating mechanism, and the gating mechanism comprises an input gate i t Forget gate i t f t And an output gate o t Input door i t Forgetting door f t And an output gate o t All values of (A) are in [0,1]]The interval represents that the information is passed by a certain proportion, the cell state is reset periodically to avoid the short accumulation of the cell state, and the cell state comprises a candidate state
Figure BDA0003795331990000032
Internal state C t And an external state h t Input door i t Controlling candidate states at the current time
Figure BDA0003795331990000033
How much information needs to be stored, forget the door f t Controlling the internal state C of the last moment t The door o is output according to the information to be forgotten t Controlling the internal state C at the present time t How much information needs to be output to the external state h t Simultaneously activating the sigmoid (σ) function and the tanh hyperbolic tangent function layer as shown in the following formula:
Figure BDA0003795331990000041
Figure BDA0003795331990000042
s3-2, initialization: initializing the matrix and the vector of the memory unit for storing model parameters, storing intermediate calculation results, and storing the number of neurons of an input layer and an output layer, the number of cells of a hidden layer and a network state;
s3-3, forward propagation calculation: the long-time and short-time memory network model determines the information discarded from the cell state, and the step is completed by a forgetting gate, firstly, aiming at the input information x at the current moment t And the hidden layer external state h at the previous moment t-1 The output information of the system is processed by a sigmoid (sigma) function layer to obtain an output between 0 and 1 as an internal state C at the last moment t-1 The filtering value of (f) is obtained as the forgetting gate t The formula of (1) is as follows:
f t =σ(W xf x t +W hf h t―1 +b f )
wherein, W is a weight matrix, the subscript of W represents the connection weight between two units, b represents the bias term;
secondly, the long-time memory network model judges the information stored in the cell state, firstly, the input information x at the current time is t And the hidden layer external state h at the previous moment t-1 The output information is calculated by a sigmoid function layer to obtain an input gate i t The value is shown as the following formula:
i t =σ(W xi x t +W hi h t―1 +b i )
a candidate state is then generated by the hyperbolic tangent function layer tanh
Figure BDA0003795331990000043
For the renewal of the cell state, as shown in the following formula:
Figure BDA0003795331990000044
finally, the long-time and short-time memory network model determines the output information of the cell, and inputs the input information x at the current moment t And the hidden layer external state h at the previous moment t-1 Output information of the output gate is calculated by a sigmoid (sigma) function layer to form an output gate o t As shown in the following formula:
o t =σ(W xo x t +W ho h t―1 +b o )
then the internal state C of the current cell t Compression to [ -1, 1] by tanh function]Interval of (2), internal state C of the compressed cell t And an output gate o t Multiplying to obtain the external state h of the hidden layer at the current moment t Outputting information as shown in the following formula:
h t =o t tanh(C t )
the memory unit can be connected with other parts in the long-time and short-time memory network model, and the external state h of the hidden layer at the current moment t As the hidden layer external state h t Is passed on to the next instant, on the other hand as the hidden layer external state h t The output information is transmitted to a next long-term and short-term memory network, when the next long-term and short-term memory network is a full connection layer, a transformation is carried out on a hidden layer result to obtain final output information, and therefore a predicted value of a time sequence is obtained
Figure BDA0003795331990000051
As shown in the following formula:
Figure BDA0003795331990000052
in the formula, vout is a weight matrix of the full connection layer, and b represents an offset term;
s3-4, updating the weight: solving the gradient of each weight of the long-time memory network, finding an optimal solution by using training data to perform random gradient descent, solving the gradient from the weight of an output layer to the weight of an input layer, sequentially updating each weight, resetting an internal state, designing an error function, and calculating and checking the gradient;
s3-5, root mean square error evaluation: training time sequence data of other environment variables related to dissolved oxygen through a long-short-term memory network model, training the long-short-term memory network model by taking the time sequence data of other environment variables which are normalized and subjected to MIC screening as a training data set, adding a Drapout mechanism into a training mechanism of a hidden layer in order to relieve the overfitting problem in the training process of a multivariable prediction model neural network, and calculating a root mean square error after training to evaluate the prediction result of the long-short-term memory network model, wherein the root mean square error is as shown in the following formula:
Figure BDA0003795331990000061
in the formula (I), the compound is shown in the specification,
Figure BDA0003795331990000062
y (i) is a predicted value of dissolved oxygen and an actual measured value of dissolved oxygen;
s4, k-fold cross validation: dividing the input variable obtained in the step S2-4 into k equal parts as an original data set, selecting k-1 parts as a training set each time, using 1 part as a test set, training k-1 parts and testing the rest 1 parts by using different hyper-parameter combinations, calculating the RMSE value of the test set, repeating the steps of the step S3-2 to the step S3-5 for long and short time memory of network model training and testing until each hyper-parameter combination in the k original data set is tested, and calculating the RMSE average value of each final output information, wherein the parameter group with the minimum RMSE average value is an optimal combination as shown in the following formula:
Figure BDA0003795331990000063
s5, calculating and predicting: real-time data of a water quality automatic station in a tidal river network area are input into an established long-time and short-time memory network model after being preprocessed, a predicted value of dissolved oxygen is obtained through scaling of a result output by the long-time and short-time memory network model, and a trend graph of the dissolved oxygen is drawn by adopting a rolling forecasting method.
Further, other environmental variables of the water time series data in step S1 include pH, water temperature, conductivity, turbidity, water level, flow rate, ammonia nitrogen, total phosphorus, permanganate index, chemical oxygen demand, total nitrogen and DO 25h Said DO 25h For corrected time-series data of dissolved oxygen, DO 25h The correction method comprises the following steps: the duration of one tidal cycle is 24h50min, the lag time is increased to 25h, and the corrected dissolved oxygen time sequence data obtained at the moment is DO 25h
Further, the preprocessing method in step S1 includes: carrying out missing value interpolation and normalization processing on the collected water quality time sequence data;
s1-1, missing value interpolation: when the water quality time sequence data is missing, the average value of the data at two adjacent moments is used for interpolation;
s1-2, normalization processing: the formula of the normalization process is:
Figure BDA0003795331990000064
wherein x' is water quality time sequence data after normalization, x is water quality time sequence data before normalization, and x min As a minimum in water quality time series dataValue, x max Is the maximum value in the water quality time series data.
Further, when the value of the maximum information coefficient MIC (D) in step S2-4 is greater than 0.8, it is considered that the correlation between the other environmental variables and the dissolved oxygen is large. Normally dissolved oxygen and DO 25 The MIC (D) value of (1) is large, about 0.7-0.8, and the correlation calculation is performed by normalized _ mutual _ info _ score in python module skleern. Metrics. Cluster.
Further, the model parameters in step S3-2 include a weight matrix W and a bias term b, and the intermediate calculation result includes an external state h t Output information, input gate f t Forgotten door i t Output gate o t
Further, the Dropout mechanism in step S3-5 is: the neural units and their connections are randomly lost during the training process with time series data of other environmental variables.
Further, the long-time memory network model is built in the step S3-1 based on a TensorFlow deep learning framework.
Further, the method for scrolling and forecasting in step S5 specifically includes: according to the sampling interval of the predicted value of the existing dissolved oxygen, a reasonable prediction time step length is set, the long-time memory network model can calculate the dissolved oxygen data of t + n days and output the calculated dissolved oxygen data to obtain a true dissolved oxygen value on the assumption that the predicted time is n days according to the t-day dissolved oxygen data concentrated in the test and the important parameters screened by the method of S2, then the true dissolved oxygen value of t + n days and other environment variables screened by the method of S2 are screened out on the t +2n days, and the sequence information is updated in time by adopting a rolling prediction method, so that error accumulation is avoided.
The invention has the beneficial effects that:
the invention provides a solution for predicting dissolved oxygen in a tidal river Network area, fully considers the characteristics that the tidal river Network area is influenced by tides and the dissolved oxygen shows periodic change, selects time-lag dissolved oxygen data as input variables, identifies key factors influencing the change of the dissolved oxygen as the input variables by a Maximum Information Coefficient (MIC) method, establishes a Long-Short-Term Memory Network (LSTM) by using a deep machine learning model, effectively solves the problem of gradient disappearance in the traditional circulating Network, selects an optimal hyper-parameter combination of the model by using a K-fold cross-validation grid searching method, and improves the accuracy of predicting the dissolved oxygen in the tidal river Network area.
Drawings
FIG. 1 is a flow chart of the method for predicting dissolved oxygen in a tidal river network area according to the present invention;
FIG. 2 is a schematic view of step S3 in an experimental example of the method for predicting dissolved oxygen in a tidal river network area according to the present invention;
FIG. 3 is a schematic diagram of the testing and training results of the long-term and short-term memory network model in Experimental example 1 of the prediction method of dissolved oxygen in the tidal river network area according to the present invention;
FIG. 4 is a schematic diagram of the test and training results of a long-term and short-term memory network model in Experimental example 2 of the prediction method of dissolved oxygen in a tidal river network area according to the present invention;
fig. 5 is a schematic diagram of the test and training results of the long-term and short-term memory network model in experimental example 3 of the prediction method for dissolved oxygen in tidal river network areas of the present invention.
Detailed Description
Example 1
A method for predicting dissolved oxygen in a tidal river network area, as shown in fig. 1, comprising the following steps:
s1, data acquisition: establishing a water quality automatic station in a tidal river network area needing dissolved oxygen prediction, collecting water quality time sequence data through the water quality automatic station, and preprocessing the collected water quality time sequence data, wherein the water quality time sequence data comprise dissolved oxygen and other environmental variables, and the other environmental variables of the water quality time sequence data comprise pH, water temperature, conductivity, turbidity, water level, flow, ammonia nitrogen, total phosphorus, permanganate index, chemical oxygen demand, total nitrogen and DO 25h Said DO 25h For corrected time series data of dissolved oxygen, DO 25h The correction method comprises the following steps: the duration of one tidal cycle is 24h50min, the lag time is increased to 25h, and the corrected dissolved oxygen time sequence data obtained at the moment is DO 25h
The pretreatment method comprises the following steps: carrying out missing value interpolation and normalization processing on the collected water quality time sequence data;
s1-1, missing value interpolation: when the water quality time sequence data is missing, the average value interpolation of the data at two adjacent moments is used;
outliers (identified by L or more 000 in the data quadratic table) and default values of the data are identified, labeled as nan. When the water quality time sequence data is missing, the average value of the data at two adjacent moments is used for interpolation;
the sampling frequency is unified, the non-integral point or integral day recording situation can occur in the data recording of the water quality automatic station, the situations are screened, and the data are unified to the whole day or the whole hour according to the actual situation of each station;
missing value interpolation: according to the unified data sampling frequency of each station, under the condition that no effective data exists at the corresponding time point, the nearest effective data is used for filling, and if the missing data is more than 12 time steps, linear interpolation is used for interpolation.
S1-2, normalization processing: the formula for the normalization process is:
Figure BDA0003795331990000091
wherein x' is water quality time sequence data after normalization, x is water quality time sequence data before normalization, and x min Is the minimum value, x, in the water quality time series data max Is the maximum value in the water quality time sequence data;
s2, data screening: calculating the maximum mutual information coefficient of the dissolved oxygen and other environment variables in the water quality time series data obtained in the step S1, screening out other environment variables with large correlation with the dissolved oxygen, and taking the other environment variables as input variables of a long-time memory network;
s2-1, mutual information definition: mutual information is an index that measures the degree of correlation between other environmental variables and dissolved oxygen, givenVariable a = { x i I =1,2,. Cndot., n } and B = { y = i I =1,2,., n }, where n is the number of samples, the mutual information I (a; B) of a and B is defined by the formula:
Figure BDA0003795331990000092
wherein p (x, y) is the joint probability density of A and B, p (x) is the edge probability density of A, and p (y) is the edge probability density of B;
s2-2, value domain division: let D = { (a) i And B), i =1,2,.. Multidot.n } is a finite set, meanwhile, the value ranges of the variable A and the variable B are respectively divided into an x section and a y section to obtain a grid G of x y, then mutual information MI (A, B) is calculated in each obtained grid division to obtain the maximum value G of the mutual information MI (A, B), and then the maximum normalization value formula of the finite set D under the condition of defining the maximum value G is as follows:
MI*(D,x,y)=maxMI(D│G)
wherein D | G is a finite set D divided using a grid G, and MI (D, x, y) is the maximum normalized value;
s2-3, calculating the maximum value: the maximum value of the feature matrix composed of the maximum normalized values obtained under each grid division is obtained, and the formula of the maximum information coefficient is as follows:
Figure BDA0003795331990000101
wherein MIC (D) is the maximum information coefficient;
s2-4, correlation analysis: calculating the value of the maximum information coefficient MIC (D) of the dissolved oxygen and other environment variables by taking the dissolved oxygen as a variable A and other environment variables as a variable B, wherein the obtained value of the maximum information coefficient MIC (D) is in a [0,1] interval, the greater the value of the maximum information coefficient MIC (D), the greater the correlation between the dissolved oxygen and other environment variables, the smaller the value of the maximum information coefficient MIC (D), the smaller the correlation between the dissolved oxygen and other environment variables, selecting other environment variables with greater correlation with the dissolved oxygen as input variables of the prediction model, and considering that the correlation between the other environment variables and the dissolved oxygen is greater when the value of the maximum information coefficient MIC (D) is greater than 0.8;
s3, establishing a long-time memory network model:
s3-1, framework construction: a long-and-short-term memory network model is built based on a TensorFlow deep learning framework, the long-and-short-term memory network model comprises 1 input layer, 1 output layer and 3 hidden layers, each hidden layer is composed of 20 memory units, the memory units control updating and utilization of historical information by introducing a gating mechanism, and the gating mechanism comprises an input gate i t Forget gate i t f t And an output gate o t Input door i t Forgetting door f t And an output gate o t All values of (A) are in [0,1]]The interval represents that the information is passed by a certain proportion, the cell state is reset periodically to avoid the short accumulation of the cell state, and the cell state comprises a candidate state
Figure BDA0003795331990000111
Internal state C t And an external state h t Input door i t Controlling candidate states at the current time
Figure BDA0003795331990000112
How much information needs to be stored, forget the door f t Controlling the internal state C of the last moment t The information to be forgotten is output through an output gate o t Controlling the internal state C at the present time t How much information needs to be output to the external state h t Simultaneously activating the sigmoid (σ) function and the tanh hyperbolic tangent function layer as shown in the following formula:
Figure BDA0003795331990000113
Figure BDA0003795331990000114
s3-2, initialization:initializing the matrix and vector of the memory unit for storing model parameters including weight matrix W and bias item b and storing intermediate calculation result including external state h t Output information, input gate f t Forget gate i t Output gate o t Storing the neuron number, the hidden layer cell number and the network state of the input layer and the output layer;
s3-3, forward propagation calculation: the long-time and short-time memory network model determines the information discarded from the cell state, and the step is completed by a forgetting gate, firstly, aiming at the input information x at the current moment t And the hidden layer external state h at the previous moment t-1 The output information of (2) is processed by sigmoid (sigma) function layer to obtain an output between 0 and 1 as the internal state C at the last time t-1 The filtering value of (f) is obtained as the forgetting gate t The formula of (1) is:
f t =σ(W xf x t +W hf h t―1 +b f )
wherein, W is a weight matrix, the subscript of W represents the connection weight between two units, and b represents an offset term;
secondly, the long-time and short-time memory network model judges the information stored in the cell state, firstly, the input information x at the current time is input t And the hidden layer external state h at the previous moment t-1 The output information is calculated by a sigmoid function layer to obtain an input gate i t The value is shown as the following formula:
i t =σ(W xi x t +W hi h t―1 +b i )
a candidate state is then generated by means of the tanh layer
Figure BDA0003795331990000121
For the renewal of the cell state, as shown in the following formula:
Figure BDA0003795331990000122
finally, the long-time and short-time memory network model determines the output information of the cell, and inputs the input information x at the current moment t And the hidden layer external state h at the previous moment t-1 The output information of the output gate is subjected to sigmoid (sigma) function layer calculation to obtain an output gate o t As shown in the following formula:
o t =σ(W xo x t +W ho h t―1 +b o )
then the internal state C of the current cell t Compression to [ -1, 1] by tanh function]Interval of (2), internal state C of the compressed cell t And an output gate o t Multiplying to obtain the external state h of the hidden layer at the current moment t Outputting information as shown in the following formula:
h t =o t tanh(C t )
the memory unit is also connected with other parts in the long-time memory network model, and the external state h of the hidden layer at the current time t As the hidden layer external state h t Is passed on to the next instant, on the other hand as the hidden layer external state h t The output information is transmitted to a next long-term and short-term memory network, when the next long-term and short-term memory network is a full connection layer, a transformation is carried out on a hidden layer result to obtain final output information, and therefore a predicted value of a time sequence is obtained
Figure BDA0003795331990000123
As shown in the following formula:
Figure BDA0003795331990000124
in the formula, vout is a weight matrix of the full connection layer, and b represents an offset term;
s3-4, updating the weight: solving the gradient of each weight of the long-time memory network, finding an optimal solution by using training data to perform random gradient descent, solving the gradient from the weight of an output layer to the weight of an input layer, sequentially updating each weight, resetting an internal state, designing an error function, and calculating and checking the gradient;
s3-5, root mean square error evaluation: training time sequence data of other environment variables related to dissolved oxygen through a long-short-term memory network model, training the long-short-term memory network model by taking the time sequence data of other environment variables which are normalized and subjected to MIC screening as a training data set, and adding a Dropout mechanism into a training mechanism of a hidden layer in order to relieve the overfitting problem in the training process of a multivariate prediction model neural network, wherein the Dropout mechanism is as follows: the neural unit and the connection thereof are randomly lost in the process of training time sequence data of other environment variables, and after the training is finished, the root mean square error is calculated to evaluate the prediction result of the long-time and short-time memory network model, wherein the root mean square error is shown as the following formula:
Figure BDA0003795331990000131
in the formula (I), the compound is shown in the specification,
Figure BDA0003795331990000132
y (i) is a predicted value of the dissolved oxygen and an actual measurement value of the dissolved oxygen;
s4, k-fold cross validation: dividing the input variable obtained in the step S2-4 as an original data set into k equal parts, taking k as 5, selecting k-1 part as a training set each time, taking the remaining 1 part as a test set, training k-1 part and testing the rest 1 part by using different hyper-parameter combinations, calculating the RMSE value of the test set, repeating the steps of training and testing the network model in the steps S3-2 to S3-5 at long and short times until each hyper-parameter combination in the k original data set is tested, and calculating the RMSE average value of each final output information, wherein the parameter with the smallest RMSE average value is the optimal combination, and the following formula shows that:
Figure BDA0003795331990000133
s5, calculating and predicting: preprocessing real-time data of a water quality automatic station in a tidal river network area, inputting the preprocessed real-time data into a built long-and-short-term memory network model, obtaining a predicted value of dissolved oxygen through scaling of a result output by the long-and-short-term memory network model, and drawing a trend graph of the dissolved oxygen by adopting a rolling forecasting method, wherein the rolling forecasting method in the step S5 specifically comprises the following steps of: according to the sampling interval of the predicted value of the existing dissolved oxygen, a reasonable prediction time step length is set, the long-time memory network model can calculate the dissolved oxygen data of t + n days and output the calculated dissolved oxygen data to obtain a true dissolved oxygen value on the assumption that the predicted time is n days according to the t-day dissolved oxygen data concentrated in the test and the important parameters screened by the method of S2, then the true dissolved oxygen value of t + n days and other environment variables screened by the method of S2 are screened out on the t +2n days, and the sequence information is updated in time by adopting a rolling prediction method, so that error accumulation is avoided.
Example 2
This embodiment is substantially the same as embodiment 1, except that: and S3-1, the number of the hidden layers in the framework construction is different.
S3-1, framework construction: a long-short-term memory network model is built based on a TensorFlow deep learning framework, and the long-short-term memory network model comprises 1 input layer, 1 output layer and 3 hidden layers.
Example 3
This embodiment is substantially the same as embodiment 1, except that: the values of the maximum information coefficients MIC (D) in steps S2-4 are different. The maximum information coefficient MIC (D) was 0.5, and the variables used for prediction included ammonia nitrogen and total phosphorus.
Experimental example 1
In order to verify the actual application effect of the invention, the actually measured water quality online observation data actually operated by a certain water quality automatic online site is selected for verification. The prediction is carried out by using the method for predicting the dissolved oxygen in the tidal river network area in the embodiment 1, the selected station is a Dalongyong station, and the time span is 1/1 day in 2019 to 3/29 days in 2021. The sampling frequency of permanganate index, ammonia nitrogen, total phosphorus, and total nitrogen was 4 hours, and the time sampling frequency of the remaining variables was 1 hour, as shown in table 1.
8832 time sequence samples processed in the data acquisition of the step S1 are respectively calculated in the step S2, and according to the MIC (D) value and 0.85 as a threshold value, DO25, conductivity, water temperature, ammonia nitrogen and total nitrogen concentration are selected as prediction variables of a long-time memory network model, wherein the MIC (D) values of temperature, pH, DO25, conductivity, turbidity, permanganate index, ammonia nitrogen, total phosphorus, total nitrogen and dissolved oxygen are respectively calculated; in step S3, a long-time and short-time memory network model is built based on a mainstream TensorFlow deep learning framework, for the hyper-parameters related to the prediction model, as shown in fig. 2, in step S4, a k-fold cross validation grid search method is adopted for optimization to obtain an optimal hyper-parameter combination, 67% of data in a sample is selected as a training set, the long-time and short-time memory network model is trained, the remaining 33% of samples are used as a test set, training and test results are shown in fig. 3, calculation results of each relevant variable are shown in table 1, and a model parameter setting and result evaluation list is shown in table 2. After training, the root mean square error was calculated to evaluate model performance, with a training set RMSE of 0.29 and a test set RMSE of 0.22.
Experimental example 2
This example is basically the same as example 1, except that: the selected observation stations are different, the data of pier bases are selected to train and predict the model, the calculation results of all relevant variables are shown in table 1, the model parameter setting and result evaluation list is shown in table 2, and the training and testing results are shown in fig. 4.
Experimental example 3
This example is basically the same as example 2, except that: the number of the selected grid layers is different, the calculation results of the relevant variables are shown in table 1, the model parameter setting and result evaluation list is shown in table 2, and the training and testing results are shown in fig. 5.
Experimental example 4
This example is substantially the same as example 2 except that: based on the fact that the maximum information coefficient MIC (D) in example 3 was 0.5, variables used for prediction included ammonia nitrogen and total phosphorus, the calculation results of each relevant variable are shown in table 1, and the model parameter setting and result evaluation list is shown in table 2.
Experimental example 5
This example is basically the same as example 1, except that: the step size is changed, more input and output time steps are used, the calculation results of all relevant variables are shown in table 1, and the model parameter setting and result evaluation list is shown in table 2.
TABLE 1 List of the results of MIC (D) calculations for the maximum information coefficient for each of the relevant variables in the Large Surge site and pier-head-based site
Figure BDA0003795331990000151
Table 2 model parameter settings and result evaluation list in experimental cases 1-5
Figure BDA0003795331990000161

Claims (8)

1. A prediction method for dissolved oxygen in a tidal river network area is characterized by comprising the following steps:
s1, data acquisition: establishing a water quality automatic station in a tidal river network area needing dissolved oxygen prediction, collecting water quality time series data through the water quality automatic station, and preprocessing the collected water quality time series data, wherein the water quality time series data comprise dissolved oxygen and other environment variables;
s2, data screening: calculating the maximum mutual information coefficient of the dissolved oxygen and other environment variables in the water quality time sequence data obtained in the step S1, screening out other environment variables with larger correlation with the dissolved oxygen, and using the other environment variables as input variables of a long-term and short-term memory network;
s2-1, mutual information definition: mutual information is an index for measuring the degree of correlation between other environmental variables and dissolved oxygen, and a given variable A = { x = i I =1,2,. Cndot., n } and B = { y = i I =1, 2.,. N }, where n is the number of samples, and the mutual information I (a; B) of a and B defines the formula:
Figure FDA0003795331980000011
wherein p (x, y) is the joint probability density of A and B, p (x) is the edge probability density of A, and p (y) is the edge probability density of B;
s2-2, value domain division: let D = { (a) i And B), i =1,2,.. Multidot.n } is a finite set, meanwhile, the value ranges of the variable A and the variable B are respectively divided into an x section and a y section to obtain a grid G of x y, then mutual information MI (A, B) is calculated in each obtained grid division to obtain the maximum value G of the mutual information MI (A, B), and then the maximum normalization value formula of the finite set D under the condition of defining the maximum value G is as follows:
MI*(D,x,y)=maxMI(D│G)
wherein D | G is a finite set D divided using a grid G, and MI (D, x, y) is a maximum normalized value;
s2-3, calculating the maximum value: the maximum value of the feature matrix composed of the maximum normalized values obtained under each grid division is obtained, and the formula of the maximum information coefficient is obtained as follows:
Figure FDA0003795331980000012
wherein MIC (D) is the maximum information coefficient;
s2-4, correlation analysis: calculating the value of the maximum information coefficient MIC (D) of the dissolved oxygen and other environment variables by taking the dissolved oxygen as a variable A and other environment variables as a variable B, wherein the obtained value of the maximum information coefficient MIC (D) is in a [0,1] interval, the greater the value of the maximum information coefficient MIC (D), the greater the correlation between the dissolved oxygen and the other environment variables, and the smaller the value of the maximum information coefficient MIC (D), the smaller the correlation between the dissolved oxygen and the other environment variables, and selecting the other environment variables with greater correlation with the dissolved oxygen as the input variables of the prediction model;
s3, establishing a long-time and short-time memory network model:
s3-1, framework construction: the long-time and short-time memory network model comprises 1 input layer, 1 output layer and a plurality of hidden layers, wherein each hidden layerIs composed of multiple memory units for controlling the update and utilization of history information by introducing a gating mechanism including an input gate i t Forgotten door i t f t And an output gate o t Input door i t Forgetting door f t And an output gate o t All values of (A) are in [0,1]]The interval represents that the information is passed by a certain proportion, the cell state is reset periodically to avoid the short accumulation of the cell state, and the cell state comprises a candidate state
Figure FDA0003795331980000021
Internal state C t And an external state h t Input door i t Controlling candidate states at the current time
Figure FDA0003795331980000022
How much information needs to be stored and left behind f t Controlling the internal state C of the previous moment t The door o is output according to the information to be forgotten t Controlling the internal state C at the current time t How much information needs to be output to the external state h t Simultaneously activating a sigmoid (σ) function and a tanh function layer, as shown in the following formula:
Figure FDA0003795331980000023
Figure FDA0003795331980000024
s3-2, initialization: initializing the matrix and the vector of the memory unit for storing model parameters, storing intermediate calculation results, and storing the number of neurons of an input layer and an output layer, the number of cells of a hidden layer and a network state;
s3-3, forward propagation calculation: the long-time and short-time memory network model can determine the information discarded from the cell state, and the step is completed by a forgetting gate, firstly, aiming at the output of the current timeEntry information x t And the hidden layer external state h at the previous moment t-1 The output information of the system is processed by a sigmoid (sigma) function layer to obtain an output between 0 and 1 as an internal state C at the last moment t-1 The filtering value of (f) is obtained as the forgetting gate t The formula of (1) is:
f t =σ(W xf x t +W hf h t―1 +b f )
wherein, W is a weight matrix, the subscript of W represents the connection weight between two units, and b represents an offset term;
secondly, the long-time and short-time memory network model judges the information stored in the cell state, firstly, the input information x at the current time is input t And the hidden layer external state h at the previous moment t-1 The output information is calculated by a sigmoid function layer to obtain an input gate i t The value is shown as the following formula:
i t =σ(W xi x t +W hi h t―1 +b i )
a candidate state is then generated by the hyperbolic tangent function layer tanh
Figure FDA0003795331980000031
For the renewal of the cell state, as shown in the following formula:
Figure FDA0003795331980000032
finally, the long-time and short-time memory network model determines the output information of the cell, and inputs the input information x at the current moment t And the hidden layer external state h at the previous moment t-1 Output information of the output gate is calculated by a sigmoid (sigma) function layer to form an output gate o t As shown in the following formula:
o t =σ(W xo x t +W ho h t―1 +b o )
then the internal state C of the current cell t Compression to [ -1, 1] by tanh function]Interval of (2), internal state C of the compressed cell t And output gate o t Multiplying to obtain the external state h of the hidden layer at the current moment t Outputting information as shown in the following formula:
h t =o t tanh(C t )
the memory unit is also connected with other parts in the long-time memory network model, and the external state h of the hidden layer at the current time t As the hidden layer external state h t Is passed on to the next instant, on the other hand as the hidden layer external state h t The output information is transmitted to a next long-term and short-term memory network, when the next long-term and short-term memory network is a full connection layer, a transformation is carried out on a hidden layer result to obtain final output information, and therefore a predicted value of a time sequence is obtained
Figure FDA0003795331980000033
As shown in the following formula:
Figure FDA0003795331980000041
in the formula, vout is a weight matrix of the full connection layer, and b represents an offset term;
s3-4, updating the weight: solving the gradient of each weight of the long-time memory network, finding an optimal solution by using training data to perform random gradient descent, solving the gradient from the weight of an output layer to the weight of an input layer, sequentially updating each weight, resetting an internal state, designing an error function, and calculating and checking the gradient;
s3-5, root mean square error evaluation: training time sequence data of other environment variables related to dissolved oxygen through a long-short-term memory network model, training the long-short-term memory network model by taking the time sequence data of other environment variables which are normalized and subjected to MIC screening as a training data set, adding a Drapout mechanism into a training mechanism of a hidden layer in order to relieve the overfitting problem in the training process of a multivariable prediction model neural network, and calculating a root mean square error after training to evaluate the prediction result of the long-short-term memory network model, wherein the root mean square error is as shown in the following formula:
Figure FDA0003795331980000042
in the formula (I), the compound is shown in the specification,
Figure FDA0003795331980000043
y (i) is a predicted value of the dissolved oxygen and an actual measurement value of the dissolved oxygen;
s4, k-fold cross validation: dividing the input variable obtained in the step S2-4 as an original data set into k equal parts, selecting k-1 part as a training set each time, using 1 part as a test set, training k-1 part and testing the rest 1 part by using different hyper-parameter combinations, calculating the RMSE value of the test set, repeating the steps of the step S3-2 to the step S3-5 for long and short time memory network model training and testing until each hyper-parameter combination in the k original data set is tested, calculating the RMSE average value of each final output information, and combining the parameter set with the minimum RMSE average value into an optimal combination as shown in the following formula:
Figure FDA0003795331980000044
s5, calculating and predicting: real-time data of a water quality automatic station in a tidal river network area are input into an established long-time and short-time memory network model after being preprocessed, a predicted value of dissolved oxygen is obtained through scaling of a result output by the long-time and short-time memory network model, and a trend graph of the dissolved oxygen is drawn by adopting a rolling forecasting method.
2. The method for predicting dissolved oxygen in tidal river network area according to claim 1, wherein the other environmental variables of the water quality time series data in step S1 comprise pH, water temperature, conductivity, turbidity, water level, flow rate, ammonia nitrogen, total phosphorus, permanganate index, chemical oxygen demand, total nitrogen and DO 25h Said DO 25h As corrected dissolved oxygenData of intermediate sequence, DO 25h The correction method comprises the following steps: the duration of one tidal cycle is 24h50min, the lag time is increased to 25h, and the corrected dissolved oxygen time sequence data obtained at the moment is DO 25h
3. The method for predicting dissolved oxygen in a tidal river network area according to claim 1, wherein the preprocessing in the step S1 comprises: carrying out missing value interpolation and normalization processing on the collected water quality time sequence data;
s1-1, missing value interpolation: when the water quality time sequence data is missing, the average value interpolation of the data at two adjacent moments is used;
s1-2, normalization processing: the formula for the normalization process is:
Figure FDA0003795331980000051
wherein x' is the water quality time sequence data after normalization, x is the water quality time sequence data before normalization, and x min Is the minimum value, x, in the water quality time series data max Is the maximum value in the water quality time series data.
4. The method for predicting dissolved oxygen in a tidal river network area according to claim 1, wherein the correlation between other environmental variables and dissolved oxygen is considered to be greater when the value of the maximum information coefficient MIC (D) in the step S2-4 is greater than 0.8.
5. The method for predicting dissolved oxygen in tidal river network area according to claim 1, wherein the model parameters in step S3-2 comprise a weight matrix W and an offset term b, and the intermediate calculation result comprises an external state h t Output information, input gate f t Forgotten door i t Output gate o t
6. The method for predicting dissolved oxygen in a tidal river network area according to claim 1, wherein the Dropout mechanism in the step S3-5 is: the neural units and their connections are randomly lost during the training process with time series data of other environmental variables.
7. The method for predicting the dissolved oxygen in the tidal river network area according to claim 1, wherein the long-time memory network model built in the step S3-1 is built based on a TensorFlow deep learning framework.
8. The method for predicting dissolved oxygen in the tidal river network area according to claim 1, wherein the rolling forecasting method in the step S5 specifically comprises the following steps: and setting a reasonable prediction time step length according to the sampling interval of the existing predicted value of the dissolved oxygen, supposing that the predicted time is n days, calculating the dissolved oxygen data of t + n days by a long-time memory network model according to the t-day dissolved oxygen data concentrated in the test and the important parameters screened by the method S2, outputting to obtain a true value of the dissolved oxygen, screening out other environment variables by using the t + n-day dissolved oxygen true value and the method S2 on the t +2n days, and updating the sequence information in time by adopting a rolling prediction method.
CN202210967488.XA 2022-08-12 2022-08-12 Prediction method for dissolved oxygen in tidal river network area Pending CN115456245A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210967488.XA CN115456245A (en) 2022-08-12 2022-08-12 Prediction method for dissolved oxygen in tidal river network area

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210967488.XA CN115456245A (en) 2022-08-12 2022-08-12 Prediction method for dissolved oxygen in tidal river network area

Publications (1)

Publication Number Publication Date
CN115456245A true CN115456245A (en) 2022-12-09

Family

ID=84298975

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210967488.XA Pending CN115456245A (en) 2022-08-12 2022-08-12 Prediction method for dissolved oxygen in tidal river network area

Country Status (1)

Country Link
CN (1) CN115456245A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116703455A (en) * 2023-08-02 2023-09-05 北京药云数据科技有限公司 Medicine data sales prediction method and system based on time series hybrid model
CN116969582A (en) * 2023-09-22 2023-10-31 深圳市友健科技有限公司 Intelligent regulation and control method and system for sewage treatment
CN118067200A (en) * 2024-04-17 2024-05-24 河北省沧州生态环境监测中心 River water quality real-time monitoring and early warning system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116703455A (en) * 2023-08-02 2023-09-05 北京药云数据科技有限公司 Medicine data sales prediction method and system based on time series hybrid model
CN116703455B (en) * 2023-08-02 2023-11-10 北京药云数据科技有限公司 Medicine data sales prediction method and system based on time series hybrid model
CN116969582A (en) * 2023-09-22 2023-10-31 深圳市友健科技有限公司 Intelligent regulation and control method and system for sewage treatment
CN116969582B (en) * 2023-09-22 2023-12-08 深圳市友健科技有限公司 Intelligent regulation and control method and system for sewage treatment
CN118067200A (en) * 2024-04-17 2024-05-24 河北省沧州生态环境监测中心 River water quality real-time monitoring and early warning system

Similar Documents

Publication Publication Date Title
CN111967688B (en) Power load prediction method based on Kalman filter and convolutional neural network
Wang et al. Adaptive learning hybrid model for solar intensity forecasting
CN115456245A (en) Prediction method for dissolved oxygen in tidal river network area
CN113297801A (en) Marine environment element prediction method based on STEOF-LSTM
CN113554466B (en) Short-term electricity consumption prediction model construction method, prediction method and device
CN114119273B (en) Non-invasive load decomposition method and system for park comprehensive energy system
CN114547974B (en) Dynamic soft measurement modeling method based on input variable selection and LSTM neural network
Dong et al. An integrated deep neural network approach for large-scale water quality time series prediction
CN114282443B (en) Residual service life prediction method based on MLP-LSTM supervised joint model
Li et al. A novel multichannel long short-term memory method with time series for soil temperature modeling
CN115495991A (en) Rainfall interval prediction method based on time convolution network
CN112434848A (en) Nonlinear weighted combination wind power prediction method based on deep belief network
CN114492922A (en) Medium-and-long-term power generation capacity prediction method
Ehsan et al. Wind speed prediction and visualization using long short-term memory networks (LSTM)
CN114862032A (en) XGboost-LSTM-based power grid load prediction method and device
CN114444561A (en) PM2.5 prediction method based on CNNs-GRU fusion deep learning model
CN116703644A (en) Attention-RNN-based short-term power load prediction method
CN116720080A (en) Homologous meteorological element fusion inspection method
CN118157127A (en) Multi-weather photovoltaic power generation power prediction digital twin system based on LSTM-MM model
Wang et al. A transformer-based multi-entity load forecasting method for integrated energy systems
CN113151842B (en) Method and device for determining conversion efficiency of wind-solar complementary water electrolysis hydrogen production
Kerboua et al. Recurrent neural network optimization for wind turbine condition prognosis
CN116613732A (en) Multi-element load prediction method and system based on SHAP value selection strategy
Xu et al. Prediction of the Wastewater's pH Based on Deep Learning Incorporating Sliding Windows.
CN115860232A (en) Steam load prediction method, system, electronic device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination