CN111310968B - LSTM neural network circulating hydrologic forecasting method based on mutual information - Google Patents

LSTM neural network circulating hydrologic forecasting method based on mutual information Download PDF

Info

Publication number
CN111310968B
CN111310968B CN201911329550.7A CN201911329550A CN111310968B CN 111310968 B CN111310968 B CN 111310968B CN 201911329550 A CN201911329550 A CN 201911329550A CN 111310968 B CN111310968 B CN 111310968B
Authority
CN
China
Prior art keywords
mutual information
hydrologic
rainfall
model
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911329550.7A
Other languages
Chinese (zh)
Other versions
CN111310968A (en
Inventor
陈晨
梁肖旭
吕宁
周扬
肖凤林
李暨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN201911329550.7A priority Critical patent/CN111310968B/en
Publication of CN111310968A publication Critical patent/CN111310968A/en
Application granted granted Critical
Publication of CN111310968B publication Critical patent/CN111310968B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Game Theory and Decision Science (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Remote Sensing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of data processing, and discloses a mutual information-based LSTM neural network circulating hydrologic prediction method, which is characterized in that original data are screened and classified through mutual information analysis, and rainfall, reservoir water level and flow hydrologic characteristics are used as input characteristics of a long-term and short-term memory circulating prediction model; training and determining the structure of an LSTMC model through a rainfall simulation process, and reflecting the long-term change of flood; and verifying the output of the model by using the actual flood data. The invention adopts a method based on mutual information to analyze the data set, fully captures each hydrologic characteristic of the current flow and the previous longer time period, and dynamically selects the input characteristic of the model. The invention utilizes a deep learning algorithm and adopts a circulating prediction model based on an LSTM neural network, when the invention is used for predicting the time sequence of flood flow, the problem that the hydrologic change process is greatly influenced by various factors in the earlier stage is solved, and the effective characteristics can be better and automatically captured.

Description

LSTM neural network circulating hydrologic forecasting method based on mutual information
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to an LSTM neural network circulating hydrologic forecasting method based on mutual information.
Background
Currently, the closest prior art: flood is an indispensable part of hydrologic research as one of natural disasters. Flood is a water flow phenomenon caused by natural factors such as storm, flash ice and snow melting, storm surge and the like, wherein the water quantity of the rivers and the lakes is rapidly increased or the water level is rapidly increased. When a flood disaster occurs, a number of hazards are caused, the consequences of which include human life hazards, disturbances in transportation and communication networks, damage to buildings and infrastructure, and loss of crops. Therefore, flood control and disaster reduction are particularly important. The correct and reliable flood forecast is one of the most important means for improving the response time to flood disasters in the flood control and disaster reduction links.
Conventional methods mainly include conceptual methods and physical methods, which are widely accepted and applied because they have clear hydrologic meanings. Although some methods enrich the theory of flood forecast, the hydrologic process is a nonlinear process, and the situations of different watercourses are quite different, so that it is difficult to simulate the complex physical relationship in the flood process of each watercourse. Therefore, a data-driven method is introduced into hydrologic forecasting, and the method is used for constructing and training models by analyzing historical hydrologic data and applying various artificial intelligent algorithms, so that the models have a certain degree of self-adjusting capacity, and the accuracy of flood forecasting is improved. At present, the common hydrological forecasting model based on data driving is mostly based on an Artificial Neural Network (ANN) algorithm, and the accuracy rate is greatly reduced when long-time forecasting of flood is carried out. On the one hand, when traffic prediction is performed, as the prediction time increases, the current rainfall and the following traffic correlation will become lower. On the other hand, one disadvantage of these models, which were mainly used for time series analysis in the past, is that any information about the order of the input sequences is lost. The hydrologic process is greatly affected by factors in earlier stages, so a need exists for a "memory" capability that can capture previously calculated information. Long-short-term memory (LSTM) is used as a special Recurrent Neural Network (RNN) to better solve the problem of gradient extinction or gradient explosion which may occur in RNN.
In the first prior art, the causative analysis method comprises the following steps: the runoff forecasting is researched by using a dynamic model method, the influences of atmospheric flow, meteorological factors and the physical environment of the underlying surface on hydrology are comprehensively considered, and the forecasting of the flood flow is realized. The first disadvantage of the prior art is that the principle of the causative analysis method is simpler, and the causative analysis method is widely used for researching the relationship between atmospheric flow and hydrologic factors. Because the weather data has randomness, when the flow is predicted for a long time, the uncertainty of the required effective weather data in more weather data can be increased, the difficulty of accurate prediction can be increased, and the accuracy can be rapidly reduced. Meanwhile, the cause analysis method also uses more meteorological data, has higher requirements on data precision, and lacks effective treatment means for randomness in hydrologic sequences.
In the second prior art, an artificial neural network method: the method is a hydrologic forecasting method based on data driving, and can model nonlinear and complex systems without explicit physical explanation. The characteristic selection is carried out by analyzing the historical hydrologic data, then the model construction and training are carried out, and the accuracy of flood flow forecasting is improved by combining other methods (such as an autoregressive moving average method (ARMA), a tabu search algorithm and principal component analysis). On the one hand, all inputs and outputs of the artificial neural network method are mutually independent, any information about the sequence of the input sequence is lost when the artificial neural network method is used for time sequence analysis and prediction, and the hydrologic change process is greatly influenced by various factors in the earlier stage. On the other hand, the artificial neural network method has the defects of low convergence speed, easy sinking into local optimal points, overfitting and the like.
In summary, the problems of the prior art are: the traditional model is difficult to simulate the complex physical relationship in the hydrologic process and the input and output of the current artificial neural network are mutually independent, the hydrologic change process is greatly influenced by factors in various aspects in the earlier stage, and the accuracy of flood forecasting for a long time is rapidly reduced.
The difficulty of solving the technical problems is as follows: the traditional model is difficult to simulate the complex physical relationship in the hydrologic process and the input and output of the current artificial neural network are mutually independent, and the hydrologic change process is greatly influenced by factors in various aspects in the earlier stage, and the accuracy of flood forecasting is rapidly reduced for a long time. For the problem that the hydrologic change process is greatly influenced by various factors in the earlier stage, proper algorithm is needed to be adopted for selecting the sequence characteristics of the hydrologic change process. LSTM can automatically and effectively capture the characteristics of an input sequence when processing a time sequence prediction problem. When the flow is predicted for a long time, a cyclic process prediction method is adopted, and according to experimental results, the accuracy of flood prediction can be improved better.
Meaning of solving the technical problems: the invention analyzes the actual hydrologic data based on the application of the county river basin, researches the characteristics of the hydrologic data and the forecasting performance of the model, supports single-point forecasting and longer-time flood flow forecasting, and hopefully provides useful value and reference for other similar projects.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides an LSTM neural network circulating hydrologic forecasting method based on mutual information.
The invention is realized in such a way, a LSTM neural network circulation hydrologic forecasting method based on mutual information, the LSTM neural network circulation hydrologic forecasting method based on mutual information includes the following steps:
firstly, screening and classifying original data through mutual information analysis, and taking rainfall, reservoir water level and flow hydrologic characteristics as input characteristics of a long-period and short-period memory cycle prediction model;
secondly, training and determining the structure of an LSTMC model through a rainfall simulation process, and reflecting the long-term change of flood;
and thirdly, verifying the output of the model by using actual flood data.
Further, the first step of screening and classifying the original data through mutual information analysis, and taking the hydrologic characteristics of rainfall, reservoir water level and flow as the input characteristics of the long-term and short-term memory cycle prediction model specifically comprises the following steps: interpolation is carried out on rainfall data by adopting an inverse distance weighting method, and flow and reservoir water level are complemented by adopting a secondary interpolation method; and (3) using the processed rainfall and flow data in equal time intervals, and adopting a mutual information method to calculate the mutual information between the rainfall of each rainfall station and the flow of the hydrologic station, wherein the mutual information calculation formula is as follows:
MI(X,Y)=H(X)+H(Y)-H(X,Y);
wherein X is a discrete random variable, the sample size is N, and the values are X 1 ,x 2 ,…x N The probability is p (x 1 ),p(x 2 ),…p(x N ) H (X), H (Y) is the entropy or information quantity of X, Y respectively; x and Y refer to rainfall or flow in the invention, MI (X, Y) is mutual information of X and Y, H (X, Y) is joint entropy of X and Y, and p XY (x, y) is the joint probability of x, y.
Further, the weight of each rainfall station is obtained according to the size of the mutual information, and the expression is as follows:
wherein r is i For mutual information between the ith rain station and traffic, m is the total number of rain stations, α i And weighting and summing the rainfall of each rainfall station for the weight of the ith rainfall station.
Further, the processed hydrologic data is selected at a certain moment, mutual information of current flow and rainfall in different periods before each year is analyzed, and the period length unit is 1 hour; a set of mutual information ρ= [ ρ ] obtained from rainfall and flow 12 ,…ρ h ]Determining a corresponding k moment when the mutual information is maximum, and taking rainfall variation in a k time difference range as part of input characteristics of a model through simulation, wherein the formula is as follows:
Δx i =x i -x i-p ,(0≤p≤k);
wherein x is i Is the rainfall of the ith hour, x i-p For the rainfall p hours before the i hour, a set of input feature variables x (t), x (t-1), … x (t-k) are obtained.
Further, the acquired input characteristic data and the predicted data set are normalized, and the Min-max normalization is adopted, and the formula is as follows:
in the method, in the process of the invention,x is the ith element after normalization processing i X is the ith element in the sequence to be processed max ,X min Respectively the maximum value and the minimum value in the sequence, i is more than or equal to 1 and less than or equal to N, and N is the total number of the data sets.
Further, the second step constructs a long-term memory cycle LSTMC prediction model: the LSTM unit comprises three gates in control states, namely an input gate, a forget gate and an output gate, wherein the input gate selectively records new information to the unit state, the forget gate selectively forgets the information of the unit state, and the output gate selectively outputs the information of the unit state;
according to the obtained model input characteristics, when predicting and obtaining the flow y at the time t t When the result is used as part of characteristic input of the next layer of neural network, the operation is repeated subsequently, the output of each layer is used as part of the input, an LSTMC prediction model is built, the LSTMC prediction model consists of n identical LSTM neural network structures, each structure is a predictor, and a series of predicted results are finally generated; x (t), x (t-1), … x (t-k), p (t), p (t-1), …, p (t-k) are respectively the rainfall and reservoir water level change process from the time t-k to the time t, y (t), y (t+1), … y (t+n) are respectively the flow values predicted at the times t, t+1, … t+n.
Further, the input gate:
i t =σ(u i *x t +w i *h t-1 +b i );
forgetting the door:
f t =σ(u f *x t +w f *h t-1 +b f );
output door:
o t =σ(u o *x t +w o *h t-1 +b o );
cell status:
h t =o t *tanh(c t );
wherein i is t ,f t ,o t The state vectors of the input gate, the forget gate and the output gate at the moment t are respectively; c t-1 ,c t The state vectors of the LSTM units at the time t-1 and the time t respectively;updating information of the LSTM unit state for the time t; h is a t-1 ,h t The state vectors of the gates are respectively output at the time t-1 and the time t, and x t Inputting for a t time unit; u (u) i ,u f ,u o Weight matrix between input gate, forget gate, output gate and hidden layer, u c A weight matrix for the LSTM unit; w (w) i ,w f ,w o Respectively an input gate, a forgetting gate, a weight matrix between an output gate and an input layer, w c A weight matrix for the LSTM unit; b i ,b f ,b o ,b c The bias vectors of the input gate, the forget gate, the output gate and the LSTM unit are respectively.
Further, the third step of verifying the output of the model using the actual flood data specifically includes: dividing the normalized data set into a training set and a testing set, wherein the proportion of the training set to the testing set is 85% and 15%, dividing the training set by adopting a 10-fold cross validation method, carrying out model training by using an Adam algorithm, determining the weight of each layer in the model, and evaluating the prediction precision of the final model by the testing set.
Further, the relative error and root mean square error are used as defined below:
wherein y is real (i),y real As an actual measurement value, y pred (i),y pred For the model predicted values, n is the total number of predicted samples; for flood peak prediction, taking 20% of measured flood peak flow as a permission error; for flood peak time forecast, 30% of the time interval from the time to the actual flood peak appearance time is taken as the permission error.
The invention further aims to provide an intelligent terminal for realizing the LSTM neural network circulation hydrologic forecasting method based on the mutual information.
In summary, the invention has the advantages and positive effects that: according to the invention, original data are screened and classified through mutual information analysis, and hydrologic characteristics such as rainfall, reservoir water level, flow and the like are used as input characteristics of a long-period memory cycle prediction model; training and determining the structure of an LSTMC model by simulating a rainfall process to reflect the long-term change of floods; verifying the output of the model by using actual flood data; the model has higher forecasting precision, and particularly, the forecasting precision of flood peak time and flood peak value is greatly improved aiming at the flood peak stage.
The invention adopts a mutual information algorithm to analyze the data set and obtain the dynamics or the adaptivity of the input characteristics of the model. The flood forecasting model based on LSTM circulation forecasting is designed, and the advantage of effective feature automatic capturing of LSTM is combined, so that single-point and long-time flood flow forecasting can be realized, and good forecasting precision is achieved. The invention adopts a method based on mutual information to analyze the data set, can fully capture each hydrologic characteristic of the current time flow and the previous longer time period, and dynamically selects the input characteristic of the model. The invention utilizes a deep learning algorithm, adopts a circulation prediction model based on an LSTM neural network, solves the problem that the hydrologic change process is greatly influenced by various factors in the early stage when being used for predicting the time sequence of the flood flow, and can automatically capture effective characteristics better. The invention supports single-point prediction and long-time flood flow prediction, has better forecasting effect compared with the traditional linear regression and neural network model, and has stable performance and high portability.
Drawings
Fig. 1 is a flowchart of a method for forecasting circulation hydrologic of an LSTM neural network based on mutual information provided by an embodiment of the present invention.
Fig. 2 is a flowchart of an implementation method of the LSTM neural network circulation hydrologic forecasting method based on mutual information provided by the embodiment of the invention.
Fig. 3 is a schematic structural diagram of an LSTM unit according to an embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a predictor according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of a result of preprocessing county data according to an embodiment of the present invention.
FIG. 6 is a schematic diagram of performance of an evaluation model provided by an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Aiming at the problems existing in the prior art, the invention provides an LSTM neural network circulating hydrologic forecasting method based on mutual information, and the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the LSTM neural network circulation hydrologic forecasting method based on mutual information provided by the embodiment of the invention includes the following steps:
s101: screening and classifying the original data through mutual information analysis, and taking hydrologic characteristics such as rainfall, reservoir water level, flow and the like as input characteristics of a long-period memory cycle prediction model;
s102: training and determining the structure of an LSTMC model through a rainfall simulation process so as to reflect the long-term change of flood;
s103: and verifying the output of the model by using the actual flood data.
The technical scheme of the invention is further described below with reference to the accompanying drawings.
As shown in fig. 2, the LSTM neural network circulation hydrologic forecasting method based on mutual information provided by the embodiment of the invention includes the following steps:
(1) Data preprocessing analysis: the method mainly aims at the data of flow, reservoir water level, rainfall and the like collected by each measuring station, and respectively carries out equal-time-period average value processing according to the collected hydrological data of the flow, the rainfall, the reservoir water level and the like, wherein the time period length is 1 hour. Due to equipment failure, poor field conditions, program maintenance and the like, some missing values may exist in the collected data. Aiming at the problem that the acquired hydrologic data has missing values, the rainfall data is interpolated and complemented by adopting an inverse distance weighting method, and the flow and the reservoir water level are complemented by adopting a secondary interpolation method. And obtaining mutual information between the rainfall of each rainfall station and the flow of the hydrological station by using the processed rainfall and flow data in equal time intervals and adopting a mutual information method, wherein the mutual information calculation formula is as follows:
MI(X,Y)=H(X)+H(Y)-H(X,Y);
wherein X is a discrete random variable, the sample size is N, and the values are X 1 ,x 2 ,…x N The probability is p (x 1 ),p(x 2 ),…p(x N ) H (X), H (Y) is the entropy or information quantity of X, Y respectively; x and Y refer to rainfall or flow in the invention, MI (X, Y) is mutual information of X and Y, H (X, Y) is joint entropy of X and Y, and p XY (x, y) is the joint probability of x, y.
The weight of each rainfall station is obtained according to the size of the mutual information, and the expression is as follows:
wherein r is i For mutual information between the ith rain station and traffic, m is the total number of rain stations, α i Is the weight of the ith rain station. The rainfall from each rainfall station is then weighted summed.
The most important step in the model development process is the selection of input features. And selecting a certain moment for the processed hydrological data, and analyzing the mutual information of the current flow of each year and the rainfall of the previous different time periods, wherein the time period length unit is 1 hour. A set of mutual information ρ= [ ρ ] obtained from rainfall and flow 12 ,…ρ h ]Determining a corresponding k moment when the mutual information is maximum, and then using rainfall variation in a k time difference range as part of input features of a model through simulation, wherein the formula is as follows:
Δx i =x i -x i-p ,(0≤p≤k);
wherein x is i Is the rainfall of the ith hour, x i-p Is the rainfall p hours before the ith hour. The invention thus obtains a set of input feature variables x (t), x (t-1), … x (t-k).
The acquired input characteristic data and the prediction data set are normalized, and Min-max standardization is adopted, wherein the formula is as follows:
in the method, in the process of the invention,x is the ith element after normalization processing i X is the ith element in the sequence to be processed max ,X min Respectively maximum and minimum in the sequenceThe value is equal to or greater than 1 and equal to or less than N, wherein N is the total number of the data sets.
(2) Constructing a long-short-term memory cycle (LSTMC) prediction model:
the LSTMC model comprises an input layer, a hiding layer and an output layer, wherein each LSTM unit in the hiding layer comprises three gates in control states, namely an input gate, a forgetting gate and an output gate, the input gate selectively records new information to the unit state, the forgetting gate selectively forgets the information of the unit state, the output gate selectively outputs the information of the unit state, and the LSTM unit is shown in figure 3:
an input door:
i t =σ(u i *x t +w i *h t-1 +b i );
forgetting the door:
f t =σ(u f *x t +w f *h t-1 +b f );
output door:
o t =σ(u o *x t +w o *h t-1 +b o );
cell status:
h t =o t *tanh(c t );
wherein i is t ,f t ,o t The state vectors of the input gate, the forget gate and the output gate at the moment t are respectively; c t-1 ,c t The state vectors of the LSTM units at the time t-1 and the time t respectively;updating information of the LSTM unit state for the time t; h is a t-1 ,h t The state vectors of the gates are respectively output at the time t-1 and the time t, and x t Inputting for a t time unit; u (u) i ,u f ,u o Weight matrix between input gate, forget gate, output gate and hidden layer, u c A weight matrix for the LSTM unit; w (w) i ,w f ,w o Respectively an input gate, a forgetting gate, a weight matrix between an output gate and an input layer, w c A weight matrix for the LSTM unit; b i ,b f ,b o ,b c The bias vectors of the input gate, the forget gate, the output gate and the LSTM unit are respectively.
According to the obtained model input characteristics, when predicting and obtaining the flow y at the time t t When the result is input as part of the characteristics of the next layer neural network, the operation is repeated subsequently, and the output of each layer is used as a part of the subsequent input. An LSTMC predictive model is constructed from this, which consists of n identical LSTM neural network structures, each of which is a predictor, ultimately producing a series of predicted results. The whole structure is shown in fig. 4: in the figure, x (t), x (t-1), … x (t-k), p (t), p (t-1), …, and p (t-k) are respectively the rainfall from t-k to t, and the reservoir water level change process, y (t), y (t+1), and … y (t+n) are respectively the flow values predicted at t, t+1, and … t+n.
(3) Model training and evaluation:
dividing the normalized data set into a training set and a testing set, wherein the ratio of the training set to the testing set is 85% and 15%, dividing the training set by adopting a 10-fold cross validation method, carrying out model training by using an Adam algorithm, determining the weight of each layer in the model, and evaluating the prediction precision of a final model by using the testing set; in order to evaluate the performance of the proposed model and the processing method of the data, according to the requirements of the hydrological forecast specifications of the national standard, the invention uses relative errors and root mean square errors, which are defined as follows:
wherein y is real (i),y real As an actual measurement value, y pred (i),y pred For the model predicted values, n is the total number of predicted samples. For peak forecast, 20% of measured peak flow is taken as the allowable error. For flood peak time forecast, 30% of the time interval from the time to the actual flood peak appearance time is taken as the permission error. For precision assessment, the total precision level of multiple predictions is represented by the percent of the ratio of the number of acceptable predictions to the total number of predictions as a percent of pass.
The technical effects of the present invention will be described in detail with reference to experiments.
In order to verify the prediction effect of the invention, the data of each rainfall station and hydrologic station in the target river basin (county) 2011-2018 are selected, the data are analyzed according to the data processing method, the data preprocessing result is shown in fig. 5, the characteristics are input into the training model, the trained model is obtained by adjusting the parameters, then the model is tested, and if the performance reaches the standard, the model is stored. Finally, the established LSTMC prediction model is compared with a reverse neural network cycle (BPNNC) prediction model and a Linear Regression Cycle (LRC) prediction model, and the prediction results are shown in table 1.
TABLE 1 forecast results
The results show that RMSE results for LSTMC models are evident with BPNNC and LRC models. To better illustrate the effectiveness of the LSTMC predictive model, the performance of the 10 field flood process assessment model was chosen from the test set as shown in fig. 6: the start time of flood peak formation during 10 field flooding, the actual peak to time, the predicted peak time, the actual peak, the predicted peak, the peak time error, the peak error, and the root mean square error of the flooding process. The correspondence is shown in table 1.
TABLE 2
The result shows that the flood peak time arrival errors are all within 30% of the allowable error, and the prediction accuracy is 100%; the peak error is within 20% of the allowable error, and the prediction accuracy is 90%. The model of the invention can generate better forecasting effect and meet the requirements of flood forecasting.
The method is realized based on mutual information analysis for obtaining the input features of the model, and the algorithm for selecting the features and analyzing the data can be realized by combining the data features of different watercourses through the contrast of a covariance method, a correlation coefficient method, a regression analysis method and the like.
It should be noted that embodiments of the present invention may be realized in hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or special purpose design hardware. Those of ordinary skill in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such as provided on a carrier medium such as a magnetic disk, CD or DVD-ROM, a programmable memory such as a read only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The device of the present invention and its modules may be implemented by hardware circuitry, such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., as well as software executed by various types of processors, or by a combination of the above hardware circuitry and software, such as firmware.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (8)

1. The LSTM neural network circulating hydrologic forecasting method based on the mutual information is characterized by comprising the following steps of:
firstly, screening and classifying original data through mutual information analysis, and taking rainfall, reservoir water level and flow hydrologic characteristics as input characteristics of a long-period and short-period memory cycle prediction model;
secondly, training and determining the structure of an LSTMC model through a rainfall simulation process, and reflecting the long-term change of flood;
thirdly, verifying the output of the model by using actual flood data;
the first step is to screen and classify the original data through mutual information analysis, and the input features of the rainfall, reservoir water level and flow hydrologic features as the long-term and short-term memory cycle prediction model specifically comprise: interpolation is carried out on rainfall data by adopting an inverse distance weighting method, and flow and reservoir water level are complemented by adopting a secondary interpolation method; and (3) using the processed rainfall and flow data in equal time intervals, and adopting a mutual information method to calculate the mutual information between the rainfall of each rainfall station and the flow of the hydrologic station, wherein the mutual information calculation formula is as follows:
MI(X,Y)=H(X)+H(Y)-H(X,Y);
wherein X is a discrete random variable, the sample size is N, the values are X1, X2 and … xN, the probability is p (X1), p (X2), … p (xN), H (X) and H (Y) are the entropy or information quantity of X and Y respectively; x and Y refer to rainfall or flow in the invention, MI (X, Y) is mutual information of X and Y, H (X, Y) is joint entropy of X and Y, pXY (X, Y) is joint probability of X and Y;
the weight of each rainfall station is obtained according to the size of the mutual information, and the expression is as follows:
where ri is mutual information between the ith rain station and the traffic, m is the total number of the rain stations, αi is the weight of the ith rain station, and the rainfall of each rain station is weighted and summed.
2. The method for forecasting the circulating hydrologic of the LSTM neural network based on the mutual information according to claim 1, wherein the processed hydrologic data is selected at a certain moment, the mutual information of the current flow of each year and the rainfall of the previous different time periods is analyzed, and the time period length unit is 1 hour; according to a group of mutual information rho= [ rho 1, rho 2, … rho h ] obtained by rainfall and flow, determining a k moment corresponding to the maximum mutual information, and taking rainfall change in a k time difference range as part of input characteristics of a model through simulation, wherein the formula is as follows:
Δx i =x i -x i-p ,(0≤p≤k);
where xi is the rainfall at the ith hour, xi-p is the rainfall at p hours before the ith hour, and a set of input characteristic variables x (t), x (t-1), … x (t-k) are obtained.
3. The method for predicting the circulation hydrologic of the LSTM neural network based on mutual information according to claim 2, wherein the acquired input characteristic data and the predicted data set are normalized, and the normalization is performed by using Min-max, and the formula is as follows:
in the method, in the process of the invention,for the ith element after normalization processing, xi is the ith element in the sequence to be processed, and Xmax and Xmin are respectivelyThe maximum value and the minimum value in the sequence are equal to or more than 1 and equal to or less than N, wherein N is the total number of the data sets.
4. The method for predicting the circulating hydrologic of the LSTM neural network based on mutual information according to claim 1, wherein the second step is to construct a long-term memory circulating LSTMC prediction model: the LSTM unit comprises three gates in control states, namely an input gate, a forget gate and an output gate, wherein the input gate selectively records new information to the unit state, the forget gate selectively forgets the information of the unit state, and the output gate selectively outputs the information of the unit state;
according to the obtained model input characteristics, when the flow yt at the moment t is predicted, the result is used as part of the characteristic input of the next layer of neural network, the operation is repeated, the output of each layer is used as a part of the subsequent input, an LSTMC prediction model is constructed, the LSTMC prediction model consists of n identical LSTM neural network structures, each structure is a predictor, and a series of predicted results are finally generated.
5. The method for cyclic hydrologic forecasting of LSTM neural network based on mutual information according to claim 4, wherein the input gate:
i t =σ(u i *x t +w i *h t-1 +b i );
forgetting the door:
f t =σ(u f *x t +w f *h t-1 +b f );
output door:
o t =σ(u o *x t +w o *h t-1 +b o );
cell status:
h t =o t *tanh(c t );
wherein it, ft, ot are the state vectors of the input gate, the forget gate and the output gate at the moment t respectively; ct-1 and ct are the state vectors of the LSTM unit at the time of t-1 and t respectively;updating information of the LSTM unit state for the time t; ht-1, ht are respectively the state vectors of the output gate at the time t-1, and xt is the input of the unit at the time t; ui, uf, uo are weight matrices between the input gate, the forget gate, the output gate and the hidden layer, respectively, uc is a weight matrix of the LSTM unit; wi, wf, wo are respectively the input gate, the forget gate, the weight matrix between the output gate and the input layer, wc is the weight matrix of the LSTM unit; bi, bf, bo, bc are the bias vectors of the input gate, the forget gate, the output gate, and the LSTM cell, respectively.
6. The method for forecasting the circulation hydrologic of the LSTM neural network based on the mutual information as claimed in claim 1, wherein the third step of verifying the output of the model by using the actual flood data specifically comprises the following steps: dividing the normalized data set into a training set and a testing set, wherein the ratio of the training set to the testing set is 85% and 15%, dividing the training set by adopting a 10-fold cross validation method, carrying out model training by using an Adam algorithm, determining the weight of each layer in the model, and evaluating the prediction precision of the final model by the testing set.
7. The method for cyclic hydrologic forecasting of LSTM neural network based on mutual information as claimed in claim 6, wherein the relative error and root mean square error are defined as follows:
wherein, yreal (i), yreal is the actual measured value, ypred (i), ypred is the model predicted value, and n is the total number of predicted samples;
for flood peak prediction, taking 20% of measured flood peak flow as a permission error; for peak time forecasting, 30% of the time interval from the time to the actual peak appearance time is taken as a permission error.
8. An intelligent terminal for implementing the LSTM neural network circulation hydrologic forecasting method based on mutual information according to any one of claims 1 to 7.
CN201911329550.7A 2019-12-20 2019-12-20 LSTM neural network circulating hydrologic forecasting method based on mutual information Active CN111310968B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911329550.7A CN111310968B (en) 2019-12-20 2019-12-20 LSTM neural network circulating hydrologic forecasting method based on mutual information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911329550.7A CN111310968B (en) 2019-12-20 2019-12-20 LSTM neural network circulating hydrologic forecasting method based on mutual information

Publications (2)

Publication Number Publication Date
CN111310968A CN111310968A (en) 2020-06-19
CN111310968B true CN111310968B (en) 2024-02-09

Family

ID=71156313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911329550.7A Active CN111310968B (en) 2019-12-20 2019-12-20 LSTM neural network circulating hydrologic forecasting method based on mutual information

Country Status (1)

Country Link
CN (1) CN111310968B (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724004B (en) * 2020-07-13 2021-03-23 浙江大学 Reservoir available water supply amount forecasting method based on improved quantum wolf algorithm
CN111737640B (en) * 2020-08-17 2021-08-27 深圳江行联加智能科技有限公司 Water level prediction method, device and computer readable storage medium
CN112001556B (en) * 2020-08-27 2022-07-15 华中科技大学 Reservoir downstream water level prediction method based on deep learning model
CN112215400A (en) * 2020-09-14 2021-01-12 山东省地质矿产勘查开发局第一地质大队 Underground water exploitation excessive early warning method and system
CN112116147A (en) * 2020-09-16 2020-12-22 南京大学 River water temperature prediction method based on LSTM deep learning
CN112488295A (en) * 2020-11-23 2021-03-12 江苏科技大学 Method for optimizing storage life prediction of LSTM network relay by cross validation algorithm
CN112668711B (en) * 2020-11-30 2023-04-18 西安电子科技大学 Flood flow prediction method and device based on deep learning and electronic equipment
CN112597835B (en) * 2020-12-11 2024-07-09 国汽(北京)智能网联汽车研究院有限公司 Driving behavior evaluation method and device, electronic equipment and readable storage medium
CN112832744A (en) * 2021-01-07 2021-05-25 中国石油大学(华东) Pumping unit well pump detection period prediction method based on LSTM neural network
CN112784479B (en) * 2021-01-12 2022-09-23 河海大学 Flood flow prediction method
CN113158542B (en) * 2021-01-29 2022-10-04 武汉大学 Multivariable design flood estimation method suitable for data-lacking area
CN112861989A (en) * 2021-03-04 2021-05-28 水利部信息中心 Deep neural network regression model based on density screening
CN112801416A (en) * 2021-03-10 2021-05-14 长沙理工大学 LSTM watershed runoff prediction method based on multi-dimensional hydrological information
CN113158556B (en) * 2021-03-31 2023-08-08 山东电力工程咨询院有限公司 Short-time high-precision forecasting method for regional water level
CN113868223A (en) * 2021-09-02 2021-12-31 深圳中兴网信科技有限公司 Water quality monitoring method, device and system and readable storage medium
CN113743017A (en) * 2021-09-13 2021-12-03 云南大学 Large watershed runoff simulation method based on computer vision and LSTM neural network
CN113887787B (en) * 2021-09-15 2024-05-07 大连理工大学 Flood forecast model parameter multi-objective optimization method based on long-short-term memory network and NSGA-II algorithm
CN113985496B (en) * 2021-10-26 2024-04-09 天津大学 Storm surge intelligent forecasting method based on LSTM-GM neural network model
CN114399193A (en) * 2022-01-11 2022-04-26 电子科技大学 Method for detecting runoff events in data-deficient areas based on depth time sequence point process and LSTM
CN114386334B (en) * 2022-01-19 2022-09-13 浙江大学 Runoff rolling forecasting method based on distributed hydrological runoff simulation substitution model
CN114707705B (en) * 2022-03-14 2024-08-20 浙江大学 Warehouse-in flow prediction method, equipment and storage medium
CN115474945B (en) * 2022-09-15 2024-04-12 燕山大学 Multi-channel brain myoelectricity coupling analysis-oriented multi-element global synchronization index method
CN115759403B (en) * 2022-11-15 2023-12-15 东北农业大学 Dynamic combination prediction model construction method for water circulation process in cold region
CN115828758B (en) * 2022-12-13 2023-08-25 广东海洋大学 Seawater three-dimensional prediction method and system based on improved firework algorithm optimization network
CN115860272B (en) * 2023-02-22 2023-06-30 山东捷讯通信技术有限公司 Reservoir multi-time-point intelligent water level prediction method and system based on deep learning
CN117421558B (en) * 2023-10-26 2024-06-21 华中科技大学 Cascade reservoir operation rule extraction and model training method thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103839265A (en) * 2014-02-26 2014-06-04 西安电子科技大学 SAR image registration method based on SIFT and normalized mutual information
CN104463358A (en) * 2014-11-28 2015-03-25 大连理工大学 Small hydropower station power generation capacity predicating method combining coupling partial mutual information and CFS ensemble forecast
CN106600634A (en) * 2016-12-12 2017-04-26 哈尔滨工业大学 Maximum mutual information image registration method based on improved volume interpolation method
CN107463993A (en) * 2017-08-04 2017-12-12 贺志尧 Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks
CN109615011A (en) * 2018-12-14 2019-04-12 河海大学 A kind of middle and small river short time flood forecast method based on LSTM

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7263243B2 (en) * 2003-12-29 2007-08-28 Carestream Health, Inc. Method of image registration using mutual information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103839265A (en) * 2014-02-26 2014-06-04 西安电子科技大学 SAR image registration method based on SIFT and normalized mutual information
CN104463358A (en) * 2014-11-28 2015-03-25 大连理工大学 Small hydropower station power generation capacity predicating method combining coupling partial mutual information and CFS ensemble forecast
CN106600634A (en) * 2016-12-12 2017-04-26 哈尔滨工业大学 Maximum mutual information image registration method based on improved volume interpolation method
CN107463993A (en) * 2017-08-04 2017-12-12 贺志尧 Medium-and Long-Term Runoff Forecasting method based on mutual information core principle component analysis Elman networks
CN109615011A (en) * 2018-12-14 2019-04-12 河海大学 A kind of middle and small river short time flood forecast method based on LSTM

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张继国 ; 谢平 ; 龚艳冰 ; 刘高峰 ; .降雨信息空间插值研究评述与展望.水资源与水工程学报.2012,全文. *

Also Published As

Publication number Publication date
CN111310968A (en) 2020-06-19

Similar Documents

Publication Publication Date Title
CN111310968B (en) LSTM neural network circulating hydrologic forecasting method based on mutual information
Coccia et al. Recent developments in predictive uncertainty assessment based on the model conditional processor approach
CN111767517B (en) BiGRU multi-step prediction method, system and storage medium applied to flood prediction
Marofi et al. Predicting spatial distribution of snow water equivalent using multivariate non-linear regression and computational intelligence methods
CN109840587A (en) Reservoir reservoir inflow prediction technique based on deep learning
Azari et al. Evaluation of machine learning methods application in temperature prediction
Nishijima et al. A preliminary impact assessment of typhoon wind risk of residential buildings in Japan under future climate change
Ghose et al. Modelling sediment concentration using back propagation neural network and regression coupled with genetic algorithm
Chen et al. Probabilistic forecasting of drought: a hidden Markov model aggregated with the RCP 8.5 precipitation projection
Apurv et al. Evaluation of the stationarity assumption for meteorological drought risk estimation at the multidecadal scale in contiguous United States
CN112561134A (en) Neural network-based water flow prediction method and device and electronic equipment
Sawaf et al. Extent of detection of hidden relationships among different hydrological variables during floods using data-driven models
Arabeyyat et al. Nonlinear Multivariate Rainfall Prediction in Jordan Using NARX-ANN Model with GIS Techniques.
CN114118565A (en) Daily runoff forecasting method based on bidirectional long-and-short-term memory coupling model
CN113609783B (en) Salt tide upward-tracing forecasting system and method coupled with large-scale circulating climate information
CN114912359A (en) Water level dynamic prediction method based on XGboost machine learning model
Kjeldsen et al. Uncertainty in flood frequency analysis
Wei Comparing single-and two-segment statistical models with a conceptual rainfall-runoff model for river streamflow prediction during typhoons
Chen et al. Rainfall forecasting in sub-Sahara Africa-Ghana using LSTM deep learning approach
Giang et al. Monthly precipitation prediction using neural network algorithms in the Thua Thien Hue Province
Sanubari et al. Flood modelling and prediction using artificial neural network
Rohaimi et al. 3 Hours ahead of time flood water level prediction using NNARX structure: Case study pahang
CN116976227A (en) Storm water increasing forecasting method and system based on LSTM machine learning
Nguyen et al. Water level prediction at tich-bui river in vietnam using support vector regression
Lin et al. Hurricane freshwater flood risk assessment model for residential buildings in southeast US coastal states considering climate change

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant