CN114386677A - Flood forecasting method based on novel universal input/output structure and long-and-short time memory network - Google Patents
Flood forecasting method based on novel universal input/output structure and long-and-short time memory network Download PDFInfo
- Publication number
- CN114386677A CN114386677A CN202111594179.4A CN202111594179A CN114386677A CN 114386677 A CN114386677 A CN 114386677A CN 202111594179 A CN202111594179 A CN 202111594179A CN 114386677 A CN114386677 A CN 114386677A
- Authority
- CN
- China
- Prior art keywords
- flood
- lstm
- model
- input
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000015654 memory Effects 0.000 title claims abstract description 17
- 238000013277 forecasting method Methods 0.000 title claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 50
- 238000000034 method Methods 0.000 claims abstract description 37
- 238000012360 testing method Methods 0.000 claims abstract description 31
- 238000011160 research Methods 0.000 claims abstract description 14
- 238000012795 verification Methods 0.000 claims abstract description 14
- 230000000694 effects Effects 0.000 claims abstract description 8
- 210000002569 neuron Anatomy 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims description 29
- 238000004088 simulation Methods 0.000 claims description 26
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims description 16
- 238000010801 machine learning Methods 0.000 claims description 13
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 239000002689 soil Substances 0.000 claims description 12
- 210000004027 cell Anatomy 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 4
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 230000036962 time dependent Effects 0.000 claims description 3
- 238000013213 extrapolation Methods 0.000 claims description 2
- 238000000691 measurement method Methods 0.000 claims 1
- 230000007123 defense Effects 0.000 abstract 1
- 239000010410 layer Substances 0.000 description 18
- 238000010586 diagram Methods 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 7
- 230000007246 mechanism Effects 0.000 description 5
- 238000013461 design Methods 0.000 description 3
- 230000007774 longterm Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012797 qualification Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 230000003578 releasing effect Effects 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2113/00—Details relating to the application field
- G06F2113/08—Fluids
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A10/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
- Y02A10/40—Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- Tourism & Hospitality (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Marketing (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- General Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Game Theory and Decision Science (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Alarm Systems (AREA)
- Educational Administration (AREA)
- Primary Health Care (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Entrepreneurship & Innovation (AREA)
Abstract
A flood forecasting method based on a novel universal input and output structure and a long-time and short-time memory network is disclosed. Firstly, analyzing the confluence characteristic according to the historical flood data of the research basin, and calculating the average confluence time of the research basin; and secondly, determining the number of output time segments of the LSTM flood forecasting model according to the average convergence time of the drainage basin, and giving the number of hidden layer layers and the number of hidden layer neuron nodes of the LSTM flood forecasting model. And then, designing an input/output structure of the novel universal LSTM flood forecasting model, inputting a training set and a verification set sample training model, and obtaining the flood forecasting models with different structures and parameters. And finally, comparing and analyzing the performances of the LSTM flood forecasting models under different input lengths in a training set and a verification set, determining the final superior LSTM flood forecasting model, and evaluating and analyzing the forecasting effect of the LSTM flood forecasting model in a testing set. The method has strong universality, and the established LSTM flood forecasting model can obtain a good forecasting effect and provide a new technical support for the flood disaster defense work of the drainage basin.
Description
Technical Field
The invention belongs to the technical field of drainage basin flood forecasting, and relates to a flood forecasting method based on a novel universal input and output structure and a long-time and short-time memory network.
Background
The long-time memory network (LSTM) is one of the most common models in the cross application research of the machine learning technology and the watershed flood forecasting at present. The LSTM flood forecast model takes samples as input and output data, the samples are combinations of input feature variables and corresponding output target values, and the generation of the samples depends only on the input and output structures of the model. Determining the input and output structures of the model is a key step for constructing a machine learning flood forecasting model, and is more directly related to the reliability and the reasonability of the LSTM flood forecasting model. In the conventional flood forecasting research and application based on the machine learning technology, the early rainfall and the flow are the two most commonly used input characteristic variables in a machine learning flood forecasting model, and the model output is a flow value in a drainage basin exit forecasting period. However, the practical application result shows that when the actual rainfall and the flow are measured in the current period as the input characteristic factors of the model, because of the obvious correlation between the flows in different time steps, the output flow is easily driven by the early-stage flow in the input characteristic variables, the model is difficult to identify the effect of the early-stage rainfall and the flow, the phenomenon that the peak time lags or two flood peak values are output in the flow simulation or forecast process is caused, and the longer the forecast period, the more the peak lag becomes obvious. Therefore, the flood forecasting model built by using the structures of rainfall, early-stage flow as input and forecasting flow target value as output may not conform to the mechanism of runoff formation in the flow area, and the input and output structure of the LSTM flood forecasting model needs to be redesigned to enhance the interpretability of the machine learning flood forecasting model.
Therefore, the invention provides a flood forecasting method based on a novel general input and output structure and a long-time and short-time memory network based on the hydrologic interpretability of a machine learning flood forecasting model. Combining the theory and method of basin hydrological simulation and the basin rainstorm flood response mechanism, selecting rainfall as the only input characteristic variable of the LSTM flood forecast model, the long-sequence rainfall information of each rainfall station in the drainage basin is used as a model input variable (the number of characteristic factors is the number of rainfall stations in the drainage basin), and the potential long-term learning and memory capacity of the LSTM is fully exerted by relying on the recursive connection of the LSTM and the states of a special control gate and a cell unit, by repeatedly learning the long sequence input data, hydrological element information such as the spatial-temporal distribution of the river basin rainfall, the response time of the river basin rainstorm flood, the early soil water storage state and the like is integrated into the input and output structural design of the model, on the basis, a flood forecasting model based on a long-time memory network is constructed, the flood forecasting model of the data area is enriched, and the flood forecasting precision of the drainage basin is improved.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a flood forecasting method based on a novel universal input and output structure and a long-time and short-time memory network.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a flood forecasting method based on a novel universal input and output structure and a long-time and short-time memory network comprises the following steps:
firstly, collecting and arranging flood data of historical watershed fields.
Collecting and organizing flood data of a research basin field, and dividing all fields of flood into a training set, a verification set and a test set field. Wherein, training set field flood is used for optimizing an internal weight matrix and offset vector parameters of the LSTM flood forecast model; verifying field level flood to assist in determining model external settings such as hyper-parameters and loss functions; test set level flooding is used to check the extrapolation (prediction) capability of the trained model.
And secondly, calculating the average confluence time of the basin.
The average confluence time of a certain research basin is determined, the time is equal to the forecast period of flood forecasting of the basin, and the regulation and storage effects of the basin on the rainfall water flow convergence process are comprehensively reflected. According to historical stage flood process data collected and sorted by a drainage basin, the time difference between main rainfall and corresponding peak flow in stage flood is counted, namely peak lag time. And calculating the average value of the peak delay of all flood fields, namely the average convergence time of the drainage basin.
And thirdly, setting the number of hidden layer layers and the number of hidden layer neuron nodes of a long-term storage (LSTM) network flood forecasting model.
And fourthly, designing an input and output structure of the novel universal LSTM flood forecasting model.
The LSTM cell unit state is similar to the drainage basin soil water storage state, and the effects between the three control gates (the forgetting gate, the input gate and the output gate) and the cell unit state can be regarded as the consumption, increase and release of the drainage basin soil water storage state. According to the traditional watershed hydrological simulation theory and method, rainfall is selected as the only input factor of the LSTM flood forecasting model, the LSTM flood forecasting model is input as long-sequence rainfall information of each rainfall station in the watershed, and the input length is n + l time periods. The method comprises the following steps that n represents the input length of early rainfall, the early rainfall can be regarded as reflecting the influence of information such as early soil water storage state of a drainage basin, short rainfall and the like on a model output flow value, a plurality of n can be selected, and a better model can be determined according to the performance of a subsequent model; the LSTM flood forecasting model outputs a flow value sequence equal to the study basin confluence time (the second step calculation result), i.e., the output length is l time segments, l is equal to the basin confluence time and l is less than or equal to n.
And fifthly, generating a training, verifying and testing sample set.
And determining the length of the sample according to the input and output structure (values of n and l) of the novel universal LSTM flood forecasting model designed in the fourth step, wherein the length of the input rainfall sequence of each sample is equal to n + l time periods, and the length of the output target flow sequence is equal to l time periods. According to the training set, the verification set and the test set field divided in the first step, a corresponding training, verification and test sample set is generated according to the mode of sliding interception by time intervals for each time of flood, a plurality of samples are generated for each time of flood, and each sample is input with a rainfall sequence P ═ Pt-n+1,Pt-n+2,…,Pt,…,Pt+l]And outputting the target flow sequence Q ═ Qt+1,Qt+2,…,Qt,…,Qt+l]Forming input and output data pairs. Wherein n is the time interval number of the input early rainfall sequence of the LSTM flood forecast model, l is the time interval number of the output flow of the model, and the input rainfall P of the t-th time intervaltIncluding the measured rainfall values at each rainfall station within the river basin.
And sixthly, constructing and training a model.
The LSTM flood forecasting model is trained in a supervised learning mode by adopting a time-dependent back propagation algorithm (BPTT), and model construction and training are realized based on open-source Keras and TensorFlow. A mature machine learning algorithm package is integrated on a Keras framework running on a TensorFlow platform, and a corresponding algorithm can be directly called to complete the construction and training of an LSTM flood forecast model.
6.1) constructing an LSTM flood forecasting model: calling Keras layer packets (layers) to define an input layer, a hidden layer and an output layer of the LSTM according to the number of hidden layer layers, the number of neuron nodes of the hidden layer and the model input-output structure designed in the fourth step, and constructing an LSTM flood forecasting model;
6.2) training an LSTM flood forecasting model: inputting the training and verification sample set generated in the fifth step, setting hyper-parameters, activation functions, loss functions, optimization algorithms and the like involved in the model training process, and operating the LSTM flood forecasting model constructed based on the Keras framework on the TensorFlow platform to obtain the trained LSTM flood forecasting model.
And seventhly, determining a better model, and extracting and analyzing a basin field sub-flood simulation result.
And comparing and analyzing the performance of the LSTM flood forecasting models under different input lengths, determining the final superior LSTM flood forecasting model, further extracting the results of the process of simulating and forecasting the flow of the flood in the research basin field, and analyzing the forecasting performance of the LSTM flood forecasting model.
The invention introduces a long-and-short-term memory network as a drainage basin flood forecasting modeling tool, innovatively designs an input-output structure taking long-sequence rainfall as input and taking multi-step flow as output, provides theoretical and method support by learning and remembering previous drainage basin soil water storage state information implied by the long-sequence rainfall and adjusting dynamic change of drainage basin soil water storage state through a control gate according to the state of a cell unit, builds a flood forecasting model based on LSTM on the interpretable basis of a rammer learning flood forecasting model, and improves the drainage basin flood forecasting precision.
The flood forecasting method based on the novel universal input and output structure and the long-time and short-time memory network is applied to basin flood forecasting.
The invention has the beneficial effects that:
the traditional machine learning flood forecasting model mostly takes rainfall and flow as input variables, and the model cannot really play a role of the rainfall in the input variables in forecasting the flow, so that the hydrological interpretation of the machine learning flood forecasting model is poor. The invention deeply analyzes the basic principle and the calculation mechanism of long-time memory network internal special control gate and cell unit states, designs a novel general LSTM flood forecasting model input-output structure of long-sequence rainfall input and multi-step flow output, can be suitable for basin flood forecasting modeling of different spatial scales, fully exerts LSTM long-term learning memory capacity and information forgetting, storing and releasing action mechanisms of internal cell units, realizes the instantiation application of the LSTM flood forecasting model in mountain basin, effectively improves the simulation and forecasting precision of basin field flood, and provides a new technical support for basin flood forecasting and early warning work.
Drawings
FIG. 1 is a diagram of an ampere-basin and a watershed utilized by an example application of the present invention;
FIG. 2 is a schematic representation of the hydrological implications of the internal construction of the LSTM of the present invention;
FIG. 3 is a schematic diagram of the input and output structure of the LSTM flood forecast model designed by the present invention;
FIG. 4a is a schematic diagram of samples generated at time t of flood session according to the present invention;
FIG. 4b is a schematic diagram of samples generated at time t +1 of flood session according to the present invention;
FIG. 5 is a schematic diagram of the hierarchical structure and matrix characteristic dimensions of the LSTM flood forecasting model of the present invention;
FIG. 6 is a schematic diagram of the LSTM flood forecast model training process of the present invention;
FIG. 7(a) is a NSE mean graph of different input rainfall length training set sessions of the LSTM flood forecast model of the present invention;
FIG. 7(b) is an MAE mean graph of training set sessions of different input rainfall lengths of the LSTM flood forecast model of the present invention;
FIG. 7(c) is a RMSE mean value diagram of different input rainfall length training set times of the LSTM flood forecast model of the present invention;
FIG. 7(d) is a NSE mean graph of different input rainfall length test set times of the LSTM flood forecast model of the present invention;
FIG. 7(e) is an MAE mean graph of different input rainfall length test set times of the LSTM flood forecast model of the present invention;
FIG. 7(f) is a RMSE average value chart of different input rainfall length test concentration orders of the LSTM flood forecast model of the present invention;
FIG. 8 is a schematic diagram of the input and output structure of a preferred LSTM flood forecast model in an example application of the present invention;
FIG. 9(a) is a comparison graph of the training set session 19840501 simulation flow process for the LSTM flood forecast model of the present invention;
FIG. 9(b) is a comparison graph of the training set session 19900730 simulation flow process for the LSTM flood forecast model of the present invention;
FIG. 9(c) is a comparison graph of the training set session 19940613 simulation flow process for the LSTM flood forecast model of the present invention;
FIG. 9(d) is a comparison graph of the training set session 19970620 simulation flow process for the LSTM flood forecast model of the present invention;
FIG. 9(e) is a comparison graph of the training set session 20020805 simulation flow process for the LSTM flood forecast model of the present invention;
FIG. 9(f) is a comparison graph of the flow process simulated by the test suite 20060505 of the LSTM flood forecast model of the present invention;
FIG. 9(g) is a comparison graph of the flow process simulated by the test suite 20060725 of the LSTM flood forecast model of the present invention;
FIG. 9(h) is a comparison graph of the flow process simulated by the test suite 20080524 of the LSTM flood forecast model of the present invention;
FIG. 9(i) is a comparison graph of the flow process simulated by the test suite 20080608 of the LSTM flood forecast model of the present invention;
FIG. 9(j) is a comparison graph of the flow process simulated by the test suite hierarchy 20120621 of the LSTM flood forecast model of the present invention.
Detailed Description
The present invention is further illustrated by the following specific examples.
The invention provides a flood forecasting method based on a novel universal input and output structure and a long-time and short-time memory network. The results of flood of training set and testing set in the application of the research basin example respectively represent the simulation and prediction performances of the LSTM flood prediction model. The invention is further explained by the embodiment and the attached drawings.
The Anhe river basin is located in Ganzhou city, Uygur county, Jiangxi province, at the east longitude of 114-114 degrees 40 ', at the north latitude of 25-26 degrees 01', and the area of the river basin is 251km2. The river basin hydrological observation data of Anhui province and river basin have good conditions, and can provide good data support for the construction of a machine learning flood forecasting model. In addition, the established vegetation in the watershed is well developed, belongs to a humid monsoon climate area in subtropical hilly and mountainous areas, has abundant rainfall, has an annual average rainfall of 1497mm, has an average air temperature of 18.8 ℃, and is shown in figure 1 in the distribution of the terrains and hydrological stations in the watershed. The number of the rainfall stations in the Anhe basin is 8, and the spaceThe distribution is relatively uniform. And selecting the river basin in the hilly area as a research example to forecast the flood, so as to realize the river basin flood forecast based on a long-time memory network. The method mainly comprises the following steps:
firstly, collecting and arranging flood data of historical watershed fields.
Actual measurement hydrological data (8 rainfall stations and 1 hydrological station) of an Anhe basin in 1984-2012 are collected, and a few flood fields with unbalanced water amount are eliminated through contrast inspection of actual measurement rainfall and flow data sequences in the past year so as to ensure the reliability of the hydrological data. Meanwhile, problems such as wrong logical relation of start and end time, repeated observation records and the like exist in collected rainfall and flow data, and the error identification and correction are needed to be carried out manually. In addition, the rainfall data is time-interval-by-time data, the starting time interval and the stopping time interval are not uniform, the flow data is instantaneous data, pretreatment work such as reasonability inspection, time-interval interpolation and the like needs to be carried out on the rainfall and the flow data, and the rainfall and the flow data are arranged into an hourly accumulated rainfall and average flow time sequence. On the basis, according to the time sequence of the flood generation of the field, the first 50% of all the fields are selected as a training set, the middle 20% of all the fields are selected as a verification set, and the proportion of the field flood of the training, verification and testing set is 5:2: 3.
And secondly, calculating the average convergence time of the Anhe basin.
According to the information of the flood rainfall flow process of the fields, which is obtained by arranging the security and drainage basin, the lag time of the flood peak flow of most fields of the security and drainage basin is 4-6 hours, and the average lag time is 5.21 hours. Accordingly, the average confluence time of the calm basin and the drainage basin is determined to be 6h, namely the flood forecast period is 6 h.
And thirdly, setting the hidden layer of the LSTM flood forecasting model as a single layer, wherein the number of neuron nodes in the hidden layer is 5.
And fourthly, designing an input and output structure of the novel universal LSTM flood forecasting model.
The LSTM cell unit state is similar to the drainage basin soil water storage state, and the effects between the three control gates (the forgetting gate, the input gate and the output gate) and the cell unit state can be regarded as the consumption, increase and release of the drainage basin soil water storage state. FIG. 2 is an LSTM modeThe input and output calculation process of the model in the t-th (current) time interval and the hydrological meaning of the input variable of the model. The input and output structure of the LSTM flood forecast model is designed as shown in fig. 3. Wherein, the model output is the flow value Q ═ Q in different periods of time in the forecast period of the flood forecast of the Anhe basin and the drainage basint+1,Qt+2,…,Qt+6]The input is measured rainfall (early rainfall + short rainfall) of long sequence of each rainfall station in the drainage basin. The early rainfall sequence reflects the influence of the change of the basin state such as the early soil water storage capacity of the basin and the like on the output flow, and the early rainfall in specific selected time intervals is determined by adopting a trial algorithm in combination with the basin characteristics. An LSTM flood forecasting model can be constructed by setting a series of input schemes (24h, 48h, 120h, 240h, 480h, 720h, 960h and 1440h) for rainfall in early stage, and the optimal order of rainfall in early stage can be determined by comparing the simulation and prediction performances of the model.
And fifthly, generating a training, verifying and testing sample set.
And according to the training, verification and test set times of the research watershed divided in the first step, generating a sample according to the flood each time in a time-interval sliding intercepting mode according to model input and output structures corresponding to different early rainfall input schemes in the fourth step. Taking model input rainfall 16h and output target flow 6h as examples, fig. 4 is an example diagram of a machine learning model sample generated at a time t-t + 1 adjacent to a field flood process. The number of samples for the Anchor basin training set, validation set, and test set is shown in Table 1.
TABLE 1 Anhe river basin training set, validation set and test set sample number
And sixthly, constructing and training an LSTM flood forecast model.
The LSTM model is trained in a supervised learning mode by adopting a time-dependent back propagation algorithm (BPTT), and the model construction and training are realized based on open-source Keras and TensorFlow. According to the model structure settings determined in the third step and the fourth step, taking early rainfall input 480h as an example, FIG. 5 showsA feature dimension visualization graph of a hierarchical structure and input and output variables of an LSTM flood forecasting model constructed based on TensorFlow and Keras. Wherein, the model input is actual rainfall measured by 8 rainfall station length sequences (486h) in Anhe basin; the model output is the flow value corresponding to the last 6 time periods of the long sequence; the role of the TimeDistributed layer is to make the hidden state h at each time instanttAll connected to the output layer (Dense) with the number of the neuron nodes being 1, and outputting a flow value at each time; the Lambda layer is a custom layer and realizes slicing processing on the output flow value sequence data, so that the output flow value of the last 6 time periods of each sample is extracted as a final output sequence of the model.
FIG. 6 is a schematic diagram of a training process of an LSTM flood forecast model under a TensorFlow deep learning framework. The super-parameter epoch is 300, the blocksize is 64, the loss function is set to be a Mean Square Error (MSE), the output layer activation function adopts a linear RELU function, the optimization algorithm adopts an Adam algorithm, and the learning rate is 0.0006. The MSE equation is shown in equation (1).
In the formula: y isk,obs、yk,outMeasured flow value and forecast flow value of a certain flood in the kth period are respectively, and the unit is m3S; l is the output length of the analog flow. The ReLU activation function has the advantages of fast convergence, simple calculation and the like, and can ensure that the output flow values of the LSTM flood forecasting model are all non-negative values, and the calculation expression is shown in formula (2).
ReLU(x)=max(x,0)(2)
And seventhly, determining a better model, and extracting and analyzing a basin field sub-flood simulation result.
Constructing corresponding LSTM flood forecasting models under different early rainfall input schemes, and comparing and analyzing the calculation results of the LSTM models in the ampere-basin field flood simulation and the watershed field flood simulation and the Nash efficiency coefficient (NSE), the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in the flow prediction process under the different early rainfall input schemes (24h, 48h, 120h, 240h, 480h, 720h, 960h and 1440h), as shown in FIG. 7. The NSE, MAE and RMSE computational expressions are as follows:
in the formula: y isk,obs、yk,outMeasured flow value and forecast flow value of a certain flood in the kth period are respectively, and the unit is m3S; l is the output length of the analog flow.
According to fig. 7, when the rainfall input in the current period is larger than or equal to 480h, the average values of NSE, MAE and RMSE of the training set and the verification set in the field flood simulation and the flow prediction process are obviously superior to the scheme that the rainfall input in the previous period is smaller than or equal to 240h, and the rainfall length in the previous period (720h, 960h and 1440h) is further prolonged, so that the model result cannot be improved. The input and output structure of the LSTM flood forecasting model, which thus determines the final superiority of the security and watershed, is shown in fig. 8.
And applying the finally established LSTM flood forecasting model to flood forecasting of the security and drainage basin, and extracting and analyzing flood calculation results of a training set and a testing set. Besides NSE, MAE and RMSE, the peak Qualification Rate (QRP) and the peak time Qualification Rate (QRT) indexes are adopted for evaluating the quality of the flood peak flow forecasting result of the model in the field flood. The QRP and QRT are calculated as shown in formula (6) and formula (7):
in the formula: n is a radical ofPRepresents the qualified number of flood peak flows in the field, NTAnd N represents the total flood time of the field. Taking the field flood simulation and prediction results corresponding to the 6 th time period output by the model as an example, table 2 shows the field flood simulation results of the LSTM flood forecast model, and the traditional conceptual hydrological model of the new anjiang river is used as a comparison reference. As can be seen from Table 2, the peak simulation and prediction results of the LSTM model are both significantly better than those of the XAJ model. Compared with the XAJ model, the QRP of the flood of the LSTM training set is improved from 77% to 82%, and the QRP of the flood of the testing set is improved from 72% to 80%; the average value of training set level flood NSE is increased from 0.825 to 0.871, and the average value of testing set level flood NSE is increased from 0.815 to 0.821; and evaluation indexes MAE and RMSE values of flood in the LSTM model training set and the test set are lower. The result analysis shows that the constructed LSTM flood forecasting model successfully establishes a complex nonlinear relation between rainfall and outlet flow in the drainage basin, and the flood forecasting precision of the model in the security and drainage basin is high.
TABLE 2 field flood simulation statistics of Anhe basin LSTM flood forecast model
Taking the flood simulation and flow prediction process of the model corresponding to the 6 th time interval as an example, 5 training sets and testing sets of flood are respectively selected, and a comparison graph of the flood simulation, prediction and actual flow measurement process of the LSTM and XAJ flood forecast models is drawn, as shown in FIG. 9. As can be seen from fig. 9, compared with the result of the XAJ model, the simulation and predicted flow processes of the LSTM flood forecasting model are more consistent with the measured flow, and the rise and fall stages of the field flood are basically consistent with the result of the XAJ model, which further illustrates that the LSTM flood forecasting model fully learns the conversion mechanism between rainfall runoff in the drainage basin, and better establishes the nonlinear mapping relationship between rainfall and outlet flow.
The results show that the novel general input and output structure and the long-term memory network-based flood forecasting method can successfully guide the LSTM flood forecasting model to mine and learn the complex nonlinear mapping relation between the drainage basin rainfall and the outlet flow, the model has better fitting capacity on the rainfall flow relation and forecasting capacity on the field flood peak value, and the simulation and forecasting precision on the field flood peak value and the peak current time of the drainage basin in mountainous areas are higher. In addition, the designed novel universal input and output structure has strong practicability in flood forecast modeling of watersheds with different spatial scales, a vertical machine learning flood forecast model is constructed according to the input and output structure, contribution of rainfall and watershed water storage states in the watersheds to flood rise and subside can be fully played, and the interpretable basis of hydrology is strong.
The above-mentioned embodiments only express the embodiments of the present invention, but not should be understood as the limitation of the scope of the invention patent, it should be noted that those skilled in the art can make several variations and modifications without departing from the concept of the present invention, and these all fall into the protection scope of the present invention.
Claims (1)
1. A flood forecasting method based on a novel universal input and output structure and a long-time and short-time memory network is characterized in that the hydrological meaning of the flood forecasting method is obvious, and a model input and output structure can be applied to basin flood forecasting of different spatial scales, and comprises the following steps:
firstly, collecting and sorting flood data of historical watershed fields;
collecting and organizing flood data of a research basin field, and dividing all fields of flood into a training set, a verification set and a test set field; wherein, training set field flood is used for optimizing an internal weight matrix and offset vector parameters of the LSTM flood forecast model; verifying field level flood to assist in determining model external settings such as hyper-parameters and loss functions; testing field flood to test the extrapolation ability of the trained model;
secondly, calculating the average convergence time of the drainage basin;
the average confluence time of a certain research basin is determined, the time is equal to the forecast period of flood of the basin, and the regulation and storage effect of the basin on the rainfall water flow convergence process is comprehensively reflected; according to historical flood process data of a field collected and sorted by a drainage basin, the time difference between main rainfall and corresponding peak flow in the field of flood is counted, namely the peak delay time; calculating the average value of the peak lag time of all flood fields, namely the average convergence time of the drainage basin;
thirdly, the number of hidden layer layers and the number of hidden layer neuron nodes of the LSTM network flood forecasting model are given and stored for a long time;
fourthly, designing an input and output structure of the novel universal LSTM flood forecasting model;
the state of the LSTM cell unit is similar to the drainage basin soil water storage state, the effects between the three control gates and the cell unit state can be regarded as consumption, increase and release of the drainage basin soil water storage state, wherein the three control gates comprise a forgetting gate, an input gate and an output gate; according to the traditional watershed hydrological simulation theory and method, rainfall is selected as the only input factor of an LSTM flood forecasting model, the LSTM flood forecasting model is input as long-sequence rainfall information of each rainfall station in a watershed, and the input length is n + l time periods; the method comprises the following steps that n represents the input length of early rainfall, the early rainfall can be regarded as reflecting the influence of information such as early soil water storage state of a drainage basin, short rainfall and the like on a model output flow value, a plurality of n can be selected, and a better model can be determined according to the performance of a subsequent model; the LSTM flood forecasting model outputs a flow value sequence equal to the research basin confluence time, namely the output length is l time intervals, l is equal to the basin confluence time, and l is less than or equal to n;
fifthly, generating a training, verifying and testing sample set;
determining the length of the sample according to the input and output structure (values of n and l) of the novel universal LSTM flood forecasting model designed in the fourth step, wherein the length of the input rainfall sequence of each sample is equal to n + l time periods, and the length of the output target flow sequence is equal to l time periods; according to the training set, the verification set and the test set field divided in the first step, corresponding training, verification and test samples are generated according to the flood each time in a time-interval sliding interception modeThe set, and each flood generates a plurality of samples, each sample consisting of an input rainfall sequence P ═ Pt-n+1,Pt-n+2,…,Pt,…,Pt+l]And outputting the target flow sequence Q ═ Qt+1,Qt+2,…,Qt,…,Qt+l]Forming input and output data pairs; wherein n is the time interval number of the input early rainfall sequence of the LSTM flood forecast model, l is the time interval number of the output flow of the model, and the input rainfall P of the t-th time intervaltThe rainfall measurement method comprises the steps of including actually measured rainfall values of all rainfall stations in a river basin;
sixthly, constructing and training a model;
the LSTM flood forecasting model is trained in a supervised learning mode by adopting a time-dependent back propagation algorithm (BPTT), and the model construction and training are realized based on open-source Keras and TensorFlow; a mature machine learning algorithm package is integrated on a Keras framework running on a TensorFlow platform, and a corresponding algorithm can be directly called to complete the construction and training of an LSTM flood forecast model;
6.1) constructing an LSTM flood forecasting model: calling Keras layer packets to define an input layer, a hidden layer and an output layer of the LSTM according to the number of hidden layer layers, the number of neuron nodes of the hidden layers and the model input and output structure designed in the fourth step, and constructing an LSTM flood forecasting model;
6.2) training an LSTM flood forecasting model: inputting the training and verification sample set generated in the fifth step, setting hyper-parameters, activation functions, loss functions, optimization algorithms and the like involved in the model training process, and operating an LSTM flood forecasting model constructed based on a Keras framework on a TensorFlow platform to obtain a trained LSTM flood forecasting model;
seventhly, determining a better model, and extracting and analyzing a basin field sub-flood simulation result;
and comparing and analyzing the performance of the LSTM flood forecasting models under different input lengths, determining the final superior LSTM flood forecasting model, further extracting the results of the process of simulating and forecasting the flow of the flood in the research basin field, and analyzing the forecasting performance of the LSTM flood forecasting model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111594179.4A CN114386677B (en) | 2021-12-24 | Flood forecasting method based on novel general input/output structure and long-short-time memory network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111594179.4A CN114386677B (en) | 2021-12-24 | Flood forecasting method based on novel general input/output structure and long-short-time memory network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114386677A true CN114386677A (en) | 2022-04-22 |
CN114386677B CN114386677B (en) | 2024-10-25 |
Family
ID=
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114707753A (en) * | 2022-04-25 | 2022-07-05 | 河海大学 | Regional LSTM flood forecasting method |
CN114970377A (en) * | 2022-07-29 | 2022-08-30 | 水利部交通运输部国家能源局南京水利科学研究院 | Method and system for field flood forecasting based on Xinanjiang and deep learning coupling model |
CN118504430A (en) * | 2024-07-18 | 2024-08-16 | 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) | Zero sample warehouse-in flood prediction method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109615011A (en) * | 2018-12-14 | 2019-04-12 | 河海大学 | A kind of middle and small river short time flood forecast method based on LSTM |
US20190227194A1 (en) * | 2015-12-15 | 2019-07-25 | Wuhan University | System and method for forecasting floods |
CN112396152A (en) * | 2020-11-17 | 2021-02-23 | 郑州大学 | Flood forecasting method based on CS-LSTM |
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190227194A1 (en) * | 2015-12-15 | 2019-07-25 | Wuhan University | System and method for forecasting floods |
CN109615011A (en) * | 2018-12-14 | 2019-04-12 | 河海大学 | A kind of middle and small river short time flood forecast method based on LSTM |
CN112396152A (en) * | 2020-11-17 | 2021-02-23 | 郑州大学 | Flood forecasting method based on CS-LSTM |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114707753A (en) * | 2022-04-25 | 2022-07-05 | 河海大学 | Regional LSTM flood forecasting method |
CN114707753B (en) * | 2022-04-25 | 2022-12-09 | 河海大学 | Regional LSTM flood forecasting method |
CN114970377A (en) * | 2022-07-29 | 2022-08-30 | 水利部交通运输部国家能源局南京水利科学研究院 | Method and system for field flood forecasting based on Xinanjiang and deep learning coupling model |
CN118504430A (en) * | 2024-07-18 | 2024-08-16 | 江西省水利科学院(江西省大坝安全管理中心、江西省水资源管理中心) | Zero sample warehouse-in flood prediction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110084367B (en) | Soil moisture content prediction method based on LSTM deep learning model | |
CN108304668B (en) | Flood prediction method combining hydrologic process data and historical prior data | |
Wang et al. | Pan evaporation modeling using four different heuristic approaches | |
CN113887787B (en) | Flood forecast model parameter multi-objective optimization method based on long-short-term memory network and NSGA-II algorithm | |
Wilby et al. | Detection of conceptual model rainfall—runoff processes inside an artificial neural network | |
CN109272146A (en) | A kind of Forecasting Flood method corrected based on deep learning model and BP neural network | |
Ustaoglu et al. | Forecast of daily mean, maximum and minimum temperature time series by three artificial neural network methods | |
CN107292098A (en) | Medium-and Long-Term Runoff Forecasting method based on early stage meteorological factor and data mining technology | |
CN104951836A (en) | Posting predication system based on nerual network technique | |
CN109299812A (en) | A kind of Forecasting Flood method based on deep learning model and KNN real time correction | |
CN102495937A (en) | Prediction method based on time sequence | |
CN114219131A (en) | Watershed runoff prediction method based on LSTM | |
CN111860974B (en) | Drought multistage prediction method based on state space and joint distribution | |
CN117787081A (en) | Hydrological model parameter uncertainty analysis method based on Morris and Sobol methods | |
CN114357670A (en) | Power distribution network power consumption data abnormity early warning method based on BLS and self-encoder | |
CN116205136A (en) | Large-scale river basin deep learning flood forecasting method based on runoff lag information | |
CN115330036A (en) | GRU-Seq2 Seq-based multistep long flood forecasting method and device | |
CN117648878A (en) | Flood rapid evolution and flooding simulation method based on 1D-CNN algorithm | |
CN118312576A (en) | Prediction method and system for high-temperature heat wave-drought composite disaster and electronic equipment | |
Viola et al. | Daily streamflow prediction with uncertainty in ephemeral catchments using the GLUE methodology | |
Bai et al. | Evolving an information diffusion model using a genetic algorithm for monthly river discharge time series interpolation and forecasting | |
CN114386677A (en) | Flood forecasting method based on novel universal input/output structure and long-and-short time memory network | |
CN114386677B (en) | Flood forecasting method based on novel general input/output structure and long-short-time memory network | |
WO2023245399A1 (en) | Rice production potential simulation method based on land system and climate change coupling | |
CN115860165A (en) | Neural network basin rainfall runoff forecasting method and system considering initial loss |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |