CN112528557A

CN112528557A - Flood flow prediction system and method based on deep learning

Info

Publication number: CN112528557A
Application number: CN202011379571.2A
Authority: CN
Inventors: 周扬; 肖凤林; 李暨
Original assignee: Beijing Jinshui Information Technology Development Co ltd
Current assignee: Beijing Jinshui Information Technology Development Co ltd
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2021-03-19

Abstract

The invention discloses a flood flow prediction system and method based on deep learning.A site end sends single-end original rainfall data and original flow data to a prediction end; the method comprises the steps that a prediction end obtains single-end sending data, prior rainfall data in the future P hours and original position information of N hydrological stations; preprocessing original rainfall data, prior rainfall data, original flow data and original position information; forming gridding rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information; extracting spatial distribution characteristics of the gridded rainfall data, and extracting time sequence characteristics of the rainfall data in historical T hours and future P hours to obtain first output characteristics; extracting the time sequence characteristics of the flow data in the historical T hours from the processed flow data to obtain second output characteristics; and carrying out merging, classifying and predicting on the first output characteristic and the second output characteristic to obtain a flow predicted value of the target hydrological station in the future P hours.

Description

Flood flow prediction system and method based on deep learning

Technical Field

The invention belongs to the field of flood flow prediction, and particularly relates to a flood flow prediction system and method based on deep learning.

Background

Flood is one of common natural disasters, hundreds of millions of people are influenced by the flood every year, and run away and lose places, and the financial and material resources loss caused by the flood is also very huge. The flood flow can be effectively predicted, and early warning can be timely sent out, so that the method has great significance for flood control and disaster reduction.

The current flood flow prediction models are mainly divided into traditional physical models and intelligent flood prediction models. The traditional physical model, such as the Xinanjiang model, is a set of prediction models with regional pertinence, which are finally prepared by calculating parameters of a physical process on the premise of fully excavating physical characteristics such as local landform, evaporation capacity, vegetation coverage and the like. The intelligent flood prediction model is a function mapping or joint distribution from input features to output features obtained by using intelligent methods such as machine learning and the like by using massive historical data as prior knowledge.

However, the existing flood flow prediction model belongs to single-point prediction, that is, only the flow condition of a future time point can be predicted, and the predicted flow data of the single time point lacks practical application value. In addition, the existing flood flow prediction model only analyzes rainfall data as a time sequence when the rainfall data is utilized, and does not consider the spatial distribution condition of rainfall, so that information described by actual rainfall data cannot be completely mined, and the prediction accuracy is not high.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention provides a flood flow prediction system and method based on deep learning. The technical problem to be solved by the invention is realized by the following technical scheme:

in a first aspect, a flood flow prediction system based on deep learning provided in an embodiment of the present invention includes N site ends and a prediction end, where the N site ends are terminals corresponding to N hydrological sites of a predetermined drainage basin, and one target hydrological site of the N hydrological sites is located at an outlet section of the predetermined drainage basin;

each site end is used for sending single-end original rainfall data representing historical K-year hourly rainfall data of the hydrological site to the forecasting end, and the site end corresponding to the target hydrological site also sends original flow data representing the historical K-year hourly flow data of the target hydrological site to the forecasting end;

the forecasting terminal is used for acquiring single-terminal original rainfall data, the original flow data, prior rainfall data of each hydrological site in the future P hours and original position information of the N hydrological sites, and acquiring original rainfall data from all the single-terminal original rainfall data; preprocessing the original rainfall data, the prior rainfall data, the original flow data and the original position information to respectively obtain processed rainfall data, processed prior rainfall data, processed flow data and processed position information; forming gridded rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information; based on the gridded rainfall data, extracting spatial distribution characteristics by using a first branch network of a flood flow prediction model based on deep learning obtained by pre-training, and extracting time sequence characteristics of rainfall data in historical T hours and future P hours to obtain first output characteristics; based on the processed flow data, extracting time sequence characteristics of historical T-hour flow data by using a second branch network of the flood flow prediction model based on deep learning to obtain second output characteristics; and performing merging, classifying and predicting on the first output characteristic and the second output characteristic by using a third network of the flood flow prediction model based on deep learning to obtain a flow prediction value of the target hydrological station in the future P hours, wherein N, K, T and P are natural numbers greater than 1, and T is less than or equal to the number of hours corresponding to the K years.

Optionally, the preprocessing the original rainfall data, the prior rainfall data, the original traffic data, and the original location information by the predicting terminal to obtain processed rainfall data, processed traffic data, and processed location information, respectively, includes:

performing data elimination, data completion processing and normalization processing on the original rainfall data to obtain processed rainfall data;

carrying out normalization processing on the prior rainfall data to obtain processed prior rainfall data;

carrying out normalization processing on the original flow data to obtain processed flow data;

and carrying out gridding processing on the original position information to obtain processed position information.

Optionally, the predicting end performs data elimination, data completion processing and normalization processing on the original rainfall data to obtain processed rainfall data, and the method includes:

removing data corresponding to hydrological sites with the data number lower than the preset number in the original rainfall data to obtain residual rainfall data;

performing data completion on the rainfall data missing from the residual rainfall data by using an inverse distance weighting method to obtain completed rainfall data;

and carrying out normalization processing on the supplemented rainfall data to obtain processed rainfall data.

Optionally, the normalization process includes a [0,1] normalization process.

Optionally, the predicting end forms gridded rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information, and the gridding formed rainfall data includes:

and selecting rainfall data corresponding to each hydrological site in the hour from the processed rainfall data and the processed prior rainfall data for each hour, and filling the rainfall data into corresponding positions of each hydrological site in the processed position information to form first gridded rainfall data corresponding to the hour, wherein the first gridded rainfall data is formed by all the first gridded rainfall data corresponding to the processed rainfall data and the processed prior rainfall data.

Optionally, the predicting end extracts, based on the gridded rainfall data, a spatial distribution feature by using a first branch network of a flood flow prediction model based on deep learning obtained through pre-training, and extracts a time sequence feature of the rainfall data in historical T hours and future P hours to obtain a first output feature, where the extracting includes:

inputting the gridded rainfall data into a dimensionality reduction network of the first branch network to obtain dimensionality reduction data;

performing feature flattening on the dimension reduction data by using a feature transformation network of the first branch network to obtain flattened data;

selecting data corresponding to historical T hours and future P hours from the flattened data to form a first vector;

extracting time sequence characteristics of the first vector by utilizing a plurality of GRU layers of the first branch network to obtain first output characteristics;

the dimensionality reduction network comprises a two-dimensional convolution layer and a maximum pooling layer.

Optionally, the step of extracting, by the predicting end, a time series feature of the historical T-hour flow data by using the second branch network of the flood flow prediction model based on deep learning based on the processed flow data to obtain a second output feature includes:

selecting data corresponding to T hours from the processed flow data to form a second vector;

and extracting time sequence characteristics of the second vector by using a plurality of GRU layers of the second branch network to obtain second output characteristics.

Optionally, the merging, classifying and predicting the first output feature and the second output feature by the predicting end by using the third network of the flood flow prediction model based on deep learning to obtain a flow prediction value of the target hydrological station in the future P hours includes:

merging the first output characteristic and the second output characteristic by using a concat module of the third network to obtain a merged characteristic;

and carrying out classified prediction on the merging characteristics by utilizing a plurality of full connection layers of the third network to obtain a flow prediction value of the target hydrological site in the future P hours.

Optionally, the forecast end is a station end corresponding to the target hydrological station.

In a second aspect, an embodiment of the present invention further provides a flood flow prediction method based on deep learning, which is applied to a prediction end in a flood flow prediction system based on deep learning, where the flood flow prediction system based on deep learning further includes N site ends, where the N site ends are terminals corresponding to N hydrological sites of a predetermined drainage basin, and one target hydrological site of the N hydrological sites is located at an exit section of the predetermined drainage basin; each site end sends single-end original rainfall data representing historical K-year hourly rainfall data of the hydrological site to the forecasting end, and the site end corresponding to the target hydrological site also sends original flow data representing the historical K-year hourly flow data of the target hydrological site to the forecasting end; the flood flow prediction method based on deep learning comprises the following steps:

acquiring the single-ended original rainfall data, the original flow data, the prior rainfall data of each hydrological site in the future P hours and the original position information of N hydrological sites, and acquiring the original rainfall data from all the single-ended original rainfall data; preprocessing the original rainfall data, the original flow data and the prior rainfall data to respectively obtain processed rainfall data, processed flow data and processed position information; forming gridded rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information; based on the gridded rainfall data, extracting spatial distribution characteristics by using a first branch network of a flood flow prediction model based on deep learning obtained by pre-training, and extracting time sequence characteristics of rainfall data in historical T hours and future P hours to obtain first output characteristics; based on the processed flow data, extracting time sequence characteristics of historical T-hour flow data by using a second branch network of the flood flow prediction model based on deep learning to obtain second output characteristics; and performing merging, classifying and predicting on the first output features and the second output features by using a third network of the flood flow prediction model based on deep learning to obtain a flow prediction value of the target hydrology site in the future P hours, wherein the flood flow prediction model based on deep learning comprises a branch network group and the third network which are connected in series, the branch network group comprises the first branch network and the second branch network which are connected in parallel, N, K, T and P are natural numbers greater than 1, and T is less than or equal to the hours corresponding to K years.

According to the flood flow prediction system and method based on deep learning, position information of each hydrological site is utilized, historical rainfall data and future prior rainfall data are gridded, so that spatial distribution information of rainfall is introduced, gridded rainfall data is utilized, and spatial distribution characteristics and time sequence characteristics are extracted by utilizing one of pre-trained branch networks of a flood flow prediction model based on deep learning; and performing time sequence feature extraction on the historical flow data by using the other branch network of the flood flow prediction model based on deep learning, and performing merging, classifying and predicting on the output features of the two branch networks to obtain a flood flow prediction value in a future time period. The prediction system provided by the embodiment of the invention can obtain the flood flow prediction result of a future time period at one time, and the result considers the space distribution condition of rainfall, so that the information described by the actual rainfall data can be fully mined to obtain the prediction result conforming to the actual rainfall condition, and the prediction accuracy is higher.

Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.

The present invention will be described in further detail with reference to the accompanying drawings and examples.

Drawings

Fig. 1 is a schematic structural diagram of a flood flow prediction system based on deep learning according to an embodiment of the present invention;

fig. 2 is a schematic structural diagram of a prediction end of the flood flow prediction system based on deep learning according to the embodiment of the present invention;

fig. 3 is a schematic structural diagram of a flood flow prediction model based on deep learning according to an embodiment of the present invention;

fig. 4 is a detailed structural diagram of a first branch network according to an embodiment of the present invention;

fig. 5 is a specific structural diagram of a second branch network according to an embodiment of the present invention;

fig. 6 is a specific structural diagram of a third network according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a prediction process according to an embodiment of the present invention;

FIG. 8 is an exemplary diagram of input data provided by an embodiment of the present invention;

FIG. 9 is a diagram illustrating a training structure of a self-encoder according to an embodiment of the present invention;

FIG. 10 is a comparison graph of water peak prediction effects for an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to realize the flood flow prediction in a future time period, completely mine information described by actual rainfall data and obtain a prediction result which is consistent with the actual rainfall condition and has high accuracy, the embodiment of the invention provides a flood flow prediction system and a flood flow prediction method based on deep learning.

In a first aspect, a flood flow prediction system based on deep learning according to an embodiment of the present invention is first introduced.

Referring to fig. 1, fig. 1 is a schematic structural diagram of a flood flow prediction system based on deep learning according to an embodiment of the present invention.

The flood flow prediction system 100 based on deep learning according to an embodiment of the present invention may include N site terminals 110 and a prediction terminal 120.

The N site terminals 110 are terminals corresponding to N hydrological sites of the predetermined drainage basin, and one target hydrological site of the N hydrological sites is located at an outlet section of the predetermined drainage basin.

Each site end 110 is configured to send single-end original rainfall data representing historical K-year hourly rainfall data of the hydrological site to the prediction end, and the site end corresponding to the target hydrological site also sends original flow data representing the historical K-year hourly flow data of the target hydrological site to the prediction end;

the forecasting terminal 120 is used for acquiring single-terminal original rainfall data, original flow data, prior rainfall data of each hydrological site in the next P hours and original position information of N hydrological sites, and acquiring original rainfall data from all the acquired single-terminal original rainfall data; preprocessing original rainfall data, pre-acquired prior rainfall data in the future P hours, original flow data and original position information to respectively obtain processed rainfall data, processed prior rainfall data, processed flow data and processed position information; forming gridding rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information; based on gridding rainfall data, extracting spatial distribution characteristics by using a first branch network of a flood flow prediction model based on deep learning obtained by pre-training, and extracting time sequence characteristics of rainfall data of historical T hours and future P hours to obtain first output characteristics; based on the processed flow data, extracting time sequence characteristics of the flow data in T hours of history by using a second branch network of a flood flow prediction model based on deep learning to obtain second output characteristics; and performing merging, classifying and predicting on the first output characteristic and the second output characteristic by using a third network of a flood flow prediction model based on deep learning to obtain a flow prediction value of the target hydrological station in the future P hours, wherein N, K, T and P are natural numbers greater than 1, and T is less than or equal to the number of hours corresponding to K years.

The following description is made for each part:

1. station end 110:

the preset watershed is a geographical area, the geographical area comprises N hydrological stations and is used for monitoring rainfall conditions, and one target hydrological station in the N hydrological stations is located on an outlet section of the preset watershed and is used for monitoring the change conditions of the water level and the flow of the outlet section of the preset watershed.

Each hydrological site records and stores rainfall data of the hydrological site per hour, and the target hydrological site also records and stores flow data of the target hydrological site per hour. The hydrologic site may store the rainfall data or the flow data in the corresponding site end 110. Site end 110 may be a processor or other electronic device, etc. Alternatively, it is reasonable that the hydrologic site can store the rainfall data or the traffic data in a storage address having a communication connection with the corresponding site end 110.

After responding to a data request of the forecasting terminal 120, each site terminal 110 may extract rainfall data of the hydrological site in each hour of the historical K years from the stored rainfall data to form single-end original rainfall data, and send the single-end original rainfall data to the forecasting terminal 120; and the site end 110 corresponding to the target hydrologic site can further extract the hourly flow data of the target hydrologic site in the historical K years from the stored flow data to form original flow data, and send the original flow data to the predicting end 120.

As a preferred embodiment, the historical K years may be consecutive K years in the historical data that are closest to the current time.

2. The prediction end 120:

referring to fig. 2, fig. 2 is a schematic structural diagram of a prediction end of the flood flow prediction system based on deep learning according to an embodiment of the present invention. The prediction side may include a raw data acquisition module 1201, a data pre-processing module 1202, an input data construction module 1203, and a model prediction module 1204. The specific operation of each module is described below.

(1) The original data acquisition module 1201 is configured to acquire single-ended original rainfall data, original flow data, prior rainfall data of each hydrological site in P hours in the future, and original position information of the N hydrological sites, and acquire the original rainfall data from all the acquired single-ended original rainfall data.

The original data acquisition module 1201 merges and integrates all the acquired single-ended original rainfall data according to time to obtain original rainfall data.

The raw rainfall data comprises historical K-year hourly rainfall data for N hydrological sites of a predetermined basin. The original flow data comprises historical K-year hourly flow data of target hydrologic sites located at outlet sections of a predetermined basin from the N hydrologic sites. The original location information includes longitudes of N hydrological sites and latitudes of the N hydrological sites, N and K being natural numbers greater than 1.

Therefore, in the embodiment of the present invention, the historical rainfall data in each hour of K years may be acquired from the N hydrological sites to generate the original rainfall data, and the historical traffic data in each hour of K years may be acquired from the target hydrological site to generate the original traffic data. As a preferred embodiment, the historical K years may be consecutive K years in the historical data that are closest to the current time.

As a specific example of the embodiment of the present invention, a county basin in henan province of china is selected as the predetermined basin, the county basin has 50 hydrologic sites in total, and the time range may be selected from 2013 to 2018, that is, N is 50 and K is 6. Examples of the obtained raw rainfall data, raw traffic data, and raw location information are shown in tables 1 to 3, and specific numerical values and names are not shown as examples of the form. Wherein TM is a time stamp and takes one hour as a unit; S1-S50 show rainfall data of 50 hydrological stations; q represents the flow data of the target hydrological site; a1 to a50 represent the numbers of 50 hydrological stations; LTGD represents the longitude coordinates of the hydrological site; LTTD represents the latitude coordinate of the hydrological station, and S1-S50 and Q are all in millimeter.

Table 1 original rainfall data example table

TM	S1/mm	S2/mm	……	S50/mm
					1 month, 1 day, 0 of 2013	/	/	/	/
1 month and 1 day of 2013	/	/	/	/
					……	/	/	/	/
31, 23 of 12 and 2013	/	/	/	/
					……	/	/	/	/
31, 23 and 12 months in 2018	/	/	/	/

Table 2 original flow data example table

TM	Q/mm
		1 month, 1 day, 0 of 2013	/
1 month and 1 day of 2013	/
		……	/
31, 23 of 12 and 2013	/
		……	/
31, 23 and 12 months in 2018	/

Table 3 original position information example table

Site	LGTD	LTTD
			A1	/	/
A2	/	/
			……	/	/
A50	/	/

Those skilled in the art will appreciate that the dimensions of the raw rainfall data illustrated in table 1 are [ (6 × 365 × 24), 50] ═ 52560,50 ]. The dimensions of the raw traffic data illustrated in table 2 are [52560, 1 ]. The original location information illustrated in table 3 has dimensions [50, 2 ].

The original data acquisition module 1201 can acquire single-ended original rainfall data and original flow data from a station side, and can acquire prior rainfall data of each hydrological station in P hours in the future by means of weather forecast and the like. The future P hours is the future time period needing prediction. For example, P may be 24 hours, etc.

(2) The data preprocessing module 1202 is configured to preprocess the original rainfall data, the prior rainfall data, the original flow data, and the original position information to obtain processed rainfall data, processed prior rainfall data, processed flow data, and processed position information, respectively.

The processing of the raw rainfall data by the data preprocessing module 1202 may include:

and carrying out data elimination, data completion processing and normalization processing on the original rainfall data to obtain the processed rainfall data.

In practice, due to careless omission and the like, rainfall data of a hydrological site may be missing, and it can be understood that if the calculation is performed by using raw rainfall data with excessive missing data, the accuracy of subsequent prediction is affected, and therefore, in this step, data of the hydrological site with excessive missing data needs to be removed first.

In an alternative embodiment, the above process may include the following three steps:

step one, removing data corresponding to hydrological sites of which the data number is lower than a preset number from original rainfall data to obtain residual rainfall data;

for example, for the raw rainfall data of table 1, if the number of rainfall data at a hydrologic site is lower than the preset number 30000, it indicates that the existing data ratio is smaller than

This means that almost half of the data is missing, and it is not necessary to complete missing values in the subsequent process for such a large amount of missing data. Because if missing value completion is carried out on the model, too many artificial factors are introduced, so that the generalization capability of the model is reduced. So that the rules learned by the final neural network do not conform to the real data relationship. Therefore, the data corresponding to the hydrological site with more missing data can be removed, the rainfall data of the preferable hydrological site is reserved, and the residual rainfall is obtainedRain data.

The preset quantity can be reasonably selected according to the dimension of the original rainfall data and the requirement of data precision.

And step two, performing data completion on the rainfall data missing from the residual rainfall data by using an inverse distance weighting method to obtain completed rainfall data.

It can be understood by those skilled in the art that there may still be some missing rainfall data at some time points in the remaining rainfall data, and the missing rainfall data can be complemented by using a missing value complementing algorithm.

Among them, the Inverse Distance Weighted (IDW) algorithm is a classic algorithm for missing value completion. The anti-distance weighting algorithm completes the missing rainfall data of a certain hydrological site by using the rainfall data of the adjacent hydrological sites according to the principle that the farther the distance is, the smaller the correlation is. The formula of the inverse distance weighting algorithm is shown in the following formula (1).

Wherein q is a rainfall data estimation value of a hydrological site with data loss; m is the number of hydrological stations participating in calculation; q. q.s_rThe rainfall data actual value of the adjacent hydrological stations is obtained; dr is the actual distance of the hydrological station with the missing distance data of the adjacent hydrological stations participating in calculation; p is a norm type for calculating the distance vector, and can generally take a value of 2.

The rainfall data of the hydrological station lacking the rainfall data in the residual rainfall data can be supplemented by the reverse distance weighting algorithm, and complete rainfall sum data is obtained, so that the supplemented rainfall data is obtained. I.e. the data in table 1 is filled in completely.

And step three, performing normalization processing on the rainfall data of all hydrological stations in the supplemented rainfall data.

Normalization is a dimensionless processing means to make the absolute value of the physical system value become some relative value relation. The method is an effective method for simplifying calculation and reducing the magnitude. Through proper normalization processing, the speed of solving the optimal solution by gradient descent can be increased, the magnitude of the model parameter value is reduced, the convergence speed in the model training process is increased, the performance of the model is improved, and the precision is improved.

Common normalization methods include: min-max normalization, standard deviation normalization, non-linear normalization, etc. Since the rainfall data and the flow data do not satisfy the normal distribution, the normalization process includes a [0,1] normalization process in a preferred embodiment of the present invention.

The [0,1] normalization process belongs to min-max normalization, which is performed by re-adjusting the values of each dimension of the data so that the final data vector falls between [0,1 ]. [0,1] the specific formula of the normalization process is shown in the following formula (2).

Wherein h is_maxAnd h_minMaximum and minimum values in the data samples, respectively; h is the raw data before normalization, h^*Is the normalized data.

Through the normalization processing of the [0,1], the processed rainfall data after the normalization processing of the supplemented rainfall data can be obtained.

The processing of the raw traffic data by the data pre-processing module 1202 may include:

and carrying out normalization processing on the original flow data to obtain processed flow data.

The normalization process of this process is the same as the normalization process in the above step. Through the normalization processing of [0,1], the processed flow data of the original flow data after normalization processing can be obtained.

The processing of the a priori rainfall data by the data preprocessing module 1202 may include:

and carrying out normalization processing on the prior rainfall data to obtain the processed prior rainfall data.

The normalization process of this process is the same as the normalization process in the above step. Through the normalization processing of the [0,1], the processed prior rainfall data after normalization processing of the prior rainfall data can be obtained.

The processing of the raw location information by the data pre-processing module 1202 may include:

In this embodiment, the data gridding generates coordinates according to the longitude and latitude information of each station and the division value of 0.01, and the coordinate calculation formula is shown in the following formula (3).

Wherein, lg_iAs longitude information of the hydrological station i, la_iLg as latitude information of a hydrological site i_maxAnd lg_minRespectively the maximum value and the minimum value of longitude information in all hydrological stations, la_maxAnd la_minThe maximum value and the minimum value of the latitude information in all the hydrological stations are respectively. The calculation result is rounded to obtain the position coordinate information (x) of the corresponding hydrological station_i,y_i) And the position coordinate information of each hydrological station is represented by a grid, so that the processed position information can be obtained by combining the grids corresponding to the position coordinate information of all the hydrological stations.

It is to be understood that the processed location information contains a plurality of grids, each grid representing location information for one hydrological site.

(3) And the input data construction module 1203 is configured to form gridded rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information.

And aiming at each hour, selecting rainfall data corresponding to each hydrological site in the hour from the processed rainfall data and the processed prior rainfall data, respectively filling the rainfall data into corresponding positions of each hydrological site in the processed position information to form first gridded rainfall data corresponding to the hour, and forming the gridded rainfall data by all the first gridded rainfall data corresponding to the processed rainfall data and the processed prior rainfall data.

It can be understood that, after the above steps, N hydrological sites are represented on grid coordinates, in this step, from the generated processed rainfall data and the processed rainfall test data, the rainfall data corresponding to each hydrological site in each hour is obtained and filled in the corresponding position of the hydrological site in the processed position information, and after all the hydrological sites in the hour are correspondingly filled, the first meshed rainfall data corresponding to the hour can be obtained, and the processing is completed according to the same operation for each hour of the historical K years and the next P hours. It can be understood that, for each hour, a corresponding first gridded rainfall data can be obtained, and the gridded rainfall data is formed by all the finally obtained first gridded rainfall data.

As will be appreciated by those skilled in the art, the tabular dimension of the gridded rainfall data is [52560, x, y ], where x, y represent the matrix span of the gridded rainfall data, respectively.

(4) The model prediction module 1204 is configured to extract spatial distribution features from the meshed rainfall data by using a first branch network of a flood flow prediction model based on deep learning obtained through pre-training, and extract time sequence features of rainfall data in historical T hours and future P hours to obtain first output features; based on the processed flow data, extracting time sequence characteristics of the flow data in T hours of history by using a second branch network of a flood flow prediction model based on deep learning to obtain second output characteristics; and carrying out merging, classifying and predicting on the first output characteristics and the second output characteristics by using a third network of a flood flow prediction model based on deep learning to obtain a flow prediction value of the target hydrological station in the future P hours.

The model prediction module 1204 is internally provided with a flood flow prediction model based on deep learning and obtained by pre-training. In order to facilitate understanding of the prediction process of the embodiment of the present invention, first, a structure and a training process of the flood flow prediction model based on deep learning of the embodiment of the present invention are described.

In order to realize flood flow prediction in a future time period, the embodiment of the invention provides a specific flood flow prediction model based on deep learning by using a Convolutional Neural Network (CNN) technology and a Recurrent Neural Network (RNN) technology.

Among them, the following formula (4) shows.

Q_[1...P]＝f(X_[T+P...0])s.p.N≤0 (4)

Wherein, f (X)_[T+P...0]) Data representing inputs, i.e. rainfall data and flow data at known hours, Q_[1...P]Indicating that the flow prediction results are calculated 1 to P hours in the future.

The invention can forecast the future flow in P hours by combining rainfall data.

The convolutional neural network is a neural network specially used for processing a gridding data structure, has the advantages of few parameters, easily adjustable receptive field and the like compared with a traditional full-connection network structure, and has great advantages in the aspect of image processing. The rainfall spatial distribution two-dimensional data generated by combining the rainfall spatial distribution two-dimensional data is applied to water conservancy, and spatial distribution information of the rainfall data can be effectively extracted.

The circular neural network is a neural network specially processing time sequence characteristics, and a special hidden state transfer mechanism in the circular neural network can transfer the current data characteristics to the next time for merging and analysis. Thereby allowing causal continuity in time for the entire sequence. Compared with the RNN, the GRU network with the RNN variant structure has the advantages that the GRU network is provided with the forgetting gate, and the problem of gradient explosion caused by overlong time sequence can be effectively prevented.

The embodiment of the invention builds a flood flow prediction model based on deep learning based on python, and FIG. 3 is a schematic structural diagram of the flood flow prediction model based on deep learning. The flood flow prediction model based on deep learning comprises the following steps: the network system comprises a series connection of a branch network group and a third network, wherein the branch network group comprises a first branch network and a second branch network which are connected in parallel.

Wherein the first branch network comprises: the device comprises a dimensionality reduction network, a feature transformation network and a plurality of GRU layers. The second branch network includes a plurality of GRU layers. The third network includes: a concat module, and a plurality of fully connected layers.

Each network of the flood flow prediction model based on deep learning is specifically described below.

(1) A first branch network:

referring to fig. 4, fig. 4 is a specific structural diagram of a first branch network according to an embodiment of the present invention. The dimensionality reduction network comprises a cascade structure formed by a convolution layer and a pooling layer. In this embodiment, the CNN structure is used for a dimension reduction network.

The convolutional layer in the embodiment of the present invention may be a two-dimensional convolutional layer, represented by Conv2D, and the Pooling layer represented by Pooling. The parameters for the specific layers are shown in Table 4.

Table 4 layer parameters of dimensionality reduction network of flood flow prediction model based on deep learning

In the model, Conv2D is used for feature extraction, and Pooling is used for reducing the number of parameters and performing feature screening. The convolution modes are all VALID convolution modes, and the Pooling modes adopted by Pooling are all MaxPolling. The resulting output vector has dimensions [ T, k1, k2], where k1, k2 are related to the convolution block size setting.

The output of the dimensionality reduction network is three-dimensional data, and the GRU network needs two-dimensional input data. The GRU has a first dimension of time step, T, which is consistent with the output of the dimensionality reduction network, and a second dimension of feature vectors at each time point. In the embodiment of the invention, dimension reconstruction is carried out in a reshape mode, a feature transformation network is added, and k1 and k2 of the output dimension of the dimension reduction network are combined into one dimension which is used as a feature vector of a corresponding time point, the dimension is [ T, k1 × k2], and the input data dimension requirement of a GRU network is met.

The first branch network comprises a GRU layer 1, a GRU layer 2 and a GRU layer 3.

In this embodiment, the RNN computation module adopts a GRU network as its structural core, and uses a model architecture of multiple layers of GRUs, and the specific structure is shown in fig. 4.

In this embodiment, a GRU network is constructed using three GRU layers. The number of the neurons of each layer of the GRU network is 50, 30 and 10 in sequence, and the dimensionality of output data is [ T, 10 ]. The parameters for the specific layers are shown in Table 5.

TABLE 5 layer parameters for GRU networks

GRU layer	Number of neurons
		GRU layer 1	50
GRU layer 2	30
		GRU layer 3	10

(2) A second branch network:

referring to fig. 5, fig. 5 is a specific structural diagram of a second branch network according to an embodiment of the present invention. Similar to the plurality of GRU layers of the first branch network, three GRU layers are also used, and specific parameters are the same as those of the plurality of GRU layers of the first branch network, which are not described herein again.

(3) A third network:

referring to fig. 6, fig. 6 is a specific structural diagram of a third network according to an embodiment of the present invention.

concat is a way of feature fusion in neural networks, and is the merging of channel numbers.

The full connection layer is mainly used for numerical fitting of the model to increase the fitting capacity of the model. And adopting a multilayer fully-connected layer cascade structure, and adding dropout and batchnormal operations for overfitting avoidance. The number of neurons in the three full-connecting layers is 100, 50 and P in sequence. dropout uses a discard ratio of 0.1. The output dimension of the model is [ P, 1], which is the flow rate in the future P hours. The parameters for the specific layers are shown in Table 6.

TABLE 6 layer parameters of fully connected layers

Full connection layer	Number of neurons
		Full connection layer 1	100
Full connection layer 2	50
		Full connection layer 3	P

The flood flow prediction model based on deep learning is obtained by utilizing historical data of N hydrologic stations, rainfall data and flow data in the previous T hours and rainfall data and flow data in the subsequent P hours through iterative training.

For example, the sample input data may be constructed by using 100-hour rainfall data and flow data of 2013 to 2017 years in the historical data and known rainfall data for a time period to be predicted, which is 24 hours (refer to the above). And (3) performing iterative training by taking the known flow data aiming at the time period to be predicted as a true value, for example, iterating for 100 times until the trained model parameters are obtained, so as to obtain the flood flow prediction model based on deep learning after training. The process is outlined as follows with respect to model training.

1) Each sample data and the corresponding true value in the sample input data are trained through the initial network model shown in the structure of fig. 3, and the training result of each sample data is obtained.

2) And comparing the training result of each sample data with the true value corresponding to the sample data to obtain the prediction result corresponding to the sample data.

3) And calculating the loss value of the network model according to the prediction result corresponding to each sample data.

4) And adjusting parameters of the network model according to the loss value, and repeating the steps 1) -3) until the loss value of the network model reaches a certain convergence condition, namely the loss value reaches the minimum value, which means that the training result of each sample data is consistent with the true value corresponding to the sample data, thereby completing the training of the network model and obtaining the flood flow prediction model based on deep learning after the training is completed.

Specifically, the training process adopts an optimization algorithm of adam (adaptive motion estimation) gradient descent to obtain a better loss descent process. The loss function takes the form of MSE (mean-square error). The learning rate was 0.01. Meanwhile, Dropout operation is added, Dropout being a parameter regularization means applied in a deep learning environment. Can be used as a kind of trigk for training the deep neural network. By ignoring half of the feature detectors (letting half of the hidden layer node values be 0) in each training batch, the overfitting phenomenon can be significantly reduced.

After the training is finished, the input data to be predicted can be input into a flood flow prediction model which is obtained through pre-training and is based on deep learning to be predicted.

Specific process fig. 7 can be seen, and fig. 7 is a schematic diagram of a prediction process according to an embodiment of the present invention. Let T be 100 and P be 24 for example.

Preferably, when the historical T hours and the future P hours are continuous time, the obtained flow prediction value is more accurate.

To facilitate understanding of the form of the input data, please refer to fig. 8, where fig. 8 is an exemplary diagram of the input data according to an embodiment of the invention. The figure is merely a formal example, and specific data is not shown. Where 0 represents the current time. The description of fig. 8 is simplified and illustrated.

Specifically, the prediction process of the model prediction module 1204 may include the following steps S1 to S3, wherein the sequence of the steps S1 and S2 is not limited.

And step S1, based on the gridded rainfall data, extracting spatial distribution characteristics by using a first branch network of a flood flow prediction model based on deep learning obtained by pre-training, and extracting time sequence characteristics of rainfall data of historical T hours and future P hours to obtain a first output characteristic.

The first branch network of the flood flow prediction model based on deep learning is to perform dimension reduction processing on original grid data by using a scheme of an Auto Encoder (AE), wherein the auto encoder learns input information and is a representation learning of input characteristics. The method can reduce the dimension of gridding data and retain the characteristics of the original data to the maximum extent. Fig. 9 shows a specific structure, and fig. 9 is a schematic diagram of a training structure of a self-encoder according to an embodiment of the present invention. The encoder is used as the encoder, the decoder is used as the decoder, and the difference between the output of the decoder and the input of the encoder is as small as possible in the training process of the AE. After the model has been trained, we remove its decoder module. The output of the encode module is used as a result of the dimension reduction.

In addition, in addition to the dimension reduction method, the dimension reduction can be performed by means of PCA dimension reduction of traditional machine learning, network embedding and the like.

It can be understood that, the dimension reduction can reduce the calculation amount, and can effectively remove some noises in the data, thereby improving the training speed and the prediction accuracy of the model.

Specifically, step S1 may include the following steps S11 to S14.

And step S11, inputting the gridded rainfall data into a dimensionality reduction network of the first branch network to obtain dimensionality reduction data.

Because the coverage area of the site is large, the grid is generated according to the scale of 0.01, so that the scale of the obtained grid is too large, and a large burden is brought to hardware data storage and model training calculation. If a larger scale is used for the gridding generation, a large amount of spatial distribution information is lost while the matrix size is reduced, which is contrary to the purpose of using the CNN network in the embodiment. Therefore, in the embodiment of the present invention, the dimension reduction network of the first branch network is the above self-encoder, and is used for performing dimension reduction on the meshed rainfall data.

In this embodiment, taking 50 hydrological sites of a basin in the county, for example, processed position information with one dimension [150,150] is obtained.

Based on the processed rainfall data and the processed position information, the dimension of the formed gridded rainfall data is [ Kx365 x24 + P,150 ] < 8760K + P,150 ], wherein [8760K + P,150 ] represents the dimension of the gridded rainfall data, and the dimension of the obtained dimension-reduced data is [8760K + P,30,30] through a two-dimensional convolution layer and a maximum pooling layer of the dimension-reduced network.

And step S12, performing feature flattening on the dimension reduction data by using the feature transformation network of the first branch network to obtain flattened data.

To match the input vector dimensions of the GRU network, the dimensions of the reduced-dimension data need to be reconstructed. Therefore, the feature flattening is performed on the dimension reduction data, and the dimension of the resulting flattened data is [8760K + P,30 × 30] ═ 8760K + P,900 ].

In step S13, data corresponding to the historical T hours and the future P hours are selected from the flattened data to form a first vector.

Rainfall data of 100 hours in history and 24 hours in the future are selected, and the dimension of the obtained first vector is [124,900 ].

Step S14, the timing characteristics of the first vector are extracted by using a plurality of GRU layers of the first branch network, so as to obtain a first output characteristic.

After the first vector passes through GRU layer 1, the dimension is changed from [124,900] to [124,30 ]; after passing through GRU layer 2, the dimension is changed from [124,30] to [124,20 ]; after passing through GRU layer 2, the dimension changes from [124,20] to [124,10 ].

Finally, the first vector is subjected to time sequence feature extraction through three GRU layers of the first branch network to obtain a first output feature, and the dimension is changed from [124,10] to [10 ].

And step S2, based on the processed flow data, extracting the time sequence characteristics of the flow data in the historical T hours by using a second branch network based on the flood flow prediction model of deep learning, and obtaining second output characteristics.

Specifically, step S2 may include the following steps S21 to S22.

And step S21, selecting data corresponding to T hours from the processed flow data to form a second vector.

Selecting the processed flow data of 100 hours in history to form a second vector, wherein the dimension of the second vector is [100,1 ].

Step S22, extracting the timing characteristics of the second vector by using the plurality of GRU layers of the second branch network, and obtaining a second output characteristic.

After the second vector passes through the GRU layer 1, the dimension is changed from [100,1] to [100,30 ]; after passing through the GRU layer 2, the dimension is changed from [100,30] to [100,20 ]; after passing through the GRU layer 2, the dimension changes from [100,20] to [100,10 ].

And finally, performing time sequence feature extraction on the second vector through three GRU layers of the second branch network, adding a weight to the data of 124 hours of data by using an attention mechanism aiming at the [124,10] vector, and performing weighted summation to obtain a second output feature with the dimension of [10 ].

And step S3, merging, classifying and predicting the first output characteristics and the second output characteristics by using a third network of the flood flow prediction model based on deep learning to obtain a flow prediction value of the target hydrological station in the future P hours.

Specifically, step S3 may include the following steps S31 to S32.

And step S31, merging the first output characteristic and the second output characteristic by using a concat module of the third network to obtain a merged characteristic.

And carrying out vector merging on the first output feature and the second output feature by using a concat module of a third network to obtain a merged feature dimension of [20 ].

And step S32, carrying out classification prediction on the merged features by utilizing a plurality of full connection layers of the third network to obtain a predicted flow value of the target hydrological site in the future P hours.

In this embodiment, the predicted flow rate is the predicted flow rate in the future 24 hours.

The number of neurons in the three fully-connected layers is 100, 50 and 24 respectively. And carrying out classification prediction on the merging characteristics by utilizing three full-connection layers to realize numerical fitting of the network.

After the merged features pass through the fully connected layer 1, the dimension is changed from [20] to [50 ]; after passing through the full connection layer 2, the dimension is changed from [50] to [40 ]; after passing through the fully-connected layer 3, the dimension changes from [40] to [24 ].

And finally, obtaining a flow predicted value of the target hydrological site with the dimensionality of [24] in the future P hours.

The concat is a way of feature fusion in the neural network, and is the combination of channel numbers. The data dimensions of the various layers of processing are labeled in fig. 7 to facilitate understanding of data variations. For the specific processing procedure of the GRU layer and the full connection layer, please refer to the prior art for understanding, and detailed description thereof is omitted here.

Referring to fig. 10, fig. 10 is a comparison graph of water peak prediction effects according to the embodiment of the present invention. Wherein the upper graph is an actual flow graph in 2018, the lower graph is a predicted flow graph in 2018, and the flow is a water peak value and is in millimeters. time is time and the unit h represents hours. It can be seen that the similarity between the upper graph and the lower graph is extremely high, which shows that the flood flow prediction model based on deep learning of the embodiment of the invention can achieve higher prediction accuracy.

Therefore, the flood flow prediction method based on deep learning of the embodiment of the invention can realize accurate prediction of the water flow in a future time period.

It can be understood that the flow prediction value obtained by the flood flow prediction model based on deep learning in the embodiment of the present invention is also normalized data, and then, optionally, a value having the same data specification as the actually measured flow value can be obtained by using corresponding data processing. This will not be described in detail.

In an optional embodiment, the predicting end may output the obtained predicted flow value of the target hydrologic site in the future P hours, for example, send the predicted flow value to another device or display the predicted flow value, for example, display the predicted flow value on a display screen of a predetermined electronic device.

In an optional implementation manner, the predicting end may further compare the obtained flow predicted value of the target hydrologic site in the P hours in the future with a predetermined threshold, and send out warning information when the flow predicted value of the target hydrologic site in the P hours in the future is greater than the predetermined threshold.

In the scheme provided by the embodiment of the invention, historical rainfall data and future prior rainfall data are gridded by utilizing the position information of each hydrological site, so that the spatial distribution information of rainfall is introduced, and the gridded rainfall data is used for extracting spatial distribution characteristics and time sequence characteristics by utilizing one of the branch networks of the pre-trained flood flow prediction model based on deep learning; and performing time sequence feature extraction on the historical flow data by using the other branch network of the flood flow prediction model based on deep learning, and performing merging, classifying and predicting on the output features of the two branch networks to obtain a flood flow prediction value in a future time period. The prediction system provided by the embodiment of the invention can obtain the flood flow prediction result of a future time period at one time, and the result considers the space distribution condition of rainfall, so that the information described by the actual rainfall data can be fully mined to obtain the prediction result conforming to the actual rainfall condition, and the prediction accuracy is higher.

In a second aspect, the flood flow prediction method based on deep learning provided in the embodiments of the present invention is applied to a prediction end in a flood flow prediction system based on deep learning, where the flood flow prediction system based on deep learning further includes N site ends, where the N site ends are terminals corresponding to N hydrological sites of a predetermined drainage basin, and one target hydrological site of the N hydrological sites is located at an outlet section of the predetermined drainage basin; each site end sends single-end original rainfall data representing historical K-year hourly rainfall data of the hydrological site to a prediction end, and the site end corresponding to the target hydrological site also sends original flow data representing the historical K-year hourly flow data of the target hydrological site to the prediction end; the flood flow prediction method based on deep learning comprises the following steps:

acquiring single-ended original rainfall data, original flow data, prior rainfall data of each hydrological site in the future P hours and original position information of N hydrological sites, and acquiring the original rainfall data from all the acquired single-ended original rainfall data; preprocessing original rainfall data, original flow data and prior rainfall data to respectively obtain processed rainfall data, processed flow data and processed position information; forming gridding rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information; based on gridding rainfall data, extracting spatial distribution characteristics by using a first branch network of a flood flow prediction model based on deep learning obtained by pre-training, and extracting time sequence characteristics of rainfall data of historical T hours and future P hours to obtain first output characteristics; based on the processed flow data, extracting time sequence characteristics of the flow data in T hours of history by using a second branch network of a flood flow prediction model based on deep learning to obtain second output characteristics; and carrying out merging, classifying and predicting on the first output characteristics and the second output characteristics by utilizing a third network of a flood flow prediction model based on deep learning to obtain a flow prediction value of the target hydrological station in the future P hours, wherein the flood flow prediction model based on deep learning comprises a branch network group and a third network which are connected in series, the branch network group comprises a first branch network and a second branch network which are connected in parallel, N, K, T and P are natural numbers which are more than 1, and T is less than or equal to the number of hours corresponding to K years.

The flood flow prediction system based on deep learning provided by the embodiment of the invention is the system of the first aspect.

For related specific contents, refer to the contents of the flood flow prediction system based on deep learning in the first aspect, which are not described herein again.

In the scheme provided by the embodiment of the invention, historical rainfall data and future prior rainfall data are gridded by utilizing the position information of each hydrological site, so that the spatial distribution information of rainfall is introduced, and the gridded rainfall data is used for extracting spatial distribution characteristics and time sequence characteristics by utilizing one of the branch networks of the pre-trained flood flow prediction model based on deep learning; and performing time sequence feature extraction on the historical flow data by using the other branch network of the flood flow prediction model based on deep learning, and performing merging, classifying and predicting on the output features of the two branch networks to obtain a flood flow prediction value in a future time period. The prediction method provided by the embodiment of the invention can obtain the flood flow prediction result of a period of time in the future at one time, and the result considers the space distribution condition of rainfall, so that the information described by the actual rainfall data can be fully mined to obtain the prediction result conforming to the actual rainfall condition, and the prediction accuracy is higher.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. A flood flow prediction system based on deep learning is characterized by comprising N site ends and a prediction end, wherein the N site ends are terminals corresponding to N hydrological sites of a preset basin, and one target hydrological site in the N hydrological sites is located at an outlet section of the preset basin;

2. The flood flow prediction system based on deep learning of claim 1, wherein the predicting end preprocesses the original rainfall data, the prior rainfall data, the original flow data and the original position information to obtain processed rainfall data, processed flow data and processed position information, respectively, and comprises:

3. The flood flow prediction system based on deep learning of claim 2, wherein the prediction end performs data elimination, data completion processing and normalization processing on the original rainfall data to obtain processed rainfall data, and the method comprises:

4. The deep learning based flood flow prediction system of claim 2, wherein the normalization process comprises a [0,1] normalization process.

5. The flood flow prediction system based on deep learning of claim 1 or 2, wherein the predicting end forms gridded rainfall data based on the processed rainfall data, the processed prior rainfall data and the processed position information, and comprises:

6. The flood flow prediction system based on deep learning of claim 1, wherein the predicting end extracts spatial distribution features and time sequence features of rainfall data of historical T hours and future P hours by using a first branch network of a flood flow prediction model based on deep learning obtained by pre-training based on the gridded rainfall data to obtain a first output feature, and the first output feature comprises:

7. The flood flow prediction system based on deep learning of claim 6, wherein the predicting end extracts a time series feature of historical T-hour flow data by using a second branch network of the flood flow prediction model based on the processed flow data to obtain a second output feature, and the second output feature comprises:

8. The flood flow prediction system based on deep learning of claim 7, wherein the predicting end performs merged classified prediction on the first output feature and the second output feature by using a third network of the flood flow prediction model based on deep learning to obtain a predicted flow value of the target hydrologic site for P hours in the future, including:

9. The flood flow prediction system based on deep learning of claim 1, wherein the prediction end is a station end corresponding to the target hydrological station.

10. The flood flow prediction method based on deep learning is characterized by being applied to a prediction end in a flood flow prediction system based on deep learning, and the flood flow prediction system based on deep learning further comprises N site ends, wherein the N site ends are terminals corresponding to N hydrological sites of a preset basin, and one target hydrological site in the N hydrological sites is located at an outlet section of the preset basin; each site end sends single-end original rainfall data representing historical K-year hourly rainfall data of the hydrological site to the forecasting end, and the site end corresponding to the target hydrological site also sends original flow data representing the historical K-year hourly flow data of the target hydrological site to the forecasting end; the flood flow prediction method based on deep learning comprises the following steps: