CN112182063A - Method for constructing hydrological forecasting model based on space-time characteristics - Google Patents

Method for constructing hydrological forecasting model based on space-time characteristics Download PDF

Info

Publication number
CN112182063A
CN112182063A CN202010974378.7A CN202010974378A CN112182063A CN 112182063 A CN112182063 A CN 112182063A CN 202010974378 A CN202010974378 A CN 202010974378A CN 112182063 A CN112182063 A CN 112182063A
Authority
CN
China
Prior art keywords
hydrological
matrix
distance
constructing
follows
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202010974378.7A
Other languages
Chinese (zh)
Inventor
朱跃龙
赵群
万定生
余宇峰
杨志勇
姚成
王继民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202010974378.7A priority Critical patent/CN112182063A/en
Publication of CN112182063A publication Critical patent/CN112182063A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Abstract

The invention discloses a method for constructing a hydrological prediction model based on space-time characteristics, which fuses an established river channel distance matrix, an Euclidean distance matrix and a correlation coefficient matrix into a hydrological topological structure chart; and extracting spatial features and temporal features, and finally establishing a hydrological forecasting model. The invention combines the time characteristic and the space characteristic to construct a hydrological forecasting model based on the space-time characteristic drive. Firstly, establishing three hydrological relationship graphs, a river channel distance matrix, an Euclidean distance matrix and a correlation coefficient matrix according to spatial characteristics and time sequence similarity, and fusing the three graphs to establish a hydrological topological structure graph; and then, the graph convolution network and the gated circulation unit are combined to learn spatial feature representation and capture hydrological temporal features at the same time, and then hydrological prediction is carried out, so that the prediction precision is improved.

Description

Method for constructing hydrological forecasting model based on space-time characteristics
Technical Field
The invention belongs to the technical field of hydrologic prediction, and particularly relates to a method for constructing a hydrologic prediction model based on space-time characteristics.
Background
The continuous development of the information age enables more and more intelligent hydrological monitoring stations to be established and perfected, the hydrological data coverage is more and more comprehensive, and massive hydrological historical data and real-time data are collected and stored in a database. The hydrological data have the characteristics of large quantity, various categories, spatiotemporal property, quick updating and the like, and meanwhile, the hydrological data are influenced by various conditions such as seasonal climate, geomorphic characteristics, hydrological laws and the like, so that a lot of valuable laws and information are hidden. More practical and valuable information is mined from hydrologic big data to improve the accuracy of hydrologic prediction and pay more and more attention to providing useful early prediction and early warning information for hydrologic monitoring stations.
The hydrological forecast is to forecast the hydrological information (such as runoff and water level) in a certain period (such as several hours) in the future according to the hydrological meteorological data in the previous period. Provides basis for flood control and disaster relief decisions and has important significance for reasonable utilization of water resources. Therefore, the hydrologic event is effectively forecasted in time, and the hydrologic event forecasting method has important significance for the decision of hydrologic workers. Hydrologic forecasting is especially important in order to alleviate adverse effects caused by flood disasters.
The existing hydrologic forecast models are mainly divided into three categories: conceptual models, physical models, and data-driven models. The conceptual model is based on the physical concepts of the hydrological phenomenon and empirical formulas, and the physical model is based on the physical process. They can predict hydrologic events for a particular river. However, these models may require a large number of hydrological parameters and are not readily adaptable to other watersheds. In particular, since the physical process parameters of different basins vary widely, the parameters of the physical model, and even the structure, may need to be significantly modified.
Therefore, in recent years, a hydrologic forecasting model based on data driving is increasingly gaining attention in hydrologic event forecasting. However, most of the existing data-driven models focus on learning temporal features from historical data, do not consider abundant spatial features, and the addition of the spatial features can increase the prediction difficulty of a common prediction model.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the defects in the prior art, and provides a method for constructing a hydrological prediction model based on space-time characteristics, which adds abundant space characteristics into the prediction model to improve the prediction precision.
The technical scheme is as follows: the invention relates to a method for constructing a hydrological forecasting model based on space-time characteristics, which comprises the following steps of:
s1, establishing a hydrological relation graph, and respectively establishing a river channel distance matrix, an Euclidean distance matrix and a correlation coefficient matrix according to the historical hydrological database and the geographic numerical information;
s2, establishing a hydrological topological structure chart, namely fusing the river channel distance matrix, the Euclidean distance matrix and the correlation coefficient matrix established in the step S1 into the hydrological topological structure chart;
s3, extracting spatial features, namely excavating the spatial features of the hydrological topological structure diagram by using a graph convolution network GCN;
the spatial characteristics refer to characteristics of the hydrological topological structure diagram, namely, the relationship between other hydrological measurement station sites and the hydrological measurement station site of the hydrological measurement station, the influence on the hydrological measurement station of the hydrological measurement station, and the like.
S4, time feature extraction, namely capturing the temporal features of the hydrological topological structure chart by using a gating circulation unit;
and S5, establishing a hydrologic forecasting model.
Further, the detailed method of step S1 is as follows:
1) establishment of river channel relation matrix
Setting the river length between a certain target river basin and an upstream river basin thereof as a river channel Distance (Hydraulic Distance), and estimating through a digital elevation model DEM (digital elevation model); wherein, if the water flow at the upstream of the target basin passes through the water flow at the downstream, the river channel distance refers to the flow path length between the two basins; if the river channel is not connected, the river channel distance is 0, and the distance from the watershed to the watershed is also 0;
the matrix formula of the river channel distance is as follows:
Figure BDA0002685252930000021
wherein dh ism,kRepresenting the river channel distance between the hydrological station m and the hydrological station k, wherein when m is k, the distance is 0;
2) euclidean distance matrix establishment
The euclidean distance refers to the horizontal distance between two hydrological stations.
The matrix of euclidean distances is as follows:
Figure BDA0002685252930000031
wherein, dem,kRepresents the horizontal distance between the hydrological station m and the hydrological station k, and when m is k, the distance is 0;
3) establishing a hydrologic correlation coefficient matrix
The Pearson Correlation Coefficient (Pearson Correlation Coefficient) is a number between-1 and 1, which represents the degree to which two variables are linearly related. Here, the pearson correlation coefficient is used to calculate the daily traffic correlation of each hydrological site, and a correlation coefficient matrix is established. The time series with large granularity is used for calculating the correlation to assist in predicting the small-granularity hour data.
The matrix formula is as follows:
Figure BDA0002685252930000032
wherein, cm,kIs hydrologyCorrelation results between site m and hydrological site k.
Further, in the step S2, the river distance map, the euclidean distance map, and the adjacency matrix of the correlation coefficient map are weighted and averaged, so as to realize the fusion of the three maps to form a hydrological topology structure diagram,
the adjacent matrix elements of the fused hydrological topological structure chart are as follows:
Figure BDA0002685252930000033
ai,jis the element of the ith row and the jth column in the adjacent matrix, and in the formula, the values of alpha, beta and gamma can be 04, 0.2 and 0.4.
Further, the specific method for mining the spatial features between the hydrological topological structures by using the graph convolution network GCN in step S3 is as follows:
the graph convolution network GCN model is defined as follows:
X(l+1)=f(X(l),A) (5)
wherein l is the number of layers of the GCN model of the graph convolution network, X(l)Is a feature of a first level node, a is an adjacency matrix of a hydrological topology;
then, firstly, the feature transformation of the nodes is carried out, and the degree matrix is used for carrying out normalization on the adjacent matrix; after self-circulation is added, the relation between each node and the adjacent nodes is considered, and the specific model formula is as follows:
Figure BDA0002685252930000041
wherein the content of the first and second substances,
Figure BDA0002685252930000042
is a feature of node i in layer l +1,
Figure BDA0002685252930000043
is the distribution of nodes (including itself) in the feature layer l of all neighboring nodes, σ is a non-linear transformation, a is the adjacency matrix,
Figure BDA0002685252930000048
it is shown that the self-circulation,
Figure BDA0002685252930000049
is degree, corresponds to
Figure BDA0002685252930000044
NiIs all neighbors (including itself) of node i, W(l)Is the weight of the l-th layer, b(l)Is the intercept of the l-th layer.
Further, the specific method for performing time feature extraction by using the gating cycle unit in step S4 is as follows:
1) and establishing an updating gate: z is a radical oft=σ(W1·[ht-1,xt])
W1Is to update the weight of the gate, ht-1Hydrologically output data for neurons at a previous time, when ztThe larger the value of (a), the less information can be left by the neuron at the previous moment, and the more information can be left by the neuron at the current moment;
2) and constructing a reset gate: r ist=σ(W2·[ht-1,xt])
W2Is a weight matrix of reset gates when rtWhen the value is 0, the method means that useless information transmitted by the neuron at the previous moment is abandoned, and only the input of the neuron at the current moment is reserved as the input;
3) and undetermined output in the neuron:
Figure BDA0002685252930000045
W3is to update the weight vector of the gate, Tanh is the hyperbolic tangent function,
4) and neuron output:
Figure BDA0002685252930000046
further, the method for constructing the final hydrologic forecast model in step S5 includes:
establishing a hydrologic forecast model: here, a two-layer GCN hydrological model is taken as an example:
Figure BDA0002685252930000047
wherein, XtIs a feature matrix, a is a adjacency matrix, a is a non-linear transformation,
Figure BDA0002685252930000051
it is shown that the self-circulation,
Figure BDA0002685252930000052
is and
Figure BDA0002685252930000053
corresponding degree matrix, W0Is the weight of the first layer, W1Is the weight of the second layer;
then, the GRU is adopted to carry out time sequence prediction, and the formula steps are as follows:
1) and establishing an updating gate: z is a radical oft=σ(Wz·[ht-1,f(Xt,A)]);
2) And constructing a reset gate: r ist=σ(Wr·[ht-1,f(Xt,A)]);
3) And undetermined output in the neuron:
Figure BDA0002685252930000054
4) and neuron output:
Figure BDA0002685252930000055
has the advantages that: the invention combines the time characteristic and the space characteristic to construct a hydrological forecasting model based on the space-time characteristic drive. Firstly, establishing three hydrological relationship graphs, a river channel distance matrix, an Euclidean distance matrix and a correlation coefficient matrix according to spatial characteristics and time sequence similarity, and fusing the three graphs to establish a hydrological topological structure graph; and then, the graph convolution network and the gated circulation unit are combined to learn spatial feature representation and capture hydrological temporal features at the same time, and then hydrological prediction is carried out, so that the prediction precision is improved.
Drawings
FIG. 1 is a block diagram of an overall architecture according to an embodiment of the present invention;
FIG. 2 is a schematic overall flow chart according to an embodiment of the present invention;
FIG. 3 is a schematic diagram showing the comparison between Euclidean distance and river channel distance in the example;
FIG. 4 is a schematic diagram of a convolutional neural network in an embodiment.
FIG. 5 is a hydrological site diagram of the selected Jiangxi province in the example.
FIG. 6 is a diagram showing the prediction results of the models in the example.
Detailed Description
The technical solution of the present invention is described in detail below, but the scope of the present invention is not limited to the embodiments.
As shown in fig. 1 and fig. 2, the method for constructing a spatiotemporal feature-based hydrological prediction model of the present invention specifically includes the following steps:
s1, establishing a hydrological relation graph, and respectively establishing a river channel distance matrix, an Euclidean distance matrix and a correlation coefficient matrix according to the historical hydrological database and the geographic numerical information;
s2, establishing a hydrological topological structure chart, namely fusing the river channel distance matrix, the Euclidean distance matrix and the correlation coefficient matrix established in the step S1 into the hydrological topological structure chart;
and S3, extracting spatial features, namely mining the spatial features of the hydrological topological structure diagram by using a convolutional network GCN. The conventional convolutional neural network CNN can only mine the characteristics of images and has the characteristics of images of pixel points. However, many data can form a topological structure and a relational structure, and almost all data can form a structure which can be mined by the graph convolution network, so that the graph convolution network GCN is used for mining the spatial characteristics of the hydrological topological structure diagram in the implementation.
S4, time feature extraction, namely capturing the temporal features of the hydrological topological structure chart by using a gating circulation unit; the method is a model which has better effect and wide application at present. The gate control loop unit (GRU) improves the gate design of the LSTM, changes the LSTM input gate, forgets to memorize the gate and the output gate into the changes of an update gate and a reset gate, simplifies the calculation of a model, has higher convergence speed and improves the time-consuming problem of LSTM training.
And S5, establishing a hydrologic forecasting model.
As shown in fig. 3, de between hydrologic site a and hydrologic site B on target basina,bIs Euclidean distance, dha,bIs the distance of the river channel.
Establishment of river channel relation matrix
And estimating the river length from the target river basin to the river basin upstream of the target river basin according to the digital elevation model DEM. If the upstream water stream passes through the downstream water stream, the channel distance is the flow path length between the two basins. If there is no river connection, the river distance is 0. And the distance from the watershed itself to itself is also 0.
The matrix formula of the channel distance is defined as follows:
Figure BDA0002685252930000061
wherein dh ism,kRepresenting the river channel distance between the station m and the station k, wherein when m is k, the distance is 0; euclidean distance matrix establishment
The horizontal distance between two hydrological stations on the target basin is assumed to be its Euclidean distance
As the hydrological process is influenced by other processes besides the upstream and downstream relationship, such as the rainfall process; during rainfall, if the horizontal distances between the two hydrological stations are closer, the rainfall process of the two hydrological stations is also close.
The Euclidean distance formula is as follows:
Figure BDA0002685252930000062
where R is the earth's radius and atan2(x, y) is an arctangent function used to return the specified x and y coordinate values.
The calculation formula of a is as follows:
a=sin2(Δθ/2)+cosαi·cosαj·sin2(Δα/2)
where Δ θ is the difference in latitude of two geographic locations, aiAnd ajIs the longitude of the two geographic locations and Δ a is the difference in longitude. The angles here must be in radians rather than in latitude or longitude in numbers.
The matrix formula of the Euclidean distance is as follows:
Figure BDA0002685252930000071
wherein, dem,kRepresents the horizontal distance between station m and station k, and when m is k, the distance is 0.
Hydrologic correlation coefficient matrix establishment
The Pearson Correlation Coefficient (Pearson Correlation Coefficient) is a number between-1 and 1, which represents the degree to which two variables are linearly related. The invention uses the Pearson correlation coefficient to calculate the daily flow correlation of each survey station and establishes a correlation coefficient matrix.
The matrix formula is as follows:
Figure BDA0002685252930000072
wherein, cm,kIs the result of the correlation between station m and station k.
Hydrological topological structure chart establishment
The river channel distance matrix, the Euclidean distance matrix and the correlation coefficient matrix are fused, and various different relational graphs are combined by weighting and summing the adjacent matrixes.
The adjacency matrix element formula of the hydrological topological structure diagram is as follows:
Figure BDA0002685252930000081
ai,jis the element in row i and column j of the adjacency matrix.
As shown in FIG. 4, X1The target station is the other stations on the hydrological topological graph.
Spatial feature extraction: graph Convolution Networks (GCNs) are used to mine spatial features between hydrological topologies. The model is defined as follows:
X(l+1)=f(X(l),A) (5)
wherein l is the number of layers, X(l)Is a characteristic of the first level node and a is the adjacency matrix. In the graph structure data, the feature information and the structure information of the node, i.e., the adjacency matrix a and the feature matrix X, should be considered at the same time. For example, in predicting runoff, X is daily runoff data and rainfall data.
Then, first, feature transformation of the nodes is performed, and the adjacency matrix is normalized by the degree matrix. After self-circulation is added, the relation between each node and the adjacent nodes is considered, and the specific model formula is as follows:
Figure BDA0002685252930000082
wherein the content of the first and second substances,
Figure BDA0002685252930000083
is a characteristic of node i in level l +1,
Figure BDA0002685252930000084
is the distribution of nodes (including itself) in the feature layer l of all neighboring nodes, σ is a non-linear transformation, a is the adjacency matrix,
Figure BDA0002685252930000085
it is shown that the self-circulation,
Figure BDA0002685252930000086
is degree, corresponds to
Figure BDA0002685252930000087
NiIs all neighbors (including itself) of node i, W(l)Is the weight of the layer, b: (l) Is the intercept of the l layers.
Time characteristic extraction: temporal feature extraction is performed using gated round robin units. The gated cycle cell formula is as follows:
(1) and (3) establishing an update gate: z is a radical oft=S(W1·[ht-1,xt])
W1Is to update the weight of the gate, ht-1Is the output of the neuron at the previous time, when ztThe larger the value of (a), the less information can be left by the neuron at the previous time, and the more information can be left in the neuron at the present time.
(2) And (3) constructing a reset door: r ist=s(W2·[ht-1,xt])
W2Is a weight matrix of reset gates when rtWhen the value is 0, it means that useless information transferred by the neuron at the previous time is abandoned, and only the input of the neuron at the current time is reserved as the input.
(3) Undetermined outputs in neurons:
Figure BDA0002685252930000091
W3is to update the weight vector of the gate, Tanh is the hyperbolic tangent function,
(4) and (3) neuron output:
Figure BDA0002685252930000092
as shown in fig. 4, the hydrologic forecast model is established: firstly, adopting GCN to excavate spatial features for a hydrological topological structure diagram, then adopting GRU to excavate time features, wherein a GCN formula is as follows, taking two layers of GCN as an example:
Figure BDA0002685252930000093
wherein, XtIs a feature matrix, a is a adjacency matrix, a is a non-linear transformation,
Figure BDA0002685252930000096
it is shown that the self-circulation,
Figure BDA0002685252930000097
is and
Figure BDA0002685252930000098
corresponding degree matrix, W0Is the weight of the first layer, W1Is the weight of the second layer.
Then, the GRU is adopted to carry out time sequence prediction, and the formula steps are as follows:
(1) and (3) establishing an update gate: z is a radical oft=σ(Wz·[ht-1,f(Xt,A)]);
(2) And (3) constructing a reset door: r ist=σ(Wr·[ht-1,f(Xt,A)]);
(3) Undetermined outputs in neurons:
Figure BDA0002685252930000094
(4) and (3) neuron output:
Figure BDA0002685252930000095
example (b): as shown in FIG. 5, Jiangxi Yanghu is the most frequent region in the middle and lower reaches of Yangtze river for flood disasters. Through the analysis of the hydrological data of the area, powerful support can be provided for the prevention and control of flood disasters. The embodiment selects the Poyang lake flowing field outside continent station as the display point of the hydrologic forecast fruit.
A hydrological data set of Jiangxi province of China is adopted, and data comprises runoff data, longitude and latitude data and rainfall data. Considering the problems of data loss, site relocation, site outage and the like, 139 sites are finally selected as research sites in Jiangxi province, as shown in FIG. 5. And selecting the flood flow and rainfall data in summer of 1998-2010 as experimental data. Because partial rainfall is incomplete, the missing rainfall is supplemented by a linear interpolation method commonly used in hydrology. And taking each hydrological station as a node, taking the hydrological topological graph as an adjacency matrix, and taking historical runoff and rainfall information as a characteristic matrix. We used 80% of the data set as the training data set and 20% as the test data set.
Hydrological relation graph establishment
Establishing a river channel relation matrix, an Euclidean distance matrix and a correlation coefficient matrix to obtain three relation matrixes, and converting the three relation matrixes into three adjacent matrixes:
Figure BDA0002685252930000101
Figure BDA0002685252930000102
Figure BDA0002685252930000103
establishing a hydrological topological structure diagram: since the adjacency matrixes of different hydrological relationship matrixes contain different information and numerical values in the adjacency matrixes are greatly different, the embodiment performs normalization operation on the adjacency matrixes of the hydrological relationship graphs, and then performs weighting processing, and further fuses the relationship graphs to obtain the hydrological topology structure diagram.
And (3) normalization operation: a' ═ D-1A and D are degree matrix of A, and D is ═ ΣjAi,j
Fusion operation:
Figure BDA0002685252930000104
wherein wiIn order to be the weight, the weight is,
Figure BDA0002685252930000106
the representative weight is dot-multiplied with each corresponding point in the matrix a.
The adjacency matrix element formula of the hydrological topological structure diagram is as follows:
Figure BDA0002685252930000105
ai,jis the element, h ', of the ith row and jth column in the contiguous matrix'i,jIs an element, e ', in the adjacency matrix of the normalized river distance map'i,jIs an element, c ', in the adjacency matrix of the normalized Euclidean distance map'i,jAre elements in the adjacency matrix of the normalized correlation coefficient map.
The adjacency matrix of the hydrological topological structure diagram is as follows:
Figure BDA0002685252930000111
and then processing the hydrologic data.
And performing linear interpolation processing on the rainfall information. Linear interpolation is the calculation of the value of some unknown quantity in between two known quantities by connecting straight lines of the two known quantities. Let it be assumed that the coordinates (x) are known0,y0) And (x)1,y1) Solving for [ x0,x1]The value of a certain position x on the straight line within the interval.
Figure BDA0002685252930000112
The rainfall data with the sample size of 50 and the loss rate of 20% is adopted, the loss values are generated at 3, 10, 14, 32, 36 and 42, mean value interpolation and linear interpolation are respectively carried out on the data at the loss positions, and the table 1 shows the comparison between the linear interpolation method and the real values.
TABLE 1 comparison of Linear interpolation results with true values
Serial number True value Linear interpolation
3 4 3.38
10 1 0.65
14 3 2.5
32 1.3 1.15
36 3 2
42 1 1.5
1. And performing Max-Min normalization processing on all the data to enable the value of each element in the obtained sequence to be in the range of [0,1 ].
In the sequence X, XmaxIs the maximum value in the sequence, XminIs the minimum value in the sequence, each element X in the sequence XiThere are the following formulas:
Figure BDA0002685252930000113
setting parameters of a hydrologic forecasting model: the number of hidden nodes is 32, the learning rate is 0.005 and 0.007, the batch processing size is 64, and the iteration number is 200.
Spatial feature mining
(1) A sample set is determined. The sample set is flow information of each station, surface average rainfall information and topological structure information among stations. The measuring stations are nodes, the adjacent matrix is an element in a hydrological topological network structure, and the characteristic matrix is flow information and rainfall information.
(2) Self-connection of nodes:
Figure BDA0002685252930000121
and I is an identity matrix, the adjacency matrix and the identity matrix are added, so that the diagonal element is 1, a self-connection characteristic is introduced, and self-circulation operation enables the node to well retain self information in convolution operation.
(3) Normalization of the adjacency matrix:
Figure BDA0002685252930000122
wherein D is a degree matrix of A,
Figure BDA0002685252930000123
is that
Figure BDA0002685252930000124
The degree matrix of (c) is,
Figure BDA0002685252930000125
Figure BDA0002685252930000126
is a normalized adjacency matrix. A is a matrix that is not normalized, so multiplication with the feature matrix changes the original distribution of the features, creating some unpredictable problems.
(4) From the previous hidden layer to the next hidden layer, the layer propagation formula is as follows:
Figure BDA0002685252930000127
Figure BDA0002685252930000128
wherein W is a weight matrix, l is the number of layers, and Y is a node matrix after convolution.
(5) Taking into account the relationship between each node and the adjacent nodes, the summation is performed, and the formula is as follows:
Figure BDA0002685252930000129
wherein the content of the first and second substances,
Figure BDA00026852529300001210
is a characteristic of node i in level l +1,
Figure BDA00026852529300001211
is the distribution of nodes (including itself) in the feature layer l of all neighboring nodes, sigma is a non-linear transformation,
Figure BDA00026852529300001212
is a contiguous matrix, representing a self-circulation,
Figure BDA00026852529300001213
is degree, corresponds to
Figure BDA00026852529300001214
NiIs all neighbors (including itself) of node i, W(l)Is the weight of the l layer, b(l)Is the intercept of the l layers.
Model building
1) Firstly, carrying out space mining on a hydrological topological structure by using a GCN, selecting a Relu function by a nonlinear transformation function, and carrying out graph convolution operation:
Figure BDA00026852529300001215
where Xt is the feature matrix, A is the adjacency matrix, σ is a non-linear transformation,
Figure BDA00026852529300001216
it is shown that the self-circulation,
Figure BDA00026852529300001217
is and
Figure BDA00026852529300001218
the corresponding degree matrix, W0 is the weight of the first layer, W1 is the weight of the second layer.
2) The GRU network weight is initialized, the sample set is space characteristic and flow data, rainfall data, and the loss function is mean square error function mse, and the calculation formula is as follows:
Figure BDA0002685252930000131
wherein, yiIs a predicted value at the i-th time, YiAnd n is the number of samples input by the model and is the true value of the ith moment.
2) Forward calculation of each unit of hidden layer of GRU hydrological model
And (3) establishing an update gate: z is a radical oft=σ(Wz·[ht-1,f(Xt,A)]);
And (3) constructing a reset door: r ist=σ(Wr·[ht-1,f(Xt,A)]);
Undetermined outputs in neurons:
Figure BDA0002685252930000132
and (3) neuron output:
Figure BDA0002685252930000133
Wzis to update the weight of the gate, WrIt is the weight that resets the gate that,
Figure BDA0002685252930000135
is to update the weight of the gate, ht-1Is the output of the neuron at the previous time,
3) and calculating the prediction error of the model, then reversely calculating the prediction error of each neuron, updating the network weight, and performing iterative training on the network.
4) And when the mean square error does not decrease any more or a certain condition is met, the iteration is finished, and the network training is finished.
In the experiment, a Support Vector Machine (SVM), a long-time memory network (LSTM) and a single gating cycle unit (GRU) are compared with the hydrologic prediction model ST-Hydro of the invention, the experimental result is shown in figure 6, and the experimental pair is shown in table 2.
The evaluation indexes are a certainty coefficient and a root mean square error.
The certainty coefficient represents the coincidence degree of the predicted value and the measured value, the value range is [0,1], the closer to 1, the more the independent variable can explain the variance change of the dependent variable, the better the prediction precision is, and the calculation formula is as follows:
Figure BDA0002685252930000134
the root mean square error represents the deviation of the predicted value and the actual value and represents the deviation of the predicted value and the actual value, the index calculates the mean value of the square sum of the errors of the sample points corresponding to the fitting data and the original data, and the smaller the value is, the better the fitting effect is. The calculation formula is as follows:
Figure BDA0002685252930000141
TABLE 2 model comparison plot for one hour in the future
Model (model) SVM LSTM GRU The invention
Deterministic coefficient 0.95 0.98 0.97 0.99
Root mean square error 128.34 91.66 92.73 72.34
It can be seen from Table 2 and FIG. 6 that the hydrologic prediction model ST-Hydro of the present invention is the best model. The performance of the existing LSTM and GRU is slightly better than that of GRU, but the difference is not significant, while the existing support vector machine performs the worst.
As can be seen from the evaluation indexes of the prediction results in Table 2, the certainty factor of the invention is the highest, the difference between LSTM and GRU is not significant, and SVM is slightly worse. Therefore, the ST-Hydro model is the best in the overall prediction result, the benefit of the method is that the GCN hydrological model is adopted to capture the space characteristics of runoff, the rainfall of a plurality of relevant stations in the space is added in the prediction process, and the problems of gradient disappearance and mean shift are avoided.
As can be seen from the above embodiments, the present invention excavates spatial features by convolving neural networks with graphs. The graph convolution neural network needs to be mined on a graph structure, which is a relational structure and a topological structure. Therefore, the invention firstly constructs a hydrological topological structure diagram which consists of three hydrological relational diagrams. And then mining the hydrological topological graph by using the convolutional network.

Claims (6)

1. A method for constructing a hydrological prediction model based on space-time characteristics is characterized by comprising the following steps: the method comprises the following steps:
s1, establishing a hydrological relation graph, namely respectively establishing a river channel distance matrix, an Euclidean distance matrix and a correlation coefficient matrix;
s2, establishing a hydrological topological structure chart, namely fusing the river channel distance matrix, the Euclidean distance matrix and the correlation coefficient matrix established in the step S1 into the hydrological topological structure chart;
s3, extracting spatial features, namely excavating the spatial features of the hydrological topological structure diagram by using a graph convolution network GCN;
s4, extracting time characteristics, namely capturing the temporal characteristics of the hydrological topological structure chart by using a gating circulation unit;
and S5, establishing a hydrologic forecasting model.
2. The method for constructing a spatiotemporal feature-based hydrological prediction model according to claim 1, wherein: the detailed method of the step S1 is as follows:
1) establishment of river channel relation graph matrix
Setting the river length between a certain target river basin and an upstream river basin thereof as a river channel distance, and estimating the river length through a digital elevation model DEM; wherein, if the water flow at the upstream of the target basin passes through the water flow at the downstream, the river channel distance refers to the flow path length between any two basins; if the upstream and downstream relation does not exist between the two watersheds, namely the river channel is not connected, the river channel distance is 0, and the distance from the watersheds to the watersheds is 0;
the matrix formula of the river channel distance is as follows:
Figure FDA0002685252920000011
wherein dh ism,kRepresenting the river course distance between the hydrological station m and the hydrological station k, when m ═k, the distance is 0;
2) euclidean distance matrix establishment
The horizontal distance between two hydrological stations is set as the Euclidean distance, and the matrix of the Euclidean distances is as follows:
Figure FDA0002685252920000021
wherein, dem,kRepresents the horizontal distance between the hydrological station m and the hydrological station k, and when m is k, the distance is 0;
3) establishing a hydrologic correlation coefficient matrix
Calculating the daily flow correlation of each hydrological site by using a Pearson correlation coefficient, and establishing a correlation coefficient graph;
the matrix formula is as follows:
Figure FDA0002685252920000022
wherein, cm,kThe correlation result between the hydrological site m and the hydrological site k is shown, and 1 in the matrix of the formula (3) means that the correlation coefficient of all sites and the own site is 1.
3. The method for constructing a spatiotemporal feature-based hydrological prediction model according to claim 1, wherein: in the step S2, the river channel distance matrix, the euclidean distance matrix, and the adjacent matrix of the correlation coefficient matrix are weighted and averaged to further realize the fusion of the three to form a hydrological topology structure diagram,
the adjacent matrix elements of the fused hydrological topological structure chart are as follows:
Figure FDA0002685252920000023
ai,jis the element in row i and column j of the adjacency matrix.
4. The method for constructing a spatiotemporal feature-based hydrological prediction model according to claim 1, wherein: the specific method for mining the spatial features between the hydrological topological structures by using the graph convolution network GCN in the step S3 is as follows:
the graph convolution network GCN model is defined as follows:
X(l+1)=f(X(l),A) (5)
wherein l is the number of layers of the GCN model of the graph convolution network, X(l)Is the characteristic of the first layer node, A is the adjacency matrix of the hydrological topology structure chart;
then, firstly, the feature transformation of the nodes is carried out, and the degree matrix is used for carrying out normalization on the adjacent matrix; after self-circulation is added, the relation between each node and the adjacent nodes is considered, and the specific model formula is as follows:
Figure FDA0002685252920000031
wherein the content of the first and second substances,
Figure FDA0002685252920000032
is a feature of node i in layer l +1,
Figure FDA0002685252920000033
is the distribution of nodes in the feature layer l of all neighboring nodes, σ is a non-linear transformation, a is the adjacency matrix,
Figure FDA0002685252920000037
it is shown that the self-circulation,
Figure FDA0002685252920000038
is degree, corresponds to
Figure FDA0002685252920000039
NiIs all neighbors of node i, W(l)Is the weight of the l-th layer, b(l)Is the intercept of the l-th layer.
5. The method for constructing a spatiotemporal feature-based hydrological prediction model according to claim 1, wherein: the specific method for extracting the time characteristics by using the gating cycle unit in the step S4 is as follows:
1) and establishing an updating gate: z is a radical oft=σ(W1·[ht-1,xt])
W1Is to update the weight of the gate, ht-1Hydrologically output data for neurons at a previous time, when ztThe larger the value of (a), the less information is left by the neuron at the previous time, and the more information is left by the neuron at the current time;
2) and constructing a reset gate: r ist=σ(W2·[ht-1,xt])
W2Is a weight matrix of reset gates when rtWhen the value is 0, the method means that useless information transmitted by the neuron at the previous moment is abandoned, and only the input of the neuron at the current moment is reserved as the input;
3) and undetermined output in the neuron:
Figure FDA0002685252920000034
W3is to update the weight vector of the gate, Tanh is the hyperbolic tangent function,
4) and neuron output:
Figure FDA0002685252920000035
6. the method for constructing a spatiotemporal feature-based hydrological prediction model according to claim 1, wherein: the method for constructing the final hydrologic forecast model in step S5 includes: inputting a space-time characteristic matrix and an adjacency matrix in the hydrological topological structure diagram into a GRU hydrological model:
Figure FDA0002685252920000036
wherein, XtIs a feature matrix, a is a adjacency matrix, a is a nonlinear transformation,
Figure FDA0002685252920000041
it is shown that the self-circulation,
Figure FDA0002685252920000042
is and
Figure FDA0002685252920000043
corresponding degree matrix, W0Is the weight of the first layer, W1Is the weight of the second layer;
then, the GRU is adopted to carry out time sequence prediction, and the formula steps are as follows:
1) and establishing an updating gate: z is a radical oft=σ(Wz·[ht-1,f(Xt,A)]);
2) And constructing a reset gate: r ist=σ(Wr·[ht-1,f(Xt,A)]);
3) And undetermined output in the neuron:
Figure FDA0002685252920000044
4) and neuron output:
Figure FDA0002685252920000045
wherein, Wz、Wr、WhAnd updating the weight vector of the gate for updating the gate weight and resetting the weight matrix of the gate in sequence.
CN202010974378.7A 2020-09-16 2020-09-16 Method for constructing hydrological forecasting model based on space-time characteristics Withdrawn CN112182063A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010974378.7A CN112182063A (en) 2020-09-16 2020-09-16 Method for constructing hydrological forecasting model based on space-time characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010974378.7A CN112182063A (en) 2020-09-16 2020-09-16 Method for constructing hydrological forecasting model based on space-time characteristics

Publications (1)

Publication Number Publication Date
CN112182063A true CN112182063A (en) 2021-01-05

Family

ID=73921377

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010974378.7A Withdrawn CN112182063A (en) 2020-09-16 2020-09-16 Method for constructing hydrological forecasting model based on space-time characteristics

Country Status (1)

Country Link
CN (1) CN112182063A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801416A (en) * 2021-03-10 2021-05-14 长沙理工大学 LSTM watershed runoff prediction method based on multi-dimensional hydrological information
CN113011501A (en) * 2021-03-22 2021-06-22 广东海启星海洋科技有限公司 Method and device for predicting typhoon water level based on graph convolution neural network
CN113762618A (en) * 2021-09-07 2021-12-07 中国水利水电科学研究院 Lake water level forecasting method based on multi-factor similarity analysis
CN114626512A (en) * 2022-05-17 2022-06-14 南京信息工程大学 High-temperature disaster forecasting method based on directed graph neural network
CN115099497A (en) * 2022-06-28 2022-09-23 中国水利水电科学研究院 CNN-LSTM-based real-time flood forecasting intelligent method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801416A (en) * 2021-03-10 2021-05-14 长沙理工大学 LSTM watershed runoff prediction method based on multi-dimensional hydrological information
CN113011501A (en) * 2021-03-22 2021-06-22 广东海启星海洋科技有限公司 Method and device for predicting typhoon water level based on graph convolution neural network
CN113011501B (en) * 2021-03-22 2022-05-24 广东海启星海洋科技有限公司 Method and device for predicting typhoon water level based on graph convolution neural network
CN113762618A (en) * 2021-09-07 2021-12-07 中国水利水电科学研究院 Lake water level forecasting method based on multi-factor similarity analysis
CN113762618B (en) * 2021-09-07 2022-03-01 中国水利水电科学研究院 Lake water level forecasting method based on multi-factor similarity analysis
CN114626512A (en) * 2022-05-17 2022-06-14 南京信息工程大学 High-temperature disaster forecasting method based on directed graph neural network
CN115099497A (en) * 2022-06-28 2022-09-23 中国水利水电科学研究院 CNN-LSTM-based real-time flood forecasting intelligent method

Similar Documents

Publication Publication Date Title
CN112182063A (en) Method for constructing hydrological forecasting model based on space-time characteristics
Sanikhani et al. Non-tuned data intelligent model for soil temperature estimation: A new approach
CN111222698B (en) Internet of things-oriented ponding water level prediction method based on long-time and short-time memory network
CN112785043B (en) Flood forecasting method based on time sequence attention mechanism
Talebizadeh et al. Uncertainty analysis for the forecast of lake level fluctuations using ensembles of ANN and ANFIS models
CN109285346A (en) A kind of city road net traffic state prediction technique based on key road segment
CN110619432B (en) Feature extraction hydrological forecasting method based on deep learning
CN107274030B (en) Runoff Forecast method and system based on hydrology variable year border and monthly variation characteristic
Yao et al. An ensemble CNN-LSTM and GRU adaptive weighting model based improved sparrow search algorithm for predicting runoff using historical meteorological and runoff data as input
CN109840587A (en) Reservoir reservoir inflow prediction technique based on deep learning
CN113139329B (en) Xinanjiang model parameter calibration method based on hydrological similarity and artificial neural network
CN113705877A (en) Real-time monthly runoff forecasting method based on deep learning model
CN112561132A (en) Water flow prediction model based on neural network
Xiang et al. Fully distributed rainfall-runoff modeling using spatial-temporal graph neural network
Vafakhah et al. Application of intelligent technology in rainfall analysis
Faruq et al. Deep Learning-Based Forecast and Warning of Floods in Klang River, Malaysia.
Ma et al. A hybrid deep learning model based on feature capture of water level influencing factors and prediction error correction for water level prediction of cascade hydropower stations under multiple time scales
CN112668711B (en) Flood flow prediction method and device based on deep learning and electronic equipment
CN112528557A (en) Flood flow prediction system and method based on deep learning
Kim et al. Analysis of AI-based techniques for forecasting water level according to rainfall
Peng et al. Meteorological satellite operation prediction using a BiLSTM deep learning model
CN116756498A (en) Runoff probability prediction algorithm based on LSTM and quantile regression
CN115860231A (en) MCR _ BilSTM-based intelligent flood forecasting method
CN112766240B (en) Residual multi-graph convolution crowd distribution prediction method and system based on space-time relationship
Steyn Short-term stream flow forecasting and downstream gap infilling using machine learning techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210105