CN114819386A - Conv-Transformer-based flood forecasting method - Google Patents

Conv-Transformer-based flood forecasting method Download PDF

Info

Publication number
CN114819386A
CN114819386A CN202210532559.3A CN202210532559A CN114819386A CN 114819386 A CN114819386 A CN 114819386A CN 202210532559 A CN202210532559 A CN 202210532559A CN 114819386 A CN114819386 A CN 114819386A
Authority
CN
China
Prior art keywords
data
model
hydrological
transformer
term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210532559.3A
Other languages
Chinese (zh)
Inventor
冯钧
王众沂
巫义锐
陆佳民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202210532559.3A priority Critical patent/CN114819386A/en
Publication of CN114819386A publication Critical patent/CN114819386A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services
    • G06Q50/265Personal security, identity or safety

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Computer Security & Cryptography (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention belongs to the technical field of data-driven water flow forecasting and discloses a Conv-Transformer-based flood forecasting method which comprises the steps of firstly, collecting hydrologic data of a large-scale basin to be researched, and then inputting the collected hydrologic historical data into a model after data preprocessing; secondly, performing data cleaning, data transformation, data set division and the like on the hydrologic historical data; thirdly, constructing a Transformer-based flood forecasting model, extracting spatial information by using a convolution-operation-based long-short term memory network to perform relative position coding, and improving the learning capacity of the model on long-term dependency information, wherein a self-attention mechanism in a Transformer module can capture dynamic space-time correlation between hydrological elements by capturing internal correlation of hydrological sequences, and a multi-head attention mechanism enables the model to realize the simultaneous learning of long-term and short-term hydrological historical information; inputting test data to test and forecast the model performance, judging whether the network performance meets the requirements, and if not, adjusting parameters until an ideal prediction result is achieved; and finally, analyzing the model through the evaluation standard to finish flood forecasting. The invention has the beneficial effects that: the flood peak precision and the flood trend can be effectively forecasted, and the method is an effective tool for forecasting the flood in the large-scale drainage basin in real time.

Description

Conv-Transformer-based flood forecasting method
Technical Field
The invention relates to the technical field of data-driven flood forecasting, in particular to a Conv-Transformer-based flood forecasting method.
Background
Flood forecasting belongs to one of a series of important non-engineering measures for preventing flood disasters, timely and effective flood early warning and forecasting can help people to effectively defend flood and reduce flood damage, and belongs to important disaster prevention and reduction applications.
At present, flood forecasting generally adopts two modes, namely a hydrological model based on a runoff process and a data-driven intelligent model, and the two models complement each other in actual forecasting. The data-driven modeling basically does not consider the physical mechanism of the hydrological process, and is a black box method with the aim of establishing the optimal mathematical relationship between input and output data. The long-term dependence information needs to be considered in the flood disaster caused by long-term rainfall. Unlike sudden flood disasters caused by short-term heavy rainfall, the duration of flood caused by long-term rainfall is longer, and the flood peak is relatively backward, so that the long-term dependence characteristic of hydrological data must be perfectly modeled, and only short-term information cannot be considered. And when the task of flow prediction is performed for a large drainage basin, the number of flood fields caused by long-term rainfall is large. However, the existing flood intelligent model still has the problem that long-term sequence information is easy to lose in the network training process. The long-term characteristics and the short-term characteristics are comprehensively considered when the intelligent model is constructed, especially the learning of the long-term dependence information which is easy to lose is enhanced, and the accuracy of the flood forecast of the large-scale drainage basin is improved.
To better alleviate the problem of parallelizing recursive computation in the recurrent neural network, google team proposed a Transformer model in 2017. The model does not use a recurrent neural network unit structure, instead uses a multi-head attention mechanism to learn the association dependency relationship between word vectors, and realizes efficient parallel processing. Recently, the Transformer model is widely applied to various fields such as computer vision, natural language processing and time series prediction. The invention constructs a Transformer flood forecasting model for carrying out relative position coding on a long-short term memory neural network based on convolution operation, the model firstly carries out relative position coding and spatial information extraction through the long-short term memory network based on convolution operation and a full connection layer, and then carries out incidence relation mining between all characteristic elements by using a submodule based on a Transformer encoder, wherein a multi-head attention mechanism enables the model to simultaneously learn long-term and short-term hydrological historical information, thereby further realizing hydrological forecasting with higher accuracy.
In the Transformer in the natural language processing field, a loop structure of a loop neural network is abandoned, the relation of each input statement sequence and the relation between the input statement sequence and the output statement sequence are solved only by relying on a self-attention mechanism, and sine and cosine absolute position coding is carried out on the input statement in the original Transformer in the NLP field. However, the linear transformation of absolute coding easily loses position information, is difficult to control relative distance information of various hydrological features, and is not beneficial to extraction of hydrological space information. Therefore, the data contain the relative position information by changing the coding method, so that the flood forecasting intelligent model can learn the long-term dependence and the spatial dependence of the hydrological sequence, and the forecasting accuracy of the model is improved. The invention adopts a long-short term memory network based on convolution operation to carry out relative position coding, and realizes extraction of spatial features of hydrological data on the premise of meeting global information extraction.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the defects in the prior art and provide a Conv-Transformer-based flood forecasting method, which carries out relative position coding through a long-term and short-term memory network based on convolution operation, realizes extraction of spatial features of hydrological data on the premise of meeting global information extraction, efficiently excavates hydrological time information through a Transformer sub-module and improves accuracy of flood forecasting.
The technical scheme is as follows: the invention relates to a Conv-Transformer-based flood forecasting method, which comprises the following steps of:
step S1, collecting hydrological historical data of the large-scale basin;
step S2, preprocessing the collected hydrologic history data;
s3, carrying out relative position coding on the data subjected to data preprocessing through a long-term and short-term memory network based on convolution operation, and then carrying out time feature modeling on the hydrological historical data subjected to spatial information extraction through a Transformer submodule;
and step S4, testing the performance of the forecasting model through the forecasting value obtained by the model in each stage of the watershed, and analyzing the output data of the model through the evaluation standard to finish flood forecasting.
The step S1 is to collect basin history data, and the step S1 is further to: when the flow data are collected, the historical flow data of the outlet section of the basin are generally collected, and the recent data of the experimental survey station are covered. The meteorological data in the drainage basin mainly comprises the attributes of evaporation, rainfall, temperature, wind speed and the like.
The step S2 is data preprocessing, and the step S2 is further:
step S2.1, preprocessing the ground rainfall measurement station data in step S2 comprises data cleaning, data transformation and normalization;
the normalization formula is as follows:
Figure BSA0000273487360000021
wherein X is a normalized value, X i Is an original value, X min Is the minimum value in the original sequence, X max Is the maximum value in the original sequence;
step S2.2, the preprocessing of the weather and flow attribute data in step S2 includes the construction of a two-dimensional matrix, and the specific operations are as follows: the column attributes formed by the two-dimensional matrix are historical runoff monitored by the hydrological station and various meteorological values detected by a plurality of meteorological monitoring stations; combining the runoff quantity and the meteorological value of each time period to obtain a final input two-dimensional matrix, and inputting the input matrix into the model subsequently;
s2.3, taking the first 80% of the data preprocessed in the step S2 as a model training set, and taking the last 20% of the data as a test set;
the drainage basin spatial geographic feature relationship is complex, and the drainage basin spatial geographic feature relationship needs to be researched from multiple angles, so as to fully mine the complex space-time features of the medium and small drainage basins, and the step S3 includes:
s3.1, performing relative position coding by using a long-term and short-term memory neural network based on convolution operation, extracting spatial information by perfecting the learning of global characteristic information, and improving the learning capability of a model to long-term dependency information, wherein the specific calculation method of the long-term and short-term memory network based on convolution operation in a relative position coding module is as follows:
f t =σ(W xf *x t +W hf *h t-1 +W cf ⊙C t-1 +b f )
i t =σ(W xi *x t +W hi *h t-1 +W ci ⊙C t-1 +b i )
C t =f t ⊙C t-1 +i t ⊙tanh(W xc *x t +W hc *h t-1 +b c )
o t =σ(W xo *x t +W ho *h t-1 +W co ⊙C t-1 +b o )
h t =o t ⊙tanh(C t )
wherein [ ] represents convolution operation, the [ ] represents Hadamard product, the σ represents activation function sigmoid function, and x t Represented by the input data of the neuronal cells at time t, h t Representing the state of the information passed to the next layer at time t, C t Representing the value of the information state of the neuronal cell at time t, f t 、i t And o t Respectively representing a forgetting gate, an input gate and an output gate;
and S3.2, after the hidden layer of the long-term and short-term memory network based on convolution operation is processed, the output value is followed by a full connection layer. The output of the transform is used as the relative position coding result of the input data of the subsequent transform submodule, and the formula of the full connection layer is as follows:
R L (x t )=ReLU(W R h t +b R )
wherein ReLU is an activation function, W R Represents a weight, b R The deviation value is indicated.
And S3.3, constructing a Transformer sub-module to extract time characteristics. The module consists of a multi-head self-attention mechanism layer, a feedforward network layer, a residual error connection and normalization operation, and a calculation method thereof
Figure BSA0000273487360000031
The method comprises the following specific steps:
Figure BSA0000273487360000032
wherein the content of the first and second substances,
Figure BSA0000273487360000033
the matrix representing the input is first subjected to a MultiHead autonomy mechanism layer to complete the MultiHead () operation, and then L&And (3) performing Norm () residual error connection and normalization operation, finally performing FNN () calculation of a feedforward network to complete dimension conversion, and performing residual error connection and normalization again. The detailed operation process of the multi-head self-attention mechanism is as follows:
q i =W Q a i
k i =W K a i
v i =W V a i
first, the input matrix X ═ X 1 ,x 2 ,...,x N Processed through the embedding layer to obtain ai ═ Wx i (i ═ 1, 2,. cndot., N); then, three linear transformation weight matrices W are used Q ,W K ,W V Calculating to obtain q i ,k i ,v i . Wherein q is i ,k i ,v i Is required for the following calculationsThe three vectors query respective subvectors of Q, key K and value V. Input x 1 Output b of the corresponding multi-head self-attention mechanism module 1 The calculation process of (2) is as follows:
Figure BSA0000273487360000041
Figure BSA0000273487360000042
Figure BSA0000273487360000043
wherein alpha is 1,1 ,α 1,2 ,...,α 1,N Is q 1 Respectively with each k i The vector dot product is obtained by calculation and normalization, and is obtained by Softmax normalization operation
Figure BSA0000273487360000044
Finally respectively react with v i Multiplication to obtain b 1 . Other b i The calculation process of (a) is similar to the formula, only the corresponding value of i needs to be replaced, and finally, the final output B is obtained after passing through various different heads 1 ,b 2 ,...,b N }. Residual concatenation is the superposition of the input and the output of the multi-headed self attention mechanism layer. Layer normalized LayerNorm γ,β (x) The detailed operation process is as follows:
Figure BSA0000273487360000045
Figure BSA0000273487360000046
Figure BSA0000273487360000047
where x represents the input to the neuron, μ represents the mean, σ 2 The variance is expressed, ε is a constant added to prevent the denominator from being 0, and γ, β are coefficients. The main components of the feedforward neural network comprise two full-connection layers and a ReLU activation function, and the operation process is as follows:
FeedForward(Z)=ReLU(ZW 1 +b 1 )W 2 +b 2
wherein Z represents the output of a multi-headed self-attentive mechanism layer, W 1 、W 2 Is a weight coefficient, b 1 、b 2 Is a deviation;
after the construction of the complete Conv-Transformer model, a series of experiments will be performed to verify model feasibility-step S4 includes:
s4.1, inputting the predicted value to test the performance of the prediction model, and judging the size change trend of the loss function value of the whole model until the loss function value is in a decreasing trend and tends to be gentle;
step S4.2, evaluating the performance of the model by using the data obtained by the test set, so as to improve the model by changing the parameters of the model, specifically, evaluating the flood forecasting result based on the attention mechanism by using three evaluation standards, wherein the three evaluation standards are average absolute error, decision coefficient and root mean square error, and the three evaluation standard formulas are as follows:
1) mean absolute error MAE:
Figure BSA0000273487360000051
wherein the content of the first and second substances,
Figure BSA0000273487360000052
-the actual observed value of the flow of the sample stream of the mth,
Figure BSA0000273487360000053
-the m-th sample river flow prediction value, N-the number of test samples;
2) determining the coefficient R 2
Figure BSA0000273487360000054
Wherein the content of the first and second substances,
Figure BSA0000273487360000055
-the actual observed value of the flow of the sample stream of the mth,
Figure BSA0000273487360000056
-a predicted value of the flow rate of the m-th sample river,
Figure BSA0000273487360000057
-the predicted mean value of river flow at the mth sample, N-the number of test samples;
3) root mean square error RMSE:
Figure BSA0000273487360000058
wherein, the first and the second end of the pipe are connected with each other,
Figure BSA0000273487360000059
-the actual observed value of the flow of the sample stream of the mth,
Figure BSA00002734873600000510
-the predicted average value of the m sample river flow, N-the number of test samples;
and S4.3, outputting a model mining result.
Has the advantages that: compared with the prior art, the invention has the advantages that:
the method utilizes a deep learning algorithm, adopts a Transformer flood forecasting method based on a convolution long-short term memory network to carry out relative position coding, is called Conv-Transformer method for short, and compared with the traditional method, the method enhances the modeling capability of local context information in a hydrological sequence and improves the flood forecasting accuracy. The model not only references a self-attention mechanism to strengthen key characteristic information in hydrological characteristic elements, but also simultaneously focuses on long-term and short-term information in historical flood data through different attention heads in a multi-head attention mechanism, so that time sequence information is captured more efficiently.
In addition, the invention also adds the relative position code of the long-short term memory neural network based on convolution calculation, describes the relation between each moment information and the current information, captures the long-term dependency information, can complete the task of acquiring global information from local perception, and realizes the extraction of the spatial characteristics of the multidimensional hydrological data. The invention realizes the comprehensive consideration of the dynamic relevance of the time characteristic and the spatial characteristic, improves the learning capacity of long-term dependence information, and effectively improves the accuracy of flood forecasting.
Drawings
FIG. 1 is a flow chart of an experiment according to the present invention;
FIG. 2 is a schematic diagram of a convolution-based long-term and short-term memory network unit structure of the flood forecasting model according to the present invention;
fig. 3 is a detailed block diagram of the present invention.
Detailed Description
The technical solution of the present invention is described in detail below, but the scope of the present invention is not limited to the embodiments.
As shown in fig. 1, a Conv-Transformer-based flood forecasting method of the present embodiment includes the following steps:
and step S1, collecting hydrological history data of the cun-Tibet hydrological station in the Yangtze river basin.
When the flow data are collected, the collected data are data of the cun-shoal drainage basin flow station, the data are historical flow data of an outlet section, and the date data of the experiment survey station in recent years are covered. The meteorological data in the drainage basin is selected from meteorological data of 110 meteorological monitoring stations in the drainage basin of the Yangtze river from the China national meteorological information center, and comprises an evaporation value, a rainfall value, a temperature value and a wind speed value.
Step S2, preprocessing the collected flow data and meteorological data of the cun-beach hydrological station;
step S2.1, preprocessing the ground rainfall measurement station data in step S2 comprises data cleaning, data transformation and normalization;
the normalization formula is as follows:
Figure BSA0000273487360000061
wherein X is a normalized value, X i Is an original value, X min Is the minimum value in the original sequence, X max Is the maximum value in the original sequence;
the preprocessing of the weather and flow attribute data in the step S2.2 and the step S2 includes the construction of a two-dimensional matrix, and the specific operations are as follows: the column attributes of the two-dimensional matrix are historical runoff monitored by hydrographic monitoring stations in the cun-beach basin and various weather values detected by a plurality of weather monitoring stations; combining the runoff and meteorological values of each day in the data set to obtain a final input two-dimensional matrix, and inputting the input matrix into the model subsequently;
and S2.3, taking the first 80% of the data of the cun-beach watershed preprocessed in the step S2 as a model training set, and taking the last 20% of the data as a test set.
S3, carrying out relative position coding on the cun-Tibet hydrological data subjected to data preprocessing through a long-term and short-term memory network based on convolution operation, and then carrying out time feature modeling on the hydrological historical data subjected to spatial information extraction through a Transformer sub-module;
s3.1, performing relative position coding by using a long-term and short-term memory neural network based on convolution operation, extracting spatial information by perfecting the learning of global characteristic information, and improving the learning capability of a model to long-term dependency information, wherein the specific calculation method of the long-term and short-term memory network based on convolution operation in a relative position coding module is as follows:
f t =σ(W xf *x t +W hf *h t-1 +W cf ⊙C t-1 +b f )
i t =σ(W xi *x t +W hi *h t-1 +W ci ⊙C t-1 +b i )
C t =f t ⊙C t-1 +i t ⊙tanh(W xc *x t +W hc *h t-1 +b c )
o t =σ(W xo *x t +W ho *h t-1 +W co ⊙C t-1 +b o )
h t =o t ⊙tanh(C t )
wherein [ ] represents convolution operation, the [ ] represents Hadamard product, the σ represents activation function sigmoid function, and x t Represented by the input data of the neuronal cells at time t, h t Representing the state of the information passed to the next layer at time t, C t Representing the value of the information state of the neuronal cell at time t, f t 、i t And o t Respectively representing a forgetting gate, an input gate and an output gate;
and S3.2, after the hidden layer of the long-term and short-term memory network based on convolution operation is processed, the output value is followed by a full connection layer. The output of the transform is used as the relative position coding result of the input data of the subsequent transform submodule, and the formula of the full connection layer is as follows:
R L (x t )=ReLU(W R h t +b R )
wherein ReLU is an activation function, W R Represents a weight, b R The deviation value is indicated.
And S3.3, constructing a Transformer sub-module to extract time characteristics. The module consists of a multi-head self-attention mechanism layer, a feedforward network layer, a residual error connection and normalization operation, and a calculation method thereof
Figure BSA0000273487360000071
The method comprises the following specific steps:
Figure BSA0000273487360000072
wherein, the first and the second end of the pipe are connected with each other,
Figure BSA0000273487360000073
the matrix representing the input is first subjected to a MultiHead autonomy mechanism layer to complete the MultiHead () operation, and then L&And (3) performing Norm () residual error connection and normalization operation, finally performing FNN () calculation of a feedforward network to complete dimension conversion, and performing residual error connection and normalization again. The detailed operation process of the multi-head self-attention mechanism is as follows:
q i =W Q a i
k i =W K a i
v i =W V a i
first, the input matrix X ═ X 1 ,x 2 ,...,x N Is processed through the embedding layer to obtain a i =Wx i (i ═ 1, 2,. cndot., N); then, three linear transformation weight matrices W are used Q ,W K ,W V Calculating to obtain q i ,k i ,v i . Wherein q is i ,k i ,v i Are the respective subvectors of the three vector queries Q, the key K and the value V required for the following calculations. Input x 1 Output b of the corresponding multi-head self-attention mechanism module 1 The calculation process of (2) is as follows:
Figure BSA0000273487360000074
Figure BSA0000273487360000075
Figure BSA0000273487360000076
wherein alpha is 1,1 ,α 1,2 ,...,α 1,N Is q 1 Respectively with each k i Performing vector dot productIs obtained by calculation and normalization, and is obtained by Softmax normalization operation
Figure BSA0000273487360000077
Finally respectively react with v i Multiplication to obtain b 1 . Other b i The calculation process of (a) is similar to the formula, only the corresponding value of i needs to be replaced, and finally, the final output B is obtained after passing through various different heads 1 ,b 2 ,...,b N }. Residual concatenation is the superposition of the input and the output of the multi-headed self attention mechanism layer. Layer normalized LayerNorm γ,β (x) The detailed operation process is as follows:
Figure BSA0000273487360000081
Figure BSA0000273487360000082
Figure BSA0000273487360000083
where x represents the input to the neuron, μ represents the mean, σ 2 The variance is expressed, ε is a constant added to prevent the denominator from being 0, and γ, β are coefficients. The main components of the feedforward neural network comprise two full-connection layers and a ReLU activation function, and the operation process is as follows:
FeedForward(Z)=ReLU(ZW 1 +b 1 )W 2 +b 2
wherein Z represents the output of a multi-headed self-attentive mechanism layer, W 1 、W 2 Is a weight coefficient, b 1 、b 2 Is a deviation.
And step S4, testing the performance of the forecasting model through a forecasting value obtained by the cun-beach basin through the model at each stage, and analyzing the forecasting data of the model through an evaluation standard to finish flood forecasting.
S4.1, testing the performance of the forecasting model through a predicted value obtained at each stage of the cun-Tibet data set, and judging the size change trend of the loss function value of the whole model until the loss function value is in a decreasing trend and tends to be flat;
step S4.2, evaluating the performance of the model by using the data obtained by the test set, so as to improve the model by changing the parameters of the model, specifically, evaluating the flood forecasting result based on the attention mechanism by using three evaluation standards, wherein the three evaluation standards are average absolute error, decision coefficient and root mean square error, and the three evaluation standard formulas are as follows:
1) mean absolute error MAE:
Figure BSA0000273487360000084
wherein the content of the first and second substances,
Figure BSA0000273487360000085
-the actual observed value of the flow of the cun-beach basin of the mth sample,
Figure BSA0000273487360000086
-the m-th sample river flow prediction value, N-the number of test samples;
2) determining the coefficient R 2
Figure BSA0000273487360000087
Wherein the content of the first and second substances,
Figure BSA0000273487360000088
-the actual observed value of the flow of the cun-beach basin of the mth sample,
Figure BSA0000273487360000089
-a predicted value of the flow rate of the m-th sample river,
Figure BSA00002734873600000810
the m th sampleThe average value of river water flow prediction, N, the number of test samples;
3) root mean square error RMSE:
Figure BSA0000273487360000091
wherein the content of the first and second substances,
Figure BSA0000273487360000092
-the actual observed value of the flow of the cun-beach basin of the mth sample,
Figure BSA0000273487360000093
-the predicted average value of the m sample river flow, N-the number of test samples;
and S4.3, outputting a model prediction result.

Claims (5)

1. A flood forecasting method based on Conv-Transformer is characterized in that: the method comprises the following steps:
step S1, collecting hydrological historical data of the large-scale basin;
step S2, preprocessing the collected hydrologic history data;
s3, carrying out relative position coding on the data subjected to data preprocessing through a long-term and short-term memory network based on convolution operation, and then carrying out time feature modeling on the hydrological historical data subjected to spatial information extraction through a Transformer submodule;
and step S4, testing the performance of the forecasting model through a forecasting value obtained by the cun-beach basin through the model at each stage, and analyzing the output data of the model through an evaluation standard to finish flood forecasting.
2. The Conv-Transformer-based flood forecasting method according to claim 1, wherein: when the hydrological data are collected, the hydrological data comprise historical meteorological data in a basin and historical flow data of an outlet section of the basin, and the data preprocessing comprises data cleaning, data transformation and data set division.
3. The Conv-Transformer-based flood forecasting method according to claim 1, wherein:
step S2.1, the preprocessing of the hydrological historical data in the step S2 comprises data preprocessing including data cleaning, data transformation and normalization;
step S2.2, the preprocessing of the hydrological historical data in the step S2 comprises the construction of a two-dimensional input matrix;
and S2.3, taking the first 80% of the data preprocessed in the step S2 as a training set, and taking the last 20% of the data as a test set.
4. The Conv-Transformer-based flood forecasting method according to claim 1, wherein:
s3.1, performing relative position coding by using a long-term and short-term memory neural network based on convolution operation, extracting spatial information by perfecting the learning of global characteristic information, and improving the learning capability of a model on long-term dependence information;
s3.2, realizing linearization through a full connection layer;
and S3.3, performing time characteristic modeling on the hydrological historical data subjected to spatial information extraction through a multi-head self-attention mechanism of the transform submodule.
5. The Conv-Transformer-based flood forecasting method according to claim 1, wherein:
s4.1, inputting the predicted value to test the performance of the prediction model, and judging the size change trend of the loss function value of the whole model until the loss function value is in a decreasing trend and tends to be gentle;
s4.2, evaluating the performance of the model by using the data obtained by the test set, so as to improve the model by changing the parameters of the model, specifically, evaluating the flood forecasting result based on the attention mechanism by using three evaluation standards, wherein the three evaluation standards are an average absolute error, a decision coefficient and a root-mean-square error;
and S4.3, outputting a model prediction result.
CN202210532559.3A 2022-05-19 2022-05-19 Conv-Transformer-based flood forecasting method Pending CN114819386A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210532559.3A CN114819386A (en) 2022-05-19 2022-05-19 Conv-Transformer-based flood forecasting method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210532559.3A CN114819386A (en) 2022-05-19 2022-05-19 Conv-Transformer-based flood forecasting method

Publications (1)

Publication Number Publication Date
CN114819386A true CN114819386A (en) 2022-07-29

Family

ID=82516176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210532559.3A Pending CN114819386A (en) 2022-05-19 2022-05-19 Conv-Transformer-based flood forecasting method

Country Status (1)

Country Link
CN (1) CN114819386A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117132606A (en) * 2023-10-24 2023-11-28 四川大学 Segmentation method for lung lesion image
CN117575873A (en) * 2024-01-15 2024-02-20 安徽大学 Flood warning method and system for comprehensive meteorological hydrologic sensitivity

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117132606A (en) * 2023-10-24 2023-11-28 四川大学 Segmentation method for lung lesion image
CN117132606B (en) * 2023-10-24 2024-01-09 四川大学 Segmentation method for lung lesion image
CN117575873A (en) * 2024-01-15 2024-02-20 安徽大学 Flood warning method and system for comprehensive meteorological hydrologic sensitivity
CN117575873B (en) * 2024-01-15 2024-04-05 安徽大学 Flood warning method and system for comprehensive meteorological hydrologic sensitivity

Similar Documents

Publication Publication Date Title
CN109492822B (en) Air pollutant concentration time-space domain correlation prediction method
CN111639748B (en) Watershed pollutant flux prediction method based on LSTM-BP space-time combination model
Xie et al. Hybrid forecasting model for non-stationary daily runoff series: a case study in the Han River Basin, China
CN110263479B (en) Atmospheric pollution factor concentration space-time distribution prediction method and system
CN114819386A (en) Conv-Transformer-based flood forecasting method
CN112116080A (en) CNN-GRU water quality prediction method integrated with attention mechanism
Lian et al. A novel data-driven tropical cyclone track prediction model based on CNN and GRU with multi-dimensional feature selection
CN111242377A (en) Short-term wind speed prediction method integrating deep learning and data denoising
CN111242351A (en) Tropical cyclone track prediction method based on self-encoder and GRU neural network
WO2023202474A1 (en) Method and system for accurately forecasting three-dimensional spatiotemporal sequence multiple parameters of seawater quality
CN112613657B (en) Short-term wind speed prediction method for wind power plant
CN116307291B (en) Distributed photovoltaic power generation prediction method and prediction terminal based on wavelet decomposition
CN111738074B (en) Pedestrian attribute identification method, system and device based on weak supervision learning
CN114419464A (en) Twin network change detection model based on deep learning
Xu et al. Research on water quality prediction based on SARIMA-LSTM: a case study of Beilun Estuary
CN114841072A (en) Differential fusion Transformer-based time sequence prediction method
Yang et al. An intelligent singular value diagnostic method for concrete dam deformation monitoring
CN115840893A (en) Multivariable time series prediction method and device
Qin et al. DeepFR: A trajectory prediction model based on deep feature representation
Wang et al. Mid-long term forecasting of reservoir inflow using the coupling of time-varying filter-based empirical mode decomposition and gated recurrent unit
Feng et al. Graph convolution based spatial-temporal attention lstm model for flood forecasting
CN109190800A (en) A kind of sea surface temperature prediction technique based on spark frame
Wang et al. CNN-BiLSTM-attention model in forecasting wave height over South-East China Seas
CN113935458A (en) Air pollution multi-site combined prediction method based on convolution self-coding deep learning
CN115168327A (en) Large-scale data space-time prediction method based on multilayer tree long-short term memory network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
DD01 Delivery of document by public notice

Addressee: Wang Zhongyi

Document name: Notification on Qualification of Preliminary Examination of Invention Patent Application

Addressee: Wang Zhongyi

Document name: Notice of Publication of Patent Application for Invention

DD01 Delivery of document by public notice