CN115933010A - Radar echo extrapolation near weather prediction method - Google Patents

Radar echo extrapolation near weather prediction method Download PDF

Info

Publication number
CN115933010A
CN115933010A CN202211688110.2A CN202211688110A CN115933010A CN 115933010 A CN115933010 A CN 115933010A CN 202211688110 A CN202211688110 A CN 202211688110A CN 115933010 A CN115933010 A CN 115933010A
Authority
CN
China
Prior art keywords
output
matrix
layer
radar echo
att
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211688110.2A
Other languages
Chinese (zh)
Inventor
程勇
钱坤
王军
何光鑫
渠海峰
王伟
何佳信
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN202211688110.2A priority Critical patent/CN115933010A/en
Publication of CN115933010A publication Critical patent/CN115933010A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a method for predicting the weather near the radar echo extrapolation, which comprises the following steps: obtaining a historical radar echo sequence sample; building and training a prediction neural network model based on AFR-LSTM, dividing a radar echo sequence sample into batch _ sizes, inputting the samples into the prediction neural network model, and performing backward propagation to update network weights after forward propagation of a multilayer network to obtain the trained prediction neural network model; inputting a radar echo sequence sample in a set time period into a trained prediction neural network model to obtain a radar echo extrapolation image sequence; and determining an adjacent weather prediction result according to the radar echo extrapolation image sequence.

Description

Radar echo extrapolation near weather prediction method
Technical Field
The invention belongs to the technical field of short-term weather forecast, and particularly relates to a radar echo extrapolation near weather prediction method.
Background
The radar echo extrapolation can be regarded as the estimation and prediction of the change trend of continuous time series images, namely, the radar echo images in a certain time in the future are predicted by using the existing radar echo images in a certain time. The nowcasting generally refers to the description of the current weather condition and the weather forecast within two hours in the future, and the main forecast objects include the disastrous weather such as strong precipitation, strong wind, hail and the like. For example, the goal of the near-heavy precipitation forecast is to accurately and timely forecast the regional precipitation intensity and distribution in the next two hours. The visible radar echo extrapolation method can provide visual radar echo image reference for the nowcasting, so that how to rapidly and accurately predict a weather radar image sequence becomes one of the hot spots for the research in the weather field.
The deep learning method has the capability of modeling a highly nonlinear complex system, combines deep learning and radar echo extrapolation, and can find out a potential rule from massive radar data, so that the accuracy of weather condition prediction in a future period of a specified area is improved. Long-Short Term Memory (LSTM) is a variant of RNN (recurrent neural networks) that solves the problem of Long-Term dependence of sequences by introducing Memory and gating units in RNN network elements. Many improved models have been derived based on this, such as ConvLSTM (convolutional Long short term memory), predRNN (predictive recurrent neural network). To maintain long-term spatiotemporal correlations, eidetic 3D LSTM and SA-ConvLSTM utilize a mechanism of attention. The attention mechanism can search information from historical memory and can save more space-time representations. However, they only recall previous time memories by using a single attention mechanism, and can only recall information of a single channel; and they do not consider the problem of information loss during the encoding and decoding process. Therefore, the information transfer capability of these networks is insufficient, and the rainfall prediction accuracy at the future time is affected.
Disclosure of Invention
The invention aims to overcome the defects that in the prior art, attention mechanisms only can recall information of a single channel and information loss in the encoding and decoding process is not fully considered. It is mentioned herein
The purpose is as follows: in order to solve the above problems, the present invention provides a method for predicting the radar echo extrapolation close weather, which provides an Attention Fusion module (Attention Fusion), and uses the Attention module to fuse the channel information and the time-space information to obtain a better long-term time-space representation, so that the memory unit can effectively recall the stored memory across a plurality of timestamps even after a long-term interference; and secondly, an information Recall mechanism (Recall) is added in the coding and decoding of information transmission to help Recall the information input by coding during decoding, so that the radar echo extrapolation prediction effect with higher accuracy is realized.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, a method for predicting weather near radar echo extrapolation is provided, including:
s1, obtaining a historical radar echo sequence sample;
s2, constructing and training a prediction neural network model based on AFR-LSTM, dividing a radar echo sequence sample into batch _ sizes, inputting the samples into the prediction neural network model, and performing backward propagation to update network weights after the samples are subjected to forward propagation of a multilayer network to obtain the trained prediction neural network model;
s3, inputting a radar echo sequence sample in a set time period into a trained prediction neural network model to obtain a radar echo extrapolation image sequence;
and S4, determining a prediction result of the adjacent weather according to the radar echo extrapolation image sequence.
In some embodiments, in step S1, obtaining historical radar echo sequence samples includes:
and (3) sequentially carrying out coordinate conversion, data interpolation and horizontal sampling pretreatment on the radar echo map acquired by the Doppler radar to obtain a gray scale map.
Further, the coordinate transformation includes: converting radar echo map data under three-dimensional polar coordinates into a three-dimensional Cartesian rectangular coordinate system;
the data interpolation includes: performing data interpolation by adopting an inverse distance weighting method to obtain regular grid data under a three-dimensional Cartesian rectangular coordinate system;
the horizontal sampling comprises the following steps: performing horizontal sampling on regular grid data under a three-dimensional Cartesian rectangular coordinate system, extracting two-dimensional plane data under a height, and mapping the two-dimensional plane data to 0-255 to obtain an echo intensity CAPPI gray image; wherein the data mapping formula is:
Figure BDA0004021513060000021
wherein P is a grayscale pixel; z is the intensity value of the data,
Figure BDA0004021513060000027
indicating that the value is rounded down.
In some embodiments, step S1 further comprises: converting the data into normalized gray data normalized _ data through normalization;
Figure BDA0004021513060000022
the resulting normalized gray scale data has a value of [0,1].
In some embodiments, in step S2, the AFR-LSTM-based predictive neural network model sequentially includes: an Encoder Encoder, an AF-LSTM module and a Decoder;
the Encoder comprises 5 convolutional layers for extracting radar echo sequence sample I t Depth feature X of t
The AF-LSTM module comprises 4 layers of AF-LSTM network units which are sequentially stacked behind the Encoder Encoder network in order and used for extracting depth characteristics X of radar echo sequence sample t Temporal and spatial information of (2), hidden state of output
Figure BDA0004021513060000023
Inputting the data into a Decoder;
the AF-LSTM module is used for outputting a memory unit of the same layer network at the previous moment
Figure BDA0004021513060000024
And hidden state
Figure BDA0004021513060000025
Hidden state output by one layer of network before current moment>
Figure BDA0004021513060000026
Space-time memory unit M of the previous layer l-1 And a set M of spatiotemporal memory cells of the front τ layer l-τ:l-1 Inputting the data into AF-LSTM network unit at the l-th layer at the t moment, and obtaining the hidden state/combination outputted by the current network unit after forward propagation>
Figure BDA0004021513060000031
Memory cell>
Figure BDA0004021513060000032
Spatiotemporal memory cell>
Figure BDA0004021513060000033
Wherein t =1,2 \ 8230; 10,l =1,2,3,4;
Figure BDA0004021513060000034
Figure BDA0004021513060000035
Setting parameters through initialization;
the Decoder comprises 5 convolutional layers for hiding the output of the AF-LSTM module
Figure BDA0004021513060000036
Decoding and correspondingly fusing the output of each convolution layer of the encoder to obtain the output radar echoPush image sequence
Figure BDA0004021513060000037
In some embodiments, the processing of the AF-LSTM module comprises:
step 2-1, the space-time memory unit M of the previous layer is processed l-1 Forgetting door f t ' and the previous several layers of the set M of continuous history space-time memory units l-τ:l-1 As input, outputting by a fusion attention mechanism to obtain a space-time memory unit Attfusion with a plurality of time steps;
step 2-2, the hidden state output by the previous layer network at the current moment
Figure BDA0004021513060000038
Hidden state output by the same layer network at the previous moment>
Figure BDA0004021513060000039
And a memory unit>
Figure BDA00040215130600000310
By means of an input modulation gate g t And input gate i t And forget door f t Updating the current memory cell->
Figure BDA00040215130600000311
The formula is as follows:
Figure BDA00040215130600000312
Figure BDA00040215130600000313
Figure BDA00040215130600000314
Figure BDA00040215130600000315
wherein "" indicates convolution operation, "" indicates dot product operation of matrix, tanh indicates hyperbolic tangent activation function
Figure BDA00040215130600000316
Sigma denotes Sigmoid activation function>
Figure BDA00040215130600000317
W xg ,W hg ,W xi ,W hi ,W xf ,W hf The sizes of all the filter _ size and the filter _ size are num _ hidden _ num _ hidden; b g ,b i ,b f Indicating a deviation;
step 2-3, the hidden state output by the previous layer network at the current moment
Figure BDA00040215130600000318
Space-time memory unit M of the previous layer l-1 And a set M of contiguous historical spatiotemporal memory units l-τ:l-1 As input, the space-time memory unit AttFusion of step 2-1, input modulation gate g t ', input gate i t ' and forget door f t ' updating the current spatiotemporal memory unit>
Figure BDA00040215130600000319
The formula is as follows:
Figure BDA0004021513060000041
Figure BDA0004021513060000042
Figure BDA0004021513060000043
Figure BDA0004021513060000044
wherein "+" denotes convolution operation, "" denotes dot product operation of matrix, and tanh denotes hyperbolic tangent activation function
Figure BDA0004021513060000045
Sigma denotes a Sigmoid activation function>
Figure BDA0004021513060000046
W xi ',W hi ',W xg ',W hg ',W xf ',W hf ' are all filter _ size × filter _ size, in number num _ hidden _ num _ hidden; b i ',b g ',b f ' denotes a deviation;
step 2-4, the hidden state output by the previous layer network at the current moment
Figure BDA0004021513060000047
Hidden state output by the same layer at the previous moment>
Figure BDA0004021513060000048
The memory cell updated in step 2->
Figure BDA0004021513060000049
And step 2-3 the updated spatiotemporal memory unit>
Figure BDA00040215130600000410
As an output gate O t For hidden state->
Figure BDA00040215130600000411
Updating is carried out, and the formula is as follows:
Figure BDA00040215130600000412
Figure BDA00040215130600000413
wherein "+" indicates convolution operation, and "-" indicates dot product operation of matrix, [, ·]Showing that the two matrixes are spliced according to columns and the rows are kept unchanged; tanh represents the hyperbolic tangent activation function
Figure BDA00040215130600000414
Convolution kernel W 1*1 The size of (1) × 1, the number num _ hidden × num _ hidden; w xo ,W ho ,W co ,W mo Is 5 × 5, in a number num _ hidden _ num _ hidden; b o The deviation is indicated.
In some embodiments, step 2-1 comprises: each space-time memory unit Attfusion comprises a space-time attention module, a channel attention module and a fusion attention module;
step 2-1-1, a space-time attention module: forget door f 1 2' ∈R B×C×H×W Is regarded as a query matrix Q l B, C, H and W respectively represent the batch size of the characteristic images, the number of channels, the image height and the image width; will inquire about the matrix Q l Remodelling to Q l ∈R N ×(H*W)×C (ii) a Set M of corresponding continuous historical spatio-temporal feature maps 0:1 ∈R B×C×τ×H×W Is regarded as a key matrix K l Sum matrix V l τ refers to the length of the time series; also, a key matrix K l Sum matrix V l Are respectively reshaped into K l ∈R B×(τ*H*W)×C And V l ∈R B ×(τ*H*W)×C (ii) a According to Q l ∈R N×(H*W)×C 、K l ∈R B×(τ*H*W)×C And V l ∈R B×(τ*H*W)×C The output of the spatiotemporal attention module, ST _ ATT, is obtained:
Figure BDA0004021513060000051
Q l =f 1 2' ;K l =V l =M 0:1
wherein,
Figure BDA0004021513060000052
representation pair query matrix Q l And key matrix K l The transposed matrix multiplication operation is followed by application to a softmax layer, representing the query matrix Q l And key matrix K l The position similarity between them, i.e. representing the forgetting of the door f 1 2' And a set M of continuous historical spatiotemporal feature maps 0:1 The degree of correlation of (c); then using the value matrix V l Calculating matrix product as weight of updated information, selectively adding M 0:1 The space-time information is collected, and then the matrix is reshaped to the original shape; finally, the time-space memory unit which is arranged on the upper layer is used for storing and storing the data>
Figure BDA0004021513060000053
Applying the sum to a layerorm layer to obtain the output ST _ ATT of the space-time attention module;
step 2-1-2, the channel attention module: forgetting to see door f t '∈R B×C×H×W For querying the matrix Q c Remodeling it into Q c ∈R B×C×(H*W) (ii) a Set M of corresponding continuous historical spatio-temporal feature maps l-τ:l-1 ∈R B×C×τ×H×W As a key matrix K c Sum matrix V c Key matrix K c Sum matrix V c Is reshaped into K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) (ii) a According to Q c ∈R B×C×(H*W) 、K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) And obtaining the output C _ ATT of the channel attention module:
C_ATT=AttC(M l-1 ,f t ',M l-τ:l-1 )
=layernorm(M l-1 +softmax(Q c ·K c T )·V c )
Q c =f t ';K c =V c =M l-τ:l-1
wherein,
Figure BDA0004021513060000054
representing a query matrix Q c Key matrix K c Degree of influence on the channel; then, is at>
Figure BDA0004021513060000055
And value matrix V c Taking matrix product as weight of updated information, selectively dividing M l-τ:l-1 The channel information of the matrix is collected, and the matrix is reshaped to the original shape; finally, the space-time memory unit M of the previous layer is passed l-1 After summing, applying the sum to a layerorm layer to obtain the output C _ ATT of the channel attention module;
step 2-1-3, fusing an attention module: and fusing the output ST _ ATT of the space-time attention module and the output C _ ATT of the channel attention module to obtain a fused attention result Attfusion:
AttFusion=Sum(ST_ATT,C_ATT)
=conv(conv(layernorm(ReLU(conv(ST_ATT))))
+conv(layernorm(ReLU(conv(C_ATT)))))
the ST _ ATT and the C _ ATT respectively pass through a convolution layer with convolution kernel size of 3, a normalization layer of layerorm, an activation function layer of Re LU and a convolution layer with convolution kernel size of 1, element summation is performed on the two results, finally, the convolution layer is utilized to generate a result finally fused with attention, and Attfusion is output by the instant memory unit.
In some embodiments, hidden states to AF-LSTM module output
Figure BDA0004021513060000062
Decoding is carried out, and the decoding and the corresponding fusion with the output of each convolution layer of the encoder respectively comprise:
Figure BDA0004021513060000061
wherein, dec l-1 Representing the output of one convolutional layer of the decoder, enc -1 () Representing the output, dec, of the encoder corresponding to the convolutional layer l Which represents the final encoder result obtained by adding the results of the two.
In a second aspect, the present invention provides a radar echo extrapolation neighboring weather predictor device, including a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method according to the first aspect.
In a third aspect, the present invention provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of the first aspect.
In a fourth aspect, the present invention provides a computer device comprising a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method according to the first aspect.
Has the advantages that: compared with the prior art, the invention provides a method for predicting the weather near the radar echo extrapolation, which has the following advantages:
(1) An Attention Fusion mechanism (Attention Fusion) is provided, channel information and space-time information are fused with each other to obtain better long-term space-time representation so as to replace the space-time memory update of an LSTM neural network forget gate, so that more space-time historical information is associated, the loss in the information transmission process is reduced, and better space-time representation is formed;
(2) Adding an information Recall (Recall) module between the encoder and the decoder, and fusing information between the result of the decoder and the input of the encoder so as to Recall stacked multi-stage encoder information and further save the prediction details;
(3) Designing a long-term and short-term memory network structure based on an attention fusion mechanism and information recall, and extracting the depth characteristics of a radar sample through a coding structure; then stacking multiple layers of prediction units to extract the spatio-temporal information of the data; and the space-time information output by the last layer of prediction unit is decoded and output.
Drawings
FIG. 1 is a flow chart of a method of an embodiment of the present invention.
FIG. 2 is a schematic diagram of an attention fusion module according to an embodiment of the invention.
FIG. 3 is a schematic diagram of the structure of the AF-LSTM unit in the embodiment of the invention.
FIG. 4 is a schematic diagram of an information recall module in an embodiment of the invention;
fig. 5 is a schematic structural diagram of a stacked network element prediction network according to an embodiment of the present invention.
Detailed Description
The invention is further explained below with reference to the drawings and the embodiments. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
In the description of the present invention, the meaning of a plurality is one or more, the meaning of a plurality is two or more, and the above, below, exceeding, etc. are understood as excluding the present numbers, and the above, below, within, etc. are understood as including the present numbers. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, reference to the description of the terms "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Example 1
As shown in fig. 1, a method for predicting weather near radar echo extrapolation includes:
s1, obtaining a historical radar echo sequence sample;
s2, constructing and training a prediction neural network model based on AFR-LSTM, dividing a radar echo sequence sample into batch _ sizes, inputting the samples into the prediction neural network model, and performing backward propagation to update network weights after the samples are subjected to forward propagation of a multilayer network to obtain the trained prediction neural network model;
s3, inputting a radar echo sequence sample in a set time period into a trained prediction neural network model to obtain a radar echo extrapolation image sequence;
and S4, determining an adjacent weather prediction result according to the radar echo extrapolation image sequence.
In some embodiments, in step S1, obtaining historical radar echo sequence samples includes:
and (3) sequentially carrying out coordinate conversion, data interpolation and horizontal sampling pretreatment on the radar echo map acquired by the Doppler radar to obtain a gray scale map.
Further, the coordinate conversion includes: converting radar echo map data under three-dimensional polar coordinates into a three-dimensional Cartesian rectangular coordinate system;
the data interpolation includes: performing data interpolation by adopting an inverse distance weighting method to obtain regular grid data under a three-dimensional Cartesian rectangular coordinate system;
the horizontal sampling comprises the following steps: performing horizontal sampling on regular grid data under a three-dimensional Cartesian rectangular coordinate system, extracting two-dimensional plane data under a height, and mapping the two-dimensional plane data to 0-255 to obtain an echo intensity CAPPI gray image; wherein the data mapping formula is as follows:
Figure BDA0004021513060000081
wherein P is a grayscale pixel; z being dataThe intensity value is a value of the intensity,
Figure BDA0004021513060000082
indicating that the value is rounded down.
In some embodiments, step S1 further comprises: converting the data into normalized gray data normalized _ data through normalization;
Figure BDA0004021513060000083
the resulting normalized gray scale data has a value of [0,1].
In some embodiments, in step S2, the AFR-LSTM-based predictive neural network model sequentially includes: an Encoder Encoder, an AF-LSTM module and a Decoder;
the Encoder Encoder comprises 5 convolutional layers for extracting radar echo sequence samples I t Depth feature X of t
The AF-LSTM module comprises 4 layers of AF-LSTM network units which are sequentially stacked behind the Encoder Encoder network in order and used for extracting depth characteristics X of radar echo sequence sample t The temporal-spatial information of (2), the hidden state of the output
Figure BDA0004021513060000084
Inputting the data into a Decoder;
the AF-LSTM module is used for outputting a memory unit of the same layer network at the previous moment
Figure BDA0004021513060000085
And hidden state
Figure BDA0004021513060000086
Hidden state output by one layer of network before current moment>
Figure BDA0004021513060000087
Space-time memory unit M of the previous layer l-1 And a set M of spatiotemporal memory cells of the front τ layer l-τ:l-1 When input to tIn the AF-LSTM network unit of the ith layer, the hidden state output by the current network unit is obtained after forward propagation>
Figure BDA0004021513060000088
Memory unit->
Figure BDA0004021513060000089
Spatiotemporal memory unit->
Figure BDA00040215130600000810
Wherein t =1,2 823010, 10,l =1,2,3,4;
Figure BDA00040215130600000811
Figure BDA00040215130600000812
Setting parameters through initialization;
the Decoder comprises 5 convolutional layers for hiding the output of the AF-LSTM module
Figure BDA0004021513060000091
Decoding, and correspondingly fusing with the output of each convolution layer of the encoder to obtain the output radar echo extrapolation image sequence
Figure BDA0004021513060000092
In some embodiments, as shown in fig. 1, the method for radar echo extrapolation of a neural network structure based on spatiotemporal prediction of an attention fusion mechanism and information recall includes the following steps:
step 1: and (4) preprocessing data. The method comprises the steps of removing invalid data with no rainfall or little rainfall from Doppler weather radar base data, obtaining CAPPI data through data interpolation, converting the CAPPI data into normalized gray data and obtaining a gray image data set, and finally dividing the data set into a training sample set and a testing sample set.
The step 1 comprises the following steps:
step 1-1: data interpolation: and converting the data under the three-dimensional polar coordinate into a three-dimensional Cartesian rectangular coordinate system, and performing data interpolation by adopting an inverse distance weighting method to obtain regular grid data under the three-dimensional Cartesian rectangular coordinate system. And then, carrying out horizontal sampling on the data, extracting two-dimensional plane data under a certain height, and mapping the data to 0-255 to obtain an echo intensity CAPPI gray image. And then converting the reflectivity data into normalized gray data through normalization.
Wherein the data mapping formula is as follows:
Figure BDA0004021513060000093
wherein P is a grayscale pixel; z is the intensity value of the data,
Figure BDA0004021513060000094
indicating that the value is rounded down.
The normalization formula is:
Figure BDA0004021513060000095
the resulting normalized gray scale data has a value of [0,1].
Step 1-2: data set partitioning: total _ length is set to 20, i.e. every 20 data are divided into one sequence, wherein the first 10 data are input sequences and the last 10 data are comparison sequences. Randomly dividing all sequences in each month in the data set into a training sequence sample subset and a test sequence sample subset according to the ratio of 3.
Step 2: and constructing and training an AFR-LSTM network. Inputting the divided training sequence sample set train _ data into a convolution space-time prediction neural network, and training through a multilayer network.
The step 2 comprises the following steps:
and 2-1, initializing training parameters. Height, width, and channel of an input image, convolution kernel filter _ size, convolution step size stride, prediction unit stack layer number num _ layers, convolution kernel number num _ hidden, the number of samples per input of the training stage, training maximum round number max _ epoch, learning rate λ, input sequence length input _ length, and extrapolated sequence length output _ length, and the like are set.
In this embodiment, the image height =480, the width =480, the channel number channel =1, the af-LSTM module (as shown in fig. 3) stacks the number of layers num _ layers =4, the convolution kernel size filter _ size =5, the step size =1, the number of convolution kernels hidden _ num =64, the learning rate λ =0.001, the input sequence length input _ length =10, the extrapolation sequence length output _ length =10, the number of samples per input in the training phase, batch _ size =4, and the maximum number of iterations max _ iterations =80000.
And 2-2, constructing a neural network. Firstly, an Encoder is constructed, which comprises 5 convolutional layers: the input channel of the convolution layer 1 is 1, the output channel is 64, the convolution kernel is 1, and the step length is 1; the input channel of the 2 nd convolutional layer is 64, the output channel is 64, the convolutional kernel is 3, the step length is 2, and the padding is 1; the input channel of the convolution layer of the 3 rd layer is 64, the output channel is 64, the convolution kernel is 3, the step length is 2, and the padding is 1; the input channel of the 4 th convolutional layer is 64, the output channel is 64, the convolutional kernel is 3, the step length is 2, and the padding is 1; the 5 th convolutional layer has an input channel of 64, an output channel of 64, a convolutional kernel of 3, a step size of 2, and padding of 1. Each layer of convolution is followed by sequential nonlinear activation. And secondly, constructing 4 layers of AF-LSTMs according to the stacking layer number, the convolution kernel size, the step length and the convolution kernel number of the AF-LSTM modules set in the step 2-1, and sequentially stacking the AF-LSTM modules behind the Encoder network. The Decoder was constructed again, containing 5 convolutional layers: the input channel of the convolution layer 1 is 64, the output channel is 64, the convolution kernel is 3, the step length is 2, and the padding is 1; the input channel of the 2 nd convolutional layer is 64, the output channel is 64, the convolutional kernel is 3, the step length is 2, and padding is 1; the input channel of the convolution layer 3 is 64, the output channel is 64, the convolution kernel is 3, the step length is 2, and the padding is 1; the input channel of the 4 th convolutional layer is 64, the output channel is 64, the convolutional kernel is 3, the step length is 2, and the padding is 1; in the layer 5 convolutional layer, the input channel is 64, the output channel is 1, the convolutional kernel is 1, and the step size is 1.
In this embodiment, the hidden state is initially set
Figure BDA0004021513060000101
Memory unit->
Figure BDA0004021513060000102
Spatiotemporal memory unit->
Figure BDA0004021513060000103
Set M of spatiotemporal memory cells initialized to an all-zero tensor, of size (4, 64, 30, 30), the first τ time steps l-τ:l-1 Also initialized to an all-zero tensor of size (τ,4, 64, 30, 30), and the output of each layer is updated every time a time elapses. In this example τ is 3.
And 2-3, reading a training sample. Fetch _ size =4 sequence samples from the training sample set at each training as input I of the network t
Step 2-4, input I at a certain moment t (t=1,2,…,10),I t Has a size of (4, 1, 480, 480); will I t Extracting the depth characteristic of the sample from the input Encoder coder Encoder, and outputting the depth characteristic as X after 5-layer convolution of the Encoder t = (4, 64, 30, 30). The formula is as follows:
X t =Enc(I t )
enc () represents an encoder for extracting deep features from an input.
Step 2-5, memory units output by the same layer network at the previous moment
Figure BDA0004021513060000104
And hidden state->
Figure BDA0004021513060000105
Hidden state output by one layer of network before current moment>
Figure BDA0004021513060000106
Space-time memory unit M of the previous layer l-1 And a set M of spatiotemporal memory cells of the front τ layer l-τ:l-1 Inputting the signal into the AF-LSTM network unit on the ith layer at the moment t, and obtaining the hidden state output by the current network unit after forward propagation>
Figure BDA0004021513060000107
Memory unit->
Figure BDA0004021513060000108
Spatiotemporal memory unit->
Figure BDA0004021513060000109
Wherein t =1,2 \ 8230, 10,l =1,2,3,4.
Figure BDA00040215130600001010
Figure BDA00040215130600001011
The parameters are set by initialization. The structure of the AF-LSTM network element is shown in FIG. 3, and comprises the following steps: />
Step 2-5-1, the space-time memory unit M of the previous layer is processed l-1 Forgetting door f t ' and the previous several layers of the set M of continuous history space-time memory units l-τ:l-1 As input, the output is performed by fusing the attention mechanism, resulting in a spatiotemporal memory unit AttFusion with multiple time steps. As shown in fig. 2, the method comprises the following steps:
step 2-5-1-1, forget gate f t '∈R B×C×H×W It is regarded as a query matrix Q l Here, B, C, H, and W represent the feature image batch size, the number of channels, the image height, and the image width, respectively. Firstly, directly remolding the mixture into Q l ∈R N ×(H*W)×C . Set M of corresponding continuous historical spatio-temporal feature maps l-τ:l-1 ∈R B×C×τ×H×W It is regarded as a key matrix K l Sum matrix V l Where τ refers to the length of the time series. Also, they are respectively provided withRemodelling to K l ∈R B×(τ*H*W)×C And V l ∈R B ×(τ*H*W)×C . Next, the output ST _ ATT of the spatiotemporal attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000111
Q l =f t ';K l =V l =M l-τ:l-1
as shown in the blue part of fig. 2, here
Figure BDA0004021513060000112
Representation pair query matrix Q l And key matrix K l The transposed matrix multiplication operation is followed by application to a softmax layer, representing the query matrix Q l And key matrix K l The position similarity between them, i.e. representing the forgetting of the door f t ' and set M of continuous historical spatio-temporal feature maps l-τ:l-1 The degree of correlation of (c). Then using the value matrix V l Calculating matrix product as weight of updated information, selectively adding M l-τ:l-1 The spatio-temporal information of (a) is assembled and the matrix is reshaped back to its original shape. Finally, the space-time memory unit M of the previous layer is passed l-1 The sum is applied to a layerorm layer to obtain the final output ST _ ATT of the spatiotemporal attention module.
Step 2-5-1-2, also forget to look at door f t '∈R B×C×H×W For querying the matrix Q c Remodelling it to Q, unlike the spatiotemporal attention module c ∈R B×C×(H*W) . Set M of corresponding continuous historical spatio-temporal feature maps l-τ:l-1 ∈R B×C×τ×H×W As a key matrix K c Sum matrix V c They are reshaped to K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) . Next, the output C _ ATT of the channel attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000113
Q c =f t ';K c =V c =M l-τ:l-1
as shown in the orange portion of figure 2,
Figure BDA0004021513060000114
representing a query matrix Q c Key matrix K c The extent of influence on the channel. Then, it is combined with the value matrix V c Taking matrix product as weight of updated information, selectively dividing M l-τ:l-1 And then the matrix is reshaped back to its original shape. Finally, the space-time memory unit M of the previous layer is passed l-1 And after summation, applying the sum to a layerorm layer to obtain the output C _ ATT of the final channel attention module.
Step 2-5-1-3, the output of the spatiotemporal attention module ST _ ATT of step 2-5-1-1 and the output of the channel attention module C _ ATT of step 2-5-1-2 are fused, as shown in the green part of FIG. 2. Specifically, the ST _ ATT and the C _ ATT respectively pass through a convolution layer with a convolution kernel size of 3, a normalization layer of layerorm, an activation function layer of Re L U, and a convolution layer with a convolution kernel size of 1, then element summation is performed on the two results, and finally the convolution layer is used to generate a result AttFusion of final fusion attention, wherein a specific calculation formula is as follows:
AttFusion=Sum(ST_ATT,C_ATT)
=conv(conv(layernorm(ReLU(conv(ST_ATT))))
+conv(layernorm(ReLU(conv(C_ATT)))))
step 2-5-2, the hidden state output by the previous layer network at the current moment is output
Figure BDA0004021513060000121
Hidden state output by the same layer network at the previous moment>
Figure BDA0004021513060000122
And a memory unit->
Figure BDA0004021513060000123
By means of an input modulation gate g t And an input gate i t And forget door f t Updating the current memory cell->
Figure BDA0004021513060000124
The formula is as follows:
Figure BDA0004021513060000125
Figure BDA0004021513060000126
Figure BDA0004021513060000127
Figure BDA0004021513060000128
wherein "+" denotes convolution operation, "" denotes dot product operation of matrix, and tanh denotes hyperbolic tangent activation function
Figure BDA0004021513060000129
Sigma denotes Sigmoid activation function>
Figure BDA00040215130600001210
W xg ,W hg ,W xi ,W hi ,W xf ,W hf The sizes of the two layers are all filter _ size _ filter _ size, and the number of the two layers is num _ hidden _ num _ hidden; b g ,b i ,b f The deviation is indicated.
Step 2-5-3, the hidden state output by the previous layer network at the current moment is output
Figure BDA00040215130600001211
Space-time memory unit M of the previous layer l-1 And a set M of contiguous historical spatiotemporal memory units l-τ:l-1 As input, the attention fused equation Attfusion, input modulation gate g, by step 2-5-1 t ', input gate i t ' and forget door f t ' updating the current spatiotemporal memory unit>
Figure BDA00040215130600001212
The formula is as follows:
Figure BDA00040215130600001213
Figure BDA00040215130600001214
Figure BDA00040215130600001215
Figure BDA00040215130600001216
wherein "" indicates convolution operation, "" indicates dot product operation of matrix, tanh indicates hyperbolic tangent activation function
Figure BDA00040215130600001217
Sigma denotes Sigmoid activation function>
Figure BDA00040215130600001218
W xi ',W hi ',W xg ',W hg ',W xf ',W hf ' are all filter _ size × filter _ size, in number num _ hidden _ num _ hidden; b i ',b g ',b f ' denotes the deviation.
Step 2-5-4, the hidden state output by the previous layer network at the current moment is output
Figure BDA0004021513060000131
Hidden state output by the same layer at the previous moment>
Figure BDA0004021513060000132
Memory cell updated in step 2-5-2 and step 2-5-3>
Figure BDA0004021513060000133
And space-time memory
Figure BDA0004021513060000134
As an output gate O t For hidden states>
Figure BDA0004021513060000135
Updating is carried out, and the formula is as follows:
Figure BDA0004021513060000136
Figure BDA0004021513060000137
wherein "" denotes a convolution operation, "all" indicates a dot product operation of the matrix, [, ]]The two matrixes are spliced according to columns and the rows are kept unchanged; tanh represents the hyperbolic tangent activation function
Figure BDA0004021513060000138
Convolution kernel W 1*1 Is 1 × 1, in a number num _ hidden _ num _ hidden; w is a group of xo ,W ho ,W co ,W mo Is 5 by 5 in number num _ hidden by num _ hidden; b o The deviation is indicated.
Step 2-6, as shown in FIG. 5, the hidden state H output after repeating step 2-5 four times t l Input into Decoder, and then fused with the output of each convolutional layer of the encoder, as shown in fig. 4, the formula is as follows:
Figure BDA0004021513060000139
wherein Enc -1 () Representing the encoder output, dec, used to extract depth features from a data set l-1 Representing the decoder output, dec, through a stacking network l Indicating the final encoder result obtained by adding the two results.
Step 2-7, decoding result Dec output from step 2-6 l I.e. output of prediction result images of the network
Figure BDA00040215130600001310
Size (4, 1, 480, 480), and final completion of slave input I t To
Figure BDA00040215130600001311
Extrapolation of the radar echo. The formula is as follows:
Figure BDA00040215130600001312
step 2-8, when t is more than or equal to 10, the output of step 2-7
Figure BDA00040215130600001313
As an input, steps 2-4 to 2-7 are repeated until t =19, in turn resulting in an image sequence ≥ at a predicted future moment>
Figure BDA00040215130600001314
And (5) finishing the extrapolation of the radar echo sequence.
And 2-9, calculating a loss function value. For the prediction sequence obtained by the forward propagation from the step 2-4 to the step 2-8
Figure BDA00040215130600001315
And extrapolated reference sequence group _ truths = { I 11 ,I 12 ,...,I 20 The mean square error is taken as the loss function. Number obtained from loss functionAnd calculating the network parameter gradient by the value, updating the network parameter and finishing back propagation.
In a specific embodiment, the above steps 2-4 to 2-9 may be embodied as steps (1) to (16):
step (1), sample I 1 (t = 1) into an Encoder comprising a 5-layer convolution structure by which the depth feature X of the sample is preliminarily extracted 1
Step (2), initializing all-zero memory cell
Figure BDA0004021513060000141
And hidden state->
Figure BDA0004021513060000142
Hidden state output by one layer of network before current moment>
Figure BDA0004021513060000143
(equal to X) t ) Space-time memory cell M of the previous layer 0 (initialized to full 0 tensor) and set M of spatio-temporal memory cells of the front τ layer 0:0 Input to the layer 1 AF-LSTM network element at time 1. The hidden state output by the current network unit is obtained after forward propagation>
Figure BDA0004021513060000144
Memory unit->
Figure BDA0004021513060000145
Spatiotemporal memory unit->
Figure BDA0004021513060000146
Set M of consecutive historical spatiotemporal memory units simultaneously updating the previous τ time steps 0:1 As input to the attention module of the next cell. The AF-LSTM network element, as shown in fig. 3, comprises the following steps:
step (2-1), the space-time memory unit M of the previous layer is processed 0 (initialization is full 0 tensor), forgetting gate f 1 1' And the first several layers of continuous history space-time memory unitClosing M 0:0 As input, the output is performed by fusing the attention mechanism, resulting in a spatiotemporal memory unit AttFusion with multiple time steps. As shown in fig. 2, the method comprises the following steps:
step (2-1-1), forgetting the door f 1 1' ∈R B×C×H×W It is regarded as a query matrix Q l Here, B, C, H, and W represent the feature image batch size, the number of channels, the image height, and the image width, respectively. Firstly, directly remolding the mixture into Q l ∈R N ×(H*W)×C . Set M of corresponding continuous historical spatio-temporal feature maps 0:0 ∈R B×C×τ×H×W It is considered as a key matrix K l Sum matrix V l . Also, they are individually reshaped to K l ∈R B×(τ*H*W)×C And V l ∈R B×(τ*H*W)×C . Next, the output ST _ ATT of the spatiotemporal attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000147
Q l =f 1 1' ;K l =V l =M 0:0
as shown in the blue part of fig. 2, here
Figure BDA0004021513060000148
Representation pair query matrix Q l And key matrix K l The transposed matrix multiplication operation is followed by application to a softmax layer, representing the query matrix Q l And key matrix K l The position similarity between them, i.e. representing the forgetting of the door f 1 1' And a set M of continuous historical spatiotemporal feature maps 0:0 The degree of correlation of (c). Then using the value matrix V l Calculating matrix product as weight of updated information, selectively adding M 0:0 The spatio-temporal information of (a) is assembled and the matrix is reshaped back to its original shape. Finally, the space-time memory unit M of the previous layer is passed 0 After summing, applying the sum to a layerorm layer to obtain the final timeThe output of the null attention module ST _ ATT.
Step (2-1-2), forgetting to look door f 1 1' ∈R B×C×H×W For querying the matrix Q c Remodelling it to Q, unlike the spatiotemporal attention module c ∈R B×C×(H*W) . Set M of corresponding continuous historical spatio-temporal feature maps 0:0 ∈R B×C×τ×H×W As a key matrix K c Sum matrix V c They are reshaped to K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) . Next, the output C _ ATT of the channel attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000151
Q c =f 1 1' ;K c =V c =M 0:0
as shown in the orange portion of figure 2,
Figure BDA0004021513060000152
representing a query matrix Q c Key matrix K c The extent of influence on the channel. Then, it is combined with the value matrix V c Taking matrix product as weight of updated information, selectively dividing M 0:0 And then the matrix is reshaped back to its original shape. Finally, the space-time memory unit M of the previous layer is passed 0 And after summation, applying the sum to a layerorm layer to obtain the output C _ ATT of the final channel attention module.
And (2-1-3) fusing the output ST _ ATT of the spatiotemporal attention module of the step 2-1-1 and the output C _ ATT of the channel attention module of the step 2-1-2, as shown in the green part of FIG. 2. Specifically, the ST _ ATT and the C _ ATT respectively pass through a convolution layer with a convolution kernel size of 3, a normalization layer of layerorm, an activation function layer of ReLU, and a convolution layer with a convolution kernel size of 1, then element summation is performed on the two results, and finally the convolution layer is used to generate a final attention fused result AttFusion, wherein a specific calculation formula is as follows:
AttFusion=Sum(ST_ATT,C_ATT)
=conv(conv(layernorm(ReLU(conv(ST_ATT))))
+conv(layernorm(ReLU(conv(C_ATT)))))
step (2-2), the hidden state X input at the current moment is input 1 Hidden state of same layer network output at previous moment
Figure BDA0004021513060000153
And a memory unit>
Figure BDA0004021513060000154
Modulation gate by input>
Figure BDA0004021513060000155
Input door/door>
Figure BDA0004021513060000156
And forget door f 1 1 Updating the current memory cell->
Figure BDA0004021513060000157
The formula is as follows:
Figure BDA0004021513060000158
Figure BDA0004021513060000159
Figure BDA00040215130600001510
Figure BDA00040215130600001511
a step (2-3) of,outputting the hidden state X of the previous layer network at the current moment 1 The space-time memory cell of the previous layer
Figure BDA00040215130600001512
And a set M of contiguous historical spatiotemporal memory units 0:0 As input, by the attention fused formula AttFusion, input modulation door @, of step 2-1>
Figure BDA0004021513060000161
Input door/door>
Figure BDA0004021513060000162
And forget door f 1 1' Updating the current spatiotemporal memory cell>
Figure BDA0004021513060000163
The formula is as follows:
Figure BDA0004021513060000164
Figure BDA0004021513060000165
Figure BDA0004021513060000166
Figure BDA0004021513060000167
step (2-4), the hidden state X output by the previous layer network at the current moment is output 1 Hidden state of the same layer output at the previous time
Figure BDA0004021513060000168
The memory cell updated in step 2-2 and step 2-3 is then selected>
Figure BDA0004021513060000169
And spatiotemporal memory->
Figure BDA00040215130600001610
As an output gate>
Figure BDA00040215130600001611
For hidden states>
Figure BDA00040215130600001612
Updating is carried out, and the formula is as follows:
Figure BDA00040215130600001613
Figure BDA00040215130600001614
step (3) of hiding the output state of the step (2)
Figure BDA00040215130600001615
Memory cell>
Figure BDA00040215130600001616
Spatiotemporal memory>
Figure BDA00040215130600001617
And hidden status of same layer network output at previous time>
Figure BDA00040215130600001618
Set M of spatiotemporal memory cells of the sum front τ layer 0:1 Inputting the data into a layer 2 AF-LSTM network unit at the 1 st moment, and obtaining the hidden state of the output of the layer after forward propagation>
Figure BDA00040215130600001619
Memory unit->
Figure BDA00040215130600001620
And space-time memory
Figure BDA00040215130600001621
Set M of consecutive historical spatiotemporal memory units simultaneously updating the previous τ time steps 0:2 As input to the attention module of the next unit. The AF-LSTM network element, as shown in fig. 3, comprises the following steps:
step (3-1), the space-time memory unit of the previous layer is processed
Figure BDA00040215130600001622
Forget door f 1 2' And the previous several layers of continuous historical space-time memory unit set M 0:1 As input, the output is made by a fusion attention mechanism, resulting in a spatiotemporal memory unit AttFusion with multiple time steps. As shown in fig. 2, the method comprises the following steps:
step (3-1-1), forgetting the door f 1 2' ∈R B×C×H×W It is regarded as a query matrix Q l Here, B, C, H, and W represent the feature image batch size, the number of channels, the image height, and the image width, respectively. Firstly, directly remodelling the compound into Q l ∈R N ×(H*W)×C . Set M of corresponding continuous historical spatio-temporal feature maps 0:1 ∈R B×C×τ×H×W It is regarded as a key matrix K l Sum matrix V l Where τ refers to the length of the time series. Also, they are individually reshaped to K l ∈R B×(τ*H*W)×C And V l ∈R B ×(τ*H*W)×C . Next, the output ST _ ATT of the spatiotemporal attention module can be obtained, and the specific formula is as follows:
Figure BDA00040215130600001623
Q l =f 1 2' ;K l =V l =M 0:1
as shown in the blue part of fig. 2, here
Figure BDA0004021513060000171
Representation pair query matrix Q l And key matrix K l The transposed matrix multiplication operation is followed by application to a softmax layer, representing the query matrix Q l And key matrix K l The position similarity between them, i.e. representing the forgetting of the door f 1 2' And a set M of continuous historical spatiotemporal feature maps 0:1 The degree of correlation of (c). Then using the value matrix V l Calculating matrix product as weight of updated information, selectively adding M 0:1 The spatiotemporal information of (a) is assembled and the matrix is reshaped back to its original shape. Finally, the time-space memory unit which is arranged on the upper layer is used for storing and storing the data>
Figure BDA0004021513060000172
The sum is applied to a layerorm layer to obtain the final output ST _ ATT of the spatiotemporal attention module.
Step (3-1-2), forgetting to look at door f 1 2' ∈R B×C×H×W For querying the matrix Q c Remodelling it to Q, unlike the spatiotemporal attention module c ∈R B×C×(H*W) . Set M of corresponding continuous historical spatio-temporal feature maps 0:1 ∈R B×C×τ×H×W It is regarded as a key matrix K c Sum matrix V c They are reshaped to K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) . Next, the output C _ ATT of the channel attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000173
Q c =f 1 2' ;K c =V c =M 0:1
as shown in the orange portion of figure 2,
Figure BDA0004021513060000174
representing a query matrix Q c Moment of mutual couplingMatrix K c The extent of influence on the channel. Then, it is combined with the value matrix V c Taking matrix product as weight of updated information, selectively dividing M 0:1 And then the matrix is reshaped back to its original shape. Finally, the space-time memory unit M of the previous layer is passed 1 1 And after summation, applying the sum to a layerorm layer to obtain the output C _ ATT of the final channel attention module.
And (3-1-3) fusing the output ST _ ATT of the spatiotemporal attention module of the step 3-1-1 and the output C _ ATT of the channel attention module of the step 3-1-2, as shown in the green part of FIG. 2. Specifically, the ST _ ATT and the C _ ATT respectively pass through a convolution layer with a convolution kernel size of 3, a normalization layer of layerorm, an activation function layer of ReLU, and a convolution layer with a convolution kernel size of 1, then element summation is performed on the two results, and finally the convolution layer is used to generate a final attention fused result AttFusion, wherein a specific calculation formula is as follows:
AttFusion=Sum(ST_ATT,C_ATT)
=conv(conv(layernorm(ReLU(conv(ST_ATT))))
+conv(layernorm(ReLU(conv(C_ATT)))))
step (3-2), the hidden state output by the previous layer network at the current moment is output
Figure BDA0004021513060000181
Hidden state output by the same layer network at the previous moment>
Figure BDA0004021513060000182
And a memory unit->
Figure BDA0004021513060000183
Modulating the door by an input>
Figure BDA0004021513060000184
Input door/door>
Figure BDA0004021513060000185
And forget door f 1 2 Updating the current memory cell->
Figure BDA0004021513060000186
The formula is as follows:
Figure BDA0004021513060000187
Figure BDA0004021513060000188
Figure BDA0004021513060000189
Figure BDA00040215130600001810
step (3-3), the hidden state output by the previous layer network at the current moment is output
Figure BDA00040215130600001811
The space-time memory unit of the previous layer->
Figure BDA00040215130600001812
And a set M of contiguous historical spatiotemporal memory units 0:1 As an input, the input modulation door @, by the fused attention formula AttFusion of step 3-1>
Figure BDA00040215130600001813
Input door/door>
Figure BDA00040215130600001814
And forget door f 1 2' Updating the current spatiotemporal memory cell>
Figure BDA00040215130600001815
The formula is as follows:
Figure BDA00040215130600001816
Figure BDA00040215130600001817
Figure BDA00040215130600001818
Figure BDA00040215130600001819
step (3-4), the hidden state output by the previous layer network at the current moment is output
Figure BDA00040215130600001820
Hidden state of the same output in the preceding time>
Figure BDA00040215130600001821
The memory cell updated in step 3-2 and step 3-3 is then selected>
Figure BDA00040215130600001822
And a spatiotemporal memory unit>
Figure BDA00040215130600001823
As an output gate>
Figure BDA00040215130600001824
For hidden state->
Figure BDA00040215130600001825
Updating is carried out, and the formula is as follows: />
Figure BDA00040215130600001826
Figure BDA00040215130600001827
Step (4) of hiding the output of step (3)
Figure BDA00040215130600001828
Spatiotemporal memory unit->
Figure BDA00040215130600001829
Inputting the signal into a 3 rd layer space-time convolution long-short term memory network of the network, and obtaining the hidden state of the output of the layer after forward propagation>
Figure BDA00040215130600001830
Memory cell>
Figure BDA00040215130600001831
And spatiotemporal memory>
Figure BDA00040215130600001832
Set M of consecutive historical spatiotemporal memory cells simultaneously updating the first τ time steps 0:2 The concrete steps are the same as the step (3).
Step (5) of hiding the output of step (4)
Figure BDA00040215130600001833
Spatiotemporal memory>
Figure BDA00040215130600001834
Inputting the signal into a 4 th layer space-time convolution long-short term memory network of the network, and obtaining the hidden state of the output of the layer after forward propagation>
Figure BDA00040215130600001835
Memory unit->
Figure BDA00040215130600001836
And space-time memory
Figure BDA00040215130600001837
Set M of consecutive historical spatiotemporal memory units simultaneously updating the previous τ time steps 1:3 The concrete steps are the same as the step (3).
Step (6) of hiding the output of step (5)
Figure BDA00040215130600001838
The decoded output is input into a Decoder of the Decoder, and then is correspondingly fused with the output of each convolution layer of the encoder, and the formula is as follows:
Figure BDA0004021513060000191
step (7), the decoder result Dec output in step (6) l I.e. output of prediction result images of the network
Figure BDA0004021513060000192
Finally completing the slave input I t To>
Figure BDA0004021513060000193
And (4) extrapolation of the radar echo. The formula is as follows:
Figure BDA0004021513060000194
step (8) of sampling the sample I t (t =2,3 8230; 10) into an Encoder comprising a 5-layer convolution structure by which the depth feature X of a sample is preliminarily extracted t
Step (9), the memory unit of the layer 1 at the previous moment
Figure BDA0004021513060000195
And hidden state>
Figure BDA0004021513060000196
Hidden state output by previous layer network at current moment>
Figure BDA0004021513060000197
(equal to X) t ) The space-time memory unit of the previous layer->
Figure BDA0004021513060000198
(spatio-temporal memory cell of layer 4 at previous time) and set M of spatio-temporal memory cells of layer τ before l-τ:l-1 Inputting the data into the layer 1 space-time long short-term memory network unit at the time t. Output hidden state after forward propagation>
Figure BDA0004021513060000199
Cell status->
Figure BDA00040215130600001910
Spatiotemporal memory unit->
Figure BDA00040215130600001911
Set M of consecutive historical spatiotemporal memory units simultaneously updating the previous τ time steps l-τ:l-1 As input to the attention module of the next cell. The AF-LSTM network element, as shown in fig. 3, comprises the following steps:
step (9-1), the space-time memory unit M of the previous layer is processed l-1 Forgetting door f t 1' And the previous several layers of continuous history space-time memory unit set M l-τ:l-1 As input, the output is made by a fusion attention mechanism, resulting in a spatiotemporal memory unit AttFusion with multiple time steps. As shown in fig. 2, the method comprises the following steps:
step (9-1-1), forgetting to gate f t 1' ∈R B×C×H×W It is regarded as a query matrix Q l Here, B, C, H, and W represent the feature image batch size, the number of channels, the image height, and the image width, respectively. Firstly, directly remolding the mixture into Q l ∈R N ×(H*W)×C . Set M of corresponding continuous historical spatio-temporal feature maps l-τ:l-1 ∈R B×C×τ×H×W It is regarded as a key matrix K l Sum matrix V l Where τ refers to the length of the time series. The same applies toThey are each reshaped to K l ∈R B×(τ*H*W)×C And V l ∈R B ×(τ*H*W)×C . Next, the output ST _ ATT of the spatiotemporal attention module can be obtained, and the specific formula is as follows:
Figure BDA00040215130600001912
Q l =f t 1' ;K l =V l =M l-τ:l-1
as shown in the blue part of fig. 2, here
Figure BDA00040215130600001913
Representation pair query matrix Q l And key matrix K l The transposed matrix multiplication operation is followed by application to a softmax layer, representing the query matrix Q l And key matrix K l The position similarity between them, i.e. representing the forgetting of the door f t ' and set M of continuous historical spatio-temporal feature maps l-τ:l-1 The degree of correlation of (c). Then using the value matrix V l Calculating matrix product as weight of updated information, selectively adding M l-τ:l-1 The spatio-temporal information of (a) is assembled and the matrix is reshaped back to its original shape. Finally, the space-time memory unit M of the previous layer is passed l-1 The sum is applied to a layerorm layer to obtain the final output ST _ ATT of the spatiotemporal attention module.
Step (9-1-2), forgetting to look at door f t 1' ∈R B×C×H×W For querying the matrix Q c Remodelling it to Q, unlike the spatiotemporal attention module c ∈R B×C×(H*W) . Set M of corresponding continuous historical spatio-temporal feature maps l-τ:l-1 ∈R B×C×τ×H×W It is regarded as a key matrix K c Sum matrix V c They are reshaped to K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) . Next, the output C _ ATT of the channel attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000201
Q c =f t 1' ;K c =V c =M l-τ:l-1
as shown in the orange portion of fig. 2, softmax (Q) c ·K c T )∈R B×C×(τ*C) Representing a query matrix Q c Key matrix K c The extent of influence on the channel. Then, it is combined with the value matrix V c Taking matrix product as weight of updated information, selectively dividing M l-τ:l-1 And then the matrix is reshaped back to its original shape. Finally, the space-time memory unit of the previous layer is passed through Ml-1 And after summation, applying the sum to a layerorm layer to obtain the output C _ ATT of the final channel attention module.
And (9-1-3) fusing the output ST _ ATT of the spatiotemporal attention module of step 9-1-1 and the output C _ ATT of the channel attention module of step 9-1-2, as shown in the green part of FIG. 2. Specifically, the ST _ ATT and the C _ ATT respectively pass through a convolution layer with a convolution kernel size of 3, a normalization layer of layerorm, an activation function layer of ReLU, and a convolution layer with a convolution kernel size of 1, then element summation is performed on the two results, and finally the convolution layer is used to generate a final attention fused result AttFusion, wherein a specific calculation formula is as follows:
AttFusion=Sum(ST_ATT,C_ATT)
=conv(conv(layernorm(ReLU(conv(ST_ATT))))
+conv(layernorm(ReLU(conv(C_ATT)))))
step (9-2), the hidden state X input at the current moment is input t Hidden state of same layer network output at previous moment
Figure BDA0004021513060000202
And a memory unit->
Figure BDA0004021513060000203
Modulating the door by an input>
Figure BDA0004021513060000204
Input door/door>
Figure BDA0004021513060000205
And forget door f t 1 Updating the current memory cell->
Figure BDA0004021513060000206
The formula is as follows:
Figure BDA0004021513060000211
Figure BDA0004021513060000212
Figure BDA0004021513060000213
Figure BDA0004021513060000214
step (9-3), the hidden state X output by the previous layer network at the current moment is output t Space-time memory cell M of the previous layer l-1 And a set M of contiguous historical spatiotemporal memory units l-τ:l-1 As an input, the attention fused equation Attfusion, input modulation gate, via step 9-1
Figure BDA0004021513060000215
Input door/door>
Figure BDA0004021513060000216
And forget door f 1 1' Updating the current spatiotemporal memory cell>
Figure BDA0004021513060000217
The formula is as follows: />
Figure BDA0004021513060000218
Figure BDA0004021513060000219
f 1 1' =σ(W′ xf *X t +W mf *M l-1 +b' f )
Figure BDA00040215130600002110
Step (9-4), the hidden state X output by the previous layer network at the current moment is output t Hidden state of the same layer output at the previous time
Figure BDA00040215130600002111
The memory cell updated in step 9-2 and step 9-3 is then based on the status of the memory cell->
Figure BDA00040215130600002112
And spatiotemporal memory->
Figure BDA00040215130600002113
As an output door->
Figure BDA00040215130600002114
For hidden state->
Figure BDA00040215130600002115
Updating is carried out, and the formula is as follows:
Figure BDA00040215130600002116
Figure BDA00040215130600002117
step (10) of hiding the output of step (9)
Figure BDA00040215130600002118
Memory unit->
Figure BDA00040215130600002119
Spatiotemporal memory->
Figure BDA00040215130600002120
And hidden status of same layer network output at previous time>
Figure BDA00040215130600002121
And a set M of spatiotemporal memory cells of the front τ layer l-τ:l-1 Inputting the signal into a layer 2 AF-LSTM network unit at the t moment, and obtaining the hidden state of the output of the layer after forward propagation>
Figure BDA00040215130600002122
Memory unit->
Figure BDA00040215130600002123
And spatiotemporal memory->
Figure BDA00040215130600002124
Set M of consecutive historical spatiotemporal memory units simultaneously updating the previous τ time steps l-τ:l-1 As input to the attention module of the next cell. The AF-LSTM network element, as shown in fig. 3, comprises the following steps:
step (10-1), the space-time memory unit M of the previous layer is processed l-1 Forgetting door f t 2' And the previous several layers of continuous historical space-time memory unit set M l-τ:l-1 As input, the output is performed by fusing the attention mechanism, resulting in a spatiotemporal memory unit AttFusion with multiple time steps. As shown in fig. 2, the method comprises the following steps:
step (10-1-1), forgetting to gate f t 2' ∈R B×C×H×W It is regarded as a query matrix Q l Here, B, C, H, and W represent the feature image batch size, the number of channels, the image height, and the image width, respectively. Firstly, directly remolding the mixture into Q l ∈R N ×(H*W)×C . Set M of corresponding continuous historical spatio-temporal feature maps l-τ:l-1 ∈R B×C×τ×H×W It is regarded as a key matrix K l Sum matrix V l Where τ refers to the length of the time series. Also, they are individually reshaped to K l ∈R B×(τ*H*W)×C And V l ∈R B ×(τ*H*W)×C . Next, the output ST _ ATT of the spatiotemporal attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000221
Q l =f t 2' ;K l =V l =M l-τ:l-1
as shown in the blue part of fig. 2, here
Figure BDA0004021513060000222
Representation pair query matrix Q l And key matrix K l The transposed matrix multiplication operation is followed by application to a softmax layer, representing the query matrix Q l And key matrix K l The position similarity between them, i.e. representing the forgetting of the door f t 2' And a set M of continuous historical spatiotemporal feature maps l-τ:l-1 The degree of correlation of (c). Then using the value matrix V l Calculating matrix product as weight of updated information, selectively adding M l-τ:l-1 The spatio-temporal information of (a) is assembled and the matrix is reshaped back to its original shape. Finally, the last layer of space-time memory unit M is passed through l-1 And applying the summation to a layerorm layer to obtain the output ST _ ATT of the final space-time attention module.
Step (10-1-2), forgetting to look at door f t 2' ∈R B×C×H×W For querying the matrix Q c And is andthe difference between the space-time attention module and the attention module is that the space-time attention module reshapes the space-time attention module into Q c ∈R B×C×(H*W) . Set M of corresponding continuous historical spatio-temporal feature maps l-τ:l-1 ∈R B×C×τ×H×W It is regarded as a key matrix K c Sum matrix V c They are reshaped to K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) . Next, the output C _ ATT of the channel attention module can be obtained, and the specific formula is as follows:
Figure BDA0004021513060000223
Q c =f t 2' ;K c =V c =M l-τ:l-1
as shown in the orange portion of figure 2,
Figure BDA0004021513060000224
representing a query matrix Q c Key matrix K c The extent of influence on the channel. Then, it is combined with the value matrix V c Taking matrix product as weight of updated information, selectively dividing M l-τ:l-1 And then the matrix is reshaped back to its original shape. Finally, the last layer of space-time memory unit M is passed through l-1 And after summation, applying the sum to a layerorm layer to obtain the output C _ ATT of the final channel attention module.
And (10-1-3) fusing the output ST _ ATT of the spatiotemporal attention module of step 10-1-1 and the output C _ ATT of the channel attention module of step 10-1-2, as shown in the green part of FIG. 2. Specifically, the ST _ ATT and the C _ ATT respectively pass through a convolution layer with a convolution kernel size of 3, a normalization layer of layerorm, an activation function layer of ReLU, and a convolution layer with a convolution kernel size of 1, then element summation is performed on the two results, and finally the convolution layer is used to generate a final attention fused result AttFusion, wherein a specific calculation formula is as follows:
AttFusion=Sum(ST_ATT,C_ATT)
=conv(conv(layernorm(ReLU(conv(ST_ATT))))
+conv(layernorm(ReLU(conv(C_ATT)))))
step (10-2), the hidden state output by the previous layer network at the current moment is output
Figure BDA0004021513060000231
Hidden state in the output of the same layer network at a preceding instant>
Figure BDA0004021513060000232
And a memory unit>
Figure BDA0004021513060000233
Modulating the door by an input>
Figure BDA0004021513060000234
Input door/door>
Figure BDA0004021513060000235
And forget door f t 2 Updating a current memory cell>
Figure BDA0004021513060000236
The formula is as follows:
Figure BDA0004021513060000237
Figure BDA0004021513060000238
Figure BDA0004021513060000239
Figure BDA00040215130600002310
step (10-3), the previous layer of the current time is processedHidden state of network output
Figure BDA00040215130600002311
Space-time memory unit M of the previous layer l-1 And a set M of contiguous historical spatiotemporal memory units l-τ:l-1 As an input, the input modulation door @, by the fused attention formula AttFusion of step 10-1>
Figure BDA00040215130600002312
Input door>
Figure BDA00040215130600002313
And forget door f t 2' Updating the current spatiotemporal memory cell>
Figure BDA00040215130600002314
The formula is as follows:
Figure BDA00040215130600002315
Figure BDA00040215130600002316
Figure BDA00040215130600002317
Figure BDA00040215130600002318
step (10-4), the hidden state output by the previous layer network at the current moment is output
Figure BDA00040215130600002319
Hidden state output by the same layer at the previous moment>
Figure BDA00040215130600002320
Through the step 10-2And step 10-3 the updated memory unit->
Figure BDA00040215130600002321
And spatiotemporal memory->
Figure BDA00040215130600002322
As an output gate
Figure BDA00040215130600002323
For hidden states>
Figure BDA00040215130600002324
Updating is carried out, and the formula is as follows:
Figure BDA00040215130600002325
Figure BDA00040215130600002326
step (11) of hiding the output of step (10)
Figure BDA00040215130600002327
Spatiotemporal memory->
Figure BDA00040215130600002328
Inputting the signal into a 3 rd layer space-time convolution long-short term memory network of the network, and obtaining the hidden state of the output of the layer after forward propagation>
Figure BDA00040215130600002329
Memory cell>
Figure BDA00040215130600002330
And space-time memory
Figure BDA00040215130600002331
The concrete steps are the same as the step (10).
Step (12) of subjecting the mixture to(11) Hidden state of output
Figure BDA00040215130600002332
Spatiotemporal memory>
Figure BDA00040215130600002333
Inputting into the 4 th layer space-time convolution long-short term memory network of the network, and obtaining the hidden state of the output of the layer after forward propagation>
Figure BDA0004021513060000241
Memory unit->
Figure BDA0004021513060000242
And space-time memory
Figure BDA0004021513060000243
The concrete steps are the same as the step (10).
Step (13) of hiding the output of step (12)
Figure BDA0004021513060000244
The decoded output is input into a Decoder of the Decoder, and then is correspondingly fused with the output of each convolution layer of the encoder, and the formula is as follows:
Figure BDA0004021513060000245
step (14), the decoder result Dec output in step (13) l I.e. output of prediction images of the network
Figure BDA0004021513060000246
Finally completing the slave input I t To>
Figure BDA0004021513060000247
And (4) extrapolation of the radar echo. The formula is as follows:
Figure BDA0004021513060000248
step (15), when t =11,12, \ 8230;, 19, the previous time is output through the prediction layer
Figure BDA0004021513060000249
And (4) as the input of the network, repeatedly executing the steps (9) to (14) until t =19, and sequentially obtaining the image sequence of the predicted future time
Figure BDA00040215130600002410
And (5) finishing the extrapolation of the radar echo sequence.
And (16) calculating a loss function value. For the prediction sequence obtained in step (15)
Figure BDA00040215130600002411
And extrapolated reference sequence group _ truths = { I 11 ,I 12 ,...,I 20 Calculate the mean square error as a loss function. And calculating the gradient of the network parameters according to the numerical value obtained by the loss function, updating the network parameters and performing back propagation.
And 2-10, calculating all data in the training set into one round once, and repeatedly executing the steps 2-3 to 2-9 until the maximum number of rounds of training is finished or the convergence condition is reached, thereby finishing the AFR-LSTM network training.
And step 3: and (4) predicting the AF-LSTM network. And (3) predicting by using the AFR-LSTM network trained in the step (2) and the test sequence sample set obtained by dividing in the step (1). During prediction, 1 sequence sample data is read from the test sequence sample set test _ data each time, and the sample data is input into the trained AFR-LSTM network to obtain a final extrapolation image sequence.
In this embodiment, step 3 includes the following steps:
and 3-1, reading a test set sample. 1 sequence sample at a time is read from the test sequence sample set test _ data.
And 3-2, extrapolating the radar echo image. And inputting the test sequence sample into the trained AFR-LSTM network, and finally obtaining a radar echo extrapolation image sequence with the length of output _ length =10 through forward propagation.
The present invention provides a method for predicting the weather near radar echo extrapolation, and a plurality of methods and approaches for implementing the technical solution are provided, and the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a plurality of modifications and embellishments can be made without departing from the principle of the present invention, and these modifications and embellishments should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.
Example 2
In a second aspect, the present embodiment provides a radar echo extrapolation neighboring weather predictor device, including a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method according to embodiment 1.
Example 3
In a third aspect, the present embodiment provides a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of embodiment 1.
Example 4
In a fourth aspect, the present embodiment provides a computer device, including a processor and a storage medium;
the storage medium is used for storing instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method according to embodiment 1.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention, and such modifications and adaptations are intended to be within the scope of the invention.

Claims (10)

1. A method for predicting the weather near the radar echo extrapolation is characterized by comprising the following steps:
s1, obtaining a historical radar echo sequence sample;
s2, constructing and training a prediction neural network model based on AFR-LSTM, dividing a radar echo sequence sample into batch _ sizes, inputting the samples into the prediction neural network model, and performing backward propagation to update network weights after the samples are subjected to forward propagation of a multilayer network to obtain the trained prediction neural network model;
s3, inputting a radar echo sequence sample in a set time period into a trained prediction neural network model to obtain a radar echo extrapolation image sequence;
and S4, determining a prediction result of the adjacent weather according to the radar echo extrapolation image sequence.
2. The method for predicting weather near radar echo extrapolation according to claim 1, wherein in step S1, obtaining a historical radar echo sequence sample comprises:
and (3) sequentially carrying out coordinate conversion, data interpolation and horizontal sampling pretreatment on the radar echo map acquired by the Doppler radar to obtain a gray scale map.
3. The method of radar echo extrapolation neighboring weather prediction according to claim 2, characterized in that the coordinate transformation comprises: converting radar echo map data under three-dimensional polar coordinates into a three-dimensional Cartesian rectangular coordinate system;
the data interpolation includes: performing data interpolation by adopting a reverse distance weighting method to obtain regular grid data under a three-dimensional Cartesian rectangular coordinate system;
the horizontal sampling comprises the following steps: performing horizontal sampling on regular grid data under a three-dimensional Cartesian rectangular coordinate system, extracting two-dimensional plane data under a height, and mapping the two-dimensional plane data to 0-255 to obtain a CAPPI gray image of echo intensity; wherein the data mapping formula is as follows:
Figure FDA0004021513050000011
wherein P is a grayscale pixel; z is the intensity value of the data,
Figure FDA0004021513050000012
indicating that the value is rounded down.
4. The method for predicting radar echo adjacent weather according to claim 3, wherein the step S1 further includes: converting the data into normalized gray data normalized _ data through normalization;
Figure FDA0004021513050000021
the resulting normalized gray scale data has a value of [0,1].
5. The method for predicting radar echo adjacent weather according to claim 1, wherein in step S2, the AFR-LSTM-based prediction neural network model sequentially comprises: an Encoder Encoder, an AF-LSTM module and a Decoder;
the Encoder Encoder comprises 5 convolutional layers for extracting radar echo sequence samples I t Depth feature X of t
The AF-LSTM module comprises 4 layers of AF-LSTM network units which are sequentially stacked behind the Encoder Encoder network in order and used for extracting depth characteristics X of radar echo sequence sample t The temporal-spatial information of (2), the hidden state of the output
Figure FDA0004021513050000022
Inputting the data into a Decoder;
the AF-LSTM module is used for outputting memory units of the same layer network at the previous time
Figure FDA0004021513050000023
And hidden state->
Figure FDA0004021513050000024
Hidden state output by one layer of network before current moment>
Figure FDA0004021513050000025
Space-time memory unit M of the previous layer l-1 And a set M of spatiotemporal memory cells of the front τ layer l-τ:l-1 Inputting the data into AF-LSTM network unit at the l-th layer at the t moment, and obtaining the hidden state/combination outputted by the current network unit after forward propagation>
Figure FDA0004021513050000026
Memory unit->
Figure FDA0004021513050000027
Spatiotemporal memory unit->
Figure FDA0004021513050000028
Wherein t =1,2 823010, 10,l =1,2,3,4;
Figure FDA0004021513050000029
Figure FDA00040215130500000210
setting parameters through initialization;
the Decoder comprises 5 convolutional layers for hiding the output of the AF-LSTM module
Figure FDA00040215130500000211
Decoding is carried out, and the decoding is correspondingly fused with the output of each convolution layer of the encoder respectively to obtain an output radar echo extrapolated image sequence->
Figure FDA00040215130500000212
6. The method of claim 5, wherein the processing of the AF-LSTM module comprises:
step 2-1, the space-time memory unit M of the previous layer is processed l-1 Forgetting door f t ' and the previous several layers of the set M of continuous history space-time memory units l-τ:l-1 As input, outputting by a fusion attention mechanism to obtain a space-time memory unit Attfusion with a plurality of time steps;
step 2-2, the hidden state output by the previous layer network at the current moment
Figure FDA0004021513050000031
Hidden state output by the same layer network at the previous moment>
Figure FDA0004021513050000032
And a memory unit->
Figure FDA0004021513050000033
By means of an input modulation gate g t And input gate i t And forget door f t Updating the current memory cell->
Figure FDA0004021513050000034
The formula is as follows:
Figure FDA0004021513050000035
Figure FDA0004021513050000036
Figure FDA0004021513050000037
Figure FDA0004021513050000038
wherein "+" denotes convolution operation, "" denotes dot product operation of matrix, and tanh denotes hyperbolic tangent activation function
Figure FDA0004021513050000039
Sigma denotes Sigmoid activation function>
Figure FDA00040215130500000310
W xg ,W hg ,W xi ,W hi ,W xf ,W hf The sizes of the two layers are all filter _ size _ filter _ size, and the number of the two layers is num _ hidden _ num _ hidden; b g ,b i ,b f Indicating a deviation;
step 2-3, the hidden state output by the previous layer network at the current moment is output
Figure FDA00040215130500000311
Space-time memory unit M of the previous layer l-1 And a set M of contiguous historical spatiotemporal memory units l-τ:l-1 As input, the data is passed through the space-time memory unit AttFusion of step 2-1, and the input modulation gate g' t And an input gate i' t And forget gate f' t Updating a current spatiotemporal memory cell>
Figure FDA00040215130500000312
The formula is as follows:
Figure FDA00040215130500000313
Figure FDA00040215130500000314
Figure FDA00040215130500000315
Figure FDA00040215130500000316
wherein "+" denotes convolution operation, "" denotes dot product operation of matrix, and tanh denotes hyperbolic tangent activation function
Figure FDA0004021513050000041
Sigma denotes a Sigmoid activation function>
Figure FDA0004021513050000042
W xi ′,W hi ′,W xg ′,W hg ′,W xf ′,W hf ' are all filter _ size × filter _ size, in number numhidden _ num _ hidden; b i ′,b g ′,b f ' denotes a deviation;
step 2-4, the hidden state output by the previous layer network at the current moment
Figure FDA0004021513050000043
Hidden state output by the same layer at the previous moment>
Figure FDA0004021513050000044
The memory cell updated in step 2->
Figure FDA0004021513050000045
And step 2-3 the updated spatiotemporal memory unit>
Figure FDA0004021513050000046
As output gate O t For hidden state->
Figure FDA0004021513050000047
Updating is carried out, and the formula is as follows:
Figure FDA0004021513050000048
Figure FDA0004021513050000049
wherein "+" indicates convolution operation, and "-" indicates dot product operation of matrix, [, ·]Showing that the two matrixes are spliced according to columns and the rows are kept unchanged; tanh represents the hyperbolic tangent activation function
Figure FDA00040215130500000410
Convolution kernel W 1*1 The size of (1) × 1, the number num _ hidden × num _ hidden; w xo ,W ho ,W co ,W mo Is 5 by 5 in number num _ hidden by num _ hidden; b is a mixture of o The deviation is indicated.
7. The method of claim 6, wherein step 2-1 comprises: each space-time memory unit Attfusion comprises a space-time attention module, a channel attention module and a fusion attention module;
step 2-1-1, a space-time attention module: forget door f 1 2' ∈R B×C×H×W Is regarded as a query matrix Q l B, C, H and W respectively represent the batch size of the characteristic images, the number of channels, the image height and the image width; will inquire about the matrix Q l Remodeling into Q l ∈R N ×(H*W)×C (ii) a Set M of corresponding continuous historical spatiotemporal feature maps 0:1 ∈R B×C×τ×H×W Is regarded as a key matrix K l Sum matrix V l τ refers to the length of the time series; also, a key matrix K l Sum matrix V l Are respectively reshaped into K l ∈R B×(τ*H*W)×C And V l ∈R B ×(τ*H*W)×C (ii) a According to Q l ∈R N×(H*W)×C 、K l ∈R B×(τ*H*W)×C And V l ∈R B×(τ*H*W)×C The output of the spatiotemporal attention module, ST _ ATT, is obtained:
Figure FDA0004021513050000051
wherein,
Figure FDA0004021513050000052
representation pair query matrix Q l And key matrix K l The transposed matrix multiplication operation is followed by application to a softmax layer, representing the query matrix Q l And key matrix K l The position similarity between them, i.e. representing the forgetting of the door f 1 2 ' and set M of continuous historical spatio-temporal feature maps 0:1 The degree of correlation of (c); then using the value matrix V l Calculating matrix product as weight of updated information, selectively dividing M 0:1 The space-time information is collected, and then the matrix is reshaped to the original shape; finally, the time-space memory unit which is arranged on the upper layer is used for storing and storing the data>
Figure FDA0004021513050000053
Applying the sum to a layerorm layer to obtain the output ST _ ATT of the space-time attention module;
step 2-1-2, the channel attention module: forgetting to see door f t '∈R B×C×H×W For querying the matrix Q c Remodeling it to Q c ∈R B ×C×(H*W) (ii) a Set M of corresponding continuous historical spatiotemporal feature maps l-τ:l-1 ∈R B×C×τ×H×W As a key matrix K c Sum matrix V c Key matrix K c Sum matrix V c Is reshaped into K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) (ii) a According to Q c ∈R B×C×(H*W) 、K c ∈R B×(τ*C)×(H*W) And V c ∈R B×(τ*C)×(H*W) And obtaining the output C _ ATT of the channel attention module:
Figure FDA0004021513050000054
wherein,
Figure FDA0004021513050000055
representing a query matrix Q c Key matrix K c The extent of the effect on the channel; then in>
Figure FDA0004021513050000056
And value matrix V c Taking matrix product as weight of updated information, selectively dividing M l-τ:l-1 The channel information of the matrix is collected, and the matrix is reshaped to the original shape; finally, the last layer of space-time memory unit M is passed through l-1 After summing, applying the sum to a layerorm layer to obtain the output C _ ATT of the channel attention module;
step 2-1-3, fusing an attention module: and fusing the output ST _ ATT of the space-time attention module and the output C _ ATT of the channel attention module to obtain a fused attention result Attfusion:
AttFusion=Sum(ST_ATT,C_ATT)
=conv(conv(layernorm(ReLU(conv(ST_ATT))))+conv(layernorm(ReLU(conv(C_ATT)))))
the ST _ ATT and the C _ ATT respectively pass through a convolution layer with a convolution kernel size of 3, a normalization layer of layerorm, an activation function layer of ReLU and a convolution layer with a convolution kernel size of 1, element summation is performed on the two results, finally, the convolution layer is utilized to generate a result finally fused with attention, and Attfusion is output by the instant memory unit.
8. The method of claim 5, wherein the hidden state of the AF-LSTM module output is hidden from the radar echo extrapolation
Figure FDA0004021513050000061
Decoding is carried out, and the decoding and the corresponding fusion with the output of each convolution layer of the encoder respectively comprise:
Figure FDA0004021513050000062
wherein, dec l-1 Representing the output of one convolutional layer of the decoder, enc -1 () Representing the output, dec, of the encoder corresponding to the convolutional layer l Which represents the final encoder result obtained by adding the results of the two.
9. A radar echo extrapolation adjacent weather prediction device is characterized by comprising a processor and a storage medium;
the storage medium is to store instructions;
the processor is configured to operate in accordance with the instructions to perform the steps of the method according to any one of claims 1 to 8.
10. A storage medium having a computer program stored thereon, the computer program, when being executed by a processor, performing the steps of the method of any one of claims 1 to 8.
CN202211688110.2A 2022-12-28 2022-12-28 Radar echo extrapolation near weather prediction method Pending CN115933010A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211688110.2A CN115933010A (en) 2022-12-28 2022-12-28 Radar echo extrapolation near weather prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211688110.2A CN115933010A (en) 2022-12-28 2022-12-28 Radar echo extrapolation near weather prediction method

Publications (1)

Publication Number Publication Date
CN115933010A true CN115933010A (en) 2023-04-07

Family

ID=86554115

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211688110.2A Pending CN115933010A (en) 2022-12-28 2022-12-28 Radar echo extrapolation near weather prediction method

Country Status (1)

Country Link
CN (1) CN115933010A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116953653A (en) * 2023-09-19 2023-10-27 成都远望科技有限责任公司 Networking echo extrapolation method based on multiband weather radar
CN117634930A (en) * 2024-01-24 2024-03-01 南京信息工程大学 Typhoon cloud picture prediction method, typhoon cloud picture prediction system and storage medium
CN117665825A (en) * 2024-01-31 2024-03-08 南京信息工程大学 Radar echo extrapolation prediction method, system and storage medium
CN118379642A (en) * 2024-06-27 2024-07-23 安徽省公共气象服务中心(安徽省突发公共事件预警信息发布中心) Probability distribution-based self-supervision short-cut rainfall prediction method and system
CN118604917A (en) * 2024-08-09 2024-09-06 南京信息工程大学 Precipitation proximity forecasting method based on spatial correlation feature extraction and depth space-time fusion network

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116953653A (en) * 2023-09-19 2023-10-27 成都远望科技有限责任公司 Networking echo extrapolation method based on multiband weather radar
CN116953653B (en) * 2023-09-19 2023-12-26 成都远望科技有限责任公司 Networking echo extrapolation method based on multiband weather radar
CN117634930A (en) * 2024-01-24 2024-03-01 南京信息工程大学 Typhoon cloud picture prediction method, typhoon cloud picture prediction system and storage medium
CN117634930B (en) * 2024-01-24 2024-03-29 南京信息工程大学 Typhoon cloud picture prediction method, typhoon cloud picture prediction system and storage medium
CN117665825A (en) * 2024-01-31 2024-03-08 南京信息工程大学 Radar echo extrapolation prediction method, system and storage medium
CN117665825B (en) * 2024-01-31 2024-05-14 南京信息工程大学 Radar echo extrapolation prediction method, system and storage medium
CN118379642A (en) * 2024-06-27 2024-07-23 安徽省公共气象服务中心(安徽省突发公共事件预警信息发布中心) Probability distribution-based self-supervision short-cut rainfall prediction method and system
CN118604917A (en) * 2024-08-09 2024-09-06 南京信息工程大学 Precipitation proximity forecasting method based on spatial correlation feature extraction and depth space-time fusion network

Similar Documents

Publication Publication Date Title
CN115933010A (en) Radar echo extrapolation near weather prediction method
CN108846199B (en) Ultra-high arch dam deformation space-time sequence prediction method based on space-time integration
CN115390164B (en) Radar echo extrapolation forecasting method and system
CN112446419A (en) Time-space neural network radar echo extrapolation forecasting method based on attention mechanism
CN111680784B (en) Sea surface temperature deep learning prediction method based on space-time multidimensional influence
CN117665825B (en) Radar echo extrapolation prediction method, system and storage medium
CN116030537B (en) Three-dimensional human body posture estimation method based on multi-branch attention-seeking convolution
CN115482656B (en) Traffic flow prediction method by using space dynamic graph convolutional network
CN112288690A (en) Satellite image dense matching method fusing multi-scale and multi-level features
CN116612396A (en) Ocean surface temperature sequence prediction method based on space-time double-flow non-stationary sensing
CN116844041A (en) Cultivated land extraction method based on bidirectional convolution time self-attention mechanism
CN117172355A (en) Sea surface temperature prediction method integrating space-time granularity context neural network
CN117011668A (en) Weather radar echo extrapolation method based on time sequence prediction neural network
CN116486102A (en) Infrared dim target detection method based on mixed spatial modulation characteristic convolutional neural network
CN115860215A (en) Photovoltaic and wind power generation power prediction method and system
CN113935458A (en) Air pollution multi-site combined prediction method based on convolution self-coding deep learning
CN117131991A (en) Urban rainfall prediction method and platform based on hybrid neural network
CN116106909A (en) Radar echo extrapolation method, system and storage medium
CN114594443B (en) Meteorological radar echo extrapolation method and system based on self-attention mechanism and predictive recurrent neural network
CN115830707A (en) Multi-view human behavior identification method based on hypergraph learning
CN115797181A (en) Image super-resolution reconstruction method for mine fuzzy environment
CN115100599A (en) Mask transform-based semi-supervised crowd scene abnormality detection method
CN114694037A (en) Tropical cyclone track prediction method based on deep learning algorithm
CN117634930B (en) Typhoon cloud picture prediction method, typhoon cloud picture prediction system and storage medium
Zhu et al. An Image Fingerprint and AttentionMechanism Based Load Estimation Algorithm for Electric Power System.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination