CN117237781B - Attention mechanism-based double-element fusion space-time prediction method - Google Patents

Attention mechanism-based double-element fusion space-time prediction method Download PDF

Info

Publication number
CN117237781B
CN117237781B CN202311522863.0A CN202311522863A CN117237781B CN 117237781 B CN117237781 B CN 117237781B CN 202311522863 A CN202311522863 A CN 202311522863A CN 117237781 B CN117237781 B CN 117237781B
Authority
CN
China
Prior art keywords
prediction
data
space
fusion
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311522863.0A
Other languages
Chinese (zh)
Other versions
CN117237781A (en
Inventor
周志权
徐天亮
王晨旭
李迎春
李剑锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN202311522863.0A priority Critical patent/CN117237781B/en
Publication of CN117237781A publication Critical patent/CN117237781A/en
Application granted granted Critical
Publication of CN117237781B publication Critical patent/CN117237781B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a double-element fusion space-time prediction method based on an attention mechanism, which solves the technical problem that the prediction precision of the existing space-time prediction method is insufficient; the method comprises the steps of using a dual-element input module to perform standardized processing on dual-element data, eliminating the order-of-magnitude difference between dimensions of the data, and enabling the data to be scaled to a standard interval range; extracting time and space information of element hiding by using a 3D convolution module; using an attention fusion module to fuse time and space characteristics between the double elements and reallocate characteristic weights for predicted ocean elements; capturing a time sequence dependency relationship by using a convolution long-short-term memory module, and converting to generate a prediction feature matrix of a target length; and mapping the prediction feature matrix to an output space by using a single element prediction output module, and visualizing a prediction result through inverse standardization. The method and the device are widely applied to the technical field of marine element space-time prediction.

Description

Attention mechanism-based double-element fusion space-time prediction method
Technical Field
The application relates to a double-element fusion space-time prediction method, in particular to a double-element fusion space-time prediction method based on an attention mechanism.
Background
The acquisition of the surface data of the ocean element mainly comprises two means of direct observation and satellite remote sensing. The direct observation collects data through the buoy, has the characteristics of accurate sampling, low cost, real-time uploading and the like, is limited by the multi-factor influences of changeable sea conditions, instrument and equipment maintenance and the like, and has smaller coverage range. The satellite remote sensing is mainly based on microwave and infrared observation modes, and has wider coverage range and high resolution. The increasingly accumulated and refined marine element observation data lays a data foundation for a prediction method taking data driving as a core, and is an important means for researching and observing marine elements at present.
Deep learning networks have been increasingly applied to the spatiotemporal prediction of marine element surface data, particularly for determining the formation of el nino by predicting changes in the Sea Surface Temperature (SST) of the pacific ocean in order to prevent ocean disasters and reduce economic losses. Liu et al propose a convolutional long-short term memory neural network (ConvLSTM) using CNN to extract relevant features from the warm salt data over time, and then using LSTM network to predict SST. Xu et al propose a regional convolution long-short-term memory theoretical model (RC-LSTM) with space-time information processing capability, solve the problem of losing regional information, and perform experiments in China coast. Gou et al propose a general space-time ocean deep learning framework (deep) using a multi-layer perceptron (MLP) generation module and a multivariate convolution LSTM (MVC-LSTM) neural network prediction module.
However, a difficulty with marine environmental element prediction is that marine power processes are complex and nonlinear, especially when el nino is formed, the change in SST is more specific, making prediction more challenging. The above method does not consider the influence of other marine elements on SST change, and the fusion of the related characteristics among multiple elements cannot be realized. Therefore, adding multi-element features is critical to improving the prediction accuracy of SST.
Disclosure of Invention
In order to solve the problem of insufficient prediction precision of the existing space-time prediction method, the technical scheme adopted by the application is as follows: the utility model provides a double-element fusion space-time prediction method based on an attention mechanism, which comprises the following steps:
the dual-element input module is used for carrying out standardized processing on dual-element data, order-of-magnitude difference among dimensions of the data is eliminated, and the data can be scaled to a standard interval range;
extracting time and space information of element hiding by using a 3D convolution module;
using an attention fusion module to fuse time and space characteristics between the double elements and reallocate characteristic weights for predicted ocean elements;
capturing a time sequence dependency relationship by using a convolution long-short-term memory module, and converting to generate a prediction feature matrix of a target length;
mapping the prediction feature matrix to an output space by using a single element prediction output module, and visualizing a prediction result through inverse standardization;
wherein the dual element includes element 1 with SST as element 1 and SLA or SSW as element 2.
Preferably, the dual-element data is normalized, and a specific normalization process formula is as follows:
(1)。
wherein,x std is in the range of 0,1]The data after the standardization of the data are carried out,x max andx min is [ rows, cols ]]Maximum and minimum values of the original data within the region,x i is [ rows, cols ]]Each location within the areaiRaw data at the location;
the standardized data shape obtained is (1, timeteps, rows, cols, 1).
Preferably, the 3D convolution module extracts the element hidden temporal and spatial information by using Conv3D units, and outputs data in the shape of (1, timeteps, rows, cols, 40).
Preferably, the implementation process of the attention fusion module includes:
element 1 is obtained by dimension substitution Transpost, and the 1 st and 4 th dimensions of the data are substitutedHidden information H of (1) i
Element 2 obtains the correlation feature weight of the double elements through continuous element fusion Add, relu activation function, conv3D_A unit, sigmoid activation function and dimension substitution Transose
Computing correlation feature weights by position multiplication multiplexingAnd hidden information H i The product between elements of (2) to obtain a new feature X of element 1 i Its data shape is (1, 40, rows, cols, timeteps).
Preferably, the convolution long-short-term memory module further extracts features, captures time sequence dependency and converts the time sequence dependency to generate a prediction feature matrix of the target length.
Preferably, the convolution long-term and short-term memory module specifically realizes the following steps:
new feature X of element 1 i Firstly, carrying out dimension substitution Transpose; the features are further extracted by encoding through 2 ConvLSTM 2D layers; then generating a target layer through conversion, and generating a prediction feature matrix of target length targetsizes through conversion; finally, feature decoding is performed through 2 ConvLSTM 2D layers, and the output data has the shape of (1, targetsizes, rows, cols, 32).
Preferably, the unit element prediction output module maps the prediction feature matrix output by the convolution long-short-term memory module to an output space, and then visualizes a prediction result through inverse normalization.
Preferably, the prediction result visualization is implemented in a specific manner:
using TimeDistributed (Dense) full connection layer, compressing and mapping the extracted features to output space by nonlinear transformation;
the prediction result is visualized through the inverse normalization process, and the targetsizes day prediction result of the element 1 is output.
Preferably, the inverse normalization formula is as follows:
(2)。
wherein x is std Is that the range of the model mapping onto the output space is in the interval 0,1]Standardized prediction data therebetween; x is x max And x min Is [ rows, cols ]]The maximum value and the minimum value of the original data in the area are the values of x in the formula (1) max And x min Is of a size of (2); x is x i Is [ rows, cols ]]The visualization of the true predictions at each location i within the region.
The invention has the beneficial effects of providing a double-element fusion space-time prediction method based on an attention mechanism, which comprises a double-element input module, a 3D convolution module, an attention fusion module, a convolution long-short-term memory module and a single element prediction output module. Extracting time and space information hidden by elements through a 3D convolution module; the time and space characteristics between the double elements are fused through the attention fusion module, and characteristic weights are redistributed for the predicted ocean elements; capturing short-term time sequence dependency relationship through a convolution long-term and short-term memory module, converting to generate a prediction matrix with target length, and realizing space-time prediction of ocean elements. According to the invention, SST is used as the double-element fusion prediction SST of the element 1 and SLA or SSW as the element 2, so that the defect of a single element SST prediction model is overcome, the root mean square error of marine element space-time prediction can be effectively reduced, and the prediction precision is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required for the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for dual-element fusion spatiotemporal prediction based on an attention mechanism according to an embodiment of the present application;
FIG. 2 is a schematic workflow diagram of an attention fusion module according to an embodiment of the present application;
fig. 3 is a block diagram of a ConvLSTM 2D layer and an update delivery diagram according to an embodiment of the present application.
Detailed Description
In order to make the technical problems, technical schemes and beneficial effects to be solved by the present application more clear, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
It should be noted that the terms "first," "second," and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implying a number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
Referring to fig. 1, a flow chart of a dual-element fusion spatiotemporal prediction method based on an attention mechanism according to an embodiment of the present application is provided, and for convenience of explanation, only the relevant parts of the embodiment are shown, and the detailed description is as follows:
in one embodiment, a method for dual element fusion spatiotemporal prediction based on an attention mechanism includes the steps of:
s101, performing standardized processing on the double-element data by using a double-element input module, and eliminating the order-of-magnitude difference between each dimension of the data, so that the data can be scaled to a reasonable interval range.
S102, extracting time and space information of element hiding by using a 3D convolution module.
S103, fusing time and space characteristics among the double elements by using an attention fusion module, and reallocating characteristic weights for the predicted ocean elements.
S104, capturing a time sequence dependency relationship by using a convolution long-term and short-term memory module, and converting to generate a prediction feature matrix of the target length.
S105, mapping the prediction feature matrix to an output space by using a single element prediction output module, and visualizing a prediction result through inverse standardization.
Specifically, the detailed description of the dual element fusion spatiotemporal prediction method is as follows:
referring to fig. 1, the structure of the method for spatial-temporal prediction based on double-element fusion of attention mechanism is composed of 5 modules, including a double-element input module, a 3D convolution module, an attention fusion module, a convolution long-short-term memory module and a single element prediction output module.
In one embodiment, the implementation of the dual element input module is as follows:
firstly, two marine element input data of SST as element 1, SLA or SSW as element 2 are subjected to standardization processing, wherein the input data is in the shape of (1, timeteps, rows, cols, 1), the timeteps represents the input history driving time length, rows represents the number of lines of data in an area, cols represents the number of columns of data in the area, and the number of samples and the number of channels are 1. SST is global diurnal satellite ocean surface temperature observation data provided by the National Ocean and Atmospheric Administration (NOAA), has higher precision by using a brand new land mask mode, and has a grid resolution of 0.25 o And fully meets the data specification standard of GDS 2.0. SLA is global daily grid sea level anomaly data provided by the French space research Center (CNES) in satellite oceology archiving, verification and interpretation (Aviso), with a grid resolution of 0.25 o The observation range is wide. SSW is high-resolution hybrid sea wind data provided by NOAA, which is generated by mixing observations of multiple satellites, reducing sampled data blanking and random errors. Grid accuracy was 0.25 as for SST and SLA o
The purpose of the normalization process is to eliminate the order of magnitude difference between the dimensions of the data, enabling the data to be scaled to a reasonable span. The normalization can greatly reduce the calculation cost, improve the training efficiency, avoid large fluctuation in the optimal solution searching process and more easily and accurately converge to the optimal solution. The normalization process formula is as follows:
(1)。
wherein,x std is in the range of 0,1]The data after the standardization of the data are carried out,x max andx min is [ rows, cols ]]Maximum and minimum values of the original data within the region,x i is [ rows, cols ]]Each location within the areaiRaw data at that location.
Finally, the resulting normalized data shape is (1, timeteps, rows, cols, 1) which is fed into a 3D convolution module.
The 3D convolution module extracts the temporal and spatial information of the element concealment by using Conv3D units. In particular, in one embodiment, the Conv3D element is a 3-dimensional convolution process that, by shifting the convolution kernel by 3 x 3 in the 2-dimensional image plane and 3-dimensional depth direction, and (3) completing the operation process that the element matrix value of each 3 multiplied by the corresponding position value of the convolution kernel is summed. The two element input matrixes are respectively subjected to 3-dimensional convolution operation, 40 filters are used, so that 40 groups of output results are obtained, the output data is (1, timeteps, rows, cols, 40), and the output data is sent to an attention fusion module.
In one embodiment, the attention fusion module is specifically implemented as follows:
and the attention fusion module fuses time and space characteristics between the double elements and redistributes characteristic weights for the predicted ocean element 1. As shown in FIG. 2, the workflow is divided into two parts, one part is that hidden information output by a module of a previous layer of element 1 is replaced by 1 st and 4 th dimensions of data through dimension replacement Transpose, so that data dimensions [0,1, 2, 3, 4 ] are obtained]Transformed into [0, 4, 2, 3, 1 ]]Obtaining hidden information H of element 1 i The data shape is (1, 40, rows, cols, timeteps). The other part is that the hidden information output by the modules of the previous layer of the element 1 and the element 2 obtains the correlation characteristics of the double elements through continuous element fusion Add, relu activation function, conv3D_A unit, sigmoid activation function and dimension substitution TransoseWeighting ofThe data shape is (1, 1, rows, cols, timeteps). Specifically, the element fusion Add is used for fusing time and space characteristics between the double elements, only the information quantity is increased, and the dimension of the image is not changed. The activation function is used to perform a nonlinear mapping process. The conv3d_a unit and the conv3d unit are both the same 3-dimensional convolution process, except that 1 filter is used for conv3d_a.
Finally, calculating the correlation characteristic weight through position multiplication multipleAnd hidden information H i The product between elements of (2) to obtain a new feature X of element 1 i Its data shape is (1, 40, rows, cols, timeteps) and sent to the convolution long-short-term memory module.
The convolution long-short-term memory module is used for further extracting features, capturing time sequence dependency and converting the time sequence dependency into a prediction feature matrix of the target length.
New feature X of element 1 i Firstly, exchanging different dimensions of input tensors through dimension substitution Transpose, and substituting 1 st and 4 th dimensions of data to enable data dimensions [0,1, 2, 3, 4 ]]Transformed into [0, 4, 2, 3, 1 ]]The transformed data shape is (1, timeteps, rows, cols, 40);
the features are further extracted through encoding of 2 ConvLSTM 2D layers, the time sequence dependency relationship is captured, and information is memorized and stored. In particular, convLSTM 2D layer performs a convolution long-short term memory operation on the features, using 32 filters, a 3 x 3 convolution kernel, the movement direction and step size are 1 x 1, the discard rate was 0.2, thus obtaining 32 sets of output results, the data shape at this time is (1, timeteps, rows, cols, 32);
converting to generate a target layer, converting to generate a prediction feature matrix with a target length, converting timeps in an input shape into targetsizes, and generating the prediction feature matrix with the target length, wherein the data shape is (1, targetsizes, rows, cols, 32);
finally, the feature decoding is carried out through 2 ConvLSTM 2D layers, the parameter setting is the same as that of other ConvLSTM 2D layers, the output data is (1, targetsizes, rows, cols, 32), and the data is sent to a single-element prediction output module.
The unit element prediction output module has the functions of mapping the prediction characteristic matrix to an output space and visualizing a prediction result through inverse standardization.
The extracted features are compressed and mapped to an output space by nonlinear transformation by using a TimeDistributed (Dense) full connection layer, and the data shape is (1, targetsizes, rows, cols, 1); and visualizing the prediction result through inverse normalization processing, and outputting the targetsizes day prediction result of the element 1, wherein targetsizes represents the number of days of direct prediction output.
The inverse normalization formula is as follows:
(2)。
wherein x is std Is that the range of the model mapping onto the output space is in the interval 0,1]Standardized prediction data therebetween; x is x max And x min Is [ rows, cols ]]The maximum value and the minimum value of the original data in the area are the values of x in the formula (1) max And x min Is of a size of (2); x is x i Is [ rows, cols ]]The visualization of the true predictions at each location i within the region.
In one embodiment, the ConvLSTM 2D layer structure in the convolutional long-term memory module is described: in terms of space-time sequence prediction, the convolution long-short term memory network (ConvLSTM) can fully utilize the high dimensionality of the space-time sequence to effectively capture the space-time structure and the space characteristics of the data, and the unit structure and the updating transmission process are shown in figure 3. One notable feature of ConvLSTM is that all input elements, state elements, hidden elements and gating elements are replaced by vectors with three-dimensional tensors, which can be thought of as a series of vectors in a spatial grid. ConvLSTM uses convolution kernels to predict the future state of the grid cell from the adjacent inputs and past states of the space, the update mechanism is as shown in FIG. 3:
forgetting the door:
an input door:
and (5) updating the state:
output door:
wherein the input X at the current time t Hidden H at previous moment t-1 And state C t-1 Is the input of ConvLSTM; output is hidden H at the current time t And state C t . Hadamard product o, convolution, sigmoid activation function σ (), hyperbolic tangent activation function tanh () is an operation used in the update process. f (f) t , i t And o t Representing the value of the gating cell at the current time. W (W) x(f,i,c,o) 、W h(f,i,c,o) And W is c(f,i,o) Is a weight matrix of the gating cell. b f 、b i 、b C And b o Is the bias of the gating cell.
According to the invention, double elements are used as input, the time and space information hidden by the two elements is extracted through 3D convolution, and the time sequence dependency relationship is captured by fusing the correlation characteristics between the two ocean elements and using convolution long-term and short-term memory, so that the prediction of SST is realized. The invention overcomes the defects of a single element prediction model, reduces the Root Mean Square Error (RMSE) of prediction and improves the prediction Precision (PACC) by adding SSW or SLA as an auxiliary element 2 to predict SST.
Next, a method for spatial-temporal prediction based on the fusion of the two elements of the attention mechanism is described in combination with a model instance to describe the prediction process of the method, and the prediction result is evaluated.
Experimental environment:
hardware environment: ubuntu, NVIDIA 2060TI, memory 8G, memory 16G;
software environment: the model was designed on the PyCharm software using Python3.6 and TensorFlow-GPU 2.0, where GPU acceleration was performed using TensorFlow-GPU for increased running speed and reduced training time.
Experimental data:
the marine surface telemetry data used as multi-element inputs were SST, SLA and SSW. SST is global diurnal satellite ocean surface temperature observation data provided by the National Ocean and Atmospheric Administration (NOAA), has higher precision by using a brand new land mask mode, and has a grid resolution of 0.25 o And fully meets the data specification standard of GDS 2.0. SLA is global daily grid sea level anomaly data provided by the French space research Center (CNES) in satellite oceology archiving, verification and interpretation (Aviso), with a grid resolution of 0.25 o The observation range is wide. SSW is high-resolution hybrid sea wind data provided by NOAA, which is generated by mixing observations of multiple satellites, reducing sampled data blanking and random errors. Grid accuracy was 0.25 as for SST and SLA o
Experimental area:
the experimental area selected was the ocean area of the middle eastern part of the pacific, using a multi-element ocean surface data space ranging from (5 o S-5 o N,140 o W-130 o W), time frame 1999-2018, 1999-2014 data as training set and validation set, 2015-2018 data as test set. Wherein experiments are performedThe region has no grid data missing.
Evaluation index:
the prediction performance of the model was evaluated using Root Mean Square Error (RMSE) and Prediction Accuracy (PACC).
Wherein,Y obs andY pre is the true observations and predictions of the marine element, N is the total number of grid points for the region n=rows×cols. The smaller the RMSE, the more accurate the prediction, and the closer the PACC value is to 1, the better the fit of the predicted value to the true value.
Determining training parameters and model parameters:
according to the method for predicting the double-element fusion space-time based on the attention mechanism, the regional parameters of the model are determined as follows: rows=40, cols=40; the selected input and output parameters are determined as follows: the historical driving time length of input is timeteps=10, and the number of days of direct prediction output is targetsizes=3. Optimizing the model by adopting a control variable method, and determining training parameters of a final model as follows: the optimization algorithm selects adam, the learning rate is 0.001, the mean square error is used as a loss function, the cosine similarity is used as a measurement during training, and the number of circulation rounds is set to be 100.
Finally, 2 trained double-element fusion space-time prediction models are obtained, wherein the model A is obtained by using SST and SLA of 10 days as double-element input, and SST of 3 days in the future is predicted. Model B is a model that predicts SST for the next 3 days using SST and SSW for 10 days as dual element inputs.
By adopting the double-element fusion space-time prediction method based on the attention mechanism, which is provided by the application, SST and SLA of 10 days are used as double-element input to generate a model A, and SST of 3 days in the future is predicted; wherein, SST is taken as element 1, SLA is taken as element 2, and the concrete process is as follows:
SST and SLA for 10 days were entered as two elements, with data shapes (1, 10, 40, 40, 1).
(1) And (3) performing standardized processing on the SST and SLA ocean element input data by using a double-element input module, wherein the obtained standardized data has the shape of (1, 10, 40, 40, 1).
(2) And extracting the time and space information hidden by the 2 elements by using a 3D convolution module. Conv3D uses 40 filters, so the data shape of the 2 element outputs are (1, 10, 40, 40, 40).
(3) Using the attention fusion module, the first part: the SST firstly carries out dimension substitution Transpose to replace the 1 st dimension and the 4 th dimension of the data to obtain hidden information of the SSTH i . Another part: SST and SLA obtain correlation characteristic weights between two ocean elements of fused SST and SLA through continuous element fusion Add, relu activation function, conv3D_A unit, sigmoid activation function and dimension substitution Transose. Finally, calculating the correlation characteristic weights by means of position multiplication multiplexing>Hiding informationH i Obtaining new features of SST by multiplying elements by elementsX i Its data shape is (1, 40, 40, 40, 10).
(4) Using a convolution long-term and short-term memory module, firstly replacing the 1 st dimension and the 4 th dimension of the data by a dimension replacement Transpose, and obtaining the transformed data with the shape of (1, 10, 40, 40, 40); encoding through 2 ConvLSTM 2D layers to obtain 32 groups of output results, wherein the data shape is (1, 10, 40, 40, 32); converting 10 in the input shape into 3 in the target layer generated by conversion to generate a prediction feature matrix of a target length, wherein the data shape is (1, 3, 40, 40, 32); finally, feature decoding is performed through 2 ConvLSTM 2D layers, and the output data is (1, 3, 40, 40, 32).
(5) Mapping the prediction feature matrix to an output space by using a single element prediction output module; and visualizing the predicted result through inverse normalization processing, and outputting the predicted result of SST for 3 days in the future, wherein the predicted result is (1, 3, 40, 40, 1).
Similarly, the attention mechanism-based double-element fusion space-time prediction method provided by the application is adopted, SST and SSW of 10 days are used as double-element input, a model B is generated, and SST of 3 days in the future is predicted; wherein, SST is taken as element 1, SSW is taken as element 2, and the specific process is as follows:
SST and SSW for 10 days were entered as two elements, and the data shapes were (1, 10, 40, 40, 1).
(1) And (3) performing standardized processing on the input data of the two marine elements, namely the SST and the SSW, by using a double-element input module, wherein the obtained standardized data has the shape of (1, 10, 40, 40, 1).
(2) And extracting the time and space information hidden by the 2 elements by using a 3D convolution module. Conv3D uses 40 filters, so the data shape of the 2 element outputs are (1, 10, 40, 40, 40).
(3) Using the attention fusion module, the first part: the SST firstly carries out dimension substitution Transpose to replace the 1 st dimension and the 4 th dimension of the data to obtain hidden information of the SSTH i . Another part: the SST and the SSW obtain the correlation characteristic weight between the two ocean elements of the fused SST and the SSW through continuous element fusion Add, relu activation function, conv3D_A unit, sigmoid activation function and dimension substitution Transose. Finally, calculating the correlation characteristic weights by means of position multiplication multiplexing>Hiding informationH i Obtaining new features of SST by multiplying elements by elementsX i Its data shape is (1, 40, 40, 40, 10).
(4) Using a convolution long-term and short-term memory module, firstly replacing the 1 st dimension and the 4 th dimension of the data by a dimension replacement Transpose, and obtaining the transformed data with the shape of (1, 10, 40, 40, 40); encoding through 2 ConvLSTM 2D layers to obtain 32 groups of output results, wherein the data shape is (1, 10, 40, 40, 32); converting 10 in the input shape into 3 in the target layer generated by conversion to generate a prediction feature matrix of a target length, wherein the data shape is (1, 3, 40, 40, 32); finally, feature decoding is performed through 2 ConvLSTM 2D layers, and the output data is (1, 3, 40, 40, 32).
(5) Mapping the prediction feature matrix to an output space by using a single element prediction output module; and visualizing the predicted result through inverse normalization processing, and outputting the predicted result of SST for 3 days in the future, wherein the predicted result is (1, 3, 40, 40, 1).
Space-time prediction assessment:
in order to verify the prediction performance of the dual-element fusion space-time prediction method based on the attention mechanism on predicting SST in 3 days in the future, comparing the dual-element fusion space-time prediction model A and the dual-element fusion space-time prediction model B with a single SST prediction model, comparing the RMSE with the PACC, and analyzing the advantages and disadvantages of the models. Wherein the single SST prediction model is selected from the following: (1) Using the same structure as the present application, the dual element input is changed into the original model Single of Single SST input; (2) model LSTM; (3) The LSTM model based on the Attention mechanism is called as model Attention.
600 prediction experiments were performed and the RMSE and PACC averages of 600 predictions were counted, and the comparison results obtained are shown in tables 1 and 2. The model A and the model B are smaller than the RMSE of other models and higher than the PACC of other models, so that the attention mechanism-based double-element fusion space-time prediction method provided by the invention has smaller RMSE and higher PACC when predicting SST, and the method is verified to have better prediction effect.
Table 1 comparative RMSE results table of model
Table 2 PACC comparison results table of model
Compared with model A, model B has smaller RMSE and higher PACC, which shows that the double-element fusion space-time prediction method based on the attention mechanism has better prediction effect by using SSW as an auxiliary prediction element when predicting SST.
Therefore, we conclude that (1) the double-element fusion space-time prediction method has smaller prediction error and higher prediction precision than the single element prediction method; (2) When predicting SST, SSW has more obvious influence on prediction accuracy than SLA as an auxiliary prediction element.
In summary, the attention mechanism-based double-element fusion space-time prediction method disclosed by the application fuses the relevant characteristics of two elements, and improves the prediction precision of a single element. SST was predicted for 1 to 3 days by comparing the predicted results of different methods under the same experimental area. The results show that: (1) Compared with a single element prediction method, the double element fusion space-time prediction method has smaller prediction error and higher prediction precision; (2) When predicting SST, SSW has more obvious influence on prediction accuracy than SLA as an auxiliary prediction element. Wherein the SST and SSW of 10 days are used as dual element inputs, the predicted effect of predicting SST of 3 days in the future is better, and the RMSE of 1 st, 2 nd and 3 rd days is predicted to be about 0.1879 o C、0.2873 o C、0.3516 o C, performing operation; PACC is approximately 0.9944, 0.9913, 0.9892.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (8)

1. A double-element fusion space-time prediction method based on an attention mechanism is characterized by comprising the following steps:
the dual-element input module is used for carrying out standardized processing on dual-element data, order-of-magnitude difference among dimensions of the data is eliminated, and the data can be scaled to a standard interval range;
extracting time and space information of element hiding by using a 3D convolution module;
using an attention fusion module to fuse time and space characteristics between the double elements and reallocate characteristic weights for predicted ocean elements;
capturing a time sequence dependency relationship by using a convolution long-short-term memory module, and converting to generate a prediction feature matrix of a target length;
mapping the prediction feature matrix to an output space by using a single element prediction output module, and visualizing a prediction result through inverse standardization;
wherein, the double elements comprise an element 1 taking SST as an element and an element 2 taking SLA or SSW as an element;
the implementation process of the attention fusion module comprises the following steps:
element 1 is subjected to dimension substitution Transpost, and the 1 st dimension and the 4 th dimension of data are subjected to substitution to obtain hidden information H of element 1 i
Element 2 obtains the correlation feature weight of the double elements through continuous element fusion Add, relu activation function, conv3D_A unit, sigmoid activation function and dimension substitution Transose
Computing correlation feature weights by position multiplication multiplexingAnd hidden information H i The product between elements of (2) to obtain a new feature X of element 1 i Its data shape is (1, 40, rows, cols, timeteps).
2. The attention-mechanism-based bi-element fusion spatiotemporal prediction method of claim 1, wherein: the standardized processing is carried out on the double-element data, and a specific standardized process formula is as follows:
(1),
wherein,x std is in the range of 0,1]The data after the standardization of the data are carried out,x max andx min is [ rows, cols ]]Maximum and minimum values of the original data within the region,x i is [ rows, cols ]]Each location within the areaiRaw data at the location;
the standardized data shape obtained is (1, timeteps, rows, cols, 1); wherein, timeteps represents the input history driving time length, rows represents the number of rows of data in the area, cols represents the number of columns of data in the area, and the number of samples and the number of channels are 1.
3. The attention-mechanism-based bi-element fusion spatiotemporal prediction method of claim 1, wherein: the 3D convolution module extracts the element hidden temporal and spatial information by using a Conv3D unit, and outputs data in the shape of (1, timeteps, rows, cols, 40).
4. The attention-mechanism-based bi-element fusion spatiotemporal prediction method of claim 1, wherein: the convolution long-term and short-term memory module further extracts features, captures time sequence dependency and converts the time sequence dependency into a prediction feature matrix of the target length.
5. The attention-mechanism-based dual element fusion spatiotemporal prediction method of claim 4, wherein: the specific implementation process of the convolution long-term and short-term memory module comprises the following steps:
new feature X of element 1 i Firstly, carrying out dimension substitution Transpose; the features are further extracted by encoding through 2 ConvLSTM 2D layers; then generating a target layer through conversion, and generating a prediction feature matrix of target length targetsizes through conversion; finally, feature decoding is performed through 2 ConvLSTM 2D layers, and the output data has the shape of (1, targetsizes, rows, cols, 32).
6. The attention-mechanism-based bi-element fusion spatiotemporal prediction method of claim 1, wherein: the unit element prediction output module maps the prediction feature matrix output by the convolution long-short-term memory module to an output space, and visualizes a prediction result through inverse standardization.
7. The attention-mechanism-based bi-element fusion spatiotemporal prediction method of claim 1, wherein: the specific implementation mode of the prediction result visualization is as follows:
using TimeDistributed (Dense) full connection layer, compressing and mapping the extracted features to output space by nonlinear transformation;
the prediction result is visualized through the inverse normalization process, and the targetsizes day prediction result of the element 1 is output.
8. The attention-mechanism-based bi-element fusion spatiotemporal prediction method of claim 7, wherein: the inverse normalization formula is as follows:
(2),
wherein x is std Is that the range of the model mapping onto the output space is in the interval 0,1]Standardized prediction data therebetween; x is x max And x min Is [ rows, cols ]]Maximum and minimum values of raw data in the region; x is x i Is [ rows, cols ]]The visualization of the true predictions at each location i within the region.
CN202311522863.0A 2023-11-16 2023-11-16 Attention mechanism-based double-element fusion space-time prediction method Active CN117237781B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311522863.0A CN117237781B (en) 2023-11-16 2023-11-16 Attention mechanism-based double-element fusion space-time prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311522863.0A CN117237781B (en) 2023-11-16 2023-11-16 Attention mechanism-based double-element fusion space-time prediction method

Publications (2)

Publication Number Publication Date
CN117237781A CN117237781A (en) 2023-12-15
CN117237781B true CN117237781B (en) 2024-03-19

Family

ID=89096949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311522863.0A Active CN117237781B (en) 2023-11-16 2023-11-16 Attention mechanism-based double-element fusion space-time prediction method

Country Status (1)

Country Link
CN (1) CN117237781B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113935249A (en) * 2021-11-23 2022-01-14 中国海洋大学 Upper-layer ocean thermal structure inversion method based on compression and excitation network
CN114936691A (en) * 2022-05-06 2022-08-23 河北工业大学 Temperature forecasting method integrating relevance weighting and space-time attention
CN114998647A (en) * 2022-05-16 2022-09-02 大连民族大学 Breast cancer full-size pathological image classification method based on attention multi-instance learning
CN115357672A (en) * 2022-07-07 2022-11-18 中国海洋大学 Sea surface altitude change prediction method based on multi-element input depth convolution network
WO2022262500A1 (en) * 2021-06-15 2022-12-22 哈尔滨工程大学 Steof-lstm-based method for predicting marine environmental elements
CN116523800A (en) * 2023-07-03 2023-08-01 南京邮电大学 Image noise reduction model and method based on residual dense network and attention mechanism
CN116612396A (en) * 2023-05-26 2023-08-18 中国海洋大学 Ocean surface temperature sequence prediction method based on space-time double-flow non-stationary sensing
CN116822382A (en) * 2023-08-30 2023-09-29 中国海洋大学 Sea surface temperature prediction method and network based on space-time multiple characteristic diagram convolution

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10909446B2 (en) * 2019-05-09 2021-02-02 ClimateAI, Inc. Systems and methods for selecting global climate simulation models for training neural network climate forecasting models

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022262500A1 (en) * 2021-06-15 2022-12-22 哈尔滨工程大学 Steof-lstm-based method for predicting marine environmental elements
CN113935249A (en) * 2021-11-23 2022-01-14 中国海洋大学 Upper-layer ocean thermal structure inversion method based on compression and excitation network
CN114936691A (en) * 2022-05-06 2022-08-23 河北工业大学 Temperature forecasting method integrating relevance weighting and space-time attention
CN114998647A (en) * 2022-05-16 2022-09-02 大连民族大学 Breast cancer full-size pathological image classification method based on attention multi-instance learning
CN115357672A (en) * 2022-07-07 2022-11-18 中国海洋大学 Sea surface altitude change prediction method based on multi-element input depth convolution network
CN116612396A (en) * 2023-05-26 2023-08-18 中国海洋大学 Ocean surface temperature sequence prediction method based on space-time double-flow non-stationary sensing
CN116523800A (en) * 2023-07-03 2023-08-01 南京邮电大学 Image noise reduction model and method based on residual dense network and attention mechanism
CN116822382A (en) * 2023-08-30 2023-09-29 中国海洋大学 Sea surface temperature prediction method and network based on space-time multiple characteristic diagram convolution

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Ocean surface current retrieval using a non homogeneous Markov-switching multi-regime model;Pierre Tandeo;《 2014 IEEE International Conference on Image Processing (ICIP)》;全文 *
一种基于深度学习的海表温度混合预测方法;韩莹等;《海洋环境科学》;第791-798页 *

Also Published As

Publication number Publication date
CN117237781A (en) 2023-12-15

Similar Documents

Publication Publication Date Title
CN115393540A (en) Intelligent fusion method and system of three-dimensional marine environment field based on deep learning
CN113935249B (en) Upper-layer ocean thermal structure inversion method based on compression and excitation network
CN113673775A (en) Time-space combination prediction method based on CNN-LSTM and deep learning
Braakmann-Folgmann et al. Sea level anomaly prediction using recurrent neural networks
Moayyed et al. A Cyber-Secure generalized supermodel for wind power forecasting based on deep federated learning and image processing
CN114004428B (en) Fishing situation prediction method and system based on cellular grid integration mechanism and space-time Transformer
CN114936691A (en) Temperature forecasting method integrating relevance weighting and space-time attention
CN110874630A (en) Deep learning-based numerical model product downscaling refinement method
CN114399073A (en) Ocean surface temperature field prediction method based on deep learning
CN115859816A (en) Wind power minute-level prediction method and system based on CNN-LSTM algorithm
CN115933010A (en) Radar echo extrapolation near weather prediction method
CN116844041A (en) Cultivated land extraction method based on bidirectional convolution time self-attention mechanism
Dong et al. Tropical cyclone track prediction with an encoding-to-forecasting deep learning model
CN116699731B (en) Tropical cyclone path short-term forecasting method, system and storage medium
CN117237781B (en) Attention mechanism-based double-element fusion space-time prediction method
CN111680784B (en) Sea surface temperature deep learning prediction method based on space-time multidimensional influence
CN117233870A (en) Short-term precipitation set forecasting and downscaling method based on multiple meteorological elements
Wang et al. Multi-view SAR automatic target recognition based on deformable convolutional network
CN110648030A (en) Method and device for predicting seawater temperature
CN114742206A (en) Rainfall intensity estimation method for comprehensive multi-space-time scale Doppler radar data
Li et al. DeepPhysiNet: Bridging Deep Learning and Atmospheric Physics for Accurate and Continuous Weather Modeling
Yu et al. Inversion of the three-dimensional temperature structure of mesoscale eddies in the Northwest Pacific based on deep learning
Chen et al. TemproNet: A transformer-based deep learning model for seawater temperature prediction
CN115205710B (en) Double-time-phase remote sensing image change detection method combined with color correction
Gan et al. W-MRI: A Multi-output Residual Integration Model for Global Weather Forecasting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant