CN107958044A

CN107958044A - Higher-dimension sequence data Forecasting Methodology and system based on depth space-time memory network

Info

Publication number: CN107958044A
Application number: CN201711190694.XA
Authority: CN
Inventors: 龙明盛; 王建民; 高志烽; 王韫博
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2017-11-24
Filing date: 2017-11-24
Publication date: 2018-04-24

Abstract

The present invention, which provides a kind of higher-dimension sequence data Forecasting Methodology and system, method based on depth space-time memory network, to be included：Higher-dimension sequence data is inputted into trained prediction loop neural network model, obtains prediction result；Wherein, trained prediction loop neural network model obtains as follows：Any instant memory is built according to the first input gate, the first forgetting door and the first input modulation door；Any layer memory is built according to the second input gate, the second forgetting door and the second input modulation door；Build any out gate；Update any hidden state；Any hidden state based on any instant memory, any layer memory and renewal builds any space-time mnemon；Prediction loop neural network model is built, tensor sequence data input prediction Recognition with Recurrent Neural Network model is trained, obtains trained prediction loop neural network model.Result the invention enables prediction can cover the trend of time dimension and Spatial Dimension, and prediction result is more accurate.

Description

Higher-dimension sequence data Forecasting Methodology and system based on depth space-time memory network

Technical field

The present invention relates to computer data analysis field, more particularly, to a kind of based on depth space-time memory network Higher-dimension sequence data Forecasting Methodology and system.

Background technology

The research of data mining is based on data characteristic in itself, is realized by modeling to being hidden in mass data Information utilizes, accordingly, it is capable to which the no a variety of implicit contacts fully caught in data are to evaluate the important mark of model quality It is accurate.As the biology lived in time and space, it can be collected into while there is the time and the number of the two kinds of dimension hierarchies in space According to.Such as when a certain special time is embodied not only in precipitation data, the space of Rainfall distribution situation point in a certain range Cloth data, also the precipitation Annual distribution data included in a certain particular spatial location.If only from Spatial Dimension or when Between dimension analyze data, will necessarily cause significantly information loss.

A period of time, the research of Data Mining forefront are being related to time dimension or space dimension merely recently Realize breakthrough on the problem of spending, the data time series analysis method such as using recurrent neural network as representative, and with volume Product neutral net is the spatial data analysis method of representative.But for by together with time data and spatial data integration into The data digging method of row analysis is also far from reaching expected.Meanwhile for space-time data analysis method demand be it is huge, The true application scenarios such as weather forecast, visual classification, image prediction need to handle space-time data.

The content of the invention

It is pre- that the present invention provides a kind of a kind of higher-dimension sequence data based on depth space-time memory network for overcoming the above problem Survey method and system.

According to an aspect of the present invention, there is provided a kind of higher-dimension sequence data prediction side based on depth space-time memory network Method, including：Higher-dimension sequence data is inputted into trained prediction loop neural network model, obtains prediction result；Wherein, it is described Trained prediction loop neural network model obtains as follows：Door and first is forgotten according to the first input gate, first Input modulation door builds any instant memory；Built according to the second input gate, the second forgetting door and the second input modulation door any Layer memory；Build any out gate；According to any instant memory, any layer memory and any out gate renewal Any hidden state；Any hidden state structure based on any instant memory, any layer memory and renewal Any space-time mnemon, any space-time mnemon are a length time memory network；Remembered by all space-times Unit builds the prediction loop neural network model, wherein, the prediction loop neural network model be using moment and layer as The two dimensional model of dimension, any space-time mnemon are in any layer and remember corresponding layer and any instant At the time of memory is corresponding；The tensor sequence data being made of observation is inputted the prediction loop neural network model to be instructed Practice, obtain the trained prediction loop neural network model.

Preferably, it is described that any instant memory is built according to the first input gate, the first forgetting door and the first input modulation door Further comprise：First input gate is obtained by the following formula：

Wherein, i_tFor the first input gate, σ is S type functions Sigmoid, W_xiFor calculate the first input gate when andDo convolution The convolution kernel of operation, * are convolution operation,For the input quantity of any instant, W_hiFor calculate the first input gate andRoll up The convolution kernel of product operation,For the hidden state of the last moment of any instant, b_iFor the first input gate deviation；

Described first, which is obtained, by the following formula forgets door：

Wherein, f_tForget door for first, σ is S type functions Sigmoid, W_xfFor calculate first forget door when andDo convolution The convolution kernel of operation, * are convolution operation,For the input quantity of any instant, W_hfFor calculate first forget door andRoll up The convolution kernel of product operation,For the hidden state of the last moment of any instant, b_fForget door deviation for first；

The first input modulation door is obtained by the following formula：

Wherein, g_tFor the first input modulation door, φ is hyperbolic tangent function, W_xgFor calculate first input modulation door when andThe convolution kernel of convolution operation is done, * is convolution operation,For the input quantity of any instant, W_hgTo calculate the first input modulation Men HeThe convolution kernel of convolution operation is done,For the hidden state of the last moment of any instant, b_gFor the first input Modulate door deviation；Door and the first input modulation door are forgotten according to first input gate, described first, obtained by following formula Any instant memory：

Wherein,Remember for any instant, f_tForgeing door for first, ⊙ is Hadamard product,For any instant The memory of last moment, i_tFor the first input gate, g_tFor the first input modulation door.

Preferably, it is described according to the second input gate, second forgetting door and second input modulation door build any layer remember into One step includes：Second input gate is obtained by the following formula：

Wherein, i '_tFor the second input gate, σ is S type functions Sigmoid, W_xiFor calculate the second input gate when andRoll up The convolution kernel of product operation, * is convolution operation,For the input quantity of any instant, W_miFor calculate the second input gate andDo The convolution kernel of convolution operation,For the memory of the last layer of any layer, b_i' it is the second input gate deviation；

Described second, which is obtained, by the following formula forgets door：

Wherein, f '_tForget door for second, σ is S type functions Sigmoid, W_xfFor calculate second forget door when andRoll up The convolution kernel of product operation, * is convolution operation,For the input quantity of any instant, W_mfFor calculate second forget door andDo The convolution kernel of convolution operation,For the memory of the last layer of any layer, b_f' it is the second forgetting door deviation；

The second input modulation door is obtained by the following formula：

Wherein, g '_tFor the second input modulation door, σ is S type functions Sigmoid, W_xgFor calculate second input modulation door when andThe convolution kernel of convolution operation is done, * is convolution operation,For the input quantity of any instant, W_mgTo calculate the second input modulation Men HeThe convolution kernel of convolution operation is done,For the memory of the last layer of any layer, b_g' inputted for second and modulate door Deviation；Door and the second input modulation door are forgotten according to second input gate, described second, described appoint is obtained by following formula One layer of memory：

Wherein,Remember for any layer, f '_tForgeing door for second, ⊙ is Hadamard product,For any layer The memory of last layer, i_t' it is the second input gate, g_t' inputted for second and modulate door.

Preferably, it is described to build any out gate and further comprise：Any out gate is obtained by the following formula：

Wherein, o_tFor any out gate, σ is S type functions Sigmoid, W_xoFor calculate out gate when andDo convolution operation Convolution kernel, * is convolution operation,For the input quantity of any instant, W_hoFor calculate out gate when andDo convolution operation Convolution kernel,For the hidden state of the last moment of any instant, W_moFor calculate out gate when andIt is convolution behaviour The convolution kernel of work,For the memory of the last layer of any layer, b_oFor out gate deviation.

Preferably, according to any instant memory, any layer memory and any out gate, by following formula more New any hidden state：

Wherein,For any hidden state, o_tFor any out gate, ⊙ is Hadamard product, and φ is tanh letter Number,For the core of 1 × 1 size, * is convolution operation,Remember for any instant,Remember for any layer, [] is Two tensors are connected according to channel direction.

Preferably, it is described by the tensor sequence data being made of observation input the prediction loop neural network model into Row training, obtains the trained prediction loop neural network model and further comprises：S1, the tensor that will be made of observation The first moment tensor sequence data in sequence data inputs the prediction loop neural network model；S2, it is described prediction follow The first layer of ring neural network model, corresponding informance, the corresponding informance are extracted by the first moment first layer space-time mnemon Including the corresponding hidden state of the first moment first layer space-time mnemon, the memory of the first moment and first layer memory, by described in The first corresponding hidden state of moment first layer space-time mnemon and first moment memory are transferred under subsequent time the One layer of space-time mnemon, while by the corresponding hidden state of the first moment first layer space-time mnemon and described One layer is remembered the space-time mnemon for being transferred to next layer of the first moment；S3, by extract corresponding informance the number of plies add one, persistently hold Row step S2, until the number of plies residing for the corresponding informance extracted is more than stacking number, obtains the prediction loop neutral net mould Prediction result of the type at the first moment, the stacking number are total number of plies of the prediction loop neural network model.

Preferably, further included after step S3：S4, by the subsequent time tensor in the tensor sequence data being made of observation Sequence data inputs the prediction loop neural network model, and circulation performs step S2 and S3, until residing for the information extracted Moment number is greater than or equal to the length of tensor sequence data in the past, obtains the prediction loop neural network model and is extracting pair Each layer of prediction result at the time of answering information；S5, judge that number is greater than or equal at the time of residing for the corresponding informance extracted During the length of past tensor sequence data, at this time at the time of number and fiducial value size, the fiducial value for it is described in the past tensor The sum of the length of sequence data and the length of following tensor sequence data；Wherein, the length of the tensor sequence data in the past is It is described at this time at the time of number last moment number numerical value, the length of the future tensor sequence data is setting value；If S6, institute Number is greater than or equal to the fiducial value at the time of stating at this time, then obtains the prediction result and actual result by loss function Gap, and the input quantity of the prediction loop neural network model is updated according to back-propagation algorithm, circulation performs step Rapid S2, S3, S4 and S5, until the gap of prediction result and actual result is less than disparity threshold.

Preferably, step S6 is further included：If it is described at this time at the time of number be less than the fiducial value, the prediction loop god Exported through network model to the prediction result at this time, and the prediction result is inputted into the prediction loop neutral net mould Type, circulation perform step S2, S3, S4 and S5, until it is described at this time at the time of number be greater than or equal to the fiducial value.

According to another aspect of the present invention, there is provided a kind of higher-dimension sequence data prediction based on depth space-time memory network System, including：Prediction result acquisition module, for higher-dimension sequence data to be inputted trained prediction loop neutral net mould Type, obtains prediction result；Wherein, the trained prediction loop neural network model is obtained by following submodule：Renewal Hidden state submodule, for building any instant memory according to the first input gate, the first forgetting door and the first input modulation door； Any layer memory is built according to the second input gate, the second forgetting door and the second input modulation door；Build any out gate；According to institute State any instant memory, any layer memory and any out gate and update any hidden state；Build space-time note Unit submodule is recalled, for any hidden state structure based on any instant memory, any layer memory and renewal Any space-time mnemon, any space-time mnemon are a length time memory network；Model submodule is built, is used In building the prediction loop neural network model by all space-time mnemons, wherein, the prediction loop neutral net Model is that it is corresponding that any space-time mnemon is in any layer memory using moment and layer as the two dimensional model of dimension At the time of layer and corresponding any instant memory；Acquisition trains model submodule, for that will be made of observation Amount sequence data inputs the prediction loop neural network model and is trained, and obtains the trained prediction loop nerve net Network model.

According to a further aspect of the invention, there is provided a kind of electronic equipment for the prediction of higher-dimension sequence data, including：Deposit Reservoir and processor, the processor and the memory complete mutual communication by bus；The memory storage has The programmed instruction that can be performed by the processor, the processor call described program instruction to be able to carry out such as any institute of above-mentioned item The Forecasting Methodology stated.

A kind of higher-dimension sequence data Forecasting Methodology and system based on depth space-time memory network provided by the invention, pass through Design realizes that space-time mnemon includes the memory at moment and the memory of layer, i.e., the information comprising time and space, by the time and Spatial information is modeled to a unified memory cell, and along vertical direction (cross over LSTM layers) and horizontal direction (along when Between) transmit these memory states so that the result of prediction can cover the trend of time dimension and Spatial Dimension, compared to existing The analysis of time dimension is mostly only carried out in technology, prediction result of the invention is more accurate.

Brief description of the drawings

Fig. 1 is a kind of flow chart of the trained prediction loop neural network model of acquisition in the embodiment of the present invention；

Fig. 2 is that the first input gate of structure, the first forgetting door and first in the embodiment of the present invention input the signal for modulating door Figure；

Fig. 3 is that the second input gate of structure, the second forgetting door and second in the embodiment of the present invention input the signal for modulating door Figure；

Fig. 4 is a kind of structure diagram of space-time mnemon in the embodiment of the present invention；

Fig. 5 is a kind of structure diagram of prediction loop neural network model in the embodiment of the present invention；

Fig. 6 is a kind of flow chart of the trained prediction loop neural network model of acquisition in the embodiment of the present invention；

Fig. 7 builds for a kind of any instant memory in the embodiment of the present invention, any layer memory and space-time mnemon Flow chart；

Fig. 8 is a kind of structure diagram of electronic equipment for the prediction of higher-dimension sequence data in the embodiment of the present invention.

Embodiment

With reference to the accompanying drawings and examples, the embodiment of the present invention is described in further detail.Implement below Example is used to illustrate the present invention, but is not limited to the scope of the present invention.

Become for the spatial coherence and time change that can not learn at the same time to Higher Dimensional Space Time associated data well at present The problem of gesture, provide a kind of higher-dimension sequence data Forecasting Methodology in the embodiment of the present invention.

Fig. 1 is a kind of flow chart of the trained prediction loop neural network model of acquisition in the embodiment of the present invention, such as Shown in Fig. 1, higher-dimension sequence data is inputted into trained prediction loop neural network model, obtains prediction result；Wherein, it is described Trained prediction loop neural network model obtains as follows：Door and first is forgotten according to the first input gate, first Input modulation door builds any instant memory；Built according to the second input gate, the second forgetting door and the second input modulation door any Layer memory；Build any out gate；According to any instant memory, any layer memory and any out gate renewal Any hidden state；Any hidden state structure based on any instant memory, any layer memory and renewal Any space-time mnemon, any space-time mnemon are a length time memory network；Remembered by all space-times Unit builds the prediction loop neural network model, wherein, the prediction loop neural network model be using moment and layer as The two dimensional model of dimension, any space-time mnemon are in any layer and remember corresponding layer and any instant At the time of memory is corresponding；The tensor sequence data being made of observation is inputted the prediction loop neural network model to be instructed Practice, obtain the trained prediction loop neural network model.

Specifically, specific explanations are made to the length time memory network in the present embodiment below.

To solve the problems, such as that sequence length has the problem of gap in the data of sequence, those skilled in the art devises circulation Neutral net (recurrent neural network, RNN) carrys out processing sequence problem.But there are two to ask by common RNN Topic.When long-distance dependence, second, gradient disappears and gradient explosion, this problem are particularly evident when long sequence is handled.

In order to solve problem above, those skilled in the art proposes length time memory network (Long Short- Term Memory, LSTM).This RNN frameworks disappear and gradient explosion issues dedicated for solving the gradient of RNN models.By three The state of activation of a multiplication gate control block of memory：Input gate (input gate), out gate (output gate), forget door again Claim to forget door (forget gate).The information of input preserves in a network before this structure can be allowed to, and the biography that goes ahead Pass, input gate, which opens stylish input, can just change the historic state preserved in network, the history shape that out gate preserves when opening State can be accessed to, and influence after output, forget be used for empty previously stored historical information.

Further, the tensor sequence data being made of observation proposed in the present embodiment is handled based on following thought 's：Dimension space time correlation data be defined, it is assumed that constantly monitoring one dynamical system P kind measurements amount (such as：RGB), wherein Each measurement amount can be all recorded on each position of area of space, be expressed as the matrix of a M × N.From Space Angle From the point of view of degree, the P kind measurement amounts at any moment can be expressed as a tensorFrom the point of view of time angle, observation A tensor sequence being made of T timestamp

Further, any space-time mnemon is in corresponding layer and moment, and any instant any layer only corresponds to one Space-time mnemon.

It should be noted that the length all same of each layer of prediction loop neural network model, the length at each moment are equal It is identical.

A kind of higher-dimension sequence data Forecasting Methodology provided by the invention, realizes that space-time mnemon includes the moment by design Memory and layer memory, i.e., the information comprising time and space, by time and spatial information modeling to a unified memory Cell, and along vertical direction (crossing over LSTM layers) and horizontal direction these memory states are transmitted (along the time) so that prediction Result can cover the trend of time dimension and Spatial Dimension, compared to point for mostly only carrying out time dimension in the prior art Analysis, prediction result of the invention are more accurate.The memory state of space-time mnemon can be along both vertically and horizontally Two dimension transmission so that more rich history Time-space serial information can be delivered to future time instance.Due to remembering with space-time The prediction loop neutral net of unit employs the mode of LSTM layers of stacking of multilayer space-time mnemon, is provided with model stronger Ability to express, be allowed to be more suitable for this complicated dynamical system of spatio-temporal prediction.The present invention is largely responsible for video monitoring And weather prognosis.

On the basis of above-described embodiment, Fig. 2 be the embodiment of the present invention in the first input gate of structure, first forget door and The schematic diagram of first input modulation door, as shown in Fig. 2, the present embodiment is made and being explained further for building any instant memory, In embodiments of the present invention, the memory at moment represents time memory, is the recall info of time dimension.

It is described that any instant memory further bag is built according to the first input gate, the first forgetting door and the first input modulation door Include：First input gate is obtained by the following formula：

Wherein, i_tFor the first input gate, σ is S type functions Sigmoid, W_xiFor calculate the first input gate when andDo convolution The convolution kernel of operation, * are convolution operation,For the input quantity of any instant, W_hiFor calculate the first input gate andRoll up The convolution kernel of product operation,For the hidden state of the last moment of any instant, b_iFor the first input gate deviation.

Described first, which is obtained, by the following formula forgets door：

Wherein, f_tForget door for first, σ is S type functions Sigmoid, W_xfFor calculate first forget door when andDo convolution The convolution kernel of operation, * are convolution operation,For the input quantity of any instant, W_hfFor calculate first forget door andRoll up The convolution kernel of product operation,For the hidden state of the last moment of any instant, b_fForget door deviation for first.

The first input modulation door is obtained by the following formula：

Wherein, g_tFor the first input modulation door, φ is hyperbolic tangent function, W_xgFor calculate first input modulation door when andThe convolution kernel of convolution operation is done, * is convolution operation,For the input quantity of any instant, W_hgTo calculate the first input modulation Men HeThe convolution kernel of convolution operation is done,For the hidden state of the last moment of any instant, b_gFor the first input Modulate door deviation.

Door and the first input modulation door are forgotten according to first input gate, described first, institute is obtained by following formula State any instant memory：

Specifically, the first input gate is used for the hidden state for determining the last moment of any instant input quantity and any instant Which information can be added into the memory of space-time mnemon.First forgetting door is used for the last moment for determining any instant Being stored in which of space-time mnemon memory information needs to be retained.First input modulation door is used to integrate any instant The output quantity of the last moment of input quantity and any instant is a tensor.

Further, in the present embodiment, Sigmoid is the function of a common S type in biology, also referred to as S Sigmoid growth curve.In information science, due to it, singly property, the Sigmoid functions such as increasing and the increasing of inverse function list are often used as god Threshold function table through network, by variable mappings to 0, between 1.

Further, convolution is the common method of image procossing, gives input picture, in the output image each pixel It is the weighted average of pixel in a zonule in input picture, wherein weights are defined by a function, this function is known as rolling up Product core.

A kind of higher-dimension sequence data Forecasting Methodology provided by the invention, by build any instant remember during, Calculate and operate with multiple convolution so that space-time mnemon can carry out dimensionality reduction for higher-dimension sequence data.

On the basis of above-described embodiment, Fig. 3 be the embodiment of the present invention in the second input gate of structure, second forget door and The schematic diagram of second input modulation door, as shown in figure 3, the present embodiment is made and being explained further for building any layer memory, In the embodiment of the present invention, the memory of layer represents spatial memory, is the recall info of Spatial Dimension.

It is described that any layer memory further bag is built according to the second input gate, the second forgetting door and the second input modulation door Include：Second input gate is obtained by the following formula：

Wherein, i '_tFor the second input gate, σ is S type functions Sigmoid, W_xiFor calculate the second input gate when andRoll up The convolution kernel of product operation, * is convolution operation,For the input quantity of any instant, W_miFor calculate the second input gate andDo The convolution kernel of convolution operation,For the memory of the last layer of any layer, b_i' it is the second input gate deviation.

Described second, which is obtained, by the following formula forgets door：

Wherein, f '_tForget door for second, σ is S type functions Sigmoid, W_xfFor calculate second forget door when andRoll up The convolution kernel of product operation, * is convolution operation,For the input quantity of any instant, W_mfFor calculate second forget door andDo The convolution kernel of convolution operation,For the memory of the last layer of any layer, b_f' it is the second forgetting door deviation.

The second input modulation door is obtained by the following formula：

Wherein, g_t' inputted for second and modulate door, σ is S type functions Sigmoid, W_xgFor calculate second input modulation door when andThe convolution kernel of convolution operation is done, * is convolution operation,For the input quantity of any instant, W_mgTo calculate the second input modulation Men HeThe convolution kernel of convolution operation is done,For the memory of the last layer of any layer, b_g' inputted for second and modulate door Deviation.

Door and the second input modulation door are forgotten according to second input gate, described second, institute is obtained by following formula State any layer memory：

Wherein,Remember for any layer, f_t' it is the second forgetting door, ⊙ is Hadamard product,For any layer The memory of last layer, i_t' it is the second input gate, g_t' inputted for second and modulate door.

Specifically, the second input gate is used to determine which of memory of last layer of any instant input quantity and random layer Information can be added into the memory of space-time mnemon.Second forgetting door is used to determine that the last layer of random layer is stored in space-time Which of mnemon memory information needs to be retained.Second input modulation door is used for the input quantity for integrating any instant and appoints The memory of the last layer of meaning layer is a tensor.

A kind of higher-dimension sequence data Forecasting Methodology provided by the invention, by during building any layer and remembering, having There is multiple convolution to calculate operation so that space-time mnemon can carry out dimensionality reduction for higher-dimension sequence data.

Based on above-described embodiment, the present embodiment is made for building any out gate and being further explained.

Any out gate is obtained by the following formula：

Specifically, in the present embodiment, Sigmoid is the function of a common S type in biology, also referred to as S types Growth curve.In information science, due to it, singly property, the Sigmoid functions such as increasing and the increasing of inverse function list are often used as nerve The threshold function table of network, by variable mappings to 0, between 1.

A kind of higher-dimension sequence data Forecasting Methodology provided by the invention, by during any out gate is built, having There is multiple convolution to calculate operation so that space-time mnemon can carry out dimensionality reduction for higher-dimension sequence data.

Based on above-described embodiment, the present embodiment is made for any hidden state of renewal and being further explained.

According to any instant memory, any layer memory and any out gate, updated by following formula described in Any hidden state：

Fig. 4 is a kind of structure diagram of space-time mnemon in the embodiment of the present invention, as shown in Figure 4, based on described Any hidden state of any instant memory, any layer memory and renewal, you can build any space-time mnemon.

A kind of higher-dimension sequence data Forecasting Methodology provided by the invention, by the way that two tensors are connected according to channel direction, Space-time mnemon is enabled to carry out dimensionality reduction for higher-dimension sequence data.

Based on above-described embodiment, the present embodiment is made further for obtaining trained prediction loop neural network model Explain on ground.

Fig. 5 is a kind of structure diagram of prediction loop neural network model in the embodiment of the present invention, and Fig. 6 is the present invention A kind of flow chart of the trained prediction loop neural network model of acquisition in embodiment.The present embodiment refers to Fig. 5 and figure 6。

S1, by the tensor sequence data being made of observation the first moment tensor sequence data input it is described prediction follow Ring neural network model；S2, the first layer in the prediction loop neural network model, are remembered by the first moment first layer space-time Unit extracts corresponding informance, and the corresponding informance includes the corresponding hidden state of the first moment first layer space-time mnemon, the Engrave to recall for the moment and remember with first layer, by the corresponding hidden state of the first moment first layer space-time mnemon and described the Engrave for the moment and recall the space-time mnemon for being transferred to first layer under subsequent time, while the first moment first layer space-time is remembered Recall the corresponding hidden state of unit and first layer memory is transferred to the space-time mnemon at next layer of the first moment；S3, general The number of plies for extracting corresponding informance adds one, continuously carries out step S2, until the number of plies residing for the corresponding informance extracted is more than stack layer Number, obtains prediction result of the prediction loop neural network model at the first moment, and the stacking number is followed for the prediction Total number of plies of ring neural network model.

At this point, it should be noted that often performing a step S2, a corresponding informance is only extracted, and corresponding informance is passed It is handed to the space-time mnemon of next layer of subsequent time.

Further, the present embodiment is that all corresponding informances at the first moment are extracted and are predicted, i.e., only Complete the prediction at a moment.Prediction loop neural network model provided by the invention is carrying out the information extraction at each moment Consistent with the method cited by the present embodiment during with data prediction, the information history comprising last time at each moment is believed Breath.

Based on above-described embodiment, the present embodiment also refers to Fig. 5 and Fig. 6.Further included after step S3：S4, will be by observation Subsequent time tensor sequence data in the tensor sequence data of composition inputs the prediction loop neural network model, and circulation is held Row step S2 and S3, number is greater than or equal to the length of tensor sequence data in the past at the time of residing for the information of extraction, obtains The prediction loop neural network model at the time of corresponding informance has been extracted in each layer of prediction result；S5, judge when pumping When number is greater than or equal to the length of tensor sequence data in the past at the time of residing for the corresponding informance taken, at this time at the time of number and compare The size of value, the fiducial value are the sum of the length of tensor sequence data in the past and the length of following tensor sequence data； Wherein, the length of the tensor sequence data in the past for it is described at this time at the time of number last moment number numerical value, the future The length of tensor sequence data is setting value；If S6, it is described at this time at the time of number be greater than or equal to the fiducial value, pass through damage The gap that function obtains the prediction result and actual result is lost, and according to back-propagation algorithm to the prediction loop nerve net The input quantity of network model is updated, and circulation performs step S2, S3, S4 and S5, until the gap of prediction result and actual result Less than disparity threshold.

Specifically, the length of past tensor sequence data for it is described at this time at the time of number last moment number numerical value, That is the length of past tensor sequence data is the last moment corresponding numerical value at this moment, for example, during for first For quarter, the length of past tensor sequence data is 0, and for n-hour, the length of past tensor sequence data is N-1. N takes natural number.

The length of following tensor sequence data is a setting value, i.e., length at the time of to be predicted.

Loss function in the present embodiment is preferably MSE, and however, the present invention is not limited thereto, also protection apply other types of damage Lose the scheme of function.

Disparity threshold is setting value.

Further, in step s 6, hyper parameter can be adjusted if necessary.

Further, in step s 6, if it is described at this time at the time of number be less than the fiducial value, the prediction loop god Exported through network model to the prediction result at this time, and the prediction result is inputted into the prediction loop neutral net mould Type, circulation perform step S2, S3, S4 and S5, until it is described at this time at the time of number be greater than or equal to the fiducial value.

A kind of higher-dimension sequence data Forecasting Methodology provided by the invention, updates space-time mnemon by back-propagation algorithm Parameter so that the prediction loop neural network model after updating can obtain more preferable prediction result, and then reduce prediction As a result with the gap of actual result.

Below by taking the prediction of Moving Mnist data sets as an example, a kind of higher-dimension sequence data provided by the invention is predicted Method, which is made, to be further explained.Fig. 7 is a kind of memory of any instant, any layer memory and space-time note in the embodiment of the present invention That recalls unit builds flow chart.The present embodiment refers to Fig. 7.

Data are pre-processed first.Each data of Moving Mnist data sets are made of 20 frame pictures, and 10 Frame is used to input, and 10 frames are used to predict, the size per pictures is 64 × 64.Its content is two hand-written numbers of random Mnist Movement of the word in picture.For each frame picture, tensor sequence X ∈ R can be generated^1×64×64.From the point of view of time angle, observation A tensor sequence being made of the tensor data of 20 timing nodes

Any instant memory is built again：First input gate is obtained by the following formula：

Wherein, i_tFor the first input gate, σ is S type functions Sigmoid, W_xiFor calculate the first input gate when andDo convolution The convolution kernel of operation, * are convolution operation,For the input quantity of any instant, W_hiFor calculate the first input gate andRoll up The convolution kernel of product operation,For the hidden state of the last moment of any instant, b_iFor the first input gate deviation；Wherein roll up Product core W can be sized to 3 × 3, and characteristic pattern quantity is 128.

Described first, which is obtained, by the following formula forgets door：

Wherein, f_tForget door for first, σ is S type functions Sigmoid, W_xfFor calculate first forget door when andDo convolution The convolution kernel of operation, * are convolution operation,For the input quantity of any instant, W_hfFor calculate first forget door andRoll up The convolution kernel of product operation,For the hidden state of the last moment of any instant, b_fForget door deviation for first；Wherein roll up Product core W can be sized to 3 × 3, and characteristic pattern quantity is 128.

The first input modulation door is obtained by the following formula：

Wherein, g_tFor the first input modulation door, φ is hyperbolic tangent function, W_xgFor calculate first input modulation door when andThe convolution kernel of convolution operation is done, * is convolution operation,For the input quantity of any instant, W_hgTo calculate the first input modulation Men HeThe convolution kernel of convolution operation is done,For the hidden state of the last moment of any instant, b_gFor the first input Modulate door deviation；Wherein convolution kernel W can be sized to 3 × 3, and characteristic pattern quantity is 128.

Any layer memory is built again：Second input gate is obtained by the following formula：

Wherein, i '_tFor the second input gate, σ is S type functions Sigmoid, W_xiFor calculate the second input gate when andRoll up The convolution kernel of product operation, * is convolution operation,For the input quantity of any instant, W_miFor calculate the second input gate andDo The convolution kernel of convolution operation,For the memory of the last layer of any layer, b_i' it is the second input gate deviation；Wherein convolution kernel W 3 × 3 are can be sized to, characteristic pattern quantity is 128.

Described second, which is obtained, by the following formula forgets door：

Wherein, f '_tForget door for second, σ is S type functions Sigmoid, W_xfFor calculate second forget door when andRoll up The convolution kernel of product operation, * is convolution operation,For the input quantity of any instant, W_mfFor calculate second forget door andDo The convolution kernel of convolution operation,For the memory of the last layer of any layer, b_f' it is the second forgetting door deviation；Wherein convolution kernel W 3 × 3 are can be sized to, characteristic pattern quantity is 128.

The second input modulation door is obtained by the following formula：

Wherein, g '_tFor the second input modulation door, σ is S type functions Sigmoid, W_xgFor calculate second input modulation door when andThe convolution kernel of convolution operation is done, * is convolution operation,For the input quantity of any instant, W_mgTo calculate the second input modulation Men HeThe convolution kernel of convolution operation is done,For the memory of the last layer of any layer, b_g' inputted for second and modulate door Deviation；Wherein convolution kernel W can be sized to 3 × 3, and characteristic pattern quantity is 128.

Out gate is built again：Any out gate is obtained by the following formula：

Wherein, o_tFor any out gate, σ is S type functions Sigmoid, W_xoFor calculate out gate when andDo convolution operation Convolution kernel, * is convolution operation,For the input quantity of any instant, W_hoFor calculate out gate when andDo convolution operation Convolution kernel,For the hidden state of the last moment of any instant, W_moFor calculate out gate when andIt is convolution behaviour The convolution kernel of work,For the memory of the last layer of any layer, b_oFor deviation；Wherein convolution kernel W can be sized to 3 × 3, characteristic pattern quantity is 128.

Any hidden state is updated by following formula again：

Wherein,For any hidden state, o_tFor any out gate, ⊙ is Hadamard product, and φ is tanh letter Number,For the core of 1 × 1 size, * is convolution operation,Remember for any instant,Remember for any layer, [] is Two tensors are connected according to channel direction；Wherein characteristic pattern quantity is 128.

Trained prediction loop neural network model is obtained by the method in above-described embodiment again.Wherein, letter is lost Number selection MSE, is then trained model with 0.001 training speed by the use of Adam algorithms as back-propagation algorithm, until MSE is less than disparity threshold, calculates 16 groups of sequences every time.

Finally with untrained Moving Mnist data come the effect of prediction loop neutral net test model, if obtaining Preferably as a result, i.e. prediction is more accurate, then prediction loop neutral net test model is preserved, so as to tool afterwards Used in body application scenarios.

Based on above-described embodiment, another embodiment of the present invention discloses a kind of higher-dimension sequence based on depth space-time memory network Data prediction system, the system include：Prediction result acquisition module, for the trained prediction of higher-dimension sequence data input to be followed Ring neural network model, obtains prediction result；Wherein, the trained prediction loop neural network model passes through following submodule Block obtains：Hidden state submodule is updated, for building and appointing according to the first input gate, the first forgetting door and the first input modulation door Engrave and recall for the moment；Any layer memory is built according to the second input gate, the second forgetting door and the second input modulation door；Build any defeated Go out；Any hidden state is updated according to any instant memory, any layer memory and any out gate； Space-time mnemon submodule is built, for based on any hidden of any instant memory, any layer memory and renewal Tibetan state builds any space-time mnemon, and any space-time mnemon is a length time memory network；Build mould Type submodule, for building the prediction loop neural network model by all space-time mnemons, wherein, the prediction follows Ring neural network model is the two dimensional model using moment and layer as dimension, and any space-time mnemon is in any layer At the time of remembering corresponding layer and corresponding any instant memory；Acquisition trains model submodule, for will be by observing The tensor sequence data of value composition inputs the prediction loop neural network model and is trained, and obtains the trained prediction Recognition with Recurrent Neural Network model.

Based on above-described embodiment, Fig. 8 is that a kind of electronics for the prediction of higher-dimension sequence data in the embodiment of the present invention is set Standby structure diagram, sets as shown in figure 8, another embodiment of the present invention discloses a kind of electronics for the prediction of higher-dimension sequence data It is standby, including：Memory and processor, the processor and the memory complete mutual communication by bus；It is described to deposit Reservoir is stored with the programmed instruction that can be performed by the processor, and the processor calls described program instruction to be able to carry out as above State the Forecasting Methodology described in any embodiment, such as including：Higher-dimension sequence data is inputted into trained prediction loop nerve net Network model, obtains prediction result；Wherein, the trained prediction loop neural network model obtains as follows：Root Any instant memory is built according to the first input gate, the first forgetting door and the first input modulation door；Lost according to the second input gate, second Forget door and the second input modulation door builds any layer memory；Build any out gate；Remembered according to any instant, described One layer of memory and any out gate update any hidden state；Based on any instant memory, any layer Any hidden state of memory and renewal builds any space-time mnemon, any space-time mnemon for it is one long in short-term Between memory network；The prediction loop neural network model is built by all space-time mnemons, wherein, the prediction loop Neural network model is the two dimensional model using moment and layer as dimension, and any space-time mnemon is in any layer note At the time of recalling corresponding layer and corresponding any instant memory；By described in the tensor sequence data being made of observation input Prediction loop neural network model is trained, and obtains the trained prediction loop neural network model.

A kind of higher-dimension sequence data Forecasting Methodology and system based on depth space-time memory network provided by the invention, pass through Design realizes that space-time mnemon includes the memory at moment and the memory of layer, i.e., the information comprising time and space, by the time and Spatial information is modeled to a unified memory cell, and along vertical direction (cross over LSTM layers) and horizontal direction (along when Between) transmit these memory states so that the result of prediction can cover the trend of time dimension and Spatial Dimension, compared to existing The analysis of time dimension is mostly only carried out in technology, prediction result of the invention is more accurate.The memory shape of space-time mnemon State can be along both vertically and horizontally two dimension transmission so that more rich history Time-space serial information can be transmitted To future time instance.Since the prediction loop neutral net with space-time mnemon employs LSTM layers of multilayer space-time mnemon The mode of stacking, makes model be provided with stronger ability to express, is allowed to be more suitable for this complicated dynamical system of spatio-temporal prediction. The present invention is largely responsible for video monitoring and weather prognosis.The present invention can capture temporal multidate information and sky at the same time Between on coarseness to fine-grained visual information.Can using the prediction loop neural network model with space-time mnemon For in the forecasting problem of Time-space serial, and splendid performance and effect can be showed.Also, traditional Recognition with Recurrent Neural Network exists When stacking multiple layers, every layer of independent storage allocation block, and the space-time mnemon of different layers shares a system in the present invention One memory pool, saves memory overhead.Higher-dimension sequence data Forecasting Methodology provided by the invention EMS memory occupation in training is small, It can in the short period of time restrain, can fully capture the feature in space-time data, the differentiation rule to following Time-space serial It is high to restrain the precision of prediction, it is real-time.

Finally, method of the invention is only preferable embodiment, is not intended to limit the scope of the present invention.It is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on, should be included in the protection of the present invention Within the scope of.

Claims

A kind of 1. higher-dimension sequence data Forecasting Methodology, it is characterised in that including：

Higher-dimension sequence data is inputted into trained prediction loop neural network model, obtains prediction result；

Wherein, the trained prediction loop neural network model obtains as follows：

Any instant memory is built according to the first input gate, the first forgetting door and the first input modulation door；According to the second input gate, Second forgetting door and the second input modulation door build any layer memory；Build any out gate；Remembered according to any instant, Any layer memory and any out gate update any hidden state；

It is single that any hidden state based on any instant memory, any layer memory and renewal builds any space-time memory Member, any space-time mnemon are a length time memory network；

The prediction loop neural network model is built by all space-time mnemons, wherein, the prediction loop nerve net Network model is the two dimensional model using moment and layer as dimension, and any space-time mnemon is in any layer memory and corresponds to Layer and at the time of any instant memory corresponding；

The tensor sequence data being made of observation is inputted the prediction loop neural network model to be trained, described in acquisition Trained prediction loop neural network model.
2. Forecasting Methodology according to claim 1, it is characterised in that it is described according to the first input gate, first forget door and First input modulation door is built any instant memory and is further comprised：

First input gate is obtained by the following formula：

Wherein, i_tFor the first input gate, σ is S type functions Sigmoid, W_xiFor calculate the first input gate when and x_tDo convolution operation Convolution kernel, * are convolution operation, x_tFor the input quantity of any instant, W_hiFor calculate the first input gate andDo convolution operation Convolution kernel,For the hidden state of the last moment of any instant, b_iFor the first input gate deviation；

Described first, which is obtained, by the following formula forgets door：

Wherein, f_tForget door for first, σ is S type functions Sigmoid, W_xfFor calculate first forget door when and x_tDo convolution operation Convolution kernel, * are convolution operation, x_tFor the input quantity of any instant, W_hfFor calculate first forget door andDo convolution operation Convolution kernel,For the hidden state of the last moment of any instant, b_fForget door deviation for first；

The first input modulation door is obtained by the following formula：

Wherein, g_tFor the first input modulation door, φ is hyperbolic tangent function, W_xgFor calculate first input modulation door when and x_tRoll up The convolution kernel of product operation, * is convolution operation, x_tFor the input quantity of any instant, W_hgFor calculate first input modulation door and The convolution kernel of convolution operation is done,For the hidden state of the last moment of any instant, b_gIt is inclined for the first input modulation door Difference；

Door and the first input modulation door are forgotten according to first input gate, described first, described appoint is obtained by following formula Engrave and recall for the moment：

Wherein,Remember for any instant, f_tForget door for first,For Hadamard product,For upper the one of any instant The memory at moment, i_tFor the first input gate, g_tFor the first input modulation door.
3. Forecasting Methodology according to claim 1, it is characterised in that it is described according to the second input gate, second forget door and Second input modulation door is built any layer memory and is further comprised：

Second input gate is obtained by the following formula：

Wherein, i_t' it is the second input gate, σ is S type functions Sigmoid, W_xiFor calculate the second input gate when and x_tDo convolution operation Convolution kernel, * is convolution operation, x_tFor the input quantity of any instant, W_miFor calculate the second input gate andIt is convolution behaviour The convolution kernel of work,For the memory of the last layer of any layer, b_i' it is the second input gate deviation；

Described second, which is obtained, by the following formula forgets door：

Wherein, f_t' it is the second forgetting door, σ is S type functions Sigmoid, W_xfFor calculate second forget door when and x_tDo convolution operation Convolution kernel, * is convolution operation, x_tFor the input quantity of any instant, W_mfFor calculate second forget door andIt is convolution behaviour The convolution kernel of work,For the memory of the last layer of any layer, b '_fForget door deviation for second；

The second input modulation door is obtained by the following formula：

Wherein, g '_tFor the second input modulation door, σ is S type functions Sigmoid, W_xgFor calculate second input modulation door when and x_tDo The convolution kernel of convolution operation, * are convolution operation, x_tFor the input quantity of any instant, W_mgFor calculate second input modulation door andThe convolution kernel of convolution operation is done,For the memory of the last layer of any layer, b '_gFor the second input modulation door deviation；

Door and the second input modulation door are forgotten according to second input gate, described second, described appoint is obtained by following formula One layer of memory：

Wherein,Remember for any layer, f '_tForget door for second,For Hadamard product,For upper the one of any layer The memory of layer, i '_tFor the second input gate, g '_tFor the second input modulation door.
4. Forecasting Methodology according to claim 1, it is characterised in that described to build any out gate and further comprise：

Any out gate is obtained by the following formula：

Wherein, o_tFor any out gate, σ is S type functions Sigmoid, W_xoFor calculate out gate when and X_tDo the convolution of convolution operation Core, * are convolution operation, X_tFor the input quantity of any instant, W_hoFor calculate out gate when andDo the convolution of convolution operation Core,For the hidden state of the last moment of any instant, W_moFor calculate out gate when andDo the volume of convolution operation Product core,For the memory of the last layer of any layer, b_oFor out gate deviation.
5. Forecasting Methodology according to claim 1, it is characterised in that according to any instant memory, any layer Memory and any out gate, update any hidden state by following formula：

Wherein,For any hidden state, o_tFor any out gate,For Hadamard product, φ is hyperbolic tangent function,For the core of 1 × 1 size, * is convolution operation,Remember for any instant,Remember for any layer, [] is two A tensor is connected according to channel direction.
6. Forecasting Methodology according to claim 1, it is characterised in that the tensor sequence data that will be made of observation Input the prediction loop neural network model to be trained, obtain the trained prediction loop neural network model into one Step includes：

The first moment tensor sequence data in the tensor sequence data being made of observation, is inputted prediction loop god by S1 Through network model；

S2, the first layer in the prediction loop neural network model, by the extraction pair of the first moment first layer space-time mnemon Information is answered, the corresponding informance includes the corresponding hidden state of the first moment first layer space-time mnemon, the first moment was remembered Remember with first layer, the corresponding hidden state of the first moment first layer space-time mnemon and first moment are remembered The space-time mnemon of first layer under subsequent time is transferred to, while the first moment first layer space-time mnemon is corresponded to Hidden state and first layer memory be transferred to the space-time mnemon at next layer of the first moment；

S3, by extract corresponding informance the number of plies add one, continuously carry out step S2, until extract corresponding informance residing for the number of plies it is big In stacking number, prediction result of the prediction loop neural network model at the first moment is obtained, the stacking number is institute State total number of plies of prediction loop neural network model.
7. Forecasting Methodology according to claim 6, it is characterised in that further included after step S3：

Subsequent time tensor sequence data in the tensor sequence data being made of observation, is inputted prediction loop god by S4 Through network model, circulation performs step S2 and S3, and number is greater than or equal to tensor sequence in the past at the time of residing for the information of extraction The length of column data, each layer of prediction in obtaining the prediction loop neural network model at the time of corresponding informance has been extracted As a result；

S5, when judging that at the time of residing for the corresponding informance of extraction number is greater than or equal to the length of tensor sequence data in the past, this When at the time of number and fiducial value size, the fiducial value is the length of tensor sequence data in the past and following tensor sequence The sum of length of data；Wherein, the length of the tensor sequence data in the past for it is described at this time at the time of number last moment number Numerical value, the length of the future tensor sequence data is setting value；

If S6, it is described at this time at the time of number be greater than or equal to the fiducial value, the prediction result is obtained by loss function With the gap of actual result, and the input quantity of the prediction loop neural network model is carried out more according to back-propagation algorithm Newly, circulation performs step S2, S3, S4 and S5, until the gap of prediction result and actual result is less than disparity threshold.
8. Forecasting Methodology according to claim 7, it is characterised in that step S6 is further included：

If it is described at this time at the time of number be less than the fiducial value, the prediction loop neural network model output to it is described at this time Prediction result, and the prediction result is inputted into the prediction loop neural network model, circulation performs step S2, S3, S4 And S5, until it is described at this time at the time of number be greater than or equal to the fiducial value.
A kind of 9. higher-dimension sequence data forecasting system, it is characterised in that including：

Prediction result acquisition module, for higher-dimension sequence data to be inputted trained prediction loop neural network model, obtains Prediction result；

Wherein, the trained prediction loop neural network model is obtained by following submodule：

Hidden state submodule is updated, it is any for being built according to the first input gate, the first forgetting door and the first input modulation door Moment is remembered；Any layer memory is built according to the second input gate, the second forgetting door and the second input modulation door；Build any output Door；Any hidden state is updated according to any instant memory, any layer memory and any out gate；

Space-time mnemon submodule is built, for appointing based on any instant memory, any layer memory and renewal One hidden state builds any space-time mnemon, and any space-time mnemon is a length time memory network；

Model submodule is built, for building the prediction loop neural network model by all space-time mnemons, wherein, The prediction loop neural network model is the two dimensional model using moment and layer as dimension, and any space-time mnemon is in At the time of any layer remembers corresponding layer and corresponding any instant memory；

Acquisition trains model submodule, and the tensor sequence data for will be made of observation inputs the prediction loop nerve Network model is trained, and obtains the trained prediction loop neural network model.
A kind of 10. electronic equipment for the prediction of higher-dimension sequence data, it is characterised in that including：

Memory and processor, the processor and the memory complete mutual communication by bus；The memory The programmed instruction that can be performed by the processor is stored with, the processor calls described program instruction to be able to carry out right such as will Seek 1 to 8 any Forecasting Methodology.