CN108108475B - Time sequence prediction method based on depth-limited Boltzmann machine - Google Patents

Time sequence prediction method based on depth-limited Boltzmann machine Download PDF

Info

Publication number
CN108108475B
CN108108475B CN201810004236.0A CN201810004236A CN108108475B CN 108108475 B CN108108475 B CN 108108475B CN 201810004236 A CN201810004236 A CN 201810004236A CN 108108475 B CN108108475 B CN 108108475B
Authority
CN
China
Prior art keywords
layer
hidden
time
value
time step
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810004236.0A
Other languages
Chinese (zh)
Other versions
CN108108475A (en
Inventor
马千里
曲怡茹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201810004236.0A priority Critical patent/CN108108475B/en
Publication of CN108108475A publication Critical patent/CN108108475A/en
Application granted granted Critical
Publication of CN108108475B publication Critical patent/CN108108475B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Fuzzy Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a time sequence prediction method based on a depth-limited Boltzmann machine, which comprises a training process and a testing process, wherein the training process comprises the following steps: reconstructing a network structure, introducing a structured shielding matrix, updating parameters of a Spike-and-Slab Boltzmann machine of each layer based on a structured recursion time sequence, and taking a hidden unit of the upper layer of training of the next layer as input; the testing process comprises the following steps: initializing each layer under the trained model, and predicting layer by layer from the highest layer, wherein the value of the hidden unit of the next layer is determined by the display unit of the upper layer, and the value of the display unit of the first layer is the predicted value of the model. The effectiveness of the method provided by the invention is verified through experiments, and the experimental result shows that the method can improve the prediction of the action sequence data.

Description

Time sequence prediction method based on depth-limited Boltzmann machine
Technical Field
The invention relates to the technical field of data mining of time series, in particular to a time series prediction method based on a depth-limited Boltzmann machine.
Background
An important problem in the field of data mining is the basic problem of studying the relationship between data and time dimensions, i.e. time-series data. Time series data are visible everywhere, such as a section of action video, a section of weather record, a stock index sequence, and the like. Common models comprise a hidden Markov model, a Bayesian model and the like, and the models have difficulty in capturing data dependency relationship in a long time range; another class of timing models is implemented in conjunction with a Recurrent Neural Network (RNN) and a limited boltzmann machine variant (RBM). Boltzmann Models (BMs), which are powerful tools for high dimensional data systems with high complexity, time series data, can describe high order interactions between variables. And the restricted Boltzmann machine model (RBM) is further simplified in structure, so that the application of the Boltzmann machine model is further enhanced, and the model can better capture the long-term dependency relationship. The time-series boltzmann machine (TRBM) is a directed graph model consisting of a series of constrained boltzmann machines, but the precise derivation of such time-series boltzmann machines is not easy, with each update of gibbs samples being an exponential cost. For this reason, the original time sequence limited boltzmann machine, known as recursive limited boltzmann machine (RTRBM), was improved by iya sutsker, Geoffrey Hinton et al, who could learn better information transferred through the connections between hidden units than the original time sequence limited boltzmann machine (TRBM); the research of Roni Mittelman et al on the time sequence model of the RTRBM shows that the model assumes that all visible units and hidden units are fully connected, namely, the dependency structure between the units is ignored or the important dependency structure cannot be identified, however, for a data set, the learning of the dependency structure can better learn the mode and improve the prediction capability, so Roni Mittelman et al provides a model SRTRBM for learning the time sequence limited Boltzmann machine parameters in a structured mode, namely, a graph model principle is used for modeling and constructing a shielding matrix for the dependency relationship of the data set by using graph topology, and a logic equation is used for replacing the shielding matrix to learn the graph structure and the parameters, so that the potential structure and the mode of the data set can be better disclosed, and the prediction effect is improved.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a time series prediction method based on a depth-limited Boltzmann machine.
The purpose of the invention can be achieved by adopting the following technical scheme:
a time sequence prediction method based on a depth limited Boltzmann machine is realized based on a structured Spike-and-Slab recursion time sequence limited Boltzmann machine, wherein each time step probability distribution and energy equation of the Spike-and-Slab recursion time sequence limited Boltzmann machine before structuring are not added are represented as follows:
Figure BDA0001538074540000021
Figure BDA0001538074540000022
in the above-mentioned formula,
Figure BDA0001538074540000023
Figure BDA0001538074540000024
wherein the model parameters include { W, U, bi,binit,Φ,μ,λ,α},WiRepresents the ith row, Φ, of the matrix WiDenotes the ith row of the matrix phi (other analogy), vtDisplay unit, h, representing the t-th time steptA hidden unit for representing the t-th time step, and a real-valued variable Slab for the t-th time step is represented as st,rtThe recursive hidden input representing the t-th time step, I is the identity matrix, diag (-) is represented as the diagonal matrix,
Figure BDA0001538074540000035
expressed as an element-by-element multiplication, σ (·) represents a logical equation, and the energy equation can be in the form:
Figure BDA0001538074540000031
wherein,
Figure BDA0001538074540000032
Figure BDA0001538074540000033
the conditional probability distribution is as follows:
p(hi=1|v)=σ(0.5α-1(Wi,·v)2iWi,·v)-0.5vTdiag(Φi,·)v+bi)
p(si|v,hi)=N((α-1Wi,·v+μi)hiα-1)
Figure BDA0001538074540000034
in the above formula, Cv|s,hIs a conditional covariance matrix.
The structured Spike-and-Slab recursion time sequence limited Boltzmann machine is characterized in that a true structure of connection of a visible and invisible unit and a hidden unit is assumed in the Spike-and-Slab recursion time sequence limited Boltzmann machine, a graph structure is simulated by adding a shielding matrix on a connection matrix between the visible and invisible unit and the hidden unit, and the training only needs to use the structured link matrix to replace the original matrix, so that the training process of all parameters except the parameters introduced by the structuring is completely the same as that of the Spike-and-Slab recursion time sequence limited Boltzmann machine. The invention is realized by using the training and derivation mode of the model in each layer and adding a deep structure on the basis, and the method comprises a training process and a prediction process, wherein the training process comprises the following steps: reconstructing a network structure, introducing a structured shielding matrix, updating parameters of a spike-and-slab Boltzmann machine of each layer based on a structured recursive time sequence, and taking a hidden unit of the upper layer of training of the next layer as input; and (3) prediction process: initializing each layer under the trained model, and predicting layer by layer from the highest layer, wherein the value of the hidden unit of the next layer is determined by the display unit of the upper layer, and the value of the display unit of the first layer is the predicted value of the model.
The training process is as follows:
and S1, reconstructing the network structure. Assuming that such an undirected graph structure exists to represent the real connection structure between explicit and implicit units and between implicit input and implicit units, it is assumed that this graph is G ═ (V, E), V ═ {1, …, | V | }, V denotes the number of nodes, and E denotes the edge of the undirected graph. Distributing the display units to the nodes, wherein each unit is distributed and can be distributed to only one node, each node can only have one display unit or an associated group of display units, the corresponding hidden units are similar, and the number of the newly obtained display and hidden unit nodes is NvAnd Nh
S2, inputting an observation sequence v of a training set1,…vTSetting the number n of model layers and the Boltzmann machine with limited Spike-and-Slab recursion time sequenceThe training parameters of (1).
S3, adding structurization by adding a shielding matrix. Setting a mask matrix M between nodeswAnd a mask matrix M between the hidden input and the hidden unitUAnd initialized to the full 1 matrix and used the W, U and phi parameters respectively
Figure BDA0001538074540000041
And
Figure BDA0001538074540000042
instead, respectively denote
Figure BDA0001538074540000043
Where W is the weight parameter between the explicit unit and the implicit unit, U is the weight parameter between the explicit unit and the recursive implicit input, Φ is the implicit state parameter,
Figure BDA0001538074540000044
representing an element-by-element multiplication operation.
S4, for each time step T1, …, T, the hidden units i in all reconstructed networks are first 1,2, …, NhCalculating the offset value of the hidden unit of the current time step and the hidden input from the previous time step by using the following formula
Figure BDA0001538074540000051
Wherein r ist,iInput of the i-th hidden unit, v, representing time ttA display unit representing the t-th time step,
Figure BDA00015380745400000511
α, is a parameter,. phiiRepresents the ith row of the matrix phi (the same applies), diag (-) represents a diagonal matrix, σ () represents a logical equation,
Figure BDA0001538074540000052
expressed as element-by-element multiplication;
then recursion is performed according to Spike-and-Slab for this time stepGibbs sampling is carried out on a conditional probability distribution formula of a Boltzmann machine to obtain a visible and invisible unit and an estimation value of Slab
Figure BDA0001538074540000053
n=0,…,NCD
S5, calculating the constrained boltzmann machine Q of the Spike-and-Slab recursion time sequence2First according to the formula
Figure BDA0001538074540000054
Using results of step S4
Figure BDA0001538074540000055
T-T, …,2 recursive computation DtWherein D ist+1Is a recursive term, vtDisplay unit representing the t-th time step, rtImplicit Unit input, h, representing the t-th time steptA hidden unit representing the t-th time step, E representing energy, U being a parameter,
Figure BDA0001538074540000056
expressed as element-by-element multiplication;
s6, calculating a Spike-and-Slab recursion time sequence limited Boltzmann machine
Figure BDA0001538074540000057
Figure BDA0001538074540000058
For model parameters W, mu, U, binitRespectively as follows:
Figure BDA0001538074540000059
Figure BDA00015380745400000510
Figure BDA0001538074540000061
Figure BDA0001538074540000062
Figure BDA0001538074540000063
Figure BDA0001538074540000064
wherein,
Figure BDA0001538074540000065
representing the parameters W to Q2Gradient of (D) (other same reason), Dt+1Is a recursive term, rtHidden unit input, v, representing the t-th time steptDisplay unit, h, representing the t-th time steptA hidden unit representing the t-th time step, E represents energy, alpha is a hyper-parameter,
Figure BDA0001538074540000066
expressed as element-by-element multiplication.
S7, for each parameter Θ e { W, U, binitΦ, μ, λ } the formula is updated using the following parameters:
Figure BDA0001538074540000067
wherein,
Figure BDA0001538074540000068
wherein,
Figure BDA0001538074540000069
wherein,
Figure BDA00015380745400000610
Figure BDA00015380745400000611
representing the parameter θ to H + Q2The gradient of (a) of (b) is,
Figure BDA00015380745400000612
representing the parameter theta to Q2W is a weight parameter, b is a bias term parameter, v is a gradient of (c), and (d) is a gradient of (d)tDisplay unit, h, representing the t-th time steptA hidden unit representing the t-th time step, alpha and mu are hyper-parameters, I is an identity matrix,
Figure BDA00015380745400000613
expressed as element-by-element multiplication.
S8, negative value Φ is set to 0.
S9, using logical equation sigma(vij),σ(v)=(1+exp{-v})-1Instead of the mask matrix MWAnd MUJi ofthBlock, converting the graph structure into the parameter vjiUpdate v using the following formulaji
Figure BDA0001538074540000071
S10, adding N layers to each layerCDThe hidden unit value obtained by sub Gibbs sampling is stored as the observation sequence of the next layer, and for each layer, the training parameter result of the layer is stored after the training of the steps S3-S9 is finished;
the prediction process is as follows:
s11, reading in parameters obtained by training each layer to prepare for constructing a prediction model of n layers, and setting a time step T of a prediction sequence to be generated;
s12, inputting an initial test value of the prediction sequence and initializing a first layer display unit by using the initial test value;
s13, starting from the first layer to the N-1 th layer, each layer passes through N by using the conditional probability distribution represented by the model parameters of the layer and the initial value of the display unitCDStep Gibbs sampling to obtain the value of a first time step hidden unit, and taking the value as the initial value of a next layer of display unit;
s14, obtaining an estimated value of each time-step implicit unit using gibbs sampling from the initial value in the same manner as in step S4 for each predicted time step of the nth layer T equal to 1, …, T;
and S15, regarding each time step of T being 1, … and T, regarding the apparent layer predicted value of the previous layer as the hidden value of the layer from the n-1 layer to the 1 layer, and obtaining the apparent layer predicted value of the layer by using Gibbs sampling based on the probability distribution represented by the model parameters of the layer. The layer-showing prediction values of the T time steps of the first layer are generated prediction sequences of the T time steps.
Compared with the prior art, the invention has the following advantages and effects:
in conclusion, the invention provides a multilayer model construction method based on a structured Spike-and-Slab recursion time sequence limited Boltzmann machine, which can effectively improve the time sequence prediction effect. The training process is carried out layer by layer, and prediction is carried out from top to bottom during testing. Experiments show that the constructed deep model has a good prediction effect.
Drawings
FIG. 1 is a flow chart of time series prediction using a structured recursive timing-limited Boltzmann machine-based depth model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
Aiming at the time sequence prediction problem, the invention provides a deep effective method for improving the time sequence prediction based on a spike-and-slab recursion time sequence limited Boltzmann machine model and combined with a structural thought.
The embodiment of the invention takes the action sequence data set of the CMU human motion database as a specific example. The CMU human motion database was captured by a Vicon optical motion capture system consisting of 12 infrared MX-40 cameras, each recording at 120 hz with 4 megapixel resolution. The cameras were placed around a 3 by 8 meter rectangular area and only the motion of the person in this area was captured. For the convenience of capture, the captured object is marked with marks on the elastic black clothes, and the optical camera captures the marks through infrared rays to complete the recording operation. The CMU character motion database stores these captured data in a variety of formats, and the experiments used herein use the asf/amc file format, a file format created by Acclaim game company for capturing and applying the motion of the research character in the game. The character motion format proposed by the Acclaim company consists of two files, namely a skeleton file ASF and a motion file AMC. The purpose of this is that only one bone structure needs to be stored to perform the different actions, and it is not necessary to store the same bone in each action file. The asf/amc file is easy to convert following ascii coding, the recorded data is in the form of euler angles, CMU image laboratory provides many tool packages to conveniently convert asf/amc format file into the form used by c + + and matlab. The CMU character movement database of the present invention is particularly useful for recording 33 run and go data records of the subject35 data set. The complete data set of person 35 has a total of 34 records, of which 23 records of walking, 10 records of running, and one record is a navigatoaronend obstacles, where only the data records of the sum run are taken.
Based on the above subject35 data set of the CMU character motion database, the embodiment trains and models the data set by the following steps, and simultaneously detects the prediction effect of the method provided by the invention and compares the prediction effect with other models:
step S00, preprocessing data, removing unchanged joints and regularizing, reducing the number of the CMU data sets from 96 observation points to 62, and recording deleted positions, so that data restoration is facilitated when people move in the later period;
step S01, taking two groups of data of the 1 st record and the 16 th record from the CMU data set as a test set, and training the rest 31 records as a training set;
step S1, reconstructing a network structure, distributing data representing the same joint point to the same node, and enabling corresponding hidden units to be similar;
step S2, inputting an observation sequence group of a training set, wherein the time step of each group is the actual data time step of the 31 groups of data, the number of model layers is set to be 2, the number of CD recursion is 1, the learning rate is set to be 0.001, a hidden unit is 10 times of a display unit, wherein alpha is set to be 1 and beta is set to be 0.01 in a model using Spike-and-Slab;
step S3, adding structuralization, and setting a shielding matrix M between nodesWAnd a mask matrix M between the hidden input and the hidden unitUAnd initialized to the full 1 matrix and used the W, U and phi parameters respectively
Figure BDA0001538074540000101
And
Figure BDA0001538074540000102
instead, respectively denote
Figure BDA0001538074540000103
Step S4, hiding the unit i to 1, …, N for each time step of each set of data in the training sethCalculating
Figure BDA0001538074540000104
Obtaining a hidden input for each time step, wherein
Figure BDA0001538074540000105
Then according to conditional probability scoreThe Gibbs sampling is carried out for 1 time by the Bu formula
Figure BDA0001538074540000106
n=0,1;
Step S5, using each group of data of training set and obtained in step S3
Figure BDA0001538074540000107
T-T, …,2 according to the formula
Figure BDA0001538074540000108
Computing D using recursive computationt
Step S6, calculating each group of data using training set
Figure BDA0001538074540000109
Gradient for model parameters:
Figure BDA00015380745400001010
Figure BDA00015380745400001011
Figure BDA00015380745400001012
Figure BDA0001538074540000111
Figure BDA0001538074540000112
Figure BDA0001538074540000113
step S7, training each parameter theta E of the model by using each group of data of the training set, wherein the parameters theta E belong to { W, U, binitPhi, mu, lambda }. Updating formula update using the following parameters:
Figure BDA0001538074540000114
wherein,
Figure BDA0001538074540000115
Figure BDA0001538074540000116
and step S8, setting phi of a negative value in the model obtained by training at present to be 0.
Step S9, learning structuralization, using logic equation sigma(vji),σ(v)=(1+exp{-v})-1Instead of the mask matrix MWAnd MUJi ofthBlock, converting the graph structure into the parameter vjiWhere, taking 8, v is updated using the following formulaji
Figure BDA0001538074540000117
Step S10, after training the first layer using steps S4-S9, the trained parameters and hidden unit values of the first layer are saved, the set of hidden values is input to the second layer as the observed values of the second layer display unit, and the second layer is trained using the same steps S4-S10 and the trained parameters of the second layer are saved.
Then, the prediction process is as follows:
and step S11, reading parameters obtained by training the first layer and the second layer respectively, preparing to construct a prediction model, and testing two groups of data respectively, wherein the predicted time step number of each group of data is set as the actual time step of the group of data.
Step S12, for each set of test data, the first time step of the first layer of display elements of the model is initialized with the value of the first time step of the set of data.
Step S13, for each set of test data, the value of the first hidden cell at the time step is obtained by 1 gibbs sampling using the conditional probability distribution represented by the model parameter of the first layer and the initial value of the display cell obtained in step S12, and this value is used as the initial value of the display cell of the second layer.
And step S14, for each group of test data, obtaining the hidden input obtained from the previous unit at each time step by using the model parameters of the layer 2, and carrying out 1-step Gibbs sampling by using the probability distribution of the hidden input to obtain the apparent and hidden unit estimation value of each time step at the layer.
And step S15, regarding each group of prediction data, taking the display layer prediction value of the second layer as the hidden layer of the first layer, and then obtaining the display layer prediction value of the first layer by using 1-step Gibbs sampling based on the probability distribution represented by the model parameters of the first layer, so as to obtain the prediction value of the group of data.
Finally, the mean square error is used to evaluate the model effect, expressed in MSE for short, and in equation
Figure BDA0001538074540000121
That is, for each group of data, the predicted value obtained at each time step is obtainediAnd actual value observediThe average of the subtracted squares; the result of the synthesis needs to average the mean square error obtained from the two sets of data, and the obtained result is shown in table 1.
TABLE 1 Experimental results Table
Type (B) Test set MSE
ss-RTRBM 39.88±2.278
ss-RTRBM_ss-RTRBM 21.05±0.7162
ss-SRTRBM 20.18±0.4041
ss-SRTRBM_ss-SRTRBM 19.64±0.213
ss-SRTRBM_ss-RTRBM 19.63±0.3281
TABLE 2 type interpretation Table
Figure BDA0001538074540000131
The meanings of the model types in the table 1 are shown in the table 2, the mean square error of the predicted value of the single-layer structured spike-and-slab restricted Boltzmann machine (ss-SRTRBM) is larger than the predicted value of the double-layer structured spike-and-slab restricted Boltzmann machine (ss-SRTRBM _ ss-SRTRBM) in the experimental result of the table 1, and the prediction effect of the model with the added layer number of the structured spike-and-slab restricted Boltzmann machine by using the method provided by the invention is better than that of the single-layer model. In addition, the first layer of the Boltzmann machine with the restricted recursion time sequence of the double-layer structured spike-and-slab is reserved, and the second layer is changed into a new model ss-SRTRBM _ ss-RTRBM obtained by the Boltzmann machine table with the restricted recursion time sequence of the spike-and-slab, so that the prediction effect is improved. Table 1 also shows that the prediction effect of the double-layer spike-and-slab recursive timing-limited Boltzmann machine (ss-RTRBM-ssrRTRBM) obtained by using the method for deepening the layer is better than that of the single-layer spike-and-slab recursive timing-limited Boltzmann machine (ss-RTRBM).
In conclusion, the invention provides a multilayer model construction method based on a structured Spike-and-Slab recursion time sequence limited Boltzmann machine, which can effectively improve the time sequence prediction effect. During training, each layer is used as an independent structured Spike-and-Slab recursion time sequence limited Boltzmann machine for training, and the data relation among the layers is realized by taking the value of the hidden unit of the layer as the value of the display unit of the next layer; during prediction, the highest layer is predicted according to an independent model, hidden units of all layers below the highest layer are determined by upper layer display unit values rather than conditional probabilities of the layers under the display unit conditions, and the display unit value of the first layer obtained according to the principle is a predicted value; the deep model constructed in the mode has a good prediction effect.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (5)

1. A time series prediction method based on a depth limited Boltzmann machine is applied to the prediction of a human movement time series, and comprises a training process and a prediction process, wherein the training process comprises the following steps: reconstructing a network structure, introducing a structured shielding matrix, updating parameters of a Spike-and-Slab recursive time sequence limited Boltzmann machine based on a structured recursive time sequence of each layer, and taking hidden units of the upper layer of training of the next layer as input; and (3) prediction process: initializing each layer under the model obtained by training, and predicting layer by layer from the highest layer, wherein the value of a hidden unit of the next layer is determined by an upper layer display unit, and the value of a first layer display unit is the predicted value of the model;
the training process comprises the following steps:
s1, reconstructing a network structure, distributing data representing the same joint point in the character motion time sequence data to the same node in the graph, and the corresponding hidden units are the same, wherein the graph is G (V, E), V (1, …, | V | }, V represents the number of nodes, E represents the edge of an undirected graph, the display units are distributed to the nodes, each unit is distributed, one unit can be distributed to only one node, each node is distributed with the same hidden unit, and the nodes are distributed with the same hidden unitThe point can only have one display unit or an associated group of display units, the corresponding hidden units are similar, and the number of the newly obtained display and hidden unit nodes is NvAnd Nh
S2, inputting observation sequence of training set
Figure FDA0002609327000000012
The time step of the observation sequence, namely the actual data time step of the character movement time sequence, is set with the model layer number n and the training parameters required by the Spike-and-Slab recursion time sequence limited Boltzmann machine;
s3, adding structuralization by adding shielding matrix, setting shielding matrix M between nodeswAnd a mask matrix M between the hidden input and the hidden unitUAnd initialized to a full 1 matrix, and W, U and phi parameters are respectively used as W [ < i > M >W,U⊙MUAnd Φ -wInstead, respectively denote
Figure FDA0002609327000000011
Wherein W is a weight parameter between the display unit and the hidden unit, U is a weight parameter between the display unit and the recursive hidden input, Φ is a hidden state parameter, which indicates element-by-element multiplication operation;
s4, for each time step, calculating the offset value of the hidden unit of the current time step and the hidden input from the previous time step, and for the time step, carrying out N according to the conditional probability distribution formula of the Spike-and-Slab recursion time sequence limited Boltzmann machineCDObtaining the estimation values of the explicit and implicit units and the Slab by step Gibbs sampling;
s5, calculating the constrained boltzmann machine Q of the Spike-and-Slab recursion time sequence2
S6, calculating a Spike-and-Slab recursion time sequence limited Boltzmann machine
Figure FDA0002609327000000021
For model parameters W, mu, U, binitA gradient of (a);
s7, updating each parameter of the model by using a parameter updating formula;
s8, setting the hidden state parameter phi of the negative value to 0;
s9, using logical equation sigma(vji),σ(v)=(1+exp{-v})-1Instead of the mask matrix MwAnd MUJi ofthBlock, converting the graph structure into the parameter vjiAnd updates v using its update formulajiWherein, is a parameter, v is a display unit input;
s10, adding N layers to each layerCDThe hidden unit value obtained by the step of Gibbs sampling is stored as the observation sequence of the next layer, and for each layer, the training parameter result of the layer is stored after the training of the steps S3-S9 is finished;
the prediction process comprises the following steps:
s11, reading parameters obtained by training each layer to prepare and construct a prediction model of n layers, and setting the time step of a prediction sequence to be generated
Figure FDA0002609327000000025
S12, inputting an initial test value of the prediction sequence and initializing a first layer display unit by using the initial test value;
s13, starting from the first layer to the N-1 th layer, each layer passes through N by using the conditional probability distribution represented by the model parameters of the layer and the initial value of the display unitCDStep Gibbs sampling to obtain the value of a first time step hidden unit, and taking the value as the initial value of a next layer of display unit;
s14 for the n-th layer
Figure FDA0002609327000000022
Using the same method as that used in step S4 from the initial value in each predicted time step, N is usedCDStep Gibbs sampling to obtain an estimated value of each time step explicit-implicit unit;
s15, for
Figure FDA0002609327000000023
Each time step from n-1 layer to 1 layer will be immediately followedThe apparent layer prediction value of a layer is used as an implicit value of the layer, and N is used based on the probability distribution of the model parameter representation of the layerCDThe prediction value of the display layer of the layer is obtained by the step Gibbs sampling, and the first layer
Figure FDA0002609327000000024
The display layer predicted value of each time step is generated
Figure FDA0002609327000000031
A prediction sequence of time steps.
2. The method for predicting time series based on the depth-limited Boltzmann machine according to claim 1, wherein the step S4 is performed as follows:
for the
Figure FDA0002609327000000032
For each time step of (1), the hidden units i in all reconstructed networks are 1,2hCalculating the offset value of the hidden unit of the current time step and the hidden input from the previous time step by using the following formula
Figure FDA0002609327000000033
b`1,i=binit,i(t=1)
Figure FDA0002609327000000034
Wherein r ist,iInput of the i-th hidden unit, v, representing time ttA display unit representing the t-th time step,
Figure FDA0002609327000000035
b`i,iand alpha is a parameter,
Figure FDA0002609327000000036
representation matrix
Figure FDA0002609327000000037
Row i of (1), diag (·) represents a diagonal matrix, σ () represents a logic equation, and a indicates an element-by-element multiplication;
then, for the time step, N is carried out according to a conditional probability distribution formula of a Spike-and-Slab recursion time sequence limited Boltzmann machineCDStep Gibbs sampling to obtain explicit and implicit units and estimation values of Slab
Figure FDA0002609327000000038
3. The method for predicting time series based on the depth-limited Boltzmann machine according to claim 2, wherein the step S5 is performed as follows:
for calculating the constrained boltzmann machine Q of the Spike-and-Slab recursion time sequence2First according to the formula
Figure FDA0002609327000000039
Using results of step S4
Figure FDA00026093270000000310
Recursive computation DtWherein D ist+1Is a recursive term, vtDisplay unit representing the t-th time step, rtImplicit Unit input, h, representing the t-th time steptAn implicit element indicating the t-th time step, E indicates energy, U is a parameter, and an indicates element-by-element multiplication.
4. The method of claim 1, wherein the model parameters W, μ, U, b in step S6 are W, μ, U, binitThe gradient of (a) is as follows:
Figure FDA0002609327000000041
Figure FDA0002609327000000042
Figure FDA0002609327000000043
Figure FDA0002609327000000044
Figure FDA0002609327000000045
Figure FDA0002609327000000046
wherein,
Figure FDA0002609327000000047
representing the parameters W to Q2Gradient of (D)t+1Is a recursive term, rtHidden unit input, v, representing the t-th time steptDisplay unit, h, representing the t-th time steptAn implicit element indicating the t-th time step, E indicates energy, α is an over parameter, and an indicates element-by-element multiplication.
5. The method for predicting time series based on the depth-limited Boltzmann machine according to claim 1, wherein the step S7 is performed as follows:
for each parameter Θ e { W, U, binitΦ, μ, λ } the formula is updated using the following parameters:
Figure FDA0002609327000000048
wherein,
Figure FDA0002609327000000051
wherein M isΦ=MW
Figure FDA0002609327000000052
Figure FDA0002609327000000053
Representing the parameter θ to H + Q2The gradient of (a) of (b) is,
Figure FDA0002609327000000054
representing the parameter theta to Q2W is a weight parameter, b is a bias term parameter, vtDisplay unit, h, representing the t-th time steptThe hidden unit of t-th time step is shown, alpha and mu are hyper-parameters, I is an unit matrix, the lines are element-by-element multiplication, and the diag (·) is a diagonal matrix.
CN201810004236.0A 2018-01-03 2018-01-03 Time sequence prediction method based on depth-limited Boltzmann machine Active CN108108475B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810004236.0A CN108108475B (en) 2018-01-03 2018-01-03 Time sequence prediction method based on depth-limited Boltzmann machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810004236.0A CN108108475B (en) 2018-01-03 2018-01-03 Time sequence prediction method based on depth-limited Boltzmann machine

Publications (2)

Publication Number Publication Date
CN108108475A CN108108475A (en) 2018-06-01
CN108108475B true CN108108475B (en) 2020-10-27

Family

ID=62218745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810004236.0A Active CN108108475B (en) 2018-01-03 2018-01-03 Time sequence prediction method based on depth-limited Boltzmann machine

Country Status (1)

Country Link
CN (1) CN108108475B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110061961B (en) * 2019-03-05 2020-08-25 中国科学院信息工程研究所 Anti-tracking network topology intelligent construction method and system based on limited Boltzmann machine

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824282A (en) * 2013-12-11 2014-05-28 香港应用科技研究院有限公司 Touch and motion detection using surface map, object shadow and a single camera
CN104933417A (en) * 2015-06-26 2015-09-23 苏州大学 Behavior recognition method based on sparse spatial-temporal characteristics
CN105335816A (en) * 2015-10-13 2016-02-17 国网安徽省电力公司铜陵供电公司 Electric power communication operation trend and business risk analyzing method based on deep learning
CN105894114A (en) * 2016-03-31 2016-08-24 华中科技大学 Solar energy prediction method based on dynamic condition Boltzmann machine

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8233541B2 (en) * 2008-03-26 2012-07-31 Sony Corporation Recursive image quality enhancement on super resolution video
JP5943358B2 (en) * 2014-09-30 2016-07-05 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation Learning device, processing device, prediction system, learning method, processing method, and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103824282A (en) * 2013-12-11 2014-05-28 香港应用科技研究院有限公司 Touch and motion detection using surface map, object shadow and a single camera
CN104933417A (en) * 2015-06-26 2015-09-23 苏州大学 Behavior recognition method based on sparse spatial-temporal characteristics
CN105335816A (en) * 2015-10-13 2016-02-17 国网安徽省电力公司铜陵供电公司 Electric power communication operation trend and business risk analyzing method based on deep learning
CN105894114A (en) * 2016-03-31 2016-08-24 华中科技大学 Solar energy prediction method based on dynamic condition Boltzmann machine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于深度学习的航空器异常飞行状态识别;吴奇等;《民用飞机设计与研究》;20190901;第68-78页 *
基于集成深度学习的时间序列预测模型;何正义等;《山东大学学报(工学版)》;20131231;第46卷(第6期);第40-47页 *

Also Published As

Publication number Publication date
CN108108475A (en) 2018-06-01

Similar Documents

Publication Publication Date Title
CN114882421B (en) Skeleton behavior recognition method based on space-time characteristic enhancement graph convolution network
Majhi et al. New robust forecasting models for exchange rates prediction
DE112016004534T5 (en) Unmonitored matching in fine-grained records for single-view object reconstruction
CN111145917B (en) Epidemic prevention and control-oriented large-scale population contact network modeling method
CN111079507B (en) Behavior recognition method and device, computer device and readable storage medium
CN115206092B (en) Traffic prediction method of BiLSTM and LightGBM models based on attention mechanism
CN112651360B (en) Skeleton action recognition method under small sample
CN108090686B (en) Medical event risk assessment analysis method and system
KR20190125029A (en) Methods and apparatuses for generating text to video based on time series adversarial neural network
CN112926485A (en) Few-sample sluice image classification method
Zhang et al. Tensor graph convolutional neural network
CN112116137A (en) Student class dropping prediction method based on mixed deep neural network
CN116502161A (en) Anomaly detection method based on dynamic hypergraph neural network
Gajamannage et al. Recurrent neural networks for dynamical systems: Applications to ordinary differential equations, collective motion, and hydrological modeling
Park et al. Recurrent neural networks for dynamical systems: Applications to ordinary differential equations, collective motion, and hydrological modeling
CN111598032B (en) Group behavior recognition method based on graph neural network
CN117671787A (en) Rehabilitation action evaluation method based on transducer
CN108108475B (en) Time sequence prediction method based on depth-limited Boltzmann machine
CN116052254A (en) Visual continuous emotion recognition method based on extended Kalman filtering neural network
Zhang et al. Granger causal inference for interpretable traffic prediction
Zhang et al. IA-CNN: A generalised interpretable convolutional neural network with attention mechanism
Connors et al. Semi-supervised deep generative models for change detection in very high resolution imagery
Esan et al. Surveillance detection of anomalous activities with optimized deep learning technique in crowded scenes
CN114399901B (en) Method and equipment for controlling traffic system
CN111709553B (en) Subway flow prediction method based on tensor GRU neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant