CN108108475B - Time sequence prediction method based on depth-limited Boltzmann machine - Google Patents
Time sequence prediction method based on depth-limited Boltzmann machine Download PDFInfo
- Publication number
- CN108108475B CN108108475B CN201810004236.0A CN201810004236A CN108108475B CN 108108475 B CN108108475 B CN 108108475B CN 201810004236 A CN201810004236 A CN 201810004236A CN 108108475 B CN108108475 B CN 108108475B
- Authority
- CN
- China
- Prior art keywords
- layer
- hidden
- time
- value
- time step
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 39
- 239000011159 matrix material Substances 0.000 claims abstract description 35
- 230000008569 process Effects 0.000 claims abstract description 19
- 238000005070 sampling Methods 0.000 claims description 15
- 238000013101 initial test Methods 0.000 claims description 4
- 238000012360 testing method Methods 0.000 abstract description 9
- 230000009471 action Effects 0.000 abstract description 5
- 238000002474 experimental method Methods 0.000 abstract description 3
- 239000010410 layer Substances 0.000 description 75
- 230000000694 effects Effects 0.000 description 12
- 239000002356 single layer Substances 0.000 description 3
- 210000000988 bone and bone Anatomy 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- QBWCMBCROVPCKQ-UHFFFAOYSA-N chlorous acid Chemical compound OCl=O QBWCMBCROVPCKQ-UHFFFAOYSA-N 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Fuzzy Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a time sequence prediction method based on a depth-limited Boltzmann machine, which comprises a training process and a testing process, wherein the training process comprises the following steps: reconstructing a network structure, introducing a structured shielding matrix, updating parameters of a Spike-and-Slab Boltzmann machine of each layer based on a structured recursion time sequence, and taking a hidden unit of the upper layer of training of the next layer as input; the testing process comprises the following steps: initializing each layer under the trained model, and predicting layer by layer from the highest layer, wherein the value of the hidden unit of the next layer is determined by the display unit of the upper layer, and the value of the display unit of the first layer is the predicted value of the model. The effectiveness of the method provided by the invention is verified through experiments, and the experimental result shows that the method can improve the prediction of the action sequence data.
Description
Technical Field
The invention relates to the technical field of data mining of time series, in particular to a time series prediction method based on a depth-limited Boltzmann machine.
Background
An important problem in the field of data mining is the basic problem of studying the relationship between data and time dimensions, i.e. time-series data. Time series data are visible everywhere, such as a section of action video, a section of weather record, a stock index sequence, and the like. Common models comprise a hidden Markov model, a Bayesian model and the like, and the models have difficulty in capturing data dependency relationship in a long time range; another class of timing models is implemented in conjunction with a Recurrent Neural Network (RNN) and a limited boltzmann machine variant (RBM). Boltzmann Models (BMs), which are powerful tools for high dimensional data systems with high complexity, time series data, can describe high order interactions between variables. And the restricted Boltzmann machine model (RBM) is further simplified in structure, so that the application of the Boltzmann machine model is further enhanced, and the model can better capture the long-term dependency relationship. The time-series boltzmann machine (TRBM) is a directed graph model consisting of a series of constrained boltzmann machines, but the precise derivation of such time-series boltzmann machines is not easy, with each update of gibbs samples being an exponential cost. For this reason, the original time sequence limited boltzmann machine, known as recursive limited boltzmann machine (RTRBM), was improved by iya sutsker, Geoffrey Hinton et al, who could learn better information transferred through the connections between hidden units than the original time sequence limited boltzmann machine (TRBM); the research of Roni Mittelman et al on the time sequence model of the RTRBM shows that the model assumes that all visible units and hidden units are fully connected, namely, the dependency structure between the units is ignored or the important dependency structure cannot be identified, however, for a data set, the learning of the dependency structure can better learn the mode and improve the prediction capability, so Roni Mittelman et al provides a model SRTRBM for learning the time sequence limited Boltzmann machine parameters in a structured mode, namely, a graph model principle is used for modeling and constructing a shielding matrix for the dependency relationship of the data set by using graph topology, and a logic equation is used for replacing the shielding matrix to learn the graph structure and the parameters, so that the potential structure and the mode of the data set can be better disclosed, and the prediction effect is improved.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provides a time series prediction method based on a depth-limited Boltzmann machine.
The purpose of the invention can be achieved by adopting the following technical scheme:
a time sequence prediction method based on a depth limited Boltzmann machine is realized based on a structured Spike-and-Slab recursion time sequence limited Boltzmann machine, wherein each time step probability distribution and energy equation of the Spike-and-Slab recursion time sequence limited Boltzmann machine before structuring are not added are represented as follows:
in the above-mentioned formula,
wherein the model parameters include { W, U, bi,binit,Φ,μ,λ,α},WiRepresents the ith row, Φ, of the matrix WiDenotes the ith row of the matrix phi (other analogy), vtDisplay unit, h, representing the t-th time steptA hidden unit for representing the t-th time step, and a real-valued variable Slab for the t-th time step is represented as st,rtThe recursive hidden input representing the t-th time step, I is the identity matrix, diag (-) is represented as the diagonal matrix,expressed as an element-by-element multiplication, σ (·) represents a logical equation, and the energy equation can be in the form:
wherein,
the conditional probability distribution is as follows:
p(hi=1|v)=σ(0.5α-1(Wi,·v)2+μiWi,·v)-0.5vTdiag(Φi,·)v+bi)
p(si|v,hi)=N((α-1Wi,·v+μi)hiα-1)
in the above formula, Cv|s,hIs a conditional covariance matrix.
The structured Spike-and-Slab recursion time sequence limited Boltzmann machine is characterized in that a true structure of connection of a visible and invisible unit and a hidden unit is assumed in the Spike-and-Slab recursion time sequence limited Boltzmann machine, a graph structure is simulated by adding a shielding matrix on a connection matrix between the visible and invisible unit and the hidden unit, and the training only needs to use the structured link matrix to replace the original matrix, so that the training process of all parameters except the parameters introduced by the structuring is completely the same as that of the Spike-and-Slab recursion time sequence limited Boltzmann machine. The invention is realized by using the training and derivation mode of the model in each layer and adding a deep structure on the basis, and the method comprises a training process and a prediction process, wherein the training process comprises the following steps: reconstructing a network structure, introducing a structured shielding matrix, updating parameters of a spike-and-slab Boltzmann machine of each layer based on a structured recursive time sequence, and taking a hidden unit of the upper layer of training of the next layer as input; and (3) prediction process: initializing each layer under the trained model, and predicting layer by layer from the highest layer, wherein the value of the hidden unit of the next layer is determined by the display unit of the upper layer, and the value of the display unit of the first layer is the predicted value of the model.
The training process is as follows:
and S1, reconstructing the network structure. Assuming that such an undirected graph structure exists to represent the real connection structure between explicit and implicit units and between implicit input and implicit units, it is assumed that this graph is G ═ (V, E), V ═ {1, …, | V | }, V denotes the number of nodes, and E denotes the edge of the undirected graph. Distributing the display units to the nodes, wherein each unit is distributed and can be distributed to only one node, each node can only have one display unit or an associated group of display units, the corresponding hidden units are similar, and the number of the newly obtained display and hidden unit nodes is NvAnd Nh。
S2, inputting an observation sequence v of a training set1,…vTSetting the number n of model layers and the Boltzmann machine with limited Spike-and-Slab recursion time sequenceThe training parameters of (1).
S3, adding structurization by adding a shielding matrix. Setting a mask matrix M between nodeswAnd a mask matrix M between the hidden input and the hidden unitUAnd initialized to the full 1 matrix and used the W, U and phi parameters respectivelyAndinstead, respectively denoteWhere W is the weight parameter between the explicit unit and the implicit unit, U is the weight parameter between the explicit unit and the recursive implicit input, Φ is the implicit state parameter,representing an element-by-element multiplication operation.
S4, for each time step T1, …, T, the hidden units i in all reconstructed networks are first 1,2, …, NhCalculating the offset value of the hidden unit of the current time step and the hidden input from the previous time step by using the following formula
Wherein r ist,iInput of the i-th hidden unit, v, representing time ttA display unit representing the t-th time step,α, is a parameter,. phiiRepresents the ith row of the matrix phi (the same applies), diag (-) represents a diagonal matrix, σ () represents a logical equation,expressed as element-by-element multiplication;
then recursion is performed according to Spike-and-Slab for this time stepGibbs sampling is carried out on a conditional probability distribution formula of a Boltzmann machine to obtain a visible and invisible unit and an estimation value of Slabn=0,…,NCD;
S5, calculating the constrained boltzmann machine Q of the Spike-and-Slab recursion time sequence2First according to the formulaUsing results of step S4T-T, …,2 recursive computation DtWherein D ist+1Is a recursive term, vtDisplay unit representing the t-th time step, rtImplicit Unit input, h, representing the t-th time steptA hidden unit representing the t-th time step, E representing energy, U being a parameter,expressed as element-by-element multiplication;
s6, calculating a Spike-and-Slab recursion time sequence limited Boltzmann machine For model parameters W, mu, U, binitRespectively as follows:
wherein,representing the parameters W to Q2Gradient of (D) (other same reason), Dt+1Is a recursive term, rtHidden unit input, v, representing the t-th time steptDisplay unit, h, representing the t-th time steptA hidden unit representing the t-th time step, E represents energy, alpha is a hyper-parameter,expressed as element-by-element multiplication.
S7, for each parameter Θ e { W, U, binitΦ, μ, λ } the formula is updated using the following parameters:
wherein,
wherein,wherein, representing the parameter θ to H + Q2The gradient of (a) of (b) is,representing the parameter theta to Q2W is a weight parameter, b is a bias term parameter, v is a gradient of (c), and (d) is a gradient of (d)tDisplay unit, h, representing the t-th time steptA hidden unit representing the t-th time step, alpha and mu are hyper-parameters, I is an identity matrix,expressed as element-by-element multiplication.
S8, negative value Φ is set to 0.
S9, using logical equation sigma(vij),σ(v)=(1+exp{-v})-1Instead of the mask matrix MWAnd MUJi ofthBlock, converting the graph structure into the parameter vjiUpdate v using the following formulaji:
S10, adding N layers to each layerCDThe hidden unit value obtained by sub Gibbs sampling is stored as the observation sequence of the next layer, and for each layer, the training parameter result of the layer is stored after the training of the steps S3-S9 is finished;
the prediction process is as follows:
s11, reading in parameters obtained by training each layer to prepare for constructing a prediction model of n layers, and setting a time step T of a prediction sequence to be generated;
s12, inputting an initial test value of the prediction sequence and initializing a first layer display unit by using the initial test value;
s13, starting from the first layer to the N-1 th layer, each layer passes through N by using the conditional probability distribution represented by the model parameters of the layer and the initial value of the display unitCDStep Gibbs sampling to obtain the value of a first time step hidden unit, and taking the value as the initial value of a next layer of display unit;
s14, obtaining an estimated value of each time-step implicit unit using gibbs sampling from the initial value in the same manner as in step S4 for each predicted time step of the nth layer T equal to 1, …, T;
and S15, regarding each time step of T being 1, … and T, regarding the apparent layer predicted value of the previous layer as the hidden value of the layer from the n-1 layer to the 1 layer, and obtaining the apparent layer predicted value of the layer by using Gibbs sampling based on the probability distribution represented by the model parameters of the layer. The layer-showing prediction values of the T time steps of the first layer are generated prediction sequences of the T time steps.
Compared with the prior art, the invention has the following advantages and effects:
in conclusion, the invention provides a multilayer model construction method based on a structured Spike-and-Slab recursion time sequence limited Boltzmann machine, which can effectively improve the time sequence prediction effect. The training process is carried out layer by layer, and prediction is carried out from top to bottom during testing. Experiments show that the constructed deep model has a good prediction effect.
Drawings
FIG. 1 is a flow chart of time series prediction using a structured recursive timing-limited Boltzmann machine-based depth model according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Examples
Aiming at the time sequence prediction problem, the invention provides a deep effective method for improving the time sequence prediction based on a spike-and-slab recursion time sequence limited Boltzmann machine model and combined with a structural thought.
The embodiment of the invention takes the action sequence data set of the CMU human motion database as a specific example. The CMU human motion database was captured by a Vicon optical motion capture system consisting of 12 infrared MX-40 cameras, each recording at 120 hz with 4 megapixel resolution. The cameras were placed around a 3 by 8 meter rectangular area and only the motion of the person in this area was captured. For the convenience of capture, the captured object is marked with marks on the elastic black clothes, and the optical camera captures the marks through infrared rays to complete the recording operation. The CMU character motion database stores these captured data in a variety of formats, and the experiments used herein use the asf/amc file format, a file format created by Acclaim game company for capturing and applying the motion of the research character in the game. The character motion format proposed by the Acclaim company consists of two files, namely a skeleton file ASF and a motion file AMC. The purpose of this is that only one bone structure needs to be stored to perform the different actions, and it is not necessary to store the same bone in each action file. The asf/amc file is easy to convert following ascii coding, the recorded data is in the form of euler angles, CMU image laboratory provides many tool packages to conveniently convert asf/amc format file into the form used by c + + and matlab. The CMU character movement database of the present invention is particularly useful for recording 33 run and go data records of the subject35 data set. The complete data set of person 35 has a total of 34 records, of which 23 records of walking, 10 records of running, and one record is a navigatoaronend obstacles, where only the data records of the sum run are taken.
Based on the above subject35 data set of the CMU character motion database, the embodiment trains and models the data set by the following steps, and simultaneously detects the prediction effect of the method provided by the invention and compares the prediction effect with other models:
step S00, preprocessing data, removing unchanged joints and regularizing, reducing the number of the CMU data sets from 96 observation points to 62, and recording deleted positions, so that data restoration is facilitated when people move in the later period;
step S01, taking two groups of data of the 1 st record and the 16 th record from the CMU data set as a test set, and training the rest 31 records as a training set;
step S1, reconstructing a network structure, distributing data representing the same joint point to the same node, and enabling corresponding hidden units to be similar;
step S2, inputting an observation sequence group of a training set, wherein the time step of each group is the actual data time step of the 31 groups of data, the number of model layers is set to be 2, the number of CD recursion is 1, the learning rate is set to be 0.001, a hidden unit is 10 times of a display unit, wherein alpha is set to be 1 and beta is set to be 0.01 in a model using Spike-and-Slab;
step S3, adding structuralization, and setting a shielding matrix M between nodesWAnd a mask matrix M between the hidden input and the hidden unitUAnd initialized to the full 1 matrix and used the W, U and phi parameters respectivelyAndinstead, respectively denote
Step S4, hiding the unit i to 1, …, N for each time step of each set of data in the training sethCalculatingObtaining a hidden input for each time step, whereinThen according to conditional probability scoreThe Gibbs sampling is carried out for 1 time by the Bu formulan=0,1;
Step S5, using each group of data of training set and obtained in step S3T-T, …,2 according to the formulaComputing D using recursive computationt;
step S7, training each parameter theta E of the model by using each group of data of the training set, wherein the parameters theta E belong to { W, U, binitPhi, mu, lambda }. Updating formula update using the following parameters:
wherein,
and step S8, setting phi of a negative value in the model obtained by training at present to be 0.
Step S9, learning structuralization, using logic equation sigma(vji),σ(v)=(1+exp{-v})-1Instead of the mask matrix MWAnd MUJi ofthBlock, converting the graph structure into the parameter vjiWhere, taking 8, v is updated using the following formulaji:
Step S10, after training the first layer using steps S4-S9, the trained parameters and hidden unit values of the first layer are saved, the set of hidden values is input to the second layer as the observed values of the second layer display unit, and the second layer is trained using the same steps S4-S10 and the trained parameters of the second layer are saved.
Then, the prediction process is as follows:
and step S11, reading parameters obtained by training the first layer and the second layer respectively, preparing to construct a prediction model, and testing two groups of data respectively, wherein the predicted time step number of each group of data is set as the actual time step of the group of data.
Step S12, for each set of test data, the first time step of the first layer of display elements of the model is initialized with the value of the first time step of the set of data.
Step S13, for each set of test data, the value of the first hidden cell at the time step is obtained by 1 gibbs sampling using the conditional probability distribution represented by the model parameter of the first layer and the initial value of the display cell obtained in step S12, and this value is used as the initial value of the display cell of the second layer.
And step S14, for each group of test data, obtaining the hidden input obtained from the previous unit at each time step by using the model parameters of the layer 2, and carrying out 1-step Gibbs sampling by using the probability distribution of the hidden input to obtain the apparent and hidden unit estimation value of each time step at the layer.
And step S15, regarding each group of prediction data, taking the display layer prediction value of the second layer as the hidden layer of the first layer, and then obtaining the display layer prediction value of the first layer by using 1-step Gibbs sampling based on the probability distribution represented by the model parameters of the first layer, so as to obtain the prediction value of the group of data.
Finally, the mean square error is used to evaluate the model effect, expressed in MSE for short, and in equation
That is, for each group of data, the predicted value obtained at each time step is obtainediAnd actual value observediThe average of the subtracted squares; the result of the synthesis needs to average the mean square error obtained from the two sets of data, and the obtained result is shown in table 1.
TABLE 1 Experimental results Table
Type (B) | Test set MSE |
ss-RTRBM | 39.88±2.278 |
ss-RTRBM_ss-RTRBM | 21.05±0.7162 |
ss-SRTRBM | 20.18±0.4041 |
ss-SRTRBM_ss-SRTRBM | 19.64±0.213 |
ss-SRTRBM_ss-RTRBM | 19.63±0.3281 |
TABLE 2 type interpretation Table
The meanings of the model types in the table 1 are shown in the table 2, the mean square error of the predicted value of the single-layer structured spike-and-slab restricted Boltzmann machine (ss-SRTRBM) is larger than the predicted value of the double-layer structured spike-and-slab restricted Boltzmann machine (ss-SRTRBM _ ss-SRTRBM) in the experimental result of the table 1, and the prediction effect of the model with the added layer number of the structured spike-and-slab restricted Boltzmann machine by using the method provided by the invention is better than that of the single-layer model. In addition, the first layer of the Boltzmann machine with the restricted recursion time sequence of the double-layer structured spike-and-slab is reserved, and the second layer is changed into a new model ss-SRTRBM _ ss-RTRBM obtained by the Boltzmann machine table with the restricted recursion time sequence of the spike-and-slab, so that the prediction effect is improved. Table 1 also shows that the prediction effect of the double-layer spike-and-slab recursive timing-limited Boltzmann machine (ss-RTRBM-ssrRTRBM) obtained by using the method for deepening the layer is better than that of the single-layer spike-and-slab recursive timing-limited Boltzmann machine (ss-RTRBM).
In conclusion, the invention provides a multilayer model construction method based on a structured Spike-and-Slab recursion time sequence limited Boltzmann machine, which can effectively improve the time sequence prediction effect. During training, each layer is used as an independent structured Spike-and-Slab recursion time sequence limited Boltzmann machine for training, and the data relation among the layers is realized by taking the value of the hidden unit of the layer as the value of the display unit of the next layer; during prediction, the highest layer is predicted according to an independent model, hidden units of all layers below the highest layer are determined by upper layer display unit values rather than conditional probabilities of the layers under the display unit conditions, and the display unit value of the first layer obtained according to the principle is a predicted value; the deep model constructed in the mode has a good prediction effect.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.
Claims (5)
1. A time series prediction method based on a depth limited Boltzmann machine is applied to the prediction of a human movement time series, and comprises a training process and a prediction process, wherein the training process comprises the following steps: reconstructing a network structure, introducing a structured shielding matrix, updating parameters of a Spike-and-Slab recursive time sequence limited Boltzmann machine based on a structured recursive time sequence of each layer, and taking hidden units of the upper layer of training of the next layer as input; and (3) prediction process: initializing each layer under the model obtained by training, and predicting layer by layer from the highest layer, wherein the value of a hidden unit of the next layer is determined by an upper layer display unit, and the value of a first layer display unit is the predicted value of the model;
the training process comprises the following steps:
s1, reconstructing a network structure, distributing data representing the same joint point in the character motion time sequence data to the same node in the graph, and the corresponding hidden units are the same, wherein the graph is G (V, E), V (1, …, | V | }, V represents the number of nodes, E represents the edge of an undirected graph, the display units are distributed to the nodes, each unit is distributed, one unit can be distributed to only one node, each node is distributed with the same hidden unit, and the nodes are distributed with the same hidden unitThe point can only have one display unit or an associated group of display units, the corresponding hidden units are similar, and the number of the newly obtained display and hidden unit nodes is NvAnd Nh;
S2, inputting observation sequence of training setThe time step of the observation sequence, namely the actual data time step of the character movement time sequence, is set with the model layer number n and the training parameters required by the Spike-and-Slab recursion time sequence limited Boltzmann machine;
s3, adding structuralization by adding shielding matrix, setting shielding matrix M between nodeswAnd a mask matrix M between the hidden input and the hidden unitUAnd initialized to a full 1 matrix, and W, U and phi parameters are respectively used as W [ < i > M >W,U⊙MUAnd Φ -wInstead, respectively denoteWherein W is a weight parameter between the display unit and the hidden unit, U is a weight parameter between the display unit and the recursive hidden input, Φ is a hidden state parameter, which indicates element-by-element multiplication operation;
s4, for each time step, calculating the offset value of the hidden unit of the current time step and the hidden input from the previous time step, and for the time step, carrying out N according to the conditional probability distribution formula of the Spike-and-Slab recursion time sequence limited Boltzmann machineCDObtaining the estimation values of the explicit and implicit units and the Slab by step Gibbs sampling;
s5, calculating the constrained boltzmann machine Q of the Spike-and-Slab recursion time sequence2;
S6, calculating a Spike-and-Slab recursion time sequence limited Boltzmann machineFor model parameters W, mu, U, binitA gradient of (a);
s7, updating each parameter of the model by using a parameter updating formula;
s8, setting the hidden state parameter phi of the negative value to 0;
s9, using logical equation sigma(vji),σ(v)=(1+exp{-v})-1Instead of the mask matrix MwAnd MUJi ofthBlock, converting the graph structure into the parameter vjiAnd updates v using its update formulajiWherein, is a parameter, v is a display unit input;
s10, adding N layers to each layerCDThe hidden unit value obtained by the step of Gibbs sampling is stored as the observation sequence of the next layer, and for each layer, the training parameter result of the layer is stored after the training of the steps S3-S9 is finished;
the prediction process comprises the following steps:
s11, reading parameters obtained by training each layer to prepare and construct a prediction model of n layers, and setting the time step of a prediction sequence to be generated
S12, inputting an initial test value of the prediction sequence and initializing a first layer display unit by using the initial test value;
s13, starting from the first layer to the N-1 th layer, each layer passes through N by using the conditional probability distribution represented by the model parameters of the layer and the initial value of the display unitCDStep Gibbs sampling to obtain the value of a first time step hidden unit, and taking the value as the initial value of a next layer of display unit;
s14 for the n-th layerUsing the same method as that used in step S4 from the initial value in each predicted time step, N is usedCDStep Gibbs sampling to obtain an estimated value of each time step explicit-implicit unit;
s15, forEach time step from n-1 layer to 1 layer will be immediately followedThe apparent layer prediction value of a layer is used as an implicit value of the layer, and N is used based on the probability distribution of the model parameter representation of the layerCDThe prediction value of the display layer of the layer is obtained by the step Gibbs sampling, and the first layerThe display layer predicted value of each time step is generatedA prediction sequence of time steps.
2. The method for predicting time series based on the depth-limited Boltzmann machine according to claim 1, wherein the step S4 is performed as follows:
for theFor each time step of (1), the hidden units i in all reconstructed networks are 1,2hCalculating the offset value of the hidden unit of the current time step and the hidden input from the previous time step by using the following formula
b`1,i=binit,i(t=1)
Wherein r ist,iInput of the i-th hidden unit, v, representing time ttA display unit representing the t-th time step,b`i,iand alpha is a parameter,representation matrixRow i of (1), diag (·) represents a diagonal matrix, σ () represents a logic equation, and a indicates an element-by-element multiplication;
3. The method for predicting time series based on the depth-limited Boltzmann machine according to claim 2, wherein the step S5 is performed as follows:
for calculating the constrained boltzmann machine Q of the Spike-and-Slab recursion time sequence2First according to the formulaUsing results of step S4Recursive computation DtWherein D ist+1Is a recursive term, vtDisplay unit representing the t-th time step, rtImplicit Unit input, h, representing the t-th time steptAn implicit element indicating the t-th time step, E indicates energy, U is a parameter, and an indicates element-by-element multiplication.
4. The method of claim 1, wherein the model parameters W, μ, U, b in step S6 are W, μ, U, binitThe gradient of (a) is as follows:
wherein,representing the parameters W to Q2Gradient of (D)t+1Is a recursive term, rtHidden unit input, v, representing the t-th time steptDisplay unit, h, representing the t-th time steptAn implicit element indicating the t-th time step, E indicates energy, α is an over parameter, and an indicates element-by-element multiplication.
5. The method for predicting time series based on the depth-limited Boltzmann machine according to claim 1, wherein the step S7 is performed as follows:
for each parameter Θ e { W, U, binitΦ, μ, λ } the formula is updated using the following parameters:
wherein,
wherein M isΦ=MW, Representing the parameter θ to H + Q2The gradient of (a) of (b) is,representing the parameter theta to Q2W is a weight parameter, b is a bias term parameter, vtDisplay unit, h, representing the t-th time steptThe hidden unit of t-th time step is shown, alpha and mu are hyper-parameters, I is an unit matrix, the lines are element-by-element multiplication, and the diag (·) is a diagonal matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810004236.0A CN108108475B (en) | 2018-01-03 | 2018-01-03 | Time sequence prediction method based on depth-limited Boltzmann machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810004236.0A CN108108475B (en) | 2018-01-03 | 2018-01-03 | Time sequence prediction method based on depth-limited Boltzmann machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108108475A CN108108475A (en) | 2018-06-01 |
CN108108475B true CN108108475B (en) | 2020-10-27 |
Family
ID=62218745
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810004236.0A Active CN108108475B (en) | 2018-01-03 | 2018-01-03 | Time sequence prediction method based on depth-limited Boltzmann machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108108475B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110061961B (en) * | 2019-03-05 | 2020-08-25 | 中国科学院信息工程研究所 | Anti-tracking network topology intelligent construction method and system based on limited Boltzmann machine |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103824282A (en) * | 2013-12-11 | 2014-05-28 | 香港应用科技研究院有限公司 | Touch and motion detection using surface map, object shadow and a single camera |
CN104933417A (en) * | 2015-06-26 | 2015-09-23 | 苏州大学 | Behavior recognition method based on sparse spatial-temporal characteristics |
CN105335816A (en) * | 2015-10-13 | 2016-02-17 | 国网安徽省电力公司铜陵供电公司 | Electric power communication operation trend and business risk analyzing method based on deep learning |
CN105894114A (en) * | 2016-03-31 | 2016-08-24 | 华中科技大学 | Solar energy prediction method based on dynamic condition Boltzmann machine |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8233541B2 (en) * | 2008-03-26 | 2012-07-31 | Sony Corporation | Recursive image quality enhancement on super resolution video |
JP5943358B2 (en) * | 2014-09-30 | 2016-07-05 | インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Machines Corporation | Learning device, processing device, prediction system, learning method, processing method, and program |
-
2018
- 2018-01-03 CN CN201810004236.0A patent/CN108108475B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103824282A (en) * | 2013-12-11 | 2014-05-28 | 香港应用科技研究院有限公司 | Touch and motion detection using surface map, object shadow and a single camera |
CN104933417A (en) * | 2015-06-26 | 2015-09-23 | 苏州大学 | Behavior recognition method based on sparse spatial-temporal characteristics |
CN105335816A (en) * | 2015-10-13 | 2016-02-17 | 国网安徽省电力公司铜陵供电公司 | Electric power communication operation trend and business risk analyzing method based on deep learning |
CN105894114A (en) * | 2016-03-31 | 2016-08-24 | 华中科技大学 | Solar energy prediction method based on dynamic condition Boltzmann machine |
Non-Patent Citations (2)
Title |
---|
基于深度学习的航空器异常飞行状态识别;吴奇等;《民用飞机设计与研究》;20190901;第68-78页 * |
基于集成深度学习的时间序列预测模型;何正义等;《山东大学学报(工学版)》;20131231;第46卷(第6期);第40-47页 * |
Also Published As
Publication number | Publication date |
---|---|
CN108108475A (en) | 2018-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114882421B (en) | Skeleton behavior recognition method based on space-time characteristic enhancement graph convolution network | |
Majhi et al. | New robust forecasting models for exchange rates prediction | |
DE112016004534T5 (en) | Unmonitored matching in fine-grained records for single-view object reconstruction | |
CN111145917B (en) | Epidemic prevention and control-oriented large-scale population contact network modeling method | |
CN111079507B (en) | Behavior recognition method and device, computer device and readable storage medium | |
CN115206092B (en) | Traffic prediction method of BiLSTM and LightGBM models based on attention mechanism | |
CN112651360B (en) | Skeleton action recognition method under small sample | |
CN108090686B (en) | Medical event risk assessment analysis method and system | |
KR20190125029A (en) | Methods and apparatuses for generating text to video based on time series adversarial neural network | |
CN112926485A (en) | Few-sample sluice image classification method | |
Zhang et al. | Tensor graph convolutional neural network | |
CN112116137A (en) | Student class dropping prediction method based on mixed deep neural network | |
CN116502161A (en) | Anomaly detection method based on dynamic hypergraph neural network | |
Gajamannage et al. | Recurrent neural networks for dynamical systems: Applications to ordinary differential equations, collective motion, and hydrological modeling | |
Park et al. | Recurrent neural networks for dynamical systems: Applications to ordinary differential equations, collective motion, and hydrological modeling | |
CN111598032B (en) | Group behavior recognition method based on graph neural network | |
CN117671787A (en) | Rehabilitation action evaluation method based on transducer | |
CN108108475B (en) | Time sequence prediction method based on depth-limited Boltzmann machine | |
CN116052254A (en) | Visual continuous emotion recognition method based on extended Kalman filtering neural network | |
Zhang et al. | Granger causal inference for interpretable traffic prediction | |
Zhang et al. | IA-CNN: A generalised interpretable convolutional neural network with attention mechanism | |
Connors et al. | Semi-supervised deep generative models for change detection in very high resolution imagery | |
Esan et al. | Surveillance detection of anomalous activities with optimized deep learning technique in crowded scenes | |
CN114399901B (en) | Method and equipment for controlling traffic system | |
CN111709553B (en) | Subway flow prediction method based on tensor GRU neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |