CN109643387A - Learn the system and method with predicted time sequence data for using depth multiplication network - Google Patents
Learn the system and method with predicted time sequence data for using depth multiplication network Download PDFInfo
- Publication number
- CN109643387A CN109643387A CN201780053794.XA CN201780053794A CN109643387A CN 109643387 A CN109643387 A CN 109643387A CN 201780053794 A CN201780053794 A CN 201780053794A CN 109643387 A CN109643387 A CN 109643387A
- Authority
- CN
- China
- Prior art keywords
- layer
- current
- encoder
- decoder
- feedback information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000012545 processing Methods 0.000 claims description 37
- 238000006073 displacement reaction Methods 0.000 claims description 5
- 230000033458 reproduction Effects 0.000 claims 3
- 239000010410 layer Substances 0.000 description 73
- 238000004891 communication Methods 0.000 description 14
- 210000004027 cell Anatomy 0.000 description 11
- 238000013528 artificial neural network Methods 0.000 description 8
- 230000001364 causal effect Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000003111 delayed effect Effects 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 210000005036 nerve Anatomy 0.000 description 3
- 238000000714 time series forecasting Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 230000002045 lasting effect Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 125000006850 spacer group Chemical group 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 239000002516 radical scavenger Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 239000002356 single layer Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
A kind of method includes being learnt and predicted time sequence data using calculating network (100).Calculating network includes one or more layers (102a, 102b, 102c), and each layer has encoder (104a, 104b, 104c) and decoder (106a, 106b, 106c).Each layer of the encoder (i) current feed-forward information from lower level or calculating network inputs (112) and the (ii) past feedback information from higher level or this layer with multiplicative combination.Each layer of encoder generates the current feed-forward information for higher level or this layer.Each layer of decoder (i) current feedback information from higher level or this layer and (ii) the current feed-forward information from lower level or at least one of calculates network inputs or past feed-forward information from lower level or calculating network inputs with multiplicative combination.Each layer of decoder generates the current feedback information for lower level or calculates network output (114).
Description
Technical field
The present disclosure relates generally to machine learning sum number it is predicted that.More specifically, this disclosure relates to for using depth multiplication
Network learns the system and method with predicted time sequence data.
Background technique
" machine learning " is generally referred to as being designed to learning and executing data the calculating skill of forecast analysis from data
Art.Neural network is an exemplary types of the machine learning techniques based on bio-networks (such as human brain).In neural network,
Data processing is executed using artificial neuron, the artificial neuron is coupled together and passes through various communication link exchange warps
The data of processing.It can be by changing weight associated with communication link so that some data are considered as than other data more
In terms of important " study " to realize neural network.
" time series forecasting " is referred to using time series data, the prediction made by machine learning algorithm, institute
It is all in this way via the input of one or more sense organs, over time collected data value to state time series data.Time
Sequence prediction is intelligent important component.For example, the ability of the prediction input time sequence of intelligent entity can permit intelligent reality
Body creates the model of the world (or some smaller portions).
Summary of the invention
Present disclose provides learn the system and method with predicted time sequence data for using depth multiplication network.
In the first embodiment, a kind of method includes being learnt and predicted time sequence data using network is calculated.It calculates
Network includes one or more layers, and each layer includes encoder and decoder.Each layer of encoder multiplicative combination (i)
Current feed-forward information from lower level calculates network inputs and the (ii) past feedback information from higher level or this layer.
Each layer of encoder generates the current feed-forward information for higher level or this layer.Each layer of decoder multiplicative combination (i)
Current feedback information from higher level or this layer and the (ii) current feed-forward information from lower level or calculate network inputs or
Past feed-forward information or calculating network inputs at least one of of the person from lower level.Each layer of decoder generation is used for
The current feedback information of lower level calculates network output.
In a second embodiment, a kind of device includes at least one storage of at least one processing equipment and store instruction
Device, described instruction make at least one described processing equipment use calculating network when being executed by least one described processing equipment
To learn and predicted time sequence data.Calculating network includes one or more layers, and each layer includes encoder and decoding
Device.Each layer of encoder is configured to the current feed-forward information with multiplicative combination (i) from lower level or calculates network inputs
(ii) the past feedback information from higher level or this layer.Each layer of encoder be configured to generate for higher level or
The current feed-forward information of this layer.Each layer of decoder is configured to (i) current from higher level or this layer with multiplicative combination
Feedback information and the (ii) current feed-forward information from lower level or before calculating network inputs or past from lower level
At least one of feedforward information or calculating network inputs.Each layer of decoder is configured to generate for the current anti-of lower level
Feedforward information calculates network output.
In the third embodiment, a kind of non-transitory computer-readable medium includes instruction, and described instruction is by least one
When a processing equipment executes, learn at least one described processing equipment and predicted time sequence data using network is calculated.
Calculating network includes one or more layers, and each layer includes encoder and decoder.Each layer of encoder is configured to
The (i) current feed-forward information from lower level or calculating network inputs and the (ii) mistake from higher level or this layer with multiplicative combination
The feedback information gone.Each layer of encoder is configured to generate the current feed-forward information for higher level or this layer.Each layer
Decoder be configured to multiplicative combination (i) current feedback information from higher level or this layer and (ii) from lower level
Current feed-forward information calculates network inputs or past feed-forward information from lower level or calculates in network inputs extremely
It is one few.Each layer of decoder is configured to generate the current feedback information for lower level or calculates network output.
From following figure, specification and claims, other technical characteristics may be to those skilled in the art
It is easily obvious.
Detailed description of the invention
In order to which the disclosure and its feature is more completely understood, it is described below referring now to what is taken in conjunction with attached drawing, in attached drawing
In:
Fig. 1 illustrates the realizations according to the disclosure for learning and the example frame of the depth multiplication network of predicted time sequence data
Structure;
Fig. 2 is illustrated according to the disclosure for being learnt the example system with predicted time sequence data using depth multiplication network
System;And
Fig. 3 is illustrated according to the disclosure for being learnt the example side with predicted time sequence data using depth multiplication network
Method.
Specific embodiment
The various embodiments of principle for describing the present invention are only made in Fig. 1 discussed below to 3 and this patent document
To illustrate, and should not be construed to limit the scope of the invention in any way.It will be understood by those skilled in the art that can be
The principle of the present invention is realized in any kind of equipment or system suitably arranged.
As noted above, such as when time series forecasting allows intelligent entity (such as people) to create the generation around him or she
When the prediction model on boundary, time series forecasting is intelligent important component.The movement (motor) of intelligent entity is intended to can also be certainly
So form a part of time series." motion intention " is generally referred to as desired movement movement associated with nerve signal,
Arm or leg such as based on the mobile people of different nerve signals or the hand for opening/closing a people.Including past
The prediction of motion intention allows the influence to the motion intention to world around to model.In addition, if intelligent entity includes
The control system that the optimal movement intention about some high level goals for influencing the world can be calculated, then predict to move in the future
The ability of intention can more accurately occur, and without always must have gone through fully optimized, this can calculated and energy user
Face provides huge saving.Can learn and the specific example of time series data predicted include natural language (including text or
Voice) and video.
The time using neural network is traditionally completed using pure feedforward neural network or shallow recurrent neural network
Sequence prediction.Recurrent neural network refers to that the connection between the node in wherein network forms the nerve of " azimuth circle " or closed loop
Network, do not allow other than starting and ending node (its indicate same node) in " azimuth circle " or closed loop node and
The repetition of connection.Recently, the depth Recursive Networks remembered using shot and long term are had been devised by.Multiply although these Web vector graphics are some
Method element, but they are mainly addition to keep backpropagation feasible.
In one aspect of the present disclosure, provide for learn and the equipment of predicted time sequence data, system, method and
Computer-readable medium.Via (i) passing through (using can scavenger) multiplication of current low-level information and past high-level information
Combination is to extract high-level information, and (ii) passes through predicted high-level information in future and current and/or past low-level information
Multiplicative combination come feedback time sequence future prediction, and complete study and prediction.In the method, via by advanced and
The multiplicative combination of low-level information combines feedforward and feedback, and forms depth Recursive Networks.
Fig. 1 illustrates shown for learn with the depth multiplication network of predicted time sequence data according to the realization of the disclosure
Example framework 100.As shown in fig. 1, framework 100 includes one or more layer 102a-102c.In this example, framework 100 includes
Three layer 102a-102c, but other number target zones can be used in framework 100.
Layer 102a-102c respectively includes encoder 104a-104c and respectively includes decoder 106a-106c.Encoder
104a-104c is configured to generate and export feed-forward information 108a-108c respectively.Encoder 104a- in layer 102a-102b
104b is configured to for feed-forward information 108a-108b to be output to next (next) higher level 102b-102c.Top 102c
Encoder 104c be configured to export feed-forward information 108c for top 102c itself use (for purpose of feeding back and feedover
The two).
Decoder 106a-106c is configured to generate and export feedback information 110a-110c respectively.In layer 102b-102c
Decoder 106b-106c be configured to for feedback information 110a-110b to be output to next lower level 102a-102b.Most
The decoder 106a of low layer 102a is configured to export the feedback information 110a from framework 100.
Feed-forward information is received as input 112 in the lowermost layer 102a of framework 100.Single input 112 indicates current
Time sequential value, and multiple inputs 112 indicate the sequence for the value being provided in lowermost layer 102a, form the time of data
Sequence.Feedback information is provided from the lowermost layer 102a of framework 100, as the next value 114 predicted.Individually predicted
Next value 114 indicates predicted future time sequential value, and multiple predicted next values 114 indicate to be formed
The sequence for the value that the slave lowermost layer 102a for the time series of data predicted is provided.Highest feedforward (and first feedback) information
108c indicates the highest coding of input time sequence data.
Layer 102a-102c respectively further comprises the delay cell 116a-116c for feedback and optionally includes needle respectively
To the delay cell 118a-118c of feedforward.Delay cell 116a-116c is configured to receive feedback information and by the message delay
One or more chronomeres.In some embodiments, such as when (one or more) delay of higher level compares (the one of low layer
It is a or multiple) delay it is long when, delay cell 116a-116c can provide different message delays.Then by delayed information
Encoder 104a-104c is supplied to from delay cell 116a-116c.Delay cell 118a-118c is configured to receive input 112
Or feed-forward information and may be by the zero or more chronomere of the message delay.Equally, in some embodiments, such as when compared with
When (one or more) that low layer is compared in high-rise (one or more) delay postpones long, delay cell 118a-118c can be mentioned
For different message delays.(possibility) delayed information is supplied to decoder 106a- from delay cell 118a-118c
106c。
In some embodiments, coding can be provided to transmit by non-linear pond unit 120a-120c respectively
The input 112 of device 104a-104c or feed-forward information 108a-108b.Pond unit 120a-120c operates so as to increase its transformation
The mode of invariance reduces the dimension of data.For example, so-calledℓ 2 Pond unit such as can be translated and be rotated etc to providing
Unitary group indicate invariance.
Each of encoder 104a-104c and decoder 106a-106c are configured to its input of multiplicative combination.Example
Such as, each encoder 104a-104c is configured to (i) current (and may the pond) feedforward from lower level with multiplicative combination
Information or input 112, and (ii) from higher level (or top own layer) delayed feedback information to produce
Raw current feed-forward information.Then current feed-forward information is supplied to higher level (or the same layer at top).Each decoding
Device 106a-106c be configured to multiplicative combination (i) come from higher level (or top own layer) current feedback letter
Breath and (ii) current (and may postpone) feed-forward information from lower level input 112 to generate current feedback information.When
Preceding feedback information is then supplied to lower level or as the next value predicted.As shown in fig. 1, from top
The feed-forward information 108c of 102c is fed back to itself, takes the circumstances into consideration to postpone by delay cell 116c.
In some embodiments, framework 100 shown in Fig. 1 can be used for realizing autocoder." autocoder " is
Attempt a neural network or other machines learning algorithm of the generation for the coding of one group of data.The coding indicates this group of data
Expression but have reduced dimension.In the ideal case, which allows autocoder based in time series data
Initial value carrys out the future value in predicted time sequence data.The ability of predicted time sequence data can be found in extensive application
Purposes.
This can by make each decoder 106a-106c multiplicative combination (i) come from higher level (or top it from
Oneself layer) current feedback information and (ii) the feed-forward information of the current and past from lower level or input 112 are realized.This
Allow network to summarize inertia autocoder, uses current feedforward value, a past feedforward value and constant more advanced anti-
The inertia combination of feedback.
Generally, realize that the network of autocoder is typically designed to so that its output approximatively reproduces its input.When
When applied to time series data, autocoder is in the sense that reproducing the information in future using only past information
" having causal ".Iteratively, it is such have causal autocoder can from itself reproduce it is entire when
Between sequence, it means that have causal autocoder the entire time can be identified based on the initial value of time series
Sequence.It is desirable that coding of the complete layer 102a-102c to input 112, so that the final coded representation (information of input 112
It is 108c) that height is controlled (such as sparse).The coded representation of input 112 can also be ideally used to what generation was predicted
Next value 114 indicates the approximate reproduction of input 112.For time series data, there is causal autocoding
Device will make the input in future approximatively be reproduced as the next value 114 predicted based on past input 112, to allow to have
There is causal autocoder to make prediction time series data.
In some embodiments, when final coding is as advanced as possible and constant so that when identical coding can be used for many
When spacer step (time step), it may be the most useful for having causal autocoder.By pond and/or pass through by
Time series data is encoded into the coding of more low dimensional with multiplication, and invariance can be realized in Fig. 1.However, in order to (according to tool
Having the requirement of causal autocoder) slightly the latter time step approximatively reproduces and is originally inputted 112, need to abandon
Low-level information add backs in calculating.According to the understanding, feed-forward information 108a-108b can be used to calculate advanced constant coding
It (information 108c) and can be used through the feedback information 110a-110b of consolidated network come via the decoded use of multiplication, benefit
The next value 114 predicted is enriched with non-fixed information, rather than pure feedforward network is used for autocoder.
Each of layer 102a-102c includes for data to be encoded, provided with dimension reduction or executes any other
Any (one or more) suitable structure of suitable processing operation.It is, for example, possible to use hardware or hardware and software/firmwares
The combination of instruction is to realize each of layer 102a-102c.
Multiplicative combination in each of encoder 104a-104c and decoder 106a-106c can take various forms.
For example, multiplicative combination may include numerical multiplication or boolean's and function.Multiplicative combination generally forms coding or decodes the biography of node
A part of delivery function also may include or execute other mathematical operations (S-shaped of such as input signal damps).As specific
Example, multiplicative combination can provide some approximations of Boolean AND operation, so that node be allowed to be operated as general state machine.
Therefore, node can check whether that input isxAnd state isy, and if it is, then determine that new state should bez。
As other machines learning system, framework 100 can be trained so that encoder 104a-104c, decoder
106a-106c and delay cell 116a-116c, 118a-118c desirably work.Up to the present, it has generally avoided
Depth multiplication network, because it is difficult to use the pure feedforward depth multiplication network of standard backpropagation techniques training.In some implementations
In example, for framework 100(combination feedforward and feedback unit) training method will be the given time sequence training data the case where
Lower repeated application following steps.It is single by each encoder 104a-104c and delay for each time step in training data
First 116a-116c/118a-118c(the latter constitutes temporal forward-propagating) forward-propagating training data, and pass through each solution
Code device 106a-106c(updates its non-delayed feed back input 110b-110c/108c) backpropagation training data.Then, at the same across
All time steps, the more weight of new encoder 104a-104c and decoder 106a-106c, preferably from current training input
Reproduce current training output.In some embodiments, if it is desired, then can also be in each encoder 104a-104c and/or solution
Post-processing is such as executed by the standardization of its weight and/or rarefaction at code device 106a-106c.This leads to local optimum
The stable convergence of network.
It in other embodiments, can be alternatively using the sparse volume for the recurrence and multiplicative property for being such as suitable for framework 100
Code technology trains encoder 104a-104c.During this is unsupervised, training is related to alternately the (i) power of more new encoder
Weight (due to the recurrence of framework 100, is more than output and is used as the defeated of the training with encoding state is (ii) updated
Enter).In each iteration, individually across training set and amount to across input and state each training to the output to encoder
Activation be standardized.Then all weights of encoder are reduced into fixed amount.Standardization and the combination reduced are intended to
Keep weight sparse.Sparsity can be particularly useful for multiplication network, because the sum of possible weight is very big.Once coding
Device has the good expression of the combination of context and input via sparse coding, so that it may such as by using these encoding states
The pass in this layer is trained in frequency analysis (frequency analysis) how to combine with the practical future value of time series
Join decoder.Certainly, any other suitably trains mechanism that can be used together with the component of framework 100.
Framework 100 shown in Fig. 1 can find purposes in numerous applications.For example, framework 100 can be applied to nature
Language understanding and generation.As a specific example, it is assumed that framework 100 includes four ranks.By feedforward, framework 100 four layers (
Moved up in framework 100) can by alpha code at phoneme, by phoneme encoding at word, by word be encoded into phrase and
Phrase is encoded into sentence.By feedback, four layers (moving down in framework 100) of framework 100 can be by sentence context
Combine with current and/or past phrase information to predict next phrase, by phrase context with it is current and/or past
Word information is combined to predict next word, combines word context with pre- with current and/or past phoneme information
Next phoneme is surveyed, and by phoneme context and current and/or past monogram to predict next letter.?
In framework 100, each layer (other than lowermost layer 102a) is by the lower level more neighbouring than its more slowly transition status, because of the layer
The information at place indicates more constant encoding state.Then it will be used to predict the less constant of lower level information by decoder
Information feed back into prediction.For both encoder and decoder due to their multiplicative property, being considered indicates the spy
Determine " state machine " of the grammer of abstraction level.
Although Fig. 1 illustrates realize for learning and the framework 100 of the depth multiplication network of predicted time sequence data
One example, but various changes can be made to Fig. 1.For example, it includes three layers and can be any that framework 100, which does not need,
It include other number target zones in suitable arrangement (including single layer).
Fig. 2 is illustrated to be shown for being learnt using depth multiplication network with predicted time sequence data according to the disclosure
Example system 200.As shown in Figure 2, system 200 instruction include at least one processing equipment 202, at least one storage equipment 204,
The computing system of at least one communication unit 206 and at least one input/output (I/O) unit 208.
Processing equipment 202 executes the instruction that can be loaded into memory 210.Processing equipment 202 is in any suitable cloth
It include the processor or other equipment of any suitable (one or more) number and (one or more) type in setting.Processing is set
Standby 202 exemplary types include microprocessor, microcontroller, digital signal processor, field programmable gate array, dedicated integrated
Circuit and discrete circuit.
Memory devices 210 and lasting reservoir 212 are the examples for storing equipment 204, and expression can store and promote
The retrieval of information (such as data, program code and/or other suitable information on the basis of temporarily or permanently) it is any
(one or more) structure.Memory devices 210 can indicate random access memory or any (one or more) other conjunctions
Suitable volatibility or non-volatile memory device.Lasting reservoir 212 may include support one of longer term storage of data or
Multiple components or equipment, such as read-only memory, hard disk drive, flash memory or CD.
Communication unit 206 supports the communication with other systems or equipment.For example, communication unit 206 may include promoting to pass through
The network interface card or wireless transceiver of the communication of wired or wireless network.Communication unit 206 can pass through any (one or more
It is a) suitable physics or wireless communication link support to communicate.
I/O unit 208 allows outputting and inputting for data.For example, I/O unit 208 can by keyboard, mouse, keypad,
Touch screen or other suitable input equipments provide the connection for user's input.I/O unit 208 can also be to display, printing
Machine or other suitable output equipments send output.
It in some embodiments, may include realizing the instruction of the framework 100 of Fig. 1 by the instruction that processing equipment 202 executes.
For example, may include realizing that various encoders shown in Fig. 1, decoder and delay are single as the instruction that processing equipment 202 executes
The instruction of member, and support to be related to the instruction of the data flow and data exchange of these components.
Learn and the system 200 of predicted time sequence data although Fig. 2 is illustrated for using depth multiplication network
One example, but various changes can be made to Fig. 2.Such as, it is assumed here that using the software executed by processing equipment 202/
Firmware realizes the framework 100 of Fig. 1.However, it is possible to use any suitable only hardware realization or any suitable hardware and soft
Part/firmware is realized to realize the functionality.Occur moreover, calculating equipment with diversified configuration, and Fig. 2 is not by the disclosure
It is limited to any specific calculating equipment.
Fig. 3 is illustrated to be shown for being learnt using depth multiplication network with predicted time sequence data according to the disclosure
Example method 300.For being easy for explanation, method 300 is described as being realized by the equipment 200 of Fig. 2 using the framework 100 of Fig. 1.
Note, however, can implementation method 300 in any other suitable manner.
As shown in Figure 3, training calculates network at step 302.This may include the processing equipment of such as equipment 200
202 reception training time sequence datas and the framework 100 that the data are supplied to Fig. 1.As noted above, framework 100 includes
One or more layer 102a-102c, each layer include corresponding encoder 104a-104c and corresponding decoder 106a-106c.
In some embodiments, it can be trained and repeating following operation.For each time step, pass through encoder
104a-104c and delay cell 116a-116c/118a-118c forward-propagating training data.For each time step, pass through solution
Code device 106a-106c backpropagation training data.More new encoder 104a-104c and decoder 106a-106c with across sometimes
Spacer step, which is preferably inputted from training, reproduces training output.Any desired after-treatment applications are conciliate in encoder 104a-104c
Code device 106a-106c, such as standardization and/or rarefaction.
At step 304, input time sequence data is received at network calculating.This may include such as equipment 200
Processing equipment 202 from any suitable source receiving time sequence data, all one or more sensors in this way in the source or other
Input equipment.This can also include that time series data is supplied to the layer 102a of framework 100 by the processing equipment 202 of equipment 200
As input 112.
It at step 306, is calculating at each of network layer, by the current feed-forward information from lower level or is calculating network
Input and the past feedback information multiplicative combination from higher level or same layer.Each encoder is at step 308
It generates and is used for higher level or the current feed-forward information for itself.This may include that the processing equipment 202 of such as equipment 200 makes
Past feedback of the multiplicative combination feed-forward information (input 112) with the decoder 106b from layer 102b is used with encoder 104a
Information.This can also include equipment 200 processing equipment 202 using encoder 104b come with multiplicative combination come self-encoding encoder 104a
Feed-forward information 108a and the decoder 106c from layer 102c past feedback information.This can also include equipment 200
Processing equipment 202 come the feed-forward information 108b of self-encoding encoder 104b and comes from itself with multiplicative combination using encoder 104c
Past feedback information.
At step 310, calculating each of network layer place, by from higher level or same layer current feedback information and
Current and/or past feed-forward information or calculating network inputs multiplicative combination from lower level.Each decoder to
Step 312 place, which generates, is used for lower level or the current feedback information for itself.This may include the processing of such as equipment 200
Equipment 202 come the feedback information (information 108c) of self-encoding encoder 104c and comes from layer with multiplicative combination using decoder 106c
Current/past feed-forward information of the encoder 104b of 102b.This can also use solution including the processing equipment 202 of equipment 200
Code device 106b working as come the feedback information 110c and encoder 104a from layer 102a with multiplicative combination from decoder 104c
Before/past feed-forward information.This can also use multiplicative combination using decoder 106a including the processing equipment 202 of equipment 200
Feedback information 110b and current/past feed-forward information (input 112) from decoder 106b.
Note that layer 102a-102b currently feedovers it each of other than top 102c in step 306-312
Information is sent to next higher level 102b-102c, and each of other than lowermost layer 102a layer 102b-102c by its
Current feedback information is sent to next lower level 102a-102b.Its current feed-forward information 108c is used as by top 102c
Its current feedback information, and lowermost layer 102a is sent to its current feedback information as the next value 114 predicted
Calculate network output.The current feed-forward information for being provided to lowermost layer 102a indicates current time sequential value, and from lowermost layer
The current feedback information that 102a is provided indicates predicted future time sequential value.Note that it is directed to each layer of 102a-102c, it can
With the feedback information to be become the past by current feedback information next life of the delay from higher level or itself.Moreover, for each
Layer 102a-102c, the feed-forward information that can be become the past by current feed-forward information next life of the delay from lower level.In addition, can
To be provided to the current feed-forward information of encoder 104a-104c by pond unit 120a-120c transmitting first to reduce dimension
Degree or the Inalterability of displacement for increasing time series data.
In this manner, carrying out predicted time sequence data using calculating network at step 314.This may include such as equipment
200 processing equipment 202 inputs 112 using calculating network, based on a limited number of come the entire sequence of predicted time sequence data
Column.
Learn and the method 300 of predicted time sequence data although Fig. 3 is illustrated for using depth multiplication network
One example, but various changes can be made to Fig. 3.Although various steps can for example, being shown as series of steps
With overlapping, parallel generation, occur or occur in a different order any number.As a specific example, step 306-314 is general
It can overlap each other.
In some embodiments, various functions described in patent document are realized or supported by computer program,
The computer program is formed by computer readable program code and is embodied in computer-readable medium.Phrase " computer
Readable program code " includes any kind of computer code comprising source code, object code and executable code.Phrase
" computer-readable medium " includes any kind of medium that can be accessed by a computer, such as read-only memory (ROM), random
Access the memory of memory (RAM), hard disk drive, compact-disc (CD), digital video disc (DVD) or any other type.
" non-transitory " computer-readable medium excludes conveying temporary electricity or the wired, wireless of other signals, optics or other communications
Link.Non-transitory computer-readable medium includes that the medium of storing data and can for good and all store and overwritten data later
Medium, such as compact disc rocordable or erasable memory equipment.
Illustrate that the definition of the certain word and expressions used through patent document can be advantageous.Term " application " and
" program " refers to be suitable for being realized with suitable computer code (including source code, object code or executable code) one
A or multiple computer programs, component software, instruction set, process, function, object, class, example, related data or part of it.
Term " communication " and its derivative include both direct communication and indirect communication.Term " includes " and "comprising" and its derivative meaning
Taste there is no limit include.Term "or" is inclusive, it is meant that and/or.Phrase " with ... it is associated " and its derive from
Word may mean that including, be included in ... it is interior, with ... interconnect, include, be comprised in ... it is interior, be connected to or with ...
Connect, be coupled to or with ... coupling, can be with ... communication, with ... cooperate, interlock and ... juxtaposition, close to, it is bound
To or with ... the attribute binding, have, having ... and/or with ... have relationship etc..Phrase "...... at least one"
Mean to can be used the various combination of one or more of listed item when the list with project is used together, and can
It can need the only one project in list.For example, " at least one of A, B and C " includes any in following combination: A, B, C,
A and B, A and C, B and C and A and B and C.
Description in patent document should not be read as implying that any particular element, step or function are must to be wrapped
Include necessity or key element in scope of the claims.Moreover, no one of claim is intended to about any appended power
Benefit requires or claim elements quote 35 articles of 112(f of United States Code No.) money, unless clearly making in specific rights requirement
With exact word " component being used for ... " or " the step of being used for ... ", it is followed by the participle phrase of identification function.Right
Such as (but not limited to) " mechanism ", " module ", " equipment ", " unit ", " component ", " element ", " component ", " dress in it is required that
Set ", the use of term of " machine ", " system ", " processor ", " processing equipment " or " controller " etc is understood as that and is intended to
Refer to that further modifying or enhance such as the feature by claim itself is knot known to those skilled in the relevant arts
Structure, and be not intended to and quote 35 articles of 112(f of United States Code No.) money.
Although some embodiments and general associated method, the change of these embodiments and methods has been described in the disclosure
More it will be apparent to those skilled in the art with displacement.Therefore, the above description of example embodiment does not limit
Or the constraint disclosure.Without departing from the spirit and scope of the present disclosure as defined by the appended claims, other
Change, replacement and change are also possible.
Claims (27)
1. a kind of method, comprising:
Learn to include one or more layers with predicted time sequence data, calculating network using network is calculated, each layer includes
Encoder and decoder;
Wherein each layer of encoder with multiplicative combination (i) current feed-forward information from lower level or calculate network inputs and
(ii) the past feedback information from higher level or this layer, each layer of encoder are generated for the current of higher level or this layer
Feed-forward information;And
Wherein each layer of decoder with multiplicative combination (i) current feedback information from higher level or this layer and (ii) come from compared with
The current feed-forward information of low layer calculates network inputs or past feed-forward information or calculating network inputs from lower level
At least one of, each layer of decoder generates the current feedback information for lower level or calculates network output.
2. according to the method described in claim 1, wherein:
Calculating network includes multiple layers;
Its current feed-forward information is sent next higher level by each layer other than top;And
Its current feedback information is sent next lower level by each layer other than lowermost layer.
3. according to the method described in claim 2, wherein:
It is top to use its current feed-forward information as its current feedback information;And
Lowermost layer, which sends its current feedback information to, calculates network output.
4. according to the method described in claim 2, wherein:
The current feed-forward information for being supplied to lowermost layer indicates current time sequential value;And
The current feedback information provided from lowermost layer indicates predicted future time sequential value.
5. according to the method described in claim 1, further include:
For each layer, by postponing the current feedback information from higher level or this layer, the past is generated from higher level or the layer
Feedback information.
6. according to the method described in claim 1, further include:
For each layer, by postponing the current feed-forward information from lower level or calculating network inputs, from lower level or calculating
Network inputs generate past feed-forward information.
7. being supplied to each layer of encoder according to the method described in claim 1, wherein transmitting first by pond unit
Current feed-forward information from lower level or calculate network inputs, the pond unit, which reduces dimension or increases current feedforward, to be believed
The Inalterability of displacement of breath.
8. according to the method described in claim 1, further include:
The encoder and decoder of each layer of training.
9. according to the method described in claim 8, wherein:
Calculating network includes multiple layers, jointly includes multiple encoders, multiple decoders and multiple delay cells;And
Each layer of encoder and decoder of training include:
For each of multiple time steps, pass through encoder and delay cell forward-propagating training data;
For each of time step, pass through decoder backpropagation training data;
Encoder and decoder are updated to improve their reproductions across time step to training data;And
It is post-processed to encoder and decoder application.
10. a kind of device, comprising:
At least one processing equipment;And
At least one processor of store instruction, described instruction when being executed by least one described processing equipment, make it is described extremely
A few processing equipment learns to include one or more layers with predicted time sequence data, calculating network using network is calculated,
Each layer includes encoder and decoder;
Wherein each layer of encoder is configured to the current feed-forward information with multiplicative combination (i) from lower level or calculates network
Input and the (ii) past feedback information from higher level or this layer, each layer of encoder are configured to generate for higher
The current feed-forward information of layer or this layer;And
Wherein each layer of decoder be configured to multiplicative combination (i) current feedback information from higher level or this layer and
(ii) the current feed-forward information from lower level or calculating network inputs or past feed-forward information or meter from lower level
Calculate at least one of network inputs, each layer of decoder be configured to generate by the current feedback information of lower level or based on
Calculate network output.
11. device according to claim 10, in which:
Calculating network includes multiple layers;
Each layer other than top is configured to send next higher level for its current feed-forward information;And
Each layer other than lowermost layer is configured to send its current feedback information to next lower level.
12. device according to claim 11, in which:
It is top to be configured to use its current feed-forward information as its current feedback information;And
Lowermost layer, which is configured to send its current feedback information to, calculates network output.
13. device according to claim 11, in which:
Lowermost layer is configured to receive the current feed-forward information including current time sequential value;And
Lowermost layer is configured to provide the current feedback information including predicted future time sequential value.
14. device according to claim 10, wherein each layer be configured to postpone it is current from higher level or this layer
Feedback information, to generate the past feedback information for being used for this layer.
15. device according to claim 10, wherein each layer is configured to postpone the letter of the current feedforward from lower level
Breath calculates network inputs, to generate the past feed-forward information for being used for this layer.
16. device according to claim 10, wherein calculating network further includes multiple pond units, each pond unit quilt
It is configured to receive the current feed-forward information from lower level or calculates network inputs, and reduce dimension or increase current feedforward letter
The Inalterability of displacement of breath.
17. device according to claim 10, wherein at least one described processing equipment is further configured to train each layer
Encoder and decoder.
18. device according to claim 17, in which:
Calculating network includes multiple layers, jointly includes multiple encoders, multiple decoders and multiple delay cells;And
In order to train each layer of encoder and decoder, at least one described processing equipment is configured to:
For each of multiple time steps, pass through encoder and delay cell forward-propagating training data;
For each of time step, pass through decoder backpropagation training data;
Encoder and decoder are updated to improve their reproductions across time step to training data;And
It is post-processed to encoder and decoder application.
19. a kind of non-transitory computer-readable medium comprising instruction, described instruction are executed by least one processing equipment
When, make at least one described processing equipment:
Learn to include one or more layers with predicted time sequence data, calculating network using network is calculated, each layer includes
Encoder and decoder;
Wherein each layer of encoder is configured to the current feed-forward information with multiplicative combination (i) from lower level or calculates network
Input and the (ii) past feedback information from higher level or this layer, each layer of encoder are configured to generate for higher
The current feed-forward information of layer or this layer;And
Wherein each layer of decoder be configured to multiplicative combination (i) current feedback information from higher level or this layer and
(ii) the current feed-forward information from lower level or calculating network inputs or past feed-forward information or meter from lower level
Calculate at least one of network inputs, each layer of decoder be configured to generate by the current feedback information of lower level or based on
Calculate network output.
20. non-transitory computer-readable medium according to claim 19, in which:
Calculating network includes multiple layers;
Each layer other than top is configured to send next higher level for its current feed-forward information;And
Each layer other than lowermost layer is configured to send its current feedback information to next lower level.
21. non-transitory computer-readable medium according to claim 20, in which:
It is top to be configured to use its current feed-forward information as its current feedback information;And
Lowermost layer, which is configured to send its current feedback information to, calculates network output.
22. non-transitory computer-readable medium according to claim 20, in which:
Lowermost layer is configured to receive the current feed-forward information including current time sequential value;And
Lowermost layer is configured to provide the current feedback information including predicted future time sequential value.
23. non-transitory computer-readable medium according to claim 19, wherein each layer is configured to postpone to come from
The current feedback information of higher level or this layer, to generate the past feedback information for being used for this layer.
24. non-transitory computer-readable medium according to claim 19, wherein each layer is configured to postpone to come from
The current feed-forward information of lower level calculates network inputs, to generate the past feed-forward information for being used for this layer.
25. non-transitory computer-readable medium according to claim 19, wherein calculating network further includes multiple ponds
Unit, each pond unit is configured to receive the current feed-forward information from lower level or calculates network inputs, and reduces
Dimension or the Inalterability of displacement for increasing current feed-forward information.
26. non-transitory computer-readable medium according to claim 19 is also included in by least one described processing
Equipment makes the instruction of each layer of the training of at least one processing equipment of the encoder and decoder when executing.
27. non-transitory computer-readable medium according to claim 26, in which:
Calculating network includes multiple layers, jointly includes multiple encoders, multiple decoders and multiple delay cells;And
The described instruction for making each layer of the training of at least one processing equipment of the encoder and decoder when executed includes
At least one described processing equipment is set to execute the instruction of following steps when executed:
For each of multiple time steps, pass through encoder and delay cell forward-propagating training data;
For each of time step, pass through decoder backpropagation training data;
Encoder and decoder are updated to improve their reproductions across time step to training data;And
It is post-processed to encoder and decoder application.
Applications Claiming Priority (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662382774P | 2016-09-01 | 2016-09-01 | |
US62/382774 | 2016-09-01 | ||
US15/666379 | 2017-08-01 | ||
US15/666,379 US10839316B2 (en) | 2016-08-08 | 2017-08-01 | Systems and methods for learning and predicting time-series data using inertial auto-encoders |
US15/681,942 US11353833B2 (en) | 2016-08-08 | 2017-08-21 | Systems and methods for learning and predicting time-series data using deep multiplicative networks |
US15/681942 | 2017-08-21 | ||
PCT/US2017/049358 WO2018045021A1 (en) | 2016-09-01 | 2017-08-30 | Systems and methods for learning and predicting time-series data using deep multiplicative networks |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109643387A true CN109643387A (en) | 2019-04-16 |
Family
ID=61301606
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201780053794.XA Pending CN109643387A (en) | 2016-09-01 | 2017-08-30 | Learn the system and method with predicted time sequence data for using depth multiplication network |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP3507746A4 (en) |
CN (1) | CN109643387A (en) |
AU (1) | AU2017321524B2 (en) |
CA (1) | CA3033753A1 (en) |
WO (1) | WO2018045021A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110175338A (en) * | 2019-05-31 | 2019-08-27 | 北京金山数字娱乐科技有限公司 | A kind of data processing method and device |
CN112581031A (en) * | 2020-12-30 | 2021-03-30 | 杭州朗阳科技有限公司 | Method for realizing real-time monitoring of motor abnormity by Recurrent Neural Network (RNN) through C language |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111241688B (en) * | 2020-01-15 | 2023-08-25 | 北京百度网讯科技有限公司 | Method and device for monitoring composite production process |
CN111709785B (en) * | 2020-06-18 | 2023-08-22 | 抖音视界有限公司 | Method, apparatus, device and medium for determining user retention time |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6125105A (en) * | 1997-06-05 | 2000-09-26 | Nortel Networks Corporation | Method and apparatus for forecasting future values of a time series |
US9146546B2 (en) * | 2012-06-04 | 2015-09-29 | Brain Corporation | Systems and apparatus for implementing task-specific learning using spiking neurons |
TR201514432T1 (en) * | 2013-06-21 | 2016-11-21 | Aselsan Elektronik Sanayi Ve Ticaret Anonim Sirketi | Method for pseudo-recurrent processing of data using a feedforward neural network architecture |
US11080587B2 (en) * | 2015-02-06 | 2021-08-03 | Deepmind Technologies Limited | Recurrent neural networks for data item generation |
-
2017
- 2017-08-30 AU AU2017321524A patent/AU2017321524B2/en not_active Ceased
- 2017-08-30 CN CN201780053794.XA patent/CN109643387A/en active Pending
- 2017-08-30 CA CA3033753A patent/CA3033753A1/en active Pending
- 2017-08-30 WO PCT/US2017/049358 patent/WO2018045021A1/en unknown
- 2017-08-30 EP EP17847459.9A patent/EP3507746A4/en not_active Withdrawn
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110175338A (en) * | 2019-05-31 | 2019-08-27 | 北京金山数字娱乐科技有限公司 | A kind of data processing method and device |
CN110175338B (en) * | 2019-05-31 | 2023-09-26 | 北京金山数字娱乐科技有限公司 | Data processing method and device |
CN112581031A (en) * | 2020-12-30 | 2021-03-30 | 杭州朗阳科技有限公司 | Method for realizing real-time monitoring of motor abnormity by Recurrent Neural Network (RNN) through C language |
CN112581031B (en) * | 2020-12-30 | 2023-10-17 | 杭州朗阳科技有限公司 | Method for implementing real-time monitoring of motor abnormality by Recurrent Neural Network (RNN) through C language |
Also Published As
Publication number | Publication date |
---|---|
EP3507746A4 (en) | 2020-06-10 |
AU2017321524A1 (en) | 2019-02-28 |
WO2018045021A1 (en) | 2018-03-08 |
EP3507746A1 (en) | 2019-07-10 |
AU2017321524B2 (en) | 2022-03-10 |
CA3033753A1 (en) | 2018-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109643387A (en) | Learn the system and method with predicted time sequence data for using depth multiplication network | |
US20230419074A1 (en) | Methods and systems for neural and cognitive processing | |
US11353833B2 (en) | Systems and methods for learning and predicting time-series data using deep multiplicative networks | |
Head | The A nthropoceneans | |
CN104662526A (en) | Apparatus and methods for efficient updates in spiking neuron networks | |
Zainuddin et al. | A review of crossover methods and problem representation of genetic algorithm in recent engineering applications | |
Janssen | A generative evolutionary design method | |
Du | An English network teaching method supported by artificial intelligence technology and WBIETS system | |
Rastovic | Targeting and synchronization at tokamak with recurrent artificial neural networks | |
Yu et al. | Fast and accurate text classification: Skimming, rereading and early stopping | |
Sun et al. | Chaotic time series prediction of nonlinear systems based on various neural network models | |
CN109844770A (en) | Learn the system and method with predicted time sequence data for using inertia autocoder | |
Modi et al. | On the architecture of a human-centered CAD agent system | |
Cho et al. | Parallel parsing in a Gradient Symbolic Computation parser | |
Elbattah et al. | ML-aided simulation: A conceptual framework for integrating simulation models with machine learning | |
Kulkarni et al. | Modelling and enterprises-the past, the present and the future | |
Liu et al. | Cellular automata inspired multistable origami metamaterials for mechanical learning | |
Wessing et al. | Replacing FEA for sheet metal forming by surrogate modeling | |
Holmberg | Designing and prototyping towards anticipatory applications | |
Cascone et al. | Architectural self-fabrication | |
Junguo et al. | A hybrid neural genetic method for load forecasting based on phase space reconstruction | |
Motzev | Statistical learning networks in simulations for business training and education | |
Hirano et al. | Efficient Parameter Tuning for Multi-agent Simulation Using Deep Reinforcement Learning | |
Cocho-Bermejo et al. | Phenotype Variability Mimicking as a Process for the Test and Optimization of Dynamic Facade Systems | |
Schillaci et al. | Re-enacting sensorimotor experience for cognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 40001498 Country of ref document: HK |
|
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190416 |