CN108510065A - Computing device and computational methods applied to long Memory Neural Networks in short-term - Google Patents

Computing device and computational methods applied to long Memory Neural Networks in short-term Download PDF

Info

Publication number
CN108510065A
CN108510065A CN201810275313.6A CN201810275313A CN108510065A CN 108510065 A CN108510065 A CN 108510065A CN 201810275313 A CN201810275313 A CN 201810275313A CN 108510065 A CN108510065 A CN 108510065A
Authority
CN
China
Prior art keywords
current time
vector
value vector
term
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810275313.6A
Other languages
Chinese (zh)
Inventor
韩银和
许浩博
王颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Computing Technology of CAS
Original Assignee
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Computing Technology of CAS filed Critical Institute of Computing Technology of CAS
Priority to CN201810275313.6A priority Critical patent/CN108510065A/en
Publication of CN108510065A publication Critical patent/CN108510065A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of computing device and computational methods being applied to long Memory Neural Networks in short-term.The computing device includes:Computing unit, the calculating for forgeing gate function, input gate function, output gate function and mnemon state for executive chairman's short-term memory neural network;Vector calculation unit is used to calculate the forgetting gate value vector at the current time obtained based on the mnemon state value vector sum of the long last moment of Memory Neural Networks in short-term, the immediate status value vector of the mnemon at the input gate value at current time vectorial, current time output gate value vector sum current time obtains the output valve vector at the long current time of Memory Neural Networks in short-term and the mnemon state value vector at current time.Computational methods according to the present invention and computing device can improve the resource utilization and computational efficiency of long Memory Neural Networks in short-term.

Description

Computing device and computational methods applied to long Memory Neural Networks in short-term
Technical field
The present invention relates to depth learning technology field more particularly to a kind of calculating being applied to long Memory Neural Networks in short-term Device and computational methods.
Background technology
Long Memory Neural Networks (LSTM) in short-term are a kind of special Recognition with Recurrent Neural Network, have can learn it is long-term according to Bad ability can be applied to study language translation, robot control, image analysis, documentation summary, speech recognition, handwriting recognition Deng.LSTM can remember the information of some time in target data processing procedure.In LSTM application processes, typically calculating A processing unit with judgement is added in method, includes input gate, forgetting door and out gate, foundation in the processing unit Rule judges whether input information is useful, and the information for meeting algorithm certification is retained, and by incongruent information by forgeing door To forget, the main occupancy part of LSTM calculating process is the interative computation process for multiplying accumulating operation and each layer of each gate value vector, The process mainly carries out gate value vector and weight the processing such as to multiply accumulating, and result is transmitted the input as next layer network, And execute identical operation.
However, during existing LSTM network calculations, counted due to the simple gate structure of generally use Calculate, so as to cause vector and weight the operation for multiplying accumulating process and part gate value vector there are series relationship, computing resource Utilization rate and the rapidity of data processing are difficult to keep simultaneously, the idle state of computing resource inevitably occur, reduce Resource utilization, in addition, there is a problem of memory access number height in LSTM networks and run low in energy consumption.
Therefore, it is necessary to be improved to the prior art, by the real-time of the data processing that improves LSTM neural networks and in terms of Resource utilization is calculated, and reduces calculating power consumption, pushes LSTM neural networks to such as intelligence wearing, intelligent robot, automatically The wider field such as driving and pattern-recognition.
Invention content
It is an object of the invention to overcome the defect of the above-mentioned prior art, provide a kind of applied to matrix fortune in LSTM networks Calculate the computing device with vector operation.
According to the first aspect of the invention, a kind of computing device being applied to long Memory Neural Networks in short-term is provided.It should Computing device includes:
Computing unit, forgetting gate function, input gate function, output gate function for executive chairman's short-term memory neural network And the calculating of mnemon state, forgetting gate value vector, the input gate value at current time for obtaining current time are vectorial, current The immediate status value vector of the mnemon at the output gate value vector sum current time at moment, wherein the note at the current time The immediate status value vector for recalling unit is used to reflect the short-term memory of long Memory Neural Networks in short-term;
Vector calculation unit, for the mnemon state value vector based on the long last moment of Memory Neural Networks in short-term Forgetting gate value vector, input gate value vector, the output gate value vector sum at current time at current time with the current time The immediate status value vector of the mnemon at current time obtain the output valve at the long current time of Memory Neural Networks in short-term to The mnemon state value vector at amount and current time, wherein the mnemon state value vector of the last moment is used for The long-term memory of the long Memory Neural Networks in short-term of reflection.
In one embodiment, computing device of the invention includes four computing units, executes correlometer parallel It calculates, wherein the first computing unit executes the calculating for forgeing gate function, obtains the forgetting gate value vector at the current time;Second Computing unit executes the calculating of input gate function, obtains the input gate value vector at the current time;Third computing unit executes The calculating of mnemon state obtains the immediate status value vector of the mnemon at the current time;4th computing unit is held The calculating of row output gate function obtains the output gate value vector at the current time.
In one embodiment, the vector calculation unit includes:
First multiplication unit, mnemon state value vector for receiving the last moment and comes from described the The forgetting gate value vector at the current time of one computing unit;
Second multiplication unit, for receive the input gate value at the current time for coming from second computing unit to The immediate status value vector of the mnemon at the current time of amount and the third computing unit;
Addition unit, for receiving the output and the execution that come from first multiplication unit and second multiplication unit Sum operation obtains the mnemon state value vector at the current time;
Activate processing unit, for the current time to coming from the addition unit mnemon state value to It measures into line activating processing;
Third multiplication unit, by receive come from it is described activation processing unit result and come from based on the described 4th The output gate value vector for calculating the current time of unit, the output at obtain length current time of Memory Neural Networks in short-term Value vector.
In one embodiment, the activation processing unit is handled using tanh activation primitives.
In one embodiment, four computing units circuit structure having the same.
In one embodiment, the computing unit includes:
Multiple multipliers are used for the weight matrix of executive chairman's short-term memory network and being multiplied for the corresponding element of input vector Operation;
The addition tree construction being made of multiple adders executes phase add operation for the result to the multiple multiplier, To obtain the multiplied result of the weight matrix and input vector;
Accumulator, the result for being obtained to the adder structure add up;
Activation primitive processing unit is handled for the result to the accumulator into line activating.
In one embodiment, the activation primitive processing unit is handled using sigmoid functions into line activating.
According to the second aspect of the invention, a kind of computational methods being applied to long Memory Neural Networks in short-term are provided, it should Computational methods computing device according to the present invention carrys out the correlation computations of executive chairman's short-term memory neural network, to obtain current time Output valve vector and the mnemon state value at current time vector,
In the computational methods of the present invention, the computing unit in the computing device executes following calculation:
ft=σ (Wf·xt+Uf·ht-1+bf)
it=σ (Wi·xt+Ui·ht-1+bi)
ot=σ (Wo·xt+Uo·ht-1+bo)
Vector calculation unit in the computing device executes following calculation:
ht=ot⊙tanh(ct)
Wherein, the multiplication of " " representing matrix and vector operation, " ⊙ " indicate the multiplication operation of vector sum vector, " σ " table Show that sigmoid activation operations, " tanh " indicate tanh activation operations, Wf、UfIt is the weight matrix for forgeing door, bfIt is to forget door Bias term, WiAnd UiIt is the weight matrix of input gate, biIt is the bias term of input gate, WcAnd UcIt is the weight square of mnemon state Battle array, bcIt is the bias term of mnemon state, WoAnd UoIt is the weight matrix of out gate, boIt is the bias term of out gate, xtIt is to work as The input vector at preceding moment, ht-1It is the output valve vector of long Memory Neural Networks last moment in short-term, htIt is long short-term memory god Output valve vector through network current time, ftIt is the forgetting gate value vector at current time, itBe current time input gate value to Amount, otIt is the output gate value vector at current time,It is the immediate status value vector of the mnemon at current time, ct-1On being The mnemon state value vector at one moment, ctIt is the mnemon state value vector at the current time calculated.
According to the third aspect of the invention we, a kind of length Memory Neural Networks processor in short-term is provided.At the neural network Managing device includes:
Computing device provided by the invention is used for the correlation computations of executive chairman's short-term memory network;
Storage unit for storing data and instructs;
Control unit, for controlling the correlation computations of the computing device according to the instruction for being stored in the storage unit With the input and output of data.
Compared with the prior art, the advantages of the present invention are as follows:For the spy of matrix operation and vector operation in LSTM networks Point provides a kind of device being adapted for carrying out correlation computations in LSTM, improves resource utilization and reduces calculating power consumption.
Description of the drawings
The following drawings only makees schematical description and interpretation to the present invention, is not intended to limit the scope of the present invention, wherein:
Fig. 1 shows the structural schematic diagram of long Memory Neural Networks in short-term in the prior art;
Fig. 2 shows the schematic diagrames of the length according to an embodiment of the invention computing device of Memory Neural Networks in short-term;
Fig. 3 shows the structural schematic diagram of computing unit in the computing device of Fig. 2;
Fig. 4 shows the structural schematic diagram of the neural network processor of the computing device based on the present invention.
Specific implementation mode
It is logical below in conjunction with attached drawing in order to keep the purpose of the present invention, technical solution, design method and advantage of greater clarity Crossing specific embodiment, the present invention is described in more detail.It should be appreciated that specific embodiment described herein is only explaining The present invention is not intended to limit the present invention.
Fig. 1 shows the structural schematic diagram of typical long memory network in short-term in the prior art, the typical structure mainly by Input gate (itIndicate the input gate value vector or gate function of t moment), out gate (otIndicate the output gate value vector or door of t moment Function), forget door (ftIndicate the forgetting gate value vector or gate function of t moment) and mnemon (ctIndicate mnemon in t The state value vector at quarter) composition.
As seen from Figure 1, in t moment, LSTM there are three input value, be respectively the network at current time input value to Measure xt, last moment LSTM output valve vector ht-1And the state value vector c of last moment mnemont-1, wherein ct-1 It is the long-term memory (being not explicitly shown in Fig. 1) about LSTM, there are two output valves by LSTM, are that the LSTM at current time is defeated respectively Go out value vector htAnd the state value vector c of the mnemon at current timet.Input gate determines the network inputs value at current time xtHow many is saved in current mnemon state ct, forget the mnemon state c that door determines last momentt-1How many is protected It is stored to the mnemon state c at current timet, out gate be used for control mnemon state ctHow many is output to working as LSTM Preceding output valve ht.Above-mentioned each gate value is the real vector between 0 to 1, if gate value is 1, any vector is multiplied with it not to be had Any change, when gate value is 0, any vector is multiplied with it can all obtain 0 vector.
In LSTM structures, adjusted jointly by input gate, out gate, forgetting door to control at every point of time to memory The data of unit transmission.The data that mnemon is inputted in the processing of multiple continuous time points, when each time point is separated by same Between be spaced.Typically, at every point of time, LSTM is according to following four two output valves of determination htAnd ct, i.e.,:
Input gate function:
it=σ (Wi·xt+Ui·ht-1+bi) (1)
Forget gate function:
ft=σ (Wf·xt+Uf·ht-1+bf) (2)
Export gate function:
ot=σ (Wo·xt+Uo·ht-1+bo) (3)
The immediate status of current mnemon:
The output valve of LSTM networks is represented by:
ht=ot⊙tanh(ct) (6)
Wherein, the multiplication of " " representing matrix and vector operation, " ⊙ " indicate the multiplication operation of vector sum vector, " σ " table Show that sigmoid activation operations, " tanh " indicate tanh activation operations, WfAnd UfIt is the weight matrix for forgeing door, bfIt is to forget door Bias term, WiAnd UiIt is the weight matrix of input gate, biIt is the bias term of input gate, WcAnd UcIt is the weight square of mnemon state Battle array, bcIt is the bias term of mnemon state, WoAnd UoIt is the weight matrix of out gate, boIt is the bias term of out gate, xtIt is to work as The input vector at preceding moment, ht-1It is the output valve vector of long Memory Neural Networks last moment in short-term, htIt is long short-term memory god Output valve vector through network current time, ftIt is the forgetting gate value vector at current time, itBe current time input gate value to Amount, otIt is the output gate value vector sum at current timeIt is immediate status value vector (its reflection of the mnemon at current time Current memory about LSTM), ct-1It is mnemon state value vector (its long-term note of reflection about LSTM of last moment Recall), ctIt is the mnemon state value vector at the current time calculated.
According to one embodiment of present invention, a kind of computing device of the matrix towards LSTM and vector operation is provided, is joined As shown in Figure 2, which includes four computing units (being labeled as computing unit 0-3) and a vector calculation unit 200, wherein computing unit 0 is used to execute the forgetting gate function in LSTM, i.e. calculation formula (2), obtains vector ft;It calculates single Member 1 obtains vectorial i for executing the input gate function in LSTM, i.e. calculation formula (1)t;Computing unit 2 is for executing LSTM The state of middle mnemon, i.e. calculation formula (4), obtain vectorComputing unit 3 is used to execute the out gate letter in LSTM Number, i.e. calculation formula (3), obtain vectorial ot
Vector calculation unit 200 is according to the f of acquisitiont、itotAnd the mnemon state value vector of last moment ct-1To obtain the output valve vector h of current LSTMt, i.e. formula (6) and current mnemon state value vector ct, i.e., public Formula (5).
Specifically, in this embodiment, vector calculation unit 200 includes multiplication unit 210, multiplication unit 220, addition list Member 230, activation processing unit 240 and multiplication unit 250, wherein multiplication unit 210 is used to receive the output of computing unit 0 ftWith the state c of last moment mnemont-1And corresponding vector multiplication operation is executed, obtain ft⊙ct-1;Multiplication unit 220 Receive the output i of computing unit 1tWith the output of computing unit 2And corresponding vector multiplication operation is executed, it obtainsAdd The reception of method unit 230 comes from the output of multiplication unit 210 and the output of multiplication unit 220 and executes add operation, obtains phase Add resultThe state c of i.e. current mnemont, and state ctOutput is can be used as, it is next to participate in The operation at moment;Activate processing unit 240 to the result of addition unit 230 into line activating processing, for example, using f (z)=tanh (z) when activation primitive, tanh (c are obtainedt);Multiplication unit 250 is to activation handling result and the output o for coming from computing unit 3t Multiplication of vectors is executed, to obtain the output valve h of current LSTMt=ot⊙tanh(ct)。
It should be understood that the multiplication unit 210 in vector calculation unit 200, multiplication unit 220, addition unit 230, swash General multiplier can be used in processing unit 240 and multiplication unit 250 living and adder, FPGA or other dedicated devices come in fact It is existing.In addition, activation processing unit 240 can not also be used as an individual processing unit to exist, for example, being integrated in addition unit 230 or multiplication unit 250 in.
It should be noted that Fig. 2 shows computing device, with four computing units be one group can be with parallel computation ft、 itot, to be quickly obtained the gate function and mnemon state that calculate needed for LSTM output valves.But in other embodiment In, more or less than four computing units can also be used, for example, calculating f successively only with a computing unitt、itot And then they are input to vector calculation unit simultaneously.
The case where for including multiple computing units, the circuit structure that each computing unit can be having the same or different, only It wants to realize function of the invention.
Fig. 3 shows that the structural schematic diagram in the computing unit of one embodiment, the computing unit include according to the present invention Multiple multipliers (being labeled as multiplier 0-n), addition tree construction, accumulator 340 and the activation letter being made of multiple adders Number processing unit 350, wherein illustrate the two-stage add tree knot being made of adder 310, adder 320 and adder 330 Structure, for example, when n is 9, the result of multiplier 0-4 is output to adder 310 and is added, and the result of multiplier 5-9 exports It is added to adder 320.
It should be understood that those skilled in the art can require design multi-level form according to calculation scale and computational efficiency Addition tree construction, to improve calculating speed, for example, each two multiplier correspond to an adder, each two adder correspond to Adder for next stage etc..
Multiplier 0-n exports result to 310 He of adder for receiving input data and weight and executing multiplication operation Adder 320, herein, input data include the input vector x of LSTMtIn element or LSTM last moment output valve ht-1, weight includes the element forgotten in door, input gate, out gate, the corresponding weight matrix of mnemon.
Adder 330 receives the result for coming from adder 310 and adder 320 and executes phase add operation.
Accumulator 340 repeatedly adds up to the result for coming from adder 330, and by final result (for example, Wi·xt+ Ui·ht-1+bi) export to activation primitive processing unit 350.
Activation primitive processing unit 350 executes activation primitive to the accumulation result for coming from accumulator 340, for example, using Sigmoid functionsHandle accumulation result.
For example, as the computing unit calculation formula i using Fig. 3t=σ (Wi·xt+Ui·ht-1+bi) when, multiplier 0- first N can be used for executing WiAnd xtThen the multiplication operations of middle corresponding element pass through adder 310, adder 320 and adder 330 It sums, by result of calculation Wi·xtIt exports to accumulator 340;Next, multiplier 0-n executes UiAnd ht-1Corresponding element Multiplication operations sum similarly by adder 310, adder 320 and adder 330, obtain result Ui·ht-1And It exports to accumulator 340;Accumulator 340 executes Wi·xtAnd Ui·ht-1Accumulating operation, and the bias term b that can further add upi (not shown) obtains Wi·xt+Ui·ht-1+bi;Finally, by activation primitive processing unit 350 into line activating processing, to final Obtain it=σ (Wi·xt+Ui·ht-1+bi).Similarly, using the structure of this computing unit can perform formula (2), (3) and (4)。
The computing device of the present invention can be applied to long Memory Neural Networks processor in short-term, to accelerate data operation process, It is shown in Figure 4, the neural network processor 400 include storage organization 410, the present invention computing unit 420, the present invention to Measure computing unit 430 and control unit 440, wherein computing unit 420 is shown as multiple, is labeled as 0-n, vector calculates single Member 430 be again shown as it is multiple, be labeled as 0-k.
Generally, structure of the neural network processor 400 provided by the invention based on storage-control-calculating.Storage knot Structure 410 is used to store data, neural network weight and the coprocessor operation instruction for participating in calculating;Control structure (control unit 440) it is used to parse operational order, generates control signal, which is used for the scheduling of the data of control neural network processor, deposits The calculating process of storage and neural network;Structure (including computing unit 420 and vector calculation unit 430) is calculated for executing this Correlation computations in neural network processor.
In the embodiment shown in fig. 4, storage organization 410 is further subdivided into input data storage unit 411, weight is deposited Storage unit 412, the location of instruction 413 and output data storage unit 414.
Input data storage unit 411 be used for store participate in calculating data, the data include LSTM input data to Amount and results of intermediate calculations;Output data storage unit 414 is calculated for storing as a result, for example, ctAnd ht;Weight stores Unit 412 is used to store the weight of LSTM neural networks, for example, forgeing the weight W of doorf、Uf, the weight W of input gateiAnd UiDeng; The location of instruction 413 stores the command information for participating in calculating, and instruction is parsed to realize the calculating of LSTM neural networks.
Control unit 440 respectively with output data storage unit 414, weight storage unit 412, the location of instruction 413, Computing unit 420 is connected with vector calculation unit 430.Control unit 440 obtains the instruction being stored in the location of instruction 413 And parse the instruction (for example, be loaded into data to computing unit, calculate and start, calculate and terminate or store result of calculation to depositing The instructions such as storage unit), control unit 440 can be according to control signal that analysis instruction obtains control computing unit 420 and to gauge Calculate the calculating that unit 430 carries out neural network.Control unit 440 itself can be microcontroller.
The control signal that computing unit 420 and vector calculation unit 430 are generated according to control unit 440 is corresponding to execute Neural computing.Computing unit 420 and vector calculation unit 430 are associated with one or more storage units, computing unit 420 can obtain data to be calculated from input data storage unit 411 associated there, and can be associated to this Output data storage unit 414 be written data.Computing unit 420 and vector calculation unit 430 are completed in LSTM neural networks Most of operation, i.e., the multiplication of above-mentioned vector sum matrix, the multiplication of vector sum vector and activation processing etc..
It should be noted that each storage unit (including buffer unit), (including vector calculates for control unit and computing unit Unit) between data path the interconnection techniques such as H-TREE or FAT-TREE can be used.Storage unit can be that static random is deposited The common storage mediums such as reservoir (SRAM), dynamic RAM (DRAM), register file can also be that 3D memory devices etc. are new The storage class of type.In addition, in some cases, said memory cells, example may be not stored in by calculating required data Such as, for the larger LSTM neural networks of calculation scale, data can also be exchanged (referring to figure with external data storage Shown in 4).
It should be understood by those skilled in the art that although not shown in fig 4, which further includes addressing of address work( Can, for the index of input to be mapped to correct storage address, to obtain data or the instruction of needs, ground from storage unit Location addressing function may be implemented to realize in a control unit or in the form of separate unit.
Neural network processor provided by the invention can be the microprocessor designed for neural computing, Can also only be a part for microprocessor, which can be applied to word processing, speech recognition and processing, more State's language translation, image recognition, biological characteristic knowledge arrive, the fields such as intelligent control, can be used as intelligence computation processor, robot, Mobile device can also be used for the supercomputer that structure Large Scale Neural Networks calculate.
It should be noted that, although each step is described according to particular order above, it is not intended that must press Each step is executed according to above-mentioned particular order, in fact, some in these steps can be executed concurrently, or even is changed suitable Sequence, as long as required function can be realized.
The present invention can be system, method and/or computer program product.Computer program product may include computer Readable storage medium storing program for executing, containing for making processor realize the computer-readable program instructions of various aspects of the invention.
Computer readable storage medium can be to maintain and store the tangible device of the instruction used by instruction execution equipment. Computer readable storage medium for example can include but is not limited to storage device electric, magnetic storage apparatus, light storage device, electromagnetism and deposit Store up equipment, semiconductor memory apparatus or above-mentioned any appropriate combination.The more specific example of computer readable storage medium Sub (non exhaustive list) includes:Portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), Erasable programmable read only memory (EPROM or flash memory), static RAM (SRAM), Portable compressed disk are read-only Memory (CD-ROM), memory stick, floppy disk, mechanical coding equipment, is for example stored thereon with instruction at digital versatile disc (DVD) Punch card or groove internal projection structure and above-mentioned any appropriate combination.
Various embodiments of the present invention are described above, above description is exemplary, and non-exclusive, and It is not limited to disclosed each embodiment.Without departing from the scope and spirit of illustrated each embodiment, for this skill Many modifications and changes will be apparent from for the those of ordinary skill in art field.The selection of term used herein, purport In principle, the practical application or to the technological improvement in market for best explaining each embodiment, or make the art its Its those of ordinary skill can understand each embodiment disclosed herein.

Claims (10)

1. a kind of computing device being applied to long Memory Neural Networks in short-term, including:
Computing unit, for executive chairman's short-term memory neural network forgetting gate function, input gate function, output gate function and The calculating of mnemon state obtains the forgetting gate value vector, the input gate value vector at current time, current time at current time Output gate value vector sum current time mnemon immediate status value vector, wherein the memory list at the current time The immediate status value vector of member is used to reflect the short-term memory of long Memory Neural Networks in short-term;
Vector calculation unit, for the mnemon state value vector sum institute based on the long last moment of Memory Neural Networks in short-term Input gate value vector, the output gate value vector sum at current time of the forgetting gate value vector, current time of stating current time are current The immediate status value vector of the mnemon at moment obtain the output valve vector at the long current time of Memory Neural Networks in short-term with And the mnemon state value vector at current time, wherein the mnemon state value vector of the last moment is for reflecting The long-term memory of long Memory Neural Networks in short-term.
2. computing device according to claim 1, which is characterized in that parallel to execute including four computing units Correlation computations, wherein the first computing unit executes the calculating for forgeing gate function, obtain the forgetting gate value at the current time to Amount;Second computing unit executes the calculating of input gate function, obtains the input gate value vector at the current time;Third calculates single Member executes the calculating of mnemon state, obtains the immediate status value vector of the mnemon at the current time;4th calculates Unit executes the calculating of output gate function, obtains the output gate value vector at the current time.
3. computing device according to claim 2, which is characterized in that the vector calculation unit includes:
First multiplication unit, the mnemon state value by receiving the last moment is vectorial and comes from based on described first Calculate the forgetting gate value vector at the current time of unit;
Second multiplication unit, the input gate value vector sum for receiving the current time for coming from second computing unit The immediate status value vector of the mnemon at the current time of the third computing unit;
Addition unit is added for receiving to come from first multiplication unit and the output of second multiplication unit and execute Operation obtains the mnemon state value vector at the current time;
Activate processing unit, for the current time to coming from the addition unit mnemon state value vector into Line activating processing;
Third multiplication unit calculates list for receiving to come from the result of the activation processing unit and come from the described 4th The output gate value vector at the current time of member, obtain the length in short-term the current time of Memory Neural Networks output valve to Amount.
4. computing device according to claim 3, which is characterized in that the activation processing unit uses tanh activation primitives It is handled.
5. computing device according to claim 2, which is characterized in that four computing units circuit knot having the same Structure.
6. computing device according to claim 5, which is characterized in that the computing unit includes:
Multiple multipliers, the behaviour that is multiplied of weight matrix and the corresponding element of input vector for executive chairman's short-term memory network Make;
The addition tree construction being made of multiple adders executes phase add operation, to obtain for the result to the multiple multiplier Obtain the multiplied result of the weight matrix and input vector;
Accumulator, the result for being obtained to the adder structure add up;
Activation primitive processing unit is handled for the result to the accumulator into line activating.
7. computing device according to claim 6, which is characterized in that the activation primitive processing unit uses sigmoid Function is handled into line activating.
8. a kind of computational methods being applied to long Memory Neural Networks in short-term, this method is according to described in any one of claim 1 to 7 Computing device carry out the correlation computations of executive chairman's short-term memory neural network, to obtain the output valve vector at current time and work as The mnemon state value vector at preceding moment.
9. computational methods according to claim 8, wherein the computing unit in the computing device executes following calculation:
ft=σ (Wf·xt+Uf·ht-1+bf)
it=σ (Wi·xt+Ui·ht-1+bi)
ot=σ (Wo·xt+Uo·ht-1+bo)
Vector calculation unit in the computing device executes following calculation:
ht=ot⊙tanh(ct)
Wherein, the multiplication of " " representing matrix and vector operation, " ⊙ " indicate the multiplication operation of vector sum vector, and " σ " is indicated Sigmoid activation operations, " tanh " indicate tanh activation operations, Wf、UfIt is the weight matrix for forgeing door, bfIt is the inclined of forgetting door Set item, WiAnd UiIt is the weight matrix of input gate, biIt is the bias term of input gate, WcAnd UcIt is the weight square of mnemon state Battle array, bcIt is the bias term of mnemon state, WoAnd UoIt is the weight matrix of out gate, boIt is the bias term of out gate, xtIt is to work as The input vector at preceding moment, ht-1It is the output valve vector of long Memory Neural Networks last moment in short-term, htIt is long short-term memory god Output valve vector through network current time, ftIt is the forgetting gate value vector at current time, itBe current time input gate value to Amount, otIt is the output gate value vector at current time,It is the immediate status value vector of the mnemon at current time, ct-1On being The mnemon state value vector at one moment, ctIt is the mnemon state value vector at the current time calculated.
10. a kind of length Memory Neural Networks processor in short-term, including:
Computing device according to any one of claim 1 to 7 is used for the correlation computations of executive chairman's short-term memory network;
Storage unit for storing data and instructs;
Control unit, for controlling the correlation computations sum number of the computing device according to the instruction for being stored in the storage unit According to input and output.
CN201810275313.6A 2018-03-30 2018-03-30 Computing device and computational methods applied to long Memory Neural Networks in short-term Pending CN108510065A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810275313.6A CN108510065A (en) 2018-03-30 2018-03-30 Computing device and computational methods applied to long Memory Neural Networks in short-term

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810275313.6A CN108510065A (en) 2018-03-30 2018-03-30 Computing device and computational methods applied to long Memory Neural Networks in short-term

Publications (1)

Publication Number Publication Date
CN108510065A true CN108510065A (en) 2018-09-07

Family

ID=63377959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810275313.6A Pending CN108510065A (en) 2018-03-30 2018-03-30 Computing device and computational methods applied to long Memory Neural Networks in short-term

Country Status (1)

Country Link
CN (1) CN108510065A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615449A (en) * 2018-10-25 2019-04-12 阿里巴巴集团控股有限公司 A kind of prediction technique and device, a kind of calculating equipment and storage medium
CN109711540A (en) * 2018-12-20 2019-05-03 北京中科寒武纪科技有限公司 A kind of computing device and board
CN110221611A (en) * 2019-06-11 2019-09-10 北京三快在线科技有限公司 A kind of Trajectory Tracking Control method, apparatus and automatic driving vehicle
CN110347506A (en) * 2019-06-28 2019-10-18 Oppo广东移动通信有限公司 Data processing method, device, storage medium and electronic equipment based on LSTM
CN110390386A (en) * 2019-06-28 2019-10-29 南京信息工程大学 Sensitive shot and long term accumulating method based on input variation differential
CN110490299A (en) * 2019-07-25 2019-11-22 南京信息工程大学 Sensitive shot and long term accumulating method based on state change differential
CN111967566A (en) * 2019-05-20 2020-11-20 天津科技大学 Edge computing offloading decision making based on long-short term memory neural network in Internet of vehicles environment
CN112036546A (en) * 2020-08-24 2020-12-04 上海交通大学 Sequence processing method and related equipment
CN113158569A (en) * 2021-04-23 2021-07-23 东南大学 Tank car side-tipping state high-reliability estimation method based on long-short term memory network
CN115719087A (en) * 2022-09-08 2023-02-28 清华大学 Long-short term memory neural network circuit and control method

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105844330A (en) * 2016-03-22 2016-08-10 华为技术有限公司 Data processing method of neural network processor and neural network processor
CN106485322A (en) * 2015-10-08 2017-03-08 上海兆芯集成电路有限公司 The neutral net unit that executive chairman's impermanent memory born of the same parents calculate simultaneously
CN106775599A (en) * 2017-01-09 2017-05-31 南京工业大学 Many computing unit coarseness reconfigurable systems and method of recurrent neural network
CN107341542A (en) * 2016-04-29 2017-11-10 北京中科寒武纪科技有限公司 Apparatus and method for performing Recognition with Recurrent Neural Network and LSTM computings
CN107622329A (en) * 2017-09-22 2018-01-23 深圳市景程信息科技有限公司 The Methods of electric load forecasting of Memory Neural Networks in short-term is grown based on Multiple Time Scales
US20180060720A1 (en) * 2016-08-30 2018-03-01 Samsung Electronics Co., Ltd. System and method for information highways in a hybrid feedforward-recurrent deep network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485322A (en) * 2015-10-08 2017-03-08 上海兆芯集成电路有限公司 The neutral net unit that executive chairman's impermanent memory born of the same parents calculate simultaneously
CN105844330A (en) * 2016-03-22 2016-08-10 华为技术有限公司 Data processing method of neural network processor and neural network processor
CN107341542A (en) * 2016-04-29 2017-11-10 北京中科寒武纪科技有限公司 Apparatus and method for performing Recognition with Recurrent Neural Network and LSTM computings
US20180060720A1 (en) * 2016-08-30 2018-03-01 Samsung Electronics Co., Ltd. System and method for information highways in a hybrid feedforward-recurrent deep network
CN106775599A (en) * 2017-01-09 2017-05-31 南京工业大学 Many computing unit coarseness reconfigurable systems and method of recurrent neural network
CN107622329A (en) * 2017-09-22 2018-01-23 深圳市景程信息科技有限公司 The Methods of electric load forecasting of Memory Neural Networks in short-term is grown based on Multiple Time Scales

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YIJIN GUAN ET AL.: "FPGA-based Accelerator for Long Short-Term Memory Recurrent Neural Networks", 《2017 22ND ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC)》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109615449A (en) * 2018-10-25 2019-04-12 阿里巴巴集团控股有限公司 A kind of prediction technique and device, a kind of calculating equipment and storage medium
CN109711540A (en) * 2018-12-20 2019-05-03 北京中科寒武纪科技有限公司 A kind of computing device and board
CN109711540B (en) * 2018-12-20 2021-09-21 中科寒武纪科技股份有限公司 Computing device and board card
CN111967566A (en) * 2019-05-20 2020-11-20 天津科技大学 Edge computing offloading decision making based on long-short term memory neural network in Internet of vehicles environment
CN110221611B (en) * 2019-06-11 2020-09-04 北京三快在线科技有限公司 Trajectory tracking control method and device and unmanned vehicle
CN110221611A (en) * 2019-06-11 2019-09-10 北京三快在线科技有限公司 A kind of Trajectory Tracking Control method, apparatus and automatic driving vehicle
CN110347506B (en) * 2019-06-28 2023-01-06 Oppo广东移动通信有限公司 Data processing method and device based on LSTM, storage medium and electronic equipment
CN110390386A (en) * 2019-06-28 2019-10-29 南京信息工程大学 Sensitive shot and long term accumulating method based on input variation differential
CN110347506A (en) * 2019-06-28 2019-10-18 Oppo广东移动通信有限公司 Data processing method, device, storage medium and electronic equipment based on LSTM
CN110490299A (en) * 2019-07-25 2019-11-22 南京信息工程大学 Sensitive shot and long term accumulating method based on state change differential
CN110490299B (en) * 2019-07-25 2022-07-29 南京信息工程大学 Sensitive long-short term memory method based on state change differential
CN112036546A (en) * 2020-08-24 2020-12-04 上海交通大学 Sequence processing method and related equipment
CN112036546B (en) * 2020-08-24 2023-11-17 上海交通大学 Sequence processing method and related equipment
CN113158569A (en) * 2021-04-23 2021-07-23 东南大学 Tank car side-tipping state high-reliability estimation method based on long-short term memory network
CN115719087A (en) * 2022-09-08 2023-02-28 清华大学 Long-short term memory neural network circuit and control method
WO2024051525A1 (en) * 2022-09-08 2024-03-14 清华大学 Long short-term memory neural network circuit and control method

Similar Documents

Publication Publication Date Title
CN108510065A (en) Computing device and computational methods applied to long Memory Neural Networks in short-term
US11586920B2 (en) Neural network processor
US12014272B2 (en) Vector computation unit in a neural network processor
US10691996B2 (en) Hardware accelerator for compressed LSTM
CN109190756B (en) Arithmetic device based on Winograd convolution and neural network processor comprising same
CN105892989B (en) Neural network accelerator and operational method thereof
CN107169563B (en) Processing system and method applied to two-value weight convolutional network
US11847553B2 (en) Parallel computational architecture with reconfigurable core-level and vector-level parallelism
CN110110851A (en) A kind of the FPGA accelerator and its accelerated method of LSTM neural network
CN107506828A (en) Computing device and method
CN108446761A (en) A kind of neural network accelerator and data processing method
CN109697510A (en) Method and apparatus with neural network
CN105844330A (en) Data processing method of neural network processor and neural network processor
TWI417797B (en) A Parallel Learning Architecture and Its Method for Transferred Neural Network
CN108376285A (en) One kind is towards changeable allosome LSTM neural networks accelerator and data processing method
CN109359730A (en) Neural network processor towards fixed output normal form Winograd convolution
CN108921288A (en) Neural network activates processing unit and the neural network processor based on the device
CN109583586A (en) A kind of convolution kernel processing method and processing device
Tu et al. Multitarget prediction using an aim-object-based asymmetric neuro-fuzzy system: A novel approach
CN108734270A (en) A kind of compatible type neural network accelerator and data processing method
Li et al. Pipuls: Predicting i/o patterns using lstm in storage systems
Lu et al. Forecasting csi 300 index using a hybrid functional link artificial neural network and particle swarm optimization with improved wavelet mutation
US20220129743A1 (en) Neural network accelerator output ranking
Khan et al. Bitcoin Price Prediction in a Distributed Environment Using a Tensor Processing Unit: A Comparison With a CPU-Based Model
Feng et al. CUDA Optimization Method for Activation Function in Convolution Operation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180907

RJ01 Rejection of invention patent application after publication