CN109102074A - A kind of training device - Google Patents
A kind of training device Download PDFInfo
- Publication number
- CN109102074A CN109102074A CN201710474297.9A CN201710474297A CN109102074A CN 109102074 A CN109102074 A CN 109102074A CN 201710474297 A CN201710474297 A CN 201710474297A CN 109102074 A CN109102074 A CN 109102074A
- Authority
- CN
- China
- Prior art keywords
- data
- value
- judgment
- cynapse
- training device
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
Present disclose provides a kind of training devices, comprising: data processing module;And computing module, it is connect with the data processing module, treated for receiving the data processing module, and data carry out operation.Disclosure training device effectively can train part by acceleration inversion, it can also be used to which the training process of entire neural network accelerates, and improves trained and arithmetic speed, reduces trained power consumption, improve the validity of operation.
Description
Technical field
This disclosure relates to field of artificial intelligence more particularly to a kind of sparse training device.
Background technique
Deep neural network is the basis of current many artificial intelligence applications, in speech recognition, image procossing, data point
The various aspects such as analysis, advertisement recommender system, automatic driving have obtained breakthrough application, so that deep neural network is applied
In the various aspects of life.But the operand of deep neural network is huge, restrict always its faster development and more
It is widely applied.When considering to accelerate with accelerator design the operation of deep neural network, huge operand will necessarily
With very big energy consumption expense, the further extensive use of accelerator equally restrict.
It mainly include two parts, forward direction speculates and reverse train in the operation of neural network.However add for existing
For fast device, positive supposition part is all only supported mostly, without considering reverse train part then, this also brings one and asks
Topic is exactly, accelerator be merely able to accelerate it is positive speculate part, can not acceleration inversion training part, be also just unable to complete neural network
Entire training process, have apparent limitation.
Summary of the invention
(1) technical problems to be solved
In order to solve or at least partly alleviate above-mentioned technical problem, present disclose provides a kind of sparse training devices.This
Sparse training device is disclosed, the positive of sparse or dense neural network can be supported to speculate part, can also be instructed with acceleration inversion
Practice part, the training process that can be used in entire neural network accelerates.
(2) technical solution
According to one aspect of the disclosure, a kind of training device is provided, comprising:
Data processing module, for being compressed or being extended to input data;And
Computing module is connect with the data processing module, data that treated for receiving the data processing module
Carry out operation.
In some embodiments, the data processing module includes:
Data compression unit, for being compressed according to a compression Rule of judgment to input data;
And Data expansion unit, for being extended according to an extension Rule of judgment to input data.
In some embodiments, the Data expansion unit, for being extended to input data, by the sparse of compression
Data expansion be unpacked format.
In some embodiments, the compression Rule of judgment and extension Rule of judgment include that threshold decision condition or function reflect
Penetrate Rule of judgment.
In some embodiments, the threshold decision condition, comprising: less than a given threshold value, it is greater than a given threshold value,
In one given value range or outside a given value range.
In some embodiments, the data compression unit screens input data according to the sparse index value of data
And compression, obtain the data to operation;Or judged according to the value of data itself, thus screen and compress to obtain meet it is described
Compress the numerical value of Rule of judgment.
In some embodiments, the data compression unit is according to the sparse index values of cynapse data to the neuron of input
Data are screened and are compressed, and obtain the neuron number evidence to operation, or according to the sparse index value of neuron number evidence to input
Cynapse data screened and compressed, obtain the cynapse data to operation.
In some embodiments, the data compression unit is sieved according to the value of cynapse itself compared with a given threshold value
It selects and compresses to obtain the cynapse data that absolute value is not less than given threshold value, or value and a given threshold value phase according to neuron itself
Compare, screen and compresses to obtain the neuron number evidence that absolute value is not less than given threshold value.
In some embodiments, be also used to be determined whether according to a gradient value Rule of judgment will be terraced for the data processing module
Angle value and data to operation are sent to computing module.
In some embodiments, the gradient value Rule of judgment includes threshold decision condition or Function Mapping Rule of judgment.
In some embodiments, the threshold decision condition, comprising: less than a given threshold value, it is greater than a given threshold value,
In one given value range or outside a given value range.
In some embodiments, if judging to determine that the absolute value of neuron gradient value is less than given threshold through data processing module
Value then compresses gradient value and the corresponding cynapse to operation, i.e., does not send computing module and carry out operation;Otherwise, if gradient value
Absolute value be not less than given threshold value, then gradient value and the corresponding cynapse to operation are sent in computing module and are transported
It calculates.
In some embodiments, if cynapse is stored in the form of sparse, before inputting computing module operation, by the number
Cynapse and cynapse index value are extended according to processing module, are converted into non-sparse mode.
In some embodiments, the computing module includes:
First arithmetic element comprising multiple PE, each PE include multiplier and/or adder, for completing multiplication, adding
Method or multiply-add operation;
Second arithmetic element comprising two groups of add tree, every group of add tree include multiple add tree, for completing cumulative fortune
It calculates;And
Third arithmetic element comprising ALU.
In some embodiments, first arithmetic element includes M*N PE, and each PE includes a multiplier and an addition
Device;Second arithmetic element includes two groups of add tree, one group include M N input add tree, another group includes that N number of M is inputted
Add tree;The third arithmetic element includes max (M, N) a ALU;Wherein, M and N is positive integer.
In some embodiments, in computing module, first arithmetic element is used to complete multiplying for gradient value and cynapse
Method operation, i.e., carry out one-to-one correspondence with corresponding gradient value for cynapse and be multiplied, and second arithmetic element will be same using add tree
One, which arranges data to be added up, adds up;If the accumulation result is not the accumulation result finally needed, i.e. accumulation operations are not yet complete
At, then skip third arithmetic element, by the intermediate result save in the buffer, wait next time adds up together;Otherwise, it is transported by third
The ALU calculated in unit completes subsequent arithmetic.
In some embodiments, if upper one layer there are activation primitive, the third arithmetic element is also used to add up this
As a result it is multiplied with the inverse function of the activation primitive, obtains final gradient value.
In some embodiments, the third arithmetic element be also used to according to a zero setting Rule of judgment to the gradient value into
Row zero setting setting.
In some embodiments, the zero setting Rule of judgment is threshold decision condition, is set if the absolute value of gradient value is less than
The gradient value is then set 0 by zero threshold value;Otherwise gradient value remains unchanged.
In some embodiments, further includes:
Memory module, for storing data;
Control module, for storing and issuing instruction, to control the memory module, data control block and operation mould
Block.
A kind of chip another aspect of the present disclosure provides comprising the training device.
A kind of chip-packaging structure another aspect of the present disclosure provides comprising the chip.
A kind of board another aspect of the present disclosure provides comprising the chip-packaging structure.
A kind of electronic device another aspect of the present disclosure provides comprising the board.
(3) beneficial effect
It can be seen from the above technical proposal that the sparse training device of the disclosure at least have the advantages that wherein it
One:
(1) the sparse training device of the disclosure effectively can train part by acceleration inversion, substantially increase training speed, reduce
Trained power consumption.
(2) by being extended or compressing to data, trained power consumption is reduced.
(3) simultaneously, disclosure sparse sequence device can support the positive of sparse or dense neural network to speculate well
Part, so as to accelerate for the training process of entire neural network.
(4) reverse train when, by increase gradient value Rule of judgment, zero setting Rule of judgment, to further mention
The high validity of operation, improves arithmetic speed.
Detailed description of the invention
Fig. 1 is according to the sparse training device functional block diagram of the disclosure.
Fig. 2 is according to disclosure computing module structural schematic diagram.
Fig. 3 is according to another structural schematic diagram of disclosure computing module.
Fig. 4 is according to disclosure data processing module structural schematic diagram.
Fig. 5 is according to disclosure Sparse treatment process schematic diagram.
Fig. 6 is according to another schematic diagram of disclosure Sparse treatment process.
Fig. 7 is according to disclosure Data expansion treatment process schematic diagram.
Fig. 8 is according to another structural schematic diagram of disclosure computing module.
Specific embodiment
For the purposes, technical schemes and advantages of the disclosure are more clearly understood, below in conjunction with specific embodiment, and reference
Attached drawing is described in further detail the disclosure.
It should be noted that similar or identical part all uses identical figure number in attached drawing or specification description.It is attached
The implementation for not being painted or describing in figure is form known to a person of ordinary skill in the art in technical field.In addition, though this
Text can provide the demonstration of the parameter comprising particular value, it is to be understood that parameter is equal to corresponding value without definite, but can connect
It is similar to be worth accordingly in the error margin or design constraint received.In addition, the direction term mentioned in following embodiment, such as
"upper", "lower", "front", "rear", "left", "right" etc. are only the directions with reference to attached drawing.Therefore, the direction term used be for
Illustrate not to be used to limit the disclosure.
The disclosure mainly proposes a kind of sparse training device of neural network, can support entirely training for neural network
Journey.As shown in Figure 1, the sparse training device includes control module, memory module, data processing module and computing module.
Wherein, control module is mainly used for store instruction, firing order, to regulate and control memory module, data control block
And computing module, so that its sparse training device is capable of the cooperation of harmonious orderly on the whole.
Memory module is mainly used for storing data, including in calculating process to the neuron number evidence of operation, cynapse data,
Other relevant parameters etc. needed in intermediate result data, final result data and operation.
Data processing module is mainly used for carrying out processing screening by treating operational data, is selected according to certain Rule of judgment
It selects out and needs to input the data that computing module carries out operation.
Computing module is mainly used for carrying out neural network computing, and the intermediate result data and final result that needs are stored
Data are sent storage section back to and are saved.
Further, computing module can complete training process by cooperation, specifically, as Figure 2-3, the fortune
Calculating module includes multiple groups arithmetic element, and every group of arithmetic element includes the first arithmetic element 1, the second arithmetic element 2, third operation list
Member 3.Wherein, first arithmetic element includes multiple PE, and each PE includes multiplier and/or adder, for complete multiplication,
Addition or multiply-add operation.Second arithmetic element includes multiple add tree, for completing accumulating operation.The third operation list
Member includes ALU, the preferably ALU of lightweight, namely the ALU comprising required function.Computing module includes multiplying for completing
Side such as activates, compares at a series of basic operations such as nonlinear operations and individual multiplication, addition.It, can be in actual operation
According to actual needs, pipelining is carried out to first arithmetic element, the second arithmetic element and third arithmetic element,
Also it may be selected to skip wherein some arithmetic element or some arithmetic section of some arithmetic element, be such as not necessarily at POOLING layers
Accumulation operations in second arithmetic element then can be skipped directly.
As shown in figure 4, the data processing module includes data compression unit and data expanding element.Wherein, the number
Input data is compressed according to a compression Rule of judgment according to compression unit;The Data expansion unit, for expanding according to one
Exhibition Rule of judgment is extended input data.The compression Rule of judgment and extension Rule of judgment include threshold decision condition or
Function Mapping Rule of judgment.The threshold decision condition, comprising: less than a given threshold value, it is greater than a given threshold value, it is given one
In value range or outside a given value range.
Specifically, the data compression unit can both press neuron Data Data for compressing to data
Contracting, can also compress cynapse data.It more specifically, can be according to prominent when cynapse data are rarefaction representation the case where
The sparse index value of touching data is screened and is compressed to the neuron of input, and " effective " neuron number to operation is filtered out
According to and cynapse value be fed together subsequent computing module, carry out operation.Conversely, work as the case where neuron number is according to being rarefaction representation,
The cynapse data of input can be screened and be compressed according to the sparse index value of neuron number evidence.
As shown in figure 5, cynapse is the data of rarefaction representation, neuron is the data of dense expression, according to rarefaction representation
The index value of cynapse compresses neuron value, includes two arrays in the rarefaction representation mode selected here, and first is used to remember
The cynapse value of rarefaction representation is recorded, another is used to save these corresponding positions of cynapse value, that is, indexes.It might as well assume former ordered series of numbers
Length is 8, then indicating that the cynapse value of the rarefaction representation is located at the position of the 1st, 3,6, the 7 of former sequence according to index value
(start bit 0) should be with the numerical value that the group index value can filter out operation to be done in neuron value positioned at the 1st, 3,6,7
The neuron set, therefore by the neuron number of these operations to be done according to screening, compressed neural metasequence is obtained, i.e.,
N1N3N6N7.Operation is then carried out in arithmetic element together with the cynapse value of rarefaction representation.
In addition, the data compression unit can also be judged according to the value of data itself, to screen and compress to obtain
The numerical value for meeting compression Rule of judgment is passed to subsequent computing module and carries out operation.By taking threshold decision condition as an example, the data
Compression unit can screen according to the value of cynapse itself compared with a given threshold value and compress to obtain absolute value not less than given threshold
The cynapse data of value can also screen according to the value of neuron itself compared with a given threshold value and compress to obtain absolute value not
Less than the neuron number evidence of given threshold value.
As shown in fig. 6, being compressed for disclosure data compression unit according to numerical value itself.It might as well assume former sequence length
It is 8, respectively 35210741.A contractive condition is given, that is, reduces the numerical value less than 3, then 012 equal quilts
Compression screens out.Retain other data, forms compressed sequence, i.e., 3574.
Data expansion unit is used to be extended input data, including neuron number evidence and cynapse data, i.e., according to general
Originally the sparse data of compression are extended as unpacked format.
Specifically, as shown in fig. 7, to be expanded according to cynapse data of the disclosure Data expansion unit to rarefaction representation
Exhibition.It include two arrays in the rarefaction representation mode wherein selected, first array is used to record the cynapse value of rarefaction representation, separately
One array is used to save these corresponding positions of cynapse value, that is, indexes.The length that might as well assume former ordered series of numbers is 8, then basis
Each in index system sequence corresponds to 1 number in former sequence, and 1 indicates that former sequential value is effective, i.e., does not indicate former sequence for 0,0
Value is 0, then the cynapse value shown herein as the rarefaction representation is located at the 1st, 3,6,7 position of former sequence (start bit is
0).Cynapse value is successively filled out to original position with the group index value, other positions 0 are non-depressed after extension thus can be obtained
Contracting sequence.
If data are without compression or extension, the data processing module can be skipped, is directly passed to by memory module
Computing module carries out operation.
In an embodiment of the disclosure, as shown in figure 8, first arithmetic element 1 includes M*N PE, each PE packet
Include a multiplier and an adder.Second arithmetic element 2 includes two groups of add tree, and one group includes adding for M N input
Method tree, one group include N number of M input add tree.The third arithmetic element 3 includes max (M, N) (taking the greater in M, N)
A lightweight ALU, i.e. ALU only include arithmetic unit needed for operation.It will be appreciated by persons skilled in the art that described
Second arithmetic element 2 also may include one group of add tree, include max (M, N) a add tree, it is only necessary to accordingly increase data biography
Defeated line completes identical operation.
The sparse training device of the disclosure is introduced by taking the convolutional layer of sparse convolution neural network as an example below for reverse train
The acceleration of process.Reverse train when, it is assumed that the input gradient value gradient that transmits of lower layer, cynapse when forward direction speculates are
W, corresponding sparse index are index, and input neuron value when forward direction speculates is input.
It should be noted that the data processing module determines whether according to gradient value Rule of judgment in reverse train
Gradient value and the data to operation are sent to computing module.The gradient value Rule of judgment includes threshold decision condition or function
Map Rule of judgment.The threshold decision condition, comprising: less than a given threshold value, be greater than a given threshold value, in a given value
In range or outside a given value range.
Firstly, updating upper one layer of neuron gradient value gradient '.Control module issues load instruction and reads gradient value
After gradient and corresponding cynapse w, the data compression unit for being sent into data processing module carries out screening compression processing to it.Such as
The absolute value of fruit gradient is less than given threshold value and does not bother to see me out then skipping by gradient and to the cynapse in requisition for operation
Enter computing module and carries out operation;Otherwise, if the absolute value of gradient value is not less than given threshold value, and corresponding cynapse w
It is sent into computing module together and carries out operation.In addition, if what cynapse w was saved in the form of sparse, for convenient for the ladder to neuron
W and index, then needing to be sent into before arithmetic section, are fed together the Data expansion of data processing module by adding up for angle value
Unit is extended it, becomes non-sparse mode, is then fed together in computing module with gradient again and carries out operation.?
In computing module, gradient value and cynapse are sent into first arithmetic element 1 first, complete the multiplication behaviour of gradient value and cynapse
Make, i.e., w is subjected to one-to-one correspondence with corresponding gradient and be multiplied, is then sent into corresponding second arithmetic element 2 of the column
In add tree, same row is waited for that cumulative data add up using add tree.If the accumulation result does not need finally
Accumulation result, i.e. accumulation operations are not yet completed, then skip the third arithmetic element 3, which is saved, and wait next time
It adds up together;Otherwise, the ALU being sent into the corresponding third arithmetic element 3 completes subsequent arithmetic.If upper one layer has activation
Function, then the accumulation result is multiplied with the inverse function of activation primitive in the third arithmetic element 3.It obtains final
gradient’。
In addition, the ALU can be used for carrying out zero setting setting to gradient value according to a zero setting Rule of judgment.Sentenced with zero setting threshold value
For broken strip part is as the zero setting Rule of judgment, if user provides zero setting threshold value at this time, in the third operation
Judged in unit 3, if the absolute value of the gradient ' is less than the zero setting threshold value, gradient '=0.Otherwise not
Become.Result back into memory module storage.Wherein, zero setting threshold value can also be other Rule of judgment, such as Function Mapping, when
Gradient ' meets given Rule of judgment, then saves gradient ' zero setting, otherwise save initial value.
Then, cynapse is updated.Control module issues load instruction and reads out neuron number according to input, the corresponding rope of cynapse w
Argument is according to index, and after reading gradient value gradient, the data compression unit being sent into data processing section compresses it
Processing, i.e., compress input according to index accordingly, is then sent into computing module and carries out operation.For each group of fortune
It calculates in unit, gradient value gradient and original nerve metadata Input is sent into the first arithmetic element 1 first, complete multiplication
Operation, i.e., carry out one-to-one correspondence with corresponding gradient for input and be multiplied, be then sent into the second arithmetic element 2, using adding
Same a line is waited for that cumulative data add up by method tree.It is if the accumulation result is not the accumulation result finally needed, i.e., cumulative
Operation is not yet completed, then skips third arithmetic element 3, which is saved, and waits add up together next time;Otherwise, it is sent into
Subsequent arithmetic is carried out in ALU in corresponding third arithmetic element 3.Third arithmetic element 3 receive the cumulative data, divided by and this
The number of the connected gradient of cynapse, obtains the renewal amount of the cynapse.The renewal amount is multiplied with learning rate, is saved to storage
Module.Then, cynapse w and synapse turnover amount are read out from memory module, are sent into complete in the first arithmetic element 1 of computing module
At individual add operation, because without carrying out other cumulative and nonlinear operations, therefore can directly skip the second operation list
Member 2 and third arithmetic element 3, save updated cynapse data.
The sparse training device of the disclosure can also support the positive of sparse or dense neural network to speculate part, process class
It is similar to update the process of cynapse.Thus the sparse training device of the disclosure can be used in accelerating the training process of entire neural network.
The sparse training device of the disclosure is introduced by taking the full articulamentum of sparse convolution neural network as an example below for reversely instructing
Practice the acceleration of process.When reverse train, it is assumed that the gradient value gradient that input lower layer is transmitted, it is prominent when forward direction speculates
Touching is w, and corresponding sparse index is index, and input neuron value when forward direction speculates is input.
Firstly, updating upper one layer of gradient value gradient '.Control module issues load instruction and reads gradient value
After gradient and corresponding cynapse w, the data compression unit for being sent into data processing module carries out screening compression processing to it.No
Harm assumes to give certain judgment threshold here, if the absolute value of gradient is less than given threshold value, by gradient and right
Cynapse in requisition for operation is skipped, i.e., is not sent into computing module and carries out operation;Otherwise, if the absolute value of gradient value is not small
In given threshold value, then it is sent into computing module together with corresponding cynapse w and carries out operation.Judgment threshold herein can also be it
His condition, such as given determination range, given function mapping etc. when waiting the given condition of satisfactions, then by gradient and corresponding need
The cynapse of operation is wanted to skip.What if cynapse w was saved in the form of sparse, need to be sent into before computing module, by w and
The Data expansion unit that index is fed together data processing module is extended it, becomes non-sparse mode, then again and
Gradient, which is fed together in computing module, carries out operation.In computing module, data are sent into the first arithmetic element 1 first
In, the multiplication operation of gradient value and cynapse is completed, i.e., w is subjected to one-to-one correspondence with corresponding gradient and is multiplied, is then sent into
In add tree in corresponding second arithmetic element 2 of the column, it is tired that same row waited for that cumulative N number of data carry out using add tree
Add.If the accumulation result is not the accumulation result finally needed, i.e. accumulation operations are not yet completed, then skip third arithmetic element
3, which is saved, waits add up together next time;Otherwise, it is sent into after the completion of the ALU in corresponding third arithmetic element 3
Reforwarding is calculated: if upper one layer has activation primitive, by the anti-letter of the accumulation result and activation primitive in third arithmetic element 3
Number is multiplied, and obtains final gradient '.If user provides zero setting threshold value at this time, in third arithmetic element 3 into
Row judgement, if the absolute value of the gradient ' is less than the zero setting threshold value, gradient '=0.Otherwise constant.Result is write
Return to memory module storage.Wherein zero setting threshold value can also be other Rule of judgment, such as Function Mapping, and it is full to work as gradient '
Gradient ' zero setting is then saved, otherwise saves initial value by given Rule of judgment enough.
Then, cynapse is updated.Control module issues load instruction and reads out neuron number according to input, the corresponding rope of cynapse w
Argument is according to index, and after reading gradient value gradient, the data compression unit being sent into data processing module compresses it
Processing is compressed input neuron input according to cynapse index data index accordingly.Then it is sent into computing module
Middle carry out operation.For in each group of arithmetic element, data are sent into the first arithmetic element 1 first, multiplication operation is completed, i.e., will
Input carries out one-to-one correspondence with corresponding gradient and is multiplied, and obtains synapse turnover amount, again multiplied by learning rate, then and PE
After the received original cynapse transmitted from memory module is added, new cynapse w is directly obtained.Then skip the second arithmetic element
2 and third arithmetic element 3, save back memory module.
In addition, though being to carry out gradient value judgement by given threshold value, but disclosure gradient value judges in above-described embodiment
Condition is not limited to threshold decision condition, can also be Function Mapping Rule of judgment, the threshold decision condition can include: be less than
One given threshold value is greater than a given threshold value, in a given value range or outside a given value range etc.;And in the disclosure
Gradient value Rule of judgment (such as can be all made of with compression Rule of judgment and extension Rule of judgment using identical Rule of judgment
The selection of threshold decision, threshold size equally may be the same or different), (such as it can also be distinguished using different Rule of judgment
For threshold decision and mapping judgement, specific threshold value and mapping relations can also be different), do not influence the reality of the disclosure
It is existing.
The sparse training device of the disclosure can also support the positive of sparse or dense neural network to speculate part, process class
It is similar to update the process of cynapse.Thus the sparse training device of the disclosure can be used in accelerating the training process of entire neural network.
In addition, the sparse training device of the disclosure is not limited to full articulamentum and convolutional layer for neural network, it also can be applied to
His layer, does not influence the realization of the disclosure.
All modules of the sparse training device of the disclosure can be hardware configuration, the physics realization of hardware configuration include but
It is not limited to physical device, physical device includes but is not limited to transistor, memristor, DNA computer.
In one embodiment, the present disclosure discloses a chips comprising above-mentioned neural metwork training device.
In one embodiment, the present disclosure discloses a chip-packaging structures comprising said chip.
In one embodiment, the present disclosure discloses a boards comprising said chip encapsulating structure.
In one embodiment, the present disclosure discloses an electronic devices comprising above-mentioned board.
The electronic device includes data processing equipment, robot, computer, printer, scanner, tablet computer, intelligence
Terminal, mobile phone, automobile data recorder, navigator, sensor, camera, cloud server, camera, video camera, projector, wrist-watch,
Earphone, mobile storage, the wearable device vehicles, household electrical appliance, and/or Medical Devices.
The vehicles include aircraft, steamer and/or vehicle;The household electrical appliance include TV, air-conditioning, micro-wave oven,
Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator;The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument
And/or electrocardiograph.
Particular embodiments described above has carried out further in detail the purpose of the disclosure, technical scheme and beneficial effects
Describe in detail it is bright, it is all it should be understood that be not limited to the disclosure the foregoing is merely the specific embodiment of the disclosure
Within the spirit and principle of the disclosure, any modification, equivalent substitution, improvement and etc. done should be included in the guarantor of the disclosure
Within the scope of shield.
Claims (10)
1. a kind of training device, comprising:
Data processing module, for being compressed or being extended to input data;And
Computing module is connect with the data processing module, and treated for receiving the data processing module, and data carry out
Operation.
2. training device according to claim 1, wherein the data processing module includes:
Data compression unit, for being compressed according to a compression Rule of judgment to input data;
And Data expansion unit, for being extended according to an extension Rule of judgment to input data.
3. training device according to claim 2, wherein the Data expansion unit, for expanding input data
The sparse Data expansion of compression is unpacked format by exhibition.
4. training device according to claim 2, wherein the compression Rule of judgment and extension Rule of judgment include threshold value
Rule of judgment or Function Mapping Rule of judgment.
5. training device according to claim 4, wherein the threshold decision condition, comprising: less than a given threshold value,
Greater than a given threshold value, in a given value range or outside a given value range.
6. training device according to claim 2, wherein the data compression unit is according to the sparse index values pair of data
Input data is screened and is compressed, and the data to operation are obtained;Or judged according to the value of data itself, so that screening is simultaneously
Compression obtains the numerical value for meeting the compression Rule of judgment.
7. training device according to claim 6, wherein the data compression unit is according to the sparse indexes of cynapse data
Value obtains the neuron number evidence to operation according to being screened and being compressed to the neuron number of input, or according to neuron number evidence
Sparse index value is screened and is compressed to the cynapse data of input, and the cynapse data to operation are obtained.
8. training device according to claim 6, wherein the data compression unit is given according to the value of cynapse itself with one
Determine threshold value to compare, screen and compress to obtain the cynapse data that absolute value is not less than given threshold value, or according to neuron itself
Value screens compared with a given threshold value and compresses to obtain the neuron number evidence that absolute value is not less than given threshold value.
9. training device according to claim 1, wherein the data processing module is also used to be judged according to a gradient value
Condition determines whether gradient value and data to operation being sent to computing module.
10. training device according to claim 9, wherein the gradient value Rule of judgment include threshold decision condition or
Function Mapping Rule of judgment.
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710474297.9A CN109102074B (en) | 2017-06-21 | 2017-06-21 | Training device |
EP19217768.1A EP3657403A1 (en) | 2017-06-13 | 2018-06-12 | Computing device and method |
EP18818258.8A EP3637327B1 (en) | 2017-06-13 | 2018-06-12 | Computing device and method |
PCT/CN2018/090901 WO2018228399A1 (en) | 2017-06-13 | 2018-06-12 | Computing device and method |
US16/698,976 US11544542B2 (en) | 2017-06-13 | 2019-11-28 | Computing device and method |
US16/698,984 US11544543B2 (en) | 2017-06-13 | 2019-11-28 | Apparatus and method for sparse training acceleration in neural networks |
US16/698,988 US11537858B2 (en) | 2017-06-13 | 2019-11-28 | Computing device and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710474297.9A CN109102074B (en) | 2017-06-21 | 2017-06-21 | Training device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109102074A true CN109102074A (en) | 2018-12-28 |
CN109102074B CN109102074B (en) | 2021-06-01 |
Family
ID=64795966
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710474297.9A Active CN109102074B (en) | 2017-06-13 | 2017-06-21 | Training device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109102074B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109726800A (en) * | 2018-12-29 | 2019-05-07 | 北京中科寒武纪科技有限公司 | Operation method, device and Related product |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105512723A (en) * | 2016-01-20 | 2016-04-20 | 南京艾溪信息科技有限公司 | Artificial neural network calculating device and method for sparse connection |
CN105913450A (en) * | 2016-06-22 | 2016-08-31 | 武汉理工大学 | Tire rubber carbon black dispersity evaluation method and system based on neural network image processing |
CN106027300A (en) * | 2016-05-23 | 2016-10-12 | 深圳市飞仙智能科技有限公司 | System and method for parameter optimization of intelligent robot applying neural network |
CN106548234A (en) * | 2016-11-17 | 2017-03-29 | 北京图森互联科技有限责任公司 | A kind of neural networks pruning method and device |
CN106650928A (en) * | 2016-10-11 | 2017-05-10 | 广州视源电子科技股份有限公司 | Method and device for optimizing neural network |
CN106796668A (en) * | 2016-03-16 | 2017-05-31 | 香港应用科技研究院有限公司 | For the method and system that bit-depth in artificial neural network is reduced |
CN106779051A (en) * | 2016-11-24 | 2017-05-31 | 厦门中控生物识别信息技术有限公司 | A kind of convolutional neural networks model parameter processing method and system |
CN106779068A (en) * | 2016-12-05 | 2017-05-31 | 北京深鉴智能科技有限公司 | The method and apparatus for adjusting artificial neural network |
CN107004157A (en) * | 2015-01-22 | 2017-08-01 | 高通股份有限公司 | Model compression and fine setting |
-
2017
- 2017-06-21 CN CN201710474297.9A patent/CN109102074B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107004157A (en) * | 2015-01-22 | 2017-08-01 | 高通股份有限公司 | Model compression and fine setting |
CN105512723A (en) * | 2016-01-20 | 2016-04-20 | 南京艾溪信息科技有限公司 | Artificial neural network calculating device and method for sparse connection |
CN106796668A (en) * | 2016-03-16 | 2017-05-31 | 香港应用科技研究院有限公司 | For the method and system that bit-depth in artificial neural network is reduced |
CN106027300A (en) * | 2016-05-23 | 2016-10-12 | 深圳市飞仙智能科技有限公司 | System and method for parameter optimization of intelligent robot applying neural network |
CN105913450A (en) * | 2016-06-22 | 2016-08-31 | 武汉理工大学 | Tire rubber carbon black dispersity evaluation method and system based on neural network image processing |
CN106650928A (en) * | 2016-10-11 | 2017-05-10 | 广州视源电子科技股份有限公司 | Method and device for optimizing neural network |
CN106548234A (en) * | 2016-11-17 | 2017-03-29 | 北京图森互联科技有限责任公司 | A kind of neural networks pruning method and device |
CN106779051A (en) * | 2016-11-24 | 2017-05-31 | 厦门中控生物识别信息技术有限公司 | A kind of convolutional neural networks model parameter processing method and system |
CN106779068A (en) * | 2016-12-05 | 2017-05-31 | 北京深鉴智能科技有限公司 | The method and apparatus for adjusting artificial neural network |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109726800A (en) * | 2018-12-29 | 2019-05-07 | 北京中科寒武纪科技有限公司 | Operation method, device and Related product |
Also Published As
Publication number | Publication date |
---|---|
CN109102074B (en) | 2021-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109543832B (en) | Computing device and board card | |
US11544543B2 (en) | Apparatus and method for sparse training acceleration in neural networks | |
CN109522052B (en) | Computing device and board card | |
CN108416327B (en) | Target detection method and device, computer equipment and readable storage medium | |
CN107341547A (en) | A kind of apparatus and method for being used to perform convolutional neural networks training | |
CN107578453A (en) | Compressed image processing method, apparatus, electronic equipment and computer-readable medium | |
EP3564863B1 (en) | Apparatus for executing lstm neural network operation, and operational method | |
CN109754074A (en) | A kind of neural network quantization method, device and Related product | |
CN111738433B (en) | Reconfigurable convolution hardware accelerator | |
CN107957976A (en) | A kind of computational methods and Related product | |
CN111160547B (en) | Device and method for artificial neural network operation | |
CN110909870B (en) | Training device and method | |
CN109903350A (en) | Method for compressing image and relevant apparatus | |
CN109711540B (en) | Computing device and board card | |
CN109583579B (en) | Computing device and related product | |
CN110837567A (en) | Method and system for embedding knowledge graph | |
CN109102074A (en) | A kind of training device | |
CN109359542A (en) | The determination method and terminal device of vehicle damage rank neural network based | |
CN109684085A (en) | Memory pool access method and Related product | |
CN117063182A (en) | Data processing method and device | |
CN110020720B (en) | Operator splicing method and device | |
Li et al. | A fault detection optimization method based on chaos adaptive artificial fish swarm algorithm on distributed control system | |
CN111062477A (en) | Data processing method, device and storage medium | |
CN109543835A (en) | Operation method, device and Related product | |
CN109558565A (en) | Operation method, device and Related product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |