CN109212960A - Binary neural network hardware-compressed method based on weight sensitivity - Google Patents
Binary neural network hardware-compressed method based on weight sensitivity Download PDFInfo
- Publication number
- CN109212960A CN109212960A CN201811000016.7A CN201811000016A CN109212960A CN 109212960 A CN109212960 A CN 109212960A CN 201811000016 A CN201811000016 A CN 201811000016A CN 109212960 A CN109212960 A CN 109212960A
- Authority
- CN
- China
- Prior art keywords
- susceptibility
- neural network
- particle
- weight matrix
- sensitive collection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05B—CONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
- G05B13/00—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
- G05B13/02—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
- G05B13/0205—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system
- G05B13/024—Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system in which a parameter or coefficient is automatically adjusted to optimise the performance
Abstract
The binary neural network hardware-compressed method based on weight sensitivity that the invention discloses a kind of, comprising the following steps: using binary neural network training, to obtain weight matrix and original accuracy;Assess the susceptibility of any weight matrix;Default susceptibility threshold values divides the sensitive collection and non-sensitive collection of weight matrix;Assess the susceptibility of the non-sensitive collection of weight matrix;Susceptibility threshold values is adjusted, the best non-sensitive collection of weight matrix is obtained;The susceptibility of the best non-sensitive collection is equal to preset maximum likelihood penalty values;The best non-sensitive collection is stored to novel memory devices or using in nearly threshold values/Asia threshold voltage technology legacy memory.Through the above scheme, the present invention has many advantages, such as that low in energy consumption, discrimination is high, versatility is good, low in cost, has a vast market foreground in hardware-compressed technical field.
Description
Technical field
The present invention relates to hardware-compressed technical fields, are based especially on the binary neural network hardware-compressed of weight sensitivity
Method.
Background technique
Currently, for resource overhead and power consumption needed for reducing neural network hardware realization, used main stream approach has
Hardware structure optimization, neural network compression, binary neural network etc..Wherein, hardware structure optimization is designed more in hardware view
The method for efficiently realizing neural network reduces the memory source that data occupy, and reduces data on memory read-write and operation mode
Redundancy, to achieve the purpose that reduce resource overhead and power consumption.And neural network compression is weighed by reducing in neural network
The number of value and quantizing bit number realize the compression of network model, while guaranteeing the recognition accuracy of neural network after compression not
It is impacted.Under normal conditions, a large amount of weights in neural network be all absolute value close to 0 number, therefore can will be in network absolutely
(as 0) is removed to the weight of value very little, so that there is no connections at this, reduces the total number of weight in network.In network
Weight is all the very high decimal of accuracy, in hardware store, needs for these decimals to be quantified as the fixed-point number of fixed bit number, is
The high-precision for guaranteeing weight stores a weight usually using 32, therefore causes storage overhead larger.It can be by each weight
More rough quantization is done, i.e., indicates high-precision decimal with less digit (such as 3), the weight of different layers can be used in network
Different quantization digit, to guarantee the recognition accuracy of neural network.Traditional neural network compression, firstly, normally training one
Weight absolute value in neural network after training is set to 0 (i.e. at this between two neurons less than threshold value by a neural network
There is no connections) so that neural network is sparse, to reduce the quantity of weight in network, weight then is carried out to sparse network
New training;Next remaining weight is divided into several classifications by the method that K-means is clustered, each classification is compiled
Code, the classification coded representation of each weight then only need 2 each weights of coded representation, the power of each classification if being divided into 4 classes
Value shares the same numerical value, then carries out re -training to the coding of weight;Finally, being advanced optimized using huffman coding
The coding of each weight, realization are effectively compressed.
Binary neural network is that the weight (or including every layer of input value) in neural network is directly quantified as 1 or -1,
It only needs 1 to indicate weight within hardware, greatly reduces digit required for each weight.Mainly have in binary neural network
The network of four kinds of forms, respectively BinaryConnect (binary connection), BNN (binary neural network), BWN (two-value weight net
Network), XNOR-Net (with or network), only the object of quantization is different between them, but is all that numerical value is changed into 1 or -1 table
Show.BinaryConnect and BWN is only weight to be quantified as two-value 1 or -1, BNN and XNOR-Net is by weight and every layer
Input value is all quantified as two-value 1 or -1.In neural network, each layer of calculating is mainly between input vector and weight matrix
Multiplication operation.If weight is only quantified 1 or -1, the multiplication between input vector and weight matrix is made to operate conversion to add
Subtraction operation reduces multiplication operation;If weight and every layer of input value are all quantified 1 or -1, make input vector and weight square
The multiplication operation of battle array is converted into the same of 1bit or operation, more saves power consumption than addition and subtraction operation.BWN and XNOR-Net ratio
BinaryConnect and BNN introduces scale factor more, therefore can more guarantee the recognition accuracy of complex task well.
Traditional compression method has the following disadvantages:
First, hardware structure optimization saves hardware resource with neural network compression and power consumption is poor;Relative to hardware structure
For optimization and neural network compression, since at least 32 times of compression (primitive network weight use may be implemented in binary neural network
32 expressions, binary neural network only have 1 expression), and computationally multiplying is converted to plus and minus calculation or 1bit is same
Or operation, it greatly reduces hardware store expense and calculates power consumption.Although and hardware structure optimization and neural network compression are one
Determine to save amount of storage and power consumption in degree, but simple not as good as binary neural network.
Second, the recognition accuracy of binary neural network is lower;In a variety of binary neural networks, with regard to classification task
Speech, BinaryConnect and BNN can only complete well classification task to some lesser data sets, such as handwritten numeral collection
MNIST, the data set CIFAR of common objects identification, the identification of real world street number number data set SVHN etc., when changing
When data set very large at such as ImageNet, BinaryConnect and BNN can make recognition accuracy degradation.For this purpose,
BWN and XNOR-Net needs additional scaling factor to guarantee Network Recognition accuracy rate.
Third, traditional compression method use as 6T SRAM memory device, the memory device make hardware resource cost and
Power consumption is all bigger, limits the scale of neural network of chip realization, although binary neural network is equipped with performance to conventional hardware
It is good, but its fault-tolerance is not fully utilized.The trend of existing mainstream is using novel memory devices part, as RRAM (deposit by resistive formula
Reservoir), although hardware resource can be saved significantly, more massive neural network is disposed on hardware, it exists
Insecure problem, if can largely influence mind by the weight storage of entire neural network on novel memory devices part
Recognition accuracy through network, therefore some challenges are still had using novel memory devices part.
4th, legacy memory is powered using normal voltage, and to reduce circuit power consumption, nearly threshold value/subthreshold value electricity can be used
Pressure technology.Wherein nearly threshold voltage technique be by the adjustment of circuit supply voltage near transistor cut-in voltage (or it is high or
It is low), which has greatly improved in aspect of performance such as working frequency, energy efficiencies;Subthreshold voltage technology is the electricity that will power
Pressure is adjusted to provide lowest energy consumption lower than transistor cut-in voltage.It therefore, can be to the legacy memory of storage neural network weight
Part uses nearly threshold value/subthreshold voltage technology, to achieve the purpose that reduce power consumption.But nearly threshold value/subthreshold voltage technology
Some challenges are still suffered from, there are problems that uncertain or mutability.Under low suppling voltage, circuit is subject to interfere, from
And the weight stored in conventional memory device is caused to malfunction.If entire conventional memory device is all used nearly threshold value/subthreshold value electricity
Pressure technology then can largely influence the recognition accuracy of neural network.
Summary of the invention
In view of the above-mentioned problems, the purpose of the present invention is to provide a kind of binary neural network hardware based on weight sensitivity
Compression method, The technical solution adopted by the invention is as follows:
Binary neural network hardware-compressed method based on weight sensitivity, comprising the following steps:
Step S1, using binary neural network training, to obtain weight matrix and original accuracy.
Step S2 assesses the susceptibility of any weight matrix.
Step S3 presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix.
Step S4 assesses the susceptibility of the non-sensitive collection of weight matrix.
Step S5 adjusts susceptibility threshold values, obtains the best non-sensitive collection of weight matrix;The best non-sensitive collection it is quick
Sensitivity is equal to preset maximum likelihood penalty values.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by step S6
Legacy memory in.
Further, in the step S2, the susceptibility of any weight matrix is assessed, comprising the following steps:
Step S21 presets error probability P, unreliable to assess novel memory devices part and nearly threshold values/Asia threshold voltage
Degree;The P is the number greater than 0 and less than 1.
Step S22, any binary neural network weight of the weight matrix successively occur mistake with error probability P, obtain
Obtain the first accuracy of binary neural network.
Step S23 repeats n times step S22, obtains the frequency histogram of the first accuracy;The N is oneself greater than 100
So number.
Step S24 acquires the average value of the frequency histogram of the first accuracy, using the average value as the weight matrix
The second accuracy that binary neural network reaches when mistake occurs.
Step S25, acquires the susceptibility of the weight matrix, and the susceptibility is in original accuracy and step S24
The difference of second accuracy.
Further, in the step S3, susceptibility threshold values is preset, divides the sensitive collection of weight matrix and non-sensitive
Collection, comprising the following steps:
Step S31, by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values.
The susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection by step S32, and by the quick of weight matrix
Sensitivity is less than or equal to susceptibility threshold values and is divided into non-sensitive collection.
Preferably, in the step S4, the susceptibility of the non-sensitive collection of weight matrix is assessed, comprising the following steps:
The non-sensitive collection mistake with error probability P, obtains the third accuracy of non-sensitive collection by step S41 occurs;Institute
Stating P is the number greater than 0 and less than 1.
Step S42 repeats n times step S41, obtains the average value of the frequency histogram of third accuracy, acquires original standard
The difference of exactness and the average value and as the susceptibility of non-sensitive collection.
Further, in the step S5, susceptibility threshold values is adjusted, the best non-sensitive collection of weight matrix is obtained, including
Following steps:
Step S51 presets maximum likelihood penalty values;
Step S52 adjusts susceptibility threshold, if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values, by institute
Non-sensitive collection is stated as best non-sensitive collection, and enters step S6, otherwise, continues to adjust susceptibility threshold, and return step S3
The sensitive collection of weight matrix and non-sensitive collection are divided.
Binary neural network hardware-compressed method based on weight sensitivity, comprising the following steps:
Step K1, using binary neural network training, to obtain weight matrix and original accuracy.
Step K2 initializes population, to obtain the initialization of the D dimensional vector of particle;The D is weight in neural network
The number of matrix.
Step K3 adds constraint condition.
Step K4 acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle.
Step K5 updates the history optimal value and global optimum of any particle.
Step K6, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates a random number T;The T is
Number more than or equal to 0 and less than or equal to 1.
Step K7 determines the size of the variation probability and random number T of any particle, if random number T is less than or equal to variation generally
Rate then converts the dimensionality of particle value inequality;If random number T is greater than variation probability, which is remained unchanged, with reality
The update of existing particle D dimensional vector.
Step K8 repeats step K4 to step K7, carries out the iterative operation of particle, and judge iterative operation number whether
Equal to preset maximum number of iterations, if so, exporting global optimum, otherwise, return step K4;Utilize the global optimum
Value acquires best non-sensitive collection.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by step K9
Legacy memory in.
Preferably, in the step K2, any one particle in population is chosen, which is obtained using sensitivity analysis
The initialization of the D dimensional vector of son, and by the D dimensional vector random initializtion of residual particles;The population is made of M particle, M
For the natural number greater than 1.
Further, in the step K4, the adaptive value of any particle in population and the sensitivity of non-sensitive collection are acquired
Degree, comprising the following steps:
Step K41, by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by the D dimensional vector
The weight matrix that middle position is 0 is labeled as sensitivity;The weight matrix quantity that position is 1 in the D dimensional vector is the particle
Adaptive value.
Step K42 by the non-sensitive conclusion in D dimensional vector and obtains non-sensitive collection, by the non-sensitive collection with wrong general
Mistake occurs for rate P, and obtains the 4th accuracy of binary neural network, repeats the frequency histogram that n times acquire the 4th accuracy,
The 5th accuracy that binary neural network reaches when as non-sensitive collection mistake is occurred for the average value of the frequency histogram;Institute
Stating P is the number greater than 0 and less than 1, and N is the natural number greater than 100;The difference of the original accuracy and the 5th accuracy is
For the susceptibility of non-sensitive collection.
Further, in the step K5, if the susceptibility of the non-sensitive collection acquired in step K42 is less than or equal to preset
Maximum likelihood penalty values, then judge whether the adaptive value of any particle is greater than the corresponding history optimal value of the particle, if should
Adaptive value is greater than history optimal value, then using the adaptive value as history optimal value, otherwise, keeps history optimal value constant;And by M
The history optimal value of a particle compares, and using the maximum history optimal value of numerical value as global optimum.
Further, in the step K6, the renewal speed and variation probability of more new particle, comprising the following steps:
Step K61 calculates the renewal speed v of every dimension of any particleid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid) ①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate generation
Random number, pid indicate the history optimal value of particle, and xid indicates the current value of particle, and pgd indicates the optimal value of M particle.
Step K62, by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
Compared with prior art, the invention has the following advantages:
(1) present invention can obtain weight matrix non-sensitive to accuracy in neural network, specific as follows: by method
It is set in 1 and adjusts susceptibility threshold, to divide weight matrix;And pass through the result with Sensitivity Analysis in method 2
D dimensional vector as one of particle initializes, other particle random initializtions, and by continuous iteration and updates M grain
The D dimensional vector of son, to obtain the splitting scheme of weight matrix.Being designed in this way is advantageous in that, can be weighed in neural network
The best non-sensitive collection of value matrix, best non-sensitive collection mean that mistake, the accuracy of network occurs even if best non-sensitive collection
It will not be by heavy losses.
(2) in method 1, for the present invention by comparing with maximum likelihood set by user loss, constantly adjustment is sensitive
Threshold value is spent, to guarantee to obtain the best non-sensitive collection of weight matrix under the conditions of maximum likelihood set by user is lost.Separately
Outside, in method 2, by adding the constraint condition of maximum likelihood loss set by user, to guarantee binary neural network
Recognition accuracy.
(3) present invention under different process and technology novel memory devices part or nearly threshold value/subthreshold voltage technology all
With versatility.Because of novel memory devices part or nearly threshold value/subthreshold voltage engineering reliability journey under different process and technology
It spends inconsistent, novel memory devices part is assessed by using error probability P or nearly threshold value/subthreshold voltage technology is unreliable
Property, and error probability P can be determined according to actual process and technology.
(4) present invention dexterously utilizes neural network weight, to reduce the resource overhead of hardware, by best non-sensitive collection
Weight storage is on novel memory devices part, and novel memory devices part is simpler than conventional memory device structure, and resource overhead is small, bring
Effect is the reduction of the use of conventional memory device, saves the resource overhead of hardware.Alternatively, the present invention will best non-sensitive collection
Weight storage using in nearly threshold value/subthreshold voltage technology conventional memory device, the power supply of nearly threshold value/subthreshold voltage
Voltage is lower than the normally-open voltage of transistor, thereby reduces the power consumption of circuit.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to the attached drawing used required in embodiment
It is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as to protection
The restriction of range to those skilled in the art without creative efforts, can also be attached according to these
Figure obtains other relevant attached drawings.
Fig. 1 is flow chart (one) of the invention.
Fig. 2 is the flow chart of the susceptibility of the assessment weight matrix of invention.
Fig. 3 is the flow chart of the division weight matrix of invention.
Fig. 4 is the flow chart of the susceptibility of the non-sensitive collection of assessment of the invention.
Fig. 5 is the flow chart for seeking best non-sensitive collection of the invention.
Fig. 6 is flow chart (two) of the invention.
Fig. 7 is flow chart (three) of the invention.
Specific embodiment
To keep the purposes, technical schemes and advantages of the application apparent, with reference to the accompanying drawings and examples to the present invention
It is described further, embodiments of the present invention include but is not limited to the following example.Based on the embodiment in the application, ability
Domain those of ordinary skill every other embodiment obtained without making creative work, belongs to the application
The range of protection.
Embodiment 1
As shown in Figures 1 to 5, a kind of binary neural network hardware-compressed based on weight sensitivity is present embodiments provided
Method, it should be noted that the serial numbers term such as " first " described in the present embodiment, " second ", " third " is only used for distinguishing same
Base part, comprising the following steps:
The first step, using binary neural network training, to obtain weight matrix and original accuracy.
Second step assesses the susceptibility of any weight matrix, specific as follows:
(21) error probability P is preset, to assess novel memory devices part and the unreliable degree of nearly threshold values/Asia threshold voltage, institute
Stating P is the number greater than 0 and less than 1;It is P that the probability of wrong (1 → -1, -1 → 1), which occurs, for each weight i.e. in weight matrix.
(22) successively with error probability P mistake occurs for any binary neural network weight of the weight matrix, obtains two
It is worth the first accuracy of neural network.When mistake occurs for any binary neural network weight, other in binary neural network
Weight matrix remains unchanged, and with the recognition accuracy of data set test network.
(23) at least 100 steps (22) are repeated, the frequency histogram of the first accuracy is obtained.Due to every in weight matrix
It is a chance event that whether a weight, which occurs mistake, therefore the error situation tested each time is all different, therefore repeats above-mentioned reality
It tests at least 100 times, and frequency histogram is obtained as probability distribution according to experimental result.
(24) average value for acquiring the frequency histogram of the first accuracy occurs the average value as the weight matrix
The second accuracy that binary neural network reaches when mistake
(25) susceptibility of the weight matrix is acquired, the susceptibility is second in original accuracy and step S24
The difference of accuracy.Each of neural network weight matrix obtains respective sensibility all in accordance with the above process.
Third step presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix, specifically:
(31) by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values.
(32) susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection, and by the susceptibility of weight matrix
Non-sensitive collection is divided into less than or equal to susceptibility threshold values.
4th step assesses the susceptibility of the non-sensitive collection of weight matrix, comprising the following steps:
(41) with error probability P mistake is occurred into for the non-sensitive collection, obtains the third accuracy of non-sensitive collection.
(42) at least 100 steps (41) are repeated, the average value of the frequency histogram of third accuracy is obtained, acquires original
The difference of accuracy and the average value and as the susceptibility of non-sensitive collection.
5th step adjusts susceptibility threshold values, obtains the best non-sensitive collection of weight matrix.Wherein, described best non-sensitive
The susceptibility of collection is equal to preset maximum likelihood penalty values.Specifically:
(51) maximum likelihood penalty values are preset;
(52) susceptibility threshold is adjusted, it, will be described non-if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values
Sensitivity collection enters the 6th step as best non-sensitive collection, otherwise, continues to adjust susceptibility threshold, and return to third step to power
The sensitive collection of value matrix and non-sensitive collection are divided.Here, adjustment susceptibility threshold is specific as follows: if the sensitivity of non-sensitive collection
Degree is less than maximum likelihood penalty values, then increases susceptibility threshold, if the susceptibility of non-sensitive collection loses greater than maximum likelihood
Value, then reduce susceptibility threshold, until the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by the 6th step
Legacy memory in.
Embodiment 2
As shown in fig. 6, a kind of binary neural network hardware-compressed method based on weight sensitivity is present embodiments provided,
It is sensitive to recognition accuracy in binary neural network to search that this method combines sensitivity analysis and binary particle swarm algorithm
Property the combination of lower weight matrix.Wherein, binary particle swarm algorithm forms a group by M particle, ties up in a D
Optimal value is searched in object space, and the position of particle is updated according to speed more new formula, each solution is evaluated with fitness function
Superiority and inferiority searches optimal value in such a way that iteration updates.It should be noted that " the 4th " described in the present embodiment, " the 5th ",
Etc. serial numbers term be only used for distinguish same item, the binary neural network hardware-compressed method the following steps are included:
Binary neural network hardware-compressed method based on weight sensitivity, which comprises the following steps:
The first step, using binary neural network training, to obtain weight matrix and original accuracy.
Second step initializes population, to obtain the initialization of the D dimensional vector of particle.Wherein, the D is neural network
The number of middle weight matrix, i.e., per one weight matrix of one-dimensional correspondence, " 1 " represents corresponding weight matrix to be insensitive, " 0 "
It is sensitive for representing corresponding weight matrix.Concrete operations are as follows: have M particle in postulated particle group, each particle with D tie up to
Amount indicates, per one-dimensional for two-value (1 or 0).Any one particle in population is chosen, which is obtained using sensitivity analysis
D dimensional vector initialization.The D dimensional vector of other (M-1) a particles does random initializtion.
Third step adds constraint condition, using preset maximum likelihood penalty values as a constraint item in the algorithm
Part, i.e. search result are to maximize the number of insensitive weight matrix under the conditions of meeting the loss of accuracy value.
4th step acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle.Specifically
For:
(41) by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by position in the D dimensional vector
The weight matrix for being set to 0 is labeled as sensitivity.The weight matrix quantity that position is 1 in the D dimensional vector is the adaptation of the particle
Value.
(42) by the non-sensitive conclusion in D dimensional vector and non-sensitive collection is obtained, by the non-sensitive collection with error probability P hair
Raw mistake, and the 4th accuracy of binary neural network is obtained, repeat at least 100 times frequency histograms for acquiring the 4th accuracy
Figure, binary neural network reaches when as non-sensitive collection mistake is occurred for the average value of the frequency histogram the 5th are accurate
Degree.The difference of the original accuracy and the 5th accuracy is the susceptibility of non-sensitive collection.
5th step updates the history optimal value and global optimum of any particle.If being acquired in step (42) non-sensitive
The susceptibility of collection is less than or equal to preset maximum likelihood penalty values, then judges whether the adaptive value of any particle is greater than the particle
Corresponding history optimal value, using the adaptive value as history optimal value, otherwise, is protected if the adaptive value is greater than history optimal value
It is constant to hold history optimal value.And the history optimal value of M particle is compared, and using the maximum history optimal value of numerical value as
Global optimum.
6th step, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates a random number T;The T is
Number more than or equal to 0 and less than or equal to 1.Specifically:
(61) the renewal speed v of every dimension of any particle is calculatedid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid)①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate to generate
Random number, pid indicate particle history optimal value, xid indicate particle current value, pgd indicate M particle optimal value.
(62) by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
7th step determines the size of the variation probability and random number T of any particle, if random number T is less than or equal to variation generally
Rate then converts the dimensionality of particle value inequality, i.e., dimensionality of particle value 1 is transformed into 0,0 and is transformed into 1;If it is general that random number T is greater than variation
Rate, then the dimensionality of particle value remains unchanged, to realize the update of particle D dimensional vector.
8th step repeats the 4th step to the 7th step, carries out the iterative operation of particle, and judge iterative operation number whether
Equal to preset maximum number of iterations, if so, otherwise output global optimum returns to the 4th step;Utilize the global optimum
Value acquires best non-sensitive collection.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by the 9th step
Legacy memory in.
Embodiment 3
As described in Figure 7, a kind of binary neural network hardware-compressed method based on weight sensitivity is present embodiments provided,
The serial numbers term such as " first " described in the present embodiment, " second ", " third " is only used for distinguishing same item, specifically, should
Method the following steps are included:
The first step, using binary neural network training, to obtain weight matrix and original accuracy.
Second step assesses the susceptibility of any weight matrix, specific as follows:
(21) error probability P is preset, to assess novel memory devices part and the unreliable degree of nearly threshold values/Asia threshold voltage, institute
Stating P is the number greater than 0 and less than 1;It is P that the probability of wrong (1 → -1, -1 → 1), which occurs, for each weight i.e. in weight matrix.
(22) successively with error probability P mistake occurs for any binary neural network weight of the weight matrix, obtains two
It is worth the first accuracy of neural network.When mistake occurs for any binary neural network weight, other in binary neural network
Weight matrix remains unchanged, and with the recognition accuracy of data set test network.
(23) at least 100 steps (22) are repeated, the frequency histogram of the first accuracy is obtained.Due to every in weight matrix
It is a chance event that whether a weight, which occurs mistake, therefore the error situation tested each time is all different, therefore repeats above-mentioned reality
It tests at least 100 times, and frequency histogram is obtained as probability distribution according to experimental result.
(24) average value for acquiring the frequency histogram of the first accuracy occurs the average value as the weight matrix
The second accuracy that binary neural network reaches when mistake
(25) susceptibility of the weight matrix is acquired, the susceptibility is second in original accuracy and step S24
The difference of accuracy.Each of neural network weight matrix obtains respective sensibility all in accordance with the above process.
Third step presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix, specifically:
(31) by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values.
(32) susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection, and by the susceptibility of weight matrix
Non-sensitive collection is divided into less than or equal to susceptibility threshold values.
4th step assesses the susceptibility of the non-sensitive collection of weight matrix, with steps are as follows:
(41) with error probability P mistake is occurred into for the non-sensitive collection, obtains the third accuracy of non-sensitive collection.
(42) at least 100 steps (41) are repeated, the average value of the frequency histogram of third accuracy is obtained, acquires original
The difference of accuracy and the average value and as the susceptibility of non-sensitive collection.
5th step adjusts susceptibility threshold values, obtains the non-sensitive collection of suboptimum of weight matrix.The non-sensitive collection of suboptimum
Susceptibility be equal to preset maximum likelihood penalty values.Specific step is as follows:
(51) maximum likelihood penalty values are preset.
(52) susceptibility threshold is adjusted, it, will be described non-if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values
Sensitivity collection is used as the non-sensitive collection of suboptimum, and enters the 6th step, otherwise, continues to adjust susceptibility threshold, and return to third step pair
The sensitive collection of weight matrix and non-sensitive collection are divided.It is specific as follows to continue adjustment susceptibility threshold: if non-sensitive collection is quick
Sensitivity is less than maximum likelihood penalty values, then increases susceptibility threshold, if the susceptibility of non-sensitive collection is damaged greater than maximum likelihood
Mistake value, then reduce susceptibility threshold, until the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values.
6th step initializes population, to obtain the initialization of the D dimensional vector of particle, wherein D is to weigh in neural network
The number of value matrix, i.e., per one weight matrix of one-dimensional correspondence, " 1 " represents corresponding weight matrix to be insensitive, and " 0 " represents
Corresponding weight matrix is sensitive.Concrete operations are as follows: having M particle in postulated particle group, choose any one particle, adopt
The D dimensional vector of the particle is initialized with the non-sensitive collection of suboptimum, the D dimensional vector of other (M-1) a particles does random initializtion.
7th step adds constraint condition, here, about using preset maximum likelihood penalty values as one in the algorithm
Beam condition, i.e. search result are to maximize the number of insensitive weight matrix under the conditions of meeting the loss of accuracy value.
8th step acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle, specifically
Steps are as follows:
(81) by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by position in the D dimensional vector
The weight matrix for being set to 0 is labeled as sensitivity.The weight matrix quantity that position is 1 in the D dimensional vector is the adaptation of the particle
Value.
(82) by the non-sensitive conclusion in D dimensional vector and non-sensitive collection is obtained, by the non-sensitive collection with error probability P hair
Raw mistake, and the 4th accuracy of binary neural network is obtained, repeat at least 100 times frequency histograms for acquiring the 4th accuracy
Figure, binary neural network reaches when as non-sensitive collection mistake is occurred for the average value of the frequency histogram the 5th are accurate
Degree.The difference of the original accuracy and the 5th accuracy is the susceptibility of non-sensitive collection.
9th step updates the history optimal value and global optimum of any particle.If being acquired in step (82) non-sensitive
The susceptibility of collection is less than or equal to preset maximum likelihood penalty values, then judges whether the adaptive value of any particle is greater than the particle
Corresponding history optimal value, using the adaptive value as history optimal value, otherwise, is protected if the adaptive value is greater than history optimal value
It is constant to hold history optimal value;And the history optimal value of M particle is compared, and using the maximum history optimal value of numerical value as
Global optimum.
Tenth step, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates random number a T, the T and be
Number more than or equal to 0 and less than or equal to 1.Specifically:
(101) the renewal speed v of every dimension of any particle is calculatedid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid) ①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate generation
Random number, pid indicate the history optimal value of particle, and xid indicates the current value of particle, and pgd indicates the optimal value of M particle.
(102) by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
11st step determines the size of variation probability and random number T, if random number T is less than or equal to variation probability, will appoint
The transformation of one dimensionality of particle value inequality, i.e., dimensionality of particle value 1 is transformed into 0,0 and is transformed into 1;If random number T is greater than variation probability, appoint
One dimensionality of particle value remains unchanged, to realize the update of particle D dimensional vector.
12nd step repeats the 8th step to the 11st step, carries out the iterative operation of particle, and judges the number of iterative operation
Whether preset maximum number of iterations is equal to, if so, otherwise output global optimum returns to the 8th step and continues particle
Iterative operation;And best non-sensitive collection is acquired using the global optimum.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage skill by the 13rd step
In the legacy memory of art.
Above-described embodiment is merely a preferred embodiment of the present invention, and it is not intended to limit the protection scope of the present invention, as long as using
Design principle of the invention, and the non-creative variation worked and made is carried out on this basis, it should belong to of the invention
Within protection scope.
Claims (10)
1. the binary neural network hardware-compressed method based on weight sensitivity, which comprises the following steps:
Step S1, using binary neural network training, to obtain weight matrix and original accuracy;
Step S2 assesses the susceptibility of any weight matrix;
Step S3 presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix;
Step S4 assesses the susceptibility of the non-sensitive collection of weight matrix;
Step S5 adjusts susceptibility threshold values, obtains the best non-sensitive collection of weight matrix;The susceptibility of the best non-sensitive collection
Equal to preset maximum likelihood penalty values;
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology biography by step S6
In system memory.
2. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that
In the step S2, the susceptibility of any weight matrix is assessed, comprising the following steps:
Step S21 presets error probability P, to assess novel memory devices part and the unreliable degree of nearly threshold values/Asia threshold voltage;Institute
Stating P is the number greater than 0 and less than 1;
Successively with error probability P mistake occurs for step S22, any binary neural network weight of the weight matrix, obtains two
It is worth the first accuracy of neural network;
Step S23 repeats n times step S22, obtains the frequency histogram of the first accuracy;The N is the natural number greater than 100;
Step S24 acquires the average value of the frequency histogram of the first accuracy, occurs the average value as the weight matrix
The second accuracy that binary neural network reaches when mistake;
Step S25, acquires the susceptibility of the weight matrix, and the susceptibility is second in original accuracy and step S24
The difference of accuracy.
3. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that
In the step S3, susceptibility threshold values is preset, divides the sensitive collection and non-sensitive collection of weight matrix, comprising the following steps:
Step S31, by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values;
The susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection by step S32, and by the susceptibility of weight matrix
Non-sensitive collection is divided into less than or equal to susceptibility threshold values.
4. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that
In the step S4, the susceptibility of the non-sensitive collection of weight matrix is assessed, comprising the following steps:
The non-sensitive collection mistake with error probability P, obtains the third accuracy of non-sensitive collection by step S41 occurs;The P
For the number greater than 0 and less than 1;
Step S42 repeats n times step S41, obtains the average value of the frequency histogram of third accuracy, acquires original accuracy
With the difference of the average value and as the susceptibility of non-sensitive collection.
5. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that
In the step S5, susceptibility threshold values is adjusted, the best non-sensitive collection of weight matrix is obtained, comprising the following steps:
Step S51 presets maximum likelihood penalty values;
Step S52 adjusts susceptibility threshold, will be described non-if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values
Sensitivity collection enters step S6 as best non-sensitive collection, otherwise, continues to adjust susceptibility threshold, and return step S3 is to power
The sensitive collection of value matrix and non-sensitive collection are divided.
6. the binary neural network hardware-compressed method based on weight sensitivity, which comprises the following steps:
Step K1, using binary neural network training, to obtain weight matrix and original accuracy;
Step K2 initializes population, to obtain the initialization of the D dimensional vector of particle;The D is weight matrix in neural network
Number;
Step K3 adds constraint condition;
Step K4 acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle;
Step K5 updates the history optimal value and global optimum of any particle;
Step K6, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates a random number T;The T be greater than
Number equal to 0 and less than or equal to 1;
Step K7 determines the size of the variation probability and random number T of any particle, if random number T is less than or equal to variation probability,
The dimensionality of particle value inequality is converted;If random number T is greater than variation probability, which is remained unchanged, to realize grain
The update of sub- D dimensional vector;
Step K8 repeats step K4 to step K7, carries out the iterative operation of particle, and judges whether the number of iterative operation is equal to
Preset maximum number of iterations, if so, output global optimum, otherwise, return step K4;It is asked using the global optimum
Obtain most preferably non-sensitive collection;
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology biography by step K9
In system memory.
7. the binary neural network hardware-compressed method according to claim 6 based on weight sensitivity, which is characterized in that
In the step K2, any one particle in population is chosen, the first of the D dimensional vector of the particle is obtained using sensitivity analysis
Beginningization, and by the D dimensional vector random initializtion of residual particles;The population is made of M particle, and M is the nature greater than 1
Number.
8. the binary neural network hardware-compressed method according to claim 6 based on weight sensitivity, which is characterized in that
In the step K4, the adaptive value of any particle in population and the susceptibility of non-sensitive collection are acquired, comprising the following steps:
Step K41, by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by position in the D dimensional vector
The weight matrix for being set to 0 is labeled as sensitivity;The weight matrix quantity that position is 1 in the D dimensional vector is the adaptation of the particle
Value;
Step K42 by the non-sensitive conclusion in D dimensional vector and obtains non-sensitive collection, by the non-sensitive collection with error probability P hair
Raw mistake, and the 4th accuracy of binary neural network is obtained, the frequency histogram that n times acquire the 4th accuracy is repeated, by institute
The 5th accuracy that binary neural network reaches when stating the average value of frequency histogram as non-sensitive collection generation mistake;The P
For the number greater than 0 and less than 1, and N is the natural number greater than 100;The difference of the original accuracy and the 5th accuracy is
The susceptibility of non-sensitive collection.
9. the binary neural network hardware-compressed method according to claim 8 based on weight sensitivity, which is characterized in that
In the step K5, if the susceptibility of the non-sensitive collection acquired in step K42 is less than or equal to preset maximum likelihood penalty values,
Then judge whether the adaptive value of any particle is greater than the corresponding history optimal value of the particle, if the adaptive value is optimal greater than history
Otherwise value, keeps history optimal value constant then using the adaptive value as history optimal value;And by the history optimal value of M particle
It compares, and using the maximum history optimal value of numerical value as global optimum.
10. the binary neural network hardware-compressed method according to claim 6 based on weight sensitivity, feature exist
In, in the step K6, the renewal speed and variation probability of more new particle, comprising the following steps:
Step K61 calculates the renewal speed v of every dimension of any particleid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid) ①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate to generate random
Number, pid indicate the history optimal value of particle, and xid indicates the current value of particle, and pgd indicates the optimal value of M particle;
Step K62, by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811000016.7A CN109212960B (en) | 2018-08-30 | 2018-08-30 | Weight sensitivity-based binary neural network hardware compression method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811000016.7A CN109212960B (en) | 2018-08-30 | 2018-08-30 | Weight sensitivity-based binary neural network hardware compression method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109212960A true CN109212960A (en) | 2019-01-15 |
CN109212960B CN109212960B (en) | 2020-08-14 |
Family
ID=64986164
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811000016.7A Active CN109212960B (en) | 2018-08-30 | 2018-08-30 | Weight sensitivity-based binary neural network hardware compression method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109212960B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109978160A (en) * | 2019-03-25 | 2019-07-05 | 北京中科寒武纪科技有限公司 | Configuration device, method and the Related product of artificial intelligence process device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7372713B2 (en) * | 2006-04-17 | 2008-05-13 | Texas Instruments Incorporated | Match sensing circuit for a content addressable memory device |
CN107729999A (en) * | 2016-08-12 | 2018-02-23 | 北京深鉴科技有限公司 | Consider the deep neural network compression method of matrix correlation |
CN107967515A (en) * | 2016-10-19 | 2018-04-27 | 三星电子株式会社 | The method and apparatus quantified for neutral net |
CN108322221A (en) * | 2017-01-18 | 2018-07-24 | 华南理工大学 | A method of being used for depth convolutional neural networks model compression |
CN108334945A (en) * | 2018-01-30 | 2018-07-27 | 中国科学院自动化研究所 | The acceleration of deep neural network and compression method and device |
-
2018
- 2018-08-30 CN CN201811000016.7A patent/CN109212960B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7372713B2 (en) * | 2006-04-17 | 2008-05-13 | Texas Instruments Incorporated | Match sensing circuit for a content addressable memory device |
CN107729999A (en) * | 2016-08-12 | 2018-02-23 | 北京深鉴科技有限公司 | Consider the deep neural network compression method of matrix correlation |
CN107967515A (en) * | 2016-10-19 | 2018-04-27 | 三星电子株式会社 | The method and apparatus quantified for neutral net |
CN108322221A (en) * | 2017-01-18 | 2018-07-24 | 华南理工大学 | A method of being used for depth convolutional neural networks model compression |
CN108334945A (en) * | 2018-01-30 | 2018-07-27 | 中国科学院自动化研究所 | The acceleration of deep neural network and compression method and device |
Non-Patent Citations (3)
Title |
---|
YIXING LI 等: "Build a compact binary neural network through bit-level sensitivity and data pruning", 《NEUROCOMPUTING》 * |
曹文龙 等: "神经网络模型压缩方法综述", 《计算机应用研究》 * |
雷杰 等: "深度网络模型压缩综述", 《软件学报》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109978160A (en) * | 2019-03-25 | 2019-07-05 | 北京中科寒武纪科技有限公司 | Configuration device, method and the Related product of artificial intelligence process device |
CN109978160B (en) * | 2019-03-25 | 2021-03-02 | 中科寒武纪科技股份有限公司 | Configuration device and method of artificial intelligence processor and related products |
Also Published As
Publication number | Publication date |
---|---|
CN109212960B (en) | 2020-08-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019179403A1 (en) | Fraud transaction detection method based on sequence width depth learning | |
CN111062382A (en) | Channel pruning method for target detection network | |
CN110705711A (en) | Quantum state information dimension reduction coding method and device | |
CN104317902A (en) | Image retrieval method based on local locality preserving iterative quantization hash | |
CN110210618A (en) | The compression method that dynamic trimming deep neural network weight and weight are shared | |
CN104199923B (en) | Large-scale image library searching method based on optimal K averages hash algorithm | |
CN107947921A (en) | Based on recurrent neural network and the password of probability context-free grammar generation system | |
CN109858518B (en) | Large data set clustering method based on MapReduce | |
CN110069644A (en) | A kind of compression domain large-scale image search method based on deep learning | |
CN109711483A (en) | A kind of power system operation mode clustering method based on Sparse Autoencoder | |
CN109471049B (en) | Satellite power supply system anomaly detection method based on improved stacked self-encoder | |
CN112732864A (en) | Document retrieval method based on dense pseudo query vector representation | |
CN109212960A (en) | Binary neural network hardware-compressed method based on weight sensitivity | |
Cai et al. | Credit Payment Fraud detection model based on TabNet and Xgboot | |
CN108805280A (en) | A kind of method and apparatus of image retrieval | |
CN113962371A (en) | Image identification method and system based on brain-like computing platform | |
Huang et al. | Rct: Resource constrained training for edge ai | |
CN116186594A (en) | Method for realizing intelligent detection of environment change trend based on decision network combined with big data | |
Lahdhiri et al. | Dnnzip: Selective layers compression technique in deep neural network accelerators | |
Kekre et al. | Vector quantized codebook optimization using modified genetic algorithm | |
CN112784838A (en) | Hamming OCR recognition method based on locality sensitive hashing network | |
Zhai et al. | Deep product quantization for large-scale image retrieval | |
CN112784625A (en) | Acceleration and compression method of pedestrian re-identification model | |
CN110751274A (en) | Neural network compression method and system based on random projection hash | |
CN117171778B (en) | Access flow control method and system for database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |