CN109212960A - Binary neural network hardware-compressed method based on weight sensitivity - Google Patents

Binary neural network hardware-compressed method based on weight sensitivity Download PDF

Info

Publication number
CN109212960A
CN109212960A CN201811000016.7A CN201811000016A CN109212960A CN 109212960 A CN109212960 A CN 109212960A CN 201811000016 A CN201811000016 A CN 201811000016A CN 109212960 A CN109212960 A CN 109212960A
Authority
CN
China
Prior art keywords
susceptibility
neural network
particle
weight matrix
sensitive collection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811000016.7A
Other languages
Chinese (zh)
Other versions
CN109212960B (en
Inventor
周军
王尹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201811000016.7A priority Critical patent/CN109212960B/en
Publication of CN109212960A publication Critical patent/CN109212960A/en
Application granted granted Critical
Publication of CN109212960B publication Critical patent/CN109212960B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • G05B13/0205Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system
    • G05B13/024Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric not using a model or a simulator of the controlled system in which a parameter or coefficient is automatically adjusted to optimise the performance

Abstract

The binary neural network hardware-compressed method based on weight sensitivity that the invention discloses a kind of, comprising the following steps: using binary neural network training, to obtain weight matrix and original accuracy;Assess the susceptibility of any weight matrix;Default susceptibility threshold values divides the sensitive collection and non-sensitive collection of weight matrix;Assess the susceptibility of the non-sensitive collection of weight matrix;Susceptibility threshold values is adjusted, the best non-sensitive collection of weight matrix is obtained;The susceptibility of the best non-sensitive collection is equal to preset maximum likelihood penalty values;The best non-sensitive collection is stored to novel memory devices or using in nearly threshold values/Asia threshold voltage technology legacy memory.Through the above scheme, the present invention has many advantages, such as that low in energy consumption, discrimination is high, versatility is good, low in cost, has a vast market foreground in hardware-compressed technical field.

Description

Binary neural network hardware-compressed method based on weight sensitivity
Technical field
The present invention relates to hardware-compressed technical fields, are based especially on the binary neural network hardware-compressed of weight sensitivity Method.
Background technique
Currently, for resource overhead and power consumption needed for reducing neural network hardware realization, used main stream approach has Hardware structure optimization, neural network compression, binary neural network etc..Wherein, hardware structure optimization is designed more in hardware view The method for efficiently realizing neural network reduces the memory source that data occupy, and reduces data on memory read-write and operation mode Redundancy, to achieve the purpose that reduce resource overhead and power consumption.And neural network compression is weighed by reducing in neural network The number of value and quantizing bit number realize the compression of network model, while guaranteeing the recognition accuracy of neural network after compression not It is impacted.Under normal conditions, a large amount of weights in neural network be all absolute value close to 0 number, therefore can will be in network absolutely (as 0) is removed to the weight of value very little, so that there is no connections at this, reduces the total number of weight in network.In network Weight is all the very high decimal of accuracy, in hardware store, needs for these decimals to be quantified as the fixed-point number of fixed bit number, is The high-precision for guaranteeing weight stores a weight usually using 32, therefore causes storage overhead larger.It can be by each weight More rough quantization is done, i.e., indicates high-precision decimal with less digit (such as 3), the weight of different layers can be used in network Different quantization digit, to guarantee the recognition accuracy of neural network.Traditional neural network compression, firstly, normally training one Weight absolute value in neural network after training is set to 0 (i.e. at this between two neurons less than threshold value by a neural network There is no connections) so that neural network is sparse, to reduce the quantity of weight in network, weight then is carried out to sparse network New training;Next remaining weight is divided into several classifications by the method that K-means is clustered, each classification is compiled Code, the classification coded representation of each weight then only need 2 each weights of coded representation, the power of each classification if being divided into 4 classes Value shares the same numerical value, then carries out re -training to the coding of weight;Finally, being advanced optimized using huffman coding The coding of each weight, realization are effectively compressed.
Binary neural network is that the weight (or including every layer of input value) in neural network is directly quantified as 1 or -1, It only needs 1 to indicate weight within hardware, greatly reduces digit required for each weight.Mainly have in binary neural network The network of four kinds of forms, respectively BinaryConnect (binary connection), BNN (binary neural network), BWN (two-value weight net Network), XNOR-Net (with or network), only the object of quantization is different between them, but is all that numerical value is changed into 1 or -1 table Show.BinaryConnect and BWN is only weight to be quantified as two-value 1 or -1, BNN and XNOR-Net is by weight and every layer Input value is all quantified as two-value 1 or -1.In neural network, each layer of calculating is mainly between input vector and weight matrix Multiplication operation.If weight is only quantified 1 or -1, the multiplication between input vector and weight matrix is made to operate conversion to add Subtraction operation reduces multiplication operation;If weight and every layer of input value are all quantified 1 or -1, make input vector and weight square The multiplication operation of battle array is converted into the same of 1bit or operation, more saves power consumption than addition and subtraction operation.BWN and XNOR-Net ratio BinaryConnect and BNN introduces scale factor more, therefore can more guarantee the recognition accuracy of complex task well.
Traditional compression method has the following disadvantages:
First, hardware structure optimization saves hardware resource with neural network compression and power consumption is poor;Relative to hardware structure For optimization and neural network compression, since at least 32 times of compression (primitive network weight use may be implemented in binary neural network 32 expressions, binary neural network only have 1 expression), and computationally multiplying is converted to plus and minus calculation or 1bit is same Or operation, it greatly reduces hardware store expense and calculates power consumption.Although and hardware structure optimization and neural network compression are one Determine to save amount of storage and power consumption in degree, but simple not as good as binary neural network.
Second, the recognition accuracy of binary neural network is lower;In a variety of binary neural networks, with regard to classification task Speech, BinaryConnect and BNN can only complete well classification task to some lesser data sets, such as handwritten numeral collection MNIST, the data set CIFAR of common objects identification, the identification of real world street number number data set SVHN etc., when changing When data set very large at such as ImageNet, BinaryConnect and BNN can make recognition accuracy degradation.For this purpose, BWN and XNOR-Net needs additional scaling factor to guarantee Network Recognition accuracy rate.
Third, traditional compression method use as 6T SRAM memory device, the memory device make hardware resource cost and Power consumption is all bigger, limits the scale of neural network of chip realization, although binary neural network is equipped with performance to conventional hardware It is good, but its fault-tolerance is not fully utilized.The trend of existing mainstream is using novel memory devices part, as RRAM (deposit by resistive formula Reservoir), although hardware resource can be saved significantly, more massive neural network is disposed on hardware, it exists Insecure problem, if can largely influence mind by the weight storage of entire neural network on novel memory devices part Recognition accuracy through network, therefore some challenges are still had using novel memory devices part.
4th, legacy memory is powered using normal voltage, and to reduce circuit power consumption, nearly threshold value/subthreshold value electricity can be used Pressure technology.Wherein nearly threshold voltage technique be by the adjustment of circuit supply voltage near transistor cut-in voltage (or it is high or It is low), which has greatly improved in aspect of performance such as working frequency, energy efficiencies;Subthreshold voltage technology is the electricity that will power Pressure is adjusted to provide lowest energy consumption lower than transistor cut-in voltage.It therefore, can be to the legacy memory of storage neural network weight Part uses nearly threshold value/subthreshold voltage technology, to achieve the purpose that reduce power consumption.But nearly threshold value/subthreshold voltage technology Some challenges are still suffered from, there are problems that uncertain or mutability.Under low suppling voltage, circuit is subject to interfere, from And the weight stored in conventional memory device is caused to malfunction.If entire conventional memory device is all used nearly threshold value/subthreshold value electricity Pressure technology then can largely influence the recognition accuracy of neural network.
Summary of the invention
In view of the above-mentioned problems, the purpose of the present invention is to provide a kind of binary neural network hardware based on weight sensitivity Compression method, The technical solution adopted by the invention is as follows:
Binary neural network hardware-compressed method based on weight sensitivity, comprising the following steps:
Step S1, using binary neural network training, to obtain weight matrix and original accuracy.
Step S2 assesses the susceptibility of any weight matrix.
Step S3 presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix.
Step S4 assesses the susceptibility of the non-sensitive collection of weight matrix.
Step S5 adjusts susceptibility threshold values, obtains the best non-sensitive collection of weight matrix;The best non-sensitive collection it is quick Sensitivity is equal to preset maximum likelihood penalty values.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by step S6 Legacy memory in.
Further, in the step S2, the susceptibility of any weight matrix is assessed, comprising the following steps:
Step S21 presets error probability P, unreliable to assess novel memory devices part and nearly threshold values/Asia threshold voltage Degree;The P is the number greater than 0 and less than 1.
Step S22, any binary neural network weight of the weight matrix successively occur mistake with error probability P, obtain Obtain the first accuracy of binary neural network.
Step S23 repeats n times step S22, obtains the frequency histogram of the first accuracy;The N is oneself greater than 100 So number.
Step S24 acquires the average value of the frequency histogram of the first accuracy, using the average value as the weight matrix The second accuracy that binary neural network reaches when mistake occurs.
Step S25, acquires the susceptibility of the weight matrix, and the susceptibility is in original accuracy and step S24 The difference of second accuracy.
Further, in the step S3, susceptibility threshold values is preset, divides the sensitive collection of weight matrix and non-sensitive Collection, comprising the following steps:
Step S31, by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values.
The susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection by step S32, and by the quick of weight matrix Sensitivity is less than or equal to susceptibility threshold values and is divided into non-sensitive collection.
Preferably, in the step S4, the susceptibility of the non-sensitive collection of weight matrix is assessed, comprising the following steps:
The non-sensitive collection mistake with error probability P, obtains the third accuracy of non-sensitive collection by step S41 occurs;Institute Stating P is the number greater than 0 and less than 1.
Step S42 repeats n times step S41, obtains the average value of the frequency histogram of third accuracy, acquires original standard The difference of exactness and the average value and as the susceptibility of non-sensitive collection.
Further, in the step S5, susceptibility threshold values is adjusted, the best non-sensitive collection of weight matrix is obtained, including Following steps:
Step S51 presets maximum likelihood penalty values;
Step S52 adjusts susceptibility threshold, if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values, by institute Non-sensitive collection is stated as best non-sensitive collection, and enters step S6, otherwise, continues to adjust susceptibility threshold, and return step S3 The sensitive collection of weight matrix and non-sensitive collection are divided.
Binary neural network hardware-compressed method based on weight sensitivity, comprising the following steps:
Step K1, using binary neural network training, to obtain weight matrix and original accuracy.
Step K2 initializes population, to obtain the initialization of the D dimensional vector of particle;The D is weight in neural network The number of matrix.
Step K3 adds constraint condition.
Step K4 acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle.
Step K5 updates the history optimal value and global optimum of any particle.
Step K6, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates a random number T;The T is Number more than or equal to 0 and less than or equal to 1.
Step K7 determines the size of the variation probability and random number T of any particle, if random number T is less than or equal to variation generally Rate then converts the dimensionality of particle value inequality;If random number T is greater than variation probability, which is remained unchanged, with reality The update of existing particle D dimensional vector.
Step K8 repeats step K4 to step K7, carries out the iterative operation of particle, and judge iterative operation number whether Equal to preset maximum number of iterations, if so, exporting global optimum, otherwise, return step K4;Utilize the global optimum Value acquires best non-sensitive collection.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by step K9 Legacy memory in.
Preferably, in the step K2, any one particle in population is chosen, which is obtained using sensitivity analysis The initialization of the D dimensional vector of son, and by the D dimensional vector random initializtion of residual particles;The population is made of M particle, M For the natural number greater than 1.
Further, in the step K4, the adaptive value of any particle in population and the sensitivity of non-sensitive collection are acquired Degree, comprising the following steps:
Step K41, by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by the D dimensional vector The weight matrix that middle position is 0 is labeled as sensitivity;The weight matrix quantity that position is 1 in the D dimensional vector is the particle Adaptive value.
Step K42 by the non-sensitive conclusion in D dimensional vector and obtains non-sensitive collection, by the non-sensitive collection with wrong general Mistake occurs for rate P, and obtains the 4th accuracy of binary neural network, repeats the frequency histogram that n times acquire the 4th accuracy, The 5th accuracy that binary neural network reaches when as non-sensitive collection mistake is occurred for the average value of the frequency histogram;Institute Stating P is the number greater than 0 and less than 1, and N is the natural number greater than 100;The difference of the original accuracy and the 5th accuracy is For the susceptibility of non-sensitive collection.
Further, in the step K5, if the susceptibility of the non-sensitive collection acquired in step K42 is less than or equal to preset Maximum likelihood penalty values, then judge whether the adaptive value of any particle is greater than the corresponding history optimal value of the particle, if should Adaptive value is greater than history optimal value, then using the adaptive value as history optimal value, otherwise, keeps history optimal value constant;And by M The history optimal value of a particle compares, and using the maximum history optimal value of numerical value as global optimum.
Further, in the step K6, the renewal speed and variation probability of more new particle, comprising the following steps:
Step K61 calculates the renewal speed v of every dimension of any particleid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid) ①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate generation Random number, pid indicate the history optimal value of particle, and xid indicates the current value of particle, and pgd indicates the optimal value of M particle.
Step K62, by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
Compared with prior art, the invention has the following advantages:
(1) present invention can obtain weight matrix non-sensitive to accuracy in neural network, specific as follows: by method It is set in 1 and adjusts susceptibility threshold, to divide weight matrix;And pass through the result with Sensitivity Analysis in method 2 D dimensional vector as one of particle initializes, other particle random initializtions, and by continuous iteration and updates M grain The D dimensional vector of son, to obtain the splitting scheme of weight matrix.Being designed in this way is advantageous in that, can be weighed in neural network The best non-sensitive collection of value matrix, best non-sensitive collection mean that mistake, the accuracy of network occurs even if best non-sensitive collection It will not be by heavy losses.
(2) in method 1, for the present invention by comparing with maximum likelihood set by user loss, constantly adjustment is sensitive Threshold value is spent, to guarantee to obtain the best non-sensitive collection of weight matrix under the conditions of maximum likelihood set by user is lost.Separately Outside, in method 2, by adding the constraint condition of maximum likelihood loss set by user, to guarantee binary neural network Recognition accuracy.
(3) present invention under different process and technology novel memory devices part or nearly threshold value/subthreshold voltage technology all With versatility.Because of novel memory devices part or nearly threshold value/subthreshold voltage engineering reliability journey under different process and technology It spends inconsistent, novel memory devices part is assessed by using error probability P or nearly threshold value/subthreshold voltage technology is unreliable Property, and error probability P can be determined according to actual process and technology.
(4) present invention dexterously utilizes neural network weight, to reduce the resource overhead of hardware, by best non-sensitive collection Weight storage is on novel memory devices part, and novel memory devices part is simpler than conventional memory device structure, and resource overhead is small, bring Effect is the reduction of the use of conventional memory device, saves the resource overhead of hardware.Alternatively, the present invention will best non-sensitive collection Weight storage using in nearly threshold value/subthreshold voltage technology conventional memory device, the power supply of nearly threshold value/subthreshold voltage Voltage is lower than the normally-open voltage of transistor, thereby reduces the power consumption of circuit.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to the attached drawing used required in embodiment It is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as to protection The restriction of range to those skilled in the art without creative efforts, can also be attached according to these Figure obtains other relevant attached drawings.
Fig. 1 is flow chart (one) of the invention.
Fig. 2 is the flow chart of the susceptibility of the assessment weight matrix of invention.
Fig. 3 is the flow chart of the division weight matrix of invention.
Fig. 4 is the flow chart of the susceptibility of the non-sensitive collection of assessment of the invention.
Fig. 5 is the flow chart for seeking best non-sensitive collection of the invention.
Fig. 6 is flow chart (two) of the invention.
Fig. 7 is flow chart (three) of the invention.
Specific embodiment
To keep the purposes, technical schemes and advantages of the application apparent, with reference to the accompanying drawings and examples to the present invention It is described further, embodiments of the present invention include but is not limited to the following example.Based on the embodiment in the application, ability Domain those of ordinary skill every other embodiment obtained without making creative work, belongs to the application The range of protection.
Embodiment 1
As shown in Figures 1 to 5, a kind of binary neural network hardware-compressed based on weight sensitivity is present embodiments provided Method, it should be noted that the serial numbers term such as " first " described in the present embodiment, " second ", " third " is only used for distinguishing same Base part, comprising the following steps:
The first step, using binary neural network training, to obtain weight matrix and original accuracy.
Second step assesses the susceptibility of any weight matrix, specific as follows:
(21) error probability P is preset, to assess novel memory devices part and the unreliable degree of nearly threshold values/Asia threshold voltage, institute Stating P is the number greater than 0 and less than 1;It is P that the probability of wrong (1 → -1, -1 → 1), which occurs, for each weight i.e. in weight matrix.
(22) successively with error probability P mistake occurs for any binary neural network weight of the weight matrix, obtains two It is worth the first accuracy of neural network.When mistake occurs for any binary neural network weight, other in binary neural network Weight matrix remains unchanged, and with the recognition accuracy of data set test network.
(23) at least 100 steps (22) are repeated, the frequency histogram of the first accuracy is obtained.Due to every in weight matrix It is a chance event that whether a weight, which occurs mistake, therefore the error situation tested each time is all different, therefore repeats above-mentioned reality It tests at least 100 times, and frequency histogram is obtained as probability distribution according to experimental result.
(24) average value for acquiring the frequency histogram of the first accuracy occurs the average value as the weight matrix The second accuracy that binary neural network reaches when mistake
(25) susceptibility of the weight matrix is acquired, the susceptibility is second in original accuracy and step S24 The difference of accuracy.Each of neural network weight matrix obtains respective sensibility all in accordance with the above process.
Third step presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix, specifically:
(31) by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values.
(32) susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection, and by the susceptibility of weight matrix Non-sensitive collection is divided into less than or equal to susceptibility threshold values.
4th step assesses the susceptibility of the non-sensitive collection of weight matrix, comprising the following steps:
(41) with error probability P mistake is occurred into for the non-sensitive collection, obtains the third accuracy of non-sensitive collection.
(42) at least 100 steps (41) are repeated, the average value of the frequency histogram of third accuracy is obtained, acquires original The difference of accuracy and the average value and as the susceptibility of non-sensitive collection.
5th step adjusts susceptibility threshold values, obtains the best non-sensitive collection of weight matrix.Wherein, described best non-sensitive The susceptibility of collection is equal to preset maximum likelihood penalty values.Specifically:
(51) maximum likelihood penalty values are preset;
(52) susceptibility threshold is adjusted, it, will be described non-if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values Sensitivity collection enters the 6th step as best non-sensitive collection, otherwise, continues to adjust susceptibility threshold, and return to third step to power The sensitive collection of value matrix and non-sensitive collection are divided.Here, adjustment susceptibility threshold is specific as follows: if the sensitivity of non-sensitive collection Degree is less than maximum likelihood penalty values, then increases susceptibility threshold, if the susceptibility of non-sensitive collection loses greater than maximum likelihood Value, then reduce susceptibility threshold, until the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by the 6th step Legacy memory in.
Embodiment 2
As shown in fig. 6, a kind of binary neural network hardware-compressed method based on weight sensitivity is present embodiments provided, It is sensitive to recognition accuracy in binary neural network to search that this method combines sensitivity analysis and binary particle swarm algorithm Property the combination of lower weight matrix.Wherein, binary particle swarm algorithm forms a group by M particle, ties up in a D Optimal value is searched in object space, and the position of particle is updated according to speed more new formula, each solution is evaluated with fitness function Superiority and inferiority searches optimal value in such a way that iteration updates.It should be noted that " the 4th " described in the present embodiment, " the 5th ", Etc. serial numbers term be only used for distinguish same item, the binary neural network hardware-compressed method the following steps are included:
Binary neural network hardware-compressed method based on weight sensitivity, which comprises the following steps:
The first step, using binary neural network training, to obtain weight matrix and original accuracy.
Second step initializes population, to obtain the initialization of the D dimensional vector of particle.Wherein, the D is neural network The number of middle weight matrix, i.e., per one weight matrix of one-dimensional correspondence, " 1 " represents corresponding weight matrix to be insensitive, " 0 " It is sensitive for representing corresponding weight matrix.Concrete operations are as follows: have M particle in postulated particle group, each particle with D tie up to Amount indicates, per one-dimensional for two-value (1 or 0).Any one particle in population is chosen, which is obtained using sensitivity analysis D dimensional vector initialization.The D dimensional vector of other (M-1) a particles does random initializtion.
Third step adds constraint condition, using preset maximum likelihood penalty values as a constraint item in the algorithm Part, i.e. search result are to maximize the number of insensitive weight matrix under the conditions of meeting the loss of accuracy value.
4th step acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle.Specifically For:
(41) by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by position in the D dimensional vector The weight matrix for being set to 0 is labeled as sensitivity.The weight matrix quantity that position is 1 in the D dimensional vector is the adaptation of the particle Value.
(42) by the non-sensitive conclusion in D dimensional vector and non-sensitive collection is obtained, by the non-sensitive collection with error probability P hair Raw mistake, and the 4th accuracy of binary neural network is obtained, repeat at least 100 times frequency histograms for acquiring the 4th accuracy Figure, binary neural network reaches when as non-sensitive collection mistake is occurred for the average value of the frequency histogram the 5th are accurate Degree.The difference of the original accuracy and the 5th accuracy is the susceptibility of non-sensitive collection.
5th step updates the history optimal value and global optimum of any particle.If being acquired in step (42) non-sensitive The susceptibility of collection is less than or equal to preset maximum likelihood penalty values, then judges whether the adaptive value of any particle is greater than the particle Corresponding history optimal value, using the adaptive value as history optimal value, otherwise, is protected if the adaptive value is greater than history optimal value It is constant to hold history optimal value.And the history optimal value of M particle is compared, and using the maximum history optimal value of numerical value as Global optimum.
6th step, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates a random number T;The T is Number more than or equal to 0 and less than or equal to 1.Specifically:
(61) the renewal speed v of every dimension of any particle is calculatedid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid)①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate to generate Random number, pid indicate particle history optimal value, xid indicate particle current value, pgd indicate M particle optimal value.
(62) by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
7th step determines the size of the variation probability and random number T of any particle, if random number T is less than or equal to variation generally Rate then converts the dimensionality of particle value inequality, i.e., dimensionality of particle value 1 is transformed into 0,0 and is transformed into 1;If it is general that random number T is greater than variation Rate, then the dimensionality of particle value remains unchanged, to realize the update of particle D dimensional vector.
8th step repeats the 4th step to the 7th step, carries out the iterative operation of particle, and judge iterative operation number whether Equal to preset maximum number of iterations, if so, otherwise output global optimum returns to the 4th step;Utilize the global optimum Value acquires best non-sensitive collection.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology by the 9th step Legacy memory in.
Embodiment 3
As described in Figure 7, a kind of binary neural network hardware-compressed method based on weight sensitivity is present embodiments provided, The serial numbers term such as " first " described in the present embodiment, " second ", " third " is only used for distinguishing same item, specifically, should Method the following steps are included:
The first step, using binary neural network training, to obtain weight matrix and original accuracy.
Second step assesses the susceptibility of any weight matrix, specific as follows:
(21) error probability P is preset, to assess novel memory devices part and the unreliable degree of nearly threshold values/Asia threshold voltage, institute Stating P is the number greater than 0 and less than 1;It is P that the probability of wrong (1 → -1, -1 → 1), which occurs, for each weight i.e. in weight matrix.
(22) successively with error probability P mistake occurs for any binary neural network weight of the weight matrix, obtains two It is worth the first accuracy of neural network.When mistake occurs for any binary neural network weight, other in binary neural network Weight matrix remains unchanged, and with the recognition accuracy of data set test network.
(23) at least 100 steps (22) are repeated, the frequency histogram of the first accuracy is obtained.Due to every in weight matrix It is a chance event that whether a weight, which occurs mistake, therefore the error situation tested each time is all different, therefore repeats above-mentioned reality It tests at least 100 times, and frequency histogram is obtained as probability distribution according to experimental result.
(24) average value for acquiring the frequency histogram of the first accuracy occurs the average value as the weight matrix The second accuracy that binary neural network reaches when mistake
(25) susceptibility of the weight matrix is acquired, the susceptibility is second in original accuracy and step S24 The difference of accuracy.Each of neural network weight matrix obtains respective sensibility all in accordance with the above process.
Third step presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix, specifically:
(31) by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values.
(32) susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection, and by the susceptibility of weight matrix Non-sensitive collection is divided into less than or equal to susceptibility threshold values.
4th step assesses the susceptibility of the non-sensitive collection of weight matrix, with steps are as follows:
(41) with error probability P mistake is occurred into for the non-sensitive collection, obtains the third accuracy of non-sensitive collection.
(42) at least 100 steps (41) are repeated, the average value of the frequency histogram of third accuracy is obtained, acquires original The difference of accuracy and the average value and as the susceptibility of non-sensitive collection.
5th step adjusts susceptibility threshold values, obtains the non-sensitive collection of suboptimum of weight matrix.The non-sensitive collection of suboptimum Susceptibility be equal to preset maximum likelihood penalty values.Specific step is as follows:
(51) maximum likelihood penalty values are preset.
(52) susceptibility threshold is adjusted, it, will be described non-if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values Sensitivity collection is used as the non-sensitive collection of suboptimum, and enters the 6th step, otherwise, continues to adjust susceptibility threshold, and return to third step pair The sensitive collection of weight matrix and non-sensitive collection are divided.It is specific as follows to continue adjustment susceptibility threshold: if non-sensitive collection is quick Sensitivity is less than maximum likelihood penalty values, then increases susceptibility threshold, if the susceptibility of non-sensitive collection is damaged greater than maximum likelihood Mistake value, then reduce susceptibility threshold, until the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values.
6th step initializes population, to obtain the initialization of the D dimensional vector of particle, wherein D is to weigh in neural network The number of value matrix, i.e., per one weight matrix of one-dimensional correspondence, " 1 " represents corresponding weight matrix to be insensitive, and " 0 " represents Corresponding weight matrix is sensitive.Concrete operations are as follows: having M particle in postulated particle group, choose any one particle, adopt The D dimensional vector of the particle is initialized with the non-sensitive collection of suboptimum, the D dimensional vector of other (M-1) a particles does random initializtion.
7th step adds constraint condition, here, about using preset maximum likelihood penalty values as one in the algorithm Beam condition, i.e. search result are to maximize the number of insensitive weight matrix under the conditions of meeting the loss of accuracy value.
8th step acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle, specifically Steps are as follows:
(81) by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by position in the D dimensional vector The weight matrix for being set to 0 is labeled as sensitivity.The weight matrix quantity that position is 1 in the D dimensional vector is the adaptation of the particle Value.
(82) by the non-sensitive conclusion in D dimensional vector and non-sensitive collection is obtained, by the non-sensitive collection with error probability P hair Raw mistake, and the 4th accuracy of binary neural network is obtained, repeat at least 100 times frequency histograms for acquiring the 4th accuracy Figure, binary neural network reaches when as non-sensitive collection mistake is occurred for the average value of the frequency histogram the 5th are accurate Degree.The difference of the original accuracy and the 5th accuracy is the susceptibility of non-sensitive collection.
9th step updates the history optimal value and global optimum of any particle.If being acquired in step (82) non-sensitive The susceptibility of collection is less than or equal to preset maximum likelihood penalty values, then judges whether the adaptive value of any particle is greater than the particle Corresponding history optimal value, using the adaptive value as history optimal value, otherwise, is protected if the adaptive value is greater than history optimal value It is constant to hold history optimal value;And the history optimal value of M particle is compared, and using the maximum history optimal value of numerical value as Global optimum.
Tenth step, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates random number a T, the T and be Number more than or equal to 0 and less than or equal to 1.Specifically:
(101) the renewal speed v of every dimension of any particle is calculatedid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid) ①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate generation Random number, pid indicate the history optimal value of particle, and xid indicates the current value of particle, and pgd indicates the optimal value of M particle.
(102) by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
11st step determines the size of variation probability and random number T, if random number T is less than or equal to variation probability, will appoint The transformation of one dimensionality of particle value inequality, i.e., dimensionality of particle value 1 is transformed into 0,0 and is transformed into 1;If random number T is greater than variation probability, appoint One dimensionality of particle value remains unchanged, to realize the update of particle D dimensional vector.
12nd step repeats the 8th step to the 11st step, carries out the iterative operation of particle, and judges the number of iterative operation Whether preset maximum number of iterations is equal to, if so, otherwise output global optimum returns to the 8th step and continues particle Iterative operation;And best non-sensitive collection is acquired using the global optimum.
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage skill by the 13rd step In the legacy memory of art.
Above-described embodiment is merely a preferred embodiment of the present invention, and it is not intended to limit the protection scope of the present invention, as long as using Design principle of the invention, and the non-creative variation worked and made is carried out on this basis, it should belong to of the invention Within protection scope.

Claims (10)

1. the binary neural network hardware-compressed method based on weight sensitivity, which comprises the following steps:
Step S1, using binary neural network training, to obtain weight matrix and original accuracy;
Step S2 assesses the susceptibility of any weight matrix;
Step S3 presets susceptibility threshold values, divides the sensitive collection and non-sensitive collection of weight matrix;
Step S4 assesses the susceptibility of the non-sensitive collection of weight matrix;
Step S5 adjusts susceptibility threshold values, obtains the best non-sensitive collection of weight matrix;The susceptibility of the best non-sensitive collection Equal to preset maximum likelihood penalty values;
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology biography by step S6 In system memory.
2. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that In the step S2, the susceptibility of any weight matrix is assessed, comprising the following steps:
Step S21 presets error probability P, to assess novel memory devices part and the unreliable degree of nearly threshold values/Asia threshold voltage;Institute Stating P is the number greater than 0 and less than 1;
Successively with error probability P mistake occurs for step S22, any binary neural network weight of the weight matrix, obtains two It is worth the first accuracy of neural network;
Step S23 repeats n times step S22, obtains the frequency histogram of the first accuracy;The N is the natural number greater than 100;
Step S24 acquires the average value of the frequency histogram of the first accuracy, occurs the average value as the weight matrix The second accuracy that binary neural network reaches when mistake;
Step S25, acquires the susceptibility of the weight matrix, and the susceptibility is second in original accuracy and step S24 The difference of accuracy.
3. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that In the step S3, susceptibility threshold values is preset, divides the sensitive collection and non-sensitive collection of weight matrix, comprising the following steps:
Step S31, by the susceptibility of the weight matrix by from successively sorting to small greatly, and default susceptibility threshold values;
The susceptibility of weight matrix is greater than susceptibility threshold values and is divided into sensitive collection by step S32, and by the susceptibility of weight matrix Non-sensitive collection is divided into less than or equal to susceptibility threshold values.
4. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that In the step S4, the susceptibility of the non-sensitive collection of weight matrix is assessed, comprising the following steps:
The non-sensitive collection mistake with error probability P, obtains the third accuracy of non-sensitive collection by step S41 occurs;The P For the number greater than 0 and less than 1;
Step S42 repeats n times step S41, obtains the average value of the frequency histogram of third accuracy, acquires original accuracy With the difference of the average value and as the susceptibility of non-sensitive collection.
5. the binary neural network hardware-compressed method according to claim 1 based on weight sensitivity, which is characterized in that In the step S5, susceptibility threshold values is adjusted, the best non-sensitive collection of weight matrix is obtained, comprising the following steps:
Step S51 presets maximum likelihood penalty values;
Step S52 adjusts susceptibility threshold, will be described non-if the susceptibility of non-sensitive collection is equal to maximum likelihood penalty values Sensitivity collection enters step S6 as best non-sensitive collection, otherwise, continues to adjust susceptibility threshold, and return step S3 is to power The sensitive collection of value matrix and non-sensitive collection are divided.
6. the binary neural network hardware-compressed method based on weight sensitivity, which comprises the following steps:
Step K1, using binary neural network training, to obtain weight matrix and original accuracy;
Step K2 initializes population, to obtain the initialization of the D dimensional vector of particle;The D is weight matrix in neural network Number;
Step K3 adds constraint condition;
Step K4 acquires the adaptive value of any particle in population, acquires the susceptibility of the non-sensitive collection of particle;
Step K5 updates the history optimal value and global optimum of any particle;
Step K6, acquires the renewal speed and variation probability of any dimensionality of particle value, and generates a random number T;The T be greater than Number equal to 0 and less than or equal to 1;
Step K7 determines the size of the variation probability and random number T of any particle, if random number T is less than or equal to variation probability, The dimensionality of particle value inequality is converted;If random number T is greater than variation probability, which is remained unchanged, to realize grain The update of sub- D dimensional vector;
Step K8 repeats step K4 to step K7, carries out the iterative operation of particle, and judges whether the number of iterative operation is equal to Preset maximum number of iterations, if so, output global optimum, otherwise, return step K4;It is asked using the global optimum Obtain most preferably non-sensitive collection;
The best non-sensitive collection is stored to novel memory devices or uses nearly threshold values/Asia threshold voltage technology biography by step K9 In system memory.
7. the binary neural network hardware-compressed method according to claim 6 based on weight sensitivity, which is characterized in that In the step K2, any one particle in population is chosen, the first of the D dimensional vector of the particle is obtained using sensitivity analysis Beginningization, and by the D dimensional vector random initializtion of residual particles;The population is made of M particle, and M is the nature greater than 1 Number.
8. the binary neural network hardware-compressed method according to claim 6 based on weight sensitivity, which is characterized in that In the step K4, the adaptive value of any particle in population and the susceptibility of non-sensitive collection are acquired, comprising the following steps:
Step K41, by position in the D dimensional vector be 1 weight matrix labeled as non-sensitive, and by position in the D dimensional vector The weight matrix for being set to 0 is labeled as sensitivity;The weight matrix quantity that position is 1 in the D dimensional vector is the adaptation of the particle Value;
Step K42 by the non-sensitive conclusion in D dimensional vector and obtains non-sensitive collection, by the non-sensitive collection with error probability P hair Raw mistake, and the 4th accuracy of binary neural network is obtained, the frequency histogram that n times acquire the 4th accuracy is repeated, by institute The 5th accuracy that binary neural network reaches when stating the average value of frequency histogram as non-sensitive collection generation mistake;The P For the number greater than 0 and less than 1, and N is the natural number greater than 100;The difference of the original accuracy and the 5th accuracy is The susceptibility of non-sensitive collection.
9. the binary neural network hardware-compressed method according to claim 8 based on weight sensitivity, which is characterized in that In the step K5, if the susceptibility of the non-sensitive collection acquired in step K42 is less than or equal to preset maximum likelihood penalty values, Then judge whether the adaptive value of any particle is greater than the corresponding history optimal value of the particle, if the adaptive value is optimal greater than history Otherwise value, keeps history optimal value constant then using the adaptive value as history optimal value;And by the history optimal value of M particle It compares, and using the maximum history optimal value of numerical value as global optimum.
10. the binary neural network hardware-compressed method according to claim 6 based on weight sensitivity, feature exist In, in the step K6, the renewal speed and variation probability of more new particle, comprising the following steps:
Step K61 calculates the renewal speed v of every dimension of any particleid, expression formula are as follows:
vid=wvid+c1·rand()·(pid-xid)+c2·rand()·(pgd-xid) ①
Wherein, w indicates inertial factor, c1Indicate acceleration constant, c2Indicate that acceleration constant, rand () indicate to generate random Number, pid indicate the history optimal value of particle, and xid indicates the current value of particle, and pgd indicates the optimal value of M particle;
Step K62, by renewal speed vidIt is mapped to the variation probability of dimension values, maps expression formula are as follows:
Wherein, the vidIndicate the renewal speed of every dimension.
CN201811000016.7A 2018-08-30 2018-08-30 Weight sensitivity-based binary neural network hardware compression method Active CN109212960B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811000016.7A CN109212960B (en) 2018-08-30 2018-08-30 Weight sensitivity-based binary neural network hardware compression method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811000016.7A CN109212960B (en) 2018-08-30 2018-08-30 Weight sensitivity-based binary neural network hardware compression method

Publications (2)

Publication Number Publication Date
CN109212960A true CN109212960A (en) 2019-01-15
CN109212960B CN109212960B (en) 2020-08-14

Family

ID=64986164

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811000016.7A Active CN109212960B (en) 2018-08-30 2018-08-30 Weight sensitivity-based binary neural network hardware compression method

Country Status (1)

Country Link
CN (1) CN109212960B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978160A (en) * 2019-03-25 2019-07-05 北京中科寒武纪科技有限公司 Configuration device, method and the Related product of artificial intelligence process device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7372713B2 (en) * 2006-04-17 2008-05-13 Texas Instruments Incorporated Match sensing circuit for a content addressable memory device
CN107729999A (en) * 2016-08-12 2018-02-23 北京深鉴科技有限公司 Consider the deep neural network compression method of matrix correlation
CN107967515A (en) * 2016-10-19 2018-04-27 三星电子株式会社 The method and apparatus quantified for neutral net
CN108322221A (en) * 2017-01-18 2018-07-24 华南理工大学 A method of being used for depth convolutional neural networks model compression
CN108334945A (en) * 2018-01-30 2018-07-27 中国科学院自动化研究所 The acceleration of deep neural network and compression method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7372713B2 (en) * 2006-04-17 2008-05-13 Texas Instruments Incorporated Match sensing circuit for a content addressable memory device
CN107729999A (en) * 2016-08-12 2018-02-23 北京深鉴科技有限公司 Consider the deep neural network compression method of matrix correlation
CN107967515A (en) * 2016-10-19 2018-04-27 三星电子株式会社 The method and apparatus quantified for neutral net
CN108322221A (en) * 2017-01-18 2018-07-24 华南理工大学 A method of being used for depth convolutional neural networks model compression
CN108334945A (en) * 2018-01-30 2018-07-27 中国科学院自动化研究所 The acceleration of deep neural network and compression method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
YIXING LI 等: "Build a compact binary neural network through bit-level sensitivity and data pruning", 《NEUROCOMPUTING》 *
曹文龙 等: "神经网络模型压缩方法综述", 《计算机应用研究》 *
雷杰 等: "深度网络模型压缩综述", 《软件学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978160A (en) * 2019-03-25 2019-07-05 北京中科寒武纪科技有限公司 Configuration device, method and the Related product of artificial intelligence process device
CN109978160B (en) * 2019-03-25 2021-03-02 中科寒武纪科技股份有限公司 Configuration device and method of artificial intelligence processor and related products

Also Published As

Publication number Publication date
CN109212960B (en) 2020-08-14

Similar Documents

Publication Publication Date Title
WO2019179403A1 (en) Fraud transaction detection method based on sequence width depth learning
CN111062382A (en) Channel pruning method for target detection network
CN110705711A (en) Quantum state information dimension reduction coding method and device
CN104317902A (en) Image retrieval method based on local locality preserving iterative quantization hash
CN110210618A (en) The compression method that dynamic trimming deep neural network weight and weight are shared
CN104199923B (en) Large-scale image library searching method based on optimal K averages hash algorithm
CN107947921A (en) Based on recurrent neural network and the password of probability context-free grammar generation system
CN109858518B (en) Large data set clustering method based on MapReduce
CN110069644A (en) A kind of compression domain large-scale image search method based on deep learning
CN109711483A (en) A kind of power system operation mode clustering method based on Sparse Autoencoder
CN109471049B (en) Satellite power supply system anomaly detection method based on improved stacked self-encoder
CN112732864A (en) Document retrieval method based on dense pseudo query vector representation
CN109212960A (en) Binary neural network hardware-compressed method based on weight sensitivity
Cai et al. Credit Payment Fraud detection model based on TabNet and Xgboot
CN108805280A (en) A kind of method and apparatus of image retrieval
CN113962371A (en) Image identification method and system based on brain-like computing platform
Huang et al. Rct: Resource constrained training for edge ai
CN116186594A (en) Method for realizing intelligent detection of environment change trend based on decision network combined with big data
Lahdhiri et al. Dnnzip: Selective layers compression technique in deep neural network accelerators
Kekre et al. Vector quantized codebook optimization using modified genetic algorithm
CN112784838A (en) Hamming OCR recognition method based on locality sensitive hashing network
Zhai et al. Deep product quantization for large-scale image retrieval
CN112784625A (en) Acceleration and compression method of pedestrian re-identification model
CN110751274A (en) Neural network compression method and system based on random projection hash
CN117171778B (en) Access flow control method and system for database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant