CN108985453A

CN108985453A - Deep neural network model compression method based on the quantization of asymmetric ternary weight

Info

Publication number: CN108985453A
Application number: CN201810674698.3A
Authority: CN
Inventors: 吴俊敏; 丁杰; 吴焕
Original assignee: Suzhou Institute for Advanced Study USTC
Current assignee: Suzhou Institute for Advanced Study USTC
Priority date: 2018-06-27
Filing date: 2018-06-27
Publication date: 2018-12-11

Abstract

The invention discloses a kind of deep neural network model compression methods based on the quantization of asymmetric ternary weight, it include: in deep neural network training, before forward calculation each time, each layer of floating-point weight of network is quantified as asymmetrical ternary value, the parameter more new stage uses original floating type network weight；The deep neural network completed to training carries out compression storage.The nuisance parameter for removing deep neural network, compresses network model, recognition accuracy of the quantization method on large data sets is effectively promoted.

Description

Deep neural network model compression method based on the quantization of asymmetric ternary weight

Technical field

The present invention relates to the compression technique areas of convolutional neural networks, are based on asymmetric ternary weight more particularly to one kind The deep neural network model compression method of quantization.

Background technique

In recent years, with the fast development of deep learning algorithm, deep neural network speech recognition, image classification and from State-of-the-art achievement is achieved in a series of machine learning tasks such as right Language Processing.However, typical deep neural network is logical Often with there is millions of parameters, the embedded device for making it difficult to be deployed to only limited storage and computing resource is worked as In, how to realize that the model compression of deep neural network becomes the important research direction of current depth study.

Currently, typical model compression method is divided into two kinds, one is the structure of optimization network is to reduce the ginseng of network It keeps count of, the best paper Deep Compression of ICLR2016 is detailed to describe such method, and decades of times may be implemented Model compression ratio, but such method realizes that difficulty is larger, step is complex.Secondly to be reduced by reducing neural network accuracy Network storage, such as binaryzation network (Binary Connect) relatively conventional at present and symmetrical three-valued network (Ternary Weight networks), the above method achieves the accuracy rate not less than floating type network on lesser data set, still Loss in accuracy on biggish data set such as ImageNet is larger.

Newest ternary weight quantization method (Ternary weight networks) can arrive network weight quantization at present In {-α, 0 ,+α } ternary value, the quantization method that uses are as follows:

In the selection of quantization method, think by after training, the positive negative weight of network meets identical distribution, Significantly limit the ability to express of ternary weighting network.

Summary of the invention

For the above technical problems, object of the present invention is to: provide and a kind of quantified based on asymmetric ternary weight Deep neural network model compression method, remove the nuisance parameter of deep neural network, network model compressed, effectively Ground improves recognition accuracy of the quantization method on large data sets.

The technical scheme is that

A kind of deep neural network model compression method based on the quantization of asymmetric ternary weight, comprising the following steps:

S01: in deep neural network training, before forward calculation each time, by each layer of floating-point weight of network It is quantified as asymmetrical ternary value, the parameter more new stage uses original floating type network weight；

S02: the deep neural network completed to training carries out compression storage.

In preferred technical solution, the ternary valueAre as follows:

Wherein l represents corresponding network layer,It is the threshold value used in quantizing process,It is corresponding Zoom factor.

In preferred technical solution, loss brought by quantizing process is reduced by L2 normal form minimum, formula is as follows:

For any given threshold valueZoom factorAre as follows:

WhereinAndIt indicatesThe number of middle element；

Threshold factorAre as follows:

In preferred technical solution, the approximation of threshold factor is obtained using the method for approximate calculation:

Wherein I^p=i | W_li>=0 | i=1,2 ... n }, Iⁿ=i | W_li< 0 | i=1,2 ... n }.

In preferred technical solution, compression storage is carried out by the way of 2-bit coding, in compression process, passes through shifting 16 ternary values are stored as a 32-bit fixed-point integer by bit manipulation.

The invention also discloses a kind of deep neural network model compression set based on the quantization of asymmetric ternary weight, packets It includes:

One asymmetric ternary weighting network training module, deep neural network training when, forward calculation each time it Before, each layer of floating-point weight of network is quantified as asymmetrical ternary value, the parameter more new stage uses original floating type net Network weight；

One asymmetric ternary weight memory module, the deep neural network completed to training carry out compression storage.

In preferred technical solution, the ternary valueAre as follows:

For any given threshold valueZoom factorAre as follows:

WhereinAndIt indicatesThe number of middle element；

Threshold factorAre as follows:

Wherein I^p=i | W_li>=0 | i=1,2 ... n }, Iⁿ=i | W_li< 0 | i=1,2 ... n }.

Compared with prior art, the invention has the advantages that

1, different constraints is carried out for positive negative weight to improve the ability to express of three-valued network, and obtained by L2 constraint The relationship between positive negative threshold value and corresponding zoom factor is taken, bring is lost during reducing quantization, and quantization is effectively promoted Recognition accuracy of the method on large data sets.

2, the nuisance parameter for removing deep neural network, compresses network model, reduces deep neural network model Storage, can easier be transplanted in embedded device and execute.

Detailed description of the invention

The invention will be further described with reference to the accompanying drawings and embodiments:

Fig. 1 is present invention quantization network training flow chart；

Fig. 2 is weight coding method schematic diagram of the present invention；

Fig. 3 is that VGG16 quantifies network accuracy rate on CIFAR-10 data set；

Fig. 4 is that AlexNet quantifies network accuracy rate on ImageNet.

Specific embodiment

Above scheme is described further below in conjunction with specific embodiment.It should be understood that these embodiments are for illustrating The present invention and be not limited to limit the scope of the invention.Implementation condition used in the examples can be done according to the condition of specific producer Further adjustment, the implementation condition being not specified is usually the condition in routine experiment.

Embodiment:

Deep neural network generally comprises millions of parameters and it is made to be difficult to be applied to only limited resources In equipment, but most of parameter of usually network is all redundancy, thus this main object of the present invention be exactly remove it is superfluous Remaining parameter, implementation model compression.The realization of technology is broadly divided into three steps:

(1): asymmetric ternary weight quantizing process:

Traditional floating type network weight is quantified ternary value in network training by asymmetric ternary weight quantization methodIn the middle, the method that quantization method uses threshold value setting, formula are as follows:

In formulaIt is the threshold value used in quantizing process, arbitrary floating number can be arrived according to its range assignment In different ternary values.For corresponding zoom factor, lost for reducing quantizing process bring.It is arranged above Threshold method in, existFour independent parameter factors.

It is as follows that the present invention reduces loss, formula brought by quantizing process using L2 normal form minimum:

Formula (1) is brought into (2), formula can be converted are as follows:

WhereinAndIt indicatesThe number of middle element.Threshold factor is independent of each other,Be one withUnrelated independent constant value.

Formula (3) solution can be converted are as follows:

For any given threshold valueZoom factorIt may be calculated:

Later willIt is brought into formula (4), threshold factorIt may be calculated:

In above formulaIt is positive value, since formula (7) (8) are without accurate calculated value, so in experimentation In, it is assumed that network weight W_lBasic DYNAMIC DISTRIBUTION is still met by its positive and negative values after training, that is, approximate calculation can be used The approximation of method acquisition threshold factor:

Wherein I^p=i | W_li>=0 | i=1,2 ... n }, Iⁿ=i | W_li< 0 | i=1,2 ... n, finally combine formula (5), (6), it (9), (10) and substitutes into formula (1), can quantify to obtain corresponding ternary power in original floating type weight Weight values realize the sliding-model control of network weight.

(2) asymmetric ternary weighting network training process

Each layer of floating-point weight of network is all tied to by asymmetric ternary weight quantization methodThree In member value, greatly reduce the redundancy condition of parameter, effectively prevent can over-fitting generation, but for one The network that training is completed, which directlys adopt the quantization method, to have a huge impact the accuracy rate of network, it is therefore desirable to will quantify Method is added in the training process of network to reduce network loss in accuracy.The training method of network and traditional floating type net Network is similar, and training process is as shown in figure (1).

Figure (1) shows two key points of asymmetric ternary weighting network training: one is quantization method needs to add Before forward calculation each time, the penalty values of network are obtained by the weight calculation after quantifying, and main purpose is amount to obtain Influence of the change method to final result.The second is the original floating type network weight that uses of parameter more new stage and it is non-quantized after Ternary weight, it is therefore intended that obtain small gradient updating value and network updated towards optimal direction always.

(3) asymmetric ternary weight storage method

Asymmetric ternary weighting network is by after training, each layer network weight can all quantify to arrive In the middle, wherein l represents corresponding network layer, but ternary weight is still the expression of floating type, for implementation model storage Compression, this technology carry out compression storage by the way of 2-bit coding, and specific coding mode is as shown in figure (2).2-bit Coding can store four kinds of numerical value, indicates in this technology using wherein three kinds, in compression process, can pass through shifting function 16 ternary values are stored as a 32-bit fixed-point integer, can theoretically obtain 16 times or so of model compression ratio.

Asymmetric ternary weighting network (Asymmetric Ternary Networks ATNs) in CIFAR-10 and Training process on ImageNet data set is as shown in figure (3) (4), compared to traditional ternary weighting network (Ternary Weight Networks TWNs), the present invention effectively improves quantization network on CIFAR-10 and ImageNet data set Recognition accuracy, shown in specific result such as table (1) (2):

Table (1) VGG network accuracy rate on CIFAR-10 data set

Table (2) AlexNet network accuracy rate on ImageNet data set

It can be seen that ATNs improves 0.41% recognition accuracy on CIFAR-10 data set compared to TWNs, simultaneously It is also higher than floating type Network Recognition rate by 0.33%.On ImageNet data set, ATNs improves 2.25% compared to TWNs Accuracy rate, only reduce 0.63% compared to floating type network, knowledge of the quantization method on large data sets be effectively promoted Other accuracy rate.

The foregoing examples are merely illustrative of the technical concept and features of the invention, its object is to allow the person skilled in the art to be It cans understand the content of the present invention and implement it accordingly, it is not intended to limit the scope of the present invention.It is all smart according to the present invention The equivalent transformation or modification that refreshing essence is done, should be covered by the protection scope of the present invention.

Claims

1. it is a kind of based on asymmetric ternary weight quantization deep neural network model compression method, which is characterized in that including with Lower step:

S01: in deep neural network training, before forward calculation each time, each layer of floating-point weight of network is quantified For asymmetrical ternary value, the parameter more new stage uses original floating type network weight；

2. the deep neural network model compression method according to claim 1 based on the quantization of asymmetric ternary weight, It is characterized in that, the ternary valueAre as follows:

Wherein l represents corresponding network layer,It is the threshold value used in quantizing process,For corresponding scaling The factor.

3. the deep neural network model compression method according to claim 2 based on the quantization of asymmetric ternary weight, It is characterized in that, loss brought by quantizing process is reduced by L2 normal form minimum, and formula is as follows:

For any given threshold valueZoom factorAre as follows:

WhereinAndIt indicatesThe number of middle element；

Threshold factorAre as follows:

4. the deep neural network model compression method according to claim 3 based on the quantization of asymmetric ternary weight, It is characterized in that, the approximation of threshold factor is obtained using the method for approximate calculation:

Wherein I^p=i | W_li>=0 | i=1,2 ... n }, Iⁿ=i | W_li< 0 | i=1,2 ... n }.

5. the deep neural network model compression method according to claim 1 based on the quantization of asymmetric ternary weight, It is characterized in that, carries out compression storage by the way of 2-bit coding, in compression process, by shifting function by 16 ternarys Value is stored as a 32-bit fixed-point integer.

6. a kind of deep neural network model compression set based on the quantization of asymmetric ternary weight characterized by comprising

One asymmetric ternary weighting network training module, will before forward calculation each time in deep neural network training Each layer of floating-point weight of network is quantified as asymmetrical ternary value, and the parameter more new stage uses original floating type network weight Weight；

7. the deep neural network model compression set according to claim 6 based on the quantization of asymmetric ternary weight, It is characterized in that, the ternary valueAre as follows:

8. the deep neural network model compression set according to claim 7 based on the quantization of asymmetric ternary weight, It is characterized in that, loss brought by quantizing process is reduced by L2 normal form minimum, and formula is as follows:

For any given threshold valueZoom factorAre as follows:

WhereinAndIt indicatesThe number of middle element；

Threshold factorAre as follows:

9. the deep neural network model compression set according to claim 8 based on the quantization of asymmetric ternary weight, It is characterized in that, the approximation of threshold factor is obtained using the method for approximate calculation:

Wherein I^p=i | W_li>=0 | i=1,2 ... n }, Iⁿ=i | W_li< 0 | i=1,2 ... n }.

10. the deep neural network model compression set according to claim 6 based on the quantization of asymmetric ternary weight, It is characterized in that, carries out compression storage by the way of 2-bit coding, in compression process, by shifting function by 16 ternarys Value is stored as a 32-bit fixed-point integer.