CN107368885A

CN107368885A - Network model compression method and device based on more granularity beta prunings

Info

Publication number: CN107368885A
Application number: CN201710568710.8A
Authority: CN
Inventors: 曾建平; 王军; 李志国; 班华忠; 朱明�; 张智鹏
Original assignee: Beijing Zhi Xinyuandong Science And Technology Ltd
Current assignee: Beijing Zhi Xinyuandong Science And Technology Ltd
Priority date: 2017-07-13
Filing date: 2017-07-13
Publication date: 2017-11-21

Abstract

The invention provides the network model compression method based on more granularity beta prunings, this method includes following one or two or three steps：Input channel granularity level beta pruning step, using inessential element beta pruning method, inessential element in the granularity level of the input channel of network model is subjected to beta pruning；Convolution kernel granularity level beta pruning step, using inessential element beta pruning method, beta pruning is carried out to inessential element in the granularity level of the convolution kernel of network model；Weight parameter granularity level beta pruning step, using inessential element beta pruning method, beta pruning is carried out to inessential element in the granularity level of the weight parameter of network model.Compared with prior art, the present invention can effectively solve network model compression problem by carrying out beta pruning to more granularity levels.

Description

Network model compression method and device based on more granularity beta prunings

Technical field

The present invention relates to image procossing, video monitoring and deep neural network, more particularly to based on more granularity beta prunings Network model compression method and device.

Background technology

In recent years, with the fast development of artificial intelligence, deep learning network by combining low-level image feature due to forming height Layer feature, is influenceed smaller by environmental change, breakthrough achievement is achieved in computer vision field, especially in recognition of face and Image classification etc. has surmounted the recognition accuracy of the mankind.

However, existing high-performance deep learning network typically has millions of or even more than one hundred million individual parameters, this causes its Storage and calculating consumption are all very huge, limit its equipment for being applied to store and computing resource is all limited.Therefore, to depth It is a committed step for solving this problem that learning network model compression, which is compressed,.

But existing model compression technology reduces moulded dimension typically by the rarefaction of Model Weight value, but Be can not significantly decrease operation deep learning network required for storage and computing resource.

In summary, it is necessary to propose a kind of deep learning network model compression side for reducing storage and computing resource consumption Method.

The content of the invention

In view of this, it is a primary object of the present invention to reduce storage resource and computing resource consumption, network model is realized Compression.

To reach above-mentioned purpose, according to the first aspect of the invention, there is provided the network model based on more granularity beta prunings Compression method, this method include following one or two or three steps：

Input channel granularity level beta pruning step, using inessential element beta pruning method, by the input channel of network model Inessential element carries out beta pruning in granularity level；

Convolution kernel granularity level beta pruning step, using inessential element beta pruning method, to the granularity of the convolution kernel of network model Inessential element carries out beta pruning in level；

Weight parameter granularity level beta pruning step, using inessential element beta pruning method, to the weight parameter of network model Inessential element carries out beta pruning in granularity level.

Further, the inessential element beta pruning method includes：

The zero setting step of inessential element, the importance of each element in current granularity level is calculated, by inessential element Corresponding value zero setting；

Beta pruning trim step, whole network model is finely tuned according to training data；

Judgment step is lost, calculates the loss of accuracy of the network model after fine setting, if the loss of accuracy is less than essence Exactness loses threshold value, then continues executing with the zero setting step of inessential element, otherwise terminate.

Further, the zero setting step of the inessential element includes：

Element importance sorting step, count the weight parameter vector W that beta pruning element is each treated in current granularity level_e, The importance of beta pruning element is treated described in calculatingTo needed beta pruning element according to importance EIV_iRisen from small to large Sequence arranges, and obtains importance ascending order set, and calculate the importance summation of needed beta pruning element N is the quantity of current granularity level weight parameter；

Pruning threshold calculation procedure, according to energy beta pruning rate threshold value EPR, calculate beta pruning ENERGY E P=EIVS × EPR, statistics The cumulative distribution of importance in ascending order set, the importance corresponding to the cumulative distribution equal with beta pruning energy is chosen as beta pruning Threshold value；

Loss function influence amount calculation procedure, one group of test data is inputted, the loss function value Loss of calculating network, is calculated Loss function influence amount

Zero setting step, for each treating beta pruning element i in current granularity level, if treating the importance EIV of beta pruning element_i Less than pruning threshold, and loss function influence amount EL_iLess than 0, then this is treated to be worth zero setting corresponding to beta pruning element i.

According to another aspect of the present invention, there is provided the network model compression set based on more granularity beta prunings, the device Including following one or two or three modules：

Input channel granularity level pruning module, for using inessential element pruning module, by the input of network model Inessential element carries out beta pruning in the granularity level of passage；

Convolution kernel granularity level pruning module, for using inessential element pruning module, to the convolution kernel of network model Granularity level in inessential element carry out beta pruning；

Weight parameter granularity level pruning module, for using inessential element pruning module, to the weight of network model Inessential element carries out beta pruning in the granularity level of parameter.

Further, the inessential element pruning module includes：

The zero setting module of inessential element, will be inessential for calculating the importance of each element in current granularity level Value zero setting corresponding to element；

Module is finely tuned in beta pruning, for finely tuning whole network model according to training data；

Judge module is lost, for calculating the loss of accuracy of the network model after finely tuning, if the loss of accuracy is small In loss of accuracy's threshold value, then the zero setting module of inessential element is continued executing with, is otherwise terminated.

Further, the zero setting module of the inessential element includes：

Element importance sorting module, the weight parameter vector W of beta pruning element is each treated in current granularity level for counting_e, The importance of beta pruning element is treated described in calculatingTo needed beta pruning element according to importance EIV_iAscending order is carried out from small to large Arrangement, obtains importance ascending order set, and calculate the importance summation of needed beta pruning element N is the quantity of current granularity level weight parameter；

Pruning threshold computing module, for according to energy beta pruning rate threshold value EPR, calculating beta pruning ENERGY E P=EIVS × EPR, The cumulative distribution of importance in ascending order set is counted, chooses the importance conduct corresponding to the cumulative distribution equal with beta pruning energy Pruning threshold；

Loss function influence amount computing module, for input one group of test data, the loss function value Loss of calculating network, Counting loss function influences amount

Zero setting module, for for each treating beta pruning element i in current granularity level, if treating the importance of beta pruning element EIV_iLess than pruning threshold, and loss function influence amount EL_iLess than 0, then this is treated to be worth zero setting corresponding to beta pruning element i.

Compared with existing network model compression method, the model compression method of the invention based on more granularity beta prunings, adopt , not only can be with compression network moulded dimension, and due to network model with the method for one or more kinds of granularity level beta prunings Sparse format is regular, it is possible to reduce the calculating consumption of network.

Brief description of the drawings

Fig. 1 shows the flow of the network model compression method based on more granularity beta prunings according to one embodiment of the invention Figure.

Fig. 2 shows the structure of the network model compression set based on more granularity beta prunings according to one embodiment of the invention Schematic diagram.

Embodiment

To enable your auditor to further appreciate that structure, feature and the other purposes of the present invention, in conjunction with appended preferably real Apply example describe in detail it is as follows, illustrated preferred embodiment is merely to illustrate technical scheme, and the non-limiting present invention.

According to the present invention the network model compression method based on more granularity beta prunings include it is following one or two or Three steps：

Fig. 1 gives the flow of the network model compression method based on more granularity beta prunings according to one embodiment of the invention Figure.As shown in figure 1, include according to the network model compression method based on more granularity beta prunings of the present invention：

Input channel granularity level beta pruning step S1, using inessential element beta pruning method S10, the input of network model is led to Inessential element carries out beta pruning in the granularity level in road；

Convolution kernel granularity level beta pruning step S2, using inessential element beta pruning method S10, to the convolution kernel of network model Inessential element carries out beta pruning in granularity level；

Weight parameter granularity level beta pruning step S3, using inessential element beta pruning method S10, the weight of network model is joined Inessential element carries out beta pruning in several granularity levels.

Further, the inessential element beta pruning method S10 includes：

The zero setting step S11 of inessential element, the importance of each element in current granularity level is calculated, by inessential member Value zero setting corresponding to plain；

Beta pruning trim step S12, whole network model is finely tuned according to training data；

Judgment step S13 is lost, the loss of accuracy of the network model after fine setting is calculated, if the loss of accuracy is less than Loss of accuracy's threshold value, then the zero setting step S11 of inessential element is continued executing with, is otherwise terminated.

Further, the zero setting step S11 of the inessential element includes：

Element importance sorting step S111, count the weight parameter vector W that beta pruning element is each treated in current granularity level_e, meter The importance of beta pruning element is treated described in calculationTo needed beta pruning element according to importance EIV_iAscending order row is carried out from small to large Row, obtain importance ascending order set, and calculate the importance summation of needed beta pruning element N is the quantity of current granularity level weight parameter；

Pruning threshold calculation procedure S112, according to energy beta pruning rate threshold value EPR, beta pruning ENERGY E P=EIVS × EPR is calculated, The cumulative distribution of importance in ascending order set is counted, chooses the importance conduct corresponding to the cumulative distribution equal with beta pruning energy Pruning threshold；

Loss function influence amount calculation procedure S113, one group of test data of input, the loss function value Loss of calculating network, Counting loss function influences amount

Zero setting step S114, for each treating beta pruning element i in current granularity level, if treating the importance of beta pruning element EIV_iLess than pruning threshold, and loss function influence amount EL_iLess than 0, then this is treated to be worth zero setting corresponding to beta pruning element i.

Further, the span of the energy beta pruning rate threshold value EPR is 0.01~0.2.Preferably, the energy is cut Branch rate threshold value EPR span is 0.05~0.18.

Specifically, the current grain in the inessential element beta pruning method S10 of the input channel granularity level beta pruning step S1 Spend granularity level of the level for the input channel of network model.The inessential element of the convolution kernel granularity level beta pruning step S2 Current granularity level in beta pruning method S10 is the granularity level of the convolution kernel of network model.The weight parameter granularity level is cut Current granularity level in branch step S3 inessential element beta pruning method S10 is the granularity level of the weight parameter of network model.

Further, the beta pruning trim step S12 is finely adjusted using gradient descent method to whole network model.It is excellent Selection of land, the beta pruning trim step S12 are finely adjusted using stochastic gradient descent method to whole network model.

Further, the span of loss of accuracy's threshold value is 0.01~0.1 in the loss judgment step S13.It is excellent Selection of land, the span of loss of accuracy's threshold value is 0.05~0.08.

Further, the network is deep learning network.Preferably, the network include but is not limited to it is following a kind of or The a variety of combination of person：Convolutional neural networks, depth belief network, recurrent neural network.

According to the present invention the network model compression set based on more granularity beta prunings include it is following one or two or Three modules：

Fig. 2 gives the structure of the network model compression set based on more granularity beta prunings according to one embodiment of the invention Schematic diagram.As shown in Fig. 2 include according to the network model compression set based on more granularity beta prunings of the present invention：

Input channel granularity level pruning module 1, for using inessential element pruning module, by the input of network model Inessential element carries out beta pruning in the granularity level of passage；

Convolution kernel granularity level pruning module 2, for using inessential element pruning module, to the convolution kernel of network model Granularity level in inessential element carry out beta pruning；

Weight parameter granularity level pruning module 3, for using inessential element pruning module, to the weight of network model Inessential element carries out beta pruning in the granularity level of parameter.

Further, the inessential element pruning module 10 includes：

The zero setting module 11 of inessential element, for calculating the importance of each element in current granularity level, it will not weigh Want value zero setting corresponding to element；

Module 12 is finely tuned in beta pruning, for finely tuning whole network model according to training data；

Judge module 13 is lost, for calculating the loss of accuracy of the network model after finely tuning, if the loss of accuracy Less than loss of accuracy's threshold value, then the zero setting module 11 of inessential element is continued executing with, is otherwise terminated.

Further, the zero setting module 11 of the inessential element includes：

Element importance sorting module 111, the weight parameter vector W of beta pruning element is each treated in current granularity level for counting_e, The importance of beta pruning element is treated described in calculatingTo needed beta pruning element according to importance EIV_iAscending order row is carried out from small to large Row, obtain importance ascending order set, and calculate the importance summation of needed beta pruning element N is the quantity of current granularity level weight parameter；

Pruning threshold computing module 112, for according to energy beta pruning rate threshold value EPR, calculate beta pruning ENERGY E P=EIVS × EPR, the cumulative distribution of importance in ascending order set is counted, choose the importance corresponding to the cumulative distribution equal with beta pruning energy As pruning threshold；

Loss function influence amount computing module 113, for inputting one group of test data, the loss function value of calculating network Loss, counting loss function influences amount

Zero setting module 114, for for each treating beta pruning element i in current granularity level, if treating the weight of beta pruning element Spend EIV_iLess than pruning threshold, and loss function influence amount EL_iLess than 0, then this is treated to be worth zero setting corresponding to beta pruning element i.

The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, should Understand, the present invention is not limited to implementation as described herein, and the purpose of these implementations description is to help this area In technical staff put into practice the present invention.Any those of skill in the art are easy to do not departing from spirit and scope of the invention In the case of be further improved and perfect, therefore the present invention is only by the content of the claims in the present invention and limiting for scope System, its intention cover all alternatives being included in the spirit and scope of the invention being defined by the appended claims and waited Same scheme.

Claims

1. the network model compression method based on more granularity beta prunings, it is characterised in that this method include it is following one or two Or three steps：

Input channel granularity level beta pruning step, using inessential element beta pruning method, by the granularity of the input channel of network model Inessential element carries out beta pruning in level；

Convolution kernel granularity level beta pruning step, using inessential element beta pruning method, to the granularity level of the convolution kernel of network model In inessential element carry out beta pruning；

Weight parameter granularity level beta pruning step, using inessential element beta pruning method, to the granularity of the weight parameter of network model Inessential element carries out beta pruning in level.

2. the method as described in claim 1, it is characterised in that the inessential element beta pruning method includes：

The zero setting step of inessential element, the importance of each element in current granularity level is calculated, inessential element is corresponding Value zero setting；

Judgment step is lost, the loss of accuracy of the network model after fine setting is calculated, if the loss of accuracy is less than accuracy Threshold value is lost, then continues executing with the zero setting step of inessential element, otherwise terminates.

3. method as claimed in claim 2, it is characterised in that the zero setting step of the inessential element includes：Element is important Sequence step is spent, counts the weight parameter vector W that beta pruning element is each treated in current granularity level_e, beta pruning member is treated described in calculating The importance of elementTo needed beta pruning element according to importance EIV_iAscending order arrangement is carried out from small to large, obtains weight Ascending order set is spent, and calculates the importance summation of needed beta pruning elementN For the quantity of current granularity level weight parameter；

Pruning threshold calculation procedure, according to energy beta pruning rate threshold value EPR, beta pruning ENERGY E P=EIVS × EPR is calculated, counts ascending order The cumulative distribution of importance in set, the importance corresponding to the cumulative distribution equal with beta pruning energy is chosen as beta pruning threshold Value；

Loss function influence amount calculation procedure, input one group of test data, the loss function value Loss of calculating network, counting loss Function influences amount

Zero setting step, for each treating beta pruning element i in current granularity level, if treating the importance EIV of beta pruning element_iIt is less than Pruning threshold, and loss function influence amount EL_iLess than 0, then this is treated to be worth zero setting corresponding to beta pruning element i.

4. method as claimed in claim 2, the inessential element beta pruning method of the input channel granularity level beta pruning step Current granularity level is the granularity level of the input channel of network model；The convolution kernel granularity level beta pruning step it is inessential Current granularity level in element beta pruning method is the granularity level of the convolution kernel of network model；The weight parameter granularity level is cut Current granularity level in the inessential element beta pruning method of branch step is the granularity level of the weight parameter of network model.

5. method as claimed in claim 3, the span of the energy beta pruning rate threshold value EPR is 0.01~0.2.

6. method as claimed in claim 2, the span of loss of accuracy's threshold value is 0.01~0.1.

7. method as claimed in claim 4, the network is deep learning network；Preferably, the network includes but unlimited In following one or more kinds of combination：Convolutional neural networks, depth belief network, recurrent neural network.

8. the network model compression set based on more granularity beta prunings, it is characterised in that the device include it is following one or two Or three modules：

Input channel granularity level pruning module, for using inessential element pruning module, by the input channel of network model Granularity level in inessential element carry out beta pruning；

Convolution kernel granularity level pruning module, for using inessential element pruning module, to the grain of the convolution kernel of network model Spend inessential element in level and carry out beta pruning；

Weight parameter granularity level pruning module, for using inessential element pruning module, to the weight parameter of network model Granularity level in inessential element carry out beta pruning.

9. device as claimed in claim 8, it is characterised in that the inessential element pruning module includes：

The zero setting module of inessential element, for calculating the importance of each element in current granularity level, by inessential element Corresponding value zero setting；

Judge module is lost, for calculating the loss of accuracy of the network model after finely tuning, if the loss of accuracy is less than essence Exactness loses threshold value, then continues executing with the zero setting module of inessential element, otherwise terminate.

10. device as claimed in claim 9, it is characterised in that the zero setting module of the inessential element includes：Element is important Order module is spent, each treats the weight parameter vector W of beta pruning element in current granularity level for counting_e, wait to cut described in calculating The importance of branch elementTo needed beta pruning element according to importance EIV_iAscending order arrangement is carried out from small to large, is obtained To importance ascending order set, and calculate the importance summation of needed beta pruning element N is the quantity of current granularity level weight parameter；Pruning threshold computing module, for according to energy beta pruning rate threshold value EPR, calculating Beta pruning ENERGY E P=EIVS × EPR, the cumulative distribution of importance in ascending order set is counted, choose equal with beta pruning energy add up The corresponding importance of distribution is as pruning threshold；

Loss function influence amount computing module, for inputting one group of test data, the loss function value Loss of calculating network, calculate Loss function influence amount

Zero setting module, for for each treating beta pruning element i in current granularity level, if treating the importance EIV of beta pruning element_i Less than pruning threshold, and loss function influence amount EL_iLess than 0, then this is treated to be worth zero setting corresponding to beta pruning element i.