CN113065640B

CN113065640B - Image classification network compression method based on convolution kernel shape automatic learning

Info

Publication number: CN113065640B
Application number: CN202110283921.3A
Authority: CN
Inventors: 张科; 刘广哲
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2021-03-17
Filing date: 2021-03-17
Publication date: 2024-01-09
Anticipated expiration: 2041-03-17
Also published as: CN113065640A

Abstract

The invention relates to an image classification network compression method based on convolution kernel shape automatic learning, and belongs to the technical field of image processing and recognition. By applying a plurality of sparse regularization constraints to the parameters of each position in the conventional convolution kernel, the parameters in the convolution kernel are thinned in the network training process, and the automatically learned convolution kernel shape can be obtained by setting a clipping threshold according to the compression rate, so that the redundant parameters in the convolution kernel can be effectively eliminated. The method is applied to the image classification task, the compression rate of the network model can be further improved while the classification accuracy is ensured, the parameter quantity and the calculation amount of the model are reduced, and the method is convenient to deploy and apply in the mobile equipment with limited resources.

Description

Image classification network compression method based on convolution kernel shape automatic learning

Technical Field

The invention belongs to the technical field of image processing and recognition, and particularly relates to an image classification network compression method based on convolution kernel shape automatic learning.

Background

Image classification and recognition are an important subject in the field of machine vision, and early image recognition methods mainly rely on manually extracted features, which have low accuracy and limited applicability to different scenes. With the appearance of a deep learning method, a convolutional neural network achieves great achievement in the machine vision fields of image recognition, target detection and the like, and the deep neural network can effectively extract advanced semantic features in images and can reach recognition capability exceeding that of human beings.

But, while network performance is improved, network structure is more and more complex, and requirements on storage capacity and operation capacity of the computing device are higher and higher, so that application and development of the computing device in mobile devices with limited resources are limited. The large-scale neural network model has larger redundancy, not all parameters play an effective role in network performance, and excessive parameters can cause problems of slow network convergence, over-fitting of parameters and the like. In order to facilitate the deployment and application of the neural network, the neural network compression method is getting more and more attention.

Parameter pruning is an effective neural network compression method, and the effect of reducing the complexity of a model is achieved by cutting redundant or unimportant parameters in a network. Wei Yue, chen Shichao, zhu Fenghua (model pruning method based on sparse convolutional neural network; computer engineering. DOI. Https:// doi. Org/10.19678/j. Issn. 1000-3428.0059375) provides a model pruning algorithm based on sparse convolutional neural network, sparse regularization constraint is applied to convolutional layers and Batch Normalization (BN) layers in the training process, so that the network weight becomes sparse, a pruning threshold is set, filter channels with lower importance in the network are pruned, and the accuracy of a training recovery model is fine-tuned, so that the purpose of convolutional neural network compression is achieved. This method belongs to the structured pruning method and prunes with the convolution channel as the smallest unit, but cannot remove redundant parameters inside the convolution kernel. Smaller pruning units are required if higher compression rates are to be achieved.

Disclosure of Invention

Technical problem to be solved

The existing sparse pruning method of the convolutional neural network performs sparse training on the whole convolutional channel, and cannot eliminate redundant parameters in the convolutional kernel, so that the network model compression rate is low, and the final image classification accuracy is affected. The invention provides an image classification network compression method based on convolution kernel shape automatic learning.

Technical proposal

An image classification network compression method based on convolution kernel shape automatic learning is characterized by comprising the following steps:

step 1: building a convolutional neural network for image classification;

step 2: introducing coefficient matrix F into the traditional convolution process, and adding weight sparse regularization term into the loss functionDistribution equilibrium regular term->Inter-group equalization regularization term/>

Wherein lambda is ₁ 、λ ₂ 、λ ₃ Is a coefficient for balancing the terms;

solving the three loss functions in the network training process for each term f in the coefficient matrix _ij For back-propagating the update coefficient matrix F; obtaining a sparse coefficient matrix F after training is completed;

step 3: setting a threshold according to the model compression rate which is expected to be reached, and f which is lower than the threshold _ij Removing convolution kernel parameters of the corresponding positions to obtain the convolution kernel shape of each convolution layer;

step 4: and replacing the original conventional convolution kernel with the sparse shape obtained by automatic learning, and carrying out network training again to obtain a final image classification neural network model.

Preferably: the convolutional neural network in step 1 is VGG.

Preferably: the convolutional neural network in step 1 is ResNet.

Advantageous effects

According to the image classification network compression method based on the automatic learning of the convolution kernel shape, provided by the invention, the parameters of each position in the conventional convolution kernel are subjected to multiple sparse regularization constraints, the parameters in the convolution kernel are thinned in the network training process, and the automatic learning of the convolution kernel shape can be obtained by setting a clipping threshold according to the compression rate, so that the redundant parameters in the convolution kernel can be effectively eliminated. The method is applied to the image classification task, the compression rate of the network model can be further improved while the classification accuracy is ensured, the parameter quantity and the calculation amount of the model are reduced, and the method is convenient to deploy and apply in the mobile equipment with limited resources.

The image classification network compression method based on the convolution kernel shape automatic learning, which is designed by the invention, can automatically learn the convolution kernel shape of each convolution layer in the network training process, so that the feeling of the convolution kernel is also adapted to the network depth, and simultaneously, the redundant parameters in the convolution kernel are eliminated, thereby achieving a good network compression effect.

The convolution kernel shape automatic learning method provides a new thought for the design of the high-efficiency network structure, and the convolution kernel shape is added into the search space of the neural structure search (NAS), so that a larger search space can be obtained, and further, the richer target characteristics are extracted, and the network performance is improved.

The image classification network compression method based on convolution kernel shape automatic learning, which is designed by the invention, can effectively compress the parameter quantity in the neural network convolution layer, for example, the parameter quantity of 59.07% and the calculated quantity of 51.91% of the VGG-16 network can be reduced on the premise of not reducing the accuracy, and the image classification network can be conveniently deployed in the terminal mobile equipment.

The image classification network compression method based on convolution kernel shape automatic learning can reduce redundant parameters in a convolution layer, so that the overfitting risk of the image classification network is reduced, the classification accuracy of the network is improved, and the classification accuracy of 0.72% can be improved when the VGG-16 network is compressed.

Drawings

Fig. 1 is a diagram of a convolution calculation process incorporating a matrix of convolution kernel coefficients.

Fig. 2 is a 3×3 convolution kernel parameter number and packet division diagram.

Fig. 3 is a convolution kernel shape auto-learning flow chart.

FIG. 4 is a convolution kernel shape of each convolution layer automatically learned over a CIFAR-10 dataset using a VGG-16 network.

Detailed Description

The invention will now be further described with reference to examples, figures:

the invention provides an image classification network compression method based on convolution kernel shape automatic learning, wherein the convolution kernel shape automatic learning flow is shown in figure 3. The following describes embodiments of the present invention in connection with image classification examples, but the technical content of the present invention is not limited to the described scope, and the embodiments include the following steps:

(1) A convolutional neural network for image classification is built, and an image dataset with a large number of training samples and labels is constructed.

(2) For a convolution layer in a neural network, the convolution calculation process of a conventional convolution kernel is as follows:

Y＝X*w

in the method, in the process of the invention,is the input feature map tensor, < >>Is the output feature map tensor, < >>Is a convolution weight parameter, c and n are the number of input and output channels, h and w are the height and width of the input feature map, h 'and w' are the height and width of the output feature map, k×k is the size of the convolution kernel, and is an image convolution operation.

The convolution kernels of the n output channels are equally divided into d groups, and each group contains n/d convolution channels. To sparse the parameters inside the convolution kernel, a coefficient matrix is introducedThe convolution weights w of each group are multiplied point by point, and then the convolution operation is carried out on the input X, namely:

Y＝X*(F⊙w)

in the formula, the term ". As used herein, is a point-wise multiplication operation. The entire calculation process is referred to in fig. 1.

The loss function in the conventional convolutional neural network training process is as follows:

in the method, in the process of the invention,is a class loss term, related to the input image and predictive label during the network training process, +.>Is a regularization term for weight decay that can reduce network overfitting.

Introducing coefficient matrix F into the traditional convolution process, and adding weight sparse regularization term into the loss functionDistribution equilibrium regular term->Inter-group equalization canonical term->Solving the three loss functions in the network training process for each term f in the coefficient matrix _ij For back-propagating the update coefficient matrix F. And obtaining a sparse coefficient matrix F after training is completed.

In order to achieve the purpose of automatically learning the shape of the convolution kernel, sparse regularization constraint is applied to the coefficient matrix F, and the loss function is as follows:

in the method, in the process of the invention,is a regularization term that tends to sparse the convolution kernel weights, +.>Is a regularization term that equalizes the convolution kernel parameter distribution,/->Is a regularized term that equalizes parameters between groups, lambda ₁ 、λ ₂ 、λ ₃ Is a coefficient for balancing the terms.

Each regular term is constructed separately below. Taking a 3×3 convolution kernel as an example, 9 parameters in the convolution kernel are numbered and divided into angles (G _corner ) Edge (G) _edge ) Heart (G) _center ) Three groups, with specific reference to fig. 2, each group is numbered separately:

1)the method is a regularization term for enabling weights to tend to be sparse, and comprises the following steps:

wherein k is _j For coefficients of different positions, which constitute a vector k e R ^1×9 For applying different canonical constraints to parameters at different positions of diagonal, edge, heart, e.g. taking k= [4,2,4,2,1,2,4,2,4 ]]Represents the pair G _corner Position application 4 times G _center For G _edge Applying 2 times G _center And thus more focus on preserving parameters close to the center of the convolution kernel.

g (·) is the regularized norm, e.g. using L ₁ Norms, then:

during the training process, due toIndependent of the training sample, it can be obtained in advanceWith respect to each coefficient f _ij For back-propagating the update coefficient matrix F. The partial derivatives are:

where sgn (·) is a sign function.

2)The method is a regular term for balancing the distribution of the convolution kernel parameters, and is used for considering the convolution parameters in all directions, so as to avoid the situation of characteristic diagram deviation. For all d groups of coefficients f at the same position j _ij Summing the absolute values to obtain:

for G _corner And G _edge F of each of (F) _j And (3) carrying out difference between every two pairs, and obtaining the square sum:

according to the rule of the chain-type derivation,with respect to each coefficient f _ij The partial derivatives of (2) are:

in the method, in the process of the invention,

3)the method is a regular term for equalizing parameters among groups, and the excessive difference of the number of d parameters among groups is avoided. Calculation of G in d groups respectively _corner 、G _edge 、G _center F at each position _ij Sum of absolute values, yield:

wherein F is _i ^corner Representing the position G in the ith group of convolution kernels _corner Coefficient f of position _ij Sum of absolute values, F _i ^edge Representing the position G in the ith group of convolution kernels _edge Coefficient f of position _ij Sum of absolute values, F _i ^center Representing the position G in the ith group of convolution kernels _center Coefficient f of position _ij Sum of absolute values.

For each F in d groups _i ^corner 、F _i ^edge 、F _i ^center And (3) carrying out difference between every two pairs, and obtaining the square sum:

in the method, in the process of the invention,represents G _corner Position-generated inter-group equalization loss, +.>Represents G _edge Position-generated inter-group equalization loss, +.>Represents G _center Position-generated groupInter-equalization loss. The total inter-group equalization loss is:

in the method, in the process of the invention,

(3) Performing sparse training on the coefficient matrix F in (1) by using the loss function in (2), obtaining sparse F after training is completed, setting a clipping threshold value, and setting F lower than the threshold value _ij And removing the convolution kernel parameters of the corresponding positions to obtain the automatically learned convolution kernel shape.

(4) And replacing the original conventional convolution kernel with the sparse shape obtained by automatic learning, and carrying out network training again to obtain a final neural network model. The network parameters and the calculated amount of the model are lower than those of the original model, and the effect of network compression can be achieved while the correct classification result is ensured.

Based on the method for automatically learning the shape of the convolution kernel, the shape of the convolution kernel adapting to each convolution layer can be obtained, and redundant parameters in the convolution kernel can be effectively removed, so that the purpose of model compression is achieved.

FIG. 4 is various convolutional layers of a VGG-16 network automatically learned over a CIFAR-10 datasetD=2 is selected in the learning process, namely the convolution kernels of n output channels of each convolution layer are divided into 2 groups averagely, and 60% of parameters in the network are removed during clipping. The leftmost column (scheme one) is added with sparse regularization term onlyAnd the positions of the angle, the edge and the center are the same constraint coefficient k _j The results obtained. The second column (scheme II) of left number is to take different constraint coefficients k from the positions of diagonal, side and center based on the first column _j The result is that more parameters near the center of the convolution kernel remain than in the first column. The third column (scheme III) of left number is to add distribution equilibrium regular term based on the second column>As a result, the convolution kernel obtained gives consideration to the convolution parameters in all directions, in particular, layers 1 and 10, compared with the second column, and thus the problem that the extracted feature map is shifted in a certain direction can be avoided. The fourth column (scheme four) of the left number is to add the inter-group equalization regularization term on the basis of the third column>The results obtained balance the number of parameters of the two sets of convolution kernels better than the third column, especially layer 5.

Table 1 shows the compression results of the present invention, the original VGG-16 model contains 15.0M parameters and 314M calculation, and the accuracy on CIFAR-10 is 93.45%; model parameters obtained by adopting a traditional structured pruning method are 5.4M, calculated amount is 206M, and accuracy rate is reduced to 93.40%; by adopting the convolution kernel shape automatic learning method, various regular constraints are added to improve the accuracy of the model while compressing the network, so that various constraints can play a beneficial role on the compression result, the parameter of the finally obtained model (scheme IV) is 6.14M, the calculated amount is 151M, and the accuracy is improved by 94.17%.

Table 1 the network compression results table of the present invention

Claims

1. An image classification network compression method based on convolution kernel shape automatic learning is characterized by comprising the following steps:

step 1: building a convolutional neural network for image classification;

step 2: introducing coefficient matrix F into the traditional convolution process, and adding weight sparse regularization term into the loss functionDistribution equilibrium regular term->Inter-group equalization canonical term->

Wherein lambda is ₁ 、λ ₂ 、λ ₃ Is a coefficient used to balance the terms,is a loss-of-category term,/->Is a regularization term for weight decay;

the weight sparse regularization termIs calculated according to the formula:

wherein k is _j For coefficients of different positions, f _ij Representing the coefficients; "i" represents the ith group, "j" the jth position;

the distribution equilibrium regular termIs calculated according to the formula:

wherein G is _corner Is the angle G _edge Is a side; f (F) _i Representing the coefficient f in the ith set of convolution kernels _ij Sum of absolute values, F _j Representing the coefficient f in the convolution kernel at the j-th position _ij Sum of absolute values;

the inter-group equalization regular termIs calculated according to the formula:

wherein,represents G _corner Position-generated inter-group equalization loss, +.>Representative ofG _edge Position-generated inter-group equalization loss, +.>Represents G _center The inter-group equalization loss generated by the position;

2. The method for compressing the image classification network based on the convolutional kernel shape automatic learning according to claim 1, wherein the convolutional neural network in the step 1 is VGG.

3. The method for compressing the image classification network based on the convolutional kernel shape automatic learning according to claim 1, wherein the convolutional neural network in the step 1 is ResNet.