CN112488304A

CN112488304A - Heuristic filter pruning method and system in convolutional neural network

Info

Publication number: CN112488304A
Application number: CN202011515099.0A
Authority: CN
Inventors: 刘楚波; 陈再龙; 李肯立; 周旭; 肖国庆; 阳王东; 唐卓; 李克勤
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2020-12-21
Filing date: 2020-12-21
Publication date: 2021-03-12

Abstract

The invention discloses a heuristic filter pruning method and a system in a convolutional neural network, wherein the heuristic filter pruning method comprises the following steps: obtaining the adjusted cosine similarity of each filter of each convolutional layer between two periods, and carrying out soft pruning on the filters according to the adjusted cosine similarity; reconstructing the convolution neural network after the soft pruning is updated; repeating the soft pruning and reconstruction processes until a convolutional neural network with stable precision is obtained; obtaining the adjusted cosine similarity of each filter of each convolutional layer of the convolutional neural network after the precision is stable between two periods, and carrying out hard pruning on the filters according to the adjusted cosine similarity; and finely adjusting the convolution neural network after the hard pruning is updated until the network precision of the convolution neural network reaches a stable value. The invention can solve the technical problem that the precision loss of the pruned convolutional neural network is serious because the dynamic change of the distance and the direction of the filter in the training process of the convolutional neural network is not considered.

Description

Heuristic filter pruning method and system in convolutional neural network

Technical Field

The invention belongs to the technical field of computers, and particularly relates to a heuristic filter pruning method and a heuristic filter pruning system in a convolutional neural network.

Background

Convolutional neural networks are widely applied in the field of computer vision, but cannot be deployed on resource-limited devices (such as mobile devices) due to huge computing cost and storage overhead. In view of this, the pruning method is widely used at present to reduce the network complexity of the convolutional neural network and achieve the purpose of compressing the network.

Nowadays, the pruning process for the filters in the convolutional neural network usually adopts a heuristic algorithm, i.e. by judging the importance of each filter in each convolutional layer, the unimportant filters in the convolutional neural network are removed, which brings many advantages: (1) the network model after pruning has no difference on the network structure, so that the method can be well supported by any existing deep learning library; (2) memory usage will be significantly reduced, including from the model parameters themselves, and activation of intermediate layers; (3) because the structure of the pruned convolutional neural network is not damaged, the pruned convolutional neural network can be further compressed and accelerated by other compression methods, such as a parameter quantization method, a coding technology and the like; (4) the pruned convolutional neural network can greatly accelerate more visual tasks, such as target detection or semantic segmentation.

However, the existing heuristic pruning method still has certain defects: the importance of the filter is judged only according to the static state of the convolutional neural network model, the filter with lower importance is pruned, and the dynamic changes of the distance and the direction of the filter in the convolutional neural network training process are not considered, namely the magnitude of the information entropy of the filter is not considered, so that the importance of the filter of each convolutional layer cannot be accurately evaluated, and the precision loss of the pruned convolutional neural network is serious.

Disclosure of Invention

Aiming at the defects or the improvement requirements of the prior art, the invention provides a heuristic filter pruning method and a heuristic filter pruning system in a convolutional neural network, and aims to solve the technical problem that the precision loss of the pruned convolutional neural network is serious because the dynamic changes of the distance and the direction of a filter in the convolutional neural network training process are not considered.

To achieve the above object, according to an aspect of the present invention, there is provided a heuristic filter pruning method in a convolutional neural network, which includes the following steps:

(1) obtaining adjusted cosine similarity of each filter of each convolutional layer of a convolutional neural network between the jth and jth-off periods of the convolutional neural network

Wherein i is the number of convolutional layer in convolutional neural network, and i belongs to [1 total number of convolutional layer in nettle convolutional neural network]K is the number of the filters in the convolutional layer, and k belongs to [1 total number of filters in the convolutional layer]J > off, off being a preset threshold;

(2) for the first convolutional layer of the convolutional neural network, adjusting cosine similarity between the jth period and the jth-off period of the convolutional neural network based on all filters in the convolutional layer obtained in the step (1)

Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out soft pruning on the sorted filters according to a preset pruning rate;

(3) for the residual convolutional layers of the convolutional neural network, repeating the step (2) until all convolutional layers are soft-pruned, thereby obtaining the convolutional neural network after the soft pruning is updated;

(4) reconstructing the convolution neural network after soft pruning updating in off +1 periods to obtain a reconstructed convolution neural network;

(5) setting j to j + off +1, repeating the steps (1) to (4), judging whether the network precision of the reconstructed convolutional neural network tends to be stable, if so, indicating that the convolutional neural network with stable soft pruning is obtained, then entering the step (6), and otherwise, continuously repeating the step;

(6) obtaining the adjusted cosine similarity of each filter of each convolution layer of the convolutional neural network after soft pruning stabilization between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network

(7) For the first convolutional layer of the convolutional neural network after soft pruning stabilization, based on the adjusted cosine similarity of all the filters in the convolutional layer obtained in the step (6) between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network

Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out hard pruning on the sorted filters according to a preset pruning rate;

(8) for the residual convolutional layers of the convolutional neural network after the soft pruning is stable, repeating the step (7) until all the convolutional layers are subjected to hard pruning, thereby obtaining the convolutional neural network after the hard pruning is updated;

(9) and finely adjusting the convolution neural network after the hard pruning is updated until the network precision of the finely adjusted convolution neural network reaches a stable value, thereby obtaining the pruned convolution neural network.

Preferably, each filter of each convolutional layer adjusts cosine similarity between the jth epoch and the jth-off epoch of the convolutional neural network

Is calculated according to the following substeps:

(1-1) separately obtaining tensor parameters (hereinafter referred to as first tensor parameters) of each filter of each convolutional layer of the convolutional neural network after the jth time period

And tensor parameters (hereinafter, referred to as second tensor parameters) of each filter of each convolutional layer after the j-off period

And obtaining the average tensor parameter M of each convolutional layer according to the tensor parameters of all the filters of each convolutional layer after two periodsⁱ；

(1-2) Using the average tensor parameter M for each convolutional layer obtained in the step (1-1)ⁱRespectively for the first sheet quantity parameter

And a second tensor parameter

Performing a correction to obtain a corrected first sheet parameter for each filter of the convolutional layer

And a second tensor parameter

(1-3) modifying the first scalar parameter according to each filter corresponding to each convolutional layer

And a second tensor parameter

Obtaining a tuning margin for each filter of the convolutional layer between the jth period and the jth-off period of the convolutional neural networkChord similarity

Preferably, the average tensor parameter M for each convolutional layerⁱComprises the following steps:

wherein, O_iIs the total number of filters in the ith convolutional layer, k ∈ [1 nettle O [ ]_i]。

Preferably, the corrected first scalar parameter for each filter of each convolutional layer

And a second tensor parameter

Comprises the following steps:

preferably, each filter of each convolutional layer adjusts cosine similarity between the jth and jth-off periods of the convolutional neural network

Comprises the following steps:

where | | | represents the calculation of the L2 norm.

Preferably, the soft pruning operation is to zero the weight parameter of the filter; the hard pruning operation is to remove the filter.

Preferably, the reconstruction is to train the convolution neural network after the soft pruning is updated for off +1 period; the fine tuning is to train the convolution neural network updated by the hard pruning for 10-20 additional periods.

According to another aspect of the present invention, there is provided a heuristic filter pruning system in a convolutional neural network, comprising the following modules:

a first module for obtaining a modified cosine similarity of each filter of each convolutional layer of the convolutional neural network between a jth period and a jth-off period of the convolutional neural network

Wherein i is the number of convolutional layer in convolutional neural network, and i belongs to [1 total number of convolutional layer in nettle convolutional neural network]K is the number of the filters in the convolutional layer, and k belongs to [1 total number of filters in the convolutional layer]Jtoff, off is a preset threshold;

a second module for adjusting cosine similarity between the jth period and the jth-off period of the convolutional neural network for a first convolutional layer of the convolutional neural network based on all filters in the convolutional layer obtained by the first module

a third module, configured to repeat the second module for the remaining convolutional layers of the convolutional neural network until all convolutional layers are soft-pruned, so as to obtain a soft-pruned updated convolutional neural network;

the fourth module is used for reconstructing the convolution neural network after soft pruning updating in off +1 period to obtain the reconstructed convolution neural network;

a fifth module, configured to set j to j + off +1, repeat the first to fourth modules, and determine whether the network accuracy of the reconstructed convolutional neural network tends to be stable, if so, indicate that the convolutional neural network after soft pruning is stable is obtained, then enter the sixth module, otherwise, continue to execute the present module;

a sixth module, configured to obtain an adjusted cosine similarity of each filter of each convolutional layer of the convolutional neural network after soft pruning is stabilized between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network

A seventh module, configured to, for a first convolutional layer of the convolutional neural network after soft pruning is stable, obtain adjusted cosine similarity between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network based on all filters in the convolutional layer obtained by the sixth module

an eighth module, configured to repeat the seventh module for the remaining convolutional layers of the convolutional neural network after the soft pruning is stable until all convolutional layers are hard-pruned, so as to obtain a convolutional neural network after the hard pruning is updated;

and the ninth module is used for finely adjusting the convolution neural network after the hard pruning is updated until the network precision of the finely adjusted convolution neural network reaches a stable value, so that the pruned convolution neural network is obtained.

In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:

(1) the cosine similarity is adjusted in the step (1), so that the changes of the direction and the distance of the filter in the training and updating process of the convolutional neural network are represented, and the importance of the filter can be more accurately evaluated; soft pruning and hard pruning are carried out according to the adjusted cosine similarity, a filter with smaller information entropy is pruned, a filter with larger information entropy is left, and the precision of the pruned convolutional neural network is higher;

(2) the invention adopts the method of continuously carrying out soft pruning in the training process, and can more accurately find the filter with smaller real importance, thereby carrying out more accurate pruning;

(3) the invention adopts a method of combining soft pruning and hard pruning, and repeatedly tries in a plurality of periods, so that the network precision of the pruned convolutional neural network is higher under the same model compression rate and acceleration rate; under the same network precision, the model compression rate and the acceleration rate of the pruned convolutional neural network are higher.

Drawings

FIG. 1 is a process schematic of the process of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The technical terms appearing in the present invention are explained and illustrated below:

epoch (Epoch): when a complete data set passes through the neural network once and returns once, this process is called an epoch; in other words, 1 epoch is equal to one training of the neural network using all samples in the training set.

As shown in fig. 1, a heuristic filter pruning method in a convolutional neural network specifically includes the following steps:

(1) obtaining a modified cosine similarity of each filter of each convolutional layer of a convolutional neural network between the jth period (i.e., the jth epoch) and the jth-off period of the convolutional neural network

specifically, each filter of each convolutional layer adjusts cosine similarity between the jth epoch and the jth-off epoch of the convolutional neural network

Is calculated according to the following substeps:

And obtaining the average tensor parameter M of each convolutional layer according to the tensor parameters of all the filters of each convolutional layer after two periodsⁱ：

Wherein, O_iIs the total number of filters in the ith convolutional layer, k ∈ [1 nettle O [ ]_i](ii) a The magnitude of off has a significant influence on the network precision after the convolutional neural network is pruned, when the off is set too small, the variation trend and degree of the filter cannot be accurately compared due to too small interval during training, when the off is set to be larger, the tensor direction of the filter changes for many times during training and cannot be captured finely, so that the loss of the network precision after pruning is serious, and the value range of off is [1,10 ]]Preferably 2;

(1-2) Using each convolution obtained in step (1-1)Mean tensor parameter M of the layerⁱRespectively for the first sheet quantity parameter

And a second tensor parameter

And a second tensor parameter

The method specifically comprises the following steps:

And a second tensor parameter

Obtaining adjusted cosine similarity of each filter of the convolutional layer between the jth period and the jth-off period of the convolutional neural network

The method specifically comprises the following steps:

where | | | | represents the calculation of the L2 norm;

the method has the advantages that the cosine similarity is adjusted, the Euclidean distance and the cosine similarity are combined with the metric characteristics, the direction and distance changes of the filter tensor in the convolutional neural network training and updating process are represented, and the importance of the filter can be more accurately evaluated.

specifically, the soft pruning operation is to zero the weight parameter of the filter;

the advantage of the above steps (1) to (3) is that, because the input of the filter of each channel is the same in each convolutional layer, the smaller the cosine similarity is adjusted, the smaller the update amplitude of the filter in the convolutional neural network training process is, or the filter is not updated, which indicates that the output information of the filter to the same input data is smaller, i.e. the information entropy of the filter is smaller; the larger the cosine similarity is adjusted, the larger the change difference of the distance and the direction between two periods of the filter relative to other filters is, the larger the information entropy is, so that the soft pruning is performed on the filter with the smaller cosine similarity, and the larger the sum of the information entropy of the remaining filter is. However, the existing heuristic algorithm utilizes the norm of L1 or the sparsity of the feature map to evaluate the importance of the filter, and cannot reflect the dynamic change of the filter.

specifically, in the convolutional neural network after the soft pruning is updated in step (3), the weight parameters of the existing partial filters in the convolutional neural network are set to be zero, and after the off +1 period, the weight parameters of the filters which have been soft-pruned are updated to be non-zero values, so that the reconstruction of the convolutional neural network is realized, and the next soft pruning is laid.

the step (4) and the step (5) have the advantages that soft pruning is carried out in the process that the convolutional neural network is continuously trained until the precision is stable, and the filter is not really deleted, so that the capacity and the representation capability of the convolutional neural network can be maintained, unimportant filters are found in the progressive process, and the convolutional neural network is more accurately pruned.

specifically, the hard pruning operation is to remove the filter;

(9) finely adjusting the convolution neural network after the hard pruning is updated until the network precision of the finely adjusted convolution neural network reaches a stable value, thereby obtaining the pruned convolution neural network;

specifically, fine tune (fine tune) is to train the convolution neural network after the update of the hard pruning for 10-20 additional periods.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A heuristic filter pruning method in a convolutional neural network is characterized by comprising the following steps:

Wherein i is the number of convolutional layer in the convolutional neural network, and has i ∈ [1, the total number of convolutional layer in the convolutional neural network]K is the number of the filters in the convolutional layer and has k ∈ [1 ], the total number of filters in the convolutional layer]J > off, off being a preset threshold;

And all filters are arranged according to the order of adjusting the cosine similarity from small to large,soft pruning is carried out on the sorted filters according to a preset pruning rate;

2. The method of claim 1, wherein each filter of each convolutional layer adjusts cosine similarity between the jth epoch and the jth-off epoch of the convolutional neural network

Is calculated according to the following substeps:

And a second tensor parameter

And a second tensor parameter

And a second tensor parameter

。

3. A heuristic filter pruning method in a convolutional neural network as claimed in claim 2, characterized in that the average tensor parameter M of each convolutional layerⁱComprises the following steps:

wherein, O_iIs the total number of filters in the ith convolutional layer, k ∈ [1, O ]_i]。

4. A heuristic filter pruning method in a convolutional neural network as claimed in claim 2, characterized in that the modified first scalar parameter for each filter of each convolutional layer

And a second tensor parameter

Comprises the following steps:

5. the method of claim 2, wherein each filter of each convolutional layer adjusts cosine similarity between the jth epoch and the jth-off epoch of the convolutional neural network

Comprises the following steps:

where | | | represents the calculation of the L2 norm.

6. A heuristic filter pruning method in a convolutional neural network as claimed in claim 1, characterized in that the soft pruning operation is to zero the weight parameters of the filter; the hard pruning operation is to remove the filter.

7. The heuristic filter pruning method in a convolutional neural network of claim 1, wherein the reconstruction is training off +1 period for the convolutional neural network after soft pruning update; the fine tuning is to train the convolution neural network updated by the hard pruning for 10-20 additional periods.

8. A heuristic filter pruning system in a convolutional neural network is characterized by comprising the following modules:

A seventh module for soft shearsFor the first convolutional layer of the convolutional neural network after branch stabilization, based on the adjusted cosine similarity of all filters in the convolutional layer obtained by the sixth module between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network