Disclosure of Invention
Aiming at the defects or the improvement requirements of the prior art, the invention provides a heuristic filter pruning method and a heuristic filter pruning system in a convolutional neural network, and aims to solve the technical problem that the precision loss of the pruned convolutional neural network is serious because the dynamic changes of the distance and the direction of a filter in the convolutional neural network training process are not considered.
To achieve the above object, according to an aspect of the present invention, there is provided a heuristic filter pruning method in a convolutional neural network, which includes the following steps:
(1) obtaining adjusted cosine similarity of each filter of each convolutional layer of a convolutional neural network between the jth and jth-off periods of the convolutional neural network
Wherein i is the number of convolutional layer in convolutional neural network, and i belongs to [1 total number of convolutional layer in nettle convolutional neural network]K is the number of the filters in the convolutional layer, and k belongs to [1 total number of filters in the convolutional layer]J > off, off being a preset threshold;
(2) for the first convolutional layer of the convolutional neural network, adjusting cosine similarity between the jth period and the jth-off period of the convolutional neural network based on all filters in the convolutional layer obtained in the step (1)
Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out soft pruning on the sorted filters according to a preset pruning rate;
(3) for the residual convolutional layers of the convolutional neural network, repeating the step (2) until all convolutional layers are soft-pruned, thereby obtaining the convolutional neural network after the soft pruning is updated;
(4) reconstructing the convolution neural network after soft pruning updating in off +1 periods to obtain a reconstructed convolution neural network;
(5) setting j to j + off +1, repeating the steps (1) to (4), judging whether the network precision of the reconstructed convolutional neural network tends to be stable, if so, indicating that the convolutional neural network with stable soft pruning is obtained, then entering the step (6), and otherwise, continuously repeating the step;
(6) obtaining the adjusted cosine similarity of each filter of each convolution layer of the convolutional neural network after soft pruning stabilization between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network
(7) For the first convolutional layer of the convolutional neural network after soft pruning stabilization, based on the adjusted cosine similarity of all the filters in the convolutional layer obtained in the step (6) between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network
Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out hard pruning on the sorted filters according to a preset pruning rate;
(8) for the residual convolutional layers of the convolutional neural network after the soft pruning is stable, repeating the step (7) until all the convolutional layers are subjected to hard pruning, thereby obtaining the convolutional neural network after the hard pruning is updated;
(9) and finely adjusting the convolution neural network after the hard pruning is updated until the network precision of the finely adjusted convolution neural network reaches a stable value, thereby obtaining the pruned convolution neural network.
Preferably, each filter of each convolutional layer adjusts cosine similarity between the jth epoch and the jth-off epoch of the convolutional neural network
Is calculated according to the following substeps:
(1-1) separately obtaining tensor parameters (hereinafter referred to as first tensor parameters) of each filter of each convolutional layer of the convolutional neural network after the jth time period
And tensor parameters (hereinafter, referred to as second tensor parameters) of each filter of each convolutional layer after the j-off period
And obtaining the average tensor parameter M of each convolutional layer according to the tensor parameters of all the filters of each convolutional layer after two periods
i;
(1-2) Using the average tensor parameter M for each convolutional layer obtained in the step (1-1)
iRespectively for the first sheet quantity parameter
And a second tensor parameter
Performing a correction to obtain a corrected first sheet parameter for each filter of the convolutional layer
And a second tensor parameter
(1-3) modifying the first scalar parameter according to each filter corresponding to each convolutional layer
And a second tensor parameter
Obtaining a tuning margin for each filter of the convolutional layer between the jth period and the jth-off period of the convolutional neural networkChord similarity
Preferably, the average tensor parameter M for each convolutional layeriComprises the following steps:
wherein, OiIs the total number of filters in the ith convolutional layer, k ∈ [1 nettle O [ ]i]。
Preferably, the corrected first scalar parameter for each filter of each convolutional layer
And a second tensor parameter
Comprises the following steps:
preferably, each filter of each convolutional layer adjusts cosine similarity between the jth and jth-off periods of the convolutional neural network
Comprises the following steps:
where | | | represents the calculation of the L2 norm.
Preferably, the soft pruning operation is to zero the weight parameter of the filter; the hard pruning operation is to remove the filter.
Preferably, the reconstruction is to train the convolution neural network after the soft pruning is updated for off +1 period; the fine tuning is to train the convolution neural network updated by the hard pruning for 10-20 additional periods.
According to another aspect of the present invention, there is provided a heuristic filter pruning system in a convolutional neural network, comprising the following modules:
a first module for obtaining a modified cosine similarity of each filter of each convolutional layer of the convolutional neural network between a jth period and a jth-off period of the convolutional neural network
Wherein i is the number of convolutional layer in convolutional neural network, and i belongs to [1 total number of convolutional layer in nettle convolutional neural network]K is the number of the filters in the convolutional layer, and k belongs to [1 total number of filters in the convolutional layer]Jtoff, off is a preset threshold;
a second module for adjusting cosine similarity between the jth period and the jth-off period of the convolutional neural network for a first convolutional layer of the convolutional neural network based on all filters in the convolutional layer obtained by the first module
Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out soft pruning on the sorted filters according to a preset pruning rate;
a third module, configured to repeat the second module for the remaining convolutional layers of the convolutional neural network until all convolutional layers are soft-pruned, so as to obtain a soft-pruned updated convolutional neural network;
the fourth module is used for reconstructing the convolution neural network after soft pruning updating in off +1 period to obtain the reconstructed convolution neural network;
a fifth module, configured to set j to j + off +1, repeat the first to fourth modules, and determine whether the network accuracy of the reconstructed convolutional neural network tends to be stable, if so, indicate that the convolutional neural network after soft pruning is stable is obtained, then enter the sixth module, otherwise, continue to execute the present module;
a sixth module, configured to obtain an adjusted cosine similarity of each filter of each convolutional layer of the convolutional neural network after soft pruning is stabilized between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network
A seventh module, configured to, for a first convolutional layer of the convolutional neural network after soft pruning is stable, obtain adjusted cosine similarity between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network based on all filters in the convolutional layer obtained by the sixth module
Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out hard pruning on the sorted filters according to a preset pruning rate;
an eighth module, configured to repeat the seventh module for the remaining convolutional layers of the convolutional neural network after the soft pruning is stable until all convolutional layers are hard-pruned, so as to obtain a convolutional neural network after the hard pruning is updated;
and the ninth module is used for finely adjusting the convolution neural network after the hard pruning is updated until the network precision of the finely adjusted convolution neural network reaches a stable value, so that the pruned convolution neural network is obtained.
In general, compared with the prior art, the above technical solution contemplated by the present invention can achieve the following beneficial effects:
(1) the cosine similarity is adjusted in the step (1), so that the changes of the direction and the distance of the filter in the training and updating process of the convolutional neural network are represented, and the importance of the filter can be more accurately evaluated; soft pruning and hard pruning are carried out according to the adjusted cosine similarity, a filter with smaller information entropy is pruned, a filter with larger information entropy is left, and the precision of the pruned convolutional neural network is higher;
(2) the invention adopts the method of continuously carrying out soft pruning in the training process, and can more accurately find the filter with smaller real importance, thereby carrying out more accurate pruning;
(3) the invention adopts a method of combining soft pruning and hard pruning, and repeatedly tries in a plurality of periods, so that the network precision of the pruned convolutional neural network is higher under the same model compression rate and acceleration rate; under the same network precision, the model compression rate and the acceleration rate of the pruned convolutional neural network are higher.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The technical terms appearing in the present invention are explained and illustrated below:
epoch (Epoch): when a complete data set passes through the neural network once and returns once, this process is called an epoch; in other words, 1 epoch is equal to one training of the neural network using all samples in the training set.
As shown in fig. 1, a heuristic filter pruning method in a convolutional neural network specifically includes the following steps:
(1) obtaining a modified cosine similarity of each filter of each convolutional layer of a convolutional neural network between the jth period (i.e., the jth epoch) and the jth-off period of the convolutional neural network
Wherein i is the number of convolutional layer in convolutional neural network, and i belongs to [1 total number of convolutional layer in nettle convolutional neural network]K is the number of the filters in the convolutional layer, and k belongs to [1 total number of filters in the convolutional layer]J > off, off being a preset threshold;
specifically, each filter of each convolutional layer adjusts cosine similarity between the jth epoch and the jth-off epoch of the convolutional neural network
Is calculated according to the following substeps:
(1-1) separately obtaining tensor parameters (hereinafter referred to as first tensor parameters) of each filter of each convolutional layer of the convolutional neural network after the jth time period
And tensor parameters (hereinafter, referred to as second tensor parameters) of each filter of each convolutional layer after the j-off period
And obtaining the average tensor parameter M of each convolutional layer according to the tensor parameters of all the filters of each convolutional layer after two periods
i:
Wherein, OiIs the total number of filters in the ith convolutional layer, k ∈ [1 nettle O [ ]i](ii) a The magnitude of off has a significant influence on the network precision after the convolutional neural network is pruned, when the off is set too small, the variation trend and degree of the filter cannot be accurately compared due to too small interval during training, when the off is set to be larger, the tensor direction of the filter changes for many times during training and cannot be captured finely, so that the loss of the network precision after pruning is serious, and the value range of off is [1,10 ]]Preferably 2;
(1-2) Using each convolution obtained in step (1-1)Mean tensor parameter M of the layer
iRespectively for the first sheet quantity parameter
And a second tensor parameter
Performing a correction to obtain a corrected first sheet parameter for each filter of the convolutional layer
And a second tensor parameter
The method specifically comprises the following steps:
(1-3) modifying the first scalar parameter according to each filter corresponding to each convolutional layer
And a second tensor parameter
Obtaining adjusted cosine similarity of each filter of the convolutional layer between the jth period and the jth-off period of the convolutional neural network
The method specifically comprises the following steps:
where | | | | represents the calculation of the L2 norm;
the method has the advantages that the cosine similarity is adjusted, the Euclidean distance and the cosine similarity are combined with the metric characteristics, the direction and distance changes of the filter tensor in the convolutional neural network training and updating process are represented, and the importance of the filter can be more accurately evaluated.
(2) For the first convolutional layer of the convolutional neural network, adjusting cosine similarity between the jth period and the jth-off period of the convolutional neural network based on all filters in the convolutional layer obtained in the step (1)
Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out soft pruning on the sorted filters according to a preset pruning rate;
specifically, the soft pruning operation is to zero the weight parameter of the filter;
(3) for the residual convolutional layers of the convolutional neural network, repeating the step (2) until all convolutional layers are soft-pruned, thereby obtaining the convolutional neural network after the soft pruning is updated;
the advantage of the above steps (1) to (3) is that, because the input of the filter of each channel is the same in each convolutional layer, the smaller the cosine similarity is adjusted, the smaller the update amplitude of the filter in the convolutional neural network training process is, or the filter is not updated, which indicates that the output information of the filter to the same input data is smaller, i.e. the information entropy of the filter is smaller; the larger the cosine similarity is adjusted, the larger the change difference of the distance and the direction between two periods of the filter relative to other filters is, the larger the information entropy is, so that the soft pruning is performed on the filter with the smaller cosine similarity, and the larger the sum of the information entropy of the remaining filter is. However, the existing heuristic algorithm utilizes the norm of L1 or the sparsity of the feature map to evaluate the importance of the filter, and cannot reflect the dynamic change of the filter.
(4) Reconstructing the convolution neural network after soft pruning updating in off +1 periods to obtain a reconstructed convolution neural network;
specifically, in the convolutional neural network after the soft pruning is updated in step (3), the weight parameters of the existing partial filters in the convolutional neural network are set to be zero, and after the off +1 period, the weight parameters of the filters which have been soft-pruned are updated to be non-zero values, so that the reconstruction of the convolutional neural network is realized, and the next soft pruning is laid.
(5) Setting j to j + off +1, repeating the steps (1) to (4), judging whether the network precision of the reconstructed convolutional neural network tends to be stable, if so, indicating that the convolutional neural network with stable soft pruning is obtained, then entering the step (6), and otherwise, continuously repeating the step;
the step (4) and the step (5) have the advantages that soft pruning is carried out in the process that the convolutional neural network is continuously trained until the precision is stable, and the filter is not really deleted, so that the capacity and the representation capability of the convolutional neural network can be maintained, unimportant filters are found in the progressive process, and the convolutional neural network is more accurately pruned.
(6) Obtaining the adjusted cosine similarity of each filter of each convolution layer of the convolutional neural network after soft pruning stabilization between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network
(7) For the first convolutional layer of the convolutional neural network after soft pruning stabilization, based on the adjusted cosine similarity of all the filters in the convolutional layer obtained in the step (6) between the (j + off + 1) th period and the (j + 1) th period of the convolutional neural network
Arranging all the filters according to the sequence of adjusting the cosine similarity from small to large, and carrying out hard pruning on the sorted filters according to a preset pruning rate;
specifically, the hard pruning operation is to remove the filter;
(8) for the residual convolutional layers of the convolutional neural network after the soft pruning is stable, repeating the step (7) until all the convolutional layers are subjected to hard pruning, thereby obtaining the convolutional neural network after the hard pruning is updated;
(9) finely adjusting the convolution neural network after the hard pruning is updated until the network precision of the finely adjusted convolution neural network reaches a stable value, thereby obtaining the pruned convolution neural network;
specifically, fine tune (fine tune) is to train the convolution neural network after the update of the hard pruning for 10-20 additional periods.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.