CN112633472A - Convolutional neural network compression method based on channel pruning - Google Patents
Convolutional neural network compression method based on channel pruning Download PDFInfo
- Publication number
- CN112633472A CN112633472A CN202011505386.3A CN202011505386A CN112633472A CN 112633472 A CN112633472 A CN 112633472A CN 202011505386 A CN202011505386 A CN 202011505386A CN 112633472 A CN112633472 A CN 112633472A
- Authority
- CN
- China
- Prior art keywords
- model
- pruning
- channel
- method based
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a convolutional neural network compression method based on channel pruning, which comprises the following steps: a channel selection method based on feature map average activation is adopted in the convolution layer; a channel selection method based on loss estimation is adopted between convolution layers; and (3) carrying out fine adjustment after the accuracy of the model is reduced. The invention can realize self-adaptive pruning among channels while controlling the whole pruning proportion, and obtains good pruning effect.
Description
Technical Field
The invention belongs to the field of computer vision, relates to a deep learning technology, and particularly relates to a convolutional neural network compression method based on channel pruning.
Background
In recent years, deep learning has been rapidly developed in the visual field, and more high-precision models have been proposed. However, as the number of layers of the model gets deeper and deeper, the requirements for computation and storage become higher and higher. For example, VGG16 has 1 hundred million, 3 million or more parameters, and requires nearly 150 hundred million floating point operations to complete an image recognition task. In practical applications, the problem of resource limitation is often faced, so that the network model must be compressed and accelerated.
There are many methods for model compression and acceleration, and channel pruning of convolutional layers is one of the most common methods. The method does not damage the original model structure, and the parameters of the original model can be directly applied to the new model without modification, so the method is easy to realize and does not depend on specific hardware and a third-party library. However, the pruning method is relatively extensive, and the generalization capability of the model is easily damaged, so that the quality of the model is rapidly reduced. The existing pruning method usually does not consider the sensitivity difference between the convolutional layers or needs to set the pruning proportion of each layer according to experience, so that the pruning effect is not ideal, the flexibility is poor, and great application difficulty is brought.
Disclosure of Invention
In order to solve the problems, the invention discloses a convolutional neural network compression method based on channel pruning, which can fully consider the sensitivity of different convolutional layers by adopting feature map activation in layers and adopting loss estimation pruning standards among layers and realize self-adaptive pruning under the condition of controlling the integral pruning proportion.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a convolutional neural network compression method based on channel pruning comprises the following steps:
the method comprises the following steps of firstly, calculating average activation of all convolutional layer feature maps in a model, and sequencing the feature maps of each layer according to the average activation;
secondly, selecting the characteristic diagram with the minimum influence on the final loss of the model from the characteristic diagrams with the minimum average activation in each convolution layer, and cutting out the corresponding channel from the model;
thirdly, judging whether the accuracy of the model is lower than a certain threshold value T, if so, finely adjusting the reference model until the model is approximately converged, and returning to the first step;
fourthly, judging whether the pruning proportion reaches a preset proportion R or not, and returning to the second step if the pruning proportion does not reach the preset proportion R;
and fifthly, fine tuning the model again to recover the accuracy.
Further, the first step includes the following processes:
dividing a part of data from a training set of the data set as a sample set, setting the number of the sample sets as N, and setting the convolutional layer input matrix as a ∈ RN×H×W×CH, W and C are the height, width, and number of channels, respectively, of the feature map. Then the average activation of the feature map corresponding to the kth channel is:
the C channels of each layer are ordered according to the average activation of the signature.
Further, the second step includes the following processes:
note S(l)Set of all channels for layer l, k(l)The least active channel is averaged for the ith layer profile, namely:
let K be K ═ K(1),k(2),...,k(L)The set of channels that are the least active on average in each layer, where L is the number of layers of the convolutional layer. Maintaining a set K in the pruning process, tentatively pruning each channel, evaluating the change of model loss on the test set, finally selecting the channel with the minimum model loss as the pruning channel, and pruning the channel from the model
Further, the third step includes the following processes:
and testing the accuracy of the pruned model on the test set of the data set, and judging whether the accuracy is smaller than a preset threshold value T. If the value is less than the threshold value, the model is finely adjusted, namely, the convolution layer of the model is retrained with a smaller learning rate on the basis of the existing parameters. And adopting a method of early termination, stopping training when the loss of the model does not decrease within a plurality of rounds, and returning to the first step.
Further, the fourth step includes the following processes:
calculating the pruning proportion of the model, namely:
and if the pruning proportion is smaller than the preset proportion R, returning to the second step.
Further, the fifth step includes the following processes:
and fifthly, fine-tuning the whole model for more rounds by adopting a method similar to the third step, and recovering the accuracy of the model.
The invention has the beneficial effects that:
the invention adopts two different channel selection standards, adopts a channel selection method based on characteristic diagram activation in the convolution layer, and adopts a channel selection method based on loss estimation between the convolution layers.
Drawings
Fig. 1 is a flowchart of a model compression method based on channel pruning according to the present invention.
FIG. 2 is a comparison of pruning results of the present invention on VGG model using Cifar-10 data set with other methods.
Detailed Description
The present invention will be further illustrated with reference to the accompanying drawings and specific embodiments, which are to be understood as merely illustrative of the invention and not as limiting the scope of the invention.
In this embodiment, TensorFlow is used as a deep learning framework to prune convolutional layers of a VGG16 model trained on a Cifar-10 dataset, and a flow chart is shown in FIG. 1, and the following steps are adopted in this embodiment:
step 1, randomly selecting 100 pictures from a cifar-10 training set as a sample set, taking the sample set as input, and countingCalculating the output matrix of each convolutional layer, i.e. the input matrix a ∈ R of the next convolutional layerN×H×W×CH, W and C are the height, width, and number of channels, respectively, of the feature map. Calculating the average activation of the characteristic diagram corresponding to the kth channel of the convolutional layer as follows:
and sorting the C channels of each layer according to the average activation of the feature map, and maintaining a sorted channel set.
And 2, polling all convolutional layers, cutting out a channel with the minimum average activation of the characteristic diagram of each convolutional layer, testing and recording the loss on the Cifar-10 test set again, and then recovering the model. After all convolutional layers have been polled, the least lossy channel is selected as the final channel, which is pruned, and the set of convolutional layer channels is updated.
And 3, calculating the accuracy of the model on the Cifar-10 test set, setting a threshold T to be 90%, and training the pruned model if the accuracy is lower than 90%. The training method adopts random gradient descent, the learning rate is fixed to be 2.4e-5, and the Nesterov momentum is 0.9. With the early termination strategy, fine tuning is stopped when the loss no longer decreases for 20 consecutive rounds and is restored to the lowest weight loss, for a maximum of 250 rounds. And after the fine adjustment is finished, returning to the step 1, recalculating the average activation of the feature map, and starting the next round of pruning.
Step 4, calculating the pruning proportion of the model, namely:
and (3) setting the preset pruning proportion R to be 50%, and if the pruning proportion of the model is less than 50%, returning to the step (2) to continue pruning.
And 5, fine tuning the model by adopting the same parameter setting as the step 3, and recovering the accuracy of the model again.
To illustrate the superiority of the present invention, we also compared the pruning results of the present invention with other pruning methods on VGG models using Cifar-10 datasets, including greedy search method (ThiNet), Lasso regression reconstruction method (CP), batch optimization layer constraint method (slim), discriminative power perception method (DCP), scaling method (WM). The comparative results are shown in FIG. 2.
It can be seen that ThiNet, CP and DCP all adopt a fixed ratio clipping method at each layer, i.e. the number of parameters and the floating point operand are about 50% of the original value, and this method is relatively flexible and requires experience to select the clipping ratio in practical application. Both the Sliming and the present invention can adopt different cutting proportions in each layer, but the present invention can also control the whole pruning proportion on the basis of the self-adaptive pruning. From the pruning results, Sliming has the highest parameter compression rate, while DCP has the most excellent performance in accuracy, even improving the accuracy of the reference model by 0.17%. It should also be noted that DCP is a training-phase (Train-stage) pruning method that requires full knowledge of the data set and training of the model from scratch, the most complex. In conclusion, the present invention achieves very high compression rates and most balanced performance at the cost of relatively little accuracy (-0.42%).
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.
Claims (5)
1. A convolutional neural network model compression method based on channel pruning is characterized by comprising the following steps:
the method comprises the following steps of firstly, calculating average activation of all convolutional layer feature maps in a model, and sequencing the feature maps of each layer according to the average activation;
secondly, selecting the characteristic diagram with the minimum influence on the final loss of the model from the characteristic diagrams with the minimum average activation in each convolution layer, and cutting out the corresponding channel from the model;
thirdly, judging whether the accuracy of the model is lower than a certain threshold value T, if so, finely adjusting the reference model until the model converges, and returning to the first step;
fourthly, judging whether the pruning proportion reaches a preset proportion R, if not, returning to the second step, otherwise, executing the fifth step;
and fifthly, fine tuning the model again to recover the accuracy.
2. The convolutional neural network model compression method based on channel pruning as claimed in claim 1, wherein the calculation method of the mean activation of feature maps in the first step is as follows:
assuming that the number of sample sets is N, the convolutional layer input matrix is a ∈ RN×H×W×CH, W and C are the height, width, and number of channels, respectively, of the feature map; then the average activation of the feature map corresponding to the kth channel is:
3. the convolutional neural network model compression method based on channel pruning as claimed in claim 1, wherein the specific process of the second step is:
note S(l)Set of all channels for layer l, k(l)The least active channel is averaged for the ith layer profile, namely:
let K be K ═ K(1),k(2),...,k(L)The set of channels with the smallest average activation in each layerAnd L is the number of the convolutional layers, maintaining a set K in the pruning process, tentatively pruning each channel, evaluating the change of model loss on the test set, and finally selecting the channel with the minimum model loss as the pruned channel.
4. The convolutional neural network model compression method based on channel pruning as claimed in claim 1, wherein in the third step, the method for fine tuning the model is as follows:
training all convolutional layers of the pruned model by adopting a smaller learning rate; with the early termination method, training is stopped when the loss of the model no longer decreases within a certain fixed round.
5. The convolutional neural network model compression method based on channel pruning as claimed in claim 1, wherein the fine tuning method adopted in the fifth step is the same as that adopted in the third step.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011505386.3A CN112633472A (en) | 2020-12-18 | 2020-12-18 | Convolutional neural network compression method based on channel pruning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011505386.3A CN112633472A (en) | 2020-12-18 | 2020-12-18 | Convolutional neural network compression method based on channel pruning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112633472A true CN112633472A (en) | 2021-04-09 |
Family
ID=75317190
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011505386.3A Pending CN112633472A (en) | 2020-12-18 | 2020-12-18 | Convolutional neural network compression method based on channel pruning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112633472A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114881227A (en) * | 2022-05-13 | 2022-08-09 | 北京百度网讯科技有限公司 | Model compression method, image processing method, device and electronic equipment |
-
2020
- 2020-12-18 CN CN202011505386.3A patent/CN112633472A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114881227A (en) * | 2022-05-13 | 2022-08-09 | 北京百度网讯科技有限公司 | Model compression method, image processing method, device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7567972B2 (en) | Method and system for data mining in high dimensional data spaces | |
US20220351043A1 (en) | Adaptive high-precision compression method and system based on convolutional neural network model | |
CN111612144B (en) | Pruning method and terminal applied to target detection | |
CN107392919B (en) | Adaptive genetic algorithm-based gray threshold acquisition method and image segmentation method | |
CN114065863B (en) | Federal learning method, apparatus, system, electronic device and storage medium | |
TW202029074A (en) | Method, apparatus and computer device for image processing and storage medium thereof | |
JPH0744514A (en) | Learning data contracting method for neural network | |
CN110991621A (en) | Method for searching convolutional neural network based on channel number | |
CN112990420A (en) | Pruning method for convolutional neural network model | |
CN113283473B (en) | CNN feature mapping pruning-based rapid underwater target identification method | |
JPWO2019146189A1 (en) | Neural network rank optimizer and optimization method | |
CN112270405A (en) | Filter pruning method and system of convolution neural network model based on norm | |
WO2023020456A1 (en) | Network model quantification method and apparatus, device, and storage medium | |
US7809726B2 (en) | Mechanism for unsupervised clustering | |
CN112633472A (en) | Convolutional neural network compression method based on channel pruning | |
CN111160519A (en) | Convolutional neural network model pruning method based on structure redundancy detection | |
US20240135698A1 (en) | Image classification method, model training method, device, storage medium, and computer program | |
CN110826692B (en) | Automatic model compression method, device, equipment and storage medium | |
CN109034372B (en) | Neural network pruning method based on probability | |
WO2020087254A1 (en) | Optimization method for convolutional neural network, and related product | |
CN112329923A (en) | Model compression method and device, electronic equipment and readable storage medium | |
CN110321799B (en) | Scene number selection method based on SBR and average inter-class distance | |
CN112149716A (en) | Model compression method and system based on FPGM (field programmable gate array) | |
CN113313246A (en) | Method, apparatus and program product for determining model compression ratio | |
CN113177627B (en) | Optimization system, retraining system, method thereof, processor and readable medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |