CN112149803A - Channel pruning method suitable for deep neural network - Google Patents

Channel pruning method suitable for deep neural network Download PDF

Info

Publication number
CN112149803A
CN112149803A CN202011002072.1A CN202011002072A CN112149803A CN 112149803 A CN112149803 A CN 112149803A CN 202011002072 A CN202011002072 A CN 202011002072A CN 112149803 A CN112149803 A CN 112149803A
Authority
CN
China
Prior art keywords
layer
pruning
channel
channels
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011002072.1A
Other languages
Chinese (zh)
Inventor
陈彦明
闻翔
施巍松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202011002072.1A priority Critical patent/CN112149803A/en
Publication of CN112149803A publication Critical patent/CN112149803A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of convolutional neural network compression, in particular to a channel pruning method suitable for a deep neural network. The invention can greatly reduce the storage space and the calculation amount of the model under the condition of not reducing the performance of the model, thereby obtaining the light neural network model, and the light neural network model has certain application prospect in edge equipment and vehicle-mounted systems.

Description

Channel pruning method suitable for deep neural network
Technical Field
The invention relates to the technical field of convolutional neural network compression, in particular to a channel pruning method suitable for a deep neural network.
Background
The current deep convolutional neural network has great success in the fields of image classification, target detection and the like, but as the performance of the network is better and better, the depth of a network model is deeper and deeper, the scale of the network model is larger and larger, and the required storage space and the calculation cost are increased. This makes deep neural network technology difficult to apply in daily life, especially on some resource-constrained devices. In view of this, more and more people pay attention to how to make the model occupy the least possible resources without reducing the performance of the model. Aiming at the problem, the invention adopts a channel pruning method to reduce the size of the model and hardly influences the performance of the model.
In the field of deep neural network model compression, methods of model compression can be divided into four categories: weight quantification, tensor decomposition, distillation and pruning. The weight quantization and tensor decomposition can be realized in most scenes by special software and hardware support, and a basic linear algebraic subprogram library cannot be utilized. The structure of the student network of knowledge distillation is difficult to determine, and specialized artificial design is generally required. Pruning is gaining increasing attention due to its simple concept and efficient performance.
In view of this, the present invention provides a channel pruning method suitable for a deep neural network.
Disclosure of Invention
The present invention is directed to a channel pruning method for a deep neural network, so as to solve the problems mentioned in the background art.
In order to achieve the purpose, the invention provides the following technical scheme:
a channel pruning method suitable for a deep neural network comprises the following steps:
step 1: in order to identify unimportant channels conveniently, L1 regularization is carried out on the convolutional layer weight and the BN layer scaling factor respectively, and a network model is trained in a sparse mode;
step 2: combining the convolutional layer weight and the BN layer scaling factor to obtain a channel importance vector S of the convolutional layer of the first layerl(L is more than or equal to 1 and less than or equal to L), and combining the channel importance vectors of all layers to obtain a global channel importance vector S;
and step 3: in the pruning process, if all the channels of a certain layer are judged to be unimportant channels, in the pruning, the pruning operation on the layer is cancelled, so that all the channels left after the last pruning are reserved.
Preferably, when the network model is trained in a sparse mode, the optimization function is performed according to the following formula:
LOSS=L(f(X,W),y)+α1R(W)+α2R(γ),
Figure BDA0002694677310000021
Figure BDA0002694677310000022
where X represents the input data set, y represents the corresponding label, W represents the trainable set of weights for all convolutional layers,
Figure BDA0002694677310000023
represents the first layer convolutional layer weight, gammalDenotes the scaling factor, n, of the l-th layeri+1Is an output channel, niIs an input channel, H is the convolution kernel height, G is the convolution kernel width, gamma is the set of all BN layer scaling factors, each output channel corresponds to a scaling factor, alpha1And alpha2The invention selects to introduce L1 regularization on convolution layer weight and BN layer scaling factor respectively to output unimportant channels with zero, thereby facilitating to screen out unimportant channels jointly by combining weight and scaling factor and delete them.
Preferably, the single layer channel importance vector may utilize the following formula:
Figure BDA0002694677310000031
wherein SlIs a polymer having ni+1The vector of numbers, each value representing the importance of an output channel, because of the BN layer relationship, considering only a single layer may mistakenly delete an important channel.
Preferably, pruning is to evaluate the importance of the channels in a global scope, so that it may occur that all channels of a certain layer are evaluated as having lower importance, and all channels are deleted during pruning, that is, the entire layer is deleted, which may destroy the structure of the model, so that, during the pruning process, if all channels of the layer are deleted, the pruning operation on the layer is cancelled in the pruning of the time, so as to retain all channels remaining after the last pruning.
Compared with the prior art, the invention has the beneficial effects that: jointly evaluating the importance of the channel by combining the convolutional layer weight and the BN layer scaling factor avoids using only a single convolutional layer or BN layer to evaluate the importance of the channel because whether the corresponding scaling factor or channel is important or not is not considered, namely when evaluating a channel in the convolutional layer is not important, the corresponding scaling factor is possibly important, and the important output feature map is deleted by mistake; the method provided by the invention can greatly reduce the storage space and the calculated amount of the model under the condition of not reducing the performance of the model, thereby obtaining the light neural network model, and the light neural network model has certain application prospect in edge equipment and vehicle-mounted systems.
Drawings
FIG. 1 is a display diagram of convolutional layer weight and BN layer scaling factors and channel importance vectors in the present invention;
FIG. 2 is a flow chart of a channel pruning method suitable for a deep neural network according to the present invention;
FIG. 3 is the pruning result of the proposed method of the present invention on the cifar-10 dataset using the VGG19_ bn model.
FIG. 4 is the pruning result of the proposed method of the present invention on the cifar-100 dataset using the VGG19_ bn model.
FIG. 5 is a diagram of GPU memory footprint during pre-and post-pruning training on cifar-10 and cifar-100 datasets using the VGG19_ bn model in accordance with the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-5, the present invention provides a technical solution:
examples
Considering the pruning process of the VGG19_ bn network on the cifar-10 and cifar-100 data sets, the VGG19_ bn originally has 16 convolutional layers and 3 fully-connected layers. However, in the current model, the full connection is generally replaced by the global pooling layer, so the required pruning VGG19_ bn model has 16 convolutional layers and 1 full connection layer.
The cifar-10 data set comprises 50000 training pictures and 10000 testing pictures, wherein 10 classes are provided, each class in the training set comprises 5000 pictures, and each class in the testing set comprises 1000 pictures. The cifar-100 data set consisted of 50000 training sets and 10000 test sets. Unlike cifar-10, cifar-100 has 100 classes, each class having 500 pictures in the training set and 100 pictures in the testing set.
A channel pruning method suitable for a deep neural network comprises the following steps:
step 1, initializing weight parameters and setting values of hyper-parameters in a model.
And 2, in order to calculate unimportant channels conveniently, performing L1 regularization on the convolutional layer weight W and the BN layer scaling factor gamma respectively, and training a network model by using a sparseness mode.
When the network model is trained by using a sparse mode, the target optimization function is as follows:
LOSS=L(f(X,W),y)+α1R(W)+α2R(γ),
Figure BDA0002694677310000051
Figure BDA0002694677310000052
where X represents the input data set, y represents the corresponding label, W represents the trainable set of weights for all convolutional layers,
Figure BDA0002694677310000053
represents the first layer convolutional layer weight, gammalDenotes the scaling factor, n, of the l-th layeri+1Is an output channel, niIs the input channel, H is the convolution kernel height, G is the convolution kernel height, γ is the set of all BN layer scaling factors, one for each output channel. Alpha is alpha1And alpha2To balance the normal LOSS function and the sparse regularization term. L (f (X, W), y) is the LOSS function normal on the dataset X, R (-) represents the sparse regularization term, and L1 regularization is selected in the present invention to zero-out unimportant channels in order to better screen out unimportant channels for deletion.
Step 3, calculating the I channel importance vector S by combining the convolutional layer weight and the BN layer scaling factorl(L is more than or equal to 1 and less than or equal to L), and combining the channel importance vectors of all layers to obtain a global channel importance vector S, wherein a channel importance vector calculation formula is as follows:
Figure BDA0002694677310000054
Figure BDA0002694677310000061
wherein SlIs a polymer having ni+1A vector of numbers, each value representing the importance of an output channel. After the channel importance is calculated, the importance of each channel will be determined globally.
Step 4, sorting the values of the channel importance vectors S according to the size of the values to obtain new channel importance vectors Ssort
Step 5, setting the pruning rate P and deleting the channel importance SsortAnd obtaining a new model through a channel corresponding to a small value of the middle and front P%.
As shown in fig. 1, the grey channels represent insignificant channels that will be deleted. When the unimportant channel in the convolutional layer is deleted, the corresponding scaling factor and output characteristic diagram in the BN layer are also deleted, and at the same time, the input channel of the next convolutional layer is also deleted. Pruning is to evaluate the importance of channels on a global scale, so that it may occur that all channels of a certain layer are evaluated as having a lower importance, and all channels are deleted in pruning, i.e. the entire layer is deleted, which may destroy the structure of the model. The present invention addresses this problem in a simple and efficient manner by eliminating the pruning operation for the layer during the pruning operation if all channels of the layer are deleted.
And 6, assigning values to the parameters of the new model, and copying the values of the channels in the old model corresponding to the channels of the new model to the new model.
And 7, fine-tuning the new model and recovering the precision of the new model.
Compared with one-time pruning, the precision of the model can be easily recovered by using an iterative mode pruning network. And after each pruning, the network is finely adjusted to recover the accuracy of the network, and the finely adjusted network is taken as the network to be pruned next time. Because one-time pruning is easier to delete the more important channels in the network model, the precision of the model is seriously damaged, and fine tuning cannot be recovered. Relatively speaking, the iterative pruning method can obtain a smoother model, the performance of the network cannot be damaged, and even the precision can be improved after fine adjustment.
And 8, repeating the steps 4-7 according to the precision reduction condition to obtain the final lightweight model.
The pruning procedure is shown in FIG. 2.
Experiment of
The effects of the present invention can be further illustrated by the following experiments.
In the experiment, the pruning operation is carried out on the VGG19_ bn model by adopting the method. The invention trains all initialization networks from scratch using random gradient descent (SDG), with weight degradation set to 10-4Nesteroy momentum was set to 0.9 and initial learning rate was set to 0.1, on the CIFAR dataset we trained 160 batches using mini-batch 64, 80 batches and 120 batchesNext, the learning rate is divided by 10. During sparse training, the hyper-parameter alpha1And alpha2And is the term used to balance the normal LOSS function and coefficient regularization, we set their values empirically, and on VGG, we set α1=10-6,α2=10-4
After the sparsification training, the method obtains the channel important vectors according to the step 4 and the step 5, and deletes the corresponding number of channels according to the set proportion.
FIG. 3 shows the pruning effect of VGG19_ bn on the cifar-10 data set. It can be seen from the figure that on the cifar-10 data set, when the invention reduces the calculation amount by 48.92%, the accuracy of the model is not only not reduced, but also improved by 0.07%. When 85.50% of the calculation amount is reduced, only the parameter of 0.46M (Million) is reserved, and the accuracy of the model is reduced by only 0.35%.
FIG. 4 shows the pruning effect of VGG19_ bn on the cifar-100 dataset, with cifar-100 having more classes and being more difficult to prune than cifar-10. It can be seen from the figure that when the calculation amount is reduced by 42.48%, the accuracy of the model is improved by 0.85%. When the calculated amount is reduced by 62.10%, only the 2.49M parameter is reserved, and the accuracy of the model is reduced by only 0.22%.
Fig. 5 shows the memory usage on the GPU during training after pruning and before pruning, and it can be seen that the model occupies less memory resources after pruning, which is very important for some devices with limited resources.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (4)

1. A channel pruning method suitable for a deep neural network is characterized by comprising the following steps: the method comprises the following steps:
step 1: in order to identify unimportant channels conveniently, L1 regularization is carried out on the convolutional layer weight and the BN layer scaling factor respectively, and a network model is trained in a sparse mode;
step 2: combining the convolutional layer weight and the BN layer scaling factor to obtain the l-th layer channel importance vector Sl(L is more than or equal to 1 and less than or equal to L), and combining the channel importance vectors of all layers to obtain a global channel importance vector S;
and step 3: in the pruning process, if all the channels of a certain layer are judged to be unimportant channels, in the pruning, the pruning operation on the layer is cancelled, so that all the channels left after the last pruning are reserved.
2. The channel pruning method suitable for the deep neural network according to claim 1, wherein: when the network model is trained by using a sparse mode, the optimization function is carried out according to the following formula:
LOSS=L(f(X,W),y)+α1R(W)+α2R(γ),
Figure FDA0002694677300000011
Figure FDA0002694677300000012
where X represents the input data set, y represents the corresponding label, W represents the trainable set of weights for all convolutional layers,
Figure FDA0002694677300000013
represents the first layer convolutional layer weight, gammalDenotes the scaling factor, n, of the l-th layeri+1Is an output channel, niIs an input channel, H is the convolution kernel height, G is the convolution kernel width, gamma is the set of all BN layer scaling factors, each output channel corresponds to a scaling factor, alpha1And alpha2Hyperparameters to balance normal LOSS functions and sparse regularization terms, L (f (X, W), y) is normal on dataset XThe LOSS function, R (-) represents a sparse regularization term, and in the invention, L1 regularization is introduced to convolutional layer weight and BN layer scaling factor respectively to output unimportant channels in a zero way, so that the unimportant channels are screened out and deleted together by the joint weight and the scaling factor.
3. The channel pruning method suitable for the deep neural network according to claim 1, wherein: the single layer channel importance vector may utilize the following formula:
Figure FDA0002694677300000021
wherein SlIs a polymer having ni+1The vector of numbers, each value representing the importance of an output channel, because of the BN layer relationship, considering only a single layer may mistakenly delete an important channel.
4. The channel pruning method suitable for the deep neural network according to claim 1, wherein: pruning is to evaluate the importance of channels on a global scale, so that it may occur that all channels of a certain layer are evaluated as having a lower importance, and all channels are deleted in pruning, i.e. the entire layer is deleted, which may destroy the structure of the model. Therefore, in the pruning process, if all channels of the layer are deleted, the pruning operation on the layer is cancelled in the pruning, so that all channels left after the last pruning are reserved.
CN202011002072.1A 2020-09-22 2020-09-22 Channel pruning method suitable for deep neural network Pending CN112149803A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011002072.1A CN112149803A (en) 2020-09-22 2020-09-22 Channel pruning method suitable for deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011002072.1A CN112149803A (en) 2020-09-22 2020-09-22 Channel pruning method suitable for deep neural network

Publications (1)

Publication Number Publication Date
CN112149803A true CN112149803A (en) 2020-12-29

Family

ID=73896117

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011002072.1A Pending CN112149803A (en) 2020-09-22 2020-09-22 Channel pruning method suitable for deep neural network

Country Status (1)

Country Link
CN (1) CN112149803A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837381A (en) * 2021-09-18 2021-12-24 杭州海康威视数字技术股份有限公司 Network pruning method, device, equipment and medium for deep neural network model
CN114154626A (en) * 2021-12-14 2022-03-08 中国人民解放军国防科技大学 Deep neural network filter pruning method based on filter weight comprehensive evaluation

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113837381A (en) * 2021-09-18 2021-12-24 杭州海康威视数字技术股份有限公司 Network pruning method, device, equipment and medium for deep neural network model
CN113837381B (en) * 2021-09-18 2024-01-05 杭州海康威视数字技术股份有限公司 Network pruning method, device, equipment and medium of deep neural network model
CN114154626A (en) * 2021-12-14 2022-03-08 中国人民解放军国防科技大学 Deep neural network filter pruning method based on filter weight comprehensive evaluation
CN114154626B (en) * 2021-12-14 2022-08-16 中国人民解放军国防科技大学 Filter pruning method for image classification task

Similar Documents

Publication Publication Date Title
CN109657156B (en) Individualized recommendation method based on loop generation countermeasure network
CN108765506B (en) Layer-by-layer network binarization-based compression method
CN109002889B (en) Adaptive iterative convolution neural network model compression method
CN111582397B (en) CNN-RNN image emotion analysis method based on attention mechanism
Bhandari et al. Optimal sub-band adaptive thresholding based edge preserved satellite image denoising using adaptive differential evolution algorithm
CN110428045A (en) Depth convolutional neural networks compression method based on Tucker algorithm
CN112365514A (en) Semantic segmentation method based on improved PSPNet
CN112016674A (en) Knowledge distillation-based convolutional neural network quantification method
CN112149803A (en) Channel pruning method suitable for deep neural network
CN111415323B (en) Image detection method and device and neural network training method and device
CN112418261B (en) Human body image multi-attribute classification method based on prior prototype attention mechanism
CN111833322B (en) Garbage multi-target detection method based on improved YOLOv3
CN113112446A (en) Tunnel surrounding rock level intelligent judgment method based on residual convolutional neural network
CN114511576B (en) Image segmentation method and system of scale self-adaptive feature enhanced deep neural network
CN109949200B (en) Filter subset selection and CNN-based steganalysis framework construction method
CN112488070A (en) Neural network compression method for remote sensing image target detection
WO2020260656A1 (en) Pruning and/or quantizing machine learning predictors
CN115035418A (en) Remote sensing image semantic segmentation method and system based on improved deep LabV3+ network
CN111582091A (en) Pedestrian identification method based on multi-branch convolutional neural network
CN112183742A (en) Neural network hybrid quantization method based on progressive quantization and Hessian information
Lacey et al. Stochastic layer-wise precision in deep neural networks
CN114283088A (en) Low-dose CT image noise reduction method and device
CN114819143A (en) Model compression method suitable for communication network field maintenance
WO2020165490A1 (en) A method, an apparatus and a computer program product for video encoding and video decoding
CN114882278A (en) Tire pattern classification method and device based on attention mechanism and transfer learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201229

RJ01 Rejection of invention patent application after publication