CN113516240A

CN113516240A - Neural network structured progressive pruning method and system

Info

Publication number: CN113516240A
Application number: CN202110697462.3A
Authority: CN
Inventors: 唐乾坤; 徐晓刚; 何鹏飞; 朱岳江; 曹卫强; 王军
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2021-06-23
Filing date: 2021-06-23
Publication date: 2021-10-19

Abstract

The invention relates to the field of computer vision, in particular to a neural network structured progressive pruning method and a system, wherein the method comprises the following steps: step S1: setting the cutting rate and the pruning standard of each layer of the neural network and the training cycle number of the neural network; step S2: inputting pictures to train a neural network, gradually increasing the cutting rate of each layer from zero to a set cutting rate in a certain training period, and determining and setting the redundant information of each layer to be 0 according to the pruning standard; step S3: after the preset cutting rate is reached, removing redundant information in the neural network and reconstructing an original network layer; step S4: and after the neural network is reconstructed, continuing training until a set neural network training period is reached. The method is simple to operate, has few steps, can achieve the aim of pruning in the normal neural network training process, does not need the fine adjustment process after pruning, can greatly reduce the processing time, and can obtain higher performance while achieving higher pruning rate compared with the prior art.

Description

Neural network structured progressive pruning method and system

Technical Field

The invention relates to the field of computer vision, in particular to a neural network structured progressive pruning method and system.

Background

At present, the neural network obtains good performance in the field of computer vision, particularly image classification, target detection and the like. Even exceeding human recognition capabilities. However, although high performance is achieved in neural networks, these neural networks usually have high floating point operations and storage, such as inputting a picture of size 224 × 224, ResNet-50 requires 4.1B floating point operations and a parameter of 25.6 MB. The large amount of computation and storage requires the operating platform to have higher computation resources and more storage resources. Therefore, these good performance neural networks cannot be deployed to resource-limited platforms such as mobile phones and embedded platforms.

In fact, there is a lot of redundant information in the neural network, and the redundant information, especially the structured channels and filters, are cut out, thereby reducing the amount of computation and memory during operation. The existing pruning method of the neural network needs to train a complete neural network, set a fixed pruning rate for each layer of the neural network after the training is finished, judge redundant information according to a certain standard and prune the information, and finally fine-tune the pruned neural network. This pruning method is cumbersome and requires a long processing time.

Disclosure of Invention

In order to solve the technical problems in the prior art, the invention provides a neural network structured progressive pruning method and system, and the specific technical scheme is as follows:

a neural network structured progressive pruning method comprises the following steps:

step S1: setting the cutting rate and the pruning standard of each layer of the neural network and the training cycle number of the neural network;

step S2: inputting pictures to train the neural network, gradually increasing the cutting rate of each layer of the neural network from zero to the cutting rate set in the step S1 in a certain training period, selecting redundant information of each layer of the neural network according to the pruning standard determined in the step S1, and temporarily setting the value of the redundant information as 0;

step S3: after the set clipping rate is reached, removing the redundant information with the value of 0 processed in the step S2 from each layer of the neural network, and reconstructing the original neural network layer;

step S4: and after the neural network is reconstructed, continuing training until the set neural network training cycle number is reached, and obtaining the pruned lightweight neural network model after training is finished.

Preferably, the cutting rate of each layer of the neural network in step S1 is specifically: each network layer sets different positive values, or the network layers needing pruning in the network set the same positive values.

Preferably, the pruning criteria in step S1 includes L_pNorm, random selection, and inverse gradient-based computation.

Preferably, the number of neural network training cycles in step S1 is one cycle of training all pictures in the data set once, or training a certain number of pictures sampled from the data set once.

Preferably, the training period in step S2 is less than or equal to the number of neural network training periods set in step S1.

Preferably, the cutting rate of each layer of the neural network in the step S2 gradually increases from zero to the cutting rate set in the step S1, which can be formally expressed as

Wherein the content of the first and second substances,

representing the clipping rate of a layer at the current training period,

indicating the clipping rate set in step S1,

indicating the set initial cropping rate of each layer,

indicating the current number of training cycles,

which represents the initial number of training cycles,

the cutting interval is represented by the number of the cutting lines,

the clipping rate attenuation rate is set to be,

any positive integer value.

Preferably, in step S2, the redundant information value is temporarily set to 0, specifically, the redundant channel or the redundant filter value in the neural network is set to 0.

Preferably, in step S3, specifically, when the redundant information is structured filter information, the filter with a value of 0 in the weight of the current network layer is removed, and meanwhile, the channel of the current network layer corresponding to the removed filter is also removed, and the convolution kernel of the filter corresponding to the current network layer removed channel in the weight of the latter network layer is also removed;

and when the redundant information is the structured channel information, removing the channel with the value of 0 in the current network layer, simultaneously removing the filter corresponding to the removal channel in the current network layer, and removing the convolution kernel of the filter corresponding to the removal channel in the current network layer in the weight of the next network layer.

Preferably, in the weight-reduced neural network model after pruning in step S4, if a batch normalization operation is used, parameters of the operation are combined with convolutional layer parameters of a previous network layer.

A neural network structured progressive pruning system is characterized by comprising a parameter setting module, a progressive pruning module, a network reconstruction module and a continuous training module, wherein the parameter setting module is used for setting the pruning rate, the pruning standard and the neural network training cycle number of each layer of a neural network; the progressive pruning module is used for inputting pictures to train the neural network, gradually increasing the pruning rate of each layer of the neural network from zero to the pruning rate set by the parameter setting module in a certain training period, selecting redundant information of each layer of the neural network according to the pruning standard determined by the parameter setting module, and temporarily setting the value of the redundant information to be 0; the network reconstruction module is used for removing the redundant information with the value of 0 processed by the progressive pruning module from each layer of the neural network after the set cutting rate is reached and reconstructing an original neural network layer; and the continuous training module is used for continuously training the reconstructed model, and after the neural network is reconstructed, the training is continuously carried out until the set neural network training period number is reached, so that the light weight neural network model after pruning can be obtained after the training is finished.

Compared with the prior art, the invention has the following advantages and beneficial effects:

1. the structured progressive pruning method is simple to operate and few in steps, and can achieve the purpose of pruning in the normal neural network training process;

2. the invention does not need the fine adjustment process after pruning, thereby greatly reducing the processing time;

3. compared with the prior art, the method can achieve higher performance while achieving higher cutting rate.

Drawings

FIG. 1 is a flow chart of a neural network structured progressive pruning method in an embodiment of the present invention;

FIG. 2 is a diagram of a structured filter information with redundancy removed according to an embodiment of the present invention;

FIG. 3 is a diagram of removing redundant structured channel information in an embodiment of the present invention;

FIG. 4 is a schematic diagram of a neural network structured progressive pruning system in an embodiment of the present invention;

in the figure, 21 is a parameter setting module, 22 is a progressive pruning module, 23 is a network reconstruction module, and 24 is a continuous training module.

Detailed Description

In order to make the objects, technical solutions and technical effects of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings.

The first embodiment is as follows:

as shown in fig. 1, a neural network structured progressive pruning method in this embodiment includes the following steps:

As a preferred embodiment, the pruning rate of each layer of the neural network in step S1 may be set to a different positive value for each network layer, or may be set to the same positive value for the network layer that needs pruning in the network.

As a preferred embodiment, the pruning criteria in step S1 includes L_pNorm, random selection, or computation based on inverse gradients.

In a preferred embodiment, the neural network training cycle in step S1 is performed once for all pictures in the data set or once for a certain number of pictures sampled from the data set.

The step S2 includes both the conventional neural network training method and the step of confirming the redundant structured information to be removed during training.

In a preferred embodiment, the training period in step S2 is less than or equal to the number of neural network training periods set in step S1.

As a preferred embodiment, the cutting rate of each layer of the neural network in the step S2 gradually increases from zero to the cutting rate set in the step S1, which can be formally expressed as

Wherein the content of the first and second substances,

representing the clipping rate of a layer at the current training period,

indicating the clipping rate set in step S1,

indicating the set initial cropping rate of each layer,

indicating the current number of training cycles,

which represents the initial number of training cycles,

the cutting interval is represented by the number of the cutting lines,

the clipping rate attenuation rate is set to be,

any positive integer value.

In a preferred embodiment, the redundant structured information value in step S2 is set to 0, in particular, the redundant channels or the redundant filter values in the neural network are set to 0.

The step S3 is to perform a real pruning operation, i.e., to remove the redundant structured information determined in the step S2 from the corresponding network layer of the neural network.

As a preferred embodiment, in step S3, specifically when the redundant information is the structured filter information, the filter with a value of 0 in the weight of the current network layer is removed, and the channel of the current network layer corresponding to the removed filter also needs to be removed, and the convolution kernel of the filter corresponding to the removed channel of the current network layer in the weight of the latter network layer is also removed.

As shown in figure 2 of the drawings, in which,

is the weight of convolutional layer conv1, where F_iIndicating the need for redundant filters, I₁Input characteristic diagram, O, representing convolutional layer conv1₁Represents the output characteristic diagram of convolutional layer conv1, where C_iIs represented by F_iThe corresponding output channel, convolutional layer conv2, represents the layer one after convolutional layer conv1, W₂Is the weight of convolutional layer conv2, where F² _iIs represented by C_iCorresponding convolution kernel, then F in the reconstruction process_i, C_iAnd F² _iAre removed from the respective positions.

Fig. 2 is a schematic diagram of only one case, and actually, there may be a plurality of redundant filters in each layer of neural network.

As a preferred embodiment, in step S3, specifically, when the redundant information is required to be structured channel information, the channel with the value of 0 in the current network layer needs to be removed, and the filter corresponding to the removed channel in the current network layer also needs to be removed, and the convolution kernel of the filter corresponding to the removed channel in the current network layer in the latter layer of weights is also removed.

As shown in FIG. 3, O₁Represents the output characteristic diagram of convolutional layer conv1, where C_iIndicating redundant channels, W₁Is the weight of convolutional layer conv1, where F_iIs represented by C_iThe corresponding filter, convolutional layer conv2, represents the layer one after convolutional layer conv1, W₂Is the weight of convolutional layer conv2, where F² _iIs represented by C_iCorresponding convolution kernel, then F in the reconstruction process_i, C_iAnd F² _iAre removed from the respective positions.

Fig. 3 is a schematic diagram of only one case, and actually, there may be a plurality of redundant channels in each layer of neural network.

The step S4: optionally, in obtaining the pruned lightweight neural network model, if a batch normalization operation is used, parameters of the operation are combined with parameters of the convolutional layer of the previous layer, thereby further reducing the parameters and the calculation amount of the network.

Example two

The embodiment of the invention also provides a neural network structured progressive pruning system, as shown in fig. 4, the system comprises a parameter setting module 21, a progressive pruning module 22, a network reconstruction module 23 and a continuous training module 24.

The parameter setting module 21 is configured to set a pruning rate and a pruning standard of each layer of the neural network, and a training cycle number of the neural network; the progressive pruning module 22 is used for inputting pictures to train the neural network, gradually increasing the pruning rate of each layer of the neural network from zero to the pruning rate set by the parameter setting module 21 in a certain training period, selecting redundant information of each layer of the neural network according to the pruning standard determined by the parameter setting module 21, and temporarily setting the value of the redundant information to be 0; the network reconstruction module 23 is configured to remove the redundant information with a value of 0 processed by the gradual pruning module 22 from each layer of the neural network after the set clipping rate is reached, and reconstruct an original network layer; and the continuous training module 24 is used for continuous training of the reconstructed model. And after the neural network is reconstructed, continuing training until the set neural network training cycle number is reached, and obtaining the pruned lightweight neural network model after training is finished.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way. Although the foregoing has described the practice of the present invention in detail, it will be apparent to those skilled in the art that modifications may be made to the practice of the invention as described in the foregoing examples, or that certain features may be substituted in the practice of the invention. All changes, equivalents and modifications which come within the spirit and scope of the invention are desired to be protected.

Claims

1. A neural network structured progressive pruning method is characterized by comprising the following steps:

2. The neural network structured progressive pruning method according to claim 1, wherein the clipping rate of each layer of the neural network in the step S1 is specifically: each network layer sets different positive values, or the network layers needing pruning in the network set the same positive values.

3. The method of claim 1, wherein the method comprises a step of performing progressive pruning in a neural network structureCharacterized in that the pruning criterion in the step S1 comprises L_pNorm, random selection, and inverse gradient-based computation.

4. The neural network structured progressive pruning method of claim 1, wherein the number of neural network training cycles in step S1 is one cycle of training all pictures in the data set once, or training pictures sampled quantitatively from the data set once.

5. The method of claim 1, wherein the training period in step S2 is less than or equal to the number of training periods of the neural network set in step S1.

6. The neural network structured progressive pruning method of claim 1, wherein the gradual increase of the per-layer clipping rate of the neural network in the step S2 from zero to the clipping rate set in the step S1 is formally expressed as

Wherein the content of the first and second substances,

representing the clipping rate of a layer at the current training period,

indicating the clipping rate set in step S1,

indicating the set initial cropping rate of each layer,

representing current trainingThe number of cycles,

which represents the initial number of training cycles,

the cutting interval is represented by the number of the cutting lines,

the clipping rate attenuation rate is set to be,

any positive integer value.

7. The neural network structured progressive pruning method of claim 1, wherein the redundant information value is temporarily set to 0 in step S2, specifically, the redundant channel or the redundant filter value in the neural network is set to 0.

8. The neural network structured progressive pruning method according to claim 1, wherein in step S3, specifically, when the redundant information is filter information, the filter with a value of 0 in the weight of the current network layer is removed, and the channel of the current network layer corresponding to the removed filter is also removed, and the convolution kernel of the filter corresponding to the removed channel of the current network layer in the weight of the next network layer is also removed;

and when the redundant information is channel information, removing the channel with the value of 0 in the current network layer, simultaneously removing the filter corresponding to the removal channel in the current network layer, and removing the convolution kernel of the filter corresponding to the removal channel in the current network layer in the weight of the next network layer.

9. The neural network structured progressive pruning method of claim 1, wherein in the lightweight neural network model after pruning obtained in step S4, if a batch normalization operation is used, parameters of the operation are merged with convolutional layer parameters of a previous network layer.

10. The neural network structured progressive pruning system is characterized by comprising a parameter setting module (21), a progressive pruning module (22), a network reconstruction module (23) and a continuous training module (24), wherein the parameter setting module (21) is used for setting the pruning rate, the pruning standard and the neural network training cycle number of each layer of a neural network; the progressive pruning module (22) is used for inputting pictures to train the neural network, gradually increasing the pruning rate of each layer of the neural network from zero to the pruning rate set by the parameter setting module (21) in a certain training period, selecting redundant information of each layer of the neural network according to the pruning standard determined by the parameter setting module (21), and temporarily setting the value of the redundant information to be 0; the network reconstruction module (23) is used for removing the redundant information with the value of 0 processed by the progressive pruning module (22) from each layer of the neural network after the set clipping rate is reached and reconstructing an original neural network layer; and the continuous training module (24) is used for continuously training the reconstructed model, and after the neural network is reconstructed, the training is continuously carried out until the set neural network training cycle number is reached, and the light weight neural network model after pruning can be obtained after the training is finished.