CN114819141A

CN114819141A - Intelligent pruning method and system for deep network compression

Info

Publication number: CN114819141A
Application number: CN202210360695.9A
Authority: CN
Inventors: 王颖; 陈怡桦; 李洁; 王斌; 胡留成; 马浩中; 张建龙
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2022-04-07
Filing date: 2022-04-07
Publication date: 2022-07-29
Anticipated expiration: 2042-04-07
Also published as: CN114819141B

Abstract

The invention relates to an intelligent pruning method and system for deep network compression, wherein the method comprises the following steps: acquiring a training sample set and an untrained convolutional neural network to be compressed; training the convolutional neural network by utilizing a training sample set according to a sleeping and waking mechanism of a neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network; and performing pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain the trained and compressed convolutional neural network. The compressed convolutional neural network obtained by the method can avoid operation redundancy in the process of classifying security inspection images, saves operation time, and can operate on a platform with limited computing resources.

Description

Intelligent pruning method and system for deep network compression

Technical Field

The invention belongs to the technical field of artificial intelligence, and particularly relates to an intelligent pruning method and system for deep network compression.

Background

With the rapid development of economic society, the mobility of population in China is greatly enhanced currently, and in order to maintain public safety, security inspection is required to be carried out in public places such as airports, railway stations, bus stations, subway entrances and the like so as to ensure the traveling safety of people.

At present, a common security check instrument is an X-ray security check machine, in the using process of the X-ray security check machine, workers need to carefully check an X-ray luggage image to judge whether dangerous goods are contained, the intelligent degree of the device is low, the cost required by manual inspection is high, and meanwhile, the situation of misjudgment and missed judgment can also occur.

With the rapid development of artificial intelligence technology, the neural network method is widely applied to the field of image processing, security inspection systems based on a convolutional neural network algorithm also begin to appear, the convolutional neural network is a representative neural network in the technical field of deep learning, and an intelligent security inspection machine system integrating the deep learning algorithm and the convolutional neural network greatly improves the intelligent program of a security inspection device, improves the accuracy of dangerous goods identification, and effectively relieves the pressure of security inspection workers.

However, the network structure of the convolutional neural network is complex, and cannot meet the requirements of fast and real-time application, which may cause the situation of security inspection channel congestion in the security inspection process, and in addition, the convolutional neural network generally needs higher computing resources, and in the practical application scene of image classification and identification, the computing resources are limited due to space and cost limitations, and thus the convolutional neural network cannot be applied to the actual life in a large scale.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention provides an intelligent pruning method and system for deep network compression. The technical problem to be solved by the invention is realized by the following technical scheme:

the invention provides an intelligent pruning method for deep network compression, which comprises the following steps:

acquiring a training sample set and an untrained convolutional neural network to be compressed;

training the convolutional neural network by using the training sample set according to the sleeping and waking mechanism of the neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;

and performing pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain a trained and compressed convolutional neural network, so as to classify the security inspection image data by using the compressed convolutional neural network.

In an embodiment of the present invention, training the convolutional neural network by using the training sample set according to a sleep and wake mechanism of a neuron, and updating an information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously during a training process to obtain a trained convolutional neural network, including:

inputting the training sample set into the convolutional neural network, training the convolutional neural network alternately according to the turns of sleep stages and waking stages, updating the parameters of the convolutional neural network, and calculating the information entropy of each filter in each convolutional layer in the convolutional neural network after each turn of training;

and obtaining the trained convolutional neural network after a preset network convergence condition is reached.

In one embodiment of the present invention, training the convolutional neural network in a sleep stage, and updating parameters of the convolutional neural network comprises:

obtaining the magnitude sequence of the information entropy of the filter of each convolutional layer according to the information entropy of each filter in each convolutional layer in the convolutional neural network obtained after the last round of conscious phase training;

for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as candidate filters to be pruned corresponding to the convolutional layers;

in the training process of the current round of sleep stage, deleting the weight parameters of the candidate filter to be pruned of each convolution layer, and updating the weight parameters of the rest filters so as to realize the parameter updating of the convolution neural network.

In one embodiment of the present invention, training the convolutional neural network in the waking stage, and updating the parameters of the convolutional neural network, includes:

and in the training process of the waking stage of the current turn, updating the weight parameters of all the filters of each convolution layer so as to realize the parameter updating of the convolution neural network.

In an embodiment of the present invention, pruning each convolutional layer according to the information entropy sorting of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain a trained and compressed convolutional neural network, including:

obtaining the information entropy sorting of the filters of each convolutional layer according to the information entropy of each filter in each convolutional layer in the trained convolutional neural network;

for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as the filters to be pruned corresponding to the convolutional layers;

and pruning the filter to be pruned of each convolutional layer to obtain the compressed convolutional neural network.

In one embodiment of the present invention, the entropy of information for each filter in each convolutional layer is calculated as:

in the formula, H _i,j Denotes the information entropy of the ith convolution layer and the jth filter, N denotes the number of intervals into which the weight range of the filter is divided uniformly, N denotes the nth weight interval of the filter, p _n Representing the probability of the weight of the filter occurring in the nth weight interval, K ₁ Representing the height, K, of the filter ₂ Representing the width, w, of the filter _i,j [k ₁ ][k ₂ ]Denotes the height k in the ith convolution layer and the jth filter ₁ Width k ₂ Weight in position, [ B ] _n1 ,B _n2 ) And the value range of the nth weight interval is represented.

The invention provides a compression system of a convolutional neural network for image classification, which comprises the following components:

the acquisition module is used for acquiring a training sample set and an untrained convolutional neural network to be compressed;

the network training module is used for training the convolutional neural network by using the training sample set according to the sleep and waking mechanism of the neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;

and the pruning module is used for carrying out pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain the trained and compressed convolutional neural network.

The invention provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for finishing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of the above embodiments when executing the program stored in the memory.

The present invention provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the method steps of any of the above-mentioned embodiments.

Compared with the prior art, the invention has the beneficial effects that:

1. the intelligent pruning method for the deep network compression adopts the information entropy as the evaluation criterion of the filter, fully considers the influence of the filter weight on the characteristic expression capability of the convolutional neural network, simultaneously introduces the sleeping and waking concepts in neuroscience, sets a global pruning threshold value to form an intelligent structured pruning method, selecting a filter to be pruned in the convolutional neural network in the training process, pruning the network after the training is finished to obtain a compressed convolutional neural network, when the compressed convolutional neural network is used for image classification, the method has high model generalization capability while keeping the model classification accuracy almost unchanged, and the compressed convolutional neural network after pruning can avoid the operation redundancy in the classification process of the security inspection images, save the operation time and further avoid the condition of congestion of the security inspection channel.

2. The intelligent pruning method for the deep network compression realizes the compression of the deep convolutional neural network, so that the compressed deep convolutional neural network occupies lower memory, needs less consumed computing resources, has lower requirements on the computing resources and the memory, and can be widely operated in various application scenes of image classification, in particular to embedded platforms and mobile equipment with limited computing resources.

The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.

Drawings

Fig. 1 is a schematic diagram of an intelligent pruning method for deep network compression according to an embodiment of the present invention;

FIG. 2 is a flowchart of an intelligent pruning method for deep network compression according to an embodiment of the present invention;

fig. 3 is a block diagram of an intelligent pruning system for deep network compression according to an embodiment of the present invention.

Detailed Description

To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined object, the following detailed description is made with reference to the accompanying drawings and the detailed description to describe an intelligent pruning method and system for deep network compression according to the present invention.

The foregoing and other technical contents, features and effects of the present invention will be more clearly understood from the following detailed description of the embodiments taken in conjunction with the accompanying drawings. The technical means and effects of the present invention adopted to achieve the predetermined purpose can be more deeply and specifically understood through the description of the specific embodiments, however, the attached drawings are provided for reference and description only and are not used for limiting the technical scheme of the present invention.

It should be noted that the intelligent pruning method and system for deep network compression provided by the embodiments of the present invention are suitable for compressing a convolutional neural network for image classification, for example, suitable for security inspection image detection based on the convolutional neural network, where the convolutional neural network for image classification includes, but is not limited to, a CNN series.

Example one

The convolutional neural network is a representative neural network in the field of deep learning technology, and is also an important basis for obtaining breakthrough achievement in the application of the deep learning technology in the field of computer vision. With the increase of task complexity, the number of layers of the convolutional neural network is increased, and the model size is also increased. The current mainstream deep neural network has millions of parameters, so that the difficulty degree of model training is multiplied. In order to deploy the trained deep convolutional neural networks on the internet of things and edge devices and realize real-time and rapid reasoning, the deep convolutional neural networks need to be compressed so as to effectively reduce the memory space occupied by the model and the energy consumption required by the model reasoning.

Referring to fig. 1 and fig. 2 in combination, fig. 1 is a schematic diagram of an intelligent pruning method for deep network compression according to an embodiment of the present invention; fig. 2 is a flowchart of an intelligent pruning method for deep network compression according to an embodiment of the present invention. As shown in the figure, the intelligent pruning method for deep network compression in this embodiment includes:

step 1: acquiring a training sample set and an untrained convolutional neural network to be compressed;

in an embodiment, the training sample set includes several training samples, the training samples are security inspection images attached with class labels, and the untrained convolutional neural network to be compressed is a convolutional neural network for image classification.

Step 2: training the convolutional neural network by utilizing a training sample set according to a sleeping and waking mechanism of a neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;

specifically, step 2 comprises:

step 2.1: inputting a training sample set into a convolutional neural network, training the convolutional neural network alternately according to turns of sleep stages and waking stages, updating parameters of the convolutional neural network, and calculating the information entropy of each filter in each convolutional layer in the convolutional neural network after each turn of training;

wherein, training the convolutional neural network in the sleep stage, and updating the parameters of the convolutional neural network, comprises:

a, step a: obtaining the information entropy sorting of the filters of each convolutional layer according to the information entropy of each filter in each convolutional layer in the convolutional neural network obtained after the last round of conscious stage training;

step b: for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as candidate filters to be pruned corresponding to the convolutional layers;

step c: in the training process of the current round of sleep stage, the weight parameters of the candidate filters to be pruned of each convolution layer are deleted, and the weight parameters of the rest filters are updated, so that the parameter update of the convolution neural network is realized.

Further, training the convolutional neural network in the waking stage, and updating parameters of the convolutional neural network, including: and in the training process of the waking stage of the current turn, updating the weight parameters of all the filters of each convolution layer so as to realize the parameter updating of the convolution neural network.

Each filter in the convolutional neural network can be regarded as a separate feature extraction unit, and the complexity of the information entropy can reflect the construction of the filter weight, and whether the filter pays attention to and extracts some salient features of the image. In this embodiment, the information entropy is used to analyze the weight complexity of a single filter, and the filter information entropy is used as a theoretical basis for evaluating the importance of the filter (i.e., the larger the filter information entropy is, the more important the filter is), so that the feature expression capability of the convolutional neural network can be maximally maintained in the process of implementing structured pruning, and redundant network parameters are deleted.

Specifically, in the present embodiment, the calculation formula of the information entropy of each filter in each convolutional layer is:

in the formula, H _i,j Denotes the information entropy of the ith convolution layer and the jth filter, N denotes the number of intervals into which the weight range of the filter is divided uniformly, N denotes the nth weight interval of the filter, p _n Representing filtersThe probability of occurrence of the weight in the nth weight interval.

When the information entropy of a single filter is calculated, the weight range of the filter is uniformly divided into N intervals by taking the maximum weight and the minimum weight of the filter as references, the probability that each weight interval contains the weight of the filter is calculated, and finally the information entropy of the single filter is calculated according to the probability that the weight of the filter appears in the weight interval.

Wherein the range size of a single weight interval is represented as:

value range [ B ] of nth weight interval _n1 ,B _n2 ) The calculation formula of (2) is as follows:

B _n1 ＝W _i,j [k ₁ ][k ₂ ] _min +(n-1)×scale(3)；

B _n2 ＝W _i,j [k ₁ ][k ₂ ] _min +n×scale(4)；

in the formula, k is more than or equal to 1 ₁ ≤K ₁ ，K ₁ Denotes the height of the filter, 1 ≦ k ₂ ≤K ₂ ，K ₂ Representing the width, w, of the filter _i,j [k ₁ ][k ₂ ]Denotes that the height of the ith convolution layer in the jth filter is k ₁ Width k ₂ Weight in position, w _i,j [k ₁ ][k ₂ ] _max Representing the maximum weight, w, of the current filter _i,j [k ₁ ][k ₂ ] _min Representing the minimum weight of the current filter.

According to equations (2) - (4), the probability of the occurrence of the weight of the filter in each interval can be calculated:

step 2.2: and obtaining the trained convolutional neural network after the preset network convergence condition is reached.

When the convolutional neural network reaches the total number of the set training rounds or the network loss function value reaches a preset threshold value in the training process, the network is considered to be converged, and the convolutional neural network training is completed.

It should be noted that the pruning proportion may be selected by using a global threshold or a local threshold. In this embodiment, a global threshold is adopted as the pruning proportion, that is, a uniform threshold is set for the whole convolutional neural network as the pruning proportion, and the pruning proportion setting mode is simple and convenient to control. In other embodiments, the pruning proportion may also be selected by using a local threshold, that is, a threshold is set for each of the different convolutional layers in the convolutional neural network as the pruning proportion of the layer, but the selection of the local threshold requires a large amount of experiments and rich experience intervention.

And step 3: and performing pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain a trained and compressed convolutional neural network, so as to classify the security inspection image data by using the compressed convolutional neural network.

Specifically, step 3 comprises:

step 3.1: obtaining the information entropy sorting of the filters of each convolutional layer according to the information entropy of each filter in each convolutional layer in the trained convolutional neural network;

step 3.2: for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as the filters to be pruned corresponding to the convolutional layers;

step 3.3: and pruning the filter to be pruned of each convolutional layer to obtain the compressed convolutional neural network.

In the embodiment, a sleep and wake mechanism of a neuron is introduced in convolutional neural network training, that is, in a sleep stage, importance degree evaluation is performed on each layer of filter according to a proposed information entropy evaluation criterion, the filters are sequenced from small to large according to the information entropy of the filters, a number of filters corresponding to a pruning proportion are determined according to a preset pruning proportion, and weight parameters of unimportant filters are deleted to realize the transient memory, and the weight parameters of the other important filters are updated, so that a convolutional neural network model in the sleep stage is maintained in a low-energy state.

And awakening all filters in the waking stage, ensuring that all filters are in an active state and participate in training, and receiving new memory updating so that the waking stage model is maintained in a high-energy state. Through alternate iterative training of two stages of sleep and waking, and with the help of the loss function with sparse effect in the two stages, the filter importance degree differentiation of the convolutional neural network after training is very obvious. Then, the information entropy of the filters of each convolutional layer in the trained convolutional neural network is sorted, and then the filters (i.e., a batch of filters with the minimum information entropy) with the number corresponding to the preset pruning proportion are permanently pruned, so that the trained and compressed convolutional neural network is obtained.

The compressed convolutional neural network obtained by the intelligent pruning method for the deep network compression has high model generalization capability while maintaining the model classification performance, the parameters and the floating point operation times of the compressed convolutional neural network obtained by compression are smaller than those of the original convolutional neural network, and when the compressed convolutional neural network is used for classifying security inspection images, the operation redundancy in the security inspection image classification process can be avoided, the operation time is saved, and the condition that the security inspection channels are blocked is avoided.

In addition, compared with the traditional structured pruning method which needs three steps of pre-training, pruning and fine-tuning, the method comprises the steps of fully training an original network to obtain high performance, pruning a network filter based on the size of a weight norm and each layer of pruning proportion set by experience, and finally retraining a pruned model to restore the performance of the model as far as possible. The compressed convolutional neural network of the embodiment is trained, and can be directly used for subsequent image classification without secondary training.

The intelligent pruning method for the deep network compression of the embodiment realizes the compression of the deep convolutional neural network, so that the convolutional neural network obtained by compression occupies a lower memory, needs and consumes less computing resources, has lower requirements on the computing resources and the memory, and can be widely operated in various application scenes including but not limited to image classification, in particular to embedded platforms and mobile devices with limited computing resources.

Fig. 3 is a structural block diagram of an intelligent pruning system for depth network compression according to an embodiment of the present invention, and as shown in the drawing, the intelligent pruning system for depth network compression according to the embodiment includes: the system comprises an acquisition module, a network training module and a pruning module, wherein the acquisition module is used for acquiring a training sample set and an untrained convolutional neural network to be compressed; the network training module is used for training the convolutional neural network by utilizing a training sample set according to the sleeping and waking mechanism of the neuron, and simultaneously updating the information entropy of each filter in each convolutional layer in the convolutional neural network in the training process to obtain the trained convolutional neural network; and the pruning module is used for carrying out pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and the preset pruning proportion to obtain the trained and compressed convolutional neural network. In this embodiment, the compressed convolutional neural network may be applied to a security inspection apparatus to implement classification of security inspection image data.

The intelligent pruning system for deep network compression provided by the embodiment of the invention can implement the method embodiment, the implementation principle and the technical effect are similar, and the details are not repeated herein.

Based on the same inventive concept, the embodiment of the invention also provides electronic equipment, which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus; the memory is used for storing computer programs; the processor is used for implementing any one of the above method steps of the intelligent pruning method for deep network compression when executing the program stored in the memory.

The embodiment of the invention also provides a computer readable storage medium. In which a computer program is stored which, when being executed by a processor, carries out the method steps of any of the above-mentioned intelligent pruning methods for deep network compression.

Example two

The embodiment explains the effect of the intelligent pruning method and system for deep network compression according to the invention through a specific simulation experiment.

1. Simulation conditions

In this embodiment, a Python language is used to complete a simulation experiment on a PC with a CPU configured as intel (r) core (tm) i7-9700K CPU @3.60GHz, a memory of 32G, a GPU as a single-card NVIDIA GeForce RTX 2070 video memory, a CUDA version of 11.3, and a Windows 10 operating system.

2. Content of simulation experiment

The data used in this embodiment are 60000 three-channel color images of 10 different classes of CIFAR10, and the training data of each class is 5000 images and the test data is 1000 images. When the importance of the filter is evaluated, the number N of the division regions is 10, the pruning ratio T is 0.6, the training optimizer uses a mini-batch SGD, momentum and weight attenuation are added to the SGD, the learning rate is set to 0.1, the momentum is 0.9, the weight attenuation is 0.0005, the batch size is 128, the total number of training rounds is 200, the learning rate is adjusted to 0.02, 0.004 and 0.0008 in rounds 60, 120 and 160, respectively, the sleep time and the waking time are both single epoch, and the sleep and waking stages are alternately performed. The classification performance of the convolutional neural network is tested after training, the initial deep convolutional neural network (ResNet56) achieves the highest classification precision of 93.17%, the highest classification precision of the network after being compressed by the method is 93.11%, the highest classification precision is only reduced by 0.06%, and compared with the original network, the floating point operation frequency of the compressed network is reduced by 52.63%.

It should be noted that, in this document, the terms "comprises", "comprising" or any other variation are intended to cover a non-exclusive inclusion, so that an article or apparatus comprising a series of elements includes not only those elements but also other elements not explicitly listed. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of additional like elements in the article or device comprising the element. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims

1. An intelligent pruning method for deep network compression, comprising:

2. The intelligent pruning method for deep network compression according to claim 1, wherein the convolutional neural network is trained by using the training sample set according to the sleep and wake mechanism of neurons, and the information entropy of each filter in each convolutional layer in the convolutional neural network is updated simultaneously in the training process to obtain the trained convolutional neural network, which includes:

3. The intelligent pruning method for deep network compression according to claim 2, wherein training the convolutional neural network in a sleep stage, and updating parameters of the convolutional neural network comprises:

4. The intelligent pruning method for deep network compression according to claim 2, wherein the training of the convolutional neural network in the awake phase, and the updating of the parameters of the convolutional neural network, comprise:

5. The intelligent pruning method for deep network compression according to claim 1, wherein the pruning is performed on each convolutional layer according to the information entropy sorting of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion, so as to obtain the trained and compressed convolutional neural network, and the method comprises the following steps:

6. The intelligent pruning method for deep network compression according to claim 1, wherein the calculation formula of the information entropy of each filter in each convolutional layer is as follows:

in the formula，H _i,j Denotes the information entropy of the ith convolution layer and the jth filter, N denotes the number of intervals into which the weight range of the filter is divided uniformly, N denotes the nth weight interval of the filter, p _n Representing the probability of the weight of the filter occurring in the nth weight interval, K ₁ Representing the height, K, of the filter ₂ Representing the width, w, of the filter _i,j [k ₁ ][k ₂ ]Denotes the height k in the ith convolution layer and the jth filter ₁ The width is the weight at position k2, [ B _n1 ,B _n2 ) And the value range of the nth weight interval is represented.

7. A compression system for a convolutional neural network for image classification, comprising:

8. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of claims 1-6 when executing a program stored in the memory.

9. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.