CN114819141A - Intelligent pruning method and system for deep network compression - Google Patents

Intelligent pruning method and system for deep network compression Download PDF

Info

Publication number
CN114819141A
CN114819141A CN202210360695.9A CN202210360695A CN114819141A CN 114819141 A CN114819141 A CN 114819141A CN 202210360695 A CN202210360695 A CN 202210360695A CN 114819141 A CN114819141 A CN 114819141A
Authority
CN
China
Prior art keywords
neural network
convolutional neural
filter
convolutional
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210360695.9A
Other languages
Chinese (zh)
Other versions
CN114819141B (en
Inventor
王颖
陈怡桦
李洁
王斌
胡留成
马浩中
张建龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202210360695.9A priority Critical patent/CN114819141B/en
Publication of CN114819141A publication Critical patent/CN114819141A/en
Application granted granted Critical
Publication of CN114819141B publication Critical patent/CN114819141B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Neurology (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an intelligent pruning method and system for deep network compression, wherein the method comprises the following steps: acquiring a training sample set and an untrained convolutional neural network to be compressed; training the convolutional neural network by utilizing a training sample set according to a sleeping and waking mechanism of a neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network; and performing pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain the trained and compressed convolutional neural network. The compressed convolutional neural network obtained by the method can avoid operation redundancy in the process of classifying security inspection images, saves operation time, and can operate on a platform with limited computing resources.

Description

Intelligent pruning method and system for deep network compression
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to an intelligent pruning method and system for deep network compression.
Background
With the rapid development of economic society, the mobility of population in China is greatly enhanced currently, and in order to maintain public safety, security inspection is required to be carried out in public places such as airports, railway stations, bus stations, subway entrances and the like so as to ensure the traveling safety of people.
At present, a common security check instrument is an X-ray security check machine, in the using process of the X-ray security check machine, workers need to carefully check an X-ray luggage image to judge whether dangerous goods are contained, the intelligent degree of the device is low, the cost required by manual inspection is high, and meanwhile, the situation of misjudgment and missed judgment can also occur.
With the rapid development of artificial intelligence technology, the neural network method is widely applied to the field of image processing, security inspection systems based on a convolutional neural network algorithm also begin to appear, the convolutional neural network is a representative neural network in the technical field of deep learning, and an intelligent security inspection machine system integrating the deep learning algorithm and the convolutional neural network greatly improves the intelligent program of a security inspection device, improves the accuracy of dangerous goods identification, and effectively relieves the pressure of security inspection workers.
However, the network structure of the convolutional neural network is complex, and cannot meet the requirements of fast and real-time application, which may cause the situation of security inspection channel congestion in the security inspection process, and in addition, the convolutional neural network generally needs higher computing resources, and in the practical application scene of image classification and identification, the computing resources are limited due to space and cost limitations, and thus the convolutional neural network cannot be applied to the actual life in a large scale.
Disclosure of Invention
In order to solve the above problems in the prior art, the present invention provides an intelligent pruning method and system for deep network compression. The technical problem to be solved by the invention is realized by the following technical scheme:
the invention provides an intelligent pruning method for deep network compression, which comprises the following steps:
acquiring a training sample set and an untrained convolutional neural network to be compressed;
training the convolutional neural network by using the training sample set according to the sleeping and waking mechanism of the neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;
and performing pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain a trained and compressed convolutional neural network, so as to classify the security inspection image data by using the compressed convolutional neural network.
In an embodiment of the present invention, training the convolutional neural network by using the training sample set according to a sleep and wake mechanism of a neuron, and updating an information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously during a training process to obtain a trained convolutional neural network, including:
inputting the training sample set into the convolutional neural network, training the convolutional neural network alternately according to the turns of sleep stages and waking stages, updating the parameters of the convolutional neural network, and calculating the information entropy of each filter in each convolutional layer in the convolutional neural network after each turn of training;
and obtaining the trained convolutional neural network after a preset network convergence condition is reached.
In one embodiment of the present invention, training the convolutional neural network in a sleep stage, and updating parameters of the convolutional neural network comprises:
obtaining the magnitude sequence of the information entropy of the filter of each convolutional layer according to the information entropy of each filter in each convolutional layer in the convolutional neural network obtained after the last round of conscious phase training;
for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as candidate filters to be pruned corresponding to the convolutional layers;
in the training process of the current round of sleep stage, deleting the weight parameters of the candidate filter to be pruned of each convolution layer, and updating the weight parameters of the rest filters so as to realize the parameter updating of the convolution neural network.
In one embodiment of the present invention, training the convolutional neural network in the waking stage, and updating the parameters of the convolutional neural network, includes:
and in the training process of the waking stage of the current turn, updating the weight parameters of all the filters of each convolution layer so as to realize the parameter updating of the convolution neural network.
In an embodiment of the present invention, pruning each convolutional layer according to the information entropy sorting of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain a trained and compressed convolutional neural network, including:
obtaining the information entropy sorting of the filters of each convolutional layer according to the information entropy of each filter in each convolutional layer in the trained convolutional neural network;
for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as the filters to be pruned corresponding to the convolutional layers;
and pruning the filter to be pruned of each convolutional layer to obtain the compressed convolutional neural network.
In one embodiment of the present invention, the entropy of information for each filter in each convolutional layer is calculated as:
Figure BDA0003585083590000041
Figure BDA0003585083590000042
in the formula, H i,j Denotes the information entropy of the ith convolution layer and the jth filter, N denotes the number of intervals into which the weight range of the filter is divided uniformly, N denotes the nth weight interval of the filter, p n Representing the probability of the weight of the filter occurring in the nth weight interval, K 1 Representing the height, K, of the filter 2 Representing the width, w, of the filter i,j [k 1 ][k 2 ]Denotes the height k in the ith convolution layer and the jth filter 1 Width k 2 Weight in position, [ B ] n1 ,B n2 ) And the value range of the nth weight interval is represented.
The invention provides a compression system of a convolutional neural network for image classification, which comprises the following components:
the acquisition module is used for acquiring a training sample set and an untrained convolutional neural network to be compressed;
the network training module is used for training the convolutional neural network by using the training sample set according to the sleep and waking mechanism of the neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;
and the pruning module is used for carrying out pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain the trained and compressed convolutional neural network.
The invention provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for finishing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of the above embodiments when executing the program stored in the memory.
The present invention provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the method steps of any of the above-mentioned embodiments.
Compared with the prior art, the invention has the beneficial effects that:
1. the intelligent pruning method for the deep network compression adopts the information entropy as the evaluation criterion of the filter, fully considers the influence of the filter weight on the characteristic expression capability of the convolutional neural network, simultaneously introduces the sleeping and waking concepts in neuroscience, sets a global pruning threshold value to form an intelligent structured pruning method, selecting a filter to be pruned in the convolutional neural network in the training process, pruning the network after the training is finished to obtain a compressed convolutional neural network, when the compressed convolutional neural network is used for image classification, the method has high model generalization capability while keeping the model classification accuracy almost unchanged, and the compressed convolutional neural network after pruning can avoid the operation redundancy in the classification process of the security inspection images, save the operation time and further avoid the condition of congestion of the security inspection channel.
2. The intelligent pruning method for the deep network compression realizes the compression of the deep convolutional neural network, so that the compressed deep convolutional neural network occupies lower memory, needs less consumed computing resources, has lower requirements on the computing resources and the memory, and can be widely operated in various application scenes of image classification, in particular to embedded platforms and mobile equipment with limited computing resources.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical means of the present invention more clearly understood, the present invention may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present invention more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.
Drawings
Fig. 1 is a schematic diagram of an intelligent pruning method for deep network compression according to an embodiment of the present invention;
FIG. 2 is a flowchart of an intelligent pruning method for deep network compression according to an embodiment of the present invention;
fig. 3 is a block diagram of an intelligent pruning system for deep network compression according to an embodiment of the present invention.
Detailed Description
To further illustrate the technical means and effects of the present invention adopted to achieve the predetermined object, the following detailed description is made with reference to the accompanying drawings and the detailed description to describe an intelligent pruning method and system for deep network compression according to the present invention.
The foregoing and other technical contents, features and effects of the present invention will be more clearly understood from the following detailed description of the embodiments taken in conjunction with the accompanying drawings. The technical means and effects of the present invention adopted to achieve the predetermined purpose can be more deeply and specifically understood through the description of the specific embodiments, however, the attached drawings are provided for reference and description only and are not used for limiting the technical scheme of the present invention.
It should be noted that the intelligent pruning method and system for deep network compression provided by the embodiments of the present invention are suitable for compressing a convolutional neural network for image classification, for example, suitable for security inspection image detection based on the convolutional neural network, where the convolutional neural network for image classification includes, but is not limited to, a CNN series.
Example one
The convolutional neural network is a representative neural network in the field of deep learning technology, and is also an important basis for obtaining breakthrough achievement in the application of the deep learning technology in the field of computer vision. With the increase of task complexity, the number of layers of the convolutional neural network is increased, and the model size is also increased. The current mainstream deep neural network has millions of parameters, so that the difficulty degree of model training is multiplied. In order to deploy the trained deep convolutional neural networks on the internet of things and edge devices and realize real-time and rapid reasoning, the deep convolutional neural networks need to be compressed so as to effectively reduce the memory space occupied by the model and the energy consumption required by the model reasoning.
Referring to fig. 1 and fig. 2 in combination, fig. 1 is a schematic diagram of an intelligent pruning method for deep network compression according to an embodiment of the present invention; fig. 2 is a flowchart of an intelligent pruning method for deep network compression according to an embodiment of the present invention. As shown in the figure, the intelligent pruning method for deep network compression in this embodiment includes:
step 1: acquiring a training sample set and an untrained convolutional neural network to be compressed;
in an embodiment, the training sample set includes several training samples, the training samples are security inspection images attached with class labels, and the untrained convolutional neural network to be compressed is a convolutional neural network for image classification.
Step 2: training the convolutional neural network by utilizing a training sample set according to a sleeping and waking mechanism of a neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;
specifically, step 2 comprises:
step 2.1: inputting a training sample set into a convolutional neural network, training the convolutional neural network alternately according to turns of sleep stages and waking stages, updating parameters of the convolutional neural network, and calculating the information entropy of each filter in each convolutional layer in the convolutional neural network after each turn of training;
wherein, training the convolutional neural network in the sleep stage, and updating the parameters of the convolutional neural network, comprises:
a, step a: obtaining the information entropy sorting of the filters of each convolutional layer according to the information entropy of each filter in each convolutional layer in the convolutional neural network obtained after the last round of conscious stage training;
step b: for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as candidate filters to be pruned corresponding to the convolutional layers;
step c: in the training process of the current round of sleep stage, the weight parameters of the candidate filters to be pruned of each convolution layer are deleted, and the weight parameters of the rest filters are updated, so that the parameter update of the convolution neural network is realized.
Further, training the convolutional neural network in the waking stage, and updating parameters of the convolutional neural network, including: and in the training process of the waking stage of the current turn, updating the weight parameters of all the filters of each convolution layer so as to realize the parameter updating of the convolution neural network.
Each filter in the convolutional neural network can be regarded as a separate feature extraction unit, and the complexity of the information entropy can reflect the construction of the filter weight, and whether the filter pays attention to and extracts some salient features of the image. In this embodiment, the information entropy is used to analyze the weight complexity of a single filter, and the filter information entropy is used as a theoretical basis for evaluating the importance of the filter (i.e., the larger the filter information entropy is, the more important the filter is), so that the feature expression capability of the convolutional neural network can be maximally maintained in the process of implementing structured pruning, and redundant network parameters are deleted.
Specifically, in the present embodiment, the calculation formula of the information entropy of each filter in each convolutional layer is:
Figure BDA0003585083590000081
in the formula, H i,j Denotes the information entropy of the ith convolution layer and the jth filter, N denotes the number of intervals into which the weight range of the filter is divided uniformly, N denotes the nth weight interval of the filter, p n Representing filtersThe probability of occurrence of the weight in the nth weight interval.
When the information entropy of a single filter is calculated, the weight range of the filter is uniformly divided into N intervals by taking the maximum weight and the minimum weight of the filter as references, the probability that each weight interval contains the weight of the filter is calculated, and finally the information entropy of the single filter is calculated according to the probability that the weight of the filter appears in the weight interval.
Wherein the range size of a single weight interval is represented as:
Figure BDA0003585083590000091
value range [ B ] of nth weight interval n1 ,B n2 ) The calculation formula of (2) is as follows:
B n1 =W i,j [k 1 ][k 2 ] min +(n-1)×scale(3);
B n2 =W i,j [k 1 ][k 2 ] min +n×scale(4);
in the formula, k is more than or equal to 1 1 ≤K 1 ,K 1 Denotes the height of the filter, 1 ≦ k 2 ≤K 2 ,K 2 Representing the width, w, of the filter i,j [k 1 ][k 2 ]Denotes that the height of the ith convolution layer in the jth filter is k 1 Width k 2 Weight in position, w i,j [k 1 ][k 2 ] max Representing the maximum weight, w, of the current filter i,j [k 1 ][k 2 ] min Representing the minimum weight of the current filter.
According to equations (2) - (4), the probability of the occurrence of the weight of the filter in each interval can be calculated:
Figure BDA0003585083590000092
step 2.2: and obtaining the trained convolutional neural network after the preset network convergence condition is reached.
When the convolutional neural network reaches the total number of the set training rounds or the network loss function value reaches a preset threshold value in the training process, the network is considered to be converged, and the convolutional neural network training is completed.
It should be noted that the pruning proportion may be selected by using a global threshold or a local threshold. In this embodiment, a global threshold is adopted as the pruning proportion, that is, a uniform threshold is set for the whole convolutional neural network as the pruning proportion, and the pruning proportion setting mode is simple and convenient to control. In other embodiments, the pruning proportion may also be selected by using a local threshold, that is, a threshold is set for each of the different convolutional layers in the convolutional neural network as the pruning proportion of the layer, but the selection of the local threshold requires a large amount of experiments and rich experience intervention.
And step 3: and performing pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain a trained and compressed convolutional neural network, so as to classify the security inspection image data by using the compressed convolutional neural network.
Specifically, step 3 comprises:
step 3.1: obtaining the information entropy sorting of the filters of each convolutional layer according to the information entropy of each filter in each convolutional layer in the trained convolutional neural network;
step 3.2: for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as the filters to be pruned corresponding to the convolutional layers;
step 3.3: and pruning the filter to be pruned of each convolutional layer to obtain the compressed convolutional neural network.
In the embodiment, a sleep and wake mechanism of a neuron is introduced in convolutional neural network training, that is, in a sleep stage, importance degree evaluation is performed on each layer of filter according to a proposed information entropy evaluation criterion, the filters are sequenced from small to large according to the information entropy of the filters, a number of filters corresponding to a pruning proportion are determined according to a preset pruning proportion, and weight parameters of unimportant filters are deleted to realize the transient memory, and the weight parameters of the other important filters are updated, so that a convolutional neural network model in the sleep stage is maintained in a low-energy state.
And awakening all filters in the waking stage, ensuring that all filters are in an active state and participate in training, and receiving new memory updating so that the waking stage model is maintained in a high-energy state. Through alternate iterative training of two stages of sleep and waking, and with the help of the loss function with sparse effect in the two stages, the filter importance degree differentiation of the convolutional neural network after training is very obvious. Then, the information entropy of the filters of each convolutional layer in the trained convolutional neural network is sorted, and then the filters (i.e., a batch of filters with the minimum information entropy) with the number corresponding to the preset pruning proportion are permanently pruned, so that the trained and compressed convolutional neural network is obtained.
The compressed convolutional neural network obtained by the intelligent pruning method for the deep network compression has high model generalization capability while maintaining the model classification performance, the parameters and the floating point operation times of the compressed convolutional neural network obtained by compression are smaller than those of the original convolutional neural network, and when the compressed convolutional neural network is used for classifying security inspection images, the operation redundancy in the security inspection image classification process can be avoided, the operation time is saved, and the condition that the security inspection channels are blocked is avoided.
In addition, compared with the traditional structured pruning method which needs three steps of pre-training, pruning and fine-tuning, the method comprises the steps of fully training an original network to obtain high performance, pruning a network filter based on the size of a weight norm and each layer of pruning proportion set by experience, and finally retraining a pruned model to restore the performance of the model as far as possible. The compressed convolutional neural network of the embodiment is trained, and can be directly used for subsequent image classification without secondary training.
The intelligent pruning method for the deep network compression of the embodiment realizes the compression of the deep convolutional neural network, so that the convolutional neural network obtained by compression occupies a lower memory, needs and consumes less computing resources, has lower requirements on the computing resources and the memory, and can be widely operated in various application scenes including but not limited to image classification, in particular to embedded platforms and mobile devices with limited computing resources.
Fig. 3 is a structural block diagram of an intelligent pruning system for depth network compression according to an embodiment of the present invention, and as shown in the drawing, the intelligent pruning system for depth network compression according to the embodiment includes: the system comprises an acquisition module, a network training module and a pruning module, wherein the acquisition module is used for acquiring a training sample set and an untrained convolutional neural network to be compressed; the network training module is used for training the convolutional neural network by utilizing a training sample set according to the sleeping and waking mechanism of the neuron, and simultaneously updating the information entropy of each filter in each convolutional layer in the convolutional neural network in the training process to obtain the trained convolutional neural network; and the pruning module is used for carrying out pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and the preset pruning proportion to obtain the trained and compressed convolutional neural network. In this embodiment, the compressed convolutional neural network may be applied to a security inspection apparatus to implement classification of security inspection image data.
The intelligent pruning system for deep network compression provided by the embodiment of the invention can implement the method embodiment, the implementation principle and the technical effect are similar, and the details are not repeated herein.
Based on the same inventive concept, the embodiment of the invention also provides electronic equipment, which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus; the memory is used for storing computer programs; the processor is used for implementing any one of the above method steps of the intelligent pruning method for deep network compression when executing the program stored in the memory.
The embodiment of the invention also provides a computer readable storage medium. In which a computer program is stored which, when being executed by a processor, carries out the method steps of any of the above-mentioned intelligent pruning methods for deep network compression.
Example two
The embodiment explains the effect of the intelligent pruning method and system for deep network compression according to the invention through a specific simulation experiment.
1. Simulation conditions
In this embodiment, a Python language is used to complete a simulation experiment on a PC with a CPU configured as intel (r) core (tm) i7-9700K CPU @3.60GHz, a memory of 32G, a GPU as a single-card NVIDIA GeForce RTX 2070 video memory, a CUDA version of 11.3, and a Windows 10 operating system.
2. Content of simulation experiment
The data used in this embodiment are 60000 three-channel color images of 10 different classes of CIFAR10, and the training data of each class is 5000 images and the test data is 1000 images. When the importance of the filter is evaluated, the number N of the division regions is 10, the pruning ratio T is 0.6, the training optimizer uses a mini-batch SGD, momentum and weight attenuation are added to the SGD, the learning rate is set to 0.1, the momentum is 0.9, the weight attenuation is 0.0005, the batch size is 128, the total number of training rounds is 200, the learning rate is adjusted to 0.02, 0.004 and 0.0008 in rounds 60, 120 and 160, respectively, the sleep time and the waking time are both single epoch, and the sleep and waking stages are alternately performed. The classification performance of the convolutional neural network is tested after training, the initial deep convolutional neural network (ResNet56) achieves the highest classification precision of 93.17%, the highest classification precision of the network after being compressed by the method is 93.11%, the highest classification precision is only reduced by 0.06%, and compared with the original network, the floating point operation frequency of the compressed network is reduced by 52.63%.
It should be noted that, in this document, the terms "comprises", "comprising" or any other variation are intended to cover a non-exclusive inclusion, so that an article or apparatus comprising a series of elements includes not only those elements but also other elements not explicitly listed. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of additional like elements in the article or device comprising the element. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.

Claims (9)

1. An intelligent pruning method for deep network compression, comprising:
acquiring a training sample set and an untrained convolutional neural network to be compressed;
training the convolutional neural network by using the training sample set according to the sleeping and waking mechanism of the neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;
and performing pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain a trained and compressed convolutional neural network, so as to classify the security inspection image data by using the compressed convolutional neural network.
2. The intelligent pruning method for deep network compression according to claim 1, wherein the convolutional neural network is trained by using the training sample set according to the sleep and wake mechanism of neurons, and the information entropy of each filter in each convolutional layer in the convolutional neural network is updated simultaneously in the training process to obtain the trained convolutional neural network, which includes:
inputting the training sample set into the convolutional neural network, training the convolutional neural network alternately according to the turns of sleep stages and waking stages, updating the parameters of the convolutional neural network, and calculating the information entropy of each filter in each convolutional layer in the convolutional neural network after each turn of training;
and obtaining the trained convolutional neural network after a preset network convergence condition is reached.
3. The intelligent pruning method for deep network compression according to claim 2, wherein training the convolutional neural network in a sleep stage, and updating parameters of the convolutional neural network comprises:
obtaining the magnitude sequence of the information entropy of the filter of each convolutional layer according to the information entropy of each filter in each convolutional layer in the convolutional neural network obtained after the last round of conscious phase training;
for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as candidate filters to be pruned corresponding to the convolutional layers;
in the training process of the current round of sleep stage, deleting the weight parameters of the candidate filter to be pruned of each convolution layer, and updating the weight parameters of the rest filters so as to realize the parameter updating of the convolution neural network.
4. The intelligent pruning method for deep network compression according to claim 2, wherein the training of the convolutional neural network in the awake phase, and the updating of the parameters of the convolutional neural network, comprise:
and in the training process of the waking stage of the current turn, updating the weight parameters of all the filters of each convolution layer so as to realize the parameter updating of the convolution neural network.
5. The intelligent pruning method for deep network compression according to claim 1, wherein the pruning is performed on each convolutional layer according to the information entropy sorting of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion, so as to obtain the trained and compressed convolutional neural network, and the method comprises the following steps:
obtaining the information entropy sorting of the filters of each convolutional layer according to the information entropy of each filter in each convolutional layer in the trained convolutional neural network;
for the filters of each convolutional layer, determining the filters with the number corresponding to the pruning proportion according to the sequence from small entropy to large entropy as the filters to be pruned corresponding to the convolutional layers;
and pruning the filter to be pruned of each convolutional layer to obtain the compressed convolutional neural network.
6. The intelligent pruning method for deep network compression according to claim 1, wherein the calculation formula of the information entropy of each filter in each convolutional layer is as follows:
Figure FDA0003585083580000031
Figure FDA0003585083580000032
in the formula,H i,j Denotes the information entropy of the ith convolution layer and the jth filter, N denotes the number of intervals into which the weight range of the filter is divided uniformly, N denotes the nth weight interval of the filter, p n Representing the probability of the weight of the filter occurring in the nth weight interval, K 1 Representing the height, K, of the filter 2 Representing the width, w, of the filter i,j [k 1 ][k 2 ]Denotes the height k in the ith convolution layer and the jth filter 1 The width is the weight at position k2, [ B n1 ,B n2 ) And the value range of the nth weight interval is represented.
7. A compression system for a convolutional neural network for image classification, comprising:
the acquisition module is used for acquiring a training sample set and an untrained convolutional neural network to be compressed;
the network training module is used for training the convolutional neural network by using the training sample set according to the sleep and waking mechanism of the neuron, and updating the information entropy of each filter in each convolutional layer in the convolutional neural network simultaneously in the training process to obtain the trained convolutional neural network;
and the pruning module is used for carrying out pruning processing on the filter of each convolutional layer according to the information entropy sequencing of the filter of each convolutional layer in the trained convolutional neural network and a preset pruning proportion to obtain the trained and compressed convolutional neural network.
8. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of claims 1-6 when executing a program stored in the memory.
9. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.
CN202210360695.9A 2022-04-07 2022-04-07 Intelligent pruning method and system for deep network compression Active CN114819141B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210360695.9A CN114819141B (en) 2022-04-07 2022-04-07 Intelligent pruning method and system for deep network compression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210360695.9A CN114819141B (en) 2022-04-07 2022-04-07 Intelligent pruning method and system for deep network compression

Publications (2)

Publication Number Publication Date
CN114819141A true CN114819141A (en) 2022-07-29
CN114819141B CN114819141B (en) 2024-08-13

Family

ID=82534838

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210360695.9A Active CN114819141B (en) 2022-04-07 2022-04-07 Intelligent pruning method and system for deep network compression

Country Status (1)

Country Link
CN (1) CN114819141B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992946A (en) * 2023-09-27 2023-11-03 荣耀终端有限公司 Model compression method, apparatus, storage medium, and program product
CN117217281A (en) * 2023-09-18 2023-12-12 华中科技大学 Convolutional neural network lightweight pruning method and system based on multi-view features

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084281A (en) * 2019-03-31 2019-08-02 华为技术有限公司 Image generating method, the compression method of neural network and relevant apparatus, equipment
CN111612143A (en) * 2020-05-22 2020-09-01 中国科学院自动化研究所 Compression method and system of deep convolutional neural network
CN113657595A (en) * 2021-08-20 2021-11-16 中国科学院计算技术研究所 Neural network real-time pruning method and system and neural network accelerator

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084281A (en) * 2019-03-31 2019-08-02 华为技术有限公司 Image generating method, the compression method of neural network and relevant apparatus, equipment
CN111612143A (en) * 2020-05-22 2020-09-01 中国科学院自动化研究所 Compression method and system of deep convolutional neural network
CN113657595A (en) * 2021-08-20 2021-11-16 中国科学院计算技术研究所 Neural network real-time pruning method and system and neural network accelerator

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
赖叶静;郝珊锋;黄定江;: "深度神经网络模型压缩方法与进展", 华东师范大学学报(自然科学版), no. 05, 25 September 2020 (2020-09-25), pages 77 - 91 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117217281A (en) * 2023-09-18 2023-12-12 华中科技大学 Convolutional neural network lightweight pruning method and system based on multi-view features
CN116992946A (en) * 2023-09-27 2023-11-03 荣耀终端有限公司 Model compression method, apparatus, storage medium, and program product
CN116992946B (en) * 2023-09-27 2024-05-17 荣耀终端有限公司 Model compression method, apparatus, storage medium, and program product

Also Published As

Publication number Publication date
CN114819141B (en) 2024-08-13

Similar Documents

Publication Publication Date Title
CN114022432B (en) Insulator defect detection method based on improved yolov5
CN108764471B (en) Neural network cross-layer pruning method based on feature redundancy analysis
CN114819141A (en) Intelligent pruning method and system for deep network compression
CN111091130A (en) Real-time image semantic segmentation method and system based on lightweight convolutional neural network
CN109242033A (en) Wafer defect method for classifying modes and device, storage medium, electronic equipment
CN112597815A (en) Synthetic aperture radar image ship detection method based on Group-G0 model
US20220398835A1 (en) Target detection system suitable for embedded device
CN111462090B (en) Multi-scale image target detection method
CN113591978B (en) Confidence penalty regularization-based self-knowledge distillation image classification method, device and storage medium
CN108090472A (en) Pedestrian based on multichannel uniformity feature recognition methods and its system again
CN113221687A (en) Training method of pressing plate state recognition model and pressing plate state recognition method
CN109446897B (en) Scene recognition method and device based on image context information
CN115048870A (en) Target track identification method based on residual error network and attention mechanism
CN112288700A (en) Rail defect detection method
CN114972753B (en) Lightweight semantic segmentation method and system based on context information aggregation and assisted learning
CN116206214A (en) Automatic landslide recognition method, system, equipment and medium based on lightweight convolutional neural network and double attention
Rui et al. Smart network maintenance in an edge cloud computing environment: An adaptive model compression algorithm based on model pruning and model clustering
CN113887330A (en) Target detection system based on remote sensing image
CN113033489B (en) Power transmission line insulator identification positioning method based on lightweight deep learning algorithm
CN116051961A (en) Target detection model training method, target detection method, device and medium
CN116386847A (en) DB-LSTM neural network-based intelligent prediction algorithm for mobile medical Internet of things spectrum
CN115147432A (en) First arrival picking method based on depth residual semantic segmentation network
Jiao et al. Realization and improvement of object recognition system on raspberry pi 3b+
CN112488291B (en) 8-Bit quantization compression method for neural network
CN114065920A (en) Image identification method and system based on channel-level pruning neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant