CN112101547A - Pruning method and device for network model, electronic equipment and storage medium - Google Patents

Pruning method and device for network model, electronic equipment and storage medium Download PDF

Info

Publication number
CN112101547A
CN112101547A CN202010964152.9A CN202010964152A CN112101547A CN 112101547 A CN112101547 A CN 112101547A CN 202010964152 A CN202010964152 A CN 202010964152A CN 112101547 A CN112101547 A CN 112101547A
Authority
CN
China
Prior art keywords
network model
pruning
parameter
current network
training image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010964152.9A
Other languages
Chinese (zh)
Other versions
CN112101547B (en
Inventor
谷宇章
邱守猛
袁泽强
张晓林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Microsystem and Information Technology of CAS
Original Assignee
Shanghai Institute of Microsystem and Information Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Microsystem and Information Technology of CAS filed Critical Shanghai Institute of Microsystem and Information Technology of CAS
Priority to CN202010964152.9A priority Critical patent/CN112101547B/en
Publication of CN112101547A publication Critical patent/CN112101547A/en
Application granted granted Critical
Publication of CN112101547B publication Critical patent/CN112101547B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The pruning method, the pruning device, the electronic equipment and the storage medium for the network model disclosed by the embodiment of the application comprise the steps of obtaining a training image set and a current network model, inputting the training image into the current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model. Based on the embodiment of the application, attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the number of the parameters is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.

Description

Pruning method and device for network model, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of deep learning, in particular to a pruning method and device for a network model, electronic equipment and a storage medium.
Background
In recent years, with the continuous development of deep learning technology, network models become more and more complex, and the number of parameters in the network models is also more and more, which brings great computational burden to the practical application of the deep learning network models. However, through research, many redundant parameters often exist in a trained network model, and the network model can recover the performance before elimination after the redundant parameters are eliminated and then subjected to certain fine tuning. For example, ResNet-50 has 50 convolutional layers, and the whole model requires about 95MB of storage space, but can still work well after 75% of parameters are removed, and the running time can be reduced by as much as 50%. Therefore, redundant parameters in the network model are removed, namely the model can be lightened by pruning the network model, so that the model is easy to deploy and apply in an actual scene.
In the prior art, pruning of a network model is mainly divided into two categories, namely thinning during training and pruning after training. The sparse constraint is applied to the parameters or the structure of the model in the process of training the model, so that the sparse parameters or the structure are obtained, the size of the model can be reduced, the time consumed by inference is reduced, and the inference speed is increased. And pruning after training is to specifically delete unimportant weights in the trained model so as to make the network model sparse and simplified. However, in the processing method of pruning after training, the accuracy of the network model is often reduced after the unimportant weights are deleted. Therefore, it is generally necessary to fine-tune the trained pruned network model to restore performance.
Disclosure of Invention
The embodiment of the application provides a pruning method and device for a network model, an electronic device and a storage medium, which can reduce the number of parameters, do not increase training burden and can ensure the identification accuracy of the network model.
The embodiment of the application provides a pruning method for a network model, which comprises the following steps:
acquiring a training image set and a current network model; the current network model comprises a plurality of convolutional layers;
pruning the current network model based on the training image set to obtain a pruned network model;
wherein, the pruning treatment step comprises the following steps:
inputting the training image into a current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter;
and if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.
Further, after obtaining the pruned network model, the method further includes:
and re-determining the pruned network model as the current network model, and returning to execute the step of pruning the current network model based on the training image set to obtain the pruned network model.
Further, inputting the training image into the current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model, including:
inputting the training image into a current network model, and determining a feature atlas output by each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
determining parameters corresponding to the feature map set output by each convolution layer;
and determining the corresponding parameters of each convolution layer according to the preset mapping relation and the parameters corresponding to the feature map set output by each convolution layer.
Further, based on the preset pruning rate corresponding to each convolution layer, performing attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter, including:
determining a target feature map set from the feature map set output by each convolutional layer based on the corresponding preset pruning rate of each convolutional layer; the ratio of the number of channels of the target feature map set to the number of channels of the feature map set is a preset pruning rate;
determining parameters corresponding to the target feature atlas;
performing attenuation processing on parameters corresponding to the target feature map set based on a preset coefficient to obtain transition parameters;
and determining attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters.
Further, after obtaining the pruned network model, the method further includes:
re-determining the pruned network model as the current network model;
and training the current network model by using the training image set to obtain the trained network model.
Further, training the current network model by using the training image set to obtain a trained network model, including:
inputting the training image into a current network model, and determining parameter sets corresponding to a plurality of convolutional layers according to the output of the current network model;
determining parameter sets to be pruned from the parameter sets corresponding to the plurality of convolutional layers, and determining parameters except the parameter sets to be pruned in the parameter sets corresponding to the plurality of convolutional layers as parameter sets to be updated;
and carrying out pause updating processing on the parameter set to be pruned, and updating the parameter set to be updated to obtain the trained network model.
Correspondingly, the embodiment of the present application further provides a pruning device for a network model, and the pruning device includes:
the acquisition module is used for acquiring a training image set and a current network model; the current network model comprises a plurality of convolutional layers;
the pruning module is used for carrying out pruning processing on the current network model based on the training image set to obtain a pruned network model;
wherein, the pruning treatment step comprises the following steps:
inputting the training image into a current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter;
and if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.
Further, the pruning module comprises:
the determining module is used for inputting the training image into the current network model and determining the corresponding parameter of each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
the attenuation module is used for carrying out attenuation processing on the parameters corresponding to each convolution layer based on the preset pruning rate corresponding to each convolution layer to obtain attenuation parameters;
and the rejecting module is used for rejecting parameters corresponding to the attenuation parameters in the convolutional layer to obtain the pruned network model if the difference value of the attenuation parameters and the preset threshold value is within a preset interval.
Accordingly, an embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the pruning method for the network model.
Accordingly, an embodiment of the present application further provides a computer-readable storage medium, where at least one instruction, at least one program, a code set, or a set of instructions is stored in the storage medium, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the pruning method for the network model.
The embodiment of the application has the following beneficial effects:
the pruning method specifically comprises the steps of obtaining a training image set and a current network model, wherein the current network model comprises a plurality of convolutional layers, and carrying out pruning processing on the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting a training image into a current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, rejecting the parameter corresponding to the attenuation parameter in the convolutional layer to obtain a pruned network model. Based on the embodiment of the application, attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the number of the parameters is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;
fig. 2 is a schematic flowchart of a pruning method for a network model according to an embodiment of the present application;
FIG. 3 is a flow chart illustrating the steps of a pruning process provided by an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a method for determining a% of feature maps in a feature map set output by each convolutional layer as a target feature map set according to an embodiment of the present application;
fig. 5 is a data comparison graph illustrating the recognition accuracy of the AlexNet model before pruning on the training image set and the recognition accuracy of the AlexNet model after pruning on the training image set, where the predetermined pruning rate is 80% for both the 25 th iteration batch epoch and the 75 th iteration batch epoch provided in the embodiment of the present application;
fig. 6 is a data comparison graph illustrating the recognition accuracy of the AlexNet model before pruning on the training image set and the recognition accuracy of the AlexNet model after pruning on the training image set when the 25 th iteration batch epoch and the 75 th iteration batch epoch are both performed with the pruning processing with the preset pruning rate of 60%;
fig. 7 is a data comparison graph illustrating the recognition accuracy of the VGG19 model before pruning on the training image set and the recognition accuracy of the VGG19 model after pruning on the training image set, in which the pruning rate of the VGG19 model is set to 80% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch provided in the embodiment of the present application;
fig. 8 is a data comparison graph illustrating the recognition accuracy of the VGG19 model before pruning on the training image set and the recognition accuracy of the VGG19 model after pruning on the training image set, wherein the pruning rate of the VGG19 model is preset to be 60% when the 25 th iteration batch epoch and the 75 th iteration batch epoch are both performed;
fig. 9 is a schematic structural diagram of a pruning apparatus for a network model according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It should be apparent that the described embodiment is only one embodiment of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
An "embodiment" as referred to herein relates to a particular feature, structure, or characteristic that may be included in at least one implementation of the present application. In the description of the embodiments of the present application, it is to be understood that the terms "comprises," "comprising," and "is," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not expressly listed or inherent to such process, method, apparatus, article, or apparatus.
Please refer to fig. 1, which is a schematic diagram of an application environment provided in an embodiment of the present application, and includes a server 101, where the server 101 includes a pruning device for a network model, and the server 101 obtains a training image set and a current network model, where the current network model includes a plurality of convolutional layers, and performs pruning processing on the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting a training image into a current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, rejecting the parameter corresponding to the attenuation parameter in the convolutional layer to obtain a pruned network model.
A specific embodiment of a pruning method for a network model according to the present application is described below, and fig. 2 is a schematic flow chart of a pruning method for a network model according to the embodiment of the present application, and the present specification provides the method operation steps as shown in the embodiment or the flow chart, but more or fewer operation steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is only one of many possible orders of execution and does not represent the only order of execution, and in actual execution, the steps may be performed sequentially or in parallel as in the embodiments or methods shown in the figures (e.g., in the context of parallel processors or multi-threaded processing). Specifically, as shown in fig. 2, the method includes:
s201: acquiring a training image set and a current network model; the current network model contains multiple convolutional layers.
In the embodiment of the application, a server acquires a training image set and a current network model, wherein the training image set may be a CIFAR100 data set, an OTB50 data set, an OTB100 data set, or a GOT-10K data set. The current network model can be an AlexNet model, a VGG19 model or a SimFC model, and the current network model can also be a target recognition network model obtained by training with a training image set.
S203: and pruning the current network model based on the training image set to obtain a pruned network model.
In this embodiment, in an optional implementation manner, after the server obtains the pruned network model, the server determines the pruned network model as the current network model again, and returns to execute the step of performing pruning processing on the current network model based on the training image set to obtain the pruned network model.
In this embodiment, in another optional implementation manner, after obtaining the pruned network model, the server determines the pruned network model as the current network model again, and trains the current network model by using the training image set to obtain the trained network model. Specifically, a training image is input into a current network model, parameter sets corresponding to a plurality of convolutional layers are determined according to the output of the current network model, parameter sets to be pruned are determined from the parameter sets corresponding to the plurality of convolutional layers, parameters except the parameter sets to be pruned in the parameter sets corresponding to the plurality of convolutional layers are determined as the parameter sets to be updated, the parameter sets to be pruned are subjected to pause updating processing, and the parameter sets to be updated are updated, so that the trained network model is obtained.
Fig. 3 is a flow chart illustrating the steps of a pruning process provided in an embodiment of the present application, and the present specification provides the method operation steps as shown in the embodiment or the flow chart, but more or less operation steps can be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is only one of many possible orders of execution and does not represent the only order of execution, and in actual execution, the steps may be performed sequentially or in parallel as in the embodiments or methods shown in the figures (e.g., in the context of parallel processors or multi-threaded processing). Specifically, as shown in fig. 3, the method includes:
s301: and inputting the training image into the current network model, and determining the corresponding parameters of each convolutional layer in the plurality of convolutional layers according to the output of the current network model.
In an embodiment of the present application, in a specific implementation manner of determining a parameter corresponding to each of a plurality of convolutional layers, a server inputs an acquired training image into a current network model, determines a feature map set output by each of the plurality of convolutional layers according to an output of the current network model, then determines a parameter corresponding to the feature map set output by each convolutional layer, and determines a parameter corresponding to each convolutional layer according to a preset mapping relationship and the parameter corresponding to the feature map set output by each convolutional layer, where the preset mapping relationship is a relationship between the parameter corresponding to the convolutional layer and the parameter corresponding to the feature map set output by the convolutional layer.
S303: and based on the preset pruning rate corresponding to each convolution layer, performing attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter.
In the embodiment of the application, the server determines a target feature map set from a feature map set output by each convolutional layer based on a preset pruning rate corresponding to each convolutional layer, where the preset pruning rate corresponding to each convolutional layer in the plurality of convolutional layers may be completely the same, the preset pruning rates corresponding to each convolutional layer in the plurality of convolutional layers may also be completely different, the preset pruning rates corresponding to some convolutional layers in the plurality of convolutional layers may be the same, and the preset pruning rates of some convolutional layers other than the convolutional layers with the same preset pruning rate are different.
In an optional implementation manner, the preset pruning rate corresponding to each convolutional layer in the plurality of convolutional layers may be completely the same, and then the server may arbitrarily determine, from the feature map set output by each convolutional layer, a feature map set in which the number of channels accounts for a% of the total number of channels in the feature map output by the corresponding convolutional layer as a target feature map set. Here, the arbitrary determination may be to randomly select a plurality of feature maps from the feature map set output from the convolutional layer, or to sequentially select a plurality of feature maps from the convolutional layer, for example, fig. 4 illustrates a schematic diagram in which the last a% feature map set in the feature map set output from each convolutional layer is determined as the target feature map set.
In another alternative embodiment, the predetermined pruning rate for each of the plurality of convolutional layers may be substantially different. For example, assume that the current network model includes three convolutional layers, which are a first convolutional layer, a second convolutional layer and a third convolutional layer, respectively, the first convolutional layer outputs a feature map set of 10 channels, the second convolutional layer outputs a feature map set of 10 channels, and the third convolutional layer outputs a feature map set of 20 channels. The preset pruning rate corresponding to the first convolution layer is 20%, the preset pruning rate corresponding to the second convolution layer is 40%, and the preset pruning rate corresponding to the third convolution layer is 50%, so that the server selects a feature map set of 2 channels from a feature map set of 10 channels output by the first convolution layer, selects 4 channels from a feature map set of 10 channels output by the second convolution layer, selects a feature map set of 10 channels from a feature map set of 20 channels output by the third convolution layer, and takes the selected feature map set of 16 channels as a target feature map set. Similarly, in a case where the preset pruning rates of some of the plurality of convolutional layers are the same and the preset pruning rates of some of the plurality of convolutional layers other than the convolutional layer having the same preset pruning rate are different, the method for determining the target feature map set may also be the same as the method for determining the target feature map set in the case where the preset pruning rates of each of the plurality of convolutional layers are completely different, and details are not repeated in this application.
In the embodiment of the application, after determining a target feature map set from a feature map set output by each convolutional layer based on a preset pruning rate corresponding to each convolutional layer, a server determines parameters corresponding to the target feature map set, performs attenuation processing on the parameters corresponding to the target feature map set based on preset coefficients to obtain transition parameters, and determines attenuation parameters corresponding to each convolution based on a preset mapping relationship between the parameters corresponding to a certain convolutional layer and the parameters corresponding to the feature map set output by the convolutional layer and the obtained transition parameters.
S305: and if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.
Next, based on the training image set described above, the current network model is pruned to obtain a pruned network model, and a specific experimental description is performed.
In an optional implementation manner, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The training image set recognition method comprises the steps of carrying out pruning treatment with a preset pruning rate of 80% on AlexNet models in the 25 th iteration batch epoch and the 75 th iteration batch epoch, obtaining AlexNet models after pruning, wherein the calculated amount between the convolutional layers after pruning is 41% of the calculated amount between the convolutional layers before pruning, determining the recognition accuracy of the AlexNet models before pruning on the training image set, determining the recognition accuracy of the AlexNet models after pruning on the training image set, and carrying out performance analysis on the models after pruning on the basis of the recognition accuracy of the AlexNet models before pruning on the training image set and the recognition accuracy of the AlexNet models after pruning on the training image set. Fig. 5 is a data comparison graph illustrating the training image set recognition accuracy of the AlexNet model before pruning and the training image set recognition accuracy of the AlexNet model after pruning, in which the AlexNet model is subjected to pruning with the preset pruning rate of 80% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, a dotted line Baseline represents the recognition accuracy of the AlexNet model before pruning on the training image set, and a solid line Smoot _ Pruning represents the recognition accuracy of the AlexNet model before pruning on the training image set. As can be clearly seen from fig. 5, in the whole pruning process, the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set is basically consistent with the recognition accuracy of the AlexNet model before pruning on the CIFAR100 data set, and when the AlexNet model is pruned in both the 25 th iteration batch epoch and the 75 th iteration batch epoch, the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set does not suddenly decrease, the whole pruning process is very smooth, and the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set finally exceeds the recognition accuracy of the AlexNet model before pruning on the CIFAR100 data set.
In an optional implementation manner, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The training image set recognition method comprises the steps of carrying out pruning treatment with a preset pruning rate of 60% on AlexNet models in the 25 th iteration batch epoch and the 75 th iteration batch epoch, obtaining AlexNet models after pruning, wherein the calculated amount between the convolutional layers after pruning is 13% of the calculated amount between the convolutional layers before pruning, determining the recognition accuracy of the AlexNet models before pruning on the training image set, determining the recognition accuracy of the AlexNet models after pruning on the training image set, and carrying out performance analysis on the models after pruning on the basis of the recognition accuracy of the AlexNet models before pruning on the training image set and the recognition accuracy of the AlexNet models after pruning on the training image set. Fig. 6 is a data comparison graph illustrating the training image set recognition accuracy of the AlexNet model before pruning and the training image set recognition accuracy of the AlexNet model after pruning, in which the AlexNet model is subjected to pruning processing with the preset pruning rate of 60% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, a dotted line Baseline represents the recognition accuracy of the AlexNet model before pruning on the training image set, and a solid line Smoot _ Pruning represents the recognition accuracy of the AlexNet model before pruning on the training image set. As can be clearly seen from fig. 6, the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set finally exceeds the recognition accuracy of the AlexNet model before pruning on the CIFAR100 data set.
In an optional implementation, the VGG19 is used as a current network model, the CIFAR100 dataset is used as a training image set, the VGG19 is trained by using the CIFAR100 dataset, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 dataset. The method comprises the steps of carrying out pruning treatment with a preset pruning rate of 80% on VGG19 in both the 25 th iteration batch epoch and the 75 th iteration batch epoch, obtaining VGG19 after pruning, wherein the calculated amount between the convolutional layers after pruning is 41% of the calculated amount between the convolutional layers before pruning, determining the identification accuracy of VGG19 before pruning on a training image set, determining the identification accuracy of VGG19 after pruning on the training image set, and carrying out performance analysis on a model after pruning on the basis of the identification accuracy of VGG19 before pruning on the training image set and the identification accuracy of VGG19 after pruning on the training image set. Fig. 7 illustrates a data comparison graph of the recognition accuracy of the VGG19 before pruning on the training image set and the recognition accuracy of the VGG19 after pruning on the training image set when pruning processing with the preset pruning rate of 80% is performed on the VGG19 at both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, the dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning on the training image set, and the solid line Smoot _ sounding represents the recognition accuracy of the VGG19 model before pruning on the training image set. As can be clearly seen from fig. 7, during the training process, the convergence rate of the VGG19 model is gradually increased, and the recognition accuracy of the post-pruning VGG19 on the CIFAR100 data set finally exceeds the recognition accuracy of the pre-pruning VGG19 on the CIFAR100 data set.
In an optional implementation, the VGG19 is used as a current network model, the CIFAR100 dataset is used as a training image set, the VGG19 is trained by using the CIFAR100 dataset, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 dataset. The method comprises the steps of performing pruning with a preset pruning rate of 60% on VGG19 in both a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a VGG19 after pruning, wherein the calculated amount between the convolutional layers after pruning is 13% of the calculated amount between the convolutional layers before pruning, determining the recognition accuracy of the VGG19 before pruning on a training image set, determining the recognition accuracy of the VGG19 after pruning on the training image set, and performing performance analysis on a model after pruning on the basis of the recognition accuracy of the VGG19 before pruning on the training image set and the recognition accuracy of the VGG19 after pruning on the training image set. Fig. 8 illustrates a data comparison graph of the recognition accuracy of the VGG19 before pruning on the training image set and the recognition accuracy of the VGG19 after pruning on the training image set when the VGG19 is subjected to pruning processing with the preset pruning rate of 60% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, the dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning on the training image set, and the solid line Smoot _ sounding represents the recognition accuracy of the VGG19 model before pruning on the training image set. As can be clearly seen from fig. 8, in the early stage of training, the convergence rate of the VGG19 model is gradually increased, and the recognition accuracy of the post-pruning VGG19 on the CIFAR100 data set finally exceeds the recognition accuracy of the pre-pruning VGG19 on the CIFAR100 data set.
In an alternative embodiment, the SiamFC model is used as a current network model, the OTB50 data set, the OTB100 data set, or the GOT-10K data set is used as a training image set, the SiamFC model is trained by using the OTB50 data set, the OTB100 data set, or the GOT-10K data set, and the training iteration number may be specifically 50 iteration batches epoch in the process of training the SiamFC model by using the OTB50 data set, the OTB100 data set, or the GOT-10K data set. Wherein, pruning with a preset pruning rate of 80% can be performed on the siamf fc model at the 5 th iteration batch epoch and the 15 th iteration batch epoch, the final pruning rate is 0.64, obtaining the siamf fc model after pruning, the calculated amount between the coiling layers after pruning is 41% of the calculated amount between the coiling layers before pruning, and determining the recognition accuracy of the siamf fc model before pruning on the training image set, and determining the recognition accuracy of the siamf fc model after pruning on the training image set, as shown in the following table, comparing the recognition accuracy of the siamf fc model before pruning on the training image set and the recognition accuracy of the siamf fc model after pruning on the training image set, wherein, the siamf fc represents the recognition accuracy of the siamf fc model before pruning on the training image set and the success rate, and Prun _ siamf fc represents the recognition accuracy of the siamf fc model after pruning on the training image set and the success rate of the training image set, and the success rate of the recognition of the actual position of the object is determined by the training image set, the recognition success rate of the training image set refers to the probability that the model is targeted to the framed region.
Figure BDA0002680292320000131
From the above experimental data, it can be clearly seen that, by using the pruning of the network model disclosed in the present application, the light weight of the current model can be effectively realized on the basis of not reducing the performance of the current model.
By adopting the pruning method for the network model provided by the embodiment of the application, the attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the parameter quantity is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.
Fig. 9 is a schematic structural diagram of a pruning apparatus for a network model according to an embodiment of the present application, and as shown in fig. 9, the pruning apparatus includes:
the obtaining module 901 is configured to obtain a training image set and a current network model; the current network model comprises a plurality of convolutional layers;
the pruning module 903 is configured to perform pruning processing on the current network model based on the training image set to obtain a pruned network model;
wherein, the pruning treatment step comprises the following steps:
inputting the training image into a current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter;
and if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.
In the embodiment of the present application, the pruning module 903 described above includes:
the determining module 9031 is configured to input the training image into the current network model, and determine a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to an output of the current network model;
the attenuation module 9033 is configured to perform attenuation processing on a parameter corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain an attenuation parameter;
the rejecting module 9035 is configured to reject a parameter corresponding to the attenuation parameter in the convolutional layer if the difference between the attenuation parameter and the preset threshold is within the preset interval, so as to obtain the pruned network model.
The device and method embodiments in the embodiments of the present application are based on the same application concept.
An electronic device according to an embodiment of the present application may be disposed in the server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a method for pruning a network model in the method embodiment, where the at least one instruction, the at least one program, the code set, or the set of instructions are loaded from the memory and executed to implement the method for pruning a network model.
The present invention also provides a storage medium, which can be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a method for pruning a network model in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions are loaded and executed by the processor to implement the method for pruning a network model.
Optionally, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to, a storage medium including: various media that can store program codes, such as a usb disk, a Read-only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk.
As can be seen from the embodiments of the pruning method, device, electronic device, or storage medium for a network model provided by the present application, the method in the present application includes obtaining a training image set and a current network model, where the current network model includes a plurality of convolutional layers, and performing pruning processing on the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting a training image into a current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, rejecting the parameter corresponding to the attenuation parameter in the convolutional layer to obtain a pruned network model. Based on the embodiment of the application, attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the number of the parameters is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.
It should be noted that: the foregoing sequence of the embodiments of the present application is for description only and does not represent the superiority and inferiority of the embodiments, and the specific embodiments are described in the specification, and other embodiments are also within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in the order of execution in different embodiments and achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown or connected to enable the desired results to be achieved, and in some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment is described with emphasis on differences from other embodiments. Especially, for the embodiment of the device, since it is based on the embodiment similar to the method, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims (10)

1. A pruning method for a network model is characterized by comprising the following steps:
acquiring a training image set and a current network model; the current network model comprises a plurality of convolutional layers;
pruning the current network model based on the training image set to obtain a pruned network model;
wherein the pruning treatment comprises the following steps:
inputting the training image into the current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter;
and if the difference value between the attenuation parameter and a preset threshold value is within a preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.
2. The method of claim 1, wherein after obtaining the pruned network model, the method further comprises:
and re-determining the pruned network model as the current network model, and returning to execute the step of pruning the current network model based on the training image set to obtain the pruned network model.
3. The method of claim 1, wherein inputting the training image into the current network model and determining parameters corresponding to each of the plurality of convolutional layers based on an output of the current network model comprises:
inputting the training image into the current network model, and determining a feature atlas output by each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
determining parameters corresponding to the feature map set output by each convolution layer;
and determining the corresponding parameter of each convolution layer according to the preset mapping relation and the corresponding parameter of the feature map set output by each convolution layer.
4. The method of claim 3, wherein the attenuating the parameter corresponding to each convolutional layer based on the preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter comprises:
determining a target feature map set from the feature map set output by each convolutional layer based on the preset pruning rate corresponding to each convolutional layer; the ratio of the number of channels of the target feature map set to the number of channels of the feature map set is the preset pruning rate;
determining parameters corresponding to the target feature atlas;
based on a preset coefficient, carrying out attenuation processing on parameters corresponding to the target feature map set to obtain transition parameters;
and determining attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters.
5. The method of claim 1, wherein after obtaining the pruned network model, the method further comprises:
re-determining the pruned network model as the current network model;
and training the current network model by using the training image set to obtain a trained network model.
6. The method of claim 5, wherein the training the current network model using the training image set to obtain a trained network model comprises:
inputting the training image into the current network model, and determining parameter sets corresponding to the plurality of convolutional layers according to the output of the current network model;
determining parameter sets to be pruned from the parameter sets corresponding to the plurality of convolutional layers, and determining parameters except the parameter sets to be pruned in the parameter sets corresponding to the plurality of convolutional layers as parameter sets to be updated;
and carrying out pause updating processing on the parameter set to be pruned, and updating the parameter set to be updated to obtain the trained network model.
7. A pruning apparatus for a network model, comprising:
the acquisition module is used for acquiring a training image set and a current network model; the current network model comprises a plurality of convolutional layers;
the pruning module is used for carrying out pruning processing on the current network model based on the training image set to obtain a pruned network model;
wherein the pruning treatment comprises the following steps:
inputting the training image into the current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter;
and if the difference value between the attenuation parameter and a preset threshold value is within a preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.
8. The apparatus of claim 7, wherein the pruning module comprises:
a determining module, configured to input the training image into the current network model, and determine a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to an output of the current network model;
the attenuation module is used for carrying out attenuation processing on the parameters corresponding to each convolution layer based on the preset pruning rate corresponding to each convolution layer to obtain attenuation parameters;
and the rejecting module is used for rejecting parameters corresponding to the attenuation parameters in the convolutional layer to obtain the pruned network model if the difference value between the attenuation parameters and a preset threshold value is within a preset interval.
9. An electronic device, comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the method of pruning a network model according to any one of claims 1-6.
10. A computer-readable storage medium, having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of pruning a network model according to any one of claims 1-6.
CN202010964152.9A 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium Active CN112101547B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010964152.9A CN112101547B (en) 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010964152.9A CN112101547B (en) 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112101547A true CN112101547A (en) 2020-12-18
CN112101547B CN112101547B (en) 2024-04-16

Family

ID=73751627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010964152.9A Active CN112101547B (en) 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112101547B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734029A (en) * 2020-12-30 2021-04-30 中国科学院计算技术研究所 Neural network channel pruning method, storage medium and electronic equipment
CN113111925A (en) * 2021-03-29 2021-07-13 宁夏新大众机械有限公司 Feed qualification classification method based on deep learning
CN113361697A (en) * 2021-07-14 2021-09-07 深圳思悦创新有限公司 Convolution network model compression method, system and storage medium
CN115186937A (en) * 2022-09-09 2022-10-14 闪捷信息科技有限公司 Prediction model training and data prediction method and device based on multi-party data cooperation

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104684041A (en) * 2015-02-06 2015-06-03 中国科学院上海微系统与信息技术研究所 Real-time wireless sensor network routing method supporting large-scale node application
CN109671063A (en) * 2018-12-11 2019-04-23 西安交通大学 A kind of image quality measure method of importance between the network characterization based on depth
CN109886391A (en) * 2019-01-30 2019-06-14 东南大学 A kind of neural network compression method based on the positive and negative diagonal convolution in space
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110874631A (en) * 2020-01-20 2020-03-10 浙江大学 Convolutional neural network pruning method based on feature map sparsification
US20200234130A1 (en) * 2017-08-18 2020-07-23 Intel Corporation Slimming of neural networks in machine learning environments
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN111652236A (en) * 2020-04-21 2020-09-11 东南大学 Lightweight fine-grained image identification method for cross-layer feature interaction in weak supervision scene
US20200301994A1 (en) * 2019-03-20 2020-09-24 Imagination Technologies Limited Methods and Systems for Implementing a Convolution Transpose Layer of a Neural Network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104684041A (en) * 2015-02-06 2015-06-03 中国科学院上海微系统与信息技术研究所 Real-time wireless sensor network routing method supporting large-scale node application
US20200234130A1 (en) * 2017-08-18 2020-07-23 Intel Corporation Slimming of neural networks in machine learning environments
CN109671063A (en) * 2018-12-11 2019-04-23 西安交通大学 A kind of image quality measure method of importance between the network characterization based on depth
CN109886391A (en) * 2019-01-30 2019-06-14 东南大学 A kind of neural network compression method based on the positive and negative diagonal convolution in space
US20200301994A1 (en) * 2019-03-20 2020-09-24 Imagination Technologies Limited Methods and Systems for Implementing a Convolution Transpose Layer of a Neural Network
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110874631A (en) * 2020-01-20 2020-03-10 浙江大学 Convolutional neural network pruning method based on feature map sparsification
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN111652236A (en) * 2020-04-21 2020-09-11 东南大学 Lightweight fine-grained image identification method for cross-layer feature interaction in weak supervision scene

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHOUMENG QIU 等: "BFRIFP: Brain Functional Reorganization Inspired Filter Pruning", 《ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING – ICANN 2021》, vol. 12894, 7 September 2021 (2021-09-07), pages 16 - 28, XP047607094, DOI: 10.1007/978-3-030-86380-7_2 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734029A (en) * 2020-12-30 2021-04-30 中国科学院计算技术研究所 Neural network channel pruning method, storage medium and electronic equipment
CN113111925A (en) * 2021-03-29 2021-07-13 宁夏新大众机械有限公司 Feed qualification classification method based on deep learning
CN113361697A (en) * 2021-07-14 2021-09-07 深圳思悦创新有限公司 Convolution network model compression method, system and storage medium
CN115186937A (en) * 2022-09-09 2022-10-14 闪捷信息科技有限公司 Prediction model training and data prediction method and device based on multi-party data cooperation
CN115186937B (en) * 2022-09-09 2022-11-22 闪捷信息科技有限公司 Prediction model training and data prediction method and device based on multi-party data cooperation

Also Published As

Publication number Publication date
CN112101547B (en) 2024-04-16

Similar Documents

Publication Publication Date Title
CN112101547A (en) Pruning method and device for network model, electronic equipment and storage medium
KR102110486B1 (en) Artificial neural network class-based pruning
CN108038546B (en) Method and apparatus for compressing neural networks
CN113011588B (en) Pruning method, device, equipment and medium of convolutional neural network
CN111144561A (en) Neural network model determining method and device
JP6950756B2 (en) Neural network rank optimizer and optimization method
CN111079899A (en) Neural network model compression method, system, device and medium
CN110008853B (en) Pedestrian detection network and model training method, detection method, medium and equipment
CN113241064B (en) Speech recognition, model training method and device, electronic equipment and storage medium
CN114282666A (en) Structured pruning method and device based on local sparse constraint
CN113837378A (en) Convolutional neural network compression method based on agent model and gradient optimization
CN114429208A (en) Model compression method, device, equipment and medium based on residual structure pruning
CN113554084A (en) Vehicle re-identification model compression method and system based on pruning and light-weight convolution
CN112150497A (en) Local activation method and system based on binary neural network
CN115222042A (en) Structured pruning method and system
US11886832B2 (en) Operation device and operation method
CN112287950B (en) Feature extraction module compression method, image processing method, device and medium
KR102374525B1 (en) Keyword Spotting Apparatus, Method and Computer Readable Recording Medium Thereof
US11507782B2 (en) Method, device, and program product for determining model compression rate
CN112232505A (en) Model training method, model processing method, model training device, electronic equipment and storage medium
CN115129831A (en) Data processing method and device, electronic equipment and computer storage medium
CN111027693A (en) Neural network compression method and system based on weight-removing pruning
CN116363444A (en) Fuzzy classification model training method, fuzzy image recognition method and device
CN113743571A (en) Data processing method, electronic device and storage medium
CN113128660A (en) Deep learning model compression method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant