CN112101547A

CN112101547A - Pruning method and device for network model, electronic equipment and storage medium

Info

Publication number: CN112101547A
Application number: CN202010964152.9A
Authority: CN
Inventors: 谷宇章; 邱守猛; 袁泽强; 张晓林
Original assignee: Shanghai Institute of Microsystem and Information Technology of CAS
Current assignee: Shanghai Institute of Microsystem and Information Technology of CAS
Priority date: 2020-09-14
Filing date: 2020-09-14
Publication date: 2020-12-18
Anticipated expiration: 2040-09-14
Also published as: CN112101547B

Abstract

The pruning method, the pruning device, the electronic equipment and the storage medium for the network model disclosed by the embodiment of the application comprise the steps of obtaining a training image set and a current network model, inputting the training image into the current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model. Based on the embodiment of the application, attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the number of the parameters is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.

Description

Pruning method and device for network model, electronic equipment and storage medium

Technical Field

The invention relates to the technical field of deep learning, in particular to a pruning method and device for a network model, electronic equipment and a storage medium.

Background

In recent years, with the continuous development of deep learning technology, network models become more and more complex, and the number of parameters in the network models is also more and more, which brings great computational burden to the practical application of the deep learning network models. However, through research, many redundant parameters often exist in a trained network model, and the network model can recover the performance before elimination after the redundant parameters are eliminated and then subjected to certain fine tuning. For example, ResNet-50 has 50 convolutional layers, and the whole model requires about 95MB of storage space, but can still work well after 75% of parameters are removed, and the running time can be reduced by as much as 50%. Therefore, redundant parameters in the network model are removed, namely the model can be lightened by pruning the network model, so that the model is easy to deploy and apply in an actual scene.

In the prior art, pruning of a network model is mainly divided into two categories, namely thinning during training and pruning after training. The sparse constraint is applied to the parameters or the structure of the model in the process of training the model, so that the sparse parameters or the structure are obtained, the size of the model can be reduced, the time consumed by inference is reduced, and the inference speed is increased. And pruning after training is to specifically delete unimportant weights in the trained model so as to make the network model sparse and simplified. However, in the processing method of pruning after training, the accuracy of the network model is often reduced after the unimportant weights are deleted. Therefore, it is generally necessary to fine-tune the trained pruned network model to restore performance.

Disclosure of Invention

The embodiment of the application provides a pruning method and device for a network model, an electronic device and a storage medium, which can reduce the number of parameters, do not increase training burden and can ensure the identification accuracy of the network model.

The embodiment of the application provides a pruning method for a network model, which comprises the following steps:

acquiring a training image set and a current network model; the current network model comprises a plurality of convolutional layers;

pruning the current network model based on the training image set to obtain a pruned network model;

wherein, the pruning treatment step comprises the following steps:

inputting the training image into a current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model;

based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter;

and if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.

Further, after obtaining the pruned network model, the method further includes:

and re-determining the pruned network model as the current network model, and returning to execute the step of pruning the current network model based on the training image set to obtain the pruned network model.

Further, inputting the training image into the current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model, including:

inputting the training image into a current network model, and determining a feature atlas output by each convolutional layer in the plurality of convolutional layers according to the output of the current network model;

determining parameters corresponding to the feature map set output by each convolution layer;

and determining the corresponding parameters of each convolution layer according to the preset mapping relation and the parameters corresponding to the feature map set output by each convolution layer.

Further, based on the preset pruning rate corresponding to each convolution layer, performing attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter, including:

determining a target feature map set from the feature map set output by each convolutional layer based on the corresponding preset pruning rate of each convolutional layer; the ratio of the number of channels of the target feature map set to the number of channels of the feature map set is a preset pruning rate;

determining parameters corresponding to the target feature atlas;

performing attenuation processing on parameters corresponding to the target feature map set based on a preset coefficient to obtain transition parameters;

and determining attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters.

Further, after obtaining the pruned network model, the method further includes:

re-determining the pruned network model as the current network model;

and training the current network model by using the training image set to obtain the trained network model.

Further, training the current network model by using the training image set to obtain a trained network model, including:

inputting the training image into a current network model, and determining parameter sets corresponding to a plurality of convolutional layers according to the output of the current network model;

determining parameter sets to be pruned from the parameter sets corresponding to the plurality of convolutional layers, and determining parameters except the parameter sets to be pruned in the parameter sets corresponding to the plurality of convolutional layers as parameter sets to be updated;

and carrying out pause updating processing on the parameter set to be pruned, and updating the parameter set to be updated to obtain the trained network model.

Correspondingly, the embodiment of the present application further provides a pruning device for a network model, and the pruning device includes:

the acquisition module is used for acquiring a training image set and a current network model; the current network model comprises a plurality of convolutional layers;

the pruning module is used for carrying out pruning processing on the current network model based on the training image set to obtain a pruned network model;

wherein, the pruning treatment step comprises the following steps:

Further, the pruning module comprises:

the determining module is used for inputting the training image into the current network model and determining the corresponding parameter of each convolutional layer in the plurality of convolutional layers according to the output of the current network model;

the attenuation module is used for carrying out attenuation processing on the parameters corresponding to each convolution layer based on the preset pruning rate corresponding to each convolution layer to obtain attenuation parameters;

and the rejecting module is used for rejecting parameters corresponding to the attenuation parameters in the convolutional layer to obtain the pruned network model if the difference value of the attenuation parameters and the preset threshold value is within a preset interval.

Accordingly, an embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory stores at least one instruction, at least one program, a code set, or a set of instructions, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the pruning method for the network model.

Accordingly, an embodiment of the present application further provides a computer-readable storage medium, where at least one instruction, at least one program, a code set, or a set of instructions is stored in the storage medium, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the pruning method for the network model.

The embodiment of the application has the following beneficial effects:

the pruning method specifically comprises the steps of obtaining a training image set and a current network model, wherein the current network model comprises a plurality of convolutional layers, and carrying out pruning processing on the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting a training image into a current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, rejecting the parameter corresponding to the attenuation parameter in the convolutional layer to obtain a pruned network model. Based on the embodiment of the application, attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the number of the parameters is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.

Drawings

In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;

fig. 2 is a schematic flowchart of a pruning method for a network model according to an embodiment of the present application;

FIG. 3 is a flow chart illustrating the steps of a pruning process provided by an embodiment of the present application;

FIG. 4 is a schematic diagram illustrating a method for determining a% of feature maps in a feature map set output by each convolutional layer as a target feature map set according to an embodiment of the present application;

fig. 5 is a data comparison graph illustrating the recognition accuracy of the AlexNet model before pruning on the training image set and the recognition accuracy of the AlexNet model after pruning on the training image set, where the predetermined pruning rate is 80% for both the 25 th iteration batch epoch and the 75 th iteration batch epoch provided in the embodiment of the present application;

fig. 6 is a data comparison graph illustrating the recognition accuracy of the AlexNet model before pruning on the training image set and the recognition accuracy of the AlexNet model after pruning on the training image set when the 25 th iteration batch epoch and the 75 th iteration batch epoch are both performed with the pruning processing with the preset pruning rate of 60%;

fig. 7 is a data comparison graph illustrating the recognition accuracy of the VGG19 model before pruning on the training image set and the recognition accuracy of the VGG19 model after pruning on the training image set, in which the pruning rate of the VGG19 model is set to 80% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch provided in the embodiment of the present application;

fig. 8 is a data comparison graph illustrating the recognition accuracy of the VGG19 model before pruning on the training image set and the recognition accuracy of the VGG19 model after pruning on the training image set, wherein the pruning rate of the VGG19 model is preset to be 60% when the 25 th iteration batch epoch and the 75 th iteration batch epoch are both performed;

fig. 9 is a schematic structural diagram of a pruning apparatus for a network model according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It should be apparent that the described embodiment is only one embodiment of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

An "embodiment" as referred to herein relates to a particular feature, structure, or characteristic that may be included in at least one implementation of the present application. In the description of the embodiments of the present application, it is to be understood that the terms "comprises," "comprising," and "is," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not expressly listed or inherent to such process, method, apparatus, article, or apparatus.

Please refer to fig. 1, which is a schematic diagram of an application environment provided in an embodiment of the present application, and includes a server 101, where the server 101 includes a pruning device for a network model, and the server 101 obtains a training image set and a current network model, where the current network model includes a plurality of convolutional layers, and performs pruning processing on the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting a training image into a current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, rejecting the parameter corresponding to the attenuation parameter in the convolutional layer to obtain a pruned network model.

A specific embodiment of a pruning method for a network model according to the present application is described below, and fig. 2 is a schematic flow chart of a pruning method for a network model according to the embodiment of the present application, and the present specification provides the method operation steps as shown in the embodiment or the flow chart, but more or fewer operation steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is only one of many possible orders of execution and does not represent the only order of execution, and in actual execution, the steps may be performed sequentially or in parallel as in the embodiments or methods shown in the figures (e.g., in the context of parallel processors or multi-threaded processing). Specifically, as shown in fig. 2, the method includes:

s201: acquiring a training image set and a current network model; the current network model contains multiple convolutional layers.

In the embodiment of the application, a server acquires a training image set and a current network model, wherein the training image set may be a CIFAR100 data set, an OTB50 data set, an OTB100 data set, or a GOT-10K data set. The current network model can be an AlexNet model, a VGG19 model or a SimFC model, and the current network model can also be a target recognition network model obtained by training with a training image set.

S203: and pruning the current network model based on the training image set to obtain a pruned network model.

In this embodiment, in an optional implementation manner, after the server obtains the pruned network model, the server determines the pruned network model as the current network model again, and returns to execute the step of performing pruning processing on the current network model based on the training image set to obtain the pruned network model.

In this embodiment, in another optional implementation manner, after obtaining the pruned network model, the server determines the pruned network model as the current network model again, and trains the current network model by using the training image set to obtain the trained network model. Specifically, a training image is input into a current network model, parameter sets corresponding to a plurality of convolutional layers are determined according to the output of the current network model, parameter sets to be pruned are determined from the parameter sets corresponding to the plurality of convolutional layers, parameters except the parameter sets to be pruned in the parameter sets corresponding to the plurality of convolutional layers are determined as the parameter sets to be updated, the parameter sets to be pruned are subjected to pause updating processing, and the parameter sets to be updated are updated, so that the trained network model is obtained.

Fig. 3 is a flow chart illustrating the steps of a pruning process provided in an embodiment of the present application, and the present specification provides the method operation steps as shown in the embodiment or the flow chart, but more or less operation steps can be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is only one of many possible orders of execution and does not represent the only order of execution, and in actual execution, the steps may be performed sequentially or in parallel as in the embodiments or methods shown in the figures (e.g., in the context of parallel processors or multi-threaded processing). Specifically, as shown in fig. 3, the method includes:

s301: and inputting the training image into the current network model, and determining the corresponding parameters of each convolutional layer in the plurality of convolutional layers according to the output of the current network model.

In an embodiment of the present application, in a specific implementation manner of determining a parameter corresponding to each of a plurality of convolutional layers, a server inputs an acquired training image into a current network model, determines a feature map set output by each of the plurality of convolutional layers according to an output of the current network model, then determines a parameter corresponding to the feature map set output by each convolutional layer, and determines a parameter corresponding to each convolutional layer according to a preset mapping relationship and the parameter corresponding to the feature map set output by each convolutional layer, where the preset mapping relationship is a relationship between the parameter corresponding to the convolutional layer and the parameter corresponding to the feature map set output by the convolutional layer.

S303: and based on the preset pruning rate corresponding to each convolution layer, performing attenuation processing on the parameter corresponding to each convolution layer to obtain an attenuation parameter.

In the embodiment of the application, the server determines a target feature map set from a feature map set output by each convolutional layer based on a preset pruning rate corresponding to each convolutional layer, where the preset pruning rate corresponding to each convolutional layer in the plurality of convolutional layers may be completely the same, the preset pruning rates corresponding to each convolutional layer in the plurality of convolutional layers may also be completely different, the preset pruning rates corresponding to some convolutional layers in the plurality of convolutional layers may be the same, and the preset pruning rates of some convolutional layers other than the convolutional layers with the same preset pruning rate are different.

In an optional implementation manner, the preset pruning rate corresponding to each convolutional layer in the plurality of convolutional layers may be completely the same, and then the server may arbitrarily determine, from the feature map set output by each convolutional layer, a feature map set in which the number of channels accounts for a% of the total number of channels in the feature map output by the corresponding convolutional layer as a target feature map set. Here, the arbitrary determination may be to randomly select a plurality of feature maps from the feature map set output from the convolutional layer, or to sequentially select a plurality of feature maps from the convolutional layer, for example, fig. 4 illustrates a schematic diagram in which the last a% feature map set in the feature map set output from each convolutional layer is determined as the target feature map set.

In another alternative embodiment, the predetermined pruning rate for each of the plurality of convolutional layers may be substantially different. For example, assume that the current network model includes three convolutional layers, which are a first convolutional layer, a second convolutional layer and a third convolutional layer, respectively, the first convolutional layer outputs a feature map set of 10 channels, the second convolutional layer outputs a feature map set of 10 channels, and the third convolutional layer outputs a feature map set of 20 channels. The preset pruning rate corresponding to the first convolution layer is 20%, the preset pruning rate corresponding to the second convolution layer is 40%, and the preset pruning rate corresponding to the third convolution layer is 50%, so that the server selects a feature map set of 2 channels from a feature map set of 10 channels output by the first convolution layer, selects 4 channels from a feature map set of 10 channels output by the second convolution layer, selects a feature map set of 10 channels from a feature map set of 20 channels output by the third convolution layer, and takes the selected feature map set of 16 channels as a target feature map set. Similarly, in a case where the preset pruning rates of some of the plurality of convolutional layers are the same and the preset pruning rates of some of the plurality of convolutional layers other than the convolutional layer having the same preset pruning rate are different, the method for determining the target feature map set may also be the same as the method for determining the target feature map set in the case where the preset pruning rates of each of the plurality of convolutional layers are completely different, and details are not repeated in this application.

In the embodiment of the application, after determining a target feature map set from a feature map set output by each convolutional layer based on a preset pruning rate corresponding to each convolutional layer, a server determines parameters corresponding to the target feature map set, performs attenuation processing on the parameters corresponding to the target feature map set based on preset coefficients to obtain transition parameters, and determines attenuation parameters corresponding to each convolution based on a preset mapping relationship between the parameters corresponding to a certain convolutional layer and the parameters corresponding to the feature map set output by the convolutional layer and the obtained transition parameters.

S305: and if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, removing the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.

Next, based on the training image set described above, the current network model is pruned to obtain a pruned network model, and a specific experimental description is performed.

In an optional implementation manner, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The training image set recognition method comprises the steps of carrying out pruning treatment with a preset pruning rate of 80% on AlexNet models in the 25 th iteration batch epoch and the 75 th iteration batch epoch, obtaining AlexNet models after pruning, wherein the calculated amount between the convolutional layers after pruning is 41% of the calculated amount between the convolutional layers before pruning, determining the recognition accuracy of the AlexNet models before pruning on the training image set, determining the recognition accuracy of the AlexNet models after pruning on the training image set, and carrying out performance analysis on the models after pruning on the basis of the recognition accuracy of the AlexNet models before pruning on the training image set and the recognition accuracy of the AlexNet models after pruning on the training image set. Fig. 5 is a data comparison graph illustrating the training image set recognition accuracy of the AlexNet model before pruning and the training image set recognition accuracy of the AlexNet model after pruning, in which the AlexNet model is subjected to pruning with the preset pruning rate of 80% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, a dotted line Baseline represents the recognition accuracy of the AlexNet model before pruning on the training image set, and a solid line Smoot _ Pruning represents the recognition accuracy of the AlexNet model before pruning on the training image set. As can be clearly seen from fig. 5, in the whole pruning process, the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set is basically consistent with the recognition accuracy of the AlexNet model before pruning on the CIFAR100 data set, and when the AlexNet model is pruned in both the 25 th iteration batch epoch and the 75 th iteration batch epoch, the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set does not suddenly decrease, the whole pruning process is very smooth, and the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set finally exceeds the recognition accuracy of the AlexNet model before pruning on the CIFAR100 data set.

In an optional implementation manner, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The training image set recognition method comprises the steps of carrying out pruning treatment with a preset pruning rate of 60% on AlexNet models in the 25 th iteration batch epoch and the 75 th iteration batch epoch, obtaining AlexNet models after pruning, wherein the calculated amount between the convolutional layers after pruning is 13% of the calculated amount between the convolutional layers before pruning, determining the recognition accuracy of the AlexNet models before pruning on the training image set, determining the recognition accuracy of the AlexNet models after pruning on the training image set, and carrying out performance analysis on the models after pruning on the basis of the recognition accuracy of the AlexNet models before pruning on the training image set and the recognition accuracy of the AlexNet models after pruning on the training image set. Fig. 6 is a data comparison graph illustrating the training image set recognition accuracy of the AlexNet model before pruning and the training image set recognition accuracy of the AlexNet model after pruning, in which the AlexNet model is subjected to pruning processing with the preset pruning rate of 60% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, a dotted line Baseline represents the recognition accuracy of the AlexNet model before pruning on the training image set, and a solid line Smoot _ Pruning represents the recognition accuracy of the AlexNet model before pruning on the training image set. As can be clearly seen from fig. 6, the recognition accuracy of the AlexNet model after pruning on the CIFAR100 data set finally exceeds the recognition accuracy of the AlexNet model before pruning on the CIFAR100 data set.

In an optional implementation, the VGG19 is used as a current network model, the CIFAR100 dataset is used as a training image set, the VGG19 is trained by using the CIFAR100 dataset, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 dataset. The method comprises the steps of carrying out pruning treatment with a preset pruning rate of 80% on VGG19 in both the 25 th iteration batch epoch and the 75 th iteration batch epoch, obtaining VGG19 after pruning, wherein the calculated amount between the convolutional layers after pruning is 41% of the calculated amount between the convolutional layers before pruning, determining the identification accuracy of VGG19 before pruning on a training image set, determining the identification accuracy of VGG19 after pruning on the training image set, and carrying out performance analysis on a model after pruning on the basis of the identification accuracy of VGG19 before pruning on the training image set and the identification accuracy of VGG19 after pruning on the training image set. Fig. 7 illustrates a data comparison graph of the recognition accuracy of the VGG19 before pruning on the training image set and the recognition accuracy of the VGG19 after pruning on the training image set when pruning processing with the preset pruning rate of 80% is performed on the VGG19 at both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, the dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning on the training image set, and the solid line Smoot _ sounding represents the recognition accuracy of the VGG19 model before pruning on the training image set. As can be clearly seen from fig. 7, during the training process, the convergence rate of the VGG19 model is gradually increased, and the recognition accuracy of the post-pruning VGG19 on the CIFAR100 data set finally exceeds the recognition accuracy of the pre-pruning VGG19 on the CIFAR100 data set.

In an optional implementation, the VGG19 is used as a current network model, the CIFAR100 dataset is used as a training image set, the VGG19 is trained by using the CIFAR100 dataset, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 dataset. The method comprises the steps of performing pruning with a preset pruning rate of 60% on VGG19 in both a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a VGG19 after pruning, wherein the calculated amount between the convolutional layers after pruning is 13% of the calculated amount between the convolutional layers before pruning, determining the recognition accuracy of the VGG19 before pruning on a training image set, determining the recognition accuracy of the VGG19 after pruning on the training image set, and performing performance analysis on a model after pruning on the basis of the recognition accuracy of the VGG19 before pruning on the training image set and the recognition accuracy of the VGG19 after pruning on the training image set. Fig. 8 illustrates a data comparison graph of the recognition accuracy of the VGG19 before pruning on the training image set and the recognition accuracy of the VGG19 after pruning on the training image set when the VGG19 is subjected to pruning processing with the preset pruning rate of 60% in both the 25 th iteration batch epoch and the 75 th iteration batch epoch. In the figure, the dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning on the training image set, and the solid line Smoot _ sounding represents the recognition accuracy of the VGG19 model before pruning on the training image set. As can be clearly seen from fig. 8, in the early stage of training, the convergence rate of the VGG19 model is gradually increased, and the recognition accuracy of the post-pruning VGG19 on the CIFAR100 data set finally exceeds the recognition accuracy of the pre-pruning VGG19 on the CIFAR100 data set.

In an alternative embodiment, the SiamFC model is used as a current network model, the OTB50 data set, the OTB100 data set, or the GOT-10K data set is used as a training image set, the SiamFC model is trained by using the OTB50 data set, the OTB100 data set, or the GOT-10K data set, and the training iteration number may be specifically 50 iteration batches epoch in the process of training the SiamFC model by using the OTB50 data set, the OTB100 data set, or the GOT-10K data set. Wherein, pruning with a preset pruning rate of 80% can be performed on the siamf fc model at the 5 th iteration batch epoch and the 15 th iteration batch epoch, the final pruning rate is 0.64, obtaining the siamf fc model after pruning, the calculated amount between the coiling layers after pruning is 41% of the calculated amount between the coiling layers before pruning, and determining the recognition accuracy of the siamf fc model before pruning on the training image set, and determining the recognition accuracy of the siamf fc model after pruning on the training image set, as shown in the following table, comparing the recognition accuracy of the siamf fc model before pruning on the training image set and the recognition accuracy of the siamf fc model after pruning on the training image set, wherein, the siamf fc represents the recognition accuracy of the siamf fc model before pruning on the training image set and the success rate, and Prun _ siamf fc represents the recognition accuracy of the siamf fc model after pruning on the training image set and the success rate of the training image set, and the success rate of the recognition of the actual position of the object is determined by the training image set, the recognition success rate of the training image set refers to the probability that the model is targeted to the framed region.

From the above experimental data, it can be clearly seen that, by using the pruning of the network model disclosed in the present application, the light weight of the current model can be effectively realized on the basis of not reducing the performance of the current model.

By adopting the pruning method for the network model provided by the embodiment of the application, the attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the parameter quantity is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.

Fig. 9 is a schematic structural diagram of a pruning apparatus for a network model according to an embodiment of the present application, and as shown in fig. 9, the pruning apparatus includes:

the obtaining module 901 is configured to obtain a training image set and a current network model; the current network model comprises a plurality of convolutional layers;

the pruning module 903 is configured to perform pruning processing on the current network model based on the training image set to obtain a pruned network model;

wherein, the pruning treatment step comprises the following steps:

In the embodiment of the present application, the pruning module 903 described above includes:

the determining module 9031 is configured to input the training image into the current network model, and determine a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to an output of the current network model;

the attenuation module 9033 is configured to perform attenuation processing on a parameter corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain an attenuation parameter;

the rejecting module 9035 is configured to reject a parameter corresponding to the attenuation parameter in the convolutional layer if the difference between the attenuation parameter and the preset threshold is within the preset interval, so as to obtain the pruned network model.

The device and method embodiments in the embodiments of the present application are based on the same application concept.

An electronic device according to an embodiment of the present application may be disposed in the server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a method for pruning a network model in the method embodiment, where the at least one instruction, the at least one program, the code set, or the set of instructions are loaded from the memory and executed to implement the method for pruning a network model.

The present invention also provides a storage medium, which can be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a method for pruning a network model in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions are loaded and executed by the processor to implement the method for pruning a network model.

Optionally, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to, a storage medium including: various media that can store program codes, such as a usb disk, a Read-only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk.

As can be seen from the embodiments of the pruning method, device, electronic device, or storage medium for a network model provided by the present application, the method in the present application includes obtaining a training image set and a current network model, where the current network model includes a plurality of convolutional layers, and performing pruning processing on the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting a training image into a current network model, determining a parameter corresponding to each convolutional layer in a plurality of convolutional layers according to the output of the current network model, carrying out attenuation processing on the parameter corresponding to each convolutional layer based on a preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter, and if the difference value of the attenuation parameter and a preset threshold value is within a preset interval, rejecting the parameter corresponding to the attenuation parameter in the convolutional layer to obtain a pruned network model. Based on the embodiment of the application, attenuation processing is carried out on the parameters corresponding to the convolutional layers, so that the knowledge of parameter learning corresponding to the convolutional layers with the parameters to be eliminated is forced to be transferred, the number of the parameters is reduced, the training burden is not increased, and the identification accuracy of the network model can be ensured.

It should be noted that: the foregoing sequence of the embodiments of the present application is for description only and does not represent the superiority and inferiority of the embodiments, and the specific embodiments are described in the specification, and other embodiments are also within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in the order of execution in different embodiments and achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown or connected to enable the desired results to be achieved, and in some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment is described with emphasis on differences from other embodiments. Especially, for the embodiment of the device, since it is based on the embodiment similar to the method, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. A pruning method for a network model is characterized by comprising the following steps:

wherein the pruning treatment comprises the following steps:

inputting the training image into the current network model, and determining a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to the output of the current network model;

and if the difference value between the attenuation parameter and a preset threshold value is within a preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolutional layer to obtain the pruned network model.

2. The method of claim 1, wherein after obtaining the pruned network model, the method further comprises:

3. The method of claim 1, wherein inputting the training image into the current network model and determining parameters corresponding to each of the plurality of convolutional layers based on an output of the current network model comprises:

inputting the training image into the current network model, and determining a feature atlas output by each convolutional layer in the plurality of convolutional layers according to the output of the current network model;

and determining the corresponding parameter of each convolution layer according to the preset mapping relation and the corresponding parameter of the feature map set output by each convolution layer.

4. The method of claim 3, wherein the attenuating the parameter corresponding to each convolutional layer based on the preset pruning rate corresponding to each convolutional layer to obtain an attenuation parameter comprises:

determining a target feature map set from the feature map set output by each convolutional layer based on the preset pruning rate corresponding to each convolutional layer; the ratio of the number of channels of the target feature map set to the number of channels of the feature map set is the preset pruning rate;

determining parameters corresponding to the target feature atlas;

based on a preset coefficient, carrying out attenuation processing on parameters corresponding to the target feature map set to obtain transition parameters;

5. The method of claim 1, wherein after obtaining the pruned network model, the method further comprises:

re-determining the pruned network model as the current network model;

and training the current network model by using the training image set to obtain a trained network model.

6. The method of claim 5, wherein the training the current network model using the training image set to obtain a trained network model comprises:

inputting the training image into the current network model, and determining parameter sets corresponding to the plurality of convolutional layers according to the output of the current network model;

7. A pruning apparatus for a network model, comprising:

wherein the pruning treatment comprises the following steps:

8. The apparatus of claim 7, wherein the pruning module comprises:

a determining module, configured to input the training image into the current network model, and determine a parameter corresponding to each convolutional layer in the plurality of convolutional layers according to an output of the current network model;

and the rejecting module is used for rejecting parameters corresponding to the attenuation parameters in the convolutional layer to obtain the pruned network model if the difference value between the attenuation parameters and a preset threshold value is within a preset interval.

9. An electronic device, comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by the processor to implement the method of pruning a network model according to any one of claims 1-6.

10. A computer-readable storage medium, having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of pruning a network model according to any one of claims 1-6.