WO2023098544A1 - 基于局部稀疏约束的结构化剪枝方法和装置 - Google Patents
基于局部稀疏约束的结构化剪枝方法和装置 Download PDFInfo
- Publication number
- WO2023098544A1 WO2023098544A1 PCT/CN2022/133849 CN2022133849W WO2023098544A1 WO 2023098544 A1 WO2023098544 A1 WO 2023098544A1 CN 2022133849 W CN2022133849 W CN 2022133849W WO 2023098544 A1 WO2023098544 A1 WO 2023098544A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- network model
- sparse
- pruning
- training
- Prior art date
Links
- 238000013138 pruning Methods 0.000 title claims abstract description 153
- 238000000034 method Methods 0.000 title claims abstract description 86
- 238000003062 neural network model Methods 0.000 claims abstract description 241
- 238000012549 training Methods 0.000 claims abstract description 117
- 230000009471 action Effects 0.000 claims abstract description 21
- 238000012545 processing Methods 0.000 claims abstract description 19
- 239000013598 vector Substances 0.000 claims description 32
- 230000006870 function Effects 0.000 claims description 27
- 238000004590 computer program Methods 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 18
- 238000010586 diagram Methods 0.000 description 11
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000000717 retained effect Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000003058 natural language processing Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the present application relates to the field of computer technology, in particular to a method and device for structured pruning based on local sparse constraints.
- model pruning generally uses a larger neural network model to fit and train a large amount of data. After the training is completed, by removing unimportant weights or channels, the neural network model can reduce the size of the neural network model while retaining its superior performance. The amount of parameters accelerates the forward propagation of the neural network model.
- the structured pruning in the model pruning is pruning with the convolution kernel as the granularity.
- the pruned neural network model maintains the conventional convolutional network structure, and can be deployed and pre-deployed without specific reasoning libraries and hardware support. to reasoning.
- the currently commonly used structured pruning method is to perform L1 regularization sparse training on the neural network model before pruning, then remove the parameters and network connections corresponding to the sparse channels from the neural network model, and finally fine-tune the pruning method.
- the neural network model after branching to recover the accuracy. This method constrains the neural network model to a sparse state through L1 regularization and sparse training, which can reduce the impact of pruning operations on the performance of the neural network model.
- this commonly used structured pruning method still has the following problems in actual use: L1 regularization makes the neural network model sparse, but it also imposes constraints on the channels to be retained.
- the commonly used structured pruning method Sparsity training will limit the expressive ability of the preserved channels, thereby affecting the convergence of the pruned neural network model, and then affecting the accuracy of the pruned neural network model.
- the present application provides a structured pruning method and device based on local sparse constraints, which are used to solve the defect that the sparse training in the prior art will limit the expressive ability of the reserved channels, thereby affecting the convergence of the neural network model after pruning.
- the convergence of the neural network model obtained by structured pruning can be guaranteed.
- the present application provides a structured pruning method based on local sparse constraints, including:
- the neural network model is sparsely trained to limit the scope of action through the sample data set, and the neural network model with sparse weights is obtained; wherein, the mask is preset based on the pruning rate and is used to specify The channel for performing the sparse training in the neural network model;
- the neural network model obtained by the pruning process is fine-tuned and trained through the sample data set to obtain the target general network model.
- the neural network model is sparsely trained to limit the scope of action through the sample data set, and a neural network model with sparse weights is obtained.
- the mask is preset based on the pruning rate and the number of layers of the neural network model and the number of output channels of each layer;
- sparse training is performed on the neural network model through the sample data set to obtain a neural network model with sparse weights.
- the mask is a set of vectors consisting of 0 and 1, each of the vectors corresponds to a layer of the neural network model, and each of the The number of elements contained in the vector is the number of output channels of the corresponding layer of the neural network model, and the quantity of 0 and 1 contained in each vector is determined by the pruning ratio and the number of output channels of the corresponding layer of the neural network model.
- the structured pruning method based on local sparse constraints if 1 is used to indicate that the output channel is constrained by the L1 regularization term, the number of 1s in each vector is based on the pruning rate and the neural network The product of the number of output channels of the corresponding layer of the model is determined; if 0 indicates that the output channel is not subject to the constraints of the L1 regularization term, the quantity of 0 in each vector is based on the number of output channels of the corresponding layer of the neural network model and the The difference in the number of 1s in the vector is determined;
- the pruning processing of parameters and network connections of output channels in the neural network model with sparse weights based on the preset mask includes:
- the parameters and network connections corresponding to the output channel are removed from the neural network model with sparse weights.
- the objective function of the sparse training further includes a pre-training objective function item, and the neural network model is obtained through the pre-training.
- the neural network model is sparsely trained to limit the scope of action through the sample data set, and a neural network model with sparse weights is obtained.
- the present application also provides a structured pruning device based on local sparse constraints, including:
- the sparse training module is used to perform sparse training on the neural network model to limit the range of action based on the preset mask through the sample data set, so as to obtain a neural network model with sparse weights; wherein, the mask is based on pruning
- the rate is preset, which is used to specify the channel for performing the sparse training in the neural network model;
- a pruning processing module configured to prune the parameters and network connections of the output channel in the neural network model with sparse weights based on the preset mask
- the fine-tuning training module is used to perform fine-tuning training on the neural network model obtained by the pruning process through the sample data set, so as to obtain the target neural network model.
- the present application also provides an electronic device, including a memory, a processor, and a computer program stored on the memory and operable on the processor. The steps of the structured pruning method and device based on local sparse constraints are described.
- the application also provides a non-transitory computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the structured pruning based on local sparse constraints as described in any of the above is implemented Method and device steps.
- the application also provides a computer program product, on which a computer program is stored, and when the computer program is executed by a processor, the steps of the structured pruning method and device based on local sparse constraints as described above are implemented. .
- the structured pruning method and device based on local sparse constraints provided by this application through the setting of the mask, when performing structured pruning on the neural network model, the sparse training can only prune the channels that need to be removed Do not perform sparse training on channels that do not need to be removed by pruning, and limit the scope of sparse training, so that sparse training will not limit the expressive ability of channels retained by pruning, so that Structured pruning will not affect the convergence of the neural network model, it can ensure the convergence of the neural network model obtained by structured pruning, and then ensure the accuracy of the neural network model obtained by structured pruning, so that structured pruning can be Compared with the neural network model obtained by the existing pruning method, the neural network model has better performance.
- FIG. 1 is a schematic flow diagram of a structured pruning method based on local sparse constraints provided by the present application
- Fig. 2 is the schematic flow chart of the sparsification training that limits the scope of action of the neural network model provided by the present application;
- Fig. 3 is a schematic flow chart of the pruning process provided by the present application to a neural network model with sparse weights
- FIG. 4 is a schematic flow diagram of another structured pruning method based on local sparse constraints provided by the present application.
- FIG. 5 is a schematic diagram of an application scenario of a structured pruning method based on local sparse constraints provided by the present application
- FIG. 6 is a schematic diagram of the composition and structure of the structured pruning device based on local sparse constraints provided by the present application;
- FIG. 7 is a schematic diagram of the composition and structure of the electronic device provided by the present application.
- Fig. 1 is a schematic flow diagram of the structured pruning method based on local sparse constraints provided by the present application.
- the structured pruning method based on local sparse constraints shown in Fig. 1 can be composed of structured
- the pruning device is executed.
- the structured pruning device based on local sparse constraints can be set on the client or the server.
- the client can be a smart phone, a notebook computer, a vehicle computer, a robot, a wearable device, etc.
- the server can be a The embodiment of the present application does not limit the physical server of the independent host, the virtual server carried by the host cluster, and the cloud server.
- the structured pruning method based on local sparse constraints at least includes:
- the neural network model for structured pruning with local sparse constraints may be a neural network model for computer vision processing, or a neural network model for natural language processing, etc.
- the disclosed embodiments do not limit the application field of the neural network model for structured pruning with local sparsity constraints.
- the neural network model for structured pruning with local sparse constraints can be a neural network model obtained after conventional training of the neural network model.
- the neural network model can be a convolutional neural network (Convolutional Neural Networks, CNN for short), deep neural Network (Deep Neural Networks, DNN for short), Recurrent Neural Network (Recurrent Neural Network, RNN for short), etc.
- the embodiment of the present application does not limit the type of neural network model for structured pruning with local sparse constraints.
- the sample data set can be an image data set, or a text data set, or a voice data set, etc.
- the embodiment of the present application does not limit the type of the sample data set.
- the sample data set may use an existing data set, or may also be obtained by collecting sample data, and the implementation of this application does not limit the acquisition method of the sample data set.
- a mask before the structural pruning of local sparse constraints is performed on the neural network model, a mask can be preset according to the pruning rate of the neural network model, and the sparsification in the neural network model can be specified through the mask
- the training channel or convolution kernel when performing sparse training on the neural network model according to the mask, the sparse training can only be performed on the channel or convolution kernel specified by the mask in the neural network model, while the neural network model Channels or convolution kernels not specified by the middle mask are not trained for sparsification.
- the pruning rate may be a global pruning rate, or may be a pruning rate set separately for each layer of the neural network model, which is not limited in the embodiment of the present application.
- the mask can be a string of binary codes consisting of 0 and 1.
- a string of binary codes can be set for each layer in the neural network model, and each bit in the binary code corresponds to a convolution in the neural network model.
- the kernel or output channel can be operated with the convolution kernel in the neural network model through the mask to realize the selection of the output channel in the neural network model.
- the method of performing sparse training on the neural network model through the sample data set can be realized by using an existing sparse training method, such as a supervised L1 regularized sparse training method, which is not limited in this embodiment of the present application.
- the neural network model with sparse weight can be pruned according to the preset mask, and the mask in the neural network model
- the parameters corresponding to the specified channel or convolution kernel are removed from the neural network model with sparse weights, and at the same time the network connection of the channel, that is, the parameters of the input channel of the next layer in the neural network model corresponding to the channel, are obtained from Removed from the neural network model with sparse weights; the parameters corresponding to channels or convolution kernels whose masks are not specified in the neural network model are retained in the neural network model with sparse weights.
- fine-tuning training is performed on the neural network model obtained by the pruning process through the sample data set, and the accuracy of the neural network model obtained by the pruning process is restored.
- the method of performing fine-tuning training on the neural network model obtained by pruning through the sample data set can be realized by using an existing fine-tuning training method, such as a supervised fine-tuning training method, which is not limited in this embodiment of the present application.
- the structured pruning method based on local sparse constraints provided by the embodiment of this application, through the setting of the mask, when performing structured pruning on the neural network model, the sparse training can only prune the channels that need to be removed Do not perform sparse training on channels that do not need to be removed by pruning, and limit the scope of sparse training, so that sparse training will not limit the expressive ability of channels retained by pruning, so that Structured pruning will not affect the convergence of the neural network model, it can ensure the convergence of the neural network model obtained by structured pruning, and then ensure the accuracy of the neural network model obtained by structured pruning, so that structured pruning can be Compared with the neural network model obtained by the existing pruning method, the neural network model has better performance.
- Figure 2 is a schematic flow diagram of the sparse training of the neural network model provided by the application, as shown in Figure 2, based on the preset mask, the neural network model is trained through the sample data set Sparse training that limits the scope of action, and the neural network model with sparse weights includes at least:
- 201 Acquire a preset mask; wherein, the mask is preset based on the pruning rate, the number of layers of the neural network model, and the number of output channels of each layer.
- the mask of the neural network model may be preset according to the pruning rate of the neural network model, the number of layers of the neural network model that need to be pruned in a structured manner, and the number of output channels of each layer.
- the mask can be a set of vectors consisting of 0 and 1, each vector corresponds to a layer of the neural network model, and the number of elements contained in each vector is the number of output channels of the corresponding layer of the neural network model, and each vector The number of 0s and 1s included can be determined by the pruning rate and the number of output channels of the corresponding layer of the neural network model.
- the implementation form of the mask is not limited in this embodiment of the application.
- the sparse training method of L1 regularization is used to perform sparse training on the neural network model. After obtaining the preset mask, the acquired mask can be used to perform sparse training on the objective function of the sparse training.
- the L1 regularization term is modified.
- the number of 1s in each vector is determined according to the product of the pruning rate and the number of output channels of the corresponding layer of the neural network model; if 0 Indicates that the output channel is not constrained by the L1 regularization term, and the number of 0s in each vector is determined according to the difference between the number of output channels of the corresponding layer of the neural network model and the number of 1s in the vector.
- the number of 0s in each vector is determined according to the product of the pruning rate and the number of output channels of the corresponding layer of the neural network model; if 1 indicates that the output channel is not constrained by the L1 regularization term, and the number of 1s in each vector is determined according to the difference between the number of output channels of the corresponding layer of the neural network model and the number of 0s in the vector.
- the value of is 0 or 1. When the value is 0, it means that the i-th output channel of the l-th layer will not be subject to Lasso constraints. When the value is 1, it means that the i-th output channel of the l-th layer will be subject to Lasso constraints. .
- the convolution kernel in the neural network model can be modified through the sample data set according to the modified objective function
- the L1 sparse training of the output channel dimension uses the mask to limit the scope of the L1 sparse training, so that the L1 sparse training is only performed for the channels that need to be removed in the neural network model, and for the channels that do not need to be removed in the neural network model Channels are not trained with L1 sparsification.
- the modified objective function includes both the L1 regularization item and the pre-training objective function item.
- the channel needs to be removed in the neural network model, and the L1 regularization item It works at the same time as the pre-training objective function item.
- the pre-training objective function item works.
- Fig. 3 is a schematic flow diagram of the pruning process for the neural network model with sparse weight provided by the present application. As shown in Fig. 3, based on the preset mask, the neural network model with sparse weight The pruning of the parameters and network connections of the output channel includes at least:
- the neural network model with sparse weights is obtained by performing sparse training on the neural network model to limit the scope of action according to the mask
- the parameters and network connections of the output channels removed from the weight-sparse neural network model can be determined according to the value of each output channel of each layer in the weight-sparse neural network model in the mask.
- each output channel of each layer of the neural network model with sparse weights it can be judged whether the corresponding value of the current output channel in the mask is 1; if the current output channel is in the mask If the corresponding value is 1, the parameters and network connections corresponding to the current output channel are removed from the neural network model with sparse weights; if the corresponding value of the current output channel in the mask is 0, the weights are not sparse The parameters and network connections corresponding to the current output channel in the neural network model are operated.
- each output channel of each layer of the neural network model with sparse weights it can be judged whether the corresponding value of the current output channel in the mask is 0; if the current output channel is in the mask If the corresponding value in the mask is 0, the parameters and network connections corresponding to the current output channel will be removed from the neural network model with sparse weights; if the corresponding value of the current output channel in the mask is 1, then the weight value The parameters and network connections corresponding to the current output channel in the sparse neural network model are operated.
- Fig. 4 is the schematic flow chart of another structured pruning method based on local sparse constraints provided by the present application, as shown in Fig. 4, this structured pruning method based on local sparse constraints at least includes:
- the initial neural network model before performing sparse training on the neural network model to limit the range of action, can also be pre-trained through the sample data set, and the neural network model obtained from the pre-training can be used to perform the training to limit the range of action.
- Sparsity training is a neural network model with initial parameters set after the neural network model is constructed.
- the pre-training of the initialized neural network model can be conventional neural network model training, such as supervised training, and the neural network model obtained after pre-training is a convergent neural network model.
- the neural network model is initialized by input, and after pre-training, sparse training with limited scope, pruning processing and fine-tuning training, a lightweight, small-scale neural network model that converges to better results can be output.
- the mask Based on the preset mask, perform sparse training on the neural network model to limit the range of action through the sample data set, and obtain a neural network model with sparse weights; wherein, the mask is preset based on the pruning rate and is used to specify The channel for sparsification training in the neural network model.
- the description about operation 402 may refer to the description about operation 101 in FIG. 1 , so it will not be repeated here.
- the description about operation 403 may refer to the description about operation 102 in FIG. 1 , so it will not be repeated here.
- the description about operation 404 may refer to the description about operation 103 in FIG. 1 , so it will not be repeated here.
- Figure 5 is a schematic diagram of an application scenario of the structured pruning method based on local sparse constraints provided by the present application, as shown in Figure 5, for the input deep neural network model, after step 1 pre-training, step 2 Sparse training with limited scope, step 3 pruning processing and step 4 fine-tuning training can output lightweight and small-scale neural network models. Step 1, Step 2, Step 3 and Step 4 are described in detail below.
- Step 1 For an initialized deep neural network model, pre-train the initialized deep neural network model through the image dataset to obtain a converged deep neural network model.
- Step 2 Obtain the mask M set for each output channel of each layer in the deep neural network model according to the global pruning rate, and modify the sparse training objective function.
- the model is trained sparsely with L1 regularization with limited scope.
- all hyperparameters are exactly the same as those in step 1.
- the sparsity factor is used to control the strength of the Lasso constraint. Generally, it can be adjusted to the order of magnitude of the gradient of the weight parameter during the training process.
- Step 3 Use the mask M to prune the weight-sparse deep neural network model obtained in step 2.
- the value corresponding to the output channel in the mask M is 1, all the parameters and network connections corresponding to the output channel are removed from the weight-sparse deep neural network model; when the output channel corresponding to the mask M When the value is 0, no operation will be performed on the parameters and network connections of the output channel.
- Step 4 After obtaining the pruned neural network model, perform fine-tuning training on the obtained neural network model through the image dataset. During the training process, all hyperparameters are exactly the same as in step 1, except that the learning rate needs to be adjusted to one percent of the original.
- the most commonly used direct pruning method L1-Norm and the global sparse pruning method NetworkSlimming are respectively used to prune the ResNet-50 neural network model on the public dataset ImageNet, and the obtained
- the accuracy and computational complexity of the neural network model after pruning are shown in Table 1, where the accuracy is calculated by ACC (the higher the better), and the computational complexity is calculated by FLOPS.
- the following describes the image processing model based on local self-attention of deformable blocks provided by this application.
- the image processing model of local self-attention based on deformable blocks described below is the same as the image processing model based on deformable blocks described above
- the image processing methods of local self-attention can be referred to each other.
- Figure 6 is a schematic diagram of the composition and structure of the structured pruning device based on local sparse constraints provided by the present application.
- the structured pruning device based on local sparse constraints shown in Figure 6 can be used to implement the local
- the structured pruning method of sparse constraints, as shown in Figure 6, the structured pruning device based on local sparse constraints at least includes:
- the sparse training module 610 is used to perform sparse training on the neural network model to limit the range of action based on the preset mask through the sample data set, so as to obtain a neural network model with sparse weights; wherein, the mask is based on the pruning rate Preset, used to specify the channel for sparsification training in the neural network model.
- the pruning processing module 620 is configured to prune the parameters and network connections of the output channel in the neural network model with sparse weights based on the preset mask.
- the fine-tuning training module 630 is configured to perform fine-tuning training on the neural network model obtained by the pruning process through the sample data set to obtain the target general network model.
- the sparse training module 610 includes:
- the mask acquisition unit is configured to acquire a preset mask; wherein, the mask is preset based on the pruning rate, the number of layers of the neural network model, and the number of output channels of each layer.
- the objective function modification unit is configured to modify the L1 regularization item in the objective function of the sparse training based on the obtained mask.
- the sparse training unit is used to perform sparse training on the neural network model through the sample data set based on the modified objective function, so as to obtain a neural network model with sparse weights.
- the mask is a set of vectors consisting of 0 and 1, each vector corresponds to a layer of the neural network model, and the number of elements contained in each vector is the number of output channels of the corresponding layer of the neural network model, and each vector contains The number of 0s and 1s in is determined by the pruning rate and the number of output channels of the corresponding layer of the neural network model.
- the number of 1s in each vector is determined according to the product of the pruning rate and the number of output channels of the corresponding layer of the neural network model; if 0 is used to indicate that the output channel is not Constrained by the L1 regularization term, the number of 0s in each vector is determined according to the difference between the number of output channels of the corresponding layer of the neural network model and the number of 1s in the vector;
- the pruning processing module 620 includes:
- a mask judging unit is used to judge whether the value of the corresponding output channel in the mask is 1 for each output channel of each layer of the neural network model with sparse weights;
- the pruning processing unit according to the judgment result of the judging unit, if the value corresponding to the output channel in the mask is 1, removes the parameters and network connections corresponding to the output channel from the neural network model with sparse weights.
- the objective function of the sparse training further includes a pre-training objective function item, and the neural network model is obtained through pre-training.
- the sparse training module 610 it includes:
- the pre-training module is used to pre-train the initialized neural network model through the sample data set to obtain the neural network model.
- FIG. 7 illustrates a schematic diagram of the physical structure of an electronic device.
- the electronic device may include: a processor (processor) 710, a communication interface (CommunicationsInterface) 720, a memory (memory) 730, and a communication bus 740, wherein , the processor 710 , the communication interface 720 , and the memory 730 communicate with each other through the communication bus 740 .
- the processor 710 can invoke logic instructions in the memory 730 to perform a structured pruning method based on local sparse constraints, the method comprising:
- the neural network model is sparsely trained to limit the scope of action through the sample data set, and the neural network model with sparse weights is obtained; wherein, the mask is preset based on the pruning rate and is used to specify The channel for performing the sparse training in the neural network model;
- the parameters and network connections of the output channel in the weight sparse neural network model are pruned;
- the neural network model obtained by the pruning process is fine-tuned and trained through the sample data set to obtain the target general network model.
- the above-mentioned logic instructions in the memory 730 may be implemented in the form of software functional units and when sold or used as an independent product, may be stored in a computer-readable storage medium.
- the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-OnlyMemory), random access memory (RAM, RandomAccessMemory), magnetic disk or optical disk, and various media that can store program codes.
- the present application also provides a computer program product
- the computer program product includes a computer program stored on a non-transitory computer-readable storage medium
- the computer program includes program instructions, and when the program instructions are executed by a computer During execution, the computer can execute the structured pruning method based on local sparse constraints provided by the above method embodiments, the method includes:
- the neural network model is sparsely trained to limit the scope of action through the sample data set, and the neural network model with sparse weights is obtained; wherein, the mask is preset based on the pruning rate and is used to specify The channel for performing the sparse training in the neural network model;
- the neural network model obtained by the pruning process is fine-tuned and trained through the sample data set to obtain the target general network model.
- the present application also provides a non-transitory computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, it is implemented to implement the structure based on local sparse constraints provided by the above method embodiments
- the pruning method includes:
- the neural network model is sparsely trained to limit the scope of action through the sample data set, and the neural network model with sparse weights is obtained; wherein, the mask is preset based on the pruning rate and is used to specify The channel for performing the sparse training in the neural network model;
- the neural network model obtained by the pruning process is fine-tuned and trained through the sample data set to obtain the target general network model.
- the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network elements. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. It can be understood and implemented by those skilled in the art without any creative efforts.
- each implementation can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware.
- the essence of the above technical solution or the part that contributes to the prior art can be embodied in the form of software products, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic discs, optical discs, etc., including several instructions to make a computer device (which may be a personal computer, server, or network device, etc.) execute the methods described in various embodiments or some parts of the embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Machine Translation (AREA)
Abstract
本申请提供一种基于局部稀疏约束的结构化剪枝方法和装置。其中方法包括:基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理;通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。本申请可以保证结构化剪枝得到的神经网络模型相比于现有剪枝方法得到的神经网络模型拥有更好的表现。
Description
本申请要求于2021年12月3日提交中国专利局、申请号为CN202111475019.8、申请名称为“基于局部稀疏约束的结构化剪枝方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及计算机技术领域,尤其涉及一种基于局部稀疏约束的结构化剪枝方法和装置。
随着计算机技术的发展,各类深度学习技术在计算机视觉以及自然语言处理等领域被广泛应用。虽然深度神经网络模型在任务处理上具有优越的表现,但是因其参数量大、算力与存储成本需求较高等特点,使得很难被部署到终端设备上。模型剪枝技术的出现为解决这一问题提供了有效的方法。所谓模型剪枝,一般是使用一个较大的神经网络模型对大量数据进行拟合训练,训练完毕后通过去掉不重要的权重或者通道,使神经网络模型在保留优越表现的同时,降低神经网络模型的参数量,加速神经网络模型的前向传播。而模型剪枝中的结构化剪枝,以卷积核为粒度进行裁剪,剪枝后的神经网络模型保持了常规的卷积网络结构,不需要特定推理库和硬件支持就可以进行部署和前向推理。
目前常用的结构化剪枝方法,是在剪枝前先对神经网络模型进行L1正则化稀疏化训练,再将被稀疏掉的通道对应的参数和网络连接从神经网络模型中去掉,最后微调剪枝后的神经网络模型来恢复精度。这种方法通过L1正则化 稀疏化训练将神经网络模型约束到稀疏状态,可以减小剪枝操作对神经网络模型表现的影响。
然而,这种常用的结构化剪枝方法在实际使用中仍然存在以下问题:L1正则化使神经网络模型变得稀疏,但也会对被保留的通道产生约束,常用的结构化剪枝方法的稀疏化训练会限制被保留的通道的表达能力,从而影响剪枝后神经网络模型的收敛,进而影响剪枝后神经网络模型的精度。
申请内容
本申请提供一种基于局部稀疏约束的结构化剪枝方法和装置,用以解决现有技术稀疏化训练会限制被保留的通道的表达能力,从而影响剪枝后神经网络模型的收敛的缺陷,通过对稀疏化训练的作用范围进行限制,可以保证结构化剪枝得到的神经网络模型的收敛性。
第一方面,本申请提供一种基于局部稀疏约束的结构化剪枝方法,包括:
基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;
基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理;
通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
根据本申请提供的基于局部稀疏约束的结构化剪枝方法,所述基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型,包括:
获取所述预先设置的掩码;其中,所述掩码是基于所述剪枝率和所述神经网络模型的层数以及每层的输出通道数预先设置;
基于所获取的掩码,对所述稀疏化训练的目标函数中的L1正则化项进行修改;
基于修改后的目标函数,通过所述样本数据集对所述神经网络模型进行稀疏化训练,得到所述权值稀疏的神经网络模型。
根据本申请提供的基于局部稀疏约束的结构化剪枝方法,所述掩码为由0和1组成一组向量,每个所述向量与所述神经网络模型的一层对应,每个所述向量包含的元素数量为所述神经网络模型对应层的输出通道数,每个所述向量包含的0和1的数量由所述剪枝率和所述神经网络模型对应层的输出通道数确定。
根据本申请提供的基于局部稀疏约束的结构化剪枝方法,若以1表示输出通道受到L1正则化项的约束,每个所述向量中1的数量根据所述剪枝率与所述神经网络模型对应层的输出通道数的乘积确定;若以0表示输出通道不受到L1正则化项的约束,每个所述向量中0的数量根据所述神经网络模型对应层的输出通道数与所述向量中1的数量的差值确定;
所述基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理,包括:
对所述权值稀疏的神经网络模型每层的每个输出通道,判断在所述掩码中对应所述输出通道的取值是否为1;
若在所述掩码中对应所述输出通道的取值为1,将所述输出通道对应的参数和网络连接从所述权值稀疏的神经网络模型中移除。
根据本申请提供的基于局部稀疏约束的结构化剪枝方法,所述稀疏化训练的目标函数还包括预训练目标函数项,所述神经网络模型是通过所述预训练得到。
根据本申请提供的基于局部稀疏约束的结构化剪枝方法,所述基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型之前,包括:
通过所述样本数据集对初始化神经网络模型进行所述预训练,得到所述神经网络模型。
第二方面,本申请还提供一种基于局部稀疏约束的结构化剪枝装置,包括:
稀疏化训练模块,用于基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;
剪枝处理模块,用于基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理;
微调训练模块,用于通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
第三方面,本申请还提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述任一种所述基于局部稀疏约束的结构化剪枝方法和装置的步骤。
第四方面,申请还提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如上述任一种所述基于局部稀疏约束的结构化剪枝方法和装置的步骤。
第五方面,申请还提供一种计算机程序产品,其上存储有计算机程序,该计算机程序被处理器执行时实现如上述任一种所述基于局部稀疏约束的结构化剪枝方法和装置的步骤。
本申请提供的基于局部稀疏约束的结构化剪枝方法和装置,通过掩码的设置,在对神经网络模型进行结构化剪枝时,可以使稀疏化训练只对剪枝处理需要移除的通道进行,不对剪枝处理不需要移除的通道不进行稀疏化训练,对稀疏化训练的作用范围进行限制,可以使稀疏化训练不会对剪枝处理保留的通道的表达能力进行限制,从而使结构化剪枝不会对神经网络模型的收敛造成影响,可以保证结构化剪枝得到的神经网络模型的收敛性,进而保证结构化剪枝得到的神经网络模型的精度,使结构化剪枝得到的神经网络模型相比于现有剪枝方法得到的神经网络模型拥有更好的表现。
为了更清楚地说明本申请或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请提供的基于局部稀疏约束的结构化剪枝方法的流程示意图;
图2是本申请提供的对神经网络模型进行限制作用范围的稀疏化训练的流程示意图;
图3是本申请提供的对权值稀疏的神经网络模型进行剪枝处理的流程示意图;
图4是本申请提供的另一基于局部稀疏约束的结构化剪枝方法的流程示意图;
图5是本申请提供的基于局部稀疏约束的结构化剪枝方法的一应用场景的示意图;
图6是本申请提供的基于局部稀疏约束的结构化剪枝装置的组成结构示意图;
图7是本申请提供的电子设备的组成结构示意图。
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
下面结合图1-图5描述本申请的基于局部稀疏约束的结构化剪枝方法。
请参阅图1,图1是本申请提供的基于局部稀疏约束的结构化剪枝方法的流程示意图,图1所示的基于局部稀疏约束的结构化剪枝方法可以由基于局部稀疏约束的结构化剪枝装置执行,基于局部稀疏约束的结构化剪枝装置可以设 置于客户端或者服务器,例如,客户端可以为智能手机、笔记型计算机、车载计算机、机器人、可穿戴设备等,服务器可以为包含独立主机的物理服务器、主机集群承载的虚拟服务器、云服务器等,本申请实施例对此不作限定。如图1所示,该基于局部稀疏约束的结构化剪枝方法至少包括:
101,基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,掩码是基于剪枝率预先设置,用于指定神经网络模型中进行稀疏化训练的通道。
在本申请实施例中,进行局部稀疏约束的结构化剪枝的神经网络模型可以为用于进行计算机视觉处理的神经网络模型,或者也可以为用于进行自然语言处理的神经网络模型,等,明实施例对进行局部稀疏约束的结构化剪枝的神经网络模型的应用领域不作限定。进行局部稀疏约束的结构化剪枝的神经网络模型可以为对神经网络模型进行常规训练后得到的神经网络模型,例如,神经网络模型可以为卷积神经网络(ConvolutionalNeural Networks,简称CNN)、深度神经网络(Deep NeuralNetworks,简称DNN)、循环神经网络(RecurrentNeural Network,简称RNN)等,本申请实施例对进行局部稀疏约束的结构化剪枝的神经网络模型的类型不作限定。样本数据集根据神经网络模型的应用场景可以为图像数据集,或者也可以为文本数据集,或者还可以为语音数据集,等,本申请实施例对样本数据集的类型不作限定。样本数据集可以采用现有的数据集,或者也可以通过对样本数据的收集获得,本申请实施对样本数据集的获取方式不作限定。
在本申请实施例中,在对神经网络模型进行局部稀疏约束的结构化剪枝之前,可以根据对神经网络模型的剪枝率预先设置掩码,通过掩码来指定神经网络模型中进行稀疏化训练的通道或者卷积核,在根据掩码对神经网络模型进行稀疏化训练时,可以使稀疏化训练只对神经网络模型中掩码所指定的通道或者卷积核进行,而对神经网络模型中掩码未指定的通道或者卷积核不进行稀疏化训练。
在本申请实施例中,剪枝率可以是全局剪枝率,或者也可以是针对神经网 络模型的每一层分别设置的剪枝率,本申请实施例对此不作限定。掩码可以是由0和1组成的一串二进制代码,对于神经网络模型中的每一层可以分别设置一串二进制代码,二进制代码中的每一位分别对应于神经网络模型中的一个卷积核或者输出通道,可以通过掩码与神经网络模型中的卷积核进行运算,实现对神经网络模型中输出通道的选择。通过样本数据集对神经网络模型进行稀疏化训练的方法,可以采用现有的稀疏化训练方法来实现,例如有监督的L1正则化的稀疏化训练方法,本申请实施例对此不作限定。
102,基于预先设置的掩码,对权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理。
在本申请实施例中,在通过稀疏化训练得到权值稀疏的神经网络模型之后,可以根据预先设置的掩码,对权值稀疏的神经网络模型进行剪枝处理,将神经网络模型中掩码所指定的通道或者卷积核对应的参数从权值稀疏的神经网络模型中移除,同时将该通道的网络连接,即该通道对应的神经网络模型中下一层的输入通道的参数,从权值稀疏的神经网络模型中移除;将神经网络模型中掩码未指定的通道或者卷积核对应的参数,保留于权值稀疏的神经网络模型中。
103,通过样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
在本申请实施例中,在对权值稀疏的神经网络模型进行剪枝处理之后,通过样本数据集对剪枝处理得到的神经网络模型进行微调训练,恢复剪枝处理得到的神经网络模型的精度,得到轻量化的、规模较小的目标神将网络模型。其中,通过样本数据集对剪枝处理得到的神经网络模型进行微调训练的方法,可以采用现有的微调训练方法来实现,例如有监督的微调训练方法,本申请实施例对此不作限定。
本申请实施例提供的基于局部稀疏约束的结构化剪枝方法,通过掩码的设置,在对神经网络模型进行结构化剪枝时,可以使稀疏化训练只对剪枝处理需要移除的通道进行,不对剪枝处理不需要移除的通道不进行稀疏化训练,对稀疏化训练的作用范围进行限制,可以使稀疏化训练不会对剪枝处理保留的通道 的表达能力进行限制,从而使结构化剪枝不会对神经网络模型的收敛造成影响,可以保证结构化剪枝得到的神经网络模型的收敛性,进而保证结构化剪枝得到的神经网络模型的精度,使结构化剪枝得到的神经网络模型相比于现有剪枝方法得到的神经网络模型拥有更好的表现。
请参阅图2,图2是本申请提供的对神经网络模型进行限制作用范围的稀疏化训练的流程示意图,如图2所示,基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型至少包括:
201,获取预先设置的掩码;其中,掩码是基于剪枝率和神经网络模型的层数以及每层的输出通道数预先设置。
在本申请实施例中,可以根据神经网络模型的剪枝率、需要进行结构化剪枝的神经网络模型的层数和每层的输出通道数,预先设置神经网络模型的掩码。可选地,掩码可以为由0和1组成一组向量,每个向量与神经网络模型的一层对应,每个向量包含的元素数量为神经网络模型对应层的输出通道数,每个向量包含的0和1的数量可以由剪枝率和神经网络模型对应层的输出通道数确定。本申请实施例对掩码的实现形式不作限定。
例如,如果给定的神经网络模型的剪枝率为α%,需要进行结构化剪枝的神经网络模型的卷积核参数集W为W={W
1,W
2,…,W
L},其中,
L为正整数,为深神经网络模型的层数,W
l为第l层的卷积核参数,l为正整数、l=1,2,…,L;o
l为第l层的输入通道数,n
l为第l层的输出通道数,k
l为卷积核的大小,则掩码M对应于神经网络模型第l层的向量为M
l,其长度为n
l;M
l可以是由
个1与
个0组成,即M
l={1,1..,0,..0}。
202,基于所获取的掩码,对稀疏化训练的目标函数中的L1正则化项进行修改。
在本申请实施例中,采用L1正则化的稀疏化训练方法对神经网络模型进行稀疏化训练,在获取预先设置的掩码之后,可以利用所获取的掩码对稀疏化 训练的目标函数中的L1正则化项进行修改。
在一些可选的例子中,若以1表示输出通道受到L1正则化项的约束,每个向量中1的数量根据剪枝率与神经网络模型对应层的输出通道数的乘积确定;若以0表示输出通道不受到L1正则化项的约束,每个向量中0的数量根据神经网络模型对应层的输出通道数与该向量中1的数量的差值确定。
在另一些可选的例子中,若以0表示输出通道受到L1正则化项的约束,每个向量中0的数量根据剪枝率与神经网络模型对应层的输出通道数的乘积确定;若以1表示输出通道不受到L1正则化项的约束,每个向量中1的数量根据神经网络模型对应层的输出通道数与该向量中0的数量的差值确定。
例如,修改后的稀疏化训练的目标函数为Loss=loss
pretrain+Lasso(W)’,其中,loss
pretrain为预训练目标函数项,通过在对神经网络模型进行结构化剪枝之前,对神经网络模型进行预训练得到;Lasso(W)’为修改后的L1正则化项,即限制作用范围的Lasso约束函数,
λ为稀疏因子,
掩码
的取值为0或者1,当取值为0时表示第l层的第i个输出通道不会受到Lasso约束,当取值为1时表示第l层的第i个输出通道会受到Lasso约束。
203,基于修改后的目标函数,通过样本数据集对神经网络模型进行稀疏化训练,得到权值稀疏的神经网络模型。
在本申请实施例中,在通过掩码对稀疏化训练的目标函数中的L1正则化项进行修改之后,可以根据修改后的目标函数,通过样本数据集对神经网络模型中的卷积核进行输出通道维度的L1稀疏化训练,利用掩码来限制L1稀疏化训练的作用范围,使L1稀疏化训练只针对神经网络模型中需要移除的通道进行,对于神经网络模型中不需要移除的通道不进行L1稀疏化训练。例如,修改后的目标函数既包括L1正则化项也包括预训练目标函数项,在通过修改后的目标函数对神经网络模型进行训练时,对于神经网络模型中需要移除通道,L1正则化项和预训练目标函数项同时起作用,对于神经网络模型中不需要移 除的通道,只有预训练目标函数项起作用。
请参阅图3,图3是本申请提供的对权值稀疏的神经网络模型进行剪枝处理的流程示意图,如图3所示,基于预先设置的掩码,对权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理至少包括:
301,对权值稀疏的神经网络模型每层的每个输出通道,判断在掩码中对应输出通道的取值是否为1。
若在掩码中对应输出通道的取值为1,则执行302;若在掩码中对应输出通道的取值为0,则不对权值稀疏的神经网络模型中输出通道对应的参数和网络连接进行操作。
302,将输出通道对应的参数和网络连接从权值稀疏的神经网络模型中移除。
在本申请实施例中,由于权值稀疏的神经网络模型是根据掩码对神经网络模型进行限制作用范围的稀疏化训练得到,因此,在对权值稀疏的神经网络模型进行剪枝处理时,可以根据权值稀疏的神经网络模型中每层的每个输出通道在掩码中的取值,确定从权值稀疏的神经网络模型中移除的输出通道的参数和网络连接。
在一些可选的例子中,对于权值稀疏的神经网络模型每层的每个输出通道,可以通过判断当前输出通道在掩码中对应的取值是否为1;若当前输出通道在掩码中对应的取值为1,则将当前输出通道对应的参数和网络连接从权值稀疏的神经网络模型中移除;若当前输出通道在掩码中对应的取值为0,则不对权值稀疏的神经网络模型中当前输出通道对应的参数和网络连接进行操作。
在另一些可选的例子中,对于权值稀疏的神经网络模型每层的每个输出通道,可以通过判断当前输出通道在掩码中对应的取值是否为0;若当前输出通道在掩码中对应的取值为0,则将当前输出通道对应的参数和网络连接从权值稀疏的神经网络模型中移除;若当前输出通道在掩码中对应的取值为1,则不对权值稀疏的神经网络模型中当前输出通道对应的参数和网络连接进行操作。
请参阅图4,图4是本申请提供的另一基于局部稀疏约束的结构化剪枝方 法的流程示意图,如图4所示,该基于局部稀疏约束的结构化剪枝方法至少包括:
401,通过样本数据集对初始化神经网络模型进行预训练,得到神经网络模型。
在本申请实施例中,在对神经网络模型进行限制作用范围的稀疏化训练之前,还可以通过样本数据集对初始化神经网络模型进行预训练,以预训练得到的神经网络模型进行限制作用范围的稀疏化训练。其中,初始化神经网络模型为神经网络模型构建好后,设置了初始参数的神经网络模型。对初始化神经网络模型进行的预训练可以为常规的神经网络模型训练,例如有监督的训练,经过预训练得到的神经网络模型为一个收敛的神经网络模型。通过输入初始化神经网络模型,经过预训练、限制作用范围的稀疏化训练、剪枝处理和微调训练,可以输出一个轻量化的、规模较小的收敛到较好结果的神经网络模型。
402,基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,掩码是基于剪枝率预先设置,用于指定神经网络模型中进行稀疏化训练的通道。
在本申请实施例中,关于操作402的说明可以参见图1中关于操作101的说明,故此处不再复述。
403,基于预先设置的掩码,对权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理。
在本申请实施例中,关于操作403的说明可以参见图1中关于操作102的说明,故此处不再复述。
404,通过样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
在本申请实施例中,关于操作404的说明可以参见图1中关于操作103的说明,故此处不再复述。
请参阅图5,图5是本申请提供的基于局部稀疏约束的结构化剪枝方法的一应用场景的示意图,如图5所示,对于输入的深度神经网络模型,经过步骤 1预训练、步骤2限制作用范围的稀疏化训练、步骤3剪枝处理和步骤4微调训练,可以输出轻量化的、规模较小的神经网络模型。下面对步骤1、步骤2、步骤3和步骤4进行具体说明。
步骤1:对于一个初始化的深度神经网络模型,通过图像数据集对初始化的深度神经网络模型进行预训练,得到一个收敛的深度神经网络模型。
步骤2:获取根据全局剪枝率针对深度神经网络模型中每层的每个输出通道设置的掩码M,修改稀疏化训练目标函数,根据修改后的目标函数,通过图像数据集对深度神经网络模型进行限制作用范围的L1正则化的稀疏化训练。在训练过程中,所有超参数与步骤1中完全一致,稀疏因子用于控制Lasso约束的强弱,一般调整为训练过程中权重参数的梯度数量级大小即可。
步骤3:使用掩码M对步骤2得到的权值稀疏的深度神经网络模型进行剪枝处理。当在掩码M中输出通道对应的取值为1时,将该输出通道对应的参数和网络连接全部从权值稀疏的深度神经网络模型中移除;当在掩码M中输出通道对应的取值为0时,不对该输出通道的参数和网络连接做任何操作。
步骤4:在获得剪枝后的神经网络模型后,通过图像数据集对所获得的神经网络模型进行微调训练。在训练过程中,除了将学习率需要调整为原来的百分之一以外,所有超参数与步骤1中完全一致。
分别采用本申请实施例提供的方法与目前最常用的直接剪枝方法L1-Norm和全局稀疏剪枝方法NetworkSlimming在公开的数据集ImageNet上,对ResNet-50神经网络模型进行剪枝,所得到的剪枝后神经网络模型的准确率和计算复杂度如表1所示,其中,准确率采用ACC计算(越高越好),计算复杂度采用FLOPS计算。
表1
从表1中可以看出,在FLOPS下降一致的情况下,采用本申请实施例提供的方法可以显著提升剪枝后模型的准确率。
下面对本申请提供的基于可形变划块的局部自注意力的图像处理模型进行描述,下文描述的基于可形变划块的局部自注意力的图像处理模型与上文描述的基于可形变划块的局部自注意力的图像处理方法可相互对应参照。
请参阅图6,图6是本申请提供的基于局部稀疏约束的结构化剪枝装置的组成结构示意图,图6所示的基于局部稀疏约束的结构化剪枝装置可用来执行图1的基于局部稀疏约束的结构化剪枝方法,如图6所示,该基于局部稀疏约束的结构化剪枝装置至少包括:
稀疏化训练模块610,用于基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,掩码是基于剪枝率预先设置,用于指定神经网络模型中进行稀疏化训练的通道。
剪枝处理模块620,用于基于预先设置的掩码,对权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理。
微调训练模块630,用于通过样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
可选地,稀疏化训练模块610,包括:
掩码获取单元,用于获取预先设置的掩码;其中,掩码是基于剪枝率和神经网络模型的层数以及每层的输出通道数预先设置。
目标函数修改单元,用于基于所获取的掩码,对稀疏化训练的目标函数中的L1正则化项进行修改。
稀疏化训练单元,用于基于修改后的目标函数,通过样本数据集对神经网 络模型进行稀疏化训练,得到权值稀疏的神经网络模型。
可选地,掩码为由0和1组成一组向量,每个向量与神经网络模型的一层对应,每个向量包含的元素数量为神经网络模型对应层的输出通道数,每个向量包含的0和1的数量由剪枝率和神经网络模型对应层的输出通道数确定。
可选地,若以1表示输出通道受到L1正则化项的约束,每个向量中1的数量根据剪枝率与神经网络模型对应层的输出通道数的乘积确定;若以0表示输出通道不受到L1正则化项的约束,每个向量中0的数量根据神经网络模型对应层的输出通道数与向量中1的数量的差值确定;
剪枝处理模块620,包括:
掩码判断单元,用于对权值稀疏的神经网络模型每层的每个输出通道,判断在掩码中对应输出通道的取值是否为1;
剪枝处理单元,根据判断单元的判断结果,若在掩码中对应输出通道的取值为1,将输出通道对应的参数和网络连接从权值稀疏的神经网络模型中移除。
可选地,稀疏化训练的目标函数还包括预训练目标函数项,神经网络模型是通过预训练得到。
可选地,稀疏化训练模块610之前,包括:
预训练模块,用于通过样本数据集对初始化神经网络模型进行预训练,得到神经网络模型。
图7示例了一种电子设备的实体结构示意图,如图7所示,该电子设备可以包括:处理器(processor)710、通信接口(CommunicationsInterface)720、存储器(memory)730和通信总线740,其中,处理器710,通信接口720,存储器730通过通信总线740完成相互间的通信。处理器710可以调用存储器730中的逻辑指令,以执行基于局部稀疏约束的结构化剪枝方法,该方法包括:
基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;
基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的 参数和网络连接进行剪枝处理;
通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
此外,上述的存储器730中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-OnlyMemory)、随机存取存储器(RAM,RandomAccessMemory)、磁碟或者光盘等各种可以存储程序代码的介质。
另一方面,本申请还提供一种计算机程序产品,所述计算机程序产品包括存储在非暂态计算机可读存储介质上的计算机程序,所述计算机程序包括程序指令,当所述程序指令被计算机执行时,计算机能够执行上述各方法实施例提供的基于局部稀疏约束的结构化剪枝方法,该方法包括:
基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;
基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理;
通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
又一方面,本申请还提供一种非暂态计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现以执行上述各方法实施例提供的基于局部稀疏约束的结构化剪枝方法,该方法包括:
基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围 的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;
基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理;
通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件。基于这样的理解,上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。
Claims (10)
- 一种基于局部稀疏约束的结构化剪枝方法,其特征在于,包括:基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理;通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
- 根据权利要求1所述的基于局部稀疏约束的结构化剪枝方法,其特征在于,所述基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型,包括:获取所述预先设置的掩码;其中,所述掩码是基于所述剪枝率和所述神经网络模型的层数以及每层的输出通道数预先设置;基于所获取的掩码,对所述稀疏化训练的目标函数中的L1正则化项进行修改;基于修改后的目标函数,通过所述样本数据集对所述神经网络模型进行稀疏化训练,得到所述权值稀疏的神经网络模型。
- 根据权利要求2所述的基于局部稀疏约束的结构化剪枝方法,其特征在于,所述掩码为由0和1组成一组向量,每个所述向量与所述神经网络模型的一层对应,每个所述向量包含的元素数量为所述神经网络模型对应层的输出通道数,每个所述向量包含的0和1的数量由所述剪枝率和所述神经网络模型对应层的输出通道数确定。
- 根据权利要求3所述的基于局部稀疏约束的结构化剪枝方法,其特征在于,若以1表示输出通道受到L1正则化项的约束,每个所述向量中1的数量根据所述剪枝率与所述神经网络模型对应层的输出通道数的乘积确定;若以0表示输出通道不受到L1正则化项的约束,每个所述向量中0的数量根据所 述神经网络模型对应层的输出通道数与所述向量中1的数量的差值确定;所述基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理,包括:对所述权值稀疏的神经网络模型每层的每个输出通道,判断在所述掩码中对应所述输出通道的取值是否为1;若在所述掩码中对应所述输出通道的取值为1,将所述输出通道对应的参数和网络连接从所述权值稀疏的神经网络模型中移除。
- 根据权利要求1至4任一项所述的基于局部稀疏约束的结构化剪枝方法,其特征在于,所述稀疏化训练的目标函数还包括预训练目标函数项,所述神经网络模型是通过所述预训练得到。
- 根据权利要求5所述的基于局部稀疏约束的结构化剪枝方法,其特征在于,所述基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型之前,包括:通过所述样本数据集对初始化神经网络模型进行所述预训练,得到所述神经网络模型。
- 一种基于局部稀疏约束的结构化剪枝装置,其特征在于,包括:稀疏化训练模块,用于基于预先设置的掩码,通过样本数据集对神经网络模型进行限制作用范围的稀疏化训练,得到权值稀疏的神经网络模型;其中,所述掩码是基于剪枝率预先设置,用于指定所述神经网络模型中进行所述稀疏化训练的通道;剪枝处理模块,用于基于所述预先设置的掩码,对所述权值稀疏的神经网络模型中输出通道的参数和网络连接进行剪枝处理;微调训练模块,用于通过所述样本数据集对剪枝处理得到的神经网络模型进行微调训练,得到目标神将网络模型。
- 一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现如权利要求1至6任一项所述基于局部稀疏约束的结构化剪枝方法的步骤。
- 一种非暂态计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述基于局部稀疏约束的结构化剪枝方法的步骤。
- 一种计算机程序产品,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至6任一项所述基于局部稀疏约束的结构化剪枝方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111475019.8A CN114282666A (zh) | 2021-12-03 | 2021-12-03 | 基于局部稀疏约束的结构化剪枝方法和装置 |
CN202111475019.8 | 2021-12-03 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023098544A1 true WO2023098544A1 (zh) | 2023-06-08 |
Family
ID=80870874
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/133849 WO2023098544A1 (zh) | 2021-12-03 | 2022-11-24 | 基于局部稀疏约束的结构化剪枝方法和装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114282666A (zh) |
WO (1) | WO2023098544A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912637A (zh) * | 2023-09-13 | 2023-10-20 | 国网山东省电力公司济南供电公司 | 输变电缺陷识别的方法、装置、计算机设备和存储介质 |
CN116992945A (zh) * | 2023-09-27 | 2023-11-03 | 之江实验室 | 一种基于贪心策略反向通道剪枝的图像处理方法及装置 |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114282666A (zh) * | 2021-12-03 | 2022-04-05 | 中科视语(北京)科技有限公司 | 基于局部稀疏约束的结构化剪枝方法和装置 |
TWI833209B (zh) | 2022-04-27 | 2024-02-21 | 緯創資通股份有限公司 | 用於神經網路的優化方法、電腦系統及電腦可讀取媒體 |
CN115829024B (zh) * | 2023-02-14 | 2023-06-20 | 山东浪潮科学研究院有限公司 | 一种模型训练方法、装置、设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111667068A (zh) * | 2020-06-02 | 2020-09-15 | 清华大学 | 一种基于掩码的深度图卷积神经网络模型剪枝方法与系统 |
CN111738435A (zh) * | 2020-06-22 | 2020-10-02 | 上海交通大学 | 一种基于移动设备的在线稀疏训练方法及系统 |
CN112396179A (zh) * | 2020-11-20 | 2021-02-23 | 浙江工业大学 | 一种基于通道梯度剪枝的柔性深度学习网络模型压缩方法 |
CN114282666A (zh) * | 2021-12-03 | 2022-04-05 | 中科视语(北京)科技有限公司 | 基于局部稀疏约束的结构化剪枝方法和装置 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111931914A (zh) * | 2020-08-10 | 2020-11-13 | 北京计算机技术及应用研究所 | 一种基于模型微调的卷积神经网络通道剪枝方法 |
CN112508190A (zh) * | 2020-12-10 | 2021-03-16 | 上海燧原科技有限公司 | 结构化稀疏参数的处理方法、装置、设备及存储介质 |
CN113128355A (zh) * | 2021-03-29 | 2021-07-16 | 南京航空航天大学 | 一种基于通道剪枝的无人机图像实时目标检测方法 |
CN112990458B (zh) * | 2021-04-14 | 2024-06-04 | 北京灵汐科技有限公司 | 卷积神经网络模型的压缩方法及装置 |
-
2021
- 2021-12-03 CN CN202111475019.8A patent/CN114282666A/zh active Pending
-
2022
- 2022-11-24 WO PCT/CN2022/133849 patent/WO2023098544A1/zh unknown
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111667068A (zh) * | 2020-06-02 | 2020-09-15 | 清华大学 | 一种基于掩码的深度图卷积神经网络模型剪枝方法与系统 |
CN111738435A (zh) * | 2020-06-22 | 2020-10-02 | 上海交通大学 | 一种基于移动设备的在线稀疏训练方法及系统 |
CN112396179A (zh) * | 2020-11-20 | 2021-02-23 | 浙江工业大学 | 一种基于通道梯度剪枝的柔性深度学习网络模型压缩方法 |
CN114282666A (zh) * | 2021-12-03 | 2022-04-05 | 中科视语(北京)科技有限公司 | 基于局部稀疏约束的结构化剪枝方法和装置 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912637A (zh) * | 2023-09-13 | 2023-10-20 | 国网山东省电力公司济南供电公司 | 输变电缺陷识别的方法、装置、计算机设备和存储介质 |
CN116912637B (zh) * | 2023-09-13 | 2023-12-22 | 国网山东省电力公司济南供电公司 | 输变电缺陷识别的方法、装置、计算机设备和存储介质 |
CN116992945A (zh) * | 2023-09-27 | 2023-11-03 | 之江实验室 | 一种基于贪心策略反向通道剪枝的图像处理方法及装置 |
CN116992945B (zh) * | 2023-09-27 | 2024-02-13 | 之江实验室 | 一种基于贪心策略反向通道剪枝的图像处理方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
CN114282666A (zh) | 2022-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023098544A1 (zh) | 基于局部稀疏约束的结构化剪枝方法和装置 | |
US10984319B2 (en) | Neural architecture search | |
JP7478145B2 (ja) | 機械学習モデルの自動生成 | |
WO2023138188A1 (zh) | 特征融合模型训练及样本检索方法、装置和计算机设备 | |
US20230230198A1 (en) | Utilizing a generative neural network to interactively create and modify digital images based on natural language feedback | |
CN109361404B (zh) | 一种基于半监督深度学习网络的ldpc译码系统及译码方法 | |
CN111079899A (zh) | 神经网络模型压缩方法、系统、设备及介质 | |
WO2021042857A1 (zh) | 图像分割模型的处理方法和处理装置 | |
CN111126602A (zh) | 一种基于卷积核相似性剪枝的循环神经网络模型压缩方法 | |
US12015526B2 (en) | Mixed-precision neural networks | |
CN111222629A (zh) | 一种基于自适应批规范化的神经网络模型剪枝方法及系统 | |
JP2022058696A (ja) | 敵対的ネットワークモデルのトレーニング方法、文字ライブラリの作成方法、並びにそれらの装置、電子機器、記憶媒体並びにコンピュータプログラム | |
WO2018120723A1 (zh) | 视频压缩感知重构方法、系统、电子装置及存储介质 | |
JP7566969B2 (ja) | 軽量モデルトレーニング方法、画像処理方法、軽量モデルトレーニング装置、画像処理装置、電子デバイス、記憶媒体及びコンピュータプログラム | |
CN113011532A (zh) | 分类模型训练方法、装置、计算设备及存储介质 | |
WO2022134946A1 (zh) | 模型训练方法、装置、存储介质及设备 | |
WO2021244203A1 (zh) | 参数优化的方法、电子设备和存储介质 | |
US20240005531A1 (en) | Method For Change Detection | |
TWI758223B (zh) | 具有動態最小批次尺寸之運算方法,以及用於執行該方法之運算系統及電腦可讀儲存媒體 | |
CN115238883A (zh) | 神经网络模型的训练方法、装置、设备及存储介质 | |
WO2023174189A1 (zh) | 图网络模型节点分类方法、装置、设备及存储介质 | |
CN112183744A (zh) | 一种神经网络剪枝方法及装置 | |
WO2024027068A1 (zh) | 评估目标检测模型鲁棒性的攻击方法和装置 | |
CN115953651A (zh) | 一种基于跨域设备的模型训练方法、装置、设备及介质 | |
CN115688917A (zh) | 神经网络模型的训练方法、装置、电子设备及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22900349 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |