CN113919484A  Structured pruning method and device based on deep convolutional neural network model  Google Patents
Structured pruning method and device based on deep convolutional neural network model Download PDFInfo
 Publication number
 CN113919484A CN113919484A CN202111148560.8A CN202111148560A CN113919484A CN 113919484 A CN113919484 A CN 113919484A CN 202111148560 A CN202111148560 A CN 202111148560A CN 113919484 A CN113919484 A CN 113919484A
 Authority
 CN
 China
 Prior art keywords
 model
 pruning
 layer
 channel
 structured
 Prior art date
 Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
 Pending
Links
 238000003062 neural network model Methods 0.000 title claims abstract description 68
 238000000034 method Methods 0.000 claims description 48
 238000004364 calculation method Methods 0.000 claims description 32
 230000002829 reduced Effects 0.000 description 13
 238000004590 computer program Methods 0.000 description 9
 238000007906 compression Methods 0.000 description 7
 238000001514 detection method Methods 0.000 description 7
 238000010586 diagram Methods 0.000 description 7
 230000000694 effects Effects 0.000 description 7
 238000011156 evaluation Methods 0.000 description 7
 230000000875 corresponding Effects 0.000 description 3
 240000007072 Prunus domestica Species 0.000 description 2
 238000004422 calculation algorithm Methods 0.000 description 2
 230000003247 decreasing Effects 0.000 description 2
 238000010801 machine learning Methods 0.000 description 2
 239000011159 matrix material Substances 0.000 description 2
 230000001537 neural Effects 0.000 description 2
 238000010606 normalization Methods 0.000 description 2
 238000005457 optimization Methods 0.000 description 2
 230000001360 synchronised Effects 0.000 description 2
 230000004913 activation Effects 0.000 description 1
 238000004458 analytical method Methods 0.000 description 1
 238000007418 data mining Methods 0.000 description 1
 238000004821 distillation Methods 0.000 description 1
 230000000670 limiting Effects 0.000 description 1
 238000005259 measurement Methods 0.000 description 1
 230000004048 modification Effects 0.000 description 1
 238000006011 modification reaction Methods 0.000 description 1
 230000036961 partial Effects 0.000 description 1
 230000000644 propagated Effects 0.000 description 1
 230000003068 static Effects 0.000 description 1
Classifications

 G06N3/045—

 G—PHYSICS
 G06—COMPUTING; CALCULATING; COUNTING
 G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
 G06N3/00—Computing arrangements based on biological models
 G06N3/02—Computing arrangements based on biological models using neural network models
 G06N3/08—Learning methods
 G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
Abstract
The application discloses a structured pruning method and device based on a deep convolutional neural network model. The method comprises the following steps: carrying out sparse regularization training on a pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model; calculating importance scores of residual error structures in the sparse model according to the first BN layer parameters of the sparse model; performing layer pruning on the sparse model according to the importance score of the residual structure and the preset pruning layer number; calculating the importance score of each channel according to the second BN layer parameter of the sparse model; channel pruning is carried out on the model subjected to structured layer pruning according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel, so that a model subjected to structured channel pruning is obtained; and arranging the model after the structured channel is pruned to obtain a lightweight model for deployment and application at the embedded end. The method and the device can solve the problem that the highprecision complex model is difficult to directly deploy at a (middle and low end) mobile end.
Description
Technical Field
The application relates to the technical field of computers, in particular to a structured pruning method and device based on a deep convolutional neural network model, computer equipment and a storage medium.
Background
Since 2012, the deep convolutional neural network gradually replaces the traditional statistical learning to become the mainstream frame and method of computer vision, and has been widely applied in the aspects including face recognition, auxiliary driving, and the like. However, the highprecision model is generally complex in design, often accompanied by high storage space and computational resource consumption, and it is difficult to directly deploy applications at a (middlelow end) mobile end.
The model volume can be effectively reduced through model compression, and the model calculation amount is reduced. The inventor finds that the current mainstream model compression methods such as network structure reconstruction and knowledge distillation all need certain deep learning professional knowledge and parameter adjusting experience, and the iteration period is long; quantization can only reduce the space required by each weight to a certain extent, and contributes little to the reduction of the amount of computation. Model pruning has gradually begun to be deployed on a large scale on terminal devices and has achieved good market acceptance due to its relatively simple concept. Research has indicated that significant redundancy exists in many deep convolutional neural networks, and only a small fraction of the weights is sufficient to predict the remaining weights, so model pruning can achieve very considerable compression rate.
The invention provides a structured pruning method based on a deep convolutional neural network model, aiming at the problem that a highprecision complex model is difficult to directly deploy at a (middlelow end) mobile end.
Disclosure of Invention
The application mainly aims to provide a structured pruning method and device based on a deep convolutional neural network model, so as to solve the problem that a highprecision complex model is difficult to directly deploy at a (middlelow end) mobile end. The design scheme of the application can effectively reduce the occupation of the storage space of the deep convolutional neural network model at the mobile terminal and the consumption of computing resources, and greatly improves the performance of the deep learning algorithm on the embedded platform.
In order to achieve the above object, according to one aspect of the present application, a structured pruning method based on a deep convolutional neural network model is provided.
The method mainly aims at pruning a classical detection network YOLOv3, so that the Precision evaluation index mainly comprises mean Average Precision (mAP), and the model volume and speed evaluation index mainly comprises parameter quantity and calculated quantity.
The structured pruning method based on the deep convolutional neural network model comprises the following steps:
carrying out sparse regularization training on a pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model;
calculating importance scores of residual error structures in the sparse model according to the first BN layer parameters of the sparse model;
performing layer pruning on the sparse model according to the importance score of the residual error structure and the preset pruning layer number to obtain a structured layer pruned model;
calculating the importance score of each channel according to the second BN layer parameter of the sparse model;
channel pruning is carried out on the model subjected to structured layer pruning according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel, so that a model subjected to structured channel pruning is obtained;
and arranging the model after the structured channel is pruned to obtain a lightweight model for deployment and application at the embedded end.
Further, the sparse regularization training of the pretrained deep convolutional neural network model by the scaling factor based on the BN layer to obtain a sparse model includes:
adding an L1 regular constraint to the scaling factor of the BN layer to induce the sparsification of the BN layer to obtain a loss function of sparse training;
and training the loss function to update the scaling factor of the BN layer so as to obtain a sparse model.
Further, the first BN layer parameter is a scaling factor of a plurality of output channels in the residual structure; the calculating the importance score of the residual error structure in the sparse model according to the first BN layer parameter of the sparse model comprises:
determining a residual structure comprising a 1 × 1 convolution and a 3 × 3 convolution in the sparse model;
and performing summation average calculation on the scaling factors of a plurality of output channels in the residual error structure to obtain the importance score of the residual error structure.
Further, the performing layer pruning on the sparse model according to the importance score of the residual error structure and a preset pruning layer number to obtain a model after structured layer pruning comprises:
ranking the importance scores of the residual structure;
and pruning the residual structure with the preset pruning layer number and low score in the sparse model to obtain a model after structured layer pruning.
Further, after obtaining the model after structured layer pruning, the method further includes:
evaluating whether the parameter quantity, the calculated quantity and the precision of the model after the structured layer pruning meet preset performance indexes or not;
if the preset performance index is met, ending the layer pruning process;
if the preset performance index is not met, the steps in claims 2 to 4 are executed iteratively.
Further, the second BN layer parameter of the sparse model is a scaling factor of the BN layer of the sparse model.
Further, the channel pruning is performed on the structured layer pruned model according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel, and obtaining the structured channel pruned model includes:
ranking the importance scores of the channels;
and according to the preset global pruning rate, the preset local protection pruning rate and the sorted channel scores, pruning the channels which meet the preset channel pruning conditions, and obtaining the model after structured channel pruning.
Further, after the obtaining the pruned model of the structured channel, the method further comprises:
evaluating whether the parameter quantity, the calculated quantity and the precision of the model after the structured channel is pruned meet preset performance indexes;
if the preset performance index is met, ending the channel pruning process;
if the performance index does not meet the preset performance index, the steps in the claims 2, 6 and 7 are executed iteratively.
Further, the sorting the model after the structured channel is pruned to obtain a lightweight model for deployment application at the embedded end includes:
BN parameter combination is carried out on the calculation process related to the model after the structured channel is pruned; the BN parameter combination refers to combining BN parameters into convolution parameters, and the BN operation step is omitted;
if the BN parameter merging process is successfully executed, a lightweight model for deploying application at the embedded end is obtained;
the lightweight model is the final output obtained by sequentially executing the structural pruning steps on the pretrained deep convolutional neural network model, wherein the lightweight model parameters are 1/101/5 of the deep convolutional neural network model parameters, and the lightweight model calculated quantity is 1/101/5 of the deep convolutional neural network model parameters.
In order to achieve the above object, according to another aspect of the present application, a structured pruning apparatus based on a deep convolutional neural network model is provided.
The structured pruning device based on the deep convolutional neural network model comprises:
the sparse regularization training module is used for carrying out sparse regularization training on the pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model;
the first score calculating module is used for calculating the importance score of a residual error structure in the sparse model according to the first BN layer parameter of the sparse model;
the layer pruning module is used for carrying out layer pruning on the sparse model according to the importance score of the residual error structure and the preset pruning layer number to obtain a structured layer pruned model;
the second score calculation module is used for calculating the importance score of each channel according to the second BN layer parameter of the sparse model;
the channel pruning module is used for carrying out channel pruning on the model subjected to structured layer pruning according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel to obtain a model subjected to structured channel pruning;
and the model sorting module is used for sorting the model after the structured channel is pruned to obtain a lightweight model for deployment and application at the embedded end.
A computer device comprising a memory and a processor, the memory storing a computer program operable on the processor, the processor implementing the steps in the various method embodiments described above when executing the computer program.
A computerreadable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the respective method embodiment described above.
According to the structured pruning method and device based on the deep convolutional neural network model, the scaling factor based on the BN layer is used for carrying out sparse regularization training on the pretrained deep convolutional neural network model, the distribution of kernel weights is not damaged as much as possible, the contribution of weights to the performance of the model can be effectively distinguished, and layers or channels with small contribution to the performance are pruned, so that the compression effect of the maximum proportion is realized. Meanwhile, the precision loss caused by single pruning is made up in an iterative training mode, so that a lightweight model with low precision loss is obtained, the storage space required by the model is reduced, the resource consumption required by forward propagation of the model is reduced, the deep convolutional neural network model is favorably deployed at a (middle and low end) moving end, and the landing process of the deep convolutional neural network model at the moving end is accelerated.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, serve to provide a further understanding of the application and to enable other features, objects, and advantages of the application to be more apparent. The drawings and their description illustrate the embodiments of the invention and do not limit it. In the drawings:
FIG. 1 is a schematic flow diagram of a structured pruning method based on a deep convolutional neural network model in one embodiment;
FIG. 2 is a schematic flow chart illustrating a step of sparse regularization training of a pretrained deep convolutional neural network model based on a scaling factor of a BN layer to obtain a sparse model in one embodiment;
fig. 3 is a schematic flowchart illustrating a step of performing layer pruning on the sparse model according to the importance score of the residual structure and the preset number of pruning layers to obtain a model after structured layer pruning in one embodiment;
FIG. 4 is a flowchart illustrating a step of performing channel pruning on a model after structured layer pruning according to an importance score, a preset global pruning rate, and a preset local protection pruning rate of each channel to obtain a model after structured channel pruning in one embodiment;
FIG. 5 is a schematic flow chart of a structured pruning method based on a deep convolutional neural network model in another embodiment;
FIG. 6(a) is a schematic diagram showing the detection results of a deep convolutional neural network model before pruning;
FIG. 6(b) is a schematic diagram showing the monitoring effect of the structured pruning of the Yolov3 model by using the structured pruning method of the present application;
FIG. 7 is a schematic structural diagram of a structured pruning device based on a deep convolutional neural network model in one embodiment;
FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a nonexclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
In one embodiment, as shown in fig. 1, a structured pruning method based on a deep convolutional neural network model is provided, which specifically includes the following steps:
and 101, performing sparse regularization training on a pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model.
And calling a pretrained deep convolutional neural network model. For example, the pretrained deep convolutional neural network model may be a pretrained YOLOv3 network model, which is a fullprecision complex model containing a BN layer obtained by training a large unmanned aerial vehicle capture data set VisDroneDET2019 provided by AISKYEYE of the university of tianjin machine learning and data mining laboratory; the data set can contain ten categories such as pedestrians, vehicles and bicycles, the average accuracy mean mAP is used as the model precision evaluation index, and the parameters and the calculated amount are used as the model volume and speed evaluation index.
And carrying out sparse regularization training on the deep convolutional neural network model based on the scaling factor of the BN layer. The scaling factor γ of the BN layer is a parameter used for learning the distribution of real data in the BN layer, and can very well represent the importance of the feature map channel. It is not efficient to prune the pretrained deep convolutional neural network model directly, for example, channel pruning is performed on the pretrained ResNet network, and the parameter amount may be reduced by only 10% without damaging the accuracy. According to the method, the scaling factor gamma of the BN layer is added with the L1 regular pattern, the pretrained model is forced to be thinned, the unimportant layer or the unimportant channel can be automatically identified, and the precision reduction brought by pruning is reduced.
And 102, calculating importance scores of residual error structures in the sparse model according to the first BN layer parameters of the sparse model.
And 103, carrying out layer pruning on the sparse model according to the importance score of the residual structure and the preset pruning layer number to obtain a structured layer pruned model.
The Residual structure can also be called a Residual structure, and the Residual structure is a core module of a backbone in yollov 3 and is formed by connecting two convolution operations of 1 × 1 × n and 3 × 3 × 2n with one short. Where n represents an output channel. The YOLOv3 network includes 23 Residual structures, which are respectively located after the downsampling convolution of each stride being 2 according to the superposition mode of 1,2,8,8, and 4. In order to ensure the integrity of the design structure of the YOLOv3, pruning parameters are set according to the limit pruning conditions of 1,1,1,1 and 1, namely, at most 235 layers can be pruned to 18 layers, and the number of pruning layers is preset to be l (l is more than or equal to 0 and less than or equal to 18);
the first BN layer parameter of the sparse model refers to the scaling factor of all output channels in the Residual structure, i.e. the scaling factor of 3n channels. In this embodiment, a Residual structure is used as a pruning unit to perform a layer pruning process, and the importance score of the Residual structure can be represented by a sum and average of scaling factors of 3n channels.
And then, sequencing importance scores of the Residual structure from small to large, pruning the layer I with lower score, and obtaining a model after structured layer pruning. And performing finetune (fine tuning) on the model after the structured layer pruning so as to recover the precision loss caused by pruning. And then evaluating whether the model after structured layer pruning meets the preset performance index or not through the precision, the calculated amount and the parameter amount, and if not, iteratively executing the process.
And 104, calculating the importance scores of the channels according to the second BN layer parameters of the sparse model.
And 105, performing channel pruning on the model subjected to structured layer pruning according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel to obtain the model subjected to structured channel pruning.
The second BN layer parameter is the scaling factor γ of the BN layer. The importance score of each channel is expressed using a scaling factor γ, where all channels in the backbone and detection head in the YOLOv3 network all participate in channel pruning except for the last layer.
Sorting the channel importance scores in the order from small to large, and calculating a global pruning threshold theta according to a preset global pruning rate_{pg}Calculating the pruning protection threshold theta of each layer according to the preset local protection pruning rate_{pli}As local protection threshold θ_{pli}To prevent excessive pruning. If the importance score gamma of the current channel satisfies gamma < theta_{pli},γ＜θ_{pg}And pruning the channel to obtain a model of the structured channel after pruning. Performing finetune (fine tuning) on the model after the pruning of the structured channel to recover the precision loss caused by the pruning; and evaluating whether the model after structured channel pruning meets preset performance indexes or not through the precision, the calculated amount and the parameter amount, and if not, iteratively executing the process.
And 106, arranging the model after the structured channel is pruned to obtain a lightweight model for deployment and application at the embedded end.
The process of arranging the model after the pruning of the structured channel mainly realizes the parameter combination operation of the BN layer, namely combining the parameters of the BN layer into the parameters of the convolutional layer so as to reduce the calculated amount and facilitate the deployment of the model.
In this embodiment, the scaling factor based on the BN layer performs sparse regularization training on the pretrained deep convolutional neural network model, and the distribution of kernel weights is not destroyed as much as possible, so that the contribution of weights to the model performance can be effectively distinguished, and layers or channels with small contribution to the performance are pruned, thereby achieving a compression effect in a very large proportion. According to the importance score of the residual structure and the preset pruning layer number, layer pruning is carried out on the sparse model, channel pruning is carried out on the model after structured layer pruning according to the importance score of each channel, the preset global pruning rate and the preset local protection pruning rate, and compression of the complex model in a large proportion can be realized through a pruning mode from thick (layer pruning) to thin (channel pruning), so that the storage space occupation and the calculation resource consumption of the deep convolutional neural network model at a (middle and low end) moving end are effectively reduced, and the performance of the deep learning algorithm on an embedded platform is greatly improved.
In one embodiment, as shown in fig. 2, in step 101, sparse regularization training is performed on a pretrained deep convolutional neural network model based on a scaling factor of a BN layer, so as to obtain a sparse model, which includes the specific steps of:
step 201, adding an L1 regular constraint to the scaling factor of the BN layer to induce the BN layer to be sparse, so as to obtain a loss function of sparse training.
Step 202, training the loss function to update the scaling factor of the BN layer to obtain a sparse model.
In the present embodiment, the importance of different network layers is mainly distinguished through sparse training, and therefore, the sparsification factor α is involved, for example, α is generally set to 0.0001, and is used to perform regularization constraint on the scaling factor γ of the BN layer. The larger the sparsification factor, the more sparse the model.
And adding an L1 regular constraint to the scaling factor gamma of the BN layer, and constructing a sparsely trained loss function, namely the loss function of the final model.
The BN layer normalization formula is shown below:
wherein, mu and sigma^{2}For statistical parameters, batch processing is respectively expressedThe mean and variance of the medium data, γ and β, are trainable parameters that represent scaling and shifting the normalized distribution, respectively, to preserve the original learned features. From the above BN layer normalization formula, γ is a trainable parameter, mainly scales the normalized distribution, and retains the originally learned feature distribution. While different channels of each feature map have different scaling factors γ, so γ can represent the importance of the feature map channel very well. The scaling factor gamma of the BN layer is used as the importance measurement index of the pruning target, so that the network can not only utilize the original advantages of the BN layer, but also avoid introducing extra parameters in the pruning process.
Regularization is a common technique in machine learning, and its main purpose is to control model complexity and reduce overfitting. L1 regularization refers to the sum of the absolute values of the scaling factor γ, usually expressed as   γ   calculation of luminance_{1}The scaling factor γ may be sparsely distributed to generate a sparse model for feature selection. In order to better distinguish important parameters from redundant parameters, an L1 regular constraint is added to γ in the present embodiment to perform sparse training. At this time, the loss function of the training becomes:
where f (γ) ═ γ  is an L1 regularized function, and α is a sparseness factor.
In this embodiment, a model training weight updating manner may be defined. Specifically, since the L1 canonical term gradient does not exist everywhere (derivative does not exist at point 0), the method of introducing the nearend gradient is optimized. Optimization of the L1 regular function is a special case of the convex optimization problem, the loss function trained in this embodiment has the same expression form as that of the classical Lasso problem, and in view of the particularity of the Lasso problem, the convergence rate is increased by using the method of nearend gradient descent in this embodiment. Assuming that the function f (x) can be expressed as f (x) ═ g (x) + h (x), where g (x) is a convex differentiable function and h (x) is a convex nondifferentiable function, the nearend gradient descent can be obtainedThe convergence speed of (2). The Lasso problem is redescribed as:
it can be derived that the iterative formula is:
wherein S is_{α}(t) is a soft threshold function:
through the analysis, in this embodiment, after the loss function is propagated in reverse in each iteration process and before the gradient is decreased, the scaling factor γ of the BN layer is updated by using the above iteration formula, which is equivalent to using a subgradient decreasing method to add L1 regularization to γ and optimize, and finally receiving the effect of sparse training.
And carrying out sparse training on the YOLOv3 model according to the sparse factor, the L1 regular function and a nearend gradient descent method until the loss function is not descended any more or is smaller than a threshold value, and storing model parameters to obtain a sparse model for pruning. The compression effect of a large proportion can be realized, and the convergence speed of the model is improved.
In one embodiment, the step 102 of calculating the importance score of the residual error structure in the sparse model according to the first BN layer parameters of the sparse model comprises: determining a residual error structure containing a 1 × 1 convolution and a 3 × 3 convolution in the sparse model; and performing summation average calculation on the scaling factors of a plurality of output channels in the residual error structure to obtain the importance score of the residual error structure.
The first BN layer parameter is a scaling factor of a plurality of output channels in a residual structure of the sparse model. The Residual structure may also be referred to as a Residual structure. The Residual structure contains one 1 × 1 convolution and one 3 × 3 convolution. Specifically, the Residual structure is formed by connecting two convolution operations of 1 × 1 × n and 3 × 3 × 2n with one short cut. Where n denotes an output channel, for a total of 3n output channels.
In this embodiment, the importance score of the Residual structure can be calculated as follows:
where N represents the sum of the number of channels of 1 × 1 convolution and 3 × 3 convolution, γ_{i}Represents the scaling factor, S, corresponding to the ith channel_{Residual}Means that the gamma values of N channels are summed and averaged to obtain the importance score of the Residual structure.
In an embodiment, as shown in fig. 3, in step 103, performing layer pruning on the sparse model according to the importance score of the residual structure and a preset pruning layer number, and obtaining a structured layer pruned model includes:
step 301, ranking the importance scores of the residual structure.
And 302, pruning the residual structure with the preset pruning layer number and low score in the sparse model to obtain a structured layer pruned model.
The importance scores of the residual structure are sorted in order from small to large. The number of the pruning layers is preset to be l (l is more than or equal to 0 and less than or equal to 18). And setting the layer of mask with the lower score as 0 and the other layers of masks as 1, thereby obtaining a mask table for indicating whether each layer of the whole network needs pruning. And traversing all layers of the network according to the mask table of the network layer, if the mask of the layer is 0, pruning the layer, otherwise, reserving the layer, and thus obtaining the model after structured layer pruning.
In one embodiment, after obtaining the model after structured layer pruning, the method further comprises: the method comprises the following steps of evaluating whether a model after structured layer pruning meets a preset performance index, and specifically comprises the following steps: evaluating whether the parameter quantity, the calculated quantity and the precision of the model after structured layer pruning meet preset performance indexes; if the preset performance index is met, ending the layer pruning process; if the preset performance index is not met, the steps in claims 2 to 4 are executed iteratively.
In this embodiment, the preset performance index is used to evaluate whether the current pruning satisfies the condition. Specifically, the precision, the parameter quantity and the calculated quantity of the model after structured layer pruning are evaluated according to the precision index mAP, the model parameter index parameter quantity and the calculated quantity, if the precision, the parameter quantity and the calculated quantity meet the preset performance index, the layer pruning process is finished, and otherwise, the steps of sparse regularization training, calculation of importance scoring of a residual error structure in the sparse model and layer pruning are repeated. And according to the evaluation result, if the preset performance index is met, saving the model parameters, wherein the model parameters are used for the finegrained channel pruning process.
Through the iterative training mode, the precision loss caused by single pruning is made up, so that the lightweight model with smaller precision loss is obtained, the storage space required by the model is further reduced, the resource consumption required by forward propagation of the model is reduced, the deep convolutional neural network model is favorably deployed at a (middle and low end) mobile end, and the landing process of the deep convolutional neural network model at the mobile end is accelerated.
Before the step of evaluating whether the model after structured layer pruning meets the preset performance index, the model after structured layer pruning can be subjected to finetune so as to recover the precision loss caused by pruning. Because the model is generally accompanied with a certain precision loss after pruning, the model needs to be finely adjusted to compensate for the precision reduction, and a model with higher precision is obtained.
In one embodiment, the second BN layer parameter of the sparse model is a scaling factor of the BN layer of the sparse model. The importance score of each channel can be calculated according to the scaling factor of the BN layer of the sparse model. The scaling factor of the BN layer of the sparse model refers to model parameters after sparse regularization training, the importance score of each channel can be represented by the scaling factor of the BN layer, and different channels of each feature map have different scaling factors, so that the scaling factors can represent the importance of the feature map channels very well.
In an embodiment, as shown in fig. 4, in step 105, performing channel pruning on the structured layer pruned model according to the importance score of each channel, the preset global pruning rate, and the preset local protection pruning rate, and obtaining the structured channel pruned model includes:
step 401, ranking the importance scores of the channels.
And step 402, pruning channels meeting the pruning conditions of the preset channels according to the preset global pruning rate, the preset local protection pruning rate and the sorted channel scores to obtain a model after structured channel pruning.
After the importance scores of the channels are obtained through calculation, the importance scores of the channels are sorted from small to large, and the contribution degree of different channels to the model performance is shown.
The preset global pruning rate represents the proportion of the channels needing to be pruned in the total number of the channels of the network, is generally set to be 50 percent and can be adjusted according to the actual pruning effect; the preset local protection pruning rate represents the proportion of the channels which are mostly pruned in each layer to the total number of the channels in the layer, and at least 2 channels are generally reserved to prevent excessive pruning.
Calculating a global pruning threshold according to a preset global pruning rateWherein M represents the total number of channels participating in pruning of the model, p_{g}Representing a preset global pruning rate, g () representing the preset global pruning rate p corresponding to the calculated network channel importance score_{g}The size of (2). Calculating the pruning protection threshold value of each layer according to the preset local protection pruning rateAs local protection thresholdsTo prevent excessive pruning.
The predetermined pruning condition may be that the importance score γ of the channel satisfiesIf the importance score gamma of the current channel is satisfiedThe channel mask is equal to 0, otherwise, the channel mask is equal to 1, so as to obtain a mask table indicating whether each channel of the whole network layer needs channel pruning. Traversing all the channels of the network according to the network channel mask table, if the channel mask is 0, pruning the channel, otherwise, reserving the channel, thereby obtaining the model after the structured channel is pruned.
In one embodiment, after obtaining the pruned model of the structured channel, the method further comprises: the method comprises the following steps of evaluating whether a model after structured channel pruning meets a preset performance index, wherein the steps specifically comprise: evaluating whether the parameter quantity, the calculated quantity and the precision of the model after the structured channel is pruned meet preset performance indexes; if the preset performance index is met, ending the channel pruning process; if the performance index does not meet the preset performance index, the steps in the claims 2, 6 and 7 are executed iteratively.
In this embodiment, the preset performance index is used to evaluate whether the current channel pruning satisfies the condition. Specifically, the precision, the parameter quantity and the calculated quantity of the model after structured channel pruning are evaluated according to the precision index mAP, the model parameter index parameter quantity and the calculated quantity, if the precision, the parameter quantity and the calculated quantity meet the preset performance index, the channel pruning process is finished, and otherwise, the steps of sparse regularization training, calculation of importance scores of all channels and channel pruning are repeated. And if the process is successfully executed, generating a lightweight model with smaller precision error, wherein the lightweight model can be used for the embedded platform deployment application.
Through the iterative training mode, the precision loss caused by single pruning is made up, so that the lightweight model with smaller precision loss is obtained, the storage space required by the model is further reduced, the resource consumption required by forward propagation of the model is reduced, the deep convolutional neural network model is favorably deployed at a (middle and low end) mobile end, and the landing process of the deep convolutional neural network model at the mobile end is accelerated.
Before the step of evaluating whether the model after the structured channel pruning meets the preset performance index, the model after the structured channel pruning can be subjected to finetune so as to recover the precision loss caused by pruning. The model after structured channel pruning generally has a certain precision loss, so the model needs to be finely adjusted to compensate for the precision reduction, and a lightweight model with higher precision is obtained.
In one embodiment, the step 106 of collating the structured channel pruned model to obtain a lightweight model for deploying applications at the embedded end comprises: BN parameter combination is carried out on the calculation process related to the model after the structured channel is pruned; the BN parameter combination refers to combining BN parameters into convolution parameters, and the BN operation step is omitted; if the BN parameter merging process is successfully executed, a lightweight model for deploying application at the embedded end is obtained; the lightweight model is the final output obtained by sequentially executing the structural pruning steps on a pretrained deep convolutional neural network model, wherein the lightweight model parameters are 1/101/5 of the deep convolutional neural network model parameters, and the lightweight model calculation amount is 1/101/5 of the deep convolutional neural network model parameters.
In this embodiment, the calculation process involved in the model after structured channel pruning may include a convolutional layer calculation process and a BN layer calculation process. The relevant network parameters may be obtained first. Specifically, the calculation of one layer in the general network mainly comprises convolutional layer calculation, BN layer calculation and an activation function, the relevant network model parameters mainly comprise convolutional layer parameters and BN layer parameters, wherein the convolutional layer parameters comprise a weight matrix W and a bias vector B, and the BN layer parameters comprise a mean value mu and a variance sigma^{2}Scaling factor gamma, translation parameter beta, a small number epsilon.
And combining the BN parameters and merging the BN parameters into the convolution parameters, and omitting the BN operation step.
The convolution layer calculation process is YW X X + B, wherein Y is convolution layer output, W is a weight matrix, X is convolution layer input, and B is a bias vector;
BN layer calculationThe process is thatY ═ γ X + β, where X is the BN layer input, Y is the BN layer output, μ is the mean, σ^{2}Variance, gamma scaling factor, beta translation parameter, and epsilon to a smaller number, to prevent the denominator from appearing to be 0.
The BN parameter merging may specifically be merging a convolutional layer calculation process and a BN layer calculation process to obtain a merged calculation processWhere X is the convolutional layer input and Y is the BN layer output. Thereby obtaining the weight and the offset parameter after the BN parameter is merged. The weight parameters and the bias parameters include the original convolution and related parameters in the BN.
The combined calculation process is sorted to obtainIf it is usedTo representNamely, it isThe combined calculation process can be arranged asThe combined weight parameter can be obtainedMerged bias parametersThe combined convolution operation is Y ═ W_{merged}*X+B_{merged}。
And if the BN parameter merging process is successfully executed, generating and storing the model parameters after the BN parameters are merged. Wherein the new model parameters may be used for embedded end deployment applications. The lightweight model is the final output obtained by sequentially executing the structured pruning steps on a pretrained deep convolutional neural network model. The number of the lightweight model parameters is 1/101/5 of the number of the deep convolutional neural network model parameters, the lightweight model calculation amount is 1/101/5 of the number of the deep convolutional neural network model parameters, and the lightweight model precision is basically consistent with the precision of the deep convolutional neural network model.
As shown in fig. 5, it is a step of a structured pruning method based on a deep convolutional neural network model in another embodiment:
and 501, performing sparse regularization training on the pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model.
Step 502, performing summation average calculation on the scaling factors of a plurality of output channels of the residual error structure in the sparse model to obtain the importance score of the residual error structure.
And 503, sequencing importance scores of the residual error structures, and presetting a residual error structure with a lower score in the pruning sparse model to obtain a model after structured layer pruning.
And step 504, fine tuning the model after the structural layer pruning.
And 505, evaluating whether the parameters, the calculated amount and the precision of the model after structured layer pruning meet preset performance indexes.
Step 506, calculating the importance scores of the channels according to the scaling factors of the BN layer of the sparse model.
And 507, sequencing the importance scores of the channels, and pruning the channels meeting the pruning conditions of the preset channels according to the preset global pruning rate, the preset local protection pruning rate and the sequenced channel scores to obtain a model after structured channel pruning.
And step 508, finetuning the model after the structured channel is pruned.
And 509, evaluating whether the parameters, the calculated amount and the precision of the model after the structured channel pruning meet preset performance indexes.
And 510, combining BN parameters in the calculation process related to the model after the structured channel is pruned, and if the BN parameter combination process is successfully executed, obtaining a lightweight model for deploying and applying at the embedded end.
The specific processes of the steps in this embodiment can be seen from the description in the above specific embodiments, and are not described herein again.
In one embodiment, as shown in fig. 6, the operation effect of the method on a common classical detection network YOLOv3 is shown, wherein, the graph (a) shows the detection result of a deep convolutional neural network model before pruning, and the graph (b) shows the detection result when the structured pruning method of the present application is adopted to prune 85% of the calculated amount of the YOLOv3 model and 92% of the parameter amount is pruned, and the results of the two graphs are basically consistent. The structured pruning method is effective, model parameters and calculated quantity can be effectively reduced on the premise of ensuring precision, and model operation efficiency is improved.
As shown in table 1, the method has statistical results of the operation on the common classical detection network YOLOv 3: the structured pruning method is adopted to carry out pruning on the YOLOv3 model with the calculated amount of 70 percent, and when the parameter amount is 76 percent, the mAP error is less than or equal to 1.8; the calculated amount of pruning is 75 percent, and when the parameter amount of pruning is 80 percent, the mAP error is less than or equal to 2.5; when the calculated amount of pruning is 85 percent and the parameter amount of pruning is 92 percent, the mAP error is less than or equal to 3.3. The structured pruning method is effective, model parameters and calculated amount can be effectively reduced on the premise of ensuring precision, the operation efficiency of the model is improved, and the deployment and application of the highprecision complex model on a (middlelow end) mobile end can be further promoted.
TABLE 1
Model  Input size  Param/rate  Gflops/rate  mAP 
YOLOv3  640  61.6M  154.87  44.3 
YOLOv3_prune_65  640  14.63M/76.25％  46.6/69.91％  42.5 
YOLOv3_prune_75  640  11.97M/80.57％  38.67/75.03％  41.8 
YOLOv3_prune_85  640  4.55M/92.61％  22.61/85.4％  41.0 
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computerexecutable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.
In one embodiment, as shown in fig. 7, there is provided a structured pruning device based on a deep convolutional neural network model, including: sparse regularization training module 701, first score calculating module 702, layer pruning module 703, second score calculating module 704, channel pruning module 705, and model sorting module 706, wherein:
and the sparse regularization training module 701 is configured to perform sparse regularization training on the pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model.
A first score calculating module 702, configured to calculate an importance score of a residual structure in the sparse model according to the first BN layer parameter of the sparse model.
The layer pruning module 703 is configured to perform layer pruning on the sparse model according to the importance score of the residual structure and the preset number of pruning layers, so as to obtain a structured layer pruned model.
And a second score calculating module 704, configured to calculate an importance score of each channel according to the second BN layer parameter of the sparse model.
The channel pruning module 705 is configured to perform channel pruning on the model after structured layer pruning according to the importance score of each channel, the preset global pruning rate, and the preset local protection pruning rate, so as to obtain a model after structured channel pruning.
And the model sorting module 706 is used for sorting the model after the structured channel is pruned to obtain a lightweight model for deployment application at the embedded end.
In one embodiment, the sparse regularization training module 701 is further configured to add an L1 regularization constraint to the scaling factor of the BN layer to induce the BN layer to be sparse, so as to obtain a loss function of sparse training; and training the loss function to update the scaling factor of the BN layer so as to obtain a sparse model.
In one embodiment, the first BN layer parameter is a scaling factor for a plurality of output channels in the residual structure; the first score calculating module 702 is further configured to determine a residual structure comprising a 1 × 1 convolution and a 3 × 3 convolution in the sparse model; and performing summation average calculation on the scaling factors of a plurality of output channels in the residual error structure to obtain the importance score of the residual error structure.
In one embodiment, the layer pruning module 703 is further configured to rank the importance scores of the residual structures; and pruning the residual structure with the preset pruning layer number and lower score in the sparse model to obtain the model after structured layer pruning.
In one embodiment, the above apparatus further comprises: the first model evaluation module is used for evaluating whether the parameter quantity, the calculated quantity and the precision of the model after structured layer pruning meet preset performance indexes; if the preset performance index is met, ending the layer pruning process; if the preset performance index is not met, the steps in claims 2 to 4 are executed iteratively.
In one embodiment, the second BN layer parameter of the sparse model is a scaling factor of the BN layer of the sparse model.
In one embodiment, the channel pruning module 705 is further configured to rank the importance scores of the channels; and according to the preset global pruning rate, the preset local protection pruning rate and the sorted channel scores, pruning the channels which meet the preset channel pruning conditions, and obtaining the model after structured channel pruning.
In one embodiment, the above apparatus further comprises: the second model evaluation module is used for evaluating whether the parameter quantity, the calculated quantity and the precision of the model after the structured channel pruning meet the preset performance indexes; if the preset performance index is met, ending the channel pruning process; if the performance index does not meet the preset performance index, the steps in the claims 2, 6 and 7 are executed iteratively.
In one embodiment, the model sorting module 706 is further configured to perform BN parameter merging on the computation process involved in the model after the structured channel pruning; the BN parameter combination refers to combining BN parameters into convolution parameters, and the BN operation step is omitted; if the BN parameter merging process is successfully executed, a lightweight model for deploying application at the embedded end is obtained; the lightweight model is the final output obtained by sequentially executing the structural pruning steps on a pretrained deep convolutional neural network model, wherein the lightweight model parameters are 1/101/5 of the deep convolutional neural network model parameters, and the lightweight model calculation amount is 1/101/5 of the deep convolutional neural network model parameters.
For specific definition of the structured pruning device based on the deep convolutional neural network model, reference may be made to the above definition of the structured pruning method based on the deep convolutional neural network model, and details are not described here. The modules in the structured pruning device based on the deep convolutional neural network model can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The nonvolatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the nonvolatile storage medium. The database of the computer device is used for storing data of a structured pruning based on a deep convolutional neural network model. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a structured pruning method based on a deep convolutional neural network model.
Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory storing a computer program and a processor implementing the steps of the various embodiments described above when the processor executes the computer program.
In one embodiment, a computerreadable storage medium is provided, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the respective embodiments described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a nonvolatile computerreadable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include nonvolatile and/or volatile memory, among others. Nonvolatile memory can include readonly memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The abovementioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A structured pruning method based on a deep convolutional neural network model is characterized by comprising the following steps:
carrying out sparse regularization training on a pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model;
calculating importance scores of residual error structures in the sparse model according to the first BN layer parameters of the sparse model;
performing layer pruning on the sparse model according to the importance score of the residual error structure and the preset pruning layer number to obtain a structured layer pruned model;
calculating the importance score of each channel according to the second BN layer parameter of the sparse model;
channel pruning is carried out on the model subjected to structured layer pruning according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel, so that a model subjected to structured channel pruning is obtained;
and arranging the model after the structured channel is pruned to obtain a lightweight model for deployment and application at the embedded end.
2. The method of claim 1, wherein the scaling factor based on the BN layer is used for sparse regularization training of a pretrained deep convolutional neural network model, and obtaining a sparse model comprises:
adding an L1 regular constraint to the scaling factor of the BN layer to induce the sparsification of the BN layer to obtain a loss function of sparse training;
and training the loss function to update the scaling factor of the BN layer so as to obtain a sparse model.
3. The method of claim 1, wherein the first BN layer parameter is a scaling factor of a plurality of output channels in the residual structure; the calculating the importance score of the residual error structure in the sparse model according to the first BN layer parameter of the sparse model comprises:
determining a residual structure comprising a 1 × 1 convolution and a 3 × 3 convolution in the sparse model;
and performing summation average calculation on the scaling factors of a plurality of output channels in the residual error structure to obtain the importance score of the residual error structure.
4. The method according to claim 1, wherein the performing layer pruning on the sparse model according to the importance score of the residual structure and a preset pruning layer number to obtain a structured layer pruned model comprises:
ranking the importance scores of the residual structure;
and pruning the residual structure with the preset pruning layer number and low score in the sparse model to obtain a model after structured layer pruning.
5. The method of claim 4, wherein after the obtaining the model after structured layer pruning, the method further comprises:
evaluating whether the parameter quantity, the calculated quantity and the precision of the model after the structured layer pruning meet preset performance indexes or not;
if the preset performance index is met, ending the layer pruning process;
if the preset performance index is not met, the steps in claims 2 to 4 are executed iteratively.
6. The method of claim 1, wherein the second BN layer parameter of the sparse model is a scaling factor of a BN layer of the sparse model.
7. The method of claim 1, wherein the channel pruning the structured layer pruned model according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel to obtain the structured channel pruned model comprises:
ranking the importance scores of the channels;
and according to the preset global pruning rate, the preset local protection pruning rate and the sorted channel scores, pruning the channels which meet the preset channel pruning conditions, and obtaining the model after structured channel pruning.
8. The method of claim 7, wherein after the obtaining the poststructured channel pruning model, the method further comprises:
evaluating whether the parameter quantity, the calculated quantity and the precision of the model after the structured channel is pruned meet preset performance indexes;
if the preset performance index is met, ending the channel pruning process;
if the performance index does not meet the preset performance index, the steps in the claims 2, 6 and 7 are executed iteratively.
9. The method of any one of claims 1 to 8, wherein the collating the structured channel pruned model to obtain a lightweight model for deployment application at an embedded end comprises:
BN parameter combination is carried out on the calculation process related to the model after the structured channel is pruned; the BN parameter combination refers to combining BN parameters into convolution parameters, and the BN operation step is omitted;
if the BN parameter merging process is successfully executed, a lightweight model for deploying application at the embedded end is obtained;
the lightweight model is the final output obtained by sequentially executing the structural pruning steps on the pretrained deep convolutional neural network model, wherein the lightweight model parameters are 1/101/5 of the deep convolutional neural network model parameters, and the lightweight model calculated quantity is 1/101/5 of the deep convolutional neural network model parameters.
10. A structured pruning device based on a deep convolutional neural network model, which is characterized by comprising:
the sparse regularization training module is used for carrying out sparse regularization training on the pretrained deep convolutional neural network model based on the scaling factor of the BN layer to obtain a sparse model;
the first score calculating module is used for calculating the importance score of a residual error structure in the sparse model according to the first BN layer parameter of the sparse model;
the layer pruning module is used for carrying out layer pruning on the sparse model according to the importance score of the residual error structure and the preset pruning layer number to obtain a structured layer pruned model;
the second score calculation module is used for calculating the importance score of each channel according to the second BN layer parameter of the sparse model;
the channel pruning module is used for carrying out channel pruning on the model subjected to structured layer pruning according to the importance score, the preset global pruning rate and the preset local protection pruning rate of each channel to obtain a model subjected to structured channel pruning;
and the model sorting module is used for sorting the model after the structured channel is pruned to obtain a lightweight model for deployment and application at the embedded end.
Priority Applications (1)
Application Number  Priority Date  Filing Date  Title 

CN202111148560.8A CN113919484A (en)  20210929  20210929  Structured pruning method and device based on deep convolutional neural network model 
Applications Claiming Priority (1)
Application Number  Priority Date  Filing Date  Title 

CN202111148560.8A CN113919484A (en)  20210929  20210929  Structured pruning method and device based on deep convolutional neural network model 
Publications (1)
Publication Number  Publication Date 

CN113919484A true CN113919484A (en)  20220111 
Family
ID=79236735
Family Applications (1)
Application Number  Title  Priority Date  Filing Date 

CN202111148560.8A Pending CN113919484A (en)  20210929  20210929  Structured pruning method and device based on deep convolutional neural network model 
Country Status (1)
Country  Link 

CN (1)  CN113919484A (en) 

2021
 20210929 CN CN202111148560.8A patent/CN113919484A/en active Pending
Similar Documents
Publication  Publication Date  Title 

CN108764471B (en)  Neural network crosslayer pruning method based on feature redundancy analysis  
CN107689224B (en)  Deep neural network compression method for reasonably using mask  
CN107729999B (en)  Deep neural network compression method considering matrix correlation  
CN110598731B (en)  Efficient image classification method based on structured pruning  
US20180046919A1 (en)  Multiiteration compression for deep neural networks  
CN111126668B (en)  Spark operation time prediction method and device based on graph convolution network  
CN111950656B (en)  Image recognition model generation method and device, computer equipment and storage medium  
CN113159276A (en)  Model optimization deployment method, system, equipment and storage medium  
JP6950756B2 (en)  Neural network rank optimizer and optimization method  
CN109034372B (en)  Neural network pruning method based on probability  
CN113222014A (en)  Image classification model training method and device, computer equipment and storage medium  
CN114037844A (en)  Global rank perception neural network model compression method based on filter characteristic diagram  
Wang et al.  Towards efficient convolutional neural networks through lowerror filter saliency estimation  
CN113919484A (en)  Structured pruning method and device based on deep convolutional neural network model  
CN110705708A (en)  Compression method and device of convolutional neural network model and computer storage medium  
CN111562977A (en)  Neural network model splitting method, device, storage medium and computer system  
CN114186671A (en)  Largebatch decentralized distributed image classifier training method and system  
CN109740734B (en)  Image classification method of convolutional neural network by optimizing spatial arrangement of neurons  
US20200349416A1 (en)  Determining computerexecuted ensemble model  
CN111079899A (en)  Neural network model compression method, system, device and medium  
Urgun et al.  Composite power system reliability evaluation using importance sampling and convolutional neural networks  
CN108932550B (en)  Method for classifying images based on fuzzy dense sparse dense algorithm  
CN110490323A (en)  Network model compression method, device, storage medium and computer equipment  
Yang et al.  MultiAdapt: A Neural Network Adaptation For Pruning Filters Base on Multilayers Group  
CN114037073A (en)  Image identification method based on weight pruning quantification 
Legal Events
Date  Code  Title  Description 

PB01  Publication  
PB01  Publication  
SE01  Entry into force of request for substantive examination  
SE01  Entry into force of request for substantive examination 