CN112802141A - Model compression method and terminal applied to image target detection - Google Patents

Model compression method and terminal applied to image target detection Download PDF

Info

Publication number
CN112802141A
CN112802141A CN202110300622.6A CN202110300622A CN112802141A CN 112802141 A CN112802141 A CN 112802141A CN 202110300622 A CN202110300622 A CN 202110300622A CN 112802141 A CN112802141 A CN 112802141A
Authority
CN
China
Prior art keywords
importance factor
layer
model
preset
importance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110300622.6A
Other languages
Chinese (zh)
Other versions
CN112802141B (en
Inventor
潘成龙
张宇
刘东剑
杨伟强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Santachi Video Technology Shenzhen Co ltd
Original Assignee
Santachi Video Technology Shenzhen Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Santachi Video Technology Shenzhen Co ltd filed Critical Santachi Video Technology Shenzhen Co ltd
Priority to CN202110300622.6A priority Critical patent/CN112802141B/en
Publication of CN112802141A publication Critical patent/CN112802141A/en
Application granted granted Critical
Publication of CN112802141B publication Critical patent/CN112802141B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/002Image coding using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a model compression method and a terminal applied to image target detection.A significance factor layer independent of an original convolution network is added after a convolution layer needing pruning in a preset target detection algorithm, and significance factor vectors of each significance factor layer are thinned, so that the characteristics which do not contribute to the algorithm model are preliminarily removed; the threshold value of the importance factor parameter is determined according to the preset pruning rate, the importance of the convolutional layer channel is judged according to the threshold value, the convolutional layer channel corresponding to the importance factor parameter lower than the threshold value is deleted, the algorithm model can be pruned under the condition of not depending on a specific layer structure, and the model volume can be greatly reduced and the precision loss can be reduced by using the compression method in the target detection algorithm; the trimmed model is trained to a preset precision in a fine tuning mode, the model precision and accuracy can be guaranteed while the model is compressed, and the model is easy to realize and deploy without a large amount of computing time resources.

Description

Model compression method and terminal applied to image target detection
Technical Field
The invention relates to the field of computer image processing, in particular to a model compression method and a terminal applied to image target detection.
Background
When the convolutional neural network model is compressed, usually by using the characteristic relationship between layers in a convolutional layer, the next convolutional neural network result with the same or similar channels can be calculated by using the specific subset channels in the convolutional neural network of the previous layer, so that all channels except the specific subset channels in the convolutional neural network of the previous layer can be deleted, thereby achieving the effect of deleting the channels and the compression model. However, in practical use, the error between the complete model and the pruning model needs to be reconstructed in a minimized manner by using the least square method, and the method is difficult to apply to the industry.
Another common method is to use the known parameters in the convolutional neural network or the parameters of the existing network layer as the channel importance criteria, for example, the γ parameter in the BatchNorm layer for accelerating the algorithm convergence and preventing overfitting as the channel importance criteria, and to sparsify the γ parameter in the BatchNorm layer before pruning, when the γ parameter in the BatchNorm layer is larger at this time, the more important the channel of the current BatchNorm layer containing the γ parameter is considered, and the more important the convolution layer channel corresponding to the channel is considered, whereas, the smaller the γ parameter in the BatchNorm layer is, the less important the channel of the current BatchNorm layer containing the γ parameter is considered, and the convolution layer channel corresponding to the channel is deleted. However, this method requires the inclusion of the BatchNorm layer in the deep neural network, and once the method is used without the BatchNorm layer, the method cannot be used, and the method consumes a lot of time and resources for training and fine tuning at a later stage.
Another commonly used method is that the weight itself represents the importance of the network channel, and by using the sum of the weights normalized by L1 as the judgment basis of the importance of the channel, when the sum of the weights normalized by L1 of the channel is small, it represents that the channel is not important and can be deleted, but when the method is actually applied, the judgment method is too simple, and the accuracy loss is serious especially in the field of target detection.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the model compression method and the terminal applied to image target detection are provided, and are independent of a specific layer structure in the compression process and guarantee the precision after compression.
In order to solve the technical problems, the invention adopts the technical scheme that:
a model compression method applied to image target detection comprises the following steps:
training a preset target detection algorithm based on a preset data set to obtain a converged target detection algorithm model;
adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in the target detection algorithm model, and thinning the importance factor vector of each importance factor layer;
calculating the threshold value of the importance factor parameter in the importance factor vector according to a preset pruning rate;
judging whether each importance factor parameter of each importance factor layer is smaller than the threshold value, if so, cutting out the convolutional layer channel corresponding to the importance factor parameter;
training the pruned target detection algorithm model based on the preset data set to obtain a fine tuning model, judging whether the fine tuning model reaches preset precision, if so, stopping training, and if not, continuing training the fine tuning model until the preset precision is reached.
In order to solve the technical problem, the invention adopts another technical scheme as follows:
a model compression terminal for image object detection, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
training a preset target detection algorithm based on a preset data set to obtain a converged target detection algorithm model;
adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in the target detection algorithm model, and thinning the importance factor vector of each importance factor layer;
calculating the threshold value of the importance factor parameter in the importance factor vector according to a preset pruning rate;
judging whether each importance factor parameter of each importance factor layer is smaller than the threshold value, if so, cutting out the convolutional layer channel corresponding to the importance factor parameter;
training the pruned target detection algorithm model based on the preset data set to obtain a fine tuning model, judging whether the fine tuning model reaches preset precision, if so, stopping training, and if not, continuing training the fine tuning model until the preset precision is reached.
The invention has the beneficial effects that: adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in a preset target detection algorithm, and thinning an importance factor vector of each importance factor layer; determining a threshold value of the importance factor parameter according to a preset pruning rate, judging the importance of the convolutional layer channel according to the threshold value, and deleting the convolutional layer channel corresponding to the importance factor parameter lower than the threshold value; in the prior art, the parameters of the original convolutional layer or the original network parameters are usually used, and the parameters of the original network are thinned, and the thinning can cause the large change of the original network parameters, the precision after pruning is greatly reduced, and the subsequent training and fine adjustment are required to be large; compared with the prior art, the method has the advantages that an importance factor layer is newly added, the importance factor layer is not related to the original network, only the newly added importance factor layer is thinned, and parameter distribution of the original network is not changed when the importance factor layer is thinned, so that the original network is not influenced during thinning, precision after pruning is almost lossless, a large amount of time for training and fine-tuning of a model after pruning is not needed, calculation time and resources are saved, an algorithm model can be pruned without depending on a specific layer structure, the method is suitable for any network, the size of the model can be greatly reduced by using the compression method in a target detection algorithm, and precision loss is reduced; the trimmed model is finely tuned and trained to a preset precision, the model precision can be ensured while the model is compressed, the accuracy of the compressed model is further ensured, and the training accuracy is improved while the model is easy to realize and deploy.
Drawings
FIG. 1 is a flowchart of a model compression method applied to image target detection according to an embodiment of the present invention;
FIG. 2 is a diagram of a model compression terminal applied to image target detection according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a model compression method applied to image target detection according to an embodiment of the present invention, in which an importance factor layer is added after a BatchNorm layer;
FIG. 4 is a schematic diagram of a model compression method applied to image target detection according to an embodiment of the present invention, in which an importance factor layer is added after another Norm layer;
fig. 5 is a line graph of the number of channels reserved after the Retina algorithm is compressed, which is three types of model compression methods applied to image target detection in the embodiment of the present invention.
Detailed Description
In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.
The noun explains:
the deep learning target detection means that a deep learning technology is utilized to find out all interested targets in an image, the category and the position of the targets in the image are determined at the same time, the position of an object in the image is marked by using an identification frame on the basis of positioning the targets, and the category of the object is given;
model pruning, which means measuring the importance of each neuron weight in deep learning by different methods, and pruning unimportant neurons according to the importance degree of the neurons to achieve the purpose of model compression;
finenetune model: the original data set is utilized to perform a small amount of training on a new model, and the long time of the original training model is not needed, which is also called as a fine tuning model;
mAP: namely mean-AP, mAP is an algorithm evaluation standard of passalvac challenge, each category has an AP value, and finally, the average of the AP values of all the categories is the mAP, and the closer to 1, the more excellent the algorithm is;
BatchNorm layer: the name of a specific layer in the neural network, which is used to speed up algorithm convergence and prevent overfitting;
backbone: a backbone network in the convolutional neural network, i.e., a main network used for extracting features;
RetinaNet: one of the classical target detection algorithms can be combined with different backbones to form detection algorithms with different performances;
resnet 50: a typical Residual Network, Resnet, is widely used in the field of object classification and the like, and is part of a classical neural Network that is the backbone of computer vision tasks.
Referring to fig. 1, an embodiment of the present invention provides a model compression method applied to image target detection, including the steps of:
training a preset target detection algorithm based on a preset data set to obtain a converged target detection algorithm model;
adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in the target detection algorithm model, and thinning the importance factor vector of each importance factor layer;
calculating the threshold value of the importance factor parameter in the importance factor vector according to a preset pruning rate;
judging whether each importance factor parameter of each importance factor layer is smaller than the threshold value, if so, cutting out the convolutional layer channel corresponding to the importance factor parameter;
training the pruned target detection algorithm model based on the preset data set to obtain a fine tuning model, judging whether the fine tuning model reaches preset precision, if so, stopping training, and if not, continuing training the fine tuning model until the preset precision is reached.
From the above description, the beneficial effects of the present invention are: adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in a preset target detection algorithm, and thinning an importance factor vector of each importance factor layer; determining a threshold value of the importance factor parameter according to a preset pruning rate, judging the importance of the convolutional layer channel according to the threshold value, and deleting the convolutional layer channel corresponding to the importance factor parameter lower than the threshold value; in the prior art, the parameters of the original convolutional layer or the original network parameters are usually used, and the parameters of the original network are thinned, and the thinning can cause the large change of the original network parameters, the precision after pruning is greatly reduced, and the subsequent training and fine adjustment are required to be large; compared with the prior art, the method has the advantages that an importance factor layer is newly added, the importance factor layer is not related to the original network, only the newly added importance factor layer is thinned, and parameter distribution of the original network is not changed when the importance factor layer is thinned, so that the original network is not influenced during thinning, precision after pruning is almost lossless, a large amount of time for training and fine-tuning of a model after pruning is not needed, calculation time and resources are saved, an algorithm model can be pruned without depending on a specific layer structure, the method is suitable for any network, the size of the model can be greatly reduced by using the compression method in a target detection algorithm, and precision loss is reduced; the trimmed model is finely tuned and trained to a preset precision, the model precision can be ensured while the model is compressed, the accuracy of the compressed model is further ensured, and the training accuracy is improved while the model is easy to realize and deploy.
Further, the adding of the importance factor layer independent of the original convolutional layer network after the convolutional layer needing pruning in the target detection algorithm model comprises:
and judging whether a BatchNorm layer is included after the convolutional layer needing pruning of the target detection algorithm model, if so, adding the importance factor layer after the BatchNorm layer, and if not, adding the importance factor layer after other Norm layers after the convolutional layer.
As can be seen from the above description, the importance factor layer is added after the BatchNorm layer after the convolutional layer or after the other Norm layer after the convolutional layer, and is applicable to any convolutional layer structure.
Further, the importance factor vector sparsifying for each of the importance factor layers comprises:
forming the importance factor parameters of each importance factor layer into corresponding importance factor vectors S;
the importance factor vector is thinned by:
Figure 879559DEST_PATH_IMAGE001
wherein x represents input data during training, y represents desired output data, W represents a weight of a current convolutional layer, l () represents a loss function during training, f () represents output value calculation for the input data based on the weight of the current convolutional layer, λ represents a balance coefficient,
Figure 999961DEST_PATH_IMAGE002
a set of importance factor vectors representing all importance factor layers, g(s) a L1 regularization of the importance factor vectors, L a loss function with adjustment factors added,
Figure 500737DEST_PATH_IMAGE003
is an added regulatory factor;
and thinning the importance factor vector to a stable state.
According to the description, the importance factor vectors are thinned, and can be regularized according to a formula, so that preliminary arrangement of the importance vectors is facilitated, the influence of unimportant features is removed, subsequent calculation is facilitated, and the calculation workload is reduced.
Further, the calculating the threshold value of the importance factor parameter in the importance factor vector according to the preset pruning rate includes:
calculating the threshold value serial number threshold _ id of the importance factor parameter by the following formula:
threshold_id = floor( len(sorted_s) * p);
in the formula, floor () represents a downward rounding function, len () represents an acquisition length function, sorted _ s represents an array obtained by sorting all importance factor parameters of each importance factor layer, and p represents a pruning rate;
calculating the threshold value threshold of the importance factor parameter according to the threshold value serial number threshold _ id by the following formula:
threshold = sorted_s[threshold_id]。
as can be seen from the above description, the threshold value number is calculated according to the pruning rate and the number of the importance factor parameters of each importance factor layer, an array is obtained by sequentially arranging the importance factor parameters of each importance factor layer, and the corresponding threshold value is determined in the sequential array according to the threshold value number, so that the threshold value can be simply and intuitively obtained.
Further, the determining whether the fine tuning model reaches a preset precision includes:
based on a preset algorithm evaluation standard, calculating the average number of data obtained by the fine tuning model in all evaluation categories of the algorithm evaluation standard, and judging whether the average number reaches a preset precision.
According to the above description, the average number of data obtained by the fine tuning model in all evaluation categories of the algorithm evaluation standard can be used for accurately obtaining the evaluation value of the current fine tuning model, and the precision of the model after compression is further ensured by comparing the evaluation value with the preset precision.
Referring to fig. 2, an embodiment of the present invention provides a model compression terminal applied to image object detection, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the following steps when executing the computer program:
training a preset target detection algorithm based on a preset data set to obtain a converged target detection algorithm model;
adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in the target detection algorithm model, and thinning the importance factor vector of each importance factor layer;
calculating the threshold value of the importance factor parameter in the importance factor vector according to a preset pruning rate;
judging whether each importance factor parameter of each importance factor layer is smaller than the threshold value, if so, cutting out the convolutional layer channel corresponding to the importance factor parameter;
training the pruned target detection algorithm model based on the preset data set to obtain a fine tuning model, judging whether the fine tuning model reaches preset precision, if so, stopping training, and if not, continuing training the fine tuning model until the preset precision is reached.
From the above description, the beneficial effects of the present invention are: adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in a preset target detection algorithm, and thinning an importance factor vector of each importance factor layer; determining a threshold value of the importance factor parameter according to a preset pruning rate, judging the importance of the convolutional layer channel according to the threshold value, and deleting the convolutional layer channel corresponding to the importance factor parameter lower than the threshold value; in the prior art, the parameters of the original convolutional layer or the original network parameters are usually used, and the parameters of the original network are thinned, and the thinning can cause the large change of the original network parameters, the precision after pruning is greatly reduced, and the subsequent training and fine adjustment are required to be large; compared with the prior art, the method has the advantages that an importance factor layer is newly added, the importance factor layer is not related to the original network, only the newly added importance factor layer is thinned, and parameter distribution of the original network is not changed when the importance factor layer is thinned, so that the original network is not influenced during thinning, precision after pruning is almost lossless, a large amount of time for training and fine-tuning of a model after pruning is not needed, calculation time and resources are saved, an algorithm model can be pruned without depending on a specific layer structure, the method is suitable for any network, the size of the model can be greatly reduced by using the compression method in a target detection algorithm, and precision loss is reduced; the trimmed model is finely tuned and trained to a preset precision, the model precision can be ensured while the model is compressed, the accuracy of the compressed model is further ensured, and the training accuracy is improved while the model is easy to realize and deploy.
Further, the adding of the importance factor layer independent of the original convolutional layer network after the convolutional layer needing pruning in the target detection algorithm model comprises:
and judging whether a BatchNorm layer is included after the convolutional layer needing pruning of the target detection algorithm model, if so, adding the importance factor layer after the BatchNorm layer, and if not, adding the importance factor layer after other Norm layers after the convolutional layer.
As can be seen from the above description, the importance factor layer is added after the BatchNorm layer after the convolutional layer or after the other Norm layer after the convolutional layer, and is applicable to any convolutional layer structure.
Further, the importance factor vector sparsifying for each of the importance factor layers comprises:
forming the importance factor parameters of each importance factor layer into corresponding importance factor vectors S;
the importance factor vector is thinned by:
Figure 364788DEST_PATH_IMAGE004
wherein x represents input data during training, y represents desired output data, W represents weight of a current convolutional layer, l () represents a loss function during training, and f () represents a function based on the current convolutional layerIs calculated for the input data, lambda represents a balance coefficient,
Figure 648002DEST_PATH_IMAGE005
a set of importance factor vectors representing all importance factor layers, g(s) a L1 regularization of the importance factor vectors, L a loss function with adjustment factors added,
Figure 939306DEST_PATH_IMAGE006
is an added regulatory factor;
and thinning the importance factor vector to a stable state.
According to the description, the importance factor vectors are thinned, and can be regularized according to a formula, so that preliminary arrangement of the importance vectors is facilitated, the influence of unimportant features is removed, subsequent calculation is facilitated, and the calculation workload is reduced.
Further, the calculating the threshold value of the importance factor parameter in the importance factor vector according to the preset pruning rate includes:
calculating the threshold value serial number threshold _ id of the importance factor parameter by the following formula:
threshold_id = floor( len(sorted_s) * p);
in the formula, floor () represents a downward rounding function, len () represents an acquisition length function, sorted _ s represents an array obtained by sorting all importance factor parameters of each importance factor layer, and p represents a pruning rate;
calculating the threshold value threshold of the importance factor parameter according to the threshold value serial number threshold _ id by the following formula:
threshold = sorted_s[threshold_id]。
as can be seen from the above description, the threshold value number is calculated according to the pruning rate and the number of the importance factor parameters of each importance factor layer, an array is obtained by sequentially arranging the importance factor parameters of each importance factor layer, and the corresponding threshold value is determined in the sequential array according to the threshold value number, so that the threshold value can be simply and intuitively obtained.
Further, the determining whether the fine tuning model reaches a preset precision includes:
based on a preset algorithm evaluation standard, calculating the average number of data obtained by the fine tuning model in all evaluation categories of the algorithm evaluation standard, and judging whether the average number reaches a preset precision.
According to the above description, the average number of data obtained by the fine tuning model in all evaluation categories of the algorithm evaluation standard can be used for accurately obtaining the evaluation value of the current fine tuning model, and the precision of the model after compression is further ensured by comparing the evaluation value with the preset precision.
The above-mentioned model compression method and terminal applied to image target detection of the present invention are applicable to the compression of convolutional layer structures of various convolutional neural networks, and are particularly applicable to the model compression in the field of target detection, and the following description is made by specific embodiments:
example one
Referring to fig. 1, a model compression method applied to image target detection includes the steps of:
s1, training a preset target detection algorithm based on a preset data set to obtain a converged target detection algorithm model;
specifically, in the embodiment, a deep learning target detection algorithm serving as a reference is selected, the target detection algorithm is trained to be convergent by using a preset existing data set, and objective model evaluation data is obtained by using a test standard evaluation algorithm model of pascalloc;
s2, adding an importance factor layer independent of an original convolution network after the convolution layer needing pruning in the target detection algorithm model, and thinning the importance factor vector of each importance factor layer;
wherein adding an importance factor layer independent of an original convolutional layer network after a convolutional layer needing pruning in the target detection algorithm model comprises:
judging whether a BatchNorm layer is included after the convolutional layer needing pruning of the target detection algorithm model, if so, adding the importance factor layer after the BatchNorm layer, and if not, adding the importance factor layer after other Norm layers after the convolutional layer;
specifically, referring to fig. 3 and 4, in the present embodiment, according to the actual situation of the convolutional layer structure, if there is a BatchNorm layer after the convolutional layer, the importance factor layer is added after the BatchNorm layer, otherwise, the importance factor layer is added after other Norm layers after the convolutional layer, such as a GroupNorm layer, and the newly added importance factor layer has no substantial association relationship with the original convolutional layer, so that the original convolutional layer is not affected during the thinning;
wherein the importance factor vector sparsifying for each of the importance factor layers comprises:
forming the importance factor parameters of each importance factor layer into corresponding importance factor vectors S;
the importance factor vector is thinned by:
Figure 924449DEST_PATH_IMAGE007
wherein x represents input data during training, y represents desired output data, W represents a weight of a current convolutional layer, l () represents a loss function during training, f () represents output value calculation for the input data based on the weight of the current convolutional layer, λ represents a balance coefficient,
Figure 592190DEST_PATH_IMAGE008
a set of importance factor vectors representing all importance factor layers, g(s) a L1 regularization of the importance factor vectors, L a loss function with adjustment factors added,
Figure 464331DEST_PATH_IMAGE009
is an added regulatory factor;
sparsifying the importance factor vector to a steady state;
the importance factor layer has no correlation with the original network layer, and is used for reflecting the importance of the channel in the corresponding convolutional layer, the importance factor parameter of each channel of the importance factor layer corresponds to the importance of each channel of the convolutional layer, and the importance factor parameter is used as the importance judgment criterion of the channel of the corresponding convolutional layer;
specifically, S = (S1, S2, S3, … …, sc), where c is the number of channels in each convolutional layer, and corresponds to each channel in each convolutional layer, so that a trainable and learned one-dimensional vector importance factor S may be multiplied to each original channel output;
specifically, in this embodiment, the importance factor vector is thinned according to a thinning formula, and the importance factor vector of each layer is thinned to be stable, at this time, in each importance factor layer, the larger the value of the importance factor parameter is, the more important the channel of the parameter in the importance factor layer is, and the more important the corresponding convolutional layer channel is, so that the importance of the convolutional layer channel can be judged through the importance factor parameter;
s3, calculating the threshold value of the importance factor parameter in the importance factor vector according to a preset pruning rate;
wherein, the threshold sequence number threshold _ id of the importance factor parameter is calculated by the following formula:
threshold_id = floor( len(sorted_s) * p);
in the formula, floor () represents a downward rounding function, len () represents an acquisition length function, sorted _ s represents an array obtained by sorting all importance factor parameters of each importance factor layer, and p represents a pruning rate;
calculating the threshold value threshold of the importance factor parameter according to the threshold value serial number threshold _ id by the following formula:
threshold = sorted_s[threshold_id];
specifically, by the method for calculating the threshold value of the importance factor parameter, the threshold value threshold of the importance factor parameter can be calculated by using the pruning rate p considered to be needed, and the pruning strength of each layer can be determined by using the threshold;
s4, judging whether each importance factor parameter of each importance factor layer is smaller than the threshold value, if so, cutting out the convolutional layer channel corresponding to the importance factor parameter;
specifically, in this embodiment, after the threshold value is determined, deleting the channels in which the convolutional layers to be pruned are smaller than the threshold value, so as to complete model pruning;
s5, training the pruned target detection algorithm model based on the preset data set to obtain a fine tuning model, judging whether the fine tuning model reaches preset precision, if so, stopping training, and if not, continuing to train the fine tuning model until the preset precision is reached;
specifically, in this embodiment, a preset data set is used to perform a small amount of training on the pruned algorithm model to obtain a Finetune model, and after the Finetune model reaches a preset precision, the training is stopped;
referring to fig. 5, in this embodiment, model pruning is performed on the RetinaNet algorithm of the rescnet 50, and characteristics or differences of different algorithms are distinguished by comparing channel labels and channel numbers reserved for the same network after different pruning schemes, and since model pruning itself needs to comply with corresponding pruning rules, only the first two convolutional layers of all blocks (blocks) that constitute the rescnet 50 are pruned, the third convolutional layer of each block is not pruned, 32 layers can be pruned based on the above principle, and the number of channels of each layer is reserved after pruning is performed on the same trained RetinaNet model by using different methods;
in fig. 5, in the diagram, L1_ Norm represents an L1 regularization method, scaling represents an existing model pruning-network slimming method, scale represents the method provided in this embodiment, and it can be seen that channels finally retained by different algorithms based on the same model are all inconsistent, since the regularization sparse parameter is also used as a standard for evaluating channel importance, the channel number distribution of the method is close to the channel number distribution in the scaling method, but since the method is a newly added importance factor layer, the original convolutional layer and the Norm layer are not affected by the importance factor layer sparseness, and the scaling method further affects the parameters of the whole network by affecting the γ parameter in the existing batchm layer, thereby further causing the accuracy loss of the whole network; because the precision is hardly lost after the thinning, the method can realize lossless pruning, and the fine adjustment times are few or even can be omitted;
the deeper the convolutional layer of the convolutional neural network is, the more the number of channels is, the more important the shallow channel is than the deep channel, so that by combining the high-precision pruning method in this embodiment with the experimental data obtained in fig. 5, the pruning model obtained in this embodiment focuses more on the preservation of the number of channels in the shallow layer and ignores the channel in the deepest layer, so as to prevent overfitting of the compressed model, and meanwhile, the redundancy of the shallow filter is smaller.
Example two
The difference between this embodiment and the first embodiment is that how to determine whether the fine tuning model reaches the predetermined precision is further defined:
wherein, the judging whether the fine tuning model reaches the preset precision comprises:
based on a preset algorithm evaluation standard, calculating the average number of data obtained by the fine tuning model in all evaluation categories of the algorithm evaluation standard, and judging whether the average number reaches a preset precision;
specifically, in this embodiment, the mAP is used for algorithm evaluation, the mAP is mean-AP, and is an algorithm evaluation standard of passicac challenge, each category of the algorithm evaluation standard has an AP value, and finally, the mAP value of all categories is averaged to obtain the mAP, and the closer the mAP is to 1, the better the algorithm is;
specifically, in this embodiment, a RetinaNet algorithm of a backbone used for extracting features in resnet50 is subjected to a pruning precision and model size test, RetinaNet is set to train and test a public data set pascal VOC data set, training is performed on a VOC2012+ VOC2007 training set, testing is performed on a voctest2007, at this time, a pruning ratio is set to 0.5, and after pruning by the method, various AP values and maps are shown in table 1;
TABLE 1 comparison tables of AP and mAP without pruning and with pruning according to the method of the present example for RetinaNet algorithm
Figure 926537DEST_PATH_IMAGE011
As can be seen from table 1, when the pruning ratio is set to 0.5, the size of the model after pruning is about 0.5 times that of the original model, and the precision loss is 0.043, which proves that the algorithm is effective, and the resnet50 is already relatively simplified as a backbone for extracting features, and if a more complex and more redundant network is used, the compression ratio is larger, and the precision loss is smaller.
EXAMPLE III
Referring to fig. 2, a model compression terminal applied to image object detection includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the model compression method applied to image object detection in the first embodiment or the second embodiment.
In summary, the model compression method and terminal applied to image target detection provided by the invention select a target detection algorithm to be used, add an importance factor layer independent of an original convolution network after a convolution layer to be pruned, and perform sparsification on an importance factor vector of each importance factor layer; determining a threshold value of the importance factor parameter according to a preset pruning rate, judging the importance of the convolutional layer channel according to the threshold value, and deleting the convolutional layer channel corresponding to the importance factor parameter lower than the threshold value; in the prior art, the scale parameter of the existing network layer associated with the convolutional layer is usually used as an importance index, but the existing parameter is thinned, so that the existing network layer and the convolutional layer data are affected, the precision after thinning is greatly reduced, and a large amount of follow-up time is needed for fine adjustment and training; due to the fact that the importance factor layer is newly added, the importance factor layer is not related to the original network layer structure, the original network is not affected during sparsification, only the newly added importance factor layer is sparsified, lossless pruning is achieved, and only a small amount of fine tuning training or even the step of the fine tuning training is omitted for the model after pruning, computing time and resources are saved; the algorithm model is pruned without depending on a specific layer structure, the method can be suitable for any network, and the model volume can be greatly reduced and the precision loss can be reduced by using the compression method in the target detection algorithm; the method comprises the steps of training a pruned model to a preset precision in a fine tuning mode, wherein the new pruned model is evaluated by using the mAP, training the model continuously when the evaluation result does not reach the preset precision, and can ensure that the shallow channel number is more emphasized in the compression process and the deep channel number is more ignored, so that the shallow filter redundancy is smaller, the compressed precision is further ensured, a large amount of calculation time and calculation resources are not needed, and the training accuracy is improved while the implementation and the deployment are easy.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims (10)

1. A model compression method applied to image target detection is characterized by comprising the following steps:
training a preset target detection algorithm based on a preset data set to obtain a converged target detection algorithm model;
adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in the target detection algorithm model, and thinning the importance factor vector of each importance factor layer;
calculating the threshold value of the importance factor parameter in the importance factor vector according to a preset pruning rate;
judging whether each importance factor parameter of each importance factor layer is smaller than the threshold value, if so, cutting out the convolutional layer channel corresponding to the importance factor parameter;
training the pruned target detection algorithm model based on the preset data set to obtain a fine tuning model, judging whether the fine tuning model reaches preset precision, if so, stopping training, and if not, continuing training the fine tuning model until the preset precision is reached.
2. The method of claim 1, wherein the adding of the importance factor layer independent of the original convolutional layer network after the convolutional layer requiring pruning in the object detection algorithm model comprises:
and judging whether a BatchNorm layer is included after the convolutional layer needing pruning of the target detection algorithm model, if so, adding the importance factor layer after the BatchNorm layer, and if not, adding the importance factor layer after other Norm layers after the convolutional layer.
3. The method of claim 1, wherein the thinning of the importance factor vector for each of the importance factor layers comprises:
forming the importance factor parameters of each importance factor layer into corresponding importance factor vectors S;
the importance factor vector is thinned by:
Figure 604840DEST_PATH_IMAGE002
wherein x represents input data during training, y represents desired output data, W represents a weight of a current convolutional layer, l () represents a loss function during training, f () represents output value calculation for the input data based on the weight of the current convolutional layer, λ represents a balance coefficient,
Figure 115455DEST_PATH_IMAGE004
a set of importance factor vectors representing all importance factor layers, g(s) a L1 regularization of the importance factor vectors, L a loss function with adjustment factors added,
Figure 364034DEST_PATH_IMAGE006
is an added regulatory factor;
and thinning the importance factor vector to a stable state.
4. The method of claim 1, wherein the calculating the threshold value of the importance factor parameter in the importance factor vector according to the preset pruning rate comprises:
calculating the threshold value serial number threshold _ id of the importance factor parameter by the following formula:
threshold_id = floor( len(sorted_s) * p);
in the formula, floor () represents a downward rounding function, len () represents an acquisition length function, sorted _ s represents an array obtained by sorting all importance factor parameters of each importance factor layer, and p represents a pruning rate;
calculating the threshold value threshold of the importance factor parameter according to the threshold value serial number threshold _ id by the following formula:
threshold = sorted_s[threshold_id]。
5. the method as claimed in any one of claims 1 to 4, wherein the determining whether the fine tuning model reaches a predetermined precision comprises:
based on a preset algorithm evaluation standard, calculating the average number of data obtained by the fine tuning model in all evaluation categories of the algorithm evaluation standard, and judging whether the average number reaches a preset precision.
6. A model compression terminal applied to image object detection, comprising a memory, a processor and a computer program stored on the memory and operable on the processor, wherein the processor implements the following steps when executing the computer program:
training a preset target detection algorithm based on a preset data set to obtain a converged target detection algorithm model;
adding an importance factor layer independent of an original convolution network after a convolution layer needing pruning in the target detection algorithm model, and thinning the importance factor vector of each importance factor layer;
calculating the threshold value of the importance factor parameter in the importance factor vector according to a preset pruning rate;
judging whether each importance factor parameter of each importance factor layer is smaller than the threshold value, if so, cutting out the convolutional layer channel corresponding to the importance factor parameter;
training the pruned target detection algorithm model based on the preset data set to obtain a fine tuning model, judging whether the fine tuning model reaches preset precision, if so, stopping training, and if not, continuing training the fine tuning model until the preset precision is reached.
7. The model compression terminal applied to image object detection according to claim 6, wherein the adding of the importance factor layer independent of the original convolution network after the convolution layer requiring pruning in the object detection algorithm model comprises:
and judging whether a BatchNorm layer is included after the convolutional layer needing pruning of the target detection algorithm model, if so, adding the importance factor layer after the BatchNorm layer, and if not, adding the importance factor layer after other Norm layers after the convolutional layer.
8. The model compression terminal applied to image target detection as claimed in claim 6, wherein the importance factor vector sparsification for each of the importance factor layers comprises:
forming the importance factor parameters of each importance factor layer into corresponding importance factor vectors S;
the importance factor vector is thinned by:
Figure 103451DEST_PATH_IMAGE008
wherein x represents input data during training, y represents desired output data, W represents a weight of a current convolutional layer, l () represents a loss function during training, f () represents output value calculation for the input data based on the weight of the current convolutional layer, λ represents a balance coefficient,
Figure DEST_PATH_IMAGE009
a set of importance factor vectors representing all importance factor layers, g(s) a L1 regularization of the importance factor vectors, L a loss function with adjustment factors added,
Figure DEST_PATH_IMAGE010
is an added regulatory factor;
and thinning the importance factor vector to a stable state.
9. The model compression terminal applied to image target detection according to claim 6, wherein the calculating the threshold value of the importance factor parameter in the importance factor vector according to the preset pruning rate comprises:
calculating the threshold value serial number threshold _ id of the importance factor parameter by the following formula:
threshold_id = floor( len(sorted_s) * p);
in the formula, floor () represents a downward rounding function, len () represents an acquisition length function, sorted _ s represents an array obtained by sorting all importance factor parameters of each importance factor layer, and p represents a pruning rate;
calculating the threshold value threshold of the importance factor parameter according to the threshold value serial number threshold _ id by the following formula:
threshold = sorted_s[threshold_id]。
10. the terminal of any one of claims 6 to 9, wherein the determining whether the fine-tuning model reaches the preset precision comprises:
based on a preset algorithm evaluation standard, calculating the average number of data obtained by the fine tuning model in all evaluation categories of the algorithm evaluation standard, and judging whether the average number reaches a preset precision.
CN202110300622.6A 2021-03-22 2021-03-22 Model compression method and terminal applied to image target detection Active CN112802141B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110300622.6A CN112802141B (en) 2021-03-22 2021-03-22 Model compression method and terminal applied to image target detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110300622.6A CN112802141B (en) 2021-03-22 2021-03-22 Model compression method and terminal applied to image target detection

Publications (2)

Publication Number Publication Date
CN112802141A true CN112802141A (en) 2021-05-14
CN112802141B CN112802141B (en) 2021-08-24

Family

ID=75817309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110300622.6A Active CN112802141B (en) 2021-03-22 2021-03-22 Model compression method and terminal applied to image target detection

Country Status (1)

Country Link
CN (1) CN112802141B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764471A (en) * 2018-05-17 2018-11-06 西安电子科技大学 The neural network cross-layer pruning method of feature based redundancy analysis
CN109445935A (en) * 2018-10-10 2019-03-08 杭州电子科技大学 A kind of high-performance big data analysis system self-adaption configuration method under cloud computing environment
CN109543766A (en) * 2018-11-28 2019-03-29 钟祥博谦信息科技有限公司 Image processing method and electronic equipment, storage medium
CN110276450A (en) * 2019-06-25 2019-09-24 交叉信息核心技术研究院(西安)有限公司 Deep neural network structural sparse system and method based on more granularities
CN111062382A (en) * 2019-10-30 2020-04-24 北京交通大学 Channel pruning method for target detection network
CN112101487A (en) * 2020-11-17 2020-12-18 深圳感臻科技有限公司 Compression method and device for fine-grained recognition model
CN112183748A (en) * 2020-09-30 2021-01-05 中国科学院自动化研究所 Model compression method, system and related equipment based on sparse convolutional neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108764471A (en) * 2018-05-17 2018-11-06 西安电子科技大学 The neural network cross-layer pruning method of feature based redundancy analysis
CN109445935A (en) * 2018-10-10 2019-03-08 杭州电子科技大学 A kind of high-performance big data analysis system self-adaption configuration method under cloud computing environment
CN109543766A (en) * 2018-11-28 2019-03-29 钟祥博谦信息科技有限公司 Image processing method and electronic equipment, storage medium
CN110276450A (en) * 2019-06-25 2019-09-24 交叉信息核心技术研究院(西安)有限公司 Deep neural network structural sparse system and method based on more granularities
CN111062382A (en) * 2019-10-30 2020-04-24 北京交通大学 Channel pruning method for target detection network
CN112183748A (en) * 2020-09-30 2021-01-05 中国科学院自动化研究所 Model compression method, system and related equipment based on sparse convolutional neural network
CN112101487A (en) * 2020-11-17 2020-12-18 深圳感臻科技有限公司 Compression method and device for fine-grained recognition model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
杨文慧: "基于参数和特征冗余的神经网络模型压缩方法", 《中国优秀硕士学位论文全文数据库》 *
纪荣嵘 等: "深度神经网络压缩与加速综述", 《计算机研究与发展》 *

Also Published As

Publication number Publication date
CN112802141B (en) 2021-08-24

Similar Documents

Publication Publication Date Title
CN108615071B (en) Model testing method and device
US10733332B2 (en) Systems for solving general and user preference-based constrained multi-objective optimization problems
US7545986B2 (en) Adaptive resampling classifier method and apparatus
CN111612144B (en) Pruning method and terminal applied to target detection
CN107729999A (en) Consider the deep neural network compression method of matrix correlation
CN112001403B (en) Image contour detection method and system
CN114037844A (en) Global rank perception neural network model compression method based on filter characteristic diagram
CN112016674A (en) Knowledge distillation-based convolutional neural network quantification method
CN111079780A (en) Training method of space map convolution network, electronic device and storage medium
US20210073633A1 (en) Neural network rank optimization device and optimization method
CN112488313A (en) Convolutional neural network model compression method based on explicit weight
CN113125440A (en) Method and device for judging object defects
Woods Ramsay curve IRT for Likert-type data
CN114662666A (en) Decoupling method and system based on beta-GVAE and related equipment
CN112802141B (en) Model compression method and terminal applied to image target detection
CN113239199B (en) Credit classification method based on multi-party data set
CN108090564A (en) Based on network weight is initial and the redundant weighting minimizing technology of end-state difference
CN105117330B (en) CNN code test methods and device
CN111539508A (en) Generator excitation system parameter identification algorithm based on improved wolf algorithm
WO2023102844A1 (en) Method and apparatus for determining pruning module, and computer-readable storage medium
CN116468102A (en) Pruning method and device for cutter image classification model and computer equipment
CN115170902A (en) Training method of image processing model
CN110751201A (en) SAR equipment task failure cause reasoning method based on textural feature transformation
CN113610350B (en) Complex working condition fault diagnosis method, equipment, storage medium and device
CN106709598B (en) Voltage stability prediction and judgment method based on single-class samples

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant