CN114913441A - Channel pruning method, target detection method and remote sensing image vehicle detection method - Google Patents
Channel pruning method, target detection method and remote sensing image vehicle detection method Download PDFInfo
- Publication number
- CN114913441A CN114913441A CN202210738608.9A CN202210738608A CN114913441A CN 114913441 A CN114913441 A CN 114913441A CN 202210738608 A CN202210738608 A CN 202210738608A CN 114913441 A CN114913441 A CN 114913441A
- Authority
- CN
- China
- Prior art keywords
- convolution
- channel
- model
- decoupling
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 68
- 238000013138 pruning Methods 0.000 title claims abstract description 65
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000012549 training Methods 0.000 claims abstract description 63
- 230000006835 compression Effects 0.000 claims abstract description 10
- 238000007906 compression Methods 0.000 claims abstract description 10
- 238000005520 cutting process Methods 0.000 claims abstract description 7
- 238000010606 normalization Methods 0.000 claims description 29
- 230000006870 function Effects 0.000 claims description 20
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000009466 transformation Effects 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 239000000126 substance Substances 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 2
- 239000004480 active ingredient Substances 0.000 claims 1
- 230000008569 process Effects 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000013459 approach Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000002238 attenuated effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010030 laminating Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000004804 winding Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
- G06V20/17—Terrestrial scenes taken from planes or by drones
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
- G06V20/54—Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/08—Detecting or categorising vehicles
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
The invention discloses a channel pruning method, which comprises the steps of determining a target network model; training a target network model to obtain a basic network model; equivalently decoupling the convolution layer of the basic network model to obtain a basic network decoupling model; training a basic network decoupling model to obtain a decoupling model; determining channels which can be compressed finally and reserved channels; and equivalently combining the decoupling models to obtain a network model after channel pruning, and finishing final channel pruning. The invention also discloses a target detection method comprising the channel pruning method and a remote sensing image vehicle detection method comprising the target detection method. Equivalently decoupling convolution layers in the model into cascade of the original convolution and the structural convolution, separately training and equivalently combining the cascade into an original network, and finally cutting channels according to parameters in the structural convolution; therefore, the method not only can keep the original precision of the model, but also has high compression ratio and good reliability.
Description
Technical Field
The invention belongs to the field of digital signal processing, and particularly relates to a channel pruning method, a target detection method and a remote sensing image vehicle detection method.
Background
With the development of economic technology and the improvement of living standard of people, the target detection technology is widely applied to the production and life of people, and brings endless convenience to the production and life of people. Therefore, ensuring the accuracy and rapidity of target detection becomes the key point of research on target detection technology.
At present, the mode of adopting unmanned aerial vehicles to detect targets is already used in a large range. Different from an offline target detection process, target detection on edge devices such as an unmanned aerial vehicle needs to detect a target in a shot image in real time. However, the platform is limited by computing power, memory and power consumption, and generally, a target detection method based on deep learning cannot realize real-time deployment, so that high-precision and light-weight target detection is realized, and the method is particularly important for edge devices such as unmanned aerial vehicles.
In order to meet the real-time deployment of the deep neural network on the end side, researchers have conducted a great deal of research on a model compression method, and the purpose of the method is to simplify the model so as to reduce the calculation amount and the storage amount of the model, and meanwhile, the performance of the model is not affected. The channel pruning method is an important model compression method, the structure of the model does not need to be redefined, and the size of the model is reduced by directly deleting redundant channels, so that the training time of a deep neural network is reduced, and the reasoning speed of the model is accelerated. The channel pruning method provides possibility for the target detection method of deep learning to be deployed on edge equipment.
However, the performance of the deep neural network is closely related to the number of convoluted channels, and the convoluted channels may affect the performance of the model to some extent after pruning, so that a trade-off needs to be made between the pruning degree and the performance. In the training process of traditional model pruning, each parameter participates in training and pruning at the same time, namely precision training and pruning training are coupled; on one hand, however, the optimization target of the model can be changed by a weight penalty term (such as structure sparsity) introduced in pruning training, and the performance of the deep neural network in the training process can be seriously reduced; on the other hand, if the pruning constraint is reduced in order to maintain the model performance and the pruning degree cannot be guaranteed, a pruning model with a high compression rate cannot be obtained.
Disclosure of Invention
The invention aims to provide a channel pruning method which is high in compression rate and good in reliability and can keep the original precision of a model.
It is a further object of the present invention to provide a method of object detection comprising said method of channel pruning.
The invention also aims to provide a remote sensing image vehicle detection method comprising the target detection method.
The channel pruning method provided by the invention comprises the following steps:
s1, determining a target network model;
s2, acquiring a training data set and a loss function, and training the target network model determined in the step S1 by using the acquired training data set and the loss function to obtain a basic network model;
s3, carrying out equivalent decoupling on the convolution layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model;
s4, training the basic network decoupling model obtained in the step S3 by adopting the training data set and the loss function obtained in the step S2 to obtain a decoupling model;
s5, determining a channel which can be compressed finally and a reserved channel according to the decoupling model obtained in the step S4;
and S6, equivalently combining the decoupling models obtained in the step S4 according to the channels which can be compressed and the reserved channels determined in the step S5 to obtain the network models after the channels are pruned, and finishing the channel pruning of the final target network model.
The acquiring of the training data set in step S2 specifically includes the following steps:
acquiring a training picture; carrying out random multi-scale transformation on the obtained training picture; after transformation, randomly turning left and right according to a set probability; finally, unifying the picture size in a gray value supplementing mode;
arranging the pictures into a uniform format: the unified format isn,x,y,w,h) WhereinnIs a target category; (x,y) The central coordinate of the target frame after the relative length and width normalization is obtained; (w,h) The width and the height of the target frame after normalization.
Step S3, performing equivalent decoupling on the convolutional layer of the basic network model obtained in step S2 to obtain a basic network decoupling model, specifically including the following steps:
the basic network model obtained in the step S2WTo (1) acA convolution layerw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e ;
Wherein the structure is convolutedw e A convolutional layer of 1 x 1 cores; structural convolutionw e Has an initial weight ofd o *d o The unit matrix of (a) is,d o is laminated to the original coilw c The number of output channels.
To speed up the data processing flow, the structure is convolvedw e Translating to the original convolution layerw c The latter batch normalization layer.
Step S4, which is to train the basic network decoupling model obtained in step S3 by using the training data set and the loss function obtained in step S2, to obtain a decoupling model, specifically includes the following steps:
A. setting a learning rate by adopting the training data set and the loss function obtained in the step S2, and training the basic network decoupling model obtained in the step S3 again;
during training, beforeNTraining the wheel normally;Nafter the round, sorting according to the size of the parameters of the structural convolution, selecting channels needing to be compressed, and applying an extra punishment gradient to the parameters corresponding to the structural convolution;
B. updating parameters of the structural convolution to,DThe number of convolution kernel channels of the structure convolution layer; then, the original convolution is calculated according to the following formula by using the parameters of the structural convolutiondChannel importance of individual channelsI d :
C. selecting the number of channels to be compressedM: at the beginningM= 0; from the firstNAt the beginning of the wheel, each timeXAfter the training of each batch, the training is completed,Mincrease ofYUntil reaching the preset channel compression ratio; meanwhile, the number of channels of each convolution is not lower than a set value when the channels are selectedS(ii) a Wherein the content of the first and second substances,X、YandSare all set positive integers, and;
D. the convolution parameters are updated byWhereinFor the purpose of the updated convolution parameters,Win order to obtain the convolution parameters before the update,lin order to obtain a learning rate,Gpair volumes for loss functionThe backtransmission gradient of the product;
in the structural convolution, for a channel which does not need to be compressed, the parameter updating mode is the same as the original convolution parameter updating mode; for the channel needing to be compressed, the gradient updating mode is changed, an additional penalty gradient is applied to the channel, and the parameter updating mode isWhereinQFor the parameters before the update of the structural convolution,for the updated parameters of the structural convolution,is an imposed penalty gradient;is a penalty factor, and;,is a function of a sign and。
step S5, determining a channel that can be finally compressed and a channel that is reserved according to the decoupling model obtained in step S4, specifically including the steps of:
calculating the channel importance of each channel of the original convolution by using the parameters of the structural convolution, whereinIThe channel importance of the strip channel isI d ;
If importance of each channel of the original convolutionI d Satisfy the requirement ofWhereinkIs a pruning threshold value andk=and 0.01, determining that the channel corresponding to the original convolution is a cut channel, and not reducing the performance of the model after cutting.
The equivalently merging of the decoupling model obtained in step S4 according to the channel that can be compressed and the channel that is reserved and determined in step S5 described in step S6 specifically includes the following steps:
a. combining the calculation formulas of the convolution layer and the batch normalization layer to obtainIn the formulaxIn order to input the features of the image,yis the output of the input features after passing through the convolutional layer and the batch normalization layer,wis a weight parameter of the convolutional layer,bas a bias parameter for the convolutional layer,for the scaling factor of the batch normalization layer,is the average of the batch normalization layers,is the standard deviation of the batch normalization layer,taken for a set minimum,Offset coefficients for batch normalization layer, convolution operator;
b. the combined calculation formula is arranged into a convolution calculation format to obtainThe corresponding convolution is a new convolution;
c. and (c) calculating the weight and the bias of the new convolution obtained in the step (b) by adopting the following formula:
in the formulaWeight parameters for the new convolution;bias parameters for the new convolution, and convolution operator;
d. and c, combining the new convolution obtained in the step b with the structural convolution, and calculating the weight and the bias of the combined convolution:
in the formulaWeight of the merged convolution layer;w Q is the weight of the structural convolution;wis the weight of the original convolution;bias for the merged convolution layer;bas the bias of the original convolution, as the convolution operator;
e. in the combined convolution layer of step d, if the convolution layer includes a channel to be cut,thenAndand simultaneously deleting the parameters on the corresponding channels to finish the cutting of the corresponding channels.
The invention also provides a target detection method comprising the channel pruning method, which comprises the following steps:
(1) constructing an original model of target detection;
(2) performing channel pruning on the target detection original model constructed in the step (1) by adopting the channel pruning method, thereby obtaining a target detection model;
(3) and (3) adopting the target detection model obtained in the step (2) to carry out actual target detection.
The invention also provides a remote sensing image vehicle detection method comprising the target detection method, which comprises the following steps:
1) acquiring a remote sensing image vehicle detection data set;
2) constructing a target detection original model as a Yolov5 model;
3) performing channel pruning on the target detection original model constructed in the step 2) by adopting the channel pruning method, thereby obtaining a cut target detection model;
4) and 3) carrying out actual vehicle detection on the remote sensing image by adopting the target detection model obtained in the step 3).
According to the channel pruning method, the target detection method and the remote sensing image vehicle detection method, convolution layers in a model are equivalently decoupled into cascade connection of original convolution and structural convolution according to the characteristic that the convolution is linear, detachable and combinable, then precision related training is trained according to a normal training mode, pruning related training is only operated on the structural convolution, equivalent combination is carried out into an original network after training is finished, and finally channel pruning is carried out according to parameters in the structural convolution; therefore, the method not only can keep the original precision of the model, but also has high compression ratio and good reliability.
Drawings
FIG. 1 is a schematic process flow diagram of the channel pruning method of the present invention.
Fig. 2 is a schematic view of the pruning principle of the channel pruning method of the present invention.
FIG. 3 is a schematic method flow chart of the target detection method of the present invention.
FIG. 4 is a schematic method flow diagram of the remote sensing image vehicle detection method of the present invention.
Detailed Description
Fig. 1 is a schematic flow chart of the channel pruning method of the present invention: the channel pruning method provided by the invention comprises the following steps:
s1, determining a target network model, such as yolov5 network;
s2, acquiring a training data set and a loss function, and training the target network model determined in the step S1 by using the acquired training data set and the loss function to obtain a basic network model; the method specifically comprises the following steps:
acquiring a training picture; carrying out random multi-scale transformation and parameter on the obtained training picturesPreferably, it is(ii) a After transformation, randomly turning left and right according to a set probability (preferably 50%); finally, unifying the sizes of the pictures (preferably unifying the sizes to 640 × 640) in a mode of complementing the gray values;
arranging the pictures into a uniform format: the unified format isn,x,y,w,h) In whichnIs a target category; (x,y) The central coordinate after the relative length and width normalization of the target frame is obtained; (w,h) The width and the height of the target frame after normalization;
then, training the target network model determined in the step S1 by using the obtained training data set and the loss function; in training, the learning rate is preferably set to 0.01; after the training is finished, a basic network model is obtainedW;
S3, carrying out equivalent decoupling on the convolution layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model; the method specifically comprises the following steps:
the basic network model obtained in the step S2WTo (1) acA convolution layerw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e ;
Wherein the structure is convolutedw e A convolutional layer of 1 x 1 cores; structural convolutionw e Has an initial weight ofd o *d o The unit matrix of (a) is,d o is laminated to the original coilw c The number of output channels.
To speed up the data processing flow, the structure is convolvedw e Translating to the original convolution layerw c The subsequent batch normalization layer;
the specific process of the step is as follows:
for modelWTo (1) acA convolution layerw c Let the input feature map bex i The output characteristic diagram isy c Then the process is represented as;
Then, the layers are laminated on the original windingw c Post-join structural convolutionw e Wherein, the input and output channels of the original convolution are respectivelyd i Andd o ,w e convolution layer of 1 x 1 kernel with initial weight ofd o *d o The unit matrix of (2) is set as an input characteristic diagramx e =y c The feature map is output through structural convolution asy e The process is represented as(ii) a Since the initial weight of the structural convolution is an identity matrix, the structural convolution is performed by using the identity matrixx e =y e ;
Laminating the layersw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e The whole process is(ii) a Therefore, the decoupling transformation is completely equivalent mathematically, and the performance of the model before and after the structural convolution is added is completely consistent; in order to simplify data processing brought by the merging model in the step 5, in the actual decoupling operation, after the structural convolution is translated to the batch normalization layer, the performances before and after translation are still completely consistent;
s4, training the basic network decoupling model obtained in the step S3 by adopting the training data set and the loss function obtained in the step S2 to obtain a decoupling model; the method specifically comprises the following steps:
A. setting a learning rate (the same as the learning rate set in step S2) by using the training data set and the loss function (the same as in step S2) acquired in step S2, and training the basic network decoupling model obtained in step S3 again;
during training, beforeNTraining the wheel normally;Nafter the round, the model finishes adapting to the decoupled parameters, then sorts the parameters according to the size of the structural convolution, selects the channels to be compressed, and applies an extra punishment gradient to the parameters corresponding to the structural convolution;Npreferably 5;
B. parameters of the structural convolution are updated to,DThe number of convolution kernel channels of the structure convolution layer; then, the original convolution is calculated according to the following formula by using the parameters of the structural convolutiondChannel importance of individual channelsI d :
C. selecting the number of channels to be compressedM: at the beginningM=0; from the firstNAt the beginning of the wheel, each timeXAfter the training of each batch, the training is completed,Mincrease ofYUntil reaching the preset channel compression ratio; meanwhile, the number of channels of each convolution is not lower than a set value when the channels are selectedS(ii) a Wherein the content of the first and second substances,X、YandSare all set positive integers, and;Xpreferably a mixture of 256 and preferably,Ypreferably 16 when the compression requirements are not highSPreferably 8, which can make the network performance better;
D. the convolution parameters are updated byWhereinFor the purpose of the updated convolution parameters,Win order to update the convolution parameters before the update,lin order to obtain the learning rate of the learning,Gregression gradient of convolution for loss function;
in the structural convolution, for a channel which does not need to be compressed, the parameter updating mode is the same as the original convolution parameter updating mode; for the channel needing to be compressed, the gradient updating mode is changed, an additional penalty gradient is applied to the channel, and the parameter updating mode isWhereinQFor the parameters before the update of the structural convolution,for the updated parameters of the structural convolution,is an imposed penalty gradient;is a penalty factor, and;,is a function of a sign and;
under the action of the penalty gradient, for the channel needing to be compressed, the corresponding parameter in the structural convolution gradually approaches to zero; when a certain line of parameters in the structural convolution approaches zero, the neuron of a certain channel of the preceding convolution layer is inactivated, and the channel can be removed in the subsequent steps;
s5, determining a channel which can be compressed finally and a reserved channel according to the decoupling model obtained in the step S4; the method specifically comprises the following steps:
after pruning training, some channels in the structure convolution approach to zero, namely the output of the corresponding channel of the original convolution filter can be ignored under the effect of the structure convolution, so that the removal of the channels can not affect the network performance;
calculating the channel importance of each channel of the original convolution by using the parameters of the structural convolution, whereinIThe channel importance of the strip channel isI d ;
Setting a threshold valuek(ii) a If importance of each channel of the original convolutionI d Satisfy the requirement ofWhereinkIs a pruning threshold value andk=0.01, determining the channel corresponding to the original convolution as a cut channel, wherein the performance of the model cannot be reduced after cutting;
s6, equivalently combining the decoupling models obtained in the step S4 according to the channels which can be compressed and the reserved channels determined in the step S5 to obtain network models after channel pruning, and finishing the channel pruning of the final target network model; the method specifically comprises the following steps:
a. the convolution layer is calculated as(ii) a Batch normalization layer calculation as(ii) a Combining the calculation formulas of the convolution layer and the batch normalization layer to obtainIn the formulaxIn order to input the features of the image,yis the output of the input features after passing through the convolutional layer and the batch normalization layer,wis a weight parameter of the convolutional layer,bas a bias parameter for the convolutional layer,for the scaling factor of the batch normalization layer,is the average of the batch normalization layers,is the standard deviation of the batch normalization layer,is taken as a minimum,Offset coefficients for batch normalization layer, convolution operator;
b. the combined calculation formula is arranged into a convolution calculation format to obtainThe corresponding convolution is a new convolution;
c. calculating the weight and bias of the new convolution obtained in step b using the following formula:
in the formulaWeight parameters for the new convolution;bias parameters for the new convolution, and convolution operator;
d. the new convolution is calculated asThe calculation formula of the structural convolution isThe convolution calculation format of the combined calculation formula is(ii) a And then, combining the new convolution obtained in the step b with the structural convolution, and calculating the weight and the bias of the combined convolution layer:
in the formulaIs the weight of the merged convolution layer;w Q is a knotConstructing weights of the convolution;wis the weight of the original convolution;bias for the merged convolution layer;bas the bias of the original convolution, as the convolution operator;
e. d, in the combined coiling layer in the step d, if the coiling layer comprises a channel to be cut, the coiling layer comprises a channel to be cutAndand simultaneously deleting the parameters on the corresponding channels to finish the cutting of the corresponding channels.
Fig. 2 is a schematic view of the pruning principle of the channel pruning method according to the present invention: the method for pruning the structural decoupling channel utilizes the size of the parameter of the structural convolution to represent the importance of each channel of the original convolution and can represent the strength of the information transmission capability of the corresponding convolution channel. After channels which can be cut are selected gradually and iteratively, unimportant channels are gradually attenuated to zero by utilizing punishment gradient, the channels are gradually inactivated in iterative pruning, and the channels can be cut with almost no reduction of performance when networks are combined, so that the pruning with lossless performance is achieved.
Fig. 3 is a schematic flow chart of the method of the target detection method of the present invention: the invention provides a target detection method comprising the channel pruning method, which comprises the following steps:
(1) constructing an original model of target detection;
(2) performing channel pruning on the target detection original model constructed in the step (1) by adopting the channel pruning method, thereby obtaining a target detection model;
(3) and (3) adopting the target detection model obtained in the step (2) to carry out actual target detection.
FIG. 4 is a schematic flow chart of the method of the remote sensing image vehicle detection method of the present invention: the invention provides a remote sensing image vehicle detection method comprising the target detection method, which comprises the following steps:
1) acquiring a remote sensing image vehicle detection data set;
2) constructing a target detection original model as a Yolov5 model;
3) performing channel pruning on the target detection original model constructed in the step 2) by adopting the channel pruning method, thereby obtaining a cut target detection model;
4) and 3) carrying out actual vehicle detection on the remote sensing image by adopting the target detection model obtained in the step 3).
Claims (9)
1. A channel pruning method comprises the following steps:
s1, determining a target network model;
s2, acquiring a training data set and a loss function, and training the target network model determined in the step S1 by using the acquired training data set and the loss function to obtain a basic network model;
s3, carrying out equivalent decoupling on the convolution layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model;
s4, training the basic network decoupling model obtained in the step S3 by adopting the training data set and the loss function obtained in the step S2 to obtain a decoupling model;
s5, determining a channel which can be compressed finally and a reserved channel according to the decoupling model obtained in the step S4;
and S6, equivalently combining the decoupling models obtained in the step S4 according to the channel which can be compressed and the reserved channel determined in the step S5 to obtain a network model after channel pruning, and finishing the channel pruning of the final target network model.
2. The channel pruning method according to claim 1, wherein the step of obtaining the training data set in step S2 specifically includes the following steps:
acquiring a training picture; carrying out random multi-scale transformation on the obtained training picture; after transformation, randomly turning left and right according to a set probability; finally, unifying the picture size in a gray value supplementing mode;
arranging the pictures into a uniform format: the unified format isn,x,y,w,h) In whichnIs a target category; (x,y) The central coordinate of the target frame after the relative length and width normalization is obtained; (w,h) The width and the height of the target frame after normalization are obtained.
3. The channel pruning method according to claim 2, wherein the step S3 of equivalently decoupling the convolutional layer of the basic network model obtained in the step S2 to obtain a basic network decoupling model specifically comprises the following steps:
the basic network model obtained in the step S2WTo (1) acA convolution layerw c Equivalent decoupling as cascaded protolayersw c And structural convolutionw e ;
Wherein the structure is convolutedw e A convolutional layer of 1 x 1 cores; structural convolutionw e Has an initial weight ofd o *d o The unit matrix of (a) is,d o is laminated to the original volumew c The number of output channels.
4. Method for pruning channels according to claim 3, characterized in that for speeding up the data processing flow, the structure is convolvedw e Translating to the original convolution layerw c The latter batch normalization layer.
5. The channel pruning method according to claim 4, wherein the step S4 of training the basic network decoupling model obtained in the step S3 by using the training data set and the loss function obtained in the step S2 to obtain the decoupling model specifically comprises the steps of:
A. setting a learning rate by adopting the training data set and the loss function obtained in the step S2, and training the basic network decoupling model obtained in the step S3 again;
during training, beforeNTraining the wheel normally;Nafter the round, sorting according to the size of the parameters of the structural convolution, selecting channels needing to be compressed, and applying an extra punishment gradient to the parameters corresponding to the structural convolution;
B. updating parameters of the structural convolution to,DThe number of convolution kernel channels of the structure convolution layer; then, the original convolution is calculated according to the following formula by using the parameters of the structural convolutiondChannel importance of individual channelsI d :
C. selecting the number of channels to be compressedM: at the beginningM= 0; from the firstNAt the beginning of the wheel, each timeXAfter the training of each batch, the training is completed,Mincrease ofYUntil reaching the preset channel compression ratio; meanwhile, the number of channels of each convolution is not lower than a set value when the channels are selectedS(ii) a Wherein the content of the first and second substances,X、YandSare all set positive integers, and;
D. the convolution parameters are updated byWhereinFor the purpose of the updated convolution parameters,Win order to update the convolution parameters before the update,lin order to obtain a learning rate,Ga return gradient of the convolution for the loss function;
in the structural convolution, for a channel which does not need to be compressed, the parameter updating mode is the same as the original convolution parameter updating mode; for the channel needing to be compressed, the gradient updating mode is changed, an additional penalty gradient is applied to the channel, and the parameter updating mode isWhereinQFor the parameters before the update of the structural convolution,for the updated parameters of the structural convolution,is an imposed penalty gradient;is a penalty factor, and;,is a function of a sign and。
6. the channel pruning method according to claim 5, wherein the step S5 of determining the channels that can be finally compressed and the remaining channels according to the decoupling model obtained in the step S4 specifically comprises the steps of:
calculating the channel importance of each channel of the original convolution by using the parameters of the structural convolution, whereinIThe channel importance of the strip channel isI d ;
7. The channel pruning method according to claim 6, wherein the step S6 of equivalently merging the decoupling models obtained in the step S4 according to the channels that can be compressed and the reserved channels determined in the step S5 includes the following steps:
a. combining the calculation formulas of the convolution layer and the batch normalization layer to obtainIn the formulaxIn order to input the features of the image,yis the output of the input features after passing through the convolutional layer and the batch normalization layer,wis a weight parameter of the convolutional layer,bas a bias parameter for the convolutional layer,for the scaling factor of the batch normalization layer,is the average of the batch normalization layers,is the standard deviation of the batch normalization layer,is a set minimum value of the amount of the active ingredient,offset coefficients for batch normalization layer, convolution operator;
b. the combined calculation formula is arrangedFor the convolution calculation format, obtainThe corresponding convolution is a new convolution;
c. calculating the weight and bias of the new convolution obtained in step b using the following formula:
in the formulaWeight parameters for the new convolution;bias parameters for the new convolution, and convolution operator;
d. and c, combining the new convolution obtained in the step b with the structural convolution, and calculating the weight and the bias of the combined convolution:
in the formulaWeight of the merged convolution layer;w Q is the weight of the structural convolution;wis the weight of the original convolution;bias for the merged convolution layer;bas the bias of the original convolution, as the convolution operator;
8. An object detection method comprising the channel pruning method according to any one of claims 1 to 7, characterized by comprising the steps of:
(1) constructing an original model of target detection;
(2) performing channel pruning on the target detection original model constructed in the step (1) by adopting the channel pruning method, thereby obtaining a target detection model;
(3) and (3) adopting the target detection model obtained in the step (2) to carry out actual target detection.
9. A remote sensing image vehicle detection method including the object detection method of claim 8, characterized by comprising the steps of:
1) acquiring a remote sensing image vehicle detection data set;
2) constructing a target detection original model as a Yolov5 model;
3) performing channel pruning on the target detection original model constructed in the step 2) by adopting the channel pruning method, thereby obtaining a cut target detection model;
4) and 3) carrying out actual vehicle detection on the remote sensing image by adopting the target detection model obtained in the step 3).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210738608.9A CN114913441B (en) | 2022-06-28 | 2022-06-28 | Channel pruning method, target detection method and remote sensing image vehicle detection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210738608.9A CN114913441B (en) | 2022-06-28 | 2022-06-28 | Channel pruning method, target detection method and remote sensing image vehicle detection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114913441A true CN114913441A (en) | 2022-08-16 |
CN114913441B CN114913441B (en) | 2024-04-16 |
Family
ID=82772813
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210738608.9A Active CN114913441B (en) | 2022-06-28 | 2022-06-28 | Channel pruning method, target detection method and remote sensing image vehicle detection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114913441B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115730654A (en) * | 2022-11-23 | 2023-03-03 | 湖南大学 | Layer pruning method, kitchen garbage detection method and remote sensing image vehicle detection method |
CN116579409A (en) * | 2023-07-11 | 2023-08-11 | 菲特(天津)检测技术有限公司 | Intelligent camera model pruning acceleration method and acceleration system based on re-parameterization |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110009095A (en) * | 2019-03-04 | 2019-07-12 | 东南大学 | Road driving area efficient dividing method based on depth characteristic compression convolutional network |
CN111680781A (en) * | 2020-04-20 | 2020-09-18 | 北京迈格威科技有限公司 | Neural network processing method, neural network processing device, electronic equipment and storage medium |
CN111967594A (en) * | 2020-08-06 | 2020-11-20 | 苏州浪潮智能科技有限公司 | Neural network compression method, device, equipment and storage medium |
US20210049423A1 (en) * | 2019-07-31 | 2021-02-18 | Zhejiang University | Efficient image classification method based on structured pruning |
WO2021129570A1 (en) * | 2019-12-25 | 2021-07-01 | 神思电子技术股份有限公司 | Network pruning optimization method based on network activation and sparsification |
CN113222142A (en) * | 2021-05-28 | 2021-08-06 | 上海天壤智能科技有限公司 | Channel pruning and quick connection layer pruning method and system |
CN113255892A (en) * | 2021-06-01 | 2021-08-13 | 上海交通大学烟台信息技术研究院 | Method and device for searching decoupled network structure and readable storage medium |
WO2021208151A1 (en) * | 2020-04-13 | 2021-10-21 | 商汤集团有限公司 | Model compression method, image processing method and device |
CN114065923A (en) * | 2021-11-30 | 2022-02-18 | 南京航空航天大学 | Compression method, system and accelerating device of convolutional neural network |
CN114594461A (en) * | 2022-03-14 | 2022-06-07 | 杭州电子科技大学 | Sonar target detection method based on attention perception and zoom factor pruning |
-
2022
- 2022-06-28 CN CN202210738608.9A patent/CN114913441B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110009095A (en) * | 2019-03-04 | 2019-07-12 | 东南大学 | Road driving area efficient dividing method based on depth characteristic compression convolutional network |
US20210049423A1 (en) * | 2019-07-31 | 2021-02-18 | Zhejiang University | Efficient image classification method based on structured pruning |
WO2021129570A1 (en) * | 2019-12-25 | 2021-07-01 | 神思电子技术股份有限公司 | Network pruning optimization method based on network activation and sparsification |
WO2021208151A1 (en) * | 2020-04-13 | 2021-10-21 | 商汤集团有限公司 | Model compression method, image processing method and device |
CN111680781A (en) * | 2020-04-20 | 2020-09-18 | 北京迈格威科技有限公司 | Neural network processing method, neural network processing device, electronic equipment and storage medium |
CN111967594A (en) * | 2020-08-06 | 2020-11-20 | 苏州浪潮智能科技有限公司 | Neural network compression method, device, equipment and storage medium |
CN113222142A (en) * | 2021-05-28 | 2021-08-06 | 上海天壤智能科技有限公司 | Channel pruning and quick connection layer pruning method and system |
CN113255892A (en) * | 2021-06-01 | 2021-08-13 | 上海交通大学烟台信息技术研究院 | Method and device for searching decoupled network structure and readable storage medium |
CN114065923A (en) * | 2021-11-30 | 2022-02-18 | 南京航空航天大学 | Compression method, system and accelerating device of convolutional neural network |
CN114594461A (en) * | 2022-03-14 | 2022-06-07 | 杭州电子科技大学 | Sonar target detection method based on attention perception and zoom factor pruning |
Non-Patent Citations (2)
Title |
---|
XIAOHAN DING 等: ""ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting"", ARXIV:2007.03260V4, 14 August 2021 (2021-08-14), pages 1 - 11 * |
郭庆北: ""深度卷积神经网络的压缩与加速技术的研究"", 《中国博士学位论文全文数据库 (信息科技辑)》, 15 March 2022 (2022-03-15), pages 140 - 26 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115730654A (en) * | 2022-11-23 | 2023-03-03 | 湖南大学 | Layer pruning method, kitchen garbage detection method and remote sensing image vehicle detection method |
CN115730654B (en) * | 2022-11-23 | 2024-05-14 | 湖南大学 | Layer pruning method, kitchen waste detection method and remote sensing image vehicle detection method |
CN116579409A (en) * | 2023-07-11 | 2023-08-11 | 菲特(天津)检测技术有限公司 | Intelligent camera model pruning acceleration method and acceleration system based on re-parameterization |
Also Published As
Publication number | Publication date |
---|---|
CN114913441B (en) | 2024-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114913441A (en) | Channel pruning method, target detection method and remote sensing image vehicle detection method | |
CN110532859B (en) | Remote sensing image target detection method based on deep evolution pruning convolution net | |
CN107748895B (en) | Unmanned aerial vehicle landing landform image classification method based on DCT-CNN model | |
CN110766063B (en) | Image classification method based on compressed excitation and tightly connected convolutional neural network | |
CN110909667B (en) | Lightweight design method for multi-angle SAR target recognition network | |
CN108288270B (en) | Target detection method based on channel pruning and full convolution deep learning | |
CN111191583B (en) | Space target recognition system and method based on convolutional neural network | |
CN110378381A (en) | Object detecting method, device and computer storage medium | |
CN111340814A (en) | Multi-mode adaptive convolution-based RGB-D image semantic segmentation method | |
CN106203363A (en) | Human skeleton motion sequence Activity recognition method | |
CN113159173A (en) | Convolutional neural network model compression method combining pruning and knowledge distillation | |
CN113870335A (en) | Monocular depth estimation method based on multi-scale feature fusion | |
WO2022062164A1 (en) | Image classification method using partial differential operator-based general-equivariant convolutional neural network model | |
CN116416561A (en) | Video image processing method and device | |
CN116071668A (en) | Unmanned aerial vehicle aerial image target detection method based on multi-scale feature fusion | |
CN110781912A (en) | Image classification method based on channel expansion inverse convolution neural network | |
CN112561041A (en) | Neural network model acceleration method and platform based on filter distribution | |
CN115966010A (en) | Expression recognition method based on attention and multi-scale feature fusion | |
CN115048870A (en) | Target track identification method based on residual error network and attention mechanism | |
CN113554084A (en) | Vehicle re-identification model compression method and system based on pruning and light-weight convolution | |
CN114154626B (en) | Filter pruning method for image classification task | |
CN113850373B (en) | Class-based filter pruning method | |
CN114972753A (en) | Lightweight semantic segmentation method and system based on context information aggregation and assisted learning | |
CN113850365A (en) | Method, device, equipment and storage medium for compressing and transplanting convolutional neural network | |
CN115620120B (en) | Street view image multi-scale high-dimensional feature construction quantization method, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |